Architecture
Overview
axm-smelt follows a layered architecture with clear separation of concerns:
graph TB
CLI["CLI (cli.py)"] --> Pipeline["smelt() / check()"]
MCP["SmeltTool (AXMTool)"] --> Pipeline
Pipeline --> Detector["detect_format()"]
Pipeline --> Counter["count() — tiktoken"]
Pipeline --> Strategies["Strategy pipeline"]
Strategies --> Registry["_REGISTRY / _PRESETS"]
Pipeline --> Report["SmeltReport"]
Layers
1. Public API (__init__.py)
Three exported functions:
smelt(text?, strategies?, preset?, *, parsed?)— run the pipeline and return aSmeltReport. Accepts eithertext(str) orparsed(dict/list); at least one is required.check(text?, *, parsed?)— dry-run every registered strategy and return per-strategy savings estimates. Same input contract assmelt.count(text, model?)— count tokens via tiktoken (o200k_baseby default)
2. CLI (cli.py)
Four commands via cyclopts: compact, check, count, version. All read from stdin or --file. compact also accepts --strategies, --preset, and --output. The CLI calls the same core functions as the Python API — no business logic lives in the CLI layer.
3. MCP Tool (tools/smelt.py)
SmeltTool(AXMTool) and SmeltCheckTool(AXMTool) expose the pipeline as MCP tools registered under the axm.tools entry point group. When data is already a dict or list, the tools pass it via parsed= to skip the serialize→deserialize round-trip.
4. Pipeline (core/pipeline.py)
smelt() orchestrates four steps:
- Detect format via
detect_format()(iterates_PROBES:_try_json,_try_xml,_try_yaml,_try_markdown) - Count input tokens
- Build a
SmeltContextfrom the input text and detected format, then apply strategies in order — eachstrategy.apply(ctx)receives and returns aSmeltContext. A token-count guard compares the result against the current token count: the strategy is only accepted if it strictly reduces tokens (or reduces text length at equal tokens). Strategies that regress are silently discarded - Count output tokens and compute
savings_pct
check() runs every registered strategy independently on the original SmeltContext and records per-strategy savings without chaining. Only strategies with positive savings (> 0%) are included in strategy_estimates; strategies that regress or break even are omitted.
5. Strategies (strategies/)
Each strategy is a class implementing SmeltStrategy (name, category, apply(ctx) -> SmeltContext). Strategies are registered in _REGISTRY and composed into presets via _PRESETS:
| Preset | Strategies |
|---|---|
safe |
minify, collapse_whitespace |
moderate |
minify, drop_nulls, flatten, dedup_values, tabular, strip_quotes, collapse_whitespace, compact_tables, strip_html_comments |
aggressive |
minify, drop_nulls, flatten, tabular, round_numbers, dedup_values, strip_quotes, collapse_whitespace, compact_tables, strip_html_comments |
| Strategy class | Name | Category |
|---|---|---|
MinifyStrategy |
minify |
whitespace |
CollapseWhitespaceStrategy |
collapse_whitespace |
whitespace |
CompactTablesStrategy |
compact_tables |
whitespace |
DropNullsStrategy |
drop_nulls |
structural |
FlattenStrategy |
flatten |
structural |
TabularStrategy |
tabular |
structural |
DedupValuesStrategy |
dedup_values |
structural |
StripQuotesStrategy |
strip_quotes |
cosmetic |
StripHtmlCommentsStrategy |
strip_html_comments |
cosmetic |
RoundNumbersStrategy |
round_numbers |
cosmetic |
6. Format Detection (core/detector.py)
Heuristic detection returns a Format enum value (JSON, YAML, XML, TOML, CSV, MARKDOWN, TEXT). Strategies that are format-specific (e.g., minify for JSON) check the first character before attempting to parse.
7. Models (core/models.py)
SmeltContext — dataclass carrying the current text, detected format, and a lazily-parsed JSON cache; passed through the strategy pipeline. SmeltReport — Pydantic model with extra = "forbid". Format — string enum.
Data Flow
sequenceDiagram
participant User
participant API as smelt() / CLI
participant Pipeline
participant Strategies
User->>API: smelt(text, preset="moderate")
API->>Pipeline: detect_format(text)
API->>Pipeline: count(text) -> original_tokens
Pipeline->>Pipeline: SmeltContext(text, format)
loop For each strategy in preset
Pipeline->>Strategies: strategy.apply(ctx)
Strategies-->>Pipeline: ctx (accepted if tokens decrease, else discarded)
end
Pipeline->>Pipeline: count(ctx.text) -> compacted_tokens
Pipeline-->>User: SmeltReport