Index

`axm_smelt`

axm-smelt - Deterministic token compaction for LLM inputs.

`CounterBackend`

Bases: StrEnum

Backend used to produce a token count.

Currently only :attr:TIKTOKEN exists; the enum is retained as the seam for a future HuggingFace/SentencePiece backend (Llama/Mistral/Gemma).

Source code in packages/axm-smelt/src/axm_smelt/core/counter.py

Python
class CounterBackend(StrEnum):
    """Backend used to produce a token count.

    Currently only :attr:`TIKTOKEN` exists; the enum is retained as the seam
    for a future HuggingFace/SentencePiece backend (Llama/Mistral/Gemma).
    """

    TIKTOKEN = _TIKTOKEN_VALUE

`Format`

Bases: Enum

Supported input formats.

Source code in packages/axm-smelt/src/axm_smelt/core/models.py

Python
class Format(enum.Enum):
    """Supported input formats."""

    JSON = "json"
    YAML = "yaml"
    XML = "xml"
    TOML = "toml"
    CSV = "csv"
    MARKDOWN = "markdown"
    TEXT = "text"

`SmeltReport`

Bases: BaseModel

Report produced by the smelt pipeline.

Source code in packages/axm-smelt/src/axm_smelt/core/models.py

Python
class SmeltReport(BaseModel):  # type: ignore[explicit-any]  # reason: pydantic plugin synthesizes __init__ with Any kwargs
    """Report produced by the smelt pipeline."""

    original: str
    compacted: str
    original_tokens: int
    compacted_tokens: int
    savings_pct: float
    format: Format
    strategies_applied: list[str]
    strategy_estimates: dict[str, float] = {}
    counter_backend: CounterBackend = CounterBackend.TIKTOKEN

`check(text=None, *, parsed=None)`

Analyze text without transforming it.

The report carries two distinct savings figures:

strategy_estimates maps each registry strategy to the reduction it achieves in isolation, measured against the unmutated input. These estimates are independent and non-additive: summing them overstates the achievable gain, because strategies overlap (e.g. minify already removes whitespace that collapse_whitespace would also target).
savings_pct is the real cumulative gain — the reduction obtained by chaining the default strategy set (resolve_strategies(None, None), i.e. the safe preset, exactly what :func:smelt applies with no explicit strategies). It equals what a user would actually get from smelt(text).

original and compacted stay identical: check never transforms its input, it only measures.

Source code in packages/axm-smelt/src/axm_smelt/core/pipeline.py

Python
def check(
    text: str | None = None,
    *,
    parsed: dict[str, JsonValue] | list[JsonValue] | None = None,
) -> SmeltReport:
    """Analyze *text* without transforming it.

    The report carries two distinct savings figures:

    - ``strategy_estimates`` maps each registry strategy to the reduction it
      achieves *in isolation*, measured against the unmutated input. These
      estimates are **independent and non-additive**: summing them overstates
      the achievable gain, because strategies overlap (e.g. ``minify`` already
      removes whitespace that ``collapse_whitespace`` would also target).
    - ``savings_pct`` is the **real cumulative gain** — the reduction obtained
      by *chaining* the default strategy set (``resolve_strategies(None, None)``,
      i.e. the ``safe`` preset, exactly what :func:`smelt` applies with no
      explicit strategies). It equals what a user would actually get from
      ``smelt(text)``.

    ``original`` and ``compacted`` stay identical: ``check`` never transforms
    its input, it only measures.
    """
    from axm_smelt.strategies import _REGISTRY

    if parsed is not None:
        text = json.dumps(parsed, separators=(",", ":"))
    elif text is None:
        msg = "Either text or parsed must be provided"
        raise ValueError(msg)

    fmt, detected_parsed = detect_format_parsed(text)
    tokens, backend = count_with_backend(text)

    if detected_parsed is not None:
        ctx = SmeltContext(text=text, format=fmt, parsed=detected_parsed)
    else:
        ctx = SmeltContext(text=text, format=fmt)

    estimates: dict[str, float] = {}
    for name, cls in _REGISTRY.items():
        strategy = cls()
        result = _safe_apply(strategy, ctx)
        if result.text != ctx.text:
            result_tokens, b = count_with_backend(result.text)
            backend = _worst(backend, b)
            savings = (1 - result_tokens / tokens) * 100 if tokens > 0 else 0.0
            if savings > 0:
                estimates[name] = round(savings, 2)

    strats = resolve_strategies(None, None)
    chained_ctx, _applied, b_strat = _apply_strategies(ctx, strats, tokens)
    backend = _worst(backend, b_strat)
    chained_tokens, b_chain = count_with_backend(chained_ctx.text)
    backend = _worst(backend, b_chain)
    cumulative = (1 - chained_tokens / tokens) * 100 if tokens > 0 else 0.0

    return SmeltReport(
        original=text,
        compacted=text,
        original_tokens=tokens,
        compacted_tokens=tokens,
        savings_pct=cumulative,
        format=fmt,
        strategies_applied=[],
        strategy_estimates=estimates,
        counter_backend=backend,
    )

`count(text, model='o200k_base')`

Return the token count for text.

Uses tiktoken with model encoding; a claude* or unknown model is routed to the o200k_base proxy.

Source code in packages/axm-smelt/src/axm_smelt/core/counter.py

Python
def count(text: str, model: str = "o200k_base") -> int:
    """Return the token count for *text*.

    Uses tiktoken with *model* encoding; a ``claude*`` or unknown model is
    routed to the ``o200k_base`` proxy.
    """
    n, _ = count_with_backend(text, model)
    return n

`smelt(text=None, strategies=None, preset=None, *, parsed=None)`

Run the compaction pipeline and return a report.

Baseline for savings_pct:

text= path: the baseline is the provided raw string, unchanged.
parsed= path: the baseline is the pretty serialization json.dumps(parsed, indent=2, ensure_ascii=False), not the compact dump. A parsed object has no canonical textual form, so measuring savings against an already-minified compact baseline would structurally under-report the reduction. The pretty baseline reflects the indented form a user would otherwise have read, so savings_pct matches the perceived reduction. report.original still holds the compact serialization (the pipeline's working text).

Source code in packages/axm-smelt/src/axm_smelt/core/pipeline.py

Python
def smelt(
    text: str | None = None,
    strategies: list[str] | None = None,
    preset: str | None = None,
    *,
    parsed: dict[str, JsonValue] | list[JsonValue] | None = None,
) -> SmeltReport:
    """Run the compaction pipeline and return a report.

    Baseline for ``savings_pct``:

    - ``text=`` path: the baseline is the provided raw string, unchanged.
    - ``parsed=`` path: the baseline is the *pretty* serialization
      ``json.dumps(parsed, indent=2, ensure_ascii=False)``, not the compact
      dump. A parsed object has no canonical textual form, so measuring
      savings against an already-minified compact baseline would
      structurally under-report the reduction. The pretty baseline reflects
      the indented form a user would otherwise have read, so ``savings_pct``
      matches the perceived reduction. ``report.original`` still holds the
      compact serialization (the pipeline's working text).
    """
    text, parsed = resolve_input(text, parsed)

    fmt, detected_parsed = detect_format_parsed(text)
    if parsed is not None:
        detected_parsed = parsed
        baseline = json.dumps(parsed, indent=2, ensure_ascii=False)
    else:
        baseline = text
    original_tokens, b1 = count_with_backend(baseline)

    strats = resolve_strategies(strategies, preset)

    if detected_parsed is not None:
        ctx = SmeltContext(text=text, format=fmt, parsed=detected_parsed)
    else:
        ctx = SmeltContext(text=text, format=fmt)

    ctx, applied, b_strat = _apply_strategies(ctx, strats, original_tokens)

    compacted = ctx.text
    compacted_tokens, b3 = count_with_backend(compacted)
    backend = _worst(_worst(b1, b_strat), b3)
    savings = (
        (1 - compacted_tokens / original_tokens) * 100 if original_tokens > 0 else 0.0
    )

    return SmeltReport(
        original=text,
        compacted=compacted,
        original_tokens=original_tokens,
        compacted_tokens=compacted_tokens,
        savings_pct=savings,
        format=fmt,
        strategies_applied=applied,
        counter_backend=backend,
    )