Skip to content

Architecture

Overview

axm-ast follows a layered architecture: CLI → core engines → models. The core layer is entirely I/O-free — it operates on Pydantic models produced by tree-sitter parsing.

graph TD
    subgraph "User Interface"
        CLI["CLI (cyclopts)"]
    end

    subgraph "Core Engines"
        Parser["Parser (tree-sitter)"]
        Analyzer["Analyzer"]
        Cache["Cache"]
        Ranker["Ranker (PageRank)"]
        Callers["Caller Analysis"]
        Context["Context (one-shot)"]
        Impact["Impact Analysis"]
        GitCoupling["Git Coupling"]
        StructDiff["Structural Diff"]
        Workspace["Workspace"]
        Docs["Docs Discovery"]
        Formatters["Formatters"]
    end

    subgraph "Models (Pydantic)"
        ModuleInfo["ModuleInfo"]
        PackageInfo["PackageInfo"]
        WorkspaceInfo["WorkspaceInfo"]
        CallSite["CallSite"]
    end

    subgraph "External"
        TreeSitter["tree-sitter-python"]
        FS["File System"]
        PyProject["pyproject.toml"]
        Git["git log"]
    end

    CLI --> Cache
    CLI --> Context
    CLI --> Impact
    CLI --> Callers
    CLI --> Formatters
    Cache --> Analyzer
    Cache --> PackageInfo
    Analyzer --> Parser
    Ranker --> PackageInfo
    Callers --> Parser
    Context --> Cache
    Context --> Ranker
    Impact --> Callers
    Impact --> Cache
    Impact --> GitCoupling
    GitCoupling --> Git
    StructDiff --> Git
    StructDiff --> Analyzer
    Parser --> TreeSitter
    Parser --> FS
    Parser --> ModuleInfo
    Analyzer --> PackageInfo
    Callers --> CallSite
    Context --> PyProject
    CLI --> Docs
    CLI --> Workspace
    Workspace --> Cache
    Workspace --> Callers
    Workspace --> Impact
    Workspace --> WorkspaceInfo
    Docs --> FS

Layers

1. CLI (cli.py)

Cyclopts-based commands with input validation and formatted output (text + JSON). Each command follows the pattern: parse arguments → call core → format output.

2. Core Engines (core/)

Independent, composable analysis engines:

Engine Purpose Key Function
parser.py Tree-sitter AST parsing → ModuleInfo extract_module_info()
analyzer.py Package discovery, import graph (absolute + relative), search, stubs analyze_package()
cache.py Thread-safe caching of PackageInfo — avoids redundant parsing get_package(), clear_cache()
ranker.py PageRank symbol importance rank_symbols()
callers.py Call-site detection find_callers(), find_callers_workspace()
context.py One-shot project dump build_context()
impact.py Change blast radius (callers + reexports + tests + git coupling + cross-package) analyze_impact(), analyze_impact_workspace()
git_coupling.py Git co-change coupling analysis (6-month history) git_coupled_files()
structural_diff.py Symbol-level branch diff via git worktrees structural_diff()
workspace.py Multi-package workspace detection and analysis detect_workspace(), analyze_workspace()
docs.py Documentation tree discovery discover_docs()
dead_code.py Dead code detection with test/lazy-import/base-class scanning; respects .gitignore via _discover_py_files find_dead_code(), DeadSymbol
flows.py Entry point detection, BFS flow tracing, source enrichment find_entry_points(), trace_flow()

3. Formatters (formatters.py)

Output formatting with multiple detail levels:

Function Purpose
format_text() Human-readable text (summary / detailed / full)
format_compressed() AI-friendly compressed view
format_json() Machine-readable JSON
format_toc() Table-of-contents: module names + counts only
filter_modules() Case-insensitive substring filter on module names
format_mermaid() Mermaid dependency graph

4. Models (models/)

Pydantic models for structured data exchange between layers:

Model Purpose
ModuleInfo Full introspection result for a single module
PackageInfo Full introspection result for a package
FunctionInfo Function metadata (params, return type, decorators)
ClassInfo Class metadata (bases, methods, docstring)
ParameterInfo Function parameter (name, type, default)
VariableInfo Module-level variable / constant
ImportInfo Import statement (absolute/relative, names)
CallSite Call-site location (module, line, context)
WorkspaceInfo Multi-package workspace (packages, dependency edges)

5. Hooks (hooks/)

Protocol hooks registered via axm.hooks entry points. These are called by axm-engine as pre/post-hooks in protocol execution.

Hook Entry Point Purpose
TraceSourceHook ast:trace-source Run trace_flow(detail="source") and inject trace into session context
SourceBodyHook ast:source-body Fetch raw source body for a symbol and inject it into session context
FileHeaderHook ast:file-header Extract file-level header (module docstring, __all__, top-level imports) and inject into session context

Design Decisions

Decision Rationale
tree-sitter for parsing Fast, incremental, handles broken files gracefully
Pydantic models Validation, serialization, JSON output for free
PageRank for ranking Graph-based importance adapts to any project structure
Composable engines impact = callers + analyzer + ranker + test mapping + git coupling
Session cache PackageCache avoids redundant tree-sitter parsing across chained tool calls
Workspace auto-detect [tool.uv.workspace] triggers multi-package mode transparently
src/ layout PEP 621 best practice, no import conflicts