Skip to content

Parser

parser

Tree-sitter based Python source parser.

This module provides deterministic, fast parsing of Python source files using tree-sitter. It extracts structured information (functions, classes, imports, variables, docstrings) from the concrete syntax tree.

Example

from pathlib import Path from axm_ast.core.parser import extract_module_info mod = extract_module_info(Path("my_module.py")) [f.name for f in mod.functions]['main', 'helper']

extract_module_info(path)

Extract full module information from a Python file.

Parses the file using tree-sitter and extracts all functions, classes, imports, variables, and the module docstring.

Parameters:

Name Type Description Default
path Path

Path to a .py file.

required

Returns:

Type Description
ModuleInfo

ModuleInfo with all extracted metadata.

Raises:

Type Description
FileNotFoundError

If the file does not exist.

ValueError

If the file is not a .py file.

Example

from pathlib import Path mod = extract_module_info(Path("my_module.py")) mod.path.name 'my_module.py'

Source code in packages/axm-ast/src/axm_ast/core/parser.py
def extract_module_info(path: Path) -> ModuleInfo:
    """Extract full module information from a Python file.

    Parses the file using tree-sitter and extracts all functions,
    classes, imports, variables, and the module docstring.

    Args:
        path: Path to a .py file.

    Returns:
        ModuleInfo with all extracted metadata.

    Raises:
        FileNotFoundError: If the file does not exist.
        ValueError: If the file is not a .py file.

    Example:
        >>> from pathlib import Path
        >>> mod = extract_module_info(Path("my_module.py"))
        >>> mod.path.name
        'my_module.py'
    """
    tree = parse_file(path)
    root = tree.root_node

    docstring = _extract_docstring(root)
    all_exports = _extract_all_exports(root)

    functions: list[FunctionInfo] = []
    classes: list[ClassInfo] = []
    imports: list[ImportInfo] = []
    variables: list[VariableInfo] = []

    for child in root.children:
        if child.type == "function_definition":
            functions.append(_extract_function(child))
        elif child.type == "decorated_definition":
            _process_decorated(child, functions, classes)
        elif child.type == "class_definition":
            classes.append(_extract_class(child))
        elif child.type in (
            "import_statement",
            "import_from_statement",
            "future_import_statement",
        ):
            imports.extend(_extract_imports(child))
        elif child.type == "expression_statement":
            var = _extract_variable(child)
            if var is not None and var.name != "__all__":
                variables.append(var)

    return ModuleInfo(
        path=path.resolve(),
        docstring=docstring,
        functions=functions,
        classes=classes,
        imports=imports,
        variables=variables,
        all_exports=all_exports,
    )

parse_file(path)

Parse a Python file into a tree-sitter Tree.

Parameters:

Name Type Description Default
path Path

Path to a .py file.

required

Returns:

Type Description
Tree

Parsed tree-sitter Tree.

Raises:

Type Description
FileNotFoundError

If the file does not exist.

ValueError

If the file is not a .py file.

Example

from pathlib import Path tree = parse_file(Path("setup.py")) tree.root_node.type 'module'

Source code in packages/axm-ast/src/axm_ast/core/parser.py
def parse_file(path: Path) -> Tree:
    """Parse a Python file into a tree-sitter Tree.

    Args:
        path: Path to a .py file.

    Returns:
        Parsed tree-sitter Tree.

    Raises:
        FileNotFoundError: If the file does not exist.
        ValueError: If the file is not a .py file.

    Example:
        >>> from pathlib import Path
        >>> tree = parse_file(Path("setup.py"))
        >>> tree.root_node.type
        'module'
    """
    path = Path(path).resolve()
    if not path.exists():
        msg = f"File not found: {path}"
        raise FileNotFoundError(msg)
    if path.suffix != ".py":
        msg = f"Not a Python file: {path}"
        raise ValueError(msg)
    source = path.read_text(encoding="utf-8")
    return parse_source(source)

parse_source(source)

Parse a Python source string into a tree-sitter Tree.

Parameters:

Name Type Description Default
source str

Python source code as string.

required

Returns:

Type Description
Tree

Parsed tree-sitter Tree.

Example

tree = parse_source("def foo(): pass") tree.root_node.type 'module'

Source code in packages/axm-ast/src/axm_ast/core/parser.py
def parse_source(source: str) -> Tree:
    """Parse a Python source string into a tree-sitter Tree.

    Args:
        source: Python source code as string.

    Returns:
        Parsed tree-sitter Tree.

    Example:
        >>> tree = parse_source("def foo(): pass")
        >>> tree.root_node.type
        'module'
    """
    parser = _get_parser()
    return parser.parse(source.encode("utf-8"))