Files
codexis/AGENTS.md
2026-04-10 15:31:52 -04:00

63 lines
2.7 KiB
Markdown

# Codexis
Tree-sitter powered code indexer. Produces a SQLite database of symbols, files, and line numbers at `.codexis/index.db`.
## Usage
```bash
codexis [flags] [root] # default root is current directory
codexis . # index cwd → .codexis/index.db
codexis -force . # full re-index (ignore file hashes)
codexis -o /tmp/out.db . # custom output path
```
## Architecture
- **`main.go`** — CLI entry, schema creation, orchestration
- **`indexer/walker.go`** — Uses `git ls-files` to find files, `grammars.DetectLanguage()` to filter
- **`indexer/indexer.go`** — For each file: hash check → tree-sitter tag → store symbols
- **`indexer/scope.go`** — Package extraction (language-specific AST queries with filepath fallback), export detection
- **`db/`** — sqlc-generated code from `schema.sql` and `queries.sql`
- **`extension/`** — Pi coding agent extension providing `codexis` tool for LLM SQL queries
## Key Dependencies
- **`github.com/odvcencio/gotreesitter`** — Pure-Go tree-sitter runtime (no CGo). 206 grammars.
- `grammars.DetectLanguage(filename)` → language detection
- `grammars.ResolveTagsQuery(entry)` → symbol extraction queries (inferred if not explicit)
- `gotreesitter.NewTagger(lang, query).Tag(src)` → returns `[]Tag` with kind, name, range
- **`github.com/mattn/go-sqlite3`** — SQLite driver
- **sqlc** — Generates Go from `db/schema.sql` + `db/queries.sql`
## Schema
Two tables: `files` and `symbols`. See `db/schema.sql`.
Symbol kinds (enforced via CHECK constraint): `function`, `method`, `class`, `type`, `interface`, `constant`, `variable`, `constructor`.
Parent-child relationships (e.g., method → class) are determined by range containment in the AST.
## Pi Extension
`extension/codexis.ts` registers a single `codexis` tool. Install:
```bash
# Symlink into pi extensions directory
ln -s $(pwd)/codexis/extension ~/.pi/agent/extensions/codexis
```
The tool finds `<git-root>/.codexis/index.db` automatically and runs read-only SQL queries. Schema is embedded in the tool description so the LLM knows the tables and valid enum values.
## Modifying
1. Schema changes: edit `db/schema.sql` + `db/queries.sql`, run `sqlc generate` in `db/`
2. New language package queries: add to `packageQueries` map in `indexer/scope.go`
3. Export detection heuristics: `IsExported()` in `indexer/scope.go`
## Principles
- **KISS** — Use the tagger as-is. Don't write custom per-language extractors unless the tagger is insufficient.
- **YAGNI** — No query CLI, no web UI, no call graph. Just produce the `.db` file.
- **Incremental** — Files are skipped if their sha256 hash hasn't changed. Use `-force` to bypass.