63 lines
2.7 KiB
Markdown
63 lines
2.7 KiB
Markdown
# Codexis
|
|
|
|
Tree-sitter powered code indexer. Produces a SQLite database of symbols, files, and line numbers at `.codexis/index.db`.
|
|
|
|
## Usage
|
|
|
|
```bash
|
|
codexis [flags] [root] # default root is current directory
|
|
|
|
codexis . # index cwd → .codexis/index.db
|
|
codexis -force . # full re-index (ignore file hashes)
|
|
codexis -o /tmp/out.db . # custom output path
|
|
```
|
|
|
|
## Architecture
|
|
|
|
- **`main.go`** — CLI entry, schema creation, orchestration
|
|
- **`indexer/walker.go`** — Uses `git ls-files` to find files, `grammars.DetectLanguage()` to filter
|
|
- **`indexer/indexer.go`** — For each file: hash check → tree-sitter tag → store symbols
|
|
- **`indexer/scope.go`** — Package extraction (language-specific AST queries with filepath fallback), export detection
|
|
- **`db/`** — sqlc-generated code from `schema.sql` and `queries.sql`
|
|
- **`extension/`** — Pi coding agent extension providing `codexis` tool for LLM SQL queries
|
|
|
|
## Key Dependencies
|
|
|
|
- **`github.com/odvcencio/gotreesitter`** — Pure-Go tree-sitter runtime (no CGo). 206 grammars.
|
|
- `grammars.DetectLanguage(filename)` → language detection
|
|
- `grammars.ResolveTagsQuery(entry)` → symbol extraction queries (inferred if not explicit)
|
|
- `gotreesitter.NewTagger(lang, query).Tag(src)` → returns `[]Tag` with kind, name, range
|
|
- **`github.com/mattn/go-sqlite3`** — SQLite driver
|
|
- **sqlc** — Generates Go from `db/schema.sql` + `db/queries.sql`
|
|
|
|
## Schema
|
|
|
|
Two tables: `files` and `symbols`. See `db/schema.sql`.
|
|
|
|
Symbol kinds (enforced via CHECK constraint): `function`, `method`, `class`, `type`, `interface`, `constant`, `variable`, `constructor`.
|
|
|
|
Parent-child relationships (e.g., method → class) are determined by range containment in the AST.
|
|
|
|
## Pi Extension
|
|
|
|
`extension/codexis.ts` registers a single `codexis` tool. Install:
|
|
|
|
```bash
|
|
# Symlink into pi extensions directory
|
|
ln -s $(pwd)/codexis/extension ~/.pi/agent/extensions/codexis
|
|
```
|
|
|
|
The tool finds `<git-root>/.codexis/index.db` automatically and runs read-only SQL queries. Schema is embedded in the tool description so the LLM knows the tables and valid enum values.
|
|
|
|
## Modifying
|
|
|
|
1. Schema changes: edit `db/schema.sql` + `db/queries.sql`, run `sqlc generate` in `db/`
|
|
2. New language package queries: add to `packageQueries` map in `indexer/scope.go`
|
|
3. Export detection heuristics: `IsExported()` in `indexer/scope.go`
|
|
|
|
## Principles
|
|
|
|
- **KISS** — Use the tagger as-is. Don't write custom per-language extractors unless the tagger is insufficient.
|
|
- **YAGNI** — No query CLI, no web UI, no call graph. Just produce the `.db` file.
|
|
- **Incremental** — Files are skipped if their sha256 hash hasn't changed. Use `-force` to bypass.
|