Files
pi-web/README.md
Evan Reichard 67ce141b1b feat: add web_fetch tool, rename to pi-web
- Rename package to @evan/pi-web (repo rename handled separately).
- Rename existing 'search' tool to 'web_search' for consistency.
- Add 'web_fetch' tool: navigates via the shared headless Firefox,
  extracts via Mozilla Readability, falls back to <body> when no
  article is detected, converts with Turndown. 50KB cap, 15s nav
  timeout. Description steers LLM to curl for raw/non-text content.
- Reuses the shared driver, so search + fetch share one warm browser.
2026-05-25 11:51:46 -04:00

79 lines
2.9 KiB
Markdown

# evan/pi-web
Web tools for [pi coding agent](https://github.com/mariozechner/pi-coding-agent). Registers two tools backed by a shared headless Firefox session that's kept warm for the lifetime of the pi process.
| Tool | Purpose |
| ------------ | ----------------------------------------------------------------------- |
| `web_search` | Search the web via Kagi (session token) or SearXNG (JSON API). |
| `web_fetch` | Fetch a URL and return readable markdown (Readability + Turndown). |
For raw HTTP responses, non-text content, or simple API calls, the LLM is steered toward `bash` + `curl` rather than `web_fetch`.
## Search Providers
| Provider | How it works | Requires |
| --------- | --------------------------------------------------------------------------------------------- | -------------------------------------------- |
| `kagi` | Drives a headless Firefox session against `kagi.com/search?token=…&q=…` and scrapes results. | Kagi session token, `firefox`, `geckodriver` |
| `searxng` | Calls a SearXNG instance's `/search?format=json` endpoint. | A SearXNG base URL with JSON format enabled |
## Config
Drop a JSON file at `~/.pi/pi-search/config.json`:
```json
{
"provider": "searxng",
"kagi": {
"token": "<your kagi session token>"
},
"searxng": {
"baseUrl": "https://search.example.com"
}
}
```
### Env Var Overrides
| Variable | Overrides |
| ----------------------- | ----------------- |
| `PI_SEARCH_PROVIDER` | `provider` |
| `KAGI_TOKEN` | `kagi.token` |
| `PI_SEARCH_SEARXNG_URL` | `searxng.baseUrl` |
### Getting A Kagi Session Token
Open `kagi.com`, sign in, then go to **Settings → Session Link**. Copy the `token=` value from the link. Treat it like a password — it grants full account access.
## Tools
### `web_search`
| Arg | Type | Description |
| ------- | ------ | ----------------- |
| `query` | string | Search query text |
Returns a markdown list of `## [title](url)\n> description` items.
### `web_fetch`
| Arg | Type | Description |
| ----- | ------ | --------------------- |
| `url` | string | Absolute URL to fetch |
Returns markdown of the page. Pipeline:
1. Navigate the shared Firefox session to the URL (15s timeout).
2. Run [Readability](https://github.com/mozilla/readability) to extract the article subtree.
3. If Readability finds nothing, fall back to the full `<body>`.
4. Convert with [Turndown](https://github.com/mixmark-io/turndown).
5. Truncate at 50KB with a clear marker.
## Install
```bash
cd ~/.pi/agent/extensions/pi-web
npm install
```
`pi` picks up the extension via the `pi.extensions` entry in `package.json`.