diff --git a/modules/home/programs/terminal/pi/config/skills/web-glimpse/SKILL.md b/modules/home/programs/terminal/pi/config/skills/web-glimpse/SKILL.md deleted file mode 100644 index f0e44d8..0000000 --- a/modules/home/programs/terminal/pi/config/skills/web-glimpse/SKILL.md +++ /dev/null @@ -1,112 +0,0 @@ ---- -name: web-glimpse -description: 'Search the web, read pages, extract content, run JavaScript, or capture screenshots using the `glimpse` headless browser tool. Use when the user asks to search the web, look something up online, read/fetch a page, inspect dynamic content, or capture visual state. Does not replace curl for simple HTTP/API requests.' ---- - -# Web Browsing With Glimpse - -`glimpse` runs headless Firefox via WebDriver. Use it for web search, reading rendered pages, running JavaScript, and screenshots. Prefer `curl` for simple APIs, static files, and direct downloads. - -## Commands - -| Command | Purpose | -| ------- | ------- | -| `reader ` | Extract page content as Markdown (Reader View → raw fallback) | -| `exec ` | Run JavaScript on a page, return the result | -| `screenshot ` | Save a PNG screenshot | -| `search ` | Search the web (Kagi) and return results | -| `serve` | Start a persistent browser for faster repeat commands | - -## Persistent Server - -For multi-command sessions, start a persistent browser server first. All subsequent commands auto-discover it via Unix socket — no extra flags needed. - -```bash -# Start persistent server (keeps geckodriver + Firefox alive) -glimpse serve & - -# All commands now reuse the running browser (~300ms vs ~2-3s each) -glimpse reader https://example.com -glimpse reader https://other.com -glimpse exec https://example.com --js='return document.title' - -# Check server status -glimpse serve --status - -# Stop when done -glimpse serve --stop -``` - -State (cookies, localStorage) persists between commands — this is intentional for sticky sessions. Without a running server, commands work normally with ad-hoc browser startup. - -## Quick Reference - -```bash -# Read a page (tries Reader View, falls back to raw Turndown) -glimpse reader https://example.com --timeout=15 - -# Read without Reader View (raw HTML → Markdown via Turndown) -glimpse reader https://example.com --no-reader --timeout=15 - -# Get structured JSON instead of Markdown (includes method: "reader"|"raw") -glimpse reader https://example.com --format=json - -# Save extracted content to a file -glimpse reader https://example.com --output=page.md - -# Run JavaScript and return a value -glimpse exec https://example.com --js='return document.title' - -# Extract specific data with JavaScript -glimpse exec https://example.com --wait-until=complete --js='return { - title: document.title, - text: document.body.innerText.slice(0, 4000) -}' - -# Wait for dynamic content before extracting -glimpse reader https://example.com \ - --wait-js='return document.querySelector(".content")?.innerText?.length > 100' \ - --timeout=30 - -# Capture a screenshot -glimpse screenshot https://example.com --output=page.png - -# Search the web -glimpse search "query terms" --timeout=15 - -# Search and get JSON instead of Markdown -glimpse search "query terms" --format=json -``` - -## Common Options - -| Option | Default | Purpose | -| ------ | ------- | ------- | -| `--timeout=` | `10` | Max wait time in seconds; increase for slow/JS-heavy pages | -| `--wait-until=` | `none` | Wait for `none`, `interactive`, or `complete` | -| `--wait-js=` | — | Poll JS expression until truthy | -| `--js=` | — | Run inline JS before command logic | -| `--script=` | — | Run JS file before command logic | -| `--no-headless` | — | Show the browser window | -| `--format=` | varies | Output format (reader: `markdown`/`html`/`text`/`json`; search: `markdown`/`json`) | -| `--output=` | — | Write output to file (reader, screenshot) | -| `--no-reader` | — | Skip Reader View, use raw page extraction | - -## Workflow - -1. **Search first** when the user asks an open-ended question. Pick authoritative results to read. -2. **Read pages with `reader`** — it tries Firefox Reader View for clean article extraction, then falls back to converting the raw page HTML to Markdown via Turndown. Most pages work without extra options. -3. **Add `--wait-until=complete`** for JS-heavy pages, SPAs, or pages that load content dynamically. -4. **Use `exec`** when you need targeted data extraction via JavaScript rather than full page content. -5. **Use `screenshot`** when visual layout, charts, or rendering state matters. -6. **Increase timeouts** — start at `15`, go to `30` for slow sites. The default `10` is often too tight for real-world pages. -7. **Cite URLs** when summarizing web research. Distinguish search snippets from verified page content. - -## Error Handling - -| Error | Fix | -| ----- | --- | -| `TIMEOUT` | Increase `--timeout` (in seconds), add `--wait-until=complete`, or use `--wait-js` | -| `USAGE_ERROR` | Check arg order: `glimpse `, search is `glimpse search "query"` | -| Thin/empty content | Try `--wait-until=complete`, `--no-reader`, or targeted `exec` | -| Search auth errors | Kagi token is configured via `~/.config/glimpse/config.json` or `KAGI_TOKEN` env | diff --git a/modules/home/programs/terminal/pi/default.nix b/modules/home/programs/terminal/pi/default.nix index 3e39b39..d62231a 100755 --- a/modules/home/programs/terminal/pi/default.nix +++ b/modules/home/programs/terminal/pi/default.nix @@ -17,6 +17,7 @@ let # writing other fields (current model, etc.) without us clobbering them. piPackages = [ "https://gitea.va.reichard.io/evan/pi-lsp.git@main" + "https://gitea.va.reichard.io/evan/pi-web.git@main" "https://gitea.va.reichard.io/evan/pi-subagents.git@main" "https://gitea.va.reichard.io/evan/pi-statusline.git@main" ]; @@ -130,6 +131,17 @@ in }) piAuthApiKeys ); + # Pi Web Config - Sops template so the kagi token (declared by the + # glimpse module, which pi enables above) can be embedded alongside + # the non-secret searxng base URL. + templates."pi-web.json" = { + path = "${config.home.homeDirectory}/.pi/pi-web/config.json"; + content = builtins.toJSON { + provider = "searxng"; + kagi.token = "${config.sops.placeholder.kagi_token}"; + searxng.baseUrl = "https://search.va.reichard.io"; + }; + }; templates."pi-models.json" = { path = "${config.home.homeDirectory}/.pi/agent/models.json"; content = builtins.toJSON {