Files
glimpse/README.md

297 lines
7.5 KiB
Markdown

# glimpse
Small Firefox/Selenium browser utilities packaged with Nix.
The project provides a `glimpse` CLI with subcommands for page automation and provider-backed search. It runs Firefox headless by default. The Nix package wraps the binary so `firefox` and `geckodriver` are available on `PATH`.
## Requirements
### Nix Usage
Nix is the easiest way to run this project. It provides Node.js, Firefox, and geckodriver.
```bash
nix run .#glimpse -- exec https://example.com --js='return document.title'
```
### Local Node Usage
If running directly with Node.js, install dependencies and make sure `firefox` and `geckodriver` are available on `PATH`.
```bash
npm install
npm run build
node dist/src/index.js exec https://example.com --js='return document.title'
```
## Glimpse CLI
```bash
glimpse <command> [options]
```
Common options:
- `--config=<file>` - read config from a custom path instead of `~/.config/glimpse/config.json`
- `--no-headless` - show Firefox instead of running headless
- `--url=<server>` - connect to an existing WebDriver server
- `--timeout=<ms>` - maximum wait time in milliseconds for command waits (default: `10000`)
- `--wait-js=<code>` - poll JavaScript until it returns a truthy value before command-specific behavior
- `--wait-until=<state>` - wait for document readiness: `none`, `interactive`, or `complete` (default: `none`)
- `--js=<code>` - execute inline JavaScript after loading the page and before command-specific behavior
- `--script=<file>` - execute JavaScript from a file after loading the page and before command-specific behavior
Running `glimpse` with no arguments or with `--help` prints human-readable help. Runtime and validation errors are emitted as structured JSON on stderr with `ok: false`, a stable error `code`, a human-readable `message`, and `elapsedMs`.
### Snapshot A Page
Return an agent-friendly structured view of a page.
```bash
nix run .#glimpse -- snapshot https://example.com
```
Wait for asynchronous page state before extracting the snapshot:
```bash
nix run .#glimpse -- snapshot https://example.com --wait-until=complete --wait-js='return document.body.innerText.length > 0'
```
Output:
```json
{
"ok": true,
"url": "https://example.com/",
"title": "Example Domain",
"result": {
"text": "Example Domain\nThis domain is for use in illustrative examples...",
"headings": [
{
"level": 1,
"text": "Example Domain"
}
],
"links": [
{
"text": "More information...",
"href": "https://www.iana.org/domains/example"
}
],
"buttons": [],
"inputs": [],
"forms": []
}
}
```
Snapshot includes page text, headings, links, buttons, inputs, and forms. Heading extraction is best-effort and does not fail the snapshot if heading metadata cannot be extracted.
### Reader View To Markdown
Open a page with Firefox Reader View and print the readable article content as Markdown.
```bash
nix run .#glimpse -- reader https://example.com/article
```
Write Markdown to a file:
```bash
nix run .#glimpse -- reader https://example.com/article --output=article.md
```
Other formats are available:
```bash
nix run .#glimpse -- reader https://example.com/article --format=json
nix run .#glimpse -- reader https://example.com/article --format=html
nix run .#glimpse -- reader https://example.com/article --format=text
```
Options:
- `--format=<format>` - output `markdown`, `html`, `text`, or `json` (default: `markdown`)
- `--output=<file>` - write output to a file
### Execute JavaScript
JavaScript execution is available as a top-level option for every `glimpse` command. The script runs after loading the page and before command-specific behavior.
Use the `exec` command when you only want to print the returned JavaScript value.
Inline JavaScript:
```bash
nix run .#glimpse -- exec https://example.com --js='return document.title'
```
JavaScript from a file:
```bash
nix run .#glimpse -- exec https://example.com --script=extract.js
```
The script should explicitly return a value:
```javascript
return {
title: document.title,
links: Array.from(document.querySelectorAll("a")).map((a) => a.href),
};
```
Objects and arrays are printed as formatted JSON. Primitive values are printed directly.
### Screenshot A Page
Save a PNG screenshot after loading a page.
```bash
nix run .#glimpse -- screenshot https://example.com --output=example.png
```
Run JavaScript before taking the screenshot:
```bash
nix run .#glimpse -- screenshot https://example.com --js='document.body.style.zoom = "80%"' --output=example.png
```
If `--output` is omitted, the screenshot is saved to `screenshot.png`.
Output:
```json
{
"ok": true,
"result": {
"path": "example.png"
},
"elapsedMs": 842
}
```
### Search With A Provider
Search using a supported provider and print a JSON array of results. Currently only Kagi is supported.
Kagi requires a token from `--token=<token>`, `KAGI_TOKEN`, or the glimpse config file. The token is validated by the Kagi provider and sent to Kagi as the `token` query parameter.
Default config path:
```text
~/.config/glimpse/config.json
```
Example config:
```json
{
"search": {
"provider": "kagi"
},
"providers": {
"kagi": {
"token": "your-kagi-token"
}
}
}
```
Then search without exposing the token in command arguments:
```bash
nix run .#glimpse -- search "nix flakes selenium webdriver"
```
Local usage:
```bash
./result/bin/glimpse search "nix flakes selenium webdriver"
```
Options:
- `--provider=<provider>` - search provider: `kagi` (default: config or `kagi`)
- `--token=<token>` - Kagi token (default: `KAGI_TOKEN` or config)
- `--no-headless` - show Firefox instead of running headless
- `--url=<server>` - connect to an existing WebDriver server
- `--timeout=<ms>` - wait time for results before returning `[]` (default: `10000`)
Output is a JSON array of search results:
```json
[
{
"title": "Result title",
"url": "https://example.com",
"description": "Result description"
}
]
```
## Build
Build the default package, which contains `glimpse`:
```bash
nix build .#default
```
Run the built tool:
```bash
./result/bin/glimpse exec https://example.com --js='return document.title'
./result/bin/glimpse search "example query"
```
## Development
Enter the dev shell:
```bash
nix develop
```
Run linting:
```bash
npm run lint
```
Run smoke tests. These require Firefox and geckodriver on `PATH` and use local `data:` HTML pages.
```bash
npm test
```
Run focused smoke tests by tag when iterating on a specific area:
```bash
npm run test:list
npm run test:snapshot
npm run test:wait
npm run test:errors
node test/smoke.js snapshot js
```
Useful local commands:
```bash
npm run build
node dist/src/index.js snapshot 'data:text/html,<title>Hello</title><h1>Hello</h1>'
node dist/src/index.js exec 'data:text/html,<title>Hello</title>' --js='return document.title'
node dist/src/index.js screenshot 'data:text/html,<title>Hello</title>' --output=/tmp/page.png
node dist/src/index.js reader 'https://example.com/article'
```
## Project Structure
- `src/index.ts` - `glimpse` CLI with subcommands, including Firefox Reader View extraction and provider-backed search
- `src/config.ts` - home-dir config loading for CLI defaults and provider settings
- `src/driver.ts` - Firefox WebDriver creation and geckodriver resolution
- `src/providers/kagi.ts` - reusable Kagi search provider implementation
- `tsconfig.json` - TypeScript compiler settings; build output goes to `dist/`
- `flake.nix` - Nix dev shell, package, wrappers, and apps
- `KAGI.md` - Kagi-specific notes