glimpse/README.md

# glimpse

Small Firefox/Selenium browser utilities packaged with Nix.

The project provides a `glimpse` CLI with subcommands for page automation and provider-backed search. It runs Firefox headless by default. The Nix package wraps the binary so `firefox` and `geckodriver` are available on `PATH`.

## Requirements

### Nix Usage

Nix is the easiest way to run this project. It provides Node.js, Firefox, and geckodriver.

```bash
nix run .#glimpse -- exec https://example.com --js='return document.title'
```

### Local Node Usage

If running directly with Node.js, install dependencies and make sure `firefox` and `geckodriver` are available on `PATH`.

```bash
npm install
npm run build
node dist/src/index.js exec https://example.com --js='return document.title'
```

## Glimpse CLI

```bash
glimpse <command> [options]
```

Common options:

- `--config=<file>` - read config from a custom path instead of `~/.config/glimpse/config.json`
- `--no-headless` - show Firefox instead of running headless
- `--url=<server>` - connect to an existing WebDriver server
- `--timeout=<ms>` - maximum wait time in milliseconds for command waits (default: `10000`)
- `--wait-js=<code>` - poll JavaScript until it returns a truthy value before command-specific behavior
- `--wait-until=<state>` - wait for document readiness: `none`, `interactive`, or `complete` (default: `none`)
- `--js=<code>` - execute inline JavaScript after loading the page and before command-specific behavior
- `--script=<file>` - execute JavaScript from a file after loading the page and before command-specific behavior

Running `glimpse` with no arguments or with `--help` prints human-readable help. Runtime and validation errors are emitted as structured JSON on stderr with `ok: false`, a stable error `code`, a human-readable `message`, and `elapsedMs`.

### Snapshot A Page

Return an agent-friendly structured view of a page.

```bash
nix run .#glimpse -- snapshot https://example.com
```

Wait for asynchronous page state before extracting the snapshot:

```bash
nix run .#glimpse -- snapshot https://example.com --wait-until=complete --wait-js='return document.body.innerText.length > 0'
```

Output:

```json
{
  "ok": true,
  "url": "https://example.com/",
  "title": "Example Domain",
  "result": {
    "text": "Example Domain\nThis domain is for use in illustrative examples...",
    "headings": [
      {
        "level": 1,
        "text": "Example Domain"
      }
    ],
    "links": [
      {
        "text": "More information...",
        "href": "https://www.iana.org/domains/example"
      }
    ],
    "buttons": [],
    "inputs": [],
    "forms": []
  }
}
```

Snapshot includes page text, headings, links, buttons, inputs, and forms. Heading extraction is best-effort and does not fail the snapshot if heading metadata cannot be extracted.

### Reader View To Markdown

Open a page with Firefox Reader View and print the readable article content as Markdown.

```bash
nix run .#glimpse -- reader https://example.com/article
```

Write Markdown to a file:

```bash
nix run .#glimpse -- reader https://example.com/article --output=article.md
```

Other formats are available:

```bash
nix run .#glimpse -- reader https://example.com/article --format=json
nix run .#glimpse -- reader https://example.com/article --format=html
nix run .#glimpse -- reader https://example.com/article --format=text
```

Options:

- `--format=<format>` - output `markdown`, `html`, `text`, or `json` (default: `markdown`)
- `--output=<file>` - write output to a file

### Execute JavaScript

JavaScript execution is available as a top-level option for every `glimpse` command. The script runs after loading the page and before command-specific behavior.

Use the `exec` command when you only want to print the returned JavaScript value.

Inline JavaScript:

```bash
nix run .#glimpse -- exec https://example.com --js='return document.title'
```

JavaScript from a file:

```bash
nix run .#glimpse -- exec https://example.com --script=extract.js
```

The script should explicitly return a value:

```javascript
return {
  title: document.title,
  links: Array.from(document.querySelectorAll("a")).map((a) => a.href),
};
```

Objects and arrays are printed as formatted JSON. Primitive values are printed directly.

### Screenshot A Page

Save a PNG screenshot after loading a page.

```bash
nix run .#glimpse -- screenshot https://example.com --output=example.png
```

Run JavaScript before taking the screenshot:

```bash
nix run .#glimpse -- screenshot https://example.com --js='document.body.style.zoom = "80%"' --output=example.png
```

If `--output` is omitted, the screenshot is saved to `screenshot.png`.

Output:

```json
{
  "ok": true,
  "result": {
    "path": "example.png"
  },
  "elapsedMs": 842
}
```

### Search With A Provider

Search using a supported provider and print a JSON array of results. Currently only Kagi is supported.

Kagi requires a token from `--token=<token>`, `KAGI_TOKEN`, or the glimpse config file. The token is validated by the Kagi provider and sent to Kagi as the `token` query parameter.

Default config path:

```text
~/.config/glimpse/config.json
```

Example config:

```json
{
  "search": {
    "provider": "kagi"
  },
  "providers": {
    "kagi": {
      "token": "your-kagi-token"
    }
  }
}
```

Then search without exposing the token in command arguments:

```bash
nix run .#glimpse -- search "nix flakes selenium webdriver"
```

Local usage:

```bash
./result/bin/glimpse search "nix flakes selenium webdriver"
```

Options:

- `--provider=<provider>` - search provider: `kagi` (default: config or `kagi`)
- `--token=<token>` - Kagi token (default: `KAGI_TOKEN` or config)
- `--no-headless` - show Firefox instead of running headless
- `--url=<server>` - connect to an existing WebDriver server
- `--timeout=<ms>` - wait time for results before returning `[]` (default: `10000`)

Output is a JSON array of search results:

```json
[
  {
    "title": "Result title",
    "url": "https://example.com",
    "description": "Result description"
  }
]
```

## Build

Build the default package, which contains `glimpse`:

```bash
nix build .#default
```

Run the built tool:

```bash
./result/bin/glimpse exec https://example.com --js='return document.title'
./result/bin/glimpse search "example query"
```

## Development

Enter the dev shell:

```bash
nix develop
```

Run linting:

```bash
npm run lint
```

Run smoke tests. These require Firefox and geckodriver on `PATH` and use local `data:` HTML pages.

```bash
npm test
```

Run focused smoke tests by tag when iterating on a specific area:

```bash
npm run test:list
npm run test:snapshot
npm run test:wait
npm run test:errors
node test/smoke.js snapshot js
```

Useful local commands:

```bash
npm run build
node dist/src/index.js snapshot 'data:text/html,<title>Hello</title><h1>Hello</h1>'
node dist/src/index.js exec 'data:text/html,<title>Hello</title>' --js='return document.title'
node dist/src/index.js screenshot 'data:text/html,<title>Hello</title>' --output=/tmp/page.png
node dist/src/index.js reader 'https://example.com/article'
```

## Project Structure

- `src/index.ts` - `glimpse` CLI with subcommands, including Firefox Reader View extraction and provider-backed search
- `src/config.ts` - home-dir config loading for CLI defaults and provider settings
- `src/driver.ts` - Firefox WebDriver creation and geckodriver resolution
- `src/providers/kagi.ts` - reusable Kagi search provider implementation
- `tsconfig.json` - TypeScript compiler settings; build output goes to `dist/`
- `flake.nix` - Nix dev shell, package, wrappers, and apps
- `KAGI.md` - Kagi-specific notes