Commit Graph

176 Commits

Author SHA1 Message Date
a01f9e34ee chore: tweak ctx 2026-05-12 09:27:42 -04:00
9824728ccb feat(pi): add pi-subagents extension 2026-05-12 08:41:06 -04:00
9ec2d61fcc chore(llama-swap): bump llama-cpp to b9048 and swap in UD-Q4/Q6 MTP configs
Replace qwen3.6-27b-thinking and qwen3.6-27b-mtp-thinking with
qwen3.6-27b-udq4-thinking (single GPU) and qwen3.6-27b-udq6-thinking
(dual GPU). Update aliases and concurrent set accordingly.
2026-05-11 15:26:39 -04:00
4df32ad273 fix(llama-swap): allow qwen thinking by default 2026-05-11 09:51:01 -04:00
ecad94aab3 fix(llama-swap): update vllm timings patch 2026-05-11 09:40:13 -04:00
187c717383 fix(pi extension): simplify regex for replacing 'pi' with 'claude code' 2026-05-11 09:06:30 -04:00
352e99c732 feat(llama-swap): add gemma-4-26b-vision model config 2026-05-10 16:59:35 -04:00
6fff658f9d feat(llama-swap): add --default-chat-template-kwargs to vLLM 3090 configs
Sync all three Qwen3.6 27B vLLM configs (tools-text, long-text,
long-vision) with club-3090 83bf73d. Adds disable-thinking flag
and introduces upstream hash tracking comments for future syncs.

Update update-vllm-3090-configs skill to use hash-based skip logic.
2026-05-10 16:57:17 -04:00
b41e9f2a84 docs(pi): add agent knowledge capture guidance 2026-05-09 10:18:13 -04:00
b25a933dd0 docs(pi): tighten agent guidance 2026-05-09 10:16:34 -04:00
37b0fae7e2 fix(llama-swap): sync qwen vllm 3090 configs 2026-05-09 10:16:32 -04:00
f3cc67b17d chore(llama-swap): tune presence penalty to 1.5 and remove repeat penalty 2026-05-07 20:37:54 -04:00
fea5cc887d feat(llama-swap): add Qwen3.6-27B MTP thinking model and bump llama-cpp to b9045
Add qwen3.6-27b-mtp-thinking model config with 150K context, MTP
speculative decoding, and thinking mode support. Bump llama-cpp
from b9009 to b9045 and apply MTP patch from upstream PR #22673.
2026-05-06 12:49:49 -04:00
d3ccbda958 refactor(llama-swap): update Genesis to v7.69 and upgrade vLLM nightly image
- Bump Genesis pin from 2db18df to 7b9fd319
- Upgrade vllm/vllm-openai nightly from 7a1eb8ac to 01d4d1ad
- Remove standalone boot-time patches now folded into Genesis (patch_tolist_cudagraph,
  patch_inputs_embeds_optional, patch_workspace_lock_disable, patch_pn25_genesis_register_fix,
  patch_pn30_dst_shaped_temp_fix, patch_pr40798_workspace)
- Reorganize environment variables across all vLLM compose configs
- Add new Genesis optimizations: P100, P101, P103, P15B, P38B, PN59 streaming GDN
2026-05-05 15:50:37 -04:00
3bccdc6382 chore(llama-swap): disable GENESIS_ENABLE_P85 2026-05-04 12:53:11 -04:00
0e3658615a fix(home/pi): use config instead of osConfig for sops checks, cleanup formatting 2026-05-04 12:20:59 -04:00
3095515963 feat(conduit): add package and home module 2026-05-04 00:00:06 -04:00
4701a97a91 feat(pi): add statusline package 2026-05-03 11:41:36 -04:00
6e20baf883 feat(pi): merge sops-managed auth keys 2026-05-02 22:44:03 -04:00
d1f17a18b4 docs(web-glimpse): simplify and streamline skill documentation
Condense the web-glimpse SKILL.md from verbose multi-section format to a
compact quick-reference style. Key changes:
- Consolidate usage patterns into a single quick reference block
- Replace separate sections per command with a concise command table
- Simplify workflow guidance and error handling into scannable tables
- Update timeout values from milliseconds to seconds
- Document new --no-reader and --format options
- Remove redundant answering guidelines
2026-05-02 20:32:36 -04:00
7c1519881a refactor(llama-swap): generate sops secrets from apiKeys list 2026-05-02 15:48:15 -04:00
8d45977154 chore(llama-swap): update Genesis to v7.69, add cliff 2 optimizations
- Bump Genesis pin from fc89395 to 2db18df (v7.69)
- Add PN32 GDN chunked prefill and PN34 workspace lock relax env vars
- Replace patch_workspace_lock_disable with patch_inputs_embeds_optional
- Remove setup-time PN25/PN30 patches (folded into v7.69 natively)
- Switch patch base URL to v7.69-cliff2-test branch
- Lower GPU memory utilization to 0.93 for long-text variant
- Remove python3 from preflight check prerequisites
- Add printing service to lin-va-thinkpad
2026-05-02 13:48:41 -04:00
40114f438f feat(llama-swap): sync vLLM configs from club-3090, add evan API key
Sync all three vLLM model configs from club-3090 master (ae4846f).
Update to Genesis v7.65 full PROD env set with new patches.
Update docker image to nightly-7a1eb8ac. Add torch_compile and
triton cache dirs. Add agent setup guide (AGENTS.md).

Add 'evan' API key to llama-swap sops secrets.
2026-05-02 08:27:47 -04:00
ba30222962 feat(pi): add skills and improve AGENTS.md reading guidelines
- Add 'create-skill' skill for scaffolding new skill directories
- Add 'planning' skill for structured implementation workflows
- Add search-then-read pattern guidance to AGENTS.md
2026-05-01 23:26:53 -04:00
e4d40d89d9 feat: add api keys to llama-swap 2026-05-01 22:12:51 -04:00
43a1d66e6b add 9b vision 2026-05-01 21:51:06 -04:00
1283b7cdef add fim 4b + 9b 2026-05-01 21:42:50 -04:00
09fdff4908 refactor(llama-swap): reorganize models by GPU hardware section 2026-05-01 21:08:53 -04:00
88308602c8 feat(llama-swap): add concurrent model matrix for CUDA0/CUDA1
Allow one CUDA0 and one CUDA1 model to run simultaneously. Dual-GPU
models (using -ts splits) are excluded from the matrix so they evict
everything when loaded. vLLM docker models get evict_cost=50 to
discourage eviction due to slow cold starts.
2026-05-01 16:50:28 -04:00
1812d2ea03 feat(pi): add vision model support
Extract a shared hasType helper for model filtering and add
vision (text + image) input capability to compatible models.
Also tag two llama-swap models with the vision type.
2026-05-01 15:03:26 -04:00
ab63211a75 wip 2026-05-01 14:36:36 -04:00
561f10d2a7 fix: timing & vllm 2026-05-01 13:09:28 -04:00
a3b2efa5bb feat: vllm timings patch 2026-05-01 10:57:59 -04:00
74ff71803b feat: vllm yay 2026-05-01 10:38:43 -04:00
75eba8703f add: vllm base 3.6 27b 2026-04-30 21:47:29 -04:00
3d55b6e675 chore: add lfs 2026-04-30 20:14:41 -04:00
990b6a4392 feat: vllm 2026-04-30 20:04:58 -04:00
bcba8f6b60 feat(address-gh-review): add thread resolution with reactions and comments 2026-04-30 14:20:01 -04:00
93e2247a30 chore(nixos/llama-swap): remove synthetic peer and tune local model args 2026-04-30 11:43:04 -04:00
31363f5f8d docs(pi): add _scratch guidance for ephemeral artifacts 2026-04-30 08:32:03 -04:00
976edab339 config(llama-swap): enable preserve_thinking in chat template kwargs 2026-04-30 07:45:57 -04:00
eef4d78cb3 feat(home): add pass-backed keyring module and enable for work VM
- Add modules/home/security/pass-keyring with GPG agent, pass, and
  python keyring backend config for headless credential storage
- Enable pass-keyring for lin-va-mbp-work-vm
- Update bash PATH from ~/.bin to ~/.local/bin
2026-04-27 23:02:22 -04:00
b85b01bcaa feat: add web-glimpse skill for headless browser tasks 2026-04-27 10:45:44 -04:00
04296e282c feat: add glimpse module 2026-04-27 10:25:20 -04:00
005ba2244b feat(pi): add glimpse browser automation CLI 2026-04-27 08:12:39 -04:00
a39a314674 feat(pi): manage pi extension packages via nix module 2026-04-26 08:59:00 -04:00
e8bc4e4da7 fix(pi): guard prompt replacement to anthropic-only and preserve pi-coding-agent 2026-04-26 08:58:58 -04:00
fc1f2404d0 refactor(nixos): move supportedFilesystems nfs to common boot module 2026-04-24 07:25:06 -04:00
1070642635 feat(llama-swap): add qwen3.6-27b-thinking model 2026-04-22 13:01:38 -04:00
c3d433ddaf feat(nvim): add manual mode for LSP servers
Allow LSP servers to be enabled on-demand via a buffer-local command
instead of auto-starting on matching filetypes. The command name is
auto-derived from the server name (e.g. 'GolangciLint'). Switch
golangci-lint to manual mode as it's resource-heavy and not always needed.
2026-04-22 13:01:32 -04:00