evan/nix - nix - Gitea: Git with a cup of tea

evan/nix

Author	SHA1	Message	Date
Evan Reichard	328bb6e1db	feat(llama-swap): add ik-llama-cpp package and Qwen3.6-27B MTP config Add ikawrakow/ik_llama.cpp as a new package with CUDA/Vulkan support, enabling MTP (Multi-Token Prediction) and IQ4_KS quantization. Wire it into llama-swap with a new 'ik-qwen3.6-27b-iq4ks-thinking' model config and 'iq36' alias. Also add a chat template download to the vLLM setup script and include the binary on lin-va-desktop.	2026-05-12 16:19:34 -04:00
Evan Reichard	ecad94aab3	fix(llama-swap): update vllm timings patch	2026-05-11 09:40:13 -04:00
Evan Reichard	37b0fae7e2	fix(llama-swap): sync qwen vllm 3090 configs	2026-05-09 10:16:32 -04:00
Evan Reichard	d3ccbda958	refactor(llama-swap): update Genesis to v7.69 and upgrade vLLM nightly image - Bump Genesis pin from 2db18df to 7b9fd319 - Upgrade vllm/vllm-openai nightly from 7a1eb8ac to 01d4d1ad - Remove standalone boot-time patches now folded into Genesis (patch_tolist_cudagraph, patch_inputs_embeds_optional, patch_workspace_lock_disable, patch_pn25_genesis_register_fix, patch_pn30_dst_shaped_temp_fix, patch_pr40798_workspace) - Reorganize environment variables across all vLLM compose configs - Add new Genesis optimizations: P100, P101, P103, P15B, P38B, PN59 streaming GDN	2026-05-05 15:50:37 -04:00
Evan Reichard	8d45977154	chore(llama-swap): update Genesis to v7.69, add cliff 2 optimizations - Bump Genesis pin from fc89395 to 2db18df (v7.69) - Add PN32 GDN chunked prefill and PN34 workspace lock relax env vars - Replace patch_workspace_lock_disable with patch_inputs_embeds_optional - Remove setup-time PN25/PN30 patches (folded into v7.69 natively) - Switch patch base URL to v7.69-cliff2-test branch - Lower GPU memory utilization to 0.93 for long-text variant - Remove python3 from preflight check prerequisites - Add printing service to lin-va-thinkpad	2026-05-02 13:48:41 -04:00
Evan Reichard	40114f438f	feat(llama-swap): sync vLLM configs from club-3090, add evan API key Sync all three vLLM model configs from club-3090 master (ae4846f). Update to Genesis v7.65 full PROD env set with new patches. Update docker image to nightly-7a1eb8ac. Add torch_compile and triton cache dirs. Add agent setup guide (AGENTS.md). Add 'evan' API key to llama-swap sops secrets.	2026-05-02 08:27:47 -04:00
Evan Reichard	ab63211a75	wip	2026-05-01 14:36:36 -04:00
Evan Reichard	561f10d2a7	fix: timing & vllm	2026-05-01 13:09:28 -04:00
Evan Reichard	74ff71803b	feat: vllm yay	2026-05-01 10:38:43 -04:00

9 Commits