feat(llama-swap): add ik-llama-cpp package and Qwen3.6-27B MTP config
Add ikawrakow/ik_llama.cpp as a new package with CUDA/Vulkan support, enabling MTP (Multi-Token Prediction) and IQ4_KS quantization. Wire it into llama-swap with a new 'ik-qwen3.6-27b-iq4ks-thinking' model config and 'iq36' alias. Also add a chat template download to the vLLM setup script and include the binary on lin-va-desktop.
This commit is contained in:
@@ -9,6 +9,7 @@ let
|
||||
|
||||
nvidia-smi = "${config.hardware.nvidia.package.bin}/bin/nvidia-smi";
|
||||
llama-cpp = pkgs.reichard.llama-cpp;
|
||||
ik-llama-cpp = pkgs.reichard.ik-llama-cpp;
|
||||
stable-diffusion-cpp = pkgs.reichard.stable-diffusion-cpp.override {
|
||||
cudaSupport = true;
|
||||
};
|
||||
@@ -129,6 +130,7 @@ in
|
||||
|
||||
# Local Packages
|
||||
llama-cpp
|
||||
ik-llama-cpp
|
||||
stable-diffusion-cpp
|
||||
];
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user