- Increased context window from 80k to 202,752 tokens - Added CUDA device specification for GPU acceleration - Optimized for GLM 4.7 Flash (30B) model performance
14 KiB
14 KiB
- Increased context window from 80k to 202,752 tokens - Added CUDA device specification for GPU acceleration - Optimized for GLM 4.7 Flash (30B) model performance