Files
agent-evals/LEADERBOARD.md
2026-02-06 19:36:09 -05:00

36 lines
956 B
Markdown

# Leaderboard
## Overall Rankings
1. **[pi - Kimi K2.5](../../../src/branch/eval/pi-kimi-k2.5/)** - UI: 9/10
- Fixes: Files List, Editor Scrolling, Duplicate `.md` Name
<details>
<summary>Screenshot</summary>
![pi-kimi-k2.5](../../../raw/branch/eval/pi-kimi-k2.5/screenshot.png)
</details>
2. **[pi - Qwen3 Coder Next (80B)](../../../src/branch/eval/pi-qwen3-coder-next-80b/)** - UI: 8/10
- Fixes: New File, Markdown Styling
<details>
<summary>Screenshot</summary>
![pi-qwen3-coder-next-80b](../../../raw/branch/eval/pi-qwen3-coder-next-80b/screenshot.png)
</details>
3. **[pi - GLM4.7](../../../src/branch/eval/pi-glm4.7/)** - UI: 7/10
- Fixes: Files List, Markdown Styling, Delete Failure
<details>
<summary>Screenshot</summary>
![pi-glm4.7](../../../raw/branch/eval/pi-glm4.7/screenshot.png)
</details>
## By Parameters
**< 100B (Local):** Qwen3 Coder Next (80B)
**> 100B (Hosted):** Kimi K2.5