diff --git a/NOTES.md b/NOTES.md deleted file mode 100644 index 8ad511c..0000000 --- a/NOTES.md +++ /dev/null @@ -1,14 +0,0 @@ -### Rankings - -1. eval/pi-qwen-coder-next-80b - - Almost one shot - - Nicest UI - - Invalid route (first fix commit) - very simple fix - - Not displaying content (second fix commit) - simple fix -2. eval/pi-glm4.7-flash - - FE wouldnt run on first try - - Had to downgrade a bunch of deps so it would run - - Used legacy packages - - UI is meh -3. eval/pi-devstral-small-2 - - Wouldnt run diff --git a/RESULTS.md b/RESULTS.md new file mode 100644 index 0000000..100bc10 --- /dev/null +++ b/RESULTS.md @@ -0,0 +1,45 @@ +## Description + +## Agentic Tools + +- [pi](https://github.com/badlogic/pi-mono/tree/main/packages/coding-agent) +- [opencode](https://github.com/anomalyco/opencode) +- [claude-code](https://github.com/anthropics/claude-code) + +## Grading + +Purely opinion based, but based on +- One-shot performance +- Follow up fixes +- UI design + +## Rankings + +### Overall + +1. eval/pi-kimi-k2.5 +2. eval/pi-qwen3-coder-next-80b +3. eval/pi-glm4.7 + +### Parameters: < 100B (Local) + +1. eval/pi-qwen3-coder-next-80b + - UI: 8/10 + - Fixes: + - New File + - Markdown Styling + +### Parameters: > 100B (Hosted) + +1. eval/pi-kimi-k2.5 + - UI: 9/10 + - Fixes: + - Files List + - Editor Scrolling + - Duplicate `.md` Name +2. eval/pi-glm4.7 + - UI: 7/10 + - Fixes: + - Files List + - Markdown Styling + - Delete Failure