Commit Graph

6 Commits

Author SHA1 Message Date
49879ed358 chore: add leaderboard 2026-02-06 19:40:02 -05:00
304622aafc chore: update results 2026-02-06 17:22:05 -05:00
cd6eea2a7b docs: add model rankings and observations
- pi-qwen-coder-next-80b: Near one-shot success with UI advantages, minor routing issues
- pi-glm4.7-flash: Dependency issues, legacy packages, average UI
- pi-devstral-small-2: Runtime issues
2026-02-05 19:45:55 -05:00
98757e7746 chore: add gitignore to root directory 2026-02-05 16:02:54 -05:00
d9b143f4f8 docs: update specification to use TypeScript frontend
- Replace 'Vanilla JavaScript' with 'TypeScript' in frontend requirements
- Update flake.nix to use nodejs package instead of tailwindcss
- Clarify testing requirements to cover both frontend and backend
- Fix punctuation in evaluation checklist
2026-02-03 20:55:02 -05:00
fece98f5ee Initial: branch-oriented eval framework 2026-01-30 10:32:50 -05:00