Leaderboard

Run IDModelLangRatingTimestamp
20260206-130232moonshotai/kimi-k2.5go26002026-02-06T15:37:23.215559Z
20260205-113922gemini-3-pro-previewgo26002026-02-05T13:20:09.460344Z
20260227-140328gemini-3.1-pro-previewgo26002026-02-27T17:09:12.976022Z
20260227-140237gpt-5go26002026-02-27T16:31:08.908533Z
20260214-083437moonshotai/kimi-k2.5go24002026-02-14T17:29:47.203815Z
20260228-151134gpt-5go24002026-02-28T18:08:17.872336Z
20260218-151827z-ai/glm-5go24002026-02-18T20:52:08.206596Z
20260126-054408gemini-3-pro-previewgo22002026-01-26T07:44:58.798378Z
20260222-064656gemini-3.1-pro-previewgo22002026-02-22T08:54:16.202942Z
20260223-104458anthropic/claude-sonnet-4.6go22002026-02-23T11:15:55.994256Z
20260225-063403gemini-3.1-pro-previewgo22002026-02-25T08:03:45.589978Z
20260227-135649x-ai/grok-4go22002026-02-27T23:09:10.882665Z
20260213-090515moonshotai/kimi-k2.5go20002026-02-13T12:48:38.234151Z
20260228-152401moonshotai/kimi-k2.5go20002026-02-28T19:31:41.112549Z
20260217-144912x-ai/grok-4go18002026-02-17T22:12:51.358371Z
20260206-090110anthropic/claude-opus-4.6go18002026-02-06T09:49:09.362864Z
20260226-141715gpt-5.2go14002026-02-26T14:40:50.080389Z
20260218-065301gemini-3-pro-previewgo-148002026-02-18T08:04:16.81538Z

Evaluation History for 20260223-104458

Eval IDRun IDModelLangProblemRatingSuccessTimestampPromptResponseStdoutStderr
841 20260223-104458 anthropic/claude-sonnet-4.6 go 878B (CF) 2300 false 2026-02-23T11:15:55.982526Z View View View View
840 20260223-104458 anthropic/claude-sonnet-4.6 go 538F (CF) 2200 true 2026-02-23T11:15:40.778041Z View View View View
839 20260223-104458 anthropic/claude-sonnet-4.6 go 530F (CF) 2100 true 2026-02-23T11:15:15.083638Z View View View View
838 20260223-104458 anthropic/claude-sonnet-4.6 go 401D (CF) 2000 true 2026-02-23T11:15:05.15413Z View View View View
837 20260223-104458 anthropic/claude-sonnet-4.6 go 535D (CF) 1900 true 2026-02-23T11:14:54.660042Z View View View View
836 20260223-104458 anthropic/claude-sonnet-4.6 go 371D (CF) 1800 true 2026-02-23T11:14:18.325768Z View View View View
835 20260223-104458 anthropic/claude-sonnet-4.6 go 1744E2 (CF) 1900 false 2026-02-23T11:14:11.165788Z View View View View
834 20260223-104458 anthropic/claude-sonnet-4.6 go 188H (CF) 1800 true 2026-02-23T11:13:55.550087Z View View View View
833 20260223-104458 anthropic/claude-sonnet-4.6 go 437D (CF) 1900 false 2026-02-23T11:13:51.793251Z View View View View
832 20260223-104458 anthropic/claude-sonnet-4.6 go 1672F1 (CF) 2000 false 2026-02-23T11:13:41.621001Z View View View View
831 20260223-104458 anthropic/claude-sonnet-4.6 go 850B (CF) 2100 false 2026-02-23T11:13:34.623547Z View View View View
830 20260223-104458 anthropic/claude-sonnet-4.6 go 494B (CF) 2000 true 2026-02-23T11:13:12.211596Z View View View View
829 20260223-104458 anthropic/claude-sonnet-4.6 go 1558B (CF) 1900 true 2026-02-23T11:12:50.80664Z View View View View
828 20260223-104458 anthropic/claude-sonnet-4.6 go 1680E (CF) 2000 false 2026-02-23T11:12:31.672095Z View View View View
827 20260223-104458 anthropic/claude-sonnet-4.6 go 1482E (CF) 2100 false 2026-02-23T11:12:09.072717Z View View View View
826 20260223-104458 anthropic/claude-sonnet-4.6 go 1993D (CF) 2200 false 2026-02-23T11:11:51.204848Z View View View View
825 20260223-104458 anthropic/claude-sonnet-4.6 go 992D (CF) 2100 true 2026-02-23T11:10:48.567382Z View View View View
824 20260223-104458 anthropic/claude-sonnet-4.6 go 1584D (CF) 2000 true 2026-02-23T11:09:23.411065Z View View View View
823 20260223-104458 anthropic/claude-sonnet-4.6 go 1365E (CF) 1900 true 2026-02-23T11:09:05.071807Z View View View View
822 20260223-104458 anthropic/claude-sonnet-4.6 go 1525D (CF) 1800 true 2026-02-23T11:08:47.299141Z View View View View
821 20260223-104458 anthropic/claude-sonnet-4.6 go 1118E (CF) 1700 true 2026-02-23T11:08:38.600399Z View View View View
820 20260223-104458 anthropic/claude-sonnet-4.6 go 939D (CF) 1600 true 2026-02-23T11:08:15.722802Z View View View View
819 20260223-104458 anthropic/claude-sonnet-4.6 go 773A (CF) 1700 false 2026-02-23T11:08:09.827208Z View View View View
818 20260223-104458 anthropic/claude-sonnet-4.6 go 794C (CF) 1800 false 2026-02-23T11:07:55.023508Z View View View View
817 20260223-104458 anthropic/claude-sonnet-4.6 go 696B (CF) 1700 true 2026-02-23T11:07:44.952742Z View View View View
816 20260223-104458 anthropic/claude-sonnet-4.6 go 150B (CF) 1600 true 2026-02-23T11:07:27.57574Z View View View View
815 20260223-104458 anthropic/claude-sonnet-4.6 go 1783C (CF) 1700 false 2026-02-23T11:07:20.000672Z View View View View
814 20260223-104458 anthropic/claude-sonnet-4.6 go 52B (CF) 1600 true 2026-02-23T11:06:36.04974Z View View View View
813 20260223-104458 anthropic/claude-sonnet-4.6 go 2029C (CF) 1700 false 2026-02-23T11:06:30.079682Z View View View View
812 20260223-104458 anthropic/claude-sonnet-4.6 go 73A (CF) 1600 true 2026-02-23T10:49:42.714312Z View View View View
811 20260223-104458 anthropic/claude-sonnet-4.6 go 2104E (CF) 1700 false 2026-02-23T10:49:20.308259Z View View View View
810 20260223-104458 anthropic/claude-sonnet-4.6 go 174C (CF) 1800 false 2026-02-23T10:49:00.335485Z View View View View
809 20260223-104458 anthropic/claude-sonnet-4.6 go 1216C (CF) 1700 true 2026-02-23T10:48:47.465286Z View View View View
808 20260223-104458 anthropic/claude-sonnet-4.6 go 255D (CF) 1800 false 2026-02-23T10:48:36.578291Z View View View View
807 20260223-104458 anthropic/claude-sonnet-4.6 go 827A (CF) 1700 true 2026-02-23T10:48:07.766477Z View View View View
806 20260223-104458 anthropic/claude-sonnet-4.6 go 768B (CF) 1600 true 2026-02-23T10:47:54.894766Z View View View View
805 20260223-104458 anthropic/claude-sonnet-4.6 go 1478C (CF) 1700 false 2026-02-23T10:47:38.928271Z View View View View
804 20260223-104458 anthropic/claude-sonnet-4.6 go 2093F (CF) 1800 false 2026-02-23T10:47:20.870416Z View View View View
803 20260223-104458 anthropic/claude-sonnet-4.6 go 281B (CF) 1700 true 2026-02-23T10:46:55.433929Z View View View View
802 20260223-104458 anthropic/claude-sonnet-4.6 go 814C (CF) 1600 true 2026-02-23T10:46:49.415998Z View View View View
801 20260223-104458 anthropic/claude-sonnet-4.6 go 2122C (CF) 1700 false 2026-02-23T10:46:32.8848Z View View View View
800 20260223-104458 anthropic/claude-sonnet-4.6 go 55B (CF) 1600 true 2026-02-23T10:46:14.437353Z View View View View
799 20260223-104458 anthropic/claude-sonnet-4.6 go 165B (CF) 1500 true 2026-02-23T10:46:03.587641Z View View View View
798 20260223-104458 anthropic/claude-sonnet-4.6 go 711B (CF) 1400 true 2026-02-23T10:45:59.76476Z View View View View
797 20260223-104458 anthropic/claude-sonnet-4.6 go 1978C (CF) 1300 true 2026-02-23T10:45:46.287252Z View View View View
796 20260223-104458 anthropic/claude-sonnet-4.6 go 787A (CF) 1200 true 2026-02-23T10:45:31.69316Z View View View View
795 20260223-104458 anthropic/claude-sonnet-4.6 go 227B (CF) 1100 true 2026-02-23T10:45:19.586369Z View View View View
794 20260223-104458 anthropic/claude-sonnet-4.6 go 118B (CF) 1000 true 2026-02-23T10:45:15.76639Z View View View View
793 20260223-104458 anthropic/claude-sonnet-4.6 go 1401A (CF) 900 true 2026-02-23T10:45:09.407725Z View View View View
792 20260223-104458 anthropic/claude-sonnet-4.6 go 432A (CF) 800 true 2026-02-23T10:45:04.542295Z View View View View