[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-claude-vs-gpt-vs-gemini-coding-benchmark-leaderboard-en":3,"article-related-claude-vs-gpt-vs-gemini-coding-benchmark-leaderboard-en":33,"series-industry-2c317df8-4070-4c74-bab5-48f79fe2860e":81},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":25,"views":29,"created_at":30,"published_at":31,"topic_cluster_id":32},"2c317df8-4070-4c74-bab5-48f79fe2860e","claude-vs-gpt-vs-gemini-coding-benchmark-leaderboard-en","Claude vs GPT vs Gemini: Coding Benchmark Leaderboard","\u003Cp data-speakable=\"summary\">A June 2026 coding \u003Ca href=\"\u002Ftag\u002Fbenchmark\">benchmark\u003C\u002Fa> comparison of Claude, GPT, and Gemini for model buyers.\u003C\u002Fp>\u003Cp>On the table are \u003Ca href=\"https:\u002F\u002Ftygartmedia.com\u002Fclaude-vs-gpt-vs-gemini-coding-benchmark\u002F\">Claude\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Ftygartmedia.com\u002Fclaude-vs-gpt-vs-gemini-coding-benchmark\u002F\">GPT\u003C\u002Fa>, and \u003Ca href=\"https:\u002F\u002Ftygartmedia.com\u002Fclaude-vs-gpt-vs-gemini-coding-benchmark\u002F\">Gemini\u003C\u002Fa>, and this comparison helps you decide which one fits coding work when price, context, and benchmark evidence do not line up cleanly.\u003C\u002Fp>\u003Ch2>At a glance\u003C\u002Fh2>\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>Dimension\u003C\u002Fth>\u003Cth>Claude Fable 5\u003C\u002Fth>\u003Cth>Claude Opus 4.8\u003C\u002Fth>\u003Cth>GPT-5.5\u003C\u002Fth>\u003Cth>Gemini 3.1 Pro\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\u003Ctr>\u003Ctd>Input \u002F output price per 1M tokens\u003C\u002Ftd>\u003Ctd>$10 \u002F $50\u003C\u002Ftd>\u003Ctd>$5 \u002F $25\u003C\u002Ftd>\u003Ctd>$5 \u002F $30\u003C\u002Ftd>\u003Ctd>$2 \u002F $12 up to 200K, then $4 \u002F $18\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Context window\u003C\u002Ftd>\u003Ctd>1M tokens\u003C\u002Ftd>\u003Ctd>1M tokens\u003C\u002Ftd>\u003Ctd>1,050,000 tokens\u003C\u002Ftd>\u003Ctd>1M tokens\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Max output\u003C\u002Ftd>\u003Ctd>128K tokens\u003C\u002Ftd>\u003Ctd>128K tokens\u003C\u002Ftd>\u003Ctd>128K tokens\u003C\u002Ftd>\u003Ctd>64K tokens\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Published coding score we could verify\u003C\u002Ftd>\u003Ctd>Not machine-verifiable\u003C\u002Ftd>\u003Ctd>Not machine-verifiable\u003C\u002Ftd>\u003Ctd>83.4% Terminal-Bench 2.1, per competitor attribution\u003C\u002Ftd>\u003Ctd>80.6% SWE-bench Verified; 54.2% SWE-bench Pro Public; 2887 Elo LiveCodeBench Pro\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Knowledge cutoff\u003C\u002Ftd>\u003Ctd>Not stated on overview\u003C\u002Ftd>\u003Ctd>Jan 2026\u003C\u002Ftd>\u003Ctd>Dec 1, 2025\u003C\u002Ftd>\u003Ctd>Not stated on model card\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Verification note\u003C\u002Ftd>\u003Ctd>Official score table was image-based\u003C\u002Ftd>\u003Ctd>Official score table was image-based\u003C\u002Ftd>\u003Ctd>Primary page was not machine-readable to the fetcher\u003C\u002Ftd>\u003Ctd>Scores came from Google’s official model card\u003C\u002Ftd>\u003C\u002Ftr>\u003C\u002Ftbody>\u003C\u002Ftable>\u003Ch2>Claude: strongest on breadth, weaker on public benchmark visibility\u003C\u002Fh2>\u003Cp>Claude Fable 5 and Claude Opus 4.8 are the least tidy options to compare because \u003Ca href=\"\u002Ftag\u002Fanthropic\">Anthropic\u003C\u002Fa>’s public coding tables were not machine-readable on the verification date. That does not mean they are weak models. It means the leaderboard-style proof is harder to extract from the source, so buyers have to lean more on the spec sheet, the product tier, and their own tests.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781939876788-ivgw.png\" alt=\"Claude vs GPT vs Gemini: Coding Benchmark Leaderboard\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>The practical distinction is price and positioning. Claude Opus 4.8 is the better value inside Anthropic’s lineup at $5 per million input tokens and $25 per million output tokens, with a 1M-token context window and 128K output. Fable 5 doubles that to $10 \u002F $50, which signals a premium tier for teams that want Anthropic’s top release even when the benchmark table is not easy to quote back.\u003C\u002Fp>\u003Ch2>GPT: a strong middle ground if you want long context and a readable score\u003C\u002Fh2>\u003Cp>GPT-5.5 is the cleanest \u003Ca href=\"\u002Ftag\u002Fopenai\">OpenAI\u003C\u002Fa> option in this comparison because the pricing and context specs are easy to verify, and Anthropic’s page attributes a Terminal-Bench 2.1 score of 83.4% to it. The caveat is important: that number is competitor-reported, not a score we read directly from OpenAI, so it should be treated as directional rather than final proof.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781939880953-vauo.png\" alt=\"Claude vs GPT vs Gemini: Coding Benchmark Leaderboard\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>Still, GPT-5.5 looks attractive for teams that want a large 1,050,000-token context window, 128K max output, and a familiar $5 \u002F $30 price point. If your coding workflow involves long repo-wide prompts, agent loops, and lots of pasted context, GPT-5.5 gives you a lot of room without moving into the most expensive tier.\u003C\u002Fp>\u003Ch2>Gemini: best verified public scores and the lowest entry price\u003C\u002Fh2>\u003Cp>Gemini 3.1 Pro is the most benchmark-transparent model in this set because Google’s official model card publishes the figures directly: 80.6% on \u003Ca href=\"\u002Ftag\u002Fswe-bench-verified\">SWE-bench Verified\u003C\u002Fa>, 54.2% on SWE-bench Pro Public, and 2887 Elo on LiveCodeBench Pro. Those are single-attempt results from the official card, which makes them especially useful if you care about public, source-backed numbers more than vendor-adjacent references.\u003C\u002Fp>\u003Cp>It also has the sharpest price advantage at $2 \u002F $12 per million tokens for prompts up to 200K, with higher tiers above that, but the trade-off is a 64K max output and a benchmark profile that is not directly comparable across every test harness. Gemini is the value play \u003Ca href=\"\u002Fnews\u002Fai-coding-assistant-roi-measured-en\">when you\u003C\u002Fa> want a strong published score, lower cost, and you can live with shorter outputs.\u003C\u002Fp>\u003Ch2>When to pick what\u003C\u002Fh2>\u003Cp>Pick Claude Opus 4.8 if you want the safest Anthropic default for everyday \u003Ca href=\"\u002Ftag\u002Fagentic-coding\">agentic coding\u003C\u002Fa> and you care more about a balanced price-to-capability mix than a headline benchmark citation.\u003C\u002Fp>\u003Cp>Pick GPT-5.5 if your team works in very long contexts, wants a large output budget, and prefers a model with a strong but caveated benchmark signal that still reads well in vendor comparisons.\u003C\u002Fp>\u003Cp>Pick Gemini 3.1 Pro if you want the lowest cost, the most clearly published public coding scores, and a model that is easy to justify in a procurement review because the source numbers are directly visible.\u003C\u002Fp>\u003Cp>Default to Gemini 3.1 Pro for cost-sensitive coding teams, but switch to GPT-5.5 when your workflows regularly need the extra-long 1,050,000-token context window and higher-output headroom.\u003C\u002Fp>","A June 2026 coding benchmark comparison of Claude, GPT, and Gemini for model buyers.","tygartmedia.com","https:\u002F\u002Ftygartmedia.com\u002Fclaude-vs-gpt-vs-gemini-coding-benchmark\u002F",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781939876788-ivgw.png","industry","en","ff19d422-4694-464f-8184-fff9bfba954a",[17,18,19,20,21,22,23,24],"Claude","GPT-5.5","Gemini 3.1 Pro","coding benchmarks","SWE-bench","LiveCodeBench Pro","Terminal-Bench","model comparison",[26,27,28],"Gemini 3.1 Pro has the clearest verified public benchmark data and the lowest listed starting price.","GPT-5.5 stands out for its 1,050,000-token context window and competitive middle-tier pricing.","Claude Opus 4.8 is the practical Anthropic pick, while Fable 5 is the premium option.",0,"2026-06-20T07:17:35.473285+00:00","2026-06-20T07:17:35.463+00:00","d19fc184-5852-4c4d-9ec0-db0c4841ac17",{"tags":34,"relatedLang":40,"relatedPosts":44},[35,38],{"name":36,"slug":37},"SWE-Bench","swe-bench",{"name":17,"slug":39},"claude",{"id":15,"slug":41,"title":42,"language":43},"claude-vs-gpt-vs-gemini-cheng-shi-ma-ji-zhun-dui-jue-zh","Claude vs GPT vs Gemini：程式碼基準對決","zh",[45,51,57,63,69,75],{"id":46,"slug":47,"title":48,"cover_image":49,"image_url":49,"created_at":50,"category":13},"b94d42b2-d124-4f93-98f7-305f60799562","wso2-600m-sale-open-source-enterprise-software-en","WSO2’s $600M sale caps a 20-year open-source run","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781951573378-b4r4.png","2026-06-20T10:32:31.01869+00:00",{"id":52,"slug":53,"title":54,"cover_image":55,"image_url":55,"created_at":56,"category":13},"36d8e7b0-ce1b-4870-b1ee-fae9ae5e2356","google-ax-resumable-agent-runtime-en","Google AX turns agent runs into resumable jobs","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781950658549-zc18.png","2026-06-20T10:17:16.910807+00:00",{"id":58,"slug":59,"title":60,"cover_image":61,"image_url":61,"created_at":62,"category":13},"7f9867eb-30df-48d7-9ca0-a38c2fc75394","designmd-agent-ready-ui-specs-en","design.md turns brand tokens into agent-ready UI specs","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781947068272-459j.png","2026-06-20T09:17:20.913274+00:00",{"id":64,"slug":65,"title":66,"cover_image":67,"image_url":67,"created_at":68,"category":13},"e018e62b-a712-4e2c-aee6-21fb492b993a","clip-converter-rivals-faster-safer-2026-en","Clip Converter’s 2026 rivals are faster and safer","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781935360953-hmtl.png","2026-06-20T06:02:19.249403+00:00",{"id":70,"slug":71,"title":72,"cover_image":73,"image_url":73,"created_at":74,"category":13},"e5877eb6-413d-46f3-b91e-3c4139b5e1f9","openai-sora-shutdown-unit-economics-en","OpenAI’s Sora shutdown proves hype can’t outrun unit economics","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781933563230-w87h.png","2026-06-20T05:32:17.714689+00:00",{"id":76,"slug":77,"title":78,"cover_image":79,"image_url":79,"created_at":80,"category":13},"d2810cd9-a360-4466-a3a3-5a953daea1b1","anthropics-model-shutdown-safety-can-bite-back-en","Anthropic’s model shutdown shows safety can bite back","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781932667519-4tfs.png","2026-06-20T05:17:22.132183+00:00",[82,87,92,97,102,107,112,117,122,127],{"id":83,"slug":84,"title":85,"created_at":86},"d35a1bd9-e709-412e-a2df-392df1dc572a","ai-impact-2026-developments-market-en","AI's Impact in 2026: Key Developments and Market Shifts","2026-03-25T16:20:33.205823+00:00",{"id":88,"slug":89,"title":90,"created_at":91},"5ed27921-5fd6-492e-8c59-78393bf37710","trumps-ai-legislative-framework-en","Trump's AI Legislative Framework: What's Inside?","2026-03-25T16:22:20.005325+00:00",{"id":93,"slug":94,"title":95,"created_at":96},"e454a642-f03c-4794-b185-5f651aebbaca","nvidia-gtc-2026-key-highlights-innovations-en","NVIDIA GTC 2026: Key Highlights and Innovations","2026-03-25T16:22:47.882615+00:00",{"id":98,"slug":99,"title":100,"created_at":101},"0ebb5b16-774a-4922-945d-5f2ce1df5a6d","claude-usage-diversifies-learning-curves-en","Claude Usage Diversifies, Learning Curves Emerge","2026-03-25T16:25:50.770376+00:00",{"id":103,"slug":104,"title":105,"created_at":106},"69934e86-2fc5-4280-8223-7b917a48ace8","openclaw-ai-commoditization-concerns-en","OpenClaw's Rise Raises Concerns of AI Model Commoditization","2026-03-25T16:26:30.582047+00:00",{"id":108,"slug":109,"title":110,"created_at":111},"b4b2575b-2ac8-46b2-b90e-ab1d7c060797","google-gemini-ai-rollout-2026-en","Google's Gemini AI Rollout Extended to 2026","2026-03-25T16:28:14.808842+00:00",{"id":113,"slug":114,"title":115,"created_at":116},"6e18bc65-42ae-4ad0-b564-67d7f66b979e","meta-llama4-fabricated-results-scandal-en","Meta's Llama 4 Scandal: Fabricated AI Test Results Unveiled","2026-03-25T16:29:15.482836+00:00",{"id":118,"slug":119,"title":120,"created_at":121},"bf888e9d-08be-4f47-996c-7b24b5ab3500","accenture-mistral-ai-deployment-en","Accenture and Mistral AI Team Up for AI Deployment","2026-03-25T16:31:01.894655+00:00",{"id":123,"slug":124,"title":125,"created_at":126},"5382b536-fad2-49c6-ac85-9eb2bae49f35","mistral-ai-high-stakes-2026-en","Mistral AI: Facing High Stakes in 2026","2026-03-25T16:31:39.941974+00:00",{"id":128,"slug":129,"title":130,"created_at":131},"9da3d2d6-b669-4971-ba1d-17fdb3548ed5","cursors-meteoric-rise-pressures-en","Cursor's Meteoric Rise Faces Industry Pressures","2026-03-25T16:32:21.899217+00:00"]