[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-kimi-k26-open-source-coding-agentic-ai-benchmarks-en":3,"article-related-kimi-k26-open-source-coding-agentic-ai-benchmarks-en":30,"series-model-release-2b2e09ae-d63f-4d0d-88c9-ca494fc7cc3b":77},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":22,"views":26,"created_at":27,"published_at":28,"topic_cluster_id":29},"2b2e09ae-d63f-4d0d-88c9-ca494fc7cc3b","kimi-k26-open-source-coding-agentic-ai-benchmarks-en","Kimi K2.6 tops coding and agentic AI benchmarks","\u003Cp data-speakable=\"summary\">\u003Ca href=\"\u002Ftag\u002Fmoonshot-ai\">Moonshot AI\u003C\u002Fa>’s Kimi K2.6 is an open-weight model built for long-horizon coding and agentic work.\u003C\u002Fp>\u003Cp>Moonshot AI’s \u003Ca href=\"https:\u002F\u002Fwww.moonshot.ai\" target=\"_blank\" rel=\"noopener\">Moonshot AI\u003C\u002Fa> Kimi K2.6 is being pitched as a major step for open-source agentic models. The model, published on June 26, 2026 and available via Hugging Face and the Kimi API, uses a Mixture-of-Experts design with a 262,144-token context window and targets coding, design, and multi-agent workflows.\u003C\u002Fp>\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>項目\u003C\u002Fth>\u003Cth>數值\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\u003Ctr>\u003Ctd>發布日期\u003C\u002Ftd>\u003Ctd>2026-06-26\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Context window\u003C\u002Ftd>\u003Ctd>262,144 tokens\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>API pricing\u003C\u002Ftd>\u003Ctd>$0.74 \u002F $3.50 per 1M input\u002Foutput tokens\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Swarm scale\u003C\u002Ftd>\u003Ctd>300 sub-agents\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Agent steps\u003C\u002Ftd>\u003Ctd>4,000 coordinated steps\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Kimi Design Bench\u003C\u002Ftd>\u003Ctd>Outperforms Google AI Studio on visual input, landing pages, full-stack apps, creative coding\u003C\u002Ftd>\u003C\u002Ftr>\u003C\u002Ftbody>\u003C\u002Ftable>\u003Ch2>What changed\u003C\u002Fh2>\u003Cp>K2.6 is not a small update to Kimi K2.5. Moonshot says the new model improves Toolathlon by almost 80%, adds about 8 points on BrowseComp and \u003Ca href=\"\u002Ftag\u002Fswe-bench\">SWE-Bench\u003C\u002Fa> Pro, and expands the agent swarm system from 100 agents and 1,500 steps to 300 agents and 4,000 steps.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782739081936-jpdb.png\" alt=\"Kimi K2.6 tops coding and agentic AI benchmarks\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>On benchmarks, K2.6 is close to the best closed models across several categories. Reported scores include 80.2 on \u003Ca href=\"\u002Ftag\u002Fswe-bench-verified\">SWE-Bench Verified\u003C\u002Fa>, 89.6 on LiveCodeBench v6, 76.7 on SWE-Bench Multilingual, 66.7 on Terminal-Bench 2.0, 54.0 on HLE-Full with tools, 92.5 on DeepSearchQA, and 73.1 on OSWorld-Verified.\u003C\u002Fp>\u003Cul>\u003Cli>Long-horizon coding: multi-file refactors, compiler-driven debugging, and cross-language work\u003C\u002Fli>\u003Cli>Coding-driven design: prompts that produce interactive front ends and database-backed apps\u003C\u002Fli>\u003Cli>Agent swarm coordination: hundreds of sub-agents running in parallel\u003C\u002Fli>\u003Cli>Real-world demos: 4,000+ tool calls over 12+ hours, and a 13-hour codebase overhaul\u003C\u002Fli>\u003C\u002Ful>\u003Cp>Moonshot’s demos show the model sustaining long runs without human steering. In one case, it deployed a small model locally on a Mac, rewrote inference in Zig, and pushed throughput from about 15 tokens per second to 193. In another, it made more than 1,000 code changes to an older financial matching engine and lifted medium throughput by 185% and peak throughput by 133%.\u003C\u002Fp>\u003Ch2>Why it matters\u003C\u002Fh2>\u003Cp>For developers, K2.6 matters because it compresses a lot of work into one model: planning, coding, debugging, UI generation, and tool use. That makes it relevant for teams building coding copilots, autonomous refactoring tools, research agents, and app builders that need to keep state across long sessions.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782739074460-bt74.png\" alt=\"Kimi K2.6 tops coding and agentic AI benchmarks\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>For the market, the bigger signal is price. Moonshot is offering an open-weight model that can compete with proprietary systems while charging $0.74 per million input tokens and $3.50 per million output tokens. That puts pressure on closed-model vendors and gives enterprise teams a cheaper option for agent-heavy workloads, if they can handle the infrastructure.\u003C\u002Fp>\u003Cp>That infrastructure is the catch. \u003Ca href=\"\u002Ftag\u002Flong-context\">Long context\u003C\u002Fa>, bursty tool calls, and parallel agents can overload naïve deployments, which is why the article points to \u003Ca href=\"https:\u002F\u002Fwww.truefoundry.com\" target=\"_blank\" rel=\"noopener\">TrueFoundry\u003C\u002Fa>’s AI Gateway for routing, concurrency control, tracing, and cost attribution. The practical question is no longer whether K2.6 can do the work, but which teams can serve it at scale without adding weeks of ops overhead.\u003C\u002Fp>\u003Cp>The real test for Kimi K2.6 is not the benchmark chart. It is whether open-source \u003Ca href=\"\u002Ftag\u002Fagentic-ai\">agentic AI\u003C\u002Fa> can move from impressive demos to repeatable production systems.\u003C\u002Fp>","Moonshot AI’s Kimi K2.6 hits top marks in coding and agentic tasks, with a 262K context window and open-weight pricing at $0.74\u002F$3.50 per 1M tokens.","www.truefoundry.com","https:\u002F\u002Fwww.truefoundry.com\u002Fblog\u002Fkimi-k2-6-the-open-source-coding-giant-thats-reshaping-agentic-ai",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782739081936-jpdb.png","model-release","en","ca1e6960-10e7-4fa7-949f-c5991c99fc7e",[17,18,19,20,21],"Kimi K2.6","Moonshot AI","agentic AI","open-source models","coding benchmarks",[23,24,25],"Kimi K2.6 targets long-horizon coding, design, and multi-agent tasks with a 262K context window.","Moonshot reports strong benchmark results, near top closed models on several coding and agent tests.","Open-weight pricing is low enough to pressure proprietary model economics for agent-heavy workloads.",0,"2026-06-29T13:17:26.953686+00:00","2026-06-29T13:17:26.944+00:00","1bae1133-d241-4581-9332-fbf39690c319",{"tags":31,"relatedLang":36,"relatedPosts":40},[32,34],{"name":18,"slug":33},"moonshot-ai",{"name":19,"slug":35},"agentic-ai",{"id":15,"slug":37,"title":38,"language":39},"kimi-k26-open-source-coding-agentic-ai-benchmarks-zh","Kimi K2.6 登頂程式與代理式 AI 基準","zh",[41,47,53,59,65,71],{"id":42,"slug":43,"title":44,"cover_image":45,"image_url":45,"created_at":46,"category":13},"ab62b837-c8ac-493d-a35a-4c454402fd12","kimi-2-7-price-coding-benchmark-en","Kimi 2.7 makes price the real coding benchmark","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782746269451-4jtb.png","2026-06-29T15:17:24.882797+00:00",{"id":48,"slug":49,"title":50,"cover_image":51,"image_url":51,"created_at":52,"category":13},"666962b5-ce8c-430c-9d07-8cdfd44ffd09","llama-legends-380-season-3-heroes-raids-en","Llama Legends 3.8.0 adds Season 3 heroes and raids","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782711179242-ednu.png","2026-06-29T05:32:33.398141+00:00",{"id":54,"slug":55,"title":56,"cover_image":57,"image_url":57,"created_at":58,"category":13},"b4840252-4311-4c44-9814-4a3d1666302f","omlx-045-dev1-glm52-minimax-m3-speedups-en","oMLX 0.4.5.dev1 speeds up GLM-5.2 and MiniMax M3","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782709371396-mn9r.png","2026-06-29T05:02:28.770698+00:00",{"id":60,"slug":61,"title":62,"cover_image":63,"image_url":63,"created_at":64,"category":13},"1fe27411-ad64-4717-85c9-89b5c350253c","grok-45-private-beta-tesla-spacex-en","Grok 4.5 enters private beta at Tesla and SpaceX","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782687764199-vjto.png","2026-06-28T23:02:23.343104+00:00",{"id":66,"slug":67,"title":68,"cover_image":69,"image_url":69,"created_at":70,"category":13},"35368bfc-0dbe-45dc-b422-87b1bd350ac0","google-openrl-llm-fine-tuning-kubernetes-en","Google OpenRL brings RL fine-tuning to Kubernetes","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782572578249-jlty.png","2026-06-27T15:02:27.543012+00:00",{"id":72,"slug":73,"title":74,"cover_image":75,"image_url":75,"created_at":76,"category":13},"8fe33efd-3a68-4fe3-935f-f0f5d3f058fc","diffusiongemma-runs-fast-on-nvidia-rtx-dgx-en","DiffusionGemma runs fast on NVIDIA RTX and DGX","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782570781225-7xo9.png","2026-06-27T14:32:34.997765+00:00",[78,83,88,93,98,103,108,113,118,123],{"id":79,"slug":80,"title":81,"created_at":82},"d4cffde7-9b50-4cc7-bb68-8bc9e3b15477","nvidia-rubin-ai-supercomputer-en","NVIDIA Unveils Rubin: A Leap in AI Supercomputing","2026-03-25T16:24:35.155565+00:00",{"id":84,"slug":85,"title":86,"created_at":87},"eab919b9-fbac-4048-89fc-afad6749ccef","google-gemini-ai-innovations-2026-en","Google's AI Leap with Gemini Innovations in 2026","2026-03-25T16:27:18.841838+00:00",{"id":89,"slug":90,"title":91,"created_at":92},"5f5cfc67-3384-4816-a8f6-19e44d90113d","gap-google-gemini-ai-checkout-en","Gap Teams Up with Google Gemini for AI-Driven Checkout","2026-03-25T16:27:46.483272+00:00",{"id":94,"slug":95,"title":96,"created_at":97},"f6d04567-47f6-49ec-804c-52e61ab91225","ai-model-release-wave-march-2026-en","Navigating the AI Model Release Wave of March 2026","2026-03-25T16:28:45.409716+00:00",{"id":99,"slug":100,"title":101,"created_at":102},"895c150c-569e-4fdf-939d-dade785c990e","small-language-models-transform-ai-en","Small Language Models: Llama 3.2 and Phi-3 Transform AI","2026-03-25T16:30:26.688313+00:00",{"id":104,"slug":105,"title":106,"created_at":107},"38eb1d26-d961-4fd3-ae12-9c4089680f5f","midjourney-v8-alpha-features-pricing-en","Midjourney V8 Alpha: A Deep Dive into Its Features and Pricing","2026-03-26T01:25:36.387587+00:00",{"id":109,"slug":110,"title":111,"created_at":112},"bf36bb9e-3444-4fb8-ab19-0df6bc9d8271","rag-2026-indispensable-ai-bridge-en","RAG in 2026: The Indispensable AI Bridge","2026-03-26T01:28:34.472046+00:00",{"id":114,"slug":115,"title":116,"created_at":117},"60881d6d-2310-44ef-b1fb-7f98e9dd2f0e","xiaomi-mimo-trio-agents-robots-voice-en","Xiaomi’s MiMo trio targets agents, robots, and voice","2026-03-28T03:05:08.899895+00:00",{"id":119,"slug":120,"title":121,"created_at":122},"f063d8d1-41d1-4de4-8ebc-6c40511b9369","xiaomi-mimo-v2-pro-1t-moe-agents-en","Xiaomi MiMo-V2-Pro: 1T MoE Model for Agents","2026-03-28T03:06:19.238032+00:00",{"id":124,"slug":125,"title":126,"created_at":127},"a1379e9a-6785-4ff5-9b0a-8cff55f8264f","cursor-composer-2-started-from-kimi-en","Cursor’s Composer 2 started from Kimi","2026-03-28T03:11:59.132398+00:00"]