Back to home

Tag

agentic coding

Agentic coding refers to models that plan, edit, test, and iterate across a software task instead of answering a single prompt. It raises practical issues around tool use, multi-agent coordination, long contexts, token cost, and deployment choices for SWE-bench-style workflows and Claude Code-like setups.

20 articles

Devin AI Review 2026: Benchmarks, Pricing & Tests
Tools & Apps/Jun 25

Devin AI Review 2026: Benchmarks, Pricing & Tests

A developer guide to testing Devin AI, its benchmarks, pricing, and workflow limits.

SpaceX收购Cursor不划算,AI编程能力应自己做
Industry News/Jun 19

SpaceX收购Cursor不划算,AI编程能力应自己做

SpaceX花600亿收购Cursor并不划算,AI编程能力更该自建。

8 Cursor alternatives that fit how you work
Tools & Apps/Jun 13

8 Cursor alternatives that fit how you work

I break down eight Cursor alternatives, what each one is actually good for, and the template I’d start from.

MiniMax M3 Proves Open-Weight Can Still Win on Coding
Model Releases/Jun 9

MiniMax M3 Proves Open-Weight Can Still Win on Coding

MiniMax M3 makes a strong case that open-weight models can still lead on coding, context, and price.

Why MiniMax M3 matters more than another long-context model
Model Releases/Jun 6

Why MiniMax M3 matters more than another long-context model

MiniMax M3 is a real step forward because it pairs long context with multimodal and agentic control.

5 GitHub Explore picks for builders
Industry News/May 29

5 GitHub Explore picks for builders

5 GitHub Explore picks show what builders are shipping, from agentic coding tools to document parsers and open-source apps.

GPT-5.0 to 5.5: Which ChatGPT Model Wins?
Model Releases/May 29

GPT-5.0 to 5.5: Which ChatGPT Model Wins?

OpenAI’s GPT-5 family grew from a 400K-token baseline to 1M-token agentic models, with GPT-5.5 now leading benchmarks.

Why Devin AI is overrated as a software engineer
AI Agent/May 27

Why Devin AI is overrated as a software engineer

Devin AI is impressive, but it is not the autonomous software engineer its launch implied.

Sonar Acquires Gitar for AI Code Review
Tools & Apps/May 26

Sonar Acquires Gitar for AI Code Review

Sonar bought Gitar to add AI code review to SonarQube, pairing review with verification for agent-written code across CI workflows.

Zero turns compiler errors into agent-ready JSON
AI Agent/May 26

Zero turns compiler errors into agent-ready JSON

I break down Vercel Zero’s agent-first compiler design and give you a copy-ready template for structured diagnostics.

Why Amazon Q Developer is wrong about the future of coding
Industry News/May 19

Why Amazon Q Developer is wrong about the future of coding

Amazon Q Developer is a strong AWS assistant, but it should not be treated as the future of software development.

MiniMax M2 opens up cheap agentic coding
Model Releases/May 18

MiniMax M2 opens up cheap agentic coding

MiniMax open-sourced M2, a model for agents and code that costs $0.30 per million input tokens and is free for a limited time.

Why Xiaomi’s MiMo-V2.5-Pro Changes Coding Agents More Than Chatbots
Model Releases/May 14

Why Xiaomi’s MiMo-V2.5-Pro Changes Coding Agents More Than Chatbots

MiMo-V2.5-Pro matters because it is built for long, tool-heavy coding work, not chat.

Kimi K2.6 and Qwen 3.6 Narrow the Gap
Model Releases/May 4

Kimi K2.6 and Qwen 3.6 Narrow the Gap

Kimi K2.6 and Qwen 3.6 are open-weight models that now rival closed models on coding and agent tasks.

How AI Agents Spend Your Money: 1000x Tokens on SWE-bench
Research/Apr 27

How AI Agents Spend Your Money: 1000x Tokens on SWE-bench

A study of SWE-bench Verified shows agentic coding can consume 1000x more tokens than chat, with costs driven by inputs and hard to predict.

Qwen3.6-27B opens a smaller, sharper path to coding
Model Releases/Apr 27

Qwen3.6-27B opens a smaller, sharper path to coding

Qwen3.6-27B is a 27B dense multimodal model that beats Qwen3.5-397B-A17B on key coding benchmarks while staying easier to deploy.

Claude Opus 4.7 发布:更会干活了
Model Releases/Apr 22

Claude Opus 4.7 发布:更会干活了

Anthropic发布Claude Opus 4.7,长任务、视觉理解和代码工作流更强,但Token消耗也更高。

Qwen3.6-35B-A3B: 35B Open Source Model Release
Model Releases/Apr 20

Qwen3.6-35B-A3B: 35B Open Source Model Release

Qwen3.6-35B-A3B ships with 35B total params, 3B active params, and Anthropic API compatibility for Claude Code workflows.

Cursor Composer 2 Bets on Agentic Coding
Model Releases/Mar 28

Cursor Composer 2 Bets on Agentic Coding

Cursor’s Composer 2 posts 61.3 on CursorBench and 61.7 on Terminal-Bench 2.0, with pricing aimed at high-volume coding teams.

Xiaomi MiMo-V2-Pro: 1T MoE Model for Agents
Model Releases/Mar 28

Xiaomi MiMo-V2-Pro: 1T MoE Model for Agents

Xiaomi’s MiMo-V2-Pro packs 1T parameters, 42B active, and 1M context, with SWE-bench results close to Claude Sonnet 4.6.