GLM-5.2 tops open AI rankings on Huawei chips

OraCore Editors

[IND] June 28, 20263 min readOraCore Editors

GLM-5.2 tops open AI rankings on Huawei chips

Z.ai’s GLM-5.2 hit the top open-weight rank on Huawei Ascend chips, as Anthropic’s Fable 5 stayed offline after a US export ban.

Anthropic

Share LinkedIn

GLM-5.2 tops open AI rankings on Huawei chips

Z.ai’s GLM-5.2 is the top open-weight AI model on Huawei chips after Anthropic’s Fable 5 ban.

Techtimes reports that Z.ai’s GLM-5.2 now leads open rankings after a June 17 release, while Anthropic keeps Fable 5 and Mythos 5 offline following a US export order. The model was trained entirely on Huawei Ascend 910B chips, with no Nvidia hardware in the training stack.

項目	數值
GLM-5.2 release date	June 17, 2026
Fable 5 ban date	June 12, 2026
Training chips	100,000 Huawei Ascend 910B
Total parameters	744 billion
Active parameters per inference	~40 billion
Context window	1 million tokens
Training cost estimate	$25 million

What changed

Get the latest AI news in your inbox

Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.

No spam. Unsubscribe at any time.

Z.ai says GLM-5.2 is the first model family it trained entirely on domestic Chinese silicon, and the release landed with MIT-licensed weights on Hugging Face. That made the model easy to download, self-host, and test against closed Western systems.

On public benchmarks, GLM-5.2 posted strong results in coding and design tasks. It ranked second on Code Arena with an Elo score of 1,595, led SWE-bench Pro with 62.1, and took first on Design Arena. The same report notes a wide spread on harder tests: GLM-5.2 scored 13.0 on SWE-Marathon, while Claude Opus 4.8 reached 26.0.

GLM-5.2 uses a Mixture-of-Experts design with 744 billion total parameters.
About 40 billion parameters activate per inference.
DeepSeek Sparse Attention helps make the 1M-token context window usable.
Inference speed is about 17 to 19 tokens per second, slower than Nvidia-backed rivals.

The model’s architecture is built for long-context work, especially large codebases and agentic software tasks. Z.ai says the routing system picks 8 of 256 expert sub-networks for each token, which keeps compute lower than a dense model at similar scale.

Why it matters

The timing matters as much as the benchmarks. On June 12, the US Commerce Department ordered Anthropic to disable Fable 5 and Mythos 5 for foreign nationals worldwide, citing a jailbreak issue. Five days later, Z.ai shipped an open-weight alternative that developers can run locally and cannot be pulled back with an export order.

That shifts the policy debate from chip denial to practical access. If a Chinese model can match key commercial use cases on coding and design while running on Huawei hardware, export controls may slow training, but they do not stop frontier-class releases from reaching developers outside the US.

There is still a gap on the hardest reasoning tests, and the report cites Epoch AI and Stanford’s 2026 AI Index to show that US labs remain ahead on evaluations built to resist gaming. But for teams choosing between closed APIs and local deployment, GLM-5.2 changes the default answer: open weights are now close enough to matter.

The real question is no longer whether Chinese labs can ship competitive code models on domestic chips, but how much leverage export controls still have once those weights are public.

// Related Articles

GLM-5.2 tops open AI rankings on Huawei chips

What changed

Get the latest AI news in your inbox

Why it matters

OpenClaw should treat OpenAI Realtime as a paid API, not a subscripti…

Krea 2 brings 2-second image generation to teams

US model curbs should be lifted through security deals, not blanket b…

Meta’s moderation shift shows where AI cuts costs

Meta is replacing moderators with AI to cut costs

Meta’s AI moderation push is the wrong tradeoff