Tag
Gemma 4
Gemma 4 is Google’s open-weight model family focused on long context, multimodal input, and flexible cloud deployment. With 256K context, vision, audio, and Apache 2.0 licensing, it matters for teams using Vertex AI, Cloud Run, GKE, or TPUs.
7 articles

AtomicBot’s llama.cpp fork boosts throughput on two fronts
4 ways AtomicBot’s llama.cpp fork speeds up Gemma 4 and Qwen 3.6, with matrix-bench gains up to 30-50% on the right setup.

Gemma 4 brings 256K context to open models
Google’s Gemma 4 adds text, image, and audio input, plus up to 256K context and five model sizes for local or server use.

Claude Fable 5 leads a quiet AI release week
Anthropic’s Claude Fable 5 is the only notable new model in the last 48 hours, while major labs mostly shipped smaller upgrades.

Why model-release feeds matter more than model-launch posts
Model-release feeds are the best way to track real AI progress and pricing shifts.

Gemma 4 assistant models get faster draft tokens
Gemma 4 E2B and E4B assistant models use centroid masking to cut lm_head work about 45x with little quality loss.

Gemma 4 lands on Google Cloud
Google Cloud brings Gemma 4 to Vertex AI, Cloud Run, GKE, and TPUs, with 256K context, vision, audio, and Apache 2.0 licensing.

AIME 2026 leaderboard: Qwen leads math tests
Qwen3.6 Plus tops the AIME 2026 math benchmark with 0.953, while 8 models show a wide gap in olympiad-style reasoning.