Tag

Google Research

Google Research covers the core engineering behind modern AI: inference efficiency, memory compression, prompt behavior, and benchmark reliability. Its work often shapes how models are deployed, measured, and optimized in production.

6 articles

Research/Jun 29

Google DeepMind turns science into tools

Google DeepMind’s science tools show how Google is packaging AI for researchers who want precision, not hype.

Industry News/Jun 12

TurboQuant makes long-context AI much cheaper

4 ways TurboQuant’s 100x KV cache cut could lower long-context AI costs, ease GPU needs, and change model serving.

Research/Jun 8

TurboQuant cuts KV cache memory 6x in Google tests

Google Research says TurboQuant compresses KV caches by over 4x, with up to 6x less memory and no loss on long-context tests.

Industry News/May 20

5 KV cache takeaways for llama.cpp users

5 takeaways from TurboQuant: under-3-bit KV cache compression, memory savings, and the tradeoffs llama.cpp users should watch.

Research/Apr 3

TurboQuant cuts memory use 6x without accuracy loss

Google Research’s TurboQuant claims 6x less memory and 8x faster inference with no accuracy loss, jolting AI inference economics.

Research/Apr 2

Duplicate Prompts Can Lift Accuracy Fast

A Google study found repeating prompts once improved 47 of 70 model-benchmark pairs, with one task jumping from 21% to 97%.