Research

AI research papers, breakthroughs, and technical deep dives. From academic publications to lab findings shaping the future of AI.

Jun 29

OPD lets you distill skills without brute-force RL

I break down On-Policy Distillation and turn the idea into a copy-ready post-training template.

Jun 29

Google DeepMind turns science into tools

Google DeepMind’s science tools show how Google is packaging AI for researchers who want precision, not hype.

Jun 29

Measuring when LLM behavior actually переносится

A new framework tests whether an LLM’s behavior transfers across payoff-equivalent decision environments.

Jun 29

Prompt injection is now an AI security problem

Prompt injection lets hidden text steer LLMs, and recent tests show models like DeepSeek-R1 can be tricked at worrying rates.

Jun 29

Solver choice changes which Nash equilibrium wins

Different zero-sum game solvers can converge to different Nash equilibria, and the choice is algorithm-dependent.

Jun 29

Proper positive-only learning gets a full characterization

A new result characterizes when proper learning from positive-only samples is possible.

Jun 29

DexCompose Reuses Dexterous Policies Across Tasks

DexCompose composes pretrained hand policies into multi-task manipulation by assigning finger-level action ownership.

Jun 29

HaWoR turns hand motion into MANO params

HaWoR’s hand reconstruction setup boils down to predicting MANO parameters, not raw meshes.

Jun 29

NVIDIA’s $30,000 grant targets USC health AI

USC is advertising NVIDIA’s $30,000 academic grant for health and AI research, with June 30, 2026 applications due.

Jun 29

CUDA Toolkit 13.3 fixes a nested-divergence bug

CUDA Toolkit 13.3 fixes a compiler bug from 12.8 that could corrupt registers in deeply divergent GPU kernels.

Jun 28

EAGLE3 is the real speedup for Kimi-K2.5 on MI325X

EAGLE3 is the main reason Kimi-K2.5-W4A8 decodes faster on AMD MI325X, not kernel tweaks.

Jun 27

LLM fine-tuning turns generic models into domain tools

A practical breakdown of enterprise LLM fine-tuning, from data prep to model choice, plus a copy-ready template.

Jun 27

Rust learners need permission to clone first, optimize later

Rust learners should clone freely at first, then optimize once they understand the problem.

Jun 26

Mistral OCR 4 brings structure to document AI

Mistral OCR 4 adds boxes, block labels, and confidence scores to OCR, with API pricing from $4 per 1,000 pages.

Jun 26

Autoregressive Boltzmann Generators ditch flows

ArBG replaces flow-based Boltzmann generators with autoregressive modeling for faster, more scalable equilibrium sampling.

Jun 26

RiVER trains LLMs without ground-truth answers

RiVER shows LLMs can improve from score-based tasks without ground-truth answers by calibrating rewards from execution feedback.

Jun 26

DanceOPD distills image-editing skills into one model

DanceOPD trains flow-matching image models to combine text-to-image and editing skills without them fighting each other.

Jun 26

Microsoft funds AI research on team collaboration

Microsoft Research opened a Spring 2026 CFP for AI that helps teams work better, with awards around $50K to $75K.

Jun 25

3 AI papers on code, music, and diagnosis

A Zhihu roundup highlights three 2026.06.24 AI papers on code generation, real-time music, and rare-disease diagnosis.

Jun 25

New NLP papers map agent memory and tool use

A June 24 arXiv roundup highlights agent memory, tool-use signals, and conversational search papers that push practical NLP forward.

Jun 25

Self-Distillation Can Shrink Model Diversity

Self-distillation can boost pass@1 while quietly reducing rollout diversity and hurting out-of-distribution robustness.

Jun 25

RevengeBench tests reverse-engineering game policies

RevengeBench tests whether LLMs can reconstruct hidden game policies from behavior and improve with custom probes.

Jun 25

Learning Action Priors for Cross-Embodiment Manipulation

A two-stage training scheme gives VLA robots an explicit motion prior before cross-modal alignment.

Jun 25

OPSD lets you turn user clicks into training

I break down OPSD into a copyable loop for turning implicit user feedback into targeted correction and continual training.

Jun 25

UltraQuant: 4-bit KV caching for long agents

UltraQuant shows 4-bit KV caching can speed long, multi-turn agent serving while keeping more context resident.

Jun 24

FLUX3D fixes 3DGS detail loss from images

FLUX3D improves image-to-3D Gaussian generation by aligning sparse 3D latents with dense 2D image tokens.

Jun 24

Stochastic Subgradient Last Iterate Gets Tight Bounds

The paper tightens last-iterate bounds for stochastic subgradient descent in 1D and shows variance alone is not enough.

Jun 24

InSight lets VLAs learn new skills on their own

InSight makes vision-language-action policies learn new manipulation skills without human demos of those target tasks.

Jun 24

Anthropic is right to sound the alarm on recursive self-improvement

Anthropic’s warning is justified, but the bigger problem is that AI control is already slipping beyond easy governance.

Jun 24

OpenAI’s bug hunt rattled Chrome, Safari, Firefox

OpenAI researchers found multiple exploitable browser bugs in Chrome, Safari, and Firefox within a week.

You've reached the end