Databricks adds AI Gateway inference tables for served models
Databricks now logs model-serving requests and responses to Unity Catalog Delta tables for monitoring, debugging, and agent tracing.

Databricks AI Gateway inference tables log model-serving requests and responses to Unity Catalog Delta tables.
Databricks updated its AWS docs on Jun. 30, 2026 to describe AI Gateway-enabled inference tables for served models. The feature automatically captures request and response data from Model Serving endpoints and stores it in Unity Catalog for monitoring, evaluation, debugging, and tuning.
| 項目 | 數值 |
|---|---|
| Doc update | Jun. 30, 2026 |
| Supported endpoint types | 5 |
| AI agent tables created per deployment | 3 |
| Payload data availability | Within 1 hour |
What changed
Get the latest AI news in your inbox
Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.
No spam. Unsubscribe at any time.
Inference tables are now described as a built-in logging layer for Databricks Model Serving endpoints. When enabled, they write incoming requests, outgoing responses, HTTP status codes, run time, request IDs, and traces into a Delta table in Unity Catalog.

The docs also extend the feature to deployed AI agents. For agents, Databricks stores payload and request details plus MLflow Trace logs, and agents deployed with the mlflow.deploy() API get inference tables automatically.
- Supported endpoint types include provisioned throughput, pay-per-token, external models, deployed AI agents, and custom models.
- Workspace requirements include Unity Catalog, serverless compute, and a region with model serving support.
- Databricks says both the endpoint creator and modifier need Can Manage on the endpoint plus USE CATALOG, USE SCHEMA, and CREATE TABLE in Unity Catalog.
- Databricks creates a new inference table automatically; existing tables are not supported.
The docs warn that changing the table schema, renaming the table, or deleting it can stop logging or corrupt the table. For AI agents, Databricks is also deprecating request logs and assessment logs in favor of the newer payload tables.
Why it matters
For developers, the change turns serving telemetry into queryable data instead of opaque logs. Teams can join inference tables with ground truth labels, build training corpora, monitor drift, and use Databricks SQL or notebooks to inspect failures without leaving the platform.

It also gives ops teams a clearer path for agent debugging. Because the tables can include MLflow traces and request metadata, teams can track slow runs, compare historical requests, and spot where response quality or latency changed.
Databricks also points to the new Unity AI Gateway beta, which it describes as the enterprise control plane for governing LLM endpoints and coding agents. That suggests inference tables are part of a broader push to make serving, governance, and observability sit in one workflow.
The practical question for teams is no longer whether they can log model traffic, but whether they want that traffic governed in Unity Catalog from day one.
// Related Articles
- [TOOLS]
BASIC09 gets a new LLVM-based compiler
- [TOOLS]
9 Cursor alternatives that beat lock-in
- [TOOLS]
AI视频生成工具的胜负手,已经不是单次生成而是全流程生产
- [TOOLS]
Go makes backend scale easier in production
- [TOOLS]
Boot.dev’s Go Playground is a better teaching tool than a full IDE
- [TOOLS]
Zhihe A210 turns RISC-V into a dev kit