NVIDIA and Microsoft unify agentic AI from PC to cloud
5 NVIDIA-Microsoft moves show how agentic AI now spans Windows devices, Azure, local deployment, and secure enterprise runtimes.

NVIDIA and Microsoft are linking PCs, cloud, local systems, and secure runtimes for agentic AI.
At Microsoft Build, NVIDIA said the stack now reaches from Windows devices to Azure and local deployment, with one benchmark showing Microsoft Fabric SQL running up to 6x faster than a CPU baseline.
| Item | Where it runs | Key spec |
|---|---|---|
| RTX Spark | Windows devices | 1 petaflop AI performance, up to 128GB unified memory |
| DGX Station for Windows | Windows desktops | Up to 20 petaflops FP4, up to 748GB coherent memory |
| Microsoft Fabric Data Warehouse | Cloud data layer | Up to 6x faster SQL execution vs CPU baseline |
| Azure Local with RTX PRO 6000 Blackwell Server Edition | Local and sovereign deployments | Multinode support, vLLM runtime |
| Vera Rubin on Azure | AI factories | Up to 10x inference throughput per megawatt |
1. RTX Spark for Windows agents
Get the latest AI news in your inbox
Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.
No spam. Unsubscribe at any time.
NVIDIA and Microsoft are positioning RTX Spark as a personal AI machine for developers who want agents to run natively on Windows. The pitch is simple: build, tune, and run local agents on a laptop or small desktop without depending on the cloud for every step.

RTX Spark targets a practical middle ground between consumer PCs and full datacenter gear. NVIDIA says it delivers 1 petaflop of AI performance, up to 128GB of unified memory, and all-day battery life, while keeping full AI and graphics performance unplugged.
- Purpose-built Windows PCs for personal agents
- Ships from Microsoft Surface, ASUS, Dell, HP, Lenovo, and MSI
- Includes CUDA, RTX, DLSS, and TensorRT support
2. DGX Station for Windows for enterprise workflows
DGX Station for Windows is the bigger sibling for teams that need a deskside AI system for enterprise apps and long-running workflows. NVIDIA says it is built for always-on agents, with enough memory and compute to handle frontier-scale models locally.
According to NVIDIA, the system uses the GB300 Grace Blackwell Ultra Desktop Superchip, offers up to 748GB of coherent memory, and reaches 20 petaflops of FP4 performance. That makes it a fit for model development, local inference, and heavier agent pipelines that cannot wait on round trips to the cloud.
- Expected from ASUS, Dell, GIGABYTE, HP, MSI, and Supermicro in Q4
- Runs NVIDIA OpenShell secure runtime
- Supports models up to 1 trillion parameters
3. NVIDIA models in Microsoft Foundry
Microsoft Foundry is becoming the place where enterprises compose agent systems from multiple model families instead of betting on one model for every job. NVIDIA says its open models now sit alongside Anthropic and OpenAI models in Foundry Agent Service, with built-in identity and governance.

The most notable addition is Nemotron 3 Ultra, an open frontier reasoning model aimed at coding, research, and enterprise workflows. NVIDIA also points to Nemotron 3.5 ASR for speech recognition and Nemotron 3.5 Content Safety, plus Cosmos 3 for physical AI and Earth-2 weather models for forecasting and risk analysis.
- Nemotron models available on Foundry managed compute
- Anthropic Claude runs natively on NVIDIA GB300 Blackwell Ultra systems on Azure
- Agent Toolkit and NemoClaw blueprints are available for production agents
4. Fabric and Azure Local for faster data and local control
Agentic systems need a data layer that can keep up with repeated queries, reasoning loops, and retrieval calls. NVIDIA says its accelerated computing is now built into Microsoft Fabric Data Warehouse, where Microsoft’s internal tests showed SQL execution up to 6x faster than a CPU baseline and up to 7x faster than three other cloud data warehouse providers in high-concurrency workloads.
For teams that need to keep data on site or close to the edge, Microsoft is also bringing Foundry Local on Azure Local to the RTX PRO 6000 Blackwell Server Edition platform. Paired with Nemotron models, it supports multinode deployments and the vLLM runtime for manufacturing, energy, sovereign data centers, and other latency-sensitive uses.
Use case fit:
- Fabric Data Warehouse: cloud analytics and agent queries
- Azure Local: on-prem, hybrid, and sovereign deployments
- vLLM: scaled inference where latency matters5. OpenShell and Vera Rubin for secure agents and AI factories
As agents move from suggestions to actions, NVIDIA is pushing a security model where each agent runs in its own sandbox and every outbound call is checked against policy before it reaches files, networks, or credentials. That is the job of OpenShell, now integrated into GitHub Copilot and released as open source under Apache 2.0.
The other half of the story is the datacenter. Microsoft says Fairwater Wisconsin is live and validated for NVIDIA Vera Rubin, which can slot into Azure without retrofits. NVIDIA says the platform can deliver up to 10x inference throughput per megawatt and cut cost per agentic token by an order of magnitude, while Confidential Computing protects models and data at scale.
- OpenShell is model-agnostic
- Policies are written as code and versioned in the repo
- Vera Rubin works alongside Blackwell in Azure data centers
How to decide
If you want local development and fast iteration, start with RTX Spark. If your team needs heavier Windows-based model work, DGX Station for Windows is the stronger fit. If your priority is enterprise orchestration across data, governance, and model choice, Microsoft Foundry and Fabric are the center of gravity.
For regulated or latency-sensitive deployments, Azure Local with RTX PRO 6000 Blackwell Server Edition is the better path. If your concern is agent safety, OpenShell matters most. If you are planning large-scale inference infrastructure, Vera Rubin and the AI factory stack are the pieces to watch.
// Related Articles
- [IND]
Google Gemini outage hits users with error 1076
- [IND]
NVIDIA’s Hugging Face hub is built for AI pipelines
- [IND]
Anthropic’s survey turns AI anxiety into policy
- [IND]
ChatGPT grew from chatbot to platform
- [IND]
OpenAI Files Confidential IPO After $122B Round
- [IND]
Government access orders should govern frontier model access