Nvidia is building everything for the age of agents

From the rack to the desk, Jensen Huang reframed the entire stack around agentic AI.

Share
Nvidia is building everything for the age of agents
Photo Credit: Paul Mah.

We're moving into the era of agentic AI, says Nvidia's Jensen Huang at his annual keynote at GTC Taipei. Here are the three most important things he shared today.

Rubin goes live

Vera Rubin is ramping into full production, says Jensen. The next-generation AI chip is slated to replace Blackwell and will eventually reach 600kW per rack over several design iterations. The good news is that assembly is radically simpler. A hose-free, fanless, modular tray design now takes just five minutes per compute tray, down from two hours.

Vera Rubin isn't a single product but a set of five purpose-built racks operating as a massive AI supercomputer within the data centre. It comprises the Vera Rubin NVL72, the Vera CPU, the Rubin CPX (under a non-exclusive licensing deal), BlueField-4 STX storage, and Spectrum-6 Ethernet.

Where Blackwell was built for AI, Vera Rubin is purpose-built for agents, says Jensen, a point he reiterated several times. He notes that Vera Rubin will offer 10x agent throughput compared with Grace Blackwell, and up to 10 times lower cost per token.

Nvidia's CPU play

Not content with current CPU designs, Nvidia has also built a fully custom data centre CPU core called Olympus, pitched as a "CPU for agents." According to Jensen, old CPUs are built for humans, whereas agents are far more impatient and latency sensitive.

To support agentic AI, the new Vera CPU was designed around single-thread performance, bandwidth, and spatial multi-threading. This gives it a memory subsystem with 40% lower peak memory latency and three times more per-core bandwidth than traditional data centre CPUs. The result is 3x faster SQL data processing and 6x faster real-time streaming processing.

The PC, reinvented

Finally, Nvidia wants to reinvent the PC. The upcoming RTX Spark will fuse a Blackwell RTX GPU with a custom 20-core Grace CPU, co-designed with MediaTek. It packs 6,144 CUDA cores, up to one petaFLOP of AI performance (FP4), up to 128GB of unified memory, and 600GB/s of GPU-to-CPU bandwidth. And it will run Microsoft Windows.

Why build such powerful AI capabilities on end-user systems? Jensen is betting that the era of agentic AI doesn't stop at the data centre. It will also run on the device on your desk. He thinks your next PC will feel less like a tool and more like R2-D2.

Do you buy it? The new RTX Spark machines start arriving in September.