NVIDIA is once again pushing the envelope in AI server tech, shifting from its just-launched Blackwell Ultra GB300 units to preparing for an even bolder leap – the Vera Rubin AI server architecture. With a product cycle that’s nearly absurd at just six to eight months, Team Green is moving at breakneck speed, setting a pace the rest of the industry can only watch from the sidelines.
According to Taiwan Economic Daily, NVIDIA is finalizing designs for its next-gen Rubin server racks, expected to be handed over to suppliers by the end of the month.
These won’t hit the market until 2026 or 2027, but the groundwork is already in motion – showing how far ahead Jensen Huang and his team are playing the game.
Rubin promises to be a complete rethinking of AI server design. It’s built around next-gen HBM4 memory to power the R100 GPUs, with a significant performance leap over today’s HBM3E. It’ll be fabricated using TSMC’s advanced 3nm (N3P) process and feature CoWoS-L packaging for even more efficient scaling. But what really sets Rubin apart is NVIDIA’s first move into chiplet architecture – using a bold 4x reticle design, a notable step up from Blackwell’s 3.3x approach.
This marks a monumental shift for NVIDIA, comparable to the leap from Ampere to Hopper. Rubin isn’t just about performance gains – it’s about reshaping the entire infrastructure stack, from memory to interconnects. However, concerns remain. With such fast-paced releases, can the supply chain – or even NVIDIA’s own internal teams – realistically keep up? Blackwell’s rocky early deployment using the older Bianca board hints at possible turbulence ahead.
Still, NVIDIA doesn’t seem interested in slowing down. Rubin shows a clear commitment to keeping the AI race firmly in their grip, proving once again that only Jensen dares to set this kind of relentless standard.