AMD has shattered expectations again with its latest Adrenalin Edition 25.8.1 driver, unlocking up to 96 GB of memory for integrated graphics on consumer machines.
That’s not just a bump-it’s a breakthrough that opens the door to running massive 128B-parameter LLMs like Meta’s Llama 4 Scout right from your own PC, no cloud subscription required.
Thanks to the upgraded Variable Graphics Memory (VGM) system, AMD’s Ryzen AI MAX+ APUs can now dynamically tap into system RAM for graphics memory, enabling local deployment of high-performance AI assistants
. The Llama 4 Scout, built using a Mixture of Experts (MoE) architecture, only needs about 17B parameters at runtime, making it a perfect fit for this setup-responsive, powerful, and surprisingly lightweight.
But that’s not all. AMD’s edge AI evolution brings with it a stunning leap in context handling: support for 256,000-token sequences, dwarfing the standard 4K-token limit on most consumer chips. For developers, power users, and AI enthusiasts, that means longer conversations, richer prompts, and fewer compromises.
Of course, this future isn’t cheap. Devices running the flagship Strix Halo platform still hover above $2,000-hardly mass-market prices. But as one commenter put it, AMD’s vision makes “AI power” more accessible. Even if that power currently demands a thick wallet, the direction is clear: AI PCs are becoming a real thing.
While some users are already dreaming of future 3nm APUs with LPDDR6X and up to 1TB RAM, what’s available today is already game-changing. And Lisa Su’s confidence in AMD’s trajectory seems to be translating into investor interest too-especially with Intel’s performance stumbles making AMD look even sharper.
In short, AMD’s new driver isn’t just an update-it’s a declaration. LLMs on laptops are here. The AI PC era has officially begun.