A New Architecture
AMD describes Vega as the company's "most sweeping architectural change in five years." In numbers, the Vega 10 GPU features four asynchronous compute engines, four next-gen geometry engines, 64 next-gen compute units, 4,096 stream processors, 256 texture units, 64 render back-ends, 4MB of L2 cache and a 2,048-bit bus to HBM2 memory. Phew!
Vega 10 is the first AMD GPU to leverage the Infinity Fabric interconnect that plays an integral role in current Ryzen CPUs, as well as forthcoming Ryzen APUs, and transistor count sees a 40 per cent increase over Fiji XT, rising to 12.5 billion. A 14nm manufacturing process allows for all this to be packed into a 486mm² die that is nearly 20 per cent smaller than its predecessor, and AMD tells us the extra transistors are a key contributor to higher frequencies.
Indeed, the company touts 'extensive design efforts' to improve timings across the Asic, and Vega 10's next-generation compute unit (NCU) benefits from shorter wire lengths and an optimised layout that enables core speeds of up to 1.7GHz. The peak speeds don't seem particularly high compared to what the competition has already achieved, but Vega 10 is up to 70 per cent quicker in terms of frequency than Fiji, and 30 per cent quicker than Polaris.
Vega's foundation may appear familiar, yet the goal is to improve efficiency in order to get more from the same number of shaders and ROPs. With this in mind, Rapid Packed Math allows the Vega NCU to handle two 16-bit ops simultaneously for greater throughput during certain workloads. A similar technique is already utilised by the custom AMD GPU featured in the PlayStation 4 games console, and Rapid Packed Match is just one of many console-based initiatives making its way to the desktop GPU.
Enhanced SIMD with double the effective register space and double the peak floating rate with 16-bit ops are key to improving NCU efficiency, and in addition to adding 40 new instructions to the ISA, AMD is able to further enhance throughput by introducing a new stage in the pipeline called the primitive shader. Sat between surface shader and rasteriser, the primitive shader allows the GPU to better organise its workflow and quickly discard primitives for faster rendering.
Throughout Vega's development, the RTG team has been keen to stress that GPU processing power is rapidly outpacing the growth in memory capacity. Asserting that memory requirements are key, Vega 10 ships with second-generation HBM2 memory as standard. Offering double the bandwidth per pin of HBM1 and eight times the capacity per stack, Vega officially supports up to 16GB of memory via a 2,048-bit bus.
We've already seen professional Vega cards ship with the full complement of memory, but on the consumer front both Vega 64 and Vega 56 will launch with a more cost-effective 8GB frame buffer. A slight variation in clock speeds means that memory bandwidth for the two GPUs is recorded as 484GB/s and 410GB/s, respectively, and it's also worth noting that Vega's render back-ends are now clients of the L2 cache, which itself has doubled in size to 4MB in order to achieve more efficient output and reduce the need to move data around.
HBM2 is merely a starting point for AMD's march on large-scale compute, and when you listen to the firm's engineers extol the virtues of the Vega architecture, it's hard not to notice a greater emphasis on non-gaming workloads.
One example is the theoretical potential of the high-bandwidth cache controller (HBCC), which allows the GPU to address spare system memory as an additional large pool of cache. Page-based memory management ensures optimal distribution of resources, and though AMD intimates that game developers will be able to create larger, more-detailed worlds without worrying about frame-buffer limitations, HBCC holds greater value for professional users. Real-time rendering with terabytes of assets is a genuine boon for content creators, making HBCC a key selling point for Radeon Pro hardware.
HBCC is one for the future - it is disabled by default and must be enabled via the Radeon software if you choose to experiment - but what's more interesting for today's gamers is the available feature set. Vega 10 offers the most comprehensive support for the DirectX 12 and Vulkan APIs of any GPU to date, and there are improvements to the display engine, too.
Appreciating that Vega is key to driving FreeSync sales, AMD has ensured that Vega 10 supports DisplayPort 1.4 and HDMI 2.0, offering the ability to drive a trio of 4K HDR displays at 60Hz, a single 4K HDR display at 120Hz or a 5K HDR display at 60Hz. Choose not to take the HDR route and Vega 10 can drive an 8K panel at 80Hz or up to six 4K screens at 60Hz apiece.
Vega's architectural underpinnings have been a known quantity for some time, but the reference design and benchmarks are ripe for exploration. Let's take a look at some graphics cards, shall we?