Guts of the APU: the new CPU
AMD's legion of enthusiast supporters can rightly grumble - as it stands, the company's best CPUs can't hold a candle to what's on offer from Intel. If it's top-end performance you're after, Ivy Bridge is where the action lies and, truth be told, if you're willing to spend top dollar there's little reason to consider anything else.
And being unable to compete at the highest level has had an unwanted knock-on effect; AMD has lost some of its lustre. While Intel flaunts competition-spanking CPU performance and super-sexy Ultrabooks, AMD is having to put forward a case of good value laptops from mid-range parts that do the job well. AMD's proposition isn't a bad one, not by any means, but it doesn't grab headlines and consequently hasn't had the attention it deserves.
Despite living in the shadow of all things Core, the accelerated processing unit (APU), as AMD likes to call it, has proven to be a muted success story. The current crop - known by the codename Llano - isn't setting benchmarks alight, but it's cheap and effective and for those reasons AMD managed to ship more than 30 million APUs last year - setting a new company record for annual laptop-centric revenue in the process. The APU makes a lot of sense - it's the strength of AMD's CPU and GPU technologies rolled into a single, affordable solution that's ideal for thin-and-light laptops - and AMD's now hoping to build on that early promise with the introduction of a 2012 platform codenamed Trinity.
Thank you, come again
The APU, as we know it today, is primarily forged from the unification of an AMD Phenom II-derived CPU and a Radeon HD 5000-series-class GPU. The combination, existing in all available Llano chips, has proved to function well - the ageing CPU is competent and the Radeon, despite having roots that date back to 2009, has raised the bar for integrated graphics capabilities.
A modestly-potent mix, but the two key components aren't exactly cutting edge and now that AMD's found its feet in these newly-trodden APU paths, it's aiming to bring us up to speed in the most obvious manner; pull out both the existing CPU and GPU segments and replace them with something altogether new.
Looking first at the Trinity APU overview, we can see that out with the old, in with the new is certainly the order of the day. Llano's 'Stars' x86 cores (read: Phenom II) have been replaced by up to four 'Piledriver' x86 cores (read: Bulldozer), while the GPU portion has been bumped up from Evergreen-class (read: Radeon HD 5000 series) to Northern Islands (read: Radeon HD 6000 series).
AMD continues to play catch up in the fabrication race, so whereas Intel has moved on to 22nm with Ivy Bridge, Trinity remains hewn from a 32nm fabric resulting in a 246mm2 die that's relatively large by 2012's standards. It's a case of evolution as opposed to revolution and this approach is likely to elicit mixed feelings; Bulldozer didn't quite live up to expectations so Trinity's x86 cores are unlikely to truly dazzle, and the Radeon HD 6000 series, while new in mobile APU terms, has already been superseded on the desktop by top-of-the-line discrete Radeon HD 7000-series cards.
But again, set aside your lust for bleeding-edge performance and you get an idea of what AMD is offering here; an affordable all-in-one package that combines the company's latest CPU microarchitecture and one of the best GPU designs in recent years, together in a single chip that will feature in sleek laptops priced from as little as $599 and, in due course, for desktop PCs.
Enhanced Bulldozer
We know that Trinity is equipped with a Piledriver CPU based on AMD's 32nm Bulldozer architecture, and the two share many similarities. In keeping with Bulldozer design, Piledriver is built using compute modules that each contain two x86 cores that share common resources - including fetcher, decoder and 2MB of L2 cache. Having two such modules means that premium Trinity APUs contain four x86 cores and a maximum of 4MB L2 cache.
You can read more about the architecture in our in-depth Bulldozer review, but in a nutshell, this configuration can result in workload being assigned to modules inefficiently - two threads split between two dual-core modules, for example, would be more efficient than two threads assigned to a single dual-core module, especially if the two-thread workload demands wholesome access to the shared resources in each module. Potential problems reside largely in software - Windows 7 sees the available Piledriver cores and randomly assigns threads to them - but AMD expects future operating systems to do a better job of tapping into the performance on offer. Windows 8, for example, will feature new scheduler code that enables the software to efficiently utilise Piledriver cores and increase performance by as much as 10 per cent in certain applications.
But let's be honest, provisos for existing hardware designs aren't what we want to hear, so AMD has imbued the Piledriver core design with a couple of added upgrades. Of note are FMA3 instructions for floating-point scalar and SIMD operations, and a new half-float conversion instruction, F16C.
These additions enable Piledriver to offer greater IPC performance (instructions per clock), but the most obvious performance gap between Llano and Trinity will come from frequency enhancements; whereas the current crop of mobile Llano APUs ramp up to 2.7GHz on the CPU core, Trinity APUs will hit speeds of 3.2GHz. Interesting to note, also, that Trinity emulates Llano in featuring no L3 cache - intimating that performance will be tightly linked to maximum DDR3 speed, which remains limited to 1,600MHz on mobile APUs.
We'll talk more about models and specifications later in the (p)review, but it's clear to see that the CPU upgrade isn't a massive one. Be it Bulldozer or Piledriver, AMD doesn't yet have what it takes to topple Intel's Ivy Bridge (or Sandy Bridge, for that matter), so it's up to the company's Radeon division to help level the playing field.