Intel Penryn (45nm) dual- and quad-core benchmarks - Conroe and Kentsfield flayed and sent packing

by Tarinder Sandhu on 18 April 2007, 12:57

Tags: Penryn, Intel (NASDAQ:INTC)

Quick Link: HEXUS.net/qaihv

Add to My Vault: x

Results

Please head back to our Penryn discussion, as the enhancements to the architecture play a key part in explaining the increase in performance.

Both Penryn-based CPUs operate at a ~13.5 per cent higher clock speed and 25 per cent higher FSB, albeit with identical memory speed and timings. We need to factor out those differences and evaluate what the extra enhancements offer in the way of additional performance.



Starting off with an H.264 encoder that isn't optimised for the Penryn's SSE4 ISA, we see that quad-core Yorkfield's advantage over the best that Kentsfield has to offer is more than pure clock and FSB increase. The 2.93GHz quad-core model is around 22 per cent slower, more than expected from the difference in speeds, and the Penryn's Super Shuffle feature, which aligns incoming data in intelligent fashion, helps provide most of the additional gain here.



Our perennial favourite, Cinebench R9.5, shows a 25 per cent gain when switching between Kentsfield and Yorkfield, and we can attribute the majority of the architectural-related performance gain to the implementation of the Radix-16 divider, which helps speed up basic floating-point (square root) performance.


Moving on over to the unreleased R10 version, the gap is more pronounced. This benchmark is more elastic to cache and FSB, so Penryn wins out again.



The above graph appears to be erroneous. You may wonder how a dual-core Penryn is able to spank a quad-core processor that is the undisputed champion of video encoding. The secret sauce for blistering performance in this instance is SSE4. DivX has optimised its latest CODEC to benefit from the SSE4 support present in Penryn.

In particular, one of the fundamental limitations of video encoding lies with motion estimation, which requires analysing reference frames for matches, and it's that exactness of this search, in the main, which defines the quality of the encoded video and, by inference, file size.

Now, SSE optimisations speed up this motion estimation and, as you can see, SSE4 is rather good at it. That's why the Penryn is able to completely outmuscle the previous champ. In fact, the decoder ,and not encoder, is the limiting factor with the quad-core model.