The previous Radeon HD 6800-series got our hopes up in two ways – firstly it sounded like it’d be more than a mid-range GPU that went toe-to-toe with Nvidia’s GeForce GTX 460 cards, and secondly it wasn’t all that new. But while the Barts GPU of the HD 6800 looked more like an overclocked HD 5830 with a tweaked front-end unit, the Cayman GPU of the Radeon HD 6900 is a completely new design throughout.
Before we plunge into the details of the new GPU, let’s clear up the naming. We’re sticking with calling these cards ATI Radeons because that’s what most people know them as, and AMD has said that it’s fine with it partners gradually transitioning from ATI to AMD Graphics until 2011. We still plan to switch when AMD’s Fusion APU is launched in early 2011, as it’ll be silly to talk about a single piece of silicon that has both AMD and ATI technology.
Previously ATI has reserved the HD x900 family branding for its dual-GPU cards to give a bolder indication to the customer that a dual-GPU card should be much faster than a single-GPU HD x800-series card. However, because ATI needed the HD 5700-series to exist with the HD 6000-series, ‘Radeon HD 6800’ was already taken. The Radeon HD 6950 2GB and Radeon HD 6970 2GB are therefore both single-GPU cards. There is a plan to release a dual-GPU card based on two Cayman GPUs which we assume will be called the Radeon HD 6990 4GB. With the names out of the way, let’s stuck into what’s inside the silicon that makes a Cayman worthy of its HD 6900 name.
Click to enlarge
A Dual Front-End Design
While Nvidia went nuts with its Fermi design, breaking apart the elements of a typical GPU front-end and scattering them throughout the chip, ATI has been much more reserved. However, we’ve been expecting ATI to get a bit more radical for a while – after all, a GPU with only one tessellator and one setup engine is starting to look a bit anachronistic these days. While the Barts GPU added merely had a tessellator upgrade (to what ATI is called its 8th Gen Tessellator, which allows off-chip buffering) the Cayman design of the Radeon HD 6900-series has two entire front-end units.
There are some obvious advantages of having two front-end engines: you get two setup engines, two 8th Gen Tessellators, two geometry engines and the ability to send twice as much work per clock to the stream processors than before. ATI claims that the HD 6970 2GB has up to three times the tessellation rate of the HD 5870 1GB. These two front-end units can also load-balance the work flowing to them, and ATI has implemented ‘asynchronous dispatch’.
The new Cayman GPU of the Radeon HD 6970 2GB has two Fron-End Engines.
Asynchronous dispatch is like the ability of Nvidia’s Fermi GPUs to work on two distinct kernels simultaneously. However, ATI says that its technology is ‘completely new in the marketplace’ as it allows multiple, different programs to execute on the GPU at the same time. ‘It’s not like other solutions where you [only] have one program that can spawn multiple kernels to run on the graphics card, you can genuinely have multiple, different applications running on the GPU at the same time’ Dave Bauman, Senior Product Manager, told us.
Click to enlarge
This should make the GPU more flexible for general-purpose work – the GPU will act more like a modern CPU. The two bidirectional DMA (Direct Memory Access) engines of the GPU are also pitched as enhancing the GPU Compute capabilities of the GPU, as they allow two simultaneous reads or writes or a simultaneous read and write per unit.
However, we’re unsure of the need for such advanced capabilities when it comes to gaming. A game runs as a single application, meaning that Nvidia’s technology is perfectly adequate – a game can invoke a DirectCompute kernel and throw it, as well as DirectX shader code, at Fermi GPU without concern.
Where the ATI technology will be useful is if many different applications try to use your graphics card, especially if they don’t have the courtesy to wait until you’ve finished gaming before doing so. It’s not impossible that anti-virus applications could be written in OpenCL, for example, as virus scanning