bit-tech.net

AMD Bulldozer and Llano Architectures

AMD Bulldozer Core – High-Performance Computing

If Bobcat APUs are meant to be small, agile and feisty, Bulldozer APUs are designed to plough through whatever work you give them. Interestingly, they’re nothing like Bobcat CPUs, as they sport a beefed-up method of shoving two process threads through one execution core. It’s a bit like Intel’s Hyper-Threading, but don’t tell AMD we said that.

AMD previews Fusion details AMD Bulldozer and Llano Architectures
AMD's Bulldozer will adopt a technique similar to Intel's Hyper-Threading. But with knobs on. Click to enlarge.

The main difference between Bulldozer’s dual-thread processing and Intel’s is that there are more duplicated sub-units in a Bulldozer core than in a current-generation Intel. AMD’s slide depicts Intel’s Hyper-Threading as having a few tack-on elements such as a second register for the second thread, which means that the threads have to be interleaved. Bulldozer APUs have far more duplicated sub-units and so – according to AMD – allow two threads to be processed in parallel much quicker. However, not everything in Bulldozer CPU core is duplicated; each execution core will present two threads to Windows though, just as Hyper-Threading does.

Bulldozer APUs are designed to be modular, with each dual-thread execution core able to link into the shared Level 3 cache, memory controller and HT Link unit that completes a Bulldozer processor. Bulldozer APUs are set to replace Magny-Cours-based Opteron 6000-series CPUs.

AMD previews Fusion details AMD Bulldozer and Llano Architectures AMD previews Fusion details AMD Bulldozer and Llano Architectures
Bulldozer APUs will have many duplicated sub-units to enhance its ability to simultaneously process two threads per core. Click to enlarge.

In an interesting statement, AMD claims that "[Bulldozer] delivers 33 per cent more cores and an estimated 50 per cent increase in throughput in the same power envelope as Magny-Cours" which rather undermines its current top-end server CPU. Asked to clarify, AMD said that Bulldozer is more efficient in delivering more cores per given area than the Magny-Cours design, but didn’t clarify as whether that meant execution threads or execution units.

When quizzed as to whether the increase in core-count would lead to extra strain on the memory controller, AMD tantalisingly said that it’s "correct to say there’s a need to support greater memory performance because of all the cores" but didn’t go as far as to say how that support would be implemented. We’d be surprised if a Bulldozer APU had more than the four memory channels of a Magny-Cours CPU, but not that it would be quicker – Magny-Cours CPUs are comprised of two 6-core CPU dies, so the quad-channel memory controller is really two dual-channel units split across the two dies rather than one homogenous mega-controller.

AMD previews Fusion details AMD Bulldozer and Llano Architectures
AMD's Bulldozer uses a modular design, which should make it relatively easy to design different variants of the APU quickly. Click to enlarge.

AMD seemed to indicate that Bulldozer APUs will work in current-generation server and workstation motherboards, after a BIOS update, but that a new socket or chipset might be required to unlock all of the features or power saving capabilities. However, we’d wait until this information is verified before betting our next server-room overhaul on it.

AMD Llano Core – Desktop PCs

Of all the Fusion CPUs, Llano is the most interesting to the majority of bit-tech readers, and also the most disappointing. Remember how on page one we said that Fusion wasn’t just a Phenom with a GPU slapped on? Well, that wasn’t entirely true. The AMD chaps did say that to call Llano merely a Phenom with a GPU bolted on "underestimates the improvement to the uncore area. But the CPU core is K8."

Remember, AMD's current K10 architecture which is used in Phenom IIs, is an updated K8 (yes, the same K8 that launched in 2003 with the Athlon 64), so don't expect massive improvements, although if they're updating the uncore area there will likely be better power management via power-gating as well as clock-gating. The uncore area of a Phenom CPU also contains the memory controller and HT Link, so we could see upgrades there – a triple-channel memory controller or an improved HyperTransport bus.

AMD previews Fusion details AMD Bulldozer and Llano Architectures
While Bobcat and Bulldozer sound brilliant, Llano... doesn't. Click to enlarge.

What’s even more disappointing is that while Bobcat and Bulldozer APUs have a pretty tight schedule of very early- and early-2011 respectively, Llano’s launch date is much further away and vague. AMD told us to expect the desktop APU in the second half of next year. When asked whether AMD would update the Turbo Core speed boost technology of the Thuban-based Phenom II X6 1090T Black Edition it evasively answered that it practised "continual refinement of power management and of hardware to shift performance to where it’s needed."

At least Llano will also be made on the 32nm manufacturing process, with high-K and metal gate transistors. It’ll also use the power gating of Bobcat and Bulldozer APUs to keep the power consumption per execution core to between 2.5W and 25W. The GPU will likely come from ATI’s Evergreen range, so will be DX11-compatible, as it showed (then hid) a while ago. As with the Bulldozer APUs, AMD seemed to indicate that Llano APUs will work in a Socket AM3 motherboard (after a BIOS update) but that new sockets and chipsets might enable all the features and power management to work.

Related Reading

AMD Ontario performance numbers leaked
AMD shows its Fusion APU (then hides it)
AMD says Fusion CPU and GPU will ship this year
AMD unites CPU and GPU development teams