bit-tech.net

AMD launches 6-core Istanbul Opteron processor

AMD launches 6-core Istanbul Opteron processor

Can you see the six execution cores on this die shot? Give yourself a prize if you can!

AMD has just launched its first six-core CPUs, the Opteron 2400-series and the Opteron 840-series, five months earlier than originally planned. The new CPUs have six execution cores, with clock speeds ranging up to 2.6GHz. Based on the Istanbul architecture, the new Opteron fits in AMD’s current Socket F (1,207) CPU socket, and offers ’30 per cent more performance per watt’ according to AMD.

The new CPU can be used in 2P, 4P and 8P configurations. AMD also claims that the Opteron 2435 is the ‘industry’s only six-core processors with Direct Connect Architecture’, which is a swipe at Intel’s six-core ‘Dunnington’ Xeon 7400-series which has to use a single shared memory controller in the Northbridge.

The Opteron 2400-series can be deployed as a single CPU or in systems with 2,4 or more sockets and is set to range from $445 to $1,019 per CPU. The 8400-series can only be installed in multi-CPU motherboards and will range from $1,514 to $2,649. Every CPU has the same cache configuration (though AMD hasn’t revealed what this is) and ‘will be widely available from the beginning of June 2009 with HE, SE and EE versions’.

The Istanbul architecture introduces a new technology called ‘HT Assist’, which aims to reduce probe filter traffic. When processing a piece of data, a CPU in a multi-CPU configuration has to probe the other CPUs to ascertain whether a piece of data held in their caches is more up to date than that held in main memory. Without HT Assist probes are broadcast to all other CPUs, and the CPU that sent the probe must wait for all CPUs to echo back before it can proceed. In a 4P configuration AMD says this can take around ten actions. With HT Assist enabled, all the CPUs know where to go for the most up to date version of data, and so only two actions are required.

AMD launches 6-core Istanbul Opteron processor
HT Assist prevents wasted CPU clock cycles by eliminating the need to broadcast probes

Curious to know more about how HT Assist works, we asked Pat Patla, Vice President and General Manger, Server and Workstation Business, AMD (how big are his business cards?) to explain. We also asked whether the early arrival of the six-core Opteron had anything to do with Intel stepping up its launch date of its octo-core Nehalem EX-based Xeon CPUs and the future of Opteron beyond this launch:

bit-tech: HT Assist seems like a great addition, but how exactly does it work? Is there a lookup table telling each CPU what the other CPUs have in their caches or is it some kind of cache mirroring system or something else entirely?
Patler: The new HT Assist feature reduces memory access latency and cache probe traffic in multi-socket systems by storing a [1MB] “directory” in each CPUs L3 cache. Directories on each node track addresses mapped to the local DRAM on that node. Each CPU is considered the host of the cache information contained in its L3 directory. For many CPU-to-CPU transactions, the host CPU knows exactly which CPU to probe for the information it needs, eliminating the need to broadcast. For data requests, the directory will be probed to see if data has already been fetched directly to the L1 cache. If data is present, it will cancel the request. This results in a reduction of probe traffic on the HT links, increasing bandwidth available to data traffic and reducing the need to wait for probe responses before forwarding data to the cores. HT Assist sets up “indexes” in the cache on each processor. Prior to HT Assist, the cores had to search each cache to find the latest version of a piece of data. With HT Assist, the core can simply look at the cache registry and see if the data is in there. This saves time and reduces probe filter traffic, thereby increasing memory bandwidth performance as much as 60%.

bit-tech: Has the launch of Istanbul been brought forward in response to Nehalem EX’s updated launch date?
PatlerIstanbul being pulled in by five months is a result of excellent execution by our design and manufacturing teams who were about to take it from first stepping of silicon to production. Also, the fact that Istanbul is based on our existing socket infrastructure, enables our OEMs to save time on validation cycles that are normally associated with a new processor that delivers the performance Istanbul can.

bit-tech: We heard that Direct Connect 2.0 will be seen in 2010, can you talk about the improvements we’ll see from that?
Patler: 12-core CPUs, quad-channel integrated memory controllers and 4 HT links will be available with DCA 2.0 (so far we had 6 cores, 2 channel integrated memory controller and 3 HT links).

bit-tech: Wow, sounds awesome. Thanks Pat!

Will you be looking to upgrade your server or workstation with some Istanbul CPUs? Let us know in the Forums...

0 Comments

Discuss in the forums Reply
Log in

You are not logged in, please login with your forum account below. If you don't already have an account please register to start contributing.



Discuss in the forums