Nehalem: A whole new processor concept.
If you thought the Core micro architecture was a vast change from the Netburst Pentium 4 range, just wait until you get a look at what Nehalem has in store! With AMD ramping up the game as it seeds Fusion and other technologies to integrate more into the CPU core, we all wondered how Intel was going to react.
While the expressed details are still to be confirmed, we’ve learnt that there are a lot of changes in store for Intel's upcoming platform, and that perhaps the ideas and methods adopted by the green camp weren’t so bad after all.
Firstly Nehalem will arrive in Q208 and is being designed from the ground up on the 45nm process. Intel has confirmed it will contain a variant of Hyper-Threading technology previously seen on the Pentium 4 CPUs, although it won’t be a hacked on addition in response to expected poor IPC and long pipeline, like it was in the Netburst days. SMT (Simultaneous Multithreading) is being optimised to make use of the many cores and shared cache in a way that “intelligently” uses the available resources.
Intel is aiming to have a scalable performance and core structure including 8+ cores with 16+ threads running. What gets very interesting is that Intel describes Nehalem as having a Multi-Level shared cache architecture, without specifically denouncing something along the lines of the L3-shared cache that AMD’s next generation Barcelona will have.
Integrated memory controller... on an Intel?
Say goodbye to the northbridge, because Nehalem will integrate the memory controller into the CPU core
. Intel is finally ready to do what AMD has been doing for years with the K8 architecture - incorporate an on-die memory controller, to lower memory access latencies, reduce power consumption of the whole platform and make designing future motherboards far easier.
This could be be a marketing nightmare for Intel’s PR and the green camp is going to be rolling around the floor in fits of glee at this news, but respect to Intel for ultimately biting the bullet and making the right choice. That said, Intel was in a similar situation when it created the Pentium M and had to convince the market the MHz wasn’t the only performance rating that mattered after years of preaching the contrary – and that turned out to be one of the most successful moves for Intel in recent history.
By combining the architectural power of Core with an incredibly low latency memory controller and some super bandwidth DDR3 we should see massive gains in multi-core applications that are now suddenly freed of the northbridge front side bus (FSB) limitation.
Traditionally, Intel CPUs in a multi-core scenario had to queue and wait for the northbridge to serve commands to the memory, with the scenario getting progressively worse as the latency increases in every CPU you add.
By adding larger and larger L2 cache (or L3 in the case of Xeons), this can help reduce the need to access memory to an extent, but ultimately it couldn’t last, especially
with the multi-core, multi-socket platforms of the future. AMD Opterons scale exceptionally well in this respect, as every CPU has its own memory it can talk to, as well as talking to each other through Hyper-Transport.
While there won’t be a “front side bus” in the traditional sense, Intel is still currently using that term in order to differentiate itself from AMD. It has commented that it will use a form of PCI-Express’ ultra fast, point-to-point serial link technology to talk to the memory.
Although this sounds a lot like Hyper-Transport
, we’re sure that Intel will only use “elements” of the technology as PCI-Express is tailored towards peripheral interconnect to provide compatibility with older technologies as well as other specific benefits like hot-plugging, scalability and flexibility and data striping, which doesn’t benefit small packets and memory addressing. In comparison, Hyper-Transport offers a low overhead, dedicated 32bit packet point-to-point linkage with integrated addressing that is perfectly suited to memory access.
Strangely, integrating the memory controller goes against Intel’s “80 Core Processor” ideal as well, where the company claimed that it wanted to build a CPU of many mini-cores, differentiating products by their core count as opposed to clock speed and cache sizes. This allows far easier scalability in the future, but the problem with adding a memory controller into the CPU core is that each of the new cores now needs to be wired into it. On a motherboard, adding traces is not that hard and new motherboards get made in far greater quantities than new CPU architectures get produced.
This presents the technical challenge of making a processor die to have the “potential” of everything - have a huge substrate and separate memory controller chip or planning for a dozen different processor design variations for just one family. They all have their problems, whether it’s additional cost, lower performance or a phenomenal amount of work. This is partly the reason why AMD can’t just throw another dual core die onto its CPU substrate to make a quick AMD Athlon X4 range, like Intel has done with its current quad core processor.
Regardless, Intel has all the makings of a killer processor that might leave AMD gasping for air, as the Athlon family is due to have its major fundamental performance difference taken away from it.
Click for Large Images
Integrated graphics... on an Intel?
It speaks of the old saying, “if you can’t beat ‘em, join ‘em” and Intel may have realised that AMD was going to progress down the right track with Fusion, integrating an AMD CPU and ATI graphics core into one die space. As graphics cores become fundamentally more CPU-esq in their general calculation ability, as well as the fact that including a graphics core on CPU means less power consumption, lower latency and space savings, this certainly seems to be a logical step forward. With this model you don’t need a separate IGP (integrated graphics processor) on the motherboard and traces that require more space and power. The results could be a lower cost PC, a smaller PC and thinner, cooler, lighter notebooks with longer battery life.
With Intel now pushing the mini-ITX form factor, having an all-in-one CPU will mean that a massive percentage of board real estate is reclaimed for other components. It also means cheaper motherboards as a manufacturer doesn’t have to include a northbridge and/or graphics processor, as it’s down to the end user to source at their own cost. For example, the cost difference between a moderately larger micro ATX motherboard and mini-ITX motherboard can be three or four times as much in favour of micro ATX.
Intel expressed that the integrated graphics will be in the same vein as its current integrated graphics, which realistically is for non-gaming systems . This is essentially to remain within the TDP envelope, but including any extra graphics processor means that the CPU cores will have to be exceptionally low power. There is a computational advantage however, as the CPU could possibly palm off calculations to the graphics core when it isn’t heavily utilised.
However it eventually turns out, Nehalem looks set to offer an interesting future of optimised low power coupled with increased performance. While AMD Fusion now has a direct competitor, Intel has tipped this for being a mid-2008 product and Fusion is due Q408/Q109. Could Nehalem and Intel beat AMD to market at its own game?