Instructions? We don't need no stinkin' instructions!
So, we've covered the memory addressing issues. We're done, right? Not even close.
The memory issue is by far the most visible aspect of 64-bit computing, which is probably why it is touted as the only thing that is different. However, being able to use qwords instead of dwords means a lot more than just a longer address of where to stick the data - it means a longer string of instructions explaining what to do with that data, and also a bigger chunk of the data itself.
Instructions in the 64-bit desktop world come in the flavour of the x86-64 group, known as AMD64 and EM64T (Intel's version). It is worth noting that Intel actually copied almost all of EM64T from AMD64, so these two sets are pretty much the same. However, neither should be confused with IA-64, which is the Intel Itanium instruction set. This is currently the only
true 64-bit instruction set in general use today, and AMD has no true 64-bit processors.
Though AMD had been using x86-64 instructions since 2003, Intel did not bring its first x86-64 compatible chips to life until the Prescott series in the middle of 2004. Since the release of the Core microarchitecture, Intel has renamed its version to Intel 64
, and it is supported on all Core chips.
AMD64/Intel 64 brings quite a benefit to the x86 world. For starters, x86-64 contains a huge step forward for programmers with the introduction of "relative pointers
." Without going into an entire programming explanation, pointers are reference points in code (in this case we are referring to the translated machine code, or assembly) that tell a program where to go next or look for its next piece of data. These pointers previously needed to be absolute
, meaning that you needed to know the exact memory address or register that you were wanting to access.
This type of programming can be very inefficient, as it requires an absolute understanding of which addresses and registers are free at the time. If that for some reason was already full or was otherwise unable to be written, the program would crash due to a general protection fault. It also meant that little program pieces were strewn about free memory addresses and registers rather than intelligently organised.
The AMD64 architecture was the first of its kind.
By making use of the relative pointers, each program is capable of running in its own "virtual space" rather than in an absolute position within the CPU and memory. This makes program loading and unloading significantly more efficient and organised at the machine-code level, which can speed up memory- or computation-heavy processes significantly as compared to 32-bit execution.
One of the most controversial features of the x86-64 instruction set has been the addition of an instruction known as the NX Bit. This is a kernel security feature, and is short for "No eXecute". By using the NX bit to flag various registers and memory addresses, it is possible for an operating system to prevent code from being executed without a fault - think of it as a "write protect" switch for registers and memory.
The NX bit was implemented to help prevent one of the weaknesses of the x86 architecture ever since the 286 - buffer overruns. Of course, there have been various other theorised uses for the NX bit, including (but not limited to) its possible use as a hardware-level DRM.
On top of this, there are even some benefits to 32-bit code executed on 64-bit systems. Both Intel and AMD processors have the ability to "double up" certain 32-bit instructions, running two commands at once instead one command per clock. Though this isn't universally functional for all instructions or data, it can provide some nice little speed enhancements over the aggregate of a program running.
Feed me, Seymour!
All of the instructions, increased maximum data sizes and memory addresses won't do any good without the ability for the rest of the system to transport the same size chunks of information. In particular, data has to be able to flow between the northbridge, RAM and CPU with at least
the same data width in order for 64-bit extensions to function well.
Fortunately, bus width is something that seems to largely stay a step ahead of the curve. All northbridge chipsets on the Intel side since the G915 have supported a full 64-bit bus, as have most A64 chipsets since Nvidia's nForce3 chipset.
For Intel, memory bandwidth has been more of a problem due to the lack of an integrated memory controller, causing a greater slowdown as data and instructions get routed through the northbridge before being stored in memory.
However, AMD's HyperTransport system has come with its own weaknesses - in order to comply with 32-bit execution, HyperTransport sends a pair of 32-bit words. In order to link the two words together, a few bits from each word are used as a common identifier. By the time the system is done flagging each word as part of a memory address, illustrating whether the address is an NX space or not and adds the linking bits, the 64-bit bus can only use 40-bit memory addresses.
Real-world decreases in memory addressing from 64-bit down to 40-bit are tremendous, but assuredly well beyond what any user is likely to encounter. We're talking a decrease from 16 exa
bytes to somewhere in the neighbourhood of 1TB - still far more RAM than modern desktops will have in the foreseeable future, and likely well in excess of what will be readily available even by the turn of 128-bit processing.