Nvidia announces world's 'most complex' GPU
May 18, 2012 // 12:01 p.m.
Nvidia has revealed details of what it claims is the most complex commercially-available integrated circuit on the planet, the Kepler-based GK110 GPU.
Before you get too excited, however: this chip won't be making it into any high-end gaming cards. Instead, the company is aiming the GK110 firmly at high-performance computing (HPC) and supercomputing applications where a GPU's ability to rapidly churn through highly-parallel tasks is welcomed with open arms and blank cheques.
Unveiled at the GPU Technology Conference (GTC) this week, the GK110 is manufactured on a 28nm process node and boasts a whopping 7.1 billion transistors - making it, in Nvidia head Jen-Hsun Huang's own words, 'the most complex IC [integrated circuit] commercially available on the planet.'
In comparison, Nvidia rival AMD's commercial-grade Tahiti GPU, as found in the Radeon HD 7900 family, features just 4.3 billion transistors created on the same 28nm process node. For a real giggle: Intel's 4004 processor, the first commercially available microprocessor from the company released back in 1971, featured 2,300 transistors on a 10µm process node.
The GK110 itself boasts 15 Streaming Multiprocessor (SMX) units featuring 192 CUDA cores each, for a total of 2,880 CUDA cores in each GPU. Comments made by Huang at the event suggest that several grades of products will be made available, each with fewer SMX units enabled, as Nvidia seeks to increase its yields on what will be a very complex chip to manufacture.
Nvidia's first outing for the GK110 will be the HPC-centric Tesla K20 series of products, featuring a 384-bit memory bus made up of six 64-bit controllers running in parallel. The company has yet to indicate the quantity of memory available, but given the target market it seems likely that each GK110 will have between 2GB and 4GB of GDDR5 to play with.
The Tesla K20 boards won't be out until the end of the year, but Nvidia also unveiled a dual-GK104 Kepler-based Tesla in the form of the K10, featuring 4.58 teraflops of single-precision floating point performance. The board also introduces Dynamic Parallelism, which Nvidia claims allows the GPU to adapt dynamically to data by spawning new threads, and Hyper-Q, which allows multiple CPU cores to address the CUDA cores on a single Kepler-based GPU simultaneously.
Nvidia, naturally, did not share pricing information at the event.