Nvidia has officially released CUDA 5, upgrading its parallel processing platform with a raft of new features including an online resource sharing service.
As Nvidia looks to make its CUDA platform the go-to model for highly-parallel programming - for the simple reason that it will sell more graphics chips that way - it spent much of its launch presentation discussing the various ways in which CUDA 5 makes programmers' lives easier including dynamic parallelism for spawning new parallel work from within GPU code, GPU callable libraries, GPUDirect for high-performance and low-latency direct-memory access between GPUs and PCI Express-connected devices, and an Nsight plug-in for the Eclipse IDE which offers the ability to code, debug and optimise from a single interface.
Looking at each feature in turn, the ability to spawn new threads from within GPU threads mean that it's now possible for the GPU to automatically adapt to the data at hand where previously communication with the CPU was required. By eliminating - or, at least, vastly reducing - the CPU's interference in the GPU's operations, performance is claimed to be significantly improved while the applicability of CUDA is extended to a broader set of algorithms such as those used in computation fluid dynamics applications.
The GPU callable libraries are part of Nvidia's attempt to foster a wider third-party ecosystem, allowing developers to access CUDA parallelism through their own libraries. With Nvidia suggesting that coders can write plug-in APIs to allow other developers to extend the functionality of their kernel, or allow them to implement callbacks on the GPU to customise the functionality of third-party libraries, it's clear Nvidia is hoping that developers will take advantage of the new object linking capabilities to build larger and more complex CUDA-powered applications.
GPUDirect, meanwhile, has a more immediate benefit: minimising system memory bottlenecks. Designed to allow GPUs to communicate with other PCI Express-connected devices without getting the CPU and system RAM involved, GPUDirect is claimed to significantly reduce latency between nodes in a GPU cluster as well as improve overall performance where external hardware is accessed.
Finally, the Nsight plug-in for Eclipse provides developers with the ability to write, debug and compile CUDA code within the popular IDE on Linux and OS X platforms. Those using Eclipse will also find a new automatic refactoring tool to quickly port existing code to CUDA, along with customised syntax highlighting to differentiate between CPU and GPU code segments.
The big feature of the CUDA 5 launch, however, was the CUDA Resource Centre. Part of the Nvidia Developer Zone, the CUDA Resource Centre provides instantaneous access to all the things programmers could want to start taking advantage of the benefits of parallelism. Programming guides, API references, library manuals, code samples, tools documentation and platform specifications are all included, with a total of over 1,600 files ready for viewing at launch.
More details on CUDA 5 are available on the Nvidia website