An Intel study into GPGPU computing came up with an interesting result: it beats CPUs. Whoops.
Graphics cards are inherently good at parallel processing tasks: it's long been considered true, but support for the theory has come from an unlikely source - CPU manufacturer Intel.
As reported over on
iTworld, the chip giant set out to disprove the myth that GPUs offer a 100x speed boost in parallel processing tasks over a CPU - in other words, attempting to continue selling its top-end CPUs rather than seeing its high-performance computing clusters move to GPU-based supercomputers such as the
FASTRA II.
While Intel successfully debunked the myth that moving your parallel processing tasks onto the GPU via CUDA or OpenCL would net you a 100x performance boost, it failed to show that there was
no performance boost. Rather, the final figures demonstrated that the Nvidia GeForce GTX 280 used in the test out-performed the Core i7 960 3.2GHz processor by a margin of 2.5x on average - with certain functions running up to fourteen times faster on the GPU than the CPU.
That's an embarrassing result for Intel - and doubly so when you realise that the graphics card used was released almost exactly two years ago in June 2008, whereas the CPU is from October 2009.
Nvidia, naturally, is crowing about the test results, with the company's general manager of GPU computing Andy Keane
blogging "
it's a rare day in the world of technology when a company you compete with stands up at an important conference and declares that your technology is only up to 14 times faster than theirs."
Despite Intel's testing results, the day when we can ditch our CPU is far from here quite yet: while GPUs show great improvements in massively parallel tasks, a lot of day-to-day computing is serial in nature - and thus runs faster on a CPU.
Are you shocked to see that an elderly Nvidia graphics card can beat one of Intel's more recent processors, or did you always know that GPGPU computing was the future? Share your thoughts over in
the forums.
39 Comments
Discuss in the forums ReplyUtter annihilation?
The atmosphere catches on fire and the world ends :)
i.e. pre mid-2009 a GTX 285 would easily outperform a Core i7 in Folding@home, but after that (with the release of new clients) the CPU would be faster.
As always, it depends on what your doing with your hardware.
I'd like to see a comparision of performance per watt of the CPU and GPU running these tasks.
@mjb501 - performance per watt would be interesting but I think we can guess given that you'd need a <5* performance improvement to make it worth while
@rickysio - you'd have to hope so given thats what the 480 was designed for... heck it's faster than a 5870 which was 'just' designed for graphics.
In what way? Do you mean how the apparent performance (if you're measuring in ppd) of the clients has varied over time as Stanford adjusts the points system?
Marketing wise, they've shot themselves in the foot with both barrels.
I have absolutely no doubt that they loaded each test with as much bias as possible.
The fact that they used a GPU that was 1 year older that their CPU is proof in point.
To then come out and say the GPU was at least 2.5 times a fast as their CPU (at parallel processing tasks) is amazing.
One thing is for sure: no way in Hell would Apple's Marketing Dept allow such a test result to be released.
Quite so. Actually, I'm surprised it's only 14 times; for graphics I suspect it would really be quite a lot more than that, frame rate for frame rate (though a CPU based graphics engine would likely give more accurate results, if you care).
P
It's not really marketing - it's a paper for discussion by experts.
yep thats what I meant - though I only picked up the info from the forum - so I dont honestly know if it is true.
Why do I get an image of Charlton Heston screaming on a beach looking up at a huge 480
Employee: Doc, I've just pwned myself
Prof: so what, you're a nerd anyway
they can learn from this as can the industry in general, it also shows the difference isn't as large as a lot of people thought, which in intels eyes is of course a positive (and so it should be).
@ shagbag : the fact that apple wouldn't release this says more about apple than it does about intel
what Intel should have showed is single threaded performance, or heavily branching based performance.
a) People don't like change. It'd require a massive switch for the entire physical structure of the PC.
b) CPUs are still better at a variety of tasks, tasks which are still quite common today. Most applications currently available and used can't support the highly parallel nature of a GPU.
c) Standards such as CUDA and OpenCL need to be more widely developed/adopted.
d) Intel and AMD like money.
Hell imagine the time it would take to program something with 64 threads.
Of course, you could be smart and have it assign an 'empty' thread automatically, over thread(1) or stuff :D
with the nerfed A3 bigadv work units the CPU i7 clocked at about 3.8-4ghz douls around 20k PPD (26k before with the A2 work units)
good thing is thought the nerf to points seems global as the GPU3 comes into play thats 20% slower, A3 norm work units are about 20-40% slower and Bigadv A3 is 20-30% slower
.-
Up to 14 times. Up to. The average was 2.5.
And they didn't report how much effort it took to recode those benchmarks. A simple port will probably run abysmally. You really have to work hard to get those hundreds of threads running that the GPU requires.
So if you have that one kernel (probably graphics) that speeds up bigtime, go for it. If you have something else, think very hard if you want to invest a couple of weeks/months in rewriting your code for a small gain.
V.
That is because software has to be massively rewritten to work efficiently on a GPU. With regular CPUs if the clock got faster, the application got faster. No work required. If the core count goes up, you have to do some stuff with threading before you see a gain. But that's manageable. With GPUs you basically have to recode your application.
If you're a relatively small application and it happens to be one that gets good speed up (read, you're a game and graphics determines your speed) then you'll invest the effort. If you're something like MS Windows, you'll never run on a GPU. Too much work and no gain.
V.
keep the clock rate the same pump up the numbers of cores by 100's or shrink the chip.
they are doing a lot of R&D to test new market, a company that size aint going to roll over with a few bad hands.
Hence why we use CPUs for general processing and GPUs for graphics processing.
Whilst the processing units are getting more powerful, there are still loads of CPU functions and capabilities that GPUs simply cannot handle.
This paper was designed to try and persuade people that Intel chips were faster in large supercomputers and clusters, where you might see over 20,000 cores.
Single threaded performance is pointless, simply because it's not representative of real world use. This is not a benchmark designed to interest geeks at their computers comparing Apples to Oranges, it's a scientific paper to convince people that when they're making their large processing clusters, they should use lots of Intel CPUs rather than nVidia GPUs.
With regard to video encoding, I assume you use Badaboom. The reason why it's much faster is that they compromise massively on quality. If you drop the settings in (say) Handbrake to a comparable level, you're going to have roughly the same speed in either, but as soon as you try to crank up the image quality to HD level, the GPU will be a long way behind.
To Gareth: I'm disappointed in you. The headline borders on the sensationalist, when even in your own article you write that it's only in one of the tests. It's a bit like saying 'i7 3000 times slower than GTX480' when the test you did was rendering Crysis. 2.5x faster is the actual figure, according to the study' so for you to say that it's 14x faster is basically taken straight from the Daily Mail Handbook of sensationalist headlines!
Don't think that Intel will take this lying down they have the resources to to adapt. AMD was thought to be crazy in buying ATI. Nvidia might want to push CUDA over competitors but I don't think they can because they aren't in a position to overthrow Intel and their presence in the software side. I think AMD/ATI might be in the best position to integrate CPU/GPU. They have a CPU that Nvidia doesn't have and far more experience than Intel in GPUs.
In the real world these ideas and exercises don't always make it to market but they usually do have a significant impact on future hardware and software architecture. I do think GPU-CPU integration will happen but the GPUs will be used to increase parallel data processing performance and not specifically for graphics output. Think of the money ATI and Nvidia make on graphics cards, they won't want to give up those highly profitable margins. Plus high performance GPUs generate a lot of heat and how much thermal density can a CPU-GPU integrated package handle? If it takes up too much real estate it will likely not push GPU cards off the board. In the mass corporate world with low demands on GPUs it will make a lot of sense to further integrate more, if not all, functions onto a single chip where thermal limits aren't pushed and power consumption is a major consideration.
In reality we're 2.5x times slower than we should be.
Way to go Intel. :/
Now when we are talking about computer clusters that handle large data sets than threading becomes relevant. Seti @ home and folding @ home are evidence of the types of tasks that can be parallelized to the nth degree and still seek performance gains. Some tasks can only be reduced so much before you don't gain anything.
I actually forsee the whole industry moving towards threading and it will take the hardware software and people (coders etc) a good 10-15 years for good threading practice and standards to come into place. If you look at all the CPU roadmaps they are moving to 6-8+ cores. I think we'll see sometype of hybrid hardware that can scale down to a few very fast cores for serial computing and scale up to N cores for massively parallel applications. Ala Joining a CPU and GPU into one die.
We're getting there. Flash 10.1 is just the start.
But computers are more than the underlying hardware, they are about software too. You need a solid community of systems programmers and tool authors to make the platform work. To me, this is why AMD's x64 won out over Intel's IA64.
It's the same story with the GPGPU idea.
AMD64 won because it was backwards compatible with x86.