bit-tech.net

Intel unveils 50-core maths co-processor card

Intel unveils 50-core maths co-processor card

The processor will be packaged on a traditional 16x PCI-E card.

Intel officially took the wraps off its next generation ‘Knights Corner’ processor last night; a dedicated 50-core maths co-processor chip based on the technology from Intel’s abandoned Larrabee graphics project.

Intel confirmed that the 50 x86 cores used in Knights Corner will be fabricated using the same 22nm Tri-Gate process as next year’s Ivy Bridge processors, meaning the processors will use the very latest transistor technology.

Intel also explained that Knights Corner is only the first product in what will eventually be a range of Many Integrated Core (MIC) processors. Indeed, another iteration of the MIC family - dubbed Knights Ferry - is already being trialled at several supercomputing laboratories across the globe.

The processors will also be packaged on a traditional 16x PCI-E card, so they'll potentially provide an easy upgrade for any workstation that requires a little extra processing grunt.

Intel understandably envisages the MIC processors competing directly with current co-processor technologies; most notably Nvidia’s range Tesla products, which are currently based on the company’s Fermi GPU architecture.

Intel believes it has the edge in this battle, however, as it says it will be easy for people to use their existing tools when programming for its x86-based MIC processors.

If you can program a Xeon, you can program this microprocessor,’ says the general manager of Intel’s Many Integrated Core Computing division, Anthony Neal-Graves. ‘You can use the same tools and the same compilers. That makes parallelism simpler for the end user. It provides a saving in terms of time and money, and allows programmers to be much more efficient in terms of what they do.

Would a super-powerful maths co-processor improve your working life? Could Knights Corner threaten GPGPU computing? Let us know your thoughts in the forums.

45 Comments

Discuss in the forums Reply
Burhoom 21st June 2011, 12:46 Quote
But can it run Crysis?

...sorry couldn't resists :P
Paradigm Shifter 21st June 2011, 12:57 Quote
Whether it becomes a threat to GPGPU depends solely on its performance in equivalent applications.

Also, different architectures will always perform differently at different tasks. It might be amazing at one thing (handily beating GPGPU) and be terrible at another.

Time will tell.
Evildead666 21st June 2011, 13:05 Quote
Thats quite an exhaust vent on that card. Must need some serious cooling.
Even though that looks like the same pic we've been shown for ages, I wonder if the latest version will be a dual 8-pin affair, with some pretty high thermals.
Tattysnuc 21st June 2011, 13:10 Quote
How does this affect existing software? Will software have to be written specifically to run on this, or will it just accelerate the maths calculations that can be parallelised? Folding....?
GuilleAcoustic 21st June 2011, 13:11 Quote
I'm looking forward to it. I'm planing to develop a physic engine and that would be lovely in pair with a nice GPU for the display.
Taffy 21st June 2011, 13:42 Quote
History repeating itself, first just CPU's, then CPU's with a separate maths co-processor, then CPU and integrated maths co-processor. Now an additional separate maths co processor, What Next ?
StoneyMahoney 21st June 2011, 13:44 Quote
I can see research lab administrators being quite happy to buy a dedicated maths accelerator card for staff workstations where they would normally baulk at requests for gaming graphics cards. How well will it perform and how easy is it to offload work to it? Being x86 based, I imagine it'll be more flexible than GPGPUs just not quite as fast as the things GPGPUs can do.
John_T 21st June 2011, 14:04 Quote
Quote:
Originally Posted by Evildead666
Thats quite an exhaust vent on that card. Must need some serious cooling.

Maybe, but not necessarily. Without the need for Video Outputs like on GPU's they may just as well open up the back as not - it'd be a bit silly not to really...
GuilleAcoustic 21st June 2011, 14:19 Quote
I'm wondering how it will perform on the price / perf / power draw compared to GPU and CUDA / openCL.
Phalanx 21st June 2011, 14:21 Quote
Quote:
Originally Posted by Taffy
History repeating itself, first just CPU's, then CPU's with a separate maths co-processor, then CPU and integrated maths co-processor. Now an additional separate maths co processor, What Next ?

They'll attach a CPU to it soon, just watch. Dual CPU PCs using one on a PCI-E port ;)
Baekkel 21st June 2011, 14:29 Quote
Quote:
Originally Posted by Ph4lanx


They'll attach a CPU to it soon, just watch. Dual CPU PCs using one on a PCI-E port ;)

I can see the scenario... SR-2 with Xeon add-ins
..?60Core madness?
NiHiLiST 21st June 2011, 14:36 Quote
Is there any word on whether this will support OpenCL? I believe that's the way we should go going, so computers can just have a pool of processing resources, be it CPUs, graphics cards or co-processors, which work together seamlessly. I understand it's much more complex than this from a programming and performance point of view, but it's a nice idea.
Autti 21st June 2011, 14:39 Quote
being x86 based doesn't net any benefits, in fact a lot of downfalls. x86 isn't designed to work specifically with high bandwidth processors like this, and his comments referring to programming a xeon is simply unfounded. You will have to completely re-write the program if you wan't it performed efficiently on this. Sure it's x86 coding, but if you have to write a program what does it matter what code, specifically when CUDA offers such unparalleled support.

I will reserve comments on the performance until there is more detail of what they have taken from the Larrabee core, but one thing should be pointed out. It's not just competing against Tesla, it has to compete with a few other processors as well such as Niagra which is a 128thread best of parallel processing.
Autti 21st June 2011, 14:52 Quote
found Intel's paper stating they achieved 950Gflops, with 1200Gflops the theoretical maximum, which means Tesla comfortably beats it.
azazel1024 21st June 2011, 14:52 Quote
I don't know exactly the core design of the processor, so its hard to know just how fast the thing is going to be in different application (always the impirical method). However, my guess is, if it is using out of order instruction sets and most of the other goodies you see in current SB or even core x86 processors then my guess is that it is going to be a lot faster than GPU computing for a number of tasks, but still slower at some of the basic math highly parrallel stuff, like breaking hashes, etc take relatively little math, is completely inorder processesing and benifits from hugely parrallel processing, which 50 cores can't out compete 400+.

However, stuff that requires a couple of floating point precisions is probably going to run faster on this thing with its 50, assumedly faster and greater instruction flexibility, cores than on a GPU with more, but less capable cores.

My other guess, this is going to be a lot like CPU vs GPU computing. Somethings work better on full fledged CPUs and somethings work better on full fledged GPUs. This I think is going to be one of those grey areas inbetween where it is better than both at some things, and not as good at others. You need a steak knife for steak, a scaling knife for scaling and a paring knife for fruit. You can use each for the other's tasks, but they just aren't as good.
GuilleAcoustic 21st June 2011, 15:23 Quote
Quote:
Originally Posted by Baekkel
I can see the scenario... SR-2 with Xeon add-ins
..?60Core madness?

Tyan Thunder n4250QE + Tyan Thunder M4985-SI Expansion Board + 8x 12-cores opteron = 96 cores ... and you still have 4x PCI-e16x free to be used .... nom nom nom
Sensei 21st June 2011, 16:59 Quote
Can none of you see whats happening? This is Skynet all over again. Well Im not hanging around for some T800 to ruin my day.

JC
moreard 21st June 2011, 17:03 Quote
This reminded me of my first choice in computing - 486SX or 486DX
RichCreedy 21st June 2011, 17:12 Quote
f@h might be good use for this
abezors 21st June 2011, 17:15 Quote
Interesting idea, but perhaps a couple of years too late. This would have been better released in the earlier days of GPGPU I think; something to stimulate the market and development in the area. Aren't serious computing programs written with CUDA/Stream/OpenCL in mind now?
Mankz 21st June 2011, 17:45 Quote
This could be perfect for my CAD rendering rig :D
Fod 21st June 2011, 18:00 Quote
Heh, I knew this would be recycled larrabee tech the moment I read the headline.
Crunch77 21st June 2011, 18:42 Quote
How does one take advantage of this? Do I just write my code as usual- using my usual compiler? Use a different compiler targetting the Knights Corner? Any benefit for me as a developer? Could it be used to speed compiling/testing in my IDE, making for a super nice developer machine?
thehippoz 21st June 2011, 18:55 Quote
Quote:
Originally Posted by Crunch77
How does one take advantage of this? Do I just write my code as usual- using my usual compiler? Use a different compiler targetting the Knights Corner? Any benefit for me as a developer? Could it be used to speed compiling/testing in my IDE, making for a super nice developer machine?

it's interesting isn't it? =]
schmidtbag 21st June 2011, 19:46 Quote
i'd much rather see a pci-e card dedicated to opencl. basically it'd be a gpu but it wouldn't have video ports and it would be designed to ignore any of the instruction sets intended for video use only. think of it like agea's original ppu, except opencl instead of physx.
PingCrosby 21st June 2011, 20:17 Quote
"Knights Corner"? who the hell thought that one up.
Tulatin 21st June 2011, 21:04 Quote
Quote:
Originally Posted by schmidtbag
i'd much rather see a pci-e card dedicated to opencl. basically it'd be a gpu but it wouldn't have video ports and it would be designed to ignore any of the instruction sets intended for video use only. think of it like agea's original ppu, except opencl instead of physx.

That's the nVidia Tesla line.
Blackshark 21st June 2011, 21:24 Quote
Surely what this is about is not beating GPGPU, but providing a better solution for certain problems that require out of order execution and other benefits of a x86 architecture. Sure, there are much faster, many more core products that can do my RISCy based simple maths, MUCH faster. But if I was designing a large server farm based super computer for general problem solving (ie. a University) then I would want flexibility.

Fill up a board with 2 GPGPU cards, 2 of these, and you have a machine that is capable of doing a lot pretty fast rather than a few things a bit faster. Sure there are specific companies that want to model proteins and the like that will likely say no thanks.

Its a good product, it has its place.
Qwatkins 21st June 2011, 21:26 Quote
Quote:
Originally Posted by Tulatin
That's the nVidia Tesla line.
Except that Tesla is CUDA only NOT OpenCL
GuilleAcoustic 21st June 2011, 21:40 Quote
Quote:
Originally Posted by GuilleAcoustic
Tyan Thunder n4250QE + Tyan Thunder M4985-SI Expansion Board + 8x 12-cores opteron = 96 cores ... and you still have 4x PCI-e16x free to be used .... nom nom nom

I quoted my own post :D

add a this and 2 GTX590 (with WB and single slot bracket and you have a 146 cores CPU and 2048 cores for GPGPU. Pretty much do everything rig.
Action_Parsnip 22nd June 2011, 01:34 Quote
I still think it will suck.

Even in it's 'natural home' running x86 code, it will be up against both Nvidia and ATI.

I see it like this: Intel release 50 core card, Tesla and AMD cards recieve 50% price cut. They're only non-intentionally-gimped common GPUs with commercial-use prices and software and support packages.

CUDA is moving at break-neck speed towards ever better flexibility and performance. By 2012 Keppler will be out and about (focused on flops/watt), CUDA will be improved, and I do not see x86 being 'all-that' in a field (HPC) where C++ is the requirement and nothing more. Plus they just shown their hand a year in advance. Thats a lot of pricing/marketing/CUDA pushing wiggle-room for Nvidia.

Disclaimer: I do not like Nvidia much at all.
Andy Mc 22nd June 2011, 11:46 Quote
If this supports OpenCL then I hope it'll mine well.....
Bindibadgi 22nd June 2011, 11:49 Quote
Quote:
Originally Posted by schmidtbag
i'd much rather see a pci-e card dedicated to opencl. basically it'd be a gpu but it wouldn't have video ports and it would be designed to ignore any of the instruction sets intended for video use only. think of it like agea's original ppu, except opencl instead of physx.

http://www.amd.com/us/products/workstation/graphics/Pages/workstation-graphics.aspx

Job done.

Intel promised OpenCL within 6 months of launching Clarksdale. We still don't have it with Sandy Bridge. :(:( Yet we have Intel compilers ready since launch for its Quick Sync Video..
HourBeforeDawn 22nd June 2011, 20:46 Quote
so would this be ideal for people in the rendering side of the field like video editing and music mixing and what not?
The Infamous Mr D 22nd June 2011, 21:19 Quote
Quote:
Originally Posted by GuilleAcoustic
I quoted my own post :D

add a this and 2 GTX590 (with WB and single slot bracket and you have a 146 cores CPU and 2048 cores for GPGPU. Pretty much do everything rig.

Including superheat your abode and create a stupendous electric bill :)
GuilleAcoustic 22nd June 2011, 22:52 Quote
indead :)
rogerrabbits 23rd June 2011, 03:33 Quote
Quote:
Originally Posted by PingCrosby
"Knights Corner"? who the hell thought that one up.

I know, sounds like a medieval yoghurt.
[USRF]Obiwan 23rd June 2011, 11:51 Quote
Seems like they dont want to do it like Atari with their ET cartridges.

Intel put their stockpile of useless 'wannabe' GPU processors on a pci-e card and sell it as math co procs and still make profit of it. Typically Intel....
Kaiwan 23rd June 2011, 12:18 Quote
Can't wait for Ansys to get their hands on this to charge me another license to use it :D
PingCrosby 23rd June 2011, 12:48 Quote
Quote:
Originally Posted by rogerrabbits
I know, sounds like a medieval yoghurt.

mmmmmmmmm.....yoghurt.;)
timevans999 26th June 2011, 08:54 Quote
Is this the answer to running supreme commander forged alliance
Bindibadgi 26th June 2011, 09:03 Quote
Quote:
Originally Posted by [USRF]Obiwan
Seems like they dont want to do it like Atari with their ET cartridges.

Intel put their stockpile of useless 'wannabe' GPU processors on a pci-e card and sell it as math co procs and still make profit of it. Typically Intel....

Stockpile? It'll be 22nm - they haven't even got that production ready yet. The original Larabee was only 32 processors iirc.

These are not GPUs any more, they are IA co-processors in the same way the original FPU was a co-processor to the CPU in the early 90s.
BrightCandle 16th November 2011, 16:05 Quote
Intel failed to release Larrabee and this a fallback position. They couldn't get the graphics hardware and software to make it perform well enough, but by putting lots of x86 cores on a card you can potentially get the first stepping stone working. Then they can go about putting the vector instructions into it that makes it efficient enough for graphics work. Finally they can add the outputs and any specialist dedicated hardware on board to make it a complete card. in the meantime they need to be working on the software driver that makes it all work.

Its a better strategy than their first attempt, it removes the arrogance that they are Intel and they can take over the Graphics market whenever they want. They may however have given up making a mass parallel x86 graphics card and are settling for the openCL world.

They could really make a big hit with x86 cores if they get the software process correct. Ideally what we want is for the CPUs to be transparent to the programmer. Windows will use them if it needs to, but we want it to use the main cores in preference. Then I think we also need the ability to tell a thread if it should run on the slower cores, so we can reserve the main cores for critical work. If that is all that it took to use them then we'd be getting extra performance from programs that are already parallel almost immediately. If its just an openCL implementation or propriety then its not going to be a big release.
Marvin-HHGTTG 16th November 2011, 16:12 Quote
Quote:
Originally Posted by BrightCandle
Intel failed to release Larrabee and this a fallback position. They couldn't get the graphics hardware and software to make it perform well enough, but by putting lots of x86 cores on a card you can potentially get the first stepping stone working. Then they can go about putting the vector instructions into it that makes it efficient enough for graphics work. Finally they can add the outputs and any specialist dedicated hardware on board to make it a complete card. in the meantime they need to be working on the software driver that makes it all work.

Its a better strategy than their first attempt, it removes the arrogance that they are Intel and they can take over the Graphics market whenever they want. They may however have given up making a mass parallel x86 graphics card and are settling for the openCL world.

They could really make a big hit with x86 cores if they get the software process correct. Ideally what we want is for the CPUs to be transparent to the programmer. Windows will use them if it needs to, but we want it to use the main cores in preference. Then I think we also need the ability to tell a thread if it should run on the slower cores, so we can reserve the main cores for critical work. If that is all that it took to use them then we'd be getting extra performance from programs that are already parallel almost immediately. If its just an openCL implementation or propriety then its not going to be a big release.

Erm, news is known as such because it's "new." This is 5 months later... ;)
Log in

You are not logged in, please login with your forum account below. If you don't already have an account please register to start contributing.



Discuss in the forums