bit-tech.net

How To Build The Best Folding Rig

Comments 26 to 48 of 48

Reply
Zero82z 3rd August 2009, 22:51 Quote
Decent article, but you made an extremely important omission, which is that you completely ignored the possible contributions of CPUs to folding production. Although it's true that it's more difficult and expensive to build a farm centered around CPUs, current quad-core CPUs are actually just as good and in some cases better than GPUs for folding. A 45nm Core 2 Quad or Phenom II CPU can hit 5-6k PPD depending on the overclock when using the Linux SMP client, and an i7 CPU can hit 8-10k PPD depending on the overclock, which is actually more than any single GPU is capable of producing. Folding on a quad-core CPU in a multi-GPU folding box will give you essentially the same contribution as adding an entire GPU would, which is pretty significant in my books.

There are two ways of setting up the Linux SMP client efficiently with modern systems. You can either install Linux natively and run the GPU clients under WINE, or you can run Windows and use VMWare Player in conjunction with instances of notfred's folding virtual appliance (one instance per two cores). Running two 2-core instances of the Linux SMP client produces more points per day, although the RAM requirements are higher. Still, 4GB is enough for a multi-GPU box with two VMs.

Also, the PPD chart looks a bit off (the positioning of the GTX260 192SP below the 9800GTX+ and GTS250, and the GTX280 and GTX275 being placed below the GTX260 216SP; it should be GTX260 192SP<GTX260 216SP<GTX280<GTX275<GTX285<GTX295). You also list some numbers for the 9600GSO, although you didn't specify which version (I presume it's the EOL 96SP version, since it's placed higher than the 9600GT; the current 48SP version would perform worse than the 9600GT).
MrGumby 4th August 2009, 03:07 Quote
I think people are missing the point of folding. It maybe a form of Willie Waving but at the very least it acomplishes something.
dark_avenger 4th August 2009, 05:44 Quote
+1 for notfred's VM SMP client
it's worth putting a faster quad-core to keep up with the GPU's but also to get a few extra PPD thu SMP clients.
mm vr 4th August 2009, 07:39 Quote
Why do you recommend the crappy eXtreme Outervision PSU calculator?

- It's funded by Thermaltake
- It gives wattage values way off (50-200% too much)
B3CK 4th August 2009, 07:59 Quote
I couldn't imagine turning a rig on like this here in Tx, U.S. during the summer. Constant daily temps of over 100f. Ouch. But for those like me that hate the cold, I could put a rig like this in the front of my central air intake, or living room, and have a nice little 1kw heater. Folding while heating,, great for winter months, but too bad for summer.
Farting Bob 4th August 2009, 11:09 Quote
Quote:
Originally Posted by MrGumby
I think people are missing the point of folding. It maybe a form of Willie Waving but at the very least it acomplishes something.

What has it actually accomplished so far?
mclintox 4th August 2009, 13:13 Quote
I think some people need to get out more!
John_T 4th August 2009, 13:52 Quote
Apologies if I'm becoming a little too 'geeky' is this, (the politest word I can probably get away with) but I do have a couple of questions on the hardware side if maybe someone (Lizard?) could help out...

I've taken a look at the specialist Tesla C1060 cards that nVidia produce, and they appear at first glance to be little more than GTX 275's with 4GB RAM instead of 896MB - yet you can buy a GTX 275 for less than £200, whereas a C1060 costs around £1,300.

http://www.viglen.co.uk/pricelist/HPCpricelist.pdf (Opens a 2.1MB pdf just to warn people - p8 if you're interested).

Why on earth is there such a price disparity?

If the extra (and faster?) RAM makes such a huge difference to performance, then wouldn't it be worth paying all the extra upfront to save on the very substantial running costs of having four cards running instead of one? (Plus all the reduced CO2 emissions, heat, noise etc, etc).

And if it doesn't make that much difference, then why are four GTX 295's, (each with double the number of processor cores) still cheaper than a single C1060?

I suppose this is all a bit of a moot point as I'm not actually thinking of buying four 295's anyway, but then if one C1060 does a similar job with a quarter of the electricity, (plus all the other benefits) well, that could be something to think very seriously about.

Seeing as you'd be showcasing their products to a wide audience, would there be any chance of persuading nVidia to lend you some C1060's to build a showcase folding rig with them & see what's what?

That's a follow up article I'd love to see...
Lizard 4th August 2009, 15:10 Quote
Quote:
Originally Posted by John_T
I've taken a look at the specialist Tesla C1060 cards that nVidia produce, and they appear at first glance to be little more than GTX 275's with 4GB RAM instead of 896MB - yet you can buy a GTX 275 for less than £200, whereas a C1060 costs around £1,300. Why on earth is there such a price disparity?

There's such a huge price difference not just because of the extra RAM but because Tesla cards are professional workstation/HPC products, not gaming (GeForce) products. Because of this, Tesla cards (like Quadro cards) go through a much stricter qualification process than GeForce cards, just like a Xeon is much more thoroughly tested than a Core 2/i7.
Quote:
Originally Posted by John_T
If the extra (and faster?) RAM makes such a huge difference to performance, then wouldn't it be worth paying all the extra upfront to save on the very substantial running costs of having four cards running instead of one? (Plus all the reduced CO2 emissions, heat, noise etc, etc).

The trouble is, the extra RAM makes absolutely no difference to folding@home, so a Tesla card is completely pointless for running this GPGPU app. Other HPC apps however may well use the extra RAM.
Quote:
Originally Posted by John_T
Seeing as you'd be showcasing their products to a wide audience, would there be any chance of persuading nVidia to lend you some C1060's to build a showcase folding rig with them & see what's what?That's a follow up article I'd love to see...

I've been talking to Nvidia about doing an article on Tesla cards for the better part of six months but due to their high price Nvidia won't give any samples to the press. They're probably also pretty scared that we could slate them for their high price, even though we review Xeons/Opterons on a regular basis.
John_T 4th August 2009, 15:32 Quote
Thanks for the detailed response.

So, essentially, for folding purposes, a C1060 is going to produce (approximately) the same performance, yet for close to 7 times the price. Fair enough.

All that 'stricter qualification process' may be important for mission critical applications such as medical equipment, but I suspect for even most Tesla users (who don't need the extra memory) it's probably just a waste of money - as a good quality after-market cooler should keep the thing reasonably safe.

It's no wonder they don't want to hand any over to you!

(To think I was actually weighing it up as an option...)
Lizard 4th August 2009, 15:39 Quote
Yes indeed, and to make matters worse I'm not convinced the 'additional qualification' is enough to justify the huge price difference, even if you are a professional user. For example, in contrast, a professional CPU (Xeon or Opteron) is a superior product to a desktop CPU (they have better thermals, error detection and correction) - something neither ATI or Nvidia have addressed with their professional cards.
John_T 4th August 2009, 16:20 Quote
Haha! They've stuck some extra RAM on it, relabelled it and charged £1,100 for the privilige haven't they - cheeky swines. Still, they're a business and not a charity I suppose - charge what you can get away with & all that.

You know, I'd absolutely LOVE to see one of these pitched against a bog-standard 295 now, as even for apps that could make use of the additional memory I'd be willing to bet the extra 240 cores would still beat it hands down.

A pity I don't know anyone who uses that kind of workstation - I'd persuade them to rip it out and send it to you for a couple of days so you could run a comparison. I'm sure they'd be willing if they could find out they could potentially increase performance for almost a £1,000 saving per card. There must be someone out there...
Splynncryth 5th August 2009, 04:17 Quote
Quote:
Originally Posted by iggy2k
you'd need university funding for the leccy bill.

That's a reason I stopped using what would have been folding overkill before the GPU clients. I nearly doubled my electric bill and that was in the winter when I could use it to help heat my place :)

Are there options for a system that only have x8 physical slots, and how does scaling the links down translate into how the GPU can do?
Lizard 5th August 2009, 08:50 Quote
8x PCI-E slots provide more than adequate bandwidth for folding - although you will need to slice the end of the slot to allow a 16x PCI-E card to fit. It's very easy to do though - just take your time with your favourite tool (Dremel/Stanley knife), being careful not to cut into the connections in the slot or the motherboard.
Aristide1 8th August 2009, 00:55 Quote
While the article was dead on in regards to graphics card advice, it could not have missed the mark further in the motherboard area. My K9A2 is the worst board I ever owned, buggy and unpredictable even after a BIOS update.
Lizard 9th August 2009, 10:07 Quote
I'm surprised to hear you've had problems with your K9A2, why not pop into the folding form and say hello, loads of us have K9A2's there.
JackOfAll 9th August 2009, 11:51 Quote
There is absolutely no point in purchasing and using the C1060 cards for FAH. Spend your $$$'s on the latest 200 series cards. Aside from the any differences in the qualification process, as has a already been pointed out the main advantage is the additional RAM. Data transfer from system memory to card memory is a very expensive operation. If you have large datasets to process there is a big speed advantage to minimizing the number of transfers. ie. load large dataset onto the card as few times as possible. FAH does not operate on large datasets, it has been designed for the typical RAM available on consumer grade cards, and old ones at that. ie. 8 series.

You also need to bear in mind that people are using CUDA in professional or commercial datacenter applications. It's not centred on FAH or home use. Rack mounted S1070 units are going to be purchased by the investment bank to run their monte carlo simulations on option positions, not home build rigs with 4x GTX285. As a professional developer I'd expect to be using C1060's on my desktop development machine to develop the code that's going to run in production. ie. you don't develop on consumer grade hardware and then deploy to hardware with different specs. In any case, if you're writing applications that use 4GB of card RAM, you need development hardware that supports that.

Whether it's right or wrong, (or a complete waste of money if you don't need the 4GB of RAM per card), there is also the no-one ever got fired for buying IBM predicament when purchasing hardware in a commercial setting. You simply don't purchase consumer grade hardware for a mission critical application.

The professional Tesla solutions have their place in the market, but there is absolutely no point in using them in preference to a consumer grade graphics card for FAH. No real benefit for a lot of extra expense.
JackOfAll 9th August 2009, 13:15 Quote
Quote:
Originally Posted by Lizard
Although it’s possible, mixing and matching different graphics cards in one PC isn’t always a good idea. This is due to a well documented (but still unfixed) limitation of the CUDA API – if the Nvidia GPUs have a different number of stream processors then folding will run much slower than it should.

James, I've seen several reports now that with the Windows 190.38 (CUDA 2.3) drivers this 'limitation' is fixed. Various other things as well. eg. if cards are in SLI mode, you can now access the individual GPU's, which you couldn't before.
Aristide1 16th August 2009, 01:08 Quote
My K9A2 running with 2 cards on Vista 64 behaved only slightly strange. The Riva Tuner setting would not stick (be remembered) and the FAH GPU configuration was also "forgotten" making for a massive loss of points. Then I added a 3rd card, and windows now thinks I have 7 displays. Yes SEVEN. Folding has gotten worse with another card, no better. The GPUs? 9600GSOs with 96 stream processors, nothing heat or bandwidth intensive. PS? Seasonic 850.
Splynncryth 17th August 2009, 06:42 Quote
Quote:
Originally Posted by Lizard
8x PCI-E slots provide more than adequate bandwidth for folding - although you will need to slice the end of the slot to allow a 16x PCI-E card to fit. It's very easy to do though - just take your time with your favourite tool (Dremel/Stanley knife), being careful not to cut into the connections in the slot or the motherboard.

I'm just the steward of the box, so I can't go cutting things up, and it a little pricey for me to want to try on :) It's a serious server which makes a few other things hard to deal with like power. I have not had it do much of anything for a while, but threads like this have me thinking about firing it back up for folding.
Lehmann 3rd March 2010, 17:45 Quote
A much better setup would be running a i7@4ghz smp A3 core 17,000 ppd stand alone or with a couple of gtx260's. I run my shaders at 1560 average 8000ppd= 33,000 ppd
sadlydefiant 31st March 2010, 02:54 Quote
That is impressive.
I know it is for a good cause and all but that would cost more than most people are willing to spend.
Log in

You are not logged in, please login with your forum account below. If you don't already have an account please register to start contributing.



Discuss in the forums

More About...