Building a linux cluster / Computer Science / Forums

Forums

4hv.org :: Forums :: Computer Science

« Previous topic | Next topic »

Building a linux cluster

1 2 3 next

Move Thread

LAN_403

Dr. Shark

Thu Jan 20 2011, 03:34AM

Registered Member #75 Joined: Thu Feb 09 2006, 09:30AM
Location: Montana, USA
Posts: 711

Since we don't have money for a real computer in my lab, I get to build a small cluster to do our data analysis on. With a budged of roughly 5k and an emphasis on number crunching, my idea is to build 4 computing nodes with Intel's new Sandy Bridge 2600K processors and 16GB of memory. For storage I would build an extra node with a cheap CPU and and an 8-way RAID array of 2TB disks.

Most of our jobs can be parallelized manually so I'm not really worrying about linking the computer together Beowulf style at the moment.

I don't really expect that anyone has build a similar system here, but I'm surely not the only one pondering putting together a Sandy Bridge rig in the near future. Since it's been a while I last build a computer (I switched to Mac nearly 10 years ago) it would be really useful if somebody could look over my choice of components and point out any stupidities and incompatibilities. Here it goes:

Compute nodes:
Intel i7 2600k

Gigabyte H67A-UD3H

(with the DVI port so I don't need graphics cards)
Gskill Ripjaw 4x4 GB DDR3 1600

Antec Three Hundred case

Antec Earthwatts EA380 PSU

Caviar Blue 500GB HDD

Storage node:
Pentium E6500
Asus p5g41c
Patriot 2x2GB DDR3 1333
case, PSU and HDD as above
areca ARC 1220 RAID controller

8x Caviar Green 2TB HDDs

On newegg the compute nodes come to about $900 each, the storage node excluding disks is even less. It's amazing how cheap computers have gotten.

Short of porting all the code to run on GPUs, do you guys agree this is the cheapest way to get a lot of computing horse power?

Carbon_Rod

Thu Jan 20 2011, 05:48AM

Registered Member #65 Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155

Depends what types of problems you are trying to solve....

Sometimes two $300 graphics cards running CUDA can outrun a partition with 16 cluster nodes.

Cluster nodes work just fine off a PXE booted OS, and workstations can be set to "dual" boot in this way during the weekends. Also, a node costs under $250 for the bare bones hardware: motherboard, 2 cpu, power supply, expensive fiber NIC, and maximum ram.

For $5k, you may want to look into Amazon cloud nodes with GPUs.

Cheers,

Conundrum

Thu Jan 20 2011, 07:44AM

Registered Member #96 Joined: Thu Feb 09 2006, 05:37PM
Location: CI, Earth
Posts: 4062

cluster made of YLoD PS3's?

some university made one as a black hole simulator IIRC.

the fix is really easy but takes a while, if you do them in parallel it should take you about a day of work to do 8 or so.

Alternate idea (if you can't get ps3's), build a cluster using netbook motherboards overclocked to 40% faster than stock, with liquid cooling.
Easy to do if you have access to a 3-D prototyping jig to make the cooling blocks...
(plus they run windows and with minimal hackery will boot "native" with no hdd's required!)

-A

Sulaiman

Thu Jan 20 2011, 08:25AM

Registered Member #162 Joined: Mon Feb 13 2006, 10:25AM
Location: United Kingdom
Posts: 3141

I've no idea what kinds of programmes you wish to run but
I used to maintain mainframe/mini/unix/pc systems
and I think that the two best approaches are
- A single large multi-processor unix system, shared O.S./memory/disk etc.
- use every spare computing cycle of every PC on the network
(cloud-computing, co-operative computing, NFS-like systems etc.)

Irrespective of the above, I would expect to gain more overall performance improvement by spending $900 on a training course for the software than an extra server. Mastering the software tools and programming wisely based on a good understanding of the data and rules can give >10x performance compared to a poorly (average) thought out system.

Carbon_Rod

Thu Jan 20 2011, 08:48AM

Registered Member #65 Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155

Or... try this:

cheers,

Steve Conner

Thu Jan 20 2011, 10:04AM

Registered Member #30 Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706

Carbon_Rod wrote ...

Sometimes two $300 graphics cards running CUDA can outrun a partition with 16 cluster nodes.

Well, why not fit each compute node with two of those too?

NVidia actually sell the "Tesla" GPU-based "supercomputers".

Pinky's Brain

Thu Jan 20 2011, 02:10PM

Registered Member #2901 Joined: Thu Jun 03 2010, 01:25PM
Location:
Posts: 837

With Sandy Bridge you won't have ECC.

Infiniband is an option, even on this budget.

Why not use rackmount?

Nicko

Thu Jan 20 2011, 02:46PM

Registered Member #1334 Joined: Tue Feb 19 2008, 04:37PM
Location: Nr. London, UK
Posts: 615

Dr. Shark wrote ...

Intel i7 2600k

Gigabyte H67A-UD3H

(with the DVI port so I don't need graphics cards)

Hmmm. Don't think the i7 has on-chip graphics. The Gigabyte board supports the i5 chips that do have graphics, but nothing else. You'll need a cheapo PCIe graphics card.

I may be wrong - I often am...

Edit: It seems the 2700k does have graphics when using the H67 chipset - I'd still double check with the supplier...

Bjørn

Thu Jan 20 2011, 03:10PM

Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058

Sometimes two $300 graphics cards running CUDA can outrun a partition with 16 cluster nodes.

Most comparisons you find compare a single threaded badly written C program with a fully optimized GPU program.

If the comparison shows 100x improvement on the GPU:
Running 8 threads on an i7 gives 6x improvement so we are down to 16x
Spending the same amount of time on the C code as the CUDA code, 8x
Changing over to use SSE, 4x

The GPU is still faster but it is not always so fast that it is worth the extra effort.

Dr. Shark

Thu Jan 20 2011, 06:05PM

Registered Member #75 Joined: Thu Feb 09 2006, 09:30AM
Location: Montana, USA
Posts: 711

Lot's of good suggestions, especially on spending the money on training - I'll forward that to my colleagues

Seriously though, they are biologists and neuroscientists and would rather throw money at the problem than changing any of the code. We have a huge repository of old MatLab that we are using and trying to change any of that would probably set our research back about 5 years.

In principle I like the CUDA approach, but I know a lot more about computers than anyone else in my lab combined and I pretty much gave up when I tried it a couple of months ago. It seems it should be so easy for what we are doing, filtering data, training neural networks etc, but all the easy toolboxes like PyCuda only get you so far. A real example, matrix multiplication is cheap and fast, but as soon as you need to slice an array in a weird way or access it element-wise you need to write your own kernels. And once you have to write low level code you have to know about optimal block size, what sits in cache vs. main memory, it really becomes so complex that it takes less time in total to just run the code on a slow (CPU) machine. Maybe this is starting to change as the lastest MatLab release has CUDA support, but for know I think BjÃ¸rn is right on.

Same applies to PS3's btw, we want an x86 environment so nobody has to write low level code. We briefly discussed using Amazon cloud services but the machines will be under full load pretty much 24/7 so it would not be economical.

My take on buying "big" machines with say a quad socket server board so we could have all 16 cores and storage in one machine was that it's a lot more expensive than spreading over a number of smaller nodes. Even a 12 core Mac Pro or equivalent Dell are more expensive than what I sketched out, and that is with a lot less storage and memory. Overall it seems to cost roughly twice as much as building your own. Future expandability is another big plus for the cluster.

1 2 3 next

Moderator(s): Chris Russell, Noelle, Alex, Tesladownunder, Dave Marshall, Dave Billington, Bjørn, Steve Conner, Wolfram, Kizmo, Mads Barnkob