Welcome
Username or Email:

Password:


Missing Code




[ ]
[ ]
Online
  • Guests: 25
  • Members: 0
  • Newest Member: omjtest
  • Most ever online: 396
    Guests: 396, Members: 0 on 12 Jan : 12:51
Members Birthdays:
No birthdays today

Next birthdays
05/11 ramses (16)
05/11 Arcstarter (31)
05/11 Zak (15)
Contact
If you need assistance, please send an email to forum at 4hv dot org. To ensure your email is not marked as spam, please include the phrase "4hv help" in the subject line. You can also find assistance via IRC, at irc.shadowworld.net, room #hvcomm.
Support 4hv.org!
Donate:
4hv.org is hosted on a dedicated server. Unfortunately, this server costs and we rely on the help of site members to keep 4hv.org running. Please consider donating. We will place your name on the thanks list and you'll be helping to keep 4hv.org alive and free for everyone. Members whose names appear in red bold have donated recently. Green bold denotes those who have recently donated to keep the server carbon neutral.


Special Thanks To:
  • Aaron Holmes
  • Aaron Wheeler
  • Adam Horden
  • Alan Scrimgeour
  • Andre
  • Andrew Haynes
  • Anonymous000
  • asabase
  • Austin Weil
  • barney
  • Barry
  • Bert Hickman
  • Bill Kukowski
  • Blitzorn
  • Brandon Paradelas
  • Bruce Bowling
  • BubeeMike
  • Byong Park
  • Cesiumsponge
  • Chris F.
  • Chris Hooper
  • Corey Worthington
  • Derek Woodroffe
  • Dalus
  • Dan Strother
  • Daniel Davis
  • Daniel Uhrenholt
  • datasheetarchive
  • Dave Billington
  • Dave Marshall
  • David F.
  • Dennis Rogers
  • drelectrix
  • Dr. John Gudenas
  • Dr. Spark
  • E.TexasTesla
  • eastvoltresearch
  • Eirik Taylor
  • Erik Dyakov
  • Erlend^SE
  • Finn Hammer
  • Firebug24k
  • GalliumMan
  • Gary Peterson
  • George Slade
  • GhostNull
  • Gordon Mcknight
  • Graham Armitage
  • Grant
  • GreySoul
  • Henry H
  • IamSmooth
  • In memory of Leo Powning
  • Jacob Cash
  • James Howells
  • James Pawson
  • Jeff Greenfield
  • Jeff Thomas
  • Jesse Frost
  • Jim Mitchell
  • jlr134
  • Joe Mastroianni
  • John Forcina
  • John Oberg
  • John Willcutt
  • Jon Newcomb
  • klugesmith
  • Leslie Wright
  • Lutz Hoffman
  • Mads Barnkob
  • Martin King
  • Mats Karlsson
  • Matt Gibson
  • Matthew Guidry
  • mbd
  • Michael D'Angelo
  • Mikkel
  • mileswaldron
  • mister_rf
  • Neil Foster
  • Nick de Smith
  • Nick Soroka
  • nicklenorp
  • Nik
  • Norman Stanley
  • Patrick Coleman
  • Paul Brodie
  • Paul Jordan
  • Paul Montgomery
  • Ped
  • Peter Krogen
  • Peter Terren
  • PhilGood
  • Richard Feldman
  • Robert Bush
  • Royce Bailey
  • Scott Fusare
  • Scott Newman
  • smiffy
  • Stella
  • Steven Busic
  • Steve Conner
  • Steve Jones
  • Steve Ward
  • Sulaiman
  • Thomas Coyle
  • Thomas A. Wallace
  • Thomas W
  • Timo
  • Torch
  • Ulf Jonsson
  • vasil
  • Vaxian
  • vladi mazzilli
  • wastehl
  • Weston
  • William Kim
  • William N.
  • William Stehl
  • Wesley Venis
The aforementioned have contributed financially to the continuing triumph of 4hv.org. They are deserving of my most heartfelt thanks.
Forums
4hv.org :: Forums :: Computer Science
« Previous topic | Next topic »   

Building a linux cluster

 1 2 3 
Move Thread LAN_403
Pinky's Brain
Thu Jan 20 2011, 09:02PM
Pinky's Brain Registered Member #2901 Joined: Thu Jun 03 2010, 01:25PM
Location:
Posts: 837
Mac???? Srsly? Even Dell isn't real the one to look for for cheap servers, try Supermicro.

For a lot of cores on a single motherboard you could also consider AMD, the processors themselves are of course significantly slower ... orders of magnitude better interconnect bandwidth though.

A 4 CPU 2 GHz AMD system with 32 cores would be around 2500$ without memory.

PS. in a 1U rackmount case ... compact, but not something you want to be near.
Back to top
Carbon_Rod
Fri Jan 21 2011, 02:51AM
Carbon_Rod Registered Member #65 Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155
PS3 scientific computing libraries generally require the right compiler. Yellow dog offered the most features, but the cost per bogomip is now higher than cheap x86 boards. However, most of the neat stuff about cell architecture (and some of the correct libraries you would need) is now quietly no longer served by IBM.


Usually, a good place to start is traditional OpenMP, and OpenCL for transparent GPU support.

A moment of silence for sun's SGE... Oracle is turning it into an abomination too.
=(
Back to top
Steve Conner
Fri Jan 21 2011, 11:02AM
Steve Conner Registered Member #30 Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706
Carbon_rod: As the "Bogo" prefix suggests, Bogomips are not a measure of performance smile Link2

I have a Mac, as do many of my colleagues, but I can see how Macs might not be the most cost-effective compute servers. The high-end desktop ones are ridiculously expensive.

Cray have a new "deskside" model out. Link2 I have no idea how much it costs.
Back to top
Dr. Shark
Fri Jan 21 2011, 05:48PM
Dr. Shark Registered Member #75 Joined: Thu Feb 09 2006, 09:30AM
Location: Montana, USA
Posts: 711
Since nobody pointed out any serious flaws like the RAM or processor having too many pins to fit into the main board (I had that the last time I build a computer smile ) I went ahead and ordered the parts for one compute node and the raid server.

I don't know much about bogomips, but I hear that Macs have the best bogomips / TFLOPS ratio. Seriously though, I think this Link2 comparison hits the nail on the head, there is no significant "apple tax" once you compare similar high quality systems, but there is a 100% convenience tax to buying a complete system compared to building you own.

The biggest challenge will now be building the RAID-5 or 6 array using the hardware raid controller in a Linux environment. I am sure there are lots of things that can go wrong, so I will be back with more questions soon.
Back to top
Sulaiman
Fri Jan 21 2011, 07:45PM
Sulaiman Registered Member #162 Joined: Mon Feb 13 2006, 10:25AM
Location: United Kingdom
Posts: 3140
First, if your repository is of MattLab files then you may (will probably) find them not suitable for a distributed processing environment, some re-programming/compiling may be required.
Contact your MattLab supplier and ask for suggestions.
Back to top
Carbon_Rod
Sat Jan 22 2011, 12:16AM
Carbon_Rod Registered Member #65 Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155
@Steve McConner
What metric would you recommend to quote your Linux kernel efficiency on symmetric multiprocessors in a heterogeneous networked parallel computer framework?
Prove this is optimal and you get $$$... I even paired the puzzle Link2 for you...=P
6 292 35 125 365 45 228 209 182 150 499 143 81 53 644 463 22 33 415 278 165 74 71 360 19 147 82 169 479 478 75 312 244 38 67 773 98 18 501 162 174 388 295 129 260 86 89 146 65 8 367 99 714 118 54 23 80 827 34 176 57 390 569 11 977 94 532 68 24 1 26 152 85 196 88 161 790 124 131 189 96 122 16 79 252 126 14 106 139 83 200 32 290 60 446 10 772 163 117 104 9 359 372 329 215 130 21 7 107 78 76 247 1485 338 2 316 90 596 20 254 70 571 4 43 141 115 97 583 144 15 205 132 270 36 91 100 214 206 56 231 194 5 37 297 253 40 563 49 87 61 1051 13 46 109 12 137 44 311 63 510 41 216 363 171 72 29 240 186 39 84 202 128 177 30 421 62 48 351 411 31 156 17 153 242 190 111 112 291 198 164 102 279 509 208 376 52 319 58 296 108 142 238 3 66 370 64 294 201 226 50 133


@Sulaiman
Indeed, MatLab and it's 89MB worth of "hello world" libraries has an export code generator for C/C++ (DSP libs too)... but it is not going to be optimized code.

@Dr. Shark
It may sound silly now, but you may want to start thinking about your file system choices if 2TB+ drives are in use.... he he he...


Back to top
Dr. Shark
Sat Jan 22 2011, 01:17AM
Dr. Shark Registered Member #75 Joined: Thu Feb 09 2006, 09:30AM
Location: Montana, USA
Posts: 711
MatLab has a parallel processing toolbox, but in my experience it is far inferior to using Python with openMPI. It uses java for all communications between cluster nodes and it's really sluggish. That's kind of beside the point though, since our problems can easily be parallelized "manually". The data are organized by days and it's easy to just run one weeks worth of data on one computer, the next weeks data on another computer, and so on. So the problem is data-parallel in a really trivial way.

@Carbon rod, the fs is definitely a valid concern, I was planning to go with ext4 with lvm on top. lvm seems necessary for snapshots so we could run nightly backups without having to interrupt the computing jobs. This would also allow adding more/bigger drives to the raid as storage need grows. Any thoughts on that?
Back to top
Carbon_Rod
Sat Jan 22 2011, 03:45AM
Carbon_Rod Registered Member #65 Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155
LVM is only really helpful if you are not using a RAID card as it slows down the i/o.
And... if you boot from it... there could be "Problems" later...

A Cluster file system does not usually use a traditional Journal like Ext4 or Ext3. Note too that only the newest distros' utilities will support disk scans into the TB... (you may have to do some compiling)

There are also other things to consider like maximum file counts and size limits.
Back to top
Steve Conner
Sat Jan 22 2011, 10:22AM
Steve Conner Registered Member #30 Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706
Carbon_Rod wrote ...

Prove this is optimal and you get $$$... I even paired the puzzle Link2 for you...=P

I have absolutely no idea what you're talking about tongue The only metric I understand is Number of happy customers * profit margin.

But you can't deny that Bogomips basically measure how fast the processor can execute the no-op instruction, and are therefore useless as a performance metric. (If you were a CPU designer, imagine how easy it would be to schedule all those NOPs in parallel for an industry-leading Bogomip rating. smile )

Some other things to think about re the original topic:

Integer or floating point math? For some applications, like financial trading, floating point isn't good enough, the application uses its own arbitrary precision numbers similar to Lisp's "Bignums". The FPU doesn't get used, and the processor's integer math performance is what matters.

For the vast majority of scientific applications, double-precision floating point is what's used. (Again, nothing to do with Bogomips.) When looking at MFLOPS figures, be sure to find out if they're single or double precision. Some FPUs can do both at the same speed, but others are twice as fast in single precision, and you can guess which figure will get quoted in the advertising literature. smile

Finally, you have to consider how your algorithms will map onto multiple cores. You said that your jobs can be easily parallelised by hand, so it may be worth checking whether your OS will allow you to schedule each job on one core, using only the memory most local to that core. Link2

Back to top
Dr. Shark
Sun Jan 23 2011, 07:07PM
Dr. Shark Registered Member #75 Joined: Thu Feb 09 2006, 09:30AM
Location: Montana, USA
Posts: 711
Steve McConner wrote ...


Integer or floating point math? For some applications, like financial trading, floating point isn't good enough, the application uses its own arbitrary precision numbers similar to Lisp's "Bignums". The FPU doesn't get used, and the processor's integer math performance is what matters.


Everything we ever need to do is in float 64. Our data comes from a 12 bit daq, so it would potentially make more sense to work in single precision, but MatLab does not like any other format that 64 bit doubles. MatLab is usually pretty quick to pick up new instructions sets like SSE and the new AVX so with any luck the code can still run reasonably fast.

MatLab can be told how many threads to use and since it does not really do intelligent threading byt itself (how could it?), it usually makes sense to run multiple instances each constrained to a single thread. No clue whether is is doing anything smart with cache and ram access though.

wrote ...

A Cluster file system does not usually use a traditional Journal like Ext4 or Ext3. Note too that only the newest distros' utilities will support disk scans into the TB... (you may have to do some compiling)

We are not really shooting for a cluster fs, I don't know nearly enough about linux to be able to pull this off. The idea is to have all storage in the central node and mount that via NFS (don't know if NFS 3 or 4) on all of the nodes. Ideally all data, code and software would be on the RAID and the compute nodes would just have Ubuntu and nothing else installed locally.
Back to top
 1 2 3 

Moderator(s): Chris Russell, Noelle, Alex, Tesladownunder, Dave Marshall, Dave Billington, Bjørn, Steve Conner, Wolfram, Kizmo, Mads Barnkob

Go to:

Powered by e107 Forum System
 
Legal Information
This site is powered by e107, which is released under the GNU GPL License. All work on this site, except where otherwise noted, is licensed under a Creative Commons Attribution-ShareAlike 2.5 License. By submitting any information to this site, you agree that anything submitted will be so licensed. Please read our Disclaimer and Policies page for information on your rights and responsibilities regarding this site.