HPC: If I created a cluster out of few multi-core machines let say five 4 core with each machine has 8 thread capacity

(Last Updated On: June 21, 2011)

This may be silly question but I am a little novice to HPC. If I created a cluster out of few multi-core machines let say five 4 core with each machine has 8 thread capacity
(eg. Intel® Xeon® Processor X5687 ) with more than enough memory. How many nodes would I have, 40, 20 or 5. In other words, How many identical processes can I run simultaneously and independently with maximum performance ? Is it 40, 20, 5 or any other number?. Or simply a node means a one machine and has nothing to do with the amount of processes.

Node == physical machine. It really has nothing to do with performance:capacity:parallelism of your given app(s).

Clearly a multi-core node with HT will have more raw capacity than non-multi-core node in terms of raw CPU performance; but overall performance depends on your app and its performance requirements:specs:profile.(IO, Network, CPU, etc).

Strictly speaking, you can run a ‘test cluster’ on a single physical node with virtualization software to carve it up into multiple VMs. This lets you develop your parallel app; but of course is not representative of ‘real world performance’ that would be observed with multiple physical nodes connected by some interconnect (gig, 10gig, infiniband, etc)
I hope the following summary will help you. 🙂
One node Intel X5687 chip-set has 4 physical cores. If the hyper threading (HT) is enabled (via BIOS setup), you can have 4 virtual cores in addition with your physical cores. Therefore, total CPUs=8. You can run 8 application threads on your one node x5687 machine. Even-though you have 8 CPUs and your application is more compute intensive, the performance may not be good due to virtual cores may not do the actual compute (and the computation will happen only in physical core).
I think it is pending your definition.
For example: If you have a sever scheduling tasks to your cluster then you can run one client process per machine and multithread its calculation giving you 5 nodes with up to x8 performance speedup. Or you can run a client per core calculating on a single thread giving you 40 node cluster.
You did confused me with the specs you gave, I prefer a thread per core (physical) ratio to avoid the overhead. if you are talking about 4 physical cores with hyper threading or something like that, then you should benchmark to decide if you want to configure your environment as 20 or 40 working threads.

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!
Don't miss out!

You will received instantly the download links.

Invalid email address
Give it a try. You can unsubscribe at any time.


Check NEW site on stock forex and ETF analysis and automation

Scroll to Top