Re: Fastest Matlab server under $4000?



"Kevin Chen" <hyperq@xxxxxxxxx> wrote in message <gfolru$67r$1@xxxxxxxxxxxxxxxxxx>...
Thanks for the suggestions. I agree that a singler server will be easier to deal with than a cluster.

I checked Dell's server offerings. It looks like I can get a decent machine for a little over $2000. The basic configuration is 2 x Intel Xeon E5420 CPU (Quad core), with 8 (4 x 2GB) ram. The total budget is $4000, so what other hardware should I get? I prefer not to spend another $1000 for CPUs that are just 10% faster.

I am looking for the best performance/cost ratio. Any tips on picking CPUs and RAM to get maximum performance out of Matlab? Should I also look into Nvidia GPUs?


cpp.matlab@xxxxxxxxx wrote in message <a74c2069-2766-4ae5-b98d-f5dbac18cab7@xxxxxxxxxxxxxxxxxxxxxxxxxxx>...
On Nov 13, 9:50=A0pm, "Kevin Chen" <hyp...@xxxxxxxxx> wrote:
I am an research professor building statistical models in Matlab for my r=
esearch topics. =A0I recently got a research grant, so I have $4000 (US dol=
lars) to spend on hardware to run Matlab.

My models are very taxing on CPUs. =A0The last one took a month to run on=
a 2 year old, shared server in my school. =A0My question is: What is the f=
astest hardware I can buy with $4000? =A0I simply need to run Matlab and no=
thing else.

Should I buy one large server with two Intel E5420 CPUs and 8GB ram, or a=
cluster of small blade servers? =A0Which one will give me the best perform=
ance/cost ratio?


IF you dont want to change from MATLAB
The best way would be to buy time using that money in Super computing
center at Pittsburgh.

IF you want to change your models
Shift to C++, I had personal experience reducing computational load of
gradient descent method from 4 days to 4 hours



I am not a statistician, so I do not have an intuitive feel for what you might be doing. But, I do have a background in computer science.

It boils down to this: are your problems what are termed 'embarrassingly parallel'?

That is to say, can your problems be broken down into N independent subproblems which can be computed independently? or N problmes that are the same but with different parameters?

a simple example might be a genetic algorithm or other population based search technique. A population of solutions to a problem is generated. Each is assessed at some computational cost.

but as these assessments take place, no communication is necessary between the computational processes. They are completely independent.

If you can frame your problem like that, then the best way to spend 4000 dollars is to buy multiple cheap quadcore systems (e.g. shuttles) to maximise the number of processors you have.

the speed at which your computation progresses will scale nearly linearly with the number of cores.

But what if your problem is not of this nature? what if it is fundametially a serial algorithm?

then you will completely waste your time in buying

"one large server with two Intel E5420 CPUs and 8GB "

because this is a system designed for parallel computing as I have described it. If your algorithms are fundamentally serial, adding CPUs or cores or whatever will not increase speed one iota.

re: matlab v C++, you will find the coding in C or C++ very often produces huge increases per CPU/core. You should look at a beowulf cluster and MPI "'message passing interface". But actually, learning about what matlab does fast (linear operations on matrices) and what it does not (loops and much else) could save you the trouble of redoing it in C.

Say you want lots of computer CPUS. In terms of cost:

Buying 'one large server', even with many cores (especially from sun!!!, or even dell) , is much less economical than buying many small, cheap machines (a small quadcore system can be as little as 400 dollars - so you could get 40 cores for 4000 dollars, plus the cost of networking and visualisation). This is how, e.g. google does business.

If your problem is truly serial, then there is nothing you can do. If it is parallel, you should get large numbers of cheap machines, whether you use matlab or not.

these are trivially the facts.



.



Relevant Pages

  • Limits of java heap space?!
    ... CPUs and 32Gb RAM. ... MATLAB runs out of memory. ... MATLAB won't start anymore! ...
    (comp.soft-sys.matlab)
  • Re: Concurrent Sequential
    ... extremely slow because current CPUs are designed and optimized for the ... the correct model for parallel computing and it should be the model ... used by multicore CPU designers. ...
    (comp.arch)
  • Re: [OT] Re: Whats the name for this?
    ... I need to point out that the older CPUs ... > step from the earlier dedicated computing devices. ... > in cases where the programmers themselves are more comfortable in a CISC ... > who programs with no knowledge of how a compiler works. ...
    (comp.programming)
  • Re: Low-tech distributed computing?
    ... You do your programming within the MATLAB environment and add parallelism to ... computations to the available MATLAB workers is done automatically, ... Distributed Computing Toobox has been renamed to Parallel Computing ... while MATLAB Distributed Computing Engine has been renamed to MATLAB ...
    (comp.soft-sys.matlab)
  • 2010-03 Math and Statistics new programs
    ... 2010-03 Math and Statistics new programs ... Mathworks Matlab R2010a ISO | 5.01 GB ... is a high-level technical computing language and interactive ... Wolfram Research Mathematica v7.0.1 for Windows | 475.98 MB ...
    (sci.math.num-analysis)