Continue to Site

Welcome to MCAD Central

Join our MCAD Central community forums, the largest resource for MCAD (Mechanical Computer-Aided Design) professionals, including files, forums, jobs, articles, calendar, and more.

Mechanica on multiple processors

antran7

New member
For the life of me, I can't get Mechanica to run on all 4
processors (or maybe it is, but doesn't look like it).
I'm running a fairly long FEA, and my CPU usage stays at
25% (can you tell I've got 4 processors?)

When Mechanica starts the run, here are it's outputted
run settings:
Parallel Processing Status
Parallel task limit for current run:&n bsp;4
Parallel task limit for current platform:64
Number of processors detected automatically:4

But every run has the Elapsed Time nearly equal to the
CPU Time.

I've got these settings in my config.pro:
sim_run_num_threads all
sim_run_tmp_dir D:\Working\mechanica

And I tried this setting:
MECH_NUM_THREADS = 4

but it wasn't recognized

I've looked at my resource monitor, and the CPU, Disk,
Network, and Memory are not maxed out.

Any ideas what the bottleneck could be?

Thanks,
An
 
Mechanica's multiprocessor performance isn't that hot -
most multiprocessor software except perhaps rendering is
poor - tried any CFD packages?

Many of the phases Mechanica runs in a job are single
threaded. However it does use all processors during the
solving stage. I suspect the main bottleneck ends up
being write-to-disk - SATA3 Raid or solid state drive
might help more.

Your settings look about right. I generally see Elapsed
Time about 65% the CPU time on a dual core machine - is
your job big enough to stretch it on the solver front?
Use the solram setting from the *.pas file
(checkpoints)to set the memory allocation.

Frankly Mechanica needs a bunch of work on it - FloEFD
puts it to shame on the integration front. The WF5 update
has rolled over many of the faults from earlier versions
- it would be interesting to see if the same faults are
in CoCreate Mechanica.

Its all a pity really - its like a piece of high class
real estate thats been handed down the inbreed family and
left to rot. They dont know theyve got a jewel.

Theres a saying "clod to clod in three generations" -
perhaps this also applies to software...


Edited by: moriarty
 
Thanks for the info.

You're probably right: the write-to-disk might be the
bottleneck. According to the resource monitor, it's the
only resource that's occasionally maxing out (although
that's less than 5% of the time).

I'm getting new machine with SATA3 HD and faster and
newer generation processor. Maybe I'll toss in a little
SSD as a working drive for Mechanica.

My final time was:
Elapsed Time: 12924.4 seconds
CPU Time: 13271.77 seconds

Elapsed time was only 97.4% CPU Time. So I suspect
there's some room for improvement. Maybe I should turn
off HyperThreading?


Is there a way to designate some RAM as a temporary
working drive? That would be screaming fast and bypass
the hard drive controller. (maybe i would literally be
screaming as the FEA was running).
 
How much ram do you have available? I have found that the multi cpu solver only works well if you can fit most of the model in RAM at the time it it running.
 
one machine has 6.5GB, and another has 4GB. Neither had
the RAM maxed out (according to the task manager or
resource manager)

And I'm running Windows7 64-bit to take advantage of the
more than 2GB RAM
 
Did you up your solram setting at all? The default (128Mb) is pretty puny. On my 8GB System I ususally run it at ~4000 or so.
 
i created a RAMdisk, and am going to trying running the
analysis on that (with all the temp files written to it,
obviously)

I'll keep you posted...
 
Oh, one other trick I use is to make sure my temp files go to a folder that is NOT monitored by our Virus checker... Nothing to slow down your ultra fast disk drive like the mandatory on-access scan :)
 
well.. it's still running the study. But in the
meantime, I looked at the improved write speeds

My original working drive had a write speed of 85 MB/s
(actually measured with a tool).

My RAMdisk has a write speed of 800 MB/s.

So I've got almost 10x the write speed, and the FEA isn't
running 10x faster. The bottleneck seems to be speed of
a single processor (or core)

Too bad... maybe in the (near?) future
 
What spec is your machine and what version of WF are you
running?

There is only a small gain from hyperthreading - at least
with old technology. This may be a different story with
Core i7.

Ive been told that 4GB is minimum memory requirement with
Windows7 x64 - it runs poorly with less memory.

Don't put solram too high - use the solram setting from
the *.pas file from the job as a guide.

If someone is prepared to put a file up - we could each
run it as a benchmark and post the results?
 
I'd be up for benchmarking my current workstation and one
that's on it's way. I can't post what i'm working on
right now (I'd be sleeping with the fishes, if I do).

I ran it last night, dedicating 1GB of my 6.5GB of RAM to
a RAMdisk. It ran 14% slower. Seems like the faster
read/write of 1GB of space didn't offset the loss of 1GB
of RAM.

Still looks like the bottleneck could be a single
processor, but could be the bus speed or other nerdy CS
acronyms that I bluff but really don't know what they do.


I'm currently on a pretty old workstation:

Dell Precision 670
MB Dell
North Bridge Intel E7525
South Bridge Intel 82801EB
Dual Processor Nocona Xeon 3.0 GHz
800 MHz FSB
16 KB L1 Cache
1 MB L2 Cache
6.5 GB DDR2-400 PC2-3200 (200 MHz) ECC

On my laptop, it runs in in less than half the time
(about 45%)
Macbook Pro 2010
MB Apple
North Bridge Intel ID0044 rev12
South Bridge Intel HM55 rev12
Dual Core i7 2.67GHz
2x32 KB L1 Cache
2x32 KB L2 Cache
4 MB L3 Cache
4 GB DDR3-1066 PC3-8500F (533 MHz) non-ECC

Hmmm... so I take back my original hypothesis that the
bottleneck is the speed of a single processor. It could
be the FSB, RAM speed, and/or cache size and speed.

And that supports my observation that when I check the
resource monitor during the run, a single processor isn't
continually maxed out.


Funny how since I've got so much machine downtime, I've
become more interested in how to optimize for Mechanica
than the results I'm waiting for
 
Also, I'm running WF4 M100

On a side note, I have noticed a significant improvement in
Mechanica between WF2 and WF3&beyond
 
A benchmark that runs for 15min would be good.It probably needs to bea simple stress analysis thatcan run on64 and 32 bit machines so it runs the same each time.


Anybody got a problem they can share?
 
We run the following
<UL dir=ltr style="MARGIN-RIGHT: 0px">
<UL dir=ltr>
<LI>
<DIV style="MARGIN-RIGHT: 0px" align=left>WF5 M040</DIV></LI>
<LI>
<DIV style="MARGIN-RIGHT: 0px" align=left>Win7x64</DIV></LI>
<LI>
<DIV style="MARGIN-RIGHT: 0px" align=left>1X Core i7 980X (hexcore)</DIV></LI>
<LI>
<DIV style="MARGIN-RIGHT: 0px" align=left>16 GB DDR3-1333</DIV></LI>
<LI>
<DIV style="MARGIN-RIGHT: 0px" align=left>SATA2</DIV></LI>[/list][/list]
Mechanica detects 12 processors (hyperthreading ON). In the solver phase it runs at over 90% CPU usage with the 12 processors.
CPU Time/Elapsed Time= 3.51

Edited by: moriarty
 
is the default interface between the 2 parts meant to be
bonded? I assume so, since the "with contacts" checkbox
isn't shown... I'm running it on my Macbook Pro now..
 
Here's what my MacBook Pro did on your analysis:
Macbook Pro 2010
MB Apple
North Bridge Intel ID0044 rev12
South Bridge Intel HM55 rev12
Dual Core i7 2.67GHz
2x32 KB L1 Cache
2x32 KB L2 Cache
4 MB L3 Cache
4 GB DDR3-1066 PC3-8500F (533 MHz) non-ECC

Wildfire 4 M100

Elapsed Time 3527 sec (59 minutes)
CPU Time 3713.86 sec
Memory Usage 673915 kb
Work Dir Disk Usage 11770699 kb
 
WF5M040
Win7x64
Xeon DP5050 3GHz (1processor,2cores,4threads)
RAM 11 GB PC2-5300
SATA2

Elapsed Time(sec) 3138.17
CPU Time(sec) 7469.22
Memory Usage 693909 kb
Work Dir Disk Usage 10823494 kb
solram=128

Here's a graph of CPU Time / Elapsed Time through the
various phases of the job. Equation Solve multi-
threads significantly after Pass 1; Post-Processing
Calcs
only slightly. Everything else is single
threaded. Getting over 3 with 2 cores isnt too bad -
hyperthreading does seem to be working even with this
ancient processor.

View attachment 4572


Edited by: moriarty
 
Here's my current ancient workstation results:

Dell Precision 670
MB Dell
North Bridge Intel E7525
South Bridge Intel 82801EB
Dual Processor Nocona Xeon 3.0 GHz
800 MHz FSB
16 KB L1 Cache
1 MB L2 Cache
6.5 GB DDR2-400 PC2-3200 (200 MHz) ECC

WF4 M100

Elapsed Time 4510.02
CPU Time 9213.73
Memory Usage 665501 kb
Work Dir Disk Usage 11408212 kb


On a side note (and back to my original post), it looks
like the ratio of Elapsed Time/CPU Time is very dependant
on what type of analysis your running.


I (allegedly) get my new workstation tomorrow. When it's
up and running, I'll post its results
 

Sponsor

Articles From 3DCAD World

Back
Top