Asus Z9PE-D8 ws (2x Xeons wrong scaling?)


#1

Hello

I am new to the forum…
First of all i would like to say hello to all members…Sooo Hi!!

the reason i am writing in this thread is because i also own an Asus Z9PE-D8 ws…Problem is that i have 2x Xeon es 2687w B0 step 2 "very early silicone chip…i think the previous model can not even recognise a second cpu in a two cpu based motherboard…just so you know how early my version is…don’t get me wrong…the cpu’s are ok in general…they have not given me me many problems other than that the system does not seem to be working as smoothly as i expected it to be…it feels sometimes like it lags or bottlenecks…that’s not the main problem though…I am using this system rig for 3d animation and video editing ect ect…
Problem is when i realized that the render times in vray or mental ray don’t scale as someone would expect…so…let’s do the maths…

Vray 2.40v test scene (same goes for vray vesrion 3.00)

  1. with 1x Cpu 4.50 minutes…
  2. with 2x Cpu 3.20 minutes…instead of half time (assuming that with one cpu is 4:50 than it should be 2:25 not 3:20 )

In Mental ray it gets worse…it doesn’t matter if i have 1x cpu or 2x cpu in mother board …render time is the same…

So,the only thing i am asking is for someone with the same rig ( mobo and cpu model) but with final silicone xeon 2687w could please download the scene files and post the render times…
i need only with 2 xeons no need for you to go and unplug the second cpu

that would be a great help for me so that way i would understand if it is the xeon’s problem because of early stepping edition or because of the software not scaling properly…

Fun fact is that in cinebench with 2x cpu’s the results i get are along with other similar rigs i saw online correctly…same goes for benchwell from maxwell renderer…

rig setup
asus z95ed8 ws
2 x xeon 2687w es B0 step 2
64 gb of ecc ram (from the qvl memory list )
ram config = 4 dimms of 8gb per cpu
gpu nvidia quadro 6000
intel ssd for Os
OS: windows 7 64bit pro
psu 1200 watt corsair

Any help would be appreciated…

Here are both scenes in a we transfer link

http://we.tl/ehNtHFt8iH

Thank you again in the first place…


#2

There is no guarantee rendering will scale linearly with the available number of units.
If you pitch all cores against the same frame there will still be plenty times large amounts of work will bottleneck to fewer or even single thread.
If a particular scene/case is particularly biased towards those weak spots, then you can very well see render times almost unchanged some times.

I don’t know enough of VRay nor have a license to open scenes and try it, but in general don’t assume adding CPUs will linearly decrease render times, it usually won’t.

Run multiple frames if you want near linear increase in speed (provided you have enough RAM to handle the footprint).


#3

This - The CINEMA 4D renderengine is being used by Intel to benchmark new processors. Depending on the exact CPU a doubling of cores can produce anything between a factor of 1.7 to 2.2. It gets more extreme if you compare processors of different types or if you use a less optimal scene configuration.
In worst case scenarios we had an absolute decrease in renderpower when doubling the number of cores. In one case we got a factor of 0.7 instead of the to be hoped for 2.x.
Cheers
Björn


#4

Thank you both for the information provided…i still have a question…we do prints…large prints…a still render could be 35.000x10.000 pixels…so what is the best solution to this?
multiple cores in one single rig as i have or multiple pc’s with i7’s and distributed render?
if that is the case i could break down my system to a single cpu mobo and have each xeon alone and combine them with the other 2 i7 systems i have…that would give me 48 threads…
16 x 2 from the xeons…and 8 per i7 system…so the final question is…will it work better as distributed or it will be the same thing?

thank you once again


#5

It’s impossible to tell without knowing the engine and the scene.
Certain things that are horribly storage intensive in example, in an engine that doesn’t deal well with the problem (or a situation leading to the same), might lead to the case Srek described where that part of the rendering is actually negatively impacted by the number of units.
Other cases scale barely above 1, say on some engines inefficient acceleration structures constantly waiting on a single all-important thread.
Other cases again scale ideally and will hit 2x or close to and distribute tiles efficiently.

There is simply no best way to do it.
In your place I would move this from a hardware point of view back into rendering territory.
Go on Chaos Group’s forums and ask them what’s best for your typical scenes and settings.

Mental ray is affectionately called “Mental Delay” by many for good reasons, so I won’t comment past that :stuck_out_tongue:


#6

When testing scaling here at Maxon it turned out that upping the render resolution is among the most unpredictable things you can do when it comes to rendertime. This will surely be different for each render engine, but i doubt that the behavior will be anywhere linear.


#7

I’ve got a system here with that board and retail versions of those chips. No issues with scaling here. As others have mentioned scaling will depend greatly on the scene and engine, but in something I/O light and CPU heavy you should get close to linear up to 2 sockets. Try Cinebench to see if your scene is the culprit.

Using ES chips though all bets are off as they can behave unpredictably.


#8

This might be of interest to you. Adding more processor cores only improves the performance of tasks that can be done in parallel.

http://en.wikipedia.org/wiki/Amdahl’s_law

Processing considerations aside there are other bottlenecks that the number of processor cores won’t have any affect on like how long it takes to load textures and geometry from the disk.

Like Raffaele has said, if you want the processor cores to be utilized to their fullest potential then run concurrent tasks assuming there’s enough memory. For example if you have 12 processor cores then run 12 concurrent renders at a time (each limited to a single core).

While each frame will take longer compared to throwing all 12 cores at one frame, a sequence of many frames will render much faster since more of the cores will be used more of the time.


#9

Thank you all for answering!!! problem is that i render still prints…so i need all the cores for one frame…i rarely render animations…so i would like a solution,if there is any of course on that…
would it be better for me to have 2-3 rigs with one single cpu (say i7 4930k) and make distributed rendering? i could break my rig down to single cpu’s motherboards and have one xeon as main system and another rig for the second xeon for distributed render…if i am correct that would render faster one single frame than both of them in on a two cpu based mother board right? and then i could add some i7’s for extra distributed threads?

thank you all very much for caring and answering!!!


#10

Having more cores in a single machine will perform better for rendering a single frame than the same number of cores distributed across several machines. The network communication adds even more of a bottleneck and slows things down even more. If you want a single frame to render faster then you want a lot of cores and fast cores in one machine. Adding more cores in other machines for distributed tile rendering can help but it wouldn’t be my first suggestion.


#11

Thank you all for the replies and answers…i realised what is going on with the cores/speed and all these stuff…and i found the best solution for my rig

thank you once again