Single tower CPU/GPU Renderfarm?


#1

Hello there one and all I have an insanely bizarre, hypothetical question. Can you build a renderfarm in a single tower? If I were to take a Supermicro Quad socket motherboard, stuff it with four Hexadeca Core Opertons, four video cards, and 64-128GB of ram could I theoretically run 8 instances of a CPU/GPU rendering engine like Blender Cycles or IRay? Leaving a single core free on each CPU for each video card?

Would this tax the system too hard overall as to make it unstable and hurt performance (like how the system overhead often makes quad SLI perform worse than tripple SLI)? Like, would the CPUs, even with a core or two free, be unable to also juggle a video card each while rendering? I got the idea after discovering that rendering using two instances of Blender (I have two videocards) performed better than having two cards working on the same frame. 

The reason I ask is that I already have two micro-ITX computers as a small render-farm and have no more room and thus the thought of essentially running 8 computers/instances of a program in the same case is appealing. Please, if you have any other ideas or suggestions in the same vein post them!

P.S. I work from home and live in Canada so thermals will not be a problem. XD


#2

I think you’re missing the point of 4-socket systems. They don’t have 4 sockets so you can run 4 apps at once, the system acts as a single mega machine. Just run 1 copy of blender and it will render over all 64 cores, no need to run blender 4 times. And no need for 4 gfx cards. You’ll either want lots of gfx cards for a gpu render engine like cycles, or lots of cpus for a traditional render engine. Doing both at once makes no sense.

Oh, and look at intel for rendering, amd’s opterons are… lacking.


#3

I worded that wrong and also realised I made a booboo. Video cards do not scale 1 to 1 in terms of performance when you add them to a single render and for some odd reason I forgot that the same is not true with CPUs. :banghead: I am so embarrassed. :blush:

My thought process was that the CPUs would perform better working on their own instance rather than all four at the same time, I knew this to not be true yet somehow I completely forgot when I was thinking about the system overhead associated with video cards. :argh:

What I originally meant to ask before my brain decided it was time for a grand derp was in a single instance of Windows 7 running, could 60 Opteron threads match the performance of a single GTX 980 while 4 threads are dedicated to keeping 4 GTX 980s chugging as well? Thus allowing you to render 5, not 8, instances of Blender without the system overhead impairing performance. I know that all 64 threads murder a 980 as well as a Titan Black but could they do it while also rendering as CPU+GPU rendering in the same instance is not supported?

My real-life plan is to simply get the cheapest chip that can support 4 video cards on the same MoBo and run 4 instances of Blender (since there is a 10-20 percent performance loss when running the same render together, I have done only tentative tests with IRay) but decided to ask a hypothetical extreme as a potential future option.

Two sub 500$ xeons are starting to sound better from the extra research I have been doing in the meantime. Should out perform a 1000 dollar 5960x. - Still not something I am entirely decided upon, most like will still only go with a cheap(ish) CPU.

P.S. I went full derp, never go full derp. :surprised


#4

Sort of. Even though most renderers scale well to many processor cores for ray tracing there are still bottlenecks like loading files, loading the renderer application itself, decompressing textures, etc. Running multiple instances of a renderer on a single machine increases efficiency as long as there’s enough memory for all of the instances.

The idea is that the otherwise wasted processor cycles that go by while an instance is bottlenecked can be used by another instance for ray tracing. The more instances you have sharing the resources the fewer chances there are for processor cycles to go unused throughout the day/month/year.

The downside is each frame takes longer to render since the resources are being split with other instances of the renderer. Utilizing render nodes this way can be kind of annoying like if you want to make sure at least one frame works in a job that was submitted (takes longer to see feedback for a single frame) but the improvements in efficiency overall add up to a lot.

Having said that I don’t think a super duper render node is a good idea unless you need that much memory for a single task. In terms of performance for the dollar the lower end hardware is almost always a better deal. Balance this with any software licenses if applicable because a cheap render node is only cheap if it doesn’t have a huge license fee attached to it.