Cryptite
12-05-2012, 02:07 PM
This is a call for advice on switch traffic theory for a small-sized (~13 artists) VFX studio. For the most part, we do fine, but we have a couple (read, one big) problems that cause bandwidth choking for us, and I'd love to get some ideas from you guys about how your studios run and what has and hasn't worked for you guys in the past.
Here's the setup for us:
Roughly 5 fileservers (we'll shorten to FS for the rest of the post), one main one that serves all of the production files, the other 4 are rare-use and can be legitimately forgotten about for the time being.
60 Render Farm Blades; our 20 fastest render our Fusion/Nuke comps (where the problem lies which I'll get to in a moment).
~13 Artist Workstations presumed equal in speed, but this is a bandwidth issue, so it doesn't much matter here.
The switch setup is the following:
40 render blades, all 3d renderers, no Fusion rendering are on switch A
The other 20 aforementioned comp/3d rendering blades are on switch B
Switch B is wired to Switch A with 2 CAT6 cables.
Switch A then routes traffic from both itself and B (daisy-chain, if you will) to the "Main Switch" through 4 CAT6 cables.
The "Main Switch" hosts all of our file servers and workstations as well as all of the traffic from the render farm switches.
The main bandwidth choke point is when we launch Fusion jobs to the farm. While we only have 10 render licenses for it, when 10 of our fastest machines (or probably any 10 for that matter) get to rendering Fusion; because of the constant pull of Very Large EXRs as well as how fusion handles EXRs (poorly, but that's a topic for another day) nearly everybody on the network gets a reasonably noticeable (our AE guys complain the most) network hit and pretty much just have to work through it until those jobs finish. No other jobs seem to have this problem as all of our 3d scene files and assets are copied locally before they render.
If it needs mentioning, we use Royal Render as our render farm manager.
Also, Nuke is new to our pipeline and we're slowly introducing it. We know it handles EXRs better, but is it orders of magnitude better in that this problem may not even exist anymore once we've fulled transitioned?
My question to ye vfx types is how does your company route your network traffic? We've had many internal theories about trying to separate switch connections so that the farm and the main FS can talk to each other directly without going through the same switch that the workstations go through. Either way, we know something needs to change; we're just not 100% certain on what that is.
Also for extra credit, how do you host your files in terms of production assets and renders for compositing? Do you keep them all in the same project folder? Do you separate your renders on an entirely different FS so that they may be accessed without affecting the production FS? Do you all use SSD's? Do your computers sit atop wireless unicorns?
Thanks in advance!
-Crypt
Here's the setup for us:
Roughly 5 fileservers (we'll shorten to FS for the rest of the post), one main one that serves all of the production files, the other 4 are rare-use and can be legitimately forgotten about for the time being.
60 Render Farm Blades; our 20 fastest render our Fusion/Nuke comps (where the problem lies which I'll get to in a moment).
~13 Artist Workstations presumed equal in speed, but this is a bandwidth issue, so it doesn't much matter here.
The switch setup is the following:
40 render blades, all 3d renderers, no Fusion rendering are on switch A
The other 20 aforementioned comp/3d rendering blades are on switch B
Switch B is wired to Switch A with 2 CAT6 cables.
Switch A then routes traffic from both itself and B (daisy-chain, if you will) to the "Main Switch" through 4 CAT6 cables.
The "Main Switch" hosts all of our file servers and workstations as well as all of the traffic from the render farm switches.
The main bandwidth choke point is when we launch Fusion jobs to the farm. While we only have 10 render licenses for it, when 10 of our fastest machines (or probably any 10 for that matter) get to rendering Fusion; because of the constant pull of Very Large EXRs as well as how fusion handles EXRs (poorly, but that's a topic for another day) nearly everybody on the network gets a reasonably noticeable (our AE guys complain the most) network hit and pretty much just have to work through it until those jobs finish. No other jobs seem to have this problem as all of our 3d scene files and assets are copied locally before they render.
If it needs mentioning, we use Royal Render as our render farm manager.
Also, Nuke is new to our pipeline and we're slowly introducing it. We know it handles EXRs better, but is it orders of magnitude better in that this problem may not even exist anymore once we've fulled transitioned?
My question to ye vfx types is how does your company route your network traffic? We've had many internal theories about trying to separate switch connections so that the farm and the main FS can talk to each other directly without going through the same switch that the workstations go through. Either way, we know something needs to change; we're just not 100% certain on what that is.
Also for extra credit, how do you host your files in terms of production assets and renders for compositing? Do you keep them all in the same project folder? Do you separate your renders on an entirely different FS so that they may be accessed without affecting the production FS? Do you all use SSD's? Do your computers sit atop wireless unicorns?
Thanks in advance!
-Crypt
