CPU render hours at DreamWorks

Become a member of the CGSociety

Connect, Share, and Learn with our Large Growing CG Art Community. It's Free!

THREAD CLOSED
 
Thread Tools Search this Thread Display Modes
Old 11 November 2012   #1
CPU render hours at DreamWorks

” To create “Puss in Boots,” artists used more than 200 high-performance HP Z800 Workstations to design everything in the film. HP ProLiant BL460 blade technology powered five different server render farms geographically dispersed across the U.S. and India. The film utilized over 60 million render hours and 117 terabytes of data.”
Source

I'm interested in knowing what this exactly means. At first I thought that one render hour would be one hour on a HP ProLiant BL460 blade technology server. However in other sources they use the term CPU render hours and these servers contain two CPU's so does this mean that every HP ProLiant BL460 blade technology server produces two render hour pr. hour?

Or is it actually pr. core?

60 million hours is a lot even if they have more than 4000 computers in their farm it would still be 625 days of rendering if a render hour was one hour on one computer. This seems unrealistic, can anyone help clarify this?
__________________
Martin Christensen
 
Old 11 November 2012   #2
Yeah, I think this merits a clarification.

Most of us non-Pros just count them in calendar time terms.
__________________
"Your most creative work is pre-production, once the film is in production, demands on time force you to produce rather than create."
My ArtStation
 
Old 11 November 2012   #3
Based on the math, it seems most likely to be referring to per core. A search online shows an article by Red Hat about DreamWorks implementing it into their systems. I don't see a date for when the article was published but it says that DreamWorks has a render farm with 30,000+ cores.

Using the idea that they are using 4000 servers with 2 processors at 4 cores per processor, the number would be 32,000 cores. The 60 million hours figure would translate into around 80 days going by core hours in that situation.
 
Old 11 November 2012   #4
Translation - "Buy HP computers!!"

Seriously, all those big numbers come off as marketing. I remember hearing years back that it took about a year to render whichever film using not just the renderfarm, but also the artists' machines when they were logged out. Computers may be faster now, but the increased complexity means it probably takes around the same amount of time.
__________________

 
Old 11 November 2012   #5
Every place I know of these days uses core time and wallclock time for measuring jobs. I doubt DW differs, and per CPU doesn't make a lot of sense any more, per blade even less.
I'd say it's fair to assume it refers to core time.
__________________
Come, Join the Cult http://www.cultofrig.com - Rigging from First Principles
 
Old 11 November 2012   #6
Thanks

Thanks guys.

I believe you are right. It makes much more sense with core time.

Actually Core time also doesn't really makes sense unless you know which kind of CPU.
I found a renderfarm online that used the term gHz-H and that seems to be a more accurate measure. My i7 CPU produces around 24 gHz-H pr. hour according to them.

You can find their calculator here: http://www.rebusfarm.net/en/lets-go/calculator

Knowing what DreamWorks use I could convert the render hours into gHz-H and have a measure that is comparable to other systems.
__________________
Martin Christensen
 
Old 11 November 2012   #7
Originally Posted by Darrolm: Based on the math, it seems most likely to be referring to per core. A search online shows an article by Red Hat about DreamWorks implementing it into their systems. I don't see a date for when the article was published but it says that DreamWorks has a render farm with 30,000+ cores.

Using the idea that they are using 4000 servers with 2 processors at 4 cores per processor, the number would be 32,000 cores. The 60 million hours figure would translate into around 80 days going by core hours in that situation.


That actually makes sense. I wish the film people would make it clear like that.
__________________
I like to learn.
 
Old 11 November 2012   #8
Originally Posted by Sok: Thanks guys.

I believe you are right. It makes much more sense with core time.

Actually Core time also doesn't really makes sense unless you know which kind of CPU.
I found a renderfarm online that used the term gHz-H and that seems to be a more accurate measure. My i7 CPU produces around 24 gHz-H pr. hour according to them.

You can find their calculator here: http://www.rebusfarm.net/en/lets-go/calculator

Knowing what DreamWorks use I could convert the render hours into gHz-H and have a measure that is comparable to other systems.


If you consider their farm and workstations evolve over time actual GHz would be a pain to keep track of as they'd have so many. And even gHz would be wrong as a modern Ivybridge gets a more per gHz than an old kentsfield processor.
__________________
Quote: "Until you do what you believe in, how do you know whether you believe in it or not?" -Leo Tolstoy
Kai Pedersen
 
Old 11 November 2012   #9
Originally Posted by Sok: Thanks guys.

I believe you are right. It makes much more sense with core time.

Actually Core time also doesn't really makes sense unless you know which kind of CPU.
I found a renderfarm online that used the term gHz-H and that seems to be a more accurate measure. My i7 CPU produces around 24 gHz-H pr. hour according to them.

You can find their calculator here: http://www.rebusfarm.net/en/lets-go/calculator

Knowing what DreamWorks use I could convert the render hours into gHz-H and have a measure that is comparable to other systems.

For a large shop kind of jobs core time is a lot more appropriate.
The farm doesn't just crunch numbers for some renders; comps, caches, proxies, simulations, commits, db compression, and a million other things are run on the farm, and they tend to not care much for the speed of a single core very often, or not even be something you can thread efficiently, while other times they are hugely speed dependent.

Renderfarm usage needs a discrete unit and an overall run time, cores these days are that for the former, and wallclock has always been that for the latter.

It also ties in in core ceiling for jobs (since some jobs can span threads, some can't, some benefit, some don't and so on), and it's an efficient way to split jobs at an administrative and departmental (computational) budget as well, whereas assigning Ghz to jobs would be non-sensical.

There are many well considered reasons if we all pretty much fell in line into that standard
__________________
Come, Join the Cult http://www.cultofrig.com - Rigging from First Principles
 
Old 11 November 2012   #10
Why don't they just always use Wallclock time?

I mean it's more transparent that way.
__________________
"Your most creative work is pre-production, once the film is in production, demands on time force you to produce rather than create."
My ArtStation
 
Old 11 November 2012   #11
Because one hour of using 120 cores on a job split between multiple point caches, texture processing, rendering and precomping on a per-frame stream doesn't cost the same of one hour of a single core to output a file.

Core time is transparent, wallclock time is extremely opaque and absolutely irrelevant for anything other than letting you know when a job will be ready for pick-up/hand-over
__________________
Come, Join the Cult http://www.cultofrig.com - Rigging from First Principles
 
Old 11 November 2012   #12
Hmmm.. I see...

Is there a standard counting/formula for this?

I would imagine each shop/farm would end up with different ways of counting rendertime on cores.

I suppose this is also the key means to plan for expanding a render farm or unit intelligently according to scale of work?
__________________
"Your most creative work is pre-production, once the film is in production, demands on time force you to produce rather than create."
My ArtStation
 
Old 11 November 2012   #13
The minutes a core spends on a job between when the job is assinged and when the job finishes and the core is made available for another job to use adds to the core time of that job. Too many jobs in that sentence...

In the rare occasion a job will take a whole CPU, even if it can only use one or two cores on it (there are rare exceptions where that's the case), all cores are usually counted towards that job's core time even if they aren't under load.

As far as I know more or less everybody does it like this. Why would different shops do it differently? There are no multiple standards to this, it's like saying one shop might use Martian hours instead of Earth hours for the wallclock counts ;p

Basically when something excludes a core from other things to use it, minutes for that core start adding up to that job.

Plans to scale the farm up are more complex than just looking at that, but for the brunt of the work, the main metric for throughput is core time of course. Concurrency, scalability, power consumption, real estate, networking and other things come into place.
Some times just throwing more blades somewhere isn't ideal, or even possible.

Many shops also rent each other's nodes out during busy time/lull contrast times, so some times if all you need is another 1500 cores for a month, and Big Studio ABC is on downtime, you can strike a deal and have those racks isolated and given to you to manage for a few weeks so they don't have depreciating hardware sitting at a full loss between load spikes. This is more common in film VFX though, places like Pixar or DW tend to have a consistent turn-around with multiple overlapping projects.

Some blades, or even entire racks, are often assigned, exlusively or inclusively, to certain jobs that have pre-requisites; IE: you might keep old blades around and have them in use for low number crunching but high latency jobs, or you might have a rack that has OGL capable cards on board (uncommon) for running OGL captures on the farm and so on.

Farms are a slightly more complex problem than most people seem to imagine when you get to a certain size and complexity.
__________________
Come, Join the Cult http://www.cultofrig.com - Rigging from First Principles
 
Old 11 November 2012   #14
Originally Posted by ThE_JacO: The minutes a core spends on a job between when the job is assinged and when the job finishes and the core is made available for another job to use adds to the core time of that job. Too many jobs in that sentence...

In the rare occasion a job will take a whole CPU, even if it can only use one or two cores on it (there are rare exceptions where that's the case), all cores are usually counted towards that job's core time even if they aren't under load.

As far as I know more or less everybody does it like this. Why would different shops do it differently? There are no multiple standards to this, it's like saying one shop might use Martian hours instead of Earth hours for the wallclock counts ;p




No it's just that I thought different hardware manufacturers would have different CPU/Core performances so that you'd need to meter them first and basically you'd have a standard for what you use in your rendering unit.

I didn't mean different way of counting time.

But I meant that the number of minutes per core per site may not be the same everywhere because they use different CPU or Cores and if you use maybe Intel vs AMD you wouldn't have the same Jobs per Minute - so you'd need an initial count based on the hardware standard chosen so I felt that would not be the same from studio to studio.

I can understand the above makes no difference when counting rendering time after the fact. But I'm talking about using this data from a predictive standpoint where you know exactly what your rendering unit can do so you can plan the scale of future work.
__________________
"Your most creative work is pre-production, once the film is in production, demands on time force you to produce rather than create."
My ArtStation
 
Old 11 November 2012   #15
Nevermind the units, all you need to know is whether a given job's time on the farm goes up or down. It's rate of change is important. Identifying bottlenecks is important. Different procs and whatnot are taken care of statistical averaging out. It doesn't matter that much anyway. Stats like those above are quite artificial anyway.
__________________
Homo Effectus
 
Thread Closed share thread



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
CGSociety
Society of Digital Artists
www.cgsociety.org

Powered by vBulletin
Copyright ©2000 - 2006,
Jelsoft Enterprises Ltd.
Minimize Ads
Forum Jump
Miscellaneous

All times are GMT. The time now is 08:23 AM.


Powered by vBulletin
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.