future of rendering - GPU rendering vs ARM CPU


We’ve all seen how much resources GPU makers are putting into GPU development recently specifically for AI and crypto mining. It’s becoming apparent that they’re now wanting to sell specialty GPU rackmount gear with specialty chips and push traditional GPU’s to the background. Nvidia has stated in a recent announcement during Q&A they aren’t releasing a new gaming GPU for a “long while”. Obviously AMD isn’t giving them enough competition in the gaming space.

On the CPU side however, we’re now seeing things heat up between Intel and AMD …and ARM. ARM cpu’s are now competing in the server market as a lower-cost, lower power consumption alternative CPU’s with the potential for better scaling in the future with more core counts. Nvidia and AMD are now fragmenting their chip architectures into specialty processors for AI and crypto mining and putting gaming cards on the backburner. Microsoft has made a version of Windows that runs on ARM CPU (via emulation code) and we’re now seeing super computers and big-data companies like Google that are starting to utilize ARM CPU’s. Even Apple is now investing into ARM-based architecture for its upcoming laptops. We’ve seen this happen before when x86 started to overtake SGI’s MIPS and Dec’s Alpha CPU’s for the mainstream CG market.

It struck me the other day when I configured our render farm to utilize network rendering for Blackmagic Design Fusion, mass-photogrammetry processing, and when I was asked by a team member if they could use the farm for some machine-learning tasks for tapping into Google’s text to speech library. Our farm being CPU-based, it supports just about anything we’d ever throw at it. However, if our hardware was GPU-based, we would mostly be limited to using it a small handful of CG render engines.

I started thinking how cheap and popular the raspberry pi’s have become as a cheap and easy way for writing code and testing mass-scale processing. Meanwhile, writing GPU-code forces developers to rewrite entire code bases. If it was easy, everything would be using the GPU right? ARM is already running native linux distros that support it. Linux software porting to ARM would likely be much simpler than CUDA or OpenCL’s restrictions.

I know for years most CG artists have been saying GPU’s are the future and we have GPU render engines, but that has always been contingent on GPU makers pushing the technology forward and keeping it affordable. If GPU’s start regularly having a 1.5-2 year life cycle like the GTX 1080 is having, can we really still predict that GPU’s are definitely the future of CG rendering? Or maybe they’ll remain as mainly an augmentation for rendering and for a few specialty render engines?


Sentry, GPU rendering is going to undergo major changes because of 2 factors:

  1. Current GPU rendering engines are probably not yet fully optimized for using 1000+ GPU cores efficiently. Expect them to get faster in the future due to better algorithms, but also possibly due to better data flow between unique GPU cores, better programming interfaces and larger and faster GPU VRAM architectures.

  2. GPU accelerated realtime raytracing and pathtracing has - for the first time -been officially announced and demonstrated with videos. Nvidia, Microsoft, AMD, possibly Intel and others are already on the bandwagon.

Regarding 2), there are companies like Imagination - now mostly Chinese owned - that already had powerful, fully working realtime raytracing GPUs 3 years ago:


But at the time, nobody paid attention. Now that everybody is paying attention, Imagination may take its mobile raytracing GPU and create a beefed up desktop model.

[b]Basically, the CPU side of things is fairly predictable - expect the same old slow-ish progress - and the GPU side is not.

[/b]Nobody is expecting huge differences to happen in CPUs, but GPUs are entering the realtime raytracing and pathtracing era within the next 2 years.


The realtime raytracing is huge, but it is far in its infancy. Only doing shadows, AO, and reflections right now. Also very taxing at the moment as all the press demos have been using some seriously expensive gear. GPU’s have been doing raytracing for the last decade - just not quickly.

Latency is an issue
For real-time and AI work

You will have massive latency with a renderfarm of GPU’s with network-bound limits if real-time is the goal. That’s why you’re seeing Nvidia wanting to sell expensive complete rackmount solutions using low-latency fabric and specialty CPU’s to tie all their GPU’s together. How many of us will purchase one of these $50k beasts just to play in the game? The tech will trickle down to gaming/realtime, but the priority is no on AI because companies working on it have big money to invest in it.

GPU’s have done well so far, but the market is shifting as AI gains traction. It’s a leap of faith either way IMO. CUDA has not taken over everything for processing. As well as it works, it’s not a good investment of time for many developers - also the vendor lock-in issue. The main reason GPU’s have been so great is how cheap they are to add multiple ones into a computer. That’s all. If another architecture comes along that can offer cost ratios similar to that, but with a much lower barrier to entry on the development side and no vendor lock-in issues, it’s no contest.

Specialty software will always exist, but that still has no real bearing on what the majority of the industry does


Imagination was able to realtime raytrace games back in 2016 with a low-power mobile GPU for smartphones of all things. Its not even a desktop GPU. And it made it to market years ago.

So NVidia can pretty much forget about putting 50,000 USD specialty boxes on the market - Imagination can probably do the exact same thing by sticking 8 mobile GPUs on a 3,000 USD PCI card.

Intel also are realtime raytracing on FPGAs (Field Programmable Gate Arrays) - they bought FPGA maker Altera for 16.7 Billion USD a short while ago. The resulting FPGA cards may trounce NVidia GPUs in speed.

AMD has announced that Radeon ProRender will get realtime raytracing soon. AMD is in the game as well.

So Nvidia in no way has any kind of monopoly on realtime raytracing - other experienced companies can do realtime raytracing it too, and their solutions probably will not cost 50K to buy.

And then there is the possibility of superfast ASICs being created for realtime raytracing. You won’t need 4 - 8 GPUs at all then. Just 1 or 2 PCI boards with fast ASICs on them.

Imagination did that back with Caustic Visualizer for Maya and Max. That didn’t prove popular, but they incorporated the knowhow into their mobile GPUs.


Yeah, realtime raytracing sounds great. Though most of the stuff on screen Nvidia has been showing is mostly still rasterized. I’m mainly talking about offline rendering since it still seems a bit premature to think Unreal/Unity are going to handle all the modules and capabilities built into maya/max/c4d/etc to be useful to 3D animators. Many of us still need to produce a certain amount of detail that GPU’s can’t handle yet. I think we’re all happy to see raytracing officially embraced by GPU makers, but we can’t expect sudden miracles either. Pixar will still continue to render their movies on Renderman, though maybe some lower-budget broadcast TV shows will utilize realtime rendering.

I’m keeping an eye on where things go with FPGA’s too and they seem like a much better choice than GPU’s in the long run since they aren’t so restricted.

Things like FPGA’s that compete with or outperform GPU’s seem like they might be problem for developers/user who’ve put all their resources and money into CUDA hardware. That’s the only point I was wanting to bring up. Is it smart to jump on the GPU rendering bandwagon at this point or do GPU look more like a potential dead-end in the long run?


These statements are a blatant contradiction. First you state that GPU-raytracing has “for the first time- been officially announced” then state that another company announced it three years ago.

Which is also false because it’s been “announced” for well over twenty years now. It’s not new, they just re-hyped it recently in the news and media so it seems like you bought their product. It’s still far, far too slow to handle what we do on our CPUs and nowhere near realtime for anything but light gaming-type tasks. There were papers on the topic back in the 1990s. Here’s one from 2005, even: https://www-s.ks.uiuc.edu/Research/vmd/projects/ece498/raytracing/RTonGPU.pdf

AMD tossed it into their ProRender as well this year, but that doesn’t mean it’s new stuff - yes GPUs are starting to catch up to the level we use our CPUs at, but it’s still nowhere near realtime (60fps or better). Project Pica (Nvidia’s new baby) shows off the most basic shaders running some raytracing for reflections, etc., but it’s still nothing saleable from a rendering standpoint.

All that said, I don’t have much faith in ARM overtaking the rendering pipelines simply because they’re too slow. ARM is years behind Intel and AMD in terms of throughput and IPC, not to mention clock speeds. Ten years, perhaps. Using them in laptops is fine for battery life, but they’re toys in comparison to the big dogs. AMD just finally caught back up to Intel - I don’t have any faith in ARM designs overtaking Zen. I’d be far more interested in seeing Intel or AMD beat them at the portable/cell phone game. How does a single I7 core or Zen core fare against an ARM chip, using the same power?


yeah with the recently announced 32-core threadripper and 64-core epyc, AMD is really pushing the envelope right now.

Someone at a press conference asked an intel speaker with GPU’s pushing for larger chips, higher watts, and extreme cooling, why hasn’t intel tried pushing for the same thing with their CPU designs. Her answer was…she didn’t know. It’s like everyone seemed to think 135 watt CPU chips were the max - and maybe for certain IT specifications it was for things like how many watts a single blade could be and still stay within specifications for the power enclosure, but it turns out even that could be reenginereed.


I just saw the new Threadrippers over on Techspot! And just one day after Intel announced its 28-core CPU, too. Good for AMD - finally competitive again, and doing great! I’m looking forward to the 20-core version myself.

But on the other side of the PCI bus, there isn’t much reason to upgrade from the GTX-980 to anything newer. GPUs are way too expensive still, for any real Return-on-Investment. I only recently switched from mental ray to Vray and while RT is pretty cool, it’s not useful beyond pre-viz and I still have to render high-res for print. While I’d love to see the GPU more involved in rendering tech, there isn’t much hope for anything approaching real-time at that level of detail and resolution.


You don’t get it, do you? There is something in electronics engineering called an ASIC - Application Specific Integrated Circuit.

Its a fancy way of saying “Custom Chip” - remember how the Amiga was a bog-ordinary CPU paired with very clever Custom Chips for graphics (sprites, scrolling), sound, MIDI music et cetera?

If you implement a raytracing or pathtracing algorithm, or the really expensive mathematical parts of it, as a fast ASIC - a chip designed just for rendering - that will be many, many times faster than any GPU you can currently buy from Nvidia and AMD.

That is what Imagination seem to have done - paired a programmable GPU with some ASIC circuitry that hardware raytraces really, really quickly.

The second best thing to an ASIC - still faster than a GPU - is to do the same thing with an FPGA chip (Field Programmable Gate Array). Intel/Altera had working FPGA realtime raytracing demos 2 years ago at trade shows.

Imagination did it first years ago - with Caustic Visualizer and their mobile raytracing GPUs - but they were such a bit-player in the market that nobody paid attention to their tech. Certainly not on this forum and forums like it.

But Microsoft/Nvidia officially announcing realtime raytracing support in DirectX 12 - even if years later - actually means that realtime raytracing is coming very, very soon and is going to be a very mainstream gaming technology.

You may now say - meh, it will take years before a 500 Dollar gaming GPU can raytrace in realtime.

That is wrong - because there is ASIC and FPGA technology as well, and in the hands of giants like Intel to boot. Intel paid 16.7 Billion Dollars for FPGA maker Altera, and they are desperately looking for good applications for FPGA chips.

To get to the point, realtime raytracing is going to be a race between GPUs, ASICs, FPGAs that can all hardware realtime raytrace at different speeds and at different price points.

If Nvidia and AMD, for example, fail to provide the necessary GPUs before 2020, that leaves the market wide open for other GPU, ASIC, and FPGA makers.

Nvidia and AMD/ATI may be the kings of raster graphics, but they are NOT the kings of realtime raytracing. There are other companies that can do that too.

As for years being required before the GPUs get up to speed, bla bla bla. Typical Nvidia lies in my opinion.

Watch how quickly Nvidia churns out 4K @ 60 FPS capable realtime raytracing GPUs for VR/AR use when Imagination, Intel/Altera and other competent hardware manufacturers join the fray.

The only reason you have not seen many raytracing ASICs, FPGAs and so forth to date, is that offline 3D rendering is an insignificantly small niche market for these electronics manufacturers.

And one that is very nice for selling lots of expensive multi-core CPUs to.

But if 500 to 800 Million gamers are the target market for realtime raytracing, then yes, all of a sudden, the demand is there to actually develop GPUs, ASICs, FPGAs for that market.

It is not that it is difficult or impossible to speed up raytracing and even pathtracing. The electronics technology for that is already very, very feasible to produce today.

It is that these large companies are going to invest Billions in doing this - and they simply won’t risk that unless they expect even more Billions to come back to them as profit.

So it is going to be games, VR and AR that drive this new technology.

People doing CG rendering - a tiny little niche market by comparison - are lucky enough in that the same technology will also be able to speed up offline rendering hugely.

You may not get your pathtraced Archviz running at 60 FPS realtime on this hardware, but you may very feasibly be able to get a high quality frame rendered in 5 seconds, rather than 8 minutes.

Also, the tech will not just show up in UnrealEngine or CryEngine and Unity.

It will come to the render engines that you have in Max, Maya, C4D, Modo and so forth.

As soon as render engine developers have something actual to work with - the hardware, the programming interfaces - the technology will show up in offline render engines.


so it sounds like FPGA’s and other specialty chips are the future then, and not so much GPU’s. People with CPU farms will be able to pack a bunch of FPGA’s into their machine and not miss a beat. People with GPU’s that use GPU-based software will have to remove GPU’s to open up PCIe slots for FPGA’s


Nope. The future is already here with the newly released Raytracing DirectX API.


I agree. It think it is very hard to do a more cost efficient ray-tracing hardware than the current GPUs. Of course if Nvidia, AMD and INtel poured billions in to R&D they could perhaps come up with a highly specialized ray tracing chip that could outperform the GPUs, but at what cost? It is one thing to recoup R&D costs if you sell millions of units, but what if you just have a market of 1-10.000 units? As it is right now, my guess is that any specialized ray tracing hardware always have a cost multiplier that is bigger than the speed multiplier compared to GPUs. But… give it 10-20 years when the price of making custom silicon has perhaps gone down and there are also perhaps more competition in the market. But don’t hold your breath for consumer priced holodeck quality ray tracing hardware any time soon.


I think the biggest technological cliff, or supercharger, we are headed towards. Basically the thing, which once we pass a certain threshold on things are going to become many folds more advanced, exponentially. It is not CPU’s or GPU’s, rather it is wireless transmission. Wireless transmission is really the true bottleneck to the wildly futuristic computing fantasies anyone has.

The reason I say this is, imagine once we get to the point where a cheap, tiny, wireless chip can connect to and simultaneously stream to thousands of other devices as fast as anything. 10 gb ethernet we will get to with wireless at some point. But who is to say that PCI-E transmission speeds could not be achieved wirelessly at some point? Wireless transmission technology is compounding, and multiplying, in it’s capability just the same as CPU and GPU speeds.

Once wireless technology gets to a certain point, something will happen, and that is, instead of each CPU and GPU being constrained to it’s machine, all CPU’s and GPU’s will be able to collaborate on calculations. All computing devices will essentially become one massive CPU, or rather the earth itself will become a DPU, distributed/decentralized processing unit.

You will want to render an image, and it won’t go to a densely packed datacenter somewhere, rather that task will be split and disperse to 10,000 random mobile phones, game consoles, VR headsets and whatever else has a chip within range, or within range of relay. All devices will join in a global decentralized compute network, and all computation globally will be dispersed and spread across all devices.

So in this way are GPU’s, which keep getting bigger, and more power hungry, the future? I don’t think so. The future is low power mobile chips that can be put in anything. Eventually all calculation will be done on these.

How long will that be though?
3g = 10mbs
4g = 60mbs
5g = 3000mbs

If that rate of improvement keeps up, maybe a decade?


What about Data security and IP and privacy in general?
This assumes that, in the future, everyone will have unbreakble end to end encryption that
can even hide all meta data such as geolocation and time stamp info.

who is going to develop that technology at a mass distributable price point.??


What about data and security? The datastructure and alghorithm to do this has not been completely invented yet. There are lots of people circulating it though. There are dozens, maybe even hundreds now, of startups working on “Decentralized Compute Clusters”. I could explain some of it, but that search term will turn up tons of information. There are a handful of companies that have such technology working and are already trying to compete with AWS.


In facts, I think it’s more about software than hardware. Specialized chips needs specialized (support from) compilers and also specialized people that write specialized code. DirectX RT APIs or upcoming Vulkan stuff have a unified approach and your code can run on any supported gfx card that means also there’s already a big community effort to get into common practices to implement stuff you’d be alone with on custom chips. This is cost efficent too.