DeepMind’s AI can ‘imagine’ a world based on a single picture


Artificial intelligence can now put itself in someone else’s shoes. DeepMind has developed a neural network that taught itself to ‘imagine’ a scene from different viewpoints, based on just a single image.Given a 2D picture of a scene – say, a room with a brick wall, and a brightly coloured sphere and cube on the floor – the neural network can generate a 3D view from a different vantage point, rendering the opposite sides of the objects and altering where shadows fall to maintain the same light source.


Very promising research. Some of the examples almost achieve the quality of the old Wolfenstein 3D from just a few learning images.

The question is, how much computational power does this require?

If it requires Petaflops of CPU/GPU power, its going to be a while before you can give your computer a few photos and have it generate a game level from them.

Still - some of the more interesting research I’ve seen in a while.


Not that I had any doubt, but this certainly confirms my long-held assumption that the future of CG won’t be explicitly defined geometries and brute-force simulation of light and surfaces, but rather will be perceptually based.


It is the network “training” that is computational expensive. Once you have a working model, you could probably use it on a mobile phone. And, since this would probably be a SaaS, google would use all the new inputs that you give it and further train the model and make it (even) more generalizable. But, who knows. This is research and might be a one trick pony for now.


Yeah, more like some kind of dreams. But eventually very precise.
So my predictions that it will create 3d models based on a concept design seem valid.
There’s nothing it can’t do, as it operates the same way our mind does. It’s just in its infantry.


Not sure what you mean by one trick pony, unless it only works on that one image. No, this is a glimpse at the future. Single image photogrammetry, changing viewpoints of a photo or painting, all of that will follow. I just wonder if they have it, or are close to having it, able to produce geometry from its interpolated shapes and spaces.