Why are grayscale images used for heightmaps?


This seems like a silly question, but what is it about raster grayscale images that makes them so common to generate a heightmap from?

Working from a large, like a really large raster image has it’s own problems, and i’m wondering if something else to generate heightmaps exist.
Like could a lattice of elevation points be generated, basically like a coordinate system, and then imported into a program.


one reason, they are incredibly easy to edit in any image manipulation app. If there is not a actual need for over complication, keep it simple. Sure, there are other ways…like storing vertex data to represent height in 3d:)


There is a vector displacement.


Depending what 3D app and/or renderer you use, you can also use a vector image (created in adobe illustrator for example) instead of a raster/bitmap as input.
In someting like illustrator you could blend between those elevation lines with a gradient and save them to a vector file format like AI, EPS, SVG, … (no pixels).
But after some quick testing this is not advised as vector images seem to get processed as 8bit images, which is too low to use for displacement. I don’t know why though, there should be no reason to limit vector graphics in such a way… Hopefully this gets adressed soon. (I tested this in 3ds max, might be that other software handles them in a better way.)
*edit: after a quick google i found out adobe illustrator only supports 8bit per channel… which explains why i couldn’t get more accuracy out of it. I hope this will change in the near future.

This is not the same as ‘vector displacement’ mentioned by Mister3D.

Yes terminology can get confusing sometimes… Just google for more info;
Vector displacementVector image/drawing


I think that the grayscale thing is actually a hold from the old, pre-GPU days. It’s all about simplicity. Think about it.

If the dimensions of the image map represent the X & Y coordinates then the color represents Z.

Why grayscale though? The logic is pretty straight forward.

  1. It doesn’t matter what channel you look up. R, G, & B will all contain the same Z value at that exact {X,Y} position. That’s less for the programmer to worry about.
  2. Compression. If R, G, & B all hold the same exact value then that’s less data to store. Knowing the R is the same as knowing the B & G.
  3. Cycling the Z values was also a sneaky way to morph the height and give the illusion of movement across terrain. That was a trick that 64k demo type coders were fond of back in the day.

From a technical perspective, there’s a method to the madness. Back in the days where storage was at a premium and you had to pray that the render PC had a FPU, tricks like this were a godsend.