To get vertex lighting/shadowing at the same “resolution” as lightmapping would require the meshes to be tesselated to quite a high density, hence the use of lightmapping which is only as slow as applying a second texture to a surface.
It’s less of an issue now, what with graphics cards having hardware transforms, but then the emphasis moves to bus bandwith. You need to shove these vertices to the graphics chip, and you probably wouldn’t store them on the graphics card because that would take up space used for framebuffers/textures. Damn, another problem 
Another advantage of lightmaps is that you can use a global illumination solution (most probably radiosity or photon mapping) to get “nicer” lighting, stuff like color bleeding between surfaces and that sort of thing. Vertex lighting is only a local solution, so the resulting color is only a function of the vertex and the light and doesn’t take into account light from other surfaces.
The whole thing is shifting again now though, using stencil shadow volumes and dot3 lighting is giving reasonably good results, although it’s back to being a local lighting solution, albeit per-pixel.
To go back to your original question, are you thinking of the lightmap calculation here? i.e. the radiosity calculation? If so, then yes you’re correct. Light bounces around between all the surfaces until a certain negligible energy level is all that’s leaving the surfaces. After that the energy at each surface is converted into an illumination level and stored in the lightmap texture. When rendering the surface (at run-time) all you need to do is lightmap*albedo colormap and you get the lit/shadowed surface.
phew, big post. HTH.