Explicit Code: Calculating The Scale Under Perspective Projection

In the recent past, I did a lot of 3D graphics programming again. On the one hand for my game project Galactic Fall and on the other hand for the modeling software that goes along with it. One problem that I encountered was, how to calculate the effective amount of scaling for distant objects. In 3D applications, the world is rendered to the screen using a perspective projection, given as a transformation matrix. Under such a perspective projection, objects in the distance get scaled down, so that an illusion of depth is created on screen.

What is the factor of scale for an object in the distance?

As I started thinking about this, one of the first things that became obvious was this: It's not really 3D objects that get scaled. Imagine a very large object, for instance an oil tanker, with your eyes (and the rest of you) placed at the bridge, looking down over the ship towards the bow. Clearly, it is not the whole tanker, that gets scaled down. Rather, parts of the tanker that are farther away from the eye get scaled down more than those parts near to the eye. So, the scaling factor is not a discreet property of the object but a continuous scalar property of space itself and can be calculated for every distinct point in space.

It took me quite some time to appreciate this! The scaling factor is a property of any point in space, yet scaling points per se doesn't make much sense, because they have no extent which could be scaled. The only thing that can appear smaller is something that has spacial extent, for example a line or a triangle, or a cube. However, in general, each point of an object has a slightly different distance from the eye and thus a slightly different scaling factor. This means, that, generally speaking, no single scaling factor can be given to represent the scale of an object under a perspective projection.

What is the factor of scale for any point in space?

Nevertheless, I can determine the specific scaling factor for every single point in space. Trying to deduce the correct formula, I will now turn to a certain type of objects for my investigations. I need objects that have a spacial extent but a constant factor of scaling across all points of the object. Obviously, that means that all points have to have the same distance from the eye position and therefore have to lie on the surface of a sphere with the eye at its centre.

Target Function

Firstly, to simplify things a bit, I'll assume that all coordinates are given in eye space. That is, the eye/camera/viewer sits firmly at coordinates (0, 0, 0) and all other points in space are given relative to that origin. The target function looks something like this

$Scale (P) = ...$

returning a single floating point value, which expresses the apparent scaling at the point P.

Reference scale

Suppose, this function returns 0.2. So we know, an object at point P appears 5 times smaller than ... than what? What is the reference? What is the "normal" size of the object, when the scale is 1? Or, asked rather more purposefully: What is the distance, of a point P from the eye, so that

Scale (P)

equals 1?

The good thing is, that it doesn't matter - it is purely a matter of definition. However, as with all definitions, you should choose your definition to be meaningful in the context that you are going to use it in.

Easy answer

Well, if we can define it any way we like, why not use 1. And the beautiful answer is:

$Scale (P) = \frac{1}{|P|}$

Or in words: the scale is inversely proportional to the distance. Anything at a distance of 7 would thus appear to be a seventh of the size that it appears to have at distance 1.

Understanding the answer

To understand, why this is the case, I will analyse the following picture. It is a further simplification of the problem, by moving to the two-dimensional space, but it can be transformed to three dimensions with relative ease. It's just a lot harder to draw.

In three dimensional space, I would use the geodesic segment, which is a straight line projected onto a the surface of a sphere. In two dimensional space, this is called an arc and is a segment of a circle.

E is, of course the eye position, the origin of the coordinate system.
O is an object at a near distance.
P is the same object at a far distance.
A is the vector from the E to the object O.
B is the vector from the E to the object P.

Preliminaries

Firstly, what, in this picture, represents the scaled down apparent size of the object P? It is the angle between the grey lines that limit the objects O and P. This angle is called the visual angle of the objects O and P respectively, and I will use θ for the visual angle of O and φ for the visual angle of P.

Using the distance to an object and its visual angle, it is possible to calculate the physical size of the object. In this case, it is the length of the arc which is defined as the product of those two values. (see Wikipedia: Arc for reference)

$\begin{array}{rclr} Size (O) & = & |A| \times θ & (1) \\ Size (P) & = & |B| \times φ & (2) \end{array}$

Now, I know, that the physical size of P is the same as the physical size of O, because the former is just a translated version of the latter. This gives me:

$\begin{array}{lrclr} Size (O) & = & Size (P) \\ \Leftrightarrow & |A| \times θ & = & |B| \times φ & using (1) and (2) \\ \Leftrightarrow & \frac{|A| \times θ}{|B|} & = & φ & (3) \end{array}$

Solution

So, what is the target function? In this case, I want to know the factor of scale for the point that is indicated by the vector B - so that will be the input. The output is the scale, and it is expressed by the relation between the two angles θ and φ and I know that the result should be smaller than 1, so the larger angle (θ) goes to the divisor.

$\begin{matrix} Scale (B) & = & \frac{φ}{θ} \\ \Leftrightarrow & Scale (B) & = & \frac{\frac{|A| \times ϕ}{|B|}}{ϕ} \\ \Leftrightarrow & Scale (B) & = & \frac{|A|}{|B|} \end{matrix}$

This is a good function to calculate the scale as it is independent from the actual object, that is meant to be scaled. It only relates the two distances to each other and is actually more powerful than what I limited myself to above. The magnitude of A in the dividend is the reference distance which may be 1, as assumed above, or may be any other distance that seems opportune or serves some specific purpose.

Explicit Code

Sunday, 1 February 2015

Calculating The Scale Under Perspective Projection