WHY DIVIDE BY Z? Unraveling the geometry behind perspective projection.
by Toshi Horie
The first step toward 3D graphics in QB is to find out how to convert
3D points to screen coordinates. Graphics people call this process
"perspective projection." In online tutorials, we see formulas like
xs = x/z, ys = y/z without explanation. Why divide by z? Is it just an
approximation? Or is there really a geometric reason behind it?
The first thing to do to figure out the answers to these questions
is to draw a nice diagram. Imagine yourself looking down from the ceiling
at your monitor and where you usually sit. Here is my little ASCII diagram
to help you.
The 3D object, say a baseball is at point P, and it is displayed
on the screen of the monitor at point S. The eye is at E, and the center of
the screen is at point C. All points are defined so that the top left corner
of the screen is the origin (0,0,0), and +y is down and +x is to the right
and +z is into the monitor (up in the following diagram). The units
for position are in SCREEN 13 pixels, since that's teh screen mode the
sample code will be working in.
' [top-down view of screen, sliced at y=100]
'
(160,100,zp)
Q+--------- * P(xp,100,zp)
behind screen | / (a point in 3D - assume y is 100 for now)
| /
0.. (160,100,zs) | / (320,100,zs)
|=================C======S===============| <-- screen
| / (xs,100,zs)
^+z | / where the pixel is lit
| | /
+-->+x | /
| /
E|/
eye(160,100,zeye)
In this figure,
* the eye is at E (160,100,zeye).
* the center of the screen 13 is at C (160,100,zs).
* the point in 3D is P (xp,100,zp).
* the point on the screen where you would plot the pixel
corresponding to the point in 3D is S at (xs,100,zs).
Now you have to notice that we have two similar triangles:
- Triangle ECS and Triangle EQP are similar triangles.
(In case you don't know what similar triangles are, they are
triangles with the same shape but of different sizes**. They
have the property that their corresponding sides are proportional,
meaning they are magnified by the same amount, and thus the ratio
between the corresponding sides is the same.)
** There's a special case when similar triangles have the same
size as well, but those are usually called "congruent triangles."
This means that the ratio of the corresponding sides
of the triangle is the same for ECS and EQP! Which means:
EC CS
---- = ------ .... Eq. 1
EQ QP
* Notice, that
- EC is just the distance from your eye to the screen in
pixels, so it would be around 640 pixels in screen 13.
(How did I get 640? Well, my monitor has a length of
11 inches. And my eye is approximately 22 inches from
the screen [yes, I measured it], which means that my eye
is twice as far from the screen as the length of it.
Since the 11 inches length is covered by 320 pixels, 22
inches should be covered by twice the pixels, or 640 pixels.
If you are closer or farther from the screen, you have
to change this length accordingly.)
- EQ is how far behind the screen the 3D point is, plus EC.
so it is (zp-zs)+640
- CS is (xs-160), in screen 13 pixels.
- QP is (zp-160), again in screen 13 pixels.
Now we want to find out what xs is, because that's the x coordinate
of the point we want to plot with PSET.
Substituting the values above into equation 1, we get:
(remember the distance between the eye and center of the screen is 640)
640 xs-160
----------- = ----------- ... Eq. 2
(zp-zs)+640 xp-160
Now, if we assume the screen is at z=0, then zs drops out and
things get easy.
640 xs-160
----------- = ----------- ... Eq. 3
zp+640 xp-160
'[first figure with more numbers filled in]
'
'
(160,100,zp)
Q+--------- * P(xp,100,zp)
behind screen | / (a point in 3D)
| /
(160,100,0) | / (320,100,0)
|=================C======S===============|
:| / (xs,100,0)
6| / pixel for point
4| /
0| /
:| /
E:|/
eye(160,100,zeye)
We want to solve this for xs, so here it goes:
- multiplying both sides by the (xp-160), we get
640*(xp-160)
xs-160 = ----------------------- ... Eq. 3b
640+zp
adding 160 to both sides of the equation, we get
640*(xp-160)
xs = ----------------------- + 160 ... Eq. 4 (origin at top left corner of screen)
640+zp
Next, we will find the formula for ys, then we can
plot 3D points on the screen using PSET(xs,ys),colour.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
HOW COME WE CAN ASSUME Y=100?
Okay, we got the formula for xs when y=100, but this same formula
actually works for y<>100. Why is this? Here is an intuitive
explanation:
if i was standing on a cliff ...
looking into oblivion
and there's this giant orb that just floats
say it's "30 units to the right of the center of my FOV"
and it moves along the (vertical) y-axis
no matter how far up or down it goes that x-coord is
staying the same
....................a more difficult explanation...................
: The mathematical reason behind it has to do with projection again. :
: Say y=120 (the 3D point is at xp,120,zp). The similar triangles :
: formed by this point and the eye will match the one :
: with y=100 if you project it to the y=100 plane. :
....................................................................
Because y does not have to be 100, the formula for xs, given in
equation 4 can be used any time we need to project 3D points to
the screen.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This gives us a formula for xs. But what about ys? It turns out
that ys can be found in almost the exact same way!
Now you can get off the ceiling :) Sit back in your seat, and rotate the
monitor sideways so you can't see what's on the screen. Before you do that,
you might want to copy the diagram below, so you can compare how the monitor
looks to the diagram. Okay, since the screen is sideways, the +z axis points
.to the right in the diagram, and the +y axis points down. The baseball is
now at point P' (pronounced "pee-prime") this time.
'[side view of monitor and eye]
'
in front of <--- screen --> behind screen
====
||:::::::::
+-->into +z || ::::::::::
| screen || (assume x = 160
+y v || for all points)
down || ::::::::
|| :::
|| :::
Eye(160,100,zeye) || behind screen :::
E----------C------+ Q' :::
\ || | :::
\ || | :::
\ || | :::
\ || | :::
S' | :::
|| \ | :::
|| \ | :::
|| * P' (160,yp,zp) :::
|| :::::::::::
|| :::::::::::::::::::
||::::::::::
====
In this figure,
* the eye is still at E (160,100,zeye).
* the center of the screen 13 is still at C at (160,100,zs).
* the new point in 3D is at P'(160,yp,zp), so everything
lies on the x=160 plane, so it's easier to solve.
* the point on the screen where you would plot the pixel
corresponding to the point in 3D is S at (xs',100,zs').
Now you have to notice that we have two similar triangles:
- Triangle ECS' and Triangle EQ'P' are similar.
This means that the *ratio of the corresponding sides*
of the triangle is the *same* for ECS' and EQ'P'! So we have:
EC CS'
---- = ------ ... Eq. 5
EQ' Q'P'
Looks just like equation 3, huh? I told you that the x and y's
can be solved in the same way!
The rest of the derivation looks similar too! Just keep the numbers
straight, and you'll be fine. Plugging in the lengths of the sides of
the triangle into equation 5, we get something that looks a lot like
equation 2: (remember the distance between the eye and center of
the screen is 640 pixels for SCREEN 13.)
640 ys'-100
------------- = ----------- ... Eq. 6
640+(zp-zs') yp-100
Again, the screen is at z=0, so zs=0 and things get easier.
640 ys'-100
----------- = ----------- ... Eq. 7
640+zp yp-100
We want to solve this for ys, so here it goes:
- multiplying both sides by the (yp'-100), we get
640*(yp-100)
ys'-100 = ----------------------- ... Eq. 7b
640+zp
adding 100 to both sides, we get
640*(yp-100)
ys' = ---------------- + 100 ... Eq. 8 (origin at top left corner of screen)
640+zp
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
HOW COME WE CAN ASSUME X=160 ?
When we solve for ys, why can we forget about the x coordinate and
assume it is 160? I can say, it works by analogy, but that's not
a proof. Here is a physics-based explanation:
If I was standing on the side of a flat street looking toward the
other side, while the cars were passing by in the x direction
(horizontally), I wouldn't see the cars moving up and down,
would I? [Now if this was a sloped street, cars going horizontally
would be either taking off or crashing into the ground, like in
"Back to the Future," but that's another story.]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Because of the above reasoning, once again, we can generalize our
equation to one that projects any 3D point to the screen, without
doing any extra work! So ys = ys' if point P' is at the same
position as point P above.
640*(yp-100)
ys = ys' = ------------------- + 100 ... Eq. 8a (origin at top left corner of screen)
640+zp
Together, Equation 4 and equation 8a give us the complete formula for
plotting 3D points (which have their origin on the top left
corner of the screen, with +x axis going to the right, the +y axis pointing down,
and the +z axis pointing into the monitor) onto the screen.
Here they are again.
640*(xp-160)
xs = ------------------- + 160 ... Eq. 4
640+zp
640*(yp-100)
ys = ------------------- + 100 ... Eq. 8a
640+zp
Wait! "Top left corner of the screen?" That means (1,-1,1) will be
plotted off the screen! Ok, we'll fix this, but there's another
problem for people used to y axis pointing up. The y-axis on our
coordinate system points down!
To correct this, we have to return to equation 7b.
(don't worry, it's only a small change!)
640*(yp-100)
-(ys'-100) = ------------- ... Eq. 7b [+y axis is up in 3D point, down on screen]
640+zp
Look, all we had to do was add a minus sign! Now this makes a small change in
the equation 8 and 8a. Here it is:
640*(yp-100)
ys = 100 - --------------- ... Eq. 8a [y axis fix]
640+zp
We didn't have to change equation 4 because the screen coordinate
(abbreviated "screen coord" below) agrees with the Cartesian coordinate system
(defined by the x, y and z axes) we used.
[to make the origin of points at center of screen]
(Note: These xp and yp variables have values different from the
xp and yp in Eq. 4 and 8a.)
640*(xp+160-160)
xs = 160 + ---------------------- ... Eq. 4c (origin at C)
640+zp [y axis fix, origin at C]
640*(yp-100+100)
ys = 100 - --------------------- ... Eq. 8c (origin at C)
640+zp [y axis fix, origin at C]
*******************************************************************
Simplifying, we get a formula that works pretty well
for plotting 3D points in SCREEN 13.
640*xp
xs = 160 + ----------- ... Eq. 4c' (origin at C)
640+zp [y axis fix, units in pixels]
640*yp
ys = 100 - ---------- ... Eq. 8c' (origin at C)
640+zp [y axis fix]
*******************************************************************
[How things look with the origin at C (orthogonal projection)]
(160,100,zp) Q+--------- * P(xp,yp,zp)
| / (note: values of xp,yp,zp are different than before)
| /
(-160,ys,0) (0,0,0) | / (160,ys,0)
|=================C======S===============|
| / (xs,ys,0)
| / pixel for point
| /
| /
| /
E |/
eye(160,100,-640)
////////////////behind eye/////////////////
///////////////////////////////////////////
Likewise, we can move the orgin to the eye, if you want, although
usually this *isn't* the always the best thing, because a point at the origin
will crash your 3D engine (it's equivalent to poking yourself in the eye),
unless you write an IF statement to handle the special case! (In fact, all
points with z coordinates on or behind the eye shouldn't be displayed!)
But this is actually what most 3D engines do (including OpenGL) when
doing perspective transform.
(Note: LET xp3d = xp from Eq. 4c'
yp3d = yp from Eq. 8c'
zp3d = zp+640 )
640*xp3d
xs = 160 + -------------- ... Eq. 4e' (origin at E, y-axis fix)
zp3d
640*yp3d
ys = 100 - -------------- ... Eq. 8e' (origin at E, y-axis fix)
zp3d
Well, if we take a quick look at the xs = x/z, ys = y/z in the introduction,
you'll see that 4e' and 8e' are very close. (just take off the centering addition
and the *640 which multiplies the x and y by the eye to screen distance).
To really get that, you have to measure everything in special units so that
the distance from the eye to screen is defined to be 1, and use the coordinate
system with the origin (0,0,0) at the eye and do WINDOW SCREEN (-160,100)-(160,100)
to center the screen at (0,0,zs). Although that is nice in theory,
when you write a game engine, you don't want to be doing extra divide operations,
so the forms presented in equation 4e'+8e' or 4c'+8c' works the best. I suggest
that you work out the math to prove to yourself that is true.
Well, we have derived several formulas for perspective projection in SCREEN 13, and we
found out that the x/z and y/z are accurate ways to do perspective projection when we
use the correct coordinate system and units. We will finish this time by writing a
simple 3D parametric function plotter.
QBasic code (finally!)
DEFINT A-Z
SCREEN 13: CLS
'=====================================
' 3D Perspective Projection Test
'=====================================
'set grayscale palette
FOR i = 0 TO 255: OUT &H3C9, i \ 4: OUT &H3C9, i \ 4: OUT &H3C9, i \ 4: NEXT
'draw wavy thing around zp=100 axis
FOR t! = 0 TO 6 STEP .001
xp = INT(100 * COS(t!))
yp = INT(100 * SIN(8 * t!))
zp = INT(99 * SIN(t!) + 100)
zdenom = (zp + 640)
'perspective projection (world space to screen space)
IF zdenom > 0 THEN
xs = (160 + xp * 640& \ zdenom) 'using equation 4c'.
ys = (100 - yp * 640& \ zdenom) 'using equation 8c'.
r = (640 \ zdenom) 'find size of point
CIRCLE (xs, ys), r, 200 - zp 'plot it on the screen!
END IF
NEXT t!
'draw helix around the y axis
FOR t! = 0 TO 60 STEP .001
xp = INT(100 * COS(t!))
yp = INT(t! + .5)
zp = INT(100 * SIN(t!) + 100)
xp3d = xp
yp3d = yp
zp3d = zp + 640
'perspective projection (world space to screen space)
'note how zdenom = zp3d
IF zp3d > 0 THEN 'if point is in front of eye, then
'project the 3D point to the screen
xs = (160 + xp3d * 640& \ zp3d) 'using equation 4e'.
ys = (100 - yp3d * 640& \ zp3d) 'using equation 8e'.
r = (640 \ zp3d) 'find size of point
CIRCLE (xs, ys), r, 200 - zp 'plot it on the screen!
END IF
NEXT t!
Next time, I'll talk about how to change the field of view, so
you can get panoramic scenes or binocular zoom vision in your
perspective code.