## Building Virtual Worlds

#### Lecture 3: Perspective and Stereo Projection

This lecture is based on Chapters 3.3, 3.4 and 3.5 of Vince, and some supplementary material on parallel projection.

Projecting an image means computing 2d points from 3d points. Rendering an image means actually drawing it. So projection is a necessary step before rendering (unless the object was 2 dimensional in the first place.)

Section 3.3: Perspective Projection.

To project the 3d information from an object in world coordinates into a 2d view screen, we first transform the object's coordinates into camera, or Virtual Observer (VO) coordinates. If d is the distance of the VO out on the negative Z axis, similar triangles show us that the projected value xp of a point's x value is

xp = (xd)/(z + d)

and by identical logic,

yp = (yd)/ (z + d)

Where z is of course how far the object is beyond the viewscreen. Big z, small object. Distant objects are smaller than close-up ones.

Field of View. To calculate the field of view as an angle, we actually have to know how wide and high the view screen is. The table 3.1 in the text is based on 35mm film. Since the film's image area is not square, there are actually different values for the field of view in width and height.

If you assume that the film's image area is close to 35 mm wide, you can see that the angle Theta is in a triangle with opposite side 35/2 = 17.5 mm and base 35 mm, so TAN (theta) = 1.2 or theta=30 degrees. Double that angle is 60 degrees. The textbook table 3.1 reports a field of view of 63 degrees for a 35 mm focal length, so we can tell that the film's view area is slightly larger than 35mm wide.

Query 3.1. Construct a cube whose center is at (0,0,20)  of side length 10, with the sides parallel or perpendicular to the viewscreen located at Z=0.  (By "construct a cube", I mean that you should list its vertices as eight triples of real numbers.) With a viewpoint of (0,0,-10) calculate the location of the images of the cube's corners on the viewscreen. Draw the image of the cube as seen from VO. Also draw the X and Y coordinate axes.

Why does this image not look like our normal impression of a "perspective" rendering of a cube?

Query 3.2. Construct a cube whose center is at (10,10,20) of side length 5, which is rotated so that some of its faces are not parallel to the principle axial planes. Such a  cube is illustrated in Figure 3.15 in the text. Now apply the perspective projection to your new cube. Do you get an image more like the one shown in Figure 3.15's view screen, than we did in Query 3.1?

HINTS: Of course there are an infinite number of such cubes, and yours may not look like some other student's. Here are some hints on how to construct such a cube. From the figure, we see that we want the top of the cube to be visible in the perspective rendering. So let's just make the cube "nod" by rotating it downward, along an axis parallel to the X axis and through its center. As you can see below, if the cube's sides are to remain 5 units long, we can immediately calculate by Pythagoras' theorem that the diagonal lengths d must be such that d**2 + d**2 = 5**2 = 25 or d**2 = 25/2 = 12.5, so (whipping out our trusty calculator) we find that d=3.54. So, we can now compute the Z and Y positions of the corners, since we know where the center is. I hope that the X positions of the corners is obvious to you. The X axis runs into the page.

Parallel Projection. There is a simpler way to produce 2d pictures from 3d data: Simply throw away the z coordinate!

Query 3.3: Draw a parallel projection of our rotated cube above.

Section 3.4: Human Vision

Monocular (one-eye) Depth cues: Motion parallax, absolute size (called by author, "perspective depth cues"; confusing!), fog/haze, occlusion (blocking of distant objects by close objects.), focal adaptation. Wonder why he left several of these out?

Binocular (two-eye, or three if you have them) Depth Cues: Eye convergence, stereopsis.

These last two are NOT the same. Eye convergence is based on the eye's need to center an image on the retina. Muscles pull the eyes inward until an error signal somewhere in the optic tectum of the brain is minimized.This concerns the actual direction from which light is coming. The amount of convergence that is applied, depends on what part of a real scene we're paying attention to. Consider two posts, one near and one far. If you shift your attention from the near to the far one, the convergence angle changes. The focal length of your eyeball also changes, if you're young and have flexible eyeballs.

Stereopsis is the analysis of the perceived difference in two scenes, one per eye. The brain is quite sophisticated at deducing how far away a vertex or edge is, from this difference. But it only works out to about 10 meters, maximum. Stereopsis is applied to the whole scene, all the time. It doesn't depend on what part of the scene you're paying attention to, except that the actual images PROVIDED for stereopsis will of course be different, based on the convergence angle your eyes are using.

We compute two perspective images to make stereo pictures, of course. But they must be painted on a flat surface, such as an LCD screen. This is the source of the headache (literally) that occurs when our brain notices that focus, convergence and stereopsis are inconsistent. Looking into such a scene, the eye has only one focal distance to work with - that is, the distance from the eye to the screen surface.  The computer doesn't know at what distance into the scene your attention is focused, and so cannot know what convergence to assume. It has to assume some value and provide you with a single set of stereo images.

Usually, for computing the stereo images, we assume that the viewer is gazing straight ahead ("infinity optics"). Then we cheat to fake convergence by pulling these stereo images a bit closer together than they really should be.

Query 3.4: Draw a picture (a top-down "map style" view is best) which shows by an example, why the production of stereo images must assume a particular convergence angle.

Query 3.5: Screen based stereoscopic displays generate the images on a wall rather than inside a head mounted display. Users wear glasses which, either by time multiplexing or filters (colored or polarized) provide separate images for the left and right eyes. Will the convergence/stereopsis problem be different for stereoscopic screen displays, than for head mounted displays?

Back to the course index
Back to the course syllabus
Back to the previous lecture
Onward to the next lecture