Digital Media

Moshell - Spring 99

Lecture 14: Introduction to Computer Graphic

This lecture covers the basics of rotation, scaling, translation and perspective projection. It is very similar to Lectures 2 and 3 of the course CAP4021 - Building Virtual Worlds

Defining a Right Handed Coordinate System:

X right (almost all systems do this)
Y Up (some use Y down, like a Windows screen)
Z Out (goes with Y up and Right Hand Rule)

The Synthetic Camera is an imaginary device inside which an image must be formed, on its own screen coordinate system. To do that, the objects must be located in the camera coordinate system. So we have to construct rules which take world 3D coordinates (x,y,z) into the viewer (Virtual Observer=VO, or camera) coordinate system (xv, yv, zv), and then into screen coords (xs, ys).

The VO is located in space at some point named VO=(xv,yv,zv). That's easy.

It is oriented in some way involving three degrees of freedom. That's hard. We discuss 3 ways of doing it.

Direction Cosines. If the world coordinate system has three orthogonal unit vectors I, J, K as its bases, and the transformed system has bases L, M and N, the idea of direction cosines is to express the base vector L in terms of I, J and K. (And likewise, M and N.) For instance,

If  L is parallel to I and perpendicular to J and K, then L's direction cosines are (1,0,0) in the IJK system.
If L is antiparallel to J, then L's direction cosines are (0,-1,0) in the IJK system.

The cases with superimposed axes are simple to understand. (These are cases where the target system is 90 degrees or 180 degrees away from the source system, in some sense.) If things are rotated so that the axes don't align, then we gotta do some math. Here's a review:

REVIEW: Cosine. Cosine of an angle is the measure of their parallel-ness. Cos(0)=1, meaning if the angle between 2 vectors is 0, Cosine is maximized. Cos(pi/2)=cos(90 degrees)=0, meaning if the angle between two vectors is perpendicular, cosine is zero. Cos (pi)=cos(180 degrees)=-1. Your calculator, or a lookup table, will provide values of Cosine. The commonly needed values are these:

Cos(pi/4)= (sqrt 2)/2. Cos (pi/3) = cos(60 degrees) = 1/2. Cos(0)=0. Cos(pi)=-1 Cos(pi/2)=1.

Unit vectors are just vectors of length 1, such as (1,0,0) or (0,0,1) or ((sqrt 2)/2, (sqrt 2)/2, 0). This last vector ofcourse falls exactly between the x and y axes, and in the z=zero plane.

QUERY 14.1: Figure out the direction cosines (the three components) of a unit vector which falls exactly between the x,y and z axes. Hint: Its three components will be identical.

These direction cosines enable any point P(x,y,z) in one frame of reference to be transformed into P' (x',y',z') in another frame of reference(FOR). This is literally true, but "enable" does not mean "provide all the necessary information." You need both the relative ROTATION of the two FORs, and the relative TRANSLATION of them.

Homogenous Coordinates means using a 4 x 4 matrix operation to represent three dimensional geometric transformations - a neat trick. The fourth column provides an additive component, when we augment our 3d vectors with a fourth element of value 1. We'll see examples in a few minutes.

REVIEW: Length of a vector. A vector V= (a,b,c) has length |V| = sqrt( a**2 + b**2 + c**2). Pythagorean Theorem.

REVIEW: Dot Product of vectors. Two vectors V=(a,b,c,d) and W=(e,f,g,h) can be multiplied  together by forming the sum (ae+bf+cg+dh). Input: 2 vectors; output: one scalar number. This "dot product" also computes the value of |V||W|cos Theta, where Theta is the angle between the 2 vectors. A miracle, no doubt....

REVIEW: Matrix Multiplication. Two matrices placed side by side can be multiplied, if the left one has the same number of columns, as the right one has rows. For instance, a 4x4 and a 4 x 1 can be multiplied (we always tell the number of rows, first.) The 4 x 1 is called a column vector.

To multiply a 4 x 4 and a 4 x 1, take the dot product of the first row of the 4x4 with the 4x1 column vector. This is the first (top) scalar in the resulting 4 x 1 column vector. In other words
 

a b c d      u
e f g h      v
i j k l      w
m n p q      x
Multiplying this matrix by this column vector yields a new column vector whose first term is
    au + bv + cw + dx
and whose other terms are
    eu + fv + gw + hx
    iu + jv + kw + lx
    mu + nv + pw + qx
Got that?

Query 14.2: Construct the transformation matrix which describes a VO located at 10,0,10, with its Yvo parallel to the Yw axis and its Zvo pointing straight at the origin. Verify your matrix's correctness by applying it to the world coordinate points (0,0,0) and (10,0,0). Draw pictures as necessary to convince yourself (and me) that the world coordinate points have the right coordinate values in the VO coordinate system.

XYZ Fixed Angles. Note that the terms "roll", "pitch" and "yaw", imported from aviation, relate to positive rotation around the VO's z, x and y axes respectively. You have to be careful to think in aircraft terms rather than coordinate axis terms, as other people might not line z up with the plane's long axis, but roll is always thus defined. Likewise pitch always means "nose up" and yaw means "nose left".

Matrices for roll, pitch & yaw are as follows:

cos roll    -sin roll    0    0
sin roll    cos roll     0    0
0            0              1    0
0            0              0    1

1            0            0           0
0     cos pitch    -sin pitch   0
0    sin pitch    cos pitch      0
0            0            0           1

cos yaw    0    sin yaw     0
0               1         0         1
-sin yaw    0    cos yaw    0
0               0         0         1

Matrix multiplication is non-commutative (that is, the order you line them up does matter), so you gotta have an "official" order for these operations. We just named it: roll, pitch & yaw, because these are the standard most-to-least common aviation operations.

Euler Angles are similar but you always figure the angle based on the current object's coordinate system, moving forward one step at a time. The objective in any case is to get from a given pose, to some other pose. There are usually more than one way to describe the steps involved, and all you want is to get ONE matrix that reliably does the job. Here's how:

If a VO is located in the virtual environment by describing its XYZ Euler angles, then the point p=(x,y,z) in virtual space corresponds to a point p'=(x',y',z') in the VO or camera coordinates, according to this formula:

p' = [-yaw][-pitch][-roll][-translate][p]

Modeling and Viewing Transformations

A modeling transformation is one which moves the object in world coordinates. Such transformations usually represent either motion, or the positioning of a component with respect to the overall object being built. The following query is really about a modeling transformation.

Query 14.3. A whale is swimming north in normal whale position (belly down.) Construct a transformation matrix which would reposition this whale so that it is now swimming West and belly-up. Test the matrix on a very small whale "data set".

HINT: The small whale data set can consist of the points (0,0,0) for its tail, (0,0,-10) for its nose, and (-2,0,-5) for the tip of its left fin ("front paw"). The coordinates are in meters.

Query 14.4. The whale we described happened to be located with its tail at the origin when we did our flip-around. What if the whale had had its tail at 10 kilometers north of the origin? Correct your test data and see what the matrix does in this case.

In the following two queries, "pseudo-code" means that you can code the algorithms in English, C, Java, Pascal or anything you want as long as the loops, loop counters and computed values are explicitly named as variables. None of that "then multiply the value by the cosine"... vagueness.

Query 14.5. Write a pseudo-code procedure which has three input values (a,b,c) which represent Euler angles for roll, pitch and yaw. The procedure yields a 3 x 3 matrix M[0..2, 0..2] where the first coordinate denotes the row, which performs the stated transformation. You may find it convenient in your code to call the procedure you're about to write in the next Query.

Query 14.6. Write a pseudo-code procedure which accepts two 3 x 3 matrices (M, N) and produces their matrix product P.
 
Viewing transformations, on the other hand, produce numbers which represent the same point as seen from the new coordinate system (usually the camera or VO system.) Here's an easy way to remember what happens:

The transformation that would move the VO system on top of the VE system is the same one that moves data points from the VE system to the VO system.

Trivial example: The VO system is just like VE but its origin is at (10,0,0) in VE system. Thus, the point p=(15,0,0) in VE corresponds to p'= (5,0,0) in VO.

a) How would you move VO on top of VE? Of course you would translate it by (-10,0,0).

b) How would you compute p' from p? Of course you would add (-10,0,0) to p.

Projecting the 2d Image from the 3d Data

Projecting an image means computing 2d points from 3d points. Rendering an image means actually drawing it. So projection is a necessary step before rendering (unless the object was 2 dimensional in the first place.)

 Perspective Projection.

To project the 3d information from an object in world coordinates into a 2d view screen, we first transform the object's coordinates into camera, or Virtual Observer (VO) coordinates. If d is the distance of the VO out on the negative Z axis, similar triangles show us that the projected value xp of a point's x value is

xp = (xd)/(z + d)

and by identical logic,

yp = (yd)/ (z + d)

Where z is of course how far the object is beyond the viewscreen. Big z, small object. Distant objects are smaller than close-up ones.

Field of View. To calculate the field of view as an angle, we actually have to know how wide and high the view screen is. A textbook (Vince: Virtual Reality Systems) asserts that the following relationship holds between a lens' focal length and the field of view:
 
Focal Length in mm Field of View in Degrees
20 94
28 75
35 63
40 46
85 28.5
135 18
200 12
300 8.25

 Since the film's image area is not square, there are actually different values for the field of view in width and height.

If you assume that the film's image area is close to 35 mm wide, you can see that the angle Theta is in a triangle with opposite side 35/2 = 17.5 mm and base 35 mm, so TAN (theta) = 1.2 or theta=30 degrees. Double that angle is 60 degrees. The textbook table 3.1 reports a field of view of 63 degrees for a 35 mm focal length, so we can tell that the film's view area is slightly larger than 35mm wide.


Query 14.7. Construct a cube whose center is at (0,0,20)  of side length 10, with the sides parallel or perpendicular to the viewscreen located at Z=0.  (By "construct a cube", I mean that you should list its vertices as eight triples of real numbers.) With a viewpoint of (0,0,-10) calculate the location of the images of the cube's corners on the viewscreen. Draw the image of the cube as seen from VO. Also draw the X and Y coordinate axes.

Why does this image not look like our normal impression of a "perspective" rendering of a cube?

Query 14.8. Construct a cube whose center is at (10,10,20) of side length 5, which is rotated so that some of its faces are not parallel to the principle axial planes. (If you haven't any idea how to do this, work with a friend or read the hint below.) Now apply the perspective projection to your new cube. Do you get an image more like what one expects a cube to look like? This will depend on the cube you designed, of course.

HINTS: Of course there are an infinite number of such cubes, and yours may not look like some other student's. Here are some hints on how to construct such a cube. From the figure, we see that we want the top of the cube to be visible in the perspective rendering. So let's just make the cube "nod" by rotating it downward, along an axis parallel to the X axis and through its center. As you can see below, if the cube's sides are to remain 5 units long, we can immediately calculate by Pythagoras' theorem that the diagonal lengths d must be such that d**2 + d**2 = 5**2 = 25 or d**2 = 25/2 = 12.5, so (whipping out our trusty calculator) we find that d=3.54. So, we can now compute the Z and Y positions of the corners, since we know where the center is. I hope that the X positions of the corners is obvious to you. The X axis runs into the page.


Parallel Projection. There is a simpler way to produce 2d pictures from 3d data: Simply throw away the z coordinate!

Query 14.9: Draw a parallel projection of our rotated cube above.
 

Human Three Dimensional Vision

How can we tell how far away things are? We use both one-eye (Monocular) and two-eye (Binocular) clues. Monocular (one-eye) Depth cues include: Motion parallax, absolute size, fog/haze, occlusion (blocking of distant objects by close objects.), focal adaptation.

Query 14.10: Define motion parallax and focal adaptation.

Binocular (two-eye, or three if you have them) Depth Cues include Eye convergence and stereopsis.

These two are NOT the same. Eye convergence is based on the eye's need to center an image on the retina. Muscles pull the eyes inward until an error signal somewhere in the optic tectum of the brain is minimized.This concerns the actual direction from which light is coming. The amount of convergence that is applied, depends on what part of a real scene we're paying attention to. Consider two posts, one near and one far. If you shift your attention from the near to the far one, the convergence angle changes. The focal length of your eyeball also changes, if you're young and have flexible eyeballs.

Stereopsis is the analysis of the perceived difference in two scenes, one per eye. The brain is quite sophisticated at deducing how far away a vertex or edge is, from this difference. But it only works out to about 10 meters, maximum. Stereopsis is applied to the whole scene, all the time. It doesn't depend on what part of the scene you're paying attention to, except that the actual images PROVIDED for stereopsis will of course be different, based on the convergence angle your eyes are using.

We compute two perspective images to make stereo pictures, of course. But they must be painted on a flat surface, such as an LCD screen. This is the source of the headache (literally) that occurs when our brain notices that focus, convergence and stereopsis are inconsistent. Looking into such a scene, the eye has only one focal distance to work with - that is, the distance from the eye to the screen surface.  The computer doesn't know at what distance into the scene your attention is focused, and so cannot know what convergence to assume. It has to assume some value and provide you with a single set of stereo images.

Usually, for computing the stereo images, we assume that the viewer is gazing straight ahead ("infinity optics"). Then we cheat to fake convergence by pulling these stereo images a bit closer together than they really should be.

Query 14.11: Draw a picture (a top-down "map style" view is best) which shows by an example, why the production of stereo images must assume a particular convergence angle.

Query 14.12: Screen based stereoscopic displays generate the images on a wall rather than inside a head mounted display. Users wear glasses which, either by time multiplexing or filters (colored or polarized) provide separate images for the left and right eyes. Will the convergence/stereopsis problem be different for stereoscopic screen displays, than for head mounted displays?

Back to the course index
Back to the course syllabus
Back to the previous lecture
Onward to the next lecture