Computer Vision

CAP5415 - Computer Vision

Spring 2003
TR 19:00 - 20:15
CSB 0221

Instructor

Khurram Hassan Shafique
CSB 103 (Computer Vision Lab)
Phone (Vision Lab): 407-823-4733
Office Hours: TR 15:00-16:00 in CSB-255 (Grad Lab)
Phone (Grad Lab): 407-823-2245

Teaching Assistant

Cen Rao
CSB 103 (Computer Vision Lab)
Phone (Vision Lab): 407-823-4733
Office Hours: TR 16:00-17:00 in CSB-255 (Grad Lab)
Phone (Grad Lab): 407-823-2245

Course Introduction

Pre-requisite: Other than programming experience, there is no pre-requisite for the course, however knowledge of linear algebra, image processing and/or graphics may be helpful.

"Vision is the process of discovering from images what is present in the world, and where it is" (David Marr). Ever since the advent of computational machines, researchers have wondered whether these machines can be programmed to imitate human cognitive processes (visual perception, natural language processing, deductive reasoning etc). The problem was assumed to be easier in the beginning and only processing power and limited storage was regarded as the major hurdle.

"I believe that in about fifty years' time it will be possible to programme computers to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning". Alan Turing, 'Computing Machinery and Intelligence', Mind (1950), 442

But after more than 50 years of Turing's remarks and a huge shift in computing power and storage, the modeling and simulation of perceptual processes is still an unsolved mystery.

Computer vision is the study of analysis of pictures and videos in order to achieve results similar to those as by men. Thus human vision acts as a lower bound on our ambitions with regard to computational image analysis (Turing Test for computer vision). The field of computer vision has inspired a large number of researchers in computer science, engineering, mathematics and even though we are still far from achieving this ultimate goal, we have gathered a great amount of work and knowledge in the process and the techniques developed are widely used in the areas such as medical imaging, video surveillance, computer graphics, video compression etc.

Course Syllabus

The course is introductory level and deals mostly with the low level and mid-level visual analysis. The class assignments will consist of homework problems (both programming and non-programming), a midterm exam (in-class), a term project and a final exam (comprehensive, in-class). The course grade will be determined on the following basis:

Biweeky Assignments: 20%
Programming Assignments: 30%
Mid-Term Exam: 20%
Final Exam: 30%

The major topics include the following (not necessarily in that order. Some topics may not be covered due to time restrictions while some other topics of general interest may be introduced)

Imaging Geometry

Camera Modeling and Calibration

Filtering and Enhancing Images

Region Segmentation

Color and Texture

Line and Curve Detection

Perceptual Organization

Shape Analysis

Stereopsis

Motion and Optical Flow

Structure from Motion

High level Vision

The University Golden Rules will be observed in this class. Copying or Plagiarism is violation of the Golden Rules.

Reference Text:

David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach", Prentice Hall, 2003. (This very new book is a nice survey of computer vision techniques (though lacking details at some places) and is already being used as a text book for introductory level graduate courses in computer vision in many schools. The electronic version of this book is available at http://www.cs.berkeley.edu/~daf/book3chaps.html).

Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998. (This book has a more algorithmic approach to the problems. Not a lot of depth but a good review of main areas).

Mubarak Shah, "Fundamentals of Computer Vision" (This book is only available online and provide useful resource to most of the problems that will be covered in the class. A must-read for students preparing for qualifying exam)

Olivier Faugeras, "Three Dimensional Computer Vision", MIT Press, 1993 (A very good and mathematically rigorous reference on 3D vision, imaging geometry and stereo. Requires background in linear algebra and projective geometry)

Other interesting readings include the following

B. K. P. Horn, "Robot Vision", MIT Press, 1985. (A classic text, a bit dated ).

Martin D. Levine, "Vision in Man and Machine", McGraw-Hill, 1985. (Another classic, tries to relate biological and computational vision. Out of Print.)

Robert M. Haralick and Linda G. Shapiro, "Computer and Robot Vision", Addison-Wesley, 1992-93.

Richard O. Duda, Peter E. Hart, and David G. Stork, "Pattern Classification", Wiley Interscience, 2001. (Does not concern with most of the material of this course. A very valuable resource if you are interested in High level vision techniques. Requires some background in statistics)

Lectures

Lecture 1 (January 7, 2003)

Introduction and brief history

Course Overview

Lecture 2 (January 9, 2003)
Slides: PDF/ PPT

Pinhole Cameras

Projections (Perspective, Weak Perspective and Orthographic)

Lenses and their effect

Introduction to Vector Spaces and Euclidean Space

Suggested Reading: Chapter 1, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Lecture 3 (January 14, 2003)

Introduction to vector spaces and Euclidean space (contd)

Projective Coordinates

Translation and Rotation in Euclidean Space

Suggested Reading:

Chapter 2, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Chapter 1, Mubarak Shah, "Fundamentals of Computer Vision"

Lecture 4 (January 16, 2003)

Rigid Body Transformation

Affine and Projective Transformations

Intrinsic and Extrinsic Camera Parameters

Suggested Reading:

Chapter 2, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Chapter 3, Olivier Faugeras, "Three Dimensional Computer Vision", MIT Press, 1993

Chapter 1, Mubarak Shah, "Fundamentals of Computer Vision"

Lecture 5 (January 21, 2003)
Slides: PDF/ PPT

Intrinsic and Extrinsic Camera Parameters (Contd)

Estimation of Camera Parameters

Suggested Reading:

Chapter 6, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

Chapter 1, Mubarak Shah, "Fundamentals of Computer Vision"

Lecture 6 (January 23, 2003)
Slides: PDF/ PPT

Estimation of Camera Parameters (Contd)

Types of Images and their Representation

Types of Noise and their modeling

Suggested Reading:

Chapter 6, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

Lecture 7 (January 28, 2003)
Slides: PDF/ PPT

Linear Filtering and Convolution

Introduction to Fourier Transform

Suggested Reading:

Chapter 7, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Lecture 8 (January 30, 2003)
Slides: PDF/ PPT

Sampling and Aliasing

Gaussian Pyramids

Laplacian Pyramids

Suggested Reading:

Chapter 7, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Lecture 9 (February 04, 2003)
Slides: PDF/ PPT

Derivatives and Convolution

Edge Detection

Suggested Reading:

Chapter 8, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Chapter 4, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

Chapter 2, Mubarak Shah, "Fundamentals of Computer Vision"

Lecture 10 (February 06, 2003)
Slides: PDF/ PPT

Edge Detection

Suggested Reading:

Chapter 8, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Chapter 4, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

Chapter 2, Mubarak Shah, "Fundamentals of Computer Vision"

Lecture 11 (February 11, 2003)
Slides: PDF/ PPT

Connected Components

Line Fitting

Suggested Reading:

Chapter 15, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Chapter 5, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

Chapter 4, Mubarak Shah, "Fundamentals of Computer Vision"

"Use of the Hough Transformation to Detect Lines & Curves in Pictures," R.O. Duda & P.E. Hart, Computer methods in image analysis (On Reserve)

Lecture 12 (February 13, 2003)
Slides: PDF/ PPT

Line and Curve Fitting

Deformable Contours

Suggested Reading:

Chapter 15, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Chapter 5, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

Donna Williams, and Mubarak Shah. "A Fast Algorithm for Active Contours and Curvature Estimation," Computer Vision, Graphics and Image Processing, January 1992, Volume 55, pp 14-26.

Lecture 13 (February 18, 2003)
Slides: PDF/ PPT

Deformable Contours

Suggested Reading:

Chapter 5, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

Donna Williams, and Mubarak Shah. "A Fast Algorithm for Active Contours and Curvature Estimation," Computer Vision, Graphics and Image Processing, January 1992, Volume 55, pp 14-26.

Lecture 14 (February 27, 2003)
Slides: PDF/ PPT

Segmentation by Clustering

Suggested Reading:

Chapter 14, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Chapter 3, Mubarak Shah, "Fundamentals of Computer Vision"

Lecture 15 (March 06, 2003)
Slides: PDF/ PPT

Segmentation by Clustering

Suggested Reading:

Chapter 14, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Jianbo Shi and Jitendra Malik, "Normalized Cuts and Image Segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888-905, August 2000.

Lecture 16 (March 11, 2003)
Slides: PDF/ PPT

Hu Moments

Medial Axis Transform

Suggested Reading:

M-K. Hu., "Visual pattern recognition by moment invariants," Computer methods in image analysis.

H. Blum, "A transformation for extracting new descriptors of shape," Computer methods in image analysis.

Chapter 4, Mubarak Shah, "Fundamentals of Computer Vision"

Lecture 17 (March 13, 2003)
Slides: PDF/ PPT

Color

Texture

Suggested Reading:

Chapter 6, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Chapter 9, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Lecture 18 (March 25, 2003)
Slides: PDF/ PPT

Motion Estimation

Optical Flow

Suggested Reading:

Chapter 8, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

Lecture 19 (March 27, 2003)
Slides: PDF/ PPT

Optical Flow

Motion Models

Global Flow Estimation

Suggested Reading:

Chapter 8, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

James R. Bergen, P. Anandan, Keith J. Hanna, Rajesh Hingorani: "Hierarchical Model-Based Motion Estimation," ECCV 1992: 237-252

Lecture 20 (April 1, 2003)
Slides: PDF/ PPT

Global Flow Estimation

Image Warping

Motion Tracking

Suggested Reading:

James R. Bergen, P. Anandan, Keith J. Hanna, Rajesh Hingorani: "Hierarchical Model-Based Motion Estimation," ECCV 1992: 237-252

Lecture 21 (April 3, 2003)
Slides: PDF/ PPT

Motion Tracking

Change Detection

Suggested Reading:

Chapter 8, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

C. Stauffer and W.E.L. Grimson, "Learning patterns of activity using real time tracking," IEEE Trans. On PAMI, 22(8):747-757, Aug 2000

Lecture 22 (April 8, 2003)
Slides: PDF/ PPT

Change Detection

Suggested Reading:

C. Stauffer and W.E.L. Grimson, "Learning patterns of activity using real time tracking," IEEE Trans. On PAMI, 22(8):747-757, Aug 2000

Lecture 23 (April 10, 2003)
Slides: PDF/ PPT

Multiview Geometry

Suggested Reading:

Chapter 10, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Chapter 7, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

Chapter 6, Olivier Faugeras, "Three Dimensional Computer Vision", MIT Press, 1993

Lecture 24 (April 15, 2003)
Slides: PDF/ PPT

Multiview Geometry

Stereopsis

Suggested Reading:

Chapter 10, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Chapter 7, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

Chapter 6, Olivier Faugeras, "Three Dimensional Computer Vision", MIT Press, 1993

Epipolar Geometry Applet

Lecture 25 (April 17, 2003)
Slides: PDF/ PPT

Stereopsis

Suggested Reading:

Chapter 7, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998

Chapter 11, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

Chapter 6, Olivier Faugeras, "Three Dimensional Computer Vision", MIT Press, 1993

Assignments

Assignment1/Solution

Programming Assignment1, Due February 11, 2003.

Assignment2/Solution

Programming Assignment2, Due March 06, 2003.

Assignment3, Due March 25, 2003.

Programming Assignment3, Due March 25, 2003.

Assignment4, Due April 03, 2003.

Programming Assignment4, Due April 17, 2003.

Assignment5/Solution

Data Files

Test files for Programming Assignment #1, BMP Format/Binary PGM Format

Test files for Canny Edge Detector, BMP Format/Binary PGM Format

Test files for Hough Transform, BMP Format/Binary PGM Format

Test files for Single Line Fitting, BMP Format/Binary PGM Format

Reading material for Assignment #3 and Programming Assignment #3, Papers

Test files for Snake Algorithm, BMP Format/Binary PGM Format

Test files for Thinning Algorithms, BMP Format/Binary PGM Format

Test files for Lucas-Kanade Optical Flow, BMP Format/Binary PGM Format

Test files for Affine Warping, BMP Format/Binary PGM Format

Test files for Global Flow Estimation, BMP Format/Binary PGM Format

Important Dates/Deadlines

February 20, Mid-term Exam
February 28, Withdrawal Deadline
March 17-23, Spring Break
April 17, Last Class Meeting
April 24, Final Exam (19:00-21:45)

Leading Journals and Conferences in Computer Vision

International Journal of Computer Vision (IJCV)

Pattern Analysis and Machine Intelligence (PAMI)

IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR)

International Conference on Computer Vision (ICCV)

Some other Links

UCF Computer Vision Lab Home Page

Computer Vision Home Page

CAP5415 Computer Vision Spring 2002

CAP5415 Computer Vision Spring 2001

Last modified April 07, 2003