CAP5415 - Computer Vision

Spring 2003
TR 19:00 - 20:15
CSB 0221


Instructor

Khurram Hassan Shafique
CSB 103 (Computer Vision Lab)
Phone (Vision Lab): 407-823-4733
Office Hours: TR 15:00-16:00 in CSB-255 (Grad Lab)
Phone (Grad Lab): 407-823-2245


Teaching Assistant

Cen Rao
CSB 103 (Computer Vision Lab)
Phone (Vision Lab): 407-823-4733
Office Hours: TR 16:00-17:00 in CSB-255 (Grad Lab)
Phone (Grad Lab): 407-823-2245


Course Introduction

Pre-requisite: Other than programming experience, there is no pre-requisite for the course, however knowledge of linear algebra, image processing and/or graphics may be helpful.

"Vision is the process of discovering from images what is present in the world, and where it is" (David Marr). Ever since the advent of computational machines, researchers have wondered whether these machines can be programmed to imitate human cognitive processes (visual perception, natural language processing, deductive reasoning etc). The problem was assumed to be easier in the beginning and only processing power and limited storage was regarded as the major hurdle.

"I believe that in about fifty years' time it will be possible to programme computers to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning". Alan Turing, 'Computing Machinery and Intelligence', Mind (1950), 442

But after more than 50 years of Turing's remarks and a huge shift in computing power and storage, the modeling and simulation of perceptual processes is still an unsolved mystery.

Computer vision is the study of analysis of pictures and videos in order to achieve results similar to those as by men. Thus human vision acts as a lower bound on our ambitions with regard to computational image analysis (Turing Test for computer vision). The field of computer vision has inspired a large number of researchers in computer science, engineering, mathematics and even though we are still far from achieving this ultimate goal, we have gathered a great amount of work and knowledge in the process and the techniques developed are widely used in the areas such as medical imaging, video surveillance, computer graphics, video compression etc.


Course Syllabus

The course is introductory level and deals mostly with the low level and mid-level visual analysis. The class assignments will consist of homework problems (both programming and non-programming), a midterm exam (in-class), a term project and a final exam (comprehensive, in-class). The course grade will be determined on the following basis:

Biweeky Assignments: 20%
Programming Assignments: 30%
Mid-Term Exam: 20%
Final Exam: 30%

The major topics include the following (not necessarily in that order. Some topics may not be covered due to time restrictions while some other topics of general interest may be introduced)

  • Imaging Geometry
  • Camera Modeling and Calibration
  • Filtering and Enhancing Images
  • Region Segmentation
  • Color and Texture
  • Line and Curve Detection
  • Perceptual Organization
  • Shape Analysis
  • Stereopsis
  • Motion and Optical Flow
  • Structure from Motion
  • High level Vision
  • The University Golden Rules will be observed in this class. Copying or Plagiarism is violation of the Golden Rules.


    Reference Text:

  • David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach", Prentice Hall, 2003. (This very new book is a nice survey of computer vision techniques (though lacking details at some places) and is already being used as a text book for introductory level graduate courses in computer vision in many schools. The electronic version of this book is available at http://www.cs.berkeley.edu/~daf/book3chaps.html).
  • Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998. (This book has a more algorithmic approach to the problems. Not a lot of depth but a good review of main areas).
  • Mubarak Shah, "Fundamentals of Computer Vision" (This book is only available online and provide useful resource to most of the problems that will be covered in the class. A must-read for students preparing for qualifying exam)
  • Olivier Faugeras, "Three Dimensional Computer Vision", MIT Press, 1993 (A very good and mathematically rigorous reference on 3D vision, imaging geometry and stereo. Requires background in linear algebra and projective geometry)

    Other interesting readings include the following

  • B. K. P. Horn, "Robot Vision", MIT Press, 1985. (A classic text, a bit dated ).
  • Martin D. Levine, "Vision in Man and Machine", McGraw-Hill, 1985. (Another classic, tries to relate biological and computational vision. Out of Print.)
  • Robert M. Haralick and Linda G. Shapiro, "Computer and Robot Vision", Addison-Wesley, 1992-93.
  • Richard O. Duda, Peter E. Hart, and David G. Stork, "Pattern Classification", Wiley Interscience, 2001. (Does not concern with most of the material of this course. A very valuable resource if you are interested in High level vision techniques. Requires some background in statistics)

  • Lectures

    Lecture 1 (January 7, 2003)

  • Introduction and brief history
  • Course Overview
  • Lecture 2 (January 9, 2003)
    Slides: PDF/ PPT

  • Pinhole Cameras
  • Projections (Perspective, Weak Perspective and Orthographic)
  • Lenses and their effect
  • Introduction to Vector Spaces and Euclidean Space
  • Suggested Reading: Chapter 1, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"

    Lecture 3 (January 14, 2003)

  • Introduction to vector spaces and Euclidean space (contd)
  • Projective Coordinates
  • Translation and Rotation in Euclidean Space
  • Suggested Reading:

  • Chapter 2, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Chapter 1, Mubarak Shah, "Fundamentals of Computer Vision"
  • Lecture 4 (January 16, 2003)

  • Rigid Body Transformation
  • Affine and Projective Transformations
  • Intrinsic and Extrinsic Camera Parameters
  • Suggested Reading:

  • Chapter 2, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Chapter 3, Olivier Faugeras, "Three Dimensional Computer Vision", MIT Press, 1993
  • Chapter 1, Mubarak Shah, "Fundamentals of Computer Vision"
  • Lecture 5 (January 21, 2003)
    Slides: PDF/ PPT

  • Intrinsic and Extrinsic Camera Parameters (Contd)
  • Estimation of Camera Parameters
  • Suggested Reading:

  • Chapter 6, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • Chapter 1, Mubarak Shah, "Fundamentals of Computer Vision"
  • Lecture 6 (January 23, 2003)
    Slides: PDF/ PPT

  • Estimation of Camera Parameters (Contd)
  • Types of Images and their Representation
  • Types of Noise and their modeling
  • Suggested Reading:

  • Chapter 6, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • Lecture 7 (January 28, 2003)
    Slides: PDF/ PPT

  • Linear Filtering and Convolution
  • Introduction to Fourier Transform
  • Suggested Reading:

  • Chapter 7, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Lecture 8 (January 30, 2003)
    Slides: PDF/ PPT

  • Sampling and Aliasing
  • Gaussian Pyramids
  • Laplacian Pyramids
  • Suggested Reading:

  • Chapter 7, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Lecture 9 (February 04, 2003)
    Slides: PDF/ PPT

  • Derivatives and Convolution
  • Edge Detection
  • Suggested Reading:

  • Chapter 8, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Chapter 4, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • Chapter 2, Mubarak Shah, "Fundamentals of Computer Vision"
  • Lecture 10 (February 06, 2003)
    Slides: PDF/ PPT

  • Edge Detection
  • Suggested Reading:

  • Chapter 8, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Chapter 4, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • Chapter 2, Mubarak Shah, "Fundamentals of Computer Vision"
  • Lecture 11 (February 11, 2003)
    Slides: PDF/ PPT

  • Connected Components
  • Line Fitting
  • Suggested Reading:

  • Chapter 15, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Chapter 5, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • Chapter 4, Mubarak Shah, "Fundamentals of Computer Vision"
  • "Use of the Hough Transformation to Detect Lines & Curves in Pictures," R.O. Duda & P.E. Hart, Computer methods in image analysis (On Reserve)
  • Lecture 12 (February 13, 2003)
    Slides: PDF/ PPT

  • Line and Curve Fitting
  • Deformable Contours
  • Suggested Reading:

  • Chapter 15, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Chapter 5, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • Donna Williams, and Mubarak Shah. "A Fast Algorithm for Active Contours and Curvature Estimation," Computer Vision, Graphics and Image Processing, January 1992, Volume 55, pp 14-26.
  • Lecture 13 (February 18, 2003)
    Slides: PDF/ PPT

  • Deformable Contours
  • Suggested Reading:

  • Chapter 5, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • Donna Williams, and Mubarak Shah. "A Fast Algorithm for Active Contours and Curvature Estimation," Computer Vision, Graphics and Image Processing, January 1992, Volume 55, pp 14-26.
  • Lecture 14 (February 27, 2003)
    Slides: PDF/ PPT

  • Segmentation by Clustering
  • Suggested Reading:

  • Chapter 14, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Chapter 3, Mubarak Shah, "Fundamentals of Computer Vision"
  • Lecture 15 (March 06, 2003)
    Slides: PDF/ PPT

  • Segmentation by Clustering
  • Suggested Reading:

  • Chapter 14, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Jianbo Shi and Jitendra Malik, "Normalized Cuts and Image Segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888-905, August 2000.
  • Lecture 16 (March 11, 2003)
    Slides: PDF/ PPT

  • Hu Moments
  • Medial Axis Transform
  • Suggested Reading:

  • M-K. Hu., "Visual pattern recognition by moment invariants," Computer methods in image analysis.
  • H. Blum, "A transformation for extracting new descriptors of shape," Computer methods in image analysis.
  • Chapter 4, Mubarak Shah, "Fundamentals of Computer Vision"
  • Lecture 17 (March 13, 2003)
    Slides: PDF/ PPT

  • Color
  • Texture
  • Suggested Reading:

  • Chapter 6, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Chapter 9, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Lecture 18 (March 25, 2003)
    Slides: PDF/ PPT

  • Motion Estimation
  • Optical Flow
  • Suggested Reading:

  • Chapter 8, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • Lecture 19 (March 27, 2003)
    Slides: PDF/ PPT

  • Optical Flow
  • Motion Models
  • Global Flow Estimation
  • Suggested Reading:

  • Chapter 8, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • James R. Bergen, P. Anandan, Keith J. Hanna, Rajesh Hingorani: "Hierarchical Model-Based Motion Estimation," ECCV 1992: 237-252
  • Lecture 20 (April 1, 2003)
    Slides: PDF/ PPT

  • Global Flow Estimation
  • Image Warping
  • Motion Tracking
  • Suggested Reading:

  • James R. Bergen, P. Anandan, Keith J. Hanna, Rajesh Hingorani: "Hierarchical Model-Based Motion Estimation," ECCV 1992: 237-252
  • Lecture 21 (April 3, 2003)
    Slides: PDF/ PPT

  • Motion Tracking
  • Change Detection
  • Suggested Reading:

  • Chapter 8, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • C. Stauffer and W.E.L. Grimson, "Learning patterns of activity using real time tracking," IEEE Trans. On PAMI, 22(8):747-757, Aug 2000
  • Lecture 22 (April 8, 2003)
    Slides: PDF/ PPT

  • Change Detection
  • Suggested Reading:

  • C. Stauffer and W.E.L. Grimson, "Learning patterns of activity using real time tracking," IEEE Trans. On PAMI, 22(8):747-757, Aug 2000
  • Lecture 23 (April 10, 2003)
    Slides: PDF/ PPT

  • Multiview Geometry
  • Suggested Reading:

  • Chapter 10, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Chapter 7, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • Chapter 6, Olivier Faugeras, "Three Dimensional Computer Vision", MIT Press, 1993
  • Lecture 24 (April 15, 2003)
    Slides: PDF/ PPT

  • Multiview Geometry
  • Stereopsis
  • Suggested Reading:

  • Chapter 10, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Chapter 7, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • Chapter 6, Olivier Faugeras, "Three Dimensional Computer Vision", MIT Press, 1993
  • Epipolar Geometry Applet

    Lecture 25 (April 17, 2003)
    Slides: PDF/ PPT

  • Stereopsis
  • Suggested Reading:

  • Chapter 7, Emanuele Trucco, Alessandro Verri, "Introductory Techniques for 3-D Computer Vision", Prentice Hall, 1998
  • Chapter 11, David A. Forsyth and Jean Ponce, "Computer Vision: A Modern Approach"
  • Chapter 6, Olivier Faugeras, "Three Dimensional Computer Vision", MIT Press, 1993

  • Assignments

  • Assignment1/Solution
  • Programming Assignment1, Due February 11, 2003.
  • Assignment2/Solution
  • Programming Assignment2, Due March 06, 2003.
  • Assignment3, Due March 25, 2003.
  • Programming Assignment3, Due March 25, 2003.
  • Assignment4, Due April 03, 2003.
  • Programming Assignment4, Due April 17, 2003.
  • Assignment5/Solution
  • .

    Data Files

  • Test files for Programming Assignment #1, BMP Format/Binary PGM Format
  • Test files for Canny Edge Detector, BMP Format/Binary PGM Format
  • Test files for Hough Transform, BMP Format/Binary PGM Format
  • Test files for Single Line Fitting, BMP Format/Binary PGM Format
  • Reading material for Assignment #3 and Programming Assignment #3, Papers
  • Test files for Snake Algorithm, BMP Format/Binary PGM Format
  • Test files for Thinning Algorithms, BMP Format/Binary PGM Format
  • Test files for Lucas-Kanade Optical Flow, BMP Format/Binary PGM Format
  • Test files for Affine Warping, BMP Format/Binary PGM Format
  • Test files for Global Flow Estimation, BMP Format/Binary PGM Format

  • Important Dates/Deadlines

    February 20, Mid-term Exam
    February 28, Withdrawal Deadline
    March 17-23, Spring Break
    April 17, Last Class Meeting
    April 24, Final Exam (19:00-21:45)


    Leading Journals and Conferences in Computer Vision

  • International Journal of Computer Vision (IJCV)
  • Pattern Analysis and Machine Intelligence (PAMI)
  • IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR)
  • International Conference on Computer Vision (ICCV)

  • Some other Links

  • UCF Computer Vision Lab Home Page
  • Computer Vision Home Page
  • CAP5415 Computer Vision Spring 2002
  • CAP5415 Computer Vision Spring 2001

  • Last modified April 07, 2003