CAP 6412  - Advanced Computer Vision

Fall 2009
TuTh 3:00PM - 4:15PM
CLI 119 (Classroom building 1, room 119) 

Credit hours: 3
Office  hours: 4:30-5:30 PM Tuesdays, 2:00-3:00 PM Thursdays
Office Location: HEC 247

Instructor: Mubarak Shah, email: shah@cs.ucf.edu

Course Web Page:  http://www.cs.ucf.edu/courses/cap6412/


Lectures / Paper List(new!)    Announcements   Potential Papers    Datasets    Assignments    Homeworks

Course Goals:

To prepare students for graduate research in computer vision.

Course Description:

Review recent advances in computer vision. 

Required and Optional texts:

No textbook. 

Course Prerequisites:

CAP5415 or consent of instructor. 

Exam and Grading Policy:

Reports                                                  30%
Discussion and Attendance                     20%
Homework                                            10%
Programs/Project                                   40%
No exam



Class Policy:

The University Golden Rules will be observed in this class. Copying or Plagiarism is violation of the Golden Rules.

Some Tips on Reading Research Papers:

1. You have to read the paper several times to understand it. When you read the paper first time, if you do not understand something do not get stuck, keep reading assuming you will figure out that later. When you read it the second time, you will understand much more, and the third time even more ...

2. Try first to get a general idea of the paper: What problem is being solved? What are the main steps? How can I implement the method?, even though I do not understand why each step is performed the way it is performed?

3. Try to relate the method to other methods you know, and conceptually find similarities and differences.

4. In the first reading it may be a good idea to skip the related work, since you do not know all other papers, they will confuse you more.

5. Do not use dictionary to just look up the meaning of technical terms like particle filters, maximum likelihood, they are concepts, dictionaries do not define them. They will tell you literal meanings, which may not be useful.

6. Try to understand each concept in isolation, and then integrate them to understand the whole paper. For instance, the paper on "Feature Integration with adaptive weights in a sequential Monte Carlo Tracker" is quite complex paper at the first look. Because it uses Monte Carlo, particle filter, likelihood etc. But try to understand the gist of it. The paper is about tracking, you know a few tracking methods already. It uses features: color histogram, templates in correlation, shape, etc. You know these features, and you have used them. The probabilities obtained by each features are combined (fused) to achieve tracking. How will you combine the probabilities or confidences of each features: multiply, add, apply threshold and then add ...
Particle filter/condensation method is already available in Intell Open CV library, use it, get some idea how it works, what are the parameters, then go back to read the paper again ... If you keep doing it for one week, you will understand a lot about that paper! Next week you do the second paper, and so on ...

Research Tip in MIT:


Other Useful Links:

Lecture List:

1- August 25, 27: Lecture 1-3  Presenter: Dr. Mubarak Shah
        - Computing optical flow
        - Pyramids
        - Global Motion Compensation

         Related papers:

2- September 2, 4: Lecture 4  Presenter: Jingen Liu
        - Bag of Words Approach

Related papers:

3- September 22: Lecture 5  Presenter: Dr. Mubarak Shah
        - Alignment

Related papers:

Paper List:

1- September 8-10Object Tracking, Presenter: Dr. Mubarak Shah
Alper Yilmaz, Omar Javed, Mubarak Shah,  "Object Tracking: A Survey", ACM Computing Surveys  2006.

Related papers:

2- September 15: Photo Synthesis, Presenter: Omar Oreifej
Noah Snavely, Steven M. Seitz, Richard Szeliski  "Photo Tourism: Exploring Photo Collections in 3D", SIGGRAPH 2006.

Related papers:

3- September 17 : Human Tracking, Presenter: Subhabrata Bhattacharya
S. Pellegrini, A. Ess, K. Schindler, L. van Gool, "You’ll Never Walk Alone: Modeling Social Behavior for Multi-target Tracking", ICCV 2009.

Related papers:

Dataset: ETH Walking Pedestrians (EWAP)

4- September 22: Alignment, Presenter: Dr. Mubarak Shah

Related papers:

5- September 24: Data Clustering Presenter: Yang Yang
Anil K. Jain, "Data Clustering: 50 Years Beyond K-Means", Pattern Recognition Letters 2009.

Related papers:

6- September 29: Machine Recognition of Human Activities, Presenter: Hakan Boyraz
Pavan Turaga, Rama Chellappa, V. S. Subrahmanian, and Octavian Udrea, "Machine Recognition of Human Activities: A Survey", PAMI 2008.

Related papers:

7- October 1: Geo-spatial Aerial Video Processing, Presenter: Chris Huff
Sameer Agarwal, Noah Snavely, Ian Simon, Steven M. Seitz, Richard Szeliski, "Building Rome in a Day", CVPR 2009.

                    Related papers:

8- October 6Image Geolocation, Presenter: Amir Roshan Zamir
Evangelos Kalogerakis, Olga Vesselova, James Hays, Alexei A. Efros, Aaron Hertzmann, "Image Sequence Geolocation with Human Travel Priors", ICCV 2009.

           Related papers:

9- October  8: Support Vector Machines, Presenter:  Adarsh Nagaraja
R. Berwick, "An Idiot’s guide to Support vector machines", 

                     Related papers:

10- October 13: Discussion of  Homework 1: Ramin Mehran

                    Related papers:

11- October 15: Image Classification, Presenter: Kishore Reddy
Oren Boiman, Eli Shechtman, Michal Irani,  "In Defense of Nearest-Neighbor Based Image Classification", CVPR08.

Related papers:

12- October 27: Image Editing, Presenter: Devina Shiwlochan 
Yael Pritch, Eitam Kav-Venaki, Shmuel Peleg,  "Shift-Map Image Editing", ICCV09.

                    Related papers:

13- October 29: Human Detection, Presenter:  Berkan Solmaz
Xiaoyu Wang, Tony X. Han, Shuicheng Yan,  "An HOG-LBP Human Detector with Partial Occlusion Handling", ICCV09.

                    Related papers:

14- November 3: Video Annotation, Presenter:  Laura Norena
Jenny Yuen, Bryan Russell, Ce Liu, Antonio Torralba,  "LabelMe video: Building a Video Database with Human Annotations", ICCV09.

Related papers:

15- November 5: Data Clustering Presenter:  Maria Villarreal
Lior Shapira, Shai Avidan, Ariel Shamir,  "Mode-Detection via Median-Shift", ICCV09.

Related papers:

16- November 10: Weakly Supervised Clustering, Presenter:  Tyler Gomez
Oncel Tuzel, Fatih Porikli, Peter Meer,  "Kernel Methods for Weakly Supervised Mean Shift Clustering", ICCV09.

Related papers:

17- November 12: Human Computer Interaction, Presenter:  Guang Shu
Tilke Judd, Krista Ehinger, Fredo Durand, Antonio Torralba,  "Learning to Predict Where Humans Look", ICCV09.

Related papers:

18- November 24: Image Enhancement, Presenter:  Pramod Chakrapani
Amit Agrawal and Ramesh Raskar,  "Resolving Objects at Higher Resolution from a Single Motion-blurred Image", CVPR07.

Related papers:

19- December 1 : Annnotation of Human Actions, Presenter: Naveed Imran
Olivier Duchenne, Ivan Laptev, Josef Sivic, Francis Bach and Jean Ponce, "Automatic Annotation of Human Actions in Video", ICCV09.

Related papers:

20- December 3: Subspace Clustering, Presenter:  Leon F Guerrero
Ehsan Elhamifar, Rene Vidal,  "Sparse Subspace Clustering", CVPR09.

Related papers:

21- December x: Label Propagation, Presenter:  Haroon Idrees
Hong Cheng, Zicheng Liu, Jie Yang,  "Sparsity Induced Similarity Measure for Label Propagation", ICCV09.

Related papers:

22- December xUncertain geometry, Presenter:  Soumyabrata Dey
Mundy, Joseph L.; Ozcanli, Ozge C.,  "Uncertain geometry: a new approach to modeling for recognition", Proceedings of the SPIE Automatic Target Recognition XIX 2009

Related papers:
Potential Papers (ICCV09, CVPR 09, ECCV 09, SIGGRAPH 09, PAMI, and ...):

20- November (TBD) Content Based Video Retrieval, Presenter:
J. P. Collomosse, G. McNeill and Y. Qian,  "Storyboard sketches for content based video retrieval", ICCV09.

Related papers:


  1. Updated at 10/22/2009: The table of  matrix/vector derivatives: Vector/Matrix Derivatives and Integrals (source 1) (source 2) (table 4.1) by By James E. Gentle. Also look at Matrix Identities by Sam Roweies
  2. Updated at 10/5/2009: 9 More potential papers are posted.
  3. Updated at 9/28/2009: PDF of the Alignment lecture is updated and the new homework is posted on homepage.
  4. Updated at 9/14/2009: Programming assignment 1 is introduced.
  5. Updated at 9/15/2009: Classroom for regular hours changed to CLI 119 (Classroom building 1, room 119)
  6. Dataset for Assignment 1 is on the course homepage under Dataset and Code section.

Programming Assignments:

  1. Updated at 9/14/2009: Implement the paper "You’ll Never Walk Alone: Modeling Social Behavior for Multi-target Tracking", ICCV 2009.; this is the paper we will discuss on Thursday Sep 17th. This programming assignment will be due on October 15. It will involve lots of work, so need to start on this asap. (Dataset)
  2. Implement the paper " "An HOG-LBP Human Detector with Partial Occlusion Handling", ICCV09." Due on Dec 1st.  (INRIA Human Dataset from local server) (from Dalal's webpage)

Data Set and Code:

  1. Test images for optical flow
  2. ETH Walking Pedestrians Dataset (EWAP) (needed for assignment 1)
  3. INRIA Human Dataset from local server or from Dalal's webpage (needed for assignment 2)


    1. Homework Due September 29
      1. Derive linear system equation in Bergan’s method .
      2. Derive equations for Mann’s method (weighted) 
      3. Derive equations for approximation of projective using Taylor series (Pseudo‐perspective) 
      4. Derive equations for Mann’s method (un‐weighted, Pseudo‐perspective and bi‐linear)
-> The table of  matrix/vector derivatives: Vector/Matrix Derivatives and Integrals (source 1) (source 2) (table 4.1) by By James E. Gentle. Also look at Matrix Identities by Sam Roweies

CAP 6412 | Department of Electrical Engineering and Computer Sciences | University of Central Florida
Copyright 2009 University of Central Florida