CAP 6412  - Advanced Computer Vision

 Fall 2008
TuTh 3:00PM - 4:15PM
  BA 221

Credit hours: 3
Office  hours: 4:30-5:30 PM Tuesdays, 2:00-3:00 PM Thursdays
Office Location: CSB 237

Instructor: Mubarak Shah, email: shah@cs.ucf.edu

Course Web Page:  http://www.cs.ucf.edu/courses/cap6412/


Paper List    Potential Papers    Datasets    Assignments    Homeworks

Course Goals:

To prepare students for graduate research in computer vision.

Course Description:

Review recent advances in computer vision. 

Required and Optional texts:

No textbook. 

Course Prerequisites:

CAP5415 or consent of instructor. 

Exam and Grading Policy:

Reports                                                  30%
Discussion and Attendance                     20%
Homework                                            10%
Programs/Project                                   40%
No exam



Class Policy:

The University Golden Rules will be observed in this class. Copying or Plagiarism is violation of the Golden Rules.

Some Tips on Reading Research Papers:

1. You have to read the paper several times to understand it. When you read the paper first time, if you do not understand something do not get stuck, keep reading assuming you will figure out that later. When you read it the second time, you will understand much more, and the third time even more ...

2. Try first to get a general idea of the paper: What problem is being solved? What are the main steps? How can I implement the method?, even though I do not understand why each step is performed the way it is performed?

3. Try to relate the method to other methods you know, and conceptually find similarities and differences.

4. In the first reading it may be a good idea to skip the related work, since you do not know all other papers, they will confuse you more.

5. Do not use dictionary to just look up the meaning of technical terms like particle filters, maximum likelihood, they are concepts, dictionaries do not define them. They will tell you literal meanings, which may not be useful.

6. Try to understand each concept in isolation, and then integrate them to understand the whole paper. For instance, the paper on "Feature Integration with adaptive weights in a sequential Monte Carlo Tracker" is quite complex paper at the first look. Because it uses Monte Carlo, particle filter, likelihood etc. But try to understand the gist of it. The paper is about tracking, you know a few tracking methods already. It uses features: color histogram, templates in correlation, shape, etc. You know these features, and you have used them. The probabilities obtained by each features are combined (fused) to achieve tracking. How will you combine the probabilities or confidences of each features: multiply, add, apply threshold and then add ...
Particle filter/condensation method is already available in Intell Open CV library, use it, get some idea how it works, what are the parameters, then go back to read the paper again ... If you keep doing it for one week, you will understand a lot about that paper! Next week you do the second paper, and so on ...

Research Tip in MIT:


August 26: Lecture 1
Computer Vision Story and The Changing Shape of Computer Vision in Twenty First Century

Paper List:

August 28: Anomalous Event Detection, Presenter: Arslan Basharat
Arslan Basharat, Alexei Gritai, and Mubarak Shah, "Learning Object Motion Patterns for Anomaly Detection and Improved Object Detection", CVPR 2008.

Related papers:

September 2: Video Synopsis and Indexing, Presenter: Mikel D. Rodriguez Sullivan
Y. Pritch, A. Rav-Acha, S. Peleg, "Video Synopsis and Indexing", PAMI 2008.

Related papers:

September 4: Activity Recognition, Presenter: Jingen Liu 
H. Jiang, D. Martin, "Finding Actions Using Shape Flows", EECV 2008.

Related papers:

I. Action recognition by body joint trajectories

II. Action recognition by silhouettes

III. Action recognition by interest parts

IV. Others

September 9: Image Warping, Presenter: Ramin Mehran
T. Leyvand, D. Cohen-Or, G. Dror, D. Lischinski, "Data-Driven Enhancement of Facial Attractiveness", SIGGRAPH 2008.

Related papers:

Related Videos and images:

September 11: Object Tracking, Presenter: Enrique G. Ortiz
Yanlin Guo,Steve Hsu, Harpreet S. Sawhney, Rakesh Kumar, and Ying Shan,  "Robust Object Matching for Persistent Tracking with Heterogeneous Features", PAMI  2007.

Related Material:

Related Book and Toolbox:

September 16: Geo-spatial Aerial Video Processing, Presenter: Vladimir Reilly
Jiangjian Xiao, Hui Cheng, Feng Han, Harpreet Sawhney, "Geo-spatial Aerial Video Processing for Scene Understanding and Object Tracking", CVPR 2008.

Related papers:

Book Chapter:

                              Geo Registeration:

September 18: Geographic Information from Images, Presenter: Janaka Liyanage
James Hays, and Alexei A. Efros, "IM2GPS: Estimating Geographic Information from a Single Image", CVPR 2008.

Related papers:
Data on the web:

September 23: Object Localization, Presenter: Pavel Babenko
Christoph H. Lampert, Matthew B. Blaschko, Thomas Hofmann, "Beyond Sliding Windows: Object Localization by Efficient Subwindow Search", CVPR 2008.

Related papers:

September 25: 3D Pose Refinement, Presenter: Alexandre Bassel
P. Lagger, M. Salzmann, V. Lepetit, and P. Fua, "3D Pose Refinement from Reflections", CVPR 2008

Related papers:

Related Matrial:

September 30: Recursive GMM, Presenter: Janaka Liyanage
Zivkovic, Zoran and Heijden van der, Ferdinand, "Recursive unsupervised learning of finite mixture models", PAMI 2004

Related papers

October 2: Crowd Segmentation, Presenter: Ramin Mehran
P. Tu, T. Sebastian, G. Doretto, N. Krahnstoever, J. Rittscher, and T. Yu, "Unified Crowd Segmentation", ECCV 2008.

Related papers

            October 7th and 9th: Presentations for the Assignment I

Octobesr 16: Detection and Tracking, Presenter: Alexandre Bassel
Mykhaylo Andriluka, Stefan Roth, Bernt Schiele, "People-Tracking-by-Detection and People-Detection-by-Tracking", CVPR 2008.

Related papers

October 17: Face Alignment, Presenter: Enrique G. Ortiz
H. Wu, X. Liu, G. Doretto. "Face Alignment using Boosted Ranking Models." In Proc. of  IEEE CVPR, 2008

Related papers

October 21: Actions from Movies, Presenter: Mikel D. Rodriguez Sullivan (Link to Author's Presentations)
Ivan Laptev, Marcin MarszaƂek, Cordelia Schmid, Benjamin Rozenfeld, "Learning realistic human actions from movies", CVPR 2008.

Related papers:
Related material:

October 28: Vision Context, Presenter: Vladimir Reilly
Zhuowen Tu, "Auto-context and Its Application to High-level Vision Tasks" , In Proc. of  IEEE CVPR, 2008

Related papers

October 30: Image Descriptor, Presenter: Pavel Babenko
Engin Tola, Vincent Lepetit, Pascal Fua, "A Fast Local Descriptor for Dense Matching", In Proc. of  IEEE CVPR, 2008

Related papers

            November 4th: Presentations for the Assignment 2 (group 2)

November 6: Link Analysis, Presenter: Ramin Mehran
Gunhee Kim, Christos Faloutsos, Martial Hebert, "Unsupervised Modeling of Object Categories Using Link Analysis Techniques", In Proc. of  IEEE CVPR, 2008

Related papers

November 13: Object Category Detection, Presenter: Dr. Rahul Sukthankar    UCF Vision Class Guest Lecture
L. Yang, R. Jin, R. Sukthankar, F. Jurie. "Unifying Discriminative Visual Codebook Generation with Classifier Training for Object Category Recognition", In Proc. of  IEEE CVPR, 2008

Related papers

November 13: Levenberg-Marquardt, Presenter: Dr. Mubarak Shah   Course Lecture
Levenberg-Marquardt and Szeliski  Registration Method

Related papers

November 13: Kalman Filter, Presenter: Dr. Mubarak Shah   Course Lecture
Kalman Filter

Related papers

Potential Papers (CVPR 08, ECCV 08, SIGGRAPH 08):

(a) CVPR 2008
  1. Large-scale manifold learning
  2. Single-image Vignetting correction using radial gradient symmetry
  3. Epitomic Location Recognition
  4. 3D Pose Refinement from Reflections
  5. Viewpoint-Independent Object Class Detection using 3D Feature Maps
  6. Auto-Context and Its Application to High-level Vision Tasks
  7. Learning realistic human actions from movies
  8. People-Tracking-by-Detection and People-Detection-by-Tracking
  9. Motion blur identification from image gradients
  10. Who killed the directed model?
  11. Fast Image Search for Learned Metrics
  12. Segmentation by transduction
  13. Semantic texton forests for image categorization and segmentation
  14. Robust Dual Motion Deblurring
  15. Unsupervised Modeling of Object Categories Using Link Analysis Techniques
  16. Multi-Object Shape Estimation and Tracking from Silhouette Cues
  17. Partitioning of Image Datasets using Discriminative Context Information
  18. Beyond Sliding Windows: Object Localization by Efficient Subwindow Search
  19. Globally Optimal Bilinear Programming for Computer Vision Applications
  20. Background Subtraction in Highly Dynamic Scenes
  21. Face Alignment via Boosted Ranking Model
  22. Directions of Egomotion from Antipodal Points
  23. A Unified Framework for Generalized Linear Discriminant Analysis
  24. Transductive Object Cutout
  25. Taylor Expansion Based Classifier Adaptation: Application to Person Detection
  26. Constant Time O(1) Bilateral Filtering
  27. Demosaicing by Smoothing along 1D Features
  28. A Fast Local Descriptor for Dense Matching
  29. Kernel Integral Images: A Framework for Fast Non-Uniform Filtering
  30. Manifold-Manifold Distance with Application to Face Recognition based on Image Set
  31. Summarizing Visual Data Using Bidirectional Similarity
  32. Unifying Discriminative Visual Codebook Generation with Classifier Training for Object Category Recognition
  33. Human-Assisted Motion Annotation
  34. Directional Independent Component Analysis with Tensor Representation
  35. Re-weighting Linear Discrimination Analysis under Ranking Loss
  36. From Appearance to Context-Based Recognition: Dense Labeling in Small Images
  37. The patch transform and its applications to image editing
(b) ECCV 2008

(c) SIGGRAPH 2008

Programming Assignments:

  1. Implement  Data-Driven Enhancement of Facial Attractiveness". Due September 30
    1. Active Shape Model Code and instructions (Download)
    2. Dataset of faces with rating from Mikel Rodriguez 
  2. Work in details of assignment 1, Due Oct 26
    1. Experiment with ASM model. Train the model using roughly half of 150 annotated face images, and test the face feature detection on the remaining half. Summarize the results, comment on the quality of results, difficulties, failures…
    2. Experiment with triangulation of 150 annotated face images. Keep the order of face point the same. Summarize the results, comment on the quality of results, difficulties, failures…
    3. Train SVR (non-linear) for computing the beauty score of face image. Train SVR on some of annotated face images and test on remailing images. Summarize the results, comment on the quality of results, difficulties, failures…
    4. Test LM algorithm (non-linear optimization) to estimate the optimal distances of vertices for beautification process. Study the LM iterations, effect of inial estimate, number of iterations to converge, error..
    5. Study warping method from MIT, summarize its main step. Apply the warping to all images.  Summarize the results, comment on the quality of results, difficulties, failures…
  3. Impelement the paper: Zhuowen Tu, "Auto-context and Its Application to High-level Vision Tasks" , In Proc. of  IEEE CVPR, 2008 , Due Last Day of the Classes
  • demonstrate on:
  1. Weisman data set for horses
  2. Human body configuration
  3. MSRC Scene parsing/labeling 

Data Set and Code:


HW 1:  From Geo-spatial Aerial Video Processing for Scene Understanding and Object Tracking", CVPR 2008. (Due Tuesday 9/23/2008)

  1. Define Bundle Adjustment.
  2. What is the radial distortion and how it is removed using eq. (1).
  3. Describe how Kalman filter is used to have refined parameter.

HW 2:  From  "A Fast Local Descriptor for Dense Matching", CVPR 2008. (Due Thursday 12/03/2008)

  1. Explain the role of EM in the paper.

CAP 6412 | Department of Electrical Engineering and Computer Sciences | University of Central Florida
Copyright 2008 University of Central Florida