<? echo $GLOBALS['title'] ?>

Research



Computer Vision

A Probabilistic Representation for Efficient Large Scale Visual Recognition Tasks

Abstract: In this paper, we present an efficient alternative to the traditional vocabulary based on bag-of-visual words (BoV) used for visual classification tasks. Our representation is both conceptually and computationally superior to the bag-of-visual words: (1) We iteratively generate a Maximum Likelihood estimate of an image given a set of characteristic features in contrast to the BoV methods where an image is represented as a histogram of visual words,(2) We randomly sample a set of characteristic features instead of employing computation intensive clustering algorithms used during the vocabulary generation step of BoV methods. Our comparable performance to the state-of-the-art, on experiments over a challenging scene categorization dataset and two equally challenging human action datasets, demonstrates the universal applicability of our method. The camera ready version is here. The code and data used in the experiments discussed in the paper would be uploaded shortly.

Video Monitoring of Honey Bees

Abstract: (Coming soon!)

TRECVID MED(Multimedia Event Detection) 2010

Abstract: TRECVID Multimedia Event Detection offers an interesting but very challenging task in detecting high-level complex events(batting baseball run, making cake, assembling shelter) in user-generated videos. In this paper, we will present an overview and comparative analysis of our results, which achieved top performance among all 45 submissions in TRECVID 2010. Our aim is to answer the following questions. What kind of feature is more effective for multimedia event detection? Are features from different feature modalities (e.g., audio and visual) complementary for event detection? Can we benefit from generic concept detection of background scenes, human actions, and audio concepts? Are sequence matching and event-specific object detectors critical? Our findings indicate that spatial-temporal feature is very effective for event detection, and it's also very complementary to other features such as static SIFT and audio features. As a result, our baseline run combining these three features already achieves very impressive results, with a mean minimal normalized cost (MNC) of 0.586. Incorporating the generic concept detectors using a graph diffusion algorithm provides marginal gains (mean MNC 0.579). Sequence matching with Earth Mover's Distance (EMD) further improves the results (mean MNC 0.565). The event-specific detector ("batter"), however, didn't prove useful from our current re-ranking tests. We conclude that it is important to combine strong complementary features from multiple modalities for multimedia event detection, and cross-frame matching is helpful in coping with temporal order variation. Leveraging contextual concept detectors and foreground activities remains a very attractive direction requiring further research. This is a joint effort between Columbia University and UCF which culminated into the best performance in the Multimedia Event Detection 2010 challenge. A notebook paper is available here.

A Framework for Photo-Quality Assessment and Enhancement based on Visual Aesthetics

Abstract: We present an interactive application that enables users to improve the visual aesthetics of their digital photographs using spatial recomposition. Unlike earlier work that focuses either on photo quality assessment or interactive tools for photo editing, we enable the user to make informed decisions about improving the composition of a photograph and to implement them in a coherent framework. Specifically, the user can interactively select a foreground object and the system will present recommendations for where it can be moved in a manner that optimizes a learned aesthetic metric while obeying semantic constraints. For photographic compositions that lack a distinct foreground object, our tool provides the user with cropping or expanding recommendations that improve its aesthetic quality. We learn a support vector regression model for capturing image aesthetics from user data and seek to optimize this metric during recomposition. Rather than prescribing a fully-automated solution, we allow user-guided object segmentation and inpainting to ensure that the final photograph matches the user's criteria. Our approach achieves 86% accuracy in predicting the attractiveness of unrated images, when compared to their respective human rankings. Additionally, 73% of the images recomposited using our tool are ranked more attractive than their original counterparts by human raters.

This work is accepted in ACM Multimedia International Conference (ACMMM 2010) as a 10 page paper (17% acceptance rate), held in Firenze, Italy. Here is an accompanying talk. A subset of the images from the dataset mentioned in the paper is available here. We received some objects from Flickr users for making there images publicly available for experiments, hence the full dataset was brought down. The code provided in the archive is unsupported.

Moving Object Detection and Tracking in Forward Looking Infra-Red Aerial imagery

Abstract: This chapter discusses the challenges of automating surveillance and reconnaissance tasks for infra-red visual data obtained from aerial platforms. These problems have gained significant importance over the years, especially with the advent of lightweight and reliable imaging devices. Detection and tracking of objects of interest has traditionally been an area of interest in the computer vision literature. These tasks are rendered especially challenging in aerial sequences of infra red modality. The chapter gives an overview of these problems, and the associated limitations of some of the conventional techniques typically employed for these applications. We begin with a study of various image registration techniques that are required to eliminate motion induced by the motion of the aerial sensor. Next, we present a technique for detecting moving objects from the ego-motion compensated input sequence. Finally, we describe a methodology for tracking already detected objects using their motion history. We substantiate our claims with results on a wide range of aerial video sequences.

This work is published as a chapter in Springer book Machine Vision beyond Visible Spectrum .

Video on Demand (Before UCF)

A Case for Grid based Video on Demand System

Abstract: The Video on Demand (VoD) services incorporate streaming of video over network and allow its subscribers to select videos and play them in near real time playback quality including interactive functions like Fast Forward, Rewind, Random seek, etc. VoD systems put a huge amount of overhead on the processing capabilities of the video server and need an equally huge amount of storage. These systems also demand a highly optimized network backbone for data transfer. Though lot of research have been carried out in the area of VoD distribution and network optimization, the problems mentioned above have received only cursory attention. Recently, research in the high community has led to the development of Grid Computing technologies for precisely the problems stated above. In this paper, we propose and develop the idea of integrating VoD servers with Grid computing and describe the system as Grid based Video on Demand (GDVoD) system. Prototype of GDVoD system has been developed and experiments have been carried out. The experiments highlight the fact that GDVoD system has low overhead in terms of computation without compromising the quality of the streamed video.

The paper was submitted in HPDC 2006.


Systems Virtualization

Nova: An Approach to On-Demand Virtual Execution Environments for Grids

Abstract: This paper attempts to reduce the overheads of dynamically creating and destroying the virtual environments for secure job execution. It broaches a grid architecture which we call Nova, consisting of extremely minuscule, pre-created virtual machines whose configurations could be altered with respect to the application executed within it. The benefits of the architecture are supported by experimental claims.

This work was accepted as a short paper in CCGrid 2006.


Grid/High Performance Computing

Scalable and Distributed Mechanisms for Integrated Scheduling and Replication in Data Grids

Abstract: Data Grids seek to harness geographically distributed resources for large-scale data-intensive problems. Such problems involve loosely coupled jobs and large data sets distributed remotely. Data Grids have found applications in scientific research fields of high-energy physics, life sciences etc. as well as in the enterprises. The issues that need to be considered in the Data Grid research area include resource management for computation and data. Computation management comprises scheduling of jobs, scalability, and response time; while data management includes replication and movement of data at selected sites. As jobs are data intensive, data management issues often become integral to the problems of scheduling and effective resource management in the Data Grids. Integration of data replication and scheduling strategies is important. Such an integrating solution is either non-existent or work in a centralized manner which is not scalable. The paper deals with the problem of integrating the scheduling and replication strategies in a distributed manner. As part of the solution, we have proposed a Distributed Replication and Scheduling Strategy (DistReSS) which aims at an iterative improvement of the performance based on coupling between scheduling and replication, which is achieved in a distributed and hierarchical fashion. Results suggest that, in the context of our experiments, DistReSS performs comparable to the centralized approach when the parameters are tuned properly.

Work accepted as a poster in CCGrid 2005.