Source Code- Hao Cheng

Research
Misc

Local and Global Structures Preserving Projection (LGSPP)

In dimensionality reduction, Principal Component Analysis seeks to derive a set of axes along which the data exhibit greater variances than others and it mainly preserves global structures of the data. Locality Preserving Projection encodes the neighborhood information into a similarity matrix and derives a linear manifold embedding as the optimal approximation to this local structure. However, neither of them handles both local and global structures in a systematic way. LGSPP is proposed to address this problem, which minimizes the distances of the points in each local neighborhood while dispersing them far away from their corresponding remote points. The code is written in MATLAB and can be downloaded from this link: [TAR].

Described in Hao Cheng, Kien A. Hua, Khanh Vu. "Local and Global Structures Preserving Projection". In 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI '07), Patras, Greece, October 2007. [IEEE] [PPT]

SubSpace Projection and Dimension Partitions

Recent dimensionality reduction techniques, such as Piecewise Aggregate Approximation (PAA), Segmented Means (SMEAN) and Mean-Standard deviation (MS) prove to be very effective in reducing data dimensionality by partitioning dimensions into subsets and extracting aggregate values from each dimension subset. There partition-based techniques have many advantages including very efficient multi-phased approximation while being simple to implement. They, however, are not adaptive to different characteristics of data in diverse applications, as the partitioning is fixed. The proposal SubSpace Projection (SSP) is a unified framework for these partition-based techniques. Accordingly, a greedy algorithm is designed to efficiently determine a good partitioning of the data dimensions in order to achieve robust performances in the similarity search. The code is written in MATLAB/C and can be downloaded from this link: [ZIP].

Described in Hao Cheng, Khanh Vu, Kien A. Hua. "SubSpace Projection: A Unified Framework for a Class of Partition-based Dimension Reduction Techniques". To appear in Information Sciences (INS), Elsevier. [Elsevier] [REPORT]

.

Constrained Clustering

To be released soon.

Content-Based Image Retrieval Software

The software consists of client/server side programs, which allows users to browse and query over an image dataset. The client side program submits query requests (selection of retrieval algorithms, parameters, positive/negative samples) to the server via socket connection. The server takes the request and creates the MATLAB engine instance via the application program interface. According to query parameters, the corresponding MATLAB script is loaded. The query is executed and result images are returned and displayed in the client side. New retrieval algorithms can be added by simply writing new functions in the MATLAB script. The server side was written in C (Visual Studio 2003) and the client side was written in Object Pascal. The code can be downloaed from the link: [ZIP].

Self-Similar, Chaos and Fractal MATLAB Toolbox

This toolbox was written in 2004, that was for my undergraduate thesis and a seminar class project. It provides functions to compute mutual information, false nearest neighbors, correlation dimension, embedding dimension, correlation integral, Lyapunov exponent, Hurst exponent and fractal curves. It is available at the link: [ZIP]. By the way, it is indeed very fun to find the slides of my undergraduate thesis and seminar (both in Chinese) after so many years.


Manage Call for Papers of Conferences

I use this code to maintain a list of call for papers of conferences, see my list. The cfp's are stored in an XML file and the code is written in Perl, HTML, XML and available at the link: [TAR].

List: Personal Resource Management

I wrote this program serveral years ago and since then, I have been using it to manage resources I found and downloaded from the Internet, especially the papers I read. It has indeed saved me a lot of trouble and I call it `list'. In list, all items are saved in a Microsoft Access database and can be classified into multiple categories and flexible search is supported. The code was written in Object Pascal, Borland Delphi 6, and is now available at the link: [ZIP].

.

Search Engine over FTP Servers

This program was written solely for fun in 2003, when I was in my undergraduate at USTC. There were a number of FTP servers in the campus and lot of resources were available on the servers. I coded a FTP search engine. The robot searched around 150 sites and built the local directory image of files of each server. Each directory image was compressed with the zlib compression library and stored in a file, which can save a signicant amount of disk IO. The decompression and string matching were executed on the fly over all the image files for a given submitted keyword. The robot and the back end were coded in Delphi 6 and the front end was in CGI. The code can be downloaded from the link: [ZIP].


Updated by Hao Cheng.