My Research

 

Haplotype Inference

Currently I am working on improved algorithms for haplotype inference and block partitioning. My current focus is on algorithms based on the perfect phylogeny model. The perfect phylogeny approach is relevant because of the block structure of the human genome. My accomplishments in this area include the following:

·         I have developed a linear-time algorithm for the perfect phylogeny haplotyping (PPH) problem. The problem was introduced by Dr. Gusfield in 2002. The previous known algorithms for this problem were of quadratic O(nm2) complexity. The implementation of my opph algorithm is available from here. The pdf of my paper on this problem (published in jcb) is available here. 

·         I have developed algorithms for constructing near-perfect phylogenies with multiple homoplasy events.

·         I have developed algorithms for perfect phylogeny haplotyping with missing data.

 

Sequence analysis

 

Previously, I worked on sequence analysis. I worked on many problems in this area - designing oligos for micro arrays, designing DNA expression vectors with certain properties, and discovering transcription factor binding sites.

·         I worked on the monad pattern finding problem. My paper on this problem can be obtained from my publications page.

·         I published a paper on codon optimization and CpG motif engineering on DNA expression vectors.

 

Data Compression and Pattern Matching

 

I worked on compressed domain pattern matching on lzw-compressed text. Other topics I have worked on include the Burrows-Wheeler transform (BWT) and applications of suffix trees and suffix arrays.