Xiaoman Shawn Li

Associate Professor

Burnett School of Biomedical Science
Department of Computer Science

University of Central Florida

Office: HEC210

Email: xiaoman@mail.ucf.edu

Telephone: 407-823-4811

Fax: 407-823-5835



Our lab is interested in solving statistical and algorithmic problems in computational biology. The beauty of this exciting area lies on the fact that it has a direct impact in the real world and statistics and algorithms really matter in data mining here. In addition, the biological problems challenge the current statistical and algorithmic methods and provide a great opportunity to advance and create novel computational methods. We were one of the international groups funded by NIH to develop technology for the ENCODE projects. Currently, our lab is focusing on the following classification and data integration problems: (1) transcription factor binding site prediction; (2) enhancer target gene prediction; (3) Metagenomics.

Announcement: We welcome undergraduate/graduate students with excellent programming skill to join the lab. Highly motivated students interested in machine learning, data mining, and/or bioinformatics are preferred. Please email your CV to xiaoman@mail.ucf.edu.


Talukder A, Barham C, Li X, Hu H. Interpretation of deep learning in genomics and epigenomics. Briefings in Bioinformatics. 2021; 22(3): bbaa177.

Cha M, Zheng H, Talukder A, Barham C, Li X, Hu H. A two-stream convolutional neural network for microRNA transcription start site feature integration and identification. Scientific Reports. 2021; 11(1):1-13.

Talukder A, Hu H, Li X. An intriguing characteristic of enhancer-promoter interactions. BMC Genomics. 2021; 22:163.

Wang S, Talukder A, Cha M, Li X, Hu H. Computational annotation of miRNA transcription start sites. Briefings in bioinformatics. 2021; 22(1):380-392.

Zheng H, Li X, Hu H. Deep Learning to Identify Transcription Start Sites from CAGE Data. 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2020.

Li X, Hu H, Li X. mixtureS: a novel tool for bacterial strain reconstruction from reads. Bioinformatics. 2020; 37(4): 575-577.

Wang S, Hu H, Li X. Shared distal regulatory regions may contribute to the coordinated expression of human ribosomal protein genes. Genomics. 2020; 112(4): 2886-2893.

Talukder A, Li X, Hu H. Position-wise binding preference is important for miRNA target site prediction. Bioinformatics. 2020; 36(12): 3680-3686.

Li X, Saadat S, Hu H, Li X. BHap: a novel approach for bacterial haplotype reconstruction. Bioinformatics. 2019; 35(22): 4624-4631.

Talukder A, Saadat S, Li X, Hu H. EPIP: A novel approach for condition-specific enhancer-promoter interaction prediction. Bioinformatics. 2019; 35(20): 3877-3883.

Divoux A, Sandor K, Bojcsuk D, Talukder A, Li X, Balint BL, Osborne TF, Smith S. Differential open chromatin profile and transcriptomic signature define depot-specific human subcutaneous preadipocytes: primary outcomes. Clinical epigenetics. 2018; 10(1): 148.

Li X, Naser S, Khaled A, Hu H, Li X. When old metagenomic data meet newly sequenced genomes, a case study. Plos One. 2018; 132: e0198773.

Li X, Ge P, Hu H. FlexSLiM: a Novel Approach for Short Linear Motif Discovery in Protein Sequences. Proceedings of the 2018 6th International Conference on Bioinformatics and Computational Biology. 2018; 32-39.

Zheng Y, Li X, Hu H. Discover the semantic structure of human reference epigenome by differential latent dirichlet allocation. 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2017.

Ding J, Li X, Hu H. CCmiR: a computational approach for competitive and cooperative microRNA binding prediction. Bioinformatics. 2017; 34(2): 198-206.

Wang Y, Goodison S, Li X, Hu H. Prognostic cancer gene signatures share common regulatory motifs. Scientific Reports , 2017; 7(1): 4750.

Wang Y, Hu H, Li X. rRNAFilter: a fast approach for ribosomal RNA read removal without a reference database. Journal of Computational Biology, 2016;

Roqueta-Rivera M, Esquejo RM, Phelan PE, Sandor K, Daniel B, Foufelle F, Ding J, Li X, Khorasanizadeh S, Osborne TF. SETDB2 Links Glucocorticoid to Lipid Metabolism through Insig2a Regulation. Cell Metab. 2016; 24(3): 474-84.

Zhao C, Li X, Hu H. PETModule: a motif module based approach for enhancer target gene prediction. Scientific Reports . 2016; 6: 30043.

Wang Y, Hu H, Li X. MBMC: An Effective Markov Chain Approach for Binning Metagenomic Reads from Environmental Shotgun Sequencing Projects. OMICS: A Journal of Integrative Biology. 2016; 20 (8): 470-479.


Li X, Zheng Y, Hu H, Li X. Integrative analyses shed new light on human ribosomal protein gene regulation. Scientific Reports. 2016; 6:28619.


Ding J, Li X, Hu H. TarPmiR: a new approach for microRNA target site prediction. Bioinformatics. 2016; 32(18):2768-2775.

Dhillon V, Li X.Single-Cell Genome Sequencing for Viral-Host Interactions. Journal of Computer Science & Systems Biology. 2015; 194(12):6011-6023.

Kapoor N, Niu J, Saad Y, Kumar S, Sirakova T, Becerra E, Li,X, Kolattukudy, PE. Transcription factors STAT6 and KLF4 implement macrophage polarization via the dual catalytic powers of MCPIP. Journal of Immunology. 2015; 8:160-165.

Wang Y, Hu H, Li X. MBBC: an efficient approach for metagenomic binning based on clustering. BMC Bioinformatics. 2015; 16(1):1.

Zheng Y, Li X, Hu H. PreDREM: a database of predicted DNA regulatory motifs from 349 human cell and tissue samples. Database. 2015; bav007.

Ding J, Dhillon V, Li X, Hu H. Systematic discovery of cofactor motifs from ChIP-seq data by SIOMICS. Methods. 2015; 79: 47-51.

Ding J, Li X, Hu H. MicroRNA modules prefer to bind weak and unconventional target sites. Bioinformatics. 2015; 31(9): 1366-1374.

Zheng Y, Li X, Hu H. Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs. Nucleic Acids Research. 2014; 43(1):74-83.

Zheng Y, Li X, Hu H. Computational discovery of feature patterns in nucleosomal DNA sequences. Genomics. 2014; 104(2):87-95.

Wang Y, Li X, Hu H. H3K4me2 reliably defines transcription factor binding regions. Genomics. 2014; 103(2-3):222-228.

Ding J, Hu H, Li X. SIOMICS: a Novel Approach for Systematic Identification of Motifs in ChIP-seq Data . Nucleic Acids Research. 2014; 42(5): e35.

Ding J, Hu H, Li X. NIM, A novel computational method for predicting nuclear-encoded chloroplast proteins. Journal of Medical and Bioengineering. 2013; 2(2): 115-119.

Ding J, Cai X, Wang Y, Hu H, Li X. ChIPModule: Systematic discovery of transcription factors and their cofactors from ChIP-seq data. Pac Symp Biocomput. 2013.

Ding J, Li X, Hu H. Systematic Identification of cis-regulatory elements in Chlamydomonas reinhardtii genome using comparative genomics. Plant Physiology. 2012;160(2):613-23.

Wang Y, Ding J, Daniell H, Hu H, Li X. Motif analysis unveils the possible co-regulation of chloroplast genes and nuclear genes encoding chloroplast proteins. Plant Molecular Biology. 2012;80(2):177-87.

Ruppert SM, Chehtane M, Zhang G, Hu H, Li X, Khaled AR. JunD/AP-1-Mediated Gene Expression Promotes Lymphocyte Growth Dependent on Interleukin-7 Signal Transduction. Plos One. 2012;7(2):e32262.

Ding J, Hu H, Li X. Thousands of cis-regulatory sequences are shared by Arabidopsis and populus. Plant Physiology. 2012;158(1):145-55.

Wang Y, Li X, Hu H. Transcriptional regulation of co-expressed microRNA targt genes. Genomics. 2011;98(6):445-52.

Balakrishnan MP, Cilenti L, Ambivero C, Goto Y, Takata M, Turkson J, Li XS, Zervos AS. THAP5 is a DNA binding transcriptional repressor that is regulated in melanoma cells during DNA damage-induced cell death. Biochem Biophys Res Commun. 2011; 404(1):195-200.

Schanen BC, Li X. Transcriptional regulation of mammalian miRNA genes. Genomics. 2011; 97(1):1-6.

Cai X, Hou L, Su N, Hu H, Deng M, Li X. Systematic identification of conserved motif modules in the human genome. BMC Genomics. 2010; 11:567. Database. Software.

Hu H, Li X. Transcription factor binding site identification by phylogenetic footprinting. Book Chapter in Frontiers in Computational and Systems Biology, 113-132, 2010.

Hu H, Li X. Whole genome identification of target genes of transcription factors. The 2010 International Conference On Bioinformatics and Biomedical Technology, Chengdu, China. April 16-18, 2010.

Hu H, Li X. Hierarchical order of gene expression levels. The 2010 International Conference On Bioinformatics and Biomedical Technology, Chengdu, China. April 16-18, 2010.

Cai X, Hu H, Li X. A new measurement of sequence conservation. BMC Genomics. 2009; 10:623.

Sadat MA, Dirscherl S, ..., Li X, Grez M, Cornetta K, Mooney SD, Dinauer MC. Retroviral vector integration in post-transplant hematopoiesis in mice conditioned with either submyeloablative or ablative irradiation. Gene Ther. 2009; 16:1452-1464.

Ma X, Zhang K, Li X. Evolution of Drosophila Ribosomal Protein Gene Core Promoters. Gene. 2009; 432(2):54-59.

Hu J, Hu H, Li X. MOPAT: a graph-based method to predict recurrent cis-regulatory modules from known motifs. Nucleic Acids Res. 2008; 36(13):4488-97. Unix version download, cygwin version download, Windows version download.

Peters B, Dirscherl S, Dantzer J, Nowacki J, Cross S, Li X, Cornetta K, Dinauer MC, Mooney SD. Automated analysis of viral integration sites in gene therapy research using the SeqMap web resource.Gene Ther. 2008; 15(18):1294-8.

Georgiadis MM, Luo M, Gaur RK, Delaplane S, Li X, Kelley MR. Evolution of the redox function in mammalian apurinic/apyrimidinic endonuclease. Mutat Res. 2008; 643(1-2):54-63.

Hu H, Li X. Networking pathways unveils association between obesity and non-insulin dependent diabetes mellitus. Pac Symp Biocomput. 2008; 13:255-66.

Humphreys TL, Li L, Li X, Janowicz DM, Fortney KR, Zhao Q, Li W, McClintick J, Katz BP, Wilkes DS, Edenberg HJ, Spinola SM. Dysregulated immune profiles for skin and dendritic cells are associated with increased host susceptibility to Haemophilus ducreyi infection in human volunteers . Infect Immun. 2007; 75(12):5686-97.

Cai, X., Hu, H., Li X. Tree Gibbs Sampler: Identifying Conserved Motifs without Aligning Orthologous Sequences. Bioinformatics. 2007; 23(15):2013-4. Windows version download. Unix version download. full version

Hu H., Li X. Transcriptional regulation in eukaryotic ribosomal protein genes . Genomics. 2007; 90(4):421-3.

Li X. Cancer Bioinformatics--From Therapy Design to Treatment . Briefings in Bioinformatics. 2006.

Li L, Cheng AS, Jin VX, Paik HH, Fan M, Li X, Zhang W, Robarge J, Balch C, Davuluri RV, Kim S, Huang TH, Nephew KP. A mixture model-based discriminate analysis for identifying ordered transcription factor binding site pairs in gene promoters directly regulated by estrogen receptor-alpha . Bioinformatics. 2006; 22(18):2210-6.

Li X., Zhong S., Wong WH. Reliable transcription factor binding sites prediction in eukaryotes by phylogenetic verification. Proc Natl Acad Sci U S A. 2005; 102(47):16945-50.

Li X., Wong WH. Sampling motifs on phylogenetic trees. Proc Natl Acad Sci U S A. 2005; 102(27): 9481-6. Windows version download. Unix version download. full version.

LaRocque R., Harris JB, Dziejman M., Li X. et al. Transcriptional Profiling of Vibrio cholerae Recovered Directly from Early and late human infection. Infection and Immunity. 2005; 73(8), 4488-93.

Bjorkbacka H, Fitzgerald KA, Huet F, Li X, et al. The induction of macrophage gene expression by LPS predominantly utilizes MyD88-independent signaling cascades. Physiol Genomics. 2004; 19(3):319-30.

Li X and Waterman MS. Estimate the repeat structure of a genome without assembly. Genome Research. 2003; 13(8):1916-22.

Yeh RF, Speed T., Waterman MS, Li X. Predicting progress in shotgun sequencing with paired ends. Updated version. Center for Bioinformatics & Molecular Biostatistics, 2002.


Software Tools and Resources



conserved motif combinations in Arabidopsis and poplar.

GenomeModule: predicted motif combinations and CRMs in the human genome.

MOPAT Unix version, MOPAT cygwin version, MOPAT Windows version.

Tree Gibbs Sampler



Showalter Award. Sampling Cis-elements on Phylogenetic Trees and Its Application to Developmental Biology. July 2006---June 2008.

NIH R01. Discovery of Cis-Regulatory Modules in the Human Genome, September 2007---July 2011.

NSF. III: Small: Computational Inference of Microbial Community Structures from Environmental Shotgun Reads, October 2012---September 2015.

NIH R01 (PI: Dr. Osborne). Sterol Relgulatory Element Binding Proteins in Regulation of Lipid Metabolism, August 2014---July 2018.

NIH R01 (PI: Dr. Smith). Epigenetic regulation of adipose tissue distribution, June 2016---May 2020.


NSF. IIBR Informatics: An integrative study of distal gene regulation, August 2020--- July 2023.



Bioinformation and Genomics, Fall 2009
Structure Bioinformatics, Spring 2010
Sequence Analysis , Fall 2010
Structure Analysis , Spring 2011
Sequence Analysis , Fall 2011
Structure Analysis , Spring 2012
Sequence Analysis , Fall 2012

Visit our group page for more information.


Current Students

Daniella Badal, Biomolecular Science Center undergraduate

Julian Quintana, Biomolecular Science Center undergraduate

Jun Ding, EECS PhD, Co-advising

Desislava Doncheva, EECS undergraduate

Shawn Hendricks, Biomolecular Science Center PhD

Vikram Dhillon, Biomolecular Science Center undergraduate

Michael Venincasa, high school student

Past Students and Postdocs

Xiaohui Cai, Postdoctoral Fellow (2006-2008)

Jianfei Hu, Postdoctoral Fellow (2006-2008)

Erik Ladewig, EECS master student (2008-2010)

Jeremy Keller, Biomolecular Science Center, undergraduate (2010-2011)

Luke Stevens, College of Medicine, MD student (2009-2010)