Haiyan Nancy Hu 

Assistant Professor

Department of EECS

University of Central Florida

Office: HEC-233

Phone: 407-882-0134
Email: haihu@cs.ucf.edu
Mail: UCF, 4000 Central Florida Blvd, Harris Engineering Center, Bldg. 116, Room 233, Orlando, Fl. 32816-2005


Lab Page

Research Interests

Bioinformatics/Computational Biology; Data mining and machine learning algorithms; Pathway and network analysis; Motif finding and gene regulation; Functional Genomics; Genomics/epigenomics data integration to understand complex biological processes such as aging, obesity, diabetes, cancer, heart disease, psychiatric diseases and autoimmune diseases.

<<<Graduate and Undergraduate Research Assisant Positions are available in areas including data mining, machine learning, pattern recognition, modeling and simulation, bioinformatics and computational biology. For interested students, please read here>>>


UCF In-House Award: Computational Modeling of miRNA binding interaction ($7,500, PI, 100% share, 05/13- 04/14)

NSF CAREER: A Computational Framework to Study Epigenetic Regulation ($684,172, PI, 100% share, 05/12-04/17)

NSF BRIGE: Computational Identification of Gene Regulatory Networks in Microalgae ($174,654, PI, 100% share, 08/11-07/14)

NIH R01: Discovery of Cis-Regulatory Modules in the Human Genome ($492,405, 25% share, 08/08- 07/11)


Fall 2013: CAP 6938 Advanced Topics in Machine Learning

Fall 2013: CAP5510 Introduction to Bioinformatics

Spring 2013: CAP 6545 Machine Learning in Bioinformatics

Fall 2012: CAP5510 Introduction to Bioinformatics

Spring 2012: COT3100H Introduction to Discrete Structures (Honors)

Spring 2012: CAP 6545 Machine Learning in Bioinformatics

Fall 2011: CAP6938 Graphs and Networks in Computational Biology

Fall 2011: CAP5510 Introduction to Bioinformatics

Spring 2011: CAP 6545 Machine Learning in Bioinformatics

Spring 2011: COT3100 Introduction to Discrete Structures

Fall 2010: CAP6938 Data Mining in Bioinformatics

Spring 2010: CAP 6545 Machine Learning in Bioinformatics

Fall 2009: CAP6938 Data Mining in Bioinformatics

Spring 2009: COP 3503C Computer Science II (CS2)

Fall 2008: COP 3503 Computer Science II (CS2)

Wang Y, Li X, Hu H. H3K4me2 reliably defines transcription factor binding regions.Genomics. doi: 10.1016/j.ygeno.2014.02.002, 2014.

Ding J, Hu H, Li X. SIOMICS: a novel approach for systematic identification of motifs in ChIP-seq data, Nucleic Acids Research. doi: 10.1093/nar/gkt1288, 2013.

Ding J, Hu H, Li X. NIM, A novel computational method for predicting nuclear-encoded chloroplast proteins, Journal of Medical and Bioengineering, 2(2): 115-119. doi: 10.12720/jomb.2.2.115-119, 2013.

Ding J, Cai X, Wang Y, Hu H, Li X. ChIPModule: Systematic discovery of transcription factors and their cofactors from ChIP-seq data, Pac Symp Biocomput. 2013.

Ding J, Li X, Hu H. Systematic discovery of cis-regulatory elements in Chlamydomonas reinhardtii genome using comparative genomics, Plant Physiology, doi: http://dx.doi.org/10.1104/pp.112.200840, 2012.

Wang Y, Ding J, Daniell H, Hu H, Li X. Motif analysis unveils the possible co-regulation of chloroplast genes and nuclear genes encoding chloroplast proteins. Plant Mol Biol, 80(2): 177-187. doi:10.1007/s11103-012-9938-6, 2012.

Ruppert SM , Chehtane M , Zhang G , Hu H , Li X , Khaled AR. JunD/AP-1-Mediated gene expression promotes lymphocyte growth dependent on Interleukin-7 signal transduction. PLoS ONE 7(2): e32262. doi:10.1371/journal.pone.0032262, 2012.

Li W, Hu H, Huang Y, Li H, Mehan MR, Nunez-Iglesias J, Xu M, Yan X, Zhou XJ. Pattern mining across many massive networks. Book Chapter in Functional Coherence of Biological Networks. Springer, M. Koyuturk, S. Subramaniam, and A. Grama Eds., 137-170, 2012.

Ding J, Hu H, Li X. Thousands of cis-regulatory sequence combinations are shared by Arabidopsis and Poplar. Plant Physiology, doi: http://dx.doi.org/10.1104/pp.111.186080, 2011.

Wang Y, Li X, Hu H. Transcriptional regulation of co-expressed microRNA target genes. Genomics. doi:10.1016/j.ygeno.2011.09.004, 2011.

Li W, Hu H, Huang Y, Li H, Mehan MR, Nunez-Iglesias J, Xu M, Yan X, and Zhou XJ. Frequent pattern discovery in multiple biological networks: algorithms and applications. Statistics in Biosciences, p. 1-20. DOI: 10.1007/s12561-011-9047-0, 2011.

Hu H. Mining patterns in disease classification forests. Journal of Biomedical Informatics, 43(5):820-7, 2010.

Cai X, Hou L, Su N, Hu H, Deng M, Li X. Systematic identification of conserved motif modules in the human genome. BMC Genomics, 11:567, 2010.

Hu H, Li X. Whole genome identification of target genes of transcription factors. The 2010 International Conference On Bioinformatics and Biomedical Technology, Chengdu, China. April 16-18, 2010.

Hu H, Li X. Hierarchical order of gene expression levels. The 2010 International Conference On Bioinformatics and Biomedical Technology, Chengdu, China. April 16-18, 2010.

Hu H, Li X. Transcription factor binding site identification by phylogenetic footprinting. Book Chapter in Frontiers in Computational and Systems Biology, 113-132, 2010.

Cai X, Hu H, Li X. A new measurement of sequence conservation. BMC Genomics, 10:623, 2009.

Hu H. An efficient method to identify conditionally activated transcription factors and their corresponding signal transduction pathway segments. Bioinformatics and Biology Insights, 3:179-187, 2009.

Hu J, Hu H, Li X. MOPAT: a graph-based method to predict recurrent cis-regulatory modules from known motifs. Nucleic Acids Res, 36(13):4488-4497, 2008.

Hu H, Li X. Networking Pathways unveils Association between Obesity and Non-Insulin Dependent Diabetes Mellitus. Pac Symp Biocomput.13: 255-66, 2008.

Hu H, Li X. Transcriptional regulation in eukaryotic ribosomal protein genes, Genomics. 90(4):421-3, 2007.

Cai X, Hu H, Li X. Tree Gibbs Sampler: Identifying Conserved Motifs without Aligning Orthologous Sequences. Bioinformatics. 23(15):2013-4, 2007.

Huang Y, Li H, Hu H, Yan X, Waterman MS, Huang H, Zhou XJ. Systematic Discovery of Functional Modules and Context-Specific Functional Annotation of Human Genome. Bioinformatics, 23(13):i222-i229, 2007.

Pan F, Kamath K, Zhang K, Pulapura S, Achar A, Nunez-Iglesias J, Huang Y, Yan X, Han J, Hu H, Xu M, Zhou XJ. Integrative Array Analyzer: a software package for analysis of cross-platform and cross-species microarray data. Bioinformatics. 22(13):1665-7, 2006.

Hu H, Yan X, Huang Y, Han J, Zhou XJ. Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics. 21 Suppl. 1, i213-i221, 2005.

Pan F, Kamath Kiran, Hu H, Huang Y, Zhang K, Xu M, Yan X, Han J and Zhou XJ. BioArrayMiner: A software package for integrative analysis of cross-platform and cross-species microarray data. Bioinformatics (ISMB 2005).

Xue L, Sun X, Yang L, Hu H, Li W. Study on the control system of casing-bag machine hand. New Technology & New Process 6: 11-13, 1999.

Sun X, Hu H, Pang J, Xue L. Physical Realization of Palletizing Robot Computer Control System. Journal of Beijing Institute of Petro-Chemical Technology 2:1008-2565, 1999.

Xue L, Sun X, Yang L, Zhang S, Hu H. Robot for sheathing bags and PLC control, Low Voltage Apparatus 5: 38-39+64, 1998.


CODENSEis a software package to mine coherent dense subgraphs from multiple biological networks. CODENSE is short for Mining Coherent Dense Subgraphs. By simplifying the problem of identifying coherent dense subgraphs across n graphs into a problem of identifying dense subgraphs in two special graphs: the summary graph and the second-order graph, CODENSE can efficiently mine frequent coherent dense subgraphs across large numbers of massive graphs.

MODES is short for Mining Overlapping DENSE Subgraphs. MODES is developed based on HCS (Mining Highly Connected Subgraphs) (Hartuv & Shamir, 2000), with two new features: (1) MODES is more efficient in identifying dense subgraphs; and more importantly, (2) MODES can discover overlapping subgraphs.

MOPAT (Motif Pair Tree) identifies CRMs through the identification of motif modules, groups of motifs co-ccurring in multiple CRMs. It can identify orthologous CRMs without multiple alignments. It can also find CRMs given a large number of known motifs. Unix version download, cygwin version download.
Tree Gibbs Sampler is a software for identifying motifs by simultaneously using the motif overrepresentation property and the motif evolutionary conservation property. It identifies motifs without depending on pre-aligned orthologous sequences, which makes it useful for the extraction of regulatory elements in multiple genomes of both closely related and distant species.Windows version download. Unix version download.
activeTF is to find a set of coordinately activated Transcription Factors (TF) from a given gene expression dataset.
Current Students  

Jun Ding

Ying Wang

Yiyu Zheng

Liu Pei