Dr. Cui's research focuses on machine learning applications for semantic annotation of semi-structured information, with a current focus on biodiversity literature. She develops and evaluates machine learning and natural lanaguage processing algorithms for converting born-digital and digitized taxonomic descriptions into new Semantic Web formats. More recently her research has led to ontology building in biology domain. Her work has an explicit impact on how scientific information can be retrieved and used in the digital era by turning the wealth of human-readable scientific information into something that can be understood and read by computers. She is the principal investigator or co-PI of a number of National Science Foundation-funded projects. The methodology developed by Dr. Cui has been adopted by several other research groups in the US and abroad. She leads the biosemantics research group in the iSchool.


Information Organization, Natural Language Processing Applications, Machine Learning Applications, Biodiversity Informatics, Ontology Development, Software Development


• PI, "The Value of Automated Semantic Annotation for Biodiversity Informatics," NSERC, Canada, $75,000. Awarded. 2007. PI moved in 2007.

• PI, "Fine-Grained Semantic Markup of Descriptive Data for Knowledge Applications in Biodiversity Domains," NSF, $700,452.00, Awarded (EF-0849982). 9/2009-8/2012.

• Subcontractor, "Collaborative research: ABI Development: Ontology-enabled reasoning across phenotypes from evolution and model organisms" (NSF DBI-1062542), $49,596. Awarded. 7/2011-6/2013.

• Co-PI, “Collaborative Research: Next Generation Phenomics for the Tree of Life,” NSF, $335,000. Awarded (DEB-1208567). 5/2012-4/2015.

• PI, “BCSP: Collaborative Research: ABI Development: Exploring Taxon Concepts (ETC) through analyzing fine-grained semantic markup of descriptive literature,” NSF, $1,095,946.00. Awarded (DBI-1147266). 7/2012-6/2016

• Senior Personnel. “La SCALA: Latino Scholars Cambio Leadership Academy” Type: Training Grant. Institute for Museum and Library Studies, $173,000. Awarded

• PI: Research Experience for Undergraudate Students (REU) supplement to "Collaborative Research: AVATOL - Next Generation Phenomics for the Tree of Life” $7,500. Awarded (DEB-1208567). 5/2014-5/2017.

• Co-PI: Special Creativity Extension to "Collaborative Research: AVATOL - Next Generation Phenomics for the Tree of Life” $117,422. Awarded (DEB-1208567). 5/2015-5/2017.

• Co-PI: “Collaborative Research: Building a Comprehensive Evolutionary History of Flagellate Plants” $53,198. Awarded (DEB 1541509) 1/2016-12/2019.

• “Collaborative Research: ABI innovation: Authors in the driver's seat: fast, consistent, computable phenotype data and ontology production”, 7/1/2017-6/30/2020. $640,000.

Capstone/Directed Research Projects available for MS and Ph.D students (Fall 2017 and Spring 2018):

1. Capstone projects for MS students

1.1 Content fetcher: Design and develop of a software module that interactively and efficiently fetches digital content from public websites (e.g. an open access journal site) for the use as input to the ETC toolkit.  Required skills: programming skill in any language. 

1.2 Google Drive Extension: Extends Google Drive with a new ontology add-on. The main functions include highlight text that matches terms in an ontology, and show snippets of ontology in Google Doc view. Required skills: programming skill in any language.

1.3 Ontology visualization: Design and implement visualization modules using gwt-d3-api to visualize segements of an ontology in a larger GWT application.

2. Directed Research topics for Ph.D students

2.1 Deep learning for  biodiversity Name Entitiy Recoganization tasks, including free phrases to ontology term mapping.

Email hongcui@email.arizona.edu if you are interested in any of the projects/topics. 

• Arighi, C.N., Carterette, B., Cohen K.B. et al. (2013). An Overview of the BioCreative 2012 Workshop Track III: Interactive Text Mining Task. Database. doi: 10.1093/database/bas056
• Burleigh, G, et al. (2013). Next generation phenomics for the Tree of Life. Plos Current. http://currents.plos.org/treeoflife/article/next-generation-phenomics-fo...
• Duan, YF., Hei ZZ., Jiu, F., & Cui, H.(2013) Heuristics based semantic annotation of biodiversity documents in Chinese. Chinese Journal of Library and Information Science (English). 2013,6(2):33-46. http://ir.las.ac.cn/handle/12502/6238?mode=full&submit_simple=Show+full+...
• Dahdul, W.M., Cui, H., Mabee, P. et al. (2014) The Biological Spatial Ontology: anatomical descriptors for spatial and topological aspects of biological structures. Journal of Biomedical Semantics. 5:34. doi:10.1186/2041-1480-5-34 [13 pages]
• Deans AR, Lewis SE, Huala E, Anzaldo SS, Ashburner M, et al. (2015) Finding Our Way through Phenotypes. PLoS Biology 13(1): e1002033. doi:10.1371/journal.pbio.1002033 [perspective paper].
• Huang, F, Macklin, J.A., Cui, H.*, Cole, H.A., & Endara, L. (2015). OTO: Ontology Term Organizer. BMC Bioinformatics. 16:47 doi:10.1186/s12859-015-0488-1
• Carrine, B., Cui, H., Moore, L., Ramona, W. (2016). MicrO: an ontology of phenotypic and metabolic characters, assays, and culture media found in prokaryotic taxonomic descriptions. Journal of Biomedical Semantics, 7:18, DOI: 10.1186/s13326-016-0060-6, http://www.jbiomedsem.com/content/7/1/18
• Cui, H.*, Xu, D., Chong, S.S., Ramirez, M.J., Rodenhausen, T., Macklin, J.A., Ludascher, B., Morris, R.A., Soto, E. M., & Koch, N.M. (2017). Introducing Explorer of Taxon Concepts with a Case Study on Spider Measurement Matrix Building. BMC Bioinformatics.
• Mao, J. Moore, L., Blank, C. Wu, E.H-H, Ackerman, M., Ranade, S., & Cui, H* (2017). Microbial Phenomics Information Extractor (MicroPIE): A Natural Language Processing Tool for the Automated Acquisition of Prokaryotic Phenotypic Characters from Text Sources. BMC Bioinformatics.
• Endara, L. Cole, H.A., Burleigh, J.G., Nagalingum, N., Macklin, J.A., Liu, J., Cui, H*. (2017) Using taxonomic descriptions to build a standardized Plant Glossary, Taxon.

Hong Cui
Director of Graduate Studies; Associate Professor, Information Technologies
Telephone: 520-621-3565


Ph. D Library and Information Science 2005
Master in Computer Science 2002

Courses Taught

672: Introduction to Applied Technology
515: Organization of Information
630: Controlled Vocabularies
588 special topic: XML and Semantic Web Standards
Doctoral seminar: Text Mining

