Biosemantic Research Group focus on converting factual information from biodiversity literature to computable data, covering research in information extraction, controlled vocabulary/ontology construction, and knowledge modeling.
ABI innovation: Authors in the driver's seat: fast, consistent, computable phenotype data and ontology production. NSF DBI-1661485. July 2017-Jun 2020. Link (under construction)
Collaborative Research: AVATOL - Next Generation Phenomics for the Tree of Life. NSF DEB-1208567. May 2012- May 2017. Link.
Collaborative Research: ABI Development: Exploring Taxon Concepts (ETC) through analyzing fine-grained semantic markup of descriptive literature. NSF DBI-1147266. 7/2012-6/2016. Link.
Collaborative Research: Building a Comprehensive Evolutionary History of Flagellate Plants NSF DEB-1541509. Jan 2016- Dec 2019.
Biosemantic software tools online:
Explorer of Taxon Concept Toolkit (ETC): http://etc.cs.umb.edu/etcsite
Web-based application that Includes the following tools
- Text Capture (charaparser) that extract trait/phenotype characters from taxonomic descriptions of different taxon groups,
- Ontology Building that facilitates the creation of a phenotype ontology using terms from taxonomic descriptions,
- Matrix Generation that builds a taxon-by-character matrix from the extracted character data,
- Key Generation that builds an interactive keys using characters, and
- Taxonomy Comparison that compare taxon concepts using EULER tools and extracted characters.
Ontology Term Organizer (OTO): http://biosemantics.arizona.edu/OTO
A simple web application that allow multiple users to categorize a set of terms by drag and drop terms. This tool is meant to gather consensus from a group of users in order to support the development of a formal ontology. Relationships supported are is_a, part_of, and order (follows/precedes).
MicroPIE for extracting microbial phenomic characters: http://biosemantics.arizona.edu/micropieweb/
Extracts microbial physiological characters from descriptions and returns a taxon-by-character matrix.
CharaParser+EQ: not maintained at this time.
Biosemantic software code repository:
Biosemantics Group Members:
| Hong Cui
Lead Software Developer
|Autumn Fun 2016||Rubber Duck Race 2016|
Group Weekly Presentations
Selected Publications since 2010
- Cui, H*. (2010). Semantic annotation of morphological descriptions: An overall strategy. BMC Bioinformatics,11, 1-11. DOI:10.1186/1471-2105-11-278. http://www.biomedcentral.com/1471-2105/11/278
- Cui, H*., Boufford, D., & Selden, P. (2010). Semantic annotation of biosystematics literature without training examples. Journal of the American Society for Information Science and Technology, 61(3), 522-542.
- Cui, H*. (2010). Competency evaluation of plant character ontologies against domain literature. Journal of the American Society for Information Science and Technology, 61(6), 1144-1165.
- Cui, H*., Duan, Y. & Li, F. (2011). Machine learning based semantic markup of biodiversity literature in English. Document, Information, & Knowledge (in Chinese), 2, 73-77.
- Cui, H*. (2012). CharaParser for fine-grained semantic annotation of organism morphological descriptions. Journal of the American Society for Information Science and Technology, 63(4), 738-754.
- Thessen, A., Cui, H., & Mozzherin, D. (2012). Applications of natural language processing in biodiversity science. Advances in Bioinformatics.
- Duan, Y, Hei, Z, Ju, F., Cui, H. (2012). Study on Semantic Markup of Species Description Text in Chinese Based on Auto-Learned Rules. New Technology of Library and Information Services (Chinese). 2012 (5).
- Duan, Y, Hei, Z, Ju, F., Cui, H. (2012). Semantic Annotation of Species Description Text in Chinese Literature by Naive Bayes Classifier. Journal of the China Society for Scientific and Technical Information (Chinese). 31, (8), 805-812.
- Arighi, C.N., Carterette, B., Cohen K.B. et al. (2013). An Overview of the BioCreative 2012 Workshop Track III: Interactive Text Mining Task. Database. doi: 10.1093/database/bas056
- Burleigh, G, et al. (2013). Next generation phenomics for the Tree of Life. Plos Current. http://currents.plos.org/treeoflife/article/next-generation-phenomics-for-the-tree-of-life/
- Duan, YF., Hei ZZ., Jiu, F., & Cui, H.(2013) Heuristics based semantic annotation of biodiversity documents in Chinese. Chinese Journal of Library and Information Science (English). 2013,6(2):33-46. http://ir.las.ac.cn/handle/12502/6238?mode=full&submit_simple=Show+full+item+record
- Dahdul, W.M., Cui, H., Mabee, P. et al. (2014) The Biological Spatial Ontology: anatomical descriptors for spatial and topological aspects of biological structures. Journal of Biomedical Semantics. 5:34. doi:10.1186/2041-1480-5-34
- Zhang, Y., Cui, H., Burkell, & J. Mercer, R.E. (2014) A machine learning approach for rating the quality of depression treatment web pages. iConference, 2014. http://hdl.handle.net/2142/47314 [full paper]
- Deans AR, Lewis SE, Huala E, Anzaldo SS, Ashburner M, et al. (2015) Finding Our Way through Phenotypes. PLoS Biology 13(1): e1002033. doi:10.1371/journal.pbio.1002033 [perspective paper].
- Huang, F, Macklin, J.A., Cui, H.*, Cole, H.A., & Endara, L. (2015). OTO: Ontology Term Organizer. BMC Bioinformatics. 16:47 doi:10.1186/s12859-015-0488-1
- Cui, H., Dahdul, W., Dececchi, A., Ibrahim, N., Mabee, P., Balhoff, J., Gopalakrishnan, H. (2015) CharaPaser+EQ: Performance Evaluation Without Gold Standard. Annual Meeting of American Society for Information Science and Technology, Nov 6-10, St Louis, Missouri, 2015. (Full paper, acceptance rate: 36.%)
- Carrine, B., Cui, H., Moore, L., Ramona, W. (2016). MicrO: an ontology of phenotypic and metabolic characters, assays, and culture media found in prokaryotic taxonomic descriptions. Journal of Biomedical Semantics, 7:18, DOI: 10.1186/s13326-016-0060-6, http://www.jbiomedsem.com/content/7/1/18
Cui, H.*, Xu, D., Chong, S.S., Ramirez, M.J., Rodenhausen, T., Macklin, J.A., Ludascher, B., Morris, R.A., Soto, E. M., & Koch, N.M. (2016). Introducing Explorer of Taxon Concepts with a Case Study on Spider Measurement Matrix Building. BMC Bioinformatics. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1352-7
Mao, J. Moore, L., Blank, C. Wu, E.H-H, Ackerman, M., Ranade, S., & Cui, H* (2016). Microbial Phenomics Information Extractor (MicroPIE): A Natural Language Processing Tool for the Automated Acquisition of Prokaryotic Phenotypic Characters from Text Sources. BMC Bioinformatics. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1396-8
- Endara, L. Cole, H.A., Burleigh, J.G., Nagalingum, N., Macklin, J.A., Liu, J., Cui, H*. (Under review) Using taxonomic descriptions to build a standardized Plant Glossary, Taxon.
- T. Dang, H. Cui, and A. G. Forbes (2016). MultiLayerMatrix: Visualizing large taxonomic datasets. In Proceedings of the EuroVis Workshop on Visual Analytics (EuroVA), Groningen, Netherlands, June 2016.