Introduction to Bio-Ontologies explores the computational background of ontologies. Emphasizing computational and algorithmic issues surrounding bio-ontologies, this self-contained text helps readers understand ontological algorithms and their applications.
The first part of the book defines ontology and bio-ontologies. It also explains the importance of mathematical logic for understanding concepts of inference in bio-ontologies, discusses the probability and statistics topics necessary for understanding ontology algorithms, and describes ontology languages, including OBO (the preeminent language for bio-ontologies), RDF, RDFS, and OWL.
The second part covers significant bio-ontologies and their applications. The book presents the Gene Ontology; upper-level ontologies, such as the Basic Formal Ontology and the Relation Ontology; and current bio-ontologies, including several anatomy ontologies, Chemical Entities of Biological Interest, Sequence Ontology, Mammalian Phenotype Ontology, and Human Phenotype Ontology.
The third part of the text introduces the major graph-based algorithms for bio-ontologies. The authors discuss how these algorithms are used in overrepresentation analysis, model-based procedures, semantic similarity analysis, and Bayesian networks for molecular biology and biomedical applications.
With a focus on computational reasoning topics, the final part describes the ontology languages of the Semantic Web and their applications for inference. It covers the formal semantics of RDF and RDFS, OWL inference rules, a key inference algorithm, the SPARQL query language, and the state of the art for querying OWL ontologies.
Software and data designed to complement material in the text are available on the book's website: http://bio-ontologies-book.org The site provides the R Robo package developed for the book, along with a compressed archive of data and ontology files used in some of the exercises. It also offers teaching/presentation slides and links to other relevant websites.
This book provides readers with the foundation to use ontologies as a starting point for new bioinformatics research projects or to support current molecular genetics research projects. By supplying a self-contained introduction to OBO ontologies and the Semantic Web, it bridges the gap between both fields and helps readers see what each can contribute to the analysis and understanding of biomedical data.
Peter N. Robinson is a research scientist and leader of the Computational Biology Group in the Institute of Medical Genetics and Human Genetics at Charite-Universitatsmedizin Berlin. Dr. Robinson completed his medical education at the University of Pennsylvania, followed by an internship at Yale University. He also studied mathematics and computer science at Columbia University. His research interests involve the use of mathematical and bioinformatics models to understand biology and hereditary disease. Sebastian Bauer is a research assistant in the Institute of Medical Genetics and Human Genetics at Charite-Universitatsmedizin Berlin. He earned a degree in computer science from the Technical University of Ilmenau. His research interests include mathematical modeling, discrete algorithms, theoretical computer science, software engineering, and the applications of these fields to medicine and biology.
BASIC CONCEPTS Ontologies and Applications of Ontologies in Biomedicine What Is an Ontology? Ontologies and Bio-Ontologies Ontologies for Data Organization, Integration, and Searching Computer Reasoning with Ontologies Typical Applications of Bio-Ontologies Mathematical Logic and Inference Representation and Logic Propositional Logic First-Order Logic Sets Description Logic Probability Theory and Statistics for Bio-Ontologies Probability Theory Bayes' Theorem Introduction to Graphs Bayesian Networks Ontology Languages OBO RDF and RDFS OWL and the Semantic Web BIO-ONTOLOGIES The Gene Ontology A Tool for the Unification of Biology Three Subontologies Relations in GO GO Annotations GO Slims Upper-Level Ontologies Basic Formal Ontology The Big Divide: Continuants and Occurrents Universals and Particulars Relation Ontology Revisiting Gene Ontology Revisiting GO Annotations A Selective Survey of Bio-Ontologies OBO Foundry The National Center for Biomedical Ontology Bio-Ontologies What Makes a Good Ontology? GRAPH ALGORITHMS FOR BIO-ONTOLOGIES Overrepresentation Analysis Definitions Term-for-Term Multiple Testing Problem Term-for-Term Analysis: An Extended Example Inferred Annotations Lead to Statistical Dependencies in Ontology DAGs Parent-Child Algorithms Parent-Child Analysis: An Extended Example Topology-Based Algorithms Topology-elim: An Extended Example Other Approaches Summary Model-Based Approaches to GO Analysis A Probabilistic Generative Model for GO Enrichment Analysis A Bayesian Network Model MGSA: An Extended Example Summary Semantic Similarity Information Content in Ontologies Semantic Similarity of Genes and Other Items Annotated by Ontology Terms Statistical Significance of Semantic Similarity Scores Frequency-Aware Bayesian Network Searches in Attribute Ontologies Modeling Queries Probabilistic Inference for the Items Parameter-Augmented Network The Frequency-Aware Network Benchmark INFERENCE IN ONTOLOGIES Inference in the Gene Ontology Inference over GO Edges Cross-Products and Logical Definitions RDFS Semantics and Inference Definitions Interpretations RDF Entailment RDFS Entailment Entailment Rules Summary Inference in OWL Ontologies The Semantics of Equality The Semantics of Properties The Semantics of Classes The Semantics of the Schema Vocabulary Conclusions Algorithmic Foundations of Computational Inference The Tableau Algorithm Developer Libraries SPARQL SPARQL Queries Combining RDF Graphs Conclusions Appendix A: An Overview of R Appendix B: Information Content and Entropy Appendix C: W3C Standards: XML, URIs, and RDF Appendix D: W3C Standards: OWL Bibliography Index Exercises and Further Reading appear at the end of each chapter.