Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review of clustering analysis in bioinformatics from the fundamentals through to state-of-the-art techniques and applications.
Key Features: * Offers a contemporary review of clustering methods and applications in the field of bioinformatics, with particular emphasis on gene expression analysis * Provides an excellent introduction to molecular biology with computer scientists and information engineering researchers in mind, laying out the basic biological knowledge behind the application of clustering analysis techniques in bioinformatics * Explains the structure and properties of many types of high-throughput datasets commonly found in biological studies * Discusses how clustering methods and their possible successors would be used to enhance the pace of biological discoveries in the future * Includes a companion website hosting a selected collection of codes and links to publicly available datasets
Preface xix List of Symbols xxi About the Authors xxiii Part One Introduction 1 1 Introduction to Bioinformatics 3 2 Computational Methods in Bioinformatics 9 Part Two Introduction to Molecular Biology 19 3 The Living Cell 21 4 Central Dogma of Molecular Biology 33 Part Three Data Acquisition and Pre-processing 53 5 High-throughput Technologies 55 6 Databases, Standards and Annotation 67 7 Normalisation 87 8 Feature Selection 109 9 Differential Expression 119 Part Four Clustering Methods 133 10 Clustering Forms 135 11 Partitional Clustering 143 12 Hierarchical Clustering 157 13 Fuzzy Clustering 167 14 Neural Network-based Clustering 181 15 Mixture Model Clustering 197 16 Graph Clustering 227 17 Consensus Clustering 247 18 Biclustering 265 19 Clustering Methods Discussion 283 Part Five Validation and Visualisation 303 20 Numerical Validation 305 21 Biological Validation 323 22 Visualisations and Presentations 339 Part Six New Clustering Frameworks Designed for Bioinformatics 363 23 Splitting-Merging Awareness Tactics (SMART) 365 24 Tightness-tunable Clustering (UNCLES) 385 Appendix 395 Index 409