From the very basics to linear models, this book provides a complete introduction to statistics, data analysis, and R for bioinformatics research and applications. It covers ANOVA, cluster analysis, visualization tools, and machine learning techniques. Suitable for self-study and courses in computational biology, bioinformatics, statistics, and the life sciences, the text also presents examples of microarrays and bioinformatics applications. R code illustrates all of the essential concepts and is available on an accompanying CD-ROM.
Sorin Draghici the Robert J. Sokol MD Endowed Chair in Systems Biology in the Department of Obstetrics and Gynecology, professor in the Department of Clinical and Translational Science and Department of Computer Science, and head of the Intelligent Systems and Bioinformatics Laboratory at Wayne State University. He is also the chief of the Bioinformatics and Data Analysis Section in the Perinatology Research Branch of the National Institute for Child Health and Development. A senior member of IEEE, Dr. Draghici is an editor of IEEE/ACM Transactions on Computational Biology and Bioinformatics, Journal of Biomedicine and Biotechnology, and International Journal of Functional Informatics and Personalized Medicine. He earned a Ph.D. in computer science from the University of St. Andrews.
Introduction Bioinformatics - an emerging discipline Introduction to R Introduction to R The basic concepts Data structures and functions Other capabilities The R environment Installing Bioconductor Graphics Control structures in R Programming in R vs C/C++/Java Bioconductor: Principles and Illustrations Overview The portal Some explorations and analyses Elements of Statistics Introduction Some basic concepts Elementary statistics Degrees of freedom Probabilities Bayes' theorem Testing for (or predicting) a disease Probability Distributions Probability distributions Central limit theorem Are replicates useful? Basic Statistics in R Introduction Descriptive statistics in R Probabilities and distributions in R Central limit theorem Statistical Hypothesis Testing Introduction The framework Hypothesis testing and significance "I do not believe God does not exist" An algorithm for hypothesis testing Errors in hypothesis testing Classical Approaches to Data Analysis Introduction Tests involving a single sample Tests involving two samples Analysis of Variance (ANOVA) Introduction One-way ANOVA Two-way ANOVA Quality control Linear Models in R Introduction and model formulation Fitting linear models in R Extracting information from a fitted model: testing hypotheses and making predictions Some limitations of the linear models Dealing with multiple predictors and interactions in the linear models, and interpreting model coefficients Experiment Design The concept of experiment design Comparing varieties Improving the production process Principles of experimental design Guidelines for experimental design A short synthesis of statistical experiment designs Some microarray specific experiment designs Multiple Comparisons Introduction The problem of multiple comparisons A more precise argument Corrections for multiple comparisons Corrections for multiple comparisons in R Analysis and Visualization Tools Introduction Box plots Gene pies Scatter plots Volcano plots Histograms Time series Time series plots in R Principal component analysis (PCA) Independent component analysis (ICA) Cluster Analysis Introduction Distance metric Clustering algorithms Partitioning around medoids (PAM) Biclustering Clustering in R Machine Learning Techniques Introduction Main concepts and definitions Supervised learning Practicalities using R The Road Ahead