UCSFCBMBCBMB



    BMI 209 Home

    Announcements

    Syllabus

    Schedule & Handouts

    Links
 












BMI 209 - Statistical Data Mining & Analysis of Microarray Data

9/15    Lecture 1: Introduction to genomics and microarray technology [Yeh]

           [Slide]

9/22    Lecture 2: Introduction to statistics and microarray analysis [Yeh]
           - Summary statistics and exploratory data analysis methods
           - Clustering: partitioning and hierarchical methods
           - Sources of variability and experimental design
           - Data preprocessing for expression arrays


           Readings/References: Gentleman et al Ch. 1-4.

          
[Slide]

9/29    Lecture 3: Hypothesis testing and Linear Models [Yeh]
           - Two-sample statistics
           - Introduction to linear models for factorial experiments 
           - Multiple testing issues


                [Slide]

10/6    Lecture 4: Classification I [Fridlyand]
           - Linear methods for classification (Hastie Ch.4)
           - Linear discriminant analysis and variations
       
            
[Slide]

10/13  Lecture 5: Classification II [Fridlyand]
           - Tree-based methods (Hastie Ch.9)
           - Ensembles: bagging, boosting, random forests (Hastie Ch.10)

            
[Slide]

10/20  Lecture 6: Classification III [Segal]
           - Support vector machines (Hastie Ch.12)
           - Nearest centroid classifiers

                [Slide]

            Additional reference:
    1. AR Dabney. 2005. Bioinformatics Sep 2005. Classification of microarrays to nearest centroids.
                       
10/27  Lecture 7: Model selection [Fridlyand / Segal]
           - Bias, variance and model complexity (Hastie Ch.7)
           - Model search (forward/backwards/stochastic)
           - Model selection criteria: AIC/BIC
           - Cross validation and performance assessment
           - Application to estimating the number of clusters

                [Slide (Segal)]
                [Slide (Fridlyand)]

11/3    Lecture 8: Regression [Segal]
           - Penalization and selection (Hastie, Ch. 3)
           - Continuous and survival  endpoints

            
[Slide]

11/10  Lecture 9: Annotations [Yeh]
           - Gene annotation
           - Functional annotation
           - cis-regulatory element annotation

           * SNP arrays

            
[Slide]

11/17  Lecture 10: Case Studies [Fridlyand]
           - Clustering (cont.)
           - Array CGH
          
           * Student presentations

11/24  Thanksgiving holiday



Textbook:
The Elements of Statistical Learning by T. Hastie, R. Tibshirani, J. H. Friedman. 2001. Springer.

Recommended readings:
  1. Bioinformatics and Computational Biology Solutions using R and Bioconductor, edited by R. Gentleman et al. 2005. Springer.
  2. Statistical Analysis of Gene Expression Microarray Data, edited by T.P. Speed. 2003. Chapman & Hall/CRC.
  3. A Primer of Genome Science by G. Gibson & S.V. Muse. 2001. Sinauer Associates.