A central issue in many current large-scale scientific studies is how to assess statistical significance while taking into account the inherent multiple hypothesis testing issue. This graduate course will provide an in-depth understanding of the topic in the context of data science with a focus on statistical 'omics.' We start with an insightful revisit of single hypothesis testing, the building block of multiple hypothesis testing. We then study the fundamental elements of multiple hypothesis testing, including the control of family-wise error rate and false discovery rate. We will also touch upon various more advanced topics such as data integration, selective inference and fallacy of p-values. The course will provide both analytical arguments and empirical evidence.
Students are evaluated based on class participation and one final research report on a suggested or self-selected project related to multiple hypothesis testing.