STA 15C Introduction to Statistical Data Science III


Learning outcomes:

1. Develop deeper understanding of statistical methodology through use of statistical software.
2. Understand the concepts of estimation and inference in the context of linear regression and ANOVA.
3. Understand the basics of Bayesian paradigm for statistical inference.
4. Appreciate the distinction between parametric and nonparametric methods, and learn basics of nonparametric statistics
5. Develop basic understanding for the idea of ``resampling’’.

Course content:
1. Use of R programming environment to summarize, interpret and display results of statistical methodologies.
2. Model fitting through maximum likelihood and least squares principles.
3. Quantification of uncertainty in the context of linear regression and ANOVA.
4. Introduction to Bayesian statistical models and methods.
5. Nonparametric methods – sign-based and rank-based procedures.
6. Resampling methods – permutation and bootstrap procedures for inference.
7. Effects of missing data in statistical analysis.
There will be plenty of case studies in the course, with examples drawn from natural and social sciences, to reinforce the importance of statistical concepts and methodologies in scientific investigation.

Illustrative Reading:
1. Ramsey, F. and Schafer, D. (2012). The Statistical Sleuth: A Course in Methods of Data Analysis, 3rd Edition. Cengage Learning.
2. Bruce, P. and Bruce, A. (2017). Practical Statistics for Data Scientists: 50 Essential Concepts. O'Reilly Media.

Potential Overlap:
There is partial overlap with materials in STA 104, STA 106, STA 108 and STA 145. However, the focus here is on providing an introduction to the basic methodologies, and integrating the concepts with computation and data analysis.

History:
None