STA 242: Introduction to Statistical Programming

Subject: STA 242
Title: Introduction to Statistical Programming
Units: 4.0
School: College of Letters and Science LS
Department: Statistics STA
Effective Term: 2009 Winter

Learning Activities

  • Lecture - 3.0 hours
  • Laboratory - 1.0 hours

Description

Essentials of statistical computing using a general-purpose statistical language. Topics include algorithms; design; debugging and efficiency; object-oriented concepts; model specification and fitting; statistical visualization; data and text processing; databases; computer systems and platforms; comparison of scientific programming languages.

Prerequisites

STA 130A; STA 130B; or equivalent of STA 130A and STA 130B.

Expanded Course Description

Summary of Course Content: 
The intent of the class is to ensure that graduate students have (i) a vocabulary for the basic computational tasks that they will encounter in their courses, research and career, and (ii) knowledge and experience to approach future computational tasks intelligently. Students will learn to use a general-purpose statistical programming language (e.g. R), to utilize its existing facilities and also to create new functionality for non-standard tasks. Programming exercises will focus on the following: running simulations, writing functions (e.g. random number generation), reading data in a variety of formats, exploring data with graphical and numerical summaries, modeling data, and reporting results of statistical analyses. The course may cover the elementary aspects of computationally intensive tasks. The computational topics in this course will be presented in the context of analyses of real scientific/social studies. We will use a variety of statistical topics as examples and exercises, including linear models and Äúmodern Äù techniques such as statistical learning methods (e.g. CART, SVMs) and Bayesian computations (e.g. MCMC). 

Illustrative Reading: 
1) Using R for Data Analysis and Graphics - Introduction, Examples and Commentary by J Maindonald. 2) Modern Applied Statistics with S by WN Venables and BD Ripley. 3) The Little SAS Book by LD Delwiche and SJ Slaughter. 4) Bioinformatics with R by R Gentleman. 5) A First Course in Statistical Programming with R by WJ Braun and DJ Murdoch. 

Potential Course Overlap: 
STA 141 and STA 242 both introduce students to statistical programming, but with very different emphasis and applications. STA 141 is an undergraduate course focused on the practical aspects of statistical computing for course work and employment in a variety of fields. STA 242 is a graduate course with an emphasis on computing for statistical research. STA 243, in turn, covers numerical and computational issues in statistics, including algorithms and matrix calculations. The statistical programming taught in STA 243 involves implementing these computations and is totally complementary to that in STA 242.