Statistics 1: Introduction to ANOVA, Regression, and Logistic Regression



This course is for SAS software users who perform statistical analyses using SAS/STAT software. The focus is on t-tests, ANOVA, linear regression, and logistic regression. This course (or equivalent knowledge) is a prerequisite to many of the courses in the statistical analysis curriculum.


  • SAS Certified Clinical Trials Programmer Using SAS 9
  • SAS Statistical Business Analysis Using SAS 9: Regression and Modeling


Statisticians, researchers, and business analysts who use SAS programming to generate analyses using either continuous or categorical response (dependent) variables


  • Completion of an undergraduate course in statistics covering p-values, hypothesis testing, analysis of variance, and regression
  • Ability to execute SAS programs and create SAS data sets

Note: You can gain this experience by completing the SAS Programming 1: Essentials course.

Learning Objectives

  • Generate descriptive statistics and explore data with graphs
  • Perform analysis of variance and apply multiple comparison techniques
  • Perform linear regression and assess the assumptions
  • Use regression model selection techniques to aid in the choice of predictor variables in multiple regression
  • Use diagnostic statistics to assess statistical assumptions and identify potential outliers in multiple regression
  • Use chi-square statistics to detect associations among categorical variables, and fit a multiple logistic regression model.

1. Course Overview and Review of Concepts

  • Descriptive statistics
  • Inferential statistics
  • Examining data distributions
  • Obtaining and interpreting sample statistics using the univariate procedure
  • Examining data distributions graphically in the univariate and freq procedures
  • Constructing confidence intervals
  • Performing simple tests of hypothesis
  • Performing tests of differences between two group means using PROC TTEST

2. ANOVA and Regression

  • Performing one-way ANOVA with the GLM procedure
  • Performing post-hoc multiple comparisons tests in PROC GLM
  • Producing correlations with the CORR procedure
  • Fitting a simple linear regression model with the REG procedure

3. More Complex Linear Models

  • Performing two-way ANOVA with and without interactions
  • The concepts of multiple regression

4. Model Building and Effect Selection

  • Automated model selection techniques in PROC GLMSELECT to choose from among several candidate models
  • Interpreting and comparison of selected models

5. Model Post-Fitting for Inference

  • Examining residuals
  • Investigating influential observations
  • Assessing collinearity

6. Model Building and Scoring for Prediction

  • The concepts of predictive modeling
  • The importance of data partitioning
  • The concepts of scoring
  • Obtaining predictions (scoring) for new data using PROC GLMSELECT and PROC PLM

7. Categorical Data Analysis

  • Producing frequency tables with the FREQ procedure
  • Examining tests for general and linear association using the FREQ procedure
  • Exact tests
  • The concepts of logistic regression
  • Fitting univariate and multivariate logistic regression models using the LOGISTIC procedure
  • Using automated model selection techniques in PROC LOGISTIC including interaction terms
  • Obtaining predictions (scoring) for new data using PROC PLM