Académique Documents
Professionnel Documents
Culture Documents
To cite this article: Natalie J. Blades, G. Bruce Schaalje & William F. Christensen (2015) The
Second Course in Statistics: Design and Analysis of Experiments?, The American Statistician,
69:4, 326-333, DOI: 10.1080/00031305.2015.1086437
326 2015 American Statistical Association DOI: 10.1080/00031305.2015.1086437 The American Statistician, November 2015, Vol. 69, No. 4
mathematical preparation for courses like mathematical statis- Lovedays (1961) Statistics a Second Course and Mosteller and
tics or regression analysis. If students are excited by statistics, it Tukeys eclectic and inimitable Data Analysis and Regression:
seems a pity to abandon them to a year or two of pure mathemat- A Second Course in Statistics (1977) through Regression: A Sec-
ics before education in statistics resumes. Even if their interest ond Course in Statistics (Wonnacott and Wonnacott 1986) and
in statistics does not wane during this period, students are liable Regression with Graphics: A Second Course in Applied Statis-
to develop the attitude that statistics is just a subset of mathe- tics (Hamilton 1991) to Dielmans Applied Regression Analysis:
matics. The foundational fact that statistics forces a brave but A Second Course in Business and Economic Statistics (2004),
risky confrontation of mathematics with the real world can be Lomaxs Statistical Concepts (2007), and the most recent edition
swamped or lost. of Mendenhall and Sincichs A Second Course in Statistics: Re-
The purpose of this article is to propose design and analy- gression Analysis (2011), the titles and subtitles of the last three
sis of experiments (DAE) as an especially appropriate second decades point to the pervasiveness of this opinion; however,
course for undergraduate statistics majors and to relate our ex- these texts presuppose a potentially outdated view of the math-
periences with DAE as the second course over the last 4 years. ematical preparation of students taking a second coursethat
This proposal may appear to be either radical by its departure these students already have the mathematical maturity encapsu-
from most programs or archaic in its foundational emphasis on lated in integral and differential calculus, multivariate calculus,
design. Undoubtedly some statistics programs use DAE as the and linear algebra.
second course (though none of the top ten programs as defined The last few years have seen the publication of several
by number of statistics majors do) but the topic deserves fresh new textbooks updating the regression curriculum for second
eyes and fresh discussion. courses with progressive active-learning approaches: Kuiper and
Sklars Practicing Statistics: Guided Investigations for the Sec-
2. CURRENT SECOND COURSES IN STATISTICS ond Course (2012) aims to provide a case-based introduction to
statistical modeling and Cannon et al.s STAT2: Building Mod-
While the introductory statistics course prepares students as els for a World of Data (2012) provides a modular approach
critical consumers of data and analyses, it is insufficient to pre- to linear regression, analysis of variance (ANOVA) and exper-
pare students to apply statistical methods in their own work imental design, and logistic regression. By focusing on broad
(ASA 2005). Students may understand basic statistical concepts conceptual understanding these texts introduce simple and mul-
after a first course but they have not mastered the application tiple regression, logistic regression, ANOVA, and experimental
of those ideas. The second course must begin to equip students design without including the mathematical details: both texts
with tools to produce rather than merely consume statistical present the material assuming students have only an exposure
analyses. The expected learning outcomes of the second course to algebra.
are not as standardized as those for the first course in statistics; In addition to these regression approaches there are many
these learning outcomes will depend on whether the second other possibilities that could be considered for a second course:
course is the terminal statistics course of the students under- introduction to probability, statistical computing, data science,
graduate training or the next step in an undergraduate statistics data scraping, or algorithmic predictive modeling. Some pro-
core for a statistics or analytics degree and they must reflect grams are now allowing for flexibility in the entry into the
whether advanced training in mathematics or another discipline major: the undergraduate statistics major at Harvard can choose
is presupposed. to take linear models or data science or finance after an intro-
Second courses in statistics have been developed for many ductory statistics courseor he or she may take probability as
audiences: second courses for students in statistics majors, mi- the first course; at Berkeley a student may choose computing
nors, and concentrations, second courses for math majors, sec- with data, concepts of probability, or concepts of statistics as a
ond courses for economics and psychology and sociology ma- second course. In our own program, while most students take
jors, and second courses for graduate students in a wide variety the DAE course described here as the second course, students
of disciplines. The new ASA curriculum guidelines (2014) ac- could instead take discrete probability (an option popular among
knowledge, There is not a single definition of what is appro- students who would like to prepare for the actuarys probability
priate as a second course in statistics, and a number of different exam) or an introduction to statistical computing (in either R or
options can be found at many institutions. This is a big space SAS) as his or her second course; Figure 1 briefly displays these
that is dominated by mathematical statistics: More than half entry points into the statistics major at BYU. A second course
of nonintroductory statistics courses are taught in math depart- that focused on algorithmic modeling would address Breimans
ments and 72% of the nonintroductory statistics courses taught (2001) concern regarding the heavy emphasis on data models
in math departments are mathematical statistics or probability in our profession; Draper (1987, 1995) also called for more
(Blair, Kirkman, and Maxwell 2013a, b). predictive modeling to identify how the past and future are
A terminal second course in statistics should be focused on connected by comparing predictions to observable reality and
breadth, not depth; it should give students tools for building Cobb (2015) further addressed this conflict between algorithmic
models with quantitative or categorical responses and quantita- modeling and generative data models.
tive or categorical predictors. The relatively small (but growing) When the second course is not terminal, but rather part of
number of textbooks with the title or subtitle Second Course the statistics core for a B.S. in Statistics, the material could be
in Statistics apparently without exception imply that the ap- presented at greater depth. Rather than a survey of advanced
propriate topic for a second course is regression analysis. From methods taught without accompanying mathematical detail, the
course could facilitate a deeper understanding in what will be calculus and linear algebra as part of the course, require concur-
the second of many applied and theoretical statistics courses. rent enrollment, and so forth.
Such an approach would sacrifice breadth in the second course; In 2010, a second course in DAE was developed to provide
however, that breadth will be acquired by the end of the program. a platform for students with only a single introductory statistics
Additionally, the appropriate second course must work within course (possibly AP Statistics) to start understanding fundamen-
the structure of the university. At BYU we are concerned with the tal concepts of statistics. This DAE-based second course covers
curriculum for the second course for statistics majors housed in the scientific method, statistical thinking, sources of variation,
a department of statistics with 19 full-time tenure-track faculty. randomized factorial designs, power, and sample size calcula-
Many of our majors have not finished calculus at the time they tions. Students are encouraged to enroll concurrently in differen-
declare a major in statistics. We require roughly 50 credits for tial calculus; students who have already completed differential
a B.S. Statistics degree (out of the 120 credits required for calculus are encouraged to enroll in integral or multivariate cal-
graduation). This allows for more extensive study beyond the culus, as appropriate. This second-course DAE covers the ma-
introductory course: a required core of 6 credits of methods, 6 terial in sufficient depth that an undergraduate statistics major
credits of theory, 6 credits of computing, 814 credits of math, would have appropriate familiarity with the design of experi-
and 24 upper-division credits. In this context, we suggest DAE ments without requiring additional courses; students who wish
has proven to be an appropriate second course. to explore design of experiments further are well prepared for
an elective course (using, e.g., Lawson 2014).
The expected learning outcomes for second-course DAE
include
3. DESIGN OF EXPERIMENTS AS THE
SECOND COURSE
From the founding of BYUs Department of Statistics (with 1. Defining the experimental unit, response variable, factor(s),
concomitant creation of a B.S. in Statistics) in 1960 until 2010, and level(s) of a basic experiment;
the second course in statistics at BYU was regression analysis. 2. Understanding the role of randomization and replication in
The specific content of the second course drifted as textbooks inferring causation;
and software changed, but not in a major way. The question of 3. Performing a completely randomized design and construct-
how to balance the introduction of interesting statistical topics ing the ANOVA table in SAS and R;
with the need for mathematical rigor was initially not a problem 4. Computing the minimum number of replicates in a com-
because most statistics majors were sophomore- or junior-level pletely randomized design to achieve a given level of power;
transfer students from engineering or mathematics who had al- 5. Computing pairwise tests of differences in means in SAS
ready completed multivariate calculus and linear algebra. As and R to understand a significant overall F-test;
the major began to attract first-year students, this balance prob- 6. Performing a randomized complete block design and con-
lem was approached in several ways (none of which were fully structing the ANOVA table in SAS and R;
satisfactory): enforce prerequisites that delayed students from 7. Performing a factorial design and constructing the ANOVA
taking the second course until multivariate calculus and lin- table in SAS and R;
ear algebra had been completed, cover requisite concepts from 8. Explaining a statistically significant interaction;