MixtureMS 031219

Running head: RTI: DUAL DISCREPANCY POPULATION
Response to Intervention: Empirical Demonstration of a Dual-Discrepancy Population
via Random Effects Mixture Models

RTI: DUAL-DISCREPANCY POPULATION 2
Abstract
Response to Intervention (RtI) is a commonly used framework to identify students in need of
additional or specialized instruction. Special education eligibility decisions within RtI rely on the
assumption that there are subpopulations of students: those who demonstrate appropriate growth
and those who do not demonstrate appropriate growth, when provided specialized instruction.
The purpose of the present study was to illustrate the use of random-effects mixture models
(RMMs) to estimate the likely number of (unobserved) subpopulations within one curriculum-
based measurement of oral reading (CBM-R) progress monitoring dataset. The dataset comprised
second grade students’ CBM-R data collected weekly over 20 weeks. RMMs were fit with
several numbers of classes, and a two-class model best fit the data. Results suggest that RMMs
are useful to understand subpopulations of students who need specialized instruction. Results
also provide empirical support to some extent for the use of a dual-discrepancy model of learning
disability identification within RtI.
Keywords: CBM Reading, random-effects, mixture model, progress monitoring, response
to intervention.
Response to Intervention: Empirical Demonstration of a Dual-Discrepancy Population
via Random Effects Mixture Models
Student progress is often monitored to establish a time-series dataset that is used to
evaluate the rate of improvement (ROI) with respect to a target skill. This practice was initially
conceptualized as idiographic analysis, which was used to estimate student progress and evaluate
instructional effects (Deno,1990; Deno & Mirkin, 1977; Deno, 1985; 1986). More recently, that
approach was incorporated into federal law (Individuals with Disabilities Education Act [IDEA],
2004) and professional practice (Batsche et al., 2005; Christ & Poncy, 2005) as Response to
Intervention (RtI). RtI is a resource allocation model in which all students are screened to ensure
expected progress is made (Fuchs & Fuchs, 2006). Students not meeting expectations are
provided targeted intervention services. RtI is an alternative to the diagnostic-prescriptive
approaches of the past (Fuchs & Fuchs, 2006), which were used to inform service delivery and
diagnosis of specific learning disabilities. For example, the RtI dual-discrepancy model
prescribes that a student may be eligible for special education services due to a learning
disability if there is a demonstrated deficit in both the level of achievement and rate of
improvement (ROI; Fuchs, 2003; Fuchs & Fuchs, 2006; Fuchs, Fuchs, & Speece, 2003). This is
an important concept with respect to this study because there is little existing empirical evidence
that the available data are sufficient to inform such decisions.
Curriculum-Based Measurement of Oral Reading (CBM-R; (Deno et al., 1986; 2001)
emerged as a primary approach to monitor student progress in reading (Ardoin, Christ, et al.,
2013). CBM-R is widely recommended and used as part of RtI to measure and evaluate the level
and ROI for use in a RtI dual discrepancy model. In brief, CBM-R data are collected by an adult
who listens to a child read aloud for one minute and records the number of words read correctly
per minute (WRC). Data are collected approximately weekly with one of 20 or so alternate
forms. Each CBM-R form comprises a passage with approximately 250 to 350 words.
Once CBM-R data are collected, the level of achievement and ROI are derived. The ROI
is typically derived with visual analysis or a statistical method. Ordinary least squares regression
(OLSR) is the most widely recommended and used statistical approach to estimate ROI (Ardoin,
Christ, et al., 2013). The level and ROI of student achievement is compared to either norms or
benchmarks to facilitate interpretation. Published estimates of ROIs indicate that typically-
developing students gain approximately 1.4 WRC per week and students who receive special
education services gain approximately .84 WRC per week when in first and second grades (Deno
et al., 2001). Other estimates published in the peer-reviewed literature (Fuchs et al., 1993) and by
vendors (e.g., AIMSweb, Pearson 2018; DIBELS, Good & Kaminski, 2011; FastBridge
Learning, 2018) are generally consistent with those ROIs.
Typically-developing first and second grade students often read 30 to 70 WRC, but their
level of performance often fluctuates +/-10 WRC from one CBM-R administration to the next
(Van Norman, 2015; also Ardoin & Christ, 2009; Christ, 2006; Hintze & Christ, 2004; Christ &
Silberglitt, 2007). This variability of CBM-R data over time creates substantial challenges to
estimate a ROI with precision (cf., Christ, Zopluoglu, Long, et al., 2012). The precision of those
ROIs is relatively low, and often student performances are too variable across time for OLSR
estimates to be reliable. Published estimates for the standard error of ROI suggest they often
exceed 3 WRC per week, such that a 68% confidence interval around a moderate ROI may be
1.28 +/- 3.00 WRC per week (range, -1.72 to 4.28 WRC per week). Application of statistical
models like the random effects model may be advantageous in this context as these models do
not require analyzing each student’s data separately; rather the models borrow information across
students to estimate ROI which makes the estimates more precise. The OLSR model considers
only one student’s data at a time which makes the growth estimates more susceptible to outliers
or anomalies in the data. In contrast, random effects models assume underlying growth structure
of all students to be similar; hence the fit for each student borrows strength from that common
distribution and is less prone to be unduly influenced by a few outlying points.
Although there is some published work (Ardoin & Christ, 2009; Deno et al., 2001), there
is very little empirical work to establish that students with and without learning difficulties
demonstrate unique and atypical ROIs. Furthermore, much of the prior work either aggregated
ROIs across potentially distinct populations (Fuchs et al., 1993) or pre-classified students into
the general education or special educations populations (Ardoin & Christ, 2009; Deno et al.,
2001). These two approaches either ignored the possibility of distinct groups within the
population (Fuchs et al.) or presupposed that distinct groups existed (Deno et al.).
Random-effects Mixture Models (RMMs)
Statistical methods have emerged in the last few decades to empirically evaluate the
existence of distinct subgroups in a population and evaluate the confidence of model-based
classifications. These methods are an extension of random-effects models, called random-effects
mixture models (RMMs), and have been demonstrated useful for this purpose (Muthén, 2001;
Muthén & Muthén, 2000; Muthén & Shedden, 1999).
Random-effects models. Random-effects models are used to estimate the overall growth
(i.e., mean intercept, and mean slope) and the variability around the growth coefficients for any
given repeated measures dataset. Unlike OLSR, these models include coefficients (parameters)
that are shared by all students in a population (i.e., population or mean intercept, and mean slope)
in addition to unique student-specific parameters (i.e., deviation from the mean intercept, and
mean slope; aka. random-effects). Suppose 𝒀𝑖 is the vector of outcomes (words read correctly
per minute) for the 𝑖 𝑡ℎ student in the sample, and 𝑿𝑖 is a matrix of predictors which help to
explain the variation in the outcome for the 𝑖 𝑡ℎ student. Assuming 𝒀𝑖 has a linear relationship
with 𝑿𝑖 , a random-effects model has the form
𝒀𝑖 = 𝟏(𝛼 + 𝒂𝑖 ) + 𝑿𝑖 (𝜷 + 𝒃𝑖 ) + 𝜺𝑖 (1)
where 𝛼 is the overall intercept, and 𝜷 consists of a vector of overall linear slope coefficients for
the predictor variables 𝑿𝑖 shared by all students in the sample. The coefficients 𝒂𝑖 and 𝒃𝑖 are
student-specific random-effects; these coefficients vary from student to student and capture the
between-student variability. The coefficients 𝒂𝑖 and 𝒃𝑖 are assumed to have a normal
distribution with a mean of zero and a variance-covariance matrix 𝝓. The 𝜺𝑖 is a vector of
random errors which is often assumed to have a mean zero normal distribution. It is assumed that
random-effects and random errors are independent of each other. Random-effects models are
frequently used to analyze repeated measures data where each element in 𝒀𝑖 corresponds to the
value of the outcome (e.g. WRC) at a particular point of time (a particular week).
Random-effects mixture models (RMMs). One of the assumptions made by the
random-effects model is that all the individuals in the dataset are homogeneous. That is, all
individuals belong to the same underlying population, and thus share a similar overall growth
pattern. However, there are scenarios where the individuals in the dataset may not be
homogenous. The dataset may consist of two or more distinct unobserved classes of individuals,
and these classes must be discerned from the data empirically. To make it more concrete, it is
possible that in a given progress monitoring dataset, there may be a class of students who exhibit
higher reading progression over time than other students. In this case RMMs are more
appropriate to model population heterogeneity, along with between- and within-individual

heterogeneity. The RMMs allow for the classification of unique subpopulations within a sample
of individuals. In the context of the current study the application of RMMs is preceded by testing
the assumption about whether there are indeed unique subgroups that can be empirically
identified on the basis of CBM-R level and ROI. Although it is often assumed that subgroups of
students exist (Deno, 2001), distinct subgroups based on those data have yet to be empirically
established.
In RMMs, each class is defined by its own set of regression coefficients (mean growth
curve parameters of intercept 𝛼 and slope 𝛽 and the variance around the mean growth curve
estimated by the random-effects), and random error (𝜀𝑖 ) parameters. A c class general mixture
model formulation for, say, the slope parameter can be written as
𝛽 ~ ∑𝑐𝑘=1 π𝑘 N(µ𝑘, 𝜎𝑘2 ), (2)
where π𝑘 is the class mixing proportion, where ∑𝑐𝑘=1 π𝑘 = 1. The µ𝑘 and 𝜎𝑘2 are respectively the
mean and variance of the slope of the 𝑘 𝑡ℎ class. In the context of the example described above, a
mixture model with 2 classes, 𝑐 = 2 can be used to fit data where one class comprises students
showing higher growth over time and is characterized by larger value of µ𝑘 , and the other class
comprises students showing smaller reading growth and is defined by a smaller value of µ𝑘 . A c
class mixture distribution for the distribution of the intercept and random error can be defined
similarly.
Purpose
This study extends previous research with an empirical evaluation of one CBM-R
progress monitoring dataset to determine whether there are two or more distinct classes based on
the level and/or ROI of CBM-R data. The identified classes estimate the likely number of distinct
subpopulations, and those subpopulations might differ by initial level of performance (intercept)
and ROI (slope). There were three research questions that guided the study.
1. Are there two (or more) empirically derived classifications of the CBM-R cases
when using both the initial level of achievement and ROI? We hypothesized that
there would be at least two latent classes based on empirical classification.
2. Are there unique classifications derived from ROI alone? We hypothesized that
ROI alone could support unique classifications.
3. If so, what is the relative confidence of those classifications?
Methods
Dataset
The dataset analyzed in this study consists of 215 second graders’ weekly CBM-R data. It
was a sub-set of a large dataset of students across states, schools, and grades who received
additional reading intervention based on insufficient level of performance in CBM-R. The 215
cases were chosen through the application of selection criteria, as follows.
Second graders’ data were selected because students in lower grades typically
demonstrate higher rates of reading growth (FastBridge Learning, 2018; Fuchs et al., 1993;
Pearson, 2012). We estimate 20% to 40% of the total population of students received additional
reading intervention. All students who were considered in the data analysis in this manuscript did
receive additional reading intervention, at the “Tier 2” level of an RtI service delivery model.
Further details about the type and intensity of intervention provided to students were not
available. The subset was further filtered to exclude 3 students who only had 1 or 2 progress
monitoring data points across 20 weeks. To aid in the generalizability of the results, the dataset
included students across schools and passage sets. Different schools used one of three different
probe sets: AIMSweb (n=147; Pearson, 2012), FAST (n=51; FastBridge Learning, 2018), and
DIBELSNext (n=20; Good & Kaminsi, 2011); we present analyses on possible passage effects
in the discussion section. All examiners were trained in the administration of CBM-R, and
standard procedures were followed: words were provided after hesitations of three seconds, and
words read correctly were counted.
The median number of probes administered to each student was 17 (190 students
completed 13 to 18 probes) with a minimum of 11 (1 student) and a maximum of 20 (4 students).
Of the 215 students, 187 had scores collected at most once per week, while 25 students had two
scores in one week and one score for the remaining 19 weeks, and 3 students had two scores in
two weeks, and 1 score for the remaining 18 weeks. According to data selection criteria, there
were no students with fewer than 20 weeks’ data. We modeled growth of CBM-R WRC scores
over time with the unit of time being “week”; for multiple scores in one week we used the
corresponding week values. For example, if a score was collected in the middle of week 1 and
week 2, the week value for that score was coded as week = 1.5. The maximum CBM-R WRC
score was 123, minimum score 0, median and mean score 47, with 25% of all scores being below
31, and 75% of all scores being below 63.
Data were collected and analyzed in accordance with ethical guidelines for research with
human subjects. The appropriate Institutional Review Board approved the present study.
Analytic Procedures
For each student we had CBM-R WRC scores collected over the duration of 20 weeks.
The analytic procedures comprised several steps. First, we calculated the intraclass correlation
coefficient (ICC). ICC is a descriptive statistic to assess what proportion of the total variance of
the outcome variable, CBM-R scores, is attributed to between-subject variability and within-
subject variability respectively. The ICC value indicates whether there is sufficient between-
subject variability in the dataset to justify the application of RMMs. For the present study the
ICC value was found to be satisfactory, which meant we could proceed to fit RMMs.
In the second step, we fitted RMMs with varying number of classes. That is, we fitted
RMMs with one class, two classes, and three classes. The goal of fitting RMMs with different
numbers of classes was to ascertain empirically whether there exist more than one distinct class
of students based on the level and/or ROI of CBM-R data. Out of the three models fitted to the
dataset, the best fitting model was selected based on model fit criterion, such as BIC (Bayesian
Information Criterion; Kass & Raftery, 1995). BIC is defined as
BIC = −2 log 𝑒 𝐿 + 𝑝 log 𝑒 𝑛 (3)
where 𝑝 = number of free parameters to be estimated, 𝑛 = number of data points, and 𝐿 is the
maximized value of the likelihood function. The BIC value can be calculated for each model fit,
and the model with the smallest value of BIC is said to best fit the dataset. In the third step, the
selected RMM with c-classes (based on BIC) was then fitted by using two alternative
approaches. In the first approach, the RMM was fitted assuming a class-constant intercept-slope
covariance matrix, and in the second approach, RMM was fitted assuming a class-specific
intercept-slope covariance matrix. The objective of fitting RMM using these two different
approaches was to compare the model fit obtained from these two alternative approaches and
determine the best fitting RMM that better captures the underlying phenomena of the dataset in
hand. In the last step, we fitted the selected RMM (based on the previous step) to intercept-
centered data to determine how the results would differ when the estimated empirical
classification of individual students into classes was conducted solely on the basis of growth and
not using the initial level of performance.

We assumed each student’s reading proficiency grew linearly over time, as is commonly
found in previous literature (Good & Shinn, 1990; Christ 2006; Christ, Zopluoglu, Long, et al.,
2012). To test the assumption of linearity, we plotted a random sample (n = 30) of the individual
students’ growth trajectories, along with the mean of the OLSR fits for those students to visually
inspect the pattern. As seen in Figure 1 there is no clear visual evidence of non-linearity in the
data. This evidence was further corroborated when we fitted a quadratic function to the data and
the quadratic coefficient was found to be negligible.
Intraclass Correlation Coefficient. As discussed in the Introduction section, each
individual student’s CBM-R data over successive weeks is often highly variable (indicating large
within-subject variability of reading scores). Furthermore, if tests were administered more than
once per week, it is possible for two or more successive reading scores to be dissimilar. If the
within-subject variation is too high, it may be a source of concern because it could lead to
unreliable estimates of growth obtained by fitting statistical models to the dataset. Therefore,
before the RMM can be fitted to data, it must be verified whether the variation in each individual
student’s weekly data is substantially smaller compared to the variation across all the students in
the sample (i.e., between-subjects). To assess this, ICC (Raudenbush & Bryk, 2002, see Ch. 2, 4)
was calculated based on the linear mixed-effects model. That is,
𝒚𝒊𝒋 = (𝜶 + 𝒂𝒊 ) + 𝜷𝟏 𝑿𝒊𝒋 + 𝜺𝒊𝒋 (𝟒)
where 𝑦𝑖𝑗 is the 𝑗 𝑡ℎ (weekly) observation for the 𝑖 𝑡ℎ student, 𝛼 is the overall intercept, 𝛽1 is the
overall weekly linear slope, 𝑋𝑖𝑗 is the week for the 𝑗 𝑡ℎ observation for the 𝑖 𝑡ℎ student, 𝑎𝑖 is the
𝑖 𝑡ℎ student random-effect for the intercept term, and 𝜀𝑖𝑗 are the normally distributed random
error terms. The coefficients, 𝛼 and 𝛽1, are fixed effects (i.e., same for all students). The ICC is
defined only for random-intercept models so we only included an overall slope in the model. The
ICC is given by the following expression:
𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠

ICC =
𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒
Var (𝑎𝑖 )
= (5)
Var (𝑎𝑖 ) + Var (𝜀𝑖𝑗 )
The ICC value for the dataset was 0.70, indicating that 70% (a substantial part) of the
total variability was due to variability from one student to another, and only 30% of the total
variability was due to variability in an individual student’s weekly scores. The large ICC value
confirmed that a random intercepts model was warranted for the dataset. However, it did not tell
us whether the students differ from each other with respect to weekly slope. Thus, to gain insight
about the between-subject variability as a function of weekly slope only, we later show the
results from fitting RMM to intercept-centered data. Note that the definition of ICC is useful and
interpretable only for the random intercepts model; hence, it does not make sense to re-calculate
the ICC value using the intercept-centered data. In sum, the value of ICC was indicative of
sufficiently large between-student variability and the following sections will briefly describe the
different variations of RMMs that were then fit to the data.
RMM with class constant intercept-slope covariance structure. Here we describe an
RMM that allows for the estimation of class specific mean growth coefficients and residual
variances, but only class constant intercept-slope covariance structure (i.e., same variance of
intercept and slope for both classes, and same covariance between intercept and slope for both
classes). This model indicates the underlying assumption that the rate of reading growth
exhibited by the students differed depending on which class the students were classified in but
the variation around the mean growth coefficients from student-to-student was constant across
classes. In this c class RMM, the data for the 𝒊𝒕𝒉 student is modeled as:
𝑦𝑖𝑗 = 𝛽0𝑖 + 𝛽1𝑖 𝑋𝑖𝑗 + 𝜀𝑖𝑗 (6)
where 𝑖 = 1,2, . . , 𝑚, where 𝑚 = number of students, 𝑗 = 1,2, . . , 𝑛𝑖 , where 𝑛𝑖 = number of time
points for student 𝑖. The coefficient 𝛽0𝑖 is random intercept for student 𝑖, and 𝛽1𝑖 is random linear
slope for student 𝑖, where 𝛽0𝑖 = 𝛽0 + 𝑏0𝑖 , and 𝛽1𝑖 = 𝛽1 + 𝑏1𝑖 . The mixture structure described
in (2) was imposed on the overall intercept 𝛽0 and overall linear slope 𝛽1 , allowing for the
estimation of class-specific mean intercept and mean slope, respectively. A class-specific
mixture structure was also put on the residual variance 𝜺𝑖𝑗 . The covariance structures of the
random effects, 𝑏0𝑖 and 𝑏1𝑖 , were assumed to be constant across classes.
RMM with class specific intercept-slope covariance structure. In the model described
above, we assumed that the variation around the mean growth coefficients from student-to-
student was constant across classes. This assumption may not hold in practice. Thus, we
considered an RMM with class specific intercept-slope covariance structure, along with class
specific mean growth coefficients and residual variances. That is, for each individual class we
estimated a class-specific mean intercept and mean slope, variance of intercept and variance of
slope, and covariance between intercept and slope, and residual variance. We fitted both kinds of
RMMs (i.e., class constant, and class specific intercept-slope covariance structure) and compared
the fit of the two models to determine which model fit the data better.
Covariance matrix for repeated measures residuals. Repeated measures data may
have autocorrelated residuals over time. If we have a first order autocorrelation of the residuals
over time, then they can be modeled as:
𝜀𝑡 = 𝜌𝜀𝑡−1 + 𝜔𝑡 ,
where |𝜌| < 1, and 𝜔𝑡 are independent and identically distributed N(0, s2) errors. This equation
implies that the residual (unexplained variance of outcome yij) at time 𝑡 is dependent on the
residual at the previous timepoint 𝑡 − 1 through the parameter 𝜌. If 𝜌 = 0 then there is no first
order autocorrelation in the repeated measures data.
For each student, we calculated the residuals from the model fit, and then tested for 𝜌 =
0. The test revealed that first order autocorrelation is present in the data of 5 out of the 215
students. Since the autocorrelation was not present in the majority of the students’ data, we did
not incorporate an autoregressive error structure in our model. We use the unstructured
covariance form which does not impose any restrictions on the covariance structure.
Model fitting. The models were fitted using Maximum Likelihood estimation procedure
via the Expectation Maximization (EM) algorithm in the R program. This estimation procedure
is an iterative algorithm and the best parameter value set is obtained when the difference of the
likelihood values in two successive iterations is less than 0.001. Specifically, we used the
regmixEM.mixed function from the mixtools package in R (Benaglia et al., 2009). Generally
speaking, the random effects mixture models are computationally intensive because they require
the estimation of a large number of coefficients. Thus, to ease the computation complexity, a set
of starting values are often provided to the algorithm to help ease the computation. For the
current study we obtained the starting values by fitting a one-class RMM (which is similar to
fitting a random-effects model) to the data. The obtained estimated coefficients are used as a
sensible guess (starting values) for the overall mean parameters for RMMs with two classes, or
for RMMs with three classes. Note that the regmixEM.mixed R function does not provide
standard errors of any estimates.

Results
Approximately 3800 CBM-R scores across 215 student cases were retained for analysis.
No extreme values were trimmed, as all scores were inside the range of likely values for second
grade students. Assumptions of normality for CBM-R scores were examined via scatterplots and
descriptive statistics and were met for the purposes of the analyses. Descriptive statistics
obtained from fitting the OLSR model to each student in the dataset are presented in Table 1.
The following section summarizes the results obtained from fitting the RMMs described in the
Methods section.
Random-effects Mixture Models with One to Three Classes
To answer the first research question, three RMMs with different numbers of classes
were fitted to the data: we fitted one-class (𝑐 = 1), two-class (𝑐 = 2), and three-class (𝑐 =
3) RMMs. The class-specific means and residual variances, and class constant intercept-slope
covariance structure were assumed. BIC was calculated for the one-, two-, and three-class model
fits, and the model with the lowest BIC value was then selected. The BIC values for the one-,
two-, and three-class models were 37621, 31082, and 33352, respectively. Based on the values of
BIC, we selected the two-class model. Note that all the models fitted until now had class specific
means, and residual variances, and class constant intercept-slope covariance structure. Below we
describe the results obtained from fitting the two-class RMM with class constant intercept-slope
covariance structure, and the two-class RMM with class-specific intercept-slope covariance
structure.
Two-class Model with Class Constant Intercept-Slope Covariance Structure
The parameter estimates obtained for this model are shown in Table 2 (we labeled this
model as Model A in the Table). The slope estimates for the two-class model were 0.86 for class
1 and 1.49 for class 2 respectively. About 47 students were classified into class 1, and the
remaining 168 students were classified into class 2. Though all students in the dataset received
additional reading instruction, students in class 2 had an estimated mean slope that aligns with
recommended reasonable growth rates in previous literature (Deno et al., 2001; Fuchs et al.,
1993), and students in class 1 had an estimated mean slope that aligns with performance of
students receiving special education services (Deno et al., 2001). Thus, the two-class RMM
empirically identified the particular group of students who may need further additional
instruction. Table 3 shows descriptive statistics obtained from fitting the OLSR model to
students in each of the two classes.
Probability of a student to belong to a particular class. One of the goals of RMMs is
to empirically classify students into each of the classes. For the dataset in hand, we computed the
probability of the student to belong to class 1 (estimated 𝛑𝟏 ) and the probability of the same
student to belong to class 2 (estimated (𝟏 − 𝛑𝟏 )). We then determined which class the student
belongs to by selecting the class with the higher probability. We classify all the students in the
dataset into the class having the higher probability and we call that probability the classification
probability.
We classified the students using the two-class RMM; however, we needed to verify that
the class assignment had been conducted with sufficient amount of confidence. In other words, it
is desirable that each student has a clear high probability for belonging to a particular class, as
opposed to there being a 50% chance of students belonging to either class. The left panel of
Figure 2 shows the distribution of the classification probabilities for all the students. Based on
the figure, it is evident that most of the students had a high probability (close to 1) of being
classified to a particular class. In summary, students were classified with a satisfactory degree of
confidence.
Entropy of classification. Another measure that is used to assess the quality of empirical
classification in a two-class model is the measure of entropy which quantifies the amount of
uncertainty in the classification. The entropy of classification was computed to evaluate the
degree of confidence in the empirical classification. For the RMM with two classes, say “a” and
“b”, the entropy is defined as:
Entropy = −p(𝑎) log 2 (p(𝑎)) − p(𝑏) log 2 (p(𝑏)) (7)
where p(𝑎) is the probability of a student to be classified in class “𝑎”. Entropy takes values in
[0,1] and a value of 0 indicates perfect delineation of the classes (Celeux & Soromenho, 1996).
The entropy of classification for the two-class RMM (p(𝑎) = 0.22) was 0.04, which is
satisfactorily close to 0.
Two-class Model with Class Specific Intercept-Slope Covariance Structure
We fitted a two-class RMM with class specific means, residual variances and class
specific intercept-slope covariance structure (labeled as Model B in Table 2) to compare its fit to
the fit of the two-class RMM with class specific means and residual variances, and class constant
intercept-slope covariance structure (i.e., Model A). This enabled us to see if the incorporation of
the class specific intercept-slope covariance structure improved the overall model fit. The two-
class RMM with class specific intercept-slope covariance structure had slightly larger BIC
compared to the BIC of the two-class RMM with class constant intercept-slope covariance
structure. The BIC for the two-class RMM with class specific intercept-slope covariance was
31142 and for the two-class RMM with class constant intercept-slope covariance it was 31082.
We selected the model with the lowest BIC. The two-class RMM with class constant intercept-
slope covariance is the final selected model.
Intercept-Centered Data
It is important to understand the between-subject variability as a function of weekly slope
only (and no influence of intercept whatsoever); so, we fitted the two-class RMM with class
constant intercept-slope covariance to intercept-centered data. This analysis served to answer our
second and third research questions. The intercept-centering ensures that the classification is not
influenced by the initial level of performance (intercept), but rather students are classified by
their slope or growth. We centered the data by subtracting the median of the first three
observations of each student from that student’s data; the centering was performed using median
as opposed to mean since CBM data are highly variable from week to week and the mean is
prone to be affected by outliers. The slope estimates for the two classes were 0.93 and 1.52, and
the class probabilities were 0.31 and 0.69 (see Table 2). The entropy of classification for the
model using intercept-centered data was 0.04; the right side of Figure 2 shows the distribution of
the classification probabilities. As seen in Figure 2, more students were classified with slightly
greater certainty using the uncentered data; the model estimates were similar to the fit using
uncentered data (Table 2). This finding suggests that the classification of students using the
uncentered data was not driven by the students’ initial status (intercept), but rather the
classification did take into account ROI as well as the initial level. The BIC value for this two-
class RMM with class constant intercept-slope covariance structure fit to the intercept-centered
data was 30578. Note that this BIC value is not comparable to the BIC values from models fit to
the uncentered data since we have changed the data values by centering them.
Ordinary Least Squares Regression Comparison

To place the RMM results in context, student performance was also estimated via OLSR.
Estimating student performances using OLSR does not lead to an automatic model-based
classification of students into distinct sub populations. The classification can be conducted post-
hoc using a cutoff pre-specified by the investigator. For example, in our data analysis we can
choose a slope of 1.2 as the cutoff and classify students into two groups based on whether their
linear (OLSR) slope over time is less than 1.2 or more than 1.2. Comparing this to the
classification from the RMM implemented, we found that the 149 out of 215 (69%) students
were classified into the same class using OLSR and RMM. In addition, we note that the OLSR
method gives some negative slope values or slope values greater than 3, whereas RMM does not
produce such outlying slope values.
Discussion
The purpose of this study was to provide an empirical demonstration of subpopulations of
students with different rates of growth, as hypothesized by an RtI model of identification. To
answer our first research question, the results indicated two distinct groups likely exist in the
population. The low-low group (i.e., dual discrepant) comprised 22% of the sample and was
characterized by a low level of initial achievement (16 WRC) and low ROI (0.86 WRC per
week). The moderate-moderate group comprised 78% of the sample and had a moderate level of
achievement (39 WRC) and moderate ROI (1.49 WRC per week). Within the context of a RtI
dual discrepancy framework, the low-low group might be considered for more intensive
intervention and further consideration for special education or disability diagnosis.
To answer our second and third research questions, results suggest the assignment of
cases to groups can be also done with a high degree of confidence using ROI alone. With the
initial achievement removed from the model (intercepts centered), two distinct groups still
emerged. The low ROI group (.93 WRC per week) comprised 31% of the sample and was
distinct from the moderate ROI group (1.52 WRC per week) that comprised 69% of the sample.
Our empirical investigation suggests that the students within each class have the same
amount of heterogeneity in growth and intercept values, because the RMM with class constant
rather than class specific intercept-slope variance fitted the dataset better.
Implications for the RtI Model of Student Identification
As previously described, the data used in the current study represent data regularly
collected in education for research purposes. The response of a student to an educational
program is influenced by a multitude of factors. Despite these known threats to the validity of
interpretation of ROIs, results of the present study align remarkably well with the RtI model of
student identification. In other words, the statistically derived subpopulations are practically
meaningful.
Students in both empirically-derived classes in the present analysis have a level of
performance below publisher benchmarks, as would be expected for students selected to receive
additional reading instruction. Students in class 2 (moderate-moderate) demonstrate a higher
level and demonstrate appropriate (if not ambitious) growth. In other words, students in class 2
appear to be responsive to the instruction. Students in class 1 (low-low) have a much lower level
and do not demonstrate appropriate growth in response to instruction. If 20% to 40% of the
overall student population is represented in this progress monitoring dataset, and 22% of those
students are empirically classified into the low-low or non-responsive group (Table 2), then 4.4%
to 8.8% of all students would potentially be classified as not making progress. These percentages
are in line with previous intervention studies on the incidence of learning disabilities as identified
with the dual-discrepancy model (Fuchs, Fuchs, & Speece, 2002), a model that can be used in
RtI special education eligibility determinations. These results lend support to the use of ROIs in
practice. Moreover, a notable feature of the RMM modeling approach is that though the students
have been grouped into two classes, student-specific characteristics have also been accounted for
through the random effects parameters in the model.
An empirical demonstration of two subpopulations of students does not imply that
student classification is permanent. In other words, a student with low ROI could still increase
their ROI with a different educational program. Future research with RMMs may investigate the
conditions under which students change classes. Although the present study focused on
classification, RtI is a resource allocation model. Its purpose is to accurately and flexibly target
intervention resources. The use of statistical models as part of data-based decision making could
serve to improve the accurate, flexible provision of early intervention services and prevent long-
term reading failure.
Different Passage Set Types
The present dataset incorporates CBM-R data across different passage sets (AIMSweb,
DIBELSNext, and FAST). Classification proportions were consistent across these passage sets.
For example, the proportions of students placed in class 1 who were assessed with AIMSweb,
FAST, and DIBELSNext were .22, .24, and .20 respectively. To further investigate how, if at all,
differences between passage set types interact with prediction of classes, we re-fitted all models
including passage set type as a covariate in the model. In 99% of cases, students were classified
into the same class regardless of the inclusion of passage set as a covariate. Since only 3 out of
215 students had non-consistent classification it is difficult to say whether it is due to any
systematic underlying factor. We can conclude that regardless of passage set publisher and
school, students are clearly, empirically classified into two separate subpopulations. In short, for
classification purposes, different passage sets are consistent.
Predictions
One potential application of RMMs is prediction. For the present study, this would
involve the prediction of CBM-R scores at future time points, to aid in decision-making at the
present time point. As an initial exploration, we applied the two-class models to 6, 10, and 14
weeks’ data. The results using 14 weeks’ data were similar to the results using 20 weeks’ data
(92% of students remained in the same class), which indicates that it would be reasonable to
predict growth at 20 weeks using 14 weeks’ data. For comparison, only 79% of students
remained in the same class when OLSR and a cutoff score were used for classification. The need
for 14 weeks of data also aligns with previous work on the duration required to make consistent
decisions with CBM-R progress monitoring data (Christ, Zopluoglu, Long, et al., 2012; Christ,
Zopluoglu, Monaghen, et al., 2013).
RMMs could also potentially describe a range of possible ROI values when a given
intervention is used with a group of students who have similar characteristics to those in a
previous dataset and analysis. RMMs will usually ameliorate the biases in growth estimates
inherent to the use of OLSR with highly variable progress monitoring data.
Application of RMMs would be most successful if models were built in to vendor
software already used by educators to prevent the need for educators to possess advanced
statistical analysis tools or skills. The use of vendor software would also appropriately limit the
application of RMMs to datasets of sufficient sizes to obtain reliable mean growth estimates. In
practice, vendors could provide initial datasets for schools that don’t have sufficient numbers of
students. Provision of initial datasets would be analogous to the typical provision of norms.
More research would be needed on the consequential validity of these potential automated
classification recommendations.
Limitations
Previous research suggests that variability in individual students’ weekly scores, the
number of scores available to make the prediction, and how far into the future the prediction is to
be made may all impact any model’s prediction accuracy (Christ, Zopluoglu, Long, et al., 2012;
Christ, Zopluoglu, Monaghen, et al., 2013). Future work should isolate these variables to inform
conditions under which predictions are most accurate before application would be possible.
Although the present dataset pulls from student populations around the country as well as
across probe sets, the results are limited to second grade students who would benefit from
modified reading instruction. For a larger cross-grade sample, the analysis must be by grade
since it is not reasonable to assume the same mean growth (slope) value for all grades.
Inferences made from the present study are further limited by the minimal information
available about instruction. The students included in the dataset were all classified as students
who would benefit from reading instruction. Information was not available, however, on the
type, intensity, timing, or fidelity of intervention these students received.
We also do not know if the 20 weeks of CBM data collection were 20 consecutive weeks
or if there were gaps in the progress monitoring data collection and intervention due to school
vacations. It would be useful to know if the data collection happened uniformly for all students
or if some students had continuous progress monitoring while others did not. In the latter case, it
might be difficult to interpret growth estimates based on school weeks as actual ROI.
Complexities and risks often emerge in practice because of a lack of standardization of
implementation and measurement of educational programs. That is, the measured response of a
student to an educational program is likely influenced by multiple extraneous variables. For
example, the magnitude of a ROI could depend on the instructional program selected, fidelity of
implementation, dosage/amount of service, and the qualities of the data collected to monitor
program effects. These are threats to validity of the interpretation and use of ROIs. This study
does not address the relative influence of those threats, but it does evaluate whether distinct
groups can be derived from samples of data whereby those threats abound.
Conclusions and Future Directions
Repeated measures data are a key part of Response to Intervention service delivery
models. Accurate and efficient analysis of these data promotes accurate educational decisions
and efficient allocation of resources. The present study illustrated the use of RMMs to
empirically support the existence of two subpopulations of students within students who receive
additional reading instruction. This initial application provides empirical evidence in support of
dual-discrepancy theory of learning disability identification. Future research may directly assess
the classification accuracy of RMMs for learning disabilities, may assess conditions under which
students change classification, or may extend RMMs to identification of subpopulations of
students in other academic areas and assessments, such as CBM-Math.

Running head: RTI: DUAL DISCREPANCY POPULATION
References
Ardoin, S. P., Christ, T. J., Morena, L. S., Cormier, D. C., & Klingbeil, D. A. (2013). A
systematic review and summarization of the recommendations and research surrounding
curriculum-based measurement of oral reading fluency (CBM-R) decision rules. Journal
of School Psychology, 51(1), 1–18.
Ardoin, S. P., & Christ, T. J. (2009). Curriculum-based measurement of oral reading: Standard
errors associated with progress monitoring outcomes from DIBELS, AIMSweb, and an
experimental passage set. School Psychology Review, 38(2), 266.
Benaglia, T., Chauveau, D., Hunter, D., & Young, D. (2009). mixtools: An R package for
analyzing finite mixture models. Journal of Statistical Software, 32(6), 1-29.
Batsche, G. M., Elliot, J., Graden, J. L., Grimes, J., Kovaleski, J. F., Prasse, D., . . . Tilly, W. D.
(2005). Response to intervention: Policy considerations and implementation. Alexandria,
VA: NASDE.
Celeux, G., & Soromenho G. (1996). An entropy criterion for assessing the number of clusters in
a mixture model. Journal of Classification, 13.2: 195-212.
Christ, T. J. (2006). Short-term estimates of growth using curriculum-based measurement of oral
reading fluency: Estimating standard error of the slope to construct confidence
intervals. School Psychology Review, 35(1), 128.
Christ, T. J., & Poncy, B. C. (2005). Guest editors' introduction to a special issue on response to
intervention. Journal of Psychoeducational Assessment, 23(4), 299-303.
Christ, T. J., & Silberglitt, B. (2007). Estimates of the standard error of measurement for
curriculum-based measures of oral reading fluency. School Psychology Review, 36(1),
130.
Christ, T. J., Zopluoglu, C., Long, J. D., & Monaghen, B. D. (2012). Curriculum-based
measurement of oral reading: Quality of progress monitoring outcomes. Exceptional
Children, 78(3), 356-373.
Christ, T. J., Zopluoglu, C., Monaghen, B., & Van Norman, E. R. (2013). Curriculum-based
measurement reading (CBM-R) progress monitoring: Multi-study evaluation of schedule,
duration, and dataset quality on progress monitoring outcomes. Journal of School
Psychology (51), 19-57. doi:10.1016/j.jsp.2012.11.001
Curran, P. J., Obeidat, K., & Losardo, D. (2010). Twelve frequently asked questions about
growth curve modeling. Journal of Cognition and Development, 11(2), 121-136.
Deno, S. L. (1985). Curriculum-based measurement: The emerging alternative. Exceptional
children, 52(3), 219-232.
Deno, S. L. (1986). Formative evaluation of individual student programs: A new role for school
psychologists. School Psychology Review.
Deno, S. L. (1990). Individual differences and individual difference. The Journal of Special
Education, 24(2), 160.
Deno, S. L., Marston, D., & Tindal, G. (1986). Direct and frequent curriculum-based
measurement : An alternative for educational decision making. Special Services in the
Schools, 2(2), 5–27.
Deno, S. L., & Mirkin, P. K. (1977). Data-based program modification: A manual. Reston, VA:
Council for Exceptional Children.
Deno, S., Fuchs, L. S., Marston, D., & Shin, J. (2001). Using curriculum-based measurement to
establish growth standards for students with learning disabilities. School Psychology
Review, 30(4), 507-524.

Fast Bridge Learning (2018). Formative Assessment System for Teachers. Minneapolis: Author.
Fuchs, L. S. (2003). Assessing intervention responsiveness: Conceptual and technical issues.
Learning Disabilities Research and Practice, 18, 172–186.
Fuchs, D., & Fuchs, L. S. (2006). Introduction to response to intervention: What, why, and how
valid is it?. Reading research quarterly, 41(1), 93-99.
Fuchs, L. S., Fuchs, D., & Speece, D. L. (2002). Treatment validity as a unifying construct for
identifying learning disabilities. Learning Disability Quarterly, 25(1), 33-45.
Fuchs, L. S., Fuchs, D., Hamlett, C. L., Walz, L., & Germann, G. (1993). Formative evaluation
of academic progress: How much growth can we expect. School Psychology Review, 22,
27-48.
Good, R. H., & Kaminski, R. A. (2011). Dynamic Indicators of Basic Early Literacy Skills Next.
Eugene, OR: Dynamic Measurement Group. Retrieved from http://www.dibels.org/
Good, R. H., & Shinn, M. R. (1990). Forecasting accuracy of slope estimates for reading
curriculum-based measurement : Empirical evidence. Behavioral Assessment, 12, 179–
193.
Hintze, J. M., & Christ, T. J. (2004). An examination of variability as a function of passage
variance in CBM progress monitoring. School Psychology Review, 33(2), 204.
Individuals with Disabilities Education Improvement Act, 20 U.S.C., Pub. L. No. 108-446 §
1400 et seq. (2004).
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American statistical
association, 90(430), 773-795.

Kim, S. Y. (2012). Sample size requirements in single-and multiphase growth mixture models: A
Monte Carlo simulation study. Structural Equation Modeling: A Multidisciplinary
Journal, 19(3), 457-476.
Laird, N. M., & Waire, J. H. (1982). Random-effects model for longitudinal data. Biometrics, 38,
963-974.
McLachlan, G., & Peel, D. (2004). Finite Mixture Models. John Wiley & Sons.
Meredith, W., & Tisak, J. (1990). Latent curve analysis. Psychometrika, 55, 107-122.
Muthén, B. (2001). Latent variable mixture modeling. New Developments and Techniques in
Structural Equation Modeling, 1-33.
Muthén, B., & Muthén, L. K. (2000). Integrating person‐centered and variable‐centered analyses:
Growth mixture modeling with latent trajectory classes. Alcoholism: Clinical and
Experimental Research, 24(6), 882-891.
Muthén, L. K., & Muthén, B. O. (2002). How to use a Monte Carlo study to decide on sample
size and determine power. Structural equation modeling, 9(4), 599-620.
Muthén, B., & Shedden, K. (1999). Finite mixture modeling with mixture outcomes using the
EM algorithm. Biometrics, 55(2), 463-469.
Nylund, K. L., Asparouhov, T., & Muthén, B. O. (2007). Deciding on the number of classes in
latent class analysis and growth mixture modeling: A Monte Carlo simulation
study. Structural equation modeling, 14(4), 535-569.
Pearson, Inc. (2012). AIMSweb Technical Manual. Retrieved from http://www.aimsweb.com/
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data
analysis methods (Vol. 1). Sage.

Reschly, A. L., Busch, T. W., Betts, J., Deno, S. L., & Long, J. D. (2009). Curriculum-based
measurement oral reading as an indicator of reading achievement: A meta-analysis of the
correlational evidence. Journal of School Psychology, 47(6), 427-469.
TJCC (2015). Formative Assessment System for Teachers: Technical Manual Version 2.0.,
Minneapolis, MN: Author and FastBridge Learning (www.fastbridge.org)
Tofighi, D., & Enders, C. K. (2008). Identifying the correct number of classes in growth mixture
models. Advances in latent variable mixture models, (Information Age Publishing, Inc),
317-341.
Wayman, M. M., Wallace, T., Wiley, H. I., Ticha, R., & Espin, C. A. (2007). Literature synthesis
on curriculum-based measurement in reading. The Journal of Special Education, 41(2),
85-120.

MixtureMS 031219

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

MixtureMS 031219

Transféré par

Droits d'auteur :

Formats disponibles

Running head: RTI: DUAL DISCREPANCY POPULATION

Response to Intervention: Empirical Demonstration of a Dual-Discrepancy Population

via Random Effects Mixture Models

Response to Intervention (RtI) is a commonly used framework to identify students in need of

disability identification within RtI.

Keywords: CBM Reading, random-effects, mixture model, progress monitoring, response

Response to Intervention: Empirical Demonstration of a Dual-Discrepancy Population

via Random Effects Mixture Models

Student progress is often monitored to establish a time-series dataset that is used to

provided targeted intervention services. RtI is an alternative to the diagnostic-prescriptive

that the available data are sufficient to inform such decisions.

Curriculum-Based Measurement of Oral Reading (CBM-R; (Deno et al., 1986; 2001)

benchmarks to facilitate interpretation. Published estimates of ROIs indicate that typically-

Learning, 2018) are generally consistent with those ROIs.

distribution and is less prone to be unduly influenced by a few outlying points.

Random-effects Mixture Models (RMMs)

existence of distinct subgroups in a population and evaluate the confidence of model-based

classifications. These methods are an extension of random-effects models, called random-effects

Muthén & Muthén, 2000; Muthén & Shedden, 1999).

with 𝑿𝑖 , a random-effects model has the form

between-student variability. The coefficients 𝒂𝑖 and 𝒃𝑖 are assumed to have a normal

distribution with a mean of zero and a variance-covariance matrix 𝝓. The 𝜺𝑖 is a vector of

Random-effects mixture models (RMMs). One of the assumptions made by the

appropriate to model population heterogeneity, along with between- and within-individual

model formulation for, say, the slope parameter can be written as

𝛽 ~ ∑𝑐𝑘=1 π𝑘 N(µ𝑘, 𝜎𝑘2 ), (2)

there would be at least two latent classes based on empirical classification.

ROI alone could support unique classifications.

3. If so, what is the relative confidence of those classifications?

cases were chosen through the application of selection criteria, as follows.

words read correctly were counted.

completed 13 to 18 probes) with a minimum of 11 (1 student) and a maximum of 20 (4 students).

31, and 75% of all scores being below 63.

Information Criterion; Kass & Raftery, 1995). BIC is defined as

BIC = −2 log 𝑒 𝐿 + 𝑝 log 𝑒 𝑛 (3)

not using the initial level of performance.

the quadratic coefficient was found to be negligible.

Intraclass Correlation Coefficient. As discussed in the Introduction section, each

was calculated based on the linear mixed-effects model. That is,

𝒚𝒊𝒋 = (𝜶 + 𝒂𝒊 ) + 𝜷𝟏 𝑿𝒊𝒋 + 𝜺𝒊𝒋 (𝟒)

ICC is given by the following expression:

𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠

different variations of RMMs that were then fit to the data.

RMM with class constant intercept-slope covariance structure. Here we describe an

𝑦𝑖𝑗 = 𝛽0𝑖 + 𝛽1𝑖 𝑋𝑖𝑗 + 𝜀𝑖𝑗 (6)

where 𝑖 = 1,2, . . , 𝑚, where 𝑚 = number of students, 𝑗 = 1,2, . . , 𝑛𝑖 , where 𝑛𝑖 = number of time

estimation of class-specific mean intercept and mean slope, respectively. A class-specific

over time, then they can be modeled as:

order autocorrelation in the repeated measures data.

standard errors of any estimates.

Random-effects Mixture Models with One to Three Classes

Two-class Model with Class Constant Intercept-Slope Covariance Structure

students in each of the two classes.

Probability of a student to belong to a particular class. One of the goals of RMMs is

“b”, the entropy is defined as:

Entropy = −p(𝑎) log 2 (p(𝑎)) − p(𝑏) log 2 (p(𝑏)) (7)

Two-class Model with Class Specific Intercept-Slope Covariance Structure

slope covariance is the final selected model.

It is important to understand the between-subject variability as a function of weekly slope

Ordinary Least Squares Regression Comparison

produce such outlying slope values.

The purpose of this study was to provide an empirical demonstration of subpopulations of

students with different rates of growth, as hypothesized by an RtI model of identification. To

measurement : An alternative for educational decision making. Special Services in the

curriculum-based measurement : Empirical evidence. Behavioral Assessment, 12, 179–