Vous êtes sur la page 1sur 29

Running head: RTI: DUAL DISCREPANCY POPULATION

Response to Intervention: Empirical Demonstration of a Dual-Discrepancy Population

via Random Effects Mixture Models


RTI: DUAL-DISCREPANCY POPULATION 2

Abstract

Response to Intervention (RtI) is a commonly used framework to identify students in need of

additional or specialized instruction. Special education eligibility decisions within RtI rely on the

assumption that there are subpopulations of students: those who demonstrate appropriate growth

and those who do not demonstrate appropriate growth, when provided specialized instruction.

The purpose of the present study was to illustrate the use of random-effects mixture models

(RMMs) to estimate the likely number of (unobserved) subpopulations within one curriculum-

based measurement of oral reading (CBM-R) progress monitoring dataset. The dataset comprised

second grade students’ CBM-R data collected weekly over 20 weeks. RMMs were fit with

several numbers of classes, and a two-class model best fit the data. Results suggest that RMMs

are useful to understand subpopulations of students who need specialized instruction. Results

also provide empirical support to some extent for the use of a dual-discrepancy model of learning

disability identification within RtI.

Keywords: CBM Reading, random-effects, mixture model, progress monitoring, response

to intervention.
RTI: DUAL-DISCREPANCY POPULATION 3

Response to Intervention: Empirical Demonstration of a Dual-Discrepancy Population

via Random Effects Mixture Models

Student progress is often monitored to establish a time-series dataset that is used to

evaluate the rate of improvement (ROI) with respect to a target skill. This practice was initially

conceptualized as idiographic analysis, which was used to estimate student progress and evaluate

instructional effects (Deno,1990; Deno & Mirkin, 1977; Deno, 1985; 1986). More recently, that

approach was incorporated into federal law (Individuals with Disabilities Education Act [IDEA],

2004) and professional practice (Batsche et al., 2005; Christ & Poncy, 2005) as Response to

Intervention (RtI). RtI is a resource allocation model in which all students are screened to ensure

expected progress is made (Fuchs & Fuchs, 2006). Students not meeting expectations are

provided targeted intervention services. RtI is an alternative to the diagnostic-prescriptive

approaches of the past (Fuchs & Fuchs, 2006), which were used to inform service delivery and

diagnosis of specific learning disabilities. For example, the RtI dual-discrepancy model

prescribes that a student may be eligible for special education services due to a learning

disability if there is a demonstrated deficit in both the level of achievement and rate of

improvement (ROI; Fuchs, 2003; Fuchs & Fuchs, 2006; Fuchs, Fuchs, & Speece, 2003). This is

an important concept with respect to this study because there is little existing empirical evidence

that the available data are sufficient to inform such decisions.

Curriculum-Based Measurement of Oral Reading (CBM-R; (Deno et al., 1986; 2001)

emerged as a primary approach to monitor student progress in reading (Ardoin, Christ, et al.,

2013). CBM-R is widely recommended and used as part of RtI to measure and evaluate the level

and ROI for use in a RtI dual discrepancy model. In brief, CBM-R data are collected by an adult

who listens to a child read aloud for one minute and records the number of words read correctly
RTI: DUAL-DISCREPANCY POPULATION 4

per minute (WRC). Data are collected approximately weekly with one of 20 or so alternate

forms. Each CBM-R form comprises a passage with approximately 250 to 350 words.

Once CBM-R data are collected, the level of achievement and ROI are derived. The ROI

is typically derived with visual analysis or a statistical method. Ordinary least squares regression

(OLSR) is the most widely recommended and used statistical approach to estimate ROI (Ardoin,

Christ, et al., 2013). The level and ROI of student achievement is compared to either norms or

benchmarks to facilitate interpretation. Published estimates of ROIs indicate that typically-

developing students gain approximately 1.4 WRC per week and students who receive special

education services gain approximately .84 WRC per week when in first and second grades (Deno

et al., 2001). Other estimates published in the peer-reviewed literature (Fuchs et al., 1993) and by

vendors (e.g., AIMSweb, Pearson 2018; DIBELS, Good & Kaminski, 2011; FastBridge

Learning, 2018) are generally consistent with those ROIs.

Typically-developing first and second grade students often read 30 to 70 WRC, but their

level of performance often fluctuates +/-10 WRC from one CBM-R administration to the next

(Van Norman, 2015; also Ardoin & Christ, 2009; Christ, 2006; Hintze & Christ, 2004; Christ &

Silberglitt, 2007). This variability of CBM-R data over time creates substantial challenges to

estimate a ROI with precision (cf., Christ, Zopluoglu, Long, et al., 2012). The precision of those

ROIs is relatively low, and often student performances are too variable across time for OLSR

estimates to be reliable. Published estimates for the standard error of ROI suggest they often

exceed 3 WRC per week, such that a 68% confidence interval around a moderate ROI may be

1.28 +/- 3.00 WRC per week (range, -1.72 to 4.28 WRC per week). Application of statistical

models like the random effects model may be advantageous in this context as these models do

not require analyzing each student’s data separately; rather the models borrow information across
RTI: DUAL-DISCREPANCY POPULATION 5

students to estimate ROI which makes the estimates more precise. The OLSR model considers

only one student’s data at a time which makes the growth estimates more susceptible to outliers

or anomalies in the data. In contrast, random effects models assume underlying growth structure

of all students to be similar; hence the fit for each student borrows strength from that common

distribution and is less prone to be unduly influenced by a few outlying points.

Although there is some published work (Ardoin & Christ, 2009; Deno et al., 2001), there

is very little empirical work to establish that students with and without learning difficulties

demonstrate unique and atypical ROIs. Furthermore, much of the prior work either aggregated

ROIs across potentially distinct populations (Fuchs et al., 1993) or pre-classified students into

the general education or special educations populations (Ardoin & Christ, 2009; Deno et al.,

2001). These two approaches either ignored the possibility of distinct groups within the

population (Fuchs et al.) or presupposed that distinct groups existed (Deno et al.).

Random-effects Mixture Models (RMMs)

Statistical methods have emerged in the last few decades to empirically evaluate the

existence of distinct subgroups in a population and evaluate the confidence of model-based

classifications. These methods are an extension of random-effects models, called random-effects

mixture models (RMMs), and have been demonstrated useful for this purpose (Muthén, 2001;

Muthén & Muthén, 2000; Muthén & Shedden, 1999).

Random-effects models. Random-effects models are used to estimate the overall growth

(i.e., mean intercept, and mean slope) and the variability around the growth coefficients for any

given repeated measures dataset. Unlike OLSR, these models include coefficients (parameters)

that are shared by all students in a population (i.e., population or mean intercept, and mean slope)

in addition to unique student-specific parameters (i.e., deviation from the mean intercept, and
RTI: DUAL-DISCREPANCY POPULATION 6

mean slope; aka. random-effects). Suppose 𝒀𝑖 is the vector of outcomes (words read correctly

per minute) for the 𝑖 𝑡ℎ student in the sample, and 𝑿𝑖 is a matrix of predictors which help to

explain the variation in the outcome for the 𝑖 𝑡ℎ student. Assuming 𝒀𝑖 has a linear relationship

with 𝑿𝑖 , a random-effects model has the form

𝒀𝑖 = 𝟏(𝛼 + 𝒂𝑖 ) + 𝑿𝑖 (𝜷 + 𝒃𝑖 ) + 𝜺𝑖 (1)

where 𝛼 is the overall intercept, and 𝜷 consists of a vector of overall linear slope coefficients for

the predictor variables 𝑿𝑖 shared by all students in the sample. The coefficients 𝒂𝑖 and 𝒃𝑖 are

student-specific random-effects; these coefficients vary from student to student and capture the

between-student variability. The coefficients 𝒂𝑖 and 𝒃𝑖 are assumed to have a normal

distribution with a mean of zero and a variance-covariance matrix 𝝓. The 𝜺𝑖 is a vector of

random errors which is often assumed to have a mean zero normal distribution. It is assumed that

random-effects and random errors are independent of each other. Random-effects models are

frequently used to analyze repeated measures data where each element in 𝒀𝑖 corresponds to the

value of the outcome (e.g. WRC) at a particular point of time (a particular week).

Random-effects mixture models (RMMs). One of the assumptions made by the

random-effects model is that all the individuals in the dataset are homogeneous. That is, all

individuals belong to the same underlying population, and thus share a similar overall growth

pattern. However, there are scenarios where the individuals in the dataset may not be

homogenous. The dataset may consist of two or more distinct unobserved classes of individuals,

and these classes must be discerned from the data empirically. To make it more concrete, it is

possible that in a given progress monitoring dataset, there may be a class of students who exhibit

higher reading progression over time than other students. In this case RMMs are more

appropriate to model population heterogeneity, along with between- and within-individual


RTI: DUAL-DISCREPANCY POPULATION 7

heterogeneity. The RMMs allow for the classification of unique subpopulations within a sample

of individuals. In the context of the current study the application of RMMs is preceded by testing

the assumption about whether there are indeed unique subgroups that can be empirically

identified on the basis of CBM-R level and ROI. Although it is often assumed that subgroups of

students exist (Deno, 2001), distinct subgroups based on those data have yet to be empirically

established.

In RMMs, each class is defined by its own set of regression coefficients (mean growth

curve parameters of intercept 𝛼 and slope 𝛽 and the variance around the mean growth curve

estimated by the random-effects), and random error (𝜀𝑖 ) parameters. A c class general mixture

model formulation for, say, the slope parameter can be written as

𝛽 ~ ∑𝑐𝑘=1 π𝑘 N(µ𝑘, 𝜎𝑘2 ), (2)

where π𝑘 is the class mixing proportion, where ∑𝑐𝑘=1 π𝑘 = 1. The µ𝑘 and 𝜎𝑘2 are respectively the

mean and variance of the slope of the 𝑘 𝑡ℎ class. In the context of the example described above, a

mixture model with 2 classes, 𝑐 = 2 can be used to fit data where one class comprises students

showing higher growth over time and is characterized by larger value of µ𝑘 , and the other class

comprises students showing smaller reading growth and is defined by a smaller value of µ𝑘 . A c

class mixture distribution for the distribution of the intercept and random error can be defined

similarly.

Purpose

This study extends previous research with an empirical evaluation of one CBM-R

progress monitoring dataset to determine whether there are two or more distinct classes based on

the level and/or ROI of CBM-R data. The identified classes estimate the likely number of distinct
RTI: DUAL-DISCREPANCY POPULATION 8

subpopulations, and those subpopulations might differ by initial level of performance (intercept)

and ROI (slope). There were three research questions that guided the study.

1. Are there two (or more) empirically derived classifications of the CBM-R cases

when using both the initial level of achievement and ROI? We hypothesized that

there would be at least two latent classes based on empirical classification.

2. Are there unique classifications derived from ROI alone? We hypothesized that

ROI alone could support unique classifications.

3. If so, what is the relative confidence of those classifications?

Methods

Dataset

The dataset analyzed in this study consists of 215 second graders’ weekly CBM-R data. It

was a sub-set of a large dataset of students across states, schools, and grades who received

additional reading intervention based on insufficient level of performance in CBM-R. The 215

cases were chosen through the application of selection criteria, as follows.

Second graders’ data were selected because students in lower grades typically

demonstrate higher rates of reading growth (FastBridge Learning, 2018; Fuchs et al., 1993;

Pearson, 2012). We estimate 20% to 40% of the total population of students received additional

reading intervention. All students who were considered in the data analysis in this manuscript did

receive additional reading intervention, at the “Tier 2” level of an RtI service delivery model.

Further details about the type and intensity of intervention provided to students were not

available. The subset was further filtered to exclude 3 students who only had 1 or 2 progress

monitoring data points across 20 weeks. To aid in the generalizability of the results, the dataset

included students across schools and passage sets. Different schools used one of three different
RTI: DUAL-DISCREPANCY POPULATION 9

probe sets: AIMSweb (n=147; Pearson, 2012), FAST (n=51; FastBridge Learning, 2018), and

DIBELSNext (n=20; Good & Kaminsi, 2011); we present analyses on possible passage effects

in the discussion section. All examiners were trained in the administration of CBM-R, and

standard procedures were followed: words were provided after hesitations of three seconds, and

words read correctly were counted.

The median number of probes administered to each student was 17 (190 students

completed 13 to 18 probes) with a minimum of 11 (1 student) and a maximum of 20 (4 students).

Of the 215 students, 187 had scores collected at most once per week, while 25 students had two

scores in one week and one score for the remaining 19 weeks, and 3 students had two scores in

two weeks, and 1 score for the remaining 18 weeks. According to data selection criteria, there

were no students with fewer than 20 weeks’ data. We modeled growth of CBM-R WRC scores

over time with the unit of time being “week”; for multiple scores in one week we used the

corresponding week values. For example, if a score was collected in the middle of week 1 and

week 2, the week value for that score was coded as week = 1.5. The maximum CBM-R WRC

score was 123, minimum score 0, median and mean score 47, with 25% of all scores being below

31, and 75% of all scores being below 63.

Data were collected and analyzed in accordance with ethical guidelines for research with

human subjects. The appropriate Institutional Review Board approved the present study.

Analytic Procedures

For each student we had CBM-R WRC scores collected over the duration of 20 weeks.

The analytic procedures comprised several steps. First, we calculated the intraclass correlation

coefficient (ICC). ICC is a descriptive statistic to assess what proportion of the total variance of

the outcome variable, CBM-R scores, is attributed to between-subject variability and within-
RTI: DUAL-DISCREPANCY POPULATION 10

subject variability respectively. The ICC value indicates whether there is sufficient between-

subject variability in the dataset to justify the application of RMMs. For the present study the

ICC value was found to be satisfactory, which meant we could proceed to fit RMMs.

In the second step, we fitted RMMs with varying number of classes. That is, we fitted

RMMs with one class, two classes, and three classes. The goal of fitting RMMs with different

numbers of classes was to ascertain empirically whether there exist more than one distinct class

of students based on the level and/or ROI of CBM-R data. Out of the three models fitted to the

dataset, the best fitting model was selected based on model fit criterion, such as BIC (Bayesian

Information Criterion; Kass & Raftery, 1995). BIC is defined as

BIC = −2 log 𝑒 𝐿 + 𝑝 log 𝑒 𝑛 (3)

where 𝑝 = number of free parameters to be estimated, 𝑛 = number of data points, and 𝐿 is the

maximized value of the likelihood function. The BIC value can be calculated for each model fit,

and the model with the smallest value of BIC is said to best fit the dataset. In the third step, the

selected RMM with c-classes (based on BIC) was then fitted by using two alternative

approaches. In the first approach, the RMM was fitted assuming a class-constant intercept-slope

covariance matrix, and in the second approach, RMM was fitted assuming a class-specific

intercept-slope covariance matrix. The objective of fitting RMM using these two different

approaches was to compare the model fit obtained from these two alternative approaches and

determine the best fitting RMM that better captures the underlying phenomena of the dataset in

hand. In the last step, we fitted the selected RMM (based on the previous step) to intercept-

centered data to determine how the results would differ when the estimated empirical

classification of individual students into classes was conducted solely on the basis of growth and

not using the initial level of performance.


RTI: DUAL-DISCREPANCY POPULATION 11

We assumed each student’s reading proficiency grew linearly over time, as is commonly

found in previous literature (Good & Shinn, 1990; Christ 2006; Christ, Zopluoglu, Long, et al.,

2012). To test the assumption of linearity, we plotted a random sample (n = 30) of the individual

students’ growth trajectories, along with the mean of the OLSR fits for those students to visually

inspect the pattern. As seen in Figure 1 there is no clear visual evidence of non-linearity in the

data. This evidence was further corroborated when we fitted a quadratic function to the data and

the quadratic coefficient was found to be negligible.

Intraclass Correlation Coefficient. As discussed in the Introduction section, each

individual student’s CBM-R data over successive weeks is often highly variable (indicating large

within-subject variability of reading scores). Furthermore, if tests were administered more than

once per week, it is possible for two or more successive reading scores to be dissimilar. If the

within-subject variation is too high, it may be a source of concern because it could lead to

unreliable estimates of growth obtained by fitting statistical models to the dataset. Therefore,

before the RMM can be fitted to data, it must be verified whether the variation in each individual

student’s weekly data is substantially smaller compared to the variation across all the students in

the sample (i.e., between-subjects). To assess this, ICC (Raudenbush & Bryk, 2002, see Ch. 2, 4)

was calculated based on the linear mixed-effects model. That is,

𝒚𝒊𝒋 = (𝜶 + 𝒂𝒊 ) + 𝜷𝟏 𝑿𝒊𝒋 + 𝜺𝒊𝒋 (𝟒)

where 𝑦𝑖𝑗 is the 𝑗 𝑡ℎ (weekly) observation for the 𝑖 𝑡ℎ student, 𝛼 is the overall intercept, 𝛽1 is the

overall weekly linear slope, 𝑋𝑖𝑗 is the week for the 𝑗 𝑡ℎ observation for the 𝑖 𝑡ℎ student, 𝑎𝑖 is the

𝑖 𝑡ℎ student random-effect for the intercept term, and 𝜀𝑖𝑗 are the normally distributed random

error terms. The coefficients, 𝛼 and 𝛽1, are fixed effects (i.e., same for all students). The ICC is
RTI: DUAL-DISCREPANCY POPULATION 12

defined only for random-intercept models so we only included an overall slope in the model. The

ICC is given by the following expression:

𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠


ICC =
𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒

Var (𝑎𝑖 )
= (5)
Var (𝑎𝑖 ) + Var (𝜀𝑖𝑗 )

The ICC value for the dataset was 0.70, indicating that 70% (a substantial part) of the

total variability was due to variability from one student to another, and only 30% of the total

variability was due to variability in an individual student’s weekly scores. The large ICC value

confirmed that a random intercepts model was warranted for the dataset. However, it did not tell

us whether the students differ from each other with respect to weekly slope. Thus, to gain insight

about the between-subject variability as a function of weekly slope only, we later show the

results from fitting RMM to intercept-centered data. Note that the definition of ICC is useful and

interpretable only for the random intercepts model; hence, it does not make sense to re-calculate

the ICC value using the intercept-centered data. In sum, the value of ICC was indicative of

sufficiently large between-student variability and the following sections will briefly describe the

different variations of RMMs that were then fit to the data.

RMM with class constant intercept-slope covariance structure. Here we describe an

RMM that allows for the estimation of class specific mean growth coefficients and residual

variances, but only class constant intercept-slope covariance structure (i.e., same variance of

intercept and slope for both classes, and same covariance between intercept and slope for both

classes). This model indicates the underlying assumption that the rate of reading growth

exhibited by the students differed depending on which class the students were classified in but
RTI: DUAL-DISCREPANCY POPULATION 13

the variation around the mean growth coefficients from student-to-student was constant across

classes. In this c class RMM, the data for the 𝒊𝒕𝒉 student is modeled as:

𝑦𝑖𝑗 = 𝛽0𝑖 + 𝛽1𝑖 𝑋𝑖𝑗 + 𝜀𝑖𝑗 (6)

where 𝑖 = 1,2, . . , 𝑚, where 𝑚 = number of students, 𝑗 = 1,2, . . , 𝑛𝑖 , where 𝑛𝑖 = number of time

points for student 𝑖. The coefficient 𝛽0𝑖 is random intercept for student 𝑖, and 𝛽1𝑖 is random linear

slope for student 𝑖, where 𝛽0𝑖 = 𝛽0 + 𝑏0𝑖 , and 𝛽1𝑖 = 𝛽1 + 𝑏1𝑖 . The mixture structure described

in (2) was imposed on the overall intercept 𝛽0 and overall linear slope 𝛽1 , allowing for the

estimation of class-specific mean intercept and mean slope, respectively. A class-specific

mixture structure was also put on the residual variance 𝜺𝑖𝑗 . The covariance structures of the

random effects, 𝑏0𝑖 and 𝑏1𝑖 , were assumed to be constant across classes.

RMM with class specific intercept-slope covariance structure. In the model described

above, we assumed that the variation around the mean growth coefficients from student-to-

student was constant across classes. This assumption may not hold in practice. Thus, we

considered an RMM with class specific intercept-slope covariance structure, along with class

specific mean growth coefficients and residual variances. That is, for each individual class we

estimated a class-specific mean intercept and mean slope, variance of intercept and variance of

slope, and covariance between intercept and slope, and residual variance. We fitted both kinds of

RMMs (i.e., class constant, and class specific intercept-slope covariance structure) and compared

the fit of the two models to determine which model fit the data better.

Covariance matrix for repeated measures residuals. Repeated measures data may

have autocorrelated residuals over time. If we have a first order autocorrelation of the residuals

over time, then they can be modeled as:

𝜀𝑡 = 𝜌𝜀𝑡−1 + 𝜔𝑡 ,
RTI: DUAL-DISCREPANCY POPULATION 14

where |𝜌| < 1, and 𝜔𝑡 are independent and identically distributed N(0, s2) errors. This equation

implies that the residual (unexplained variance of outcome yij) at time 𝑡 is dependent on the

residual at the previous timepoint 𝑡 − 1 through the parameter 𝜌. If 𝜌 = 0 then there is no first

order autocorrelation in the repeated measures data.

For each student, we calculated the residuals from the model fit, and then tested for 𝜌 =

0. The test revealed that first order autocorrelation is present in the data of 5 out of the 215

students. Since the autocorrelation was not present in the majority of the students’ data, we did

not incorporate an autoregressive error structure in our model. We use the unstructured

covariance form which does not impose any restrictions on the covariance structure.

Model fitting. The models were fitted using Maximum Likelihood estimation procedure

via the Expectation Maximization (EM) algorithm in the R program. This estimation procedure

is an iterative algorithm and the best parameter value set is obtained when the difference of the

likelihood values in two successive iterations is less than 0.001. Specifically, we used the

regmixEM.mixed function from the mixtools package in R (Benaglia et al., 2009). Generally

speaking, the random effects mixture models are computationally intensive because they require

the estimation of a large number of coefficients. Thus, to ease the computation complexity, a set

of starting values are often provided to the algorithm to help ease the computation. For the

current study we obtained the starting values by fitting a one-class RMM (which is similar to

fitting a random-effects model) to the data. The obtained estimated coefficients are used as a

sensible guess (starting values) for the overall mean parameters for RMMs with two classes, or

for RMMs with three classes. Note that the regmixEM.mixed R function does not provide

standard errors of any estimates.


RTI: DUAL-DISCREPANCY POPULATION 15

Results

Approximately 3800 CBM-R scores across 215 student cases were retained for analysis.

No extreme values were trimmed, as all scores were inside the range of likely values for second

grade students. Assumptions of normality for CBM-R scores were examined via scatterplots and

descriptive statistics and were met for the purposes of the analyses. Descriptive statistics

obtained from fitting the OLSR model to each student in the dataset are presented in Table 1.

The following section summarizes the results obtained from fitting the RMMs described in the

Methods section.

Random-effects Mixture Models with One to Three Classes

To answer the first research question, three RMMs with different numbers of classes

were fitted to the data: we fitted one-class (𝑐 = 1), two-class (𝑐 = 2), and three-class (𝑐 =

3) RMMs. The class-specific means and residual variances, and class constant intercept-slope

covariance structure were assumed. BIC was calculated for the one-, two-, and three-class model

fits, and the model with the lowest BIC value was then selected. The BIC values for the one-,

two-, and three-class models were 37621, 31082, and 33352, respectively. Based on the values of

BIC, we selected the two-class model. Note that all the models fitted until now had class specific

means, and residual variances, and class constant intercept-slope covariance structure. Below we

describe the results obtained from fitting the two-class RMM with class constant intercept-slope

covariance structure, and the two-class RMM with class-specific intercept-slope covariance

structure.

Two-class Model with Class Constant Intercept-Slope Covariance Structure

The parameter estimates obtained for this model are shown in Table 2 (we labeled this

model as Model A in the Table). The slope estimates for the two-class model were 0.86 for class
RTI: DUAL-DISCREPANCY POPULATION 16

1 and 1.49 for class 2 respectively. About 47 students were classified into class 1, and the

remaining 168 students were classified into class 2. Though all students in the dataset received

additional reading instruction, students in class 2 had an estimated mean slope that aligns with

recommended reasonable growth rates in previous literature (Deno et al., 2001; Fuchs et al.,

1993), and students in class 1 had an estimated mean slope that aligns with performance of

students receiving special education services (Deno et al., 2001). Thus, the two-class RMM

empirically identified the particular group of students who may need further additional

instruction. Table 3 shows descriptive statistics obtained from fitting the OLSR model to

students in each of the two classes.

Probability of a student to belong to a particular class. One of the goals of RMMs is

to empirically classify students into each of the classes. For the dataset in hand, we computed the

probability of the student to belong to class 1 (estimated 𝛑𝟏 ) and the probability of the same

student to belong to class 2 (estimated (𝟏 − 𝛑𝟏 )). We then determined which class the student

belongs to by selecting the class with the higher probability. We classify all the students in the

dataset into the class having the higher probability and we call that probability the classification

probability.

We classified the students using the two-class RMM; however, we needed to verify that

the class assignment had been conducted with sufficient amount of confidence. In other words, it

is desirable that each student has a clear high probability for belonging to a particular class, as

opposed to there being a 50% chance of students belonging to either class. The left panel of

Figure 2 shows the distribution of the classification probabilities for all the students. Based on

the figure, it is evident that most of the students had a high probability (close to 1) of being
RTI: DUAL-DISCREPANCY POPULATION 17

classified to a particular class. In summary, students were classified with a satisfactory degree of

confidence.

Entropy of classification. Another measure that is used to assess the quality of empirical

classification in a two-class model is the measure of entropy which quantifies the amount of

uncertainty in the classification. The entropy of classification was computed to evaluate the

degree of confidence in the empirical classification. For the RMM with two classes, say “a” and

“b”, the entropy is defined as:

Entropy = −p(𝑎) log 2 (p(𝑎)) − p(𝑏) log 2 (p(𝑏)) (7)

where p(𝑎) is the probability of a student to be classified in class “𝑎”. Entropy takes values in

[0,1] and a value of 0 indicates perfect delineation of the classes (Celeux & Soromenho, 1996).

The entropy of classification for the two-class RMM (p(𝑎) = 0.22) was 0.04, which is

satisfactorily close to 0.

Two-class Model with Class Specific Intercept-Slope Covariance Structure

We fitted a two-class RMM with class specific means, residual variances and class

specific intercept-slope covariance structure (labeled as Model B in Table 2) to compare its fit to

the fit of the two-class RMM with class specific means and residual variances, and class constant

intercept-slope covariance structure (i.e., Model A). This enabled us to see if the incorporation of

the class specific intercept-slope covariance structure improved the overall model fit. The two-

class RMM with class specific intercept-slope covariance structure had slightly larger BIC

compared to the BIC of the two-class RMM with class constant intercept-slope covariance

structure. The BIC for the two-class RMM with class specific intercept-slope covariance was

31142 and for the two-class RMM with class constant intercept-slope covariance it was 31082.
RTI: DUAL-DISCREPANCY POPULATION 18

We selected the model with the lowest BIC. The two-class RMM with class constant intercept-

slope covariance is the final selected model.

Intercept-Centered Data

It is important to understand the between-subject variability as a function of weekly slope

only (and no influence of intercept whatsoever); so, we fitted the two-class RMM with class

constant intercept-slope covariance to intercept-centered data. This analysis served to answer our

second and third research questions. The intercept-centering ensures that the classification is not

influenced by the initial level of performance (intercept), but rather students are classified by

their slope or growth. We centered the data by subtracting the median of the first three

observations of each student from that student’s data; the centering was performed using median

as opposed to mean since CBM data are highly variable from week to week and the mean is

prone to be affected by outliers. The slope estimates for the two classes were 0.93 and 1.52, and

the class probabilities were 0.31 and 0.69 (see Table 2). The entropy of classification for the

model using intercept-centered data was 0.04; the right side of Figure 2 shows the distribution of

the classification probabilities. As seen in Figure 2, more students were classified with slightly

greater certainty using the uncentered data; the model estimates were similar to the fit using

uncentered data (Table 2). This finding suggests that the classification of students using the

uncentered data was not driven by the students’ initial status (intercept), but rather the

classification did take into account ROI as well as the initial level. The BIC value for this two-

class RMM with class constant intercept-slope covariance structure fit to the intercept-centered

data was 30578. Note that this BIC value is not comparable to the BIC values from models fit to

the uncentered data since we have changed the data values by centering them.

Ordinary Least Squares Regression Comparison


RTI: DUAL-DISCREPANCY POPULATION 19

To place the RMM results in context, student performance was also estimated via OLSR.

Estimating student performances using OLSR does not lead to an automatic model-based

classification of students into distinct sub populations. The classification can be conducted post-

hoc using a cutoff pre-specified by the investigator. For example, in our data analysis we can

choose a slope of 1.2 as the cutoff and classify students into two groups based on whether their

linear (OLSR) slope over time is less than 1.2 or more than 1.2. Comparing this to the

classification from the RMM implemented, we found that the 149 out of 215 (69%) students

were classified into the same class using OLSR and RMM. In addition, we note that the OLSR

method gives some negative slope values or slope values greater than 3, whereas RMM does not

produce such outlying slope values.

Discussion

The purpose of this study was to provide an empirical demonstration of subpopulations of

students with different rates of growth, as hypothesized by an RtI model of identification. To

answer our first research question, the results indicated two distinct groups likely exist in the

population. The low-low group (i.e., dual discrepant) comprised 22% of the sample and was

characterized by a low level of initial achievement (16 WRC) and low ROI (0.86 WRC per

week). The moderate-moderate group comprised 78% of the sample and had a moderate level of

achievement (39 WRC) and moderate ROI (1.49 WRC per week). Within the context of a RtI

dual discrepancy framework, the low-low group might be considered for more intensive

intervention and further consideration for special education or disability diagnosis.

To answer our second and third research questions, results suggest the assignment of

cases to groups can be also done with a high degree of confidence using ROI alone. With the

initial achievement removed from the model (intercepts centered), two distinct groups still
RTI: DUAL-DISCREPANCY POPULATION 20

emerged. The low ROI group (.93 WRC per week) comprised 31% of the sample and was

distinct from the moderate ROI group (1.52 WRC per week) that comprised 69% of the sample.

Our empirical investigation suggests that the students within each class have the same

amount of heterogeneity in growth and intercept values, because the RMM with class constant

rather than class specific intercept-slope variance fitted the dataset better.

Implications for the RtI Model of Student Identification

As previously described, the data used in the current study represent data regularly

collected in education for research purposes. The response of a student to an educational

program is influenced by a multitude of factors. Despite these known threats to the validity of

interpretation of ROIs, results of the present study align remarkably well with the RtI model of

student identification. In other words, the statistically derived subpopulations are practically

meaningful.

Students in both empirically-derived classes in the present analysis have a level of

performance below publisher benchmarks, as would be expected for students selected to receive

additional reading instruction. Students in class 2 (moderate-moderate) demonstrate a higher

level and demonstrate appropriate (if not ambitious) growth. In other words, students in class 2

appear to be responsive to the instruction. Students in class 1 (low-low) have a much lower level

and do not demonstrate appropriate growth in response to instruction. If 20% to 40% of the

overall student population is represented in this progress monitoring dataset, and 22% of those

students are empirically classified into the low-low or non-responsive group (Table 2), then 4.4%

to 8.8% of all students would potentially be classified as not making progress. These percentages

are in line with previous intervention studies on the incidence of learning disabilities as identified

with the dual-discrepancy model (Fuchs, Fuchs, & Speece, 2002), a model that can be used in
RTI: DUAL-DISCREPANCY POPULATION 21

RtI special education eligibility determinations. These results lend support to the use of ROIs in

practice. Moreover, a notable feature of the RMM modeling approach is that though the students

have been grouped into two classes, student-specific characteristics have also been accounted for

through the random effects parameters in the model.

An empirical demonstration of two subpopulations of students does not imply that

student classification is permanent. In other words, a student with low ROI could still increase

their ROI with a different educational program. Future research with RMMs may investigate the

conditions under which students change classes. Although the present study focused on

classification, RtI is a resource allocation model. Its purpose is to accurately and flexibly target

intervention resources. The use of statistical models as part of data-based decision making could

serve to improve the accurate, flexible provision of early intervention services and prevent long-

term reading failure.

Different Passage Set Types

The present dataset incorporates CBM-R data across different passage sets (AIMSweb,

DIBELSNext, and FAST). Classification proportions were consistent across these passage sets.

For example, the proportions of students placed in class 1 who were assessed with AIMSweb,

FAST, and DIBELSNext were .22, .24, and .20 respectively. To further investigate how, if at all,

differences between passage set types interact with prediction of classes, we re-fitted all models

including passage set type as a covariate in the model. In 99% of cases, students were classified

into the same class regardless of the inclusion of passage set as a covariate. Since only 3 out of

215 students had non-consistent classification it is difficult to say whether it is due to any

systematic underlying factor. We can conclude that regardless of passage set publisher and
RTI: DUAL-DISCREPANCY POPULATION 22

school, students are clearly, empirically classified into two separate subpopulations. In short, for

classification purposes, different passage sets are consistent.

Predictions

One potential application of RMMs is prediction. For the present study, this would

involve the prediction of CBM-R scores at future time points, to aid in decision-making at the

present time point. As an initial exploration, we applied the two-class models to 6, 10, and 14

weeks’ data. The results using 14 weeks’ data were similar to the results using 20 weeks’ data

(92% of students remained in the same class), which indicates that it would be reasonable to

predict growth at 20 weeks using 14 weeks’ data. For comparison, only 79% of students

remained in the same class when OLSR and a cutoff score were used for classification. The need

for 14 weeks of data also aligns with previous work on the duration required to make consistent

decisions with CBM-R progress monitoring data (Christ, Zopluoglu, Long, et al., 2012; Christ,

Zopluoglu, Monaghen, et al., 2013).

RMMs could also potentially describe a range of possible ROI values when a given

intervention is used with a group of students who have similar characteristics to those in a

previous dataset and analysis. RMMs will usually ameliorate the biases in growth estimates

inherent to the use of OLSR with highly variable progress monitoring data.

Application of RMMs would be most successful if models were built in to vendor

software already used by educators to prevent the need for educators to possess advanced

statistical analysis tools or skills. The use of vendor software would also appropriately limit the

application of RMMs to datasets of sufficient sizes to obtain reliable mean growth estimates. In

practice, vendors could provide initial datasets for schools that don’t have sufficient numbers of

students. Provision of initial datasets would be analogous to the typical provision of norms.
RTI: DUAL-DISCREPANCY POPULATION 23

More research would be needed on the consequential validity of these potential automated

classification recommendations.

Limitations

Previous research suggests that variability in individual students’ weekly scores, the

number of scores available to make the prediction, and how far into the future the prediction is to

be made may all impact any model’s prediction accuracy (Christ, Zopluoglu, Long, et al., 2012;

Christ, Zopluoglu, Monaghen, et al., 2013). Future work should isolate these variables to inform

conditions under which predictions are most accurate before application would be possible.

Although the present dataset pulls from student populations around the country as well as

across probe sets, the results are limited to second grade students who would benefit from

modified reading instruction. For a larger cross-grade sample, the analysis must be by grade

since it is not reasonable to assume the same mean growth (slope) value for all grades.

Inferences made from the present study are further limited by the minimal information

available about instruction. The students included in the dataset were all classified as students

who would benefit from reading instruction. Information was not available, however, on the

type, intensity, timing, or fidelity of intervention these students received.

We also do not know if the 20 weeks of CBM data collection were 20 consecutive weeks

or if there were gaps in the progress monitoring data collection and intervention due to school

vacations. It would be useful to know if the data collection happened uniformly for all students

or if some students had continuous progress monitoring while others did not. In the latter case, it

might be difficult to interpret growth estimates based on school weeks as actual ROI.

Complexities and risks often emerge in practice because of a lack of standardization of

implementation and measurement of educational programs. That is, the measured response of a
RTI: DUAL-DISCREPANCY POPULATION 24

student to an educational program is likely influenced by multiple extraneous variables. For

example, the magnitude of a ROI could depend on the instructional program selected, fidelity of

implementation, dosage/amount of service, and the qualities of the data collected to monitor

program effects. These are threats to validity of the interpretation and use of ROIs. This study

does not address the relative influence of those threats, but it does evaluate whether distinct

groups can be derived from samples of data whereby those threats abound.

Conclusions and Future Directions

Repeated measures data are a key part of Response to Intervention service delivery

models. Accurate and efficient analysis of these data promotes accurate educational decisions

and efficient allocation of resources. The present study illustrated the use of RMMs to

empirically support the existence of two subpopulations of students within students who receive

additional reading instruction. This initial application provides empirical evidence in support of

dual-discrepancy theory of learning disability identification. Future research may directly assess

the classification accuracy of RMMs for learning disabilities, may assess conditions under which

students change classification, or may extend RMMs to identification of subpopulations of

students in other academic areas and assessments, such as CBM-Math.


Running head: RTI: DUAL DISCREPANCY POPULATION

References

Ardoin, S. P., Christ, T. J., Morena, L. S., Cormier, D. C., & Klingbeil, D. A. (2013). A

systematic review and summarization of the recommendations and research surrounding

curriculum-based measurement of oral reading fluency (CBM-R) decision rules. Journal

of School Psychology, 51(1), 1–18.

Ardoin, S. P., & Christ, T. J. (2009). Curriculum-based measurement of oral reading: Standard

errors associated with progress monitoring outcomes from DIBELS, AIMSweb, and an

experimental passage set. School Psychology Review, 38(2), 266.

Benaglia, T., Chauveau, D., Hunter, D., & Young, D. (2009). mixtools: An R package for

analyzing finite mixture models. Journal of Statistical Software, 32(6), 1-29.

Batsche, G. M., Elliot, J., Graden, J. L., Grimes, J., Kovaleski, J. F., Prasse, D., . . . Tilly, W. D.

(2005). Response to intervention: Policy considerations and implementation. Alexandria,

VA: NASDE.

Celeux, G., & Soromenho G. (1996). An entropy criterion for assessing the number of clusters in

a mixture model. Journal of Classification, 13.2: 195-212.

Christ, T. J. (2006). Short-term estimates of growth using curriculum-based measurement of oral

reading fluency: Estimating standard error of the slope to construct confidence

intervals. School Psychology Review, 35(1), 128.

Christ, T. J., & Poncy, B. C. (2005). Guest editors' introduction to a special issue on response to

intervention. Journal of Psychoeducational Assessment, 23(4), 299-303.

Christ, T. J., & Silberglitt, B. (2007). Estimates of the standard error of measurement for

curriculum-based measures of oral reading fluency. School Psychology Review, 36(1),

130.
RTI: DUAL-DISCREPANCY POPULATION 26

Christ, T. J., Zopluoglu, C., Long, J. D., & Monaghen, B. D. (2012). Curriculum-based

measurement of oral reading: Quality of progress monitoring outcomes. Exceptional

Children, 78(3), 356-373.

Christ, T. J., Zopluoglu, C., Monaghen, B., & Van Norman, E. R. (2013). Curriculum-based

measurement reading (CBM-R) progress monitoring: Multi-study evaluation of schedule,

duration, and dataset quality on progress monitoring outcomes. Journal of School

Psychology (51), 19-57. doi:10.1016/j.jsp.2012.11.001

Curran, P. J., Obeidat, K., & Losardo, D. (2010). Twelve frequently asked questions about

growth curve modeling. Journal of Cognition and Development, 11(2), 121-136.

Deno, S. L. (1985). Curriculum-based measurement: The emerging alternative. Exceptional

children, 52(3), 219-232.

Deno, S. L. (1986). Formative evaluation of individual student programs: A new role for school

psychologists. School Psychology Review.

Deno, S. L. (1990). Individual differences and individual difference. The Journal of Special

Education, 24(2), 160.

Deno, S. L., Marston, D., & Tindal, G. (1986). Direct and frequent curriculum-based

measurement : An alternative for educational decision making. Special Services in the

Schools, 2(2), 5–27.

Deno, S. L., & Mirkin, P. K. (1977). Data-based program modification: A manual. Reston, VA:

Council for Exceptional Children.

Deno, S., Fuchs, L. S., Marston, D., & Shin, J. (2001). Using curriculum-based measurement to

establish growth standards for students with learning disabilities. School Psychology

Review, 30(4), 507-524.


RTI: DUAL-DISCREPANCY POPULATION 27

Fast Bridge Learning (2018). Formative Assessment System for Teachers. Minneapolis: Author.

Fuchs, L. S. (2003). Assessing intervention responsiveness: Conceptual and technical issues.

Learning Disabilities Research and Practice, 18, 172–186.

Fuchs, D., & Fuchs, L. S. (2006). Introduction to response to intervention: What, why, and how

valid is it?. Reading research quarterly, 41(1), 93-99.

Fuchs, L. S., Fuchs, D., & Speece, D. L. (2002). Treatment validity as a unifying construct for

identifying learning disabilities. Learning Disability Quarterly, 25(1), 33-45.

Fuchs, L. S., Fuchs, D., Hamlett, C. L., Walz, L., & Germann, G. (1993). Formative evaluation

of academic progress: How much growth can we expect. School Psychology Review, 22,

27-48.

Good, R. H., & Kaminski, R. A. (2011). Dynamic Indicators of Basic Early Literacy Skills Next.

Eugene, OR: Dynamic Measurement Group. Retrieved from http://www.dibels.org/

Good, R. H., & Shinn, M. R. (1990). Forecasting accuracy of slope estimates for reading

curriculum-based measurement : Empirical evidence. Behavioral Assessment, 12, 179–

193.

Hintze, J. M., & Christ, T. J. (2004). An examination of variability as a function of passage

variance in CBM progress monitoring. School Psychology Review, 33(2), 204.

Individuals with Disabilities Education Improvement Act, 20 U.S.C., Pub. L. No. 108-446 §

1400 et seq. (2004).

Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American statistical

association, 90(430), 773-795.


RTI: DUAL-DISCREPANCY POPULATION 28

Kim, S. Y. (2012). Sample size requirements in single-and multiphase growth mixture models: A

Monte Carlo simulation study. Structural Equation Modeling: A Multidisciplinary

Journal, 19(3), 457-476.

Laird, N. M., & Waire, J. H. (1982). Random-effects model for longitudinal data. Biometrics, 38,

963-974.

McLachlan, G., & Peel, D. (2004). Finite Mixture Models. John Wiley & Sons.

Meredith, W., & Tisak, J. (1990). Latent curve analysis. Psychometrika, 55, 107-122.

Muthén, B. (2001). Latent variable mixture modeling. New Developments and Techniques in

Structural Equation Modeling, 1-33.

Muthén, B., & Muthén, L. K. (2000). Integrating person‐centered and variable‐centered analyses:

Growth mixture modeling with latent trajectory classes. Alcoholism: Clinical and

Experimental Research, 24(6), 882-891.

Muthén, L. K., & Muthén, B. O. (2002). How to use a Monte Carlo study to decide on sample

size and determine power. Structural equation modeling, 9(4), 599-620.

Muthén, B., & Shedden, K. (1999). Finite mixture modeling with mixture outcomes using the

EM algorithm. Biometrics, 55(2), 463-469.

Nylund, K. L., Asparouhov, T., & Muthén, B. O. (2007). Deciding on the number of classes in

latent class analysis and growth mixture modeling: A Monte Carlo simulation

study. Structural equation modeling, 14(4), 535-569.

Pearson, Inc. (2012). AIMSweb Technical Manual. Retrieved from http://www.aimsweb.com/

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data

analysis methods (Vol. 1). Sage.


RTI: DUAL-DISCREPANCY POPULATION 29

Reschly, A. L., Busch, T. W., Betts, J., Deno, S. L., & Long, J. D. (2009). Curriculum-based

measurement oral reading as an indicator of reading achievement: A meta-analysis of the

correlational evidence. Journal of School Psychology, 47(6), 427-469.

TJCC (2015). Formative Assessment System for Teachers: Technical Manual Version 2.0.,

Minneapolis, MN: Author and FastBridge Learning (www.fastbridge.org)

Tofighi, D., & Enders, C. K. (2008). Identifying the correct number of classes in growth mixture

models. Advances in latent variable mixture models, (Information Age Publishing, Inc),

317-341.

Wayman, M. M., Wallace, T., Wiley, H. I., Ticha, R., & Espin, C. A. (2007). Literature synthesis

on curriculum-based measurement in reading. The Journal of Special Education, 41(2),

85-120.

Vous aimerez peut-être aussi