Académique Documents
Professionnel Documents
Culture Documents
One problem with the split-half method is that the reliability estimate obtained using any
random split of the items is likely to differ from that obtained using another. One solution to
this problem is to compute the Spearman-Brown corrected split-half reliability coefficient for
every one of the possible split-halves and then find the mean of those coefficients. This is the
motivation for Cronbachs alpha.
Cronbachs alpha is superior to Kuder and Richardson Formula 20 since it can be used with
continuous and non-dichotomous data. In particular, it can be used for testing with partial
credit and for questionnaires using a Likert scale.
Property 1: Let xj = tj + ej where each ej is independent of tj and all the ej are independent of
each other. Also let x0 = and t0 = . Then the reliability of x0 where is
Cronbachs alpha.
Here we view the xj as the measured values, the tj as the true values and the ej as the
measurement error values. Click here for a proof of Property 1.
Observation: Cronbachs alpha provides a useful lower bound on reliability (as seen in Property
1). Cronbachs alpha will generally increase when the correlations between the items increase.
For this reason the coefficient measures the internal consistency of the test. Its maximum value
is 1, and usually its minimum is 0, although it can be negative (see below).
A commonly-accepted rule of thumb is that an alpha of 0.7 (some say 0.6) indicates acceptable
reliability and 0.8 or higher indicates good reliability. Very high reliability (0.95 or higher) is not
necessarily desirable, as this indicates that the items may be entirely redundant. These are only
guidelines and the actual value of Cronbachs alpha will depend on many things. E.g. as the
number of items increases, Cronbachs alpha tends to increase too even without any increase in
internal consistency.
The goal in designing a reliable instrument is for scores on similar items to be related (internally
consistent), but for each to contribute some unique information as well.
Observation: There are an number reasons why Cronbachs alpha could be low or even
negative even for a perfectly valid test. Two such reasons are reverse coding and multiple
factors.
Reverse coding: Suppose you use a Likert scale of 1 to 7 with 1 meaning strongly disagree and 7
meaning strongly agree. Suppose two of your questions are: Q1: I like pizza and Q20: I dislike
pizza. These questions ask the same thing, but with reverse wording. In order to apply
Cronbachs alpha properly you need to reverse the scoring of any negatively phrased question,
Q20 in our example. Thus if a response to Q20 is say 2, it needs to be scored as 6 instead of 2
(i.e. 8 minus the recorded score).
Multiple factors: Cronbachs alpha is useful where all the questions are testing more or less the
same thing, called a factor. If there are multiple factors then you need to determine which
questions are testing which factors. If say there are 3 factors (e.g. happiness with your job,
happiness with your marriage and happiness with yourself), then you need to split the
questionnaire/test into three tests, one containing the questions testing factor 1, one with the
questions testing factor 2 and the third with questions testing factor 3. You then calculate
Cronbachs alpha for each of the three tests. The process of determining these hidden factors
and splitting the test by factor is called Factor Analysis (see Factor Analysis).
Example 1: Calculate Cronbachs alpha for the data in Example 1 of Kuder and Richardson
Formula 20 (repeated in Figure 1 below).
Example 2: Calculate Cronbachs alpha for the survey in Example 1, where any one question is
removed.
The necessary calculations are displayed in Figure 3.
Figure 3 Cronbachs Alpha for Example 2
Each of the columns B through L represents the test with one question removed. Column B
corresponds to question #1, column C corresponds to question #2, etc. Figure 4 displays the
formulas corresponding to question #1 (i.e. column B); the formulas for the other questions are
similar. Some of the references are to cells shown in Figure 2.
Observation: Another way to calculate Cronbachs alpha is to use the Two Factor ANOVA
without Replication data analysis tool on the raw data and note that:
As you can see from Figure 5, Cronbachs alpha is .73802, the same value calculated in Figure 1.
Observation: Alternatively, we could use the Real Statistics Two Factor ANOVA data analysis
tool, setting the Number of Rows per Sample to 1. We can also obtain the same result using
the following supplemental function.
Real Statistics Function: The following function is provided in the Real Statistics Resource Pack:
CRONALPHA(R1) = Cronbachs alpha for the data in range R1
As noted above, for the data in Figure 1, CRONALPHA(B4:L15) = .738019.
Example 4: Calculate Cronbachs alpha for a 10 question questionnaire with Likert scores
between 1 and 7 based on the 15 person sample shown in Figure 6.
Statistics Corner
Questions and answers about language testing
statistics:
QUESTION: For what kind of test would a coefficient alpha reliability be appropriate? How
does one interpret reliability coefficients?
ANSWER: Coefficient alpha is one name for the Cronbach alpha reliability estimate. Cronbach
alpha is one of the most commonly reported reliability estimates in the language testing
literature. To adequately explain Cronbach alpha, I will need to address several sub-questions:
(a) What are the different strategies for estimating reliability? (b) Where does Cronbach alpha
fit into these strategies for estimating reliability? And, (c) how should we interpret Cronbach
alpha?
Where does Cronbach alpha fit into these strategies for estimating reliability?
Internal consistency reliability estimates come in several flavors. The most familiar are the
(a) split-half adjusted (i.e., adjusted using the Spearman-Brown prophecy formula, which is the
focus of Brown, 2001), (b) Kuder-Richardson formulas 20 and 21 (also known as K-R20 and K-
R21, see Kuder & Richardson, 1937), and (c) Cronbach alpha (see Cronbach, 1970).
The most frequently reported internal consistency estimates are the K-R20 and Cronbach
alpha. Either one provides a sound under-estimate (that is conservative or safe estimate) of the
reliability of a set of test results. However, the K-R20 can only be applied if the test items are
scored dichotomously (i.e., right or wrong). Cronbach alpha can also be applied when test items
are scored dichotomously, but alpha has the advantage over K-R20 of being applicable when
items are weighted (as in an item scored 0 points for a functionally and grammatically incorrect
answer, 1 point for a functionally incorrect, but grammatically correct answer, 2 points for a
functionally correct but grammatically incorrect answer, and 3 points for a functionally and
grammatically correct answer). Hence, Cronbach alpha is more flexible than K-R20 and is often
the appropriate reliability estimate for language test development projects and language
testing research.
[ p. 17 ]
1. Cronbach alpha provides an estimate of the internal consistency of the test, thus (a)
alpha does not indicate the stability or consistency of the test over time, which would
be better estimated using the test-retest reliability strategy, and (b) alpha does not
indicate the stability or consistency of the test across test forms, which would be better
estimated using the equivalent forms reliability strategy.
2. Cronbach alpha is appropriately applied to norm-referenced tests and norm-referenced
decisions (e.g., admissions and placement decisions), but not to criterion-referenced
tests and criterion-referenced decisions (e.g., diagnostic and achievement decisions).
3. All other factors held constant, tests that have normally distributed scores are more
likely to have high Cronbach alpha reliability estimates than tests with positively or
negatively skewed distributions, and so alpha must be interpreted in light of the
particular distribution involved.
4. All other factors held constant, Cronbach alpha will be higher for longer tests than for
shorter tests (as shown and explained in Brown 1998 & 2001), and so alpha must be
interpreted in light of the particular test length involved.
5. The standard error of measurement (or SEM) is an additional reliability statistic
calculated from the reliability estimate (as explained in Brown, 1999b) that may prove
more useful than the reliability estimate itself when you are making actual decisions
with test scores. The SEM's usefulness arises from the fact that it provides an estimate
of how much variability in actual test score points you can expect around a particular
cut-point due to unreliable variance (with 68% probability if one SEM plus or minus is
used, or with 95% if two SEMs plus or minus are used, or 98% if three are used). (For
more on this topic, see Brown 1996 or 1999a).
Conclusion
Clearly, Cronbach alpha is a useful and flexible tool that you can use to investigate the
reliability of your language test results. In the process, it is important to remember that
reliability, regardless of the strategy used to obtain it, is not a characteristic inherent in the test
itself, but rather is an estimate of the consistency of a set of items when they are administered
to a particular group of students at a specific time under particular conditions for a specific
purpose. Extrapolating from reliability results obtained under a particular set of circumstances
to other situations must be done with great care.
References
Brown, J. D. (1996). Testing in language programs. Upper Saddle River, NJ: Prentice Hall.
Brown, J. D. (1997). Statistics Corner: Questions and answers about language testing statistics:
Reliability of surveys. Shiken: JALT Testing & Evaluation SIG Newsletter, 1 (2), 17-19. Retrieved
December 24, 2001 from the World Wide Web: http://jalt.org/test/bro_2.htm
Brown, J. D. (1998). Statistics Corner: Questions and answers about language testing statistics:
Reliability and cloze test length. Shiken: JALT Testing & Evaluation SIG Newsletter, 2 (2), 19-22.
Retrieved December 24, 2001 from the World Wide Web: http://jalt.org/test/bro_3.htm
Brown, J. D. (1999b). Statistics Corner. Questions and answers about language testing statistics:
The standard error of vs. standard error of measurement. Shiken: JALT Testing & Evaluation SIG
Newsletter, 3 (1), 15-19. Retrieved December 24, 2001 from the World Wide
Web: http://jalt.org/test/bro_4.htm.
Brown, J. D. (2001). Statistics Corner. Questions and answers about language testing statistics:
Can we use the Spearman-Brown prophecy formula to defend low reliability? Shiken: JALT
Testing & Evaluation SIG Newsletter, 4 (3), 7-9. Retrieved December 24, 2001 from the World
Wide Web: http://jalt.org/test/bro_9.htm.
Cronbach, L. J. (1970). Essentials of psychological testing (3rd ed.). New York: Harper & Row.
Abstract
Summated scales are often used in survey instruments to probe underlying constructs that the
researcher wants to measure. These may consist of indexed responses to dichotomous or
multi-point questionnaires, which are later summed to arrive at a resultant score associated
with a particular respondent. Usually, development of such scales is not the end of the research
itself, but rather a means to gather predictor variables for use in objective models. However,
the question of reliability rises as the function of scales is stretched to encompass the realm of
prediction. One of the most popular reliability statistics in use today is Cronbach's alpha
(Cronbach, 1951). Cronbach's alpha determines the internal consistency or average correlation
of items in a survey instrument to gauge its reliability. This paper will illustrate the use of the
ALPHA option of the PROC CORR procedure from SAS(R) to assess and improve upon the
reliability of variables derived from summated scales.
J. Reynaldo A. Santos
Extension Information Technology
Texas Agricultural Extension Service
Texas A&M University
College Station, Texas
Internet address: j-santos@tamu.edu
Introduction
Reliability comes to the forefront when variables developed from summated scales are used as
predictor components in objective models. Since summated scales are an assembly of
interrelated items designed to measure underlying constructs, it is very important to know
whether the same set of items would elicit the same responses if the same questions are recast
and re-administered to the same respondents. Variables derived from test instruments are
declared to be reliable only when they provide stable and reliable responses over a repeated
administration of the test.
The ALPHA option in PROC CORR provides an effective tool for measuring Cronbach's alpha,
which is a numerical coefficient of reliability. Computation of alpha is based on the reliability of
a test relative to other tests with same number of items, and measuring the same construct of
interest (Hatcher, 1994). This paper will illustrate the use of the ALPHA option of the PROC
CORR procedure from SAS(R) to assess and improve upon the reliability of variables derived
from summated scales.
Procedure
Sixteen questions using Likert-type scales (1 = strongly agree; 6 = strongly disagree) from a
national agricultural and food preference policy survey were administered nationwide. Usable
survey forms, totaling 1,111, were received and processed using the PROC FACTOR and PROC
CORR procedures of SAS. Three common factors were extracted during factor analysis and were
interpreted to represent "subsidy policy" factors, "regulatory policy" factors, and "food safety
policy" factors (Santos, Lippke, & Pope, 1998).
To make the demonstration on Cronbach's alpha possible, SB8, which was a variable previously
deleted during factor analysis, was restored in the data set. SB8 was used to demonstrate how
a poorly selected item on a summated scale can affect the resulting value of alpha. It should be
noted here that factor analysis is not required in the determination of Cronbach's alpha.
After factor analysis, it is a common practice to attach a descriptive name to each common
factor once it is extracted and identified. The assigned name is indicative of the predominant
concern that each factor addresses. In SAS, a RENAME FACTOR(i)='descriptive name' statement
would do the job. In this example, this can be accomplished by
RENAME FACTOR1=SUBSIDY
FACTOR2=REGULATE
FACTOR3=FSAFETY;
While labeling is critical, it definitely makes for an easy identification of which construct is
running on what particular procedure. At this point, the named common factors can now be
used as independent or predictor variables. However, most experienced researchers would
insist on running a reliability test for all the factors before using them in subsequent analyses.
If you were giving an evaluation survey, would it not be nice to know that the instrument you
are using will always elicit consistent and reliable response even if questions were replaced
with other similar questions? When you have a variable generated from such a set of questions
that return a stable response, then your variable is said to be reliable. Cronbach's alpha is an
index of reliability associated with the variation accounted for by the true score of the
"underlying construct." Construct is the hypothetical variable that is being measured (Hatcher,
1994).
Alpha coefficient ranges in value from 0 to 1 and may be used to describe the reliability of
factors extracted from dichotomous (that is, questions with two possible answers) and/or
multi-point formatted questionnaires or scales (i.e., rating scale: 1 = poor, 5 = excellent). The
higher the score, the more reliable the generated scale is. Nunnaly (1978) has indicated 0.7 to
be an acceptable reliability coefficient but lower thresholds are sometimes used in the
literature.
For this demonstration, observed variables were used that precipitated a latent construct
earlier labeled "REGULATE" to run on Cronbach's alpha analysis. The following SAS statements
initiated the procedure:
Where:
Label Description
The first statement invoked the procedure PROC CORR that implements the option ALPHA to do
Cronbach's alpha analysis on all observations with no missing values (dictated by the NOMISS
option). The VAR statement lists down all the variables to be processed for the analysis.
Incidentally, the listed variables, except SB8, were the ones that loaded high (i.e., showed high
positive correlation) in factor analysis. The output from the analysis is shown in Table 1.
Table 1
Output of alpha analysis for the items included in the "REGULATE" construct
Correlation Analysis
The printed output facilitates the identification of dispensable variable(s) by listing down the
deleted variables in the first column together with the expected resultant alpha in the same
row in the third column. For this example, the table indicates that if SB8 were to be deleted
then the value of raw alpha will increase from the current .77 to .81. Note that the same
variable has the lowest item-total correlation value (.185652). This indicates that SB8 is not
measuring the same construct as the rest of the items in the scale are measuring. With this
process alone, not only was the author able to come up with the reliability index of the
"REGULATE" construct but he also managed to improve on it. What this means is that removal
SB8 from the scale will make the construct more reliable for use as a predictor variable.
Conclusion
This paper has demonstrated the procedure for determining the reliability of summated scales.
It emphasized that reliability tests are especially important when derivative variables are
intended to be used for subsequent predictive analyses. If the scale shows poor reliability, then
individual items within the scale must be re-examined and modified or completely changed as
needed. One good method of screening for efficient items is to run an exploratory factor
analysis on all the items contained in the survey to weed out those variables that failed to show
high correlation. In fact in this exercise, SB8 had been previously eliminated when it showed
low correlation during factor analysis. It was intentionally reinstated to demonstrate how the
ALPHA option in PROC CORR procedure would flag and mark it out for deletion to generate an
improved alpha.
References
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika. 16,
297-334.
Hatcher, L. (1994). A step-by-step approach to using the SAS(R) system for factor analysis and
structural equation modeling. Cary, NC: SAS Institute.
Santos, J.R.A., Lippke, L., and Pope, P. (1998). PROC FACTOR: A tool for extracting hidden gems
from a mountain of variables. Proceedings of the 23rd Annual SAS Users Group International
Conference. Cary, NC: SAS Institute Inc.
SPSS FAQ
What does Cronbach's alpha mean?
Cronbach's alpha is a measure of internal consistency, that is, how closely related a set of items
are as a group. It is considered to be a measure of scale reliability. A "high" value for alpha
does not imply that the measure is unidimensional. If, in addition to measuring internal
consistency, you wish to provide evidence that the scale in question is unidimensional,
additional analyses can be performed. Exploratory factor analysis is one method of checking
dimensionality. Technically speaking, Cronbach's alpha is not a statistical test - it is a coefficient
of reliability (or consistency).
Cronbach's alpha can be written as a function of the number of test items and the average
inter-correlation among the items. Below, for conceptual purposes, we show the formula for
the standardized Cronbach's alpha:
Here N is equal to the number of items, c-bar is the average inter-item covariance among the
items and v-bar equals the average variance.
One can see from this formula that if you increase the number of items, you increase
Cronbach's alpha. Additionally, if the average inter-item correlation is low, alpha will be
low. As the average inter-item correlation increases, Cronbach's alpha increases as well
(holding the number of items constant).
An example
Let's work through an example of how to compute Cronbach's alpha using SPSS, and how to
check the dimensionality of the scale using factor analysis. For this example, we will use a
dataset that contains four test items - q1, q2, q3 and q4. You can download the dataset by
clicking on alpha.sav. To compute Cronbach's alpha for all four items - q1, q2, q3, q4 - use
the reliability command:
RELIABILITY
/VARIABLES=q1 q2 q3 q4.
Here is the resulting output from the above syntax:
The alpha coefficient for the four items is .839, suggesting that the items have relatively high
internal consistency. (Note that a reliability coefficient of .70 or higher is
considered "acceptable" in most social science research situations.)
In addition to computing the alpha coefficient of reliability, we might also want to investigate
the dimensionality of the scale. We can use the factor command to do this:
FACTOR
/VARIABLES q1 q2 q3 q4
/FORMAT SORT BLANK(.35).
Here is the resulting output from the above syntax:
Looking at the table labeled Total Variance Explained, we see that the eigen value for the first
factor is quite a bit larger than the eigen value for the next factor (2.7 versus 0.54). Additionally,
the first factor accounts for 67% of the total variance. This suggests that the scale items are
unidimensional.
Reference(s):
http://www.real-statistics.com/reliability/cronbachs-alpha/
http://jalt.org/test/bro_13.htm
http://www.joe.org/joe/1999april/tt3.php
http://www.ats.ucla.edu/stat/spss/faq/alpha.html
Cronbach's Alpha () using SPSS
Introduction
SPSStop ^
Example
SPSStop ^
Setup in SPSS
In SPSS, the nine questions have been labelled Qu1 through to Qu9 . To know how to
correctly enter your data into SPSS in order to run a Cronbach's alpha test, see
our Entering Data into SPSS tutorial. Alternately, you can learn about our enhanced
data setup contenthere.
SPSStop ^
Test Procedure in SPSS
The eight steps below show you how to check for internal consistency using Cronbach's
alpha in SPSS. At the end of these eight steps, we show you how to interpret the results
from your Cronbach's alpha.
Click Analyze > Scale > Reliability Analysis... on the top menu, as shown below:
Transfer the variables Qu1 to Qu9 into the Items: box. You can do this by drag-and-
dropping the variables into their respective boxes or by using the button. You will
be presented with the following screen:
Published with written permission from SPSS, IBM Corporation.
Leave the Model: set as "Alpha", which represents Cronbach's alpha in SPSS. If you
want to provide a name for the scale, enter it in the Scale label: box. Since this only prints
the name you enter at the top of the SPSS output, it is certainly not essential that you
do (in our example, we leave it blank).
Click on the button, which will open the Reliability Analysis:
Statistics dialogue box, as shown below:
Published with written permission from SPSS, IBM Corporation.
Select the Item, Scale and Scale if item deleted options in the Descriptives for area, and
the Correlations option in the Inter-Item area, as shown below:
Published with written permission from SPSS, IBM Corporation.
Click the button. This will return you to the Reliability Analysis dialogue box.
Click the button to generate the output.
SPSStop ^
SPSS Output for Cronbach's Alpha
SPSS produces many different tables. The first important table is the Reliability
Statistics table that provides the actual value for Cronbach's alpha, as shown below:
Published with written permission from SPSS, IBM Corporation.
From our example, we can see that Cronbach's alpha is 0.805, which indicates a high
level of internal consistency for our scale with this specific sample.
SPSStop ^
Item-Total Statistics
The Item-Total Statistics table presents the "Cronbach's Alpha if Item Deleted" in
the final column, as shown below:
This column presents the value that Cronbach's alpha would be if that particular item
was deleted from the scale. We can see that removal of any question, except question
8, would result in a lower Cronbach's alpha. Therefore, we would not want to remove
these questions. Removal of question 8 would lead to a small improvement in
Cronbach's alpha, and we can also see that the "Corrected Item-Total Correlation"
value was low (0.128) for this item. This might lead us to consider whether we should
remove this item.
Cronbach's alpha simply provides you with an overall reliability coefficient for a set of
variables (e.g., questions). If your questions reflect different underlying personal
qualities (or other dimensions), for example, employee motivation and employee
commitment, Cronbach's alpha will not be able to distinguish between these. In order to
do this and then check their reliability (using Cronbach's alpha), you will first need to run
a test such as a principal components analysis (PCA). You can learn how to carry out
principal components analysis (PCA) using SPSS, as well as interpret and write up your
results, in our enhanced content. You can learn more here. It is also possible to run
Cronbach's alpha in Minitab.
Reference:
https://statistics.laerd.com/spss-tutorials/cronbachs-alpha-using-spss-statistics.php