Académique Documents
Professionnel Documents
Culture Documents
Day 1: Contents Introduction Tests and Test Scores Data sets and their display Measures of central tendency Skewness Measures of dispersion Basic analysis using a spreadsheet
Introduction
The aims of the programme are: to help participants appreciate the need for statistical analysis in assessment; to enable participants to interpret key statistical indicators; to allow participants to discuss the relevance of statistics to their work. The six-day programme will cover: Basic descriptive statistics; Basic comparative statistics; Classical item statistics; Introduction to item response theory and its uses. The approach will place the emphasis on the interpretation of statistics, rather than the underlying mathematics, and will be related to: test design; test evaluation; grading.
Notes:
Page 1
they are objective. they allow (permite) the technical performance of tests to be evaluated. they allow comparisons to be made among (printre) populations. they may highlight anomalies (acestea pot evidenia anomalii).
Notes:
Page 2
A test is an instrument designed to measure ability (capaciti) in a particular domain. A test is made up of a number of tasks (Un test este alctuit dintr-un numr de sarcini). Each task may be made up of several items.( Fiecare sarcin poate fi alctuit din mai multe
elemente)
Students are asked to give responses to the items presented in the test. We mark each students responses according to a marking scheme. (Noi notm
rspunsurile fiecrui elev n funcie de un sistem de marcare barem de corectare)
This produces an overall test score for the student.( Acest lucru produce un scor general la test
pentru student) Test Student Responses Task 1 o xxxxxxx o xxxxxxx o xxxxxxx o xxxxxxx Task 2 o xxxxxxx o xxxxxxx o xxxxxxx Task 3 o xxxxxxx o xxxxxxx o xxxxxxx Task 4 o xxxxxxx o xxxxxxx o xxxxxxx o xxxxxxx o xxxxxxx Apply Marking Scheme Task 1 o 1 o 1 o 0 o 2 Task 2 o 1 o 2 o 1 Task 3 o 0 o 0 o 1 Task 4 o 1 o 4 o 3 o 0 o 1 Calculate Total Score Task 1 1+1+0+2 = 4
Task 1 o item 1 o item 2 o item 3 o item 4 Task 2 o item 1 o item 2 o item 3 Task 3 o item 1 o item 2 o item 3 Task 4 o item 1 o item 2 o item 3 o item 4 o item 5
Total = 18 If the test is valid and reliable, the total score will be a good indicator of student ability. Notes:
Page 3
We can now count the students who gain a particular mark. The results are recorded as a frequency distribution. We can then plot the list of numbers as a chart.
Page 4
Score (x)
Frequency (f)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0 0 0 0 0 0 0 0 1 1 1 2 1 2 2 2 4 3 5 4 2
The frequency is the number of students who scored a particular mark. So: no-one scored 7 marks;
All this data is displayed on the chart below. Test Score Distribution
Page 5
In national examinations we have large numbers of students taking the tests. Also our examination papers have more items or measuring points typically 100 or more. In this case, ordering and displaying the distribution of scores is even more important. Here are the results from one paper in an English Language examination. 514 students attempted this paper. Notice that here the frequency distribution table has an extra column. This shows the proportion (%) of students who get a particular score. For example, we can see immediately that 4,9% of students are on a mark of 40/60.
Score (x) Frequency Proportion (f) (%) Score (x) Frequency Proportion (f) (%)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 1 1 5 3 5 2 4 3 8 6 7
0,2 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,2 0,0 0,0 0,0 0,2 0,0 0,2 0,0 0,0 0,2 0,2 1,0 0,6 1,0 0,4 0,8 0,6 1,6 1,2 1,4
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
8 8 12 9 13 12 17 10 16 25 24 22 29 23 33 21 25 24 21 28 21 22 14 11 8 6 0 1 2 0
1,6 1,6 2,3 1,8 2,5 2,3 3,3 2,0 3,1 4,9 4,7 4,3 5,6 4,5 6,4 4,1 4,9 4,7 4,1 5,5 4,1 4,3 2,7 2,1 1,6 1,2 0,0 0,2 0,4 0,0
Page 6
Percent
0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Score scale
Charts like these give us useful information about the pattern of scores produced when a group of students takes a test. However, they are difficult to compare directly. For this we need statistical indicators that tell us about: the middle of a distribution (measures of central tendency) the width of the distribution (measures of dispersion). Notes:
Page 7
Score Distribution
35 30 # of students 25 20 15 10 5 0
0 5 10 15 20 25 30 35 40 45
peak
50
55
60
Score
mode = 45
The Median The median is the position of the middle student when those taking the test are ranked according to their total test scores. For example, a test is taken by 15 students and their scores are: 10 17 13 20 15 14 17 8 19 12 17 We arrange these in order: 8 10 11 12 13 14 15 15 16 16 17 17 17 17 17 17 17 17 17 16 19 19 19 19 19 11 20 20
Page 8
If you have an even number of test-takers, find the students on either side of the middle and take their average score. For example: 8 10 11 12 13 14 15 16 17 17 median = 16,5 17 17 19 19 20 20
The median score is 16,5. The Mean The mean is the most commonly used indicator of typical performance. It is the arithmetic average of all test scores. It is calculated by adding up all the student scores and dividing by the number of students who took the test. For example: Test Scores: 8, 10, 11, 12, 13, 14, 15, 16, 17, 17, 17, 17, 19, 19, 20, 20 Total: 8+10+11+12+13+14+15+16+17+17+17+17+19+19+20+20 = 245 Average score = 245/16 = 15,3 The mean score = 15,3 Relationship between the mode, median and mean If the test produces a symmetrical pattern of scores, the mode, mean and median are exactly the same. For example, the distribution below has all three equal to 5,0.
Distribution of Test Scores 30 25 20 15 10 5 0 0 1 2 3 4 5 6 7 8 9 10
Score Number of students
Page 9
Skewness
If the test produces an asymmetrical pattern of scores then the distribution is said to be skewed. In this case, the mean, mode and median move apart. For example, in the pattern below, the mode is 7 but the mean is just 6,3.
Distribution of Test Scores
Number of students
50 40 30 20 10 0 0 1 2 3 4 5 6 7 8
mode = 7
10
Score
mean = 6,3
Skewed distributions are useful when you want to make decisions in a particular part of the marking range. For example, to identify students with learning difficulties, the distribution below would be satisfactory. This test discriminates well between students of low ability.
Distribution of Test Scores 12 10 8 6 4 2 0 0
Number of students
10
Score
If you want to grade over a wide range of ability, it is better if your distribution of scores is not highly skewed.
Page 10
Measures of Dispersion
We are interested in the spread of test scores because this can tell us about the spread of ability. We call the spread of test scores the dispersion. Here we look at three measures the range, the inter-quartile range, and the standard deviation. The Range This is the simplest measure of dispersion. It tells us the number of marks from the lowest to the highest score achieved. Range = Maximum Score Achieved Minimum Score Achieved + 1 (The +1 is needed because, for example, in a test marked out of 10 we actually have 11 marks available: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.) Example: In a national examination for History, the maximum possible score was 100 marks. When the examination was taken, the top score achieved was 96. The lowest score achieved was 18. The range of scores was: 96 18 + 1 = 79 In theory, this test had 101 marks available to discriminate between all the students. In the event, only 79 marks were used. For tests and examinations used for grading students across a wide range of ability, we prefer our tests to use nearly all the mark range.
Page 11
The Inter-Quartile Range The range can be misleading because it uses only the two most extreme cases. To solve this problem we look at the number of marks covering the middle 50% of students. This is the inter-quartile range. To find the inter-quartile range we find the score below which 25% of the students fall. This is the lower quartile. Then we find the score above which we find the top 25% of students. This is the upper quartile. Then we calculate the inter-quartile range: inter-quartile range = score at upper quartile score at lower quartile
For this distribution, the first quartile is at 40 marks and the third quartile is at 60 marks. The inter-quartile range is 60 40 = 20 marks. This tells us that, on this test, the middle 50% of the population is covered by just 20 marks.
Page 12
Standard Deviation The most powerful indicator of a tests dispersion is the standard deviation. This is a measure which takes into account every students score. The formula is given below.
Standard Deviation =
(x
i= N i =1
xi = score of student
( x
of students
You dont have to remember the formula but you should have an idea about what the standard deviation (SD) means. Firstly, the wider the spread of test scores the larger the SD. The smaller the SD, the narrower the distribution of test scores. Secondly, most national examinations produce SDs of about 16 marks in 100. If the SD is larger, the student scores are more spread out. If the SD is smaller, the scores are closer together. To get an idea of what this means, look at the score distributions on the next page. These have been drawn so that: o Each one has exactly the same mean score (50/100). o Each one has exactly the same number of students (2000). o The only thing that differs is the SD.
Page 13
Score Distributions
90 80 70 # of students 60 50 40 30 20 10 0 0 10 20 30 40 50 score 60 70 80 90 100
SD=15 SD=20 SD=10
Notice how an examination with an SD of 20 marks out of 100 (20%), gives a broad spread of scores. The examination with an SD of 10 marks out of 100 (10%) leaves the students clustered in the middle. Look at how much the number of students on the mean mark drops as the SD increases. For grading purposes, we prefer examinations with large standard deviations.
Page 14
17, 22, 25, 19, 21, 21, 19, 25, 16, 9, 26, 14, 17, 19, 18, 19, 21, 15, 17, 19, 16, 12, 20, 16, 23
(a) (i)
(ii)
the median
(iii)
(b) The teacher decides to give all students who score fewer than 10 marks extra help after school. What purpose has the teacher used the test for? (c) The teacher notes that all the students got questions 7 and 8 wrong. She decides that she will have to teach the topic tested by these questions again, using a different approach. What purpose has the teacher used the test for? 2 Here is a distribution of test scores.
Distribution of Scores
8 # of students 6 4 2 0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
score
(a)
(i) (iii)
Find: the number of students taking the test(ii) the mode (iv) the range of scores.
(b) This distribution is skewed. (i) What does skewed mean? (ii) In this distribution, is the mean smaller, bigger, or the same size as the mode? (iii) Would this test be better for identifying students with learning difficulties or for awarding prizes to the best students? Explain your answer. 3 A Mathematics examination gave these results: Mode = 14% Mean = 43% Standard Deviation = 24%
What does this data tell you about the distribution of test scores? (Consider typical student performance, symmetry and dispersion.)
Page 15
(array)
MODE(array) AVERAGE(array) MEDIAN(array) SKEW(array) MIN(array) MAX(array) QUARTILE(array,1) QUARTILE(array,3) STDEV(array)
The sum (total) of a number of scores. The mode of a distribution. The mean of a distribution. The median of a distribution. A numerical indicator of the degree of asymmetry. The maximum score achieved. The minimum score achieved. The range is given by: MAX(array)-MIN(array)+1 25% of candidates fall at or below this score. 75% of candidates fall at or below this score. The inter-quartile range is given by: QUARTILE(array,3)- QUARTILE(array,1) The standard deviation of the distribution.
To draw a frequency distribution chart in Excel, you have to create a table of scores and frequencies using the FREQUENCY(data-array,bins_-array) as shown below.
10 20 30 40 50 60 70 80 90 100 1 21 68 174 286 373 447 372 189 22
Once you have your frequency distribution you can use the chart wizard to create a bar chart.
Page 16