Vous êtes sur la page 1sur 8

Stephen bate P6749089 MS626 TMA 02

Q1-Task-1a
Tackling Task 7.1.2 required identifying outliners with reasoning. I sourced secondary
data from the internet as it was easily available real data (Appendix1 page1) and used
excel to generate the visual summaries (Appendix1 page1, page2) as it enabled me to
make comparisons in how effective the charts displayed the outliers and whether a
measure of the outlier could be seen.
The univariate height was one of the variables I looked at. I didnt want multivariate
data as I believed analyse would be too complicated for me to at this stage. Looking at
this variable gave me a context to the data that I could relate to as I am under average
height.

Q1-Task-1b
The big idea recognised was measurement. Remembering outliers affected the mean
more than the median I observed this effect on the location of measures by altering the
distance of outliers (Appendix1 page3). An improvement in my thinking is now to
build on existing knowledge by varying my data to strengthen my understanding.
Looking at shapes of the graphs (Appendix1 page2) I observed a cluster of data with
outliers showing a different distribution. The histogram in particular showed a bell like
shape identifying it to a normal distributionthat is if the outliers are removed. I was
confused as to remove them or alter the data to record 14 and 11 rather than 1.4 and 1.1
because I believed they were data entry errors. Within the heights context I expected a
normal curve because in real life most people are around my height with a few taller or
smaller. Looking at my datano one is really 1.4cm tall. I was now looking at the data
to see if it made sense in the real world which I wasnt doing before.
For me, theres a connotation with the word outlier and outsider which made the
identification easier as I view outliers as not being part a group...it was an evocative
word conjuring a feeling being outcast.

1
Stephen bate P6749089 MS626 TMA 02

Q1-Task-1c
The task frustrated me. Making calculations alone from the data was straight forward,
e.g. using a measure to determine outliners from a box plots (Appendix1 page3).
Scrutinising data and asking my own questions though was challenging. Every question
I posed I had doubts about with regards to its worthinesswas it too simplewhere
was this leading, and inevitably because I like finding concrete solutions it would end in
making a calculation, such as researching and using the Grubbs equation (Appendix1
page4) to determine an outlier, Grubbs, Frank (1950). I knew I was diverting and this
study is statistics and not mathematics so I would start again but be frustrated because it
seemed I was getting nowhere.
Concentrating on context made me feel happier with the task as I realised I had to
consider real data where outliers were important (Appendix1 page4). I thought this
would be simple but it took time to contrive scenarios. One such would be the use of
MRS scanners to pick up abnormalities and identify illness. The few I had managed to
think about were trounced by the vast number of cases I later found online which made
me doubt my critical thinking capability. From this I could see the outliers should be
considered case by case in order to decide whether to leave in or remove.

2
Stephen bate P6749089 MS626 TMA 02

Q1-Task-2a
Expanding Task 6.2.3, I used 2 data sets in order to draw and compare two box plots.
Secondary real data recording student foot length was used as it was easy to obtain from
a data web site (Appendix2 page1). I used Excel for two reasons. It generated the
summaries for me, which enabled me to compare my calculations against its algorithm
and because it was a statistical tool which helped to draw the box plots (Appendix2
page2) using the formula menu for accuracy providing clear visual representations.
I chose continuous data as box plots best display the summary values compared to other
charts.

Q1-Task-2b
The big idea was visual representation. I cleaned the dirty data as there were clear
outliners that needed to be removed. I was able to recognise these entries as they had
foot length of 0 which did not make sense (Appendix2 page1). This could be shown on
the diagram as a dotted line but for me, that took my eye away from what really needed
to be analysed was confusing.
One strength that I am developing is to look for new learning. The size of the boxes was
of interest. I was confused whether I had expanded the frame by accident or had the box
increased to show that I had a bigger sample (Appendix2 page2). This was one
advantage in displaying the visual in excel as I could go back and reproduce the image
and investigate the cause (Appendix2 page3). I learnt that the box size could be
enlarged to show a greater population but I had done this by chance. By using a
different media to my normal approach (ICT not just hand drawing), I was able to ask a
question that otherwise would not have cropped up. It was unexpected and only came
about because of the visual produced which led to further learning and understanding.

3
Stephen bate P6749089 MS626 TMA 02

Q1-Task-2c
I began the task by recalling what I knew about box plots and carried out examples for
confirmation (Appendix2 page2). I intended to build on that knowledge either by
gaining new insight by examining the visual or by researching about the model for fresh
ideas. Garfield and Ben-Zvi (2008) says, New knowledge and understanding are based
on existing knowledge, which is basis of constructivism and this was carried out when
comparing the size of the box plots (Appendix2 page2).
I found that by taking the time to focus on this one aspect, I was able to hypothesise that
the area of the box could represent the difference in the amount of data looked at when
comparing multiple data sets to give an impression of data sample size. I came to this
conclusion by considering pie charts and knew their areas could be enlarged when
comparing proportions although it is not very often used (Appendix2 page3).
On reflection, this method of investigation was new to me. Usually I take a complicated
topic (because I think this is intellectually challenging) which I know little about and
research how to understand it by following the steps other have taken. I now believe I
would be better served by building upon my statistical knowledge by looking for
smaller steps of learning because with each step there are numerous questions that I can
ask.

Word count 1092

4
Stephen bate P6749089 MS626 TMA 02

Q2-Task1-a
I had a female student aged 15 whom possessed above average mathematical ability
going by her recorded exam data.
Taken from an educational software supplier (MathsWatch), a visual representation
(Appendix2 page4) enabled the student to use statistical reasoning to deepen her
understanding of a histogram. This type of chart requires the understanding of the
concept of continuous data as used for the variable age and that the count for
frequency is discrete data. They needed to be able to redraw the histogram correctly
after pointing out what was incorrect and from that, be able to make summaries which
they could use to analyse and interpret the data and in context.

Q2-Task2-a
A frequency table supplied to the students contained very little information in order to
see what they would do with the data intending that they would draw upon their own
knowledge of how to proceed. The data was made up by me for convenience as was the
task (Appendix2 page4).
Discussion and use of visual representation was intended to bring about statistical
thinking. The students were expected to plan what needed to be done with the data, how
to present that data by drawing an appropriate chart, how a visual summary could be
attained and then to give a reasoned explanation for each choice taken. I made charts
ready to be discussed from the data (Appendix2 page5) so they could be displayed on
the wall.

5
Stephen bate P6749089 MS626 TMA 02

Q2-b
The diagram made for discussing is from task2a (Appendix2 page5).
The end length of each bar represented position but very poorly. Each bar represents a
number which could be read off the scale on the horizontal axis. Tuftes liking for
erasing redundant data ink and showing minimal features would probably want one line
rather than solid bar or even a dot plot so only relevant information is observed. It is
only by being used to reading from the end of bars that we know the position.
The chart fairly shows the data in the sense of it being accurate but could be improved
as the lengths are all in the bottom third of the frame giving a false impression of being
close together i.e. T/Jerry almost appear the same length as H/Kong Fuey. The labelling
for the frequency (numbers) should be removed as it can be read from the scale, which
in turn reduced to show a scale up to 20 instead of 60. The bars do encourage the eye to
compare lengths but the eye is taken away by the thick line framing the graph and Tufte
would remove that. For a broader overview, Tufte would use less axis numbers perhaps
going up in a scale of 4 so any reading in between could be interpreted. See corrected
chart (Appendix2 page5)
The task would be improved if the students developed their own context for the graph
(Appendix2 page6), allowing them to think of it as a real world situation where they
could consider how the data was generated which would engage them more than just
asking them to draw a graph. Cobb (1997) states: statistics requires a different type of
thinking, because data are not just numbers, they are numbers with a context.
Developing the task by adding more data requires the student think deeper and ask
themselves more questions about the data and how best to display that information
(Appendix2 page6)

6
Stephen bate P6749089 MS626 TMA 02

Q2-c
The adapted task (Appendix2 page6) enabled the student to provide her own context for
the data and become fully engaged. I was surprised when she said the data related to
lengths. Explaining, she thought the cartoons were from an old television program
called Whacky Races and the data represented miles that each cartoon character
travelled during the race she was telling a storey, making sense of the data. I was
surprised by the students imagination for creating context which was to generate
statistical conversations which stimulated both learner and teacher. The use of context
had drawn the student into the task as I had hoped but I did not think she would be as
energised as she had become right at the start. Drawing the charts to represent her data
(Appendix2 page 10) was done in very little time and not accurately as traditionally
taught in school (lack of titles, scales etc.). This did not matter as much as encouraging
discussions and reasoning that I wanted the student become involved in to develop
statistical understanding.
The student thought a bar chart was suitable as the lengths of bars would represent the
distance covered and picked out summaries such as the furthest/shortest distance
travelled with the two graphs representing two races. The bars are like lanes in the 100
metres she said. This allowed us to discuss continuous data and how it could be shown
on a graph. I asked the student to make notes before we discussed some of the pointsI
believe writing and discussing are separate skills that would help her to clarify her
thinking (Appendix2 pages11, 12). She also said she had thought of using pie charts but
said they were just to show percentages which enabled me to show her a prepared pie
chart (Appendix2 page9) to discuss this misconception.
Asked for another context for the data and she suggested the data represented votes
taken in a television studio room and executives wanted two of the five old television
cartoons to transmit again and had shown pilot episodes to choose from. Adding
different context kept her engagement. Showing another bar chart enabled further
discussions to take place (Appendix2 page12). A misconception arrived when she
thought the same number of people chose Pink Panther (3) until we discussed the
sample size. She then realised that percentages could be of some value and that
comparing lengths of bars could be misleading. At this point I was able to show
examples of pie and bar charts (Appendix2 page12). We were able to discuss comparing

7
Stephen bate P6749089 MS626 TMA 02

different samples and if percentages were more useful to be displayed on pie charts or
could they be used on bars also. I was also able to bring Tufte into the conversation,
showing her how I thought he would show the visual (Appendix2 page8) which she
totally disagreed with. She wanted as much information on the visuals as possible with
small intervals displayed on the scale axis and numbers shown at the end of each bar.
For her, this made identifying the values easy because there were more ways to see
them whereas Tufte wanted just one simple way to see them. Not everyone agrees with
the Tufte principles so she had a point but her opinion is possibly derived from years
being taught to draw charts in a certain way at school to pass exams.

Word count 1122

Vous aimerez peut-être aussi