Vous êtes sur la page 1sur 5

Bivariate Data Project: Time VS Length

Figure In this experiment, the length of the
1 thumb and the speed in which it takes to text the
quote “Why does it rain cats and dogs, how do
we get stumped, and why do we want to cut to
the chase?” is applied to see whether or not there
is an association between the two factors. To
6.8 7.0 7.2 7.4
determine this, a ruler was used to calculate the
length in centimeters from the base of the thumb
to the tip on the most dominate texting figure.
After recording the length of the thumb, the
individual being tested was given a phone, which
was used throughout the entire testing of
Histogram

Figure
samples, and was asked to read and then text the
2 phrase provided. The same timer was used and
the data was recorded and expressed. The
process was repeated per person, making total of
6.5 7.0 7.5
30 subjects tested.
length

Box Plot
In figure 1, the boxplot shows relatively
symmetrical intervals between the Q1 to the
median and the median to the Q3. The median
for the given data regarding length is 6.3cm,
which is also the approximate mean. The IQR
Figure which is calculated by subtracting Q1 from Q3,
3 which ended up being 0.8, indicating that the
0 120 140 160 180 200 220
data is approximately normal. The spread of the
ime data, assuming no outliers were present, starts
on 5.2cm and ends on 7.2cm, making 5.2cm the
minimum and 7.2cm the maximum. According to
figure 2, the histogram, which was used to
describe the shape, the data showed a unimodel
and approximately symmetrical shape. The data
expressed from the length of the thumb shows
Length VS Time Histogram that majority of the samples tested had an
7
Figure average length of the thumb, which the amount
6
4 of people gradually decreased when the average
5 length decreased or increased.
4

3 According to this boxplot, as shown in
2 Figure 3, the median appears to be around 120s
1 (120 seconds) with an IQR of 80s, the IQR, as
stated before was calculated. Unlike Figure 1,
0 50 100 150 200 250
Figure 3 did not seem to be symmetrical,
time
intervals between Q1 and median appeared to be
smaller than the interval between the median
and Q3, which indicates that the data is skewed.
When viewing Figure 4, which is the histogram of

.7s and ends around 208s. the graph appears to be bimodal. The spread starts at 66. so an actual shape may be difficult to determine. Overall. which is considered to be an unusual feature.Figure 3. or perhaps more since the data fluctuates constantly. the data appeared to be skewed to the right with a gap at 200s.

2 6.0 6. as displayed on the left. 140 120 shows a weak positive linear association 100 80 between the length of the thumb and the time 60 40 it takes to text. about 3.4 6.0113. The correlation coefficient for length time = 14.7s.6 5. for every one -40 -80 increase in length the predicted increase of 5.Length VS Time Scatter Plot Figure 220 200 180 160 The scatterplot. Also.0 6.4 6. suggesting little or no correlation between the time it takes to text to the length of the thumb. The standard deviation of the residual graph is 39. There also appears to be a 20 pattern in the residual therefore this would not 0 5.7length + 35.2 7.6 6. a linear model is not appropriate for this data unless re-expressed.2 7.4 5.9% in variation of time can be explained by the linear model for length and time.039 this particular set of data is 0. Overall.6 6.2 5.8 6. r 2 = 0.6 5.8 7.4 make an appropriate linear model.8 7.0 7.4 5.7(length).2 5.197.0 7.8 6. The model length 80 is expressed by the equation: Time= 35 + 40 0 14. .2 6. which means.4 time is 14.

2485 Scatter Plot 18 89 6. shows a somewhat linear positive strong association with a correlation coefficient of 0. which shows a somewhat curved pattern.0354length + 1.393.2 0.66344 20 90 5. the data cannot be appropriately expressed by a linear model due to the results of the residual plot. Re-Expression: Alligator Data Scatter Plot 700 600 Figure Collection 1 500 Figure length w eight lnw e ight <new > 400 61 58 28 3.43082 14 86 83 4. About 84% in variation of weight can be explained by the linear model for length and weight.66356 0 5 69 36 3.90263 25 147 640 0 6.78419 200 3 63 33 3.98898 0 -100 9 74 51 3.38203 13 85 84 4.5 22 94 130 4.04305 w eight = 5. or natural log of the y-values. as expressed in Figure 7.5 84 4.0 366 5.5 102 4. r 2 = 0.5 197 5.43082 19 90 6.0 106 4.90length .3322 300 7 2 61 44 3.917.0 21 94 110 4.93183 0 20 40 60 80 100 120 140 160 10 76 42 3.2 0 20 40 60 80 100 120 140 160 length lnw eight = 0.73767 length 11 78 57 4. r2 = 0. Despite the strong correlations.41884 15 86 80 4.0 23 114 3.58352 0 20 40 60 80 100 120 140 160 length 6 72 38 3.62497 Figure 8 5.70048 4.0 -0.86753 4. According to the pattern.63759 7 72 61 4. The scatterplot.96 Source: This source of data was used after discovering the original set of data was incapable of being re-expressed properly.2832 24 128 3.49981 17 88 70 4. indicating a strong correlation between the weight of the alligators and the length.38203 16 86 90 4.46147 20 40 60 80 100 120 140 160 length 0.11087 100 8 74 54 3.49651 100 4 68 39 3. the data was re- expressed using ln(weight). as seen in Figure 8. .34.84 12 82 80 4.

the predicted weight would be 2235. a linear model would be most ideal. The new equation is: ln(weight)= 0. Based on the re-expressed data. The natural log of weight would be 7. . indicating an even stronger correlation and the residual plot became more chaotic.34. the predicted weight would generally increase by 0.0354(length) + 1. which translates to: for every one increase in length. If the alligator was 180 inches. the correlation coefficient changed to 0. which relates to Figure 8. After the data was re-expressed.0354.978.712.01 pounds.