Académique Documents
Professionnel Documents
Culture Documents
TUTORING SESSION 3
Exercise 1
The following table contains information about the horse power and consumption on a
sample of 11 cars:
SOL
1. Use Q1 = 6.3, Q3 = 8.1, Median = 6.5, min = 5.9, max = 9.1 to draw the boxplot
2. sHP2 = 681.8182, scons2 = 1.3089, CVHP = 0.166 and CVcons = 0.1632. Therefore HP is slightly
more variable than consumption
3. Draw the scatterplot using the pairs of observed (HP, cons), the sample correlation coefficient is r
= 0.8043
Exercise 2
A market research was aimed to study the brand preferences of male and female customers
who bought a laptop in the last year. Data were collected on a sample of 180 customers across
different European countries, and are given in the following two-entry table
1. Which percentage of men in the sample bought Sony? Which percentage of the sample
is female and bought Apple?
2. Compute the mode of Brand among women, and the mode of Brand in the sample
3. With an adequate graph discuss whether the variable Brand and Gender are dependent
SOL
1. The percentage of men who bought Sony is 23,8%, while 22,23% of the sample is female and
bought Apple.
2. The mode of Brand among women is Apple, as is for the whole sample
3. As shown in the graph below the two variables are dependent. For example, while there is no
predominant brand among men, the preferences of women lean strongly towards Apple.
Exercise 3
Twelve French families were asked questions about their TV subscription in 2011 and data
are reported in the following table:
1. Compute the mean of the expenses for families with a basic subscription, and the mean
for families with premium subscription.
2. Discuss the association between number of family members and type of subscription
through an appropriate graph.
SOL
1. mean(basic) = 35.2
mean(premium) = 59.4
2. To discuss the association build the stacked or component bar chart using the following
conditional frequancies
subscription B M P
family members
1 3/5 = 0.6 1/5 = 0.2 1/5 = 0.2
2-4 1/4 = 0.25 1/4 = 0.25 2/4 = 0.5
>4 1/3 = 0.3334 0 2/3 = 0.6666
Exercise 4
In a survey of 200 married couples, information on the number of children (X) and the yearly
income of the couple in thousands of Euros (Y) was collected. The resulting data are
summarized in the following two-way table:
Y/X 0 1 2
[0,30) 10 50 60
[30,60) 4 20 36
[60,90) 6 8 6
1. Determine the frequency distribution of the variable “Number of children” and provide
an appropriate graphical representation.
2. Compute the means of the variable “Number of children” with subpopulations obtained
for the different values of the variable “Yearly income”.
3. Calculate the mean and variance for the two variables X and Y. Compare the variability
of the two variables by using an appropriate index.
4. BONUSIn the same survey, the variable “Yearly expense for goods” (Z) in thousands of
Euros was also collected. If you are told that
200
SOL
The frequencies are the number of customers for each cost category. The computations for the
mean and the standard deviation are set out in the following table
∑𝑘𝑘
𝑖𝑖=1 𝑓𝑓𝑖𝑖 𝑚𝑚𝑖𝑖 112
The sample mean is estimated by 𝑥𝑥̅ = = = 5.6. Since we are working with sample
𝑛𝑛 20
∑𝑘𝑘
𝑖𝑖=1 𝑓𝑓𝑖𝑖 (𝑚𝑚𝑖𝑖 − 𝑥𝑥̅ )
2 120.8
data, the sample variance is: 𝑠𝑠 2 = = =6.3579. Thus 𝑠𝑠 = √6.3579 = 2.52.
𝑛𝑛−1 19