Académique Documents
Professionnel Documents
Culture Documents
The objectives of this assignment are to help you to understand the statistical analysis process, to
analyse data using software and to develop your integrity in reporting your assignment. The best
way to understand statistics is by involving yourself in the whole statistical process and not just
limited to studying statistics from books, videos, or websites. This assignment requires you to
follow the steps of statistical problem-solving methodology by conducting your own study. PART
1 involves Step 1 to Step 4 of statistical problem-solving methodology, while PART 2 involves
Step 5 to Step 6. You will experience on how to collect, organise, summarise, analyse, present,
interpret, and draw conclusion from data, as well as preparing a report of your study.
INSTRUCTIONS:
1. Set up a group that consists of four (4) or five (5) members from your section only and name
your group using any statistical term.
2. Obtain an APPROVAL of your chosen topic from your lecturer BEFORE you start
collecting data and begin your statistical analysis.
3. Use the template on page 5 and page 6 as the cover for your assignment booklet of Part 1
and Part 2, respectively. Fill in all the required particulars clearly.
4. Answer ALL questions in PART 1 and PART 2, and use appropriate statistical notations.
LECTURER
DR. NORYANTI BINTI MUHAMMAD
PART 1
FOR EXAMINER USE ONLY
Question Marks Your Marks Question Marks Your Marks
1 2 8 4
2 2 9 12
3 1 10 12
4 7 11 2
5 6 12 2
6 5 13 1
7 4
TOTAL 60
PART 1 (60 Marks: 7%)
1. Identify a problem that you are interested to study. Provide a brief description of
your study.
The problem that we are interested to study is how many hour does UMP
students spend on social media (instagram, twitter, facebook and snapchat) in a day.
We chose this topic as we are interested to know how the duration of UMP students
spend on social media in a day differ between gender which is male and female.
2. Choose a single quantitative variable that describe your chosen problem. Identify
the type of level of measurement for the variable.
We chose a single variable which is in ratio level of measurement, since it
possesses all the characteristics of interval measurement, and there exist a true zero.
3. State your population.
4. Divide the data collected into two significant groups that related to the study (e.g.:
gender, faculty, year of study, etc)
i. State the name of the groups.
Gender
45
46
40
41
35
30
25 27
24
20
15 17
14
10
5 7
6 6
2
0
11 PM 11 PM 11 PM 11 PM 11 PM 11 PM 11 PM 11 PM 11 PM 11 PM
1 hour 2 hours 3 hours 30 minutes 4 hours 1 hour 2 hours 3 hours 30 minutes 4 hours
Female Male
The table of the overall data for how much time UMP students spend on social media in a day
iii. Identify the method of data collection being used. Provide the significant
evidence.
iv. Identify which sampling method you used to collect the data. Explain the
sampling method process.
5. For each group, select two sets of data of different sizes (n<30, n>30). Therefore, you
should have four sets of data in total.
(i) Present the data selected as shown in the following table.
Sample size Group 1 Group 2
n<30 26 62
n>30 24 78
(ii) Identify which sampling method you used to select the four sets of data. Explain
the sampling method process.
The sampling method that we used in this questionnaire are voluntary sampling.
The questionnaire that we make which is in google form is shared to all ump
students in Pekan which is through whatsapp messenger.
6. For each set of data, obtain the descriptive statistics using Microsoft Excel. Then,
summarise the measures of central tendency and measures of variation in the
following table.
8. Do different sample sizes affect the conclusion of the study by comparing its
measures of central tendency and measures of variation? Justify your
answer.
As the result that have been observed before, we can see that different sample
of group can produce different kind of value for central tendency and
measures of variation. The sample for male spending on social media in a day,
with n<30 give variance of 0.07783 and n>30 give variance of 0.08797. for
female student, the variance given by sample n<30 is 0.17598 and for n>30 is
0.1196 respectively. The median can also play an important role for the most
typical value if a set of values has an outlier. However, when the sample size
is large and does not include outliers, the mean value could provide a more
accurate measurement of central tendency.
9. Construct histograms for the four sets of data (be sure to label it properly!).
Identify the shape of distribution for each histogram and give your comments
based on the data distribution.
10. . Construct boxplots for all data sets on the same x-axis. Identify the shape of the
distributions. Compare and comment on the average and variability of the boxplots.
Item Male Male Female Female
N<30 N>30 N<30 N>30
Minimum 0.5 2 0.5 2
Quartile 1 0.75 2 1 3
Median 1 3 2 3
Quartile 3 1 3 2 3
Maximum 4 4 3 4
IQR 0.25 1 1 0
Q1-1.5IQR 0.375 0.5 0 3
Q3+1.5IQR 1.375 4.5 4.5 4
Outlier 2 0 2 0
Left Whisker 0.25 0 0.5 1
Right Whisker 3 1 1 1
Boxplot
3
11. What is the best measure of central tendencyto describe your data? Give a reason.
Median is the best when we want to measure the central tendency because median can be
used to describe the middle set for the skewed data.
12. What is the best variability measure to describe your data? Give a reason.
The best variability measure is IQR value. This is because we can know the initial estimate
outlier by looking at values more than one and a half times the IQR distance below the first
quartile or above the third quartile. To calculate either the outlier is true or not, the formula
for this is 1.5(IQR). IQR often preferred over range because it excludes most outliers.
13. Based on your problem stated in (1), give any relevant conclusion for the study.
In conclusion, we can say that this survey is to know how much time spend by ump
student using social media in a day. This problem was chosen as we want to know how the
amount of data used by students in UMP Pekan to surf the internet per month, especially
between the gender which is male and female as we know that the results from the last
semester exam was varied between of them.