Académique Documents
Professionnel Documents
Culture Documents
Submit your assignment to Blackboard Task 2. Please follow the submission instructions on
Blackboard.
The assignment will be marked out of a total of 100 marks and forms 30% of the total
assessment for the course. ALL assignments will be checked for plagiarism by SafeAssign
system provided by Blackboard automatically.
Refer to your Course Outline or the Course Web Site for a copy of the Student Misconduct,
Plagiarism and Collusion guidelines.
Assignment submission extensions will only be made using the official Faculty of Arts,
Business and Law Guidelines.
Requests for an extension to an assignment MUST be made to the course coordinator prior to
the date of submission and requests made on the day of submission or after the submission
date will only be considered in exceptional circumstances.
Page 2 of 7
ICT110 Introduction to Data Science Assignment 2
Background
A research team planned to study the heath development of the world in the past 15 years.
The team retrieved the dataset from World Bank (http://databank.worldbank.org) about
Health and Population Statistics between 2001 and 2015.
More details about the data attributes and data content can be found in the attached
documents.
Assignment Task
You are a member of the team, and need to perform data analysis on countries in the region
of East Asia & Pacific.
The team has not set any specific goal for the analysis. Therefore, you have the freedom to
explore the data, and dig out anything you feel interesting or significant.
Page 3 of 7
ICT110 Introduction to Data Science Assignment 2
You have been requested to prepare a data analysis report about your work and explain your
findings. The potential audiences include other researchers, business representatives, and
government agencies. They may have limited ICT or mathematical knowledge.
1. Introduction
Provide an introduction to the problem. Include background material as appropriate: who
cares about this problem, what impact it has, where does the data come from.
2. Data Setup
Describe how to load the data, and the libraries needed. Provide an overview of the data
about its dimensions and structures.
Perform 2 two-variable analysis. Plot at least one graph for each variable. Explain why the
selected graph is appropriate
The analysis can be performed on all years and all countries, or on a subset of your interest.
4. Advanced Analysis
4.1 Clustering
Briefly explain the concept of clustering and k-means.
Try to do a clustering analysis to group countries according to some selected attributes.
The analysis can be performed on all years and all countries, or on a subset of your interest.
5. Conclusion
6. Reflections
In this part, discuss any difficulties you had performing the analysis and how you solved
those difficulties. Reflect on how the analysis process went for you, what you learnt, and
what you might do differently next time.
For the data analysis, you need to provide both R code, and the explanation to the code and
the result. For the section 2 4, please represent each R code snippet in a box with some
comments. For example:
# Draw a boxplot on the attribute Income
boxplot(MyData$income)
The following guidelines will be used in marking each section of the assignment:
Page 4 of 7
ICT110 Introduction to Data Science Assignment 2
Report Format
Your report should be no less than 1,200 words and it would be best to be no longer than
2,000 words long. All comments and graph titles are counted.
For advise on report writing, the following book provides good advices:
Summers, J. & Smith, B., 2014, Communication Skills Handbook, 4th Ed, Wiley, Australia.
Referencing
2 references for the explanation of Clustering and 2 for linear regression are required. These
references should follow the Harvard method of referencing. Note that ALL references
should be from journal articles, conference papers, technical papers or a recognized expert in
the field. DO NOT use Wikipedia as a reference. The use of unqualified references will result
in the deduction of marks.
Submission
The completed assignment is to be submitted to Blackboard Task 2 by the due date of
11:59pm Friday, Week 12.
Page 5 of 7
ICT110 Introduction to Data Science Assignment 2
The assignment will be assessed according to the marking sheet which is shown in the last
page. Late submission will be penalised according to the policy in the course outline. Please
note Saturday and Sunday are included in the count of days late.
Assignment Guidelines
This assignment will take a number of weeks to complete and will require a good
understanding of data science and management for successful completion. It is imperative
that students take heed of the following points in relation to doing this assignment:
1. Ensure that you clearly understand the requirements for the assignment what has to be
done and what are the deliverables.
2. If you do not understand any of the assignment requirements Please ASK the course
coordinator or your tutor.
3. Each time you work on any aspect of the assignment reread the assignment requirements
to ensure that what is required is clearly understood.
Page 6 of 7
ICT110 Introduction to Data Science Assignment 2
Appendix A
Maximum Marks
Items
Marks Obtained
Report formatting (font, header and footer, table of content, 5
numbering, referencing)
Professional communication (correct spelling, grammar, formal 5
business language used)
Report introduction 8
Data setup 5
3.1 1st one-variable 5
3.2 2nd one-variable 5
Exploratory Data Analysis 3.3 3rd one-variable 5
3.4 1st two-variable 8
3.5 2nd two-variable 8
4.1 Clustering 10
Advanced Analysis 4.2.1 1st Linear 10
4.2.2 2nd Linear 10
Conclusion 8
Reflection 8
Total = 100 0.0
OVERALL COMMENTS:
Page 7 of 7