Vous êtes sur la page 1sur 6

INST 627 – Data Analytics for Information Professionals

Hornbake 0105
Thursday 6:00 – 8:45 PM

Instructor: Dr. Christopher Antoun


Email: antoun@umd.edu
Office: 4111G Hornbake, South Wing
Office Hours: Wednesdays, 2:00 – 4:00 PM. Use this to sign up: https://goo.gl/FCCcst

This course is focused on basic statistical methods for analyzing data, with an emphasis
on practical applications rather than statistical theories. It concentrates on the process of
finding and extracting useful information and insights from data sets.
Topics for the present course include an introduction to research design, data
manipulation, descriptive statistics, hypothesis testing, the ANOVA model, linear
regression, and logistic regression. The course will be structured around R as the main
software package.

LEARNING OBJECTIVES
After completing this course you will be able to:
- Select and apply appropriate statistical methods to the analysis of data;
- Use R for basic data manipulation and analysis; and
- Interpret analysis results to reach defensible, data-driven conclusions.

COURSE MATERIALS

Software
The following software is necessary for you to successfully complete the course.
(Required) Microsoft Excel. For Macintosh users it is available through the university’s
TERPware website (https://terpware.umd.edu).
(Required) R software. It is free and available online (https://www.r-project.org/). You
may want to use R Studio (the free version), which is an integrated development
environment for R (https://www.rstudio.com/).

Readings and Video Tutorials


(Required) The “Online Stat Book” developed primarily by Rice University (Hereafter
OSB). It provides a multimedia course of study and is available for free
(http://onlinestatbook.com/2/index.html). Completing the required readings is essential to
understanding the course material.
(Required) Video tutorials produced by Mike Marin and hosted on YouTube. The full
list of videos is shown below under “Required R Tutorials.” Occasionally he introduces
concepts that have not been covered in class, don’t worry about these. He also uses some
terminology that is different from what we use in class. For example, he says explanatory
variable to mean independent variable. Because terminology in statistics vary by
discipline, it is worth getting used to alternative terms for the same concepts.

COURSE ACTIVITIES
All students should meet the following requirements on time. Late assignments will be
penalized. If an assignment due date is a religious holiday for you, please let me know at
least one week in advance, so that an alternate due date can be set.

Exercises
There will be nine exercises. Each exercise will be graded on a scale of 0-10. You may
work with your colleagues to figure out the underlying concepts and problem-solving
processes, but are expected to work individually to answer the specific problems that are
assigned. Completed assignments will be submitted via Canvas/ELMS. They are due by
5:00 PM on the day of class.

Mid-Term Examination
This is administered to test your understanding of the concepts covered in the readings
and lectures.

Group Project
In small groups you will prepare a data-related analytic project. Over the course of the
project you will identify an interesting dataset, develop a research question, form an
analysis plan, carry out the analysis, and report on the results. There will be a few
assignments specific to the group project, including a project proposal, a progress report
(i.e. update), a presentation, and a final paper. Additional details about the group project
will be provided in ELMS/Canvas and discussed in class.

Quizzes
Throughout the semester there will be periodic quizzes. These quizzes will not count
toward your grade in the class. They are a way for you to test your knowledge and get
rapid feedback.
GRADING
The overall grade will be based on the following components:
Exercises: there will be nine exercises, each weighting 5%, for a total of 45%.
Mid-Term Examination: 25% (open book)
Group Project: 30%

MY TEACHING PHILOSOPHY
To understand how I view my role as an instructor, please read my statement of teaching
philosophy: https://sites.google.com/site/chrisantoun/teaching-philosophy

COURSE POLICIES

Academic Misconduct – Cheating in any form (copying, falsifying signatures,


plagiarism, etc.) will not be tolerated. It will result in a referral to the Office of Student
Conduct irrespective of scope and circumstances, as required by university rules and
regulations. There are severe consequences of academic misconduct, some of which are
permanent and reflected on the student’s transcript. If you have any questions regarding
the University’s policies on scholastic dishonesty, please see
http://osc.umd.edu/OSC/Default.aspx. 

It is very important that you complete your own assignments, and do not share files
(excluding raw data), partial work or final work.

University of Maryland Code of Academic Integrity – The University of Maryland,


College Park has a nationally recognized Code of Academic Integrity, administered by
the Student Honor Council. This Code sets standards for academic integrity at Maryland
for all undergraduate and graduate students. As a student you are responsible for
upholding these standards for this course. It is very important for you to be aware of the
consequences of cheating, fabrication, facilitation, and plagiarism. For more information
on the Code of Academic Integrity or the Student Honor Council, please
visit http://shc.umd.edu/SHC/Default.aspx.

Special Needs – Please come and see me as soon as possible if you think you might need
any special accommodations for disabilities. In addition, please contact the Disability
Support Services (301-314-7682 or http://www.counseling.umd.edu/DSS/). Disability
Support Services will work with us to help create appropriate academic accommodations
for any qualified students with disabilities. If you experience psychological distress
during the course of the semester, you can get professional help at the Counseling Center
(301-314-7651 or http://www.counseling.umd.edu/).
COURSE SCHEDULE

Topics Required Readings/Tutorials Due

W1: Jan 25
Introduction & Course
Overview
W2: Feb 1 -OSB • R installed
Measurement & Design Chapter 1, Sections 2-5, 7, 9 • Find an R tutorial online. Post its
Chapter 6, Section 7 name on ELMS/Canvas with a
-R tutorials 1-8 brief review (~100 words) of its
strengths and weaknesses.

W3: Feb 8 -OSB • Exercise 1


Descriptive Statistics Chapter 1, Section 11
Overview Chapter 2, Section 5
Chapter 3, Sections 2-4, 12-13, 16
-“Presenting and summarising data”
(PDF on ELMS/Canvas)
-R tutorials 9-10
W4: Feb 15 -OSB • Exercise 2
Probability & Sampling Chapter 7, Section 3
Chapter 9, Sections 2, 6
Chapter 11 Sections 2, 3, 6
-“Samples and populations” (PDF)
-R tutorials 11-12
W5: Feb 22 -OSB • Exercise 3
Hypothesis Testing Chapter 11, Sections 4-8
(one-sample t-tests) Chapter 12, Sections 2
-“Hypothesis testing and P values”
(PDF)
-R tutorial 13
W6: Mar 1 -OSB • Exercise 4
Hypothesis Testing Chapter 10, Sections 7-9, 11
(two-sample t-tests) Chapter 12, Sections 4
-“Comparison of means” (PDF)
-R tutorials 14-15
W7: Mar 8 -OSB • Exercise 5
Chi-Square Chapter 17, Sections 2, 3, 5
-R tutorial 16
W8: Mar 15
MIDTERM EXAM
(Open Book)
W9: Mar 22
SPRING BREAK
W10: Mar 29 -OSB • Project Proposal
Analysis of Variance Chapter 15, Sections 2-4
(ANOVA), One Way -“One-way analysis of variance”
(PDF)
-R tutorial 17
W11: Apr 5 -OSB • Exercise 6
Analysis of Variance Chapter 15, Sections 6, 8
(ANOVA), Two Way
W12: Apr 12 -OSB • Exercise 7
Correlations & Chapter 14, Sections 2-6 • Project Update
Linear Regression -“Correlation and Regression” (PDF)
-R tutorials 18-20
W13: Apr 19 -OSB • Exercise 8
Multiple Linear Chapter 14, Section 9 • Peer Feedback on Project
Regression
W14: Apr 26 -“Logistic regression” (PDF) • Exercise 9
Logistic Regression -“Logistic regression example in R”
(PDF)
-“Generalized Linear Models”
(http://data.princeton.edu/R/glms.html)
-R tutorials 21-22
W15: May 3
Review; Group Work
W16: May 10
PROJECT
PRESENTATIONS
(Final Paper is Due)

If there are updates to the schedule, they will be posted to ELMS/Canvas.

Required R Tutorials

1. Downloading and Installing R


(https://www.youtube.com/watch?v=cX532N_XLIs/)
2. Import Data (https://www.youtube.com/watch?v=qPk0YEKhqB8)
3. Introduction to R (https://www.youtube.com/watch?v=UYclmg1_KLk - Click
Show More under section Published on to get the data set)
4. Introduction to R II (https://www.youtube.com/watch?v=1BcGnHwUT6k)
5. Vectors in R (https://www.youtube.com/watch?v=2TcPAZOyV0U)
6. Subsetting Data (https://www.youtube.com/watch?v=jGf7WNh-LX8)
7. Basic Plots (http://www.cyclismo.org/tutorial/R/plotting.html)
8. Summary Statistics
(https://www.youtube.com/watch?v=ACWuV16tdhY&list=PLqzoL9-
eJTNBDdKgJgJzaQcY6OXmsXAHU&index=20)
9. Basic Probability Distributions
(http://www.cyclismo.org/tutorial/R/probability.html)
10. Z scores (https://www.youtube.com/watch?v=peEsXbdMY_4&list=PLqzoL9-
eJTNBDdKgJgJzaQcY6OXmsXAHU&index=26)
11. Calculating p-values (http://www.cyclismo.org/tutorial/R/pValues.html)
12. Calculating Confidence Intervals
(http://www.cyclismo.org/tutorial/R/confidence.html)
13. One sample t-test (https://www.youtube.com/watch?v=kvmSAXhX9Hs)
14. Installing packages (https://www.youtube.com/watch?v=3RWb5U3X-
T8&index=11&list=PLqzoL9-eJTNBDdKgJgJzaQcY6OXmsXAHU)
15. Two sample t-test
(https://www.youtube.com/watch?v=RlhnNbPZC0A&list=PLqzoL9-
eJTNBDdKgJgJzaQcY6OXmsXAHU&index=29)
16. Chi Square test of independence
(https://www.youtube.com/watch?v=POiHEJqmiC0&list=PLqzoL9-
eJTNBDdKgJgJzaQcY6OXmsXAHU&index=34)
17. Analysis of Variance (ANOVA)
https://www.youtube.com/watch?v=lpdFr5SZR0Q
18. Scatterplots
(https://www.youtube.com/watch?v=FEAS3akVxD8&index=19&list=PLqzoL9-
eJTNBDdKgJgJzaQcY6OXmsXAHU)
19. Correlations
(https://www.youtube.com/watch?v=XaNKst8ODEQ&list=PLqzoL9-
eJTNBDdKgJgJzaQcY6OXmsXAHU&index=36)
20. Linear Regression
(https://www.youtube.com/watch?v=66z_MRwtFJM&list=PLqzoL9-
eJTNBDdKgJgJzaQcY6OXmsXAHU&index=37)
21. Multiple Linear Regression
https://www.youtube.com/watch?v=q1RD5ECsSB0
22. Checking Linear Regression Assumptions
https://www.youtube.com/watch?v=eTZ4VUZHzxw

Vous aimerez peut-être aussi