Académique Documents
Professionnel Documents
Culture Documents
Agenda
Research Group
Administrative Issues
Content and Aims
Research Group
Scientific staff:
Prof. Dr. Jens Grabowski
Dr. Steffen Herbold
M.Sc. Xiao-Wei Wang
Dipl. Math. Verena Herbold
Dipl.-Inf. Daniel Honsel
M.Sc. Ella Albrecht
Web:
http://www.swe.informatik.uni-goettingen.de
2
Administration
Time and Place
Lecture
Tuesday, 14:15-15:45 oclock (s.t.)
Room: Ifi 0.101
Exercise
Thursday, 13:15-14:45 oclock (s.t.)
Room: Ifi -1.101
Examination
Written exam at the end of the semester
Precondition for participation in the exam
Passing the exercise
2
Administration
Exercise
Final project
Small data analysis project as group work
Presentation of the results required for passing
2
13
Expected Background
capability
Experience with statistical methods
and basic proficiency with a statistical
software package, such as R or
RStudio, Minitab, Matlab, SAS, or
SPSS
Experience with the conditioning and
management of business data
including databases
Basic programming skills, preferably
including SQL
2
Course Objectives
Upon completion of this course, you should be able to:
Immediately participate and contribute as a data science team
member on big data and other analytics projects by:
Deploy a structured lifecycle approach to data science and big data analytics projects
Reframe a business challenge as an analytics challenge
Apply analytic techniques and tools to analyze big data, create statistical models, and
identify insights that can lead to actionable results
Select optimal visualization techniques to clearly communicate analytic insights to
business sponsors and others
Use tools such as R and RStudio, MapReduce/Hadoop, in-database analytics, and
window and MADlib functions
15
2.
3.
4.
5.
6.
Methods
Tools
+
Final Lab on Big
Data Analytics
Big Data
Overview
K-means
Clustering
Operationalizing an
Analytics Project
State of the
Practice in
Analytics
Association Rules
Analytics for
Unstructured Data
(MapReduce and
Hadoop)
The Data
Scientist
Linear Regression
Statistics for Model
Building and Evaluation
Big Data
Analytics in
Industry
Verticals
Data
Analytics
Lifecycle
The Hadoop
Ecosystem
Logistic Regression
Naive Bayesian
Classifier
Decision Trees
Time Series
Analysis
In-database
Analytics SQL
Essentials
Advanced SQL and
MADlib for Indatabase Analytics
Text Analysis