Big Data Management and Analytics im WS 16/17
News
- The opportunity to have insight in your follow-up exam is scheduled for Mon 08.05.2017 at 14:00-15:00 in room F101A (Oettingenstr. 67).
- The follow-up exam has been corrected. The results can be seen in UniWorX.
- Announcements for the upcoming exam - With name assignments!
- The follow-up exam is scheduled for Wed. 19.04.2017, 10 s.t. - 11:30 in C 123 (Theresienstr. 41) and B 139 (Theresienstr. 41). The assignment to each of the lecture halls will be announced within the next days.
- The second opportunity to have an insight into the exam for those who have not participated in the first date is scheduled for Tue 11.04.2017 10:00-11:00 in the room 157
- Anyone who is eligible for an extension of the examination time (e.g. due to medical issues) for the follow-up exam and wants to make use of it, is called to inform us by mail immediately. A lated request can not be guaranteed to be accounted for the exam.
- The insight in your first exam will take place in room F 106 Oettingenstraße 67
- The follow-up exam is now scheduled for Wed. 19.04.2017, 10-13. Please register as soon as possible if you aim to participate in the follow-up exam.
- The opportunity to have insight in your first exam is scheduled for Tue 04.04.2017 at 14:00-16:00.
- The exam has been corrected. The results can be seen in UniWorX.
- Announcements for the upcoming exam - Updated name assignment!
- For the exam the usage of an simple non-programmable calculator is permitted
- An error in tutorial 10, affecting slides 21-23 has been corrected.
- An error in tutorial 9, affecting slide 23 has been corrected! The first equation had the wrong signs for x and omega.
- An error in tutorial 4, affecting slide 3 has been corrected! In the last expression, after the sum sign, a ',' between a_ij and b_jk has been removed.
- An error in tutorial 10, affecting slides 10-17 has been corrected! h3(x) in the table should yield for e=2 a value of 8 (not of 7). However the final result of the assignment remains unchanged.
- An error in tutorial 9, slide 24 has been corrected!
- Slides for tutorial 11 have been updated!
- Anyone who is eligible for an extension of the examination time (e.g. due to medical issues) and wants to make use of it, is called to inform us by mail immediately. A lated request can not be guaranteed to be accounted for the exam.
- There will be no tutorial sessions on the 22.12.2016 and 23.12.2016
- Registration for the first exam will be open today at 12:00 via UniWorX
- The first exam will be on 03.03.2017, from 10:00-12:00 in A 140, M 118, M018 main building. The second exam will be held in the beginning of April, the exact date will be announced as soon as possible.
- Registration in Uniworx opens on 1.10.16. Registration
Content
In almost all areas of business, industry, science, and everybody's life, the amount of available data that contains value and knowledge is immense and fast growing. However, turning data into information, information into knowledge, and knowledge into value is challenging. To extract the knowledge, the data needs to be stored, managed, and analyzed. Thereby, we not only have to cope with increasing amount of data, but also with increasing velocity, i.e., data streamed in high rates, with heterogeneous data sources and also more and more have to take data quality and reliability of data and information into account. These properties referring to the four V's (Volume, Velocity, Variety, and Veracity) are the key properties of "Big Data". Big Data grows faster than our ability to process the data, so we need new architectures, algorithms and approaches for managing, processing, and analyzing Big Data that goes beyond traditional concepts for knowledge discovery and data mining. This course introduces Big Data, challenges associated with Big Data, and basic concepts for Big Data Management and Big Data Analytics which are important components in the new and popular field Data Science.
Organisation
- Umfang: 3+2 hours weekly (equals 6 ECTS)
- Required: Lecture "Database Systems I" or equivalent
- Beneficial: Lecture "Knowledge Discovery in Databases I" or equivalent
- Lecture: Prof. Dr. Matthias Schubert
- Assisting: Daniyal Kazempour
- Audience: The lecture is directed towards Bachelor students (5th term) and Master students in Mediainformatics, Bioinformatics and Informatics
Time and Locations
Component | When | Where | Starts at |
---|---|---|---|
Lecture | Tue, 9.00 - 12.00 | Room B U101 (Oettingenstr. 67) | 18.10.2016 |
Tutorial 1 | Thu, 16.00 - 18.00 | Room Lehrturm-VU107 (Prof.-Huber-Pl. 2 (V)) | 27.10.2016 |
Tutorial 2 | Fri, 16.00 - 18.00 | Room Lehrturm-W401 (Prof.-Huber-Pl. 2 (W)) | 28.10.2016 |
Schedule
(Note: This schedule is tentative. As new course, chapters and dates could be updated on short notice)
Lecture | Tutorial | ||
---|---|---|---|
Date | Topic | Date | Topic |
18.10.16 | Chapter 0: Intro&Overview Chapter 1: Introduction to Big Data — the four V's | --- | |
25.10.16 | V1: Volume — Chapter 2 Part 1:NoSQL Databases | 27.10.16 28.10.16 | Introduction to Python Slides , Exercise Sheet, Python Solutions |
01.11.16 | no lecture ( All Saints' Day ) | 03.11.16 04.11.16 | Exercise Sheet, Slides |
08.11.16 | V1: Volume — Chapter 2: NoSQL Databases V1: Volume — Chapter 3 : Hadoop, MapReduce, HDFS | 10.11.16 11.11.16 | Exercise Sheet, Slides |
15.11.16 | V1: Volume — Chapter 3 : Hadoop, MapReduce, HDFS | 17.11.16 18.11.16 | Exercise Sheet, Slides
|
22.11.16 | V1: Volume — Spark | 25.11.16 26.11.16 | Exercise Sheet, Slides |
29.11.16 | V2: Velocity — Stream Processing Stream Processing | 01.12.16 02.12.16 | Exercise Sheet, Matrix multiplication template, Matrix multiplication solution, Wordcount solution, Slides (updated) |
06.12.16 | V2: Velocity — Stream Processing Apache Flink | 08.12.16 09.12.16 | Exercise Sheet, Slides (updated) |
13.12.16 | V2: Velocity —Stream Applications and Algorithms | 15.12.16 16.12.16 | Exercise Sheet, Matrix-Matrix-multiplication template, Slides, WordCount solution, Matrix-Matrix-multiplication solution
|
20.12.16 | V3: Variety — Stream Clustering and Classification | --- | |
Winter Break | |||
10.01.17 | V3: Variety — Text processing and high dimensional data | 12.01.17 13.01.17 | Exercise sheet , Slides(updated) |
17.01.17 | V3: Variety —high dimensional data (cont.)-efficient PCA | 19.01.17 20.01.17 | Exercise sheet, Slides |
24.01.17 | V3: Variety —Graph Data: Part 1, Link Analysis and PageRank | 26.01.17 27.01.17 | Exercise sheet, Slides(Updated), PowerIteration Notebook |
31.01.17 | V4: Variety — Graph Data: Part 2, Community Detection | 02.02.17 03.02.17 | Exercise sheet, Slides |
07.02.17 | Q&A | --- |
Vorhergehende Semester
WS 16/17, WS 15/16