Knowledge Discovery in Databases II im WS 2015/16
News
- The inspection of the 2nd test (8.4.16) will be at Wednesday, 1.06.2016 between 16:00 to 17:00 in room F107, Oettingenstr. 67.
- An additional written exam will be held on Friday, 8.4.16 at 10:00 in room BU101 in Oettingenstr. 67. To take part, it is mandatory to register for the exam in the UNIWORX System until 1.4.2016.
- The result of the exam has been sent to the participants. The inspection of the test will be at Friday, 26.02.2016 between 14:00 to 15:00 in room F107, Oettingenstr. 67.
- The written exam will take place at 8.2.2016 between 14:00-16:00 in room B001 (Oettingenstr. 67)
- To register for the course subscribe here:UniWorx KDD II.
- Die Klausur findet am 8.2.2016 zwischen 14:00-16:00 in Raum B001 in der Oettingenstr. 67 statt.
- Die Anmeldung für die Vorlesung findet ihr ab jetzt unter:UniWorx KDD II.
Content
In many modern application areas, data scientists face challenges which go beyond the basic techniques being introduced in the basic module Knowledge Discovery in Databases I. The module Knowledge Discovery in Databases II covers advanced techniques to handle large data volumes, volatile data streams, complex object descriptions and linked data. These topics are also known as the three major challenges (Volume, Velocity, Variety) in Big Data Analysis. The modul is directed at master students being interested in developing and designing knowledge discovery processes for various types of applications. This includes the development of new data mining and data preprocessing methods as well as the ability to select the best suited established approach for a given pratical challenge.
Inhalt
In vielen modernen Anwedungsgebieten werden Data Scientist mit neuen Herausforderungen konfrontiert, die weit über die grundlegenden Techniken hinausgehen, die im Modul Knowledge Discovery in Databases I besprochen werden. Das Modul Knowledge Discovery in Databases II stellt eine Reihe von fortgeschrittenen Techniken für sehr große Datenbestände, volatile Datenströme, komplexe Objektdarstellungen und verlinkte Datenbestände vor. Diese Themen sind auch als die drei großen Herausforderungen (Volume, Velocity, Variety) im Berich Big Data Analysis bekannt. Das Modul richtet sich an Masterstudenten, die daran interesiert sind Knowledge Dicovery Prozesse für verschiedene Arten von Anwendungen zu entwerfen und umzusetzen. Dies beinhaltet sowohl die Entwicklung neuer Data Mining- und Vorverarbeitungs-Algorithmen als auch die Fähigkeit die am besten geeigneten etablierten Ansätze für ein gegebenes praktisches Anwendungsgebiet auszuwählen.
Organisation
- Time: 3+2 hours weekly
- Lectures: Dr. Eirini Ntoutsi, PD Dr.Matthias Schubert
- Tutorial: PD Dr.Matthias Schubert
- Entre Requirements: Knowledge Discovery in Databases I
- ECTS: 6
- Type of Examination: Written Exam (90min)
Organisation
- Umfang: 3+2 Semesterwochenstunden
- Vorlesung: Dr. Eirini Ntoutsi, PD Dr.Matthias Schubert
- Übungen: PD Dr.Matthias Schubert
- Vorkenntnisse: Knowledge Discovery in Databases I
- ECTS Punkte: 6
- Art der Prüfung: schriftliche Klausur 90 min
Time and Locations
Teaching Component | Time | Location | Start |
---|---|---|---|
Lectures | Tue, 9.00 - 12.00 Uhr | Luisenstr. 37 (c) C006 | 13.10.2015 |
Tutorial Group 1 | Fri, 14.00 - 16.00 Uhr | Main building M203 | 23.10.2015 |
Tutorial Group 2 | Fr, 16.00 - 18.00 Uhr | Amalienstr. 73A 220 | 23.10.2015 |
Zeit und Ort
Veranstaltung | Zeit | Ort | Beginn |
---|---|---|---|
Vorlesung | Di, 9.00 - 12.00 Uhr | Luistenstr. 37 (c) C006 | 13.10.2015 |
Übung Gruppe 1 | Fr, 14.00 - 16.00 Uhr | Hauptgebäude M203 | 23.10.2015 |
Übung Gruppe 2 | Fr, 16.00 - 18.00 Uhr | Amalienstr. 73A 220 | 23.10.2015 |
Schedule / Vorlesungsplan
Date/ Datum | Lecture/ Vorlesung | Tutorial/ Übung | Content/ Inhalt |
13.10.2015 | Introduction (updated 13/10/2015) | - | |
20.10.2015 | Volume aspect: Feature selection (updated 21.10.2015) | 23.10.2015 | tutorials 1 FS_template.py iris.arff |
27.10.2015 | Volume aspect: Dimensionality reduction (updated 30/10/2015) | 30.10.15 | tutorials 2 PCA_template.py Katze.png |
03.11.2015 | Volume aspect: Clustering in high-dimensional data (updated 12/11/2015) | 06.11.15 | tutorials 3 RCA_template.py DataLoader.py ionosphere.arff |
10.11.2015 | Volume aspect: Clustering in high-dimensional data (continued - check previous slides) | 13.11.15 | tutorials 4 Py_4C_template.py data.csv |
17.11.2015 | Volume aspect: Large object cardinalities (1) | 20.11.15 | tutorials 5 SparkIntro_template.py Para_KMeans_template.py |
24.11.2015 | Volume aspect: Large object cardinalities (2) (updated 25/11/2015) CF radius proof | 27.11.15 | tutorials 6 spark_mult_template.py |
01.12.2015 | Velocity aspect: Data streams (1) (updated 01/12/2015) | 04.12.15 | tutorials 7 Micro_Clustering_template.py birch2.arff |
08.12.2015 | Velocity aspect: Data streams (2) (updated 10.12.2015) | 11.12.15 | tutorials 8 template_Stream_NB.py Random_stream.py |
15.12.2015 | Variety aspect: Ensemble learning (check slides of the following lecture) | 18.12.15 | tutorials 9 ecoc_template.py |
22.12.2015 | Variety aspect: Ensembles (continuation from Lecture 10) and MultiView Learning (updated 22/12/2015) | -- | |
29.12.2015 | No lecture (Christmas-New Year break) | -- | |
05.01.2015 | No lecture (Christmas-New Year break) | 08.01.16 | tutorials 10 ensemble_template.py |
12.01.2016 | Variety aspect: Multi-Instance Learning | 15.01.16 | tutorials 11 mi_distance_template.py |
19.01.2016 | Variety aspect: Graph & Linked data data (1) | 22.01.16 | tutorials 12 |
26.01.2016 | [Variety aspect: Graph & Linked data data (2) (continued - check previous slides)] | 29.01.16 | tutorials 13 fb0.edges |
02.02.2016 | Questions & Answers/ Preparation for the exam | ||
08.02.2016 | Exam | ||
middle February | Viewing of results |
Tutorials/Übungen
- The registration is necessary to take part in the tutorial and to further register for the final exams.UniWorx KDD II. (Anmeldung zu den Übungen unter: UniWorx KDD II.)
- Parts of the tutorial will require to programm in Python using the packages numpy and scipy. Here is a short Tutorial in for both packages. (Die Übungen bestehen zum Teil aus der Programmierung von Python 2.7 Programmen unter Verwendung der Bibliotheken Numpy und Scipy.)
General Links
- KDNuggets - E-Newsletter about data mining
- ACM SIGKDD -'Special Interest Group' of the 'Association for Computing Machinery' for KDD
- Weka (Data Mining with open source Machine Learning Software in Java.)
- ELKI (open-source Java-Project on Data Mining at the database group of the LMU)
Vorhergehende Semester
WS 15/16, WS 14/15, WS 13/14, WS 12/13, WS 11/12, SS 10, SS 09, SS 08, SS 07