Seminar: Techniques for implementing main memory database systems

Information

Content

In this seminar we deal with techniques for implementing main memory database systems and related topics.

Prerequisites

lecture Fundamentals of Databases (Grundlagen Datenbanken, GDB) or similar course
Very good knowledge in data bases, good programming skills in C++ (depends on topic)

Dates

weekly meeting: on tuesday, 2 p.m. - 3:30 p.m., room 02.09.014
first meeting: October 17, 2017

Organization

First organisational meeting for the seminar: Thursday, July. 13th, 4.00 p.m., room MI 02.13.010
Besides the seminar talk also an implementation of the key aspects of your topic as a component of a main memory database system in C++ has to be done. For the data mining topics it would be useful to get SQL queries instead.
Contact us by email or in person to obtain literature recommendations. For most topics you find papers of our group on our web site that serve as primary source for your seminar.
The presentation can be done in English or in German. English is recommended only if you are proficient in English writing and conversation.

Topics

Alle Themen orientieren sich an der Architektur unseres Hauptspeicher-Datenbanksystems HyPer (hyper-db.de). Auf der Webseite finden Sie auch entsprechende Literaturreferenzen. Viele Themen werden auch im entsprechenden Kapitel des Lehrbuchs "Datenbanksystem: Eine Einführung" abgehandelt (dort allerdings in knapperer Form als wir es von Ihrer Ausarbeitung erwarten). Weiterhin empfehlen wir die Nutzung der Bibliographie-Datenbank dblp. Kontaktieren Sie uns rechtzeitig (nachdem) Sie sich eingelesen/eingearbeitet haben, um den Aufbau zu besprechen.

17.10.2017: Text analysis: TFIDF (Thuy Tran) [Thesis] [Presentation]
24.10.2017: Snapshotting / Schattenspeicher (Daniel Kutasi) [Thesis] [Presentation]
21.11.2017: Parallel Cuckoo-Filter (Jeremias Neth) [Thesis] [Presentation]
21.11.2017: How database index structures can be used for Data Mining (Johannes Kirchmaier) [Thesis] [Presentation]
28.11.2017: Latency Hiding in Tree Lookups (supervisor: Timo Kersten) (Lukas Karnowski) [Thesis] [Presentation]
12.12.2017: Versioning for databases / Versionsverwaltung für Datenbanken (Benedikt Kleiner) [Thesis] [Presentation]
12.12.2017: Graph storage: How good is CSR really? (supervisor: Jan Böttcher) (Mahammed Valiyev) [Thesis] [Presentation]
12.12.2017: Gradient descent in databases (Kevin Sterjo) [Thesis] [Presentation]
19.12.2017: Parallelization of a Query Engine (Thomas Blum) [Thesis] [Presentation]
19.12.2017: Linear/Logistic Regression (Oleg Patrascu) [Thesis] [Presentation]
09.01.2018: BW-Tree (Josef Schmeißer) [Thesis] [Presentation]
16.01.2018: Improvements of Bloom-Filters (idea by: Andreas Kipf) (Matthias Bungeroth) [Thesis] [Presentation]
23.01.2018: Database Cracking (David Werner) [Thesis] [Presentation]
06.02.2018: MapReduce and SQL: Do we need MapReduce? (Michael Schwarz) [Thesis] [Presentation]
06.02.2018: Classification: Decision Trees (Dominik Vinan) [Thesis] [Presentation]

new topics / neue Themen:
- Versioning for databases / Versionsverwaltung für Datenbanken (Benedikt Kleiner)
- Aggregation of temporal data on NVIDIA GPUs (supervisor: Andreas Kipf, NVIDIA graphics board required)
- Improvements of Bloom-Filters (idea by: Andreas Kipf) (Matthias Bungeroth)
- Database Cracking (David Werner)
- Datamining on specific algorithms: Is it possible using SQL?
  - Clustering-Algorithms like DBScan
  - Classification: Decision Trees (Dominik Vinan)
  - Classification: Naive-Bayes
  - Linear Regression
  - Logistic Regression
  - Time series analysis: ARIMA model
  - Text analysis: TFIDF (Thuy Tran)
  - Text analysis: Topic analysis
  - Hypthesis-Testing
- MapReduce and SQL: Do we need MapReduce? (Michael Schwarz)
- How database index structures can be used for Data Mining (Johannes Kirchmaier)
- Topics about Graph Databases (supervisor: Jan Böttcher) (Mahammed Valiyev)
- Latency Hiding in Tree Lookups (supervisor: Timo Kersten) (Lukas Karnowski)
Last year topics (can be reused) / Themen des Vorjahres (können übernommen werden)
- Multi-Core Rechner / NUMA /Multi-Threaded Parallelization
- Column-Store / Row-Store / Hybrid Store
- Snapshotting / Schattenspeicher (Daniel Kutasi)
- Kompilation von Anfrageplänen - versus Interpretation
- Synchronisation: Lock-free versus 2PL
- Compaction
- Indexing: ART
- parallele Hash-Joins: Radix-Join versus globale HT
- Parallelization of a Query Engine (Thomas Blum)
- HTM versus Latching
- Multi-Version Concurrency Control

Material

LaTeX Template for Thesis (UTF-8): link
LaTeX Template for Thesis (Windows): link
Notes on C++: link
Slides of the organisational meeting: link
Gitlab of our Chair: link

Technische Universität München

Seminar: Techniques for implementing main memory database systems

Information

Content

Prerequisites

Dates

Organization

Topics

Material

Lehrstuhl III
Datenbanksysteme

Prof. Kemper
Prof. Neumann
Prof. Giceva
Fakultät für Informatik
TU München

Navigation

Technische Universität München

Seminar: Techniques for implementing main memory database systems

Information

Content

Prerequisites

Dates

Organization

Topics

Material

Lehrstuhl III Datenbanksysteme Prof. Kemper Prof. Neumann Prof. Giceva Fakultät für Informatik TU München

Navigation

Lehrstuhl III
Datenbanksysteme

Prof. Kemper
Prof. Neumann
Prof. Giceva
Fakultät für Informatik
TU München