Seminar: Techniques for implementing main memory database systems

Information

Content

In this seminar we deal with techniques for implementing main memory database systems and related topics.

Prerequisites

  • lecture Fundamentals of Databases (Grundlagen Datenbanken, GDB) or similar course
  • very good knowledge in data bases, good programming skills in C++

Dates & Deadlines

  • Organizational meeting: Monday, 2020-07-13 15:00: BBB-Room
  • Regular meeting: Monday, 16:00 - 18:00, MI 02.09.014

Schedule

09.11.2020: Session 1

Papers
  • Towards Scalable Dataframe Systems / When sweet and cute isn’t enough anymore: Solving scalability issues in Python Pandas with Grizzly
  • PolarFS: An Ultra-low Latency and Failure Resilient Distributed File System for Shared Storage Cloud Database

Deadlines
  • 12.10.2020: General Structure
  • 02.11.2020: Slides
  • 09.11.2020: Presentation Date
  • 23.11.2020: Paper & Implementation

16.11.2020: Session 2

Papers
  • BB-Tree: A practical and efficient main-memory index structure for multidimensional workloads
  • Interpolation-friendly B-trees: Bridging the Gap Between Algorithmic and Learned Indexes

Deadlines
  • 19.10.2020: General Structure
  • 09.11.2020: Slides
  • 16.11.2020: Presentation Date
  • 30.11.2020: Paper & Implementation

23.11.2020: Session 3

Papers
  • Updateable HyperLogLog Sketches
  • Leapfrog Triejoin: A Simple, Worst-Case Optimal Join Algorithm

Deadlines
  • 26.10.2020: General Structure
  • 16.11.2020: Slides
  • 23.11.2020: Presentation Date
  • 07.12.2020: Paper & Implementation

30.11.2020: Session 4

Papers
  • HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines
  • DB4ML - An In-Memory Database Kernel with Machine Learning Support

Deadlines
  • 02.11.2020: General Structure
  • 23.11.2020: Slides
  • 30.11.2020: Presentation Date
  • 14.12.2020: Paper & Implementation

07.12.2020: Session 5

Papers
  • Scalable and Robust Latches for Database Systems (1)
  • Scalable and Robust Latches for Database Systems (2)

Deadlines
  • 09.11.2020: General Structure
  • 30.11.2020: Slides
  • 07.12.2020: Presentation Date
  • 21.12.2020: Paper & Implementation

14.12.2020: Session 6

Papers
  • External Merge Sort for Top-K Queries
  • White-box Compression:Learning and Exploiting Compact Table Representations

Deadlines
  • 16.11.2020: General Structure
  • 07.12.2020: Slides
  • 14.12.2020: Presentation Date
  • 28.12.2020: Paper & Implementation

Topic List

Topic Supervisor
Scalable and Robust Latches for Database Systems (1) Jan Böttcher
Scalable and Robust Latches for Database Systems (2) Jan Böttcher
Towards Scalable Dataframe Systems / When sweet and cute isn’t enough anymore: Solving scalability issues in Python Pandas with Grizzly Dominik Durner
PolarFS: An Ultra-low Latency and Failure Resilient Distributed File System for Shared Storage Cloud Database Dominik Durner
BB-Tree: A practical and efficient main-memory index structure for multidimensional workloads Philipp Fent
Interpolation-friendly B-trees: Bridging the Gap Between Algorithmic and Learned Indexes Philipp Fent
External Merge Sort for Top-K Queries Philipp Fent
Updateable HyperLogLog Sketches Michael Freitag
Leapfrog Triejoin: A Simple, Worst-Case Optimal Join Algorithm Michael Freitag
White-box Compression:Learning and Exploiting Compact Table Representations Michael Freitag
HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines (GPU required) Maximilian Schüle
DB4ML - An In-Memory Database Kernel with Machine Learning Support Maximilian Schüle

Material

  • Slides of the kickoff meeting: link
  • Introduction to modern C++: link
  • Full lecture slides on modern C++: link
  • LaTeX Template for Thesis (suggestion, based on the official ACM template): link
  • Gitlab of our Chair: link