Seminar Big Data Middleware for Data Analytics

Middleware is a term from the field of Distributed Systems where applications on different machines interact. The term Middleware describes computer software that provides services to software applications beyond those available from the operating system. Since these new services are above the operating system i.e. between the operating system and the application, they are called middleware.

Middleware
Fig.1: Middleware

The motivation of this course is to deepen our knowledge about Big Data Middleware - especially for Data Analytics, and also to get presentation experiences.


Topics

The seminar topics are divided into three areas:

  • Stream processing (for example Apache Spark, Kafka)
  • Key-Value Stores (for example Memcached, REDIS, Cassandra)
  • Serverless Computing (for example Amazon AWS Lambda and S3)

Lecturer

Prof. Dr. Bettina Schnor


Modules

  • Bachelor Computational Science: 6030
    • 555601 - Modulteilprüfung
    • 555602 - alternativ
  • Master Computational Science: 10020
    • 555201 - Modulteilprüfung
    • 555202 - alternativ
  • Master Data Science

    During the summer term, there will be the seminar course where each student shall give a talk. Afterwards each student has to complete a project that is based on the topic of the seminar. The project topics are negotiable and will be adapted to the skills of the students.


    The seminar contributes to the following modules:

    • INF-DSAM7: Computer Engineering for Big Data (component: 2 SWS seminar)
    • INF-DSAM6A: Advanced Applied Data Science A (component: 2 SWS seminar)

Room/Dates

The seminar takes place on mondays 2 p.m. to 4 p.m. online synchronous.

News
Serverless on AWS 20.07.2021 Ferdinand Hoske [Seminar Paper] [Slides] [Handout]
Cassandra 20.07.2021 Vishal Kumar Lohana [Semina Paper] [Handout]
REDIS (Remote Dictionary Server) 29.06.2021 Maksym Nevar [Semina Paper] [Slides] [Handout]
RESIN: Big-Data Query Optimization 21.06.2021 Leo Repp [Seminar Paper] [Slides] [Handout]
Stream Processing with Apache Kafka 15.06.2021 Marieke Warfia [Seminar Paper] [Slides] [Handout]
Stream Processing with Apache Spark 14.06.2021 Paul Rößler [Seminar Paper] [Slides] [Handout] [Source]
Topic assignment 03.05.2021 Slides with Topics and Presentation Dates are online!: [Slides]
Kick-off lecture 12.04.2020, 2 p.m. to 4 p.m. The materials are linked here: [Slides] [Video]
How to do a Presentation online asynchron The materials are linked here: [Slides] [Video]

Course materials


Requirements

The grade is composed as following:

  • 10 % presentation draft
  • 30% successful presentation
  • 30 % presentation content
  • 30% documentation (double-sided printed)

The presentation draft should be delivered 2 weeks before the presentation and an appointment should be made for discussion. A handout for the presentation is necessary: max. 1 DIN A4 page which summarizes the main content of the presentation.
Successful presentation: max. 45 min. + 15 min. Discussion.
The documentation should be delivered within 1 week after the presentation.