About this online course


Data streams are everywhere: they are produced by smartphones, IoT devices, Cloud services, application logs, credit-card transactions, clickstreams, etc. Stream processing is currently a billion-dollar industry and is expected to quadruple in less than 5 years.

A company's goal today is not only to deal with analyzing Big Data, but to also provide timely results from that analysis. Data streams need to be processed in real-time and in a scalable fashion in order to have business value and offer operational insights.

Stream processing technology is used in various settings:

  • Data Engineer/Scientist teams implement scalable stream processing applications to ingest and analyze vast amounts of streamed data for monitoring and alerting systems.
  • Machine Learning Engineers deploy ML models on stream processing pipelines for fraud detection and risk assessment pipelines as well as re-training ML models in real-time.

After finishing this course, learners will be able to identify data streaming use cases, perform analysis of data streams, and set up stream processing pipelines of different types.

This course is designed for data engineers, machine learning engineers, software engineers and data scientists with a basic knowledge of scalable data processing techniques such as Hadoop, MapReduce, etc.

The estimated effort required to finish this course is 16 to 20 hours but you may take additional time (no longer than May 19, 2021) to complete all the assignments at your own pace. After May 19, 2021 course materials will remain available for 6 months but your work will not be checked by a course instructor and you will not receive a certificate.

What you'll learn:

  • In this course you will develop the skills to design real-time stream processing pipelines, in a scalable and efficient manner, using Apache Flink - the state-of-the-art open-source technology for stream processing.
  • After taking this course, you will be able to apply your knowledge to set up enterprise pipelines for processing application logs, monitor data centers, deploy ML models for real-time pattern detection and predictive analytics.


Course Syllabus:

Week 1:

Learn the basic concepts behind stream processing with examples from different industries. We will also cover various architectural patterns of stream processing technology. Practical assignments will help you install and experiment with Apache Flink.

In detail, the topics we will cover this week are:

  • The concept of stream processing as opposed to batch processing
  • Use cases from different industries such as Internet-scale companies, the banking sector, etc.
  • Architectures and best practices for setting up streaming pipelines
  • Introduction to the Apache Flink stream processor

Week 2:

Learn the fundamental ideas behind parallel stream processing. You will learn how a massive data stream can be split into a set of smaller substreams. This enables a streaming computation to be scaled out to a cluster of machines to filter and transform streams.

In detail, the topics we will cover this week are:

  • Partitioning strategies for streaming topologies
  • Using partitioning to parallelize the processing of streams
  • Transforming and filtering streams in parallel using a cluster of machines

Week 3:

Learn the concepts of time, order and streaming windows. This week we will show strategies to deal with streams in which events can arrive late or out of order, as well as the effect that the concept of time has on stream processing, in order to create streaming windows and aggregate data.

In detail, the topics we will cover this week are:

  • the concepts of event-time and processing-time and their fundamental difference
  • streaming windows in different time dimensions
  • aggregation of streaming windows
  • strategies to deal with streams containing out-of-order events

Week 4:

Join multiple data streams to gain insights from different streaming data sources. This week we will define the concept of joins on streams and we will see how joins can be used in conjunction with streaming windows.

In detail, the topics we will cover this week are:

  • joining multiple data streams
  • how windows and joins can be used together
  • different types of streaming joins
  • industry use case from the banking industry



If you successfully complete this course you will earn a professional education certificate and you are eligible to receive 2.0 Continuing Education Units (CEUs).

View sample certificate


This course is primarily geared towards working professionals.


  • Undergraduate degree in Computer Science or a related field
  • Basic knowledge of SQL and database concepts
  • Ability to write simple programs in Java or Scala
  • A working, free GitHub account


If you have any questions about this course or the TU Delft online learning environment, please visit our Help & Support page.

Enroll now

  • Starts: Future dates to be announced
  • Fee: € 695
  • Group fee: contact us
  • Length: Self-Paced
  • Effort: 4 - 5 hours per week / 4 weeks

Related courses and programs