Data Management for the Sensor-Edge-Cloud Continuum

NebulaStream is a general-purpose, end-to-end data-management system for the sensor-edge-cloud continuum built around three core goals:

Ease of Use

Out-of-the-box functionality for multi-modal, multi-frequency streams (e.g., alignment, inference). Enables users to focus on business logic with well-known abstractions and concepts.

Extensibility

Empower users to easily integrate custom data connectors, formats, operators, and optimizations into the system.

Efficiency

Utilize distributed heterogeneous computing devices with hardware-tailored code, adaptive execution, and the interleaved processing of data sources to handle large workloads efficiently.

NebulaStream is a joint research project at BIFOLD, with first contributors from the  DIMA Group at TU Berlin and the  DFKI IAM Group.

Get started

Get started
with NebulaStream.

Use Docker for a containerized setup, or Nix for a native build.

Docker setup
git clone https://github.com/nebulastream/nebulastream.git &&
cd nebulastream &&
docker pull nebulastream/nes-development:latest &&
docker run --rm --workdir "$(pwd)" -v "$(pwd):$(pwd)" \
  nebulastream/nes-development:latest cmake -B cmake-build-debug &&
docker run --rm --workdir "$(pwd)" -v "$(pwd):$(pwd)" \
  nebulastream/nes-development:latest cmake --build cmake-build-debug --parallel

Tested on Linux. Docker uses the published NebulaStream development container; Nix requires flakes-enabled Nix.

NebulaStream Vision

Our goal is to process thousands of queries over millions of heterogeneous sources in a massively distributed environment. We achieve this through five core technologies:

  1. Heterogeneous Hardware Support: Supports a wide range of devices including different architectures (e.g., ARM, x86) and accelerators (e.g., GPUs, TPUs).
  2. Code Generation: Compiles every query to efficient, low-energy native code.
  3. In-Network Processing: Pushes operators as close as possible to the data source to reduce network traffic.
  4. On-Demand Gathering: Utilizes all of the available processing capabilities from the source to the sink, so as to apply processing as early as possible. Thereby, reducing the network traffic as much as possible.
  5. Adaptive Resource Management: Reacts to topology or workload changes without interrupting queries.
NebulaStream core-technology diagram
NebulaStream architecture diagram

NebulaStream Architecture

A modular pipeline that stretches from sensor to cloud, optimising every hop along the way.

  • 1 Sources & Sinks: Users can send their data using different source connectors and input formats. Commonly used source connectors include JDBC, MQTT, and TCP, and common input formats include CSV and JSON, which we provide to the user out-of-the-box. In addition, users can add custom connectors or formats. Similarly, users can customize connectors and formatters in the Sink Manager.
  • 2 I/O Handling: Unlike other SPEs that handle sources individually and synchronously by assigning one thread per source, NebulaStream interleaves source processing via thread sharing within its own I/O thread pool and applies asynchronous callbacks to reduce waiting time.
  • 3 Query Submission: Users can submit queries in either our SQL-like query language. NebulaStream provides many built-in operations, like re-sampling and inference. Moreover, it allows users to specify their own operators.
  • 4 Query Optimization: After submission, a query plan is created and optimized before hardware-tailored code is generated. The user can modify the optimizations by providing their own rules to the rule engine.
  • 5 Adaptive Runtime: During runtime, the query engine schedules query processing in a highly dynamic manner using task abstractions.

Publications


Project Overview

NebulaStream: An Extensible, High-Performance Streaming Engine for Multi-Modal Edge Applications
SIGMOD 2025 | Adrian Michalke, Aljoscha Lepping, Volker Markl, Ricardo Martinez, Nils Schubert, Lukas Schwerdtfeger, Taha Tekdogan, Steffen Zeuch, Ariane Ziehn, Christoph Falkensteiner, Kyle Krüger, Alexander Meyer, Tobias Röschl, Svea Wilkending Download PDF
Using and Enhancing NebulaStream – A Tutorial
DEBS 2024 | Steffen Zeuch, Ankit Chaudhary, Viktor Rosenfeld, Taha Tekdogan, Adrian Michalke, Matthis Gördel, Ariane Ziehn, Volker Markl Download PDF
Showcasing Data Management Challenges for Future IoT Applications with NebulaStream
VLDB 2023 | Aljoscha Lepping, Hoang Mi Pham, Laura Mons, Balint Rueb, Ankit Chaudhary, Philipp M. Grulich, Steffen Zeuch, Volker Markl Download PDF
NebulaStream: Complex Analytics Beyond the Cloud
VLIoT 2020 | Steffen Zeuch, Eleni Tzirita Zacharatou, Shuhao Zhang, Xenofon Chatziliadis, Ankit Chaudhary, Bonaventura Del Monte, Dimitrios Giouroukis, Philipp M. Grulich, Ariane Ziehn, Volker Markl Download PDF Download slides
The NebulaStream Platform: Data and Application Management for the Internet of Things
CIDR 2020 | Steffen Zeuch, Ankit Chaudhary, Bonaventura Del Monte, Haralampos Gavriilidis, Dimitrios Giouroukis, Philipp M. Grulich, Sebastian Bress, Jonas Traub, Volker Markl Download PDF Download slides

System Publications

NebulaStream: An Adaptive and Efficient Multi-query Stream Processing Engine
ICDE 2026 | Nils L. Schubert, Lukas Schwerdtfeger, Sara Schnaterbeck, Philipp M. Grulich, Bonaventura Del Monte, Steffen Zeuch, Volker Markl Download PDF
Incremental Stream Query Placement in Massively Distributed and Volatile Infrastructures
ICDE 2025 | Ankit Chaudhary, Kaustubh Beedkar, Jeyhun Karimov, Felix Lang, Steffen Zeuch, Volker Markl Download PDF
Fault Tolerance Placement in the Internet of Things
SIGMOD 2024 | Anastasiia Kozar, Bonaventura Del Monte, Steffen Zeuch, Volker Markl Download PDF
Query Compilation Without Regrets
SIGMOD 2024 | Philipp M. Grulich, Aljoscha P. Lepping, Dwi P. A. Nugroho, Bonaventura Del Monte, Varun Pandey, Steffen Zeuch, Volker Markl Download PDF
Efficient Placement of Decomposable Aggregation Functions for Stream Processing over Large Geo-Distributed Topologies
VLDB 2024 | Xenofon Chatziliadis, Eleni Tzirita Zacharatou, Alphan Eracar, Steffen Zeuch, Volker Markl Download PDF
Incremental Stream Query Merging
EDBT 2023 | Ankit Chaudhary, Jeyhun Karimov, Steffen Zeuch, Volker Markl Download PDF
Rethinking Stateful Stream Processing with RDMA
SIGMOD 2022 | Bonaventura Del Monte, Steffen Zeuch, Tilmann Rabl, Volker Markl Download PDF
Babelfish: Efficient Execution of Polyglot Queries
VLDB 2022 | Philipp M. Grulich, Steffen Zeuch, Volker Markl Download PDF
An Energy-Efficient Stream Join for the Internet of Things
DAMON 2021 | Adrian Michalke, Philipp M. Grulich, Clemens Lutz, Steffen Zeuch, Volker Markl Download PDF
Streaming Data through the IoT via Actor-Based Semantic Routing Trees
VLIoT 2021 | Dimitrios Giouroukis, Johannes Jestram, Steffen Zeuch, Volker Markl Download PDF
Monitoring of Stream Processing EnginesBeyond the Cloud: an Overview
VLIoT 2021 | Xenofon Chatziliadis, Eleni Tzirita Zacharatou, Steffen Zeuch, Volker Markl Download PDF
ExDRa: Exploratory Data Science on Federated Raw Data
SIGMOD 2021 | Sebastian Baunsgaard, Matthias Boehm, Ankit Chaudhary, Behrouz Derakhshan, Stefan Geißelsöder, Philipp Grulich, Michael Hildebrand, Kevin Innerebner, Volker Markl, Claus Neubauer, Sarah Osterburg, Olga Ovcharenko, Sergey Redyuk, Tobias Rieger, Alireza Rezaei Mahdiraji, Sebastian Benjamin Wrede, Steffen Zeuch Download PDF
Parallelizing Intra-Window Join on Multicores: An Experimental Study
SIGMOD 2021 | Shuhao Zhang, Yancan Mao, Jiong He, Philipp M Grulich, Steffen Zeuch, Bingsheng He, Richard TB Ma, Volker Markl Download PDF
Towards Resilient Data Management for the Internet of Moving Things
BTW 2021 | Elena Beatriz Ouro Paz, Eleni Tzirita Zacharatou, Volker Markl Download PDF
Demand-based Sensor Data Gathering with Multi-Query Optimization
VLDB 2020 | Julius Hülsmann, Jonas Traub, Volker Markl Download PDF
Complex Event Processing for the Internet of Things
VLDB 2020 PhD Workshop | Ariane Ziehn Download PDF
A Survey of Adaptive Sampling and Filtering Algorithms for the Internet of Things
DEBS 2020 | Dimitrios Giouroukis, Alexander Dadian, Jonas Traub, Steffen Zeuch, Volker Markl Download PDF
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines
SIGMOD 2020 | Bonaventura Del Monte, Steffen Zeuch, Tilmann Rabl, Volker Markl Download PDF
Grizzly: Efficient Stream Processing Through Adaptive Query Compilation
SIGMOD 2020 | Philipp M. Grulich, Sebastian Breß, Steffen Zeuch, Jonas Traub, Janis von Bleichert, Zongxiong Chen, Tilmann Rabl, Volker Markl Download PDF
Scaling a Public Transport Monitoring System to Internet of Things Infrastructures
EDBT 2020 | Haralampos Gavriilidis, Adrian Michalke, Laura Mons, Steffen Zeuch, Volker Markl Download PDF
Governor: Operator Placement for a Unified Fog-Cloud Environment
EDBT 2020 | Ankit Chaudhary, Steffen Zeuch, Volker Markl Download PDF
Disco: Efficient Distributed Window Aggregation
EDBT 2020 | Lawrence Benson, Philipp M. Grulich, Steffen Zeuch, Volker Markl, Tilmann Rabl Download PDF
SENSE: Scalable Data Acquisition from Distributed Sensors with Guaranteed Time Coherence
arXiv Preprint 2019 | Jonas Traub, Julius Hülsmann, Sebastian Breß, Tilmann Rabl, Volker Markl Download PDF
Analyzing Efficient Stream Processing on Modern Hardware
VLDB 2019 | Steffen Zeuch, Sebastian Breß, Tilmann Rabl, Bonaventura Del Monte, Jeyhun Karimov, Clemens Lutz, Manuel Renz, Jonas Traub, Volker Markl Download PDF
Efficient Window Aggregation with General Stream Slicing
EDBT 2019 | Jonas Traub, Philipp Grulich, Alejandro Rodríguez Cuéllar, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Volker Markl Download PDF
Resense: Transparent Record and Replay of Sensor Data in the Internet of Things
EDBT 2019 | Dimitrios Giouroukis, Julius Hülsmann, Janis von Bleichert, Morgan Geldenhuys, Tim Stullich, Felipe Gutierrez, Jonas Traub, Kaustubh Beedkar, Volker Markl Download PDF
Generating Reproducible Out-Of-Order Data Streams
DEBS 2019 | Philipp M. Grulich, Jonas Traub, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Volker Markl Download PDF
Optimized On-Demand Data Streaming from Sensor Nodes
SoCC 2017 | Jonas Traub, Sebastian Breß, Tilmann Rabl, Asterios Katsifodimos, Volker Markl Download PDF

Project Leads


Maintainers


Nils Schubert

Nils Schubert

Research Associate

Aljoscha Lepping

Aljoscha Lepping

Research Associate

Leonhard Rose

Leonhard Rose

Research Associate

Yannik Schröder

Yannik Schröder

Research Associate

Current Researchers


Alumni


Collaborate with us!

Opportunities

Feel free to reach out to us to learn more about research opportunities as a Postdoc, PhD student, or student assistant. Furthermore, motivated students can also inquire about the possibility of pursing a Bachelor’s or Master’s thesis with us. Our research topics span all aspects of the sensor-edge-cloud continuum: query compilation, query optimization, query processing, query languages, distributed data processing, complex-event processing, machine learning, signal processing, sensor networks, fog computing, temporal-spatial query processing, transactional data processing, and modern hardware, among others.

Contact

Database Systems and Information Management (DIMA) Group Technische Universität Berlin
Sekr. E-N 7, Room E-N 728
Einsteinufer 17
10587 Berlin
Germany
+49 30 314 23555 nebulastream(at)dima.tu-berlin.de