Stream Processing

Stream Processing

DejaVu

The DejaVu project explores scalable complex event processing techniques for streams of events. The goal is to provide a system that can seamlessly integrate pattern detection over live and historical streams of events behind a common, declarative interface. We are investigating various optimization ideas for efficient data access and query execution. More...


MaxStream - ECC Project 

Despite the availability of several commercial data stream processing engines (SPEs), it remains hard to develop and maintain streaming applications. A major difficulty is the lack of standards, and the wide (and changing) variety of application requirements. Consequently, existing SPEs vary widely in data and query models, APIs, functionality, and optimization capabilities. This has led to some organizations using multiple SPEs, based on their application needs. Furthermore, management of stored data and streaming data are still mostly separate concerns, although applications increasingly require integrated access to both. In the MaxStream project, our goal is to design and build a federated stream processing architecture that seamlessly integrates multiple autonomous and heterogeneous SPEs with traditional databases behind a common SQL-based declarative query interface and a common API in a way to facilitate the incorporation of new functionality and requirements. More...


SECRET

There are many academic and commercial stream processing engines (SPEs) today, each of them with its own execution semantics. This variation may lead to seemingly inexplicable differences in query results. SECRET takes up this challenge. It is a descriptive model that allows users to analyze the behavior of systems and understand the results of window-based queries (with time- and tuple-based windows) for a broad range of heterogeneous SPEs. More


 UpStream

Most data stream processing systems model streams as append-only sequences of data elements. In this model, the application expects to receive a query answer on the complete stream. However, there are many situations in which each data element in the stream is in fact an update to a previous one, and therefore, the most recent value is all that really matters to the application. In UpStream, we explore how to efficiently process continuous queries under such an update-based stream data model. More...

 Back to Research

Upstream - Past project

Project Description 

Most data stream processing systems model their inputs as append-only sequences of data elements. In this model, the application expects to receive a query answer on the complete input stream. However, there are many situations in which each data element (or a window of data elements) in the stream is in fact an update to a previous one, and therefore, the most recent arrival is all that really matters to the application. UpStream defines a storage-centric approach to efficiently processing continuous queries under such an update-based stream data model. The goal is to provide the most up-to-date answers to the application with the lowest staleness possible. To achieve this, we developed a lossy tuple storage model (called an "update queue"), which under high load, will choose to sacrifice old tuples in favor of newer ones using a number of different update key scheduling heuristics. Our techniques can correctly process queries with different types of streaming operators (including sliding windows), while efficiently handling large numbers of update keys with different update frequencies.

Project Members

Current: Alexandru Moga, Nesime Tatbul
Former: Irina Botan

Publications

 

 

MaxStream - Past project

Project Description
 

Despite the availability of several commercial data stream processing engines (SPEs), it remains hard to develop and maintain streaming applications. A major difficulty is the lack of standards, and the wide (and changing) variety of application requirements. Consequently, existing SPEs vary widely in data and query models, APIs, functionality, and optimization capabilities. This has led to some organizations using multiple SPEs, based on their application needs. Furthermore, management of stored data and streaming data are still mostly separate concerns, although applications increasingly require integrated access to both. In the MaxStream project, our goal is to design and build a federated stream processing architecture that seamlessly integrates multiple autonomous and heterogeneous SPEs with traditional databases behind a common SQL-based declarative query interface and a common API in a way to facilitate the incorporation of new functionality and requirements.

Project Members

  • Current: Nihal Dindar,  Nesime Tatbul (ETH Zurich); Laura Haas (IBM Almaden); Renee Miller (University of Toronto).
  • Former: Irina Botan, Roozbeh Derakhshan, Younggoo Cho, Kihong Kim, Chulwon Lee, Beomjin Yun (SAP Labs Korea); Girish Mundada (SAP Labs USA); Ming-Chien Shan, Ying Yan,  Jin Zhang (SAP Labs China).
     

Publications

More Information

Contact: Prof. Dr. Nesime Tatbul

 

DejaVu - Past Project

Project Description

The DejaVu project explores scalable complex event processing techniques for streams of events. The goal is to provide a system that can seamlessly integrate pattern detection over live and historical streams of events behind a common, declarative interface. We are investigating various optimization ideas for efficient data access and query execution.

Project Members

  • Nihal Dindar, Nesime Tatbul, Baris Guc, Patrick Lau, Asli Ozal, Merve Soner

Publications