The UpStream Project
Project Description
Most data stream processing systems model their inputs as append-only sequences of data elements. In this model, the application expects to receive a query answer on the complete input stream. However, there are many situations in which each data element (or a window of data elements) in the stream is in fact an update to a previous one, and therefore, the most recent arrival is all that really matters to the application. UpStream defines a storage-centric approach to efficiently processing continuous queries under such an update-based stream data model. The goal is to provide the most up-to-date answers to the application with the lowest staleness possible. To achieve this, we developed a lossy tuple storage model (called an "update queue"), which under high load, will choose to sacrifice old tuples in favor of newer ones using a number of different update key scheduling heuristics. Our techniques can correctly process queries with different types of streaming operators (including sliding windows), while efficiently handling large numbers of update keys with different update frequencies.
Project Members
Current: Alexandru Moga, Nesime Tatbul
Former: Irina Botan
Publications
- A. Moga, I. Botan, N. Tatbul, ”UpStream: Storage-centric Load Management for Streaming Applications with Update Semantics”, VLDB Journal, to appear (published online: April 2011).
- A. Moga, N. Tatbul, "UpStream: A Storage-centric Load Management System for Real-time Update Streams", Demonstration (poster here), VLDB Conference, Seattle, WA, USA, August 2011.
- A. Moga, "UpStream: Storage-centric Load Management for Data Stream Processing Systems", VLDB PhD Workshop, Singapore, September 2010.
- A. Moga, I. Botan, N. Tatbul, ”UpStream: Storage-centric Load Management for Data Streams with Update Semantics”, Technical Report TR-620, ETH Zurich Department of Computer Science, March 2009.
More Information
Contact: Alexandru Moga



