Data Processing Architectures for New Hardware Platforms
Overview
Hardware landscape has seen tremendous changes over the past decades. Yet, even the latest incarnations of mainstream operating and database systems are essentially built on the same assumptions that their earliest predecessors faced forty years ago: single processors operate systems with small but fast random-access memory, data largely resides on small, bulky disk drives and the primary optimization goal is I/O from/to disk.
Nothing could be further from these assumptions in today's hardware systems and it is not surprising that an increasing number of systems researchers proclaims that "the end of an architectural era" is near and demands a "complete rewrite" of the antiquated software.
Such a complete rewrite certainly will contain a number of ideas from latest research papers that we look at in the course of this seminar. Novel adaptations of traditional algorithms exploit the characteristics of modern computing hardware, such as database algorithms that are aware of the caching hierarchy in today's systems or take advantage of vector-oriented processing (also known as SIMD instructions) on modern CPUs. In the face of multi-core CPUs and cache-loaded I/O devices, new task scheduling or I/O control mechanisms strive to maximize processing performance. And — is the current boundary between operating systems and databases the right one, after all?
This is a seminar that is at the edge of current research. So be prepared to hear some visionary ideas with often unexpected outcome. And have fun experimenting with these ideas on your own system if you like!
This course is part of the Information and Communications Systems Seminar, VVZ 252-3500-04L.
Students are well advised to read this guide on how to get high grades in systems seminars.
Organisation
Format
This course will be structured similarly to other seminar courses:
- One or two papers on a given topic will be selected for each week, and students assigned to topics
- Everyone reads the papers and emails a short summary in advance
- The assigned student presents the selected papers, and leads into a general discussion to which all students are expected to contribute
- Students submit a short report on their presentation topic
More details can be found in the introduction lecture slides.
Venue
Tuesdays 15:15–17:00 in IFW A36Staff
- Andrew Baumann
Schedule and assigned papers
| September 23rd | Introduction slides (PDF) |
| September 30th | No seminar |
| October 7th | SEDA, Cache-conscious radix-decluster projections |
| October 14th | Optimistic intra-transaction parallelism on CMP, Improving DB performance on SMT |
| October 21st | No seminar |
| October 28th | MonetDB/X100, Super-Scalar RAM-CPU Cache Compression |
| November 4th | On multidimensional data and modern disks, Rethink the Sync |
| November 11th | Cooperative Scans |
| November 18th | No seminar (change of schedule) |
| November 25th | QPipe, Corey (change of schedule) |
| December 2nd | No seminar |
| December 9th | No seminar |
| December 16th | No seminar |
Papers
This is a preliminary list of selected papers. We may add or remove papers depending on enrolments. An assignment of topics will be made in the first week.
- SEDA: an architecture
for well-conditioned, scalable internet services. Matt Welsh, David
Culler, Eric Brewer. SOSP 2001.
Presenter: Animesh Trivedi - Cache-Conscious
Radix-Decluster Projections. Stefan Manegold, Peter Boncz, Niels
Nes, Martin Kersten. VLDB 2004.
Background: Optimizing main-memory join on modern hardware. Stefan Manegold, Peter Boncz, Martin Kersten. IEEE TKDE 14(4) 2002. - Optimistic
Intra-Transaction Parallelism on Chip Multiprocessors. Christopher
B. Colohan, Anastassia Ailamaki, J. Gregory Steffan, Todd C. Mowry.
VLDB 2005.
Background: Incrementally parallelizing database transactions with thread-level speculation. Christopher B. Colohan, Anastassia Ailamaki, J. Gregory Steffan, Todd C. Mowry. ACM TOCS 26(1) 2008.
Presenter: Ramon Küpfer - Improving
database performance on simultaneous multithreading processors.
Jingren Zhou, John Cieslewicz, Kenneth A. Ross, Mihir Shah. VLDB
2005.
Presenter: Sandeep Bhardwaj - MonetDB/X100:
Hyper-Pipelining Query Execution. Peter Boncz, Marcin Zukowski,
Niels Nes. CIDR 2005.
Presenter: Ivan Krivulev -
Super-Scalar RAM-CPU Cache Compression. Marcin Zukowski, Sandor
Heman, Niels Nes, Peter Boncz. ICDE 2006.
Presenter: Kajetan Abt - On
multidimensional data and modern disks. Steven W. Schlosser, Jiri
Schindler, Stratos Papadomanolakis, Minglong Shao, Anastassia Ailamaki,
Christos Faloutsos, Gregory R. Ganger. FAST 2005.
Presenter: Merve Soner - Rethink
the Sync. Edmund B. Nightingale, Kaushik Veeraraghavan, Peter M.
Chen, Jason Flinn. OSDI 2006.
Presenter: Asli Özal - Cooperative
scans: dynamic bandwidth sharing in a DBMS. Marcin Zukowski, Sándor
Héman, Niels Nes, Peter Boncz. VLDB 2007.
Presenter: Fabian Schlup - QPipe: a
simultaneously pipelined relational query engine. Stavros
Harizopoulos, Vladislav Shkapenyuk, Anastassia Ailamaki. SIGMOD
2005.
Presenter: Tatyana Nikolayeva - Corey: an
operating system for many cores. Silas Boyd-Wickizer, Haibo Chen,
Rong Chen, Yandong Mao, Frans Kaashoek, Robert Morris, Aleksey
Pesterev, Lex Stein, Ming Wu, Yuehua Dai, Yang Zhang, Zheng Zhang. OSDI
2008.
Presenter: Baris Güç



