Personal tools

Data Processing Architectures for New Hardware Platforms

Overview

Hardware landscape has seen tremendous changes over the past decades. Yet, even the latest incarnations of mainstream operating and database systems are essentially built on the same assumptions that their earliest predecessors faced forty years ago: single processors operate systems with small but fast random-access memory, data largely resides on small, bulky disk drives and the primary optimization goal is I/O from/to disk.

Nothing could be further from these assumptions in today's hardware systems and it is not surprising that an increasing number of systems researchers proclaims that "the end of an architectural era" is near and demands a "complete rewrite" of the antiquated software.

Such a complete rewrite certainly will contain a number of ideas from latest research papers that we look at in the course of this seminar. Novel adaptations of traditional algorithms exploit the characteristics of modern computing hardware, such as database algorithms that are aware of the caching hierarchy in today's systems or take advantage of vector-oriented processing (also known as SIMD instructions) on modern CPUs. In the face of multi-core CPUs and cache-loaded I/O devices, new task scheduling or I/O control mechanisms strive to maximize processing performance. And — is the current boundary between operating systems and databases the right one, after all?

This is a seminar that is at the edge of current research. So be prepared to hear some visionary ideas with often unexpected outcome. And have fun experimenting with these ideas on your own system if you like!

This course is part of the Information and Communications Systems Seminar, VVZ 252-3500-04L.

Students are well advised to read this guide on how to get high grades in systems seminars.

Organisation

Format

This course will be structured similarly to other seminar courses:

  • One or two papers on a given topic will be selected for each week, and students assigned to topics
  • Everyone reads the papers and emails a short summary in advance
  • The assigned student presents the selected papers, and leads into a general discussion to which all students are expected to contribute
  • Students submit a short report on their presentation topic

More details can be found in the introduction lecture slides.

Venue

Tuesdays 15:15–17:00 in IFW A36

Staff

  • Andrew Baumann

Schedule and assigned papers

September 23rd Introduction slides (PDF)
September 30th No seminar
October 7th SEDA, Cache-conscious radix-decluster projections
October 14th Optimistic intra-transaction parallelism on CMP, Improving DB performance on SMT
October 21st No seminar
October 28th MonetDB/X100, Super-Scalar RAM-CPU Cache Compression
November 4th On multidimensional data and modern disks, Rethink the Sync
November 11th Cooperative Scans
November 18th No seminar (change of schedule)
November 25th QPipe, Corey (change of schedule)
December 2nd No seminar
December 9th No seminar
December 16th No seminar

Papers

This is a preliminary list of selected papers. We may add or remove papers depending on enrolments. An assignment of topics will be made in the first week.

  1. SEDA: an architecture for well-conditioned, scalable internet services. Matt Welsh, David Culler, Eric Brewer. SOSP 2001.
    Presenter: Animesh Trivedi
  2. Cache-Conscious Radix-Decluster Projections. Stefan Manegold, Peter Boncz, Niels Nes, Martin Kersten. VLDB 2004.
    Background: Optimizing main-memory join on modern hardware. Stefan Manegold, Peter Boncz, Martin Kersten. IEEE TKDE 14(4) 2002.
  3. Optimistic Intra-Transaction Parallelism on Chip Multiprocessors. Christopher B. Colohan, Anastassia Ailamaki, J. Gregory Steffan, Todd C. Mowry. VLDB 2005.
    Background: Incrementally parallelizing database transactions with thread-level speculation. Christopher B. Colohan, Anastassia Ailamaki, J. Gregory Steffan, Todd C. Mowry. ACM TOCS 26(1) 2008.
    Presenter: Ramon Küpfer
  4. Improving database performance on simultaneous multithreading processors. Jingren Zhou, John Cieslewicz, Kenneth A. Ross, Mihir Shah. VLDB 2005.
    Presenter: Sandeep Bhardwaj
  5. MonetDB/X100: Hyper-Pipelining Query Execution. Peter Boncz, Marcin Zukowski, Niels Nes. CIDR 2005.
    Presenter: Ivan Krivulev
  6. Super-Scalar RAM-CPU Cache Compression. Marcin Zukowski, Sandor Heman, Niels Nes, Peter Boncz. ICDE 2006.
    Presenter: Kajetan Abt
  7. On multidimensional data and modern disks. Steven W. Schlosser, Jiri Schindler, Stratos Papadomanolakis, Minglong Shao, Anastassia Ailamaki, Christos Faloutsos, Gregory R. Ganger. FAST 2005.
    Presenter: Merve Soner
  8. Rethink the Sync. Edmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen, Jason Flinn. OSDI 2006.
    Presenter: Asli Özal
  9. Cooperative scans: dynamic bandwidth sharing in a DBMS. Marcin Zukowski, Sándor Héman, Niels Nes, Peter Boncz. VLDB 2007.
    Presenter: Fabian Schlup
  10. QPipe: a simultaneously pipelined relational query engine. Stavros Harizopoulos, Vladislav Shkapenyuk, Anastassia Ailamaki. SIGMOD 2005.
    Presenter: Tatyana Nikolayeva
  11. Corey: an operating system for many cores. Silas Boyd-Wickizer, Haibo Chen, Rong Chen, Yandong Mao, Frans Kaashoek, Robert Morris, Aleksey Pesterev, Lex Stein, Ming Wu, Yuehua Dai, Yang Zhang, Zheng Zhang. OSDI 2008.
    Presenter: Baris Güç
Document Actions