Algorithms for Database Systems (Seminar)


Please submit your final report to all three professors by email. If you already sent us the final report but do not see it uploaded at the end of this course webpage within the next 7 days, please send us a follow-up email.


Peter Widmayer (ETH), Arijit Khan (ETH), Michael Böhlen (UZH)

EMails: widmayer AT, arijit.khan AT, boehlen AT 


Overview and Objectives:

The theme of the seminar this year is Big Data. The seminar will address various topics in this area: Algorithms, Machine Learning, Data Mining, and Applications.

Students learn how to critically read and study research papers, how to summarize the contents of a paper, and how to present it in a seminar.


Teaching Format:

Each participant writes a self-contained report of about 10 pages (single-column is fine) and gives a 30 minute presentation. This year, all presentations will be at the blackboard only. (No computers; no powerpoint!) 

Each participant is associated to another participant who serves as a shepherd (aka buddy) for report and presentation. Buddies read the report, make suggestions for improvements, and help with the presentation (e.g., dry runs). 

The first version of the report is due two weeks before the date of the presentation. (No excuses!) This first version of the report and presentation will be discussed with the buddy and a professor one week before the presentation. The final versions of the report are due at the end of the semester. 

Grading will depend on the quality of the report, talk, active participation during the seminar, and impact as a shepherd.


Setup and Organization:

The setup of the seminar will be discussed on on Tuesday, February 17, from 14:15 - 16:00 h in room CAB H 52. In this meeting, the seminar topics will be presented and assigned to participants.  The seminar talks will be given in two blocks on two Saturdays: March 21 and April 25. Participation on both Saturdays is mandatory.


Selection of Students/ Papers:

Please select exactly three papers from the list below that you would like to present and list them in descending order of your preferences. Send the list by email to <arijit.khan AT> and with title "Algorithms for Database Systems Seminar - 3 Topics" before Thursday (Feb 19) 12:00 noon. No late emails will be considered. We shall decide the students for this seminar based on your emails. Due to a large number of registrations, unfortunately we cannot ensure that every student who registered may eventually take this seminar, neither we can ensure that a student will be assigned a paper based on his/her top-3 choices of paper list.


First Presentation:

On Saturday, March 21, we will have the first session of our seminar.  There will be eight talks (each about 30 minutes).  The coordinates are: 

  Location:  CAB H52
  Starting Time:  8:15 am (sharp)

Since the building will be locked, we shall meet at 8:00 am at the back entrance of the CAB building.  (The back entrance is the entrance facing the Sternwarte, the ASV entrance, and just at the other side of the main entrance.)  Please, be on time so that we can start on time.


Second Presentation:

Saturday, April 25, the second session of our seminar takes place.

Location: BIN 2.A.01 (ifi, Binzmühlestrasse 14, Oerlikon)
Starting Time: 8:15

The building is locked and we meet at 8:05am at the front entrance
(roughly in the middle between tram stops Bahnhof Oerlikon Ost and Leutschenbach) 


Final Report:

The final version of the report is due at the end of the semester. The final report must be self-contained, about 10 pages (single-column is fine). Please email your final report to all three professors.



Paper Date Professor/ Post-doc Presenter Buddy
(1) Mining Frequent Graph Pattern with Differential Privacy March 21   Arijit Khan Linus Handschin Diana Birenbaum


(2) Mining Top-k High Utility Itemsets March 21

Michael Böhlen

Daniel Yu Sofia Orlova
(3) Mining Uncertain Data with Probabilistic Guarantees March 21 Michael Böhlen  Anna Durrer Andreas Enz
(4) Collective Graph Identification March 21 Peter Widmayer  Floran Gmehlin Marco Alvaro
(5) Efficient Episode Mining of Dynamic Event Streams March 21 Arijit Khan Marco Alvaro Veronika Molnar 
(6) An Information Theoretic Framework for Data Mining March 21 Arijit Khan Lukas Striebel Lilian Boesch 
(7) Tell Me What I Need to Know: Succinctly Summarizing Data with Itemsets March 21 Peter Widmayer  Sivaranjini Chithambaram Imanol Studer 
(8) Summarization-based Mining Bipartite Graphs March 21

Michael Böhlen 

Annika Glauser Yuves Bieri 
(9) Selecting a Comprehensive Set of Reviews March 21 Peter Widmayer  Cyrill Gossi  Marcel Molnar 
(10) Clustering Time Series using Unsupervised-Shapelets April 25 Arijit Khan Lilian Boesch Anna Durrer 
(11) Comparing Apples to Oranges: A Scalable Solution with Heterogeneous Hashing April 25

Michael Böhlen

(12) Social Sampling April 25  Peter Widmayer Marcel Mohler Annika Glauser 
(13) Selective Sampling on Graphs for Classification April 25 Arijit Khan Diana Birenbaum Daniel Yu 
(14) Reconstructing Graphs from Neighborhood Data April 25 Peter Widmayer  Yuves Bieri Sivaranjini Chithambaram 
(15) Analyzing massive astrophysical datasets: Can Pig/Hadoop or a relational DBMS help? April 25 Arijit Khan     
(16) Which Space Partitioning Tree to Use for Search? April 25

Michael Böhlen

Sofia Orlova Linus Handschin 
(17) Density estimation trees April 25

Michael Böhlen

Imanol Studer Cyrill Gossi 
(18) A Parallel Hashed Oct-Tree N-Body Algorithm April 25  Peter Widmayer Andreas Enz Floran Gmehlin 



All reports must be written in English. All talks must be in English.