Computing Platforms Seminar Series (COMPASS)

The Computing Platforms Seminar Series (COMPASS) is focused on talks by industry and academia around the general topic of computing platforms.

COMPASS is held on most Thursdays during the semester 10:00-11:00 (with some exceptions) in CAB E 72.

Upcoming Talks:

Thursday, 28. March 2019, 10:00-11:00 in CAB E 72

Speaker: Theo Rekatsinas (University of Wisconsin)

Title: A Machine Learning Perspective on Managing Noisy Data





Modern analytics are very dependent on high-effort tasks like data preparation and data cleaning to produce accurate results. It is for this reason that the vast majority of the time devoted on analytics projects is spent on high-effort tasks like data preparation and data cleaning.

This talk describes recent work on making routine data preparation tasks dramatically easier. I will first introduce a noisy channel model to describe the quality of structured data and demonstrate how most work on noisy data management by the database community can be cast as a statistical learning and inference problem. I will then show how this noisy channel model forms the basis of HoloClean, a weakly supervised ML system for automated data cleaning. I will close with additional examples of how a statistical learning view can lead to new insights and solutions to classical database problems such as constraint discovery and consistent query answering.

Short Bio:

Theodoros (Theo) Rekatsinas is an Assistant Professor in the Department of Computer Sciences at the University of Wisconsin-Madison. He is a member of the Database Group. He earned his Ph.D. in Computer Science from the University of Maryland and was a Moore Data Postdoctoral Fellow at Stanford University. His research interests are in data management, with a focus on data integration, data cleaning, and uncertain data. Theo's work has been recognized with an Amazon Research Award in 2018, a Best Paper Award at SDM 2015, and the Larry S. Davis Doctoral Dissertation award in 2015.


Past COMPASS Talks:  

Date Speaker Affiliation Talk
21.03.2019 Marko Vukolic IBM Research Hyperledger Fabric: a Distributed Operating System for Permissioned Blockchains
28.02.2019 Alberto Lerner University of Fribourg
The Case for Network-Accelerated Query Processing
21.02.2019 Thomas Würthinger Oracle Labs Bringing the Code to the Data with GraalVM
31.01.2019 Irene Zhang Microsoft Research, Redmond Demikernel: An Operating System Architecture for Hardware-Accelerated Datacenter Servers
25.10.2018 Mihnea Andrei SAP HANA Snapshot isolation in HANA - the evolution towards production-grade HTAP
04.10.2018 Philippe Bonnet IT University, Copenhagen, Denmark Near-Data Processing with Open-Channel SSDs
25.09.2018 Nandita Vijaykumar   Carnegie Mellon University Expressive Memory: Rethinking the Hardware-Software Contract with Rich Cross-Layer Abstractions
20.09.2018 Patrick Stüdi IBM Research Data processing at the speed of 100 Gbps using Apache Crail (Incubating)
15.08.2018 Leonid Yavits
Technion Resistive CAM based architectures: Resistive Associative In-Storage Processor and Resistive Address Decoder
06.07.2018 Martin Burtscher Texas State University Automatic Hierarchical Parallelization of Linear Recurrences
15.06.2018 Nitin Agrawal Samsung Research Low-Latency Analytics on Colossal Data Streams with SummaryStore
24.05.2018 Cagri Balkesen Oracle Labs RAPID: In-Memory Analytical Query Processing Engine with Extreme Performance per Watt
16.05.2018 Carsten Binnig TU Darmstadt Towards Interactive Data Exploration
09.05.2018 Bastian Hossbach Oracle Labs Modern programming languages and code generation in the Oracle Database
26.04.2018 Spyros Blanas Ohio State University Scaling database systems to high-performance computers
19.04.2018 Jane Hung MIT The Challenges and Promises of Large-Scale Biological Imaging
12.04.2018 Christoph Hagleitner IBM Research Heterogeneous Computing Systems for Datacenter and HPC Applications
14.03.2018  Eric Sedlar
 Oracle Labs
Why Systems Research Needs Social Science Added to the Computer Science
01.03.2018 Saughata Ghose Carnegie Mellon University How Safe Is Your Storage? A Look at the Reliability and Vulnerability of Modern Solid-State Drives
22.02.2018  Ioannis Koltsidas IBM Research Zurich System software for commodity solid-state storage