Events

Select event terms to filter by
Thursday March 21, 2019
Start: 21.03.2019 13:30

Thursday, 21. March 2019, 13:30-14:30 in CAB E 72

Speaker: Marko Vukolic (IBM Research)

Title: Hyperledger Fabric: a Distributed Operating System for Permissioned Blockchains

 

 

Abstract:

Fabric is a modular and extensible open-source system for deploying and operating permissioned blockchains and one of the Hyperledger projects hosted by the Linux Foundation (www.hyperledger.org). Fabric supports modular consensus protocols, which allows the system to be tailored to particular use cases and trust models. Fabric is also the first blockchain system that runs distributed applications written in standard, general-purpose programming languages, without systemic dependency on a native cryptocurrency. This stands in sharp contrast to existing blockchain platforms that require "smart-contracts" to be written in domain-specific languages or rely on a cryptocurrency. Fabric realizes the permissioned model using a portable notion of membership, which may be integrated with industry-standard identity management. To support such flexibility, Fabric introduces an entirely novel blockchain design and revamps the way blockchains cope with non-determinism, resource exhaustion, and performance attacks. Although not yet performance-optimized, Fabric achieves, in certain popular deployment configurations, end-to-end throughput of more than 3500 transactions per second (of a Bitcoin-inspired digital currency), with sub-second latency, scaling well to over 100 peers. In this talk we discuss Hyperledger Fabric architecture, detailing the rationale behind various design decisions. We also briefly discuss distributed ledger technology (DLT) use cases to which Hyperledger Fabric is relevant, including financial industry, manufacturing industry, supply chain management, government use cases and many more.

Short Biography:

Dr. Marko Vukolić is a Research Staff Member at Blockchain and Industry Platforms group at IBM Research - Zurich. Previously, he was a faculty at EURECOM and a visiting faculty at ETH Zurich. He received his PhD in distributed systems from EPFL in 2008 and his dipl. ing. degree in telecommunications from University of Belgrade in 2001. His research interests lie in the broad area of distributed systems, including blockchain and distributed ledgers, cloud computing security, distributed storage and fault-tolerance.


----

COMPASS TALKS

----

Thursday March 28, 2019
Start: 28.03.2019 10:00

Thursday, 28. March 2019, 10:00-11:00 in CAB E 72

Speaker: Theo Rekatsinas (University of Wisconsin)

Title: A Machine Learning Perspective on Managing Noisy Data

 

 

Abstract: 

Modern analytics are very dependent on high-effort tasks like data preparation and data cleaning to produce accurate results. It is for this reason that the vast majority of the time devoted on analytics projects is spent on high-effort tasks like data preparation and data cleaning.

This talk describes recent work on making routine data preparation tasks dramatically easier. I will first introduce a noisy channel model to describe the quality of structured data and demonstrate how most work on noisy data management by the database community can be cast as a statistical learning and inference problem. I will then show how this noisy channel model forms the basis of HoloClean, a weakly supervised ML system for automated data cleaning. I will close with additional examples of how a statistical learning view can lead to new insights and solutions to classical database problems such as constraint discovery and consistent query answering.

Short Bio:

Theodoros (Theo) Rekatsinas is an Assistant Professor in the Department of Computer Sciences at the University of Wisconsin-Madison. He is a member of the Database Group. He earned his Ph.D. in Computer Science from the University of Maryland and was a Moore Data Postdoctoral Fellow at Stanford University. His research interests are in data management, with a focus on data integration, data cleaning, and uncertain data. Theo's work has been recognized with an Amazon Research Award in 2018, a Best Paper Award at SDM 2015, and the Larry S. Davis Doctoral Dissertation award in 2015.


 

----

COMPASS TALKS

----

Wednesday April 24, 2019
Start: 24.04.2019 15:00

CAB G 51

 Moritz Hoffmann - PhD Defense: Managing and understanding distributed stream processing

Thursday April 25, 2019
Start: 25.04.2019 10:00

Thursday, 25. April 2019, 10:00-11:00 in CAB E 72

Speaker: Peter Pietzuch (Imperial College London)

Title: Scaling Deep Learning on Multi-GPU Servers

  

 

 

Abstract

With the widespread availability of GPU servers, scalability in terms of the number of GPUs when training deep learning models becomes a paramount concern. For many deep learning models, there is a scalability challenge: to keep multiple GPUs fully utilised, the batch size must be sufficiently large, but a large batch size slows down model convergence due to the less frequent model updates.

In this talk, I describe CrossBow, a new single-server multi-GPU deep learning system that avoids the above trade-off. CrossBow trains multiple model replicas concurrently on each GPU, thereby avoiding under-utilisation of GPUs even when the preferred batch size is small. For this, CrossBow (i) decides on an appropriate number of model replicas per GPU and (ii) employs an efficient and scalable synchronisation scheme within and across GPUs.

Short Bio:

Peter Pietzuch is a Professor at Imperial College London, where he leads the Large-scale Data & Systems (LSDS) group (http://lsds.doc.ic.ac.uk) in the Department of Computing. His research focuses on the design and engineering of scalable, reliable and secure large-scale software systems, with a particular interest in performance, data management and security issues. He has published papers in premier international venues, including SIGMOD, VLDB, OSDI, USENIX ATC, EuroSys, SoCC, ICDCS, CCS, CoNEXT, NSDI, and Middleware. Before joining Imperial College London, he was a post-doctoral fellow at Harvard University. He holds PhD and MA degrees from the University of Cambridge.

Friday May 17, 2019
Start: 17.05.2019 12:00

Friday, 17. May 2019, 12:00-13:00 in CAB E 72

Speaker: Tim Kraska (MIT)

Title: Towards Learned Algorithms, Data Structures, and Systems

 

  

Abstract

All systems and applications are composed from basic data structures and algorithms, such as index structures, priority queues, and sorting algorithms. Most of these primitives have been around since the early beginnings of computer science (CS) and form the basis of every CS intro lecture. Yet, we might soon face an inflection point: recent results show that machine learning has the potential to alter the way those primitives or systems at large are implemented in order to provide optimal performance for specific applications. In this talk, I will provide an overview on how machine learning is changing the way we build systems and outline different ways to build learned algorithms and data structures to achieve “instance-optimality” with a particular focus on data management systems.

Short Bio:

Tim Kraska is an Associate Professor of Electrical Engineering and Computer Science in MIT's Computer Science and Artificial Intelligence Laboratory and co-director of the Data System and AI Lab at MIT (DSAIL@CSAIL). Currently, his research focuses on building systems for machine learning, and using machine learning for systems. Before joining MIT, Tim was an Assistant Professor at Brown, spent time at Google Brain, and was a PostDoc in the AMPLab at UC Berkeley after he got his PhD from ETH Zurich. Tim is a 2017 Alfred P. Sloan Research Fellow in computer science and received several awards including the 2018 VLDB Early Career Research Contribution Award, the 2017 VMware Systems Research Award


 COMPASS TALKS

Monday May 20, 2019
Start: 20.05.2019 17:00

HG D22

 David Sidler - PhD Defense: In-Network Data Processing using FPGAs

Friday May 24, 2019
Friday May 31, 2019
Start: 31.05.2019 12:15

CAB E 72

Lunch Semiar Talk by Jansen Zhao

Title: A brief introduction to quantum computing with plausible applications to machine learning

Abstract:

I will give a briefly introduction to the main concepts in quantum information and quantum computing, and review the basic set of quantum algorithmic primitives. I will then show, by the example of Gaussian processes, how these quantum building blocks can be combined and provide computational speedup in machine learning. We will discuss the practical utility of these quantum algorithms and explore the domain of anticipated near-term applications of quantum computing.

Thursday July 11, 2019
Start: 11.07.2019 10:00

Thursday, 11. July 2019, 11:00-12:00 in CAB E 72

Speaker: Boris Grot (University of Edinburgh)

Title: Scale-Out ccNUMA: Embracing Skew in Distributed Key-Value Stores

 

  

 

 

 

Abstract:

Key-value stores (KVS’s) underpin many of today’s cloud services. For scalability and performance, state-of-the-art KVS systems distribute the dataset across a pool of servers, each of which holds a shard of data in memory and serves queries for the data in the shard. An important performance bottleneck that a KVS design must address is the load imbalance caused by skewed popularity distributions, whereby the “hot” items are accessed much more frequently than the rest of the dataset. Despite recent work on skew mitigation, existing approaches are limited in their efficacy when it comes to high-performance in-memory KVS deployments.

In this talk, I will discuss our recent work on skew mitigation for distributed in-memory KVS’s. We embrace popularity skew as a performance opportunity by aggressively caching popular items at all nodes of the KVS. The main challenges for such a design is maintaining the caches consistent while avoiding serialization points that can become a performance bottleneck at high load. I will describe our fully de-centralized caching architecture and the cache-coherence-inspired protocol used to keep the distributed caches consistent. I will also present simple protocol extensions that enable fault tolerance, with applicability beyond skew-tolerant KVS's.

Bio:

Boris Grot is an Associate Professor in the School of Informatics at the University of Edinburgh. His research seeks to address efficiency bottlenecks and capability shortcomings of processing platforms for data-intensive applications. Boris is a member of the MICRO Hall of Fame and a recipient of various awards for his research, including IEEE Micro Top Pick and the Best Paper Award at HPCA 2019. Boris holds a PhD in Computer Science from The University of Texas at Austin and had spent two years as a post-doctoral researcher at EPFL.


 COMPASS TALKS

Monday September 16, 2019
Thursday September 19, 2019
Start: 19.09.2019 10:00

Thursday, 19. September 2019, 10:00-11:00 in CAB E 72

 Speakers: Martin Hentschel/Max Heimel (Snowflake) 

Title: File Metadata Management at Snowflake

 

 

 

Abstract:

Snowflake is an analytic data warehouse offered as a fully-managed service in the cloud. It is faster, easier to use, and far more scalable than traditional on-premise data warehouse offerings and is used by thousands of customers around the world. Snowflake's data warehouse is not built on an existing database or "big data" software platform such as Hadoop—it uses a new SQL database engine with a unique architecture designed for the cloud. This talk provides an overview of Snowflake’s architecture that was designed to efficiently support complex analytical workloads in the cloud. Looking at the lifecycle of micro partitions, this talk explains pruning, zero-copy cloning, and instant time travel. Pruning is a technique to speed up query processing by filtering out unnecessary micro partitions during query compilation. Zero-copy cloning allows to create logical copies of the data without duplicating physical storage. Instant time travel enables the user to query data "as of" a time in the past, even if the current state of the data has changed. This talk also shows how micro partitions tie into Snowflake's unique architecture of separation of storage and compute, and enable advanced features such as automatic clustering.

Speakers bio:

Martin Hentschel received a PhD in Computer Science from the Systems Group at ETH Zurich in 2012. In the following he worked at Microsoft where he built products integrating data from social networks into the Bing search engine. In 2014, he joined Snowflake where he is working on security, meta data management, and stateful micro services.

Max Heimel holds a PhD in Computer Science from the Database and Information Management Group at TU Berlin. He joined Snowflake in 2015 and is working primarily in the areas of query execution and query optimization. Before joining Snowflake, Max worked at IBM and spent several internships at Google.  


COMPASS TALKS

Thursday September 26, 2019
Start: 26.09.2019 10:00

Thursday, 26. September 2019, 10:00-11:00 in CAB E 72

Speaker: Ben Zhao (University of Chicago).

Title: Hidden Backdoors in Deep Learning Systems

 

 

 

 

Abstract:

Lack of transparency in today’s deep learning systems has paved the way for a new type of threats, commonly referred to as backdoor or Trojan attacks. In a backdoor attack, a malicious party can corrupt a deep learning model (either at initial training time or later) to embed hidden classification rules that do not interfere with normal classification, unless an unusual “trigger” is applied to the input, which would then produce unusual (and likely incorrect) results. For example, a facial recognition model with a backdoor might recognize anyone with a pink earring as Elon Musk. Backdoor attacks have been validated in a number of image classification applications, and are difficult to detect given the black-box nature of most DNN models.

In this talk, I will describe two recent results on detecting and understanding backdoor attacks on deep learning systems. I will first present Neural Cleanse (S&P 2019), the first robust tool to detect a wide range of backdoors in deep learning models. We use the idea of inter-label perturbation distances to detect when a backdoor trigger has created shortcuts to misclassification to a particular label. Second, I will describe our new work on Latent Backdoors (CCS 2019), a stronger type of backdoor attacks that are more difficult to detect, and survives retraining in commonly used transfer learning systems. We use experimental validation to show that latent backdoors can be quite robust and stealthy, even against the latest detection tools (including neural cleanse). There are no known techniques to detect latent backdoors, but we present alternative techniques to defend against them via disruption.

Bio:

Ben Zhao is the Neubauer Professor of Computer Science at University of Chicago. He completed his PhD from Berkeley (2004) and his BS from Yale (1997). He is an ACM distinguished scientist, and recipient of the NSF CAREER award, MIT Technology Review's TR-35 Award (Young Innovators Under 35), ComputerWorld Magazine's Top 40 Tech Innovators award, Google Faculty award, and IEEE ITC Early Career Award. His work has been covered by media outlets such as Scientific American, New York Times, Boston Globe, LA Times, MIT Tech Review, and Slashdot. He has published more than 160 publications in areas of security and privacy, networked systems, wireless networks, data-mining and HCI (H-index > 60). He recently served as PC chair for World Wide Web Conference (WWW 2016) and the Internet Measurement Conference (IMC 2018), and is a general cochair for Hotnets 2020. 


 

Thursday October 10, 2019
Start: 10.10.2019 10:00

Thursday, 10. October 2019, 10:00-11:00 in CAB E 72

Speaker: Norman May (SAP Research)

Title: Exploiting modern hardware in SAP HANA

Abstract:

SAP HANA has a long history of exploiting modern hardware to achieve high performance for database workloads. As a recent trend GPUs and FGPAs have the potential to offload work from general purpose CPUs or even accelerate operations previously executed on CPUs. In this talk I will share ongoing work in SAP HANA and potential application scenarios for accelerators along the query processing pipeline. I will also discuss current limitations for the usage of accelerators in productive SAP HANA scenarios.

Short Bio:

Dr. Norman May did his doctoral thesis on algebraic optimization and evaluation of XML queries at the University of Mannheim, Germany. After joining SAP, he worked as researcher and technical coordinator of the TEXO/Theseus research project; his research focused on service discovery, service composition, and service engineering. In 2010 he joined the SAP HANA database development team where he now works as a database architect with a focus on query processing and effective resource management. He supervises the research of several students in the SAP HANA campus and actively contributes to the database research community.


COMPASS TALKS

 

Thursday October 17, 2019
Start: 17.10.2019 10:00

Thursday, 17. October 2019, 10:00-11:00 in CAB E 72

Speaker: Rene Müller (NVIDIA)

Title: Simplifying NVIDIA GPU Access: A Polyglot Binding for GPUs with GraalVM

 

 

Abstract:

NVIDIA GPU computing accelerates workloads and fuels breakthroughs across industries. There are many GPU-accelerated libraries developers can leverage, but integrating these libraries into existing software stacks can be challenging. Programming GPUs typically requires low-level programming while high-level scripting languages have become very popular in modern applications. Accelerated computing solutions are heterogeneous and inherently more complex: data needs to needs to be exchanged between main memory and the GPUs, often with format conversion along the way, while the execution of code needs to be scheduled carefully. In this talk, I will present recent work from an ongoing collaboration between NVIDIA and Oracle Labs, which we released as an open-source research prototype called grCUDA. It leverages Oracle's GraalVM and exposes GPUs in polyglot environments. While GraalVM can be regarded as the "one VM to rule them all", grCUDA is the "one GPU binding to rule them all". Data is efficiently shared between GPUs and GraalVM languages (R, Python, JavaScript) while GPU kernels can be launched directly from those languages. Precompiled GPU kernels can be used as well as kernels that are generated dynamically at runtime. I will also demonstrate how to access GPU-accelerated libraries such as RAPIDS cuML.

Short Bio:

Rene Mueller is a Senior AI Developer Technology at NVIDIA working on GPU optimizations for Machine Learning and acceleration of information processing systems such as database systems. Before joining NVIDIA, he was a Research Staff Member at the IBM Research Lab in Almaden, where he worked on DB2 BLU, DB2 EventStore, and acceleration of OLAP processing using GPUs and FPGAs. Rene holds a PhD in Computer Science from ETH Zurich.

 


Thursday November 21, 2019
Start: 21.11.2019 10:00

Thursday, 21. November 2019, 10:00-11:00 in CAB E 72

Speaker: Tamás Hauer, Technical Program Manager, SRE, Google Zurich

Title: Meaningful Availability

Abstract:

High availability is a critical requirement for cloud applications; having a metric that meaningfully captures it is useful for users and system developers. Commonly used benchmarks are either too complex to be actionable or fail to capture the true perception of users, which leads to miscommunication, and suboptimal engineering decisions. We propose two improvements to availability measurements. A novel metric, "user-uptime" directly models user-perceived availability and avoids the bias often found in alternatives. A presentation paradigm "windowed availability" supports a holistic view by integrating timescales from per-minute to monthly granularity and allows to distinguish between many short periods of unavailability or fewer longer ones. We demonstrate the benefits of windowed user-uptime on synthetic models and on production data from Google's G Suite. Today, all G Suite products are instrumented with this novel metric, it is used both to support engineering decisions and to communicate system health to enterprise customers.

Bio:

Tamás Hauer holds a PhD in theoretical physics. After a brief career as a physicist at the Max-Planck-Institut and CERN, he worked as a research associate at the department of Applied Computer Science of the University of West of England, leading a group in the area of semantic web, grid services and health informatics. He was work package leader of the European FP5 Mammogrid and FP6 Health-e-Child projects. As the CTO of the Swiss startup Prodema Medical / McMRI, he led the development of appMRI, a brain MRI image analysis platform to launch and certification as a class IIa medical device. He joined Site Reliability Engineering of Google Zürich in 2016 as a Technical Program Manager, his current interest is data analysis of service level indicators and service level objectives.

---

COMPASS TALKS 

Thursday November 28, 2019
Start: 28.11.2019 10:00

Thursday, 28. November 2019, 10:00-11:00 in CAB E 72

Speaker: Djordje Zegarac and Martin Marciniszyn (Tensor Technologies)

Title: High Frequency Trading and FPGAs 

Abstract:

High-Frequency Trading (HFT) platforms were typically implemented in software on traditional CPUs with high performance network adapters. However, the industry-wide race to "Zero Latency" has led the trading world to explore alternative system architectures that would minimize the internal latency. Field-Programmable Gate Arrays (FPGAs) offer the superior performance with deterministic execution while providing custom implementation flexibility. Due to these valuable architectural properties FPGAs became the integral part of HFT industry in accelerating trading solutions and reducing wire to wire latencies. In this talk we are going to outline the general architecture of our system and describe the main design challenges.

Bio:

Djordje Zegarac received the B.Sc. degree in Electrical Engineering from the University of Calgary, Canada in 2011, the M.Sc. degree in Electronics and Microelectronics from Ecole Polytechnique Fédérale de Lausanne and IBM Research, Switzerland in 2014. He worked as an IC Digital Design Engineer at u-blox, and as FPGA SoC Development Engineer at Enclustra, Switzerland. Currently, he is employed by Tensor Technologies as an FPGA Engineer. His main research interest is in the area of ASIC/FPGA design and hardware acceleration.

Martin Marciniszyn received a Ph.D. in Computer Science from ETH Zurich in 2007. Afterwards he spent the largest part of his professional career as a quant researcher at IMC Trading, one of the largest global HFT companies. Currently, he is the CTO of Tensor Technologies.


 COMPASS TALKS 

Tuesday December 17, 2019
Sunday January 19, 2020
Monday January 20, 2020
Tuesday January 21, 2020
Wednesday January 22, 2020
Start: 19.01.2020
End: 22.01.2020