Hardware Architectures for Machine Learning - Spring 2019


 Submit your reports via email by 31.05.2019, 23:59. The report is 4 pages, single column format. Some (non-obligatory!) guidelines can be found here. If you use the guidelines, the Synthesis part can be the relation to other seminar papers, and possible intersecting points/combinations thereof.

Submit your top-3 rankings by 28.02.2019, 23:59  - late submissions are not allowed!





The seminar is intended to cover recent results in the increasingly important field of hardware acceleration for data science, both in dedicated machines or in data centers. The seminar aims at students interested in the system aspects of data processing who are willing to bridge the gap across traditional disciplines: machine learning, databases, systems, and computer architecture. The seminar should be of special interest to students interested in completing a master thesis or even a doctoral dissertation in related topics. 


The seminar will start on February 21st with an overview of the general topics and the intended format of the seminar. Students are expected to present one paper in a 30 minute talk and complete a 4 page report on the main idea of the paper and how they relate to the other papers presented at the seminar and the discussions around those papers. Alternatively, students may submit a reproducibility report (instead of the summary report), which includes re-implementation and reproduction of the results of the presented paper, along with a README-like document. The presentation will be given during the semester in the allocated time slot. The report is due on the last day of the semester (31.05.2019).

Attendance to the seminar is mandatory to complete the credit requirements. Active participation is also expected, including having read every paper to be presented in advance and contributing to the questions and discussions of each paper during the seminar.

Course Material



Speaker Title Date/Time
Dr. Tal Ben-Nun Reproducing Deep Learning
Slides | Demo
Prof. Ce Zhang System Relaxations for First Order Methods: A 45-Minute Crash Course


Prof. Torsten Hoefler How to Survive in this Seminar?  28.02.2019
Dr. Muhsen Owaida Application Partitioning on FPGA Clusters: Inference over Decision Tree Ensembles  07.03.2019
Kaan Kara Why are specialized hardware solutions for machine learning useful? 07.03.2019


 Nicholas Dykeman  Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs 14.03.19 15:15  
 Dingguang Jin  Deep Learning On Code with an Unbounded Vocabulary 14.03.19 16:15  
 Peter Tatkowski  Constrained Graph Variational Autoencoders for Molecule Design 21.03.19 15:15  
 Zuowen Wang  A Linearly-Convergent Stochastic L-BFGS Algorithm 21.03.19 16:15   
 Eric Wolf  Relational inductive biases, deep learning, and graph networks 28.03.19 15:15  
 Marc Jourdan  mixup: Beyond Empirical Risk Minimization 28.03.19 16:15   
 Karolis Martinkus  Regularized Evolution for Image Classifier Architecture Search 04.04.19 15:15  
 Elias Stalder  Azure Accelerated Networking: SmartNICs in the Public Cloud 04.04.19 16:15  
 Diego Luis  AMC: AutoML for Model Compression and Acceleration on Mobile Devices 11.04.19 15:15  
 Tommaso Ciussani  In-RDBMS Hardware Acceleration of Advanced Analytics 11.04.19 16:15  
 Edoardo Caldarelli  Demystifying Automata Processing: GPUs, FPGAs or Micron’s AP? 18.04.19 15:15  
 Mugeeb Hassan  YellowFin and the Art of Momentum Tuning 18.04.19 16:15  
 Luis Guilherme Berenguer Todo-Bom  Efficient Sparse-Winograd Convolutional Neural Networks 02.05.19 15:15  
 Fabio Maschi  Automatic Generation of Efficient Accelerators for Reconfigurable Hardware 02.05.19 16:15  
 Clemens Hutter  Neural Ordinary Differential Equations 09.05.19 15:15  
 Giovanni Balduzzi  Integrated Model, Batch and Domain Parallelism in Training Neural Networks 09.05.19 16:15  
 Daniel Zvara  Faster Derivative-Free Stochastic Algorithm for Shared Memory Machine 16.05.19 15:15  
 Sebastian Kurella  A Configurable Cloud-Scale DNN Processor for Real-Time AI 16.05.19 16:15  
 Gianna Paulin  Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training 23.05.19 15:15  
 Georg Rutishauser  Plasticine: A Reconfigurable Accelerator for Parallel Patterns 23.05.19 16:15  


Seminar Hours

Thursdays, 15:00-17:00 in LEE C 104