Advanced Systems Lab - Fall 2018

Course Organization & Materials

Find below the dates and details of tutorials (T) and exercises (E):

Date/Time Type Description Materials
 18. Sep.  The first tutorial session will take place on Sept. 18th in CAB G61  slides
 20. Sep.  The first exercise session will take place on Sept. 20th in CAB G61. It gives an overview of the project.  slides
 25. Sep.  The second exercise session will take place on Sept. 25th in CAB G61. It focuses on Microsoft Azure, Bash and Git.  azure, bash
 27. Sep. The third exercise session will take place in individual groups. Please look up your assigned group in the table below.   slides
 2. Oct.  Tutorial session on the life cycle of an experiment.   slides, throughput
 4. Oct. E  Exercise session on good and bad practices in Java middleware development.  slides
 9. Oct.  There will be no tutorial session on October 9th.  
 11. Oct.  The exercise session will cover GnuPlot, and Baseline experiments without the Middleware.  slides
 16. Oct.   Tutorial session on planning experiments.  slides
 18. Oct. The exercise session will cover good and bad practices when generating plots, and Baseline experiments with the Middleware.  slides
 23. Oct.  Tutorial session on queueing theory. slides
 25. Oct.  Exercise session on 2K experiments. slides
 30. Oct.  Tutorial session on System Analysis. slides
 1st. Nov.  E  Exercise session on Queueing Theory slides
 6th Nov. T   All the Tutorials are finished for this semester.  
 8th Nov E  Exercise session on Network of Queues  slides
 15th Nov. E  The rest of the exercise sessions will be Q/A. No material presented.  

 

Project Details

Project Description: [Project Description]

Report: [Report Outline (pdf), (tex)]

Programming: [Java Main Class] [ANT Build File] [Bash Script Examples]

Azure: [Education Hub] [VM TemplateCaution: When creating the VMs, the machines are started automatically. Stop them if you do not run experiments right away.
 

Project Deadline: December 17th, 17:00, 2018. 

NOTE: THE DEADLINE TO DE-REGISTER FROM THE COURSE IS ON 14th OCTOBER 2018.


Literature

"Art of Computer Systems Performance Analysis" - Raj Jain
John Wiley & Sons Inc; Auflage: 2 Rev ed. (21. September 2015)

"The Art of Computer Systems Performance Analysis" - Raj Jain
Wiley Professional Computing, 1991

From the 1st edition of particular relevance are the following chapters:

  • Chapters 1, 2, 3 (General introduction, Common terminology)
  • Chapters 4, 5, 6 (Workloads)
  • Chapter 10 (Data presentation)
  • Chapters 12, 13, 14 (Probability and statistics)
  • Chapters 16, 17, 18, 20, 21, 22 (Experimental design)
  • Chapters 30, 31, 32, 33, 36 (Queueing theory)

 


Lecturer

Gustavo Alonso

 


Course Hours

Tutorials: Tuesday, 17:00 – 19:00, CAB G 61.

Exercises: Thursday, 17:00 - 19:00

General Contact: sg-asl [at] lists.inf.ethz.ch


Exercise Sessions

Exercises sessions are held on Thursday from 17:00 - 19:00 in small groups. In the exercise sessions, we answer high-level questions related to the project and the report.

 Assistant
 Room  Email Last names assigned
 Michel Mueller  CHN D42  muellmic [at] inf.ethz.ch  A-C

 Muhsen Owaida

 CHN D44  mewaida [at] inf.ethz.ch  D-He
 Alba Ríos Rodríguez  CHN D46  rialba [at] student.ethz.ch  Ho-L
 David Sidler  CAB G56  dasidler [at] inf.ethz.ch  M-Sa
 Kaan Kara   CAB G52  kkara [at] inf.ethz.ch  Sc-Z

 


Office Hours

Office hours are indented to provide you advice that will help you to complete the project and the report. To make an appointment, contact your teaching assistant by email.

  • Make sure you come prepared with concrete and well formulated questions. If possible, include them in your email.
  • We will not complete the assignment for you and neither recommend nor make design decisions on your behalf.
  • We will not debug your code, provide technical support for your setup/scripts/data analysis, or give hints about whether what you have done so far is enough.
  • We will not grade your project in advance, so please avoid questions that try to determine whether what you have done is correct or sufficient for a passing grade.
Time
Assistant
 Friday, 9:00-10:00, 13:00-14:00  Michel Mueller
 Thursday, 15:00 - 17:00   Muhsen Owaida
 Thursday, 9:00-11:00  Alba Ríos Rodríguez
 Thursday, 16:00-17:00, Friday 09:00-10:00  David Sidler
 Friday, 9:00-10:00, 13:00-14:00  Kaan Kara

 


FAQ / Tips

Q: Which Java version am I allowed to use?

A: Use Java 8.

Q: With how many threads should I run memcached?

A: Run memcached with a single thread.

Q: Can I adapt the Ant file to include the log4j library?

A: Yes, you can alter the Ant file as long as we are still able to build your middleware from a clean checkout. Note: You cannot use external libraries other than log4j.

Q: Where can I see how much money I spend on Azure?

A: Go to aka.ms/startedu and click on the "Courses" tab. Next click on the course "ASL 2018", then you should be able to see your lab and the credit assigned and consumed so far.

Q: I think that I am getting charged by Azure despite shutting down my VMs?

A: If you use the Azure console interface you have to use the "deallocate" command to deallocate the VMs, if you only use the "stop" command the VMs are not deallocated you are still charged for them. If in doubt, check in the online interface if the VMs are "stopped (deallocated)".

Q: How do I construct the histograms?

A: To construct the histograms from the middleware, you need to record the response time of every request with the precision of 100us. Note that the bucket size is not necessarily 100us, it can be larger. However, you need to have at least 10 buckets per histogram. To construct the histograms on the clients side, use the CDF generated by memtier and decide on the bucket size yourself. BUT make sure all the histograms from both the clients and middleware HAS THE SAME BUCKET SIZE.

Q: I get the error "read error: Connection reset by peer". How to solve it?

A:

- Make sure your Middleware has started and ready to receive connection requests before you start your memtier clients.

- Increase the backlog queue size (this is the allowed number of new connections waiting to be accepted). See this thread.

- Check if the network thread can handle the amount of incoming connections and does not run out of file descriptors. 

Q: For Section 6 (2K Analysis), do I have to perform two separate analyses for throughput and response time?

A: Yes, you need solve the linear equation system separately for throughput and response time.

Q: For Section 7, which exact configuration should I use for network of queues?

A: You need to consider both 1 and 2 middleware configuration. You need to analyze both read only and write only workloads. You can set the number of worker threads to the constant that delivers highest throughput.

Q: It is asked in the report outline to consider under-saturated, saturated and over-saturated states of our experiment runs. What if we don't observe these states clearly?

A: You may or may not observe some of these states. You still need to argue which of these states the system is in throughout your experiments.

Q: How do I select the maximum throughput as it is asked in the report outline?

A: It is important when determining the maximum throughput of your system to take the response time in consideration and how it is affected by the increase in load. Clearly explain the reasoning behind the selection.

Q: Do I need to plot the interactive law verification in all my figures?

A: No, this is not necessary. However, it is necessary to verify it and state that it holds in the report.

Q: What do I have to consider when calculating the interactive law for the middleware?

A: When calculating the throughput based on the response time measured in the middlware you have to adapt either a) N (number of clients/requests) or b) Z (think time). a) When only considering the middleware and servers as the system, the number of requests in the system (N) is smaller than the number of clients. Given your middleware measurements, you can determine the number of requests in the system. For this approach the think time is still ~0. b) You can take the RTT between client and middleware as the think time (Z). The RTT can be measured for instance with ping. For this approach N is still the number of clients.

Errata / Updates

  • Project Description: Section 3 5. 2k Analysis. Changed parameters to be consistent with Report Outline.
  • Report Outline: Table in Section 6 changed the entries for "Instances of memtier per machine" and "Threads per memtier instance". Reason: typo
  • Report Outline: Section 3 & 3.1 & 3.2 changed first sentence from "Connect one load generator machine.." to "Connect three load generator machines..".
  • Report Outline: Section 5 clarified the use and meaning of the --ratio parameter.
  • Azure Slides: Slide 16 changed from "openjdk-7-jdk" to "openjdk-8-jdk"