Lectures¶
Lecture Slides¶
- Lecture 01: Introduction and Logistics
- Lecture 02: Performance Metrics
- Lecture 03: Amdahl's Law
- Lecture 04: Introduction to MIPS
- Lecture 05: Cache Introduction
- Lecture 06: Cache Optimizations
- Lecture 07: Virtual Memory
- Lecture 08: Pipelining
- Lecture 09: Handling Branches
- Lecture 10: Out-of-Order Execution
- Lecture 11: Main Memory
- Lecture 12: SIMD
- Lecture 13: Multiprocessors
- Lecture 14: Consistency and Coherence
Note
I have heard reports that some parts of these PDFs don’t render properly in Adobe Reader. I recommend that you view these PDFs in your browser or via an alternative PDF viewer such as Zathura or Evince (Linux) or Foxit Reader (Windows).
Worksheets¶
- Worksheet Lecture 02: Performance(- Solutions)
- Worksheet Lecture 03: Amdahl's Law(- Solutions)
- Worksheet Lecture 05: Cache Introduction(- Solutions)
- Worksheet Lecture 06: Cache Optimizations(- Solutions)
- Worksheet Lecture 08: Pipelining Hazards(- Solutions)
- Worksheet Lecture 09: Handling Branches(- Solutions)
- Worksheet Lecture 10: Out-of-Order Execution(- Solutions)
- Worksheet Lecture 11: Main Memory(- Solutions)
Turning in Worksheets
You should turn in your worksheets on Gradescope. They are due the Friday after we finish covering them in class. You need to upload the worksheet as a PDF. There are many apps that can do this. On Android the Google Drive app has scan-to-PDF functionality. On iOS, you can scan using the Notes app and export to a PDF.
Gradescope requires that you submit the same number of pages as the blank worksheet template. If you don’t want to print out the worksheet and instead do the problems on a blank sheet of paper, you may need to add additional blank pages to make the number of pages match the required number of pages.
Supplemental Links¶
- Lecture 02: Performance Metrics- AWS takes advantage of the bandwidth of trucks carrying hard drives with AWS Snowmobile
- CPU Bandwidth - The Worrisome 2020 Trend
 
- Lecture 05: Cache Introduction
- Lecture 06: Cache Optimizations
- Lecture 07: Virtual Memory- Notes from Lecture
- Virtual Memory – Translation-Lookaside Buffer (TLB)
- Why reallocis actually efficient due to virtual memory and being able to manipulate the page table. A story of Realloc (and Laziness)
 
- Lecture 09: Branch Prediction- Why is processing a sorted array faster than processing an unsorted array - Stack Overflow
- A StackOverflow answer which talks about why predicated code isn’t always the best idea gcc optimization flag -O3 makes code slower than -O2
- Linus Torvalds to the LKML on why CMOV (conditional move) is not always that great
 
- Lecture 11: Main Memory- Flipping Bits in Memory Without Accessing Them: An Experimental Study of
DRAM Disturbance Errors (PDF, DOI 10.1145/2678373.2665726)
 
- Flipping Bits in Memory Without Accessing Them: An Experimental Study of
DRAM Disturbance Errors (
- Lecture 13: Multiprocessors
- Lecture 14: Consistency and Coherence
Lecture Recordings¶
Click here for the lecture recording YouTube playlist.