Lectures¶
Lecture Slides¶
Lecture 01: Introduction and Logistics
Lecture 02: Performance Metrics
Lecture 03: Amdahl's Law
Lecture 04: Introduction to MIPS
Lecture 05: Cache Introduction
Lecture 06: Cache Optimizations
Lecture 07: Virtual Memory
Lecture 08: Pipelining
Lecture 09: Handling Branches
Lecture 10: Out-of-Order Execution
Lecture 11: Main Memory
Lecture 12: SIMD
Lecture 13: Multiprocessors
Lecture 14: Consistency and Coherence
Note
I have heard reports that some parts of these PDFs don’t render properly in Adobe Reader. I recommend that you view these PDFs in your browser or via an alternative PDF viewer such as Zathura or Evince (Linux) or Foxit Reader (Windows).
Worksheets¶
Worksheet Lecture 02: Performance
(Solutions
)Worksheet Lecture 03: Amdahl's Law
(Solutions
)Worksheet Lecture 05: Cache Introduction
(Solutions
)Worksheet Lecture 06: Cache Optimizations
(Solutions
)Worksheet Lecture 08: Pipelining Hazards
(Solutions
)Worksheet Lecture 09: Handling Branches
(Solutions
)Worksheet Lecture 10: Out-of-Order Execution
(Solutions
)Worksheet Lecture 11: Main Memory
(Solutions
)
Turning in Worksheets
You should turn in your worksheets on Gradescope. They are due the Friday after we finish covering them in class. You need to upload the worksheet as a PDF. There are many apps that can do this. On Android the Google Drive app has scan-to-PDF functionality. On iOS, you can scan using the Notes app and export to a PDF.
Gradescope requires that you submit the same number of pages as the blank worksheet template. If you don’t want to print out the worksheet and instead do the problems on a blank sheet of paper, you may need to add additional blank pages to make the number of pages match the required number of pages.
Supplemental Links¶
- Lecture 02: Performance Metrics
- AWS takes advantage of the bandwidth of trucks carrying hard drives with AWS Snowmobile
- CPU Bandwidth - The Worrisome 2020 Trend
- Lecture 05: Cache Introduction
- Lecture 06: Cache Optimizations
- Lecture 07: Virtual Memory
Notes from Lecture
- Virtual Memory – Translation-Lookaside Buffer (TLB)
- Why
realloc
is actually efficient due to virtual memory and being able to manipulate the page table. A story of Realloc (and Laziness)
- Lecture 09: Branch Prediction
- Why is processing a sorted array faster than processing an unsorted array - Stack Overflow
- A StackOverflow answer which talks about why predicated code isn’t always the best idea gcc optimization flag -O3 makes code slower than -O2
- Linus Torvalds to the LKML on why CMOV (conditional move) is not always that great
- Lecture 11: Main Memory
- Flipping Bits in Memory Without Accessing Them: An Experimental Study of
DRAM Disturbance Errors (
PDF
, DOI 10.1145/2678373.2665726)
- Flipping Bits in Memory Without Accessing Them: An Experimental Study of
DRAM Disturbance Errors (
- Lecture 13: Multiprocessors
- Lecture 14: Consistency and Coherence
Lecture Recordings¶
Click here for the lecture recording YouTube playlist.