CS 385 Lab 12, Spring 2009

[Back to CS 385 schedule]

Due date: Monday, May 4, 1:00 PM


In preparation for this lab, you should read the following article:

John D. Owens, Mike Houston, David Luebke, Simon Green, John E. Stone, and James C. Phillips. "GPU Computing." Proceedings of the IEEE, 96(5):879–899, May 2008.

This article was emailed to you; you can also copy it from ~srivoire/cs385/pickup/owens.gpu.2008.pdf on the CS department's servers.

This article is written for a technical audience with some familiarity with computer architecture and parallel programming. Reading this type of article for the first time can be difficult. Don't get stuck if you don't understand something! Here is my recommended approach to this article:

You do not have to read the sections not mentioned in this list, although they are worthwhile.


You will turn in your writeup electronically, as a text file. To get a good grade on the writeup, your answers should be in your own words and should show that you have worked to understand the article and that you are prepared for discussion.

Answer all of the questions in the following list:

  1. Your boss hands you some code and asks if the program would run significantly faster on a GPU. How do you go about answering this question?
  2. In what ways has GPU programming become more general over the years? What restrictions remain on GPU programming compared to general-purpose programming? Why are these restrictions in place?
  3. What are some challenges unique to GPU programming (or CUDA in particular) compared to parallel programming on traditional CPUs?
  4. For the case study that you read, describe the application in your own words and explain why the application works well on GPUs.

Pick 2 of the questions in the following list and answer them. Your answer may involve some Internet research; cite your sources. These sources don't have to be academically credible; if they help you to make sense of the article, that's fine.

  1. Section I states that GPUs work well for applications where throughput is more important than latency. What is the distinction between throughput and latency?
  2. Explain the branch granularity issue described under "design tradeoffs" in Section VIIIA. Section IIIA has some helpful background.
  3. What's the difference between scatter-gather and strided memory accesses? Section IIIA will get you started.
  4. Section IV talks about programming environments that use delayed evaluation and just-in-time (or online) compilation. What are these two things, and why do they go together?

Submitting your writeup

Make sure that your writeup is named yourlastnameL12.txt, and then copy it to ~srivoire/cs385/submit. Wait 2 minutes, and then check that it was correctly submitted by visiting http://rivoire.cs.sonoma.edu/cs385/lab12sub.txt.