CS 385 Lab 12, Spring 2009

Due date: Monday, May 4, 1:00 PM

Reading

In preparation for this lab, you should read the following article:

John D. Owens, Mike Houston, David Luebke, Simon Green, John E. Stone, and James C. Phillips. "GPU Computing." Proceedings of the IEEE, 96(5):879–899, May 2008.

This article was emailed to you; you can also copy it from ~srivoire/cs385/pickup/owens.gpu.2008.pdf on the CS department's servers.

This article is written for a technical audience with some familiarity with computer architecture and parallel programming. Reading this type of article for the first time can be difficult. Don't get stuck if you don't understand something! Here is my recommended approach to this article:

Read the abstract and Section I. Try to put each idea into your own words and to put it in the context of your experiences with parallel programming. Note the things you don't understand: is the problem with terminology, or do you not follow the argument? Generally, it's worth understanding the introduction of a paper as well as you possibly can, or you're likely to get lost later.
Repeat the same procedure with Section VIII. Yes, Section VIII. The conclusion of a paper is written on the same general level as the introduction, although it may contain a few things that won't make sense until you read the rest of the paper.
Read Section III. First, skim the section to understand how it's organized and what the main points are, and then read it in more detail. After reading the section, you should have some insight into how GPU programming has evolved from a rigid model to something more general and programmable.
Read Section IV. The two things I want you to get out of this section are (1) what stream programming is and (2) how all the different languages and companies involved in GPGPU programming fit together: which ones are high-level? Which ones are low-level? Which ones are descendants of others?
Read Section VA. You should be able to relate the computational primitives to things we've done or discussed in class.
Read Section VC carefully and relate it to what we've learned about optimizing CUDA programs.
Pick one of the four case studies (game physics, the Folding@Home application, the NAMD simulator, or the VMD tool). Note the purpose of the application and the characteristics that make it a good fit for GPUs.

You do not have to read the sections not mentioned in this list, although they are worthwhile.

Writeup

You will turn in your writeup electronically, as a text file. To get a good grade on the writeup, your answers should be in your own words and should show that you have worked to understand the article and that you are prepared for discussion.

Answer all of the questions in the following list:

Your boss hands you some code and asks if the program would run significantly faster on a GPU. How do you go about answering this question?
In what ways has GPU programming become more general over the years? What restrictions remain on GPU programming compared to general-purpose programming? Why are these restrictions in place?
What are some challenges unique to GPU programming (or CUDA in particular) compared to parallel programming on traditional CPUs?
For the case study that you read, describe the application in your own words and explain why the application works well on GPUs.

Pick 2 of the questions in the following list and answer them. Your answer may involve some Internet research; cite your sources. These sources don't have to be academically credible; if they help you to make sense of the article, that's fine.

Section I states that GPUs work well for applications where throughput is more important than latency. What is the distinction between throughput and latency?
Explain the branch granularity issue described under "design tradeoffs" in Section VIIIA. Section IIIA has some helpful background.
What's the difference between scatter-gather and strided memory accesses? Section IIIA will get you started.
Section IV talks about programming environments that use delayed evaluation and just-in-time (or online) compilation. What are these two things, and why do they go together?

Submitting your writeup

Make sure that your writeup is named yourlastnameL12.txt, and then copy it to ~srivoire/cs385/submit. Wait 2 minutes, and then check that it was correctly submitted by visiting http://rivoire.cs.sonoma.edu/cs385/lab12sub.txt.