Required Reading

I frequently get asked to recommend articles on multi-core/many-core software development, GPU architectures, the Cell BE, and parallel programming models. Conveniently, special issues of three major technical publications have just appeared covering these very topics. The March/April ACM Queue discusses the use of GPUs as general-purpose computational engines; the May Proceedings of the IEEE covers commodity multi-core and many-core processors and programming models; and finally, the April IEEE Computer targets “data-intensive computing” and includes a couple of interesting articles that discuss high-performance multi-core processing and also discuss the use of the Cell BE and GPUs for specific applications in database search and pattern analysis.

Of course if you were really serious about the topic of multi-core and many-core development you would read these issues cover to cover, but in this post I’m going to comment on (and recommend) a subset of these articles.

The March/April ACM Queue special issue is specifically about using GPUs for general-purpose computation (something we’ve been doing since at least 1999). The whole issue has excellent coverage of this topic, but I would especially recommend the article entitled “GPUs: A closer look” by Kayvon Fatahalian and Mike Houston. Basically, this article describes the architecture of contemporary GPUs in detail. GPUs have a fairly complex architecture with multiple levels of hardware parallelism, and include multiple cores, massive multithreading, and SIMD tiling, as well as very aggressive mechanisms for latency hiding. While programming platforms like RapidMind do abstract away much of this complexity, the details are interesting and useful to know when performance tuning. For example, the article explains why using too much state in a GPU kernel can decrease performance due to resource oversubscription, and why it’s necessary to give a GPU lots (and lots and lots) of parallelism to work with for maximum performance.

The May special issue of the Proceedings of the IEEE on “Cutting-Edge Computing” also contains a number of useful articles, including one I wrote surveying scalable programming models, including the SPMD programming model used by RapidMind. This article compares and contrasts a number of multi-core architectures (including CPUs, the Cell BE, and both NVIDIA and ATI GPUs), discusses both task and data-parallel programming models, and shows how the SPMD model maps efficiently onto all of these architectures. This issue also includes good articles on the evolution of GPUs, a discussion of mobile GPUs, and a discussion of simulation, recognition, and synthesis workloads.

The final special issue, of IEEE Computer, is not directly targetting multi-core or many-core processors, but does discuss data-intensive workloads such as data mining and recognition tasks. The workloads discussed include both image and string data search, and cover applications in GIS, medical imaging, and spam detection. These articles discuss strategies for parallelizing these workloads on many-core processors, including on the Cell Be and on GPUs.

Although we’ve been involved in this revolution since the beginning, It’s nice to see that the community is starting to take many-core architectures and programming more seriously.  There are tremendous opportunities here, and a fundamental shift in how computers are built and programmed is underway that will lead to vastly improved performance.

Leave a Reply