Archive for May, 2008

How we get improved performance on a single core - Part 2

Tuesday, May 27th, 2008
Posted By Stefanus Du Toit

In my last post I blogged about the fact that RapidMind-enabled C++ code often gets improved performance even on a single core. I gave one reason for this, our programming model. In this post I’d like to address another reason: runtime program generation. Read on for part two of this series on single core performance with RapidMind.

(more…)

Required Reading

Saturday, May 24th, 2008
Posted By Dr. Michael McCool

I frequently get asked to recommend articles on multi-core/many-core software development, GPU architectures, the Cell BE, and parallel programming models. Conveniently, special issues of three major technical publications have just appeared covering these very topics. The March/April ACM Queue discusses the use of GPUs as general-purpose computational engines; the May Proceedings of the IEEE covers commodity multi-core and many-core processors and programming models; and finally, the April IEEE Computer targets “data-intensive computing” and includes a couple of interesting articles that discuss high-performance multi-core processing and also discuss the use of the Cell BE and GPUs for specific applications in database search and pattern analysis.

Of course if you were really serious about the topic of multi-core and many-core development you would read these issues cover to cover, but in this post I’m going to comment on (and recommend) a subset of these articles.
(more…)

How we get improved performance on a single core - Part 1

Thursday, May 15th, 2008
Posted By Stefanus Du Toit

RapidMind is all about achieving the full performance potential of modern multi-core processors. Generally, total performance is a combination of two factors:

                          total performance = scalability across cores × per-core performance

It’s probably no surprise that RapidMind aims to provide excellent scalability across cores. What might surprise you is that we spend significant amounts of effort on single-core performance as well, and RapidMind-enabled apps often outperform code written in C/C++ without RapidMind significantly. Some of our customer case studies show improvements like “25x faster than non-RapidMind-code on 8 cores” – that 25x number is made up of perfect scaling across 8 cores, combined with a 3x performance advantage even on a single core.

How’s that possible? I’m going to explore this in my next few blog posts. Read on for the first reason why we get such good performance:

Reason 1: Our programming model

(more…)

Meaningful Benchmarks

Tuesday, May 13th, 2008
Posted By Dr. Michael McCool

Lately, I’ve been thinking about apples and oranges, as in the “comparing of.”  In other words, benchmarks. The RapidMind platform enables the development of high-performance software, and so we often need to quantify the performance improvements made possible by our technology. However, benchmarks must be done with care in order to be meaningful, and I want to discuss a few of the issues here and our philosophy in setting up good benchmarks.

(more…)

Performance: What’s it For?

Thursday, May 1st, 2008
Posted By Dr. Michael McCool

It almost seems like a silly question: what good is higher performance? The answer depends on the context. Supercomputers of course, are built for high performance, and are often built expressly to run application workloads that demand it. But what good is high performance on ordinary servers and desktops, and for what kinds of applications?

This is an important question for RapidMind, since we help developers squeeze every drop of performance out of all the processors in their computers. Fortunately, many applications need or want all the performance they can get.

(more…)