Author Archive

How we get improved performance on a single core - Part 3

Tuesday, July 15th, 2008
Posted By Stefanus Du Toit

This is the third, and last, post in a series of posts about how we can achieve improved performance over regular C++ code even when running on a single core. I’ve talked about our programming model and runtime program generation previously. In this post, I’ll discuss our runtime code generation mechanism.

As mentioned in my last post, RapidMind generates machine code at runtime. This is similar to just-in-time compilation, but is done at very specific (and controllable) points in an application’s lifetime – typically during application initialization. The responsibility of generating machine code for a specific hardware target belongs to RapidMind’s backends. Each backend includes code generation support for any targets it supports. For example, the OpenGL backend for GPUs generates OpenGL shading language programs corresponding to a user’s computations. Backends like the x86 and Cell backends generate machine code for those architectures using a custom code generation stack, including a backend optimizer, scheduler, register allocator, etc.

(more…)

How we get improved performance on a single core - Part 2

Tuesday, May 27th, 2008
Posted By Stefanus Du Toit

In my last post I blogged about the fact that RapidMind-enabled C++ code often gets improved performance even on a single core. I gave one reason for this, our programming model. In this post I’d like to address another reason: runtime program generation. Read on for part two of this series on single core performance with RapidMind.

(more…)

How we get improved performance on a single core - Part 1

Thursday, May 15th, 2008
Posted By Stefanus Du Toit

RapidMind is all about achieving the full performance potential of modern multi-core processors. Generally, total performance is a combination of two factors:

                          total performance = scalability across cores × per-core performance

It’s probably no surprise that RapidMind aims to provide excellent scalability across cores. What might surprise you is that we spend significant amounts of effort on single-core performance as well, and RapidMind-enabled apps often outperform code written in C/C++ without RapidMind significantly. Some of our customer case studies show improvements like “25x faster than non-RapidMind-code on 8 cores” – that 25x number is made up of perfect scaling across 8 cores, combined with a 3x performance advantage even on a single core.

How’s that possible? I’m going to explore this in my next few blog posts. Read on for the first reason why we get such good performance:

Reason 1: Our programming model

(more…)

The difference between multi-core and multi-processing

Thursday, April 17th, 2008
Posted By Stefanus Du Toit

When discussing the shift to multi-core, I often hear people ask why multi-core, which is relatively new, is so different from multi-processing, which has been with us for decades.

First, let’s start with the basics. Multi-processing simply means putting multiple processors in one system. Symmetric Multi-Processing, or SMP, implies that all of these processors are identical, also known as a homogeneous system. SMP systems have been around in the x86 world for a very long time, and there are software systems that take advantage of SMP well.

From a technical standpoint, the difference between multi-core and SMP is relatively benign. In an SMP system, each processor plugs into a different socket, and multiple processors are connected through some kind of bus. In a multi-core processor, the “core” logic of a processor is replicated multiple times on the same chip. Multiple cores may share data through some on chip logic or shared caches. Multiple cores are presented to applications at the OS level exactly the same way as multiple processors in an SMP system. Furthermore, you can mix the two together, e.g. by having an 8-core system with two processors, each containing four cores.

So, why is programming multi-core considered so much more of a problem than programming SMP systems? It’s not because of some fundamental technical difference between multi-core and SMP. It’s because of the reason why these technologies exist.

(more…)