I was recently discussing aliasing problems when it comes to code optimization. And something stroke me : most people aren’t aware of very simple cases where aliasing just kill your performance. Before studying some simple code, let’s explain what aliasing is. And how it annoys your tiny precious compiler when it comes to optimization.
Aliasing refers to a simple fact : the same piece of memory can be accessed by different symbolic names. ( ). The memory containing item_index is written just before the same memory is read. The CPU just stalls a lot of cycle waiting for the store and the read to be finished. Fixing this problem is easy : use a local variable. The same code again, but fixed : bool findItem( int & item_index, const string & value, const string_table & table ){
int count = table.size();
for( int local_item_index = 0; local_item_index < count; ++local_item_index )
{
if( table[ local_item_index ] == value )
{
item_index = local_item_index;
return true;
}
} return false;
} Aliasing can impact performances even if the piece of code does not seem critical. But for critical code, it can be worst. Let’s image a 4X4 matrix multiplication : r->xx = a->xx * b->xx + a->xy * b->yx + ….
r->xy = a->xx * b->xy + a->xy * b->yy + …. If no care is taken, a->xx will be read four times. So will be all other elements. After having wrote r->xx, any elements of any matrix can be changed, as r->xx might have point to the same memory. Once again, __restrict or local variable can be used to fix the problem. Consider aliasing early in your code. It can improve performance without much work. If you use the __restrict keyword, you must respect the contract you made with the compiler (asserts are your friends to ensure condition are ok ). Using local variables, the code will still be valid, but might be a little less optimal. But whatever technique you decide to use, I recommend you read the generated code, you might be surprised!