Comments on: Put This In Your Pipe And Execute It Argh. Sorry man. Well, I guess now that i've ruined the surprise you'll just have to wow us all with great results ;) Argh. Sorry man. Well, I guess now that i’ve ruined the surprise you’ll just have to wow us all with great results ;)

]]>
By: Marshall Robin/2011/03/02/put-this-in-your-pipe-and-execute-it/#comment-1170 Marshall Robin Thu, 03 Mar 2011 01:45:59 +0000 Great post, I remember learning to program the PS3 in my final year of university, but it has been a few years since then and there have been many other things piled into my head since so it was starting to get somewhat foggy, now that area of my brain has been re-awakened. Great post, I remember learning to program the PS3 in my final year of university, but it has been a few years since then and there have been many other things piled into my head since so it was starting to get somewhat foggy, now that area of my brain has been re-awakened.

]]>
By: Jaymin Kessler/2011/03/02/put-this-in-your-pipe-and-execute-it/#comment-1167 Jaymin Kessler Wed, 02 Mar 2011 22:18:38 +0000 Thanks for the great article! I've been experimenting with a tight loop like the one you unrolled and round that software pieplining got me a big saving in cycles. For those that don't know, it involves changing the loop from for (...) { load process store } to load for (...) { process store load next iteration } This obviously has the downside of duplicating code and so suffers from the same concerns you give for unrolling, but I found it to be extremely useful. Also not that the load isn't always the best thing to move out of the loop. I actually duplicated the load and part of the process stage to get the best speedup. Oh, and an extra bonus of SPU programming is that the loads brought outside the loop can read from bad addresses (which could happen if the loop was never taken) but you won't get a crash on SPU for this. Thanks for the great article!

I’ve been experimenting with a tight loop like the one you unrolled and round that software pieplining got me a big saving in cycles. For those that don’t know, it involves changing the loop from

for (…)
{
load
process
store
}

to

load
for (…)
{
process
store
load next iteration
}

This obviously has the downside of duplicating code and so suffers from the same concerns you give for unrolling, but I found it to be extremely useful. Also not that the load isn’t always the best thing to move out of the loop. I actually duplicated the load and part of the process stage to get the best speedup. Oh, and an extra bonus of SPU programming is that the loads brought outside the loop can read from bad addresses (which could happen if the loop was never taken) but you won’t get a crash on SPU for this.

]]>