8/10/2010

08-10-10 - A small note on memset16

On the SPU I'm now making heavy use of a primitive op I call "memset16" ; by this I don't mean that it must be 16-byte aligned, but rather that it memsets 16-byte patterns, not individual bytes

void rrSPU_MemSet16(void * ptr,qword pattern,int count)
{
    qword * RADRESTRICT p = (qword *) ptr;
    char * end = ((char *)ptr + count );

    while ( (char *)p < end )
    {
        *p++ = pattern;
    }
}

(and yes I know this could be faster, this is the simple version for readability).

The interesting thing has been that taking a 16-byte pattern as input actually makes it way more useful than normal byte memset. I can now also memset shorts and words, floats, doubles, and vectors! So this is now the way I do any array assignments when a chunk of consecutive elements are the same. eg. instead of doing :


float fval = param;
float array[16];

for(int i=0;i<16;i++) array[i] = fval;

you do :

float array[16];
qword pattern = si_from_float(fval);

rrSPU_MemSet16(array,pattern,sizeof(array));

In fact it's been so handy that I'd like to have it on other platforms, at least up to U32.

1 comment:

RCL said...

Thanks for the idea. Quite implementable on other platforms, too.

old rants