Error attempting to run Apple Matrix Transpose example.
Hi Folks,
thought I'd try this after watching tutorial 4 on OpenCL.
I have to say the tutorials are really fantastic. Thanks so much for this.
So, I downloaded and ran the Matrix Transpose example. I have to set the GPU flag to 0 as I have a non-compatible video card.
Theoretically it should still run ok right?
Unfortunately it stops on an error
err = clEnqueueNDRangeKernel(queue, kernel, 2, NULL, global, local, 0, NULL, NULL);
if (err)
{
printf("Error: Failed to execute kernel err %d!\n",err);
return EXIT_FAILURE;
}
Printing...
Error: Failed to execute kernel err -54!
I suppose my questions are firstly, is this the best forum for this sort of thing or is it too specific?
Secondly, is there anywhere a list of OpenCL error codes with explanations? I tried the spec but couldn't see it.
Thirdly, does anyone have any idea about this error?
Any advice would be much appreciated. I am of course an OpenCL newbie!
Thanks,
Max




Re: Error attempting to run Apple Matrix Transpose example.
That example most probably won't work on the CPU. My guess is that the local work size is greater than 1.
Dave
Re: Error attempting to run Apple Matrix Transpose example.
Hi Dave,
my understanding was that code will always run, but that it can default to the CPU if a GPU is not available.
Is that flawed? Is the truth that a lot of code will be GPU only?
Cheers,
Max
Re: Error attempting to run Apple Matrix Transpose example.
Hi Max,
The kernel needs to be provided parameters that agree with the implementations standards. In the case of OpenCL on Mac OS X, for the CPU version to run the local size MUST be 1. Any value greater than that and CL won't run. You can get around this in other instances by setting a flag if you get a CPU device and then if/else the local workgroup size.
Note, that in the case of the matrix transpose, you probably won't get better performance than say using Accelerate for the transpose on the CPU. The reason this works so well for the GPU is because values are being permuted in shared memory. I talk about this in one of the episodes, but I can't remember which.
Dave
Re: Error attempting to run Apple Matrix Transpose example.
Hi Dave,
that's brilliant thanks.
I'm working through the Matrix Transpose sample line by line to figure how it handles the local memory permutation.
Your presentations four and five have been a great help in understanding this conceptually.
Don't suppose you'd know anywhere that covers the example in more detail?
Cheers,
Max