Parallel Computing Using CUDA and OpenCL
Parallel computing seems to be the technology upcoming these days..The 2 technologies CUDA , which is an architecture introduced by NVIDIA and OpenCL which was initially proposed by Apple now standardised by the Khronos group are the competing ones. NVIDIA supports OpenCL programming for CUDA architecture but seems CUDA programming is better than OpenCL for the reason that CUDA programming knows to handle the operations well, whereas in case of OpenCL it works for other hardware as well. Since CUDA programming knows how to handle the hardware well its always recommended CUDA programming on OpenCL for CUDA architecture. This is as far as my knowledge is concerned .If any changes or comments are welcomed




Comments
Parallel computing
Parallel computing on GPUs most definitely is an upcoming technology. There are, in my view, three major paths:
- GPGPU on shaders. This is the "classic" parallel processing on GPUs. Using GLSL, HLSL, Cg or even assembly language shaders (although the latter is very much obsolete), you can compute very efficiently by reformulating your computation to computations on textures. I would not rule this out; it is very efficient.
- CUDA. You get more control and get rid of the image format which may be irrelevant for your problem. CUDA is pretty elegant, with the parallel kernels living side by side with the sequential code. Some of NVidia's tools are a bit clumsy to work with, there are some gotchas and bugs that complicates things unnecessarily. The biggest disadvantage is that it is NVidia only, you must have NVidia hardware.
- OpenCL. This looks like a crossbreed of the other two. Just like with shaders, the kernel is a separate entity, delivered as text source to a compiler, and the API is not as elegant as CUDA. A big plus for OpenCL is that it requires no special tools more than the OpenCL library. The API is a straight C API, thereby portable to just about any language and very easy to get rolling in any IDE. Another plus is clearly the fact that it is designed to be hardware independent.
I am very interested in this technology and include it in my courses, and it is also part of my reseaerch interests. That makes it very frustrating when I can't make my MacMini/8400M deliver any half decent OpenCL performance. More about that in its own thread.