Tag Archives: csharp

OpenCL Cookbook: Hello World using C# Cloo host binding

So far I’ve used the C and C++ bindings in the OpenCL Cookbook series. This time I provide a quick and simple example of how to use Cloo – the C# OpenCL host binding. However, since Cloo, for whatever reason, didn’t work as expected with a char array I will use an integer array instead. In other words – instead of sending a “Hello World!” message to the kernel I will send five integers instead. My guess is that there is some sort of bug with Cloo and char arrays.

Device code using Cloo’s variant of the OpenCL language

kernel void helloWorld(global read_only int* message, int messageSize) {
	for (int i = 0; i < messageSize; i++) {
		printf("%d", message[i]);

The kernel above is merely illustrative in that it simply receives an integer array and its size and prints the array.

Note that the OpenCL syntax here is not the same as in C/C++. It has additional keywords to say whether the arguments are read only or write or read write and the kernel keyword is not prefixed with two underscores. The Cloo author must have decided that the original OpenCL syntax was for whatever reason unsuitable for adoption which IMO was a mistake. The OpenCL language syntax should be standard for portability, reusability and also so that there is only a single learning curve.

Host code using Cloo API

using System;
using System.Collections.Concurrent;
using System.Threading.Tasks;
using System.IO;
using Cloo;

namespace test
    class Program
        static void Main(string[] args)
            // pick first platform
            ComputePlatform platform = ComputePlatform.Platforms[0];

            // create context with all gpu devices
            ComputeContext context = new ComputeContext(ComputeDeviceTypes.Gpu,
                new ComputeContextPropertyList(platform), null, IntPtr.Zero);

            // create a command queue with first gpu found
            ComputeCommandQueue queue = new ComputeCommandQueue(context,
                context.Devices[0], ComputeCommandQueueFlags.None);

            // load opencl source
            StreamReader streamReader = new StreamReader("..\..\kernels.cl");
            string clSource = streamReader.ReadToEnd();

            // create program with opencl source
            ComputeProgram program = new ComputeProgram(context, clSource);

            // compile opencl source
            program.Build(null, null, null, IntPtr.Zero);

            // load chosen kernel from program
            ComputeKernel kernel = program.CreateKernel("helloWorld");

            // create a ten integer array and its length
            int[] message = new int[] { 1, 2, 3, 4, 5 };
            int messageSize = message.Length;

            // allocate a memory buffer with the message (the int array)
            ComputeBuffer<int> messageBuffer = new ComputeBuffer<int>(context,
                ComputeMemoryFlags.ReadOnly | ComputeMemoryFlags.UseHostPointer, message);

            kernel.SetMemoryArgument(0, messageBuffer); // set the integer array
            kernel.SetValueArgument(1, messageSize); // set the array size

            // execute kernel
            queue.ExecuteTask(kernel, null);

            // wait for completion

The C# program above uses the Cloo object oriented api to interface with the underlying low level opencl implementation. It’s pretty self explanatory if you’ve been following the series so far. The output of the program is 12345.