Tag Archives: apple

The new Apple iPhone 5 is announced!

The new Apple iPhone 5 has just been announced.

Have a look at the comparison page to get an at-a-glance idea of what’s different. I love the un-Apple like two tone colour scheme which looks more rugged and yet still chic and fashionable and I love the fact that, despite all the new features, it is taller, thinner and lighter and has greater batter life than all previous iPhones. And that I’m happy with. On a geeky note – I’m very glad to see that the iPhone 5 now supports both 2.4GHz *and* 5GHz band 802.11n access. I’ll explain why this matters.

The 2.4GHz band is a mainstream band and is used by a lot of household devices and is more prone to interference and speed cuts. The 5GHz band, however, is used only by a select few devices and is far less congested and therefore suffers far less interference and provides considerably faster speeds. It is also immune to interference from the 2.4GHz band. And devices supporting both will generally pick the better band. And this is why Apple are saying that the new iPhone 5 has ultra fast wireless.

Recently, I moved into a new flat and found that my wifi speeds dropped from 5m to <1m. I worked out that this was because my router only supported the 2.4GHz band. So I bought an Apple Airport Express to create a wifi hotspot off an ethernet cable from the original router and transmit on both bands simultaneously. I found that installing this new hotspot restored both my iPad and Macbook Air wifi speeds back up to 5m (my maximum). However my iPhone 4S supporting only 2.4GHz remained at <1m. And now the new iPhone 5 needn't! So the final piece of the jigsaw is in place! Until next time! Overall, I'm pretty happy with this release and will, most probably, be pre-ordering on Friday. Will you? Update: I used the Apple Store app on the iPhone 4S to order the new iPhone 5 in 32GB Black before the web based Apple Store came online! It’ll be delivered on the 21st. I have a 64GB handset right now but was only using around 16GB so I decided to compromise and save £100. However, in reality, the real cost of the iPhone 5 to me is actual almost nil. My mother lost her iPhone 4 recently and the insurance company is paying out £525 to her which she will pass on to me. I will give her my old iPhone 4S once the new one arrives. It’s all worked out perfectly.

How to prepare for the theft of your beloved iPhone

Yes – you guessed it. I’m writing this article because it happened to me. Well – not me exactly – my Mother but I was there not far ahead. It was a dark, cold and rainy night. We had just descended into the depths of Piccadilly Underground Station. Unusually, probably as a result of the onset of heavy rain, a massive backlog of people had gathered trying to get in through the electronic gates. We joined the one strand of the widest queue I’ve ever seen and waited patiently. I was a few people ahead of her at the time. A few minutes later, soon after we’d gone through the electronic gates and down the escalator, she noticed and revealed the loss of a small shoulder satchel containing her iPhone.

Somewhere between the end of the queue and the electronic gates someone or a group of perpetrators had not only stolen her iPhone but the bag containing it too. How did they do it? Well my guess would be – they probably snipped the strap using a knife or scissors and pulled the bag from underneath her shoulder. Given the amount of pushing and shoving that was going on at the time in such a large crowd it wouldn’t have been a difficult trick to pull off.

Then the panic, confusion and distress began. Logically, she retraced her steps all the way from where she was to where she came in to the station but found nothing. I, in the mean time, remained composed and tried to call the phone in a futile attempt to locate it which initially failed due to calls going straight to voicemail which meant it was still underground. I also repeatedly loaded ‘Find my iPhone’ but it failed to locate this device. I continued to repeatedly call the phone and to try to locate it but to no avail. Little did I know that Find my iPhone wasn’t working because I’d disabled iCloud (damn you Apple). After a while someone answered my call but didn’t speak – by this time I knew it was out but I couldn’t track it.

Having spoken to the underground staff, only to find out that certain suspicious characters who’d been seen there not moments ago were all too well known, we left for home that night not only in a state of shock at the financial and personal loss we’d incurred but also in disbelief that this had happened to us for the first time despite not being careless and provoking such a crime.

We didn’t contact the police as it certainly didn’t appear as though they could recover it and we didn’t really need them for anything else. We also felt somewhat amateur for not insuring the phone and vowed to insure our valuables in the future. In due time we were to realise that not calling the police would turn out to be a fatal mistake. We would discover that her iPhone was in fact covered by her personal possessions cover under her home insurance but that not calling the police within 24 hours of the crime would impede or even prevent the insurance company from processing the claim.

As I reinstated an old iPhone 3GS with a new sim for her I realised that I hadn’t backed up her stolen iPhone since the iOS 5.1.1 update came out on 8 May. It had been four months since then and all activity on her phone since that date was lost. Disappointed at myself for making such a novice mistake I restored that outdated backup and that’s when I learnt about itunes wifi-sync and automatic icloud backup.

Throughout this whole process I’ve learnt a heck of a lot about how to prepare for such an eventuality and what to do when it finally happens. Needless to say given that this is the first time anything like this had happened to us. We were ill-prepared and clueless and as a result we suffered needless distress and financial loss.

I’m writing this article to pass on my newly gained wisdom to you so that you can be better prepared for the theft of your beloved iPhone and know exactly what to do in what order if this ever happens to you. Share it with everyone. The thieves may be professionals and turn off the phone right away and remove the sim. Or, as was the case with this particular theft, they may be stupid and leave the phone on leaving you free to track its movements as well as the thieves themselves. If there’s one thing I’ve learnt always be prepared. Then, whatever happens happens.

This guide is divided conveniently into what to do before and after the theft of your iPhone. Although it’s fairly iPhone and UK focused it’ll apply to any smartphone equally I think these days as well as applying to the vast majority of people worldwide. Note: I haven’t included any paid apps in this account as I personally feel the free apps are sufficient.

What to do before the theft of your iPhone

Insure your iPhone

If you have home insurance your iPhone will already be insured away from home as long as you have personal possessions cover which you most likely do. So do not get separate mobile phone insurance. If you, however, do not have home or contents insurance the cheapest way I found of getting mobile phone and gadget cover was to get a private bank account.

The best one I found of all the private bank accounts (best value for money and abundance of features) was Cooperative Bank Privilege Current Account with the Gadget benefit pack. It also offers worldwide family annual travel insurance. Overall it turned out to be much cheaper than getting all this cover separately.

Carry your iPhone securely

Carry it somewhere where you’ll notice its loss. I carry it in my trouser pocket where I would feel its absence. My mother was carrying it in a small shoulder satchel which disappeared entirely. I’m guessing the thieves snipped the strap and pulled the whole thing. She didn’t notice until a few minutes later.

Enable itunes wifi sync and automatic icloud backup

Enabling these will ensure that, even if you forget to backup your phone manually, it will still be automatically backed up regularly to allow you to restore as recent a backup as possible. Backups matter.

  

Set a handset PIN and SIM PIN

Go to Settings > General > Passcode Lock > Turn Passcode On and set a handset pin. Disable Voice Dial to prevent misuse by thief and disable Erase Data because if your phone wipes itself you will not be able to track it. Go to Settings > Phone > SIM PIN and set a different SIM PIN. This is so that if your sim is reused in another handset it won’t work without the PIN.

  

Set up ‘Find my iPhone’

Install Find my iPhone (iTunes).

Enable it in Settings > Location Services > Find my iPhone. Enable it also in Settings > iCloud > Find my iPhone.

    

Install Prey

Install Prey (iTunes). Enable it in Settings > Location Services > Prey.

  

Open up Prey and log in.

  

Allow it to use your location and enable Camouflage mode. Camouflage mode makes the app look like a game when opened.

    

Hide both Find my iPhone and Prey icons

Hide both tracking apps in a folder labelled something entirely unrelated. The thief might still find it but you’re making it less likely. We enable Restrictions as an additional measure below to prevent the thief from deleting these apps or disabling tracking.

Enable Restrictions

Go to Settings > General > Restrictions and enable it. Enter a different pin to the handset and sim PINs above. This is critical. If the thief forces you to give him or her the handset pin they will still need an additional Restrictions pin to disable tracking or to delete tracking apps but by then you’ll be long gone.

Disable Deleting Apps. Then select Restrictions > Location Services > Don't Allow Changes.

    

What to do after the theft of your iPhone

Report to Police

Report the theft to the Police right away to get your crime reference number which you require to make an insurance claim. Do not waste time. Do it within minutes and at the very latest within 24 hours. Otherwise the insurance company may not process your claim.

Report to Lost Property

The Police may refuse to complete filing your report until you’ve contacted Lost Property wherever you lost it. So contact them. File a report online at the Lost Property website if need be. And then call the Police back to complete the remainder of the process.

Change all your passwords

Change all your passwords – particularly email (gmail) and apple id passwords.

Track using Find my iPhone

Track the thief using Find my iPhone and iCloud.

Track using Prey

Log in to Prey, mark as missing and track the thief.

Notify network operator

If you have failed to track and reclaim your phone notify your network operator. They will bar the sim and bar the handset from all networks. Once you’ve done this you will not be able to track your phone anymore!

File an insurance claim

In order to file an insurance claim you need to fulfil the following prerequisites.

  • Provide proof of purchase (receipt) or proof that you previously owned this product.
  • Acquire a crime reference number from the Police.
  • Get proof from your network operator that the sim has been barred.

Once you’ve got all the above file an insurance claim as soon as possible. Don’t give them the satisfaction of turning you down because you came too late.

Have you been a victim? Did you do the above? How did it work out in the end? Did the above help you? Let me know in the comments.

OpenCL Cookbook: Hello World using C host binding

In our OpenCL Cookbook series so far we’ve looked at some preliminary data structures in OpenCL host programming using the C language. This time – we finally arrive at a complete end-to-end example – the customary Hello World!

What this example does is simple. The host program in C passes a character array to the GPU into which the GPU writes the characters of the phrase: “Hello, World!”. The host program then reads the contents of the character array back and prints them on the screen. The output should be “Hello, World!”.

The code is annotated using brief comments. There are some aspects of OpenCL that are new that I have not yet been through in previous articles but don’t worry I’ll go through a full dissection after presenting the complete code.

Note that error handling has been taken out completely to keep the program short for easy viewing. The most important aspect of error handling in the program below is checking the build error, status and log for a failed program build which you can find further details of in my previous post.

Host source

#include
#include
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif

int main() {

    cl_platform_id platform; cl_device_id device; cl_context context;
    cl_program program; cl_kernel kernel; cl_command_queue queue;
    cl_mem kernelBuffer;

    FILE* programHandle; char *programBuffer; char *programLog;
    size_t programSize; char hostBuffer[32];

    // get first available sdk and gpu and create context
    clGetPlatformIDs(1, &platform, NULL);
    clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
    context = clCreateContext(NULL, 1, &device, NULL, NULL, NULL);

    // get size of kernel source
    programHandle = fopen("helloWorld.cl", "r");
    fseek(programHandle, 0, SEEK_END);
    programSize = ftell(programHandle);
    rewind(programHandle);

    // read kernel source into buffer
    programBuffer = (char*) malloc(programSize + 1);
    programBuffer[programSize] = '\0';
    fread(programBuffer, sizeof(char), programSize, programHandle);
    fclose(programHandle);

    // create and build program
    program = clCreateProgramWithSource(context, 1,
            (const char**) &programBuffer, &programSize, NULL);
    free(programBuffer);
    clBuildProgram(program, 1, &device, "-Werror -cl-std=CL1.1", NULL, NULL);

    // create kernel and command queue
    kernel = clCreateKernel(program, "hello", NULL);
    queue = clCreateCommandQueue(context, device, 0, NULL);

    // create kernel argument buffer and set it into kernel
    kernelBuffer = clCreateBuffer(context, CL_MEM_WRITE_ONLY,
            32 * sizeof(char), NULL, NULL);
    clSetKernelArg(kernel, 0, sizeof(cl_mem), &kernelBuffer);

    // execute kernel, read back the output and print to screen
    clEnqueueTask(queue, kernel, 0, NULL, NULL);
    clEnqueueReadBuffer(queue, kernelBuffer, CL_TRUE, 0,
            32 * sizeof(char), hostBuffer, 0, NULL, NULL);
    puts(hostBuffer);

    clFlush(queue);
    clFinish(queue);
    clReleaseKernel(kernel);
    clReleaseProgram(program);
    clReleaseMemObject(kernelBuffer);
    clReleaseCommandQueue(queue);
    clReleaseContext(context);
    return 0;

}

The host source runs on the CPU and is written in C in this case though you could also write it in C++, Python or Java whereas the kernel source runs on a device which could be one or more CPUs, GPUs or accelerators. The host source must be written in a host language whereas the kernel source must be written in OpenCL.

Host source by dissection

Here I describe what the host source is doing by dissecting it. A hello world example should ideally be entirely self contained and not rely on other articles to complement the reader’s understanding. With the exception of error handling and particularly how to debug a failed program build which I address elsewhere this example is self contained.

Below I present one snippet of code at a time followed by its dissection.

Creating platforms, devices and contexts

// get first available sdk and gpu and create context
clGetPlatformIDs(1, &platform, NULL);
clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
context = clCreateContext(NULL, 1, &device, NULL, NULL, NULL);

Here I first get a platform (an OpenCL SDK/framework). As I know I only have the Apple OpenCL framework installed on my Mac it will always be the one selected. However, if you have multiple SDKs installed such as AMD, Nvidia and Intel then you may want to select one explicitly. Next I ask for a GPU device. Once again, my machine only has one GPU so it will always be the one that’s selected but if you have multiple GPUs installed you may want to choose one in particular. Finally I create a context which is an incredibly important OpenCL data structure as it is required for the creation of numerous other structures such as programs, command queues and kernel buffers.

Loading kernel sources

// get size of kernel source
programHandle = fopen("helloWorld.cl", "r");
fseek(programHandle, 0, SEEK_END);
programSize = ftell(programHandle);
rewind(programHandle);

// read kernel source into buffer
programBuffer = (char*) malloc(programSize + 1);
programBuffer[programSize] = '\0';
fread(programBuffer, sizeof(char), programSize, programHandle);
fclose(programHandle);

As this is a host source file it has the responsibility of involving the kernel source. Generally speaking the kernel source is usually compiled at runtime as part of the execution of the host source. Therefore, the host source file must pull in the kernel source and compile it. Above I first calculate the size of the kernel source file and then read the source in into a buffer of that calculated size.

Creating a program and compiling kernel sources

// create and build program
program = clCreateProgramWithSource(context, 1,
        (const char**) &programBuffer, &programSize, NULL);
free(programBuffer);
clBuildProgram(program, 1, &device, "-Werror -cl-std=CL1.1", NULL, NULL);

Here I construct a program structure by passing in a context and the buffer containing the kernel source. Then I build the program which essentially compiles the kernel source based on supplied build options. Note that a program can contain numerous kernel sources containing multiple OpenCL functions potentially drawn in from a number of files. This program build steps builds the sum total of all kernels sources read in. At this point the build could fail for a variety of reasons and it’s critically important to be able to narrow the cause easily. Here I’ve skipped this error handling but I address this subject in detail on my previous post in the series.

Creating kernels and command queues

// create kernel and command queue
kernel = clCreateKernel(program, "hello", NULL);
queue = clCreateCommandQueue(context, device, 0, NULL);

Here I create a kernel and a command queue structure. Let’s look at what each one means in turn.

A kernel is an OpenCL function that executes on one or more devices. The program structure above may contain numerous functions so the purpose of creating the kernel structure above is to pinpoint one particular one called ‘hello’. I need a reference to this kernel in order to pass it an argument later on in the process.

A command queue is exactly what the name implies. The host program invokes a device by sending it a command. The sending mechanism for that command is a queue. Commands are by default processed in FIFO order but that can be changed by a configuration option. Sending a command, also known as a task, to a command queue is a way of requesting its execution.

Setting kernel arguments

// create kernel argument buffer and set it into kernel
kernelBuffer = clCreateBuffer(context, CL_MEM_WRITE_ONLY,
        32 * sizeof(char), NULL, NULL);
clSetKernelArg(kernel, 0, sizeof(cl_mem), &kernelBuffer);

Here, I create an OpenCL memory object. There are two types of memory objects – image and buffer. Here I am not dealing with image data so I choose a buffer memory object. Our goal here is to provide the kernel with a character array big enough to hold the phrase ‘Hello, World!’. However I cannot pass a character array into a kernel directly. I must create an OpenCL buffer memory object of a given size and then set it as the first kernel argument and that’s what I’m doing above. You’ll notice that I set the memory object to be write only as the device only needs to write to it.

Executing kernels and reading output data

// execute kernel, read back the output and print to screen
clEnqueueTask(queue, kernel, 0, NULL, NULL);
clEnqueueReadBuffer(queue, kernelBuffer, CL_TRUE, 0,
        32 * sizeof(char), hostBuffer, 0, NULL, NULL);
puts(hostBuffer);

This is the final step. Earlier I created a command queue and a kernel structure for the hello function and passed in a buffer memory object as the first argument. Here I complete the entire process by enqueuing the kernel for execution as a task into the command queue and reading back the output by passing in a character array of the same size as the original kernel buffer memory object. I then print the contents of that array onto the screen to prove that it contains what the GPU originally wrote into it.

Cleaning up

clFlush(queue);
clFinish(queue);
clReleaseKernel(kernel);
clReleaseProgram(program);
clReleaseMemObject(kernelBuffer);
clReleaseCommandQueue(queue);
clReleaseContext(context);

Above I first ensure that all commands have been issed to the device associated with the command queue by calling clFlush(). Then I block until all commands have been issued and completed by calling clFinish(). The rest of the functions above simple deallocate their own respective named structures.

Kernel source

__kernel void hello(__global char* message){
message[0] = 'H';
message[1] = 'e';
message[2] = 'l';
message[3] = 'l';
message[4] = 'o';
message[5] = ',';
message[6] = ' ';
message[7] = 'W';
message[8] = 'o';
message[9] = 'r';
message[10] = 'l';
message[11] = 'd';
message[12] = '!';
message[13] = '';
}

The kernel source is fairly self explanatory. It simply receives a character array called message and writes its message into it. Kernel functions get infinitely more complex than this one but this one has been kept deliberately simple.

Compile and run as follows keeping both source files in the same directory.

clang -framework OpenCL helloWorld.c -o helloWorld && ./helloWorld

As always if you have any feedback or if this helped you let me know in the comments!

OpenCL Cookbook: Building a program and debugging failures

Last time, in the opencl cookbook series, we looked at how to create a program data structure in the C OpenCL host programming API as well as how to read kernel source from a separate file. A program is a container or collection of kernels and a kernel, in turn, is a function in OpenCL that executes on an OpenCL device such as a CPU, GPU or accelerator. This time we look at how to build a program, which in turn builds the kernels within it and also how to debug failures that occur in the program build. For the latter we recreate two kinds of failures to see how the program reacts.

But what actually happens when you build a program? The clBuildProgram() function in C takes a program and a program in turn contains the source of one or more kernels read in from one or more files. However, kernel sources in raw form are of little use. To be functionally useful they must be compiled. This is what happens when you build a program.

Every OpenCL framework/SDK/implementation (whatever you want to call it) is mandated by the specification to make a compiler accessible through the clBuildProgram(). Though, they may provide other interfaces to their OpenCL compiler. AMD provides a compile time compiler command called clc whereas NVidia provides only a runtime one. clBuildProgram() compiles and link a program for devices associated with the platform.

Our example host program below is an extension of the one in the previous article in the series. It reads in a kernel function from a separate file into an OpenCL program and builds the program which would normally succeed. However, here I introduce a couple of mistakes to show you how the API reacts in each case and how to debug such issues.

Host source (buildProgramDebug.c)

#include <stdio.h>
#include <stdlib.h>
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif

int main() {

    cl_platform_id platform; cl_device_id device; cl_context context;
    cl_program program; cl_int error; cl_build_status status;

    FILE* programHandle;
    char *programBuffer; char *programLog;
    size_t programSize; size_t logSize;

    // get first available platform and gpu and create context
    clGetPlatformIDs(1, &platform, NULL);
    clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
    context = clCreateContext(NULL, 1, &device, NULL, NULL, NULL);

    // get size of kernel source
    programHandle = fopen("kernel.cl", "r");
    fseek(programHandle, 0, SEEK_END);
    programSize = ftell(programHandle);
    rewind(programHandle);

    // read kernel source into buffer
    programBuffer = (char*) malloc(programSize + 1);
    programBuffer[programSize] = '\0';
    fread(programBuffer, sizeof(char), programSize, programHandle);
    fclose(programHandle);

    // create program from buffer
    program = clCreateProgramWithSource(context, 1,
            (const char**) &programBuffer, &programSize, NULL);
    free(programBuffer);

    // build program
    const char options[] = "-Werror -cl-std=CL1.1";
    error = clBuildProgram(program, 1, &device, options, NULL, NULL);

    // build failed
    if (error != CL_SUCCESS) {

        // check build error and build status first
        clGetProgramBuildInfo(program, device, CL_PROGRAM_BUILD_STATUS, 
                sizeof(cl_build_status), &status, NULL);

        // check build log
        clGetProgramBuildInfo(program, device, 
                CL_PROGRAM_BUILD_LOG, 0, NULL, &logSize);
        programLog = (char*) calloc (logSize+1, sizeof(char));
        clGetProgramBuildInfo(program, device, 
                CL_PROGRAM_BUILD_LOG, logSize+1, programLog, NULL);
        printf("Build failed; error=%d, status=%d, programLog:nn%s", 
                error, status, programLog);
        free(programLog);

    }

    clReleaseContext(context);
    return 0;

}

Kernel source (kernel.cl)

__kernel void hello(__global char* string){

string[0] = 'H';
string[1] = 'e';
string[2] = 'l';
string[3] = 'l';
string[4] = 'o';
string[5] = ',';
string[6] = ' ';
string[7] = 'W';
string[8] = 'o';
string[9] = 'r';
string[10] = 'l';
string[11] = 'd';
string[12] = '!';
string[13] = '';

}

These two programs, as they stand, have no errors and in that state produce no output. The kernel source is read in to a program and the program built based on the command line options provided on line 41. The error variable on line 45 always equals CL_SUCCESS so nothing is printed. Let’s now introduce two errors in our source code one at a time and see what happens. Specifically, when a problem occurs, we’ll be examining three separate variables – the error code (which is the return value from clBuildProgram), the program build status (which has to be specifically requested) and the program build log (which also has to be requested).

Error 1: Rogue build option

Here I change line 41 to contain a bogus command line option called ‘-foobar’ as below.

const char options[] = "-Werror -cl-std=CL1.1 -foobar";

When the program is built, unsurprisingly it fails, with the output below.

tron:opencl dhruba$ clang -framework OpenCL buildProgramDebug.c -o buildProgramDebug && ./buildProgramDebug
Build failed; error=-43, status=-1, programLog:

Above the error code of -43 corresponds to the constant CL_INVALID_BUILD_OPTIONS and a status of -1 corresponds to the constant CL_BUILD_NONE. The former is self explanatory whereas the latter means that the kernel was not compiled which is to be expected as the build options were wrong.

Error 2: Kernel source syntax error

Here, I revert the error I introduced last time, and instead add an extra underscore as the first character of the kernel source to create an OpenCL syntax error. This time the output is longer as the program log is no longer blank.

tron:opencl dhruba$ clang -framework OpenCL buildProgramDebug.c -o buildProgramDebug && ./buildProgramDebug
Build failed; error=-11, status=-2, programLog:

:1:1: error: unknown type name '___kernel'
___kernel void hello(__global char* string){
^
:1:11: error: expected identifier or '('
___kernel void hello(__global char* string){

Above the error code of -11 refers to the constant CL_BUILD_PROGRAM_FAILURE and a program build status of -2 refers to the constant CL_BUILD_ERROR which makes sense. This time, however, we have some output in the program log field. The syntax error in the kernel source is being reported by the runtime compiler to the program log.

Error code and build status constants

One final tip: you may be asking how I knew which error codes and build statuses corresponded to which constants in OpenCL. Well – this is a bit of a nightmare to be quite honest. I had to open up the following header file in the Apple OpenCL framework to check which constants matched those integers. I’m sure there will be similar places to look in other SDKs.

/System/Library/Frameworks/OpenCL.framework/Headers/cl.h

As you can see, the above three fields, provide critically important means of debugging the failure to build of your program and can point our errors both in your host source and your kernel source. Did this help you, did you face any issues or do you have feedback for improvement? Let me know in the comments!

OpenCL Cookbook: Creating programs and reading kernels from a file

In OpenCL host programming, after platforms, devices and contexts in our OpenCL Cookbook series, comes the program data structure. Remember that a kernel represents a function in actual OpenCL code intended for execution on any given device as opposed to being written in a host language like C which can only execute on the CPU. In contrast a program is a container of kernels in the host programming language – in this case C though it could be C++, Java or Python. In order to execute a kernel function on a given device (CPU, GPU or accelerator) it must first be fed in to a host program to create a program data structure.

In the host program below, written in C, we take a small step further than last time to read in an OpenCL kernel function from a separate file into our host program and create a program data structure. We, then, retrieve the kernel source code from within the program data structure and verify that it is the same kernel function that we fed in earlier as a simple test. This can be valuable in the real world where you may be feeding in kernels from multiple files and want to make sure that all of them have been read into the program. When the program structure is queried for the kernel source it concatenates all kernel sources into one string so simply reading that will give you a global view of all kernel code previously fed in.

The host program which reads the kernel file in is as follows.

#include <stdio.h>
#include <stdlib.h>
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif

int main() {

    cl_platform_id platform;
    cl_device_id device;
    cl_context context;
    cl_program program;

    FILE* programHandle;
    size_t programSize, kernelSourceSize;
    char *programBuffer, *kernelSource;

    // get first available platform and gpu and create context
    clGetPlatformIDs(1, &platform, NULL);
    clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
    context = clCreateContext(NULL, 1, &device, NULL, NULL, NULL);

    // get size of kernel source
    programHandle = fopen("kernel.cl", "r");
    fseek(programHandle, 0, SEEK_END);
    programSize = ftell(programHandle);
    rewind(programHandle);

    // read kernel source into buffer
    programBuffer = (char*) malloc(programSize + 1);
    programBuffer[programSize] = '\0';
    fread(programBuffer, sizeof(char), programSize, programHandle);
    fclose(programHandle);

    // create program from buffer
    program = clCreateProgramWithSource(context, 1,
            (const char**) &programBuffer, &programSize, NULL);
    free(programBuffer);

    // read kernel source back in from program to check
    clGetProgramInfo(program, CL_PROGRAM_SOURCE, 0, NULL, &kernelSourceSize);
    kernelSource = (char*) malloc(kernelSourceSize);
    clGetProgramInfo(program, CL_PROGRAM_SOURCE, kernelSourceSize, kernelSource, NULL);
    printf("nKernel source:nn%sn", kernelSource);
    free(kernelSource);

    clReleaseContext(context);
    return 0;

}

The kernel code, saved in a file called kernel.cl, is as below.

_kernel void hello(__global char* string){

string[0] = 'H';
string[1] = 'e';
string[2] = 'l';
string[3] = 'l';
string[4] = 'o';
string[5] = ',';
string[6] = ' ';
string[7] = 'W';
string[8] = 'o';
string[9] = 'r';
string[10] = 'l';
string[11] = 'd';
string[12] = '!';
string[13] = '';

}

Save the host program as programs.c and then compile and run as follows. If you don’t have the clang command install it.

clang -framework OpenCL programs.c -o programs && ./programs

The output on my system (Macbook Air) is, as you would expect, the kernel source exactly as above.

Did this help you, did you face any issues or do you have any feedback for improvements? Let me know in the comments!

OpenCL Cookbook: Creating contexts and reference counting

Following on from my previous articles on platforms and devices in the OpenCL Cookbook series, in this instalment, I move onto the next most critical host programming data structure in OpenCL – the context.

Contexts

A context in OpenCL requires a platform and one or more devices to function and is used to create command queues which are the structures that allow hosts to send kernels to devices. That’s a loaded sentence so let’s break it down. The program such as the one written below in C (the host) may want the CPU or the GPU (devices) to execute a calculation (a kernel i.e. a function). In order for that to happen a command queue for that device must be created and the calculation enqueued onto it. That, in essence, is how a task is relayed to a device and execution is triggered in OpenCL.

Context creation

In creating a context – the platform and devices do not necessarily have to be created and supplied to the context creating method. For example a context can be created simply by choosing a device type as below.

context = clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU, NULL, NULL, NULL);

In the example above, the platform that is selected is implementation defined but if you only have one platform like me then that’s automatically selected. The device selected will be the first available one of the type you specify. So, on my Macbook Air, the Apple SDK and my only GPU – the NVidia card are automatically selected.

Alternatively, as in the snippet below you can create and provide the platform and device explicitly.

context = clCreateContext(NULL, 1, &device, NULL, NULL, NULL);

Controlling context lifetime using its reference count

In the complete program below I also introduce another important concept related to contexts – its reference count. A context when created starts with a reference count of 1 and when the function you are creating it in goes out of scope it is deallocated automatically. But this may be undesirable – maybe you want to continue accessing the context after the current function has gone out of scope. For this reason a context’s reference count can be incremented and decremented and it is only deallocated when its reference count reaches zero.

The general guideline is that if you are writing a function that uses an already created context you should increment the reference count at the start and decrement it at the end of your function. If, however, you are creating a context then at the end of your function you must simply decrement the reference count as I’m doing below.

#include <stdio.h>
#include <stdlib.h>
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif

int main() {

    cl_platform_id platform;
    cl_device_id device;
    cl_context context;
    cl_uint refCount;

    // get first available platform
    clGetPlatformIDs(1, &platform, NULL);

    // get first available gpu device
    clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL);

    // create context
    context = clCreateContext(NULL, 1, &device, NULL, NULL, NULL);

    // get context reference count
    clGetContextInfo(context, CL_CONTEXT_REFERENCE_COUNT,
            sizeof(refCount), &refCount, NULL);
    printf("Ref count: %u ", refCount);

    // increment reference count
    clRetainContext(context);
    clGetContextInfo(context, CL_CONTEXT_REFERENCE_COUNT,
            sizeof(refCount), &refCount, NULL);
    printf(">> %u ", refCount);

    // decrement reference count
    clReleaseContext(context);
    clGetContextInfo(context, CL_CONTEXT_REFERENCE_COUNT,
            sizeof(refCount), &refCount, NULL);
    printf(">> %u ", refCount);

    // finally release context
    clReleaseContext(context);
    printf(">> 0n");
    return 0;

}

Compile and run on the Mac as follows. If you don’t have the clang, g++ or gcc commands install them.

$ clang -framework OpenCL contexts.c -o contexts && ./contexts

The output produced on my machine is as below.

Ref count: 1 >> 2 >> 1 >> 0

As always error handling has been omitted for brevity and the code is only tested on my Macbook Air but should work on other platforms. If you have any issues or have suggestions for improvements to the code do let me know.

Did this help you? Let me know in the comments!

OpenCL Cookbook: Listing all devices and their critical attributes

Last time, in our OpenCL Cookbook series, we looked at how to list all platforms and their attributes. This time, we take the next step and list all devices that a platform provides access to and their critical attributes. This program is very useful in that it provides a quick and easy way of introspecting a given system’s OpenCL capabilities. Note that error handling has been omitted for brevity and the host language as before is C.

To recap on terminology: a platform is an OpenCL SDK such as an Apple, Intel, NVidia or AMD SDK. A device, on the other hand, may be a cpu, gpu or accelerator and as a result it’s highly likely a system will have multiple devices. For each device we list its critical attributes: hardware OpenCL version, software driver version, opencl c version supported by compiler for device and finally the number of parallel compute units (cores) it possesses which symbolises the extent of task based parallelism that we can achieve.

#include <stdio.h>                                                                                                                                               
#include <stdlib.h>
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif

int main() {

    int i, j;
    char* value;
    size_t valueSize;
    cl_uint platformCount;
    cl_platform_id* platforms;
    cl_uint deviceCount;
    cl_device_id* devices;
    cl_uint maxComputeUnits;

    // get all platforms
    clGetPlatformIDs(0, NULL, &platformCount);
    platforms = (cl_platform_id*) malloc(sizeof(cl_platform_id) * platformCount);
    clGetPlatformIDs(platformCount, platforms, NULL);

    for (i = 0; i < platformCount; i++) {

        // get all devices
        clGetDeviceIDs(platforms[i], CL_DEVICE_TYPE_ALL, 0, NULL, &deviceCount);
        devices = (cl_device_id*) malloc(sizeof(cl_device_id) * deviceCount);
        clGetDeviceIDs(platforms[i], CL_DEVICE_TYPE_ALL, deviceCount, devices, NULL);

        // for each device print critical attributes
        for (j = 0; j < deviceCount; j++) {

            // print device name
            clGetDeviceInfo(devices[j], CL_DEVICE_NAME, 0, NULL, &valueSize);
            value = (char*) malloc(valueSize);
            clGetDeviceInfo(devices[j], CL_DEVICE_NAME, valueSize, value, NULL);
            printf("%d. Device: %sn", j+1, value);
            free(value);

            // print hardware device version
            clGetDeviceInfo(devices[j], CL_DEVICE_VERSION, 0, NULL, &valueSize);
            value = (char*) malloc(valueSize);
            clGetDeviceInfo(devices[j], CL_DEVICE_VERSION, valueSize, value, NULL);
            printf(" %d.%d Hardware version: %sn", j+1, 1, value);
            free(value);

            // print software driver version
            clGetDeviceInfo(devices[j], CL_DRIVER_VERSION, 0, NULL, &valueSize);
            value = (char*) malloc(valueSize);
            clGetDeviceInfo(devices[j], CL_DRIVER_VERSION, valueSize, value, NULL);
            printf(" %d.%d Software version: %sn", j+1, 2, value);
            free(value);

            // print c version supported by compiler for device
            clGetDeviceInfo(devices[j], CL_DEVICE_OPENCL_C_VERSION, 0, NULL, &valueSize);
            value = (char*) malloc(valueSize);
            clGetDeviceInfo(devices[j], CL_DEVICE_OPENCL_C_VERSION, valueSize, value, NULL);
            printf(" %d.%d OpenCL C version: %sn", j+1, 3, value);
            free(value);

            // print parallel compute units
            clGetDeviceInfo(devices[j], CL_DEVICE_MAX_COMPUTE_UNITS,
                    sizeof(maxComputeUnits), &maxComputeUnits, NULL);
            printf(" %d.%d Parallel compute units: %dn", j+1, 4, maxComputeUnits);

        }

        free(devices);

    }

    free(platforms);
    return 0;

}

Compile and run on the Mac as follows. If you don’t have the clang, g++ or gcc commands install them. Any of those commands should work.

$ clang -framework OpenCL devices.c -o devices && ./devices

The output produced on my machine is as follows but may differ on your system.

1. Device: Intel(R) Core(TM)2 Duo CPU     U9600  @ 1.60GHz
 1.1 Hardware version: OpenCL 1.2 
 1.2 Software version: 1.1
 1.3 OpenCL C version: OpenCL C 1.2 
 1.4 Parallel compute units: 2
2. Device: GeForce 320M
 2.1 Hardware version: OpenCL 1.0 
 2.2 Software version: CLH 1.0
 2.3 OpenCL C version: OpenCL C 1.1 
 2.4 Parallel compute units: 6

As you can see my Macbook Air shows rather feeble and outdated metadata being an old slimline laptop. As always, the code is only tested on my Macbook Air but, in theory, should run on Windows and Linux though the way you compile and run will differ from above slightly. If you have any issues or would like to critique and improve my code (given I’m not a C programmer) by all means leave a comment.

Did this help you? Let me know in the comments!

OpenCL Cookbook: Listing all platforms and their attributes

The first article in the OpenCL Cookbook series looks at how to list all platforms and their attributes in OpenCL using C as a host language on an OpenCL supported system.

For those new to OpenCL (like me) a platform is a top level entity in the OpenCL API and represents an SDK. You have to get a platform before you can delve deeper into what a platform provides access to such as devices (cpu, gpu). Depending on the hardware/GPU of a system may find an AMD, NVidia, Intel or Apple OpenCL SDK. You may even find multiple SDKs, for instance, if you have multiple GPUs of different makes.

#include <stdio.h>
#include <stdlib.h>
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif

int main() {

    int i, j;
    char* info;
    size_t infoSize;
    cl_uint platformCount;
    cl_platform_id *platforms;
    const char* attributeNames[5] = { "Name", "Vendor",
        "Version", "Profile", "Extensions" };
    const cl_platform_info attributeTypes[5] = { CL_PLATFORM_NAME, CL_PLATFORM_VENDOR,
        CL_PLATFORM_VERSION, CL_PLATFORM_PROFILE, CL_PLATFORM_EXTENSIONS };
    const int attributeCount = sizeof(attributeNames) / sizeof(char*);

    // get platform count
    clGetPlatformIDs(5, NULL, &platformCount);

    // get all platforms
    platforms = (cl_platform_id*) malloc(sizeof(cl_platform_id) * platformCount);
    clGetPlatformIDs(platformCount, platforms, NULL);

    // for each platform print all attributes
    for (i = 0; i < platformCount; i++) {

        printf("n %d. Platform n", i+1);

        for (j = 0; j < attributeCount; j++) {

            // get platform attribute value size
            clGetPlatformInfo(platforms[i], attributeTypes[j], 0, NULL, &infoSize);
            info = (char*) malloc(infoSize);

            // get platform attribute value
            clGetPlatformInfo(platforms[i], attributeTypes[j], infoSize, info, NULL);

            printf("  %d.%d %-11s: %sn", i+1, j+1, attributeNames[j], info);
            free(info);

        }

        printf("n");

    }

    free(platforms);
    return 0;

}

Compile and run on the Mac as follows. If you don’t have the clang, g++ or gcc commands install them. Any of those commands should work.

clang -framework OpenCL platforms.c -o platforms && ./platforms

The output produced on my Macbook Air is as follows but may differ for your system.

 1. Platform 
  1.1 Name       : Apple
  1.2 Vendor     : Apple
  1.3 Version    : OpenCL 1.2 (Jun 20 2012 14:18:19)
  1.4 Profile    : FULL_PROFILE
  1.5 Extensions : cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event

The code has been tested only on my Macbook Air but should work on Windows and Linux too though the way you compile and run will differ from above slightly. If you find any issues or would like to suggest improvements to the program (given that I’m not a C programmer) then please let me know in the comments. If you would like a step by step dissected guide to the above program explaining what it’s doing let me know and if there’s enough demand I’ll do a breakdown in another post.

Installing g++ and gcc on Mac OS X 10.8 command line

Here’s what you need to do to install g++ and gcc on Mac OS X 10.8.

  1. Install XCode 4.4 from App Store.
  2. Start XCode.
  3. Run XCode > Preferences > Downloads > Components > Command Line Tools > Install.

Once done you should be able to run g++ and gcc from the command line as follows.

tron:~ dhruba$ g++
i686-apple-darwin11-llvm-g++-4.2: no input files

Apparently you can also download the command line package manually from ADC if you prefer (source).

Did this tip help you? Let me know in the comments!