Tag Archives: sdk

AMD release APP SDK 2.8, CodeXL 1.0 and Bolt

This is old news now that AMD have released APP SDK 2.8, CodeXL 1.0 and Bolt. Bolt is a C++ abstraction of OpenCL which allows you to write OpenCL applications without actually writing OpenCL. I personally haven’t tried it as we are not going to be using it at work but it sounds a good step forward in making GPGPU more accessible to the mainstream. For anyone interested there are a number of blog posts by AMD on how to use Bolt.

OpenCL Cookbook: Compiling OpenCL with Ubuntu 12.10, Unity, AMD 12.11 beta drivers & AMD APP SDK 2.7

Continuing on in the OpenCL cookbook series here I present a post not about code but about environmental setup further diversifying the scope of the cookbook. It can be a real challenge for the uninitiated to install all the above and compile an opencl c or c++ program on linux. Here’s a short guide. First download and install ubuntu (duh!).

Install ubuntu build tools and linux kernel extras

Then install the following packages which are a prerequisite to the amd installers and the subsequent c/c++ compilation.

sudo apt-get update
sudo apt-get install build-essential
sudo apt-get install linux-source
sudo apt-get install linux-headers-generic

Then download AMD 12.11 beta drivers (amd-driver-installer-catalyst-12.11-beta-x86.x86_64.zip) and AMD APP SDK 2.7 (AMD-APP-SDK-v2.7-lnx64.tgz). Obviously download either 32bit or 64bit based on what your system supports.

AMD 12.11 beta drivers installation

Once you’ve done that install the AMD 12.11 beta drivers as root first. Installation is as simple as extracting the tarball, marking the script inside as executable and running the script as root. Reboot. After the reboot unity should start using the new AMD 12.11 beta driver and you’ll know it’s the beta because you’ll see a watermark at the bottom left of the screen saying ‘AMD Testing use only’. Note that the reason we’re using the beta here is because unity does not work with earlier versions of the driver. You get a problem where you see the desktop background and a mouse pointer but there’s no toolbar or status bar. But the 12.11 beta driver works which is great.

AMD APP SDK 2.7 installation

Then install the AMD APP SDK 2.7 also as root. Again installation is very simple and exactly the same as for the beta driver above. The AMD beta drivers install a video driver and the OpenCL runtime. The AMD APP SDK install the SDK and also OpenCL and OpenGL runtimes. However if you’ve already installed the video driver first you’ll already have the OpenCL runtime on your system in /usr/lib/libamdocl64.so so the APP SDK won’t install another copy in its location of /opt/AMDAPP/lib/x86_64/libOpenCL.so. You’ll see some messages during installation that it’s skipping the opencl runtime and that’s absolutely fine for now.

Test your OpenCL environment

Now you should test your OpenCL environment by compiling and running an example c opencl program. Get my C file to list all devices on your system as an example calling it devices.c and compile as follows.

gcc -L/usr/lib -I/opt/AMDAPP/include devices.c -lamdocl64 -o devices.o # for c
g++ -L/usr/lib -I/opt/AMDAPP/include devices.c -lamdocl64 -o devices.o # for c++

Once compiled run the output file (devices.o) and if it works then you should output similar to that below.

1. Device: Tahiti
 1.1 Hardware version: OpenCL 1.2 AMD-APP (923.1)
 1.2 Software version: CAL 1.4.1741 (VM)
 1.3 OpenCL C version: OpenCL C 1.2 
 1.4 Parallel compute units: 32
2. Device: Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz
 2.1 Hardware version: OpenCL 1.2 AMD-APP (923.1)
 2.2 Software version: 2.0 (sse2,avx)
 2.3 OpenCL C version: OpenCL C 1.2 
 2.4 Parallel compute units: 32

Enabling multiple gpus for OpenCL

You may find that you are only seeing one gpu in your opencl programs. There are two things you need to do to enable multiple gpus in the OpenCL runtime. The first is to disable all crossfire. You can do this either in the amd catalyst control centre > performance which you start by running amdcccle or you can do it using the awesome amdconfig tool by running amdconfig --crossfire=off. See my post on amdconfig to find out more about this incredibly powerful tool.

The second thing you may or may not need to do is to enable COMPUTE mode as follows.

export COMPUTE=:0

Once you’ve done the above you should see program output from the program above similar to below.

dhruba@debian:~$ ./source/devices.o 
1. Device: Tahiti
 1.1 Hardware version: OpenCL 1.2 AMD-APP (1084.2)
 1.2 Software version: 1084.2 (VM)
 1.3 OpenCL C version: OpenCL C 1.2 
 1.4 Parallel compute units: 32
2. Device: Tahiti
 2.1 Hardware version: OpenCL 1.2 AMD-APP (1084.2)
 2.2 Software version: 1084.2 (VM)
 2.3 OpenCL C version: OpenCL C 1.2 
 2.4 Parallel compute units: 32
3. Device: Tahiti
 3.1 Hardware version: OpenCL 1.2 AMD-APP (1084.2)
 3.2 Software version: 1084.2 (VM)
 3.3 OpenCL C version: OpenCL C 1.2 
 3.4 Parallel compute units: 32
4. Device: Tahiti
 4.1 Hardware version: OpenCL 1.2 AMD-APP (1084.2)
 4.2 Software version: 1084.2 (VM)
 4.3 OpenCL C version: OpenCL C 1.2 
 4.4 Parallel compute units: 32
5. Device: Tahiti
 5.1 Hardware version: OpenCL 1.2 AMD-APP (1084.2)
 5.2 Software version: 1084.2 (VM)
 5.3 OpenCL C version: OpenCL C 1.2 
 5.4 Parallel compute units: 32
6. Device: Tahiti
 6.1 Hardware version: OpenCL 1.2 AMD-APP (1084.2)
 6.2 Software version: 1084.2 (VM)
 6.3 OpenCL C version: OpenCL C 1.2 
 6.4 Parallel compute units: 32
7. Device: Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz
 7.1 Hardware version: OpenCL 1.2 AMD-APP (1084.2)
 7.2 Software version: 1084.2 (sse2,avx)
 7.3 OpenCL C version: OpenCL C 1.2 
 7.4 Parallel compute units: 32

Standardising the OpenCL runtime library path

Now – it may be that you wish for the OpenCL runtime library to be installed in the standard AMD APP SDK location of /opt/AMDAPP/lib/x86_64/libOpenCL.so as opposed to the non-standard location of /usr/lib/libamdocl64.so which is where the beta driver installation puts it. The proper way to do this would probably be to install the AMD APP SDK first and then the video driver or simply skip the video driver installation (I haven’t tried either of these options so they may need verification).

However, I used a little trick to make this easier since I’d already installed the video driver followed by the APP SDK. I renamed /usr/lib/libamdocl64.so to /usr/lib/libamdocl64.so.x and reinstalled the APP SDK. This time it detected that the runtime wasn’t present and installed another runtime in /opt/AMDAPP/lib/x86_64/libOpenCL.so – the standard SDK runtime path. With the new APP SDK OpenCL runtime in place I was able to compile the same program using the new runtime as below depending on whether you want the c or c++ compiler.

gcc -L/opt/AMDAPP/lib/x86_64/ -I/opt/AMDAPP/include devices.c -lOpenCL -o devices.o # for c
g++ -L/opt/AMDAPP/lib/x86_64/ -I/opt/AMDAPP/include devices.c -lOpenCL -o devices.o # for c++

Summary

And there you have it – an opencl compiler working on ubuntu 12.10 using the AMD 12.11 beta drivers and the AMD APP 2.7 SDK. Sometimes you just need someone else to have done it first and written a guide and I hope this serves to help someone out there.

OpenCL Cookbook: Building a program and debugging failures

Last time, in the opencl cookbook series, we looked at how to create a program data structure in the C OpenCL host programming API as well as how to read kernel source from a separate file. A program is a container or collection of kernels and a kernel, in turn, is a function in OpenCL that executes on an OpenCL device such as a CPU, GPU or accelerator. This time we look at how to build a program, which in turn builds the kernels within it and also how to debug failures that occur in the program build. For the latter we recreate two kinds of failures to see how the program reacts.

But what actually happens when you build a program? The clBuildProgram() function in C takes a program and a program in turn contains the source of one or more kernels read in from one or more files. However, kernel sources in raw form are of little use. To be functionally useful they must be compiled. This is what happens when you build a program.

Every OpenCL framework/SDK/implementation (whatever you want to call it) is mandated by the specification to make a compiler accessible through the clBuildProgram(). Though, they may provide other interfaces to their OpenCL compiler. AMD provides a compile time compiler command called clc whereas NVidia provides only a runtime one. clBuildProgram() compiles and link a program for devices associated with the platform.

Our example host program below is an extension of the one in the previous article in the series. It reads in a kernel function from a separate file into an OpenCL program and builds the program which would normally succeed. However, here I introduce a couple of mistakes to show you how the API reacts in each case and how to debug such issues.

Host source (buildProgramDebug.c)

#include <stdio.h>
#include <stdlib.h>
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif

int main() {

    cl_platform_id platform; cl_device_id device; cl_context context;
    cl_program program; cl_int error; cl_build_status status;

    FILE* programHandle;
    char *programBuffer; char *programLog;
    size_t programSize; size_t logSize;

    // get first available platform and gpu and create context
    clGetPlatformIDs(1, &platform, NULL);
    clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
    context = clCreateContext(NULL, 1, &device, NULL, NULL, NULL);

    // get size of kernel source
    programHandle = fopen("kernel.cl", "r");
    fseek(programHandle, 0, SEEK_END);
    programSize = ftell(programHandle);
    rewind(programHandle);

    // read kernel source into buffer
    programBuffer = (char*) malloc(programSize + 1);
    programBuffer[programSize] = '\0';
    fread(programBuffer, sizeof(char), programSize, programHandle);
    fclose(programHandle);

    // create program from buffer
    program = clCreateProgramWithSource(context, 1,
            (const char**) &programBuffer, &programSize, NULL);
    free(programBuffer);

    // build program
    const char options[] = "-Werror -cl-std=CL1.1";
    error = clBuildProgram(program, 1, &device, options, NULL, NULL);

    // build failed
    if (error != CL_SUCCESS) {

        // check build error and build status first
        clGetProgramBuildInfo(program, device, CL_PROGRAM_BUILD_STATUS, 
                sizeof(cl_build_status), &status, NULL);

        // check build log
        clGetProgramBuildInfo(program, device, 
                CL_PROGRAM_BUILD_LOG, 0, NULL, &logSize);
        programLog = (char*) calloc (logSize+1, sizeof(char));
        clGetProgramBuildInfo(program, device, 
                CL_PROGRAM_BUILD_LOG, logSize+1, programLog, NULL);
        printf("Build failed; error=%d, status=%d, programLog:nn%s", 
                error, status, programLog);
        free(programLog);

    }

    clReleaseContext(context);
    return 0;

}

Kernel source (kernel.cl)

__kernel void hello(__global char* string){

string[0] = 'H';
string[1] = 'e';
string[2] = 'l';
string[3] = 'l';
string[4] = 'o';
string[5] = ',';
string[6] = ' ';
string[7] = 'W';
string[8] = 'o';
string[9] = 'r';
string[10] = 'l';
string[11] = 'd';
string[12] = '!';
string[13] = '';

}

These two programs, as they stand, have no errors and in that state produce no output. The kernel source is read in to a program and the program built based on the command line options provided on line 41. The error variable on line 45 always equals CL_SUCCESS so nothing is printed. Let’s now introduce two errors in our source code one at a time and see what happens. Specifically, when a problem occurs, we’ll be examining three separate variables – the error code (which is the return value from clBuildProgram), the program build status (which has to be specifically requested) and the program build log (which also has to be requested).

Error 1: Rogue build option

Here I change line 41 to contain a bogus command line option called ‘-foobar’ as below.

const char options[] = "-Werror -cl-std=CL1.1 -foobar";

When the program is built, unsurprisingly it fails, with the output below.

tron:opencl dhruba$ clang -framework OpenCL buildProgramDebug.c -o buildProgramDebug && ./buildProgramDebug
Build failed; error=-43, status=-1, programLog:

Above the error code of -43 corresponds to the constant CL_INVALID_BUILD_OPTIONS and a status of -1 corresponds to the constant CL_BUILD_NONE. The former is self explanatory whereas the latter means that the kernel was not compiled which is to be expected as the build options were wrong.

Error 2: Kernel source syntax error

Here, I revert the error I introduced last time, and instead add an extra underscore as the first character of the kernel source to create an OpenCL syntax error. This time the output is longer as the program log is no longer blank.

tron:opencl dhruba$ clang -framework OpenCL buildProgramDebug.c -o buildProgramDebug && ./buildProgramDebug
Build failed; error=-11, status=-2, programLog:

:1:1: error: unknown type name '___kernel'
___kernel void hello(__global char* string){
^
:1:11: error: expected identifier or '('
___kernel void hello(__global char* string){

Above the error code of -11 refers to the constant CL_BUILD_PROGRAM_FAILURE and a program build status of -2 refers to the constant CL_BUILD_ERROR which makes sense. This time, however, we have some output in the program log field. The syntax error in the kernel source is being reported by the runtime compiler to the program log.

Error code and build status constants

One final tip: you may be asking how I knew which error codes and build statuses corresponded to which constants in OpenCL. Well – this is a bit of a nightmare to be quite honest. I had to open up the following header file in the Apple OpenCL framework to check which constants matched those integers. I’m sure there will be similar places to look in other SDKs.

/System/Library/Frameworks/OpenCL.framework/Headers/cl.h

As you can see, the above three fields, provide critically important means of debugging the failure to build of your program and can point our errors both in your host source and your kernel source. Did this help you, did you face any issues or do you have feedback for improvement? Let me know in the comments!

OpenCL Cookbook: Creating contexts and reference counting

Following on from my previous articles on platforms and devices in the OpenCL Cookbook series, in this instalment, I move onto the next most critical host programming data structure in OpenCL – the context.

Contexts

A context in OpenCL requires a platform and one or more devices to function and is used to create command queues which are the structures that allow hosts to send kernels to devices. That’s a loaded sentence so let’s break it down. The program such as the one written below in C (the host) may want the CPU or the GPU (devices) to execute a calculation (a kernel i.e. a function). In order for that to happen a command queue for that device must be created and the calculation enqueued onto it. That, in essence, is how a task is relayed to a device and execution is triggered in OpenCL.

Context creation

In creating a context – the platform and devices do not necessarily have to be created and supplied to the context creating method. For example a context can be created simply by choosing a device type as below.

context = clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU, NULL, NULL, NULL);

In the example above, the platform that is selected is implementation defined but if you only have one platform like me then that’s automatically selected. The device selected will be the first available one of the type you specify. So, on my Macbook Air, the Apple SDK and my only GPU – the NVidia card are automatically selected.

Alternatively, as in the snippet below you can create and provide the platform and device explicitly.

context = clCreateContext(NULL, 1, &device, NULL, NULL, NULL);

Controlling context lifetime using its reference count

In the complete program below I also introduce another important concept related to contexts – its reference count. A context when created starts with a reference count of 1 and when the function you are creating it in goes out of scope it is deallocated automatically. But this may be undesirable – maybe you want to continue accessing the context after the current function has gone out of scope. For this reason a context’s reference count can be incremented and decremented and it is only deallocated when its reference count reaches zero.

The general guideline is that if you are writing a function that uses an already created context you should increment the reference count at the start and decrement it at the end of your function. If, however, you are creating a context then at the end of your function you must simply decrement the reference count as I’m doing below.

#include <stdio.h>
#include <stdlib.h>
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif

int main() {

    cl_platform_id platform;
    cl_device_id device;
    cl_context context;
    cl_uint refCount;

    // get first available platform
    clGetPlatformIDs(1, &platform, NULL);

    // get first available gpu device
    clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL);

    // create context
    context = clCreateContext(NULL, 1, &device, NULL, NULL, NULL);

    // get context reference count
    clGetContextInfo(context, CL_CONTEXT_REFERENCE_COUNT,
            sizeof(refCount), &refCount, NULL);
    printf("Ref count: %u ", refCount);

    // increment reference count
    clRetainContext(context);
    clGetContextInfo(context, CL_CONTEXT_REFERENCE_COUNT,
            sizeof(refCount), &refCount, NULL);
    printf(">> %u ", refCount);

    // decrement reference count
    clReleaseContext(context);
    clGetContextInfo(context, CL_CONTEXT_REFERENCE_COUNT,
            sizeof(refCount), &refCount, NULL);
    printf(">> %u ", refCount);

    // finally release context
    clReleaseContext(context);
    printf(">> 0n");
    return 0;

}

Compile and run on the Mac as follows. If you don’t have the clang, g++ or gcc commands install them.

$ clang -framework OpenCL contexts.c -o contexts && ./contexts

The output produced on my machine is as below.

Ref count: 1 >> 2 >> 1 >> 0

As always error handling has been omitted for brevity and the code is only tested on my Macbook Air but should work on other platforms. If you have any issues or have suggestions for improvements to the code do let me know.

Did this help you? Let me know in the comments!

OpenCL Cookbook: Listing all devices and their critical attributes

Last time, in our OpenCL Cookbook series, we looked at how to list all platforms and their attributes. This time, we take the next step and list all devices that a platform provides access to and their critical attributes. This program is very useful in that it provides a quick and easy way of introspecting a given system’s OpenCL capabilities. Note that error handling has been omitted for brevity and the host language as before is C.

To recap on terminology: a platform is an OpenCL SDK such as an Apple, Intel, NVidia or AMD SDK. A device, on the other hand, may be a cpu, gpu or accelerator and as a result it’s highly likely a system will have multiple devices. For each device we list its critical attributes: hardware OpenCL version, software driver version, opencl c version supported by compiler for device and finally the number of parallel compute units (cores) it possesses which symbolises the extent of task based parallelism that we can achieve.

#include <stdio.h>                                                                                                                                               
#include <stdlib.h>
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif

int main() {

    int i, j;
    char* value;
    size_t valueSize;
    cl_uint platformCount;
    cl_platform_id* platforms;
    cl_uint deviceCount;
    cl_device_id* devices;
    cl_uint maxComputeUnits;

    // get all platforms
    clGetPlatformIDs(0, NULL, &platformCount);
    platforms = (cl_platform_id*) malloc(sizeof(cl_platform_id) * platformCount);
    clGetPlatformIDs(platformCount, platforms, NULL);

    for (i = 0; i < platformCount; i++) {

        // get all devices
        clGetDeviceIDs(platforms[i], CL_DEVICE_TYPE_ALL, 0, NULL, &deviceCount);
        devices = (cl_device_id*) malloc(sizeof(cl_device_id) * deviceCount);
        clGetDeviceIDs(platforms[i], CL_DEVICE_TYPE_ALL, deviceCount, devices, NULL);

        // for each device print critical attributes
        for (j = 0; j < deviceCount; j++) {

            // print device name
            clGetDeviceInfo(devices[j], CL_DEVICE_NAME, 0, NULL, &valueSize);
            value = (char*) malloc(valueSize);
            clGetDeviceInfo(devices[j], CL_DEVICE_NAME, valueSize, value, NULL);
            printf("%d. Device: %sn", j+1, value);
            free(value);

            // print hardware device version
            clGetDeviceInfo(devices[j], CL_DEVICE_VERSION, 0, NULL, &valueSize);
            value = (char*) malloc(valueSize);
            clGetDeviceInfo(devices[j], CL_DEVICE_VERSION, valueSize, value, NULL);
            printf(" %d.%d Hardware version: %sn", j+1, 1, value);
            free(value);

            // print software driver version
            clGetDeviceInfo(devices[j], CL_DRIVER_VERSION, 0, NULL, &valueSize);
            value = (char*) malloc(valueSize);
            clGetDeviceInfo(devices[j], CL_DRIVER_VERSION, valueSize, value, NULL);
            printf(" %d.%d Software version: %sn", j+1, 2, value);
            free(value);

            // print c version supported by compiler for device
            clGetDeviceInfo(devices[j], CL_DEVICE_OPENCL_C_VERSION, 0, NULL, &valueSize);
            value = (char*) malloc(valueSize);
            clGetDeviceInfo(devices[j], CL_DEVICE_OPENCL_C_VERSION, valueSize, value, NULL);
            printf(" %d.%d OpenCL C version: %sn", j+1, 3, value);
            free(value);

            // print parallel compute units
            clGetDeviceInfo(devices[j], CL_DEVICE_MAX_COMPUTE_UNITS,
                    sizeof(maxComputeUnits), &maxComputeUnits, NULL);
            printf(" %d.%d Parallel compute units: %dn", j+1, 4, maxComputeUnits);

        }

        free(devices);

    }

    free(platforms);
    return 0;

}

Compile and run on the Mac as follows. If you don’t have the clang, g++ or gcc commands install them. Any of those commands should work.

$ clang -framework OpenCL devices.c -o devices && ./devices

The output produced on my machine is as follows but may differ on your system.

1. Device: Intel(R) Core(TM)2 Duo CPU     U9600  @ 1.60GHz
 1.1 Hardware version: OpenCL 1.2 
 1.2 Software version: 1.1
 1.3 OpenCL C version: OpenCL C 1.2 
 1.4 Parallel compute units: 2
2. Device: GeForce 320M
 2.1 Hardware version: OpenCL 1.0 
 2.2 Software version: CLH 1.0
 2.3 OpenCL C version: OpenCL C 1.1 
 2.4 Parallel compute units: 6

As you can see my Macbook Air shows rather feeble and outdated metadata being an old slimline laptop. As always, the code is only tested on my Macbook Air but, in theory, should run on Windows and Linux though the way you compile and run will differ from above slightly. If you have any issues or would like to critique and improve my code (given I’m not a C programmer) by all means leave a comment.

Did this help you? Let me know in the comments!

OpenCL Cookbook: Listing all platforms and their attributes

The first article in the OpenCL Cookbook series looks at how to list all platforms and their attributes in OpenCL using C as a host language on an OpenCL supported system.

For those new to OpenCL (like me) a platform is a top level entity in the OpenCL API and represents an SDK. You have to get a platform before you can delve deeper into what a platform provides access to such as devices (cpu, gpu). Depending on the hardware/GPU of a system may find an AMD, NVidia, Intel or Apple OpenCL SDK. You may even find multiple SDKs, for instance, if you have multiple GPUs of different makes.

#include <stdio.h>
#include <stdlib.h>
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif

int main() {

    int i, j;
    char* info;
    size_t infoSize;
    cl_uint platformCount;
    cl_platform_id *platforms;
    const char* attributeNames[5] = { "Name", "Vendor",
        "Version", "Profile", "Extensions" };
    const cl_platform_info attributeTypes[5] = { CL_PLATFORM_NAME, CL_PLATFORM_VENDOR,
        CL_PLATFORM_VERSION, CL_PLATFORM_PROFILE, CL_PLATFORM_EXTENSIONS };
    const int attributeCount = sizeof(attributeNames) / sizeof(char*);

    // get platform count
    clGetPlatformIDs(5, NULL, &platformCount);

    // get all platforms
    platforms = (cl_platform_id*) malloc(sizeof(cl_platform_id) * platformCount);
    clGetPlatformIDs(platformCount, platforms, NULL);

    // for each platform print all attributes
    for (i = 0; i < platformCount; i++) {

        printf("n %d. Platform n", i+1);

        for (j = 0; j < attributeCount; j++) {

            // get platform attribute value size
            clGetPlatformInfo(platforms[i], attributeTypes[j], 0, NULL, &infoSize);
            info = (char*) malloc(infoSize);

            // get platform attribute value
            clGetPlatformInfo(platforms[i], attributeTypes[j], infoSize, info, NULL);

            printf("  %d.%d %-11s: %sn", i+1, j+1, attributeNames[j], info);
            free(info);

        }

        printf("n");

    }

    free(platforms);
    return 0;

}

Compile and run on the Mac as follows. If you don’t have the clang, g++ or gcc commands install them. Any of those commands should work.

clang -framework OpenCL platforms.c -o platforms && ./platforms

The output produced on my Macbook Air is as follows but may differ for your system.

 1. Platform 
  1.1 Name       : Apple
  1.2 Vendor     : Apple
  1.3 Version    : OpenCL 1.2 (Jun 20 2012 14:18:19)
  1.4 Profile    : FULL_PROFILE
  1.5 Extensions : cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event

The code has been tested only on my Macbook Air but should work on Windows and Linux too though the way you compile and run will differ from above slightly. If you find any issues or would like to suggest improvements to the program (given that I’m not a C programmer) then please let me know in the comments. If you would like a step by step dissected guide to the above program explaining what it’s doing let me know and if there’s enough demand I’ll do a breakdown in another post.

Augmented Reality Apps with iPhone 3.1 Update

Augmented reality applications is one way in which the iPhone is currently behind Android since, as far as I could tell from the Android for Java Developers talk I went to, this functionality already exists in Android.  I wonder why this API is currently not public.

“The L.A. Times reports that Apple will begin allowing developers access to the tools they need to produce augmented reality applications starting with upcoming iPhone OS 3.1. While there have been many impressive demos floating around showing the possibilities, these applications have used unpublished APIs which prevent them from being allowed on the App Store. Apple, however, told one developer that the tools necessary would become available with iPhone 3.1.”

via Augmented Reality Apps to Arrive with iPhone 3.1 Update – Mac Rumors.

Update: Another interesting example of augmented reality.