Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Installing Intel CPU OpenCL on Ubuntu 12.04

Open Computing Language (OpenCL) is a language and framework for writing computationally intensive kernels that run accross heterogenious platforms, including GPUs, CPUs, and perhaps other more esoteric devices.

Intel provides an OpenCL implementation for Intel CPUs, but there's not a lot of instructions on how to get it set up. Here's what I did.

Installing Intel CPU OpenCL on Ubuntu (12.04)

  1. Download the Intel® SDK for OpenCL* Applications XE 2013 from the Intel website, here http://software.intel.com/en-us/vcsource/tools/opencl-sdk-xe. The download is a tarball -- the one I got is called intel_sdk_for_ocl_applications_2013_xe_sdk_3.0.67279_x64.tgz
  2. Unpack the tarball and cd into the new directory
  $ tar -xzvf intel_sdk_for_ocl_applications_2013_xe_sdk_3.0.67279_x64.tgz
  $ cd  intel_sdk_for_ocl_applications_2013_xe_sdk_3.0.67279_x64
  1. Look inside. There are a bunch of .rpm files. These are the default packages for redhat linux. We can install them on Ubuntu (a debian based distro) by converting them to .deb files.

    a) If you don't have these packages alreafy, you'll need them for dealing with rpm files.

      $ sudo apt-get install -y rpm alien libnuma1
    

    b) Convert all of the rpm files into deb format, and then install them with dpkg. You can do this by pasting the following commands into bash, or copying this as a script and then running it.

      #/bin/bash
      for f in *.rpm; do
        fakeroot alien --to-deb $f
      done
      for f in *.deb; do
        sudo dpkg -i $f
      done
    
  2. Last, but not least, you need to install the so-called icd-file, which registers this OpenCL implementation, so that it's available in paralell to any other. I'm not sure why this isn't done by default (maybe it is if you install the rpms directly?), but it seems necessary. The icd files live at /etc/OpenCL/vendors/*.icd and these files tell the ICD loader what OpenCL implementations (ICDs) are installed on your machine. There's one for each ICD. Each file is a one-line text file containing the name of the dynamic library (aka shared object, aka ".so" file) containing the implementation. The single line may either be the full absolute path or just the file name, in which case the dynamic linker must be able to find that file--perhaps with the help of setting the LD_LIBRARY_PATH environment variable. The names of the .icd files themselves are arbitrary, but they must have a file extension of .icd. To install the icd file, do

  sudo ln -s /opt/intel/opencl-1.2-3.0.67279/etc/intel64.icd /etc/OpenCL/vendors/intel64.icd

Hint: if your OpenCL version number is more recent (I'm writing this as of August, 2013, then the installed path of the OpenCL implementation in /opt/intel/opencl-* might be different than the "1.2-3.0.67279" that I'm using.

5.If this is the only OpenCL implementation on your machine, you should install a symlink to libOpenCL.so into /usr/lib, so that things can be linked up easily. If you already have the NVIDIA OpenCL platform (for your GPU) then this is not necessary -- installing the icd file into the registry is enough to tell the system about your new OpenCL platform.

   $ sudo ln -s /opt/intel/opencl-1.2-3.0.67279/lib64/libOpenCL.so /usr/lib/libOpenCL.so
   $ sudo ldconfig

Checking your OpenCL Installation

  1. Download the file clDeviceQuery.cpp from this gist. Its a small progam that reports all of the available OpenCL platforms on your machine, and all of their devices.
  2. CompileclDeviceQuery.cpp with g++, and run it. You'll need to have the OpenCL header files in your include path, and libOpenCL.so in your LD_LIBRARY_PATH. Note that you dont need the vendor-specific OpenCL implementation in your LD_LIBRARY_PATH necessarily. When libOpenCL.so is loaded, it uses the ICD registry to find all of the vendor implementations.
  $ g++ -o clDeviceQuery -I/opt/intel/opencl-1.2-3.0.67279/include clDeviceQuery.cpp -lOpenCL
  $ ./clDeviceQuery

On my machine, with both the Intel CPU OpenCL and the NVIDIA GPU OpenCL platforms, I get the following output

clDeviceQuery Starting...

2 OpenCL Platforms found

 CL_PLATFORM_NAME:   Intel(R) OpenCL
 CL_PLATFORM_VERSION: 	OpenCL 1.2 LINUX
OpenCL Device Info:

 1 devices found supporting OpenCL on: Intel(R) OpenCL

 ----------------------------------
 Device         Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
 ---------------------------------
  CL_DEVICE_NAME: 			        Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
  CL_DEVICE_VENDOR: 			Intel(R) Corporation
  CL_DRIVER_VERSION: 			1.2
  CL_DEVICE_TYPE:			CL_DEVICE_TYPE_CPU
  CL_DEVICE_MAX_COMPUTE_UNITS:		8
  CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS:	3
  CL_DEVICE_MAX_WORK_ITEM_SIZES:	1024 / 1024 / 1024
  CL_DEVICE_MAX_WORK_GROUP_SIZE:	1024
  CL_DEVICE_MAX_CLOCK_FREQUENCY:	3400 MHz
  CL_DEVICE_ADDRESS_BITS:		64
  CL_DEVICE_MAX_MEM_ALLOC_SIZE:		2994 MByte
  CL_DEVICE_GLOBAL_MEM_SIZE:		11979 MByte
  CL_DEVICE_ERROR_CORRECTION_SUPPORT:	no
  CL_DEVICE_LOCAL_MEM_TYPE:		global
  CL_DEVICE_LOCAL_MEM_SIZE:		32 KByte
  CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:	128 KByte
  CL_DEVICE_QUEUE_PROPERTIES:		CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE
  CL_DEVICE_QUEUE_PROPERTIES:		CL_QUEUE_PROFILING_ENABLE
  CL_DEVICE_IMAGE_SUPPORT:		1
  CL_DEVICE_MAX_READ_IMAGE_ARGS:	480
  CL_DEVICE_MAX_WRITE_IMAGE_ARGS:	480

  CL_DEVICE_IMAGE <dim>			2D_MAX_WIDTH	 16384
					2D_MAX_HEIGHT	 16384
					3D_MAX_WIDTH	 2048
					3D_MAX_HEIGHT	 2048
					3D_MAX_DEPTH	 2048
  CL_DEVICE_PREFERRED_VECTOR_WIDTH_<t>	CHAR 1, SHORT 1, INT 1, FLOAT 1, DOUBLE 1


clDeviceQuery, Platform Name = Intel(R) OpenCL, Platform Version = OpenCL 1.2 LINUX, NumDevs = 1, Device =         Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
 CL_PLATFORM_NAME: 	NVIDIA CUDA
 CL_PLATFORM_VERSION: 	OpenCL 1.1 CUDA 4.2.1
OpenCL Device Info:

 1 devices found supporting OpenCL on: NVIDIA CUDA

 ----------------------------------
 Device GeForce GTX 660
 ---------------------------------
  CL_DEVICE_NAME: 			GeForce GTX 660
  CL_DEVICE_VENDOR: 			NVIDIA Corporation
  CL_DRIVER_VERSION: 			310.14
  CL_DEVICE_TYPE:			CL_DEVICE_TYPE_GPU
  CL_DEVICE_MAX_COMPUTE_UNITS:		6
  CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS:	3
  CL_DEVICE_MAX_WORK_ITEM_SIZES:	1024 / 1024 / 64
  CL_DEVICE_MAX_WORK_GROUP_SIZE:	1024
  CL_DEVICE_MAX_CLOCK_FREQUENCY:	888 MHz
  CL_DEVICE_ADDRESS_BITS:		32
  CL_DEVICE_MAX_MEM_ALLOC_SIZE:		383 MByte
  CL_DEVICE_GLOBAL_MEM_SIZE:		1535 MByte
  CL_DEVICE_ERROR_CORRECTION_SUPPORT:	no
  CL_DEVICE_LOCAL_MEM_TYPE:		local
  CL_DEVICE_LOCAL_MEM_SIZE:		48 KByte
  CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:	64 KByte
  CL_DEVICE_QUEUE_PROPERTIES:		CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE
  CL_DEVICE_QUEUE_PROPERTIES:		CL_QUEUE_PROFILING_ENABLE
  CL_DEVICE_IMAGE_SUPPORT:		1
  CL_DEVICE_MAX_READ_IMAGE_ARGS:	256
  CL_DEVICE_MAX_WRITE_IMAGE_ARGS:	16

  CL_DEVICE_IMAGE <dim>			2D_MAX_WIDTH	 32768
					2D_MAX_HEIGHT	 32768
					3D_MAX_WIDTH	 4096
					3D_MAX_HEIGHT	 4096
					3D_MAX_DEPTH	 4096
  CL_DEVICE_PREFERRED_VECTOR_WIDTH_<t>	CHAR 1, SHORT 1, INT 1, FLOAT 1, DOUBLE 1


clDeviceQuery, Platform Name = Intel(R) OpenCL, Platform Version = OpenCL 1.2 LINUX, NumDevs = 1, Device =         Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
NVIDIA CUDA, Platform Version = OpenCL 1.1 CUDA 4.2.1, NumDevs = 1, Device = GeForce GTX 660

System Info:

 Local Time/Date =  18:33:46, 08/22/2013
 CPU Name: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
 # of CPU processors: 8
 Linux version 3.5.0-36-generic (buildd@roseapple) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #57~precise1-Ubuntu SMP Thu Jun 20 18:21:09 UTC 2013


TEST PASSED
/* Copyright 1993-2009 NVIDIA Corporation. All rights reserved.
Modified by Mark Zwolinski, December 2009
Modified by Robert McGibbon, August 2013
*/
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sstream>
#include <fstream>
void clPrintDevInfo(cl_device_id device) {
char device_string[1024];
// CL_DEVICE_NAME
clGetDeviceInfo(device, CL_DEVICE_NAME, sizeof(device_string), &device_string, NULL);
printf(" CL_DEVICE_NAME: \t\t\t%s\n", device_string);
// CL_DEVICE_VENDOR
clGetDeviceInfo(device, CL_DEVICE_VENDOR, sizeof(device_string), &device_string, NULL);
printf(" CL_DEVICE_VENDOR: \t\t\t%s\n", device_string);
// CL_DRIVER_VERSION
clGetDeviceInfo(device, CL_DRIVER_VERSION, sizeof(device_string), &device_string, NULL);
printf(" CL_DRIVER_VERSION: \t\t\t%s\n", device_string);
// CL_DEVICE_INFO
cl_device_type type;
clGetDeviceInfo(device, CL_DEVICE_TYPE, sizeof(type), &type, NULL);
if( type & CL_DEVICE_TYPE_CPU )
printf(" CL_DEVICE_TYPE:\t\t\t%s\n", "CL_DEVICE_TYPE_CPU");
if( type & CL_DEVICE_TYPE_GPU )
printf(" CL_DEVICE_TYPE:\t\t\t%s\n", "CL_DEVICE_TYPE_GPU");
if( type & CL_DEVICE_TYPE_ACCELERATOR )
printf(" CL_DEVICE_TYPE:\t\t\t%s\n", "CL_DEVICE_TYPE_ACCELERATOR");
if( type & CL_DEVICE_TYPE_DEFAULT )
printf(" CL_DEVICE_TYPE:\t\t\t%s\n", "CL_DEVICE_TYPE_DEFAULT");
// CL_DEVICE_MAX_COMPUTE_UNITS
cl_uint compute_units;
clGetDeviceInfo(device, CL_DEVICE_MAX_COMPUTE_UNITS, sizeof(compute_units), &compute_units, NULL);
printf(" CL_DEVICE_MAX_COMPUTE_UNITS:\t\t%u\n", compute_units);
// CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS
size_t workitem_dims;
clGetDeviceInfo(device, CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS, sizeof(workitem_dims), &workitem_dims, NULL);
printf(" CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS:\t%u\n", workitem_dims);
// CL_DEVICE_MAX_WORK_ITEM_SIZES
size_t workitem_size[3];
clGetDeviceInfo(device, CL_DEVICE_MAX_WORK_ITEM_SIZES, sizeof(workitem_size), &workitem_size, NULL);
printf(" CL_DEVICE_MAX_WORK_ITEM_SIZES:\t%u / %u / %u \n", workitem_size[0], workitem_size[1], workitem_size[2]);
// CL_DEVICE_MAX_WORK_GROUP_SIZE
size_t workgroup_size;
clGetDeviceInfo(device, CL_DEVICE_MAX_WORK_GROUP_SIZE, sizeof(workgroup_size), &workgroup_size, NULL);
printf(" CL_DEVICE_MAX_WORK_GROUP_SIZE:\t%u\n", workgroup_size);
// CL_DEVICE_MAX_CLOCK_FREQUENCY
cl_uint clock_frequency;
clGetDeviceInfo(device, CL_DEVICE_MAX_CLOCK_FREQUENCY, sizeof(clock_frequency), &clock_frequency, NULL);
printf(" CL_DEVICE_MAX_CLOCK_FREQUENCY:\t%u MHz\n", clock_frequency);
// CL_DEVICE_ADDRESS_BITS
cl_uint addr_bits;
clGetDeviceInfo(device, CL_DEVICE_ADDRESS_BITS, sizeof(addr_bits), &addr_bits, NULL);
printf(" CL_DEVICE_ADDRESS_BITS:\t\t%u\n", addr_bits);
// CL_DEVICE_MAX_MEM_ALLOC_SIZE
cl_ulong max_mem_alloc_size;
clGetDeviceInfo(device, CL_DEVICE_MAX_MEM_ALLOC_SIZE, sizeof(max_mem_alloc_size), &max_mem_alloc_size, NULL);
printf(" CL_DEVICE_MAX_MEM_ALLOC_SIZE:\t\t%u MByte\n", (unsigned int)(max_mem_alloc_size / (1024 * 1024)));
// CL_DEVICE_GLOBAL_MEM_SIZE
cl_ulong mem_size;
clGetDeviceInfo(device, CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(mem_size), &mem_size, NULL);
printf(" CL_DEVICE_GLOBAL_MEM_SIZE:\t\t%u MByte\n", (unsigned int)(mem_size / (1024 * 1024)));
// CL_DEVICE_ERROR_CORRECTION_SUPPORT
cl_bool error_correction_support;
clGetDeviceInfo(device, CL_DEVICE_ERROR_CORRECTION_SUPPORT, sizeof(error_correction_support), &error_correction_support, NULL);
printf(" CL_DEVICE_ERROR_CORRECTION_SUPPORT:\t%s\n", error_correction_support == CL_TRUE ? "yes" : "no");
// CL_DEVICE_LOCAL_MEM_TYPE
cl_device_local_mem_type local_mem_type;
clGetDeviceInfo(device, CL_DEVICE_LOCAL_MEM_TYPE, sizeof(local_mem_type), &local_mem_type, NULL);
printf(" CL_DEVICE_LOCAL_MEM_TYPE:\t\t%s\n", local_mem_type == 1 ? "local" : "global");
// CL_DEVICE_LOCAL_MEM_SIZE
clGetDeviceInfo(device, CL_DEVICE_LOCAL_MEM_SIZE, sizeof(mem_size), &mem_size, NULL);
printf(" CL_DEVICE_LOCAL_MEM_SIZE:\t\t%u KByte\n", (unsigned int)(mem_size / 1024));
// CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE
clGetDeviceInfo(device, CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE, sizeof(mem_size), &mem_size, NULL);
printf(" CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:\t%u KByte\n", (unsigned int)(mem_size / 1024));
// CL_DEVICE_QUEUE_PROPERTIES
cl_command_queue_properties queue_properties;
clGetDeviceInfo(device, CL_DEVICE_QUEUE_PROPERTIES, sizeof(queue_properties), &queue_properties, NULL);
if( queue_properties & CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE )
printf(" CL_DEVICE_QUEUE_PROPERTIES:\t\t%s\n", "CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE");
if( queue_properties & CL_QUEUE_PROFILING_ENABLE )
printf(" CL_DEVICE_QUEUE_PROPERTIES:\t\t%s\n", "CL_QUEUE_PROFILING_ENABLE");
// CL_DEVICE_IMAGE_SUPPORT
cl_bool image_support;
clGetDeviceInfo(device, CL_DEVICE_IMAGE_SUPPORT, sizeof(image_support), &image_support, NULL);
printf(" CL_DEVICE_IMAGE_SUPPORT:\t\t%u\n", image_support);
// CL_DEVICE_MAX_READ_IMAGE_ARGS
cl_uint max_read_image_args;
clGetDeviceInfo(device, CL_DEVICE_MAX_READ_IMAGE_ARGS, sizeof(max_read_image_args), &max_read_image_args, NULL);
printf(" CL_DEVICE_MAX_READ_IMAGE_ARGS:\t%u\n", max_read_image_args);
// CL_DEVICE_MAX_WRITE_IMAGE_ARGS
cl_uint max_write_image_args;
clGetDeviceInfo(device, CL_DEVICE_MAX_WRITE_IMAGE_ARGS, sizeof(max_write_image_args), &max_write_image_args, NULL);
printf(" CL_DEVICE_MAX_WRITE_IMAGE_ARGS:\t%u\n", max_write_image_args);
// CL_DEVICE_IMAGE2D_MAX_WIDTH, CL_DEVICE_IMAGE2D_MAX_HEIGHT, CL_DEVICE_IMAGE3D_MAX_WIDTH, CL_DEVICE_IMAGE3D_MAX_HEIGHT, CL_DEVICE_IMAGE3D_MAX_DEPTH
size_t szMaxDims[5];
printf("\n CL_DEVICE_IMAGE <dim>");
clGetDeviceInfo(device, CL_DEVICE_IMAGE2D_MAX_WIDTH, sizeof(size_t), &szMaxDims[0], NULL);
printf("\t\t\t2D_MAX_WIDTH\t %u\n", szMaxDims[0]);
clGetDeviceInfo(device, CL_DEVICE_IMAGE2D_MAX_HEIGHT, sizeof(size_t), &szMaxDims[1], NULL);
printf("\t\t\t\t\t2D_MAX_HEIGHT\t %u\n", szMaxDims[1]);
clGetDeviceInfo(device, CL_DEVICE_IMAGE3D_MAX_WIDTH, sizeof(size_t), &szMaxDims[2], NULL);
printf("\t\t\t\t\t3D_MAX_WIDTH\t %u\n", szMaxDims[2]);
clGetDeviceInfo(device, CL_DEVICE_IMAGE3D_MAX_HEIGHT, sizeof(size_t), &szMaxDims[3], NULL);
printf("\t\t\t\t\t3D_MAX_HEIGHT\t %u\n", szMaxDims[3]);
clGetDeviceInfo(device, CL_DEVICE_IMAGE3D_MAX_DEPTH, sizeof(size_t), &szMaxDims[4], NULL);
printf("\t\t\t\t\t3D_MAX_DEPTH\t %u\n", szMaxDims[4]);
// CL_DEVICE_PREFERRED_VECTOR_WIDTH_<type>
printf(" CL_DEVICE_PREFERRED_VECTOR_WIDTH_<t>\t");
cl_uint vec_width [6];
clGetDeviceInfo(device, CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR, sizeof(cl_uint), &vec_width[0], NULL);
clGetDeviceInfo(device, CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT, sizeof(cl_uint), &vec_width[1], NULL);
clGetDeviceInfo(device, CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT, sizeof(cl_uint), &vec_width[2], NULL);
clGetDeviceInfo(device, CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG, sizeof(cl_uint), &vec_width[3], NULL);
clGetDeviceInfo(device, CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT, sizeof(cl_uint), &vec_width[4], NULL);
clGetDeviceInfo(device, CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE, sizeof(cl_uint), &vec_width[5], NULL);
printf("CHAR %u, SHORT %u, INT %u, FLOAT %u, DOUBLE %u\n\n\n",
vec_width[0], vec_width[1], vec_width[2], vec_width[3], vec_width[4]);
}
int main(int argc, const char** argv) {
// start logs
printf("clDeviceQuery Starting...\n\n");
bool bPassed = true;
std::string sProfileString = "clDeviceQuery, Platform Name = ";
// Get OpenCL platform ID for NVIDIA if avaiable, otherwise default
char cBuffer[1024];
cl_platform_id clSelectedPlatformID = NULL;
cl_platform_id* clPlatformIDs;
cl_uint num_platforms;
cl_int ciErrNum = clGetPlatformIDs(0, NULL, &num_platforms);
if (ciErrNum != CL_SUCCESS) {
printf(" Error %i in clGetPlatformIDs Call!\n\n", ciErrNum);
bPassed = false;
} else {
if (num_platforms == 0) {
printf("No OpenCL platform found!\n\n");
bPassed = false;
} else {
// if there's one platform or more, make space for ID's
if ((clPlatformIDs = (cl_platform_id*)malloc(num_platforms * sizeof(cl_platform_id))) == NULL) {
printf("Failed to allocate memory for cl_platform ID's!\n\n");
bPassed = false;
}
printf("%d OpenCL Platforms found\n\n", num_platforms);
// get platform info for each platform
ciErrNum = clGetPlatformIDs (num_platforms, clPlatformIDs, NULL);
for(cl_uint i = 0; i < num_platforms; ++i) {
ciErrNum = clGetPlatformInfo (clPlatformIDs[i], CL_PLATFORM_NAME, 1024, &cBuffer, NULL);
if(ciErrNum == CL_SUCCESS) {
clSelectedPlatformID = clPlatformIDs[i];
// Get OpenCL platform name and version
ciErrNum = clGetPlatformInfo (clSelectedPlatformID, CL_PLATFORM_NAME, sizeof(cBuffer), cBuffer, NULL);
if (ciErrNum == CL_SUCCESS) {
printf(" CL_PLATFORM_NAME: \t%s\n", cBuffer);
sProfileString += cBuffer;
} else {
printf(" Error %i in clGetPlatformInfo Call !!!\n\n", ciErrNum);
bPassed = false;
}
sProfileString += ", Platform Version = ";
ciErrNum = clGetPlatformInfo (clSelectedPlatformID, CL_PLATFORM_VERSION, sizeof(cBuffer), cBuffer, NULL);
if (ciErrNum == CL_SUCCESS) {
printf(" CL_PLATFORM_VERSION: \t%s\n", cBuffer);
sProfileString += cBuffer;
} else {
printf(" Error %i in clGetPlatformInfo Call !!!\n\n", ciErrNum);
bPassed = false;
}
// Log OpenCL SDK Version # (for convenience: not specific to OpenCL)
sProfileString += ", NumDevs = ";
// Get and log OpenCL device info
cl_uint ciDeviceCount;
cl_device_id *devices;
printf("OpenCL Device Info:\n\n");
ciErrNum = clGetDeviceIDs (clSelectedPlatformID, CL_DEVICE_TYPE_ALL, 0, NULL, &ciDeviceCount);
// check for 0 devices found or errors...
if (ciDeviceCount == 0) {
printf(" No devices found supporting OpenCL (return code %i)\n\n", ciErrNum);
bPassed = false;
sProfileString += "0";
} else if (ciErrNum != CL_SUCCESS) {
printf(" Error %i in clGetDeviceIDs call !!!\n\n", ciErrNum);
bPassed = false;
} else {
// Get and log the OpenCL device ID's
ciErrNum = clGetPlatformInfo (clSelectedPlatformID, CL_PLATFORM_NAME, sizeof(cBuffer), cBuffer, NULL);
printf(" %u devices found supporting OpenCL on: %s\n\n", ciDeviceCount, cBuffer);
char cTemp[2];
sprintf(cTemp, "%u", ciDeviceCount);
sProfileString += cTemp;
if ((devices = (cl_device_id*)malloc(sizeof(cl_device_id) * ciDeviceCount)) == NULL) {
printf(" Failed to allocate memory for devices !!!\n\n");
bPassed = false;
}
ciErrNum = clGetDeviceIDs (clSelectedPlatformID, CL_DEVICE_TYPE_ALL, ciDeviceCount, devices, &ciDeviceCount);
if (ciErrNum == CL_SUCCESS) {
for(unsigned int i = 0; i < ciDeviceCount; ++i ) {
printf(" ----------------------------------\n");
clGetDeviceInfo(devices[i], CL_DEVICE_NAME, sizeof(cBuffer), &cBuffer, NULL);
printf(" Device %s\n", cBuffer);
printf(" ---------------------------------\n");
clPrintDevInfo(devices[i]);
sProfileString += ", Device = ";
sProfileString += cBuffer;
}
} else {
printf(" Error %i in clGetDeviceIDs call !!!\n\n", ciErrNum);
bPassed = false;
}
}
// masterlog info
sProfileString += "\n";
printf("%s", sProfileString.c_str());
}
free(clPlatformIDs);
}
}
}
// Log system info(for convenience: not specific to OpenCL)
printf( "\nSystem Info: \n\n");
char timestr[255];
time_t now = time(NULL);
struct tm *ts;
ts = localtime(&now);
strftime(timestr, 255, " %H:%M:%S, %m/%d/%Y",ts);
// write time and date to logs
printf(" Local Time/Date = %s\n", timestr);
// write proc and OS info to logs
// parse /proc/cpuinfo
std::ifstream cpuinfo( "/proc/cpuinfo" ); // open the file in /proc
std::string tmp;
int cpu_num = 0;
std::string cpu_name = "none";
do {
cpuinfo >> tmp;
if( tmp == "processor" )
cpu_num++;
if( tmp == "name" ) {
cpuinfo >> tmp; // skip :
std::stringstream tmp_stream("");
do {
cpuinfo >> tmp;
if (tmp != std::string("stepping")) {
tmp_stream << tmp.c_str() << " ";
}
}
while (tmp != std::string("stepping"));
cpu_name = tmp_stream.str();
}
}
while ( (! cpuinfo.eof()) );
// Linux version
std::ifstream version( "/proc/version" );
char versionstr[255];
version.getline(versionstr, 255);
printf(" CPU Name: %s\n # of CPU processors: %u\n %s\n\n\n",
cpu_name.c_str(),cpu_num,versionstr);
// finish
printf("TEST %s\n\n", bPassed ? "PASSED" : "FAILED !!!");
}
@BeauJoh

This comment has been minimized.

Copy link

BeauJoh commented Dec 30, 2013

You legend! Thanks a bundle for the fantastically helpful gist.

@znawaz

This comment has been minimized.

Copy link

znawaz commented Feb 18, 2014

Thanks a lot. I was stuck earlier.

@zahlenteufel

This comment has been minimized.

Copy link

zahlenteufel commented Feb 25, 2014

I did all the steps and finally got:
$ g++ -o clDeviceQuery -I/opt/intel/opencl-1.2-3.0.67279/include clDeviceQuery.cpp -lOpenCL

clDeviceQuery.cpp:8:19: fatal error: CL/cl.h: No such file or directory
compilation terminated.

@abergmeier

This comment has been minimized.

Copy link

abergmeier commented Nov 4, 2014

@zahlenteufel On debian try apt-file search CL/cl.h and install that package.

@suminb

This comment has been minimized.

Copy link

suminb commented Dec 31, 2014

This is a great tutorial. Thanks for posting!

However I've been struggling with a strange issue... The clDeviceQuery.cpp works fine (as it outputs TEST PASSED), but I'm getting an error when I try to establish a context in Python code.

  *** RuntimeError: Context failed: device not available

I suspected the following line might be problematic.

    ctx = cl.create_some_context()

So I tried to create a context with an explicit device object as:

    platform = cl.get_platforms()[0]
    device = platform.get_devices()[0]
    ctx = cl.Context([device])

Then I got the same error (device not available). Any idea where this is coming from?

@piotrglazar

This comment has been minimized.

Copy link

piotrglazar commented Apr 18, 2015

Hi,
I tried to install opencl-1.2-sdk-5.0.0.43 but unfortunately I failed. I did everything from your tutorial but when I try to compile I get:

/tmp/ccyqVitl.o: In function clPrintDevInfo(_cl_device_id*)': clDeviceQuery.cpp:(.text+0x46): undefined reference toclGetDeviceInfo'

and a bunch of other undefined references. Have you experienced something similar?

@imrehg

This comment has been minimized.

Copy link

imrehg commented Jun 12, 2015

Hi, I was checking this out on Parallella, and there the final proc info gets into an infinite loop, because ARM boards appear to use different items in /proc/cpuinfo than x86. Have to break on "Features" instead of "stepping". Also, the test is done twice within that loop. I'm using this below and works on both on an Intel/x86 laptop and on the Parallella/ARM.

std::stringstream tmp_stream("");
do {
    cpuinfo >> tmp;
    if ((tmp != std::string("stepping")) && (tmp != std::string("Features"))) {
        tmp_stream << tmp.c_str() << " ";
    } else {
        break;
    }
} 
while(1);
@xealits

This comment has been minimized.

Copy link

xealits commented Sep 23, 2015

Thanks!
To share info:
the same worked in 2015 on Ubuntu 14.04 for opencl-1.2-4.5.0.8 (checked with pyopencl, not clDeviceQuery.cpp ),
which came from Intel's OpenCL™ Runtime 14.2 for Intel® CPU and Intel® Xeon Phi™ coprocessors for Linux* (64-bit)

I already had AMD's SDK installed, so didn't link libOpenCL.so in lib at the final step.
(Btw, there is no libOpenCL.so in my lib -- ldconf finds it somewhere else.)
Also some guides refer to libOpenCL.so as "ICD loader" -- is it correct?

@jihema

This comment has been minimized.

Copy link

jihema commented Feb 16, 2016

Regarding the ICD loader (libOpenCL.so). AMD's distribution provides one, as well as Intel, and they seem to both be able to load each other's libraries pointed to by the ICD registry in /etc/OpenCL/vendors/*.icd. However, there is an important difference: unlike Intel's, the AMD loader will look for ICD registries in $OPENCL_VENDOR_PATH before looking in the default location above, which allows for individual install (useful if you haven't root access).

@anil123df

This comment has been minimized.

Copy link

anil123df commented Mar 30, 2016

i am unable to convert rpm files to deb files .when using fake root alien command it is saying "opencl-1.2-devel-6.0.0.1049-1.x86_64.rpm is for architecture amd64 ; the package cannot be built on this system". I have intel processor and i have installed all 5 rpm files using "sudo apt-get install -y rpm alien libnuma1" .How to deal with it ?

@Sudipta-Paul

This comment has been minimized.

Copy link

Sudipta-Paul commented Sep 26, 2016

I am getting something as follows,
Can something tell me how can I solve the problem:


clDeviceQuery Starting...

Error -1001 in clGetPlatformIDs Call!

System Info:

Local Time/Date = 15:11:45, 09/26/2016
CPU Name: Intel(R) Xeon(R) CPU E31280 @ 3.50GHz

of CPU processors: 8

Linux version 3.19.0-69-generic (buildd@lgw01-06) (gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13) ) #77-Ubuntu SMP Mon Aug 29 19:53:54 UTC 2016

TEST FAILED !!!

@Picoteando

This comment has been minimized.

Copy link

Picoteando commented Nov 15, 2016

Hello, I compiled and runned the clDeviceQuery example on an Intel Joule Module running an Ubuntu Linux distribution and it did not find the GPU. Actually the result I get is the following:

clDeviceQuery Starting...

1 OpenCL Platforms found

CL_PLATFORM_NAME: Experimental OpenCL 2.1 CPU Only Platform
CL_PLATFORM_VERSION: OpenCL 2.1 LINUX
OpenCL Device Info:

1 devices found supporting OpenCL on: Experimental OpenCL 2.1 CPU Only Platform


Device Intel(R) Atom(TM) Processor T5700 @ 1.70GHz

CL_DEVICE_NAME: Intel(R) Atom(TM) Processor T5700 @ 1.70GHz
CL_DEVICE_VENDOR: Intel(R) Corporation
CL_DRIVER_VERSION: 1.2.0.18
CL_DEVICE_TYPE: CL_DEVICE_TYPE_CPU
CL_DEVICE_MAX_COMPUTE_UNITS: 4
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
CL_DEVICE_MAX_WORK_ITEM_SIZES: 8192 / 8192 / 8192
CL_DEVICE_MAX_WORK_GROUP_SIZE: 8192
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1700 MHz
CL_DEVICE_ADDRESS_BITS: 64
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 958 MByte
CL_DEVICE_GLOBAL_MEM_SIZE: 3833 MByte
CL_DEVICE_ERROR_CORRECTION_SUPPORT: no
CL_DEVICE_LOCAL_MEM_TYPE: global
CL_DEVICE_LOCAL_MEM_SIZE: 32 KByte
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 128 KByte
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE
CL_DEVICE_IMAGE_SUPPORT: 1
CL_DEVICE_MAX_READ_IMAGE_ARGS: 480
CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 480

CL_DEVICE_IMAGE 2D_MAX_WIDTH 16384
2D_MAX_HEIGHT 16384
3D_MAX_WIDTH 2048
3D_MAX_HEIGHT 2048
3D_MAX_DEPTH 2048
CL_DEVICE_PREFERRED_VECTOR_WIDTH_ CHAR 1, SHORT 1, INT 1, FLOAT 1, DOUBLE 1

clDeviceQuery, Platform Name = Experimental OpenCL 2.1 CPU Only Platform, Platform Version = OpenCL 2.1 LINUX, NumDevs = 1, Device = Intel(R) Atom(TM) Processor T5700 @ 1.70GHz

System Info:

Local Time/Date = 09:12:08, 11/15/2016
CPU Name: Intel(R) Atom(TM) Processor T5700 @ 1.70GHz

of CPU processors: 4

Linux version 4.4.0-47-generic (buildd@lcy01-03) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.2) ) #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016

TEST PASSED

Why did it not find the GPU? Thanks a lot!

@fateh288

This comment has been minimized.

Copy link

fateh288 commented Feb 22, 2017

On running ./clDeviceQuery , I get the following error

clDeviceQuery Starting...

 Error -1001 in clGetPlatformIDs Call!


System Info: 

 Local Time/Date =  00:57:46, 02/23/2017
 CPU Name: Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz 
 # of CPU processors: 4
 Linux version 4.4.0-38-generic (buildd@lgw01-58) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.2) ) #57-Ubuntu SMP Tue Sep 6 15:42:33 UTC 2016


TEST FAILED !!!

What might be the problem? And how to solve it?

@SnehilVerma

This comment has been minimized.

Copy link

SnehilVerma commented Apr 11, 2017

Thanks the guide. It worked perfectly for me.

@rowanthorpe

This comment has been minimized.

Copy link

rowanthorpe commented Apr 20, 2017

When running the compiled c++ tester I got a double-free/corruption error. Looking at the relevant malloc call's location it seemed the free() call is inside the for-loop, but the malloc() call is made outside the loop, before it starts, so I made the following change and the error went away. Replace:

256 free(clPlatformIDs);
257       }

with:

256       }
257 free(clPlatformIDs);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.