~2013

Quick manual OpenCL installation on Linux

According to Debian docs you should be able to do

sudo apt-get install amd-libopencl1

However on Ubuntu Quantal this is bugged. So here's a quick manual installation using AMD's implementation which should get you up and running using the CPU.

First download the AMD-APP-SDK for your platform: * http://developer.amd.com/tools/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/downloads/

Note: I'm assuming a x86_64 architecture!

Extract the SDK and copy the libs to /usr/local/lib.

tar -xzf AMD-APP-SDK-v2.8-RC-lnx64.tgz
sudo cp AMD-APP-SDK-v2.8-RC-lnx64/lib/x86_64/lib* /usr/local/lib/
sudo ldcondig

You need to set the ICD settings so it's known which openCL implementation you are using. Just create the file /etc/OpenCL/vendors/amdocl64.icd containing:

libamdocl64.so

Now try the clinfo binary supplied with the SDK:

$ AMD-APP-SDK-v2.8-RC-lnx64/bin/x86_64/clinfo
Setting of real/effective user Id to 0/0 failed
FATAL: Module fglrx not found.
Error! Fail to load fglrx kernel module! Maybe you can switch to root user to load kernel module directly
Number of platforms:                 1
  Platform Profile:              FULL_PROFILE
  Platform Version:              OpenCL 1.2 AMD-APP (1113.2)
  Platform Name:                 AMD Accelerated Parallel Processing
  Platform Vendor:               Advanced Micro Devices, Inc.
  Platform Extensions:               cl_khr_icd cl_amd_event_callback cl_amd_offline_devices

Platform Name:               AMD Accelerated Parallel Processing
Number of devices:               1
  Device Type:                   CL_DEVICE_TYPE_CPU
  Device ID:                     4098
  Board name:                   
  Max compute units:                 2
  Max work items dimensions:             3
    Max work items[0]:               1024
    Max work items[1]:               1024
    Max work items[2]:               1024
  Max work group size:               1024
  Preferred vector width char:           16
  Preferred vector width short:          8
  Preferred vector width int:            4
  Preferred vector width long:           2
  Preferred vector width float:          4
  Preferred vector width double:         2
  Native vector width char:          16
  Native vector width short:             8
  Native vector width int:           4
  Native vector width long:          2
  Native vector width float:             4
  Native vector width double:            2
  Max clock frequency:               800Mhz
  Address bits:                  64
  Max memory allocation:             2147483648
  Image support:                 Yes
  Max number of images read arguments:       128
  Max number of images write arguments:      8
  Max image 2D width:                8192
  Max image 2D height:               8192
  Max image 3D width:                2048
  Max image 3D height:               2048
  Max image 3D depth:                2048
  Max samplers within kernel:            16
  Max size of kernel argument:           4096
  Alignment (bits) of base address:      1024
  Minimum alignment (bytes) for any datatype:    128
  Single precision floating point capability
    Denorms:                     Yes
    Quiet NaNs:                  Yes
    Round to nearest even:           Yes
    Round to zero:               Yes
    Round to +ve and infinity:           Yes
    IEEE754-2008 fused multiply-add:         Yes
  Cache type:                    Read/Write
  Cache line size:               64
  Cache size:                    32768
  Global memory size:                3975372800
  Constant buffer size:              65536
  Max number of constant args:           8
  Local memory type:                 Global
  Local memory size:                 32768
  Kernel Preferred work group size multiple:     1
  Error correction support:          0
  Unified memory for Host and Device:        1
  Profiling timer resolution:            1
  Device endianess:              Little
  Available:                     Yes
  Compiler available:                Yes
  Execution capabilities:               
    Execute OpenCL kernels:          Yes
    Execute native function:             Yes
  Queue properties:             
    Out-of-Order:                No
    Profiling :                  Yes
  Platform ID:                   0x00007fafc98664e0
  Name:                      Intel(R) Core(TM)2 Duo CPU     L9400  @ 1.86GHz
  Vendor:                    GenuineIntel
  Device OpenCL C version:           OpenCL C 1.2
  Driver version:                1113.2 (sse2)
  Profile:                   FULL_PROFILE
  Version:                   OpenCL 1.2 AMD-APP (1113.2)
  Extensions:                    cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt

For coding you need the header files which are also in the SDK but also supplied by your distribution probably:

sudo apt-get install opencl-headers

I've found this blog post most helpful and also nice suggestion of how to install multiple vendors: * http://streamcomputing.eu/blog/2011-06-24/install-opencl-on-debianubuntu-orderly/

Also note; I've found the opencl SDK supplies a libGLEW as well which interferes with my system wide installed version when using tools like cmake and sorts.

Update

Here's another great post about the current OpenCL situation on Linux: * http://mhr3.blogspot.nl/2013/06/opencl-on-ubuntu-1304.html