Class CLCapabilities
- java.lang.Object
-
- org.lwjgl.opencl.CLCapabilities
-
public class CLCapabilities extends java.lang.ObjectDefines the capabilities of an OpenCL platform or device.The instance returned by
CL.createPlatformCapabilities(long)exposes the functionality present on either the platform or any of its devices. This is unlike thePLATFORM_EXTENSIONSstring, which returns only platform functionality, supported across all platform devices.The instance returned by
CL.createDeviceCapabilities(long, org.lwjgl.opencl.CLCapabilities)exposes only the functionality available on that particular device.
-
-
Field Summary
-
-
-
Field Detail
-
clBuildProgram
public final long clBuildProgram
-
clCloneKernel
public final long clCloneKernel
-
clCompileProgram
public final long clCompileProgram
-
clCreateAcceleratorINTEL
public final long clCreateAcceleratorINTEL
-
clCreateBuffer
public final long clCreateBuffer
-
clCreateCommandQueue
public final long clCreateCommandQueue
-
clCreateCommandQueueWithProperties
public final long clCreateCommandQueueWithProperties
-
clCreateCommandQueueWithPropertiesAPPLE
public final long clCreateCommandQueueWithPropertiesAPPLE
-
clCreateContext
public final long clCreateContext
-
clCreateContextFromType
public final long clCreateContextFromType
-
clCreateEventFromEGLSyncKHR
public final long clCreateEventFromEGLSyncKHR
-
clCreateEventFromGLsyncKHR
public final long clCreateEventFromGLsyncKHR
-
clCreateFromEGLImageKHR
public final long clCreateFromEGLImageKHR
-
clCreateFromGLBuffer
public final long clCreateFromGLBuffer
-
clCreateFromGLRenderbuffer
public final long clCreateFromGLRenderbuffer
-
clCreateFromGLTexture
public final long clCreateFromGLTexture
-
clCreateFromGLTexture2D
public final long clCreateFromGLTexture2D
-
clCreateFromGLTexture3D
public final long clCreateFromGLTexture3D
-
clCreateFromVA_APIMediaSurfaceINTEL
public final long clCreateFromVA_APIMediaSurfaceINTEL
-
clCreateImage
public final long clCreateImage
-
clCreateImage2D
public final long clCreateImage2D
-
clCreateImage3D
public final long clCreateImage3D
-
clCreateKernel
public final long clCreateKernel
-
clCreateKernelsInProgram
public final long clCreateKernelsInProgram
-
clCreatePipe
public final long clCreatePipe
-
clCreateProgramWithBinary
public final long clCreateProgramWithBinary
-
clCreateProgramWithBuiltInKernels
public final long clCreateProgramWithBuiltInKernels
-
clCreateProgramWithIL
public final long clCreateProgramWithIL
-
clCreateProgramWithSource
public final long clCreateProgramWithSource
-
clCreateSampler
public final long clCreateSampler
-
clCreateSamplerWithProperties
public final long clCreateSamplerWithProperties
-
clCreateSubBuffer
public final long clCreateSubBuffer
-
clCreateSubDevices
public final long clCreateSubDevices
-
clCreateSubDevicesEXT
public final long clCreateSubDevicesEXT
-
clCreateUserEvent
public final long clCreateUserEvent
-
clEnqueueAcquireEGLObjectsKHR
public final long clEnqueueAcquireEGLObjectsKHR
-
clEnqueueAcquireGLObjects
public final long clEnqueueAcquireGLObjects
-
clEnqueueAcquireVA_APIMediaSurfacesINTEL
public final long clEnqueueAcquireVA_APIMediaSurfacesINTEL
-
clEnqueueBarrier
public final long clEnqueueBarrier
-
clEnqueueBarrierWithWaitList
public final long clEnqueueBarrierWithWaitList
-
clEnqueueCopyBuffer
public final long clEnqueueCopyBuffer
-
clEnqueueCopyBufferRect
public final long clEnqueueCopyBufferRect
-
clEnqueueCopyBufferToImage
public final long clEnqueueCopyBufferToImage
-
clEnqueueCopyImage
public final long clEnqueueCopyImage
-
clEnqueueCopyImageToBuffer
public final long clEnqueueCopyImageToBuffer
-
clEnqueueFillBuffer
public final long clEnqueueFillBuffer
-
clEnqueueFillImage
public final long clEnqueueFillImage
-
clEnqueueMakeBuffersResidentAMD
public final long clEnqueueMakeBuffersResidentAMD
-
clEnqueueMapBuffer
public final long clEnqueueMapBuffer
-
clEnqueueMapImage
public final long clEnqueueMapImage
-
clEnqueueMarker
public final long clEnqueueMarker
-
clEnqueueMarkerWithWaitList
public final long clEnqueueMarkerWithWaitList
-
clEnqueueMigrateMemObjectEXT
public final long clEnqueueMigrateMemObjectEXT
-
clEnqueueMigrateMemObjects
public final long clEnqueueMigrateMemObjects
-
clEnqueueNDRangeKernel
public final long clEnqueueNDRangeKernel
-
clEnqueueNativeKernel
public final long clEnqueueNativeKernel
-
clEnqueueReadBuffer
public final long clEnqueueReadBuffer
-
clEnqueueReadBufferRect
public final long clEnqueueReadBufferRect
-
clEnqueueReadImage
public final long clEnqueueReadImage
-
clEnqueueReleaseEGLObjectsKHR
public final long clEnqueueReleaseEGLObjectsKHR
-
clEnqueueReleaseGLObjects
public final long clEnqueueReleaseGLObjects
-
clEnqueueReleaseVA_APIMediaSurfacesINTEL
public final long clEnqueueReleaseVA_APIMediaSurfacesINTEL
-
clEnqueueSVMFree
public final long clEnqueueSVMFree
-
clEnqueueSVMMap
public final long clEnqueueSVMMap
-
clEnqueueSVMMemFill
public final long clEnqueueSVMMemFill
-
clEnqueueSVMMemcpy
public final long clEnqueueSVMMemcpy
-
clEnqueueSVMMigrateMem
public final long clEnqueueSVMMigrateMem
-
clEnqueueSVMUnmap
public final long clEnqueueSVMUnmap
-
clEnqueueTask
public final long clEnqueueTask
-
clEnqueueUnmapMemObject
public final long clEnqueueUnmapMemObject
-
clEnqueueWaitForEvents
public final long clEnqueueWaitForEvents
-
clEnqueueWaitSignalAMD
public final long clEnqueueWaitSignalAMD
-
clEnqueueWriteBuffer
public final long clEnqueueWriteBuffer
-
clEnqueueWriteBufferRect
public final long clEnqueueWriteBufferRect
-
clEnqueueWriteImage
public final long clEnqueueWriteImage
-
clEnqueueWriteSignalAMD
public final long clEnqueueWriteSignalAMD
-
clFinish
public final long clFinish
-
clFlush
public final long clFlush
-
clGetAcceleratorInfoINTEL
public final long clGetAcceleratorInfoINTEL
-
clGetCommandQueueInfo
public final long clGetCommandQueueInfo
-
clGetContextInfo
public final long clGetContextInfo
-
clGetDeviceAndHostTimer
public final long clGetDeviceAndHostTimer
-
clGetDeviceIDs
public final long clGetDeviceIDs
-
clGetDeviceIDsFromVA_APIMediaAdapterINTEL
public final long clGetDeviceIDsFromVA_APIMediaAdapterINTEL
-
clGetDeviceImageInfoQCOM
public final long clGetDeviceImageInfoQCOM
-
clGetDeviceInfo
public final long clGetDeviceInfo
-
clGetEventInfo
public final long clGetEventInfo
-
clGetEventProfilingInfo
public final long clGetEventProfilingInfo
-
clGetExtensionFunctionAddress
public final long clGetExtensionFunctionAddress
-
clGetExtensionFunctionAddressForPlatform
public final long clGetExtensionFunctionAddressForPlatform
-
clGetGLContextInfoAPPLE
public final long clGetGLContextInfoAPPLE
-
clGetGLContextInfoKHR
public final long clGetGLContextInfoKHR
-
clGetGLObjectInfo
public final long clGetGLObjectInfo
-
clGetGLTextureInfo
public final long clGetGLTextureInfo
-
clGetHostTimer
public final long clGetHostTimer
-
clGetImageInfo
public final long clGetImageInfo
-
clGetKernelArgInfo
public final long clGetKernelArgInfo
-
clGetKernelInfo
public final long clGetKernelInfo
-
clGetKernelSubGroupInfo
public final long clGetKernelSubGroupInfo
-
clGetKernelSubGroupInfoKHR
public final long clGetKernelSubGroupInfoKHR
-
clGetKernelWorkGroupInfo
public final long clGetKernelWorkGroupInfo
-
clGetMemObjectInfo
public final long clGetMemObjectInfo
-
clGetPipeInfo
public final long clGetPipeInfo
-
clGetPlatformIDs
public final long clGetPlatformIDs
-
clGetPlatformInfo
public final long clGetPlatformInfo
-
clGetProgramBuildInfo
public final long clGetProgramBuildInfo
-
clGetProgramInfo
public final long clGetProgramInfo
-
clGetSamplerInfo
public final long clGetSamplerInfo
-
clGetSupportedImageFormats
public final long clGetSupportedImageFormats
-
clLinkProgram
public final long clLinkProgram
-
clLogMessagesToStderrAPPLE
public final long clLogMessagesToStderrAPPLE
-
clLogMessagesToStdoutAPPLE
public final long clLogMessagesToStdoutAPPLE
-
clLogMessagesToSystemLogAPPLE
public final long clLogMessagesToSystemLogAPPLE
-
clReleaseAcceleratorINTEL
public final long clReleaseAcceleratorINTEL
-
clReleaseCommandQueue
public final long clReleaseCommandQueue
-
clReleaseContext
public final long clReleaseContext
-
clReleaseDevice
public final long clReleaseDevice
-
clReleaseDeviceEXT
public final long clReleaseDeviceEXT
-
clReleaseEvent
public final long clReleaseEvent
-
clReleaseKernel
public final long clReleaseKernel
-
clReleaseMemObject
public final long clReleaseMemObject
-
clReleaseProgram
public final long clReleaseProgram
-
clReleaseSampler
public final long clReleaseSampler
-
clReportLiveObjectsAltera
public final long clReportLiveObjectsAltera
-
clRetainAcceleratorINTEL
public final long clRetainAcceleratorINTEL
-
clRetainCommandQueue
public final long clRetainCommandQueue
-
clRetainContext
public final long clRetainContext
-
clRetainDevice
public final long clRetainDevice
-
clRetainDeviceEXT
public final long clRetainDeviceEXT
-
clRetainEvent
public final long clRetainEvent
-
clRetainKernel
public final long clRetainKernel
-
clRetainMemObject
public final long clRetainMemObject
-
clRetainProgram
public final long clRetainProgram
-
clRetainSampler
public final long clRetainSampler
-
clSVMAlloc
public final long clSVMAlloc
-
clSVMFree
public final long clSVMFree
-
clSetDefaultDeviceCommandQueue
public final long clSetDefaultDeviceCommandQueue
-
clSetEventCallback
public final long clSetEventCallback
-
clSetKernelArg
public final long clSetKernelArg
-
clSetKernelArgSVMPointer
public final long clSetKernelArgSVMPointer
-
clSetKernelExecInfo
public final long clSetKernelExecInfo
-
clSetMemObjectDestructorCallback
public final long clSetMemObjectDestructorCallback
-
clSetProgramReleaseCallback
public final long clSetProgramReleaseCallback
-
clSetProgramSpecializationConstant
public final long clSetProgramSpecializationConstant
-
clSetUserEventStatus
public final long clSetUserEventStatus
-
clTerminateContextKHR
public final long clTerminateContextKHR
-
clTrackLiveObjectsAltera
public final long clTrackLiveObjectsAltera
-
clUnloadCompiler
public final long clUnloadCompiler
-
clUnloadPlatformCompiler
public final long clUnloadPlatformCompiler
-
clWaitForEvents
public final long clWaitForEvents
-
OpenCL10
public final boolean OpenCL10
When true,CL10is supported.
-
OpenCL10GL
public final boolean OpenCL10GL
When true,CL10GLis supported.
-
OpenCL11
public final boolean OpenCL11
When true,CL11is supported.
-
OpenCL12
public final boolean OpenCL12
When true,CL12is supported.
-
OpenCL12GL
public final boolean OpenCL12GL
When true,CL12GLis supported.
-
OpenCL20
public final boolean OpenCL20
When true,CL20is supported.
-
OpenCL21
public final boolean OpenCL21
When true,CL21is supported.
-
OpenCL22
public final boolean OpenCL22
When true,CL22is supported.
-
cl_altera_compiler_mode
public final boolean cl_altera_compiler_mode
When true,ALTERACompilerModeis supported.
-
cl_altera_device_temperature
public final boolean cl_altera_device_temperature
When true,ALTERADeviceTemperatureis supported.
-
cl_altera_live_object_tracking
public final boolean cl_altera_live_object_tracking
When true,ALTERALiveObjectTrackingis supported.
-
cl_amd_bus_addressable_memory
public final boolean cl_amd_bus_addressable_memory
When true,AMDBusAddressableMemoryis supported.
-
cl_amd_compile_options
public final boolean cl_amd_compile_options
When true, the amd_compile_options extension is supported.This extension adds the following options, which are not part of the OpenCL specification:
- -g – This is an experimental feature that lets you use the GNU project debugger, GDB, to debug kernels on x86 CPUs running Linux or cygwin/minGW under Windows. This option does not affect the default optimization of the OpenCL code.
- -O0 – Specifies to the compiler not to optimize. This is equivalent to the OpenCL standard option -cl-opt-disable.
- -f[no-]bin-source – Does [not] generate OpenCL source in the .source section. By default, the source is NOT generated.
- -f[no-]bin-llvmir – Does [not] generate LLVM IR in the .llvmir section. By default, LLVM IR IS generated.
- -f[no-]bin-amdil – Does [not] generate AMD IL in the .amdil section. By Default, AMD IL is NOT generated.
- -f[no-]bin-exe – Does [not] generate the executable (ISA) in .text section. By default, the executable IS generated.
- -f[no-]bin-hsail – Does [not] generate HSAIL/BRIG in the binary. By default, HSA IL/BRIG is NOT generated.
To avoid source changes, there are two environment variables that can be used to change CL options during the runtime:
- AMD_OCL_BUILD_OPTIONS – Overrides the CL options specified in
BuildProgram. - AMD_OCL_BUILD_OPTIONS_APPEND – Appends options to the options specified in
BuildProgram.
-
cl_amd_device_attribute_query
public final boolean cl_amd_device_attribute_query
When true,AMDDeviceAttributeQueryis supported.
-
cl_amd_device_board_name
public final boolean cl_amd_device_board_name
When true,AMDDeviceBoardNameis supported.
-
cl_amd_device_persistent_memory
public final boolean cl_amd_device_persistent_memory
When true,AMDDevicePersistentMemoryis supported.
-
cl_amd_device_profiling_timer_offset
public final boolean cl_amd_device_profiling_timer_offset
When true,AMDDeviceProfilingTimerOffsetis supported.
-
cl_amd_device_topology
public final boolean cl_amd_device_topology
When true,AMDDeviceTopologyis supported.
-
cl_amd_event_callback
public final boolean cl_amd_event_callback
When true, the amd_event_callback extension is supported.This extension provides the ability to register event callbacks for states other than
COMPLETE. The full set of event states are allowed:QUEUED,SUBMITTED, andRUNNING.
-
cl_amd_fp64
public final boolean cl_amd_fp64
When true, the amd_fp64 extension is supported.This extension provides a subset of the functionality of that provided by the cl_khr_fp64 extension. When enabled, the compiler recognizes the double scalar and vector types, compiles expressions involving those types, and accepts calls to all builtin functions enabled by the cl_khr_fp64 extension. However, this extension does not guarantee that all cl_khr_fp64 built in functions are implemented and does not guarantee that the built in functions that have been implemented would be considered conformant to the cl_khr_fp64 extension.
-
cl_amd_media_ops
public final boolean cl_amd_media_ops
When true, the amd_media_ops extension is supported.The directive when enabled adds the following built-in functions to the OpenCL language.
Note: typen denote opencl scalar type {n = 1} and vector types {n = 4, 8, 16}. Build-in Function uint amd_pack(float4 src) Description dst = ((((uint)src.s0) & 0xff) ) + ((((uint)src.s1) & 0xff) << 8) + ((((uint)src.s2) & 0xff) << 16) + ((((uint)src.s3) & 0xff) << 24) Build-in Function floatn amd_unpack3(unitn src) Description dst.s0 = (float)((src.s0 >> 24) & 0xff) similar operation applied to other components of the vectors Build-in Function floatn amd_unpack2 (unitn src) Description dst.s0 = (float)((src.s0 >> 16) & 0xff) similar operation applied to other components of the vectors Build-in Function floatn amd_unpack1 (unitn src) Description dst.s0 = (float)((src.s0 >> 8) & 0xff) similar operation applied to other components of the vectors Build-in Function floatn amd_unpack0 (unitn src) Description dst.s0 = (float)(src.s0 & 0xff) similar operation applied to other components of the vectors Build-in Function uintn amd_bitalign (uintn src0, uintn src1, uintn src2) Description dst.s0 = (uint) (((((long)src0.s0) << 32) | (long)src1.s0) >> (src2.s0 & 31)) similar operation applied to other components of the vectors. Build-in Function uintn amd_bytealign (uintn src0, uintn src1, uintn src2) Description dst.s0 = (uint) (((((long)src0.s0) << 32) | (long)src1.s0) >> ((src2.s0 & 3)*8)) similar operation applied to other components of the vectors Build-in Function uintn amd_lerp (uintn src0, uintn src1, uintn src2) Description dst.s0 = (((((src0.s0 >> 0) & 0xff) + ((src1.s0 >> 0) & 0xff) + ((src2.s0 >> 0) & 1)) >> 1) << 0) + (((((src0.s0 >> 8) & 0xff) + ((src1.s0 >> 8) & 0xff) + ((src2.s0 >> 8) & 1)) >> 1) << 8) + (((((src0.s0 >> 16) & 0xff) + ((src1.s0 >> 16) & 0xff) + ((src2.s0 >> 16) & 1)) >> 1) << 16) + (((((src0.s0 >> 24) & 0xff) + ((src1.s0 >> 24) & 0xff) + ((src2.s0 >> 24) & 1)) >> 1) << 24); similar operation applied to other components of the vectors Build-in Function uintn amd_sad (uintn src0, uintn src1, uintn src2) Description dst.s0 = src2.s0 + abs(((src0.s0 >> 0) & 0xff) - ((src1.s0 >> 0) & 0xff)) + abs(((src0.s0 >> 8) & 0xff) - ((src1.s0 >> 8) & 0xff)) + abs(((src0.s0 >> 16) & 0xff) - ((src1.s0 >> 16) & 0xff)) + abs(((src0.s0 >> 24) & 0xff) - ((src1.s0 >> 24) & 0xff)); similar operation applied to other components of the vectors Build-in Function uintn amd_sadhi (uintn src0, uintn src1n, uintn src2) Description dst.s0 = src2.s0 + (abs(((src0.s0 >> 0) & 0xff) - ((src1.s0 >> 0) & 0xff)) << 16) + (abs(((src0.s0 >> 8) & 0xff) - ((src1.s0 >> 8) & 0xff)) << 16) + (abs(((src0.s0 >> 16) & 0xff) - ((src1.s0 >> 16) & 0xff)) << 16) + (abs(((src0.s0 >> 24) & 0xff) - ((src1.s0 >> 24) & 0xff)) << 16); similar operation applied to other components of the vectors Build-in Function uint amd_sad4(uint4 src0, uint4 src1, uint src2) Description dst = src2 + abs(((src0.s0 >> 0) & 0xff) - ((src1.s0 >> 0) & 0xff)) + abs(((src0.s0 >> 8) & 0xff) - ((src1.s0 >> 8) & 0xff)) + abs(((src0.s0 >> 16) & 0xff) - ((src1.s0 >> 16) & 0xff)) + abs(((src0.s0 >> 24) & 0xff) - ((src1.s0 >> 24) & 0xff)) + abs(((src0.s1 >> 0) & 0xff) - ((src1.s0 >> 0) & 0xff)) + abs(((src0.s1 >> 8) & 0xff) - ((src1.s1 >> 8) & 0xff)) + abs(((src0.s1 >> 16) & 0xff) - ((src1.s1 >> 16) & 0xff)) + abs(((src0.s1 >> 24) & 0xff) - ((src1.s1 >> 24) & 0xff)) + abs(((src0.s2 >> 0) & 0xff) - ((src1.s2 >> 0) & 0xff)) + abs(((src0.s2 >> 8) & 0xff) - ((src1.s2 >> 8) & 0xff)) + abs(((src0.s2 >> 16) & 0xff) - ((src1.s2 >> 16) & 0xff)) + abs(((src0.s2 >> 24) & 0xff) - ((src1.s2 >> 24) & 0xff)) + abs(((src0.s3 >> 0) & 0xff) - ((src1.s3 >> 0) & 0xff)) + abs(((src0.s3 >> 8) & 0xff) - ((src1.s3 >> 8) & 0xff)) + abs(((src0.s3 >> 16) & 0xff) - ((src1.s3 >> 16) & 0xff)) + abs(((src0.s3 >> 24) & 0xff) - ((src1.s3 >> 24) & 0xff));
-
cl_amd_media_ops2
public final boolean cl_amd_media_ops2
When true, the amd_media_ops2 extension is supported.The directive when enabled adds the following built-in functions to the OpenCL language.
Note: typen denote open scalar type { n = 1 } and vector types { n = 2, 4, 8, 16 }. Build-in Function uintn amd_msad (uintn src0, uintn src1, uintn src2) Description uchar4 src0u8 = as_uchar4(src0.s0); uchar4 src1u8 = as_uchar4(src1.s0); dst.s0 = src2.s0 + ((src1u8.s0 == 0) ? 0 : abs(src0u8.s0 - src1u8.s0)) + ((src1u8.s1 == 0) ? 0 : abs(src0u8.s1 - src1u8.s1)) + ((src1u8.s2 == 0) ? 0 : abs(src0u8.s2 - src1u8.s2)) + ((src1u8.s3 == 0) ? 0 : abs(src0u8.s3 - src1u8.s3)); similar operation applied to other components of the vectors Build-in Function ulongn amd_qsad (ulongn src0, uintn src1, ulongn src2) Description uchar8 src0u8 = as_uchar8(src0.s0); ushort4 src2u16 = as_ushort4(src2.s0); ushort4 dstu16; dstu16.s0 = amd_sad(as_uint(src0u8.s0123), src1.s0, src2u16.s0); dstu16.s1 = amd_sad(as_uint(src0u8.s1234), src1.s0, src2u16.s1); dstu16.s2 = amd_sad(as_uint(src0u8.s2345), src1.s0, src2u16.s2); dstu16.s3 = amd_sad(as_uint(src0u8.s3456), src1.s0, src2u16.s3); dst.s0 = as_uint2(dstu16); similar operation applied to other components of the vectors Build-in Function ulongn amd_mqsad (ulongn src0, uintn src1, ulongn src2) Description uchar8 src0u8 = as_uchar8(src0.s0); ushort4 src2u16 = as_ushort4(src2.s0); ushort4 dstu16; dstu16.s0 = amd_msad(as_uint(src0u8.s0123), src1.s0, src2u16.s0); dstu16.s1 = amd_msad(as_uint(src0u8.s1234), src1.s0, src2u16.s1); dstu16.s2 = amd_msad(as_uint(src0u8.s2345), src1.s0, src2u16.s2); dstu16.s3 = amd_msad(as_uint(src0u8.s3456), src1.s0, src2u16.s3); dst.s0 = as_uint2(dstu16); similar operation applied to other components of the vectors Build-in Function uintn amd_sadw (uintn src0, uintn src1, uintn src2) Description ushort2 src0u16 = as_ushort2(src0.s0); ushort2 src1u16 = as_ushort2(src1.s0); dst.s0 = src2.s0 + abs(src0u16.s0 - src1u16.s0) + abs(src0u16.s1 - src1u16.s1); similar operation applied to other components of the vectors Build-in Function uintn amd_sadd (uintn src0, uintn src1, uintn src2) Description dst.s0 = src2.s0 + abs(src0.s0 - src1.s0); similar operation applied to other components of the vectors Built-in Function: uintn amd_bfm (uintn src0, uintn src1) Description dst.s0 = ((1 << (src0.s0 & 0x1f)) - 1) << (src1.s0 & 0x1f); similar operation applied to other components of the vectors Built-in Function: uintn amd_bfe (uintn src0, uintn src1, uintn src2) Description NOTE: operator >> below represent logical right shift offset = src1.s0 & 31; width = src2.s0 & 31; if width = 0 dst.s0 = 0; else if (offset + width) < 32 dst.s0 = (src0.s0 << (32 - offset - width)) >> (32 - width); else dst.s0 = src0.s0 >> offset; similar operation applied to other components of the vectors Built-in Function: intn amd_bfe (intn src0, uintn src1, uintn src2) Description NOTE: operator >> below represent arithmetic right shift offset = src1.s0 & 31; width = src2.s0 & 31; if width = 0 dst.s0 = 0; else if (offset + width) < 32 dst.s0 = src0.s0 << (32-offset-width) >> 32-width; else dst.s0 = src0.s0 >> offset; similar operation applied to other components of the vectors Built-in Function: intn amd_median3 (intn src0, intn src1, intn src2) uintn amd_median3 (uintn src0, uintn src1, uintn src2) floatn amd_median3 (floatn src0, floatn src1, floattn src2) Description returns median of src0, src1, and src2 Built-in Function: intn amd_min3 (intn src0, intn src1, intn src2) uintn amd_min3 (uintn src0, uintn src1, uintn src2) floatn amd_min3 (floatn src0, floatn src1, floattn src2) Description returns min of src0, src1, and src2 Built-in Function: intn amd_max3 (intn src0, intn src1, intn src2) uintn amd_max3 (uintn src0, uintn src1, uintn src2) floatn amd_max3 (floatn src0, floatn src1, floattn src2) Description returns max of src0, src1, and src2
-
cl_amd_offline_devices
public final boolean cl_amd_offline_devices
When true,AMDOfflineDevicesis supported.
-
cl_amd_popcnt
public final boolean cl_amd_popcnt
When true, the amd_popcnt extension is supported.This extension introduces a “population count” function called popcnt. This extension was taken into core OpenCL 1.2, and the function was renamed popcount. The core 1.2 popcount function is identical to the AMD extension popcnt function.
-
cl_amd_predefined_macros
public final boolean cl_amd_predefined_macros
When true, the amd_predefined_macros extension is supported.The following macros are predefined when compiling OpenCL™ C kernels. These macros are defined automatically based on the device for which the code is being compiled.
GPU devices
- __Barts__
- __BeaverCreek__
- __Bheem__
- __Bonaire__
- __Caicos__
- __Capeverde__
- __Carrizo__
- __Cayman__
- __Cedar__
- __Cypress__
- __Devastator__
- __Hainan__
- __Iceland__
- __Juniper__
- __Kalindi__
- __Kauai__
- __Lombok__
- __Loveland__
- __Mullins__
- __Oland__
- __Pitcairn__
- __RV710__
- __RV730__
- __RV740__
- __RV770__
- __RV790__
- __Redwood__
- __Scrapper__
- __Spectre__
- __Spooky__
- __Tahiti__
- __Tonga__
- __Turks__
- __WinterPark__
- __GPU__
CPU devices
- __CPU__
- __X86__
- __X86_64__
Note that __GPU__ or __CPU__ are predefined whenever a GPU or CPU device is the compilation target.
-
cl_amd_printf
public final boolean cl_amd_printf
When true, the amd_printf extension is supported.This extension adds the built-in function
printf(__constant char * restrict format, …);This function writes output to the stdout stream associated with the host application. The format string is a character sequence that:
- is null-terminated and composed of zero and more directives,
- ordinary characters (i.e. not %), which are copied directly to the output stream unchanged, and
- conversion specifications, each of which can result in fetching zero or more arguments, converting them, and then writing the final result to the output stream.
The format string must be resolvable at compile time; thus, it cannot be dynamically created by the executing program. (Note that the use of variadic arguments in the built-in printf does not imply its use in other builtins; more importantly, it is not valid to use printf in user-defined functions or kernels.)
The OpenCL C printf closely matches the definition found as part of the C99 standard. Note that conversions introduced in the format string with % are supported with the following guidelines:
- A 32-bit floating point argument is not converted to a 64-bit double, unless the extension cl_khr_fp64 is supported and enabled. This includes the double variants if cl_khr_fp64 is supported and defined in the corresponding compilation unit.
- 64-bit integer types can be printed using %ld / %lx / %lu.
- %lld / %llx / %llu are not supported and reserved for 128-bit integer types (long long).
- All OpenCL vector types can be explicitly passed and printed using the modifier vn, where n can be 2, 3, 4, 8, or 16. This modifier appears before the original conversion specifier for the vector’s component type (for example, to print a float4 %v4f). Since vn is a conversion specifier, it is valid to apply optional flags, such as field width and precision, just as it is when printing the component types. Since a vector is an aggregate type, the comma separator is used between the components: 0:1, … , n-2:n-1.
-
cl_amd_vec3
public final boolean cl_amd_vec3
When true, the amd_vec3 extension is supported.This extension adds support for vectors with three elements: float3, short3, char3, etc. This data type was added to OpenCL 1.1 as a core feature.
-
cl_APPLE_biased_fixed_point_image_formats
public final boolean cl_APPLE_biased_fixed_point_image_formats
When true,APPLEBiasedFixedPointImageFormatsis supported.
-
cl_APPLE_command_queue_priority
public final boolean cl_APPLE_command_queue_priority
When true,APPLECommandQueuePriorityis supported.
-
cl_APPLE_command_queue_select_compute_units
public final boolean cl_APPLE_command_queue_select_compute_units
When true,APPLECommandQueueSelectComputeUnitsis supported.
-
cl_APPLE_ContextLoggingFunctions
public final boolean cl_APPLE_ContextLoggingFunctions
When true,APPLEContextLoggingFunctionsis supported.
-
cl_APPLE_fixed_alpha_channel_orders
public final boolean cl_APPLE_fixed_alpha_channel_orders
When true,APPLEFixedAlphaChannelOrdersis supported.
-
cl_APPLE_fp64_basic_ops
public final boolean cl_APPLE_fp64_basic_ops
When true,APPLE_fp64_basic_opsis supported.
-
cl_APPLE_gl_sharing
public final boolean cl_APPLE_gl_sharing
When true,APPLEGLSharingis supported.
-
cl_APPLE_query_kernel_names
public final boolean cl_APPLE_query_kernel_names
When true,APPLEQueryKernelNamesis supported.
-
cl_arm_core_id
public final boolean cl_arm_core_id
When true, the arm_core_id extension is supported.This extension provides a built-in function (
uint arm_get_core_id( void )) which returns the physical core id (OpenCL Compute Unit) that a work-group is running on. This value is uniform for a work-group.This value can be used for a core-specific cache or atomic pool where the storage is required to be in global memory and persistent (but not ordered) between work-groups. This does not provide any additional ordering on top of the existing guarantees between workgroups, nor does it provide any guarantee of concurrent execution.
-
cl_arm_printf
public final boolean cl_arm_printf
When true,ARMPrintfis supported.
-
cl_ext_atomic_counters_32
public final boolean cl_ext_atomic_counters_32
When true,EXTAtomicCounters32is supported.
-
cl_ext_atomic_counters_64
public final boolean cl_ext_atomic_counters_64
When true,EXTAtomicCounters64is supported.
-
cl_ext_device_fission
public final boolean cl_ext_device_fission
When true,EXTDeviceFissionis supported.
-
cl_ext_migrate_memobject
public final boolean cl_ext_migrate_memobject
When true,EXTMigrateMemobjectis supported.
-
cl_intel_accelerator
public final boolean cl_intel_accelerator
When true,INTELAcceleratoris supported.
-
cl_intel_advanced_motion_estimation
public final boolean cl_intel_advanced_motion_estimation
When true,INTELAdvancedMotionEstimationis supported.
-
cl_intel_device_partition_by_names
public final boolean cl_intel_device_partition_by_names
When true,INTELDevicePartitionByNamesis supported.
-
cl_intel_device_side_avc_motion_estimation
public final boolean cl_intel_device_side_avc_motion_estimation
When true,INTELDeviceSideAVCMotionEstimationis supported.
-
cl_intel_driver_diagnostics
public final boolean cl_intel_driver_diagnostics
When true,INTELDriverDiagnosticsis supported.
-
cl_intel_egl_image_yuv
public final boolean cl_intel_egl_image_yuv
When true,INTELEGLImageYUVis supported.
-
cl_intel_media_block_io
public final boolean cl_intel_media_block_io
This extension augments the block read/write functionality available in the Intel vendor extensionsintel_subgroupsand intel_media_block_io by the specification of additional built-in functions to facilitate the reading and writing of flexible 2D regions from images. This API allows for the explicit specification of the width and height of the image regions.While not required, this extension is most useful when the subgroup size is known at compile-time. The primary use case for this extension is to support the reading of the edge texels (or image elements) of neighboring macro-blocks as described in the Intel vendor extension
intel_device_side_avc_motion_estimation. When using the built-in functions fromcl_intel_device_ side_avc_motion_estimationthe subgroup size is implicitly fixed to 16. In other use cases the subgroup size may be fixed using theintel_required_subgroup_sizeextension, if needed.
-
cl_intel_motion_estimation
public final boolean cl_intel_motion_estimation
When true,INTELMotionEstimationis supported.
-
cl_intel_packed_yuv
public final boolean cl_intel_packed_yuv
When true,INTELPackedYUVis supported.
-
cl_intel_planar_yuv
public final boolean cl_intel_planar_yuv
When true,INTELPlanarYUVis supported.
-
cl_intel_printf
public final boolean cl_intel_printf
When true,intel_printfis supported.
-
cl_intel_required_subgroup_size
public final boolean cl_intel_required_subgroup_size
When true,INTELRequiredSubgroupSizeis supported.
-
cl_intel_simultaneous_sharing
public final boolean cl_intel_simultaneous_sharing
When true,INTELSimultaneousSharingis supported.
-
cl_intel_subgroups
public final boolean cl_intel_subgroups
When true,INTELSubgroupsis supported.
-
cl_intel_subgroups_short
public final boolean cl_intel_subgroups_short
The goal of this extension is to allow programmers to improve the performance of applications operating on 16-bit data types by extending the subgroup functions described in theintel_subgroupsextension to support 16-bit integer data types (shorts and ushorts). Specifically, the extension:- Extends the subgroup broadcast function to allow 16-bit integer values to be broadcast from one work item to all other work items in the subgroup.
- Extends the subgroup scan and reduction functions to operate on 16-bit integer data types.
- Extends the Intel subgroup shuffle functions to allow arbitrarily exchanging 16-bit integer values among work items in the subgroup.
- Extends the Intel subgroup block read and write functions to allow reading and writing 16-bit integer data from images and buffers.
Requires
OpenCL 1.2andintel_subgroups
-
cl_intel_thread_local_exec
public final boolean cl_intel_thread_local_exec
When true,INTELThreadLocalExecis supported.
-
cl_intel_va_api_media_sharing
public final boolean cl_intel_va_api_media_sharing
When true,INTELVAAPIMediaSharingis supported.
-
cl_khr_3d_image_writes
public final boolean cl_khr_3d_image_writes
When true, the khr_3d_image_writes extension is supported.This extension adds support for kernel writes to 3D images.
-
cl_khr_byte_addressable_store
public final boolean cl_khr_byte_addressable_store
When true, the khr_byte_addressable_store extension is supported.This extension eliminates the restriction of not allowing writes to a pointer (or array elements) of types less than 32-bit wide in kernel program.
-
cl_khr_depth_images
public final boolean cl_khr_depth_images
When true,KHRDepthImagesis supported.
-
cl_khr_device_enqueue_local_arg_types
public final boolean cl_khr_device_enqueue_local_arg_types
When true, the khr_device_enqueue_local_arg_types extension is supported.This extension allows arguments to blocks passed to enqueue_kernel functions to be declared as a pointer to any type (built-in or user-defined) in local memory instead of just
local void *.
-
cl_khr_egl_event
public final boolean cl_khr_egl_event
When true,KHREGLEventis supported.
-
cl_khr_egl_image
public final boolean cl_khr_egl_image
When true,KHREGLImageis supported.
-
cl_khr_fp16
public final boolean cl_khr_fp16
When true,KHRFP16is supported.
-
cl_khr_fp64
public final boolean cl_khr_fp64
When true,KHRFP64is supported.
-
cl_khr_gl_depth_images
public final boolean cl_khr_gl_depth_images
When true,KHRGLDepthImagesis supported.
-
cl_khr_gl_event
public final boolean cl_khr_gl_event
When true,KHRGLEventis supported.
-
cl_khr_gl_msaa_sharing
public final boolean cl_khr_gl_msaa_sharing
When true,KHRGLMSAASharingis supported.
-
cl_khr_gl_sharing
public final boolean cl_khr_gl_sharing
When true,KHRGLSharingis supported.
-
cl_khr_global_int32_base_atomics
public final boolean cl_khr_global_int32_base_atomics
When true, the khr_global_int32_base_atomics extension is supported.This extension adds basic atomic operations on 32-bit integers in global memory.
-
cl_khr_global_int32_extended_atomics
public final boolean cl_khr_global_int32_extended_atomics
When true, the khr_global_int32_extended_atomics extension is supported.This extension adds extended atomic operations on 32-bit integers in global memory.
-
cl_khr_icd
public final boolean cl_khr_icd
When true,KHRICDis supported.
-
cl_khr_image2d_from_buffer
public final boolean cl_khr_image2d_from_buffer
When true,KHRImage2DFromBufferis supported.
-
cl_khr_initialize_memory
public final boolean cl_khr_initialize_memory
When true,KHRInitializeMemoryis supported.
-
cl_khr_int64_base_atomics
public final boolean cl_khr_int64_base_atomics
When true, the khr_int64_base_atomics extension is supported.This extension adds basic atomic operations on 64-bit integers in both global and local memory.
-
cl_khr_int64_extended_atomics
public final boolean cl_khr_int64_extended_atomics
When true, the khr_int64_extended_atomics extension is supported.This extension adds extended atomic operations on 64-bit integers in both global and local memory.
-
cl_khr_local_int32_base_atomics
public final boolean cl_khr_local_int32_base_atomics
When true, the khr_local_int32_base_atomics extension is supported.This extension adds basic atomic operations on 32-bit integers in local memory.
-
cl_khr_local_int32_extended_atomics
public final boolean cl_khr_local_int32_extended_atomics
When true, the khr_local_int32_extended_atomics extension is supported.This extension adds extended atomic operations on 32-bit integers in local memory.
-
cl_khr_mipmap_image
public final boolean cl_khr_mipmap_image
When true,KHRMipmapImageis supported.
-
cl_khr_mipmap_image_writes
public final boolean cl_khr_mipmap_image_writes
When true, the khr_mipmap_image_writes extension is supported.This extension adds built-in functions that can be used to write a mip-mapped image in an OpenCL C program.
-
cl_khr_priority_hints
public final boolean cl_khr_priority_hints
When true,KHRPriorityHintsis supported.
-
cl_khr_select_fprounding_mode
public final boolean cl_khr_select_fprounding_mode
When true, the khr_select_fprounding_mode extension is supported.This extension adds support for specifying the rounding mode for an instruction or group of instructions in the program source.
The appropriate rounding mode can be specified using
#pragma OPENCL SELECT_ROUNDING_MODErounding-mode in the program source.The
#pragma OPENCL SELECT_ROUNDING_MODEsets the rounding mode for all instructions that operate on floating-point types (scalar or vector types) or produce floating-point values that follow this pragma in the program source until the next#pragma OPENCL SELECT_ROUNDING_MODEis encountered. Note that the rounding mode specified for a block of code is known at compile time. Except where otherwise documented, the callee functions do not inherit the rounding mode of the caller function.If this extension is enabled, the
__ROUNDING_MODE__preprocessor symbol shall be defined to be one of the following according to the current rounding mode:#define __ROUNDING_MODE__ rte #define __ROUNDING_MODE__ rtz #define __ROUNDING_MODE__ rtp #define __ROUNDING_MODE__ rtz
The default rounding mode is round to nearest even. The built-in math functions, the common functions, and the geometric functions are implemented with the round to nearest even rounding mode.
Various built-in conversions and the vstore_half and vstorea_halfn built-in functions that do not specify a rounding mode inherit the current rounding mode. Conversions from floating-point to integer type always use rtz mode, except where the user specifically asks for another rounding mode.
Notes The above four rounding modes are defined by IEEE 754. Floating-point calculations may be carried out internally with extra precision and then rounded to fit into the destination type. Round to nearest even is currently the only rounding mode required by the OpenCL specification and is therefore the default rounding mode. In addition, only static selection of rounding mode is supported. Dynamically reconfiguring the rounding modes as specified by the IEEE 754 spec is not a requirement.
-
cl_khr_spir
public final boolean cl_khr_spir
When true,KHRSPIRis supported.
-
cl_khr_subgroup_named_barrier
public final boolean cl_khr_subgroup_named_barrier
When true,KHRSubgroupNamedBarrieris supported.
-
cl_khr_terminate_context
public final boolean cl_khr_terminate_context
When true,KHRTerminateContextis supported.
-
cl_khr_throttle_hints
public final boolean cl_khr_throttle_hints
When true,KHRThrottleHintsis supported.
-
cl_nv_compiler_options
public final boolean cl_nv_compiler_options
When true, the nv_compiler_options extension is supported.This extension allows the programmer to pass options to the PTX assembler allowing greater control over code generation.
-cl-nv-maxrregcount
Passed on to ptxas as --maxrregcount N is a positive integer. Specify the maximum number of registers that GPU functions can use. Until a function-specific limit, a higher value will generally increase the performance of individual GPU threads that execute this function. However, because thread registers are allocated from a global register pool on each GPU, a higher value of this option will also reduce the maximum thread block size, thereby reducing the amount of thread parallelism. Hence, a good maxrregcount value is the result of a trade-off. If this option is not specified, then no maximum is assumed. Otherwise the specified value will be rounded to the next multiple of 4 registers until the GPU specific maximum of 128 registers. -cl-nv-opt-level Passed on to ptxas as --opt-level N is a positive integer, or 0 (no optimization). Specify optimization level. Default value: 3. -cl-nv-verbose Passed on to ptxas as --verbose Enable verbose mode. Output will be reported in the build log (accessible through the callback parameter to clBuildProgram).
-
cl_nv_device_attribute_query
public final boolean cl_nv_device_attribute_query
When true,NVDeviceAttributeQueryis supported.
-
cl_nv_pragma_unroll
public final boolean cl_nv_pragma_unroll
When true, the nv_pragma_unroll extension is supported.Overview
This extension extends the OpenCL C language with a hint that allows loops to be unrolled. This pragma must be used for a loop and can be used to specify full unrolling or partial unrolling by a certain amount. This is a hint and the compiler may ignore this pragma for any reason.
Goals
The principal goal of the pragma unroll is to improve the performance of loops via unrolling. Typically this enables other optimizations or improves instruction level parallelism of a thread.
Details
A user may specify that a loop in the source program be unrolled. This is done via a pragma. The syntax of this pragma is as follows
#pragma unroll [unroll-factor]The pragma unroll may optionally specify an unroll factor. The pragma must be placed immediately before the loop and only applies to that loop.
If unroll factor is not specified then the compiler will try to do complete or full unrolling of the loop. If a loop unroll factor is specified the compiler will perform partial loop unrolling. The loop factor, if specified, must be a compile time non negative integer constant.
A loop unroll factor of 1 means that the compiler should not unroll the loop.
A complete unroll specification has no effect if the trip count of the loop is not compile-time computable.
-
cl_qcom_ext_host_ptr
public final boolean cl_qcom_ext_host_ptr
When true,QCOMEXTHostPtris supported.
-
-