java.lang.Object
com.ibm.cuda.CudaFunction
The
CudaFunction
class represents a kernel entry point found in
a specific CudaModule
loaded on a CUDA-capable device.-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
The binary architecture version for which the function was compiled.static final int
The size in bytes of user-allocated constant memory required by this function.static final int
The size in bytes of local memory used by each thread of this function.static final int
The maximum number of threads per block, beyond which a launch of the function would fail.static final int
The number of registers used by each thread of this function.static final int
The PTX virtual architecture version for which the function was compiled.static final int
The size in bytes of statically-allocated shared memory required by this function. -
Method Summary
Modifier and TypeMethodDescriptionint
getAttribute
(int attribute) Returns the value of the specified @{code attribute}.void
setCacheConfig
(CudaDevice.CacheConfig config) Configures the cache for this function.void
Configures the shared memory of this function.
-
Field Details
-
ATTRIBUTE_BINARY_VERSION
public static final int ATTRIBUTE_BINARY_VERSIONThe binary architecture version for which the function was compiled. This value is the major binary version * 10 + the minor binary version, so a binary version 1.3 function would return the value 13. Note that this will return a value of 10 for legacy cubins that do not have a properly-encoded binary architecture version.- See Also:
-
ATTRIBUTE_CONST_SIZE_BYTES
public static final int ATTRIBUTE_CONST_SIZE_BYTESThe size in bytes of user-allocated constant memory required by this function.- See Also:
-
ATTRIBUTE_LOCAL_SIZE_BYTES
public static final int ATTRIBUTE_LOCAL_SIZE_BYTESThe size in bytes of local memory used by each thread of this function.- See Also:
-
ATTRIBUTE_MAX_THREADS_PER_BLOCK
public static final int ATTRIBUTE_MAX_THREADS_PER_BLOCKThe maximum number of threads per block, beyond which a launch of the function would fail. This number depends on both the function and the device on which the function is currently loaded.- See Also:
-
ATTRIBUTE_NUM_REGS
public static final int ATTRIBUTE_NUM_REGSThe number of registers used by each thread of this function.- See Also:
-
ATTRIBUTE_PTX_VERSION
public static final int ATTRIBUTE_PTX_VERSIONThe PTX virtual architecture version for which the function was compiled. This value is the major PTX version * 10 + the minor PTX version, so a PTX version 1.3 function would return the value 13. Note that this may return the undefined value of 0 for cubins compiled prior to CUDA 3.0.- See Also:
-
ATTRIBUTE_SHARED_SIZE_BYTES
public static final int ATTRIBUTE_SHARED_SIZE_BYTESThe size in bytes of statically-allocated shared memory required by this function. This does not include dynamically-allocated shared memory requested by the user at runtime.- See Also:
-
-
Method Details
-
getAttribute
Returns the value of the specified @{code attribute}.- Parameters:
attribute
- the attribute to be queried (see ATTRIBUTE_XXX)- Returns:
- the attribute value
- Throws:
CudaException
- if a CUDA exception occurs
-
setCacheConfig
Configures the cache for this function.- Parameters:
config
- the desired cache configuration- Throws:
CudaException
- if a CUDA exception occurs
-