Module openj9.cuda
Package com.ibm.cuda

Class CudaJitOptions

  • All Implemented Interfaces:
    Cloneable

    public final class CudaJitOptions
    extends Object
    implements Cloneable
    The CudaJitOptions class represents a set of options that influence the behavior of linking and loading modules.
    • Constructor Detail

      • CudaJitOptions

        public CudaJitOptions()
        Creates a new options object.
    • Method Detail

      • clone

        protected CudaJitOptions clone()
        Creates a new options object with the same state as this object.
        Overrides:
        clone in class Object
        Returns:
        Object a shallow copy of this object.
      • getErrorLogBuffer

        public String getErrorLogBuffer()
        Returns the contents of the error log.

        The result will be empty unless setErrorLogBufferSize(int) was called with a positive value, this object was used in connection with a CudaModule or a CudaLinker, and errors were reported.

        Returns:
        the contents of the error log
      • getInfoLogBuffer

        public String getInfoLogBuffer()
        Returns the contents of the information log.

        The result will be empty unless setInfoLogBufferSize(int) was called with a positive value, this object was used in connection with a CudaModule or a CudaLinker, and informational messages were reported.

        Returns:
        the contents of the information log
      • getThreadsPerBlock

        public int getThreadsPerBlock()
        Returns the maximum number of threads per block.

        The result will only be meaningful if setThreadsPerBlock(int) was called with a positive value, and this object was used in connection with a CudaModule or a CudaLinker involving PTX code.

        Returns:
        the maximum number of threads per block
      • getWallTime

        public float getWallTime()
        Returns the total elapsed time, in milliseconds, spent in the compiler and linker.

        Applies to: compiler and linker.

        Returns:
        the total elapsed time, in milliseconds, spent in the compiler and linker
      • recordWallTime

        public CudaJitOptions recordWallTime()
        Requests recording of the total wall clock time, in milliseconds, spent in the compiler and linker.

        Applies to: compiler and linker.

        Returns:
        this options object
      • setCacheMode

        public CudaJitOptions setCacheMode​(CudaJitOptions.CacheMode mode)
        Specifies the desired caching behavior (-dlcm).

        Applies to compiler only.

        Parameters:
        mode - the desired caching behavior
        Returns:
        this options object
      • setErrorLogBufferSize

        public CudaJitOptions setErrorLogBufferSize​(int size)
        Specifies the size, in bytes, to allocate for capturing error messages.

        Applies to compiler and linker.

        Parameters:
        size - the size, in bytes, of the error log buffer
        Returns:
        this options object
      • setGenerateDebugInfo

        public CudaJitOptions setGenerateDebugInfo​(boolean enabled)
        Specifies whether to generate debug information.

        Applies to compiler and linker.

        Parameters:
        enabled - whether debug information should be generated
        Returns:
        this options object
      • setGenerateLineInfo

        public CudaJitOptions setGenerateLineInfo​(boolean enabled)
        Specifies whether to generate line number information.

        Applies to compiler only.

        Parameters:
        enabled - whether line number information should be generated
        Returns:
        this options object
      • setInfoLogBufferSize

        public CudaJitOptions setInfoLogBufferSize​(int size)
        Specifies the size, in bytes, to allocate for capturing informational messages.

        Applies to compiler and linker.

        Parameters:
        size - the size, in bytes, of the information log buffer
        Returns:
        this options object
      • setJitFallbackStrategy

        public CudaJitOptions setJitFallbackStrategy​(CudaJitOptions.Fallback strategy)
        Specifies the fallback strategy if an exactly matching binary object cannot be found.

        Applies to: compiler only

        Parameters:
        strategy - the desired fallback strategy
        Returns:
        this options object
      • setLogVerbose

        public CudaJitOptions setLogVerbose​(boolean verbose)
        Specifies whether to generate verbose log messages.

        Applies to: compiler and linker

        Parameters:
        verbose - whether verbose log messages should be generated
        Returns:
        this options object
      • setMaxRegisters

        public CudaJitOptions setMaxRegisters​(int limit)
        Specifies the maximum number of registers that a thread may use.

        Applies to: compiler only

        Parameters:
        limit - the maximum number of registers a thread may use
        Returns:
        this options object
      • setOptimizationLevel

        public CudaJitOptions setOptimizationLevel​(int level)
        Specifies the level of optimization to be applied to generated code (0 - 4), with 4 being the default and highest level of optimization.

        Applies to compiler only.

        Parameters:
        level - the desired optimization level
        Returns:
        this options object
      • setTarget

        public CudaJitOptions setTarget​(CudaJitTarget target)
        Specifies the desired compute target.

        Cannot be combined with setThreadsPerBlock(int).

        Applies to compiler and linker.

        Parameters:
        target - the desired compute target
        Returns:
        this options object
      • setTargetFromCuContext

        public CudaJitOptions setTargetFromCuContext()
        Specifies that the target should be determined based on the current attached context.

        Applies to compiler and linker.

        Returns:
        this options object
      • setThreadsPerBlock

        public CudaJitOptions setThreadsPerBlock​(int limit)
        Specifies the minimum number of threads per block for compilation.

        This restricts the resource utilization of the compiler (e.g. maximum registers) such that a block with the given number of threads should be able to launch based on register limitations. Note, this option does not currently take into account any other resource limitations, such as shared memory utilization.

        Cannot be combined with setTarget(CudaJitTarget).

        Applies to compiler only.

        Parameters:
        limit - the desired minimum number of threads per block
        Returns:
        this options object