QUDA  1.0.0
All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
Public Member Functions | Protected Member Functions | Protected Attributes | List of all members
quda::Tunable Class Referenceabstract

#include <tune_quda.h>

Inheritance diagram for quda::Tunable:
Inheritance graph
[legend]

Public Member Functions

 Tunable ()
 
virtual ~Tunable ()
 
virtual TuneKey tuneKey () const =0
 
virtual void apply (const cudaStream_t &stream)=0
 
virtual void preTune ()
 
virtual void postTune ()
 
virtual int tuningIter () const
 
virtual std::string paramString (const TuneParam &param) const
 
virtual std::string perfString (float time) const
 
virtual void initTuneParam (TuneParam &param) const
 
virtual void defaultTuneParam (TuneParam &param) const
 
virtual bool advanceTuneParam (TuneParam &param) const
 
void checkLaunchParam (TuneParam &param)
 
CUresult jitifyError () const
 
CUresult & jitifyError ()
 

Protected Member Functions

virtual long long flops () const =0
 
virtual long long bytes () const
 
virtual unsigned int sharedBytesPerThread () const =0
 
virtual unsigned int sharedBytesPerBlock (const TuneParam &param) const =0
 
virtual unsigned int minThreads () const
 
virtual bool tuneGridDim () const
 
virtual bool tuneAuxDim () const
 
virtual bool tuneSharedBytes () const
 
virtual bool advanceGridDim (TuneParam &param) const
 
virtual unsigned int maxBlockSize (const TuneParam &param) const
 
virtual unsigned int maxGridSize () const
 
virtual unsigned int minGridSize () const
 
virtual int gridStep () const
 gridStep sets the step size when iterating the grid size in advanceGridDim. More...
 
virtual int blockStep () const
 
virtual int blockMin () const
 
virtual void resetBlockDim (TuneParam &param) const
 
virtual bool advanceBlockDim (TuneParam &param) const
 
unsigned int maxBlocksPerSM () const
 For some reason this can't be queried from the device properties, so here we set set this. Based on Table 14 of the CUDA Programming Guide 10.0 (Technical Specifications per Compute Capability) More...
 
template<typename F >
void setMaxDynamicSharedBytesPerBlock (F *func) const
 Enable the maximum dynamic shared bytes for the kernel "func" (values given by maxDynamicSharedBytesPerBlock()). More...
 
unsigned int maxDynamicSharedBytesPerBlock () const
 This can't be correctly queried in CUDA for all architectures so here we set set this. Based on Table 14 of the CUDA Programming Guide 10.0 (Technical Specifications per Compute Capability). More...
 
virtual unsigned int maxSharedBytesPerBlock () const
 The maximum shared memory that a CUDA thread block can use in the autotuner. This isn't necessarily the same as maxDynamicSharedMemoryPerBlock since that may need explicit opt in to enable (by calling setMaxDynamicSharedBytes for the kernel in question). If the CUDA kernel in question does this opt in then this function can be overloaded to return maxDynamicSharedBytesPerBlock. More...
 
virtual bool advanceSharedBytes (TuneParam &param) const
 
virtual bool advanceAux (TuneParam &param) const
 
int writeAuxString (const char *format,...)
 

Protected Attributes

char aux [TuneKey::aux_n]
 
CUresult jitify_error
 

Detailed Description

Definition at line 59 of file tune_quda.h.

Constructor & Destructor Documentation

◆ Tunable()

quda::Tunable::Tunable ( )
inline

Definition at line 279 of file tune_quda.h.

◆ ~Tunable()

virtual quda::Tunable::~Tunable ( )
inlinevirtual

Definition at line 280 of file tune_quda.h.

References quda::stream.

Member Function Documentation

◆ advanceAux()

virtual bool quda::Tunable::advanceAux ( TuneParam param) const
inlineprotectedvirtual

◆ advanceBlockDim()

virtual bool quda::Tunable::advanceBlockDim ( TuneParam param) const
inlineprotectedvirtual

◆ advanceGridDim()

virtual bool quda::Tunable::advanceGridDim ( TuneParam param) const
inlineprotectedvirtual

◆ advanceSharedBytes()

virtual bool quda::Tunable::advanceSharedBytes ( TuneParam param) const
inlineprotectedvirtual

◆ advanceTuneParam()

virtual bool quda::Tunable::advanceTuneParam ( TuneParam param) const
inlinevirtual

◆ apply()

virtual void quda::Tunable::apply ( const cudaStream_t &  stream)
pure virtual

Implemented in quda::dslash::DslashPolicyTune< Dslash >, quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >, quda::TwistClover< Float, nSpin, nColor, Arg >, quda::DslashCoarsePolicyTune, quda::Clover< Float, nSpin, nColor, Arg >, quda::ProjectSU3< Float, G >, quda::TwistGamma< Float, nColor, Arg >, quda::KSLongLinkForce< Float, Result, Oprod, Gauge >, quda::Gamma< ValueType, basis, dir >, quda::CopySpinorEx< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder, Basis, extend >, quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >, quda::CopyColorSpinor< 4, Arg >, quda::ExtractGhostEx< Float, length, nDim, dim, Order >, quda::ExtractGhost< nDim, Arg >, quda::Pack< Float, nColor, spin_project >, quda::CalculateY< from_coarse, Float, fineSpin, fineColor, coarseSpin, coarseColor, Arg >, quda::blas::MultiReduceCuda< NXZ, doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Reducer >, quda::CopyColorSpinor< Ns, Arg >, quda::GaugeOvrImpSTOUT< Float, Arg >, quda::WuppertalSmearing< Float, Ns, Nc, Arg >, quda::ShiftColorSpinorField< Output, Input >, quda::CopyGaugeEx< FloatOut, FloatIn, length, OutOrder, InOrder >, quda::Dslash5< Float, nColor, Arg >, quda::KSForceComplete< Float, Oprod, Gauge, Mom >, quda::GaugeGauss< Float, Arg >, quda::blas::MultiBlas< NXZ, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Functor, T >, quda::SpinorNoise< real, Ns, Nc, type, Arg >, quda::blas::BlasCuda< FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Functor >, quda::GenericPackGhostLauncher< Float, block_float, Ns, Ms, Nc, Mc, Arg >, quda::CopyGauge< FloatOut, FloatIn, length, Arg >, quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >, quda::QudaMemCopy, quda::CopySpinor< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder >, quda::NdegTwistedMassPreconditioned< Float, nDim, nColor, Arg >, quda::Laplace< Float, nDim, nColor, Arg >, quda::TwistedMassPreconditioned< Float, nDim, nColor, Arg >, quda::Wilson< Float, nDim, nColor, Arg >, quda::DomainWall4D< Float, nDim, nColor, Arg >, quda::TwistedCloverPreconditioned< Float, nDim, nColor, Arg >, quda::WilsonCloverPreconditioned< Float, nDim, nColor, Arg >, quda::NdegTwistedMass< Float, nDim, nColor, Arg >, quda::TwistedClover< Float, nDim, nColor, Arg >, quda::WilsonClover< Float, nDim, nColor, Arg >, quda::DomainWall5D< Float, nDim, nColor, Arg >, quda::Staggered< Float, nDim, nColor, Arg >, quda::Staggered< Float, nDim, nColor, Arg >, quda::TwistedMass< Float, nDim, nColor, Arg >, and quda::GaugePlaq< Float, Gauge >.

Referenced by quda::dslash::DslashBasic< Dslash >::operator()(), quda::dslash::DslashFusedExterior< Dslash >::operator()(), quda::dslash::DslashGDR< Dslash >::operator()(), quda::dslash::DslashFusedGDR< Dslash >::operator()(), quda::dslash::DslashGDRRecv< Dslash >::operator()(), quda::dslash::DslashFusedGDRRecv< Dslash >::operator()(), quda::dslash::DslashZeroCopyPack< Dslash >::operator()(), quda::dslash::DslashFusedZeroCopyPack< Dslash >::operator()(), quda::dslash::DslashZeroCopyPackGDRRecv< Dslash >::operator()(), quda::dslash::DslashFusedZeroCopyPackGDRRecv< Dslash >::operator()(), quda::dslash::DslashZeroCopy< Dslash >::operator()(), quda::dslash::DslashFusedZeroCopy< Dslash >::operator()(), quda::dslash::DslashNC< Dslash >::operator()(), and quda::tuneLaunch().

Here is the caller graph for this function:

◆ blockMin()

virtual int quda::Tunable::blockMin ( ) const
inlineprotectedvirtual

◆ blockStep()

virtual int quda::Tunable::blockStep ( ) const
inlineprotectedvirtual

◆ bytes()

virtual long long quda::Tunable::bytes ( ) const
inlineprotectedvirtual

Reimplemented in quda::dslash::DslashPolicyTune< Dslash >, quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >, quda::DslashCoarsePolicyTune, quda::TwistClover< Float, nSpin, nColor, Arg >, quda::ProjectSU3< Float, G >, quda::Clover< Float, nSpin, nColor, Arg >, quda::TwistGamma< Float, nColor, Arg >, quda::Dslash< Float >, quda::KSLongLinkForce< Float, Result, Oprod, Gauge >, quda::Pack< Float, nColor, spin_project >, quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >, quda::CopyColorSpinor< 4, Arg >, quda::blas::MultiReduceCuda< NXZ, doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Reducer >, quda::CopySpinorEx< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder, Basis, extend >, quda::Gamma< ValueType, basis, dir >, quda::ExtractGhostEx< Float, length, nDim, dim, Order >, quda::blas::MultiBlas< NXZ, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Functor, T >, quda::ExtractGhost< nDim, Arg >, quda::GaugeOvrImpSTOUT< Float, Arg >, quda::CopyColorSpinor< Ns, Arg >, quda::ShiftColorSpinorField< Output, Input >, quda::GenericPackGhostLauncher< Float, block_float, Ns, Ms, Nc, Mc, Arg >, quda::WuppertalSmearing< Float, Ns, Nc, Arg >, quda::CopyGaugeEx< FloatOut, FloatIn, length, OutOrder, InOrder >, quda::blas::BlasCuda< FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Functor >, quda::KSForceComplete< Float, Oprod, Gauge, Mom >, quda::GaugeGauss< Float, Arg >, quda::QudaMemCopy, quda::CopyGauge< FloatOut, FloatIn, length, Arg >, quda::SpinorNoise< real, Ns, Nc, type, Arg >, quda::Laplace< Float, nDim, nColor, Arg >, quda::Staggered< Float, nDim, nColor, Arg >, quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >, quda::TwistedCloverPreconditioned< Float, nDim, nColor, Arg >, quda::WilsonCloverPreconditioned< Float, nDim, nColor, Arg >, quda::CalculateY< from_coarse, Float, fineSpin, fineColor, coarseSpin, coarseColor, Arg >, quda::CopySpinor< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder >, quda::DomainWall5D< Float, nDim, nColor, Arg >, quda::TwistedClover< Float, nDim, nColor, Arg >, quda::WilsonClover< Float, nDim, nColor, Arg >, quda::Dslash5< Float, nColor, Arg >, and quda::GaugePlaq< Float, Gauge >.

Definition at line 63 of file tune_quda.h.

References param.

◆ checkLaunchParam()

void quda::Tunable::checkLaunchParam ( TuneParam param)
inline

Check the launch parameters of the kernel to ensure that they are valid for the current device.

Definition at line 344 of file tune_quda.h.

References quda::TuneParam::block, deviceProp, errorQuda, and quda::TuneParam::grid.

Referenced by quda::tuneLaunch().

Here is the caller graph for this function:

◆ defaultTuneParam()

virtual void quda::Tunable::defaultTuneParam ( TuneParam param) const
inlinevirtual

sets default values for when tuning is disabled

Reimplemented in quda::dslash::DslashPolicyTune< Dslash >, quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >, quda::DslashCoarsePolicyTune, quda::CalculateY< from_coarse, Float, fineSpin, fineColor, coarseSpin, coarseColor, Arg >, quda::TunableVectorYZ, quda::TunableVectorY, quda::TunableLocalParity, quda::Pack< Float, nColor, spin_project >, quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >, quda::blas::MultiReduceCuda< NXZ, doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Reducer >, quda::blas::MultiBlas< NXZ, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Functor, T >, quda::Dslash5< Float, nColor, Arg >, quda::ShiftColorSpinorField< Output, Input >, quda::GenericPackGhostLauncher< Float, block_float, Ns, Ms, Nc, Mc, Arg >, quda::blas::BlasCuda< FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Functor >, quda::NdegTwistedMassPreconditioned< Float, nDim, nColor, Arg >, and quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >.

Definition at line 329 of file tune_quda.h.

References quda::TuneParam::grid.

Referenced by quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >::defaultTuneParam(), quda::blas::MultiReduceCuda< NXZ, doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Reducer >::defaultTuneParam(), quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::defaultTuneParam(), quda::TunableLocalParity::defaultTuneParam(), quda::TunableVectorY::defaultTuneParam(), quda::DslashCoarsePolicyTune::defaultTuneParam(), quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >::defaultTuneParam(), quda::dslash::DslashPolicyTune< Dslash >::defaultTuneParam(), and quda::tuneLaunch().

Here is the caller graph for this function:

◆ flops()

virtual long long quda::Tunable::flops ( ) const
protectedpure virtual

Implemented in quda::dslash::DslashPolicyTune< Dslash >, quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >, quda::DslashCoarsePolicyTune, quda::TwistClover< Float, nSpin, nColor, Arg >, quda::ProjectSU3< Float, G >, quda::Clover< Float, nSpin, nColor, Arg >, quda::TwistGamma< Float, nColor, Arg >, quda::KSLongLinkForce< Float, Result, Oprod, Gauge >, quda::Pack< Float, nColor, spin_project >, quda::Dslash< Float >, quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >, quda::CopyColorSpinor< 4, Arg >, quda::CopySpinorEx< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder, Basis, extend >, quda::Gamma< ValueType, basis, dir >, quda::blas::MultiReduceCuda< NXZ, doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Reducer >, quda::ExtractGhostEx< Float, length, nDim, dim, Order >, quda::blas::MultiBlas< NXZ, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Functor, T >, quda::ExtractGhost< nDim, Arg >, quda::GaugeOvrImpSTOUT< Float, Arg >, quda::CopyColorSpinor< Ns, Arg >, quda::ShiftColorSpinorField< Output, Input >, quda::GenericPackGhostLauncher< Float, block_float, Ns, Ms, Nc, Mc, Arg >, quda::WuppertalSmearing< Float, Ns, Nc, Arg >, quda::CopyGaugeEx< FloatOut, FloatIn, length, OutOrder, InOrder >, quda::blas::BlasCuda< FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Functor >, quda::KSForceComplete< Float, Oprod, Gauge, Mom >, quda::GaugeGauss< Float, Arg >, quda::QudaMemCopy, quda::CopyGauge< FloatOut, FloatIn, length, Arg >, quda::SpinorNoise< real, Ns, Nc, type, Arg >, quda::NdegTwistedMassPreconditioned< Float, nDim, nColor, Arg >, quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >, quda::CopySpinor< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder >, quda::Staggered< Float, nDim, nColor, Arg >, quda::TwistedMassPreconditioned< Float, nDim, nColor, Arg >, quda::TwistedCloverPreconditioned< Float, nDim, nColor, Arg >, quda::WilsonCloverPreconditioned< Float, nDim, nColor, Arg >, quda::Laplace< Float, nDim, nColor, Arg >, quda::NdegTwistedMass< Float, nDim, nColor, Arg >, quda::TwistedClover< Float, nDim, nColor, Arg >, quda::WilsonClover< Float, nDim, nColor, Arg >, quda::TwistedMass< Float, nDim, nColor, Arg >, quda::DomainWall5D< Float, nDim, nColor, Arg >, quda::CalculateY< from_coarse, Float, fineSpin, fineColor, coarseSpin, coarseColor, Arg >, quda::GaugePlaq< Float, Gauge >, and quda::Dslash5< Float, nColor, Arg >.

◆ gridStep()

virtual int quda::Tunable::gridStep ( ) const
inlineprotectedvirtual

gridStep sets the step size when iterating the grid size in advanceGridDim.

Returns
Grid step size

Reimplemented in quda::Pack< Float, nColor, spin_project >.

Definition at line 103 of file tune_quda.h.

Referenced by quda::Pack< Float, nColor, spin_project >::gridStep().

Here is the caller graph for this function:

◆ initTuneParam()

virtual void quda::Tunable::initTuneParam ( TuneParam param) const
inlinevirtual

Reimplemented in quda::dslash::DslashPolicyTune< Dslash >, quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >, quda::DslashCoarsePolicyTune, quda::CalculateY< from_coarse, Float, fineSpin, fineColor, coarseSpin, coarseColor, Arg >, quda::TunableVectorYZ, quda::TunableVectorY, quda::TunableLocalParity, quda::Pack< Float, nColor, spin_project >, quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >, quda::blas::MultiReduceCuda< NXZ, doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Reducer >, quda::blas::MultiBlas< NXZ, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Functor, T >, quda::Dslash5< Float, nColor, Arg >, quda::ShiftColorSpinorField< Output, Input >, quda::GenericPackGhostLauncher< Float, block_float, Ns, Ms, Nc, Mc, Arg >, quda::blas::BlasCuda< FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Functor >, quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >, and quda::NdegTwistedMassPreconditioned< Float, nDim, nColor, Arg >.

Definition at line 304 of file tune_quda.h.

References quda::TuneParam::block, deviceProp, errorQuda, quda::TuneParam::grid, and quda::TuneParam::shared_bytes.

Referenced by quda::blas::BlasCuda< FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Functor >::defaultTuneParam(), quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >::initTuneParam(), quda::blas::BlasCuda< FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Functor >::initTuneParam(), quda::blas::MultiReduceCuda< NXZ, doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Reducer >::initTuneParam(), quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::initTuneParam(), quda::TunableLocalParity::initTuneParam(), quda::TunableVectorY::initTuneParam(), quda::DslashCoarsePolicyTune::initTuneParam(), quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >::initTuneParam(), quda::dslash::DslashPolicyTune< Dslash >::initTuneParam(), and quda::tuneLaunch().

Here is the caller graph for this function:

◆ jitifyError() [1/2]

CUresult quda::Tunable::jitifyError ( ) const
inline

Definition at line 375 of file tune_quda.h.

Referenced by quda::blas::multiReduceLaunch(), quda::blas::reduceLaunch(), and quda::tuneLaunch().

Here is the caller graph for this function:

◆ jitifyError() [2/2]

CUresult& quda::Tunable::jitifyError ( )
inline

Definition at line 376 of file tune_quda.h.

◆ maxBlockSize()

virtual unsigned int quda::Tunable::maxBlockSize ( const TuneParam param) const
inlineprotectedvirtual

◆ maxBlocksPerSM()

unsigned int quda::Tunable::maxBlocksPerSM ( ) const
inlineprotected

For some reason this can't be queried from the device properties, so here we set set this. Based on Table 14 of the CUDA Programming Guide 10.0 (Technical Specifications per Compute Capability)

Returns
The maximum number of simultaneously resident blocks per SM

Definition at line 153 of file tune_quda.h.

References deviceProp, and warningQuda.

◆ maxDynamicSharedBytesPerBlock()

unsigned int quda::Tunable::maxDynamicSharedBytesPerBlock ( ) const
inlineprotected

This can't be correctly queried in CUDA for all architectures so here we set set this. Based on Table 14 of the CUDA Programming Guide 10.0 (Technical Specifications per Compute Capability).

Returns
The maximum dynamic shared memory to CUDA thread block

Definition at line 198 of file tune_quda.h.

References deviceProp, and warningQuda.

Referenced by quda::Pack< Float, nColor, spin_project >::defaultTuneParam(), quda::Pack< Float, nColor, spin_project >::initTuneParam(), quda::Dslash5< Float, nColor, Arg >::maxSharedBytesPerBlock(), quda::Dslash< Float >::maxSharedBytesPerBlock(), and quda::Pack< Float, nColor, spin_project >::tuneSharedBytes().

Here is the caller graph for this function:

◆ maxGridSize()

virtual unsigned int quda::Tunable::maxGridSize ( ) const
inlineprotectedvirtual

Reimplemented in quda::Pack< Float, nColor, spin_project >.

Definition at line 95 of file tune_quda.h.

References deviceProp.

Referenced by quda::Pack< Float, nColor, spin_project >::maxGridSize().

Here is the caller graph for this function:

◆ maxSharedBytesPerBlock()

virtual unsigned int quda::Tunable::maxSharedBytesPerBlock ( ) const
inlineprotectedvirtual

The maximum shared memory that a CUDA thread block can use in the autotuner. This isn't necessarily the same as maxDynamicSharedMemoryPerBlock since that may need explicit opt in to enable (by calling setMaxDynamicSharedBytes for the kernel in question). If the CUDA kernel in question does this opt in then this function can be overloaded to return maxDynamicSharedBytesPerBlock.

Returns
The maximum shared bytes limit per block the autotung will utilize.

Reimplemented in quda::Dslash< Float >, and quda::Dslash5< Float, nColor, Arg >.

Definition at line 229 of file tune_quda.h.

References deviceProp.

Referenced by quda::Dslash5< Float, nColor, Arg >::maxSharedBytesPerBlock(), and quda::Pack< Float, nColor, spin_project >::tuneSharedBytes().

Here is the caller graph for this function:

◆ minGridSize()

virtual unsigned int quda::Tunable::minGridSize ( ) const
inlineprotectedvirtual

Reimplemented in quda::Pack< Float, nColor, spin_project >.

Definition at line 96 of file tune_quda.h.

Referenced by quda::Pack< Float, nColor, spin_project >::minGridSize().

Here is the caller graph for this function:

◆ minThreads()

virtual unsigned int quda::Tunable::minThreads ( ) const
inlineprotectedvirtual

◆ paramString()

virtual std::string quda::Tunable::paramString ( const TuneParam param) const
inlinevirtual

Definition at line 287 of file tune_quda.h.

References param.

Referenced by quda::tuneLaunch().

Here is the caller graph for this function:

◆ perfString()

virtual std::string quda::Tunable::perfString ( float  time) const
inlinevirtual

Definition at line 294 of file tune_quda.h.

References quda::blas::bytes, quda::blas::flops, and quda::TuneParam::time.

Referenced by quda::tuneLaunch().

Here is the caller graph for this function:

◆ postTune()

virtual void quda::Tunable::postTune ( )
inlinevirtual

◆ preTune()

virtual void quda::Tunable::preTune ( )
inlinevirtual

◆ resetBlockDim()

virtual void quda::Tunable::resetBlockDim ( TuneParam param) const
inlineprotectedvirtual

Definition at line 108 of file tune_quda.h.

References quda::TuneParam::block, deviceProp, and errorQuda.

◆ setMaxDynamicSharedBytesPerBlock()

template<typename F >
void quda::Tunable::setMaxDynamicSharedBytesPerBlock ( F *  func) const
inlineprotected

Enable the maximum dynamic shared bytes for the kernel "func" (values given by maxDynamicSharedBytesPerBlock()).

Parameters
[in]funcFunction pointer to the kernel we want to enable max shared memory per block for

Definition at line 181 of file tune_quda.h.

Referenced by quda::Dslash< Float >::launch(), quda::Dslash5< Float, nColor, Arg >::launch(), and quda::Pack< Float, nColor, spin_project >::launch().

Here is the caller graph for this function:

◆ sharedBytesPerBlock()

virtual unsigned int quda::Tunable::sharedBytesPerBlock ( const TuneParam param) const
protectedpure virtual

◆ sharedBytesPerThread()

virtual unsigned int quda::Tunable::sharedBytesPerThread ( ) const
protectedpure virtual

◆ tuneAuxDim()

virtual bool quda::Tunable::tuneAuxDim ( ) const
inlineprotectedvirtual

◆ tuneGridDim()

virtual bool quda::Tunable::tuneGridDim ( ) const
inlineprotectedvirtual

◆ tuneKey()

virtual TuneKey quda::Tunable::tuneKey ( ) const
pure virtual

Implemented in quda::dslash::DslashPolicyTune< Dslash >, quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >, quda::TwistClover< Float, nSpin, nColor, Arg >, quda::DslashCoarsePolicyTune, quda::CalculateY< from_coarse, Float, fineSpin, fineColor, coarseSpin, coarseColor, Arg >, quda::Clover< Float, nSpin, nColor, Arg >, quda::ProjectSU3< Float, G >, quda::TwistGamma< Float, nColor, Arg >, quda::KSLongLinkForce< Float, Result, Oprod, Gauge >, quda::Pack< Float, nColor, spin_project >, quda::Gamma< ValueType, basis, dir >, quda::CopyColorSpinor< 4, Arg >, quda::CopySpinorEx< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder, Basis, extend >, quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >, quda::ExtractGhostEx< Float, length, nDim, dim, Order >, quda::ExtractGhost< nDim, Arg >, quda::Dslash5< Float, nColor, Arg >, quda::GaugeOvrImpSTOUT< Float, Arg >, quda::ShiftColorSpinorField< Output, Input >, quda::CopyColorSpinor< Ns, Arg >, quda::WuppertalSmearing< Float, Ns, Nc, Arg >, quda::blas::MultiReduceCuda< NXZ, doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Reducer >, quda::Staggered< Float, nDim, nColor, Arg >, quda::GenericPackGhostLauncher< Float, block_float, Ns, Ms, Nc, Mc, Arg >, quda::Laplace< Float, nDim, nColor, Arg >, quda::CopyGaugeEx< FloatOut, FloatIn, length, OutOrder, InOrder >, quda::KSForceComplete< Float, Oprod, Gauge, Mom >, quda::GaugeGauss< Float, Arg >, quda::TwistedCloverPreconditioned< Float, nDim, nColor, Arg >, quda::WilsonCloverPreconditioned< Float, nDim, nColor, Arg >, quda::NdegTwistedMassPreconditioned< Float, nDim, nColor, Arg >, quda::QudaMemCopy, quda::SpinorNoise< real, Ns, Nc, type, Arg >, quda::CopyGauge< FloatOut, FloatIn, length, Arg >, quda::blas::MultiBlas< NXZ, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, Functor, T >, quda::TwistedClover< Float, nDim, nColor, Arg >, quda::WilsonClover< Float, nDim, nColor, Arg >, quda::DomainWall5D< Float, nDim, nColor, Arg >, quda::blas::BlasCuda< FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Functor >, quda::TwistedMassPreconditioned< Float, nDim, nColor, Arg >, quda::NdegTwistedMass< Float, nDim, nColor, Arg >, quda::CopySpinor< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder >, quda::TwistedMass< Float, nDim, nColor, Arg >, quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >, quda::DomainWall4D< Float, nDim, nColor, Arg >, quda::Staggered< Float, nDim, nColor, Arg >, quda::Wilson< Float, nDim, nColor, Arg >, and quda::GaugePlaq< Float, Gauge >.

Referenced by quda::dslash::DslashPolicyTune< Dslash >::tuneKey(), and quda::tuneLaunch().

Here is the caller graph for this function:

◆ tuneSharedBytes()

virtual bool quda::Tunable::tuneSharedBytes ( ) const
inlineprotectedvirtual

◆ tuningIter()

virtual int quda::Tunable::tuningIter ( ) const
inlinevirtual

◆ writeAuxString()

int quda::Tunable::writeAuxString ( const char *  format,
  ... 
)
inlineprotected

Definition at line 267 of file tune_quda.h.

References quda::TuneKey::aux_n, and errorQuda.

Member Data Documentation

◆ aux

char quda::Tunable::aux[TuneKey::aux_n]
protected

◆ jitify_error

CUresult quda::Tunable::jitify_error
protected

The documentation for this class was generated from the following file: