QUDA  0.9.0
Public Member Functions | Protected Member Functions | Protected Attributes | List of all members
quda::Tunable Class Referenceabstract

#include <tune_quda.h>

Inheritance diagram for quda::Tunable:
Inheritance graph
[legend]

Public Member Functions

 Tunable ()
 
virtual ~Tunable ()
 
virtual TuneKey tuneKey () const =0
 
virtual void apply (const cudaStream_t &stream)=0
 
virtual void preTune ()
 
virtual void postTune ()
 
virtual int tuningIter () const
 
virtual std::string paramString (const TuneParam &param) const
 
virtual std::string perfString (float time) const
 
virtual void initTuneParam (TuneParam &param) const
 
virtual void defaultTuneParam (TuneParam &param) const
 
virtual bool advanceTuneParam (TuneParam &param) const
 
void checkLaunchParam (TuneParam &param)
 

Protected Member Functions

virtual long long flops () const =0
 
virtual long long bytes () const
 
virtual unsigned int sharedBytesPerThread () const =0
 
virtual unsigned int sharedBytesPerBlock (const TuneParam &param) const =0
 
virtual unsigned int minThreads () const
 
virtual bool tuneGridDim () const
 
virtual bool tuneAuxDim () const
 
virtual bool tuneSharedBytes () const
 
virtual bool advanceGridDim (TuneParam &param) const
 
virtual unsigned int maxBlockSize () const
 
virtual unsigned int maxGridSize () const
 
virtual unsigned int minGridSize () const
 
virtual int blockStep () const
 
virtual int blockMin () const
 
virtual bool advanceBlockDim (TuneParam &param) const
 
unsigned int maxBlocksPerSM () const
 For reason this can't be queried from the device properties, so here we set set this. Based on Table 14 of the CUDA Programming Guide 9.0 (Technical Specifications per Compute Capability) More...
 
virtual bool advanceSharedBytes (TuneParam &param) const
 
virtual bool advanceAux (TuneParam &param) const
 
int writeAuxString (const char *format,...)
 

Protected Attributes

char aux [TuneKey::aux_n]
 

Detailed Description

Definition at line 60 of file tune_quda.h.

Constructor & Destructor Documentation

◆ Tunable()

quda::Tunable::Tunable ( )
inline

Definition at line 200 of file tune_quda.h.

◆ ~Tunable()

virtual quda::Tunable::~Tunable ( )
inlinevirtual

Definition at line 201 of file tune_quda.h.

Member Function Documentation

◆ advanceAux()

virtual bool quda::Tunable::advanceAux ( TuneParam param) const
inlineprotectedvirtual

Reimplemented in quda::DslashCoarsePolicyTune, and quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >.

Definition at line 187 of file tune_quda.h.

Referenced by advanceTuneParam().

Here is the caller graph for this function:

◆ advanceBlockDim()

virtual bool quda::Tunable::advanceBlockDim ( TuneParam param) const
inlineprotectedvirtual

◆ advanceGridDim()

virtual bool quda::Tunable::advanceGridDim ( TuneParam param) const
inlineprotectedvirtual

Reimplemented in quda::ShiftColorSpinorField< Output, Input >.

Definition at line 78 of file tune_quda.h.

References maxGridSize(), minGridSize(), param, and tuneGridDim().

Referenced by advanceTuneParam().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ advanceSharedBytes()

virtual bool quda::Tunable::advanceSharedBytes ( TuneParam param) const
inlineprotectedvirtual

The goal here is to throttle the number of thread blocks per SM by over-allocating shared memory (in order to improve L2 utilization, etc.). We thus request the smallest amount of dynamic shared memory that guarantees throttling to a given number of blocks, in order to allow some extra leeway.

Reimplemented in quda::CopySpinorEx< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder, Basis, extend >, quda::CopyColorSpinor< FloatOut, FloatIn, 4, Nc, Arg >, quda::CopyColorSpinor< FloatOut, FloatIn, Ns, Nc, Arg >, quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >, and quda::CopySpinor< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder >.

Definition at line 163 of file tune_quda.h.

References advanceBlockDim(), quda::TuneParam::block, deviceProp, maxBlocksPerSM(), param, sharedBytesPerBlock(), sharedBytesPerThread(), and tuneSharedBytes().

Referenced by advanceTuneParam().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ advanceTuneParam()

virtual bool quda::Tunable::advanceTuneParam ( TuneParam param) const
inlinevirtual

◆ apply()

virtual void quda::Tunable::apply ( const cudaStream_t &  stream)
pure virtual

Implemented in quda::CalculateYhat< Float, n, Arg >, quda::CalculateY< from_coarse, Float, fineSpin, fineColor, coarseSpin, coarseColor, Arg >, quda::DslashCoarsePolicyTune, quda::GaugeOvrImpSTOUT< Float, GaugeOr, GaugeDs >, quda::TwistClover< Float, nSpin, nColor, Arg >, quda::ProjectSU3< Float, G >, quda::Clover< Float, nSpin, nColor, Arg >, quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >, quda::TwistGamma< Float, nColor, Arg >, quda::KSLongLinkForce< Float, Result, Oprod, Gauge >, quda::CopySpinorEx< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder, Basis, extend >, quda::Gamma< ValueType, basis, dir >, quda::CopyColorSpinor< FloatOut, FloatIn, 4, Nc, Arg >, quda::CopyGauge< FloatOut, FloatIn, length, OutOrder, InOrder, isGhost >, quda::ExtractGhostEx< Float, length, nDim, dim, Order >, quda::ExtractGhost< Float, length, nDim, Order >, quda::Laplace< Float, nDim, nColor, Arg >, quda::WuppertalSmearing< Float, Ns, Nc, Arg >, quda::CopyColorSpinor< FloatOut, FloatIn, Ns, Nc, Arg >, quda::ShiftColorSpinorField< Output, Input >, quda::KSForceComplete< Float, Oprod, Gauge, Mom >, quda::GenericPackGhostLauncher< Float, Ns, Ms, Nc, Mc, Arg >, quda::CopyGaugeEx< FloatOut, FloatIn, length, OutOrder, InOrder >, quda::GaussSpinor< FloatIn, Ns, Nc, InOrder >, quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >, quda::QudaMemCopy, and quda::CopySpinor< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder >.

Referenced by quda::tuneLaunch().

Here is the caller graph for this function:

◆ blockMin()

virtual int quda::Tunable::blockMin ( ) const
inlineprotectedvirtual

Definition at line 100 of file tune_quda.h.

References deviceProp.

Referenced by initTuneParam().

Here is the caller graph for this function:

◆ blockStep()

virtual int quda::Tunable::blockStep ( ) const
inlineprotectedvirtual

Definition at line 99 of file tune_quda.h.

References deviceProp.

Referenced by advanceBlockDim().

Here is the caller graph for this function:

◆ bytes()

virtual long long quda::Tunable::bytes ( ) const
inlineprotectedvirtual

Reimplemented in quda::CalculateYhat< Float, n, Arg >, quda::DslashCoarsePolicyTune, quda::CalculateY< from_coarse, Float, fineSpin, fineColor, coarseSpin, coarseColor, Arg >, quda::GaugeOvrImpSTOUT< Float, GaugeOr, GaugeDs >, quda::TwistClover< Float, nSpin, nColor, Arg >, quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >, quda::ProjectSU3< Float, G >, quda::Clover< Float, nSpin, nColor, Arg >, quda::KSLongLinkForce< Float, Result, Oprod, Gauge >, quda::TwistGamma< Float, nColor, Arg >, quda::CopySpinorEx< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder, Basis, extend >, quda::CopyColorSpinor< FloatOut, FloatIn, 4, Nc, Arg >, quda::ExtractGhostEx< Float, length, nDim, dim, Order >, quda::CopyGauge< FloatOut, FloatIn, length, OutOrder, InOrder, isGhost >, quda::Gamma< ValueType, basis, dir >, quda::ExtractGhost< Float, length, nDim, Order >, quda::ShiftColorSpinorField< Output, Input >, quda::CopyColorSpinor< FloatOut, FloatIn, Ns, Nc, Arg >, quda::Laplace< Float, nDim, nColor, Arg >, quda::KSForceComplete< Float, Oprod, Gauge, Mom >, quda::WuppertalSmearing< Float, Ns, Nc, Arg >, quda::GenericPackGhostLauncher< Float, Ns, Ms, Nc, Mc, Arg >, quda::CopyGaugeEx< FloatOut, FloatIn, length, OutOrder, InOrder >, quda::QudaMemCopy, quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >, quda::GaussSpinor< FloatIn, Ns, Nc, InOrder >, and quda::CopySpinor< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder >.

Definition at line 64 of file tune_quda.h.

Referenced by perfString().

Here is the caller graph for this function:

◆ checkLaunchParam()

void quda::Tunable::checkLaunchParam ( TuneParam param)
inline

Check the launch parameters of the kernel to ensure that they are valid for the current device.

Definition at line 269 of file tune_quda.h.

References deviceProp, errorQuda, and param.

Referenced by quda::tuneLaunch().

Here is the caller graph for this function:

◆ defaultTuneParam()

virtual void quda::Tunable::defaultTuneParam ( TuneParam param) const
inlinevirtual

◆ flops()

virtual long long quda::Tunable::flops ( ) const
protectedpure virtual

Implemented in quda::CalculateYhat< Float, n, Arg >, quda::DslashCoarsePolicyTune, quda::CalculateY< from_coarse, Float, fineSpin, fineColor, coarseSpin, coarseColor, Arg >, quda::GaugeOvrImpSTOUT< Float, GaugeOr, GaugeDs >, quda::TwistClover< Float, nSpin, nColor, Arg >, quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >, quda::ProjectSU3< Float, G >, quda::Clover< Float, nSpin, nColor, Arg >, quda::KSLongLinkForce< Float, Result, Oprod, Gauge >, quda::TwistGamma< Float, nColor, Arg >, quda::CopySpinorEx< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder, Basis, extend >, quda::CopyColorSpinor< FloatOut, FloatIn, 4, Nc, Arg >, quda::ExtractGhostEx< Float, length, nDim, dim, Order >, quda::CopyGauge< FloatOut, FloatIn, length, OutOrder, InOrder, isGhost >, quda::Gamma< ValueType, basis, dir >, quda::ExtractGhost< Float, length, nDim, Order >, quda::ShiftColorSpinorField< Output, Input >, quda::CopyColorSpinor< FloatOut, FloatIn, Ns, Nc, Arg >, quda::Laplace< Float, nDim, nColor, Arg >, quda::KSForceComplete< Float, Oprod, Gauge, Mom >, quda::WuppertalSmearing< Float, Ns, Nc, Arg >, quda::GenericPackGhostLauncher< Float, Ns, Ms, Nc, Mc, Arg >, quda::CopyGaugeEx< FloatOut, FloatIn, length, OutOrder, InOrder >, quda::QudaMemCopy, quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >, quda::GaussSpinor< FloatIn, Ns, Nc, InOrder >, and quda::CopySpinor< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder >.

Referenced by perfString().

Here is the caller graph for this function:

◆ initTuneParam()

virtual void quda::Tunable::initTuneParam ( TuneParam param) const
inlinevirtual

◆ maxBlockSize()

virtual unsigned int quda::Tunable::maxBlockSize ( ) const
inlineprotectedvirtual

Reimplemented in quda::TunableLocalParity, quda::Laplace< Float, nDim, nColor, Arg >, and quda::WuppertalSmearing< Float, Ns, Nc, Arg >.

Definition at line 95 of file tune_quda.h.

References deviceProp.

Referenced by advanceBlockDim().

Here is the caller graph for this function:

◆ maxBlocksPerSM()

unsigned int quda::Tunable::maxBlocksPerSM ( ) const
inlineprotected

For reason this can't be queried from the device properties, so here we set set this. Based on Table 14 of the CUDA Programming Guide 9.0 (Technical Specifications per Compute Capability)

Returns
The maximum number of simultaneously resident blocks per SM

Definition at line 140 of file tune_quda.h.

References deviceProp, and errorQuda.

Referenced by advanceSharedBytes().

Here is the caller graph for this function:

◆ maxGridSize()

virtual unsigned int quda::Tunable::maxGridSize ( ) const
inlineprotectedvirtual

Definition at line 96 of file tune_quda.h.

References deviceProp.

Referenced by advanceGridDim().

Here is the caller graph for this function:

◆ minGridSize()

virtual unsigned int quda::Tunable::minGridSize ( ) const
inlineprotectedvirtual

Definition at line 97 of file tune_quda.h.

Referenced by advanceGridDim(), and initTuneParam().

Here is the caller graph for this function:

◆ minThreads()

virtual unsigned int quda::Tunable::minThreads ( ) const
inlineprotectedvirtual

◆ paramString()

virtual std::string quda::Tunable::paramString ( const TuneParam param) const
inlinevirtual

Definition at line 208 of file tune_quda.h.

References param, tuneAuxDim(), and tuneGridDim().

Referenced by quda::tuneLaunch().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ perfString()

virtual std::string quda::Tunable::perfString ( float  time) const
inlinevirtual

Definition at line 220 of file tune_quda.h.

References bytes(), flops(), and time().

Referenced by quda::tuneLaunch().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ postTune()

virtual void quda::Tunable::postTune ( )
inlinevirtual

◆ preTune()

virtual void quda::Tunable::preTune ( )
inlinevirtual

◆ sharedBytesPerBlock()

virtual unsigned int quda::Tunable::sharedBytesPerBlock ( const TuneParam param) const
protectedpure virtual

◆ sharedBytesPerThread()

virtual unsigned int quda::Tunable::sharedBytesPerThread ( ) const
protectedpure virtual

◆ tuneAuxDim()

virtual bool quda::Tunable::tuneAuxDim ( ) const
inlineprotectedvirtual

Reimplemented in quda::DslashCoarsePolicyTune.

Definition at line 75 of file tune_quda.h.

Referenced by paramString().

Here is the caller graph for this function:

◆ tuneGridDim()

virtual bool quda::Tunable::tuneGridDim ( ) const
inlineprotectedvirtual

◆ tuneKey()

virtual TuneKey quda::Tunable::tuneKey ( ) const
pure virtual

Implemented in quda::CalculateYhat< Float, n, Arg >, quda::CalculateY< from_coarse, Float, fineSpin, fineColor, coarseSpin, coarseColor, Arg >, quda::DslashCoarsePolicyTune, quda::GaugeOvrImpSTOUT< Float, GaugeOr, GaugeDs >, quda::TwistClover< Float, nSpin, nColor, Arg >, quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >, quda::ProjectSU3< Float, G >, quda::Clover< Float, nSpin, nColor, Arg >, quda::TwistGamma< Float, nColor, Arg >, quda::KSLongLinkForce< Float, Result, Oprod, Gauge >, quda::CopySpinorEx< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder, Basis, extend >, quda::CopyColorSpinor< FloatOut, FloatIn, 4, Nc, Arg >, quda::Gamma< ValueType, basis, dir >, quda::ExtractGhostEx< Float, length, nDim, dim, Order >, quda::CopyGauge< FloatOut, FloatIn, length, OutOrder, InOrder, isGhost >, quda::ExtractGhost< Float, length, nDim, Order >, quda::Laplace< Float, nDim, nColor, Arg >, quda::ShiftColorSpinorField< Output, Input >, quda::WuppertalSmearing< Float, Ns, Nc, Arg >, quda::CopyColorSpinor< FloatOut, FloatIn, Ns, Nc, Arg >, quda::KSForceComplete< Float, Oprod, Gauge, Mom >, quda::GenericPackGhostLauncher< Float, Ns, Ms, Nc, Mc, Arg >, quda::CopyGaugeEx< FloatOut, FloatIn, length, OutOrder, InOrder >, quda::QudaMemCopy, quda::GaussSpinor< FloatIn, Ns, Nc, InOrder >, quda::CopySpinor< FloatOut, FloatIn, Ns, Nc, OutOrder, InOrder >, and quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >.

Referenced by quda::tuneLaunch().

Here is the caller graph for this function:

◆ tuneSharedBytes()

virtual bool quda::Tunable::tuneSharedBytes ( ) const
inlineprotectedvirtual

Reimplemented in quda::KSLongLinkForce< Float, Result, Oprod, Gauge >, and quda::KSForceComplete< Float, Oprod, Gauge, Mom >.

Definition at line 76 of file tune_quda.h.

Referenced by advanceSharedBytes().

Here is the caller graph for this function:

◆ tuningIter()

virtual int quda::Tunable::tuningIter ( ) const
inlinevirtual

Reimplemented in quda::DslashCoarsePolicyTune, and quda::blas::copy_ns::CopyCuda< FloatN, N, Output, Input >.

Definition at line 206 of file tune_quda.h.

Referenced by quda::tuneLaunch().

Here is the caller graph for this function:

◆ writeAuxString()

int quda::Tunable::writeAuxString ( const char *  format,
  ... 
)
inlineprotected

Definition at line 191 of file tune_quda.h.

References aux, quda::TuneKey::aux_n, errorQuda, n, and vsnprintf().

Referenced by quda::ExtractGhost< Float, length, nDim, Order >::ExtractGhost().

Here is the call graph for this function:
Here is the caller graph for this function:

Member Data Documentation

◆ aux

char quda::Tunable::aux[TuneKey::aux_n]
protected

Definition at line 189 of file tune_quda.h.

Referenced by quda::CalculateY< from_coarse, Float, fineSpin, fineColor, coarseSpin, coarseColor, Arg >::CalculateY(), quda::CalculateYhat< Float, n, Arg >::CalculateYhat(), quda::Clover< Float, nSpin, nColor, Arg >::Clover(), quda::DslashCoarsePolicyTune::DslashCoarsePolicyTune(), quda::Gamma< ValueType, basis, dir >::Gamma(), quda::GenericPackGhostLauncher< Float, Ns, Ms, Nc, Mc, Arg >::GenericPackGhostLauncher(), quda::Laplace< Float, nDim, nColor, Arg >::Laplace(), quda::QudaMemCopy::QudaMemCopy(), quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >::TileSizeTune(), quda::QudaMemCopy::tuneKey(), quda::GenericPackGhostLauncher< Float, Ns, Ms, Nc, Mc, Arg >::tuneKey(), quda::WuppertalSmearing< Float, Ns, Nc, Arg >::tuneKey(), quda::ShiftColorSpinorField< Output, Input >::tuneKey(), quda::Laplace< Float, nDim, nColor, Arg >::tuneKey(), quda::ExtractGhost< Float, length, nDim, Order >::tuneKey(), quda::Gamma< ValueType, basis, dir >::tuneKey(), quda::TwistGamma< Float, nColor, Arg >::tuneKey(), quda::Clover< Float, nSpin, nColor, Arg >::tuneKey(), quda::ProjectSU3< Float, G >::tuneKey(), quda::blas::TileSizeTune< ReducerDiagonal, writeDiagonal, ReducerOffDiagonal, writeOffDiagonal >::tuneKey(), quda::TwistClover< Float, nSpin, nColor, Arg >::tuneKey(), quda::GaugeOvrImpSTOUT< Float, GaugeOr, GaugeDs >::tuneKey(), quda::DslashCoarsePolicyTune::tuneKey(), quda::CalculateY< from_coarse, Float, fineSpin, fineColor, coarseSpin, coarseColor, Arg >::tuneKey(), quda::CalculateYhat< Float, n, Arg >::tuneKey(), quda::TwistClover< Float, nSpin, nColor, Arg >::TwistClover(), quda::TwistGamma< Float, nColor, Arg >::TwistGamma(), writeAuxString(), and quda::WuppertalSmearing< Float, Ns, Nc, Arg >::WuppertalSmearing().


The documentation for this class was generated from the following file: