QUDA
1.0.0
|
#include <tune_quda.h>
Public Member Functions | |
bool | advanceBlockDim (TuneParam ¶m) const |
void | initTuneParam (TuneParam ¶m) const |
void | defaultTuneParam (TuneParam ¶m) const |
![]() | |
Tunable () | |
virtual | ~Tunable () |
virtual TuneKey | tuneKey () const =0 |
virtual void | apply (const cudaStream_t &stream)=0 |
virtual void | preTune () |
virtual void | postTune () |
virtual int | tuningIter () const |
virtual std::string | paramString (const TuneParam ¶m) const |
virtual std::string | perfString (float time) const |
virtual bool | advanceTuneParam (TuneParam ¶m) const |
void | checkLaunchParam (TuneParam ¶m) |
CUresult | jitifyError () const |
CUresult & | jitifyError () |
Protected Member Functions | |
unsigned int | sharedBytesPerThread () const |
unsigned int | sharedBytesPerBlock (const TuneParam ¶m) const |
virtual bool | tuneGridDim () const |
unsigned int | maxBlockSize (const TuneParam ¶m) const |
![]() | |
virtual long long | flops () const =0 |
virtual long long | bytes () const |
virtual unsigned int | minThreads () const |
virtual bool | tuneAuxDim () const |
virtual bool | tuneSharedBytes () const |
virtual bool | advanceGridDim (TuneParam ¶m) const |
virtual unsigned int | maxGridSize () const |
virtual unsigned int | minGridSize () const |
virtual int | gridStep () const |
gridStep sets the step size when iterating the grid size in advanceGridDim. More... | |
virtual int | blockStep () const |
virtual int | blockMin () const |
virtual void | resetBlockDim (TuneParam ¶m) const |
unsigned int | maxBlocksPerSM () const |
For some reason this can't be queried from the device properties, so here we set set this. Based on Table 14 of the CUDA Programming Guide 10.0 (Technical Specifications per Compute Capability) More... | |
template<typename F > | |
void | setMaxDynamicSharedBytesPerBlock (F *func) const |
Enable the maximum dynamic shared bytes for the kernel "func" (values given by maxDynamicSharedBytesPerBlock()). More... | |
unsigned int | maxDynamicSharedBytesPerBlock () const |
This can't be correctly queried in CUDA for all architectures so here we set set this. Based on Table 14 of the CUDA Programming Guide 10.0 (Technical Specifications per Compute Capability). More... | |
virtual unsigned int | maxSharedBytesPerBlock () const |
The maximum shared memory that a CUDA thread block can use in the autotuner. This isn't necessarily the same as maxDynamicSharedMemoryPerBlock since that may need explicit opt in to enable (by calling setMaxDynamicSharedBytes for the kernel in question). If the CUDA kernel in question does this opt in then this function can be overloaded to return maxDynamicSharedBytesPerBlock. More... | |
virtual bool | advanceSharedBytes (TuneParam ¶m) const |
virtual bool | advanceAux (TuneParam ¶m) const |
int | writeAuxString (const char *format,...) |
Additional Inherited Members | |
![]() | |
char | aux [TuneKey::aux_n] |
CUresult | jitify_error |
This derived class is for algorithms that deploy parity across the y dimension of the thread block with no shared memory tuning. The x threads will typically correspond to the checkboarded volume.
Definition at line 386 of file tune_quda.h.
|
inlinevirtual |
Reimplemented from quda::Tunable.
Definition at line 402 of file tune_quda.h.
References quda::Tunable::advanceBlockDim(), and quda::TuneParam::block.
|
inlinevirtual |
sets default values for when tuning is disabled
Reimplemented from quda::Tunable.
Definition at line 413 of file tune_quda.h.
References quda::TuneParam::block, and quda::Tunable::defaultTuneParam().
|
inlinevirtual |
Reimplemented from quda::Tunable.
Definition at line 408 of file tune_quda.h.
References quda::TuneParam::block, and quda::Tunable::initTuneParam().
|
inlineprotectedvirtual |
The maximum block size in the x dimension is the total number of threads divided by the size of the y dimension
Reimplemented from quda::Tunable.
Definition at line 399 of file tune_quda.h.
References deviceProp.
|
inlineprotectedvirtual |
Implements quda::Tunable.
Definition at line 390 of file tune_quda.h.
|
inlineprotectedvirtual |
Implements quda::Tunable.
Definition at line 389 of file tune_quda.h.
|
inlineprotectedvirtual |
Reimplemented from quda::Tunable.
Reimplemented in quda::GaugePlaq< Float, Gauge >.
Definition at line 393 of file tune_quda.h.