|
QUDA
0.9.0
|
#include <tune_quda.h>


Public Member Functions | |
| bool | advanceBlockDim (TuneParam ¶m) const |
| void | initTuneParam (TuneParam ¶m) const |
| void | defaultTuneParam (TuneParam ¶m) const |
Public Member Functions inherited from quda::Tunable | |
| Tunable () | |
| virtual | ~Tunable () |
| virtual TuneKey | tuneKey () const =0 |
| virtual void | apply (const cudaStream_t &stream)=0 |
| virtual void | preTune () |
| virtual void | postTune () |
| virtual int | tuningIter () const |
| virtual std::string | paramString (const TuneParam ¶m) const |
| virtual std::string | perfString (float time) const |
| virtual bool | advanceTuneParam (TuneParam ¶m) const |
| void | checkLaunchParam (TuneParam ¶m) |
Protected Member Functions | |
| unsigned int | sharedBytesPerThread () const |
| unsigned int | sharedBytesPerBlock (const TuneParam ¶m) const |
| bool | tuneGridDim () const |
| unsigned int | maxBlockSize () const |
Protected Member Functions inherited from quda::Tunable | |
| virtual long long | flops () const =0 |
| virtual long long | bytes () const |
| virtual unsigned int | minThreads () const |
| virtual bool | tuneAuxDim () const |
| virtual bool | tuneSharedBytes () const |
| virtual bool | advanceGridDim (TuneParam ¶m) const |
| virtual unsigned int | maxGridSize () const |
| virtual unsigned int | minGridSize () const |
| virtual int | blockStep () const |
| virtual int | blockMin () const |
| unsigned int | maxBlocksPerSM () const |
| For reason this can't be queried from the device properties, so here we set set this. Based on Table 14 of the CUDA Programming Guide 9.0 (Technical Specifications per Compute Capability) More... | |
| virtual bool | advanceSharedBytes (TuneParam ¶m) const |
| virtual bool | advanceAux (TuneParam ¶m) const |
| int | writeAuxString (const char *format,...) |
Additional Inherited Members | |
Protected Attributes inherited from quda::Tunable | |
| char | aux [TuneKey::aux_n] |
This derived class is for algorithms that deploy parity across the y dimension of the thread block with no shared memory tuning. The x threads will typically correspond to the checkboarded volume.
Definition at line 306 of file tune_quda.h.
|
inlinevirtual |
Reimplemented from quda::Tunable.
Definition at line 322 of file tune_quda.h.
References quda::Tunable::advanceBlockDim(), and param.

|
inlinevirtual |
sets default values for when tuning is disabled
Reimplemented from quda::Tunable.
Definition at line 333 of file tune_quda.h.
References quda::Tunable::defaultTuneParam(), and param.

|
inlinevirtual |
Reimplemented from quda::Tunable.
Definition at line 328 of file tune_quda.h.
References quda::Tunable::initTuneParam(), and param.

|
inlineprotectedvirtual |
The maximum block size in the x dimension is the total number of threads divided by the size of the y dimension
Reimplemented from quda::Tunable.
Definition at line 319 of file tune_quda.h.
References deviceProp.
|
inlineprotectedvirtual |
Implements quda::Tunable.
Definition at line 310 of file tune_quda.h.
|
inlineprotectedvirtual |
Implements quda::Tunable.
Definition at line 309 of file tune_quda.h.
|
inlineprotectedvirtual |
Reimplemented from quda::Tunable.
Definition at line 313 of file tune_quda.h.
1.8.14