QUDA  1.0.0
All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
Public Member Functions | Protected Member Functions | List of all members
quda::TunableLocalParity Class Reference

#include <tune_quda.h>

Inheritance diagram for quda::TunableLocalParity:
Inheritance graph
[legend]
Collaboration diagram for quda::TunableLocalParity:
Collaboration graph
[legend]

Public Member Functions

bool advanceBlockDim (TuneParam &param) const
 
void initTuneParam (TuneParam &param) const
 
void defaultTuneParam (TuneParam &param) const
 
- Public Member Functions inherited from quda::Tunable
 Tunable ()
 
virtual ~Tunable ()
 
virtual TuneKey tuneKey () const =0
 
virtual void apply (const cudaStream_t &stream)=0
 
virtual void preTune ()
 
virtual void postTune ()
 
virtual int tuningIter () const
 
virtual std::string paramString (const TuneParam &param) const
 
virtual std::string perfString (float time) const
 
virtual bool advanceTuneParam (TuneParam &param) const
 
void checkLaunchParam (TuneParam &param)
 
CUresult jitifyError () const
 
CUresult & jitifyError ()
 

Protected Member Functions

unsigned int sharedBytesPerThread () const
 
unsigned int sharedBytesPerBlock (const TuneParam &param) const
 
virtual bool tuneGridDim () const
 
unsigned int maxBlockSize (const TuneParam &param) const
 
- Protected Member Functions inherited from quda::Tunable
virtual long long flops () const =0
 
virtual long long bytes () const
 
virtual unsigned int minThreads () const
 
virtual bool tuneAuxDim () const
 
virtual bool tuneSharedBytes () const
 
virtual bool advanceGridDim (TuneParam &param) const
 
virtual unsigned int maxGridSize () const
 
virtual unsigned int minGridSize () const
 
virtual int gridStep () const
 gridStep sets the step size when iterating the grid size in advanceGridDim. More...
 
virtual int blockStep () const
 
virtual int blockMin () const
 
virtual void resetBlockDim (TuneParam &param) const
 
unsigned int maxBlocksPerSM () const
 For some reason this can't be queried from the device properties, so here we set set this. Based on Table 14 of the CUDA Programming Guide 10.0 (Technical Specifications per Compute Capability) More...
 
template<typename F >
void setMaxDynamicSharedBytesPerBlock (F *func) const
 Enable the maximum dynamic shared bytes for the kernel "func" (values given by maxDynamicSharedBytesPerBlock()). More...
 
unsigned int maxDynamicSharedBytesPerBlock () const
 This can't be correctly queried in CUDA for all architectures so here we set set this. Based on Table 14 of the CUDA Programming Guide 10.0 (Technical Specifications per Compute Capability). More...
 
virtual unsigned int maxSharedBytesPerBlock () const
 The maximum shared memory that a CUDA thread block can use in the autotuner. This isn't necessarily the same as maxDynamicSharedMemoryPerBlock since that may need explicit opt in to enable (by calling setMaxDynamicSharedBytes for the kernel in question). If the CUDA kernel in question does this opt in then this function can be overloaded to return maxDynamicSharedBytesPerBlock. More...
 
virtual bool advanceSharedBytes (TuneParam &param) const
 
virtual bool advanceAux (TuneParam &param) const
 
int writeAuxString (const char *format,...)
 

Additional Inherited Members

- Protected Attributes inherited from quda::Tunable
char aux [TuneKey::aux_n]
 
CUresult jitify_error
 

Detailed Description

This derived class is for algorithms that deploy parity across the y dimension of the thread block with no shared memory tuning. The x threads will typically correspond to the checkboarded volume.

Definition at line 386 of file tune_quda.h.

Member Function Documentation

◆ advanceBlockDim()

bool quda::TunableLocalParity::advanceBlockDim ( TuneParam param) const
inlinevirtual

Reimplemented from quda::Tunable.

Definition at line 402 of file tune_quda.h.

References quda::Tunable::advanceBlockDim(), and quda::TuneParam::block.

Here is the call graph for this function:

◆ defaultTuneParam()

void quda::TunableLocalParity::defaultTuneParam ( TuneParam param) const
inlinevirtual

sets default values for when tuning is disabled

Reimplemented from quda::Tunable.

Definition at line 413 of file tune_quda.h.

References quda::TuneParam::block, and quda::Tunable::defaultTuneParam().

Here is the call graph for this function:

◆ initTuneParam()

void quda::TunableLocalParity::initTuneParam ( TuneParam param) const
inlinevirtual

Reimplemented from quda::Tunable.

Definition at line 408 of file tune_quda.h.

References quda::TuneParam::block, and quda::Tunable::initTuneParam().

Here is the call graph for this function:

◆ maxBlockSize()

unsigned int quda::TunableLocalParity::maxBlockSize ( const TuneParam param) const
inlineprotectedvirtual

The maximum block size in the x dimension is the total number of threads divided by the size of the y dimension

Reimplemented from quda::Tunable.

Definition at line 399 of file tune_quda.h.

References deviceProp.

◆ sharedBytesPerBlock()

unsigned int quda::TunableLocalParity::sharedBytesPerBlock ( const TuneParam param) const
inlineprotectedvirtual

Implements quda::Tunable.

Definition at line 390 of file tune_quda.h.

◆ sharedBytesPerThread()

unsigned int quda::TunableLocalParity::sharedBytesPerThread ( ) const
inlineprotectedvirtual

Implements quda::Tunable.

Definition at line 389 of file tune_quda.h.

◆ tuneGridDim()

virtual bool quda::TunableLocalParity::tuneGridDim ( ) const
inlineprotectedvirtual

Reimplemented from quda::Tunable.

Reimplemented in quda::GaugePlaq< Float, Gauge >.

Definition at line 393 of file tune_quda.h.


The documentation for this class was generated from the following file: