QUDA  1.0.0
All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
Public Member Functions | Private Member Functions | Private Attributes | List of all members
quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer > Class Template Reference
Inheritance diagram for quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >:
Inheritance graph
[legend]
Collaboration diagram for quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >:
Collaboration graph
[legend]

Public Member Functions

 ReduceCuda (doubleN &result, SpinorX &X, SpinorY &Y, SpinorZ &Z, SpinorW &W, SpinorV &V, Reducer &r, ColorSpinorField &x, ColorSpinorField &y, ColorSpinorField &z, ColorSpinorField &w, ColorSpinorField &v, int length)
 
virtual ~ReduceCuda ()
 
TuneKey tuneKey () const
 
void apply (const cudaStream_t &stream)
 
void preTune ()
 
void postTune ()
 
void initTuneParam (TuneParam &param) const
 
void defaultTuneParam (TuneParam &param) const
 
long long flops () const
 
long long bytes () const
 
int tuningIter () const
 
- Public Member Functions inherited from quda::Tunable
 Tunable ()
 
virtual ~Tunable ()
 
virtual std::string paramString (const TuneParam &param) const
 
virtual std::string perfString (float time) const
 
virtual bool advanceTuneParam (TuneParam &param) const
 
void checkLaunchParam (TuneParam &param)
 
CUresult jitifyError () const
 
CUresult & jitifyError ()
 

Private Member Functions

unsigned int sharedBytesPerThread () const
 
unsigned int sharedBytesPerBlock (const TuneParam &param) const
 
virtual bool advanceSharedBytes (TuneParam &param) const
 

Private Attributes

const int nParity
 
ReductionArg< ReduceType, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer > arg
 
doubleN & result
 
const ColorSpinorFieldx
 
const ColorSpinorFieldy
 
const ColorSpinorFieldz
 
const ColorSpinorFieldw
 
const ColorSpinorFieldv
 
char * X_h
 
char * Y_h
 
char * Z_h
 
char * W_h
 
char * V_h
 
char * Xnorm_h
 
char * Ynorm_h
 
char * Znorm_h
 
char * Wnorm_h
 
char * Vnorm_h
 

Additional Inherited Members

- Protected Member Functions inherited from quda::Tunable
virtual unsigned int minThreads () const
 
virtual bool tuneGridDim () const
 
virtual bool tuneAuxDim () const
 
virtual bool tuneSharedBytes () const
 
virtual bool advanceGridDim (TuneParam &param) const
 
virtual unsigned int maxBlockSize (const TuneParam &param) const
 
virtual unsigned int maxGridSize () const
 
virtual unsigned int minGridSize () const
 
virtual int gridStep () const
 gridStep sets the step size when iterating the grid size in advanceGridDim. More...
 
virtual int blockStep () const
 
virtual int blockMin () const
 
virtual void resetBlockDim (TuneParam &param) const
 
virtual bool advanceBlockDim (TuneParam &param) const
 
unsigned int maxBlocksPerSM () const
 For some reason this can't be queried from the device properties, so here we set set this. Based on Table 14 of the CUDA Programming Guide 10.0 (Technical Specifications per Compute Capability) More...
 
template<typename F >
void setMaxDynamicSharedBytesPerBlock (F *func) const
 Enable the maximum dynamic shared bytes for the kernel "func" (values given by maxDynamicSharedBytesPerBlock()). More...
 
unsigned int maxDynamicSharedBytesPerBlock () const
 This can't be correctly queried in CUDA for all architectures so here we set set this. Based on Table 14 of the CUDA Programming Guide 10.0 (Technical Specifications per Compute Capability). More...
 
virtual unsigned int maxSharedBytesPerBlock () const
 The maximum shared memory that a CUDA thread block can use in the autotuner. This isn't necessarily the same as maxDynamicSharedMemoryPerBlock since that may need explicit opt in to enable (by calling setMaxDynamicSharedBytes for the kernel in question). If the CUDA kernel in question does this opt in then this function can be overloaded to return maxDynamicSharedBytesPerBlock. More...
 
virtual bool advanceAux (TuneParam &param) const
 
int writeAuxString (const char *format,...)
 
- Protected Attributes inherited from quda::Tunable
char aux [TuneKey::aux_n]
 
CUresult jitify_error
 

Detailed Description

template<typename doubleN, typename ReduceType, typename FloatN, int M, typename SpinorX, typename SpinorY, typename SpinorZ, typename SpinorW, typename SpinorV, typename Reducer>
class quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >

Definition at line 179 of file reduce_quda.cu.

Constructor & Destructor Documentation

◆ ReduceCuda()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::ReduceCuda ( doubleN &  result,
SpinorX &  X,
SpinorY &  Y,
SpinorZ &  Z,
SpinorW &  W,
SpinorV &  V,
Reducer &  r,
ColorSpinorField x,
ColorSpinorField y,
ColorSpinorField z,
ColorSpinorField w,
ColorSpinorField v,
int  length 
)
inline

Definition at line 209 of file reduce_quda.cu.

References quda::LatticeField::AuxString(), quda::blas::getFastReduce(), and quda::LatticeField::Precision().

Here is the call graph for this function:

◆ ~ReduceCuda()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
virtual quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::~ReduceCuda ( )
inlinevirtual

Definition at line 242 of file reduce_quda.cu.

Member Function Documentation

◆ advanceSharedBytes()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
virtual bool quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::advanceSharedBytes ( TuneParam param) const
inlineprivatevirtual

The goal here is to throttle the number of thread blocks per SM by over-allocating shared memory (in order to improve L2 utilization, etc.). We thus request the smallest amount of dynamic shared memory that guarantees throttling to a given number of blocks, in order to allow some extra leeway.

Reimplemented from quda::Tunable.

Definition at line 197 of file reduce_quda.cu.

References quda::TuneParam::block, and quda::TuneParam::shared_bytes.

◆ apply()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
void quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::apply ( const cudaStream_t &  stream)
inlinevirtual

Implements quda::Tunable.

Definition at line 246 of file reduce_quda.cu.

References quda::arg(), getTuning(), getVerbosity(), quda::stream, and quda::tuneLaunch().

Here is the call graph for this function:

◆ bytes()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
long long quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::bytes ( ) const
inlinevirtual

Reimplemented from quda::Tunable.

Definition at line 284 of file reduce_quda.cu.

References quda::ColorSpinorField::Bytes(), and quda::blas::ReductionArg< ReduceType, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::r.

Here is the call graph for this function:

◆ defaultTuneParam()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
void quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::defaultTuneParam ( TuneParam param) const
inlinevirtual

sets default values for when tuning is disabled

Reimplemented from quda::Tunable.

Definition at line 276 of file reduce_quda.cu.

References quda::Tunable::defaultTuneParam(), and quda::TuneParam::grid.

Here is the call graph for this function:

◆ flops()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
long long quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::flops ( ) const
inlinevirtual

◆ initTuneParam()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
void quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::initTuneParam ( TuneParam param) const
inlinevirtual

Reimplemented from quda::Tunable.

Definition at line 270 of file reduce_quda.cu.

References quda::TuneParam::grid, and quda::Tunable::initTuneParam().

Here is the call graph for this function:

◆ postTune()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
void quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::postTune ( )
inlinevirtual

◆ preTune()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
void quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::preTune ( )
inlinevirtual

◆ sharedBytesPerBlock()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
unsigned int quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::sharedBytesPerBlock ( const TuneParam param) const
inlineprivatevirtual

Implements quda::Tunable.

Definition at line 195 of file reduce_quda.cu.

◆ sharedBytesPerThread()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
unsigned int quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::sharedBytesPerThread ( ) const
inlineprivatevirtual

Implements quda::Tunable.

Definition at line 194 of file reduce_quda.cu.

◆ tuneKey()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
TuneKey quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::tuneKey ( ) const
inlinevirtual

Implements quda::Tunable.

Definition at line 244 of file reduce_quda.cu.

References quda::blas::ReductionArg< ReduceType, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::r, and quda::LatticeField::VolString().

Here is the call graph for this function:

◆ tuningIter()

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
int quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::tuningIter ( ) const
inlinevirtual

Reimplemented from quda::Tunable.

Definition at line 291 of file reduce_quda.cu.

Member Data Documentation

◆ arg

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
ReductionArg<ReduceType, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer> quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::arg
mutableprivate

Definition at line 184 of file reduce_quda.cu.

◆ nParity

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
const int quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::nParity
private

Definition at line 183 of file reduce_quda.cu.

◆ result

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
doubleN& quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::result
private

Definition at line 185 of file reduce_quda.cu.

◆ v

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
const ColorSpinorField & quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::v
private

Definition at line 187 of file reduce_quda.cu.

◆ V_h

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
char * quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::V_h
private

Definition at line 191 of file reduce_quda.cu.

◆ Vnorm_h

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
char * quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::Vnorm_h
private

Definition at line 192 of file reduce_quda.cu.

◆ w

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
const ColorSpinorField & quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::w
private

Definition at line 187 of file reduce_quda.cu.

◆ W_h

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
char * quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::W_h
private

Definition at line 191 of file reduce_quda.cu.

◆ Wnorm_h

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
char * quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::Wnorm_h
private

Definition at line 192 of file reduce_quda.cu.

◆ x

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
const ColorSpinorField& quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::x
private

Definition at line 187 of file reduce_quda.cu.

◆ X_h

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
char* quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::X_h
private

Definition at line 191 of file reduce_quda.cu.

◆ Xnorm_h

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
char* quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::Xnorm_h
private

Definition at line 192 of file reduce_quda.cu.

◆ y

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
const ColorSpinorField & quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::y
private

Definition at line 187 of file reduce_quda.cu.

◆ Y_h

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
char * quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::Y_h
private

Definition at line 191 of file reduce_quda.cu.

◆ Ynorm_h

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
char * quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::Ynorm_h
private

Definition at line 192 of file reduce_quda.cu.

◆ z

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
const ColorSpinorField & quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::z
private

Definition at line 187 of file reduce_quda.cu.

◆ Z_h

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
char * quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::Z_h
private

Definition at line 191 of file reduce_quda.cu.

◆ Znorm_h

template<typename doubleN , typename ReduceType , typename FloatN , int M, typename SpinorX , typename SpinorY , typename SpinorZ , typename SpinorW , typename SpinorV , typename Reducer >
char * quda::blas::ReduceCuda< doubleN, ReduceType, FloatN, M, SpinorX, SpinorY, SpinorZ, SpinorW, SpinorV, Reducer >::Znorm_h
private

Definition at line 192 of file reduce_quda.cu.


The documentation for this class was generated from the following file: