Inheritance diagram for quda::blas::HeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >:

Collaboration diagram for quda::blas::HeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >:

Public Types
typedef scalar< ReduceType >::type	real

Public Member Functions
	HeavyQuarkResidualNorm_ (const Float2 &a, const Float2 &b)

__device__ __host__ void	pre ()
	pre-computation routine called before the "M-loop" More...

__device__ __host__ void	operator() (ReduceType &sum, FloatN &x, FloatN &y, FloatN &z, FloatN &w, FloatN &v)
	where the reduction is usually computed and any auxiliary operations More...

__device__ __host__ void	post (ReduceType &sum)
	sum the solution and residual norms, and compute the heavy-quark norm More...

Static Public Member Functions
static int	streams ()

static int	flops ()
	total number of input and output streams More...

Public Attributes
Float2	a

Float2	b

ReduceType	aux

Detailed Description

template<typename ReduceType, typename Float2, typename FloatN>
struct quda::blas::HeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >

This kernel returns (x, x) and (r,r) and also returns the so-called heavy quark norm as used by MILC: 1 / N * (r, r)_i / (x, x)_i, where i is site index and N is the number of sites. When this kernel is launched, we must enforce that the parameter M in the launcher corresponds to the number of FloatN fields used to represent the spinor, e.g., M=6 for Wilson and M=3 for staggered. This is only the case for half-precision kernels by default. To enable this, the siteUnroll template parameter must be set true when reduceCuda is instantiated.

Definition at line 680 of file reduce_quda.cu.

Member Typedef Documentation

◆ real

template<typename ReduceType , typename Float2 , typename FloatN >

typedef scalar<ReduceType>::type quda::blas::HeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >::real

Definition at line 681 of file reduce_quda.cu.

Constructor & Destructor Documentation

◆ HeavyQuarkResidualNorm_()

template<typename ReduceType , typename Float2 , typename FloatN >

quda::blas::HeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >::HeavyQuarkResidualNorm_	(	const Float2 &	a,
		const Float2 &	b
	)

inline

Definition at line 685 of file reduce_quda.cu.

Member Function Documentation

◆ flops()

template<typename ReduceType , typename Float2 , typename FloatN >

static int quda::blas::HeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >::flops ( )

inlinestatic

total number of input and output streams

Definition at line 700 of file reduce_quda.cu.

◆ operator()()

template<typename ReduceType , typename Float2 , typename FloatN >

__device__ __host__ void quda::blas::HeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >::operator()	(	ReduceType &	sum,
		FloatN &	x,
		FloatN &	y,
		FloatN &	z,
		FloatN &	w,
		FloatN &	v
	)