Inheritance diagram for quda::blas::xpyHeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >:

Collaboration diagram for quda::blas::xpyHeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >:

Public Types
typedef scalar< ReduceType >::type	real

Public Member Functions
	xpyHeavyQuarkResidualNorm_ (const Float2 &a, const Float2 &b)

__device__ __host__ void	pre ()
	pre-computation routine called before the "M-loop" More...

__device__ __host__ void	operator() (ReduceType &sum, FloatN &x, FloatN &y, FloatN &z, FloatN &w, FloatN &v)
	where the reduction is usually computed and any auxiliary operations More...

__device__ __host__ void	post (ReduceType &sum)
	sum the solution and residual norms, and compute the heavy-quark norm More...

Static Public Member Functions
static int	streams ()

static int	flops ()
	total number of input and output streams More...

Public Attributes
Float2	a

Float2	b

ReduceType	aux

Detailed Description

template<typename ReduceType, typename Float2, typename FloatN>
struct quda::blas::xpyHeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >

Variant of the HeavyQuarkResidualNorm kernel: this takes three arguments, the first two are summed together to form the solution, with the third being the residual vector. This removes the need an additional xpy call in the solvers, impriving performance.

Definition at line 719 of file reduce_quda.cu.

Member Typedef Documentation

◆ real

template<typename ReduceType , typename Float2 , typename FloatN >

typedef scalar<ReduceType>::type quda::blas::xpyHeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >::real

Definition at line 720 of file reduce_quda.cu.

Constructor & Destructor Documentation

◆ xpyHeavyQuarkResidualNorm_()

template<typename ReduceType , typename Float2 , typename FloatN >

quda::blas::xpyHeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >::xpyHeavyQuarkResidualNorm_	(	const Float2 &	a,
		const Float2 &	b
	)

inline

Definition at line 724 of file reduce_quda.cu.

Member Function Documentation

◆ flops()

template<typename ReduceType , typename Float2 , typename FloatN >

static int quda::blas::xpyHeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >::flops ( )

inlinestatic

total number of input and output streams

Definition at line 739 of file reduce_quda.cu.

◆ operator()()

template<typename ReduceType , typename Float2 , typename FloatN >

__device__ __host__ void quda::blas::xpyHeavyQuarkResidualNorm_< ReduceType, Float2, FloatN >::operator()	(	ReduceType &	sum,
		FloatN &	x,
		FloatN &	y,
		FloatN &	z,
		FloatN &	w,
		FloatN &	v
	)