QUDA
v1.1.0
A library for QCD on GPUs
|
#include <transform_reduce.h>
Public Member Functions | |
TransformReduceArg (const std::vector< T * > &v, count_t n_items, transformer h, reduce_t init, reducer r) | |
Public Member Functions inherited from quda::ReduceArg< reduce_t > | |
ReduceArg (int n_reduce=1) | |
void | complete (std::vector< host_t > &result, const qudaStream_t stream=0, bool reset=false) |
Finalize the reduction, returning the computed reduction into result. With heterogeneous atomics this means we poll the atomics until their value differs from the init_value. The alternate legacy path posts an event after the kernel and then polls on completion of the event. More... | |
void | complete (host_t &result, const qudaStream_t stream=0, bool reset=false) |
Overload providing a simple reference interface. More... | |
__device__ void | reduce2d (const reduce_t &in, const int idx=0) |
Generic reduction function that reduces block-distributed data "in" per thread to a single value. This is the legacy variant which require explicit host-device synchronization to signal the completion of the reduction to the host. More... | |
__device__ void | reduce (const reduce_t &in, const int idx=0) |
Public Attributes | |
const T * | v [n_batch_max] |
count_t | n_items |
int | n_batch |
reduce_t | init |
reduce_t | result [n_batch_max] |
transformer | h |
reducer | r |
Public Attributes inherited from quda::ReduceArg< reduce_t > | |
qudaError_t | launch_error |
Static Public Attributes | |
static constexpr int | block_size = 512 |
static constexpr int | n_batch_max = 8 |
Definition at line 35 of file transform_reduce.h.
|
inline |
Definition at line 45 of file transform_reduce.h.
|
staticconstexpr |
Definition at line 36 of file transform_reduce.h.
transformer quda::TransformReduceArg< reduce_t, T, count_t, transformer, reducer >::h |
Definition at line 43 of file transform_reduce.h.
reduce_t quda::TransformReduceArg< reduce_t, T, count_t, transformer, reducer >::init |
Definition at line 41 of file transform_reduce.h.
int quda::TransformReduceArg< reduce_t, T, count_t, transformer, reducer >::n_batch |
Definition at line 40 of file transform_reduce.h.
|
staticconstexpr |
Definition at line 37 of file transform_reduce.h.
count_t quda::TransformReduceArg< reduce_t, T, count_t, transformer, reducer >::n_items |
Definition at line 39 of file transform_reduce.h.
reducer quda::TransformReduceArg< reduce_t, T, count_t, transformer, reducer >::r |
Definition at line 44 of file transform_reduce.h.
reduce_t quda::TransformReduceArg< reduce_t, T, count_t, transformer, reducer >::result[n_batch_max] |
Definition at line 42 of file transform_reduce.h.
const T* quda::TransformReduceArg< reduce_t, T, count_t, transformer, reducer >::v[n_batch_max] |
Definition at line 38 of file transform_reduce.h.