QUDA
v1.1.0
A library for QCD on GPUs
|
Functions | |
void | init () |
Create the BLAS context. More... | |
void | destroy () |
Destroy the BLAS context. More... | |
long long | BatchInvertMatrix (void *Ainv, void *A, const int n, const uint64_t batch, QudaPrecision precision, QudaFieldLocation location) |
Batch inversion the matrix field using an LU decomposition method. More... | |
long long | stridedBatchGEMM (void *A, void *B, void *C, QudaBLASParam blas_param, QudaFieldLocation location) |
Strided Batch GEMM. This function performs N GEMM type operations in a strided batched fashion. If the user passes. More... | |
template<typename EigenMatrix , typename Float > | |
void | invertEigen (std::complex< Float > *A_eig, std::complex< Float > *Ainv_eig, int n, uint64_t batch) |
template<typename EigenMat , typename T > | |
void | fillArray (EigenMat &EigenArr, T *arr, int rows, int cols, int ld, int offset, bool fill_eigen) |
template<typename EigenMat , typename T > | |
void | GEMM (void *A_h, void *B_h, void *C_h, T alpha, T beta, int max_stride, QudaBLASParam &blas_param) |
The generic namespace is where we can deploy any target-independent blas/lapack operations that are not supported on the native target. To this end, we use Eigen on the host.
long long quda::blas_lapack::generic::BatchInvertMatrix | ( | void * | Ainv, |
void * | A, | ||
const int | n, | ||
const uint64_t | batch, | ||
QudaPrecision | precision, | ||
QudaFieldLocation | location | ||
) |
Batch inversion the matrix field using an LU decomposition method.
[out] | Ainv | Matrix field containing the inverse matrices |
[in] | A | Matrix field containing the input matrices |
[in] | n | Dimension each matrix |
[in] | batch | Problem batch size |
[in] | precision | Precision of the input/output data |
[in] | Location | of the input/output data |
Definition at line 52 of file blas_lapack_eigen.cpp.
void quda::blas_lapack::generic::destroy | ( | ) |
Destroy the BLAS context.
Definition at line 21 of file blas_lapack_eigen.cpp.
void quda::blas_lapack::generic::fillArray | ( | EigenMat & | EigenArr, |
T * | arr, | ||
int | rows, | ||
int | cols, | ||
int | ld, | ||
int | offset, | ||
bool | fill_eigen | ||
) |
Definition at line 115 of file blas_lapack_eigen.cpp.
void quda::blas_lapack::generic::GEMM | ( | void * | A_h, |
void * | B_h, | ||
void * | C_h, | ||
T | alpha, | ||
T | beta, | ||
int | max_stride, | ||
QudaBLASParam & | blas_param | ||
) |
Definition at line 131 of file blas_lapack_eigen.cpp.
void quda::blas_lapack::generic::init | ( | ) |
Create the BLAS context.
Definition at line 19 of file blas_lapack_eigen.cpp.
void quda::blas_lapack::generic::invertEigen | ( | std::complex< Float > * | A_eig, |
std::complex< Float > * | Ainv_eig, | ||
int | n, | ||
uint64_t | batch | ||
) |
Definition at line 26 of file blas_lapack_eigen.cpp.
long long quda::blas_lapack::generic::stridedBatchGEMM | ( | void * | A, |
void * | B, | ||
void * | C, | ||
QudaBLASParam | blas_param, | ||
QudaFieldLocation | location | ||
) |
Strided Batch GEMM. This function performs N GEMM type operations in a strided batched fashion. If the user passes.
stride<A,B,C> = -1
it deduces the strides for the A, B, and C arrays from the matrix dimensions, leading dims, etc, and will behave identically to the batched GEMM. If any of the stride<A,B,C> values passed in the parameter structure are greater than or equal to 0, the routine accepts the user's values instead.
Example: If the user passes
a_stride = 0
the routine will use only the first matrix in the A array and compute
C_{n} <- a * A_{0} * B_{n} + b * C_{n}
where n is the batch index.
[in] | A | Matrix field containing the A input matrices |
[in] | B | Matrix field containing the B input matrices |
[in/out] | C Matrix field containing the result, and matrix to be added | |
[in] | blas_param | Parameter structure defining the GEMM type |
[in] | Location | of the input/output data |
Definition at line 204 of file blas_lapack_eigen.cpp.