QUDA  v1.1.0
A library for QCD on GPUs
Functions
quda::blas_lapack::generic Namespace Reference

Functions

void init ()
 Create the BLAS context. More...
 
void destroy ()
 Destroy the BLAS context. More...
 
long long BatchInvertMatrix (void *Ainv, void *A, const int n, const uint64_t batch, QudaPrecision precision, QudaFieldLocation location)
 Batch inversion the matrix field using an LU decomposition method. More...
 
long long stridedBatchGEMM (void *A, void *B, void *C, QudaBLASParam blas_param, QudaFieldLocation location)
 Strided Batch GEMM. This function performs N GEMM type operations in a strided batched fashion. If the user passes. More...
 
template<typename EigenMatrix , typename Float >
void invertEigen (std::complex< Float > *A_eig, std::complex< Float > *Ainv_eig, int n, uint64_t batch)
 
template<typename EigenMat , typename T >
void fillArray (EigenMat &EigenArr, T *arr, int rows, int cols, int ld, int offset, bool fill_eigen)
 
template<typename EigenMat , typename T >
void GEMM (void *A_h, void *B_h, void *C_h, T alpha, T beta, int max_stride, QudaBLASParam &blas_param)
 

Detailed Description

The generic namespace is where we can deploy any target-independent blas/lapack operations that are not supported on the native target. To this end, we use Eigen on the host.

Function Documentation

◆ BatchInvertMatrix()

long long quda::blas_lapack::generic::BatchInvertMatrix ( void *  Ainv,
void *  A,
const int  n,
const uint64_t  batch,
QudaPrecision  precision,
QudaFieldLocation  location 
)

Batch inversion the matrix field using an LU decomposition method.

Parameters
[out]AinvMatrix field containing the inverse matrices
[in]AMatrix field containing the input matrices
[in]nDimension each matrix
[in]batchProblem batch size
[in]precisionPrecision of the input/output data
[in]Locationof the input/output data
Returns
Number of flops done in this computation

Definition at line 52 of file blas_lapack_eigen.cpp.

◆ destroy()

void quda::blas_lapack::generic::destroy ( )

Destroy the BLAS context.

Definition at line 21 of file blas_lapack_eigen.cpp.

◆ fillArray()

template<typename EigenMat , typename T >
void quda::blas_lapack::generic::fillArray ( EigenMat &  EigenArr,
T *  arr,
int  rows,
int  cols,
int  ld,
int  offset,
bool  fill_eigen 
)

Definition at line 115 of file blas_lapack_eigen.cpp.

◆ GEMM()

template<typename EigenMat , typename T >
void quda::blas_lapack::generic::GEMM ( void *  A_h,
void *  B_h,
void *  C_h,
alpha,
beta,
int  max_stride,
QudaBLASParam blas_param 
)

Definition at line 131 of file blas_lapack_eigen.cpp.

◆ init()

void quda::blas_lapack::generic::init ( )

Create the BLAS context.

Definition at line 19 of file blas_lapack_eigen.cpp.

◆ invertEigen()

template<typename EigenMatrix , typename Float >
void quda::blas_lapack::generic::invertEigen ( std::complex< Float > *  A_eig,
std::complex< Float > *  Ainv_eig,
int  n,
uint64_t  batch 
)

Definition at line 26 of file blas_lapack_eigen.cpp.

◆ stridedBatchGEMM()

long long quda::blas_lapack::generic::stridedBatchGEMM ( void *  A,
void *  B,
void *  C,
QudaBLASParam  blas_param,
QudaFieldLocation  location 
)

Strided Batch GEMM. This function performs N GEMM type operations in a strided batched fashion. If the user passes.

stride<A,B,C> = -1

it deduces the strides for the A, B, and C arrays from the matrix dimensions, leading dims, etc, and will behave identically to the batched GEMM. If any of the stride<A,B,C> values passed in the parameter structure are greater than or equal to 0, the routine accepts the user's values instead.

Example: If the user passes

a_stride = 0

the routine will use only the first matrix in the A array and compute

C_{n} <- a * A_{0} * B_{n} + b * C_{n}

where n is the batch index.

Parameters
[in]AMatrix field containing the A input matrices
[in]BMatrix field containing the B input matrices
[in/out]C Matrix field containing the result, and matrix to be added
[in]blas_paramParameter structure defining the GEMM type
[in]Locationof the input/output data
Returns
Number of flops done in this computation

Definition at line 204 of file blas_lapack_eigen.cpp.