|
QUDA
0.9.0
|
Functions | |
| long long | BatchInvertMatrix (void *Ainv, void *A, const int n, const int batch, QudaPrecision precision, QudaFieldLocation location) |
| template<typename T > | |
| __global__ void | set_pointer (T **output_array_a, T *input_a, T **output_array_b, T *input_b, int batch_offset) |
| long long quda::cublas::BatchInvertMatrix | ( | void * | Ainv, |
| void * | A, | ||
| const int | n, | ||
| const int | batch, | ||
| QudaPrecision | precision, | ||
| QudaFieldLocation | location | ||
| ) |
Batch inversion the matrix field using an LU decomposition method.
| [out] | Ainv | Matrix field containing the inverse matrices |
| [in] | A | Matrix field containing the input matrices |
| [in] | n | Dimension each matrix |
| [in] | batch | Problem batch size |
| [in] | precision | Precision of the input/output data |
| [in] | Location | of the input/output data |
Definition at line 33 of file blas_cublas.cu.
References e, errorQuda, quda::blas::flops, FLOPS_CGETRF, FLOPS_CGETRI, handle, fused_exterior_ndeg_tm_dslash_cuda_gen::i, n, pool_device_free, pool_device_malloc, pool_pinned_free, pool_pinned_malloc, prec, printfQuda, QUDA_CPU_FIELD_LOCATION, QUDA_CUDA_FIELD_LOCATION, QUDA_DOUBLE_PRECISION, QUDA_SINGLE_PRECISION, quda::qudaDeviceSynchronize(), qudaMemcpy, size, start, time(), timeval::tv_sec, timeval::tv_usec, and warningQuda.
Referenced by quda::calculateY().


| __global__ void quda::cublas::set_pointer | ( | T ** | output_array_a, |
| T * | input_a, | ||
| T ** | output_array_b, | ||
| T * | input_b, | ||
| int | batch_offset | ||
| ) |
Definition at line 26 of file blas_cublas.cu.
1.8.14