|
template<typename Arg , typename Dslash > |
void | quda::dslash::setFusedParam (Arg ¶m, Dslash &dslash, const int *faceVolumeCB) |
|
template<typename Dslash > |
void | quda::dslash::issueRecv (cudaColorSpinorField &input, const Dslash &dslash, cudaStream_t *stream, bool gdr) |
| This helper function simply posts all receives in all directions. More...
|
|
template<typename Dslash > |
void | quda::dslash::issuePack (cudaColorSpinorField &in, const Dslash &dslash, int parity, MemoryLocation location, int packIndex) |
| This helper function simply posts the packing kernel needed for halo exchange. More...
|
|
template<typename Dslash > |
void | quda::dslash::issueGather (cudaColorSpinorField &in, const Dslash &dslash) |
| This helper function simply posts the device-host memory copies of all halos in all dimensions and directions. More...
|
|
template<typename T > |
int | quda::dslash::getStreamIndex (const T &dslashParam) |
| Returns a stream index for posting the pack/scatters to. We desire a stream index that is not being used for peer-to-peer communication. This is used by the fused halo dslash kernels where we post all scatters to the same stream so we only have a single event to wait on before the exterior kernel is applied, and by the zero-copy dslash kernels where we want to post the packing kernel to an unused stream. More...
|
|
template<typename Dslash > |
bool | quda::dslash::commsComplete (cudaColorSpinorField &in, const Dslash &dslash, int dim, int dir, bool gdr_send, bool gdr_recv, bool zero_copy_recv, bool async, int scatterIndex=-1) |
| Wrapper for querying if communication is finished in the dslash, and if it is take the appropriate action: More...
|
|
template<typename T > |
void | quda::dslash::completeDslash (const ColorSpinorField &in, const T &dslashParam) |
| Ensure that the dslash is complete. By construction, the dslash will have completed (or is in flight) on this process, however, we must also ensure that no local work begins until any communication in flight from this process to another has completed. This prevents a race condition where we could start updating the local buffers on a subsequent computation before we have finished sending. More...
|
|
template<typename Dslash > |
void | quda::dslash::setMappedGhost (Dslash &dslash, ColorSpinorField &in, bool to_mapped) |
| Set the ghosts to the mapped CPU ghost buffer, or unsets if already set. Note this must not be called until after the interior dslash has been called, since sets the peer-to-peer ghost pointers, and this need to be done without the mapped ghost enabled. More...
|
|
void | quda::dslash::enable_policy (QudaDslashPolicy p) |
|
void | quda::dslash::disable_policy (QudaDslashPolicy p) |
|