QUDA v0.4.0
A library for QCD on GPUs
Classes | Functions
quda/lib/blas_core.h File Reference

Go to the source code of this file.

Classes

class  BlasCuda< FloatN, M, writeX, writeY, writeZ, writeW, InputX, InputY, InputZ, InputW, OutputX, OutputY, OutputZ, OutputW, Functor >

Functions

template<typename FloatN , int M, int writeX, int writeY, int writeZ, int writeW, typename InputX , typename InputY , typename InputZ , typename InputW , typename OutputX , typename OutputY , typename OutputZ , typename OutputW , typename Functor >
__global__ void blasKernel (InputX X, InputY Y, InputZ Z, InputW W, Functor f, OutputX XX, OutputY YY, OutputZ ZZ, OutputW WW, int length)
template<template< typename Float, typename FloatN > class Functor, int writeX, int writeY, int writeZ, int writeW>
void blasCuda (const int kernel, const double2 &a, const double2 &b, const double2 &c, cudaColorSpinorField &x, cudaColorSpinorField &y, cudaColorSpinorField &z, cudaColorSpinorField &w)

Function Documentation

template<template< typename Float, typename FloatN > class Functor, int writeX, int writeY, int writeZ, int writeW>
void blasCuda ( const int  kernel,
const double2 &  a,
const double2 &  b,
const double2 &  c,
cudaColorSpinorField x,
cudaColorSpinorField y,
cudaColorSpinorField z,
cudaColorSpinorField w 
)

Driver for generic blas routine with four loads and two store.

Definition at line 119 of file blas_core.h.

template<typename FloatN , int M, int writeX, int writeY, int writeZ, int writeW, typename InputX , typename InputY , typename InputZ , typename InputW , typename OutputX , typename OutputY , typename OutputZ , typename OutputW , typename Functor >
__global__ void blasKernel ( InputX  X,
InputY  Y,
InputZ  Z,
InputW  W,
Functor  f,
OutputX  XX,
OutputY  YY,
OutputZ  ZZ,
OutputW  WW,
int  length 
)

Generic blas kernel with four loads and up to four stores.

Definition at line 7 of file blas_core.h.

 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines