CUDA-l2: Surpassing cuBLAS performance for matrix multiplication through RL

126 points by dzign | 14 comments
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...