[NVPTX] make load on global readonly memory to use ldg
authorJingyue Wu <jingyue@google.com>
Mon, 20 Jul 2015 21:28:54 +0000 (21:28 +0000)
committerJingyue Wu <jingyue@google.com>
Mon, 20 Jul 2015 21:28:54 +0000 (21:28 +0000)
commitc9f86c12604a377cf27a5627c11ced288c31cf1e
treec6bd5c125db87af5e3e9139059c4456566486b86
parentd94e17bde92fe7e4411cf07b0cebaa772fc808ad
[NVPTX] make load on global readonly memory to use ldg

Summary:
[NVPTX] make load on global readonly memory to use ldg

Summary:
As describe in [1], ld.global.nc may be used to load memory by nvcc when
__restrict__ is used and compiler can detect whether read-only data cache
is safe to use.

This patch will try to check whether ldg is safe to use and use them to
replace ld.global when possible. This change can improve the performance
by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in
S3D benchmark of shoc [2].

Patched by Xuetian Weng.

[1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache
[2] https://github.com/vetter/shoc

Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll

Reviewers: jholewinski, jingyue

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D11314

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242713 91177308-0d34-0410-b5e6-96231b3b80d8
lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
test/CodeGen/NVPTX/load-with-non-coherent-cache.ll [new file with mode: 0644]