Programming Languages Research Group: Git

author	Jingyue Wu <jingyue@google.com>
	Mon, 20 Jul 2015 21:28:54 +0000 (21:28 +0000)
committer	Jingyue Wu <jingyue@google.com>
	Mon, 20 Jul 2015 21:28:54 +0000 (21:28 +0000)
commit	c9f86c12604a377cf27a5627c11ced288c31cf1e
tree	c6bd5c125db87af5e3e9139059c4456566486b86	tree \| snapshot
parent	d94e17bde92fe7e4411cf07b0cebaa772fc808ad	commit \| diff

[NVPTX] make load on global readonly memory to use ldg

Summary:
[NVPTX] make load on global readonly memory to use ldg

Summary:
As describe in [1], ld.global.nc may be used to load memory by nvcc when
__restrict__ is used and compiler can detect whether read-only data cache
is safe to use.

This patch will try to check whether ldg is safe to use and use them to
replace ld.global when possible. This change can improve the performance
by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in
S3D benchmark of shoc [2].

Patched by Xuetian Weng.

[1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache
[2] https://github.com/vetter/shoc

Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll

Reviewers: jholewinski, jingyue

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D11314

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242713 91177308-0d34-0410-b5e6-96231b3b80d8

lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp		diff \| blob \| history
test/CodeGen/NVPTX/load-with-non-coherent-cache.ll	[new file with mode: 0644]	blob