Benny Halevy [Wed, 20 Oct 2010 04:18:01 +0000 (00:18 -0400)]
NFS: create and destroy inode's layout cache
At the start of the io paths, try to grab the relevant layout
information. This will initiate the inode's layout cache, but
stubs ensure the cache stays empty.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Dean Hildebrand <dhildebz@umich.edu>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Dean Hildebrand [Wed, 20 Oct 2010 04:18:00 +0000 (00:18 -0400)]
NFSv4.1: pnfs: filelayout: introduce minimal file layout driver
This driver just registers itself and supplies trivial mount/umount functions.
Signed-off-by: Dean Hildebrand <dhildebz@umich.edu>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fred Isaman [Wed, 20 Oct 2010 04:17:59 +0000 (00:17 -0400)]
NFSv4.1: pnfs: full mount/umount infrastructure
Allow a module implementing a layout type to register, and
have its mount/umount routines called for filesystems that
the server declares support it.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Bian Naimeng <biannm@cn.fujitsu.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Ricardo Labiaga [Wed, 20 Oct 2010 04:17:58 +0000 (00:17 -0400)]
NFS: set layout driver
Put in the infrastructure that uses information returned from the
server at mount to select a layout driver module.
In this patch, a stub is used that always returns "no driver found".
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Dean Hildebrand <dhildebz@umich.edu>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Andy Adamson [Wed, 20 Oct 2010 04:17:57 +0000 (00:17 -0400)]
NFS: ask for layouttypes during v4 fsinfo call
This information will be used to determine which layout driver,
if any, to use for subsequent IO on this filesystem. Each driver
is assigned an integer id, with 0 reserved to indicate no driver.
The server can in theory return multiple ids. However, our current
client implementation only notes the first entry and ignores the
rest.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Alexandros Batsakis [Wed, 20 Oct 2010 04:17:56 +0000 (00:17 -0400)]
NFS: change stateid to be a union
In NFSv4.1 the stateid consists of the other and seqid fields. For layout
processing we need to numerically compare the seqid value of layout stateids.
To do so, introduce a union to nfs4_stateid to switch between opaque(16 bytes)
and opaque(12 bytes) / __be32
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Dean Hildebrand [Wed, 20 Oct 2010 04:17:55 +0000 (00:17 -0400)]
NFSv4.1: pnfsd, pnfs: protocol level pnfs constants
Use only layoutreturn constant for both returns and recalls.
(return_* works better for recall_type rather the other way around)
Signed-off-by: Dean Hildebrand <dhildebz@umich.edu>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Benny Halevy [Wed, 20 Oct 2010 04:17:54 +0000 (00:17 -0400)]
SUNRPC: define xdr_decode_opaque_fixed
A helper for decoding a fixed length opaque value.
Returns a pointer to the next item in the xdr stream.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Andy Adamson [Wed, 20 Oct 2010 04:17:53 +0000 (00:17 -0400)]
NFSD: remove duplicate NFS4_STATEID_SIZE
Already accepted by Bruce
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 24 Oct 2010 22:00:46 +0000 (18:00 -0400)]
SUNRPC: Cleanup duplicate assignment in rpcauth_refreshcred
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Roman Borisov [Wed, 13 Oct 2010 12:54:51 +0000 (16:54 +0400)]
nfs: fix unchecked value
Return value of "decode_attr_bitmap()" was not checked;
Signed-off-by: Roman Borisov <ext-roman.borisov@nokia.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Ricardo Labiaga [Tue, 12 Oct 2010 23:30:06 +0000 (16:30 -0700)]
Ask for time_delta during fsinfo probe
Used by the client to determine if the server has a granular enough
time stamp.
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Ricardo Labiaga [Tue, 12 Oct 2010 23:30:05 +0000 (16:30 -0700)]
Revalidate caches on lock
Instead of blindly zapping the caches, attempt to revalidate them if
the server has indicated that it uses high resolution timestamps.
NFSv4 should be able to always revalidate the cache since the
protocol requires the update of the change attribute on modification of
the data. In reality, there are servers (the Linux NFS server
for example) that do not obey this requirement and use ctime as the
basis for change attribute. Long term, the server needs to be fixed.
At this time, and to be on the safe side, continue zapping caches if
the server indicates that it does not have a high resolution timestamp.
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 24 Oct 2010 21:17:31 +0000 (17:17 -0400)]
SUNRPC: After calling xprt_release(), we must restart from call_reserve
Rob Leslie reports seeing the following Oops after his Kerberos session
expired.
BUG: unable to handle kernel NULL pointer dereference at
00000058
IP: [<
e186ed94>] rpcauth_refreshcred+0x11/0x12c [sunrpc]
*pde =
00000000
Oops: 0000 [#1]
last sysfs file: /sys/devices/platform/pc87360.26144/temp3_input
Modules linked in: autofs4 authenc esp4 xfrm4_mode_transport ipt_LOG ipt_REJECT xt_limit xt_state ipt_REDIRECT xt_owner xt_HL xt_hl xt_tcpudp xt_mark cls_u32 cls_tcindex sch_sfq sch_htb sch_dsmark geodewdt deflate ctr twofish_generic twofish_i586 twofish_common camellia serpent blowfish cast5 cbc xcbc rmd160 sha512_generic sha1_generic hmac crypto_null af_key rpcsec_gss_krb5 nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc ip_gre sit tunnel4 dummy ext3 jbd nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables pc8736x_gpio nsc_gpio pc87360 hwmon_vid loop aes_i586 aes_generic sha256_generic dm_crypt cs5535_gpio serio_raw cs5535_mfgpt hifn_795x des_generic geode_rng rng_core led_class ext4 mbcache jbd2 crc16 dm_mirror dm_region_hash dm_log dm_snapshot dm_mod sd_mod crc_t10dif ide_pci_generic cs5536 amd74xx ide_core pata_cs5536 ata_generic libata usb_storage via_rhine mii scsi_mod btrfs zlib_deflate crc32c libcrc32c [last unloaded: scsi_wait_scan]
Pid: 12875, comm: sudo Not tainted 2.6.36-net5501 #1 /
EIP: 0060:[<
e186ed94>] EFLAGS:
00010292 CPU: 0
EIP is at rpcauth_refreshcred+0x11/0x12c [sunrpc]
EAX:
00000000 EBX:
defb13a0 ECX:
00000006 EDX:
e18683b8
ESI:
defb13a0 EDI:
00000000 EBP:
00000000 ESP:
de571d58
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process sudo (pid: 12875, ti=
de570000 task=
decd1430 task.ti=
de570000)
Stack:
e186e008 00000000 defb13a0 0000000d deda6000 e1868f22 e196f12b defb13a0
<0>
defb13d8 00000000 00000000 e186e0aa 00000000 defb13a0 de571dac 00000000
<0>
e186956c de571e34 debea5c0 de571dc8 e186967a 00000000 debea5c0 de571e34
Call Trace:
[<
e186e008>] ? rpc_wake_up_next+0x114/0x11b [sunrpc]
[<
e1868f22>] ? call_decode+0x24a/0x5af [sunrpc]
[<
e196f12b>] ? nfs4_xdr_dec_access+0x0/0xa2 [nfs]
[<
e186e0aa>] ? __rpc_execute+0x62/0x17b [sunrpc]
[<
e186956c>] ? rpc_run_task+0x91/0x97 [sunrpc]
[<
e186967a>] ? rpc_call_sync+0x40/0x5b [sunrpc]
[<
e1969ca2>] ? nfs4_proc_access+0x10a/0x176 [nfs]
[<
e19572fa>] ? nfs_do_access+0x2b1/0x2c0 [nfs]
[<
e186ed61>] ? rpcauth_lookupcred+0x62/0x84 [sunrpc]
[<
e19573b6>] ? nfs_permission+0xad/0x13b [nfs]
[<
c0177824>] ? exec_permission+0x15/0x4b
[<
c0177fbd>] ? link_path_walk+0x4f/0x456
[<
c017867d>] ? path_walk+0x4c/0xa8
[<
c0179678>] ? do_path_lookup+0x1f/0x68
[<
c017a3fb>] ? user_path_at+0x37/0x5f
[<
c016359c>] ? handle_mm_fault+0x229/0x55b
[<
c0170a2d>] ? sys_faccessat+0x93/0x146
[<
c0170aef>] ? sys_access+0xf/0x13
[<
c02cf615>] ? syscall_call+0x7/0xb
Code: 0f 94 c2 84 d2 74 09 8b 44 24 0c e8 6a e9 8b de 83 c4 14 89 d8 5b 5e 5f 5d c3 55 57 56 53 83 ec 1c fc 89 c6 8b 40 10 89 44 24 04 <8b> 58 58 85 db 0f 85 d4 00 00 00 0f b7 46 70 8b 56 20 89 c5 83
EIP: [<
e186ed94>] rpcauth_refreshcred+0x11/0x12c [sunrpc] SS:ESP 0068:
de571d58
CR2:
0000000000000058
This appears to be caused by the function rpc_verify_header() first
calling xprt_release(), then doing a call_refresh. If we release the
transport slot, we should _always_ jump back to call_reserve before
calling anything else.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Trond Myklebust [Sun, 24 Oct 2010 16:11:42 +0000 (12:11 -0400)]
NFSv4: Fix up the 'dircount' hint in encode_readdir
Also ensure we only ask for either fileid or mounted_on_fileid in the
readdirplus case too...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 24 Oct 2010 15:52:55 +0000 (11:52 -0400)]
NFSv4: Clean up nfs4_decode_dirent
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 24 Oct 2010 17:14:02 +0000 (13:14 -0400)]
NFSv4: nfs4_decode_dirent must clear entry->fattr->valid
Otherwise, we may end up reading uninitialised data from the resulting
struct nfs_fattr.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sat, 23 Oct 2010 19:43:10 +0000 (15:43 -0400)]
NFSv4: Fix a regression in decode_getfattr
We don't want to have the mounted_on_fileid overwrite the true fileid. We
only return the former if the server didn't supply the true fileid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sat, 23 Oct 2010 19:34:20 +0000 (15:34 -0400)]
NFSv4: Fix up decode_attr_filehandle() to handle the case of empty fh pointer
decode_attr_filehandle still needs to skip the XDR-encoded filehandle if
someone passes a null pointer argument.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sat, 23 Oct 2010 18:53:23 +0000 (14:53 -0400)]
NFS: Ensure we check all allocation return values in new readdir code
Also some clean ups.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Thu, 21 Oct 2010 20:33:18 +0000 (16:33 -0400)]
NFS: Readdir plus in v4
By requsting more attributes during a readdir, we can mimic the readdir plus
operation that was in NFSv3.
To test, I ran the command `ls -lU --color=none` on directories with various
numbers of files. Without readdir plus, I see this:
n files | 100 | 1,000 | 10,000 | 100,000 | 1,000,000
--------+-----------+-----------+-----------+-----------+----------
real | 0m00.153s | 0m00.589s | 0m05.601s | 0m56.691s | 9m59.128s
user | 0m00.007s | 0m00.007s | 0m00.077s | 0m00.703s | 0m06.800s
sys | 0m00.010s | 0m00.070s | 0m00.633s | 0m06.423s | 1m10.005s
access | 3 | 1 | 1 | 4 | 31
getattr | 2 | 1 | 1 | 1 | 1
lookup | 104 | 1,003 | 10,003 | 100,003 | 1,000,003
readdir | 2 | 16 | 158 | 1,575 | 15,749
total | 111 | 1,021 | 10,163 | 101,583 | 1,015,784
With readdir plus enabled, I see this:
n files | 100 | 1,000 | 10,000 | 100,000 | 1,000,000
--------+-----------+-----------+-----------+-----------+----------
real | 0m00.115s | 0m00.206s | 0m01.079s | 0m12.521s | 2m07.528s
user | 0m00.003s | 0m00.003s | 0m00.040s | 0m00.290s | 0m03.296s
sys | 0m00.007s | 0m00.020s | 0m00.120s | 0m01.357s | 0m17.556s
access | 3 | 1 | 1 | 1 | 7
getattr | 2 | 1 | 1 | 1 | 1
lookup | 4 | 3 | 3 | 3 | 3
readdir | 6 | 62 | 630 | 6,300 | 62,993
total | 15 | 67 | 635 | 6,305 | 63,004
Readdir plus disabled has about a 16x increase in the number of rpc calls and
is 4 - 5 times slower on large directories.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Thu, 21 Oct 2010 20:33:17 +0000 (16:33 -0400)]
NFS: introduce generic decode_getattr function
Getattr should be able to decode errors and the readdir file handle.
decode_getfattr_attrs does the actual attribute decoding, while
decode_getfattr_generic will check the opcode before decoding. This will
let other functions call decode_getfattr_attrs to decode their attributes.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Thu, 21 Oct 2010 20:33:16 +0000 (16:33 -0400)]
NFS: check xdr_decode for errors
Check if the decoded entry has the eof bit set when returning from xdr_decode
with an error. If it does, we should set the eof bits in the array before
returning. This should keep us from looping when we expect more data but the
server doesn't give us anything new.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Thu, 21 Oct 2010 20:33:16 +0000 (16:33 -0400)]
NFS: nfs_readdir_filler catch all errors
Check for all errors, not a specific one.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Wed, 20 Oct 2010 19:44:37 +0000 (15:44 -0400)]
NFS: readdir with vmapped pages
We can use vmapped pages to read more information from the network at once.
This will reduce the number of calls needed to complete a readdir.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
[trondmy: Added #include for linux/vmalloc.h> in fs/nfs/dir.c]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Wed, 20 Oct 2010 19:44:31 +0000 (15:44 -0400)]
NFS: remove page size checking code
Remove the page size checking code for a readdir decode. This is now done
by decode_dirent with xdr_streams.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Wed, 20 Oct 2010 19:44:29 +0000 (15:44 -0400)]
NFS: decode_dirent should use an xdr_stream
Convert nfs*xdr.c to use an xdr stream in decode_dirent. This will prevent a
kernel oops that has been occuring when reading a vmapped page.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 19 Oct 2010 23:58:49 +0000 (19:58 -0400)]
SUNRPC: Add a helper function xdr_inline_peek
We sometimes need to be able to read ahead in an xdr_stream without
incrementing the current pointer position.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Fri, 24 Sep 2010 22:50:01 +0000 (18:50 -0400)]
NFS: remove readdir plus limit
We will now use readdir plus even on directories that are very large.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Fri, 24 Sep 2010 22:50:01 +0000 (18:50 -0400)]
NFS: re-add readdir plus
This patch adds readdir plus support to the cache array.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 24 Sep 2010 22:49:43 +0000 (18:49 -0400)]
NFS: Optimise the readdir searches
If we're going through the loop in nfs_readdir() more than once, we usually
do not want to restart searching from the beginning of the pages cache.
We only want to do that if the previous search failed...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Fri, 24 Sep 2010 18:48:42 +0000 (14:48 -0400)]
NFS: add readdir cache array
This patch adds the readdir cache array and functions to retreive the array
stored on a cache page, clear the array by freeing allocated memory, add an
entry to the array, and search the array for a given cookie.
It then modifies readdir to make use of the new cache array.
With the new cache array method, we no longer need some of this code.
Finally, nfs_llseek_dir() will set file->f_pos to a value greater than 0 and
desc->dir_cookie to zero. When we see this, readdir needs to find the file
at position file->f_pos from the start of the directory.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Randy Dunlap [Fri, 22 Oct 2010 23:18:52 +0000 (16:18 -0700)]
nfs: include ratelimit.h, fix nfs4state build error
nfs4state.c uses interfaces from ratelimit.h. It needs to include
that header file to fix build errors:
fs/nfs/nfs4state.c:1195: warning: type defaults to 'int' in declaration of 'DEFINE_RATELIMIT_STATE'
fs/nfs/nfs4state.c:1195: warning: parameter names (without types) in function declaration
fs/nfs/nfs4state.c:1195: error: invalid storage class for function 'DEFINE_RATELIMIT_STATE'
fs/nfs/nfs4state.c:1195: error: implicit declaration of function '__ratelimit'
fs/nfs/nfs4state.c:1195: error: '_rs' undeclared (first use in this function)
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 19 Oct 2010 23:47:49 +0000 (19:47 -0400)]
NFSv4: The state manager must ignore EKEYEXPIRED.
Otherwise, we cannot recover state correctly.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sat, 23 Oct 2010 15:24:25 +0000 (11:24 -0400)]
NFSv4: Don't ignore the error return codes from nfs_intent_set_file
If nfs_intent_set_file() returns an error, we usually want to pass that
back up the stack.
Also ensure that nfs_open_revalidate() returns '1' on success.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 4 Oct 2010 21:59:08 +0000 (17:59 -0400)]
NFSv4: Don't call nfs4_reclaim_complete() on receiving NFS4ERR_STALE_CLIENTID
If the server sends us an NFS4ERR_STALE_CLIENTID while the state management
thread is busy reclaiming state, we do want to treat all state that wasn't
reclaimed before the STALE_CLIENTID as if a network partition occurred (see
the edge conditions described in RFC3530 and RFC5661).
What we do not want to do is to send an nfs4_reclaim_complete(), since we
haven't yet even started reclaiming state after the server rebooted.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Trond Myklebust [Mon, 4 Oct 2010 21:59:08 +0000 (17:59 -0400)]
NFSv4: Don't call nfs4_state_mark_reclaim_reboot() from error handlers
In the case of a server reboot, the state recovery thread starts by calling
nfs4_state_end_reclaim_reboot() in order to avoid edge conditions when
the server reboots while the client is in the middle of recovery.
However, if the client has already marked the nfs4_state as requiring
reboot recovery, then the above behaviour will cause the recovery thread to
treat the open as if it was part of such an edge condition: the open will
be recovered as if it was part of a lease expiration (and all the locks
will be lost).
Fix is to remove the call to nfs4_state_mark_reclaim_reboot from
nfs4_async_handle_error(), and nfs4_handle_exception(). Instead we leave it
to the recovery thread to do this for us.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Trond Myklebust [Mon, 4 Oct 2010 21:59:08 +0000 (17:59 -0400)]
NFSv4: Fix open recovery
NFSv4 open recovery is currently broken: since we do not clear the
state->flags states before attempting recovery, we end up with the
'can_open_cached()' function triggering. This again leads to no OPEN call
being put on the wire.
Reported-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Trond Myklebust [Mon, 4 Oct 2010 21:59:08 +0000 (17:59 -0400)]
NFS: Don't SIGBUS if nfs_vm_page_mkwrite races with a cache invalidation
In the case where we lock the page, and then find out that the page has
been thrown out of the page cache, we should just return VM_FAULT_NOPAGE.
This is what block_page_mkwrite() does in these situations.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Bryan Schumaker [Wed, 29 Sep 2010 19:41:49 +0000 (15:41 -0400)]
NFS: new idmapper
This patch creates a new idmapper system that uses the request-key function to
place a call into userspace to map user and group ids to names. The old
idmapper was single threaded, which prevented more than one request from running
at a single time. This means that a user would have to wait for an upcall to
finish before accessing a cached result.
The upcall result is stored on a keyring of type id_resolver. See the file
Documentation/filesystems/nfs/idmapper.txt for instructions.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
[Trond: fix up the return value of nfs_idmap_lookup_name and clean up code]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 29 Sep 2010 19:11:56 +0000 (15:11 -0400)]
NFS: We must use list_for_each_entry_safe in nfs_access_cache_shrinker
We may end up removing the current entry from nfs_access_lru_list.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Jeff Layton [Tue, 28 Sep 2010 13:14:01 +0000 (09:14 -0400)]
NFS: don't use FLUSH_SYNC on WB_SYNC_NONE COMMIT calls (try #2)
WB_SYNC_NONE is supposed to mean "don't wait on anything". That should
also include not waiting for COMMIT calls to complete.
WB_SYNC_NONE is also implied when wbc->nonblocking and
wbc->for_background are set, so we can replace those checks in
nfs_commit_unstable_pages with a check for WB_SYNC_NONE.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 27 Sep 2010 19:51:20 +0000 (15:51 -0400)]
NFS: Really fix put_nfs_open_context()
In nfs_open_revalidate(), if the open_context() call returns an inode that
is not the same as dentry->d_inode, then we will call
put_nfs_open_context() with a valid dentry->d_inode, but without the
context being part of the nfsi->open_files list.
In this case too, we want to just skip the list removal, but we do want to
call the ->close_context() callback in order to close the NFSv4 state.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Benny Halevy [Fri, 24 Sep 2010 13:17:01 +0000 (09:17 -0400)]
NFSv4.1: keep seq_res.sr_slot as pointer rather than an index
Having to explicitly initialize sr_slotid to NFS4_MAX_SLOT_TABLE
resulted in numerous bugs. Keeping the current slot as a pointer
to the slot table is more straight forward and robust as it's
implicitly set up to NULL wherever the seq_res member is initialized
to zeroes.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Suresh Jayaraman [Thu, 23 Sep 2010 18:26:48 +0000 (14:26 -0400)]
nfs: show "local_lock" mount option in /proc/mounts
Display the status of 'local_lock' mount option in /proc/mounts.
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Benny Halevy [Thu, 23 Sep 2010 16:22:09 +0000 (12:22 -0400)]
NFS: handle inode==NULL in __put_nfs_open_context
inode may be NULL when put_nfs_open_context is called from nfs_atomic_lookup
before d_add_unique(dentry, inode)
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Suresh Jayaraman [Thu, 23 Sep 2010 12:55:58 +0000 (08:55 -0400)]
nfs: introduce mount option '-olocal_lock' to make locks local
NFS clients since 2.6.12 support flock locks by emulating fcntl byte-range
locks. Due to this, some windows applications which seem to use both flock
(share mode lock mapped as flock by Samba) and fcntl locks sequentially on
the same file, can't lock as they falsely assume the file is already locked.
The problem was reported on a setup with windows clients accessing excel files
on a Samba exported share which is originally a NFS mount from a NetApp filer.
Older NFS clients (< 2.6.12) did not see this problem as flock locks were
considered local. To support legacy flock behavior, this patch adds a mount
option "-olocal_lock=" which can take the following values:
'none' - Neither flock locks nor POSIX locks are local
'flock' - flock locks are local
'posix' - fcntl/POSIX locks are local
'all' - Both flock locks and POSIX locks are local
Testing:
- This patch was tested by using -olocal_lock option with different values
and the NLM calls were noted from the network packet captured.
'none' - NLM calls were seen during both flock() and fcntl(), flock lock
was granted, fcntl was denied
'flock' - no NLM calls for flock(), NLM call was seen for fcntl(),
granted
'posix' - NLM call was seen for flock() - granted, no NLM call for fcntl()
'all' - no NLM calls were seen during both flock() and fcntl()
- No bugs were seen during NFSv4 locking/unlocking in general and NFSv4
reboot recovery.
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Wed, 22 Sep 2010 13:50:35 +0000 (09:50 -0400)]
lockd: Remove BKL from the client
This patch removes all calls to lock_kernel() from the client. This patch
should be applied after the "fs/lock.c prepare for BKL removal" patch submitted
by Arnd Bergmann on September 18.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 21 Sep 2010 20:55:48 +0000 (16:55 -0400)]
SUNRPC: Refactor logic to NUL-terminate strings in pages
Clean up: Introduce a helper to '\0'-terminate XDR strings
that are placed in a page in the page cache.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 21 Sep 2010 20:55:48 +0000 (16:55 -0400)]
SUNRPC: Correct an rpcbind debugging message
Clean up.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 21 Sep 2010 20:55:47 +0000 (16:55 -0400)]
NFS: Fix NFSv3 debugging messages in fs/nfs/nfs3proc.c
Clean up.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 21 Sep 2010 20:55:31 +0000 (16:55 -0400)]
NFS: Convert nfsiod to use alloc_workqueue()
create_singlethread_workqueue() is deprecated.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 21 Sep 2010 20:54:34 +0000 (16:54 -0400)]
SUNRPC: Convert rpciod to use the alloc_workqueue() interface
create_workqueue() is a deprecated function.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 21 Sep 2010 20:52:40 +0000 (16:52 -0400)]
NFSv4.1: Fix the slotid initialisation in nfs_async_rename()
This fixes an Oopsable condition that was introduced by commit
d3d4152a5d59af9e13a73efa9e9c24383fbe307f (nfs: make sillyrename an async
operation)
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 21 Sep 2010 20:52:40 +0000 (16:52 -0400)]
NFS: Fix a use-after-free case in nfs_async_rename()
The call to nfs_async_rename_release() after rpc_run_task() is incorrect.
The rpc_run_task() is always guaranteed to call the ->rpc_release() method.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Jeff Layton [Fri, 17 Sep 2010 21:31:57 +0000 (17:31 -0400)]
nfs: make sillyrename an async operation
A synchronous rename can be interrupted by a SIGKILL. If that happens
during a sillyrename operation, it's possible for the rename call to
be sent to the server, but the task exits before processing the
reply. If this happens, the sillyrenamed file won't get cleaned up
during nfs_dentry_iput and the server is left with a dangling .nfs* file
hanging around.
Fix this problem by turning sillyrename into an asynchronous operation
and have the task doing the sillyrename just wait on the reply. If the
task is killed before the sillyrename completes, it'll still proceed
to completion.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Jeff Layton [Fri, 17 Sep 2010 21:31:30 +0000 (17:31 -0400)]
nfs: move nfs_sillyrename to unlink.c
...since that's where most of the sillyrenaming code lives. A comment
block is added to the beginning as well to clarify how sillyrenaming
works. Also, make nfs_async_unlink static as nfs_sillyrename is the only
caller.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Jeff Layton [Fri, 17 Sep 2010 21:31:06 +0000 (17:31 -0400)]
nfs: standardize the rename response container
Right now, v3 and v4 have their own variants. Create a standard struct
that will work for v3 and v4. v2 doesn't get anything but a simple error
and so isn't affected by this.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Jeff Layton [Fri, 17 Sep 2010 21:30:25 +0000 (17:30 -0400)]
nfs: standardize the rename args container
Each NFS version has its own version of the rename args container.
Standardize them on a common one that's identical to the one NFSv4
uses.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 17 Sep 2010 14:56:51 +0000 (10:56 -0400)]
NFS: Add an 'open_context' element to struct nfs_rpc_ops
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 17 Sep 2010 14:56:51 +0000 (10:56 -0400)]
NFS: Clean up nfs4_proc_create()
Remove all remaining references to the struct nameidata from the low level
NFS layers. Again pass down a partially initialised struct nfs_open_context
when we want to do atomic open+create.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 17 Sep 2010 14:56:51 +0000 (10:56 -0400)]
NFSv4: Further cleanups for nfs4_open_revalidate()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 17 Sep 2010 14:56:51 +0000 (10:56 -0400)]
NFSv4: Clean up nfs4_open_revalidate
Remove references to 'struct nameidata' from the low-level open_revalidate
code, and replace them with a struct nfs_open_context which will be
correctly initialised upon success.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 17 Sep 2010 14:56:50 +0000 (10:56 -0400)]
NFSv4: Further minor cleanups for nfs4_atomic_open()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 17 Sep 2010 14:56:50 +0000 (10:56 -0400)]
NFSv4: Clean up nfs4_atomic_open
Start moving the 'struct nameidata' dependent code out of the lower level
NFS code in preparation for the removal of open intents.
Instead of the struct nameidata, we pass down a partially initialised
struct nfs_open_context that will be fully initialised by the atomic open
upon success.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 17 Sep 2010 14:54:37 +0000 (10:54 -0400)]
SUNRPC: Remove rpcb_getport_sync()
Clean up: rpcb_getport_sync() has no more users, so remove it.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 17 Sep 2010 14:54:37 +0000 (10:54 -0400)]
NFS: Allow NFSROOT debugging messages to be enabled dynamically
As a convenience, introduce a kernel command line option to enable
NFSROOT debugging messages.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 17 Sep 2010 14:54:37 +0000 (10:54 -0400)]
NFS: Clean up nfsroot.c
Clean up: now that mount option parsing for nfsroot is handled
in fs/nfs/super.c, remove code in fs/nfs/nfsroot.c that is no
longer used. This includes code that constructs the legacy
nfs_mount_data structure, and code that does a MNT call to the
server.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 17 Sep 2010 14:54:37 +0000 (10:54 -0400)]
NFS: Use super.c for NFSROOT mount option parsing
Replace duplicate code in NFSROOT for mounting an NFS server on '/'
with logic that uses the existing mainline text-based logic in the NFS
client.
Add documenting comments where appropriate.
Note that this means NFSROOT mounts now use the same default settings
as v2/v3 mounts done via mount(2) from user space.
vers=3,tcp,rsize=<negotiated default>,wsize=<negotiated default>
As before, however, no version/protocol negotiation with the server is
done.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 17 Sep 2010 14:54:37 +0000 (10:54 -0400)]
NFS: Clean up NFSROOT command line parsing
Clean up: To reduce confusion, rename nfs_root_name as nfs_root_parms,
as this buffer contains more than just the name of the remote server.
Introduce documenting comments around nfs_root_setup().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 17 Sep 2010 14:54:37 +0000 (10:54 -0400)]
NFS: Remove \t from mount debugging message
During boot, a random character is displayed instead of a tab.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 29 Aug 2010 16:13:16 +0000 (12:13 -0400)]
SUNRPC: Don't truncate tail data unnecessarily in xdr_shrink_pagelen
If we have unused buffer space, then we should make use of that rather
than unnecessarily truncating the message.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Benny Halevy [Sun, 29 Aug 2010 16:13:15 +0000 (12:13 -0400)]
sunrpc: simplify xdr_shrink_pagelen use of "copy"
The "copy" variable value can be computed using the existing
logic rather than repeating it.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Benny Halevy [Sun, 29 Aug 2010 16:13:15 +0000 (12:13 -0400)]
sunrpc: don't use the copy variable in nested block
to clean up the code "copy" will be set prior to the block
hence it mustn't be used there.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Benny Halevy [Sun, 29 Aug 2010 16:13:15 +0000 (12:13 -0400)]
sunrpc: clean up xdr_shrink_pagelen use of temporary pointer
char *p is used only as a shorthand for tail->iov_base + len in a nested
block. Move it there.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Benny Halevy [Sun, 29 Aug 2010 16:13:15 +0000 (12:13 -0400)]
sunrpc: don't shorten buflen twice in xdr_shrink_pagelen
On Jan. 14, 2009, 2:50 +0200, andros@netapp.com wrote:
> From: Andy Adamson <andros@netapp.com>
>
> The buflen is reset for all cases at the end of xdr_shrink_pagelen.
> The data left in the tail after xdr_read_pages is not processed when the
> buflen is incorrectly set.
Note that in this case we also lose (len - tail->iov_len)
bytes from the buffered data in pages.
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Linus Torvalds [Sun, 29 Aug 2010 15:36:04 +0000 (08:36 -0700)]
Linux 2.6.36-rc3
Linus Torvalds [Sun, 29 Aug 2010 15:19:02 +0000 (08:19 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ieee1394/linux1394-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: ohci: work around VIA and NEC PHY packet reception bug
firewire: core: do not use del_timer_sync() in interrupt context
firewire: net: fix unicast reception RCODE in failure paths
firewire: sbp2: fix stall with "Unsolicited response"
firewire: sbp2: fix memory leak in sbp2_cancel_orbs or at send error
ieee1394: Adjust confusing if indentation
Stefan Richter [Sat, 28 Aug 2010 12:21:26 +0000 (14:21 +0200)]
firewire: ohci: work around VIA and NEC PHY packet reception bug
VIA VT6306, VIA VT6308, and NEC OrangeLink controllers do not write
packet event codes for received PHY packets (or perhaps write
evt_no_status, hard to tell). Work around it by overwriting the
packet's ACK by ack_complete, so that upper layers that listen to PHY
packet reception get to see these packets.
(Also tested: TI TSB82AA2, TI TSB43AB22/A, TI XIO2213A, Agere FW643,
JMicron JMB381 --- these do not exhibit this bug.)
Clemens proposed a quirks flag for that, IOW whitelist known misbehaving
controllers for this workaround. Though to me it seems harmless enough
to enable for all controllers.
The log_ar_at_event() debug log will continue to show the original
status from the DMA unit.
Reported-by: Clemens Ladisch <clemens@ladisch.de> (VT6308)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Linus Torvalds [Sat, 28 Aug 2010 22:42:44 +0000 (15:42 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
net/ipv4: Eliminate kstrdup memory leak
net/caif/cfrfml.c: use asm/unaligned.h
ax25: missplaced sock_put(sk)
qlge: reset the chip before freeing the buffers
l2tp: test for ethernet header in l2tp_eth_dev_recv()
tcp: select(writefds) don't hang up when a peer close connection
tcp: fix three tcp sysctls tuning
tcp: Combat per-cpu skew in orphan tests.
pxa168_eth: silence gcc warnings
pxa168_eth: update call to phy_mii_ioctl()
pxa168_eth: fix error handling in prope
pxa168_eth: remove unneeded null check
phylib: Fix race between returning phydev and calling adjust_link
caif-driver: add HAS_DMA dependency
3c59x: Fix deadlock between boomerang_interrupt and boomerang_start_tx
qlcnic: fix poll implementation
netxen: fix poll implementation
bridge: netfilter: fix a memory leak
Linus Torvalds [Sat, 28 Aug 2010 21:24:49 +0000 (14:24 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/vapier/blackfin
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin:
Blackfin: bf52x/bf54x boards: drop unused nand page size
Blackfin: punt duplicate SPORT MMR defines
Linus Torvalds [Sat, 28 Aug 2010 21:24:34 +0000 (14:24 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: pcm: add more format names
sound: oss: fix uninitialized spinlock
ALSA: asihpi - Return hw error directly from oustream_write.
ASoC: soc-core: fix debugfs_pop_time file permissions
ALSA: hda - Add Sony VAIO quirk for ALC269
Linus Torvalds [Sat, 28 Aug 2010 21:12:05 +0000 (14:12 -0700)]
Merge branch 's5p-fixes-for-linus' of git://git./linux/kernel/git/kgene/linux-samsung
* 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: S5PV310: Fix on Secondary CPU startup
ARM: S5PV310: Bug fix on uclk1 and sclk_pwm
ARM: S5PV310: Fix missed uart clocks
ARM: S5PV310: Should be clk_sclk_apll not clk_mout_apll
ARM: S5PV310: Fix on PLL setting for S5PV310
ARM: S5PV310: Add CMU block for S5PV310 Clock
ARM: S5PV310: Fix on typo irqs.h of S5PV310
ARM: S5PV310: Fix on default ZRELADDR of ARCH_S5PV310
ARM: S5PV310: Fix on GPIO base addresses
ARM: SAMSUNG: Fix on build warning regarding VMALLOC_END type
ARM: S5P: VMALLOC_END should be unsigned long
Linus Torvalds [Sat, 28 Aug 2010 21:11:04 +0000 (14:11 -0700)]
Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
fsnotify: drop two useless bools in the fnsotify main loop
fsnotify: fix list walk order
fanotify: Return EPERM when a process is not privileged
fanotify: resize pid and reorder structure
fanotify: drop duplicate pr_debug statement
fanotify: flush outstanding perm requests on group destroy
fsnotify: fix ignored mask handling between inode and vfsmount marks
fanotify: add MAINTAINERS entry
fsnotify: reset used_inode and used_vfsmount on each pass
fanotify: do not dereference inode_mark when it is unset
Linus Torvalds [Sat, 28 Aug 2010 21:10:43 +0000 (14:10 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ecryptfs/ecryptfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6:
eCryptfs: Fix encrypted file name lookup regression
ecryptfs: properly mark init functions
fs/ecryptfs: Return -ENOMEM on memory allocation failure
Linus Torvalds [Sat, 28 Aug 2010 21:09:16 +0000 (14:09 -0700)]
Merge branch 'for-linus' of git://android.git./kernel/tegra
* 'for-linus' of git://android.git.kernel.org/kernel/tegra:
arm: tegra: VMALLOC_END should be unsigned long
arm: tegra: fix compilation of board-harmony.c
Linus Torvalds [Sat, 28 Aug 2010 21:08:38 +0000 (14:08 -0700)]
Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
vgaarb: Wrap vga_(get|put) in CONFIG_VGA_ARB
drm/radeon/kms: add missing scratch update in dp_detect
drm/modes: Fix CVT-R modeline generation
drm: fix regression in drm locking since BKL removal.
drm/radeon/kms: remove stray radeon_i2c_destroy
drm: mm: fix range restricted allocations
drm/nouveau: drop drm_global_mutex before sleeping in submission path
drm: export drm_global_mutex for drivers to use
drm/nv20: Don't use pushbuf calls on the original nv20.
drm/nouveau: Fix TMDS on some DCB1.5 boards.
drm/nouveau: Fix backlight control on PPC machines with an internal TMDS panel.
drm/nv30: Apply modesetting to the correct slave encoder
drm/nouveau: Use a helper function to match PCI device/subsystem IDs.
drm/nv50: add dcb type 14 to enum to prevent compiler complaint
Linus Torvalds [Sat, 28 Aug 2010 21:08:10 +0000 (14:08 -0700)]
Merge branch 'lguest' of git://git./linux/kernel/git/rusty/linux-2.6-for-linus
* 'lguest' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
lguest: Odd Fixes
lguest: clean up warnings in demonstration launcher.
Linus Torvalds [Sat, 28 Aug 2010 21:07:38 +0000 (14:07 -0700)]
Merge branch 'omap-fixes-for-linus' of git://git./linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
OMAP3: PM: ensure IO wakeups are properly disabled
omap: Fix omap_4430sdp_defconfig for make oldconfig
omap: Use CONFIG_SMP for test_for_ipi and test_for_ltirq
omap: Fix sev instruction usage for multi-omap
OMAP3: Fix a cpu type check problem
omap3: id: fix 3630 rev detection
Linus Torvalds [Sat, 28 Aug 2010 21:07:20 +0000 (14:07 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: fix get_ticket_handler() error handling
ceph: don't BUG on ENOMEM during mds reconnect
ceph: ceph_mdsc_build_path() returns an ERR_PTR
ceph: Fix warnings
ceph: ceph_get_inode() returns an ERR_PTR
ceph: initialize fields on new dentry_infos
ceph: maintain i_head_snapc when any caps are dirty, not just for data
ceph: fix osd request lru adjustment when sending request
ceph: don't improperly set dir complete when holding EXCL cap
mm: exporting account_page_dirty
ceph: direct requests in snapped namespace based on nonsnap parent
ceph: queue cap snap writeback for realm children on snap update
ceph: include dirty xattrs state in snapped caps
ceph: fix xattr cap writeback
ceph: fix multiple mds session shutdown
Linus Torvalds [Sat, 28 Aug 2010 21:06:19 +0000 (14:06 -0700)]
Merge branch 'pm-fixes' of git://git./linux/kernel/git/rafael/suspend-2.6
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM QoS: Fix inline documentation.
PM QoS: Fix kzalloc() parameters swapped in pm_qos_power_open()
Linus Torvalds [Sat, 28 Aug 2010 21:05:55 +0000 (14:05 -0700)]
Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.36' of git://linux-nfs.org/~bfields/linux:
nfsd: fix NULL dereference in nfsd_statfs()
nfsd4: fix downgrade/lock logic
nfsd4: typo fix in find_any_file
nfsd4: bad BUG() in preprocess_stateid_op
Linus Torvalds [Sat, 28 Aug 2010 21:05:15 +0000 (14:05 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
Cannot allocate memory error on mount
[CIFS] Eliminate unused variable warning
David Howells [Thu, 26 Aug 2010 16:44:35 +0000 (17:44 +0100)]
Alpha: Fix a missing comma in sys_osf_statfs()
Fix a comma that got accidentally deleted from sys_osf_statfs() leading to the
following warning:
arch/alpha/kernel/osf_sys.c: In function 'SYSC_osf_statfs':
arch/alpha/kernel/osf_sys.c:255: error: syntax error before 'buffer'
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Thu, 26 Aug 2010 15:00:34 +0000 (16:00 +0100)]
NOMMU: Stub out vm_get_page_prot() if there's no MMU
Stub out vm_get_page_prot() if there's no MMU.
This was added by commit
804af2cf6e7a ("[AGPGART] remove private page
protection map") and is used in commit
c07fbfd17e61 ("fbmem: VM_IO set,
but not propagated") in the fbmem video driver, but the function doesn't
exist on NOMMU, resulting in an undefined symbol at link time.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 28 Aug 2010 20:55:54 +0000 (13:55 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/bp/bp
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
amd64_edac: Do not report error overflow as a separate error
MCE, AMD: Limit MCE decoding to current families for now
Linus Torvalds [Sat, 28 Aug 2010 20:55:31 +0000 (13:55 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: pxa27x_keypad - remove input_free_device() in pxa27x_keypad_remove()
Input: mousedev - fix regression of inverting axes
Input: uinput - add devname alias to allow module on-demand load
Input: hil_kbd - fix compile error
USB: drop tty argument from usb_serial_handle_sysrq_char()
Input: sysrq - drop tty argument form handle_sysrq()
Input: sysrq - drop tty argument from sysrq ops handlers
Linus Torvalds [Sat, 28 Aug 2010 20:54:55 +0000 (13:54 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata-sff: remove harmful BUG_ON from ata_bmdma_qc_issue
sata_mv: fix broken DSM/TRIM support (v2)
libata: be less of a drama queen on empty data commands
[libata] sata_dwc_460ex: signdness bug
ahci: add HFLAG_YES_FBS and apply it to 88SE9128
libata: remove no longer needed pata_winbond driver
pata_cmd64x: revert commit
d62f5576
Hugh Dickins [Thu, 26 Aug 2010 06:12:54 +0000 (23:12 -0700)]
mm: fix hang on anon_vma->root->lock
After several hours, kbuild tests hang with anon_vma_prepare() spinning on
a newly allocated anon_vma's lock - on a box with CONFIG_TREE_PREEMPT_RCU=y
(which makes this very much more likely, but it could happen without).
The ever-subtle page_lock_anon_vma() now needs a further twist: since
anon_vma_prepare() and anon_vma_fork() are liable to change the ->root
of a reused anon_vma structure at any moment, page_lock_anon_vma()
needs to check page_mapped() again before succeeding, otherwise
page_unlock_anon_vma() might address a different root->lock.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Takashi Iwai [Sat, 28 Aug 2010 19:44:15 +0000 (21:44 +0200)]
Merge branch 'fix/asoc' into for-linus