firefly-linux-kernel-4.4.55.git
13 years agoinode: Make unused inode LRU per superblock
Dave Chinner [Fri, 8 Jul 2011 04:14:39 +0000 (14:14 +1000)]
inode: Make unused inode LRU per superblock

The inode unused list is currently a global LRU. This does not match
the other global filesystem cache - the dentry cache - which uses
per-superblock LRU lists. Hence we have related filesystem object
types using different LRU reclaimation schemes.

To enable a per-superblock filesystem cache shrinker, both of these
caches need to have per-sb unused object LRU lists. Hence this patch
converts the global inode LRU to per-sb LRUs.

The patch only does rudimentary per-sb propotioning in the shrinker
infrastructure, as this gets removed when the per-sb shrinker
callouts are introduced later on.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoinode: convert inode_stat.nr_unused to per-cpu counters
Dave Chinner [Fri, 8 Jul 2011 04:14:38 +0000 (14:14 +1000)]
inode: convert inode_stat.nr_unused to per-cpu counters

Before we split up the inode_lru_lock, the unused inode counter
needs to be made independent of the global inode_lru_lock. Convert
it to per-cpu counters to do this.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agovmscan: add customisable shrinker batch size
Dave Chinner [Fri, 8 Jul 2011 04:14:37 +0000 (14:14 +1000)]
vmscan: add customisable shrinker batch size

For shrinkers that have their own cond_resched* calls, having
shrink_slab break the work down into small batches is not
paticularly efficient. Add a custom batchsize field to the struct
shrinker so that shrinkers can use a larger batch size if they
desire.

A value of zero (uninitialised) means "use the default", so
behaviour is unchanged by this patch.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agovmscan: reduce wind up shrinker->nr when shrinker can't do work
Dave Chinner [Fri, 8 Jul 2011 04:14:36 +0000 (14:14 +1000)]
vmscan: reduce wind up shrinker->nr when shrinker can't do work

When a shrinker returns -1 to shrink_slab() to indicate it cannot do
any work given the current memory reclaim requirements, it adds the
entire total_scan count to shrinker->nr. The idea ehind this is that
whenteh shrinker is next called and can do work, it will do the work
of the previously aborted shrinker call as well.

However, if a filesystem is doing lots of allocation with GFP_NOFS
set, then we get many, many more aborts from the shrinkers than we
do successful calls. The result is that shrinker->nr winds up to
it's maximum permissible value (twice the current cache size) and
then when the next shrinker call that can do work is issued, it
has enough scan count built up to free the entire cache twice over.

This manifests itself in the cache going from full to empty in a
matter of seconds, even when only a small part of the cache is
needed to be emptied to free sufficient memory.

Under metadata intensive workloads on ext4 and XFS, I'm seeing the
VFS caches increase memory consumption up to 75% of memory (no page
cache pressure) over a period of 30-60s, and then the shrinker
empties them down to zero in the space of 2-3s. This cycle repeats
over and over again, with the shrinker completely trashing the inode
and dentry caches every minute or so the workload continues.

This behaviour was made obvious by the shrink_slab tracepoints added
earlier in the series, and made worse by the patch that corrected
the concurrent accounting of shrinker->nr.

To avoid this problem, stop repeated small increments of the total
scan value from winding shrinker->nr up to a value that can cause
the entire cache to be freed. We still need to allow it to wind up,
so use the delta as the "large scan" threshold check - if the delta
is more than a quarter of the entire cache size, then it is a large
scan and allowed to cause lots of windup because we are clearly
needing to free lots of memory.

If it isn't a large scan then limit the total scan to half the size
of the cache so that windup never increases to consume the whole
cache. Reducing the total scan limit further does not allow enough
wind-up to maintain the current levels of performance, whilst a
higher threshold does not prevent the windup from freeing the entire
cache under sustained workloads.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agovmscan: shrinker->nr updates race and go wrong
Dave Chinner [Fri, 8 Jul 2011 04:14:35 +0000 (14:14 +1000)]
vmscan: shrinker->nr updates race and go wrong

shrink_slab() allows shrinkers to be called in parallel so the
struct shrinker can be updated concurrently. It does not provide any
exclusio for such updates, so we can get the shrinker->nr value
increasing or decreasing incorrectly.

As a result, when a shrinker repeatedly returns a value of -1 (e.g.
a VFS shrinker called w/ GFP_NOFS), the shrinker->nr goes haywire,
sometimes updating with the scan count that wasn't used, sometimes
losing it altogether. Worse is when a shrinker does work and that
update is lost due to racy updates, which means the shrinker will do
the work again!

Fix this by making the total_scan calculations independent of
shrinker->nr, and making the shrinker->nr updates atomic w.r.t. to
other updates via cmpxchg loops.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agovmscan: add shrink_slab tracepoints
Dave Chinner [Fri, 8 Jul 2011 04:14:34 +0000 (14:14 +1000)]
vmscan: add shrink_slab tracepoints

It is impossible to understand what the shrinkers are actually doing
without instrumenting the code, so add a some tracepoints to allow
insight to be gained.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agomake d_splice_alias(ERR_PTR(err), dentry) = ERR_PTR(err)
Al Viro [Sat, 9 Jul 2011 01:20:11 +0000 (21:20 -0400)]
make d_splice_alias(ERR_PTR(err), dentry) = ERR_PTR(err)

... and simplify the living hell out of callers

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agodeuglify squashfs_lookup()
Al Viro [Sat, 9 Jul 2011 00:57:47 +0000 (20:57 -0400)]
deuglify squashfs_lookup()

d_splice_alias(NULL, dentry) is equivalent to d_add(dentry, NULL), NULL
so no need for that if (inode) ... in there (or ERR_PTR(0), for that
matter)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agonfsd4_list_rec_dir(): don't bother with reopening rec_file
Al Viro [Thu, 7 Jul 2011 22:43:21 +0000 (18:43 -0400)]
nfsd4_list_rec_dir(): don't bother with reopening rec_file

just rewind it to the beginning before vfs_readdir() and be
done with that...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agokill useless checks for sb->s_op == NULL
Al Viro [Thu, 7 Jul 2011 19:45:59 +0000 (15:45 -0400)]
kill useless checks for sb->s_op == NULL

never is...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agobtrfs: kill magical embedded struct superblock
Al Viro [Thu, 7 Jul 2011 19:44:25 +0000 (15:44 -0400)]
btrfs: kill magical embedded struct superblock

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoget rid of pointless checks for dentry->sb == NULL
Al Viro [Thu, 7 Jul 2011 19:12:51 +0000 (15:12 -0400)]
get rid of pointless checks for dentry->sb == NULL

it never is...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoMake ->d_sb assign-once and always non-NULL
Al Viro [Thu, 7 Jul 2011 19:03:58 +0000 (15:03 -0400)]
Make ->d_sb assign-once and always non-NULL

New helper (non-exported, fs/internal.h-only): __d_alloc(sb, name).
Allocates dentry, sets its ->d_sb to given superblock and sets
->d_op accordingly.  Old d_alloc(NULL, name) callers are converted
to that (all of them know what superblock they want).  d_alloc()
itself is left only for parent != NULl case; uses __d_alloc(),
inserts result into the list of parent's children.

Note that now ->d_sb is assign-once and never NULL *and*
->d_parent is never NULL either.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agounexport kern_path_parent()
Al Viro [Mon, 27 Jun 2011 21:14:56 +0000 (17:14 -0400)]
unexport kern_path_parent()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoswitch vfs_path_lookup() to struct path
Al Viro [Mon, 27 Jun 2011 21:00:37 +0000 (17:00 -0400)]
switch vfs_path_lookup() to struct path

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agokill lookup_create()
Al Viro [Mon, 27 Jun 2011 20:53:43 +0000 (16:53 -0400)]
kill lookup_create()

folded into the only caller (kern_path_create())

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agodevtmpfs: get rid of bogus mkdir in create_path()
Al Viro [Mon, 27 Jun 2011 20:37:12 +0000 (16:37 -0400)]
devtmpfs: get rid of bogus mkdir in create_path()

We do _NOT_ want to mkdir the path itself - we are preparing to
mknod it, after all.  Normally it'll fail with -ENOENT and
just do nothing, but if somebody has created the parent in
the meanwhile, we'll get buggered...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoswitch devtmpfs to kern_path_create()
Al Viro [Mon, 27 Jun 2011 20:35:45 +0000 (16:35 -0400)]
switch devtmpfs to kern_path_create()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoswitch devtmpfs object creation/removal to separate kernel thread
Al Viro [Mon, 27 Jun 2011 20:25:29 +0000 (16:25 -0400)]
switch devtmpfs object creation/removal to separate kernel thread

... and give it a namespace where devtmpfs would be mounted on root,
thus avoiding abuses of vfs_path_lookup() (it was never intended to
be used with LOOKUP_PARENT).  Games with credentials are also gone.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agomake sure that nsproxy_cache is initialized early enough
Al Viro [Tue, 28 Jun 2011 19:41:10 +0000 (15:41 -0400)]
make sure that nsproxy_cache is initialized early enough

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoswitch do_spufs_create() to user_path_create(), fix double-unlock
Al Viro [Sun, 26 Jun 2011 15:54:58 +0000 (11:54 -0400)]
switch do_spufs_create() to user_path_create(), fix double-unlock

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agonew helpers: kern_path_create/user_path_create
Al Viro [Sun, 26 Jun 2011 15:50:15 +0000 (11:50 -0400)]
new helpers: kern_path_create/user_path_create

combination of kern_path_parent() and lookup_create().  Does *not*
expose struct nameidata to caller.  Syscalls converted to that...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agokill LOOKUP_CONTINUE
Al Viro [Sun, 26 Jun 2011 01:59:52 +0000 (21:59 -0400)]
kill LOOKUP_CONTINUE

LOOKUP_PARENT is equivalent to it now

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agonfs: LOOKUP_{OPEN,CREATE,EXCL} is set only on the last step
Al Viro [Sun, 26 Jun 2011 01:48:43 +0000 (21:48 -0400)]
nfs: LOOKUP_{OPEN,CREATE,EXCL} is set only on the last step

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agocifs_lookup(): LOOKUP_OPEN is set only on the last component
Al Viro [Sun, 26 Jun 2011 01:45:21 +0000 (21:45 -0400)]
cifs_lookup(): LOOKUP_OPEN is set only on the last component

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoceph: LOOKUP_OPEN is set only when it's the last component
Al Viro [Sun, 26 Jun 2011 01:43:56 +0000 (21:43 -0400)]
ceph: LOOKUP_OPEN is set only when it's the last component

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agojfs_ci_revalidate() is safe from RCU mode
Al Viro [Sun, 26 Jun 2011 01:41:09 +0000 (21:41 -0400)]
jfs_ci_revalidate() is safe from RCU mode

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoLOOKUP_CREATE and LOOKUP_RENAME_TARGET can be set only on the last step
Al Viro [Sun, 26 Jun 2011 01:37:18 +0000 (21:37 -0400)]
LOOKUP_CREATE and LOOKUP_RENAME_TARGET can be set only on the last step

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agono need to check for LOOKUP_OPEN in ->create() instances
Al Viro [Sun, 26 Jun 2011 01:17:17 +0000 (21:17 -0400)]
no need to check for LOOKUP_OPEN in ->create() instances

... it will be set in nd->flag for all cases with non-NULL nd
(i.e. when called from do_last()).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agodon't pass nameidata to vfs_create() from ecryptfs_create()
Al Viro [Sun, 26 Jun 2011 01:08:31 +0000 (21:08 -0400)]
don't pass nameidata to vfs_create() from ecryptfs_create()

Instead of playing with removal of LOOKUP_OPEN, mangling (and
restoring) nd->path, just pass NULL to vfs_create().  The whole
point of what's being done there is to suppress any attempts
to open file by underlying fs, which is what nd == NULL indicates.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agodon't transliterate lower bits of ->intent.open.flags to FMODE_...
Al Viro [Sat, 25 Jun 2011 23:15:54 +0000 (19:15 -0400)]
don't transliterate lower bits of ->intent.open.flags to FMODE_...

->create() instances are much happier that way...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoDon't pass nameidata when calling vfs_create() from mknod()
Al Viro [Thu, 23 Jun 2011 16:35:50 +0000 (12:35 -0400)]
Don't pass nameidata when calling vfs_create() from mknod()

All instances can cope with that now (and ceph one actually
starts working properly).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agofix mknod() on nfs4 (hopefully)
Al Viro [Wed, 22 Jun 2011 22:53:18 +0000 (18:53 -0400)]
fix mknod() on nfs4 (hopefully)

a) check the right flags in ->create() (LOOKUP_OPEN, not LOOKUP_CREATE)
b) default (!LOOKUP_OPEN) open_flags is O_CREAT|O_EXCL|FMODE_READ, not 0
c) lookup_instantiate_filp() should be done only with LOOKUP_OPEN;
otherwise we need to issue CLOSE, lest we leak stateid on server.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agonameidata_to_nfs_open_context() doesn't need nameidata, actually...
Al Viro [Wed, 22 Jun 2011 22:47:28 +0000 (18:47 -0400)]
nameidata_to_nfs_open_context() doesn't need nameidata, actually...

just open flags; switched to passing just those and
renamed to create_nfs_open_context()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agonfs_open_context doesn't need struct path either
Al Viro [Wed, 22 Jun 2011 22:40:12 +0000 (18:40 -0400)]
nfs_open_context doesn't need struct path either

just dentry, please...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agonfs4_opendata doesn't need struct path either
Al Viro [Wed, 22 Jun 2011 22:30:55 +0000 (18:30 -0400)]
nfs4_opendata doesn't need struct path either

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agonfs4_closedata doesn't need to mess with struct path
Al Viro [Wed, 22 Jun 2011 22:20:23 +0000 (18:20 -0400)]
nfs4_closedata doesn't need to mess with struct path

instead of path_get()/path_put(), we can just use nfs_sb_{,de}active()
to pin the superblock down.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agocifs: fix the type of cifs_demultiplex_thread()
Al Viro [Tue, 21 Jun 2011 12:51:28 +0000 (08:51 -0400)]
cifs: fix the type of cifs_demultiplex_thread()

... and get rid of a bogus typecast, while we are at it; it's not
just that we want a function returning int and not void, but cast
to pointer to function taking void * and returning void would be
(void (*)(void *)) and not (void *)(void *), TYVM...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoecryptfs_inode_permission() doesn't need to bail out on RCU
Al Viro [Tue, 21 Jun 2011 05:01:59 +0000 (01:01 -0400)]
ecryptfs_inode_permission() doesn't need to bail out on RCU

... now that inode_permission() can take MAY_NOT_BLOCK and handle it
properly.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agokill IPERM_FLAG_RCU
Al Viro [Tue, 21 Jun 2011 05:01:22 +0000 (01:01 -0400)]
kill IPERM_FLAG_RCU

not used anymore

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years ago->permission() sanitizing: document API changes
Al Viro [Tue, 21 Jun 2011 01:56:31 +0000 (21:56 -0400)]
->permission() sanitizing: document API changes

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agomerge do_revalidate() into its only caller
Al Viro [Mon, 20 Jun 2011 14:55:26 +0000 (10:55 -0400)]
merge do_revalidate() into its only caller

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agono reason to keep exec_permission() separate now
Al Viro [Mon, 20 Jun 2011 23:57:03 +0000 (19:57 -0400)]
no reason to keep exec_permission() separate now

cache footprint alone makes it a bad idea...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agomassage generic_permission() to treat directories on a separate path
Al Viro [Mon, 20 Jun 2011 23:55:42 +0000 (19:55 -0400)]
massage generic_permission() to treat directories on a separate path

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years ago->permission() sanitizing: don't pass flags to exec_permission()
Al Viro [Mon, 20 Jun 2011 23:48:41 +0000 (19:48 -0400)]
->permission() sanitizing: don't pass flags to exec_permission()

pass mask instead; kill security_inode_exec_permission() since we can use
security_inode_permission() instead.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoselinux: don't transliterate MAY_NOT_BLOCK to IPERM_FLAG_RCU
Al Viro [Mon, 20 Jun 2011 23:44:08 +0000 (19:44 -0400)]
selinux: don't transliterate MAY_NOT_BLOCK to IPERM_FLAG_RCU

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years ago->permission() sanitizing: don't pass flags to ->inode_permission()
Al Viro [Mon, 20 Jun 2011 23:38:15 +0000 (19:38 -0400)]
->permission() sanitizing: don't pass flags to ->inode_permission()

pass that via mask instead.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years ago->permission() sanitizing: don't pass flags to ->permission()
Al Viro [Mon, 20 Jun 2011 23:28:19 +0000 (19:28 -0400)]
->permission() sanitizing: don't pass flags to ->permission()

not used by the instances anymore.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years ago->permission() sanitizing: don't pass flags to generic_permission()
Al Viro [Mon, 20 Jun 2011 23:16:29 +0000 (19:16 -0400)]
->permission() sanitizing: don't pass flags to generic_permission()

redundant; all callers get it duplicated in mask & MAY_NOT_BLOCK and none of
them removes that bit.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years ago->permission() sanitizing: don't pass flags to ->check_acl()
Al Viro [Mon, 20 Jun 2011 23:12:17 +0000 (19:12 -0400)]
->permission() sanitizing: don't pass flags to ->check_acl()

not used in the instances anymore.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years ago->permission() sanitizing: pass MAY_NOT_BLOCK to ->check_acl()
Al Viro [Mon, 20 Jun 2011 23:06:22 +0000 (19:06 -0400)]
->permission() sanitizing: pass MAY_NOT_BLOCK to ->check_acl()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years ago->permission() sanitizing: MAY_NOT_BLOCK
Al Viro [Mon, 20 Jun 2011 22:59:02 +0000 (18:59 -0400)]
->permission() sanitizing: MAY_NOT_BLOCK

Duplicate the flags argument into mask bitmap.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agokill check_acl callback of generic_permission()
Al Viro [Mon, 20 Jun 2011 15:31:30 +0000 (11:31 -0400)]
kill check_acl callback of generic_permission()

its value depends only on inode and does not change; we might as
well store it in ->i_op->check_acl and be done with that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agolockless get_write_access/deny_write_access
Al Viro [Mon, 20 Jun 2011 14:52:57 +0000 (10:52 -0400)]
lockless get_write_access/deny_write_access

new helpers: atomic_inc_unless_negative()/atomic_dec_unless_positive()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agomove exec_permission() up to the rest of permission-related functions
Al Viro [Sun, 19 Jun 2011 17:14:21 +0000 (13:14 -0400)]
move exec_permission() up to the rest of permission-related functions

... and convert the comment before it into linuxdoc form.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agokill file_permission() completely
Al Viro [Sun, 19 Jun 2011 16:55:10 +0000 (12:55 -0400)]
kill file_permission() completely

convert the last remaining caller to inode_permission()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoconsolidate BINPRM_FLAGS_ENFORCE_NONDUMP handling
Al Viro [Sun, 19 Jun 2011 16:49:47 +0000 (12:49 -0400)]
consolidate BINPRM_FLAGS_ENFORCE_NONDUMP handling

new helper: would_dump(bprm, file).  Checks if we are allowed to
read the file and if we are not - sets ENFORCE_NODUMP.  Exported,
used in places that previously open-coded the same logics.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoswitch path_init() to exec_permission()
Al Viro [Sun, 19 Jun 2011 15:54:42 +0000 (11:54 -0400)]
switch path_init() to exec_permission()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoswitch udf_ioctl() to inode_permission()
Al Viro [Sun, 19 Jun 2011 15:49:08 +0000 (11:49 -0400)]
switch udf_ioctl() to inode_permission()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agomake exec_permission(dir) really equivalent to inode_permission(dir, MAY_EXEC)
Al Viro [Sun, 19 Jun 2011 05:50:08 +0000 (01:50 -0400)]
make exec_permission(dir) really equivalent to inode_permission(dir, MAY_EXEC)

capability overrides apply only to the default case; if fs has ->permission()
that does _not_ call generic_permission(), we have no business doing them.
Moreover, if it has ->permission() that does call generic_permission(), we
have no need to recheck capabilities.

Besides, the capability overrides should apply only if we got EACCES from
acl_permission_check(); any other value (-EIO, etc.) should be returned
to caller, capabilities or not capabilities.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agonew helper: iterate_supers_type()
Al Viro [Sat, 4 Jun 2011 00:16:57 +0000 (20:16 -0400)]
new helper: iterate_supers_type()

Call the given function for all superblocks of given type.  Function
gets a superblock (with s_umount locked shared) and (void *) argument
supplied by caller of iterator.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agofs: add a DCACHE_NEED_LOOKUP flag for d_flags
Josef Bacik [Tue, 31 May 2011 15:58:49 +0000 (11:58 -0400)]
fs: add a DCACHE_NEED_LOOKUP flag for d_flags

Btrfs (and I'd venture most other fs's) stores its indexes in nice disk order
for readdir, but unfortunately in the case of anything that stats the files in
order that readdir spits back (like oh say ls) that means we still have to do
the normal lookup of the file, which means looking up our other index and then
looking up the inode.  What I want is a way to create dummy dentries when we
find them in readdir so that when ls or anything else subsequently does a
stat(), we already have the location information in the dentry and can go
straight to the inode itself.  The lookup stuff just assumes that if it finds a
dentry it is done, it doesn't perform a lookup.  So add a DCACHE_NEED_LOOKUP
flag so that the lookup code knows it still needs to run i_op->lookup() on the
parent to get the inode for the dentry.  I have tested this with btrfs and I
went from something that looks like this

http://people.redhat.com/jwhiter/ls-noreada.png

To this

http://people.redhat.com/jwhiter/ls-good.png

Thats a savings of 1300 seconds, or 22 minutes.  That is a significant savings.
Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph...
Linus Torvalds [Wed, 20 Jul 2011 05:10:28 +0000 (22:10 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: fix file mode calculation

13 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc
Linus Torvalds [Wed, 20 Jul 2011 05:10:05 +0000 (22:10 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/arm/linux-arm-soc

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc:
  davinci: DM365 EVM: fix video input mux bits
  ARM: davinci: Check for NULL return from irq_alloc_generic_chip
  arm: davinci: Fix low level gpio irq handlers' argument

13 years agovmscan: fix a livelock in kswapd
Shaohua Li [Tue, 19 Jul 2011 15:49:26 +0000 (08:49 -0700)]
vmscan: fix a livelock in kswapd

I'm running a workload which triggers a lot of swap in a machine with 4
nodes.  After I kill the workload, I found a kswapd livelock.  Sometimes
kswapd3 or kswapd2 are keeping running and I can't access filesystem,
but most memory is free.

This looks like a regression since commit 08951e545918c159 ("mm: vmscan:
correct check for kswapd sleeping in sleeping_prematurely").

Node 2 and 3 have only ZONE_NORMAL, but balance_pgdat() will return 0
for classzone_idx.  The reason is end_zone in balance_pgdat() is 0 by
default, if all zones have watermark ok, end_zone will keep 0.

Later sleeping_prematurely() always returns true.  Because this is an
order 3 wakeup, and if classzone_idx is 0, both balanced_pages and
present_pages in pgdat_balanced() are 0.  We add a special case here.
If a zone has no page, we think it's balanced.  This fixes the livelock.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agofs/libfs.c: fix simple_attr_write() on 32bit machines
Akinobu Mita [Tue, 19 Jul 2011 15:49:25 +0000 (08:49 -0700)]
fs/libfs.c: fix simple_attr_write() on 32bit machines

Assume that /sys/kernel/debug/dummy64 is debugfs file created by
debugfs_create_x64().

# cd /sys/kernel/debug
# echo 0x1234567812345678 > dummy64
# cat dummy64
0x0000000012345678

# echo 0x80000000 > dummy64
# cat dummy64
0xffffffff80000000

A value larger than INT_MAX cannot be written to the debugfs file created
by debugfs_create_u64 or debugfs_create_x64 on 32bit machine.  Because
simple_attr_write() uses simple_strtol() for the conversion.

To fix this, use simple_strtoll() instead.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
Linus Torvalds [Wed, 20 Jul 2011 04:50:21 +0000 (21:50 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  vfs: fix race in rcu lookup of pruned dentry
  Fix cifs_get_root()

[ Edited the last commit to get rid of a 'unused variable "seq"'
  warning due to Al editing the patch.  - Linus ]

13 years agovfs: fix race in rcu lookup of pruned dentry
Linus Torvalds [Mon, 18 Jul 2011 22:43:29 +0000 (15:43 -0700)]
vfs: fix race in rcu lookup of pruned dentry

Don't update *inode in __follow_mount_rcu() until we'd verified that
there is mountpoint there.  Kudos to Hugh Dickins for catching that
one in the first place and eventually figuring out the solution (and
catching a braino in the earlier version of patch).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoceph: fix file mode calculation
Sage Weil [Tue, 19 Jul 2011 18:25:04 +0000 (11:25 -0700)]
ceph: fix file mode calculation

open(2) must always include one of O_RDONLY, O_WRONLY, or O_RDWR.  No need
for any O_APPEND special case.

Passing O_WRONLY|O_RDWR is undefined according to the man page, but the
Linux VFS interprets this as O_RDWR, so we'll do the same.

This fixes open(2) with flags O_RDWR|O_APPEND, which was incorrectly being
translated to readonly.

Reported-by: Fyodor Ustinov <ufm@ufm.su>
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agodavinci: DM365 EVM: fix video input mux bits
Jon Povey [Tue, 19 Jul 2011 03:30:11 +0000 (12:30 +0900)]
davinci: DM365 EVM: fix video input mux bits

Video input mux settings for tvp7002 and imager inputs were swapped.
Comment was correct.

Tested on EVM with tvp7002 input.

Signed-off-by: Jon Povey <jon.povey@racelogic.co.uk>
Acked-by: Manjunath Hadli <manjunath.hadli@ti.com>
Cc: stable@kernel.org
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
13 years agoARM: davinci: Check for NULL return from irq_alloc_generic_chip
Todd Poynor [Sun, 17 Jul 2011 05:39:35 +0000 (22:39 -0700)]
ARM: davinci: Check for NULL return from irq_alloc_generic_chip

Avoid NULL dereference of irq_alloc_generic_chip return in low
memory conditions.

Signed-off-by: Todd Poynor <toddpoynor@google.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Mon, 18 Jul 2011 20:29:26 +0000 (13:29 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  pppoe: Must flush connections when MAC address changes too.
  include/linux/sdla.h: remove the prototype of sdla()
  tulip: dmfe: Remove old log spamming pr_debugs

13 years agopppoe: Must flush connections when MAC address changes too.
David S. Miller [Mon, 18 Jul 2011 18:48:28 +0000 (11:48 -0700)]
pppoe: Must flush connections when MAC address changes too.

Kernel bugzilla: 39252

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoinclude/linux/sdla.h: remove the prototype of sdla()
WANG Cong [Sat, 16 Jul 2011 22:22:20 +0000 (22:22 +0000)]
include/linux/sdla.h: remove the prototype of sdla()

`make headers_check` complains that

linux-2.6/usr/include/linux/sdla.h:116: userspace cannot reference
function or variable defined in the kernel

this is due to that there is no such a kernel function,

void sdla(void *cfg_info, char *dev, struct frad_conf *conf, int quiet);

I don't know why we have it in a kernel header, so remove it.

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoFix cifs_get_root()
Al Viro [Mon, 18 Jul 2011 17:50:40 +0000 (13:50 -0400)]
Fix cifs_get_root()

Add missing ->i_mutex, convert to lookup_one_len() instead of
(broken) open-coded analog, cope with getting something like
a//b as relative pathname.  Simplify the hell out of it, while
we are there...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
13 years agotulip: dmfe: Remove old log spamming pr_debugs
Joe Perches [Mon, 18 Jul 2011 17:44:44 +0000 (10:44 -0700)]
tulip: dmfe: Remove old log spamming pr_debugs

Commit 726b65ad444d ("tulip: Convert uses of KERN_DEBUG") enabled
some old previously inactive uses of pr_debug converted by
commit dde7c8ef1679 ("tulip/dmfe.c: Use dev_<level> and pr_<level>").

Remove these pr_debugs.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosi4713-i2c: avoid potential buffer overflow on si4713
Mauro Carvalho Chehab [Sun, 17 Jul 2011 03:24:37 +0000 (00:24 -0300)]
si4713-i2c: avoid potential buffer overflow on si4713

While compiling it with Fedora 15, I noticed this issue:

  inlined from ‘si4713_write_econtrol_string’ at drivers/media/radio/si4713-i2c.c:1065:24:
  arch/x86/include/asm/uaccess_32.h:211:26: error: call to ‘copy_from_user_overflow’ declared with attribute error: copy_from_user() buffer size is not provably correct

Cc: stable@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Sakari Ailus <sakari.ailus@maxwell.research.nokia.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Reviewed-by: Eugene Teo <eugeneteo@kernel.sg>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge branch 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
Linus Torvalds [Mon, 18 Jul 2011 16:05:59 +0000 (09:05 -0700)]
Merge branch 'fix/asoc' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ASoC: Correct WM8994 MICBIAS supply widget hookup
  ASoC: Fix shift in WM8958 accessory detection default implementation
  ASoC: sh: fsi-hdmi: fixup snd_soc_card name
  ASoC: sh: fsi-da7210: fixup snd_soc_card name
  ASoC: sh: fsi-ak4642: fixup snd_soc_card name

13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
Linus Torvalds [Mon, 18 Jul 2011 16:03:15 +0000 (09:03 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  hppfs_lookup(): don't open-code lookup_one_len()
  hppfs: fix dentry leak
  cramfs: get_cramfs_inode() returns ERR_PTR() on failure
  ufs should use d_splice_alias()
  fix exofs ->get_parent()
  ceph analog of cifs build_path_from_dentry() race fix
  cifs: build_path_from_dentry() race fix

13 years agoMerge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelv...
Linus Torvalds [Mon, 18 Jul 2011 16:02:58 +0000 (09:02 -0700)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/jdelvare/staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: (max1111) Fix race condition causing NULL pointer exception
  hwmon: (it87) Fix label group removal
  hwmon: (asus_atk0110) Fix memory leak

13 years agohppfs_lookup(): don't open-code lookup_one_len()
Al Viro [Mon, 18 Jul 2011 02:27:22 +0000 (22:27 -0400)]
hppfs_lookup(): don't open-code lookup_one_len()

... and it's getting it wrong, too - missing ->d_revalidate() calls when
it's dealing with filesystem (procfs) that has non-trivial ->d_revalidate()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agohppfs: fix dentry leak
Al Viro [Mon, 18 Jul 2011 02:24:15 +0000 (22:24 -0400)]
hppfs: fix dentry leak

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agocramfs: get_cramfs_inode() returns ERR_PTR() on failure
Al Viro [Sun, 17 Jul 2011 23:04:14 +0000 (19:04 -0400)]
cramfs: get_cramfs_inode() returns ERR_PTR() on failure

... and we want to report these failures in ->lookup() anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoufs should use d_splice_alias()
Al Viro [Sun, 17 Jul 2011 14:07:34 +0000 (10:07 -0400)]
ufs should use d_splice_alias()

it's NFS-exportable, so...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agofix exofs ->get_parent()
Al Viro [Sat, 9 Jul 2011 00:56:55 +0000 (20:56 -0400)]
fix exofs ->get_parent()

NULL is not a possible return value for that method, TYVM...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Sun, 17 Jul 2011 19:49:55 +0000 (12:49 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] update cifs to version 1.74
  [CIFS] update limit for snprintf in cifs_construct_tcon
  cifs: Fix signing failure when server mandates signing for NTLMSSP

13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Sun, 17 Jul 2011 19:49:28 +0000 (12:49 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  Bluetooth: Fix crash with incoming L2CAP connections
  Bluetooth: Fix regression in L2CAP connection procedure
  gianfar: rx parser
  r6040: only disable RX interrupt if napi_schedule_prep is successful
  net: remove NETIF_F_ALL_TX_OFFLOADS
  net: sctp: fix checksum marking for outgoing packets

13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog
Linus Torvalds [Sun, 17 Jul 2011 19:48:52 +0000 (12:48 -0700)]
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog

* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  watchdog: hpwdt depends on PCI
  watchdog: fix hpwdt Kconfig regression in 3.0-rc

13 years agoMerge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Sun, 17 Jul 2011 19:48:18 +0000 (12:48 -0700)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-2.6

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
  [media] tuner-core: fix a 2.6.39 regression with mt20xx
  [media] dvb_frontend: fix race condition in stopping/starting frontend
  [media] media: fix radio-sf16fmr2 build when SND is not enabled
  [media] MEDIA: Fix non-ISA_DMA_API link failure of sound code
  [media] nuvoton-cir: make idle timeout more sane
  [media] mceusb: increase default timeout to 100ms
  [media] mceusb: Timeout unit corrections
  [media] Revert "V4L/DVB: cx23885: Enable Message Signaled Interrupts(MSI)"

13 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux...
Linus Torvalds [Sun, 17 Jul 2011 19:47:47 +0000 (12:47 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPI: Fixes device power states array overflow
  ACPI, APEI, HEST, Detect duplicated hardware error source ID
  ACPI: Fix lockdep false positives in acpi_power_off()

13 years agoMerge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspe...
Linus Torvalds [Sun, 17 Jul 2011 19:47:27 +0000 (12:47 -0700)]
Merge branch 'pm-fixes' of git://git./linux/kernel/git/rafael/suspend-2.6

* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PM / MIPS: Convert i8259.c to using syscore_ops

13 years agoMerge branch 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 17 Jul 2011 19:47:11 +0000 (12:47 -0700)]
Merge branch 's5p-fixes-for-linus' of git://git./linux/kernel/git/kgene/linux-samsung

* 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: SAMSUNG: DMA Cleanup as per sparse
  ARM: SAMSUNG: Check NULL return from irq_alloc_generic_chip

13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Sun, 17 Jul 2011 19:43:58 +0000 (12:43 -0700)]
Merge git://git./linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc: sun4m SMP: fix wrong shift instruction in IPI handler
  sparc32,leon: Added __init declaration to leon_flush_needed()
  sparc/irqs: Do not trace arch_local_{*,irq_*} functions

13 years agohwmon: (max1111) Fix race condition causing NULL pointer exception
Pavel Herrmann [Sun, 17 Jul 2011 16:39:19 +0000 (18:39 +0200)]
hwmon: (max1111) Fix race condition causing NULL pointer exception

spi_sync call uses its spi_message parameter to keep completion information,
using a drvdata structure is not thread-safe. Use a mutex to prevent
multiple access to shared driver data.

Signed-off-by: Pavel Herrmann <morpheus.ibis@gmail.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Marek Vasut <marek.vasut@gmail.com>
Acked-by: Cyril Hrubis <metan@ucw.cz>
Tested-by: Stanislav Brabec <utx@penguin.cz>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
13 years agohwmon: (it87) Fix label group removal
Jean Delvare [Sun, 17 Jul 2011 16:39:19 +0000 (18:39 +0200)]
hwmon: (it87) Fix label group removal

A copy-and-paste error caused it87_attributes_vid to be referenced
where it87_attributes_label should be. Thankfully the group is only
used for attribute removal, not attribute creation, so the effects of
this bug are limited, but let's fix it still.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
13 years agohwmon: (asus_atk0110) Fix memory leak
Luca Tettamanti [Sun, 17 Jul 2011 16:39:18 +0000 (18:39 +0200)]
hwmon: (asus_atk0110) Fix memory leak

The object returned by atk_gitm is dynamically allocated and must be
freed.

Signed-off-by: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
13 years agowatchdog: hpwdt depends on PCI
Randy Dunlap [Sat, 16 Jul 2011 19:25:49 +0000 (12:25 -0700)]
watchdog: hpwdt depends on PCI

hpwdt is a PCI driver so it should depend on PCI.
Fixes these build errors:

drivers/watchdog/hpwdt.c:762: error: implicit declaration of function 'pci_iomap'
drivers/watchdog/hpwdt.c:762: warning: assignment makes pointer from integer without a cast
drivers/watchdog/hpwdt.c:797: error: implicit declaration of function 'pci_iounmap'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: Thomas Mingarelli <thomas.mingarelli@hp.com>
13 years agoASoC: Correct WM8994 MICBIAS supply widget hookup
Mark Brown [Thu, 14 Jul 2011 09:21:37 +0000 (18:21 +0900)]
ASoC: Correct WM8994 MICBIAS supply widget hookup

The WM8994 and WM8958 series of devices have two MICBIAS supplies rather
than one, the current widget actually manages the microphone detection
control register bit (which is managed separately by the relevant API).

Fix this, hooking the relevant supplies up to the MICBIAS1 and MICBIAS2
widgets.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@kernel.org
13 years agoceph analog of cifs build_path_from_dentry() race fix
Al Viro [Sun, 17 Jul 2011 03:43:58 +0000 (23:43 -0400)]
ceph analog of cifs build_path_from_dentry() race fix

... unfortunately, cifs bug got copied.  Fix is essentially the same.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agocifs: build_path_from_dentry() race fix
Al Viro [Sun, 17 Jul 2011 03:37:20 +0000 (23:37 -0400)]
cifs: build_path_from_dentry() race fix

deal with d_move() races properly; rename_lock read-retry loop,
rcu_read_lock() held while walking to root, d_lock held over
subtraction from namelen and copying the component to stabilize
->d_name.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>