Jaegeuk Kim [Wed, 2 Apr 2014 06:34:36 +0000 (15:34 +0900)]
f2fs: introduce f2fs_issue_flush to avoid redundant flush issue
Some storage devices show relatively high latencies to complete cache_flush
commands, even though their normal IO speed is prettry much high. In such
the case, it needs to merge cache_flush commands as much as possible to avoid
issuing them redundantly.
So, this patch introduces a mount option, "-o flush_merge", to mitigate such
the overhead.
If this option is enabled by user, F2FS merges the cache_flush commands and then
issues just one cache_flush on behalf of them. Once the single command is
finished, F2FS sends a completion signal to all the pending threads.
Note that, this option can be used under a workload consisting of very intensive
concurrent fsync calls, while the storage handles cache_flush commands slowly.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Wed, 2 Apr 2014 00:04:42 +0000 (09:04 +0900)]
f2fs: fix to cover io->bio with io_rwsem
In the f2fs_wait_on_page_writeback, io->bio should be covered by io_rwsem.
Otherwise, the bio pointer can become a dangling pointer due to data races.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Sat, 29 Mar 2014 07:30:40 +0000 (15:30 +0800)]
f2fs: fix error path when fail to read inline data
We should unlock page in ->readpage() path and also should unlock & release page
in error path of ->write_begin() to avoid deadlock or memory leak.
So let's add release code to fix the problem when we fail to read inline data.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Sat, 29 Mar 2014 03:33:17 +0000 (11:33 +0800)]
f2fs: use list_for_each_entry{_safe} for simplyfying code
This patch use list_for_each_entry{_safe} instead of list_for_each{_safe} for
simplfying code.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Wed, 2 Apr 2014 00:55:00 +0000 (08:55 +0800)]
f2fs: avoid free slab cache under spinlock
Move kmem_cache_free out of spinlock protection region for better performance.
Change log from v1:
o remove spinlock protection for kmem_cache_free in destroy_node_manager
suggested by Jaegeuk Kim.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Sat, 22 Mar 2014 06:59:45 +0000 (14:59 +0800)]
f2fs: avoid unneeded lookup when xattr name length is too long
In f2fs_setxattr we have limit this attribute name length, so we should also
check it in f2fs_getxattr to avoid useless lookup caused by invalid name length.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Sat, 22 Mar 2014 06:57:23 +0000 (14:57 +0800)]
f2fs: avoid unnecessary bio submit when wait page writeback
This patch introduce is_merged_page() to check whether current page is merged
in f2fs bio cache. When page is not in cache, we can avoid submitting bio cache,
resulting in having more chance to merge pages.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Tue, 1 Apr 2014 08:38:26 +0000 (17:38 +0900)]
f2fs: return -EIO when node id is not matched
During the cleaing of node segments, F2FS can get errored node blocks due to
data race between node page lock and its valid bitmap operations.
In that case, it needs to return an error to skip such the obsolete block copy.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Thu, 20 Mar 2014 13:21:08 +0000 (22:21 +0900)]
f2fs: avoid RECLAIM_FS-ON-W warning
This patch should resolve the following possible bug.
RECLAIM_FS-ON-W at:
mark_held_locks+0xb9/0x140
lockdep_trace_alloc+0x85/0xf0
__kmalloc+0x53/0x1d0
read_all_xattrs+0x3d1/0x3f0 [f2fs]
f2fs_getxattr+0x4f/0x100 [f2fs]
f2fs_get_acl+0x4c/0x290 [f2fs]
get_acl+0x4f/0x80
posix_acl_create+0x72/0x180
f2fs_init_acl+0x29/0xcc [f2fs]
__f2fs_add_link+0x259/0x710 [f2fs]
f2fs_create+0xad/0x1c0 [f2fs]
vfs_create+0xed/0x150
do_last+0xd36/0xed0
path_openat+0xc5/0x680
do_filp_open+0x43/0xa0
do_sys_open+0x13c/0x230
SyS_creat+0x1e/0x20
system_call_fastpath+0x16/0x1b
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Thu, 20 Mar 2014 12:52:53 +0000 (21:52 +0900)]
f2fs: skip unnecessary node writes during fsync
If multiple redundant fsync calls are triggered, we don't need to write its
node pages with fsync mark continuously.
So, this patch adds FI_NEED_FSYNC to track whether the latest node block is
written with the fsync mark or not.
If the mark was set, a new fsync doesn't need to write a node block.
Otherwise, we should do a new node block with the mark for roll-forward
recovery.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Thu, 20 Mar 2014 10:10:08 +0000 (19:10 +0900)]
f2fs: introduce fi->i_sem to protect fi's info
This patch introduces fi->i_sem to protect fi's info that includes xattr_ver,
pino, i_nlink.
This enables to remove i_mutex during f2fs_sync_file, resulting in performance
improvement when a number of fsync calls are triggered from many concurrent
threads.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Wed, 19 Mar 2014 05:17:21 +0000 (14:17 +0900)]
f2fs: change reclaim rate in percentage
It is more reasonable to determine the reclaiming rate of prefree segments
according to the volume size, which is set to 5% by default.
For example, if the volume is 128GB, the prefree segments are reclaimed
when the number reaches to 6.4GB.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Wed, 19 Mar 2014 04:40:09 +0000 (13:40 +0900)]
f2fs: add missing documentation for dir_level
This patch adds missing dir_level documentation.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Wed, 19 Mar 2014 04:45:52 +0000 (13:45 +0900)]
f2fs: remove unnecessary threshold
The NM_WOUT_THRESHOLD is now obsolete since f2fs starts to control on a basis
of the memory footprint.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Wed, 19 Mar 2014 04:31:37 +0000 (13:31 +0900)]
f2fs: throttle the memory footprint with a sysfs entry
This patch introduces ram_thresh, a sysfs entry, which controls the memory
footprint used by the free nid list and the nat cache.
Previously, the free nid list was controlled by MAX_FREE_NIDS, while the nat
cache was managed by NM_WOUT_THRESHOLD.
However, this approach cannot be applied dynamically according to the system.
So, this patch adds ram_thresh that users can specify the threshold, which is
in order of 1 / 1024.
For example, if the total ram size is 4GB and the value is set to 10 by default,
f2fs tries to control the number of free nids and nat caches not to consume over
10 * (4GB / 1024) = 10MB.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Wed, 19 Mar 2014 01:43:59 +0000 (10:43 +0900)]
f2fs: avoid to drop nat entries due to the negative nr_shrink
The try_to_free_nats should not receive the negative nr_shrink.
Otherwise, it can drop all the nat entries by the while loop.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Tue, 18 Mar 2014 04:29:07 +0000 (13:29 +0900)]
f2fs: call f2fs_wait_on_page_writeback instead of native function
If a page is on writeback, f2fs can face with deadlock due to under writepages.
This is caused by merging IOs inside f2fs, so if it comes to detect, let's throw
merged IOs, which is implemented by f2fs_wait_on_page_writeback.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Tue, 18 Mar 2014 04:47:11 +0000 (13:47 +0900)]
f2fs: introduce nr_pages_to_write for segment alignment
This patch introduces nr_pages_to_write to align page writes to the segment
or other operational unit size, which can be tuned according to the system
environment.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Tue, 18 Mar 2014 04:43:05 +0000 (13:43 +0900)]
f2fs: increase pages_skipped when skipping writepages
This patch increases pages_skipped when skipping writepages.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Tue, 18 Mar 2014 03:40:49 +0000 (12:40 +0900)]
f2fs: avoid small data writes by skipping writepages
This patch introduces nr_pages_to_skip(sbi, type) to determine writepages can
be skipped.
The dentry, node, and meta pages can be conrolled by F2FS without breaking the
FS consistency.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Tue, 18 Mar 2014 03:33:06 +0000 (12:33 +0900)]
f2fs: introduce get_dirty_dents for readability
The get_dirty_dents gives us the number of dirty dentry pages.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Tue, 18 Mar 2014 01:07:59 +0000 (09:07 +0800)]
f2fs: fix incorrect parsing with option string
Previously 'background_gc={on***,off***}' is being parsed as correct option,
with this patch we cloud fix the trivial bug in mount process.
Change log from v1:
o need to check length of parameter suggested by Jaegeuk Kim.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Mon, 17 Mar 2014 08:36:24 +0000 (16:36 +0800)]
f2fs: avoid to return incorrect errno of read_normal_summaries
We should return error number of read_normal_summaries instead of -EINVAL when
read_normal_summaries failed.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Mon, 17 Mar 2014 08:35:06 +0000 (16:35 +0800)]
f2fs: introduce f2fs_has_xattr_block for better readability
This patch introduces a help function f2fs_has_xattr_block for better
readability.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Mon, 17 Mar 2014 02:31:06 +0000 (10:31 +0800)]
f2fs: print type for each segment in segment_info's show
The original segment_info's show looks out-of-format:
cat /proc/fs/f2fs/loop0/segment_info
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 512
512 512 512 512 512 512 512 0 0 512
348 0 263 0 0 512 0 0 512 512
512 512 0 512 512 512 512 512 512 512
512 512 511 328 512 512 512 512 512 512
512 512 512 512 512 512 512 0 0 175
Let's fix this and show type for each segment.
cat /proc/fs/f2fs/loop0/segment_info
format: segment_type|valid_blocks
segment_type(0:HD, 1:WD, 2:CD, 3:HN, 4:WN, 5:CN)
0 2|0 1|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
10 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
20 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
30 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
40 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
50 3|0 3|0 3|0 3|0 3|0 3|0 3|0 0|0 3|0 3|0
60 3|0 3|0 3|0 3|0 3|0 3|0 3|0 3|0 3|0 3|512
70 3|512 3|512 3|512 3|512 3|512 3|512 3|512 3|0 3|0 3|512
80 3|0 3|0 3|0 3|0 3|0 3|512 3|0 3|0 3|512 3|512
90 3|512 0|512 3|274 0|512 0|512 0|512 0|512 0|512 0|512 3|512
100 3|512 0|512 3|511 0|328 3|512 0|512 0|512 3|512 0|512 0|512
110 0|512 0|512 0|512 0|512 0|512 0|512 0|512 5|0 4|0 3|512
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Wed, 12 Mar 2014 09:08:36 +0000 (17:08 +0800)]
f2fs: check upper bound of ino value in f2fs_nfs_get_inode
Upper bound checking of ino should be added to f2fs_nfs_get_inode, so unneeded
process before do_read_inode in f2fs_iget could be avoided when ino is invalid.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Wed, 12 Mar 2014 07:59:03 +0000 (15:59 +0800)]
f2fs: introduce f2fs_has_inline_xattr for better readability
This patch introduces a help function f2fs_has_inline_xattr for better
readability.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Tue, 11 Mar 2014 05:37:38 +0000 (13:37 +0800)]
f2fs: recover inline xattr data in roll-forward process
Previously we do not recover inline xattr data of inode after power-cut, so
inline xattr data may be lost.
We should recover the data during the roll-forward process.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Gu Zheng [Fri, 7 Mar 2014 10:43:36 +0000 (18:43 +0800)]
f2fs: optimize restore_node_summary slightly
Previously, we ra_sum_pages to pre-read contiguous pages as more
as possible, and if we fail to alloc more pages, an ENOMEM error
will be reported upstream, even though we have alloced some pages
yet. In fact, we can use the available pages to do the job partly,
and continue the rest in the following circle. Only reporting ENOMEM
upstream if we really can not alloc any available page.
And another fix is ignoring dealing with the following pages if an
EIO occurs when reading page from page_list.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
[Jaegeuk Kim: modify the flow for better neat code]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Gu Zheng [Fri, 7 Mar 2014 10:43:33 +0000 (18:43 +0800)]
f2fs: format segment_info's show for better legibility
The original segment_info's show is a bit out-of-format:
[root@guz Demoes]# cat /proc/fs/f2fs/loop0/segment_info
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
......
0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 [root@guz Demoes]#
so we fix it here for better legibility.
[root@guz Demoes]# cat /proc/fs/f2fs/loop0/segment_info
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
......
0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1
[root@guz Demoes]#
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Gu Zheng [Fri, 7 Mar 2014 10:43:28 +0000 (18:43 +0800)]
f2fs: remove the unused ctor argument of f2fs_kmem_cache_create()
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Gu Zheng [Fri, 7 Mar 2014 10:43:24 +0000 (18:43 +0800)]
f2fs: update start nid only once each circle
Integrated a couple of minor changes for better readability suggested by
Chao Yu.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Wed, 5 Mar 2014 01:48:25 +0000 (10:48 +0900)]
f2fs: fix wrong kernel coding style
This patch includes a simple fix to adjust coding style.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Mon, 3 Mar 2014 02:28:40 +0000 (11:28 +0900)]
f2fs: fix to write node pages with WRITE_SYNC
This patch fixes performance regression of dbench reported by
Alex <hbx7d@yandex.com>.
This issue was revealed by Phoronix tests results:
http://www.phoronix.com/scan.php?page=article&item=linux_314_ssdfs&num=2
It turns out that we need to assign WRITE_SYNC to the node writes, if
fsync is triggered.
The performance numbers are like below, which is measured by Alex.
1. 355MB/s ext4
2. 225MB/s f2fs : WRITE for node writes
3. 525MB/s f2fs : WRITE_SYNC for node writes
Reported-And-Tested-by: Alex <hbx7d@yandex.com>.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Fri, 28 Feb 2014 02:12:05 +0000 (10:12 +0800)]
f2fs: fix dirty page accounting when redirty
We should de-account dirty counters for page when redirty in ->writepage().
Wu Fengguang described in 'commit
971767caf632190f77a40b4011c19948232eed75':
"writeback: fix dirtied pages accounting on redirty
De-account the accumulative dirty counters on page redirty.
Page redirties (very common in ext4) will introduce mismatch between
counters (a) and (b)
a) NR_DIRTIED, BDI_DIRTIED, tsk->nr_dirtied
b) NR_WRITTEN, BDI_WRITTEN
This will introduce systematic errors in balanced_rate and result in
dirty page position errors (ie. the dirty pages are no longer balanced
around the global/bdi setpoints)."
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Thu, 27 Feb 2014 11:52:21 +0000 (19:52 +0800)]
f2fs: use existing macro to clean up some codes
This patch use existing macro F2FS_INODE/NEXT_FREE_BLKADDR to clean up some
codes.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Thu, 27 Feb 2014 11:12:24 +0000 (19:12 +0800)]
f2fs: readahead contiguous SSA blocks for f2fs_gc
If there are multi segments in one section, we will read those SSA blocks which
have contiguous address one by one in f2fs_gc. It may lost performance, let's
read ahead SSA blocks by merge multi read request.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Thu, 27 Feb 2014 11:09:05 +0000 (20:09 +0900)]
f2fs: add an sysfs entry to control the directory level
This patch adds an sysfs entry to control dir_level used by the large directory.
The description of this entry is:
dir_level This parameter controls the directory level to
support large directory. If a directory has a
number of files, it can reduce the file lookup
latency by increasing this dir_level value.
Otherwise, it needs to decrease this value to
reduce the space overhead. The default value is 0.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Thu, 27 Feb 2014 09:20:00 +0000 (18:20 +0900)]
f2fs: introduce large directory support
This patch introduces an i_dir_level field to support large directory.
Previously, f2fs maintains multi-level hash tables to find a dentry quickly
from a bunch of chiild dentries in a directory, and the hash tables consist of
the following tree structure as below.
In Documentation/filesystems/f2fs.txt,
----------------------
A : bucket
B : block
N : MAX_DIR_HASH_DEPTH
----------------------
level #0 | A(2B)
|
level #1 | A(2B) - A(2B)
|
level #2 | A(2B) - A(2B) - A(2B) - A(2B)
. | . . . .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
. | . . . .
level #N | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)
But, if we can guess that a directory will handle a number of child files,
we don't need to traverse the tree from level #0 to #N all the time.
Since the lower level tables contain relatively small number of dentries,
the miss ratio of the target dentry is likely to be high.
In order to avoid that, we can configure the hash tables sparsely from level #0
like this.
level #0 | A(2B) - A(2B) - A(2B) - A(2B)
level #1 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
. | . . . .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
. | . . . .
level #N | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)
With this structure, we can skip the ineffective tree searches in lower level
hash tables.
This patch adds just a facility for this by introducing i_dir_level in
f2fs_inode.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Thu, 27 Feb 2014 04:57:53 +0000 (13:57 +0900)]
f2fs: remove costly bit operations for f2fs_find_entry
It turns out that a bit operation like find_next_bit is not always fast enough
for f2fs_find_entry.
Instead, it is pretty much simple and fast to traverse each dentries.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Mon, 24 Feb 2014 04:00:13 +0000 (13:00 +0900)]
f2fs: implement a lock-free stat_show
The stat_show is just to show the current status of f2fs.
So, we can remove all the there-in locks.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Fri, 21 Feb 2014 05:29:35 +0000 (14:29 +0900)]
f2fs: introduce a radix_tree for the free_nid list
This patch introduces a radix tree for the list of free_nids, which enhances
the performance on free nid management.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Gu Zheng [Fri, 21 Feb 2014 10:08:29 +0000 (18:08 +0800)]
f2fs: introduce help macro on_build_free_nids()
Introduce help macro on_build_free_nids() which just uses build_lock
to judge whether the building free nid is going, so that we can remove
the on_build_free_nids field from f2fs_sb_info.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
[Jaegeuk Kim: remove an unnecessary white line removal]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Fri, 21 Feb 2014 04:17:22 +0000 (13:17 +0900)]
f2fs: fix to mark the checkpointed nat entry correctly
The nat cache entry maintains a status whether it is checkpointed or not.
So, if a new cache entry is loaded from the last checkpoint,
nat_entry->checkpointed should be true.
If the cache entry is modified as being dirty, nat_entry->checkpoint should
be false.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Wed, 19 Feb 2014 09:23:32 +0000 (18:23 +0900)]
f2fs: fix to do build_stat prior to the recovery procedure
At the end of the recovery procedure, write_checkpoint is called and updates
the cp count which is managed by f2fs stat.
But, previously build_stat() is called after the recovery procedure, which
results in:
BUG: unable to handle kernel NULL pointer dereference at
000000000000012c
IP: [<
ffffffffa03b1030>] write_checkpoint+0x720/0xbc0 [f2fs]
Call Trace:
[<
ffffffff810a6b44>] ? mark_held_locks+0x74/0x140
[<
ffffffff8109a3e0>] ? __init_waitqueue_head+0x60/0x60
[<
ffffffffa03bf036>] recover_fsync_data+0x656/0xf20 [f2fs]
[<
ffffffff812ee3eb>] ? security_d_instantiate+0x1b/0x30
[<
ffffffffa03aeb4d>] f2fs_fill_super+0x94d/0xa00 [f2fs]
[<
ffffffff811a9825>] mount_bdev+0x1a5/0x1f0
[<
ffffffff8114915e>] ? __get_free_pages+0xe/0x40
[<
ffffffffa03ae200>] ? f2fs_remount+0x130/0x130 [f2fs]
[<
ffffffffa03aa575>] f2fs_mount+0x15/0x20 [f2fs]
[<
ffffffff811aa713>] mount_fs+0x43/0x1b0
[<
ffffffff811c7124>] vfs_kern_mount+0x74/0x160
[<
ffffffff811c5cb1>] ? __get_fs_type+0x51/0x60
[<
ffffffff811c9727>] do_mount+0x237/0xb50
[<
ffffffff811c936a>] ? copy_mount_options+0x3a/0x170
So, this patche changes the order of recovery_fsync_data() and
f2fs_build_stats().
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Mon, 17 Feb 2014 10:29:27 +0000 (19:29 +0900)]
f2fs: fix not to write data pages on the page reclaiming path
Even if f2fs_write_data_page is called by the page reclaiming path, we should
not write the page to provide enough free segments for the worst case scenario.
Otherwise, f2fs can face with no free segment while gc is conducted, resulting
in:
------------[ cut here ]------------
kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/segment.c:565!
RIP: 0010:[<
ffffffffa02c3b11>] [<
ffffffffa02c3b11>] new_curseg+0x331/0x340 [f2fs]
Call Trace:
allocate_segment_by_default+0x204/0x280 [f2fs]
allocate_data_block+0x108/0x210 [f2fs]
write_data_page+0x8a/0xc0 [f2fs]
do_write_data_page+0xe1/0x2a0 [f2fs]
move_data_page+0x8a/0xf0 [f2fs]
f2fs_gc+0x446/0x970 [f2fs]
f2fs_balance_fs+0xb6/0xd0 [f2fs]
f2fs_write_begin+0x50/0x350 [f2fs]
? unlock_page+0x27/0x30
? unlock_page+0x27/0x30
generic_file_buffered_write+0x10a/0x280
? file_update_time+0xa3/0xf0
__generic_file_aio_write+0x1c8/0x3d0
? generic_file_aio_write+0x52/0xb0
? generic_file_aio_write+0x52/0xb0
generic_file_aio_write+0x65/0xb0
do_sync_write+0x5a/0x90
vfs_write+0xc5/0x1f0
SyS_write+0x55/0xa0
system_call_fastpath+0x16/0x1b
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Mon, 17 Feb 2014 03:44:20 +0000 (12:44 +0900)]
f2fs: fix the calculation of max_nids
Total nids that f2fs can use should not include 0, nid for node inode, and nid
for meta inode.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Changman Lee [Thu, 13 Feb 2014 06:12:29 +0000 (15:12 +0900)]
f2fs: show counts of checkpoint in status
This patch shows the counts of checkpoint in f2fs' status.
Signed-off-by: Changman Lee <cm224.lee@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Fri, 7 Feb 2014 08:11:53 +0000 (16:11 +0800)]
f2fs: introduce ra_meta_pages to readahead CP/NAT/SIT pages
This patch help us to cleanup the readahead code by merging ra_{sit,nat}_pages
function into ra_meta_pages.
Additionally the new function is used to readahead cp block in
recover_orphan_inodes.
Change log from v1:
o fix a deadloop bug pointed by Jaegeuk Kim.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Chao Yu [Tue, 28 Jan 2014 02:29:26 +0000 (10:29 +0800)]
f2fs: use inode mutex to keep atomicity of f2fs_falloc
Previously without protection of inode mutex, f2fs_falloc and other data
correlated operations will interfere with each other.
So let's use inode mutex to keep atomicity of f2fs_falloc.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Fri, 7 Feb 2014 01:00:06 +0000 (10:00 +0900)]
f2fs: clean up redundant function call
This patch integrates inode_[inc|dec]_dirty_dents with inc_page_count to remove
redundant calls.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Wed, 5 Feb 2014 04:03:57 +0000 (13:03 +0900)]
f2fs: fix f2fs_write_meta_page at no checkpoint status
If f2fs entered errorneous checkpoint status, it should skip writing meta
pages instead of redirtying the pages out.
Otherwise, it cannot unmount the partition even though f2fs is under read-only
status.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Wed, 5 Feb 2014 02:16:39 +0000 (11:16 +0900)]
f2fs: fix to truncate dentry pages in the error case
When a new directory is allocated, if an error is occurred, we should truncate
preallocated dentry pages too.
This bug was reported by Andrey Tsyvarev after a while as follows.
mkdir()->
f2fs_add_link()->
init_inode_metadata()->
f2fs_init_acl()->
f2fs_get_acl()->
f2fs_getxattr()->
read_all_xattrs() fails.
Also there was a BUG_ON triggered after the fault in
mkdir()->
f2fs_add_link()->
init_inode_metadata()->
remove_inode_page() ->
f2fs_bug_on(inode->i_blocks != 0 && inode->i_blocks != 1);
But, previous patch wasn't perfect to resolve that bug, so the following bug
report was also submitted.
kernel BUG at fs/f2fs/inode.c:274!
Call Trace:
[<
ffffffff811fde03>] evict+0xa3/0x1a0
[<
ffffffff811fe615>] iput+0xf5/0x180
[<
ffffffffa01c7f63>] f2fs_mkdir+0xf3/0x150 [f2fs]
[<
ffffffff811f2a77>] vfs_mkdir+0xb7/0x160
[<
ffffffff811f36bf>] SyS_mkdir+0x5f/0xc0
[<
ffffffff81680769>] system_call_fastpath+0x16/0x1b
Finally, this patch resolves all the issues like below.
If an error is occurred after make_empty_dir(),
1. truncate_inode_pages()
The make_bad_inode() prior to iput() will change i_mode to S_IFREG, which
means that f2fs will not decrement fi->dirty_dents during f2fs_evict_inode.
But, by calling it here, we can do that.
2. truncate_blocks()
Preallocated dentry pages are trucated here to sync i_blocks.
3. remove_dirty_dir_inode()
Remove this directory inode from the list.
Reported-and-Tested-by: Andrey Tsyvarev <tsyvarev@ispras.ru>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Tue, 28 Jan 2014 05:54:07 +0000 (14:54 +0900)]
f2fs: fix a build warning
This patch modifies flow a little bit to avoid the following build warnings.
src/fs/f2fs/recovery.c: In function ‘check_index_in_prev_nodes’:
src/fs/f2fs/recovery.c:288:51: warning: ‘sum.<U5390>.<U52f8>.ofs_in_node’ may
be used uninitialized in this function [-Wmaybe-uninitialized]
src/fs/f2fs/recovery.c:260:23: warning: ‘sum.nid’ may be used uninitialized
in this function [-Wmaybe-uninitialized]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Tue, 4 Feb 2014 04:01:10 +0000 (13:01 +0900)]
f2fs: clean up with a macro
This patch adds GET_BLKOFF_FROM_SEG0 to clean up some codes.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Mon, 3 Feb 2014 08:24:51 +0000 (17:24 +0900)]
f2fs: fix the potential mismatch between dir's i_size and i_blocks
This is the erroneous scenario.
i_size on-disk i_size i_blocks
__f2fs_add_link() 4096 4096 2
get_new_data_page 8192 4096 3
-ENOSPC = init_inode_metadata
checkpoint - 4096 3
POR and reboot
__f2fs_add_link() 4096 4096 3
page = get_new_data_page (page->index = 1 by NEW_ADDR)
add a dentry to the page successfully
f2fs_rmdir()
f2fs_empty_dir() 4096 4096 3
f2fs_unlink() goes, since there is no valid dentry due to i_size = 4096.
But, still there is one dentry in page->index = 1.
So this patch moves the code to write dir->i_size into on-disk i_size in order
to sync dir's i_size, on-disk i_size, and its i_blocks.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Mon, 3 Feb 2014 01:50:22 +0000 (10:50 +0900)]
f2fs: remove the ugly pointer conversion
This patch modifies the use of bi_private to remove pointer chasing for sbi.
Previously, we had a bi_private structure, but it needs memory allocation.
So this patch uses bi_private by the sbi pointer and adds a completion pointer
into the sbi.
This can achieve no memory allocation and nice use of the bi_private.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Tue, 28 Jan 2014 03:25:06 +0000 (12:25 +0900)]
f2fs: fix to recover xattr node block
If a new xattr node page was allocated and its inode is fsynced, we should
recover the xattr node page during the roll-forward process after power-cut.
But, previously, f2fs didn't handle that case, resulting in kernel panic as
follows reported by Tom Li.
BUG: unable to handle kernel paging request at
ffffc9001c861a98
IP: [<
ffffffffa0295236>] check_index_in_prev_nodes+0x86/0x2d0 [f2fs]
Call Trace:
[<
ffffffff815ece9b>] ? printk+0x48/0x4a
[<
ffffffffa029626a>] recover_fsync_data+0xdca/0xf50 [f2fs]
[<
ffffffffa02873ae>] f2fs_fill_super+0x92e/0x970 [f2fs]
[<
ffffffff8112c9f8>] mount_bdev+0x1b8/0x200
[<
ffffffffa0286a80>] ? f2fs_remount+0x130/0x130 [f2fs]
[<
ffffffffa0285e40>] f2fs_mount+0x10/0x20 [f2fs]
[<
ffffffff8112d4de>] mount_fs+0x3e/0x1b0
[<
ffffffff810ef4eb>] ? __alloc_percpu+0xb/0x10
[<
ffffffff8114761f>] vfs_kern_mount+0x6f/0x120
[<
ffffffff811497b9>] do_mount+0x259/0xa90
[<
ffffffff810ead1d>] ? memdup_user+0x3d/0x80
[<
ffffffff810eadb3>] ? strndup_user+0x53/0x70
[<
ffffffff8114a2c9>] SyS_mount+0x89/0xd0
[<
ffffffff815feae2>] system_call_fastpath+0x16/0x1b
This patch adds a recovery function of xattr node pages.
Reported-by: Tom Li <biergaizi@members.fsf.org>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Tue, 28 Jan 2014 03:22:14 +0000 (12:22 +0900)]
f2fs: handle dirty segments inside refresh_sit_entry
This patch cleans up the refresh_sit_entry to handle locate_dirty_segments.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Jaegeuk Kim [Fri, 24 Jan 2014 00:42:16 +0000 (09:42 +0900)]
f2fs: update_inode_page should be done all the time
In order to make fs consistency, update_inode_page should not be failed all
the time. Otherwise, it is possible to lose some metadata in the inode like
a link count.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Linus Torvalds [Sun, 16 Feb 2014 21:30:25 +0000 (13:30 -0800)]
Linux 3.14-rc3
Linus Torvalds [Sun, 16 Feb 2014 19:05:27 +0000 (11:05 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"We have a small collection of fixes in my for-linus branch.
The big thing that stands out is a revert of a new ioctl. Users
haven't shipped yet in btrfs-progs, and Dave Sterba found a better way
to export the information"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: use right clone root offset for compressed extents
btrfs: fix null pointer deference at btrfs_sysfs_add_one+0x105
Btrfs: unset DCACHE_DISCONNECTED when mounting default subvol
Btrfs: fix max_inline mount option
Btrfs: fix a lockdep warning when cleaning up aborted transaction
Revert "btrfs: add ioctl to export size of global metadata reservation"
Linus Torvalds [Sun, 16 Feb 2014 19:03:58 +0000 (11:03 -0800)]
Merge tag 'dt-fixes-for-3.14' of git://git./linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
"Fix booting on PPC boards. Changes to of_match_node matching caused
the serial port on some PPC boards to stop working. Reverted the
change and reimplement to split matching between new style compatible
only matching and fallback to old matching algorithm"
* tag 'dt-fixes-for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: search the best compatible match first in __of_match_node()
Revert "OF: base: match each node compatible against all given matches first"
Kevin Hao [Fri, 14 Feb 2014 05:22:46 +0000 (13:22 +0800)]
of: search the best compatible match first in __of_match_node()
Currently, of_match_node compares each given match against all node's
compatible strings with of_device_is_compatible.
To achieve multiple compatible strings per node with ordering from
specific to generic, this requires given matches to be ordered from
specific to generic. For most of the drivers this is not true and also
an alphabetical ordering is more sane there.
Therefore, this patch introduces a function to match each of the node's
compatible strings against all given compatible matches without type and
name first, before checking the next compatible string. This implies
that node's compatibles are ordered from specific to generic while
given matches can be in any order. If we fail to find such a match
entry, then fall-back to the old method in order to keep compatibility.
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Tested-by: Stephen Chivers <schivers@csc.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Linus Torvalds [Sun, 16 Feb 2014 00:18:47 +0000 (16:18 -0800)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
"Mostly minor fixes this time to v3.14-rc1 related changes. Also
included is one fix for a free after use regression in persistent
reservations UNREGISTER logic that is CC'ed to >= v3.11.y stable"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
Target/sbc: Fix protection copy routine
IB/srpt: replace strict_strtoul() with kstrtoul()
target: Simplify command completion by removing CMD_T_FAILED flag
iser-target: Fix leak on failure in isert_conn_create_fastreg_pool
iscsi-target: Fix SNACK Type 1 + BegRun=0 handling
target: Fix missing length check in spc_emulate_evpd_83()
qla2xxx: Remove last vestiges of qla_tgt_cmd.cmd_list
target: Fix 32-bit + CONFIG_LBDAF=n link error w/ sector_div
target: Fix free-after-use regression in PR unregister
Linus Torvalds [Sun, 16 Feb 2014 00:17:51 +0000 (16:17 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"i2c has a bugfix and documentation improvements for you"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
Documentation: i2c: mention ACPI method for instantiating devices
Documentation: i2c: describe devicetree method for instantiating devices
i2c: mv64xxx: refactor message start to ensure proper initialization
Linus Torvalds [Sun, 16 Feb 2014 00:06:12 +0000 (16:06 -0800)]
Merge branches 'irq-urgent-for-linus' and 'irq-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq update from Thomas Gleixner:
"Fix from the urgent branch: a trivial oneliner adding the missing
Kconfig dependency curing build failures which have been discovered by
several build robots.
The update in the irq-core branch provides a new function in the
irq/devres code, which is a prerequisite for driver developers to get
rid of boilerplate code all over the place.
Not a bugfix, but it has zero impact on the current kernel due to the
lack of users. It's simpler to provide the infrastructure to
interested parties via your tree than fulfilling the wishlist of
driver maintainers on which particular commit or tag this should be
based on"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Add missing irq_to_desc export for CONFIG_SPARSE_IRQ=n
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Add devm_request_any_context_irq()
Linus Torvalds [Sun, 16 Feb 2014 00:04:42 +0000 (16:04 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"The following trilogy of patches brings you:
- fix for a long standing math overflow issue with HZ < 60
- an onliner fix for a corner case in the dreaded tick broadcast
mechanism affecting a certain range of AMD machines which are
infested with the infamous automagic C1E power control misfeature
- a fix for one of the ARM platforms which allows the kernel to
proceed and boot instead of stupidly panicing for no good reason.
The patch is slightly larger than necessary, but it's less ugly
than the alternative 5 liner"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick: Clear broadcast pending bit when switching to oneshot
clocksource: Kona: Print warning rather than panic
time: Fix overflow when HZ is smaller than 60
Linus Torvalds [Sat, 15 Feb 2014 23:03:34 +0000 (15:03 -0800)]
Merge tag 'trace-fixes-v3.14-rc2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull twi tracing fixes from Steven Rostedt:
"Two urgent fixes in the tracing utility.
The first is a fix for the way the ring buffer stores timestamps.
After a restructure of the code was done, the ring buffer timestamp
logic missed the fact that the first event on a sub buffer is to have
a zero delta, as the full timestamp is stored on the sub buffer
itself. But because the delta was not cleared to zero, the timestamp
for that event will be calculated as the real timestamp + the delta
from the last timestamp. This can skew the timestamps of the events
and have them say they happened when they didn't really happen.
That's bad.
The second fix is for modifying the function graph caller site. When
the stop machine was removed from updating the function tracing code,
it missed updating the function graph call site location. It is still
modified as if it is being done via stop machine. But it's not. This
can lead to a GPF and kernel crash if the function graph call site
happens to lie between cache lines and one CPU is executing it while
another CPU is doing the update. It would be a very hard condition to
hit, but the result is severe enough to have it fixed ASAP"
* tag 'trace-fixes-v3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace/x86: Use breakpoints for converting function graph caller
ring-buffer: Fix first commit on sub-buffer having non-zero delta
Linus Torvalds [Sat, 15 Feb 2014 23:02:28 +0000 (15:02 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 EFI fixes from Peter Anvin:
"A few more EFI-related fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efi: Check status field to validate BGRT header
x86/efi: Fix 32-bit fallout
Linus Torvalds [Sat, 15 Feb 2014 23:01:33 +0000 (15:01 -0800)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Kevin Hilman:
"A collection of ARM SoC fixes for v3.14-rc1.
Mostly a collection of Kconfig, device tree data and compilation fixes
along with fix to drivers/phy that fixes a boot regression on some
Marvell mvebu platforms"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
dma: mv_xor: Silence a bunch of LPAE-related warnings
ARM: ux500: disable msp2 device tree node
ARM: zynq: Reserve not DMAable space in front of the kernel
ARM: multi_v7_defconfig: Select CONFIG_SOC_DRA7XX
ARM: imx6: Initialize low-power mode early again
ARM: pxa: fix various compilation problems
ARM: pxa: fix compilation problem on AM300EPD board
ARM: at91: add Atmel's SAMA5D3 Xplained board
spi/atmel: document clock properties
mmc: atmel-mci: document clock properties
ARM: at91: enable USB host on at91sam9n12ek board
ARM: at91/dt: fix sama5d3 ohci hclk clock reference
ARM: at91/dt: sam9263: fix compatibility string for the I2C
ata: sata_mv: Fix probe failures with optional phys
drivers: phy: Add support for optional phys
drivers: phy: Make NULL a valid phy reference
ARM: fix HAVE_ARM_TWD selection for OMAP and shmobile
ARM: moxart: move DMA_OF selection to driver
ARM: hisi: fix kconfig warning on HAVE_ARM_TWD
Wolfram Sang [Sat, 15 Feb 2014 14:58:35 +0000 (15:58 +0100)]
Documentation: i2c: mention ACPI method for instantiating devices
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Wolfram Sang [Mon, 10 Feb 2014 10:03:55 +0000 (11:03 +0100)]
Documentation: i2c: describe devicetree method for instantiating devices
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Filipe David Borba Manana [Sat, 15 Feb 2014 15:53:16 +0000 (15:53 +0000)]
Btrfs: use right clone root offset for compressed extents
For non compressed extents, iterate_extent_inodes() gives us offsets
that take into account the data offset from the file extent items, while
for compressed extents it doesn't. Therefore we have to adjust them before
placing them in a send clone instruction. Not doing this adjustment leads to
the receiving end requesting for a wrong a file range to the clone ioctl,
which results in different file content from the one in the original send
root.
Issue reproducible with the following excerpt from the test I made for
xfstests:
_scratch_mkfs
_scratch_mount "-o compress-force=lzo"
$XFS_IO_PROG -f -c "truncate 118811" $SCRATCH_MNT/foo
$XFS_IO_PROG -c "pwrite -S 0x0d -b 39987 92267 39987" $SCRATCH_MNT/foo
$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/mysnap1
$XFS_IO_PROG -c "pwrite -S 0x3e -b 80000 200000 80000" $SCRATCH_MNT/foo
$BTRFS_UTIL_PROG filesystem sync $SCRATCH_MNT
$XFS_IO_PROG -c "pwrite -S 0xdc -b 10000 250000 10000" $SCRATCH_MNT/foo
$XFS_IO_PROG -c "pwrite -S 0xff -b 10000 300000 10000" $SCRATCH_MNT/foo
# will be used for incremental send to be able to issue clone operations
$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/clones_snap
$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/mysnap2
$FSSUM_PROG -A -f -w $tmp/1.fssum $SCRATCH_MNT/mysnap1
$FSSUM_PROG -A -f -w $tmp/2.fssum -x $SCRATCH_MNT/mysnap2/mysnap1 \
-x $SCRATCH_MNT/mysnap2/clones_snap $SCRATCH_MNT/mysnap2
$FSSUM_PROG -A -f -w $tmp/clones.fssum $SCRATCH_MNT/clones_snap \
-x $SCRATCH_MNT/clones_snap/mysnap1 -x $SCRATCH_MNT/clones_snap/mysnap2
$BTRFS_UTIL_PROG send $SCRATCH_MNT/mysnap1 -f $tmp/1.snap
$BTRFS_UTIL_PROG send $SCRATCH_MNT/clones_snap -f $tmp/clones.snap
$BTRFS_UTIL_PROG send -p $SCRATCH_MNT/mysnap1 \
-c $SCRATCH_MNT/clones_snap $SCRATCH_MNT/mysnap2 -f $tmp/2.snap
_scratch_unmount
_scratch_mkfs
_scratch_mount
$BTRFS_UTIL_PROG receive $SCRATCH_MNT -f $tmp/1.snap
$FSSUM_PROG -r $tmp/1.fssum $SCRATCH_MNT/mysnap1 2>> $seqres.full
$BTRFS_UTIL_PROG receive $SCRATCH_MNT -f $tmp/clones.snap
$FSSUM_PROG -r $tmp/clones.fssum $SCRATCH_MNT/clones_snap 2>> $seqres.full
$BTRFS_UTIL_PROG receive $SCRATCH_MNT -f $tmp/2.snap
$FSSUM_PROG -r $tmp/2.fssum $SCRATCH_MNT/mysnap2 2>> $seqres.full
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Anand Jain [Wed, 15 Jan 2014 09:22:28 +0000 (17:22 +0800)]
btrfs: fix null pointer deference at btrfs_sysfs_add_one+0x105
bdev is null when disk has disappeared and mounted with
the degrade option
stack trace
---------
btrfs_sysfs_add_one+0x105/0x1c0 [btrfs]
open_ctree+0x15f3/0x1fe0 [btrfs]
btrfs_mount+0x5db/0x790 [btrfs]
? alloc_pages_current+0xa4/0x160
mount_fs+0x34/0x1b0
vfs_kern_mount+0x62/0xf0
do_mount+0x22e/0xa80
? __get_free_pages+0x9/0x40
? copy_mount_options+0x31/0x170
SyS_mount+0x7e/0xc0
system_call_fastpath+0x16/0x1b
---------
reproducer:
-------
mkfs.btrfs -draid1 -mraid1 /dev/sdc /dev/sdd
(detach a disk)
devmgt detach /dev/sdc [1]
mount -o degrade /dev/sdd /btrfs
-------
[1] github.com/anajain/devmgt.git
Signed-off-by: Anand Jain <Anand.Jain@oracle.com>
Tested-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Wolfram Sang [Thu, 13 Feb 2014 20:36:29 +0000 (21:36 +0100)]
i2c: mv64xxx: refactor message start to ensure proper initialization
Because the offload mechanism can fall back to a standard transfer,
having two seperate initialization states is unfortunate. Let's just
have one state which does things consistently. This fixes a bug where
some preparation was missing when the fallback happened. And it makes
the code much easier to follow. To implement this, we put the check
if offload is possible at the top of the offload setup function.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: stable@vger.kernel.org # v3.12+
Fixes: 930ab3d403ae (i2c: mv64xxx: Add I2C Transaction Generator support)
Linus Torvalds [Sat, 15 Feb 2014 00:15:45 +0000 (16:15 -0800)]
Merge tag 'usb-3.14-rc3' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here is a bunch of USB fixes for 3.14-rc3. Most of these are xhci
reverts, fixing a bunch of reported issues with USB 3 host controller
issues that loads of people have been hitting (with the exception of
kernel developers, all of our machines seem to be working fine, which
is why these took so long to get resolved...)
There are some other minor fixes and new device ids, as ususal. All
have been in linux-next successfully"
* tag 'usb-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (22 commits)
usb: option: blacklist ZTE MF667 net interface
Revert "usb: xhci: Link TRB must not occur within a USB payload burst"
Revert "xhci: Avoid infinite loop when sg urb requires too many trbs"
Revert "xhci: Set scatter-gather limit to avoid failed block writes."
xhci 1.0: Limit arbitrarily-aligned scatter gather.
Modpost: fixed USB alias generation for ranges including 0x9 and 0xA
usb: core: Fix potential memory leak adding dyn USBdevice IDs
USB: ftdi_sio: add Tagsys RFID Reader IDs
usb: qcserial: add Netgear Aircard 340U
usb-storage: enable multi-LUN scanning when needed
USB: simple: add Dynastream ANT USB-m Stick device support
usb-storage: add unusual-devs entry for BlackBerry 9000
usb-storage: restrict bcdDevice range for Super Top in Cypress ATACB
usb: phy: move some error messages to debug
usb: ftdi_sio: add Mindstorms EV3 console adapter
usb: dwc2: fix memory corruption in dwc2 driver
usb: dwc2: fix role switch breakage
usb: dwc2: bail out early when booting with "nousb"
Revert "xhci: replace xhci_read_64() with readq()"
Revert "xhci: replace xhci_write_64() with writeq()"
...
Linus Torvalds [Sat, 15 Feb 2014 00:15:03 +0000 (16:15 -0800)]
Merge tag 'tty-3.14-rc3' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are a small number of tty/serial driver fixes to resolve reported
issues with 3.14-rc and earlier (in the case of the vt bugfix). Some
of these have been tested and reported by a number of people as the
tty bugfix was pretty commonly hit on some platforms.
All have been in linux-next for a while"
* tag 'tty-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
vt: Fix secure clear screen
serial: 8250: Support XR17V35x fraction divisor
n_tty: Fix stale echo output
serial: sirf: fix kernel panic caused by unpaired spinlock
serial: 8250_pci: unbreak last serial ports on NetMos 9865 cards
n_tty: Fix poll() when TIME_CHAR and MIN_CHAR == 0
serial: omap: fix rs485 probe on defered pinctrl
serial: 8250_dw: fix compilation warning when !CONFIG_PM_SLEEP
serial: omap-serial: Move info message to probe function
tty: Set correct tty name in 'active' sysfs attribute
tty: n_gsm: Fix for modems with brk in modem status control
drivers/tty/hvc: don't use module_init in non-modular hyp. console code
Linus Torvalds [Sat, 15 Feb 2014 00:14:11 +0000 (16:14 -0800)]
Merge tag 'staging-3.14-rc3' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are a number (lots, I know) of fixes for staging drivers to
resolve a bunch of reported issues.
The largest patches here is one revert of a patch that is in 3.14-rc1
to fix reported problems, and a sync of a usb host driver that
required some ARM patches to go in before it could be accepted (which
is why it missed -rc1)
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (56 commits)
staging/rtl8821ae: fix build, depends on MAC80211
iio: max1363: Use devm_regulator_get_optional for optional regulator
iio:accel:bma180: Use modifier instead of index in channel specification
iio: adis16400: Set timestamp as the last element in chan_spec
iio: ak8975: Fix calculation formula for convert micro tesla to gauss unit
staging:iio:ad799x fix typo in ad799x_events[]
iio: mxs-lradc: remove useless scale_available files
iio: mxs-lradc: fix buffer overflow
iio:magnetometer:mag3110: Fix output of decimal digits in show_int_plus_micros()
iio:magnetometer:mag3110: Report busy in _read_raw() / write_raw() when buffer is enabled
wlags49_h2: Fix overflow in wireless_set_essid()
xlr_net: Fix missing trivial allocation check
staging: r8188eu: overflow in rtw_p2p_get_go_device_address()
staging: r8188eu: array overflow in rtw_mp_ioctl_hdl()
staging: r8188eu: Fix typo in USB_DEVICE list
usbip/userspace/libsrc/names.c: memory leak
gpu: ion: dereferencing an ERR_PTR
staging: comedi: usbduxsigma: fix unaligned dereferences
staging: comedi: fix too early cleanup in comedi_auto_config()
staging: android: ion: dummy: fix an error code
...
Linus Torvalds [Sat, 15 Feb 2014 00:13:40 +0000 (16:13 -0800)]
Merge tag 'driver-core-3.14-rc3' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is a single driver core patch for 3.14-rc3 for the component code
that Russell has found and fixed"
* tag 'driver-core-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
drivers/base: fix devres handling for master device
Linus Torvalds [Sat, 15 Feb 2014 00:13:00 +0000 (16:13 -0800)]
Merge tag 'char-misc-3.14-rc3' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Here are some small char/misc driver fixes, along with some
documentation updates, for 3.14-rc3. Nothing major, just a number of
fixes for reported issues"
* tag 'char-misc-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Revert "misc: eeprom: sunxi: Add new compatibles"
Revert "ARM: sunxi: dt: Convert to the new SID compatibles"
misc: mic: fix possible signed underflow (undefined behavior) in userspace API
ARM: sunxi: dt: Convert to the new SID compatibles
misc: eeprom: sunxi: Add new compatibles
misc: genwqe: Fix potential memory leak when pinning memory
Documentation:Update Documentation/zh_CN/arm64/memory.txt
Documentation:Update Documentation/zh_CN/arm64/booting.txt
Documentation:Chinese translation of Documentation/arm64/tagged-pointers.txt
raw: set range for MAX_RAW_DEVS
raw: test against runtime value of max_raw_minors
Drivers: hv: vmbus: Don't timeout during the initial connection with host
Drivers: hv: vmbus: Specify the target CPU that should receive notification
VME: Correct read/write alignment algorithm
mei: don't unset read cb ptr on reset
mei: clear write cb from waiting list on reset
Josef Bacik [Fri, 14 Feb 2014 18:43:48 +0000 (13:43 -0500)]
Btrfs: unset DCACHE_DISCONNECTED when mounting default subvol
A user was running into errors from an NFS export of a subvolume that had a
default subvol set. When we mount a default subvol we will use d_obtain_alias()
to find an existing dentry for the subvolume in the case that the root subvol
has already been mounted, or a dummy one is allocated in the case that the root
subvol has not already been mounted. This allows us to connect the dentry later
on if we wander into the path. However if we don't ever wander into the path we
will keep DCACHE_DISCONNECTED set for a long time, which angers NFS. It doesn't
appear to cause any problems but it is annoying nonetheless, so simply unset
DCACHE_DISCONNECTED in the get_default_root case and switch btrfs_lookup() to
use d_materialise_unique() instead which will make everything play nicely
together and reconnect stuff if we wander into the defaul subvol path from a
different way. With this patch I'm no longer getting the NFS errors when
exporting a volume that has been mounted with a default subvol set. Thanks,
cc: bfields@fieldses.org
cc: ebiederm@xmission.com
Signed-off-by: Josef Bacik <jbacik@fb.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Chris Mason <clm@fb.com>
Mitch Harder [Thu, 13 Feb 2014 15:13:16 +0000 (09:13 -0600)]
Btrfs: fix max_inline mount option
Currently, the only mount option for max_inline that has any effect is
max_inline=0. Any other value that is supplied to max_inline will be
adjusted to a minimum of 4k. Since max_inline has an effective maximum
of ~3900 bytes due to page size limitations, the current behaviour
only has meaning for max_inline=0.
This patch will allow the the max_inline mount option to accept non-zero
values as indicated in the documentation.
Signed-off-by: Mitch Harder <mitch.harder@sabayonlinux.org>
Signed-off-by: Chris Mason <clm@fb.com>
Liu Bo [Sat, 8 Feb 2014 07:33:08 +0000 (15:33 +0800)]
Btrfs: fix a lockdep warning when cleaning up aborted transaction
Given now we have 2 spinlock for management of delayed refs,
CONFIG_DEBUG_SPINLOCK=y helped me find this,
[ 4723.413809] BUG: spinlock wrong CPU on CPU#1, btrfs-transacti/2258
[ 4723.414882] lock: 0xffff880048377670, .magic:
dead4ead, .owner: btrfs-transacti/2258, .owner_cpu: 2
[ 4723.417146] CPU: 1 PID: 2258 Comm: btrfs-transacti Tainted: G W O 3.12.0+ #4
[ 4723.421321] Call Trace:
[ 4723.421872] [<
ffffffff81680fe7>] dump_stack+0x54/0x74
[ 4723.422753] [<
ffffffff81681093>] spin_dump+0x8c/0x91
[ 4723.424979] [<
ffffffff816810b9>] spin_bug+0x21/0x26
[ 4723.425846] [<
ffffffff81323956>] do_raw_spin_unlock+0x66/0x90
[ 4723.434424] [<
ffffffff81689bf7>] _raw_spin_unlock+0x27/0x40
[ 4723.438747] [<
ffffffffa015da9e>] btrfs_cleanup_one_transaction+0x35e/0x710 [btrfs]
[ 4723.443321] [<
ffffffffa015df54>] btrfs_cleanup_transaction+0x104/0x570 [btrfs]
[ 4723.444692] [<
ffffffff810c1b5d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 4723.450336] [<
ffffffff810c1c2d>] ? trace_hardirqs_on+0xd/0x10
[ 4723.451332] [<
ffffffffa015e5ee>] transaction_kthread+0x22e/0x270 [btrfs]
[ 4723.452543] [<
ffffffffa015e3c0>] ? btrfs_cleanup_transaction+0x570/0x570 [btrfs]
[ 4723.457833] [<
ffffffff81079efa>] kthread+0xea/0xf0
[ 4723.458990] [<
ffffffff81079e10>] ? kthread_create_on_node+0x140/0x140
[ 4723.460133] [<
ffffffff81692aac>] ret_from_fork+0x7c/0xb0
[ 4723.460865] [<
ffffffff81079e10>] ? kthread_create_on_node+0x140/0x140
[ 4723.496521] ------------[ cut here ]------------
----------------------------------------------------------------------
The reason is that we get to call cond_resched_lock(&head_ref->lock) while
still holding @delayed_refs->lock.
So it's different with __btrfs_run_delayed_refs(), where we do drop-acquire
dance before and after actually processing delayed refs.
Here we don't drop the lock, others are not able to add new delayed refs to
head_ref, so cond_resched_lock(&head_ref->lock) is not necessary here.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Chris Mason [Fri, 14 Feb 2014 21:42:13 +0000 (13:42 -0800)]
Revert "btrfs: add ioctl to export size of global metadata reservation"
This reverts commit
01e219e8069516cdb98594d417b8bb8d906ed30d.
David Sterba found a different way to provide these features without adding a new
ioctl. We haven't released any progs with this ioctl yet, so I'm taking this out
for now until we finalize things.
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
CC: Jeff Mahoney <jeffm@suse.com>
Linus Torvalds [Fri, 14 Feb 2014 20:48:46 +0000 (12:48 -0800)]
Merge branch 'for-3.14' of git://linux-nfs.org/~bfields/linux
Pull two nfsd bugfixes from Bruce Fields.
* 'for-3.14' of git://linux-nfs.org/~bfields/linux:
lockd: send correct lock when granting a delayed lock.
nfsd4: fix acl buffer overrun
Linus Torvalds [Fri, 14 Feb 2014 20:48:16 +0000 (12:48 -0800)]
Merge tag 'md/3.14-fixes' of git://neil.brown.name/md
Pull md fixes from Neil Brown:
"Two bugfixes for md
both tagged for -stable"
* tag 'md/3.14-fixes' of git://neil.brown.name/md:
md/raid5: Fix CPU hotplug callback registration
md/raid1: restore ability for check and repair to fix read errors.
Kevin Hao [Fri, 14 Feb 2014 05:22:45 +0000 (13:22 +0800)]
Revert "OF: base: match each node compatible against all given matches first"
This reverts commit
105353145eafb3ea919f5cdeb652a9d8f270228e.
Stephen Chivers reported this is broken as we will get a match
entry '.type = "serial"' instead of the '.compatible = "ns16550"'
in the following scenario:
serial0: serial@4500 {
compatible = "fsl,ns16550", "ns16550";
}
struct of_device_id of_platform_serial_table[] = {
{ .compatible = "ns8250", .data = (void *)PORT_8250, },
{ .compatible = "ns16450", .data = (void *)PORT_16450, },
{ .compatible = "ns16550a", .data = (void *)PORT_16550A, },
{ .compatible = "ns16550", .data = (void *)PORT_16550, },
{ .compatible = "ns16750", .data = (void *)PORT_16750, },
{ .compatible = "ns16850", .data = (void *)PORT_16850, },
...
{ .type = "serial", .data = (void *)PORT_UNKNOWN, },
{ /* end of list */ },
};
So just revert this patch, we will use another implementation to find
the best compatible match in a follow-on patch.
Reported-by: Stephen N Chivers <schivers@csc.com.au>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Greg Kroah-Hartman [Fri, 14 Feb 2014 19:16:08 +0000 (11:16 -0800)]
Revert "misc: eeprom: sunxi: Add new compatibles"
This reverts commit
f0de8e04a7201a2000f3c6d09732c11e7f35d42d, it is
incorrect, a future patch will fix this up properly.
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 14 Feb 2014 19:15:40 +0000 (11:15 -0800)]
Revert "ARM: sunxi: dt: Convert to the new SID compatibles"
This reverts commit
01ab1167cd2d861d20195eda08505652c536df97, it is
incorrect, a future patch will fix this up properly.
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
H. Peter Anvin [Fri, 14 Feb 2014 19:11:18 +0000 (11:11 -0800)]
Merge remote-tracking branch 'efi/urgent' into x86/urgent
There have been reports of EFI crashes since -rc1. The following two
commits fix known issues.
* Fix boot failure on 32-bit EFI due to the recent EFI memmap changes
merged during the merge window - Borislav Petkov
* Avoid a crash during efi_bgrt_init() by detecting invalid BGRT
headers based on the 'status' field.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Linus Torvalds [Fri, 14 Feb 2014 19:10:49 +0000 (11:10 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"A small error handling problem and a compile breakage for ARM64"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
arm64: KVM: Add VGIC device control for arm64
KVM: return an error code in kvm_vm_ioctl_register_coalesced_mmio()
Linus Torvalds [Fri, 14 Feb 2014 19:09:11 +0000 (11:09 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
"A collection of small fixes:
- There still seem to be problems with asm goto which requires the
empty asm hack.
- If SMAP is disabled at compile time, don't enable it nor try to
interpret a page fault as an SMAP violation.
- Fix a case of unbounded recursion while tracing"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, smap: smap_violation() is bogus if CONFIG_X86_SMAP is off
x86, smap: Don't enable SMAP if CONFIG_X86_SMAP is disabled
compiler/gcc4: Make quirk for asm_volatile_goto() unconditional
x86: Use preempt_disable_notrace() in cycles_2_ns()
Linus Torvalds [Fri, 14 Feb 2014 19:07:29 +0000 (11:07 -0800)]
Merge tag 'pm+acpi-3.14-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These include a fix for a recent intel_pstate regression, a fix for a
regression in the ACPI-based PCI hotplug (ACPIPHP) code introduced
during the 3.12 cycle, fixes for two bugs in the ACPI core introduced
recently and a MAINTAINERS update related to cpufreq.
Specifics:
- Fix for a recent regression in the intel_pstate driver that
introduced a race condition causing systems to crash during
initialization in some situations. This removes the affected code
altogether. From Dirk Brandewie.
- ACPIPHP fix for a regression introduced during the 3.12 cycle
causing devices to be dropped as a result of bus check
notifications after system resume on some systems due to the way
ACPIPHP interprets _STA return values (arguably incorrectly). From
Mika Westerberg.
- ACPI dock driver fix for a problem causing docking to fail due to a
check that always fails after recent ACPI core changes (found by
code inspection).
- ACPI container driver fix to prevent memory from being leaked in an
error code path after device_register() failures.
- Update of the arm_big_little cpufreq driver maintainer's e-mail
address"
* tag 'pm+acpi-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
MAINTAINERS / cpufreq: update Sudeep's email address
intel_pstate: Remove energy reporting from pstate_sample tracepoint
ACPI / container: Fix error code path in container_device_attach()
ACPI / hotplug / PCI: Relax the checking of _STA return values
ACPI / dock: Use acpi_device_enumerated() to check if dock is present
Linus Torvalds [Fri, 14 Feb 2014 19:05:41 +0000 (11:05 -0800)]
Merge tag 'edac_for_3.14' of git://git./linux/kernel/git/bp/bp
Pull EDAC fixes from Borislav Petkov:
"Fix polling timeout setting through sysfs.
You're surely wondering why the patches are not based on an rc. Well,
Andrew sent you
79040cad3f82 ("drivers/edac/edac_mc_sysfs.c: poll
timeout cannot be zero sent you") already (it got in in -rc2) but it
is not enough as a fix because for one, setting too low polling
intervals (< 1sec) don't make any sense and cause unnecessary polling
load on the system.
Then, even if we set some interval, we explode with
[ 4143.094342] WARNING: CPU: 1 PID: 0 at kernel/workqueue.c:1393 __queue_work+0x1d7/0x340()
because the workqueue setup path is used also for the timeout period
resetting and we're doing INIT_DELAYED_WORK() on an already active
workqueue. Which is total bollocks. So this is taken care of by the
second patch.
I've CCed stable for those two"
* tag 'edac_for_3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC: Correct workqueue setup path
EDAC: Poll timeout cannot be zero, p2
Linus Torvalds [Fri, 14 Feb 2014 19:04:54 +0000 (11:04 -0800)]
Merge tag 'fbdev-fixes-3.14' of git://git./linux/kernel/git/tomba/linux
Pull fbdev fixes from Tomi Valkeinen:
"Minor fbdev fixes for 3.14"
* tag 'fbdev-fixes-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
video: Kconfig: Allow more broad selection of the imxfb framebuffer driver.
video: exynos: Fix S6E8AX0 LCD driver build error
OMAPDSS: fix fck field types
OMAPDSS: DISPC: decimation rounding fix
Linus Torvalds [Fri, 14 Feb 2014 18:45:18 +0000 (10:45 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block IO fixes from Jens Axboe:
"Second round of updates and fixes for 3.14-rc2. Most of this stuff
has been queued up for a while. The notable exception is the blk-mq
changes, which are naturally a bit more in flux still.
The pull request contains:
- Two bug fixes for the new immutable vecs, causing crashes with raid
or swap. From Kent.
- Various blk-mq tweaks and fixes from Christoph. A fix for
integrity bio's from Nic.
- A few bcache fixes from Kent and Darrick Wong.
- xen-blk{front,back} fixes from David Vrabel, Matt Rushton, Nicolas
Swenson, and Roger Pau Monne.
- Fix for a vec miscount with integrity vectors from Martin.
- Minor annotations or fixes from Masanari Iida and Rashika Kheria.
- Tweak to null_blk to do more normal FIFO processing of requests
from Shlomo Pongratz.
- Elevator switching bypass fix from Tejun.
- Softlockup in blkdev_issue_discard() fix when !CONFIG_PREEMPT from
me"
* 'for-linus' of git://git.kernel.dk/linux-block: (31 commits)
block: add cond_resched() to potentially long running ioctl discard loop
xen-blkback: init persistent_purge_work work_struct
blk-mq: pair blk_mq_start_request / blk_mq_requeue_request
blk-mq: dont assume rq->errors is set when returning an error from ->queue_rq
block: Fix cloning of discard/write same bios
block: Fix type mismatch in ssize_t_blk_mq_tag_sysfs_show
blk-mq: rework flush sequencing logic
null_blk: use blk_complete_request and blk_mq_complete_request
virtio_blk: use blk_mq_complete_request
blk-mq: rework I/O completions
fs: Add prototype declaration to appropriate header file include/linux/bio.h
fs: Mark function as static in fs/bio-integrity.c
block/null_blk: Fix completion processing from LIFO to FIFO
block: Explicitly handle discard/write same segments
block: Fix nr_vecs for inline integrity vectors
blk-mq: Add bio_integrity setup to blk_mq_make_request
blk-mq: initialize sg_reserved_size
blk-mq: handle dma_drain_size
blk-mq: divert __blk_put_request for MQ ops
blk-mq: support at_head inserations for blk_execute_rq
...
Linus Torvalds [Fri, 14 Feb 2014 18:34:30 +0000 (10:34 -0800)]
Merge tag 'sound-3.14-rc3' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Again only fixes for HD-audio:
- regression fixes due to the modularization
- a few fixups for Dell, Sony and HP laptops
- a revert of the previous fix as it leads to another regression"
* tag 'sound-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: Revert "ALSA: hda/realtek - Avoid invalid COEFs for ALC271X"
ALSA: hda - Fix undefined symbol due to builtin/module mixup
ALSA: hda - Fix mic capture on Sony VAIO Pro 11
ALSA: hda - Add a headset quirk for Dell XPS 13
ALSA: hda - Fix inconsistent Mic mute LED
ALSA: hda - Fix leftover ifdef checks after modularization
Linus Torvalds [Fri, 14 Feb 2014 18:33:45 +0000 (10:33 -0800)]
Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband
Pull RDMA/InfiniBand fixes from Roland Dreier:
- Fix some rough edges from the "IP addressing for IBoE" merge
- Other misc fixes, mostly to hardware drivers
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (21 commits)
RDMA/ocrdma: Fix load time panic during GID table init
RDMA/ocrdma: Fix traffic class shift
IB/iser: Fix use after free in iser_snd_completion()
IB/iser: Avoid dereferencing iscsi_iser conn object when not bound to iser connection
IB/usnic: Fix smatch endianness error
IB/mlx5: Remove dependency on X86
mlx5: Add include of <linux/slab.h> because of kzalloc()/kfree() use
IB/qib: Add missing serdes init sequence
RDMA/cxgb4: Add missing neigh_release in LE-Workaround path
IB: Report using RoCE IP based gids in port caps
IB/mlx4: Build the port IBoE GID table properly under bonding
IB/mlx4: Do IBoE GID table resets per-port
IB/mlx4: Do IBoE locking earlier when initializing the GID table
IB/mlx4: Move rtnl locking to the right place
IB/mlx4: Make sure GID index 0 is always occupied
IB/mlx4: Don't allocate range of steerable UD QPs for Ethernet-only device
RDMA/amso1100: Fix error return code
RDMA/nes: Fix error return code
IB/mlx5: Don't set "block multicast loopback" capability
IB/mlx5: Fix binary compatibility with libmlx5
...
Linus Torvalds [Fri, 14 Feb 2014 18:33:13 +0000 (10:33 -0800)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fix from Guenter Roeck:
"Fix arithmetic overflow in ntc_thermistor driver"
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (ntc_thermistor) Avoid math overflow