Linus Torvalds [Thu, 23 Oct 2008 16:52:39 +0000 (09:52 -0700)]
Merge git://git./linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
ide: remove useless subdirs from drivers/ide/
Linus Torvalds [Thu, 23 Oct 2008 16:50:12 +0000 (09:50 -0700)]
Merge git://git./linux/kernel/git/agk/linux-2.6-dm
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
dm: tidy local_init
dm: remove unused flush_all
dm raid1: separate region_hash interface part1
dm: mark split bio as cloned
dm crypt: remove waitqueue
dm crypt: fix async split
dm crypt: tidy sector
dm: remove dm header from targets
dm: publish array_too_big
dm exception store: fix misordered writes
dm exception store: refactor zero_area
dm snapshot: drop unused last_percent
dm snapshot: fix primary_pe race
dm kcopyd: avoid queue shuffle
Linus Torvalds [Thu, 23 Oct 2008 16:38:55 +0000 (09:38 -0700)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcupdate: fix bug of rcu_barrier*()
profiling: fix !procfs build
Fixed trivial conflicts in 'include/linux/profile.h'
Linus Torvalds [Thu, 23 Oct 2008 16:37:16 +0000 (09:37 -0700)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: disable the hrtick for now
sched: revert back to per-rq vruntime
sched: fair scheduler should not resched rt tasks
sched: optimize group load balancer
sched: minor fast-path overhead reduction
sched: fix the wrong mask_len, cleanup
sched: kill unused scheduler decl.
sched: fix the wrong mask_len
sched: only update rq->clock while holding rq->lock
Linus Torvalds [Thu, 23 Oct 2008 16:36:55 +0000 (09:36 -0700)]
Merge branch 'irq-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: NULL struct irq_desc's member 'name' in dynamic_irq_cleanup()
genirq: fix off by one and coding style
genirq: fix set_irq_type() when recording trigger type
Lee Howard [Tue, 21 Oct 2008 12:50:14 +0000 (13:50 +0100)]
8250: Add more OxSemi devices
These have the Mainpine PCI identifier on however
Additional paranoia check for Tornado versions added by Alan Cox
(and this time I remembered to do an stg refresh so that the corrections ended
up in these patches not randomly attached to another diff -- Alan)
Signed-off-by: Lee Howard <lee.howard@mainpine.com>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lee Howard [Tue, 21 Oct 2008 12:48:58 +0000 (13:48 +0100)]
8250: Oxford Semiconductor Devices
Add support for the OxSemi 'Tornado' devices.
Reformatted and reworked a bit by Alan Cox
Signed-off-by: Lee Howard <lee.howard@mainpine.com>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 21 Oct 2008 12:47:44 +0000 (13:47 +0100)]
tty: Fix tty_port kref screwup
Pass the brown paper bags please. I changed the semantics of this so the
function was supposed to do the extra kref itself then forgot to do the
change.. duh....
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 21 Oct 2008 12:44:58 +0000 (13:44 +0100)]
watchdog: Fix warning
This seems to have popped up after the recent merges:
drivers/watchdog/w83697ug_wdt.c: In function ‘w83697ug_select_wd_register’:
drivers/watchdog/w83697ug_wdt.c:105: warning: ‘return’ with a value, in function returning void
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Tue, 21 Oct 2008 08:59:15 +0000 (10:59 +0200)]
mutex: speed up generic mutex implementations
- atomic operations which both modify the variable and return something imply
full smp memory barriers before and after the memory operations involved
(failing atomic_cmpxchg, atomic_add_unless, etc don't imply a barrier because
they don't modify the target). See Documentation/atomic_ops.txt.
So remove extra barriers and branches.
- All architectures support atomic_cmpxchg. This has no relation to
__HAVE_ARCH_CMPXCHG. We can just take the atomic_cmpxchg path unconditionally
This reduces a simple single threaded fastpath lock+unlock test from 590 cycles
to 203 cycles on a ppc970 system.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 23 Oct 2008 16:16:56 +0000 (09:16 -0700)]
Merge git://git./linux/kernel/git/czankel/xtensa-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/czankel/xtensa-2.6:
xtensa: Add config files for Diamond 232L - Rev B processor variant
xtensa: Fix io regions
xtensa: Add support for the Sonic Ethernet device for the XT2000 board.
xtensa: replace remaining __FUNCTION__ occurrences
xtensa: use newer __SPIN_LOCK_UNLOCKED macro
XTENSA: warn about including <asm/rwsem.h> directly.
KAMEZAWA Hiroyuki [Wed, 22 Oct 2008 21:15:05 +0000 (14:15 -0700)]
memcg: fix page_cgroup allocation
page_cgroup_init() is called from mem_cgroup_init(). But at this
point, we cannot call alloc_bootmem().
(and this caused panic at boot.)
This patch moves page_cgroup_init() to init/main.c.
Time table is following:
==
parse_args(). # we can trust mem_cgroup_subsys.disabled bit after this.
....
cgroup_init_early() # "early" init of cgroup.
....
setup_arch() # memmap is allocated.
...
page_cgroup_init();
mem_init(); # we cannot call alloc_bootmem after this.
....
cgroup_init() # mem_cgroup is initialized.
==
Before page_cgroup_init(), mem_map must be initialized. So,
I added page_cgroup_init() to init/main.c directly.
(*) maybe this is not very clean but
- cgroup_init_early() is too early
- in cgroup_init(), we have to use vmalloc instead of alloc_bootmem().
use of vmalloc area in x86-32 is important and we should avoid very large
vmalloc() in x86-32. So, we want to use alloc_bootmem() and added page_cgroup_init()
directly to init/main.c
[akpm@linux-foundation.org: remove unneeded/bad mem_cgroup_subsys declaration]
[akpm@linux-foundation.org: fix build]
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Tested-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Duane Griffin [Wed, 22 Oct 2008 21:15:03 +0000 (14:15 -0700)]
jbd: abort instead of waiting for nonexistent transactions
The __log_wait_for_space function sits in a loop checkpointing
transactions until there is sufficient space free in the journal.
However, if there are no transactions to be processed (e.g. because the
free space calculation is wrong due to a corrupted filesystem) it will
never progress.
Check for space being required when no transactions are outstanding and
abort the journal instead of endlessly looping.
This patch fixes the bug reported by Sami Liedes at:
http://bugzilla.kernel.org/show_bug.cgi?id=10976
Signed-off-by: Duane Griffin <duaneg@dghda.com>
Tested-by: Sami Liedes <sliedes@cc.hut.fi>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hidehiro Kawai [Wed, 22 Oct 2008 21:15:02 +0000 (14:15 -0700)]
jbd: test BH_Write_EIO to detect errors on metadata buffers
__try_to_free_cp_buf(), __process_buffer(), and __wait_cp_io() test
BH_Uptodate flag to detect write I/O errors on metadata buffers. But by
commit
95450f5a7e53d5752ce1a0d0b8282e10fe745ae0 "ext3: don't read inode
block if the buffer has a write error"(*), BH_Uptodate flag can be set to
inode buffers with BH_Write_EIO in order to avoid reading old inode data.
So now, we have to test BH_Write_EIO flag of checkpointing inode buffers
instead of BH_Uptodate. This patch does it.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hidehiro Kawai [Wed, 22 Oct 2008 21:15:01 +0000 (14:15 -0700)]
ext3: add checks for errors from jbd
If the journal has aborted due to a checkpointing failure, we have to
keep the contents of the journal space. Otherwise, the filesystem will
lose uncheckpointed metadata completely and become inconsistent. To
avoid this, we need to keep needs_recovery flag if checkpoint has
failed.
With this patch, ext3_put_super() detects a checkpointing failure from
the return value of journal_destroy(), then it invokes ext3_abort() to
make the filesystem read only and keep needs_recovery flag. Errors
from journal_flush() are also handled by this patch in some places.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hidehiro Kawai [Wed, 22 Oct 2008 21:15:00 +0000 (14:15 -0700)]
jbd: fix error handling for checkpoint io
When a checkpointing IO fails, current JBD code doesn't check the error
and continue journaling. This means latest metadata can be lost from both
the journal and filesystem.
This patch leaves the failed metadata blocks in the journal space and
aborts journaling in the case of log_do_checkpoint(). To achieve this, we
need to do:
1. don't remove the failed buffer from the checkpoint list where in
the case of __try_to_free_cp_buf() because it may be released or
overwritten by a later transaction
2. log_do_checkpoint() is the last chance, remove the failed buffer
from the checkpoint list and abort the journal
3. when checkpointing fails, don't update the journal super block to
prevent the journaled contents from being cleaned. For safety,
don't update j_tail and j_tail_sequence either
4. when checkpointing fails, notify this error to the ext3 layer so
that ext3 don't clear the needs_recovery flag, otherwise the
journaled contents are ignored and cleaned in the recovery phase
5. if the recovery fails, keep the needs_recovery flag
6. prevent cleanup_journal_tail() from being called between
__journal_drop_transaction() and journal_abort() (a race issue
between journal_flush() and __log_wait_for_space()
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Mundt [Wed, 22 Oct 2008 21:14:59 +0000 (14:14 -0700)]
profiling: fix up CONFIG_PROC_FS=n build
In the case where procfs is disabled, create_proc_profile() does not
exist. Stub it in with the others.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Mundt [Wed, 22 Oct 2008 21:14:58 +0000 (14:14 -0700)]
mm: page_cgroup needs linux/vmalloc.h for vmalloc_node()/vfree().
mm/page_cgroup.c: In function 'init_section_page_cgroup':
mm/page_cgroup.c:111: error: implicit declaration of function 'vmalloc_node'
mm/page_cgroup.c:111: warning: assignment makes pointer from integer without a cast
mm/page_cgroup.c: In function '__free_page_cgroup':
mm/page_cgroup.c:140: error: implicit declaration of function 'vfree'
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 23 Oct 2008 15:28:25 +0000 (08:28 -0700)]
Merge git://git./linux/kernel/git/benh/powerpc
* git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (53 commits)
powerpc: Support for relocatable kdump kernel
powerpc: Don't use a 16G page if beyond mem= limits
powerpc: Add del_node() for early boot code to prune inapplicable devices.
powerpc: Further compile fixup for STRICT_MM_TYPECHECKS
powerpc: Remove empty #else from signal_64.c
powerpc: Move memory size print into common show_cpuinfo for 32-bit
hvc_console: Remove __devexit annotation of hvc_remove()
hvc_console: Add support for tty window resizing
hvc_console: Fix loop if put_char() returns 0
hvc_console: Add tty driver flag TTY_DRIVER_RESET_TERMIOS
hvc_console: Add a hangup notifier for backends
powerpc/83xx: Add DS1339 RTC support for MPC8349E-mITX boards .dts
powerpc/83xx: Add support for MCU microcontroller in .dts files
powerpc/85xx: Move mpc8572ds.dts to address-cells/size-cells = <2>
of/spi: Support specifying chip select as active high via device tree
powerpc: Remove device_type = "board_control" properties in .dts files
i2c-cpm: Suppress autoprobing for devices
powerpc/85xx: Fix mpc8536ds dma interrupt numbers
powerpc/85xx: Enable enhanced functions for 8536 TSEC
powerpc: Delete unused prom_strtoul and prom_memparse
...
Linus Torvalds [Thu, 23 Oct 2008 15:20:34 +0000 (08:20 -0700)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/dvrabel/uwb
* 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb: (47 commits)
uwb: wrong sizeof argument in mac address compare
uwb: don't use printk_ratelimit() so often
uwb: use kcalloc where appropriate
uwb: use time_after() when purging stale beacons
uwb: add credits for the original developers of the UWB/WUSB/WLP subsystems
uwb: add entries in the MAINTAINERS file
uwb: depend on EXPERIMENTAL
wusb: wusb-cbaf (CBA driver) sysfs ABI simplification
uwb: document UWB and WUSB sysfs files
uwb: add symlinks in sysfs between radio controllers and PALs
uwb: dont tranmit identification IEs
uwb: i1480/GUWA100U: fix firmware download issues
uwb: i1480: remove MAC/PHY information checking function
uwb: add Intel i1480 HWA to the UWB RC quirk table
uwb: disable command/event filtering for D-Link DUB-1210
uwb: initialize the debug sub-system
uwb: Fix handling IEs with empty IE data in uwb_est_get_size()
wusb: fix bmRequestType for Abort RPipe request
wusb: fix error path for wusb_set_dev_addr()
wusb: add HWA host controller driver
...
Linus Torvalds [Thu, 23 Oct 2008 15:18:33 +0000 (08:18 -0700)]
Merge branch 'for-next' of git://git.o-hand.com/linux-mfd
* 'for-next' of git://git.o-hand.com/linux-mfd:
mfd: check for platform_get_irq() return value in sm501
mfd: use pci_ioremap_bar() in sm501
mfd: Don't store volatile bits in WM8350 register cache
mfd: don't export wm3850 static functions
mfd: twl4030-gpio driver
mfd: rtc-twl4030 driver
mfd: twl4030 IRQ handling update
Linus Torvalds [Thu, 23 Oct 2008 15:16:03 +0000 (08:16 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/ehca: Reject dynamic memory add/remove when ehca adapter is present
IB/ehca: Fix reported max number of QPs and CQs in systems with >1 adapter
IPoIB: Set netdev offload features properly for child (VLAN) interfaces
IPoIB: Clean up ethtool support
mlx4_core: Add Ethernet PCI device IDs
mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC
mlx4_core: Multiple port type support
mlx4_core: Ethernet MAC/VLAN management
mlx4_core: Get ethernet MTU and default address from firmware
mlx4_core: Support multiple pre-reserved QP regions
Update NetEffect maintainer emails to Intel emails
RDMA/cxgb3: Remove cmid reference on tid allocation failures
IB/mad: Use krealloc() to resize snoop table
IPoIB: Always initialize poll_timer to avoid crash on unload
IB/ehca: Don't allow creating UC QP with SRQ
mlx4_core: Add QP range reservation support
RDMA/ucma: Test ucma_alloc_multicast() return against NULL, not with IS_ERR()
Linus Torvalds [Thu, 23 Oct 2008 15:15:29 +0000 (08:15 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
sata_via: load DEVICE register when CTL changes
libata: set device class to NONE if phys_offline
libata-eh: fix slave link EH action mask handling
libata: transfer EHI control flags to slave ehc.i
libata-sff: fix ata_sff_post_internal_cmd()
libata: initialize port_task when !CONFIG_ATA_SFF
Linus Torvalds [Thu, 23 Oct 2008 15:12:21 +0000 (08:12 -0700)]
Merge branch 'for-linus' of /home/rmk/linux-2.6-arm
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] clps711x: add sparsemem definitions
[ARM] 5315/1: Fix section mismatch warning (sa1111)
[ARM] Orion: activate workaround for
88f6183 SPI clock erratum
[ARM] Orion: instantiate the dsa switch driver
[ARM] mv78xx0: force link speed/duplex on eth2/eth3
[ARM] remove extra brace in arch/arm/mach-pxa/trizeps4.c
[ARM] balance parenthesis in header file
[ARM] pxa: fix trizeps PCMCIA build
[ARM] pxa: fix trizeps defconfig
[ARM] dmabounce requires ZONE_DMA
[ARM] 5303/1: period_cycles should be greater than 1
[ARM] 5310/1: Fix cache flush functions for ARMv4
[ARM] pxa: fix
3bca103a1e658d23737d20e1989139d9ca8973bf
[ARM] pxa: fix redefinition of NR_IRQS
[ARM] S3C24XX: Fix redefine of DEFINE_TIMER() in s3c24xx pwm-clock.c
[ARM] S3C2443: Fix HCLK rate
[ARM] S3C24XX: Serial driver debug depends on DEBUG_LL
[ARM] S3C24XX: pwm-clock set_parent mask fix
Linus Torvalds [Thu, 23 Oct 2008 15:07:35 +0000 (08:07 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: (41 commits)
[IA64] Fix annoying IA64_TR_ALLOC_MAX message.
[IA64] kill sys32_pipe
[IA64] remove sys32_pause
[IA64] Add Variable Page Size and IA64 Support in Intel IOMMU
ia64/pv_ops: paravirtualized instruction checker.
ia64/xen: a recipe for using xen/ia64 with pv_ops.
ia64/pv_ops: update Kconfig for paravirtualized guest and xen.
ia64/xen: preliminary support for save/restore.
ia64/xen: define xen machine vector for domU.
ia64/pv_ops/xen: implement xen pv_time_ops.
ia64/pv_ops/xen: implement xen pv_irq_ops.
ia64/pv_ops/xen: define the nubmer of irqs which xen needs.
ia64/pv_ops/xen: implement xen pv_iosapic_ops.
ia64/pv_ops/xen: paravirtualize entry.S for ia64/xen.
ia64/pv_ops/xen: paravirtualize ivt.S for xen.
ia64/pv_ops/xen: paravirtualize DO_SAVE_MIN for xen.
ia64/pv_ops/xen: define xen paravirtualized instructions for hand written assembly code
ia64/pv_ops/xen: define xen pv_cpu_ops.
ia64/pv_ops/xen: define xen pv_init_ops for various xen initialization.
ia64/pv_ops/xen: elf note based xen startup.
...
Tejun Heo [Tue, 21 Oct 2008 15:45:57 +0000 (00:45 +0900)]
sata_via: load DEVICE register when CTL changes
VIA controllers clear DEVICE register when IEN changes. Make sure
DEVICE is updated along with CTL.
This change is separated from Joseph Chan's larger patch.
http://thread.gmane.org/gmane.linux.kernel.commits.mm/40640
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Joseph Chan <JosephChan@via.com.tw>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Tue, 21 Oct 2008 15:31:34 +0000 (00:31 +0900)]
libata: set device class to NONE if phys_offline
Reset methods don't have access to phys link status for slave links
and may incorrectly indicate device presence causing unnecessary probe
failures for unoccupied links. This patch clears device class to NONE
during post-reset processing if phys link is offline.
As on/offlineness semantics is strictly defined and used in multiple
places by the core layer, this won't change behavior for drivers which
don't use slave links.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Tue, 21 Oct 2008 11:37:21 +0000 (20:37 +0900)]
libata-eh: fix slave link EH action mask handling
Slave link action mask is transferred to master link and all the EH
actions are taken by the master link. ata_eh_about_to_do() and
ata_eh_done() are called with ATA_EH_ALL_ACTIONS to clear the slave
link actions during transfer. This always sets ATA_PFLAG_RECOVERED
flag causing spurious "EH complete" messages.
Don't set ATA_PFLAG_RECOVERED for slave link actions.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Tue, 21 Oct 2008 05:26:39 +0000 (14:26 +0900)]
libata: transfer EHI control flags to slave ehc.i
ATA_EHI_NO_AUTOPSY and ATA_EHI_QUIET are used to control the behavior
of EH. As only the master link is visible outside EH, these flags are
set only for the master link although they should also apply to the
slave link, which causes spurious EH messages during probe and
suspend/resume.
This patch transfers those two flags to slave ehc.i before performing
slave autopsy and reporting.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Mon, 20 Oct 2008 04:10:21 +0000 (13:10 +0900)]
libata-sff: fix ata_sff_post_internal_cmd()
ata_sff_post_internal_cmd() needs to grab port lock before calling
ata_bmdma_stop() and also need to clear hsm_task_state. Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Mon, 20 Oct 2008 04:11:56 +0000 (13:11 +0900)]
libata: initialize port_task when !CONFIG_ATA_SFF
ap->port_task was not initialized if !CONFIG_ATA_SFF later triggering
lockdep warning. Make sure it's initialized.
Reported by Larry Finger.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Roland Dreier [Wed, 22 Oct 2008 22:56:41 +0000 (15:56 -0700)]
Merge branches 'cma', 'cxgb3', 'ehca', 'ipoib', 'mad', 'mlx4' and 'nes' into for-next
Stefan Roscher [Wed, 22 Oct 2008 22:54:38 +0000 (15:54 -0700)]
IB/ehca: Reject dynamic memory add/remove when ehca adapter is present
Since the ehca device driver does not support dynamic memory add and
remove operations, the driver must explicitly reject such requests in
order to prevent unpredictable behaviors related to existing memory
regions that cover all of memory being used by InfiniBand protocols in
the kernel.
The solution (for now at least) is to add a memory notifier to the
ehca device driver and if a request for dynamic memory add or remove
comes in, ehca will always reject it. The user can add or remove
memory by hot-removing the ehca adapter, performing the memory
operation, and then hot-adding the ehca adapter back.
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Stefan Roscher [Wed, 22 Oct 2008 22:52:31 +0000 (15:52 -0700)]
IB/ehca: Fix reported max number of QPs and CQs in systems with >1 adapter
Because ehca adapters can differ in the maximum number of QPs and CQs
we have to save the maximum number of these ressources per adapter and
not globally per ehca driver. This fix introduces 2 new members to the
shca structure to store the maximum value for QPs and CQs per adapter.
The module parameters are now used as initial values for those
variables. If a user selects an invalid number of CQs or QPs we don't
print an error any longer, instead we will inform the user with a
warning and set the values to the respective maximum supported by the
HW.
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Or Gerlitz [Wed, 22 Oct 2008 22:49:49 +0000 (15:49 -0700)]
IPoIB: Set netdev offload features properly for child (VLAN) interfaces
Child devices were created without any offload features set, fix this by
moving the code that computes the features into generic function which is
now called through non-child and child device creation.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
-- v1 has a bug where the 'result' flag in ipoib_vlan_add may be used uninitialized
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Or Gerlitz [Wed, 22 Oct 2008 22:49:29 +0000 (15:49 -0700)]
IPoIB: Clean up ethtool support
Add a get_rx_csum method. Remove the driver's own get_tso method, as
the ethtool kernel code uses the default one if nothing is provided.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Yevgeny Petrilin [Wed, 22 Oct 2008 22:48:03 +0000 (15:48 -0700)]
mlx4_core: Add Ethernet PCI device IDs
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Yevgeny Petrilin [Wed, 22 Oct 2008 22:47:49 +0000 (15:47 -0700)]
mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC
The Mellanox ConnectX can operate as an InfiniBand adapter, as an
Ethernet NIC, or as a Fibre Channel (FC) HBA. The kernel has a
low-level driver, mlx4_core, which handles multiplexing access to the
device, and there is also already an InfiniBad driver, mlx4_ib.
This patch adds a new driver, mlx4_en, which implements a standard
Ethernet NIC driver.
Signed-off-by: Liran Liss <liranl@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Yevgeny Petrilin [Wed, 22 Oct 2008 22:38:42 +0000 (15:38 -0700)]
mlx4_core: Multiple port type support
Multi-protocol adapters support different port types. Each consumer
of mlx4_core queries for supported port types; in particular mlx4_ib
can no longer assume that all physical ports belong to it. Port type
is configured through a sysfs interface. When the type of a port is
changed, all mlx4 interfaces are unregistered, and then registered
again with the new port types.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Yevgeny Petrilin [Wed, 22 Oct 2008 18:44:46 +0000 (11:44 -0700)]
mlx4_core: Ethernet MAC/VLAN management
Add support for managing MAC and VLAN filters for each port.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Oren Duer <oren@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Yevgeny Petrilin [Wed, 22 Oct 2008 17:56:48 +0000 (10:56 -0700)]
mlx4_core: Get ethernet MTU and default address from firmware
Get maximum ethernet MTU and default MAC address from the firmware
QUERY_DEV_CAP command.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Yevgeny Petrilin [Wed, 22 Oct 2008 17:25:29 +0000 (10:25 -0700)]
mlx4_core: Support multiple pre-reserved QP regions
For ethernet support, we need to reserve QPs for the ethernet and
fibre channel driver. The QPs are reserved at the end of the QP
table. (This way we assure that they are aligned to their size)
We need to consider these reserved ranges in bitmap creation, so we
extend the mlx4 bitmap utility functions to allow reserved ranges at
both the bottom and the top of the range.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Mohan Kumar M [Tue, 21 Oct 2008 17:38:10 +0000 (17:38 +0000)]
powerpc: Support for relocatable kdump kernel
This adds relocatable kernel support for kdump. With this one can
use the same regular kernel to capture the kdump. A signature (0xfeed1234)
is passed in r6 from panic code to the next kernel through kexec_sequence
and purgatory code. The signature is used to differentiate between
kdump kernel and non-kdump kernels.
The purgatory code compares the signature and sets the __kdump_flag in
head_64.S. During the boot up, kernel code checks __kdump_flag and if it
is set, the kernel will behave as relocatable kdump kernel. This kernel
will boot at the address where it was loaded by kexec-tools ie. at the
address reserved through crashkernel boot parameter.
CONFIG_CRASH_DUMP depends on CONFIG_RELOCATABLE option to build kdump
kernel as relocatable. So the same kernel can be used as production and
kdump kernel.
This patch incorporates the changes suggested by Paul Mackerras to avoid
GOT use and to avoid two copies of the code.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Mohan Kumar M <mohan@in.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Jon Tollefson [Tue, 21 Oct 2008 15:27:36 +0000 (15:27 +0000)]
powerpc: Don't use a 16G page if beyond mem= limits
If mem= is used on the boot command line to limit memory then the memory block where a 16G page resides may not be available.
Thanks to Michael Ellerman for finding the problem.
Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Mike Ditto [Tue, 21 Oct 2008 11:32:29 +0000 (11:32 +0000)]
powerpc: Add del_node() for early boot code to prune inapplicable devices.
Some platforms have variants that can share most of a flat device tree but need
a few devices selectively pruned at boot time. This adds del_node() to ops.h
to allow access to the existing fdt_del_node().
Signed-off-by: Mike Ditto <mditto@consentry.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
David Gibson [Mon, 20 Oct 2008 17:55:29 +0000 (17:55 +0000)]
powerpc: Further compile fixup for STRICT_MM_TYPECHECKS
A patch of mine was recently committed to fix up STRICT_MM_TYPECHECKS
behaviour on powerpc (
f5ea64dcbad89875d130596df14c9b25d994a737).
However, something which breaks it again seems to have slipped in
afterwards. So, here's another small fix.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Wed, 15 Oct 2008 18:21:07 +0000 (18:21 +0000)]
powerpc: Remove empty #else from signal_64.c
Remove empty/bogus #else from signal_64.c
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Becky Bruce [Wed, 15 Oct 2008 08:25:28 +0000 (08:25 +0000)]
powerpc: Move memory size print into common show_cpuinfo for 32-bit
Most of the platforms were printing the size of the memory
in their show_cpuinfo implementations. This moves that to
the common show_cpuinfo, so that all 32-bit platforms will
now print the size of memory. I also update the code
to deal with the fact that total_memory is now a phys_addr_t.
Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Hendrik Brueckner [Mon, 13 Oct 2008 23:12:52 +0000 (23:12 +0000)]
hvc_console: Remove __devexit annotation of hvc_remove()
Removed __devexit annotation of hvc_remove() to avoid a section mismatch
if the backend initialization fails and hvc_remove() must be used to
clean up allocated hvc structs (called in section __init or __devinit).
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Hendrik Brueckner [Tue, 14 Oct 2008 05:02:09 +0000 (05:02 +0000)]
hvc_console: Add support for tty window resizing
The patch provides the hvc_resize() function to update the terminal
window dimensions (struct winsize) for a specified hvc console.
The function stores the new window size and schedules a function
that finally updates the tty winsize and signals the change to
user space (SIGWINCH).
Because the winsize update must acquire a mutex and might sleep,
the function is scheduled instead of being called from hvc_poll()
or khvcd.
This patch uses the tty_do_resize() routine from the tty layer.
A pending resize work is canceled in hvc_close() and hvc_hangup().
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Hendrik Brueckner [Mon, 13 Oct 2008 23:12:50 +0000 (23:12 +0000)]
hvc_console: Fix loop if put_char() returns 0
If put_char() routine of a hvc console backend returns 0, then the
hvc console starts looping in the following scenarios:
1. hvc_console_print()
If put_char() returns 0 then the while loop may loop forever.
I have added the missing check for 0 to throw away console messages.
2. khvcd may loop:
The thread calls hvc_poll() --> hvc_push()... if there are still
buffered data then the HVC_POLL_WRITE bit is set and causes the
khvcd thread to loop (if yield() returns immediately).
However, instead of looping, the khvcd thread could sleep for
MIN_TIMEOUT (doing the same as for get_chars()).
The MIN_TIMEOUT is set if hvc_push() was not able to write
data to the backend. If data has been written, the timeout is
set to 0 to immediately re-schedule hvc_poll().
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> (virtio_console)
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Hendrik Brueckner [Mon, 13 Oct 2008 23:12:49 +0000 (23:12 +0000)]
hvc_console: Add tty driver flag TTY_DRIVER_RESET_TERMIOS
After a tty hangup() or close() operation, processes might not reset the
termio settings to a sane state. In order to reset the termios to its
default settings the tty driver flag TTY_DRIVER_RESET_TERMIOS has been added.
TTY driver flag description from include/linux/tty_driver.h:
TTY_DRIVER_RESET_TERMIOS --- requests the tty layer to reset the
termios setting when the last process has closed the device.
Used for PTY's, in particular.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Hendrik Brueckner [Mon, 13 Oct 2008 23:12:48 +0000 (23:12 +0000)]
hvc_console: Add a hangup notifier for backends
I have added a hangup notifier that can be used by hvc console
backends to handle a tty hangup. The default irq hangup notifier
calls the notifier_del_irq() for compatibility.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Roel Kluin [Tue, 21 Oct 2008 23:39:55 +0000 (01:39 +0200)]
mfd: check for platform_get_irq() return value in sm501
sm501_devdata->irq is unsigned, while platform_get_irq() returns a
signed int.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Arjan van de Ven [Sun, 28 Sep 2008 23:14:52 +0000 (16:14 -0700)]
mfd: use pci_ioremap_bar() in sm501
Use the newly introduced pci_ioremap_bar() function in drivers/mfd.
pci_ioremap_bar() just takes a pci device and a bar number, with the goal
of making it really hard to get wrong, while also having a central place
to stick sanity checks.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Mark Brown [Mon, 20 Oct 2008 21:58:50 +0000 (23:58 +0200)]
mfd: Don't store volatile bits in WM8350 register cache
This makes the contents of the cache clearer and fixes incorrect
initialisation of the cache for partially volatile registers.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Stephen Rothwell [Mon, 20 Oct 2008 21:55:30 +0000 (23:55 +0200)]
mfd: don't export wm3850 static functions
October 10th linux-next build (powerpc allyesconfig) failed like this:
drivers/mfd/wm8350-core.c:1131: error: __ksymtab_wm8350_create_cache causes a section type conflict
Caused by commit
89b4012befb1abca5e86d232bc0e2a797b0d9825 ("mfd: Core
support for the WM8350 AudioPlus PMIC"). wm8350_create_cache is not used
elsewhere, so remove the EXPORT_SYMBOL.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
David Brownell [Mon, 20 Oct 2008 21:51:46 +0000 (23:51 +0200)]
mfd: twl4030-gpio driver
This adds basic support for the GPIOs in the twl4030 power management
chip. That includes two open drain LED drivers, and the use of GPIO-0
(and GPIO-1) as MMC/SD card detect switches which can control whether
the VMMC1 (and VMMC2) regulators are active.
This version of the code has a debounce call that will probably be
replaced before long, when a more generic interface exists.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
David Brownell [Mon, 20 Oct 2008 21:50:05 +0000 (23:50 +0200)]
mfd: rtc-twl4030 driver
This adds a driver for the RTC inside the TWL4030 multi-function device.
It's a fairly basic RTC, with a wake-capable alarm.
Note that many of the pre-release Overo boards now in circulation can't
effectively use this RTC, because of a wiring error that puts its TWL
chip into "secure" mode. (As in "secure yourself against tampering".)
This isn't an issue on other OMAP3 boards now supported in mainline,
such as Beagle and Labrador.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
David Brownell [Mon, 20 Oct 2008 21:46:28 +0000 (23:46 +0200)]
mfd: twl4030 IRQ handling update
- Move it into a separate file; clean and streamline it
- Restructure the init code for reuse during secondary dispatch
- Support both levels (primary, secondary) of IRQ dispatch
- Use a workqueue for irq mask/unmask and trigger configuration
Code for two subchips currently share that secondary handler code.
One is the power subchip; its IRQs are now handled by this core,
courtesy of this patch. The other is the GPIO module, which will
be supported through a later patch.
There are also minor changes to the header file, mostly related
to GPIO support; nothing yet in mainline cares about those. A
few references to OMAP-specific symbols are disabled; when they
can all be removed, the TWL4030 support ceases being OMAP-specific.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Russell King [Tue, 21 Oct 2008 22:29:56 +0000 (23:29 +0100)]
[ARM] clps711x: add sparsemem definitions
Fix the edb7211_defconfig build errors.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Kristoffer Ericson [Tue, 21 Oct 2008 18:47:22 +0000 (19:47 +0100)]
[ARM] 5315/1: Fix section mismatch warning (sa1111)
This patch fixes the section mismatch warning from
sa1111.o at buildtime.
CC arch/arm/common/sa1111.o
LD arch/arm/common/built-in.o
LD vmlinux.o
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x87f4): Section mismatch in reference from the function sa1111_probe() to the function .devinit.text:sa1110_mb_enable()
The function sa1111_probe() references
the function __devinit sa1110_mb_enable().
This is often because sa1111_probe lacks a __devinit
annotation or the annotation of sa1110_mb_enable is wrong.
Signed-off-by: Kristoffer Ericson <kristoffer.ericson@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Bartlomiej Zolnierkiewicz [Tue, 21 Oct 2008 18:57:23 +0000 (20:57 +0200)]
ide: remove useless subdirs from drivers/ide/
Suggested-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Kiyoshi Ueda [Tue, 21 Oct 2008 16:45:08 +0000 (17:45 +0100)]
dm: tidy local_init
This patch tidies local_init() in preparation for request-based dm.
No functional change.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Kiyoshi Ueda [Tue, 21 Oct 2008 16:45:07 +0000 (17:45 +0100)]
dm: remove unused flush_all
This patch removes the DM_WQ_FLUSH_ALL state that is unnecessary.
The dm_queue_flush(md, DM_WQ_FLUSH_ALL, NULL) in dm_suspend()
is never invoked because:
- 'goto flush_and_out' is the same as 'goto out' because
the 'goto flush_and_out' is called only when '!noflush'
- If r is non-zero, then the code above will invoke 'goto out'
and skip this code.
No functional change.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Heinz Mauelshagen [Tue, 21 Oct 2008 16:45:06 +0000 (17:45 +0100)]
dm raid1: separate region_hash interface part1
Separate the region hash code from raid1 so it can be shared by forthcoming
targets. Use BUG_ON() for failed async dm_io() calls.
Signed-off-by: Heinz Mauelshagen <hjm@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Martin K. Petersen [Tue, 21 Oct 2008 16:45:04 +0000 (17:45 +0100)]
dm: mark split bio as cloned
When a bio gets split, mark its fragments with the BIO_CLONED flag.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Milan Broz [Tue, 21 Oct 2008 16:45:03 +0000 (17:45 +0100)]
dm crypt: remove waitqueue
Remove waitqueue no longer needed with the async crypto interface.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Milan Broz [Tue, 21 Oct 2008 16:45:02 +0000 (17:45 +0100)]
dm crypt: fix async split
When writing io, dm-crypt has to allocate a new cloned bio
and encrypt the data into newly-allocated pages attached to this bio.
In rare cases, because of hw restrictions (e.g. physical segment limit)
or memory pressure, sometimes more than one cloned bio has to be used,
each processing a different fragment of the original.
Currently there is one waitqueue which waits for one fragment to finish
and continues processing the next fragment.
But when using asynchronous crypto this doesn't work, because several
fragments may be processed asynchronously or in parallel and there is
only one crypt context that cannot be shared between the bio fragments.
The result may be corruption of the data contained in the encrypted bio.
The patch fixes this by allocating new dm_crypt_io structs (with new
crypto contexts) and running them independently.
The fragments contains a pointer to the base dm_crypt_io struct to
handle reference counting, so the base one is properly deallocated
after all the fragments are finished.
In a low memory situation, this only uses one additional object from the
mempool. If the mempool is empty, the next allocation simple waits for
previous fragments to complete.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Milan Broz [Tue, 21 Oct 2008 16:45:00 +0000 (17:45 +0100)]
dm crypt: tidy sector
Prepare local sector variable (offset) for later patch.
Do not update io->sector for still-running I/O.
No functional change.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mikulas Patocka [Tue, 21 Oct 2008 16:44:59 +0000 (17:44 +0100)]
dm: remove dm header from targets
Change #include "dm.h" to #include <linux/device-mapper.h> in all targets.
Targets should not need direct access to internal DM structures.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mikulas Patocka [Tue, 21 Oct 2008 16:44:57 +0000 (17:44 +0100)]
dm: publish array_too_big
Move array_too_big to include/linux/device-mapper.h because it is
used by targets.
Remove the test from dm-raid1 as the number of mirror legs is limited
such that it can never fail. (Even for stripes it seems rather
unlikely.)
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mikulas Patocka [Tue, 21 Oct 2008 16:44:56 +0000 (17:44 +0100)]
dm exception store: fix misordered writes
We must zero the next chunk on disk *before* writing out the current chunk, not
after. Otherwise if the machine crashes at the wrong time, the "end of metadata"
marker may be missing.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: stable@kernel.org
Alasdair G Kergon [Tue, 21 Oct 2008 16:44:55 +0000 (17:44 +0100)]
dm exception store: refactor zero_area
Use a separate buffer for writing zeroes to the on-disk snapshot
exception store, make the updating of ps->current_area explicit and
refactor the code in preparation for the fix in the next patch.
No functional change.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org
Mikulas Patocka [Tue, 21 Oct 2008 16:44:53 +0000 (17:44 +0100)]
dm snapshot: drop unused last_percent
The last_percent field is unused - remove it.
(It dates from when events were triggered as each X% filled up.)
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mikulas Patocka [Tue, 21 Oct 2008 16:44:51 +0000 (17:44 +0100)]
dm snapshot: fix primary_pe race
Fix a race condition with primary_pe ref_count handling.
put_pending_exception runs under dm_snapshot->lock, it does atomic_dec_and_test
on primary_pe->ref_count, and later does atomic_read primary_pe->ref_count.
__origin_write does atomic_dec_and_test on primary_pe->ref_count without holding
dm_snapshot->lock.
This opens the following race condition:
Assume two CPUs, CPU1 is executing put_pending_exception (and holding
dm_snapshot->lock). CPU2 is executing __origin_write in parallel.
primary_pe->ref_count == 2.
CPU1:
if (primary_pe && atomic_dec_and_test(&primary_pe->ref_count))
origin_bios = bio_list_get(&primary_pe->origin_bios);
... decrements primary_pe->ref_count to 1. Doesn't load origin_bios
CPU2:
if (first && atomic_dec_and_test(&primary_pe->ref_count)) {
flush_bios(bio_list_get(&primary_pe->origin_bios));
free_pending_exception(primary_pe);
/* If we got here, pe_queue is necessarily empty. */
return r;
}
... decrements primary_pe->ref_count to 0, submits pending bios, frees
primary_pe.
CPU1:
if (!primary_pe || primary_pe != pe)
free_pending_exception(pe);
... this has no effect.
if (primary_pe && !atomic_read(&primary_pe->ref_count))
free_pending_exception(primary_pe);
... sees ref_count == 0 (written by CPU 2), does double free !!
This bug can happen only if someone is simultaneously writing to both the
origin and the snapshot.
If someone is writing only to the origin, __origin_write will submit kcopyd
request after it decrements primary_pe->ref_count (so it can't happen that the
finished copy races with primary_pe->ref_count decrementation).
If someone is writing only to the snapshot, __origin_write isn't invoked at all
and the race can't happen.
The race happens when someone writes to the snapshot --- this creates
pending_exception with primary_pe == NULL and starts copying. Then, someone
writes to the same chunk in the snapshot, and __origin_write races with
termination of already submitted request in pending_complete (that calls
put_pending_exception).
This race may be reason for bugs:
http://bugzilla.kernel.org/show_bug.cgi?id=11636
https://bugzilla.redhat.com/show_bug.cgi?id=465825
The patch fixes the code to make sure that:
1. If atomic_dec_and_test(&primary_pe->ref_count) returns false, the process
must no longer dereference primary_pe (because someone else may free it under
us).
2. If atomic_dec_and_test(&primary_pe->ref_count) returns true, the process
is responsible for freeing primary_pe.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: stable@kernel.org
Kazuo Ito [Tue, 21 Oct 2008 16:44:50 +0000 (17:44 +0100)]
dm kcopyd: avoid queue shuffle
Write throughput to LVM snapshot origin volume is an order
of magnitude slower than those to LV without snapshots or
snapshot target volumes, especially in the case of sequential
writes with O_SYNC on.
The following patch originally written by Kevin Jamieson and
Jan Blunck and slightly modified for the current RCs by myself
tries to improve the performance by modifying the behaviour
of kcopyd, so that it pushes back an I/O job to the head of
the job queue instead of the tail as process_jobs() currently
does when it has to wait for free pages. This way, write
requests aren't shuffled to cause extra seeks.
I tested the patch against 2.6.27-rc5 and got the following results.
The test is a dd command writing to snapshot origin followed by fsync
to the file just created/updated. A couple of filesystem benchmarks
gave me similar results in case of sequential writes, while random
writes didn't suffer much.
dd if=/dev/zero of=<somewhere on snapshot origin> bs=4096 count=...
[conv=notrunc when updating]
1) linux 2.6.27-rc5 without the patch, write to snapshot origin,
average throughput (MB/s)
10M 100M 1000M
create,dd 511.46 610.72 11.81
create,dd+fsync 7.10 6.77 8.13
update,dd 431.63 917.41 12.75
update,dd+fsync 7.79 7.43 8.12
compared with write throughput to LV without any snapshots,
all dd+fsync and 1000 MiB writes perform very poorly.
10M 100M 1000M
create,dd 555.03 608.98 123.29
create,dd+fsync 114.27 72.78 76.65
update,dd 152.34 1267.27 124.04
update,dd+fsync 130.56 77.81 77.84
2) linux 2.6.27-rc5 with the patch, write to snapshot origin,
average throughput (MB/s)
10M 100M 1000M
create,dd 537.06 589.44 46.21
create,dd+fsync 31.63 29.19 29.23
update,dd 487.59 897.65 37.76
update,dd+fsync 34.12 30.07 26.85
Although still not on par with plain LV performance -
cannot be avoided because it's copy on write anyway -
this simple patch successfully improves throughtput
of dd+fsync while not affecting the rest.
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Kazuo Ito <ito.kazuo@oss.ntt.co.jp>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: stable@kernel.org
Chris Zankel [Tue, 21 Oct 2008 16:11:43 +0000 (09:11 -0700)]
xtensa: Add config files for Diamond 232L - Rev B processor variant
The Diamond 232L processor is a pre-configured Xtensa processor tailored
for Linux application.
Signed-off-by: Chris Zankel <chris@zankel.net>
Chris Zankel [Tue, 21 Oct 2008 16:08:31 +0000 (09:08 -0700)]
xtensa: Fix io regions
The uncached area starts at e000.0000 and spans 1GB. Also add an IOADDR
macro to determine the bypass region to access the IO space.
Signed-off-by: Chris Zankel <chris@zankel.net>
Chris Zankel [Tue, 6 May 2008 06:36:35 +0000 (23:36 -0700)]
xtensa: Add support for the Sonic Ethernet device for the XT2000 board.
Add support for the on-board Sonic Ethernet device for the XT2000
evaluation board.
Signed-off-by: Chris Zankel <chris@zankel.net>
Lai Jiangshan [Fri, 17 Oct 2008 06:40:30 +0000 (14:40 +0800)]
rcupdate: fix bug of rcu_barrier*()
current rcu_barrier_bh() is like this:
void rcu_barrier_bh(void)
{
BUG_ON(in_interrupt());
/* Take cpucontrol mutex to protect against CPU hotplug */
mutex_lock(&rcu_barrier_mutex);
init_completion(&rcu_barrier_completion);
atomic_set(&rcu_barrier_cpu_count, 0);
/*
* The queueing of callbacks in all CPUs must be atomic with
* respect to RCU, otherwise one CPU may queue a callback,
* wait for a grace period, decrement barrier count and call
* complete(), while other CPUs have not yet queued anything.
* So, we need to make sure that grace periods cannot complete
* until all the callbacks are queued.
*/
rcu_read_lock();
on_each_cpu(rcu_barrier_func, (void *)RCU_BARRIER_BH, 1);
rcu_read_unlock();
wait_for_completion(&rcu_barrier_completion);
mutex_unlock(&rcu_barrier_mutex);
}
The inconsistency of the code and the comments show a bug here.
rcu_read_lock() cannot make sure that "grace periods for RCU_BH
cannot complete until all the callbacks are queued".
it only make sure that race periods for RCU cannot complete
until all the callbacks are queued.
so we must use rcu_read_lock_bh() for rcu_barrier_bh().
like this:
void rcu_barrier_bh(void)
{
......
rcu_read_lock_bh();
on_each_cpu(rcu_barrier_func, (void *)RCU_BARRIER_BH, 1);
rcu_read_unlock_bh();
......
}
and also rcu_barrier() rcu_barrier_sched() are implemented like this.
it will bring a lot of duplicate code. My patch uses another way to
fix this bug, please see the comment of my patch.
Thank Paul E. McKenney for he rewrote the comment.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dean Nelson [Sat, 18 Oct 2008 23:06:56 +0000 (16:06 -0700)]
genirq: NULL struct irq_desc's member 'name' in dynamic_irq_cleanup()
If the member 'name' of the irq_desc structure happens to point to a
character string that is resident within a kernel module, problems ensue
if that module is rmmod'd (at which time dynamic_irq_cleanup() is called)
and then later show_interrupts() is called by someone.
It is also not a good thing if the character string resided in kmalloc'd
space that has been kfree'd (after having called dynamic_irq_cleanup()).
dynamic_irq_cleanup() fails to NULL the 'name' member and
show_interrupts() references it on a few architectures (like h8300, sh and
x86).
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 21 Oct 2008 13:49:59 +0000 (15:49 +0200)]
genirq: fix off by one and coding style
Fix off-by-one in for_each_irq_desc_reverse().
Impact is near zero in practice, because nothing substantial wants to
iterate down to IRQ#0 - but fix it nevertheless.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Chris Friesen [Mon, 20 Oct 2008 18:41:58 +0000 (12:41 -0600)]
genirq: fix set_irq_type() when recording trigger type
Impact: fix boot hang on a G5
In set_irq_type() we want to pass the type rather than the current
interrupt state.
Signed-off-by: Chris Friesen <cfriesen@nortel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Benjamin Herrenschmidt [Tue, 21 Oct 2008 04:52:04 +0000 (15:52 +1100)]
Merge commit 'origin' into master
Manual merge of:
arch/powerpc/Kconfig
arch/powerpc/include/asm/page.h
Benjamin Herrenschmidt [Tue, 21 Oct 2008 04:49:55 +0000 (15:49 +1100)]
Merge commit 'kumar/kumar-for-2.6.28'
Anton Vorontsov [Fri, 17 Oct 2008 18:56:59 +0000 (22:56 +0400)]
powerpc/83xx: Add DS1339 RTC support for MPC8349E-mITX boards .dts
The RTC is sitting on the I2C2 bus at address 0x68. RTC interrupt signal
is connected to the IPIC's EXT2 interrupt line, the line is shared with
Vitesse 8201 Ethernet PHY.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Anton Vorontsov [Fri, 17 Oct 2008 18:57:09 +0000 (22:57 +0400)]
powerpc/83xx: Add support for MCU microcontroller in .dts files
MCU is an external Freescale MC9S08QG8 microcontroller, mainly used to
provide soft power-off function, but also exports two GPIOs (wired to
the LEDs and also available from the external headers).
Added the MCU on mpc8349emitx, mpc837xrdb and mpc8315erdb boards.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Kumar Gala [Tue, 21 Oct 2008 04:02:26 +0000 (23:02 -0500)]
powerpc/85xx: Move mpc8572ds.dts to address-cells/size-cells = <2>
Change the top-level #address-cells and #size-cells to <2> so the
mpc8572ds.dts is easier to deal with both a true 32-bit physical
or 36-bit physical address space.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Wolfgang Ocker [Wed, 15 Oct 2008 13:00:47 +0000 (15:00 +0200)]
of/spi: Support specifying chip select as active high via device tree
The patch allows to specify that an SPI device needs an active high chip
select.
Signed-off-by: Wolfgang Ocker <weo@reccoware.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Anton Vorontsov [Sat, 18 Oct 2008 00:23:52 +0000 (04:23 +0400)]
powerpc: Remove device_type = "board_control" properties in .dts files
We don't want to encourage the bogus device_type usage.
The device type isn't used in the code, so we can simply remove it from
the documentation and dts files.
Boards should specify proper compatible entries instead.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Wolfram Sang [Thu, 16 Oct 2008 19:12:00 +0000 (21:12 +0200)]
i2c-cpm: Suppress autoprobing for devices
Similar to commit
618b26d52843c0f85b8eb143cf2695d7f6fd072d, also remove
automatic probing for this i2c controller. Might need updates to dts files
using it.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Jochen Friedrich <jochen@scram.de>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Ed Swarthout [Fri, 17 Oct 2008 05:41:32 +0000 (00:41 -0500)]
powerpc/85xx: Fix mpc8536ds dma interrupt numbers
Signed-off-by: Ed Swarthout <Ed.Swarthout@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Jason Jin [Thu, 16 Oct 2008 09:31:32 +0000 (17:31 +0800)]
powerpc/85xx: Enable enhanced functions for 8536 TSEC
Signed-off-by: Jason Jin <Jason.jin@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Milton Miller [Mon, 20 Oct 2008 18:03:18 +0000 (18:03 +0000)]
powerpc: Delete unused prom_strtoul and prom_memparse
These functions should have been static, and inspection shows they
are no longer used. (We used to parse mem= but we now defer that
to early_param).
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Mon, 20 Oct 2008 17:42:42 +0000 (17:42 +0000)]
powerpc: Revert CHRP boot wrapper to real-base = 12MB on 32-bit
Commit
9b09c6d909dfd8de96b99b9b9c808b94b0a71614 ("powerpc: Change the
default link address for pSeries zImage kernels") changed the
real-base value in the CHRP note added by addnote to the zImage from
12MB to 32MB. It turns out that this causes unnecessary extra reboots
on old 32-bit CHRP machines. This therefore adds a -r flag to addnote
to allow us to specify what real-base value it should put in the CHRP
note, and adjusts the wrapper script to pass -r c00000 to addnote when
making a zImage for a CHRP machine. Also, CHRP machines ignore the
RPA note, so we don't need to arrange for it to be the same as the
kernel's.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Milton Miller [Mon, 20 Oct 2008 15:37:04 +0000 (15:37 +0000)]
powerpc: Always trim numa memory to lmb_end_of_DRAM()
numa_enforce_memory_limit tried to be smart and only call lmb_end_of_DRAM
when a memory limit was set via mem= on the command line. However,
the early boot code will also limit memory added to the lmb system
when iommu=off is specified. When this happens, the page allocator
is given pages not in the linear mapping and this results in a fatal
data reference to the unmapped page.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Milton Miller [Mon, 20 Oct 2008 15:37:03 +0000 (15:37 +0000)]
powerpc: Use cpu_thread_in_core in smp_init for of_spin_map
We used to assume that even numbered threads were the primary
threads, ie those that would be listed and started as a cpu from
open firmware. Replace a left over is even (% 2) check with a check
for it being a primary thread and update the comments.
Tested with a debug print on pseries, identical code found for cell.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Milton Miller [Mon, 20 Oct 2008 15:37:03 +0000 (15:37 +0000)]
powerpc: Find and destroy possible stale kernel added properties
64 bit powerpc requires the kexec user space tools avoid overwriting
the static kernel image and translation hash table when choosing
where to put memory image data because it copies the data into place
using the kernels virtual memory system. Kexec userspace determines
these and other areas blocked by reading properties the kernel adds,
but does not filter these properties when creating the device tree
for the next kernel.
When the second kernel tries to add its values for these properties,
the export via /proc/device-tree is hidden by the pre-existing but
stale values from the flat tree. Kexec userspace reads the old
property, allocates the new kernel at the old kernel's end, and
gets rejected by the overlap check.
Search and remove these stale properties before adding the new values.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kumar Gala [Mon, 20 Oct 2008 08:03:46 +0000 (08:03 +0000)]
powerpc: Remove Kconfig support for PPC_MERGE
There are no users of PPC_MERGE in tree so we can get rid of it.
It was a hold over from the arch/ppc days.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>