Nicolas Pitre [Tue, 24 Jul 2007 06:09:39 +0000 (02:09 -0400)]
sdio: add interface for host side SDIO interrupt reporting
Signed-off-by: Nicolas Pitre <npitre@mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Fri, 6 Jul 2007 11:35:01 +0000 (13:35 +0200)]
sdio: support IO_RW_EXTENDED
Support the multi-byte transfer operation, including handlers for
common operations like writel()/readl().
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Nicolas Pitre [Thu, 5 Jul 2007 03:40:34 +0000 (23:40 -0400)]
sdio: add /proc interface to sdio_uart driver
This mimics what the serial_core does. Useful for diagnostics.
Signed-off-by: Nicolas Pitre <npitre@mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Nicolas Pitre [Sat, 30 Jun 2007 06:04:21 +0000 (02:04 -0400)]
sdio: UART/GPS driver
This currently only accepts the GPS class since that's all I have for
testing. Tested with a Matsushita GPS and gpsd version 2.34.
Signed-off-by: Nicolas Pitre <npitre@mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Nicolas Pitre [Sat, 30 Jun 2007 14:29:41 +0000 (16:29 +0200)]
sdio: core support for SDIO function interrupt
Signed-off-by: Nicolas Pitre <npitre@mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Nicolas Pitre [Sat, 30 Jun 2007 14:21:52 +0000 (16:21 +0200)]
sdio: allow for mmc_claim_host to be aborted
It is sometimes necessary to give up on trying to claim the host lock,
especially if that happens in a thread that has to be stopped.
While at it, fix the description for mmc_claim_host() which was wrong.
Signed-off-by: Nicolas Pitre <npitre@mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Nicolas Pitre [Sun, 17 Jun 2007 01:40:07 +0000 (21:40 -0400)]
sdio: defines for some standard interface types
Signed-off-by: Nicolas Pitre <npitre@mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sun, 17 Jun 2007 09:42:21 +0000 (11:42 +0200)]
sdio: add basic sysfs attributes
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sun, 17 Jun 2007 09:34:23 +0000 (11:34 +0200)]
sdio: add modalias support
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sun, 17 Jun 2007 09:18:46 +0000 (11:18 +0200)]
mmc: whip bus uevent handler into shape
Make the mmc bus uevent callback look like all other subsystems.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sat, 16 Jun 2007 13:54:55 +0000 (15:54 +0200)]
sdio: add device id table and matching
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Nicolas Pitre [Sat, 16 Jun 2007 06:07:53 +0000 (02:07 -0400)]
mmc: initialize mmc subsystem with subsys_initcall()
The problem is that the sdio_bus must be registered before any SDIO
drivers are registered against it otherwise the kernel sulks. Because
the sdio_bus registration happens through module_init (equivalent to
device_initcall), then any SDIO
drivers linked before the SDIO core code in the kernel will be initialized
first.
Upcoming SDIO function drivers are likely to be located outside the
drivers/mmc directory as it is common practice to group drivers according
to their function rather than the bus they use. SDIO drivers are therefore
likely to appear at random location in the kernel link.
To make sure the sdio_bus is always initialized before any SDIO drivers,
let's move the MMC init to the subsys_initcall level.
Signed-off-by: Nicolas Pitre <npitre@mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Mon, 30 Jul 2007 13:15:30 +0000 (15:15 +0200)]
sdio: split up common and function CIS parsing
Add a more clean separation between global, common CIS information
and the function specific one as we need the common information in
places where no specific function is specified.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Nicolas Pitre [Sat, 16 Jun 2007 06:06:47 +0000 (02:06 -0400)]
sdio: link unknown CIS tuples to the sdio_func structure
This way those tuples that the core cares about are consumed by the core
code, and tuples that only function drivers might make sense of are
available to drivers.
Signed-off-by: Nicolas Pitre <npitre@mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Nicolas Pitre [Sat, 16 Jun 2007 06:04:16 +0000 (02:04 -0400)]
sdio: initial CIS parsing code
Signed-off-by: Nicolas Pitre <npitre@mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Mon, 11 Jun 2007 19:01:00 +0000 (21:01 +0200)]
sdio: basic parsing of FBR
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Mon, 11 Jun 2007 18:25:43 +0000 (20:25 +0200)]
sdio: read and decode interesting parts of the CCCR
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sun, 27 May 2007 12:22:37 +0000 (14:22 +0200)]
mmc: enable/disable functions for SDIO
Like many other buses, the devices (functions) on the SDIO bus
must be enabled before they can be used. Add functions that allow
drivers to do so.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sun, 27 May 2007 10:57:15 +0000 (12:57 +0200)]
mmc: add basic SDIO I/O operations
Add command wrappers that simplify register access from SDIO
function drivers.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sun, 27 May 2007 10:00:02 +0000 (12:00 +0200)]
mmc: add SDIO driver handling
Add basic driver handling to the SDIO device model.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sat, 26 May 2007 11:48:18 +0000 (13:48 +0200)]
mmc: basic SDIO device model
Add the sdio bus type and basic device handling.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Tue, 22 May 2007 18:25:21 +0000 (20:25 +0200)]
mmc: implement SDIO IO_RW_DIRECT operation
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Mon, 21 May 2007 18:23:20 +0000 (20:23 +0200)]
mmc: detect SDIO cards
Really basic init sequence for SDIO cards.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Marc Pignat [Thu, 9 Aug 2007 11:56:29 +0000 (13:56 +0200)]
mmc: at91_mci: disable handling of blocks with size not multiple of 4 bytes
This kind of transfer is not supported, so don't advertise it and make it
fail early.
Signed-off-by: Marc Pignat <marc.pignat@hevs.ch>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Tue, 24 Jul 2007 19:53:43 +0000 (21:53 +0200)]
mmc: add missing printk levels
Some printk:s were missing an explicit level.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Tue, 24 Jul 2007 19:11:47 +0000 (21:11 +0200)]
mmc: remove confusing flag
The MMC_DATA_MULTI flag never had a proper definition of what it
means, so remove it and let the drivers check the block count in
the request.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Tue, 24 Jul 2007 18:38:53 +0000 (20:38 +0200)]
mmc: remove BYTEBLOCK capability
Remove the BYTEBLOCK capability and let the broken hosts fail the
requests with -EINVAL instead.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Tue, 24 Jul 2007 17:16:54 +0000 (19:16 +0200)]
mmc: mmc_set_data_timeout() parameter write is redundant
The write parameter in mmc_set_data_timeout() is redundant as the
data structure contains information about the direction of the
transfer.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sun, 22 Jul 2007 22:34:07 +0000 (00:34 +0200)]
mmc: read ext_csd version number
Make sure we do not try to parse a structure we do not
understand.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sun, 22 Jul 2007 21:08:30 +0000 (23:08 +0200)]
mmc: improve error code feedback
Now that we use "normal" error codes, improve the reporting and response
to error codes in the core.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sun, 22 Jul 2007 20:18:46 +0000 (22:18 +0200)]
mmc: remove custom error codes
Convert the MMC layer to use standard error codes and not its own,
incompatible values.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Thomas Gleixner [Sat, 22 Sep 2007 22:29:06 +0000 (22:29 +0000)]
clockevents: remove the suspend/resume workaround^Wthinko
In a desparate attempt to fix the suspend/resume problem on Andrews
VAIO I added a workaround which enforced the broadcast of the oneshot
timer on resume. This was actually resolving the problem on the VAIO
but was just a stupid workaround, which was not tackling the root
cause: the assignement of lower idle C-States in the ACPI processor_idle
code. The cpuidle patches, which utilize the dynamic tick feature and
go faster into deeper C-states exposed the problem again. The correct
solution is the previous patch, which prevents lower C-states across
the suspend/resume.
Remove the enforcement code, including the conditional broadcast timer
arming, which helped to pamper over the real problem for quite a time.
The oneshot broadcast flag for the cpu, which runs the resume code can
never be set at the time when this code is executed. It only gets set,
when the CPU is entering a lower idle C-State.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Sat, 22 Sep 2007 22:29:05 +0000 (22:29 +0000)]
ACPI: disable lower idle C-states across suspend/resume
device_suspend() calls ACPI suspend functions, which seems to have undesired
side effects on lower idle C-states. It took me some time to realize that
especially the VAIO BIOSes (both Andrews jinxed UP and my elfstruck SMP one)
show this effect. I'm quite sure that other bug reports against suspend/resume
about turning the system into a brick have the same root cause.
After fishing in the dark for quite some time, I realized that removing the ACPI
processor module before suspend (this removes the lower C-state functionality)
made the problem disappear. Interestingly enough the propability of having a
bricked box is influenced by various factors (interrupts, size of the ram image,
...). Even adding a bunch of printks in the wrong places made the problem go
away. The previous periodic tick implementation simply pampered over the
problem, which explains why the dyntick / clockevents changes made this more
prominent.
We avoid complex functionality during the boot process and we have to do the
same during suspend/resume. It is a similar scenario and equaly fragile.
Add suspend / resume functions to the ACPI processor code and disable the lower
idle C-states across suspend/resume. Fall back to the default idle
implementation (halt) instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 22 Sep 2007 19:56:48 +0000 (12:56 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI: suspend: consolidate handling of Sx states addendum
ACPI: suspend: consolidate handling of Sx states.
ACPI: video: remove dmesg spam
ACPI: video: _DOS=0 by default to prevent hotkey hang
Linus Torvalds [Sat, 22 Sep 2007 19:56:13 +0000 (12:56 -0700)]
Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
[XFS] fix valid but harmless sparse warning
[XFS] fix filestreams on 32-bit boxes
Avi Kivity [Sat, 22 Sep 2007 12:43:45 +0000 (14:43 +0200)]
KVM: Fix virtualization menu help text
What guest drivers?
Cc: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Len Brown [Sat, 22 Sep 2007 01:55:34 +0000 (21:55 -0400)]
Pull suspend.now into release branch
Len Brown [Sat, 22 Sep 2007 01:55:29 +0000 (21:55 -0400)]
Pull now into release branch
Frans Pop [Thu, 20 Sep 2007 20:27:44 +0000 (22:27 +0200)]
ACPI: suspend: consolidate handling of Sx states addendum
Make the S0 state be always reported as supported
Signed-off: Frans Pop <elendil@planet.nl>
Acked-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Linus Torvalds [Fri, 21 Sep 2007 21:05:45 +0000 (14:05 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 4569/1: ep93xx_gpio_irq_type(): fix spurious enumeration offset for FGPIO handling
[ARM] 4568/1: fix l2x0 cache invalidate handling of unaligned addresses
Linus Torvalds [Fri, 21 Sep 2007 19:09:41 +0000 (12:09 -0700)]
Revert "x86_64: Quicklist support for x86_64"
This reverts commit
34feb2c83beb3bdf13535a36770f7e50b47ef299.
Suresh Siddha points out that this one breaks the fundamental
requirement that you cannot free page table pages before the TLB caches
are flushed. The quicklists do not give the same kinds of guarantees
that the mmu_gather structure does, at least not in NUMA configurations.
Requested-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Andi Kleen <ak@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 21 Sep 2007 17:00:52 +0000 (10:00 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] BCM1480: include <linux/init.h>.
[MIPS] BCM1480: Export zbbus_mhz.
Ralf Baechle [Fri, 21 Sep 2007 08:01:15 +0000 (09:01 +0100)]
[MIPS] BCM1480: include <linux/init.h>.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Fri, 21 Sep 2007 07:59:07 +0000 (08:59 +0100)]
[MIPS] BCM1480: Export zbbus_mhz.
Symbol is required by the ZBus profiler.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Fri, 21 Sep 2007 16:52:20 +0000 (09:52 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
ocfs2: Pack vote message and response structures
ocfs2: Don't double set write parameters
ocfs2: Fix pos/len passed to ocfs2_write_cluster
ocfs2: Allow smaller allocations during large writes
Andi Kleen [Fri, 21 Sep 2007 14:16:18 +0000 (16:16 +0200)]
x86_64: Zero extend all registers after ptrace in 32bit entry path.
Strictly it's only needed for eax.
It actually does a little more than strictly needed -- the other registers
are already zero extended.
Also remove the now unnecessary and non functional compat task check
in ptrace.
This is CVE-2007-4573
Found by Wojciech Purczynski
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Starikovskiy [Thu, 20 Sep 2007 17:32:35 +0000 (21:32 +0400)]
ACPI: suspend: consolidate handling of Sx states.
Recent changes to sleep initialization in ACPI dropped reporting of supported Sx
states above S3. Fix that and also move S5 init into same file as other Sx.
The only functional change is adding printk() for S4 and S5 cases.
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Sunil Mushran [Thu, 20 Sep 2007 17:59:48 +0000 (10:59 -0700)]
ocfs2: Pack vote message and response structures
The ocfs2_vote_msg and ocfs2_response_msg structs needed to be
packed to ensure similar sizeofs in 32-bit and 64-bit arches. Without this,
we had inadvertantly broken 32/64 bit cross mounts.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Mark Fasheh [Wed, 19 Sep 2007 00:49:29 +0000 (17:49 -0700)]
ocfs2: Don't double set write parameters
The target page offsets were being incorrectly set a second time in
ocfs2_prepare_page_for_write(), which was causing problems on a 16k page
size kernel. Additionally, ocfs2_write_failure() was incorrectly using those
parameters instead of the parameters for the individual page being cleaned
up.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Mark Fasheh [Mon, 17 Sep 2007 16:06:29 +0000 (09:06 -0700)]
ocfs2: Fix pos/len passed to ocfs2_write_cluster
This was broken for file systems whose cluster size is greater than page
size. Pos needs to be incremented as we loop through the descriptors, and
len needs to be capped to the size of a single cluster.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Mark Fasheh [Mon, 17 Sep 2007 03:10:16 +0000 (20:10 -0700)]
ocfs2: Allow smaller allocations during large writes
The ocfs2 write code loops through a page much like the block code, except
that ocfs2 allocation units can be any size, including larger than page
size. Typically it's equal to or larger than page size - most kernels run 4k
pages, the minimum ocfs2 allocation (cluster) size.
Some changes introduced during 2.6.23 changed the way writes to pages are
handled, and inadvertantly broke support for > 4k page size. Instead of just
writing one cluster at a time, we now handle the whole page in one pass.
This means that multiple (small) seperate allocations might happen in the
same pass. The allocation code howver typically optimizes by getting the
maximum which was reserved. This triggered a BUG_ON in the extend code where
it'd ask for a single bit (for one part of a > 4k page) and get back more
than it asked for.
Fix this by providing a variant of the high level allocation function which
allows the caller to specify a maximum. The traditional function remains and
just calls the new one with a maximum determined from the initial
reservation.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Linus Torvalds [Thu, 20 Sep 2007 20:25:35 +0000 (13:25 -0700)]
Merge branch 'upstream-linus' of /linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] ahci: add ATI SB800 PCI IDs
libata-sff: Fix documentation
libata: Update the blacklist with a few more devices
Davide Libenzi [Thu, 20 Sep 2007 19:40:16 +0000 (12:40 -0700)]
signalfd simplification
This simplifies signalfd code, by avoiding it to remain attached to the
sighand during its lifetime.
In this way, the signalfd remain attached to the sighand only during
poll(2) (and select and epoll) and read(2). This also allows to remove
all the custom "tsk == current" checks in kernel/signal.c, since
dequeue_signal() will only be called by "current".
I think this is also what Ben was suggesting time ago.
The external effect of this, is that a thread can extract only its own
private signals and the group ones. I think this is an acceptable
behaviour, in that those are the signals the thread would be able to
fetch w/out signalfd.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wolfgang Walter [Thu, 20 Sep 2007 19:51:46 +0000 (15:51 -0400)]
rpc: fix garbage in printk in svc_tcp_accept()
we upgraded the kernel of a nfs-server from 2.6.17.11 to 2.6.22.6. Since
then we get the message
lockd: too many open TCP sockets, consider increasing the number of nfsd threads
lockd: last TCP connect from ^\\236^\É^D
These random characters in the second line are caused by a bug in
svc_tcp_accept.
(Note: there are two previous __svc_print_addr(sin, buf, sizeof(buf))
calls in this function, either of which would initialize buf correctly;
but both are inside "if"'s and are not necessarily executed. This is
less obvious in the second case, which is inside a dprintk(), which is a
macro which expands to an if statement.)
Signed-off-by: Wolfgang Walter <wolfgang.walter@studentenwerk.mhn.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
henry su [Thu, 20 Sep 2007 20:07:33 +0000 (16:07 -0400)]
[libata] ahci: add ATI SB800 PCI IDs
ATI/AMD SB800 shares some device IDs with SB700,
and SB800 adds two more device IDs:0x4394,0x4395.
Signed-off-by: henry su <henry.su.ati@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Alan Cox [Thu, 20 Sep 2007 14:03:07 +0000 (15:03 +0100)]
libata-sff: Fix documentation
Code moved to ioread/iowrite but the comment didn't
Also note a posting issue
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Alan Cox [Thu, 20 Sep 2007 14:22:47 +0000 (15:22 +0100)]
libata: Update the blacklist with a few more devices
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Linus Torvalds [Thu, 20 Sep 2007 19:42:47 +0000 (12:42 -0700)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[BNX2]: Add PHY workaround for 5709 A1.
[PPP] L2TP: Fix skb handling in pppol2tp_xmit
[PPP] L2TP: Fix skb handling in pppol2tp_recv_core
[PPP] L2TP: Disallow non-UDP datagram sockets
[PPP] pppoe: Fix double-free on skb after transmit failure
[PKT_SCHED]: Fix 'SFQ qdisc crashes with limit of 2 packets'
[NETFILTER]: MAINTAINERS update
[NETFILTER]: nfnetlink_log: fix sending of multipart messages
Linus Torvalds [Thu, 20 Sep 2007 19:42:23 +0000 (12:42 -0700)]
Merge branch 'upstream-linus' of /linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
sky2: version 1.18
sky2: receive FIFO checking
sky2: fe+ chip support
sky2: reorganize chip revision features
sky2: ethtool speed report bug
sky2: fix VLAN receive processing (resend)
phy: export phy_mii_ioctl
myri10ge: Add support for PCI device id 9
Stephen Hemminger [Wed, 19 Sep 2007 22:36:47 +0000 (15:36 -0700)]
sky2: version 1.18
Update version number
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Wed, 19 Sep 2007 22:36:46 +0000 (15:36 -0700)]
sky2: receive FIFO checking
A driver writer from another operating system hinted that
the versions of Yukon 2 chip with rambuffer (EC and XL) have
a hardware bug that if the FIFO ever gets completely full it
will hang. Sounds like a classic ring full vs ring empty wrap around
bug.
As a workaround, use the existing watchdog timer to check for
ring full lockup.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Wed, 19 Sep 2007 22:36:45 +0000 (15:36 -0700)]
sky2: fe+ chip support
Add support for newest Marvell chips.
The Yukon FE plus chip is found in some not yet released laptops.
Tested on hardware evaluation boards.
This version of the patch is for 2.6.23. It supersedes
the two previous patches that are sitting in netdev-2.6 (upstream branch).
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Wed, 19 Sep 2007 22:36:44 +0000 (15:36 -0700)]
sky2: reorganize chip revision features
This patch should cause no functional changes in driver behaviour.
There are (too) many revisions of the Yukon 2 chip now. Instead of
adding more conditionals based on chip revision; rerganize into a
set of feature flags so adding new versions is less problematic.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Wed, 19 Sep 2007 22:36:43 +0000 (15:36 -0700)]
sky2: ethtool speed report bug
On 100mbit versions, the driver always reports gigabit speed
available. The correct modes are already computed, then overwritten.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Wed, 19 Sep 2007 22:36:42 +0000 (15:36 -0700)]
sky2: fix VLAN receive processing (resend)
The length check for truncated frames was not correctly handling
the case where VLAN acceleration had already read the tag.
Also, the Yukon EX has some features that use high bit of status
as security tag.
Signed-off-by: Pierre-Yves Ritschard <pyr@spootnik.org>
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stefan Richter [Thu, 20 Sep 2007 19:17:33 +0000 (21:17 +0200)]
ieee1394: ohci1394: fix initialization if built non-modular
Initialization of ohci1394 was broken according to one reporter if the
driver was statically linked, i.e. not built as loadable module. Dmesg:
PCI: Device 0000:02:07.0 not available because of resource collisions
ohci1394: Failed to enable OHCI hardware.
This was reported for a Toshiba Satellite 5100-503. The cause is commit
8df4083c5291b3647e0381d3c69ab2196f5dd3b7 in Linux 2.6.19-rc1 which only
served purposes of early remote debugging via FireWire. This
functionality is better provided by the currently out-of-tree driver
ohci1394_earlyinit. Reversal of the commit was OK'd by Andi Kleen.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Michael Chan [Thu, 20 Sep 2007 18:04:58 +0000 (11:04 -0700)]
[BNX2]: Add PHY workaround for 5709 A1.
Add the DIS_EARLY_DAC PHY workaround for 5709 A1. Without it, link
sometimes does not come up.
Update version to 1.6.5.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 19 Sep 2007 17:46:28 +0000 (10:46 -0700)]
[PPP] L2TP: Fix skb handling in pppol2tp_xmit
This patch makes pppol2tp_xmit call skb_cow_head so that we don't modify
cloned skb data. It also gets rid of skb2 we only need to preserve the
original skb for congestion notification, which is only applicable for
ppp_async and ppp_sync.
The other semantic change made here is the removal of socket accounting
for data tranmitted out of pppol2tp_xmit. The original code leaked any
existing socket skb accounting. We could fix this by dropping the
original skb owner. However, this is undesirable as the packet has not
physically left the host yet.
In fact, all other tunnels in the kernel do not account skb's passing
through to their own socket. In partciular, ESP over UDP does not do
so and it is the closest tunnel type to PPPoL2TP. So this patch simply
removes the socket accounting in pppol2tp_xmit. The accounting still
applies to control packets of course.
I've also added a reminder that the outgoing checksum here doesn't work.
I suppose existing deployments don't actually enable checksums.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Tue, 18 Sep 2007 20:18:42 +0000 (13:18 -0700)]
[PPP] L2TP: Fix skb handling in pppol2tp_recv_core
The function pppol2tp_recv_core doesn't handle non-linear packets properly.
It also fails to check the remote offset field.
This patch fixes these problems. It also removes an unnecessary check on
the UDP header which has already been performed by the UDP layer.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Tue, 18 Sep 2007 20:18:17 +0000 (13:18 -0700)]
[PPP] L2TP: Disallow non-UDP datagram sockets
With the addition of UDP-Lite we need to refine the socket check so
that only genuine UDP sockets are allowed through.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 19 Sep 2007 17:45:02 +0000 (10:45 -0700)]
[PPP] pppoe: Fix double-free on skb after transmit failure
When I got rid of the second packet in __pppoe_xmit I created
a double-free on the skb because of the goto abort on failure.
This patch removes that.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Kuznetsov [Wed, 19 Sep 2007 17:42:03 +0000 (10:42 -0700)]
[PKT_SCHED]: Fix 'SFQ qdisc crashes with limit of 2 packets'
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 18 Sep 2007 20:19:26 +0000 (13:19 -0700)]
[NETFILTER]: MAINTAINERS update
Update netfilter list addresses and an old email address of myself.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Leblond [Tue, 18 Sep 2007 20:07:15 +0000 (13:07 -0700)]
[NETFILTER]: nfnetlink_log: fix sending of multipart messages
The following patch fixes the handling of netlink packets containing
multiple messages.
As exposed during netfilter workshop, nfnetlink_log was overwritten the
message type of the last message (setting it to MSG_DONE) in a multipart
packet. The consequence was libnfnetlink to ignore the last message in the
packet.
The following patch adds a supplementary message (with type MSG_DONE) af
the end of the netlink skb.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 20 Sep 2007 18:33:45 +0000 (11:33 -0700)]
Fix CRLF line endings in Documentation/input/iforce-protocol.txt
Emil Medve points out that this documentation file uses CRLF line
endings, which means that if you use
[core]
autocrlf=input
(which makes sense if you ever develop under Windows, for example, or if
you use other broken tools) in your git config, git will always complain
about the file being dirty.
This removes the bogus DOS line endings, and removes whitespace at the
end of line.
Cc: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Bolle [Mon, 10 Sep 2007 21:39:02 +0000 (23:39 +0200)]
[x86 setup] Fix typo in arch/i386/boot/header.S
There's an obvious typo in arch/i386/boot/header.S (in your
linux-2.6-x86setup.git) that I noticed by just studying the code.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
H. Peter Anvin [Thu, 13 Sep 2007 21:16:37 +0000 (14:16 -0700)]
[acpi] Correct the decoding of video mode numbers in wakeup.S
wakeup.S looks at the video mode number from the setup header and
looks to see if it is a VESA mode. Unfortunately, the decoding is
done incorrectly and it will attempt to frob the VESA BIOS for any
mode number 0x0200 or larger. Correct this, and remove a bunch of #if
0'd code.
Massive thanks to Jeff Chua for reporting the bug, and suffering
though a large number of experiments in order to track this problem
down.
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
H. Peter Anvin [Thu, 13 Sep 2007 21:14:29 +0000 (14:14 -0700)]
[x86 setup] Present the canonical video mode number to the kernel
Canonicalize the video mode number as presented to the kernel. The
video mode number may be user-entered (e.g. ASK_VGA), an alias
(e.g. NORMAL_VGA), or a size specification, and that confuses the
suspend wakeup code.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Christoph Hellwig [Wed, 19 Sep 2007 05:27:30 +0000 (15:27 +1000)]
[XFS] fix valid but harmless sparse warning
The new xlog_recover_do_reg_buffer checks call be16_to_cpu on di_gen which
is a 32bit value so sparse rightly complains. Fortunately the warning is
harmless because we don't care for the value, but only whether it's
non-NULL. Due to that fact we can simply kill the endian swaps on this and
the previous di_mode check entirely.
SGI-PV: 969656
SGI-Modid: xfs-linux-melb:xfs-kern:29709a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Eric Sandeen [Thu, 30 Aug 2007 07:21:38 +0000 (17:21 +1000)]
[XFS] fix filestreams on 32-bit boxes
xfs_filestream_mount() sets up an mru cache with:
err = xfs_mru_cache_create(&mp->m_filestream, lifetime, grp_count,
(xfs_mru_cache_free_func_t)xfs_fstrm_free_func);
but that cast is causing problems...
typedef void (*xfs_mru_cache_free_func_t)(unsigned long, void*);
but:
void xfs_fstrm_free_func( xfs_ino_t ino, fstrm_item_t *item)
so on a 32-bit box, it's casting (32, 32) args into (64, 32) and I assume
it's getting garbage for *item, which subsequently causes an explosion.
With this change the filestreams xfsqa tests don't oops on my 32-bit box.
SGI-PV: 967795
SGI-Modid: xfs-linux-melb:xfs-kern:29510a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Herbert Valerio Riedel [Sun, 16 Sep 2007 15:44:39 +0000 (16:44 +0100)]
[ARM] 4569/1: ep93xx_gpio_irq_type(): fix spurious enumeration offset for FGPIO handling
The EP93XX_GPIO_LINE_F() macro is supposed to be called with a line
number between 0 and 7, but the current code causes it to get called
with an spuriously offset number range {16..23}.
Signed-off-by: Herbert Valerio Riedel <hvr@gnu.org>
Signed-off-by: Lennert Buytenhek <kernel@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Domen Puncer [Mon, 17 Sep 2007 20:21:40 +0000 (22:21 +0200)]
phy: export phy_mii_ioctl
Export phy_mii_ioctl, so network drivers can use it when built
as modules too.
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Linus Torvalds [Wed, 19 Sep 2007 23:01:13 +0000 (16:01 -0700)]
Linux 2.6.23-rc7
Linus Torvalds [Wed, 19 Sep 2007 22:47:59 +0000 (15:47 -0700)]
Merge git://git./linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
sched: fix invalid sched_class use
sched: add /proc/sys/kernel/sched_compat_yield
Eric Paris [Wed, 19 Sep 2007 21:19:12 +0000 (17:19 -0400)]
SELinux: fix array out of bounds when mounting with selinux options
Given an illegal selinux option it was possible for match_token to work in
random memory at the end of the match_table_t array.
Note that privilege is required to perform a context mount, so this issue is
effectively limited to root only.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Hiroshi Shimamoto [Wed, 19 Sep 2007 21:34:46 +0000 (23:34 +0200)]
sched: fix invalid sched_class use
When using rt_mutex, a NULL pointer dereference is occurred at
enqueue_task_rt. Here is a scenario;
1) there are two threads, the thread A is fair_sched_class and
thread B is rt_sched_class.
2) Thread A is boosted up to rt_sched_class, because the thread A
has a rt_mutex lock and the thread B is waiting the lock.
3) At this time, when thread A create a new thread C, the thread
C has a rt_sched_class.
4) When doing wake_up_new_task() for the thread C, the priority
of the thread C is out of the RT priority range, because the
normal priority of thread A is not the RT priority. It makes
data corruption by overflowing the rt_prio_array.
The new thread C should be fair_sched_class.
The new thread should be valid scheduler class before queuing.
This patch fixes to set the suitable scheduler class.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Ingo Molnar [Wed, 19 Sep 2007 21:34:46 +0000 (23:34 +0200)]
sched: add /proc/sys/kernel/sched_compat_yield
add /proc/sys/kernel/sched_compat_yield to make sys_sched_yield()
more agressive, by moving the yielding task to the last position
in the rbtree.
with sched_compat_yield=0:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2539 mingo 20 0 1576 252 204 R 50 0.0 0:02.03 loop_yield
2541 mingo 20 0 1576 244 196 R 50 0.0 0:02.05 loop
with sched_compat_yield=1:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2584 mingo 20 0 1576 248 196 R 99 0.0 0:52.45 loop
2582 mingo 20 0 1576 256 204 R 0 0.0 0:00.00 loop_yield
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Brice Goglin [Thu, 13 Sep 2007 22:40:14 +0000 (00:40 +0200)]
myri10ge: Add support for PCI device id 9
Add support for new Myri-10G boards with PCI device id 9.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Linus Torvalds [Wed, 19 Sep 2007 18:45:32 +0000 (11:45 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] cpu-bugs64.c: GCC 3.3 constraint workaround
[MIPS] DEC: Initialise ioasic_ssr_lock
Linus Torvalds [Wed, 19 Sep 2007 18:41:15 +0000 (11:41 -0700)]
Merge /pub/scm/linux/kernel/git/mchehab/v4l-dvb
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (6173a): Documentation: Remove reference to dead "cpia_pp=" boot-time option
Revert "V4L/DVB (6173a): Documentation: Remove reference to dead "cpia_pp=" boot-time option"
Linus Torvalds [Wed, 19 Sep 2007 18:40:13 +0000 (11:40 -0700)]
Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
[XFS] Avoid replaying inode buffer initialisation log items if on-disk version is newer.
[XFS] Ensure file size updates have been completed before writing inode to disk.
[XFS] On-demand reaping of the MRU cache
Linus Torvalds [Wed, 19 Sep 2007 18:39:39 +0000 (11:39 -0700)]
Merge /pub/scm/linux/kernel/git/davem/sparc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SUNSAB]: Fix several bugs.
Linus Torvalds [Wed, 19 Sep 2007 18:39:10 +0000 (11:39 -0700)]
Merge /pub/scm/linux/kernel/git/bart/ide-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6:
ide: remove unused variables from drivers/ide/ppc/pmac.c
ide: ST320413A has the same problem as ST340823A
Linus Torvalds [Wed, 19 Sep 2007 18:38:25 +0000 (11:38 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Fix timekeeping on PowerPC 601
[POWERPC] Don't expose clock vDSO functions when CPU has no timebase
[POWERPC] spusched: Fix null pointer dereference in find_victim
Linus Torvalds [Wed, 19 Sep 2007 18:37:14 +0000 (11:37 -0700)]
x86-64: page faults from user mode are always user faults
Randy Dunlap noticed an interesting "crashme" behaviour on his dual
Prescott Xeon setup, where he gets page faults with the error code
having a zero "user" bit, but the register state points back to user
mode.
This may be a CPU microcode buglet triggered by some strange instruction
pattern that crashme generates, and loading a microcode update seems to
possibly have fixed it.
Regardless, we really should trust the register state more than the
error code, since it's really the register state that determines whether
we can actually send a signal, or whether we're in kernel mode and need
to oops/kill the process in the case of a page fault.
Cc: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Maciej W. Rozycki [Mon, 17 Sep 2007 16:11:07 +0000 (17:11 +0100)]
[MIPS] cpu-bugs64.c: GCC 3.3 constraint workaround
Add a workaround to address warnings generated on the "n" constraint by
GCC 3.3 and below.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Maciej W. Rozycki [Mon, 17 Sep 2007 15:58:18 +0000 (16:58 +0100)]
[MIPS] DEC: Initialise ioasic_ssr_lock
Fix the definition of the ioasic_ssr_lock spinlock to include a proper
initialisation.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Dmitry Torokhov [Wed, 19 Sep 2007 05:46:50 +0000 (22:46 -0700)]
Driver core: fix deprectated sysfs structure for nested class devices
Nested class devices used to have 'device' symlink point to a real
(physical) device instead of a parent class device. When converting
subsystems to struct device we need to keep doing what class devices did if
CONFIG_SYSFS_DEPRECATED is Y, otherwise parts of udev break.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Greg KH <greg@kroah.com>
Tested-by: Anssi Hannula <anssi.hannula@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Dike [Wed, 19 Sep 2007 05:46:49 +0000 (22:46 -0700)]
uml: fix irqstack crash
This patch fixes a crash caused by an interrupt coming in when an IRQ stack
is being torn down. When this happens, handle_signal will loop, setting up
the IRQ stack again because the tearing down had finished, and handling
whatever signals had come in.
However, to_irq_stack returns a mask of pending signals to be handled, plus
bit zero is set if the IRQ stack was already active, and thus shouldn't be
torn down. This causes a problem because when handle_signal goes around
the loop, sig will be zero, and to_irq_stack will duly set bit zero in the
returned mask, faking handle_signal into believing that it shouldn't tear
down the IRQ stack and return thread_info pointers back to their original
values.
This will eventually cause a crash, as the IRQ stack thread_info will
continue pointing to the original task_struct and an interrupt will look
into it after it has been freed.
The fix is to stop passing a signal number into to_irq_stack. Rather, the
pending signals mask is initialized beforehand with the bit for sig already
set. References to sig in to_irq_stack can be replaced with references to
the mask.
[akpm@linux-foundation.org: use UL]
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lee Schermerhorn [Wed, 19 Sep 2007 05:46:47 +0000 (22:46 -0700)]
Fix NUMA Memory Policy Reference Counting
This patch proposes fixes to the reference counting of memory policy in the
page allocation paths and in show_numa_map(). Extracted from my "Memory
Policy Cleanups and Enhancements" series as stand-alone.
Shared policy lookup [shmem] has always added a reference to the policy,
but this was never unrefed after page allocation or after formatting the
numa map data.
Default system policy should not require additional ref counting, nor
should the current task's task policy. However, show_numa_map() calls
get_vma_policy() to examine what may be [likely is] another task's policy.
The latter case needs protection against freeing of the policy.
This patch adds a reference count to a mempolicy returned by
get_vma_policy() when the policy is a vma policy or another task's
mempolicy. Again, shared policy is already reference counted on lookup. A
matching "unref" [__mpol_free()] is performed in alloc_page_vma() for
shared and vma policies, and in show_numa_map() for shared and another
task's mempolicy. We can call __mpol_free() directly, saving an admittedly
inexpensive inline NULL test, because we know we have a non-NULL policy.
Handling policy ref counts for hugepages is a bit trickier.
huge_zonelist() returns a zone list that might come from a shared or vma
'BIND policy. In this case, we should hold the reference until after the
huge page allocation in dequeue_hugepage(). The patch modifies
huge_zonelist() to return a pointer to the mempolicy if it needs to be
unref'd after allocation.
Kernel Build [16cpu, 32GB, ia64] - average of 10 runs:
w/o patch w/ refcount patch
Avg Std Devn Avg Std Devn
Real: 100.59 0.38 100.63 0.43
User: 1209.60 0.37 1209.91 0.31
System: 81.52 0.42 81.64 0.34
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>