Linus Torvalds [Mon, 21 Apr 2008 22:41:27 +0000 (15:41 -0700)]
Merge branch 'semaphore' of git://git./linux/kernel/git/willy/misc
* 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
Deprecate the asm/semaphore.h files in feature-removal-schedule.
Convert asm/semaphore.h users to linux/semaphore.h
security: Remove unnecessary inclusions of asm/semaphore.h
lib: Remove unnecessary inclusions of asm/semaphore.h
kernel: Remove unnecessary inclusions of asm/semaphore.h
include: Remove unnecessary inclusions of asm/semaphore.h
fs: Remove unnecessary inclusions of asm/semaphore.h
drivers: Remove unnecessary inclusions of asm/semaphore.h
net: Remove unnecessary inclusions of asm/semaphore.h
arch: Remove unnecessary inclusions of asm/semaphore.h
Linus Torvalds [Mon, 21 Apr 2008 22:40:55 +0000 (15:40 -0700)]
Merge branch 'for-linus' of /home/rmk/linux-2.6-arm
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (212 commits)
[ARM] pxa: Phycore pcm-990-specific code for the PXA270 Quick Capture driver
[ARM] pxa: V4L2 soc_camera driver for PXA270
[ARM] pxa: restrict availability of pxa2xx PCMCIA drivers
[ARM] 5005/1: BAST: Fix kset_name initialiser
[ARM] 4967/1: Adds functions to set clkout rate for Samsung S3C2410
[ARM] 4988/1: Add GPIO lib support to the EP93xx
[ARM] Add initial sparsemem support
[ARM] pxa: initialise PXA devices before platform init code
[ARM] 5002/1: tosa: add two more leds
[ARM] 5004/1: Tosa: make several unreferenced structures static.
[ARM] 5003/1: Shut up sparse warnings
[ARM] 4977/2: soc - pxa2xx-ac97 - Add missing clk_enable()
[ARM] 4976/1: zylonite: Configure GPIO for WM9713 IRQ line
[ARM] 4974/1: Drop unused leds-tosa.
[ARM] 4973/1: Tosa: use leds-gpio driver.
[ARM] 4972/1: Tosa: convert scoop GPIOs usage to generic gpio code
[ARM] 4971/1: pxaficp_ir: provide startup and shutdown hooks
[ARM] pxa: lubbock: move mis-placed SPI info
[ARM] 4970/1: tosa: correct gpio used for wake up.
[ARM] 4966/1: magician: add MFP pin configuration
...
Linus Torvalds [Mon, 21 Apr 2008 22:40:24 +0000 (15:40 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mingo/linux-2.6-sched-devel
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel: (62 commits)
sched: build fix
sched: better rt-group documentation
sched: features fix
sched: /debug/sched_features
sched: add SCHED_FEAT_DEADLINE
sched: debug: show a weight tree
sched: fair: weight calculations
sched: fair-group: de-couple load-balancing from the rb-trees
sched: fair-group scheduling vs latency
sched: rt-group: optimize dequeue_rt_stack
sched: debug: add some debug code to handle the full hierarchy
sched: fair-group: SMP-nice for group scheduling
sched, cpuset: customize sched domains, core
sched, cpuset: customize sched domains, docs
sched: prepatory code movement
sched: rt: multi level group constraints
sched: task_group hierarchy
sched: fix the task_group hierarchy for UID grouping
sched: allow the group scheduler to have multiple levels
sched: mix tasks and groups
...
Linus Torvalds [Mon, 21 Apr 2008 22:38:43 +0000 (15:38 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/x86/linux-2.6-x86
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: (77 commits)
x86: UV startup of slave cpus
x86: integrate pci-dma.c
x86: don't do dma if mask is NULL.
x86: return conditional to mmu
x86: remove kludge from x86_64
x86: unify gfp masks
x86: retry allocation if failed
x86: don't try to allocate from DMA zone at first
x86: use a fallback dev for i386
x86: use numa allocation function in i386
x86: remove virt_to_bus in pci-dma_64.c
x86: adjust dma_free_coherent for i386
x86: move bad_dma_address
x86: isolate coherent mapping functions
x86: move dma_coherent functions to pci-dma.c
x86: merge iommu initialization parameters
x86: merge dma_supported
x86: move pci fixup to pci-dma.c
x86: move x86_64-specific to common code.
x86: move initialization functions to pci-dma.c
...
Linus Torvalds [Mon, 21 Apr 2008 22:38:14 +0000 (15:38 -0700)]
Merge branch 'ro-bind.b6' of git://git./linux/kernel/git/viro/vfs-2.6
* 'ro-bind.b6' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (24 commits)
[PATCH] r/o bind mounts: debugging for missed calls
[PATCH] r/o bind mounts: honor mount writer counts at remount
[PATCH] r/o bind mounts: track numbers of writers to mounts
[PATCH] r/o bind mounts: check mnt instead of superblock directly
[PATCH] r/o bind mounts: elevate count for xfs timestamp updates
[PATCH] r/o bind mounts: make access() use new r/o helper
[PATCH] r/o bind mounts: write counts for truncate()
[PATCH] r/o bind mounts: elevate write count for chmod/chown callers
[PATCH] r/o bind mounts: elevate write count for open()s
[PATCH] r/o bind mounts: elevate write count for ioctls()
[PATCH] r/o bind mounts: write count for file_update_time()
[PATCH] r/o bind mounts: elevate write count for do_utimes()
[PATCH] r/o bind mounts: write counts for touch_atime()
[PATCH] r/o bind mounts: elevate write count for ncp_ioctl()
[PATCH] r/o bind mounts: elevate write count for xattr_permission() callers
[PATCH] r/o bind mounts: get write access for vfs_rename() callers
[PATCH] r/o bind mounts: write counts for link/symlink
[PATCH] r/o bind mounts: get callers of vfs_mknod/create/mkdir()
[PATCH] r/o bind mounts: elevate write count for rmdir and unlink.
[PATCH] r/o bind mounts: drop write during emergency remount
...
Linus Torvalds [Mon, 21 Apr 2008 22:37:47 +0000 (15:37 -0700)]
Merge git://git./linux/kernel/git/lethal/sh-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (27 commits)
sh: Fix up L2 cache probe.
sh: Fix up SH-4A part probe.
sh: Add support for SH7723 CPU subtype.
sh: Fix up SH7763 build.
sh: Add migor_ts support to MigoR
sh: Add rs5c732b RTC support to MigoR
sh: Add I2C support to MigoR
sh: Add I2C platform data to sh7722
sh: MigoR NAND flash support using gen_flash
sh: MigoR NOR flash support using physmap-flash
sh: Fix up mach-types formatting from merge damage.
sh: r7780rp: Hook up the I2C and SMBus platform devices.
sh: Use phyical addresses for MigoR smc91x resources
sh: Use physical addresses for sh7722 USBF resources
sh: Add MigoR header file
Fix sh_keysc double free
sh: Fix up __access_ok() check for nommu.
sh: Allow optimized clear/copy page routines to be used on SH-2.
sh: Hook up the rest of the SH7770 serial ports.
sh: Add support for Solution Engine SH7721 board
...
Linus Torvalds [Mon, 21 Apr 2008 04:59:13 +0000 (21:59 -0700)]
Fix RCU list iterator use of 'rcu_dereference()'
The RCU iterators used 'rcu_dereference()' on an already-fetched RCU
pointer value, which defeats the whole point of the exercise.
When we dereference a pointer protected by RCU, we need to make sure
that we only fetch the value _once_, because if the compiler ends up
re-loading it due to register pressure, the newly reloaded value could
be different from the previously fetched one, and you get inconsistent
results.
Cleaned-up, fixed, and the pointless list_for_each_safe_rcu #define
deleted by Paul Kenney.
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matthew Wilcox [Sat, 19 Apr 2008 17:49:34 +0000 (13:49 -0400)]
Deprecate the asm/semaphore.h files in feature-removal-schedule.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Ingo Molnar [Sat, 19 Apr 2008 10:11:10 +0000 (12:11 +0200)]
sched: build fix
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Viktor Radnai [Sat, 19 Apr 2008 17:45:01 +0000 (19:45 +0200)]
sched: better rt-group documentation
Viktor was nice enough to enhance the document based on my replies to
his questions on the subject.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 18 Apr 2008 08:55:34 +0000 (10:55 +0200)]
sched: features fix
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:45:00 +0000 (19:45 +0200)]
sched: /debug/sched_features
provide a text based interface to the scheduler features; this saves the
'user' from setting bits using decimal arithmetic.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sat, 19 Apr 2008 07:25:58 +0000 (09:25 +0200)]
sched: add SCHED_FEAT_DEADLINE
unused at the moment.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:45:00 +0000 (19:45 +0200)]
sched: debug: show a weight tree
Print a tree of weights.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:45:00 +0000 (19:45 +0200)]
sched: fair: weight calculations
In order to level the hierarchy, we need to calculate load based on the
root view. That is, each task's load is in the same unit.
A
/ \
B 1
/ \
2 3
To compute 1's load we do:
weight(1)
--------------
rq_weight(A)
To compute 2's load we do:
weight(2) weight(B)
------------ * -----------
rq_weight(B) rw_weight(A)
This yields load fractions in comparable units.
The consequence is that it changes virtual time. We used to have:
time_{i}
vtime_{i} = ------------
weight_{i}
vtime = \Sum vtime_{i} = time / rq_weight.
But with the new way of load calculation we get that vtime equals time.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:45:00 +0000 (19:45 +0200)]
sched: fair-group: de-couple load-balancing from the rb-trees
De-couple load-balancing from the rb-trees, so that I can change their
organization.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:45:00 +0000 (19:45 +0200)]
sched: fair-group scheduling vs latency
Currently FAIR_GROUP sched grows the scheduler latency outside of
sysctl_sched_latency, invert this so it stays within.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:45:00 +0000 (19:45 +0200)]
sched: rt-group: optimize dequeue_rt_stack
Now that the group hierarchy can have an arbitrary depth the O(n^2) nature
of RT task dequeues will really hurt. Optimize this by providing space to
store the tree path, so we can walk it the other way.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:45:00 +0000 (19:45 +0200)]
sched: debug: add some debug code to handle the full hierarchy
Add some extra debug output so we can get a better overview of the
full hierarchy.
We print the cgroup path after each cfs_rq, so we can see what group
we're looking at.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:45:00 +0000 (19:45 +0200)]
sched: fair-group: SMP-nice for group scheduling
Implement SMP nice support for the full group hierarchy.
On each load-balance action, compile a sched_domain wide view of the full
task_group tree. We compute the domain wide view when walking down the
hierarchy, and readjust the weights when walking back up.
After collecting and readjusting the domain wide view, we try to balance the
tasks within the task_groups. The current approach is a naively balance each
task group until we've moved the targeted amount of load.
Inspired by Srivatsa Vaddsgiri's previous code and Abhishek Chandra's H-SMP
paper.
XXX: there will be some numerical issues due to the limited nature of
SCHED_LOAD_SCALE wrt to representing a task_groups influence on the
total weight. When the tree is deep enough, or the task weight small
enough, we'll run out of bits.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Abhishek Chandra <chandra@cs.umn.edu>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Hidetoshi Seto [Tue, 15 Apr 2008 05:04:23 +0000 (14:04 +0900)]
sched, cpuset: customize sched domains, core
[rebased for sched-devel/latest]
- Add a new cpuset file, having levels:
sched_relax_domain_level
- Modify partition_sched_domains() and build_sched_domains()
to take attributes parameter passed from cpuset.
- Fill newidle_idx for node domains which currently unused but
might be required if sched_relax_domain_level become higher.
- We can change the default level by boot option 'relax_domain_level='.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Hidetoshi Seto [Tue, 15 Apr 2008 05:03:17 +0000 (14:03 +0900)]
sched, cpuset: customize sched domains, docs
This patch introduces new feature of cpuset - sched domain customization.
This version provides a per-cpuset file 'sched_relax_domain_level' that
enable us to change the searching range of scheduler, which used to limit
how many cpus the scheduler searches at some schedule events, such as
wakening task and running out of runqueue.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:45:00 +0000 (19:45 +0200)]
sched: prepatory code movement
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:45:00 +0000 (19:45 +0200)]
sched: rt: multi level group constraints
multi level rt constraints
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:45:00 +0000 (19:45 +0200)]
sched: task_group hierarchy
Add the full parent<->child relation thing into task_groups as well.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:45:00 +0000 (19:45 +0200)]
sched: fix the task_group hierarchy for UID grouping
UID grouping doesn't actually have a task_group representing the root of
the task_group tree. Add one.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dhaval Giani [Sat, 19 Apr 2008 17:44:59 +0000 (19:44 +0200)]
sched: allow the group scheduler to have multiple levels
This patch makes the group scheduler multi hierarchy aware.
[a.p.zijlstra@chello.nl: rt-parts and assorted fixes]
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dhaval Giani [Sat, 19 Apr 2008 17:44:59 +0000 (19:44 +0200)]
sched: mix tasks and groups
This patch allows tasks and groups to exist in the same cfs_rq. With this
change the CFS group scheduling follows a 1/(M+N) model from a 1/(1+N)
fairness model where M tasks and N groups exist at the cfs_rq level.
[a.p.zijlstra@chello.nl: rt bits and assorted fixes]
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 25 Mar 2008 12:51:45 +0000 (13:51 +0100)]
sched: fix checks
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 19 Mar 2008 10:43:36 +0000 (11:43 +0100)]
sched: old sleeper bonus
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Wed, 26 Mar 2008 21:23:49 +0000 (14:23 -0700)]
sched: add new set_cpus_allowed_ptr function
Add a new function that accepts a pointer to the "newly allowed cpus"
cpumask argument.
int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)
The current set_cpus_allowed() function is modified to use the above
but this does not result in an ABI change. And with some compiler
optimization help, it may not introduce any additional overhead.
Additionally, to enforce the read only nature of the new_mask arg, the
"const" property is migrated to sub-functions called by set_cpus_allowed.
This silences compiler warnings.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Wed, 26 Mar 2008 21:23:48 +0000 (14:23 -0700)]
init: move setup of nr_cpu_ids to as early as possible
Move the setting of nr_cpu_ids from sched_init() to start_kernel()
so that it's available as early as possible.
Note that an arch has the option of setting it even earlier if need be,
but it should not result in a different value than the setup_nr_cpu_ids()
function.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Tue, 15 Apr 2008 23:35:52 +0000 (16:35 -0700)]
sched: remove another cpumask_t variable from stack
* Remove another cpumask_t variable from stack that was missed in the
last kernel_sched_c updates.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Tue, 8 Apr 2008 18:43:04 +0000 (11:43 -0700)]
cpumask: add show cpu map functions
* Add cpu_sysdev_class functions to display the following maps
with cpulist_scnprintf().
cpu_online_map
cpu_present_map
cpu_possible_map
* Small change to include/linux/sysdev.h to allow the attribute
name and label to be different (to avoid collision with the
"attr_online" entry for bringing cpus on- and off-line.)
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Tue, 8 Apr 2008 18:43:03 +0000 (11:43 -0700)]
cpumask: use new cpus_scnprintf function
* Cleaned up references to cpumask_scnprintf() and added new
cpulist_scnprintf() interfaces where appropriate.
* Fix some small bugs (or code efficiency improvments) for various uses
of cpumask_scnprintf.
* Clean up some checkpatch errors.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Tue, 8 Apr 2008 18:43:02 +0000 (11:43 -0700)]
x86: modify show_shared_cpu_map in intel_cacheinfo
* Removed kmalloc (or local array) in show_shared_cpu_map().
* Added show_shared_cpu_list() function.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Sat, 5 Apr 2008 01:11:01 +0000 (18:11 -0700)]
x86: convert cpumask_of_cpu macro to allocated array
* Here is a simple patch to use an allocated array of cpumasks to
represent cpumask_of_cpu() instead of constructing one on the stack.
It's based on the Kconfig option "HAVE_CPUMASK_OF_CPU_MAP" which is
currently only set for x86_64 SMP. Otherwise the the existing
cpumask_of_cpu() is used but has been changed to produce an lvalue
so a pointer to it can be used.
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Sat, 5 Apr 2008 01:11:02 +0000 (18:11 -0700)]
cpumask: add CPU_MASK_ALL_PTR macro
* Add a static cpumask_t variable "CPU_MASK_ALL_PTR" to use as
a pointer reference to CPU_MASK_ALL. This reduces where possible
the instances where CPU_MASK_ALL allocates and fills a large
array on the stack. Used only if NR_CPUS > BITS_PER_LONG.
* Change init/main.c to use new set_cpus_allowed_ptr().
Depends on:
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Sat, 5 Apr 2008 01:11:11 +0000 (18:11 -0700)]
cpumask: reduce stack usage in SD_x_INIT initializers
* Remove empty cpumask_t (and all non-zero/non-null) variables
in SD_*_INIT macros. Use memset(0) to clear. Also, don't
inline the initializer functions to save on stack space in
build_sched_domains().
* Merge change to include/linux/topology.h that uses the new
node_to_cpumask_ptr function in the nr_cpus_node macro into
this patch.
Depends on:
[mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Sat, 5 Apr 2008 01:11:10 +0000 (18:11 -0700)]
nodemask: use new node_to_cpumask_ptr function
* Use new node_to_cpumask_ptr. This creates a pointer to the
cpumask for a given node. This definition is in mm patch:
asm-generic-add-node_to_cpumask_ptr-macro.patch
* Use new set_cpus_allowed_ptr function.
Depends on:
[mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch
[sched-devel]: sched: add new set_cpus_allowed_ptr function
[x86/latest]: x86: add cpus_scnprintf function
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Greg Banks <gnb@melbourne.sgi.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Sat, 5 Apr 2008 01:11:08 +0000 (18:11 -0700)]
generic: reduce stack pressure in sched_affinity
* Modify sched_affinity functions to pass cpumask_t variables by reference
instead of by value.
* Use new set_cpus_allowed_ptr function.
Depends on:
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Cc: Paul Jackson <pj@sgi.com>
Cc: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Sat, 5 Apr 2008 01:11:07 +0000 (18:11 -0700)]
cpuset: modify cpuset_set_cpus_allowed to use cpumask pointer
* Modify cpuset_cpus_allowed to return the currently allowed cpuset
via a pointer argument instead of as the function return value.
* Use new set_cpus_allowed_ptr function.
* Cleanup CPU_MASK_ALL and NODE_MASK_ALL uses.
Depends on:
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Sat, 5 Apr 2008 01:11:06 +0000 (18:11 -0700)]
generic: use new set_cpus_allowed_ptr function
* Use new set_cpus_allowed_ptr() function added by previous patch,
which instead of passing the "newly allowed cpus" cpumask_t arg
by value, pass it by pointer:
-int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)
* Modify CPU_MASK_ALL
Depends on:
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Sat, 5 Apr 2008 01:11:05 +0000 (18:11 -0700)]
x86: use new set_cpus_allowed_ptr function
* Use new set_cpus_allowed_ptr() function added by previous patch,
which instead of passing the "newly allowed cpus" cpumask_t arg
by value, pass it by pointer:
-int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)
* Cleanup uses of CPU_MASK_ALL.
* Collapse other NR_CPUS changes to arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
Use pointers to cpumask_t arguments whenever possible.
Depends on:
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Cc: Len Brown <len.brown@intel.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Sat, 5 Apr 2008 01:11:04 +0000 (18:11 -0700)]
sched: remove fixed NR_CPUS sized arrays in kernel_sched_c
* Change fixed size arrays to per_cpu variables or dynamically allocated
arrays in sched_init() and sched_init_smp().
(1) static struct sched_entity *init_sched_entity_p[NR_CPUS];
(1) static struct cfs_rq *init_cfs_rq_p[NR_CPUS];
(1) static struct sched_rt_entity *init_sched_rt_entity_p[NR_CPUS];
(1) static struct rt_rq *init_rt_rq_p[NR_CPUS];
static struct sched_group **sched_group_nodes_bycpu[NR_CPUS];
(1) - these arrays are allocated via alloc_bootmem_low()
* Change sched_domain_debug_one() to use cpulist_scnprintf instead of
cpumask_scnprintf. This reduces the output buffer required and improves
readability when large NR_CPU count machines arrive.
* In sched_create_group() we allocate new arrays based on nr_cpu_ids.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Sat, 5 Apr 2008 01:11:12 +0000 (18:11 -0700)]
cpumask: Cleanup more uses of CPU_MASK and NODE_MASK
* Replace usages of CPU_MASK_NONE, CPU_MASK_ALL, NODE_MASK_NONE,
NODE_MASK_ALL to reduce stack requirements for large NR_CPUS
and MAXNODES counts.
* In some cases, the cpumask variable was initialized but then overwritten
with another value. This is the case for changes like this:
- cpumask_t oldmask = CPU_MASK_ALL;
+ cpumask_t oldmask;
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Sat, 5 Apr 2008 01:11:09 +0000 (18:11 -0700)]
numa: move large array from stack to _initdata section
* Move large array "struct bootnode nodes" from stack to _initdata
section to reduce amount of stack space required.
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Mon, 31 Mar 2008 15:41:55 +0000 (08:41 -0700)]
asm-generic: add node_to_cpumask_ptr macro
Create a simple macro to always return a pointer to the node_to_cpumask(node)
value. This relies on compiler optimization to remove the extra indirection:
#define node_to_cpumask_ptr(v, node) \
cpumask_t _##v = node_to_cpumask(node), *v = &_##v
For those systems with a large cpumask size, then a true pointer
to the array element can be used:
#define node_to_cpumask_ptr(v, node) \
cpumask_t *v = &(node_to_cpumask_map[node])
A node_to_cpumask_ptr_next() macro is provided to access another
node_to_cpumask value.
The other change is to always include asm-generic/topology.h moving the
ifdef CONFIG_NUMA to this same file.
Note: there are no references to either of these new macros in this patch,
only the definition.
Based on 2.6.25-rc5-mm1
# alpha
Cc: Richard Henderson <rth@twiddle.net>
# fujitsu
Cc: David Howells <dhowells@redhat.com>
# ia64
Cc: Tony Luck <tony.luck@intel.com>
# powerpc
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
# sparc
Cc: David S. Miller <davem@davemloft.net>
Cc: William L. Irwin <wli@holomorphy.com>
# x86
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Tue, 25 Mar 2008 22:06:59 +0000 (15:06 -0700)]
x86: oprofile: remove NR_CPUS arrays in arch/x86/oprofile/nmi_int.c
Change the following arrays sized by NR_CPUS to be PERCPU variables:
static struct op_msrs cpu_msrs[NR_CPUS];
static unsigned long saved_lvtpc[NR_CPUS];
Also some minor complaints from checkpatch.pl fixed.
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git
All changes were transparent except for:
static void nmi_shutdown(void)
{
+ struct op_msrs *msrs = &__get_cpu_var(cpu_msrs);
nmi_enabled = 0;
on_each_cpu(nmi_cpu_shutdown, NULL, 0, 1);
unregister_die_notifier(&profile_exceptions_nb);
- model->shutdown(cpu_msrs);
+ model->shutdown(msrs);
free_msrs();
}
The existing code passed a reference to cpu 0's instance of struct op_msrs
to model->shutdown, whilst the other functions are passed a reference to
<this cpu's> instance of a struct op_msrs. This seemed to be a bug to me
even though as long as cpu 0 and <this cpu> are of the same type it would
have the same effect...?
Cc: Philippe Elie <phil.el@wanadoo.fr>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Tue, 25 Mar 2008 22:06:56 +0000 (15:06 -0700)]
x86: reduce memory and stack usage in intel_cacheinfo
* Change the following static arrays sized by NR_CPUS to
per_cpu data variables:
_cpuid4_info *cpuid4_info[NR_CPUS];
_index_kobject *index_kobject[NR_CPUS];
kobject * cache_kobject[NR_CPUS];
* Remove the local NR_CPUS array with a kmalloc'd region in
show_shared_cpu_map().
Also some minor complaints from checkpatch.pl fixed.
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Tue, 25 Mar 2008 22:06:55 +0000 (15:06 -0700)]
cpumask: add cpumask_scnprintf_len function
Add a new function cpumask_scnprintf_len() to return the number of
characters needed to display "len" cpumask bits. The current method
of allocating NR_CPUS bytes is incorrect as what's really needed is
9 characters per 32-bit word of cpumask bits (8 hex digits plus the
seperator [','] or the terminating NULL.) This function provides the
caller the means to allocate the correct string length.
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Gregory Haskins [Tue, 12 Feb 2008 18:30:05 +0000 (13:30 -0500)]
sched: fix cpus_allowed settings
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dhaval Giani [Fri, 29 Feb 2008 04:32:44 +0000 (10:02 +0530)]
sched: allow cpuacct stats to be reset
Currently the schedstats implementation does not allow the statistics
to be reset. This patch aims to allow that.
echo 0 > cpuacct.usage
resets the usage. Any other value is not allowed and returns -EINVAL.
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dhaval Giani [Fri, 29 Feb 2008 04:32:43 +0000 (10:02 +0530)]
sched: cleanup cpuacct variable names
Change the variable names to the common convention for the cpuacct
subsystem.
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Olof Johansson [Tue, 4 Mar 2008 23:23:25 +0000 (15:23 -0800)]
tasklets: execute tasklets in the same order they were queued
I noticed this when looking at an openswan issue. Openswan (ab?)uses the
tasklet API to defer processing of packets in some situations, with one
packet per tasklet_action(). I started noticing sequences of
backwards-ordered sequence numbers coming over the wire, since new tasklets
are always queued at the head of the list but processed sequentially.
Convert it to instead append new entries to the tail of the list. As an
extra bonus, the splicing code in takeover_tasklets() no longer has to
iterate over the list.
Signed-off-by: Olof Johansson <olof@lixom.net>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:44:58 +0000 (19:44 +0200)]
sched: rt-group: smp balancing
Currently the rt group scheduling does a per cpu runtime limit, however
the rt load balancer makes no guarantees about an equal spread of real-
time tasks, just that at any one time, the highest priority tasks run.
Solve this by making the runtime limit a global property by borrowing
excessive runtime from the other cpus once the local limit runs out.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:44:57 +0000 (19:44 +0200)]
sched: rt-group: synchonised bandwidth period
Various SMP balancing algorithms require that the bandwidth period
run in sync.
Possible improvements are moving the rt_bandwidth thing into root_domain
and keeping a span per rt_bandwidth which marks throttled cpus.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 27 Feb 2008 13:05:10 +0000 (14:05 +0100)]
time: add ns_to_ktime()
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Mon, 18 Feb 2008 12:39:37 +0000 (13:39 +0100)]
sched: fix regression with sched yield
Balbir Singh reported:
> 1:mon> t
> [
c0000000e7677da0]
c000000000067de0 .sys_sched_yield+0x6c/0xbc
> [
c0000000e7677e30]
c000000000008748 syscall_exit+0x0/0x40
> --- Exception: c01 (System Call) at
00000400001d09e4
> SP (
4000664cb10) is in userspace
> 1:mon> r
> cpu 0x1: Vector: 300 (Data Access) at [
c0000000e7677aa0]
> pc:
c000000000068e50: .yield_task_fair+0x94/0xc4
> lr:
c000000000067de0: .sys_sched_yield+0x6c/0xbc
the check that should have avoided that is:
/*
* Are we the only task in the tree?
*/
if (unlikely(rq->load.weight == curr->se.load.weight))
return;
But I guess that overlooks rt tasks, they also increase the load.
So I guess something like this ought to fix it..
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dmitry Adamushko [Sun, 17 Feb 2008 21:34:07 +0000 (22:34 +0100)]
latencytop: optimize LT_BACKTRACEDEPTH loops a bit
There is no need to loop any longer when 'same == 0'.
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 14 Mar 2008 15:09:59 +0000 (16:09 +0100)]
sched: remove sysctl_sched_batch_wakeup_granularity
it's unused.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 19 Mar 2008 00:37:10 +0000 (01:37 +0100)]
sched: reenable sync wakeups
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 17 Mar 2008 08:36:53 +0000 (09:36 +0100)]
sched: cache hot buddy
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 19 Mar 2008 00:39:19 +0000 (01:39 +0100)]
sched: feat affine wakeups
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sun, 16 Mar 2008 19:03:22 +0000 (20:03 +0100)]
sched: introduce SCHED_FEAT_SYNC_WAKEUPS, turn it off
turn off sync wakeups by default. They are not needed anymore - the
buddy logic should be smart enough to keep the system from
overscheduling.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Sat, 19 Apr 2008 17:44:57 +0000 (19:44 +0200)]
sched: fix wakeup granularity for buddies
The wakeup buddy logic didn't use the same wakeup granularity logic as the
wakeup preemption did, this might cause the ->next buddy to be selected past
the point where we would have preempted had the task been a single running
instance.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Guillaume Chazarain [Sat, 19 Apr 2008 17:44:57 +0000 (19:44 +0200)]
sched: fix rq->clock overflows detection with CONFIG_NO_HZ
When using CONFIG_NO_HZ, rq->tick_timestamp is not updated every TICK_NSEC.
We check that the number of skipped ticks matches the clock jump seen in
__update_rq_clock().
Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reynes Philippe [Mon, 17 Mar 2008 23:19:05 +0000 (16:19 -0700)]
sched: sched.c needs tick.h
kernel/sched.c:506: erreur: implicit declaration of function tick_get_tick_sched
kernel/sched.c:506: erreur: invalid type argument of ->
kernel/sched.c:506: erreur: NOHZ_MODE_INACTIVE undeclared (first use in this function)
kernel/sched.c:506: erreur: (Each undeclared identifier is reported only once
kernel/sched.c:506: erreur: for each function it appears in.)
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Thu, 28 Feb 2008 20:00:21 +0000 (21:00 +0100)]
sched: make cpu_clock() globally synchronous
Alexey Zaytsev reported (and bisected) that the introduction of
cpu_clock() in printk made the timestamps jump back and forth.
Make cpu_clock() more reliable while still keeping it fast when it's
called frequently.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 14 Apr 2008 06:53:32 +0000 (08:53 +0200)]
sched: re-do "sched: fix fair sleepers"
re-apply:
| commit
e22ecef1d2658ba54ed7d3fdb5d60829fb434c23
| Author: Ingo Molnar <mingo@elte.hu>
| Date: Fri Mar 14 22:16:08 2008 +0100
|
| sched: fix fair sleepers
|
| Fair sleepers need to scale their latency target down by runqueue
| weight. Otherwise busy systems will gain ever larger sleep bonus.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jack Steiner [Wed, 16 Apr 2008 16:45:15 +0000 (11:45 -0500)]
x86: UV startup of slave cpus
This patch changes smpboot.c so that it can start slave cpus running
in UV non-unique apicid mode. The SIPI must be sent using a UV-specific
mechanism.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Glauber Costa [Wed, 9 Apr 2008 16:18:10 +0000 (13:18 -0300)]
x86: integrate pci-dma.c
The code in pci-dma_{32,64}.c are now sufficiently
close to each other. We merge them in pci-dma.c.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Wed, 9 Apr 2008 16:18:09 +0000 (13:18 -0300)]
x86: don't do dma if mask is NULL.
if the device hasn't provided a mask, abort allocation.
Note that we're using a fallback device now, so it does not cover
the case of a NULL device: just drivers passing NULL masks around.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Wed, 9 Apr 2008 16:18:08 +0000 (13:18 -0300)]
x86: return conditional to mmu
Just return our allocation if we don't have an mmu. For i386, where this patch
is being applied, we never have. So our goal is just to have the code to look like
x86_64's.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Wed, 9 Apr 2008 16:18:07 +0000 (13:18 -0300)]
x86: remove kludge from x86_64
The claim is that i386 does it. Just it does not.
So remove it.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Wed, 9 Apr 2008 16:18:06 +0000 (13:18 -0300)]
x86: unify gfp masks
Use the same gfp masks for x86_64 and i386.
It involves using HIGHMEM or DMA32 where necessary, for the sake
of code compatibility, (no real effect), and using the NORETRY
mask for i386.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Wed, 9 Apr 2008 16:18:05 +0000 (13:18 -0300)]
x86: retry allocation if failed
This patch puts in the code to retry allocation in case it fails. By its
own, it does not make much sense but making the code look like x86_64.
But later patches in this series will make we try to allocate from
zones other than DMA first, which will possibly fail.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:21:05 +0000 (13:21 -0300)]
x86: don't try to allocate from DMA zone at first
If we fail, we'll loop into the allocation again,
and then allocate in the DMA zone.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:21:04 +0000 (13:21 -0300)]
x86: use a fallback dev for i386
We can use a fallback dev for cases of a NULL device being passed (mostly ISA)
This comes from x86_64 implementation.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:21:02 +0000 (13:21 -0300)]
x86: use numa allocation function in i386
We can do it here to, in the same way x86_64 does.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:21:01 +0000 (13:21 -0300)]
x86: remove virt_to_bus in pci-dma_64.c
virt_to_bus() is deprecated according to the docs, and moreover,
won't return the right thing in i386 if we're dealing with high memory mappings.
So we make our allocation function return a page, and then use page_address() (for
virtual addr) and page_to_phys() (for physical addr) instead.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:59 +0000 (13:20 -0300)]
x86: adjust dma_free_coherent for i386
We call unmap_single, if available.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:21:00 +0000 (13:21 -0300)]
x86: move bad_dma_address
It goes to pci-dma.c, and is removed from the arch-specific files.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:58 +0000 (13:20 -0300)]
x86: isolate coherent mapping functions
i386 implements the declare coherent memory API, and x86_64 does not
it is reflected in pieces of dma_alloc_coherent and dma_free_coherent.
Those pieces are isolated in separate functions, that are declared
as empty macros in x86_64. This way we can make the code the same.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:57 +0000 (13:20 -0300)]
x86: move dma_coherent functions to pci-dma.c
They are placed in an ifdef, since they are i386 specific
the structure definition goes to dma-mapping.h.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:56 +0000 (13:20 -0300)]
x86: merge iommu initialization parameters
we merge the iommu initialization parameters in pci-dma.c
Nice thing, that both architectures at least recognize the same
parameters.
usedac i386 parameter is marked for deprecation
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:55 +0000 (13:20 -0300)]
x86: merge dma_supported
The code for both arches are very similar, so this patch merge them.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:53 +0000 (13:20 -0300)]
x86: move pci fixup to pci-dma.c
via_no_dac provides a fixup that is the same for both
architectures. Move it to pci-dma.c.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:54 +0000 (13:20 -0300)]
x86: move x86_64-specific to common code.
This patch moves the bootmem functions, that are largely
x86_64-specific into pci-dma.c. The code goes inside an ifdef.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:51 +0000 (13:20 -0300)]
x86: move initialization functions to pci-dma.c
initcalls that triggers the various possibiities for
dma subsys are moved to pci-dma.c.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:52 +0000 (13:20 -0300)]
x86: unify pci-nommu
merge pci-base_32.c and pci-nommu_64.c into pci-nommu.c
Their code were made the same, so now they can be merged.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:50 +0000 (13:20 -0300)]
x86: move definition to pci-dma.c
Move dma_ops structure definition to pci-dma.c, where it
belongs.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:48 +0000 (13:20 -0300)]
x86: use dma_length in i386
This is done to get the code closer to x86_64.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:49 +0000 (13:20 -0300)]
x86: use WARN_ON in mapping functions
In the very same way i386 do, we use WARN_ON functions
in map_simple and map_sg.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:47 +0000 (13:20 -0300)]
x86: use sg_phys in x86_64
To make the code usable in i386, where we have high memory mappings,
we drop te virt_to_bus(sg_virt()) construction in favour of sg_phys.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:46 +0000 (13:20 -0300)]
x86: Add flush_write_buffers in nommu functions
This patch adds flush_write_buffers() in some functions of pci-nommu_64.c
They are added anywhere i386 would also have it. This is not a problem
for x86_64, since flush_rite_buffers() an nop for it.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:45 +0000 (13:20 -0300)]
x86: implement mapping_error in pci-nommu_64.c
This patch implements mapping_error for pci-nommu_64.c.
It takes care to keep the same compatible behaviour it already
had. Although this file is not (yet) used for i386, we introduce
the i386 version here. Again, care is taken, even at the expense of
an ifdef, to keep the same behaviour inconditionally.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:44 +0000 (13:20 -0300)]
x86: delete empty functions from pci-nommu_64.c
This functions are now called conditionally on their
existence in the struct. So just delete them, instead
of keeping an empty implementation.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Glauber Costa [Tue, 8 Apr 2008 16:20:43 +0000 (13:20 -0300)]
x86: introduce pci-dma.c
This patch introduces pci-dma.c, a common file for pci dma
between i386 and x86_64. As a start, dma_set_mask() is the same
between architectures, and is placed there.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Mark McLoughlin [Thu, 27 Mar 2008 11:03:15 +0000 (11:03 +0000)]
x86: move dma_supported and dma_set_mask to pci-dma_32.c, fix
ERROR: "dma_supported" [drivers/ssb/ssb.ko] undefined!
ERROR: "dma_set_mask" [drivers/scsi/qla2xxx/qla2xxx.ko] undefined!
ERROR: "dma_set_mask" [drivers/scsi/aic7xxx/aic7xxx.ko] undefined!
ERROR: "dma_set_mask" [drivers/scsi/aic7xxx/aic79xx.ko] undefined!
ERROR: "dma_supported" [drivers/net/pcnet32.ko] undefined!
ERROR: "dma_supported" [drivers/media/video/saa7134/saa7134.ko] undefined!
ERROR: "dma_set_mask" [drivers/media/video/meye.ko] undefined!
ERROR: "dma_supported" [drivers/media/video/cx88/cx8802.ko] undefined!
ERROR: "dma_supported" [drivers/media/video/cx88/cx8800.ko] undefined!
ERROR: "dma_supported" [drivers/media/video/cx88/cx88-alsa.ko] undefined!
ERROR: "dma_supported" [drivers/media/video/cx23885/cx23885.ko] undefined!
They just need to be exported like on x86_64.
dma_supported() and dma_set_mask() were previously inlined,
but are now moved to pci-dma_32.c.
Since they're used by various drivers, they need to be
exported.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>