David Howells [Wed, 23 Aug 2006 00:06:13 +0000 (20:06 -0400)]
NFS: Add server and volume lists to /proc
Make two new proc files available:
/proc/fs/nfsfs/servers
/proc/fs/nfsfs/volumes
The first lists the servers with which we are currently dealing (struct
nfs_client), and the second lists the volumes we have on those servers (struct
nfs_server).
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:13 +0000 (20:06 -0400)]
NFS: Share NFS superblocks per-protocol per-server per-FSID
The attached patch makes NFS share superblocks between mounts from the same
server and FSID over the same protocol.
It does this by creating each superblock with a false root and returning the
real root dentry in the vfsmount presented by get_sb(). The root dentry set
starts off as an anonymous dentry if we don't already have the dentry for its
inode, otherwise it simply returns the dentry we already have.
We may thus end up with several trees of dentries in the superblock, and if at
some later point one of anonymous tree roots is discovered by normal filesystem
activity to be located in another tree within the superblock, the anonymous
root is named and materialises attached to the second tree at the appropriate
point.
Why do it this way? Why not pass an extra argument to the mount() syscall to
indicate the subpath and then pathwalk from the server root to the desired
directory? You can't guarantee this will work for two reasons:
(1) The root and intervening nodes may not be accessible to the client.
With NFS2 and NFS3, for instance, mountd is called on the server to get
the filehandle for the tip of a path. mountd won't give us handles for
anything we don't have permission to access, and so we can't set up NFS
inodes for such nodes, and so can't easily set up dentries (we'd have to
have ghost inodes or something).
With this patch we don't actually create dentries until we get handles
from the server that we can use to set up their inodes, and we don't
actually bind them into the tree until we know for sure where they go.
(2) Inaccessible symbolic links.
If we're asked to mount two exports from the server, eg:
mount warthog:/warthog/aaa/xxx /mmm
mount warthog:/warthog/bbb/yyy /nnn
We may not be able to access anything nearer the root than xxx and yyy,
but we may find out later that /mmm/www/yyy, say, is actually the same
directory as the one mounted on /nnn. What we might then find out, for
example, is that /warthog/bbb was actually a symbolic link to
/warthog/aaa/xxx/www, but we can't actually determine that by talking to
the server until /warthog is made available by NFS.
This would lead to having constructed an errneous dentry tree which we
can't easily fix. We can end up with a dentry marked as a directory when
it should actually be a symlink, or we could end up with an apparently
hardlinked directory.
With this patch we need not make assumptions about the type of a dentry
for which we can't retrieve information, nor need we assume we know its
place in the grand scheme of things until we actually see that place.
This patch reduces the possibility of aliasing in the inode and page caches for
inodes that may be accessed by more than one NFS export. It also reduces the
number of superblocks required for NFS where there are many NFS exports being
used from a server (home directory server + autofs for example).
This in turn makes it simpler to do local caching of network filesystems, as it
can then be guaranteed that there won't be links from multiple inodes in
separate superblocks to the same cache file.
Obviously, cache aliasing between different levels of NFS protocol could still
be a problem, but at least that gives us another key to use when indexing the
cache.
This patch makes the following changes:
(1) The server record construction/destruction has been abstracted out into
its own set of functions to make things easier to get right. These have
been moved into fs/nfs/client.c.
All the code in fs/nfs/client.c has to do with the management of
connections to servers, and doesn't touch superblocks in any way; the
remaining code in fs/nfs/super.c has to do with VFS superblock management.
(2) The sequence of events undertaken by NFS mount is now reordered:
(a) A volume representation (struct nfs_server) is allocated.
(b) A server representation (struct nfs_client) is acquired. This may be
allocated or shared, and is keyed on server address, port and NFS
version.
(c) If allocated, the client representation is initialised. The state
member variable of nfs_client is used to prevent a race during
initialisation from two mounts.
(d) For NFS4 a simple pathwalk is performed, walking from FH to FH to find
the root filehandle for the mount (fs/nfs/getroot.c). For NFS2/3 we
are given the root FH in advance.
(e) The volume FSID is probed for on the root FH.
(f) The volume representation is initialised from the FSINFO record
retrieved on the root FH.
(g) sget() is called to acquire a superblock. This may be allocated or
shared, keyed on client pointer and FSID.
(h) If allocated, the superblock is initialised.
(i) If the superblock is shared, then the new nfs_server record is
discarded.
(j) The root dentry for this mount is looked up from the root FH.
(k) The root dentry for this mount is assigned to the vfsmount.
(3) nfs_readdir_lookup() creates dentries for each of the entries readdir()
returns; this function now attaches disconnected trees from alternate
roots that happen to be discovered attached to a directory being read (in
the same way nfs_lookup() is made to do for lookup ops).
The new d_materialise_unique() function is now used to do this, thus
permitting the whole thing to be done under one set of locks, and thus
avoiding any race between mount and lookup operations on the same
directory.
(4) The client management code uses a new debug facility: NFSDBG_CLIENT which
is set by echoing 1024 to /proc/net/sunrpc/nfs_debug.
(5) Clone mounts are now called xdev mounts.
(6) Use the dentry passed to the statfs() op as the handle for retrieving fs
statistics rather than the root dentry of the superblock (which is now a
dummy).
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:12 +0000 (20:06 -0400)]
NFS: Start rpciod in server common management
Start rpciod in the server common (nfs_client struct) management code rather
than in the superblock management code. This means we only need to "start" it
once per server instead of once per superblock.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:12 +0000 (20:06 -0400)]
NFS: Eliminate client_sys in favour of cl_rpcclient
Eliminate nfs_server::client_sys in favour of nfs_client::cl_rpcclient as we
only really need one per server that we're talking to since it doesn't have any
security on it.
The retransmission management variables are also moved to the common struct as
they're required to set up the cl_rpcclient connection.
The NFS2/3 client and client_acl connections are thenceforth derived by cloning
the cl_rpcclient connection and post-applying the authorisation flavour.
The code for setting up the initial common connection has been moved to
client.c as nfs_create_rpc_client(). All the NFS program definition tables are
also moved there as that's where they're now required rather than super.c.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:12 +0000 (20:06 -0400)]
NFS: Move rpc_ops from nfs_server to nfs_client
Move the rpc_ops from the nfs_server struct to the nfs_client struct as they're
common to all server records of a particular NFS protocol version.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:11 +0000 (20:06 -0400)]
NFS: Make better use of inode* dereferencing macros
Make better use of inode* dereferencing macros to hide dereferencing chains
(including NFS_PROTO and NFS_CLIENT).
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:11 +0000 (20:06 -0400)]
NFS: Maintain a common server record for NFS2/3 as well as for NFS4
Maintain a common server record for NFS2/3 as well as for NFS4 so that common
stuff can be moved there from struct nfs_server.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:11 +0000 (20:06 -0400)]
NFS: Add extra const qualifiers
Add some extra const qualifiers into NFS.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:10 +0000 (20:06 -0400)]
NFS: Use the dentry superblock directly in nfs_statfs()
Use the nominated dentry's superblock directly in the NFS statfs() op to get a
file handle, rather than using s_root (which will become a dummy dentry in a
future patch).
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:10 +0000 (20:06 -0400)]
NFS: Generalise the nfs_client structure
Generalise the nfs_client structure by:
(1) Moving nfs_client to a more general place (nfs_fs_sb.h).
(2) Renaming its maintenance routines to be non-NFS4 specific.
(3) Move those maintenance routines to a new non-NFS4 specific file (client.c)
and move the declarations to internal.h.
(4) Make nfs_find/get_client() take a full sockaddr_in to include the port
number (will be required for NFS2/3).
(5) Make nfs_find/get_client() take the NFS protocol version (again will be
required to differentiate NFS2, 3 & 4 client records).
Also:
(6) Make nfs_client construction proceed akin to inodes, marking them as under
construction and providing a function to indicate completion.
(7) Make nfs_get_client() wait interruptibly if it finds a client that it can
share, but that client is currently being constructed.
(8) Make nfs4_create_client() use (6) and (7) instead of locking cl_sem.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:10 +0000 (20:06 -0400)]
NFS: Add a server capabilities NFS RPC op
Add a set_capabilities NFS RPC op so that the server capabilities can be set.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:09 +0000 (20:06 -0400)]
NFS: Add a lookupfh NFS RPC op
Add a lookup filehandle NFS RPC op so that a file handle can be looked up
without requiring dentries and inodes and other VFS stuff when doing an NFS4
pathwalk during mounting.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:09 +0000 (20:06 -0400)]
NFS: Return an error when starting the idmapping pipe
Return an error when starting the idmapping pipe so that we can detect it
failing.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:09 +0000 (20:06 -0400)]
NFS: Rename nfs_server::nfs4_state
Rename nfs_server::nfs4_state to nfs_client as it will be used to represent the
client state for NFS2 and NFS3 also.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:08 +0000 (20:06 -0400)]
NFS: Rename struct nfs4_client to struct nfs_client
Rename struct nfs4_client to struct nfs_client so that it can become the basis
for a general client record for NFS2 and NFS3 in addition to NFS4.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:08 +0000 (20:06 -0400)]
NFS: Fix NFS4 callback up/down prototypes
Make the nfs_callback_up()/down() prototypes just do nothing if NFS4 is not
enabled. Also make the down function void type since we can't really do
anything if it fails.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:08 +0000 (20:06 -0400)]
NFS: Disambiguate nfs_stat_to_errno()
Rename the NFS4 version of nfs_stat_to_errno() so that it doesn't conflict with
the common one used by NFS2 and NFS3.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:07 +0000 (20:06 -0400)]
NFS: Fix up split of fs/nfs/inode.c
Fix ups for the splitting of the superblock stuff out of fs/nfs/inode.c,
including:
(*) Move the callback tcpport module param into callback.c.
(*) Move the idmap cache timeout module param into idmap.c.
(*) Changes to internal.h:
(*) namespace-nfs4.c was renamed to nfs4namespace.c.
(*) nfs_stat_to_errno() is in nfs2xdr.c, not nfs4xdr.c.
(*) nfs4xdr.c is contingent on CONFIG_NFS_V4.
(*) nfs4_path() is only uses if CONFIG_NFS_V4 is set.
Plus also:
(*) The sec_flavours[] table should really be const.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
David Howells [Wed, 23 Aug 2006 00:06:07 +0000 (20:06 -0400)]
NFS: Add dentry materialisation op
The attached patch adds a new directory cache management function that prepares
a disconnected anonymous function to be connected into the dentry tree. The
anonymous dentry is transferred the name and parentage from another dentry.
The following changes were made in [try #2]:
(*) d_materialise_dentry() now switches the parentage of the two nodes around
correctly when one or other of them is self-referential.
The following changes were made in [try #7]:
(*) d_instantiate_unique() has had the interior part split out as function
__d_instantiate_unique(). Callers of this latter function must be holding
the appropriate locks.
(*) _d_rehash() has been added as a wrapper around __d_rehash() to call it
with the most obvious hash list (the one from the name). d_rehash() now
calls _d_rehash().
(*) d_materialise_dentry() is now __d_materialise_dentry() and is static.
(*) d_materialise_unique() added to perform the combination of d_find_alias(),
d_materialise_dentry() and d_add_unique() that the NFS client was doing
twice, all within a single dcache_lock critical section. This reduces the
number of times two different spinlocks were being accessed.
The following further changes were made:
(*) Add the dentries onto their parents d_subdirs lists.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 25 Jul 2006 15:28:19 +0000 (11:28 -0400)]
NFS: Add an ACCESS cache memory shrinker
A pinned inode may in theory end up filling memory with cached ACCESS
calls. This patch ensures that the VM may shrink away the cache in these
particular cases.
The shrinker works by iterating through the list of inodes on the global
nfs_access_lru_list, and removing the least recently used access
cache entry until it is done (or until the entire cache is empty).
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 25 Jul 2006 15:28:18 +0000 (11:28 -0400)]
NFS: Add a global LRU list for the ACCESS cache
...in order to allow the addition of a memory shrinker.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 25 Jul 2006 15:28:18 +0000 (11:28 -0400)]
NFS: Add a new ACCESS rpc call cache to the linux nfs client
The current access cache only allows one entry at a time to be cached for each
inode. Add a per-inode red-black tree in order to allow more than one to
be cached at a time.
Should significantly cut down the time spent in path traversal for shared
directories such as ${PATH}, /usr/share, etc.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Linus Torvalds [Sat, 23 Sep 2006 00:51:59 +0000 (17:51 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] statfs for cifs unix extensions no longer experimental
[CIFS] New POSIX locking code not setting rc properly to zero on successful
[CIFS] Support deep tree mounts (e.g. mounts to //server/share/path)
Linus Torvalds [Sat, 23 Sep 2006 00:50:50 +0000 (17:50 -0700)]
Merge /pub/scm/linux/kernel/git/davej/agpgart
* master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart:
[AGPGART] Rework AGPv3 modesetting fallback.
[AGPGART] Add suspend callback for i965
[AGPGART] Fix number of aperture sizes in 830 gart structs.
[AGPGART] Intel 965 Express support.
[AGPGART] agp.h: constify struct agp_bridge_data::version
[AGPGART] const'ify VIA AGP PCI table.
[AGPGART] CONFIG_PM=n slim: drivers/char/agp/intel-agp.c
[AGPGART] CONFIG_PM=n slim: drivers/char/agp/efficeon-agp.c
[AGPGART] Const'ify the agpgart driver version.
[AGPGART] remove private page protection map
Linus Torvalds [Sat, 23 Sep 2006 00:50:22 +0000 (17:50 -0700)]
Merge /pub/scm/linux/kernel/git/davej/cpufreq
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] sw_any_bug_dmi_table can be used on resume, so it isn't initdata
[CPUFREQ] Fix some more CPU hotplug locking.
[CPUFREQ] Workaround for BIOS bug in software coordination of frequency
[CPUFREQ] Longhaul - Add voltage scaling to driver
[CPUFREQ] Fix sparse warning in ondemand
[CPUFREQ] make drivers/cpufreq/cpufreq_ondemand.c:powersave_bias_target() static
[CPUFREQ] Longhaul - Add ignore_latency option
[CPUFREQ] Longhaul - Disable arbiter
[CPUFREQ][2/2] ondemand: updated add powersave_bias tunable
[CPUFREQ][1/2] ondemand: updated tune for hardware coordination
[CPUFREQ] Fix typo.
Al Viro [Sat, 23 Sep 2006 00:29:34 +0000 (01:29 +0100)]
[PATCH] fallout from hcd-core patch
missing le16_to_cpu()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sat, 23 Sep 2006 00:27:30 +0000 (01:27 +0100)]
[PATCH] fix the survivors of fbcon_vbl_handler() renaming
In
|Author: James Simmons <jsimmons@kozmo.(none)>
|Date: Thu Mar 13 22:37:08 2003 -0800
|
| [FBCON] Cursor handling clean up. I nuked several static variables.
we have
-static void fbcon_vbl_handler(int irq, void *dummy, struct pt_regs *fp)
+static void fb_vbl_handler(int irq, void *dev_id, struct pt_regs *fp)
and 3 years later a couple of instances missed back then still remains
there.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sat, 23 Sep 2006 00:26:02 +0000 (01:26 +0100)]
[PATCH] sun4: fix sbus_setup_iommu()
iommu_init() and iounit_init() are never called for sun4, but that's not
enough - these calls should be ifdefed out since the functions in question
simply do not exist for CONFIG_SUN4 kernel.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sat, 23 Sep 2006 00:25:18 +0000 (01:25 +0100)]
[PATCH] asm/backlight.h is ppc-only
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sat, 23 Sep 2006 00:22:46 +0000 (01:22 +0100)]
[PATCH] sanitize frv archclean
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sat, 23 Sep 2006 00:24:25 +0000 (01:24 +0100)]
[PATCH] aoa is pmac-only
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sat, 23 Sep 2006 00:20:31 +0000 (01:20 +0100)]
[PATCH] memcpy_fromio() missing in istallion
memcpy() from iomem is a bad thing...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sat, 23 Sep 2006 00:18:41 +0000 (01:18 +0100)]
[PATCH] fix ancient breakage in ebus_init()
Back when pci_dev had base_address[], loop of form
base = &...->base_address[0];
for (.....) {
...
*base++ = addr;
}
was fine, but when that array got spread in ->resource[...].start
replacing the initialization with
base = &...->resource[0].start;
was not a sufficient modification. IOW this code got broken for cases
when there had been more than one resource to fill. All way back in
2.3.41-pre3...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Fri, 22 Sep 2006 23:10:18 +0000 (00:10 +0100)]
[PATCH] fix missing ifdefs in syscall classes hookup for generic targets
several targets have no ....at() family and m32r calls its only chown variant
chown32(), with __NR_chown being undefined. creat(2) is also absent in some
targets.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeremy Fitzhardinge [Wed, 13 Sep 2006 01:55:53 +0000 (18:55 -0700)]
[CPUFREQ] sw_any_bug_dmi_table can be used on resume, so it isn't initdata
sw_any_bug_dmi_table can be used on resume, so it isn't initdata.
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Dave Jones [Fri, 22 Sep 2006 23:15:23 +0000 (19:15 -0400)]
[CPUFREQ] Fix some more CPU hotplug locking.
Lukewarm IQ detected in hotplug locking
BUG: warning at kernel/cpu.c:38/lock_cpu_hotplug()
[<
b0134a42>] lock_cpu_hotplug+0x42/0x65
[<
b02f8af1>] cpufreq_update_policy+0x25/0xad
[<
b0358756>] kprobe_flush_task+0x18/0x40
[<
b0355aab>] schedule+0x63f/0x68b
[<
b01377c2>] __link_module+0x0/0x1f
[<
b0119e7d>] __cond_resched+0x16/0x34
[<
b03560bf>] cond_resched+0x26/0x31
[<
b0355b0e>] wait_for_completion+0x17/0xb1
[<
f965c547>] cpufreq_stat_cpu_callback+0x13/0x20 [cpufreq_stats]
[<
f9670074>] cpufreq_stats_init+0x74/0x8b [cpufreq_stats]
[<
b0137872>] sys_init_module+0x91/0x174
[<
b0102c81>] sysenter_past_esp+0x56/0x79
As there are other places that call cpufreq_update_policy without
the hotplug lock, it seems better to keep the hotplug locking
at the lower level for the time being until this is revamped.
Signed-off-by: Dave Jones <davej@redhat.com>
Linus Torvalds [Fri, 22 Sep 2006 22:47:06 +0000 (15:47 -0700)]
Merge branch 'for-linus' of /linux/kernel/git/roland/infiniband
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: (65 commits)
IB: Fix typo in kerneldoc for ib_set_client_data()
IPoIB: Add some likely/unlikely annotations in hot path
IPoIB: Remove unused include of vmalloc.h
IPoIB: Rejoin all multicast groups after a port event
IPoIB: Create MCGs with all attributes required by RFC
IB/sa: fix ib_sa_selector names
IB/iser: INFINIBAND_ISER depends on INET
IB/mthca: Simplify calls to mthca_cq_clean()
RDMA/cma: Document rdma_accept() error handling
IB/mthca: Recover from catastrophic errors
RDMA/cma: Document rdma_destroy_id() function
IB/cm: Do not track remote QPN in timewait state
IB/sa: Require SA registration
IPoIB: Refactor completion handling
IB/iser: Do not use FMR for a single dma entry sg
IB/iser: fix some debug prints
IB/iser: make FMR "page size" be 4K and not PAGE_SIZE
IB/iser: Limit the max size of a scsi command
IB/iser: fix a check of SG alignment for RDMA
RDMA/cma: Protect against adding device during destruction
...
Linus Torvalds [Fri, 22 Sep 2006 22:37:31 +0000 (15:37 -0700)]
Merge branch 'upstream-linus' of /linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
[netdrvr] mv643xx_eth: fix obvious typo, which caused build breakage
[netdrvr] lp486e: fix typo
Krishna Kumar [Fri, 22 Sep 2006 22:22:58 +0000 (15:22 -0700)]
IB: Fix typo in kerneldoc for ib_set_client_data()
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Eli Cohen [Fri, 22 Sep 2006 22:22:58 +0000 (15:22 -0700)]
IPoIB: Add some likely/unlikely annotations in hot path
Signed-off-by: Eli Cohen <eli@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Dotan Barak [Thu, 21 Sep 2006 15:26:43 +0000 (18:26 +0300)]
IPoIB: Remove unused include of vmalloc.h
IPoIB doesn't use anything from <linux/vmalloc.h>, so don't include it.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Eli Cohen [Fri, 22 Sep 2006 22:22:56 +0000 (15:22 -0700)]
IPoIB: Rejoin all multicast groups after a port event
When ipoib_ib_dev_flush() is called because of a port event, the
driver needs to rejoin all multicast groups, since the flush will call
ipoib_mcast_dev_flush() (via ipoib_ib_dev_down()). Otherwise no
(non-broadcast) multicast groups will be rejoined until the networking
core calls ->set_multicast_list again, and so multicast reception will
be broken for potentially a long time.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Fri, 22 Sep 2006 22:22:56 +0000 (15:22 -0700)]
IPoIB: Create MCGs with all attributes required by RFC
RFC 4391 ("Transmission of IP over InfiniBand (IPoIB)") says:
If the IB multicast group does not already exist, one must be
created first with the IPoIB link MTU. The MGID MUST use the same
P_Key, Q_Key, SL, MTU, and HopLimit as those used in the
broadcast-GID. The rest of attributes SHOULD follow the values used
in the broadcast-GID as well.
However, the current IPoIB driver is only setting the attributes
required by the InfiniBand spec to create a multicast group, so in
particular the MTU and HopLimit are not being set. Add these
attributes when creating MCGs, and also set the Rate attribute, since
IPoIB pays attention to that attribute as well.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Michael S. Tsirkin [Mon, 18 Sep 2006 19:17:08 +0000 (22:17 +0300)]
IB/sa: fix ib_sa_selector names
Relevant SA queries are actually "greater than" / "less than", not
"greater than or equal" / "less than or equal" as the names imply.
(See IB spec 1.2 Vol 1, 15.2.5.16 PATHRECORD/Table 205 PathRecord)
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Fri, 22 Sep 2006 22:22:55 +0000 (15:22 -0700)]
IB/iser: INFINIBAND_ISER depends on INET
iSER won't build without CONFIG_INET enabled, so make Kconfig reflect that.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Fri, 22 Sep 2006 22:22:55 +0000 (15:22 -0700)]
IB/mthca: Simplify calls to mthca_cq_clean()
If a QP has separate send and receive CQs, then the send CQ will never
have receive completions from that QP in it. So when cleaning the
send CQ, there's no need to pass in an SRQ pointer, even if the QP is
attached to an SRQ.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Or Gerlitz [Fri, 22 Sep 2006 22:22:54 +0000 (15:22 -0700)]
RDMA/cma: Document rdma_accept() error handling
Document the reject sending and modifying QP to error done in rdma_accept().
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Jack Morgenstein [Tue, 15 Aug 2006 18:11:18 +0000 (21:11 +0300)]
IB/mthca: Recover from catastrophic errors
Trigger device remove and then add when a catastrophic error is
detected in hardware. This, in turn, will cause a device reset, which
we hope will recover from the catastrophic condition.
Since this might interefere with debugging the root cause, add a
module option to suppress this behaviour.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Or Gerlitz [Tue, 12 Sep 2006 16:03:33 +0000 (09:03 -0700)]
RDMA/cma: Document rdma_destroy_id() function
Clarify that rdma_destroy_id cancels outstanding asynchronous operations on the
Associated id.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Michael S. Tsirkin [Mon, 28 Aug 2006 13:32:50 +0000 (16:32 +0300)]
IB/cm: Do not track remote QPN in timewait state
Do not track remote QPN in TimeWait state, since QP is not connected.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Michael S. Tsirkin [Mon, 21 Aug 2006 23:40:12 +0000 (16:40 -0700)]
IB/sa: Require SA registration
Require users to register with SA module, to prevent the sa_query
module text from going away while an SA query callback is still
running. Update all in-tree users for the new interface.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Fri, 22 Sep 2006 22:22:52 +0000 (15:22 -0700)]
IPoIB: Refactor completion handling
Split up ipoib_ib_handle_wc() into ipoib_ib_handle_rx_wc() and
ipoib_ib_handle_tx_wc() to make the code easier to read. This will
also help implement NAPI in the future.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Erez Zilber [Mon, 11 Sep 2006 09:26:33 +0000 (12:26 +0300)]
IB/iser: Do not use FMR for a single dma entry sg
Fast Memory Registration (fmr) is used to register for rdma an sg whose
elements are not linearly sequential after dma mapping.
The IB verbs layer provides an "all dma memory MR (memory region)" which
can be used for RDMA-ing a dma linearly sequential buffer.
Change the code to use the dma mr instead of doing fmr when dma mapping
produces a single dma entry sg.
Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Erez Zilber [Mon, 11 Sep 2006 09:24:00 +0000 (12:24 +0300)]
IB/iser: fix some debug prints
fix and add some debug prints related to iser
handling of memory for rdma.
Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Erez Zilber [Mon, 11 Sep 2006 09:22:30 +0000 (12:22 +0300)]
IB/iser: make FMR "page size" be 4K and not PAGE_SIZE
As iser is able to use at most one rdma operation for the
execution of a scsi command, and registration of the sg
associated with scsi command has its restrictions, the code
checks if an sg is "aligned for rdma".
Alignment for rdma is measured in "fmr page" units whose
possible resolutions are different between HCAs and can be
smaller, equal or bigger to the system page size.
When the system page size is bigger than 4KB (eg the default
with ia64 kernels) there a bigger chance that an sg would be
aligned for rdma if the fmr page size is 4KB.
Change the code to create FMR whose pages are of size 4KB
and to take that into account when processing the sg.
Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Erez Zilber [Mon, 11 Sep 2006 09:20:54 +0000 (12:20 +0300)]
IB/iser: Limit the max size of a scsi command
Currently, the data length of a command coming down from scsi-ml
is limited only by the size of its sg list (sg_tablesize). The
max data length may be different for different page size values.
By setting max_sectors, we limit the data length to
max_sectors*512 bytes.
Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Erez Zilber [Mon, 11 Sep 2006 09:19:17 +0000 (12:19 +0300)]
IB/iser: fix a check of SG alignment for RDMA
dma mapping may include a "compaction" of the sg associated with scsi command.
Hence, the size of the maximal prefix of the SG which is aligned for rdma must be
compared against the length of the dma mapped sg (mem->dma_nents) and not against
the size of it before it was mapped (mem->size).
Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Sean Hefty [Fri, 1 Sep 2006 22:33:55 +0000 (15:33 -0700)]
RDMA/cma: Protect against adding device during destruction
Closes a window where address resolution can attach an rdma_cm_id to a
device during destruction of the rdma_cm_id. This can result in the
rdma_cm_id remaining in the device list after its memory has been
freed.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Tom Tucker [Fri, 22 Sep 2006 22:22:48 +0000 (15:22 -0700)]
RDMA/amso1100: Add driver for Ammasso 1100 RNIC
Add a driver for the Ammasso 1100 gigabit ethernet RNIC.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Tom Tucker [Thu, 3 Aug 2006 21:02:42 +0000 (16:02 -0500)]
RDMA: iWARP Core Changes.
Modifications to the existing rdma header files, core files, drivers,
and ulp files to support iWARP, including:
- Hook iWARP CM into the build system and use it in rdma_cm.
- Convert enum ib_node_type to enum rdma_node_type, which includes
the possibility of RDMA_NODE_RNIC, and update everything for this.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Tom Tucker [Thu, 3 Aug 2006 21:02:40 +0000 (16:02 -0500)]
RDMA: iWARP Connection Manager.
Add an iWARP Connection Manager (CM), which abstracts connection
management for iWARP devices (RNICs). It is a logical instance of the
xx_cm where xx is the transport type (ib or iw). The symbols exported
are used by the transport independent rdma_cm module, and are
available also for transport dependent ULPs.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Fri, 22 Sep 2006 22:22:46 +0000 (15:22 -0700)]
IB: Whitespace fixes
Remove some trailing whitespace that has snuck in despite the best
efforts of whitespace=error-all. Also fix a few other whitespace
bogosities.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Sean Hefty [Mon, 28 Aug 2006 22:15:18 +0000 (15:15 -0700)]
IB/cm: Randomize starting comm ID
Randomize the starting local comm ID to avoid getting a rejected
connection due to a stale connection after a system reboot or
reloading of the ib_cm.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
James Lentini [Mon, 28 Aug 2006 22:12:04 +0000 (15:12 -0700)]
IB/mad: Remove unused includes
The ib_mad module does not use a kthread function, but mad_priv.h
includes <linux/kthread.h>. mad_rmpp.c does not do any DMA-related
stuff, but includes <linux/dma-mapping.h>. Remove the unused includes.
Signed-off-by: James Lentini <jlentini@netapp.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Sean Hefty [Mon, 28 Aug 2006 22:10:32 +0000 (15:10 -0700)]
IB/mad: Add support for dual-sided RMPP transfers.
The implementation assumes that any RMPP request that requires a
response uses DS RMPP. Based on the RMPP start-up scenarios defined
by the spec, this should be a valid assumption. That is, there is no
start-up scenario defined where an RMPP request is followed by a
non-RMPP response. By having this assumption we avoid any API
changes.
In order for a node that supports DS RMPP to communicate with one that
does not, RMPP responses assume a new window size of 1 if a DS ACK has
not been received. (By DS ACK, I'm referring to the turn-around ACK
after the final ACK of the request.) This is a slight spec deviation,
but is necessary to allow communication with nodes that do not
generate the DS ACK. It also handles the case when a response is sent
after the request state has been discarded.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Sean Hefty [Mon, 28 Aug 2006 18:57:42 +0000 (11:57 -0700)]
IB/cm: Use correct reject code for invalid GID
Set the reject code properly when rejecting a request that contains an
invalid GID. A suitable GID is returned by the IB CM in the
additional reject information (ARI). This is a spec compliancy issue.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Sean Hefty [Mon, 28 Aug 2006 18:55:52 +0000 (11:55 -0700)]
IB/cm: Enable atomics along with RDMA reads
Enable atomic operations along with RDMA reads if a local RDMA
read/atomic depth is provided by the user.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Jack Morgenstein [Mon, 28 Aug 2006 16:12:39 +0000 (19:12 +0300)]
IB/mthca: Return correct number of bits for static rate in query_qp
Incorrect number of bits was taken for static_rate field.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Jack Morgenstein [Mon, 28 Aug 2006 16:10:34 +0000 (19:10 +0300)]
IB/mthca: Return port number for unconnected QPs in query_qp
port_num was not being returned for unconnected QPs.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Jack Morgenstein [Mon, 28 Aug 2006 16:08:53 +0000 (19:08 +0300)]
IB/mthca: Fix default static rate returned for Tavor in AV
When default static rate is returned for Tavor, need to translate it
to an ib rate value.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:48 +0000 (11:24 -0700)]
IB/ipath: control receive polarity inversion
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:46 +0000 (11:24 -0700)]
IB/ipath: fix return value from ipath_poll
This stops the generic poll code from waiting for a timeout.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:45 +0000 (11:24 -0700)]
IB/ipath: allow SMA to be disabled
This is useful for testing purposes.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:44 +0000 (11:24 -0700)]
IB/ipath: handle sq_sig_all field correctly
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:43 +0000 (11:24 -0700)]
IB/ipath: put a limit on the number of QPs that can be created
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:42 +0000 (11:24 -0700)]
IB/ipath: validate path_mig_state properly
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:41 +0000 (11:24 -0700)]
IB/ipath: be more strict about testing the modify QP verb
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:40 +0000 (11:24 -0700)]
IB/ipath: add serial number to hardware freeze error message
Also added the word "Hardware" after "Fatal" to make it more obvious
that it's hardware, not software.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:39 +0000 (11:24 -0700)]
IB/ipath: support new QLogic product naming scheme
This patch only renames files, fixes product names, and updates
comments.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:38 +0000 (11:24 -0700)]
IB/ipath: account for attached QPs correctly
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:37 +0000 (11:24 -0700)]
IB/ipath: do not allow use of CQ entries with invalid counts
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:36 +0000 (11:24 -0700)]
IB/ipath: add new minor device to allow sending of diag packets
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:35 +0000 (11:24 -0700)]
IB/ipath: trivial cleanups
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:34 +0000 (11:24 -0700)]
IB/ipath: remove stale references to userspace SMA
When we first submitted a userspace subnet management agent, it was
rejected, so we left it out of the final driver submission. This patch
removes a number of vestigial references to it.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:33 +0000 (11:24 -0700)]
IB/ipath: simplify debugging code after ipath_core and ib_ipath merger
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:32 +0000 (11:24 -0700)]
IB/ipath: simplify layering code
A lot of ipath layer code was only called in one place. Now that the
ipath_core and ib_ipath drivers are merged, it's more sensible to simply
inline the simple stuff that the layer code was doing.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:31 +0000 (11:24 -0700)]
IB/ipath: merge ipath_core and ib_ipath drivers
There is little point in keeping the two drivers separate, so we are
merging them.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:30 +0000 (11:24 -0700)]
IB/ipath: drop requirement that PIO buffers be mmaped write-only
Some userlands try to mmap these pages read-write, so accommodate them.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:29 +0000 (11:24 -0700)]
IB/ipath: fix handling of kpiobufs
Change comment: no longer imply that user can set ipath_kpiobufs to zero.
Actually set ipath_kpiobufs from parameter. Previously only altered
per-device ipath_lastport_piobuf, which was over-written in chip init.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:28 +0000 (11:24 -0700)]
IB/ipath: fix for crash on module unload, if cfgports < portcnt
Allocate enough pointers for all possible ports, to avoid problems in
cleanup/unload.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:27 +0000 (11:24 -0700)]
IB/ipath: lock resource limit counters correctly
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Bryan O'Sullivan [Fri, 25 Aug 2006 18:24:26 +0000 (11:24 -0700)]
IB/ipath: More changes to support InfiniPath on PowerPC 970 systems
Ordering of writethrough store buffers needs to be forced, and we need
to use ifdef to get writethrough behavior to InfiniPath buffers, because
there is no generic way to specify that at this time (similar to code
in char/drm/drm_vm.c and block/z2ram.c).
Signed-off-by: John Gregor <john.gregor@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Ralph Campbell [Fri, 22 Sep 2006 22:22:26 +0000 (15:22 -0700)]
IB/ipath: Performance improvements via mmap of queues
Improve performance of userspace post receive, post SRQ receive, and
poll CQ operations for ipath by allowing userspace to directly mmap()
receive queues and completion queues. This eliminates the copying
between userspace and the kernel in the data path.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Ralph Campbell [Fri, 11 Aug 2006 21:58:09 +0000 (14:58 -0700)]
IB/uverbs: Pass userspace data to modify_srq and modify_qp methods
Pass a struct ib_udata to the low-level driver's ->modify_srq() and
->modify_qp() methods, so that it can get to the device-specific data
passed in by the userspace driver.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Ralph Campbell [Fri, 22 Sep 2006 22:22:24 +0000 (15:22 -0700)]
IB/uverbs: Allow resize CQ operation to return driver-specific data
Add a ib_uverbs_resize_cq_resp.driver_data field so that low-level
drivers can return data from a resize CQ operation to userspace. Have
ib_uverbs_resize_cq() only copy the cqe field, to avoid having to bump
the userspace ABI.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Heiko J Schick [Fri, 22 Sep 2006 22:22:22 +0000 (15:22 -0700)]
IB/ehca: Add driver for IBM eHCA InfiniBand adapters
Add a driver for IBM GX bus InfiniBand adapters, which are usable with
some pSeries/System p systems.
Signed-off-by: Heiko J Schick <schickhj.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Ishai Rabinovitz [Tue, 15 Aug 2006 14:34:52 +0000 (17:34 +0300)]
IB/srp: Add port/device attributes
Add local_ib_device and local_ib_port attributes to srp scsi_host.
These are needed when we want to connect to the same target through
multiple distinct ports.
Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
James Lentini [Fri, 22 Sep 2006 22:17:20 +0000 (15:17 -0700)]
IB/mthca: Include the header we really want
Signed-off-by: James Lentini <jlentini@netapp.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Fri, 22 Sep 2006 22:17:20 +0000 (15:17 -0700)]
IB/uverbs: Fix lockdep warning when QP is created with 2 CQs
Lockdep warns when userspace creates a QP that uses different CQs for
send completions and receive completions, because both CQs are locked
and their mutexes belong to the same lock class. However, we know
that the mutexes are distinct and the nesting is safe (there is no
possibility of AB-BA deadlock because the mutexes are locked with
down_read()), so annotate the situation with SINGLE_DEPTH_NESTING to
get rid of the lockdep warning.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Fri, 22 Sep 2006 22:17:19 +0000 (15:17 -0700)]
IB/uverbs: Use idr_read_cq() where appropriate
There were two functions that open-coded idr_read_cq() in terms of
idr_read_uobj() rather than using the helper.
Signed-off-by: Roland Dreier <rolandd@cisco.com>