Sylwester Nawrocki [Thu, 26 Jul 2012 10:15:42 +0000 (07:15 -0300)]
[media] s5p-fimc: Don't allocate fimc-capture video device dynamically
This fixes potential invalid pointer de-reference, when
media_entity_cleanup() is called before video device
is unregistered.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Sylwester Nawrocki [Thu, 26 Jul 2012 10:13:08 +0000 (07:13 -0300)]
[media] s5p-fimc: Don't allocate fimc-lite video device structure dynamically
This fixes potential invalid pointer de-reference, when
media_entity_cleanup() is called after video_unregister_device,
and video device structure memory is already freed.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Sylwester Nawrocki [Thu, 26 Jul 2012 12:50:49 +0000 (09:50 -0300)]
[media] s5p-fimc: Enable FIMC-LITE driver only for SOC_EXYNOS4x12
Allow to compile-in the FIMC-LITE driver only on Exynos4212,
Exynos4412 and Exynos5250 SoC where the device is available.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Albert Wang [Wed, 1 Aug 2012 05:45:41 +0000 (02:45 -0300)]
[media] media: soc_camera: don't clear pix->sizeimage in JPEG mode
In JPEG mode, the size of image is variable due to different JPEG compression
rate. We only can get the pix->sizeimage from the user.
If we clear pix->sizeimage in soc_camera_try_fmt() then we will get it from:
ret = soc_mbus_image_size(xlate->host_fmt, pix->bytesperline,
pix->height);
if (ret < 0)
return ret;
pix->sizeimage = max_t(u32, pix->sizeimage, ret);
In general, this sizeimage will be larger than the actul JPEG image size.
But vb2 will check the buffer and size of image in __qbuf_userptr():
/* Check if the provided plane buffer is large enough */
if (planes[plane].length < q->plane_sizes[plane])
So we shouldn't clear the pix->sizeimage and also shouldn't re-calculate
the pix->sizeimage in soc_mbus_image_size() in JPEG mode
We also shouldn't re-calculate pix->bytesperline:
ret = soc_mbus_bytes_per_line(pix->width, xlate->host_fmt);
if (ret < 0)
return ret;
pix->bytesperline = max_t(u32, pix->bytesperline, ret);
pix->bytesperline also should be set by the user or by the driver's
try_fmt() implementation.
Change-Id: I700690a2287346127a624b5260922eaa5427a596
Signed-off-by: Albert Wang <twang13@marvell.com>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Javier Martin [Wed, 1 Aug 2012 09:16:44 +0000 (06:16 -0300)]
[media] media: mx2_camera: Fix clock handling for i.MX27
On i.MX27 two clocks are required: emma-ipg and emma-ahb. The ahb clock
has to be requested using both a device and a connection ID.
Signed-off-by: Javier Martin <javier.martin@vista-silicon.com>
[g.liakhovetski@gmx.de: rebase to the current media tree]
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Fabio Estevam [Fri, 25 May 2012 23:14:48 +0000 (20:14 -0300)]
[media] video: mx2_camera: Use clk_prepare_enable/clk_disable_unprepare
Prepare the clock before enabling it.
Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Cc: <linux-media@vger.kernel.org>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Fabio Estevam [Fri, 25 May 2012 23:14:47 +0000 (20:14 -0300)]
[media] video: mx1_camera: Use clk_prepare_enable/clk_disable_unprepare
Prepare the clock before enabling it.
Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Cc: <linux-media@vger.kernel.org>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Alex Gershgorin [Wed, 1 Aug 2012 08:05:10 +0000 (05:05 -0300)]
[media] media: mx3_camera: buf_init() add buffer state check
This patch checks the state of the buffer when calling .buf_init() method.
This is needed for the USERPTR buffer type, because in that case
.buf_init() is called every time a buffer is queued, and not only once
during the preparation stage, like in the MMAP case. Without this check
buffers get initialised repeatedly, which also leads to the allocation
of new DMA descriptors, of which there is only a final relatively small
number available. Both MMAP and USERPTR methods were successfully tested.
Signed-off-by: Alex Gershgorin <alexg@meprolight.com>
[g.liakhovetski@gmx.de: remove mx3_camera_buffer::state completely]
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Hans de Goede [Sat, 11 Aug 2012 09:34:55 +0000 (06:34 -0300)]
[media] radio-shark2: Only compile led support when CONFIG_LED_CLASS is set
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Hans de Goede [Sat, 11 Aug 2012 09:34:54 +0000 (06:34 -0300)]
[media] radio-shark: Only compile led support when CONFIG_LED_CLASS is set
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Hans de Goede [Sat, 11 Aug 2012 09:34:53 +0000 (06:34 -0300)]
[media] radio-shark*: Call cancel_work_sync from disconnect rather then release
This removes the need for shark_led_work to take the v4l2 lock.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Hans de Goede [Sat, 11 Aug 2012 09:34:52 +0000 (06:34 -0300)]
[media] radio-shark*: Remove work-around for dangling pointer in usb intfdata
Recent kernels properly clear the usb intfdata pointer when another
driver fails to bind (in the radio-shark* case the usbhid driver would try
to bind first.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Guenter Roeck [Mon, 6 Aug 2012 03:15:20 +0000 (00:15 -0300)]
[media] Add USB dependency for IguanaWorks USB IR Transceiver
This patch fixes the error
drivers/usb/core/hub.c:3753: undefined reference to `usb_speed_string'
seen in various random configurations.
Cc: Sean Young <sean@mess.org>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Hans Verkuil [Wed, 1 Aug 2012 19:18:41 +0000 (16:18 -0300)]
[media] Add missing logging for rangelow/high of hwseek
struct v4l2_hw_freq_seek has two new fields that weren't printed in the
logging function. Added.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Hans Verkuil [Wed, 1 Aug 2012 18:52:46 +0000 (15:52 -0300)]
[media] VIDIOC_ENUM_FREQ_BANDS fix
When VIDIOC_ENUM_FREQ_BANDS is called for a driver that doesn't supply an
enum_freq_bands op, then it will fall back to reporting a single freq band
based on information from g_tuner or g_modulator.
Due to a bug this is an infinite list since the index field wasn't tested.
This patch fixes this and returns -EINVAL if index != 0.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Hans Verkuil [Wed, 1 Aug 2012 06:32:33 +0000 (03:32 -0300)]
[media] mem2mem_testdev: fix querycap regression
Trival but important patch.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Hans Verkuil [Fri, 3 Aug 2012 14:16:59 +0000 (11:16 -0300)]
[media] si470x: v4l2-compliance fixes
Just a few fixes for problems found after updating v4l2-compliance to check
the frequency band enumeration.
Note that the i2c driver doesn't fill in bus_info, but since I can't test that
driver I've decided not to fix that.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Hans Verkuil [Fri, 3 Aug 2012 14:16:29 +0000 (11:16 -0300)]
[media] DocBook: Remove a spurious character
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Jayakrishnan Memana [Sun, 15 Jul 2012 13:54:03 +0000 (10:54 -0300)]
[media] uvcvideo: Reset the bytesused field when recycling an erroneous buffer
Buffers marked as erroneous are recycled immediately by the driver if
the nodrop module parameter isn't set. The buffer payload size is reset
to 0, but the buffer bytesused field isn't. This results in the buffer
being immediately considered as complete, leading to an infinite loop in
interrupt context.
Fix the problem by resetting the bytesused field when recycling the
buffer.
Cc: <stable@vger.kernel.org>
Signed-off-by: Jayakrishnan Memana <jayakrishnan.memana@maxim-ic.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Linus Torvalds [Wed, 1 Aug 2012 01:47:44 +0000 (18:47 -0700)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-media
Pull second set of media updates from Mauro Carvalho Chehab:
- radio API: add support to work with radio frequency bands
- new AM/FM radio drivers: radio-shark, radio-shark2
- new Remote Controller USB driver: iguanair
- conversion of several drivers to the v4l2 core control framework
- new board additions at existing drivers
- the remaining (and vast majority of the patches) are due to
drivers/DocBook fixes/cleanups.
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (154 commits)
[media] radio-tea5777: use library for 64bits div
[media] tlg2300: Declare MODULE_FIRMWARE usage
[media] lgs8gxx: Declare MODULE_FIRMWARE usage
[media] xc5000: Add MODULE_FIRMWARE statements
[media] s2255drv: Add MODULE_FIRMWARE statement
[media] dib8000: move dereference after check for NULL
[media] Documentation: Update cardlists
[media] bttv: add support for Aposonic W-DVR
[media] cx25821: Remove bad strcpy to read-only char*
[media] pms.c: remove duplicated include
[media] smiapp-core.c: remove duplicated include
[media] via-camera: pass correct format settings to sensor
[media] rtl2832.c: minor cleanup
[media] Add support for the IguanaWorks USB IR Transceiver
[media] Minor cleanups for MCE USB
[media] drivers/media/dvb/siano/smscoreapi.c: use list_for_each_entry
[media] Use a named union in struct v4l2_ioctl_info
[media] mceusb: Add Twisted Melon USB IDs
[media] staging/media/solo6x10: use module_pci_driver macro
[media] staging/media/dt3155v4l: use module_pci_driver macro
...
Conflicts:
Documentation/feature-removal-schedule.txt
Linus Torvalds [Wed, 1 Aug 2012 01:45:44 +0000 (18:45 -0700)]
Merge tag 'nfs-for-3.6-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull second wave of NFS client updates from Trond Myklebust:
- Patches from Bryan to allow splitting of the NFSv2/v3/v4 code into
separate modules.
- Fix Oopses in the NFSv4 idmapper
- Fix a deadlock whereby rpciod tries to allocate a new socket and ends
up recursing into the NFS code due to memory reclaim.
- Increase the number of permitted callback connections.
* tag 'nfs-for-3.6-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
nfs: explicitly reject LOCK_MAND flock() requests
nfs: increase number of permitted callback connections.
SUNRPC: return negative value in case rpcbind client creation error
NFS: Convert v4 into a module
NFS: Convert v3 into a module
NFS: Convert v2 into a module
NFS: Keep module parameters in the generic NFS client
NFS: Split out remaining NFS v4 inode functions
NFS: Pass super operations and xattr handlers in the nfs_subversion
NFS: Only initialize the ACL client in the v3 case
NFS: Create a try_mount rpc op
NFS: Remove the NFS v4 xdev mount function
NFS: Add version registering framework
NFS: Fix a number of bugs in the idmapper
nfs: skip commit in releasepage if we're freeing memory for fs-related reasons
sunrpc: clarify comments on rpc_make_runnable
pnfsblock: bail out partial page IO
Linus Torvalds [Wed, 1 Aug 2012 01:43:13 +0000 (18:43 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking update from David S. Miller:
"I think Eric Dumazet and I have dealt with all of the known routing
cache removal fallout. Some other minor fixes all around.
1) Fix RCU of cached routes, particular of output routes which require
liberation via call_rcu() instead of call_rcu_bh(). From Eric
Dumazet.
2) Make sure we purge net device references in cached routes properly.
3) TG3 driver bug fixes from Michael Chan.
4) Fix reported 'expires' value in ipv6 routes, from Li Wei.
5) TUN driver ioctl leaks kernel bytes to userspace, from Mathias
Krause."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (22 commits)
ipv4: Properly purge netdev references on uncached routes.
ipv4: Cache routes in nexthop exception entries.
ipv4: percpu nh_rth_output cache
ipv4: Restore old dst_free() behavior.
bridge: make port attributes const
ipv4: remove rt_cache_rebuild_count
net: ipv4: fix RCU races on dst refcounts
net: TCP early demux cleanup
tun: Fix formatting.
net/tun: fix ioctl() based info leaks
tg3: Update version to 3.124
tg3: Fix race condition in tg3_get_stats64()
tg3: Add New 5719 Read DMA workaround
tg3: Fix Read DMA workaround for 5719 A0.
tg3: Request APE_LOCK_PHY before PHY access
ipv6: fix incorrect route 'expires' value passed to userspace
mISDN: Bugfix only few bytes are transfered on a connection
seeq: use PTR_RET at init_module of driver
bnx2x: remove cast around the kmalloc in bnx2x_prev_mark_path
ipv4: clean up put_child
...
Linus Torvalds [Wed, 1 Aug 2012 01:08:25 +0000 (18:08 -0700)]
Merge tag 'for-v3.6' of git://git.infradead.org/battery-2.6
Pull battery updates from Anton Vorontsov:
"The tag contains just a few battery-related changes for v3.6. It's is
all pretty straightforward, except one thing.
One of our patches added thermal support for power supply class, but
thermal/ subsystem changed under our feet. We (well, Stephen, that
is) caught the issue and it was decided[1] that I'd just delay the
battery pull request, and then will fix it up by merging upstream back
into battery tree at the specific commit.
That's not all though: another[2] small fixup for thermal subsystem
was needed to get rid of a warning in power supply subsystem (the
warning was not drivers/power's "fault", the thermal registration
function just needed a proper const annotation, which is also done by
a small commit on top of the merge.
So, to sum this up:
- The 'master' branch of the battery tree was in the -next tree for
weeks, was never rebased, altered etc. It should be all OK;
- Although, for-v3.6 tag contains the 'master' branch + merge + the
warning fix.
[1] http://lkml.org/lkml/2012/6/19/23
[2] http://lkml.org/lkml/2012/6/18/28"
* tag 'for-v3.6' of git://git.infradead.org/battery-2.6: (23 commits)
thermal: Constify 'type' argument for the registration routine
olpc-battery: update CHARGE_FULL_DESIGN property for BYD LiFe batteries
olpc-battery: Add VOLTAGE_MAX_DESIGN property
charger-manager: Fix build break related to EXTCON
lp8727_charger: Move header file into platform_data directory
power_supply: Add min/max alert properties for CAPACITY, TEMP, TEMP_AMBIENT
bq27x00_battery: Add support for BQ27425 chip
charger-manager: Set current limit of regulator for over current protection
charger-manager: Use EXTCON Subsystem to detect charger cables for charging
test_power: Add VOLTAGE_NOW and BATTERY_TEMP properties
test_power: Add support for USB AC source
gpio-charger: Use cansleep version of gpio_set_value
bq27x00_battery: Add support for power average and health properties
sbs-battery: Don't trigger false supply_changed event
twl4030_charger: Allow charger to control the regulator that feeds it
twl4030_charger: Add backup-battery charging
twl4030_charger: Fix some typos
max17042_battery: Support CHARGE_COUNTER power supply attribute
smb347-charger: Add constant charge and current properties
power_supply: Add constant charge_current and charge_voltage properties
...
Linus Torvalds [Tue, 31 Jul 2012 22:34:13 +0000 (15:34 -0700)]
Merge branch 'perf-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"The biggest changes are Intel Nehalem-EX PMU uncore support, uprobes
updates/cleanups/fixes from Oleg and diverse tooling updates (mostly
fixes) now that Arnaldo is back from vacation."
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
uprobes: __replace_page() needs munlock_vma_page()
uprobes: Rename vma_address() and make it return "unsigned long"
uprobes: Fix register_for_each_vma()->vma_address() check
uprobes: Introduce vaddr_to_offset(vma, vaddr)
uprobes: Teach build_probe_list() to consider the range
uprobes: Remove insert_vm_struct()->uprobe_mmap()
uprobes: Remove copy_vma()->uprobe_mmap()
uprobes: Fix overflow in vma_address()/find_active_uprobe()
uprobes: Suppress uprobe_munmap() from mmput()
uprobes: Uprobe_mmap/munmap needs list_for_each_entry_safe()
uprobes: Clean up and document write_opcode()->lock_page(old_page)
uprobes: Kill write_opcode()->lock_page(new_page)
uprobes: __replace_page() should not use page_address_in_vma()
uprobes: Don't recheck vma/f_mapping in write_opcode()
perf/x86: Fix missing struct before structure name
perf/x86: Fix format definition of SNB-EP uncore QPI box
perf/x86: Make bitfield unsigned
perf/x86: Fix LLC-* and node-* events on Intel SandyBridge
perf/x86: Add Intel Nehalem-EX uncore support
perf/x86: Fix typo in format definition of uncore PCU filter
...
Linus Torvalds [Tue, 31 Jul 2012 22:33:04 +0000 (15:33 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull powerpc updates from Benjamin Herrenschmidt:
"Kumar sent me a handful of Freescale related fixes and I added another
regression fix to the pile.
PS. I -will- eventually learn about that signed tag business :-)"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/kvm/book3s_32: Fix MTMSR_EERI macro
powerpc/85xx: p1022ds: fix DIU/LBC switching with NAND enabled
powerpc/85xx: p1022ds: disable the NAND flash node if video is enabled
powerpc/85xx: Fix sram_offset parameter type
powerpc/85xx: P3041DS - change espi input-clock from 40MHz to 35MHz
powerpc/85xx: Fix pci base address error for p2020rdb-pc in dts
Linus Torvalds [Tue, 31 Jul 2012 22:32:05 +0000 (15:32 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
"This it the second batch of s390 patches for the 3.6 merge window.
Included is enablement for two common code changes, killable page
faults and sorted exception tables. And the regular set of cleanup
and bug fix patches."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: make use of user_mode() macro where possible
s390/mm: rename user_mode variable to addressing_mode
s390/mm: fix fault handling for page table walk case
s390/mm: make page faults killable
s390: update defconfig
s390/mm: downgrade page table after fork of a 31 bit process
s390/ipl: Use diagnose 8 command separation
s390/linker script: use RO_DATA_SECTION
s390/exceptions: sort exception table at build time
s390/debug: remove module_exit function / move EXPORT_SYMBOLs
David S. Miller [Tue, 31 Jul 2012 22:06:50 +0000 (15:06 -0700)]
ipv4: Properly purge netdev references on uncached routes.
When a device is unregistered, we have to purge all of the
references to it that may exist in the entire system.
If a route is uncached, we currently have no way of accomplishing
this.
So create a global list that is scanned when a network device goes
down. This mirrors the logic in net/core/dst.c's dst_ifdown().
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 31 Jul 2012 22:02:02 +0000 (15:02 -0700)]
ipv4: Cache routes in nexthop exception entries.
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 31 Jul 2012 21:42:28 +0000 (14:42 -0700)]
Merge branch 'nfsd-next' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from J. Bruce Fields:
"This has been an unusually quiet cycle--mostly bugfixes and cleanup.
The one large piece is Stanislav's work to containerize the server's
grace period--but that in itself is just one more step in a
not-yet-complete project to allow fully containerized nfs service.
There are a number of outstanding delegation, container, v4 state, and
gss patches that aren't quite ready yet; 3.7 may be wilder."
* 'nfsd-next' of git://linux-nfs.org/~bfields/linux: (35 commits)
NFSd: make boot_time variable per network namespace
NFSd: make grace end flag per network namespace
Lockd: move grace period management from lockd() to per-net functions
LockD: pass actual network namespace to grace period management functions
LockD: manage grace list per network namespace
SUNRPC: service request network namespace helper introduced
NFSd: make nfsd4_manager allocated per network namespace context.
LockD: make lockd manager allocated per network namespace
LockD: manage grace period per network namespace
Lockd: add more debug to host shutdown functions
Lockd: host complaining function introduced
LockD: manage used host count per networks namespace
LockD: manage garbage collection timeout per networks namespace
LockD: make garbage collector network namespace aware.
LockD: mark host per network namespace on garbage collect
nfsd4: fix missing fault_inject.h include
locks: move lease-specific code out of locks_delete_lock
locks: prevent side-effects of locks_release_private before file_lock is initialized
NFSd: set nfsd_serv to NULL after service destruction
NFSd: introduce nfsd_destroy() helper
...
Eric Dumazet [Tue, 31 Jul 2012 05:45:30 +0000 (05:45 +0000)]
ipv4: percpu nh_rth_output cache
Input path is mostly run under RCU and doesnt touch dst refcnt
But output path on forwarding or UDP workloads hits
badly dst refcount, and we have lot of false sharing, for example
in ipv4_mtu() when reading rt->rt_pmtu
Using a percpu cache for nh_rth_output gives a nice performance
increase at a small cost.
24 udpflood test on my 24 cpu machine (dummy0 output device)
(each process sends 1.000.000 udp frames, 24 processes are started)
before : 5.24 s
after : 2.06 s
For reference, time on linux-3.5 : 6.60 s
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 31 Jul 2012 01:08:23 +0000 (01:08 +0000)]
ipv4: Restore old dst_free() behavior.
commit
404e0a8b6a55 (net: ipv4: fix RCU races on dst refcounts) tried
to solve a race but added a problem at device/fib dismantle time :
We really want to call dst_free() as soon as possible, even if sockets
still have dst in their cache.
dst_release() calls in free_fib_info_rcu() are not welcomed.
Root of the problem was that now we also cache output routes (in
nh_rth_output), we must use call_rcu() instead of call_rcu_bh() in
rt_free(), because output route lookups are done in process context.
Based on feedback and initial patch from David Miller (adding another
call_rcu_bh() call in fib, but it appears it was not the right fix)
I left the inet_sk_rx_dst_set() helper and added __rcu attributes
to nh_rth_output and nh_rth_input to better document what is going on in
this code.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 31 Jul 2012 21:35:28 +0000 (14:35 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull Ceph changes from Sage Weil:
"Lots of stuff this time around:
- lots of cleanup and refactoring in the libceph messenger code, and
many hard to hit races and bugs closed as a result.
- lots of cleanup and refactoring in the rbd code from Alex Elder,
mostly in preparation for the layering functionality that will be
coming in 3.7.
- some misc rbd cleanups from Josh Durgin that are finally going
upstream
- support for CRUSH tunables (used by newer clusters to improve the
data placement)
- some cleanup in our use of d_parent that Al brought up a while back
- a random collection of fixes across the tree
There is another patch coming that fixes up our ->atomic_open()
behavior, but I'm going to hammer on it a bit more before sending it."
Fix up conflicts due to commits that were already committed earlier in
drivers/block/rbd.c, net/ceph/{messenger.c, osd_client.c}
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (132 commits)
rbd: create rbd_refresh_helper()
rbd: return obj version in __rbd_refresh_header()
rbd: fixes in rbd_header_from_disk()
rbd: always pass ops array to rbd_req_sync_op()
rbd: pass null version pointer in add_snap()
rbd: make rbd_create_rw_ops() return a pointer
rbd: have __rbd_add_snap_dev() return a pointer
libceph: recheck con state after allocating incoming message
libceph: change ceph_con_in_msg_alloc convention to be less weird
libceph: avoid dropping con mutex before fault
libceph: verify state after retaking con lock after dispatch
libceph: revoke mon_client messages on session restart
libceph: fix handling of immediate socket connect failure
ceph: update MAINTAINERS file
libceph: be less chatty about stray replies
libceph: clear all flags on con_close
libceph: clean up con flags
libceph: replace connection state bits with states
libceph: drop unnecessary CLOSED check in socket state change callback
libceph: close socket directly from ceph_con_close()
...
Mauro Carvalho Chehab [Tue, 31 Jul 2012 19:38:41 +0000 (16:38 -0300)]
[media] radio-tea5777: use library for 64bits div
drivers/built-in.o: In function `radio_tea5777_set_freq':
radio-tea5777.c:(.text+0x4d8704): undefined reference to `__udivdi3'
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Hans de Goede <hdegoede@redhat.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Jeff Layton [Mon, 23 Jul 2012 19:46:23 +0000 (15:46 -0400)]
nfs: explicitly reject LOCK_MAND flock() requests
We have no mechanism to emulate LOCK_MAND locks on NFSv4, so explicitly
return -EINVAL if someone requests it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NeilBrown [Tue, 31 Jul 2012 04:40:12 +0000 (14:40 +1000)]
nfs: increase number of permitted callback connections.
By default a sunrpc service is limited to (N+3)*20 connections
where N is the number of threads. This is 80 when N==1.
If this number is exceeded a warning is printed suggesting that
the number of threads be increased. However with services which
run a single thread, this is impossible.
For such services there is a ->sv_maxconn setting that can be
used to forcibly increase the limit, and silence the message.
This is used by lockd.
The nfs client uses a sunrpc service to handle callbacks and
it too is single-threaded, so to avoid the useless messages,
and to allow a reasonable number of concurrent connections,
we need to set ->sv_maxconn. 1024 seems like a good number.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Anton Vorontsov [Tue, 31 Jul 2012 11:39:30 +0000 (04:39 -0700)]
thermal: Constify 'type' argument for the registration routine
thermal_zone_device_register() does not modify 'type' argument, so it is
safe to declare it as const. Otherwise, if we pass a const string, we are
getting the ugly warning:
CC drivers/power/power_supply_core.o
drivers/power/power_supply_core.c: In function 'psy_register_thermal':
drivers/power/power_supply_core.c:204:6: warning: passing argument 1 of 'thermal_zone_device_register' discards 'const' qualifier from pointer target type [enabled by default]
include/linux/thermal.h:140:29: note: expected 'char *' but argument is of type 'const char *'
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Jean Delvare <khali@linux-fr.org>
Anton Vorontsov [Tue, 31 Jul 2012 11:59:42 +0000 (04:59 -0700)]
Merge ... upstream to accommodate with thermal changes
This merge is performed to take commit
c56f5c0342dfee11a1 ("Thermal: Make
Thermal trip points writeable") out of Linus' tree and then fixup power
supply class. This is needed since thermal stuff added a new argument:
CC drivers/power/power_supply_core.o
drivers/power/power_supply_core.c: In function ‘psy_register_thermal’:
drivers/power/power_supply_core.c:204:6: warning: passing argument 3 of ‘thermal_zone_device_register’ makes integer from pointer without a cast [enabled by default]
include/linux/thermal.h:154:29: note: expected ‘int’ but argument is of type ‘struct power_supply *’
drivers/power/power_supply_core.c:204:6: error: too few arguments to function ‘thermal_zone_device_register’
include/linux/thermal.h:154:29: note: declared here
make[1]: *** [drivers/power/power_supply_core.o] Error 1
make: *** [drivers/power/] Error 2
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Alexander Graf [Mon, 30 Jul 2012 12:32:59 +0000 (12:32 +0000)]
powerpc/kvm/book3s_32: Fix MTMSR_EERI macro
Commit
b38c77d82e4 moved the MTMSR_EERI macro from the KVM code to generic
ppc_asm.h code. However, while adding it in the headers for the ppc32 case,
it missed out to remove the former definition in the KVM code.
This patch fixes compilation on server type PPC32 targets with CONFIG_KVM
enabled.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Tue, 31 Jul 2012 05:18:31 +0000 (15:18 +1000)]
Merge remote-tracking branch 'kumar/merge' into merge
Kumar says:
"A few patches that missed the initial 3.6 window. These are bug fixes at
this point."
Linus Torvalds [Tue, 31 Jul 2012 05:14:04 +0000 (22:14 -0700)]
Merge tag 'writeback-proportions' of git://git./linux/kernel/git/wfg/linux
Pull writeback updates from Wu Fengguang:
"Use time based periods to age the writeback proportions, which can
adapt equally well to fast/slow devices."
Fix up trivial conflict in comment in fs/sync.c
* tag 'writeback-proportions' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
writeback: Fix some comment errors
block: Convert BDI proportion calculations to flexible proportions
lib: Fix possible deadlock in flexible proportion code
lib: Proportions with flexible period
Tim Gardner [Wed, 25 Jul 2012 18:41:04 +0000 (15:41 -0300)]
[media] tlg2300: Declare MODULE_FIRMWARE usage
Cc: Huang Shijie <shijie8@gmail.com>
Cc: Kang Yong <kangyong@telegent.com>
Cc: Zhang Xiaobing <xbzhang@telegent.com>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: linux-media@vger.kernel.org
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Huang Shijie <shijie8@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tim Gardner [Wed, 25 Jul 2012 15:54:02 +0000 (12:54 -0300)]
[media] lgs8gxx: Declare MODULE_FIRMWARE usage
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: linux-media@vger.kernel.org
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tim Gardner [Wed, 25 Jul 2012 12:15:19 +0000 (09:15 -0300)]
[media] xc5000: Add MODULE_FIRMWARE statements
This will make modinfo more useful with regard
to discovering necessary firmware files.
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Michael Krufky <mkrufky@kernellabs.com>
Cc: Eddi De Pieri <eddi@depieri.net>
Cc: linux-media@vger.kernel.org
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tim Gardner [Tue, 24 Jul 2012 19:52:27 +0000 (16:52 -0300)]
[media] s2255drv: Add MODULE_FIRMWARE statement
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Dean Anderson <linux-dev@sensoray.com>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: linux-media@vger.kernel.org
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Linus Torvalds [Tue, 31 Jul 2012 02:16:57 +0000 (19:16 -0700)]
Merge tag 'nfs-for-3.6-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Features include:
- More preparatory patches for modularising NFSv2/v3/v4. Split out
the various NFSv2/v3/v4-specific code into separate files
- More preparation for the NFSv4 migration code
- Ensure that OPEN(O_CREATE) observes the pNFS mds threshold
parameters
- pNFS fast failover when the data servers are down
- Various cleanups and debugging patches"
* tag 'nfs-for-3.6-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (67 commits)
nfs: fix fl_type tests in NFSv4 code
NFS: fix pnfs regression with directio writes
NFS: fix pnfs regression with directio reads
sunrpc: clnt: Add missing braces
nfs: fix stub return type warnings
NFS: exit_nfs_v4() shouldn't be an __exit function
SUNRPC: Add a missing spin_unlock to gss_mech_list_pseudoflavors
NFS: Split out NFS v4 client functions
NFS: Split out the NFS v4 filesystem types
NFS: Create a single nfs_clone_super() function
NFS: Split out NFS v4 server creating code
NFS: Initialize the NFS v4 client from init_nfs_v4()
NFS: Move the v4 getroot code to nfs4getroot.c
NFS: Split out NFS v4 file operations
NFS: Initialize v4 sysctls from nfs_init_v4()
NFS: Create an init_nfs_v4() function
NFS: Split out NFS v4 inode operations
NFS: Split out NFS v3 inode operations
NFS: Split out NFS v2 inode operations
NFS: Clean up nfs4_proc_setclientid() and friends
...
Dan Carpenter [Fri, 20 Jul 2012 10:11:57 +0000 (07:11 -0300)]
[media] dib8000: move dereference after check for NULL
My static checker complains that we dereference "state" inside the call
to fft_to_mode() before checking for NULL. The comments say that it is
possible for "state" to be NULL so I have moved the dereference after
the check.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 31 Jul 2012 02:10:38 +0000 (23:10 -0300)]
[media] Documentation: Update cardlists
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tony Gentile [Thu, 19 Jul 2012 12:36:33 +0000 (09:36 -0300)]
[media] bttv: add support for Aposonic W-DVR
Forwarded-by: Gerd Hoffmann <kraxel@bytesex.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Linus Torvalds [Tue, 31 Jul 2012 02:06:25 +0000 (19:06 -0700)]
Merge tag 'mfd-for-linus-3.6-1' of git://git./linux/kernel/git/sameo/mfd-2.6
Pull MFD fix from Samuel Ortiz:
"This one fixes an s5m8767 regulator build breakage due to a merge
conflict caused by the MFD s5m API changes."
* tag 'mfd-for-linus-3.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
regulator: Fix an s5m8767 build failure
Ezequiel GarcÃa [Wed, 18 Jul 2012 16:41:11 +0000 (13:41 -0300)]
[media] cx25821: Remove bad strcpy to read-only char*
The strcpy was being used to set the name of the board.
This was both wrong and redundant,
since the destination char* was read-only and
the name is set statically at compile time.
The type of the name field is changed to const char*
to prevent future errors.
Reported-by: Radek Masin <radek@masin.eu>
Signed-off-by: Ezequiel Garcia <elezegarcia@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Linus Torvalds [Tue, 31 Jul 2012 02:03:41 +0000 (19:03 -0700)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
"This is the first part of the media patches for v3.6.
This patch series contain:
- new DVB frontend: rtl2832
- new video drivers: adv7393
- some unused files got removed
- a selection API cleanup between V4L2 and V4L2 subdev API's
- a major redesign at v4l-ioctl2, in order to clean it up
- several driver fixes and improvements."
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (174 commits)
v4l: Export v4l2-common.h in include/linux/Kbuild
media: Revert "[media] Terratec Cinergy S2 USB HD Rev.2"
[media] media: Use pr_info not homegrown pr_reg macro
[media] Terratec Cinergy S2 USB HD Rev.2
[media] v4l: Correct conflicting V4L2 subdev selection API documentation
[media] Feature removal: V4L2 selections API target and flag definitions
[media] v4l: Unify selection flags documentation
[media] v4l: Unify selection flags
[media] v4l: Common documentation for selection targets
[media] v4l: Unify selection targets across V4L2 and V4L2 subdev interfaces
[media] v4l: Remove "_ACTUAL" from subdev selection API target definition names
[media] V4L: Remove "_ACTIVE" from the selection target name definitions
[media] media: dvb-usb: print mac address via native %pM
[media] s5p-tv: Use module_i2c_driver in sii9234_drv.c file
[media] media: gpio-ir-recv: add allowed_protos for platform data
[media] s5p-jpeg: Use module_platform_driver in jpeg-core.c file
[media] saa7134: fix spelling of detach in label
[media] cx88-blackbird: replace ioctl by unlocked_ioctl
[media] cx88: don't use current_norm
[media] cx88: fix a number of v4l2-compliance violations
...
Duan Jiong [Wed, 18 Jul 2012 13:41:47 +0000 (10:41 -0300)]
[media] pms.c: remove duplicated include
Signed-off-by: Duan Jiong <djduanjiong@gmail.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Duan Jiong [Wed, 18 Jul 2012 13:38:16 +0000 (10:38 -0300)]
[media] smiapp-core.c: remove duplicated include
Signed-off-by: Duan Jiong <djduanjiong@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Alex Elder [Wed, 25 Jul 2012 14:32:41 +0000 (09:32 -0500)]
rbd: create rbd_refresh_helper()
Create a simple helper that handles the common case of calling
__rbd_refresh_header() while holding the ctl_mutex.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Wed, 25 Jul 2012 14:32:41 +0000 (09:32 -0500)]
rbd: return obj version in __rbd_refresh_header()
Add a new parameter to __rbd_refresh_header() through which the
version of the header object is passed back to the caller. In most
cases this isn't needed. The main motivation is to normalize
(almost) all calls to __rbd_refresh_header() so they are all
wrapped immediately by mutex_lock()/mutex_unlock().
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Wed, 11 Jul 2012 01:30:10 +0000 (20:30 -0500)]
rbd: fixes in rbd_header_from_disk()
This fixes a few issues in rbd_header_from_disk():
- There is a check intended to catch overflow, but it's wrong in
two ways.
- First, the type we don't want to overflow is size_t, not
unsigned int, and there is now a SIZE_MAX we can use for
use with that type.
- Second, we're allocating the snapshot ids and snapshot
image sizes separately (each has type u64; on disk they
grouped together as a rbd_image_header_ondisk structure).
So we can use the size of u64 in this overflow check.
- If there are no snapshots, then there should be no snapshot
names. Enforce this, and issue a warning if we encounter a
header with no snapshots but a non-zero snap_names_len.
- When saving the snapshot names into the header, be more direct
in defining the offset in the on-disk structure from which
they're being copied by using "snap_count" rather than "i"
in the array index.
- If an error occurs, the "snapc" and "snap_names" fields are
freed at the end of the function. Make those fields be null
pointers after they're freed, to be explicit that they are
no longer valid.
- Finally, move the definition of the local variable "i" to the
innermost scope in which it's needed.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Tue, 26 Jun 2012 19:57:03 +0000 (12:57 -0700)]
rbd: always pass ops array to rbd_req_sync_op()
All of the callers of rbd_req_sync_op() except one pass a non-null
"ops" pointer. The only one that does not is rbd_req_sync_read(),
which passes CEPH_OSD_OP_READ as its "opcode" and, CEPH_OSD_FLAG_READ
for "flags".
By allocating the ops array in rbd_req_sync_read() and moving the
special case code for the null ops pointer into it, it becomes
clear that much of that code is not even necessary.
In addition, the "opcode" argument to rbd_req_sync_op() is never
actually used, so get rid of that.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Sat, 14 Jul 2012 01:35:11 +0000 (20:35 -0500)]
rbd: pass null version pointer in add_snap()
rbd_header_add_snap() passes the address of a version variable to
rbd_req_sync_exec(), but it ignores the result. Just pass a null
pointer instead.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Tue, 26 Jun 2012 19:57:03 +0000 (12:57 -0700)]
rbd: make rbd_create_rw_ops() return a pointer
Either rbd_create_rw_ops() will succeed, or it will fail because a
memory allocation failed. Have it just return a valid pointer or
null rather than stuffing a pointer into a provided address and
returning an errno.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Wed, 11 Jul 2012 01:30:10 +0000 (20:30 -0500)]
rbd: have __rbd_add_snap_dev() return a pointer
It's not obvious whether the snapshot pointer whose address is
provided to __rbd_add_snap_dev() will be assigned by that function.
Change it to return the snapshot, or a pointer-coded errno in the
event of a failure.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Sage Weil [Tue, 31 Jul 2012 01:19:45 +0000 (18:19 -0700)]
libceph: recheck con state after allocating incoming message
We drop the lock when calling the ->alloc_msg() con op, which means
we need to (a) not clobber con->in_msg without the mutex held, and (b)
we need to verify that we are still in the OPEN state when we retake
it to avoid causing any mayhem. If the state does change, -EAGAIN
will get us back to con_work() and loop.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Sage Weil [Tue, 31 Jul 2012 01:19:30 +0000 (18:19 -0700)]
libceph: change ceph_con_in_msg_alloc convention to be less weird
This function's calling convention is very limiting. In particular,
we can't return any error other than ENOMEM (and only implicitly),
which is a problem (see next patch).
Instead, return an normal 0 or error code, and make the skip a pointer
output parameter. Drop the useless in_hdr argument (we have the con
pointer).
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Sage Weil [Tue, 31 Jul 2012 01:17:13 +0000 (18:17 -0700)]
libceph: avoid dropping con mutex before fault
The ceph_fault() function takes the con mutex, so we should avoid
dropping it before calling it. This fixes a potential race with
another thread calling ceph_con_close(), or _open(), or similar (we
don't reverify con->state after retaking the lock).
Add annotation so that lockdep realizes we will drop the mutex before
returning.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Sage Weil [Tue, 31 Jul 2012 01:16:56 +0000 (18:16 -0700)]
libceph: verify state after retaking con lock after dispatch
We drop the con mutex when delivering a message. When we retake the
lock, we need to verify we are still in the OPEN state before
preparing to read the next tag, or else we risk stepping on a
connection that has been closed.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Sage Weil [Tue, 31 Jul 2012 01:16:40 +0000 (18:16 -0700)]
libceph: revoke mon_client messages on session restart
Revoke all mon_client messages when we shut down the old connection.
This is mostly moot since we are re-using the same ceph_connection,
but it is cleaner.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Sage Weil [Tue, 31 Jul 2012 01:16:16 +0000 (18:16 -0700)]
libceph: fix handling of immediate socket connect failure
If the connect() call immediately fails such that sock == NULL, we
still need con_close_socket() to reset our socket state to CLOSED.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Sage Weil [Mon, 30 Jul 2012 23:27:48 +0000 (16:27 -0700)]
ceph: update MAINTAINERS file
* shiny new inktank.com email addresses
* add include/linux/crush directory (previous oversight)
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Sage Weil [Mon, 30 Jul 2012 23:26:13 +0000 (16:26 -0700)]
libceph: be less chatty about stray replies
There are many (normal) conditions that can lead to us getting
unexpected replies, include cluster topology changes, osd failures,
and timeouts. There's no need to spam the console about it.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Sage Weil [Sat, 21 Jul 2012 00:30:40 +0000 (17:30 -0700)]
libceph: clear all flags on con_close
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Sat, 21 Jul 2012 00:29:55 +0000 (17:29 -0700)]
libceph: clean up con flags
Rename flags with CON_FLAG prefix, move the definitions into the c file,
and (better) document their meaning.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Sat, 21 Jul 2012 00:24:40 +0000 (17:24 -0700)]
libceph: replace connection state bits with states
Use a simple set of 6 enumerated values for the socket states (CON_STATE_*)
and use those instead of the state bits. All of the con->state checks are
now under the protection of the con mutex, so this is safe. It also
simplifies many of the state checks because we can check for anything other
than the expected state instead of various bits for races we can think of.
This appears to hold up well to stress testing both with and without socket
failure injection on the server side.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Sat, 21 Jul 2012 00:19:43 +0000 (17:19 -0700)]
libceph: drop unnecessary CLOSED check in socket state change callback
If we are CLOSED, the socket is closed and we won't get these.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Fri, 20 Jul 2012 23:45:49 +0000 (16:45 -0700)]
libceph: close socket directly from ceph_con_close()
It is simpler to do this immediately, since we already hold the con mutex.
It also avoids the need to deal with a not-quite-CLOSED socket in con_work.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Fri, 20 Jul 2012 22:40:04 +0000 (15:40 -0700)]
libceph: drop gratuitous socket close calls in con_work
If the state is CLOSED or OPENING, we shouldn't have a socket.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Fri, 20 Jul 2012 22:34:04 +0000 (15:34 -0700)]
libceph: move ceph_con_send() closed check under the con mutex
Take the con mutex before checking whether the connection is closed to
avoid racing with someone else closing it.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Fri, 20 Jul 2012 22:33:04 +0000 (15:33 -0700)]
libceph: move msgr clear_standby under con mutex protection
Avoid dropping and retaking con->mutex in the ceph_con_send() case by
leaving locking up to the caller.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Fri, 20 Jul 2012 22:22:53 +0000 (15:22 -0700)]
libceph: fix fault locking; close socket on lossy fault
If we fault on a lossy connection, we should still close the socket
immediately, and do so under the con mutex.
We should also take the con mutex before printing out the state bits in
the debug output.
Signed-off-by: Sage Weil <sage@inktank.com>
Alex Elder [Wed, 25 Jul 2012 14:32:41 +0000 (09:32 -0500)]
rbd: drop "object_name" from rbd_req_sync_unwatch()
rbd_req_sync_unwatch() only ever uses rbd_dev->header_name as the
value of its "object_name" parameter, and that value is available
within the function already. So get rid of the parameter.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Wed, 25 Jul 2012 14:32:40 +0000 (09:32 -0500)]
rbd: drop "object_name" from rbd_req_sync_notify_ack()
rbd_req_sync_notify_ack() only ever uses rbd_dev->header_name as the
value of its "object_name" parameter, and that value is available
within the function already. So get rid of the parameter.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Wed, 25 Jul 2012 14:32:40 +0000 (09:32 -0500)]
rbd: drop "object_name" from rbd_req_sync_notify()
rbd_req_sync_notify() only ever uses rbd_dev->header_name as the
value of its "object_name" parameter, and that value is available
within the function already. So get rid of the parameter.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Wed, 25 Jul 2012 14:32:40 +0000 (09:32 -0500)]
rbd: drop "object_name" from rbd_req_sync_watch()
rbd_req_sync_watch() is only called in one place, and in that place
it passes rbd_dev->header_name as the value of the "object_name"
parameter. This value is available within the function already.
Having the extra parameter leaves the impression the object name
could take on different values, but it does not.
So get rid of the parameter. We can always add it back again if
we find we want to watch some other object in the future.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 19 Jul 2012 14:09:27 +0000 (09:09 -0500)]
rbd: drop rbd_dev parameter in snap functions
Both rbd_register_snap_dev() and __rbd_remove_snap_dev() have
rbd_dev parameters that are unused. Remove them.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 19 Jul 2012 14:09:27 +0000 (09:09 -0500)]
rbd: drop rbd_header_from_disk() gfp_flags parameter
The function rbd_header_from_disk() is only called in one spot, and
it passes GFP_KERNEL as its value for the gfp_flags parameter.
Just drop that parameter and substitute GFP_KERNEL everywhere within
that function it had been used. (If we find we need the parameter
again in the future it's easy enough to add back again.)
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 19 Jul 2012 14:09:27 +0000 (09:09 -0500)]
rbd: snapc is unused in rbd_req_sync_read()
The "snapc" parameter to in rbd_req_sync_read() is not used, so
get rid of it.
Reported-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Tue, 3 Jul 2012 21:01:19 +0000 (16:01 -0500)]
rbd: rename rbd_device->id
The "id" field of an rbd device structure represents the unique
client-local device id mapped to the underlying rbd image. Each rbd
image will have another id--the image id--and each snapshot has its
own id as well. The simple name "id" no longer conveys the
information one might like to have.
Rename the device "id" field in struct rbd_dev to be "dev_id" to
make it a little more obvious what we're dealing with without having
to think more about context.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Wed, 25 Jul 2012 14:32:40 +0000 (09:32 -0500)]
rbd: encapsulate header validity test
If an rbd image header is read and it doesn't begin with the
expected magic information, a warning is displayed. This is
a fairly simple test, but it could be extended at some point.
Fix the comparison so it actually looks at the "text" field
rather than the front of the structure.
In any case, encapsulate the validity test in its own function.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Sat, 14 Jul 2012 01:35:11 +0000 (20:35 -0500)]
ceph: define snap counts as u32 everywhere
There are two structures in which a count of snapshots are
maintained:
struct ceph_snap_context {
...
u32 num_snaps;
...
}
and
struct ceph_snap_realm {
...
u32 num_prior_parent_snaps; /* had prior to parent_since */
...
u32 num_snaps;
...
}
These fields never take on negative values (e.g., to hold special
meaning), and so are really inherently unsigned. Furthermore they
take their value from over-the-wire or on-disk formatted 32-bit
values.
So change their definition to have type u32, and change some spots
elsewhere in the code to account for this change.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Sat, 14 Jul 2012 01:35:11 +0000 (20:35 -0500)]
rbd: clean up a few dout() calls
There was a dout() call in rbd_do_request() that was reporting
the reporting the offset as the length and vice versa. While
fixing that I did a quick scan of other dout() calls and fixed
a couple of other minor things.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 19 Jul 2012 14:09:27 +0000 (09:09 -0500)]
rbd: simplify __rbd_remove_all_snaps()
This just replaces a while loop with list_for_each_entry_safe()
in __rbd_remove_all_snaps().
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 19 Jul 2012 14:09:27 +0000 (09:09 -0500)]
rbd: drop extra header_rwsem init
In commit
c666601a there was inadvertently added an extra
initialization of rbd_dev->header_rwsem. This gets rid of the
duplicate.
Reported-by: Guangliang Zhao <gzhao@suse.com>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 19 Jul 2012 13:49:18 +0000 (08:49 -0500)]
rbd: kill rbd_image_header->snap_seq
The snap_seq field in an rbd_image_header structure held the value
from the rbd image header when it was last refreshed. We now
maintain this value in the snapc->seq field. So get rid of the
other one.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 19 Jul 2012 13:49:18 +0000 (08:49 -0500)]
rbd: set snapc->seq only when refreshing header
In rbd_header_add_snap() there is code to set snapc->seq to the
just-added snapshot id. This is the only remnant left of the
use of that field for recording which snapshot an rbd_dev was
associated with. That functionality is no longer supported,
so get rid of that final bit of code.
Doing so means we never actually set snapc->seq any more. On the
server, the snapshot context's sequence value represents the highest
snapshot id ever issued for a particular rbd image. So we'll make
it have that meaning here as well. To do so, set this value
whenever the rbd header is (re-)read. That way it will always be
consistent with the rest of the snapshot context we maintain.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 19 Jul 2012 13:49:18 +0000 (08:49 -0500)]
rbd: preserve snapc->seq in rbd_header_set_snap()
In rbd_header_set_snap(), there is logic to make the snap context's
seq field get set to a particular snapshot id, or 0 if there is no
snapshot for the rbd image.
This seems to be an artifact of how the current snapshot id for an
rbd_dev was recorded before the rbd_dev->snap_id field began to be
used for that purpose.
There's no need to update the value of snapc->seq here any more, so
stop doing it. Tidy up a few local variables in that function
while we're at it.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 19 Jul 2012 13:49:18 +0000 (08:49 -0500)]
rbd: don't use snapc->seq that way
In what appears to be an artifact of a different way of encoding
whether an rbd image maps a snapshot, __rbd_refresh_header() has
code that arranges to update the seq value in an rbd image's
snapshot context to point to the first entry in its snapshot
array if that's where it was pointing initially.
We now use rbd_dev->snap_id to record the snapshot id--using the
special value CEPH_NOSNAP to indicate the rbd_dev is not mapping a
snapshot at all.
There is therefore no need to check for this case, nor to update the
seq value, in __rbd_refresh_header(). Just preserve the seq value
that rbd_read_header() provides (which, at the moment, is nothing).
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin [Tue, 6 Dec 2011 02:10:44 +0000 (18:10 -0800)]
rbd: send header version when notifying
Previously the original header version was sent. Now, we update it
when the header changes.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Josh Durgin [Mon, 5 Dec 2011 22:03:05 +0000 (14:03 -0800)]
rbd: use reference counting for the snap context
This prevents a race between requests with a given snap context and
header updates that free it. The osd client was already expecting the
snap context to be reference counted, since it get()s it in
ceph_osdc_build_request and put()s it when the request completes.
Also remove the second down_read()/up_read() on header_rwsem in
rbd_do_request, which wasn't actually preventing this race or
protecting any other data.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Josh Durgin [Mon, 5 Dec 2011 18:41:28 +0000 (10:41 -0800)]
rbd: set image size when header is updated
The image may have been resized.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Josh Durgin [Mon, 5 Dec 2011 18:35:04 +0000 (10:35 -0800)]
rbd: expose the correct size of the device in sysfs
If an image was mapped to a snapshot, the size of the head version
would be shown. Protect capacity with header_rwsem, since it may
change.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Josh Durgin [Tue, 22 Nov 2011 01:13:54 +0000 (17:13 -0800)]
rbd: only reset capacity when pointing to head
Snapshots cannot be resized, and the new capacity of head should not
be reflected by the snapshot.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Josh Durgin [Tue, 22 Nov 2011 02:14:25 +0000 (18:14 -0800)]
rbd: return errors for mapped but deleted snapshot
When a snapshot is deleted, the OSD will return ENOENT when reading
from it. This is normally interpreted as a hole by rbd, which will
return zeroes. To minimize the time in which this can happen, stop
requests early when we are notified that our snapshot no longer
exists.
[elder@inktank.com: updated __rbd_init_snaps_header() logic]
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>