commit 20a807a7b294ee58ab8a8b38bff9f04b6fc127c4 Author: Greg Kroah-Hartman Date: Tue Sep 15 10:46:05 2009 -0700 Linux 2.6.30.7 commit 0ce76a2beb6ee0b6eb3182f012c8dbce57f4b490 Author: Mikulas Patocka Date: Fri Sep 4 20:40:43 2009 +0100 dm snapshot: fix on disk chunk size validation commit ae0b7448e91353ea5f821601a055aca6b58042cd upstream. Fix some problems seen in the chunk size processing when activating a pre-existing snapshot. For a new snapshot, the chunk size can either be supplied by the creator or a default value can be used. For an existing snapshot, the chunk size in the snapshot header on disk should always be used. If someone attempts to load an existing snapshot and has the 'default chunk size' option set, the kernel uses its default value even when it is incorrect for the snapshot being loaded. This patch ensures the correct on-disk value is always used. Secondly, when the code does use the chunk size stored on the disk it is prudent to revalidate it, so the code can exit cleanly if it got corrupted as happened in https://bugzilla.redhat.com/show_bug.cgi?id=461506 . Signed-off-by: Mikulas Patocka Signed-off-by: Alasdair G Kergon Signed-off-by: Greg Kroah-Hartman commit da8341635673d7cf46312c39e4c69912c2629a30 Author: Mikulas Patocka Date: Fri Sep 4 20:40:41 2009 +0100 dm exception store: split set_chunk_size commit 2defcc3fb4661e7351cb2ac48d843efc4c64db13 upstream. Break the function set_chunk_size to two functions in preparation for the fix in the following patch. Signed-off-by: Mikulas Patocka Signed-off-by: Alasdair G Kergon Signed-off-by: Greg Kroah-Hartman commit 30d026addc77ab55120db88f4e17acae217b9115 Author: Mikulas Patocka Date: Fri Sep 4 20:40:39 2009 +0100 dm snapshot: fix header corruption race on invalidation commit 61578dcd3fafe6babd72e8db32110cc0b630a432 upstream. If a persistent snapshot fills up, a race can corrupt the on-disk header which causes a crash on any future attempt to activate the snapshot (typically while booting). This patch fixes the race. When the snapshot overflows, __invalidate_snapshot is called, which calls snapshot store method drop_snapshot. It goes to persistent_drop_snapshot that calls write_header. write_header constructs the new header in the "area" location. Concurrently, an existing kcopyd job may finish, call copy_callback and commit_exception method, that goes to persistent_commit_exception. persistent_commit_exception doesn't do locking, relying on the fact that callbacks are single-threaded, but it can race with snapshot invalidation and overwrite the header that is just being written while the snapshot is being invalidated. The result of this race is a corrupted header being written that can lead to a crash on further reactivation (if chunk_size is zero in the corrupted header). The fix is to use separate memory areas for each. See the bug: https://bugzilla.redhat.com/show_bug.cgi?id=461506 Signed-off-by: Mikulas Patocka Signed-off-by: Alasdair G Kergon Signed-off-by: Greg Kroah-Hartman commit 008f14e7842214765ee1f535f003ca1aa72f7ee3 Author: Mikulas Patocka Date: Fri Sep 4 20:40:37 2009 +0100 dm snapshot: refactor zero_disk_area to use chunk_io commit 02d2fd31defce6ff77146ad0fef4f19006055d86 upstream. Refactor chunk_io to prepare for the fix in the following patch. Pass an area pointer to chunk_io and simplify zero_disk_area to use chunk_io. No functional change. Signed-off-by: Mikulas Patocka Signed-off-by: Alasdair G Kergon Signed-off-by: Greg Kroah-Hartman commit 8c57b59008353a5e218c12bccb8bd1c2cf7ea5d9 Author: Jonathan Brassow Date: Fri Sep 4 20:40:32 2009 +0100 dm raid1: do not allow log_failure variable to unset after being set commit d2b698644c97cb033261536a4f2010924a00eac9 upstream. This patch fixes a bug which was triggering a case where the primary leg could not be changed on failure even when the mirror was in-sync. The case involves the failure of the primary device along with the transient failure of the log device. The problem is that bios can be put on the 'failures' list (due to log failure) before 'fail_mirror' is called due to the primary device failure. Normally, this is fine, but if the log device failure is transient, a subsequent iteration of the work thread, 'do_mirror', will reset 'log_failure'. The 'do_failures' function then resets the 'in_sync' variable when processing bios on the failures list. The 'in_sync' variable is what is used to determine if the primary device can be switched in the event of a failure. Since this has been reset, the primary device is incorrectly assumed to be not switchable. The case has been seen in the cluster mirror context, where one machine realizes the log device is dead before the other machines. As the responsibilities of the server migrate from one node to another (because the mirror is being reconfigured due to the failure), the new server may think for a moment that the log device is fine - thus resetting the 'log_failure' variable. In any case, it is inappropiate for us to reset the 'log_failure' variable. The above bug simply illustrates that it can actually hurt us. Signed-off-by: Jonathan Brassow Signed-off-by: Alasdair G Kergon Signed-off-by: Greg Kroah-Hartman commit 8f78d18b849ef860dccfa636b209a48e5a34a410 Author: Clemens Ladisch Date: Tue Sep 1 08:23:58 2009 +0200 sound: oxygen: fix MCLK rate for 192 kHz playback commit b91ab72b830e1494c2c7f8de05ccb2ab2c9cfb26 upstream. Do not forget to program the MCLK ratio for the I2S output. Otherwise, the master clock frequency can be too high for the DACs at sample frequencies above 96 kHz. Signed-off-by: Clemens Ladisch Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit ec20282569b8c528265974c8aef921ce24e14e25 Author: Clemens Ladisch Date: Wed Sep 2 18:25:39 2009 +0200 sound: oxygen: handle cards with missing EEPROM commit 92653453c3015c083b9fe0ad48261c6b2267d482 upstream. The card model detection code introduced in 2.6.30 that tries to work around partially broken EEPROM contents by reading the EEPROM directly does not handle cards where the EEPROM has been omitted. In this case, we have to use the default ID to allow the driver to load. Signed-off-by: Clemens Ladisch Reported-and-tested-by: Ozan Çağlayan Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 65a11d230e61e2d760ee114781235d25725fd9d9 Author: James Bottomley Date: Tue May 26 20:35:48 2009 +0000 SCSI: sd: fix bug in SCSI async probing commit 601e7638254c118fca135af9b1a9f35061420f62 upstream. The async split up of probing in sd.c created a potential failure case where something goes wrong with device_add(), but which we don't recover properly. Since, in general, asynchronous error handling is hard, move the device_add() into the asynchronous path (it should be fast) and make sure all the deferred processing cannot fail. Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit 354c51ed795e4f0a8301a3815b963845d6d6131c Author: Chris Wright Date: Fri Aug 28 13:00:06 2009 -0700 PCI SR-IOV: correct broken resource alignment calculations commit 6faf17f6f1ffc586d16efc2f9fa2083a7785ee74 upstream. An SR-IOV capable device includes an SR-IOV PCIe capability which describes the Virtual Function (VF) BAR requirements. A typical SR-IOV device can support multiple VFs whose BARs must be in a contiguous region, effectively an array of VF BARs. The BAR reports the size requirement for a single VF. We calculate the full range needed by simply multiplying the VF BAR size with the number of possible VFs and create a resource spanning the full range. This all seems sane enough except it artificially inflates the alignment requirement for the VF BAR. The VF BAR need only be aligned to the size of a single BAR not the contiguous range of VF BARs. This can cause us to fail to allocate resources for the BAR despite the fact that we actually have enough space. This patch adds a thin PCI specific layer over the generic resource_alignment() function which is aware of the special nature of VF BARs and does sorting and allocation based on the smaller alignment requirement. I recognize that while resource_alignment is generic, it's basically a PCI helper. An alternative to this patch is to add PCI VF BAR specific information to struct resource. I opted for the extra layer rather than adding such PCI specific information to struct resource. This does have the slight downside that we don't cache the BAR size and re-read for each alignment query (happens a small handful of times during boot for each VF BAR). Signed-off-by: Chris Wright Cc: Ivan Kokshaysky Cc: Linus Torvalds Cc: Matthew Wilcox Cc: Yu Zhao Signed-off-by: Jesse Barnes Signed-off-by: Greg Kroah-Hartman commit 820fdfd8431c2b06723ce2c42806f3cef0ddb2e5 Author: Ryusuke Konishi Date: Sun Aug 30 04:21:41 2009 +0900 nilfs2: fix preempt count underflow in nilfs_btnode_prepare_change_key commit b1f1b8ce0a1d71cbc72f7540134d52b79bd8f5ac upstream. This will fix the following preempt count underflow reported from users with the title "[NILFS users] segctord problem" (Message-ID: <949415.6494.qm@web58808.mail.re1.yahoo.com> and Message-ID: ): WARNING: at kernel/sched.c:4890 sub_preempt_count+0x95/0xa0() Hardware name: HP Compaq 6530b (KR980UT#ABC) Modules linked in: bridge stp llc bnep rfcomm l2cap xfs exportfs nilfs2 cowloop loop vboxnetadp vboxnetflt vboxdrv btusb bluetooth uvcvideo videodev v4l1_compat v4l2_compat_ioctl32 arc4 snd_hda_codec_analog ecb iwlagn iwlcore rfkill lib80211 mac80211 snd_hda_intel snd_hda_codec ehci_hcd uhci_hcd usbcore snd_hwdep snd_pcm tg3 cfg80211 psmouse snd_timer joydev libphy ohci1394 snd_page_alloc hp_accel lis3lv02d ieee1394 led_class i915 drm i2c_algo_bit video backlight output i2c_core dm_crypt dm_mod Pid: 4197, comm: segctord Not tainted 2.6.30-gentoo-r4-64 #7 Call Trace: [] ? sub_preempt_count+0x95/0xa0 [] warn_slowpath_common+0x78/0xd0 [] warn_slowpath_null+0xf/0x20 [] sub_preempt_count+0x95/0xa0 [] nilfs_btnode_prepare_change_key+0x11b/0x190 [nilfs2] [] nilfs_btree_assign_p+0x19d/0x1e0 [nilfs2] [] nilfs_btree_assign+0xbd/0x130 [nilfs2] [] nilfs_bmap_assign+0x47/0x70 [nilfs2] [] nilfs_segctor_do_construct+0x956/0x20f0 [nilfs2] [] ? _spin_unlock_irqrestore+0x12/0x40 [] ? __up_write+0xe0/0x150 [] ? up_write+0x9/0x10 [] ? nilfs_bmap_test_and_clear_dirty+0x43/0x60 [nilfs2] [] ? nilfs_mdt_fetch_dirty+0x27/0x60 [nilfs2] [] nilfs_segctor_construct+0x8c/0xd0 [nilfs2] [] nilfs_segctor_thread+0x15c/0x3a0 [nilfs2] [] ? nilfs_construction_timeout+0x0/0x10 [nilfs2] [] ? add_timer+0x13/0x20 [] ? __wake_up_common+0x5a/0x90 [] ? autoremove_wake_function+0x0/0x40 [] ? nilfs_segctor_thread+0x0/0x3a0 [nilfs2] [] ? nilfs_segctor_thread+0x0/0x3a0 [nilfs2] [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] ? kthread+0x0/0x90 [] ? child_rip+0x0/0x20 This problem was caused due to a missing radix_tree_preload() call in the retry path of nilfs_btnode_prepare_change_key() function. Reported-by: Eric A Reported-by: Jerome Poulin Signed-off-by: Ryusuke Konishi Tested-by: Jerome Poulin Signed-off-by: Greg Kroah-Hartman commit e8f23f08a0dbfbaf5e95fa77a12c8cb2b1bd97e7 Author: Eric Dumazet Date: Thu Sep 3 22:38:59 2009 +0300 slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU commit d76b1590e06a63a3d8697168cd0aabf1c4b3cb3a upstream. kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close() and *before* sysfs_slab_remove() or risk rcu_free_slab() being called after kmem_cache is deleted (kfreed). rmmod nf_conntrack can crash the machine because it has to kmem_cache_destroy() a SLAB_DESTROY_BY_RCU enabled cache. Reported-by: Zdenek Kabelac Signed-off-by: Eric Dumazet Acked-by: Paul E. McKenney Signed-off-by: Pekka Enberg Signed-off-by: Greg Kroah-Hartman commit 6c135b4e133017a8aeef1534107ee4b1f11d3265 Author: Massimo Cirillo Date: Thu Aug 27 10:44:09 2009 +0200 JFFS2: add missing verify buffer allocation/deallocation commit bc8cec0dff072f1a45ce7f6b2c5234bb3411ac51 upstream. The function jffs2_nor_wbuf_flash_setup() doesn't allocate the verify buffer if CONFIG_JFFS2_FS_WBUF_VERIFY is defined, so causing a kernel panic when that macro is enabled and the verify function is called. Similarly the jffs2_nor_wbuf_flash_cleanup() must free the buffer if CONFIG_JFFS2_FS_WBUF_VERIFY is enabled. The following patch fixes the problem. The following patch applies to 2.6.30 kernel. Signed-off-by: Massimo Cirillo Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 22a3f9b74ab211deb28e35549284d9f459232c5f Author: Mathieu Desnoyers Date: Tue Aug 18 20:16:55 2009 -0700 sparc: sys32.S incorrect compat-layer splice() system call [ Upstream commit e2c6cbd9ace61039d3de39e717195e38f1492aee ] I think arch/sparc/kernel/sys32.S has an incorrect splice definition: SIGN2(sys32_splice, sys_splice, %o0, %o1) The splice() prototype looks like : long splice(int fd_in, loff_t *off_in, int fd_out, loff_t *off_out, size_t len, unsigned int flags); So I think we should have : SIGN2(sys32_splice, sys_splice, %o0, %o2) Signed-off-by: Mathieu Desnoyers Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit eb2bbea7da00366d35e1a57084e7ca864292494d Author: David S. Miller Date: Fri Sep 4 03:38:54 2009 -0700 sparc64: Fix bootup with mcount in some configs. [ Upstream commit bd4352cadfacb9084c97c853b025fac010266c26 ] Functions invoked early when booting up a cpu can't use tracing because mcount requires a valid 'current_thread_info()' and TLB mappings to be setup. The code path of sun4v_register_mondo_queues --> register_one_mondo is one such case. sun4v_register_mondo_queues already has the necessary 'notrace' annotation, but register_one_mondo does not. Normally register_one_mondo is inlined so the bug doesn't trigger, but with some config/compiler combinations, it won't be so we must properly mark it notrace. While we're here, add 'notrace' annoations to prom_printf and prom_halt so that early error handling won't have the same problem. Reported-by: Alexander Beregalov Reported-by: Leif Sawyer Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit db945b995c0cfdc3276ef886f77cc90e5ceb9ff0 Author: David S. Miller Date: Tue Aug 25 16:47:46 2009 -0700 sparc64: Validate linear D-TLB misses. [ Upstream commit d8ed1d43e17898761c7221014a15a4c7501d2ff3 ] When page alloc debugging is not enabled, we essentially accept any virtual address for linear kernel TLB misses. But with kgdb, kernel address probing, and other facilities we can try to access arbitrary crap. So, make sure the address we miss on will translate to physical memory that actually exists. In order to make this work we have to embed the valid address bitmap into the kernel image. And in order to make that less expensive we make an adjustment, in that the max physical memory address is decreased to "1 << 41", even on the chips that support a 42-bit physical address space. We can do this because bit 41 indicates "I/O space" and thus covers non-memory ranges. The result of this is that: 1) kpte_linear_bitmap shrinks from 2K to 1K in size 2) we need 64K more for the valid address bitmap We can't let the valid address bitmap be dynamically allocated once we start using it to validate TLB misses, otherwise we have crazy issues to deal with wrt. recursive TLB misses and such. If we're in a TLB miss it could be the deepest trap level that's legal inside of the cpu. So if we TLB miss referencing the bitmap, the cpu will be out of trap levels and enter RED state. To guard against out-of-range accesses to the bitmap, we have to check to make sure no bits in the physical address above bit 40 are set. We could export and use last_valid_pfn for this check, but that's just an unnecessary extra memory reference. On the plus side of all this, since we load all of these translations into the special 4MB mapping TSB, and we check the TSB first for TLB misses, there should be absolutely no real cost for these new checks in the TLB miss path. Reported-by: heyongli@gmail.com Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5522658d2894bd792dc0afa81a8212656889f64b Author: David S. Miller Date: Thu Sep 3 02:35:20 2009 -0700 sparc64: Kill spurious NMI watchdog triggers by increasing limit to 30 seconds. [ Upstream commit e6617c6ec28a17cf2f90262b835ec05b9b861400 ] This is a compromise and a temporary workaround for bootup NMI watchdog triggers some people see with qla2xxx devices present. This happens when, for example: CPU 0 is in the driver init and looping submitting mailbox commands to load the firmware, then waiting for completion. CPU 1 is receiving the device interrupts. CPU 1 is where the NMI watchdog triggers. CPU 0 is submitting mailbox commands fast enough that by the time CPU 1 returns from the device interrupt handler, a new one is pending. This sequence runs for more than 5 seconds. The problematic case is CPU 1's timer interrupt running when the barrage of device interrupts begin. Then we have: timer interrupt return for softirq checking pending, thus enable interrupts qla2xxx interrupt return qla2xxx interrupt return ... 5+ seconds pass final qla2xxx interrupt for fw load return run timer softirq return At some point in the multi-second qla2xxx interrupt storm we trigger the NMI watchdog on CPU 1 from the NMI interrupt handler. The timer softirq, once we get back to running it, is smart enough to run the timer work enough times to make up for the missed timer interrupts. However, the NMI watchdogs (both x86 and sparc) use the timer interrupt count to notice the cpu is wedged. But in the above scenerio we'll receive only one such timer interrupt even if we last all the way back to running the timer softirq. The default watchdog trigger point is only 5 seconds, which is pretty low (the softwatchdog triggers at 60 seconds). So increase it to 30 seconds for now. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 461eca1d34f0fa01887019a4d3ba39d145b8c799 Author: Eric Dumazet Date: Tue Jul 28 02:36:15 2009 +0000 net: net_assign_generic() fix [ Upstream commit 144586301f6af5ae5943a002f030d8c626fa4fdd ] memcpy() should take into account size of pointers, not only number of pointers to copy. Signed-off-by: Eric Dumazet Acked-by: Pavel Emelyanov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit bcb9e52411cb2069fce6720cdc314651c1f200c1 Author: Eric Dumazet Date: Tue Jul 28 03:47:39 2009 +0000 pppol2tp: calls unregister_pernet_gen_device() at unload time [ Upstream commit 446e72f30eca76d6f9a1a54adf84d2c6ba2831f8 ] Failure to call unregister_pernet_gen_device() can exhaust memory if module is loaded/unloaded many times. Signed-off-by: Eric Dumazet Acked-by: Cyrill Gorcunov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0f316c424183f5a0a8f1d538c533966483dbbb8a Author: Ben McKeegan Date: Tue Jul 28 07:43:57 2009 +0000 ppp: fix lost fragments in ppp_mp_explode() (resubmit) [ Upstream commit a53a8b56827cc429c6d9f861ad558beeb5f6103f ] This patch fixes the corner cases where the sum of MTU of the free channels (adjusted for fragmentation overheads) is less than the MTU of PPP link. There are at least 3 situations where this case might arise: - some of the channels are busy - the multilink session is running in a degraded state (i.e. with less than its full complement of active channels) - by design, where multilink protocol is being used to artificially increase the effective link MTU of a single link. Without this patch, at most 1 fragment is ever sent per free channel for a given PPP frame and any remaining part of the PPP frame that does not fit into those fragments is silently discarded. This patch restores the original behaviour which was broken by commit 9c705260feea6ae329bc6b6d5f6d2ef0227eda0a 'ppp:ppp_mp_explode() redesign'. Once all 'free' channels have been given a fragment, an additional fragment is queued to each available channel in turn, as many times as necessary, until the entire PPP frame has been consumed. Signed-off-by: Ben McKeegan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f94cae4285c03f8060daaabd77809793924c6100 Author: Tom Goff Date: Fri Aug 14 16:33:56 2009 -0700 gre: Fix MTU calculation for bound GRE tunnels [ Upstream commit 8cdb045632e5ee22854538619ac6f150eb0a4894 ] The GRE header length should be subtracted when the tunnel MTU is calculated. This just corrects for the associativity change introduced by commit 42aa916265d740d66ac1f17290366e9494c884c2 ("gre: Move MTU setting out of ipgre_tunnel_bind_dev"). Signed-off-by: Tom Goff Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4d422a0590a44d9de3749dadb673b7b0561cc0d1 Author: Krzysztof Hałasa Date: Sun Aug 23 19:02:13 2009 -0700 E100: fix interaction with swiotlb on X86. [ Upstream commit 6ff9c2e7fa8ca63a575792534b63c5092099c286 ] E100 places it's RX packet descriptors inside skb->data and uses them with bidirectional streaming DMA mapping. Data in descriptors is accessed simultaneously by the chip (writing status and size when a packet is received) and CPU (reading to check if the packet was received). This isn't a valid usage of PCI DMA API, which requires use of the coherent (consistent) memory for such purpose. Unfortunately e100 chips working in "simplified" RX mode have to store received data directly after the descriptor. Fixing the driver to conform to the API would require using unsupported "flexible" RX mode or receiving data into a coherent memory and using CPU to copy it to network buffers. This patch, while not yet making the driver conform to the PCI DMA API, allows it to work correctly on X86 with swiotlb (while not breaking other architectures). Signed-off-by: Krzysztof Hałasa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 64a0893c7619a93077de2d32260d5597affafd23 Author: Wei Yongjun Date: Tue Aug 4 21:44:39 2009 +0000 dccp: missing destroy of percpu counter variable while unload module [ Upstream commit 476181cb05c6a3aea3ef42309388e255c934a06f ] percpu counter dccp_orphan_count is init in dccp_init() by percpu_counter_init() while dccp module is loaded, but the destroy of it is missing while dccp module is unloaded. We can get the kernel WARNING about this. Reproduct by the following commands: $ modprobe dccp $ rmmod dccp $ modprobe dccp WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c() Hardware name: VMware Virtual Platform list_add corruption. next->prev should be prev (c080c0c4), but was (null). (next =ca7188cc). Modules linked in: dccp(+) nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc Pid: 1956, comm: modprobe Not tainted 2.6.31-rc5 #55 Call Trace: [] warn_slowpath_common+0x6a/0x81 [] ? __list_add+0x27/0x5c [] warn_slowpath_fmt+0x29/0x2c [] __list_add+0x27/0x5c [] __percpu_counter_init+0x4d/0x5d [] dccp_init+0x19/0x2ed [dccp] [] do_one_initcall+0x4f/0x111 [] ? dccp_init+0x0/0x2ed [dccp] [] ? notifier_call_chain+0x26/0x48 [] ? __blocking_notifier_call_chain+0x45/0x51 [] sys_init_module+0xac/0x1bd [] sysenter_do_call+0x12/0x22 Signed-off-by: Wei Yongjun Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman