commit 0ee0f94f82153cc3e4a94a180349c28d1218f1d7 Author: Greg Kroah-Hartman Date: Sun Sep 26 17:22:13 2010 -0700 Linux 2.6.32.23 commit a1a34b6cceb66a843dfbec0a197b0067fe9a8dc9 Author: H. Peter Anvin Date: Tue Jul 27 17:01:49 2010 -0700 x86: Add memory modify constraints to xchg() and cmpxchg() commit 113fc5a6e8c2288619ff7e8187a6f556b7e0d372 upstream. [ Backport to .32 by Tomáš Janoušek ] xchg() and cmpxchg() modify their memory operands, not merely read them. For some versions of gcc the "memory" clobber has apparently dealt with the situation, but not for all. Originally-by: Linus Torvalds Signed-off-by: H. Peter Anvin Cc: Glauber Costa Cc: Avi Kivity Cc: Peter Palfrader Cc: Greg KH Cc: Alan Cox Cc: Zachary Amsden Cc: Marcelo Tosatti LKML-Reference: <4C4F7277.8050306@zytor.com> Signed-off-by: Greg Kroah-Hartman commit 5d881186be461f3ccea69fe93f674e3266db7336 Author: Michael Cree Date: Wed Sep 1 11:25:17 2010 -0400 alpha: Fix printk format errors commit 3e073367a57d41e506f20aebb98e308387ce3090 upstream. When compiling alpha generic build get errors such as: arch/alpha/kernel/err_marvel.c: In function ‘marvel_print_err_cyc’: arch/alpha/kernel/err_marvel.c:119: error: format ‘%ld’ expects type ‘long int’, but argument 6 has type ‘u64’ Replaced a number of %ld format specifiers with %lld since u64 is unsigned long long. Signed-off-by: Michael Cree Signed-off-by: Matt Turner Signed-off-by: Greg Kroah-Hartman commit 38db8f310a497bf97863e5c0bb738e2afc67e3b4 Author: Ben Hutchings Date: Wed Mar 24 03:33:48 2010 +0000 sis-agp: Remove SIS 760, handled by amd64-agp commit d831692a1a8e9ceaaa9bb16bb3fc503b7e372558 upstream. SIS 760 is listed in the device tables for both amd64-agp and sis-agp. amd64-agp is apparently preferable since it has workarounds for some BIOS misconfigurations that sis-agp doesn't handle. Signed-off-by: Ben Hutchings Signed-off-by: Dave Airlie Signed-off-by: Greg Kroah-Hartman commit 6d607033481c7031e5fbfe9750542deb52e9b94c Author: Ralf Baechle Date: Tue Mar 23 17:56:38 2010 +0100 MIPS: Sibyte: Fix M3 TLB exception handler workaround. commit 3d45285dd1ff4d4a1361b95e2d6508579a4402b5 upstream. The M3 workaround needs to cmpare the region and VPN2 fields only. Signed-off-by: Ralf Baechle Signed-off-by: Greg Kroah-Hartman commit 28496c5713772499967266e43a8cfd8f2e5afc0a Author: Bartlomiej Zolnierkiewicz Date: Sat Feb 13 17:43:17 2010 -0500 pata_pdc202xx_old: fix UDMA mode for PDC2026x chipsets commit 750e519da7b3f470fe1b5b55c8d8f52d6d6371e4 upstream. PDC2026x chipsets need the same treatment as PDC20246 one. This is completely untested but will hopefully fix UDMA issues that people have been reporting against pata_pdc202xx_old for the last couple of years. Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Jeff Garzik Signed-off-by: Greg Kroah-Hartman commit 526c2b2bdf2ddbb3ff020ef1d35d531476e3587d Author: Bartlomiej Zolnierkiewicz Date: Sat Feb 13 14:35:53 2010 +0100 pata_pdc202xx_old: fix UDMA mode for Promise UDMA33 cards commit a75032e8772d13dab5e3501413d7e14a148281b4 upstream. On Monday 04 January 2010 02:30:24 pm Russell King wrote: > Found the problem - getting rid of the read of the alt status register > after the command has been written fixes the UDMA CRC errors on write: > > @@ -676,7 +676,8 @@ void ata_sff_exec_command(struct ata_port *ap, const struct > ata_taskfile *tf) > DPRINTK("ata%u: cmd 0x%X\n", ap->print_id, tf->command); > > iowrite8(tf->command, ap->ioaddr.command_addr); > - ata_sff_pause(ap); > + ndelay(400); > +// ata_sff_pause(ap); > } > EXPORT_SYMBOL_GPL(ata_sff_exec_command); > > > This rather makes sense. The PDC20247 handles the UDMA part of the > protocol. It has no way to tell the PDC20246 to wait while it suspends > UDMA, so that a normal register access can take place - the 246 ploughs > on with the register access without any regard to the state of the 247. > > If the drive immediately starts the UDMA protocol after a write to the > command register (as it probably will for the DMA WRITE command), then > we'll be accessing the taskfile in the middle of the UDMA setup, which > can't be good. It's certainly a violation of the ATA specs. Fix it by adding custom ->sff_exec_command method for UDMA33 chipsets. Debugged-by: Russell King Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Jeff Garzik Signed-off-by: Greg Kroah-Hartman commit 606fa05a76966af96cb3c170e6d09b65194e97c9 Author: Ralf Baechle Date: Tue Mar 23 15:54:50 2010 +0100 MIPS: uasm: Add OR instruction. commit 5808184f1b2fe06ef8a54a2b7fb1596d58098acf upstream. This is needed for the fix of the M3 workaround. Signed-off-by: Ralf Baechle [Backported by Aurelien Jarno ] Signed-off-by: Greg Kroah-Hartman commit a4d6b59efe59fc769e2d55292653922cb23c41c7 Author: Ben Hutchings Date: Sun Jun 13 22:22:59 2010 +0100 MIPS: Set io_map_base for several PCI bridges lacking it commit 8faf2e6c201d95b780cd3b4674b7a55ede6dcbbb upstream. Several MIPS platforms don't set pci_controller::io_map_base for their PCI bridges. This results in a panic in pci_iomap(). (The panic is conditional on CONFIG_PCI_DOMAINS, but that is now enabled for all PCI MIPS systems.) Signed-off-by: Ben Hutchings Cc: linux-mips@linux-mips.org Cc: Martin Michlmayr Cc: Aurelien Jarno Cc: 584784@bugs.debian.org Patchwork: https://patchwork.linux-mips.org/patch/1377/ Signed-off-by: Ralf Baechle Signed-off-by: Greg Kroah-Hartman commit 74845b58bc3d60d50acd754d3f33fb5a245b7974 Author: David Daney Date: Thu Jul 22 11:59:27 2010 -0700 MIPS: Quit using undefined behavior of ADDU in 64-bit atomic operations. commit f2a68272d799bf4092443357142f63b74f7669a1 upstream. For 64-bit, we must use DADDU and DSUBU. Signed-off-by: David Daney To: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/1483/ Signed-off-by: Ralf Baechle Signed-off-by: Greg Kroah-Hartman commit e1155f2ae2c0562b4efe8b2b5165ad911be0d008 Author: Dmitry Torokhov Date: Mon Jan 11 00:05:43 2010 -0800 Input: add compat support for sysfs and /proc capabilities output commit 15e184afa83a45cf8bafdb9dc906b97a8fbc974f upstream. Input core displays capabilities bitmasks in form of one or more longs printed in hex form and separated by spaces. Unfortunately it does not work well for 32-bit applications running on 64-bit kernels since applications expect that number is "worth" only 32 bits when kernel advances by 64 bits. Fix that by ensuring that output produced for compat tasks uses 32-bit units. Reported-and-tested-by: Michael Tokarev Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit a48981e31d8ea85f7ef0a5b5dead4fd4a82b2fc9 Author: Eric Paris Date: Wed Jul 28 10:18:37 2010 -0400 inotify: fix inotify oneshot support commit ff311008ab8d2f2cfdbbefd407d1b05acc8164b2 upstream. During the large inotify rewrite to fsnotify I completely dropped support for IN_ONESHOT. Reimplement that support. Signed-off-by: Eric Paris Signed-off-by: Greg Kroah-Hartman commit af3fc1bc4bb4173e14e83035b4c4ab02d380edbb Author: John W. Linville Date: Tue Jul 13 14:06:32 2010 -0400 hostap_pci: set dev->base_addr during probe commit 0f4da2d77e1bf424ac36424081afc22cbfc3ff2b upstream. "hostap: Protect against initialization interrupt" (which reinstated "wireless: hostap, fix oops due to early probing interrupt") reintroduced Bug 16111. This is because hostap_pci wasn't setting dev->base_addr, which is now checked in prism2_interrupt. As a result, initialization was failing for PCI-based hostap devices. This corrects that oversight. Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 372ee0be3eb7d1b4fa744a617dd9821a62afd8de Author: Herbert Xu Date: Thu May 20 23:07:56 2010 -0700 gro: Fix bogus gso_size on the first fraglist entry commit 622e0ca1cd4d459f5af4f2c65f4dc0dd823cb4c3 upstream. When GRO produces fraglist entries, and the resulting skb hits an interface that is incapable of TSO but capable of FRAGLIST, we end up producing a bogus packet with gso_size non-zero. This was reported in the field with older versions of KVM that did not set the TSO bits on tuntap. This patch fixes that. Reported-by: Igor Zhang Signed-off-by: Herbert Xu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit cc760d4a8859a02aeb777d508a276d4b1e726eed Author: Aurelien Jarno Date: Mon May 31 21:45:48 2010 +0000 clocksource: sh_tmu: compute mult and shift before registration commit 66f49121ffa41a19c59965b31b046d8368fec3c7 upstream. Since commit 98962465ed9e6ea99c38e0af63fe1dcb5a79dc25 ("nohz: Prevent clocksource wrapping during idle"), the CPU of an R2D board never goes to idle. This commit assumes that mult and shift are assigned before the clocksource is registered. As a consequence the safe maximum sleep time is negative and the CPU never goes into idle. This patch fixes the problem by moving mult and shift initialization from sh_tmu_clocksource_enable() to sh_tmu_register_clocksource(). Signed-off-by: Aurelien Jarno Signed-off-by: Paul Mundt Signed-off-by: Greg Kroah-Hartman commit a4693e59fcd9b4a1461ba9988f85421d6542914e Author: Peter Oberparleiter Date: Mon Jul 19 09:22:35 2010 +0200 dasd: use correct label location for diag fba disks commit cffab6bc5511cd6f67a60bf16b62de4267b68c4c upstream. Partition boundary calculation fails for DASD FBA disks under the following conditions: - disk is formatted with CMS FORMAT with a blocksize of more than 512 bytes - all of the disk is reserved to a single CMS file using CMS RESERVE - the disk is accessed using the DIAG mode of the DASD driver Under these circumstances, the partition detection code tries to read the CMS label block containing partition-relevant information from logical block offset 1, while it is in fact located at physical block offset 1. Fix this problem by using the correct CMS label block location depending on the device type as determined by the DASD SENSE ID information. Signed-off-by: Peter Oberparleiter Signed-off-by: Martin Schwidefsky [bwh: Adjust for 2.6.32] Signed-off-by: Greg Kroah-Hartman commit e665d4c5a0857d7a8e9d8d00dcdbcd81b320013a Author: Jussi Kivilinna Date: Tue Mar 9 12:24:38 2010 +0000 asix: fix setting mac address for AX88772 commit 7f29a3baa825725d29db399663790d15c78cddcf upstream. Setting new MAC address only worked when device was set to promiscuous mode. Fix MAC address by writing new address to device using undocumented command AX_CMD_READ_NODE_ID+1. Patch is tested with AX88772 device. Signed-off-by: Jussi Kivilinna Acked-by: David Hollis Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 33c567a7081fec50b5a83e3d7fee7950ba253bd6 Author: Ben Hutchings Date: Wed Apr 7 20:55:47 2010 -0700 3c503: Fix IRQ probing commit b0cf4dfb7cd21556efd9a6a67edcba0840b4d98d upstream. The driver attempts to select an IRQ for the NIC automatically by testing which of the supported IRQs are available and then probing each available IRQ with probe_irq_{on,off}(). There are obvious race conditions here, besides which: 1. The test for availability is done by passing a NULL handler, which now always returns -EINVAL, thus the device cannot be opened: 2. probe_irq_off() will report only the first ISA IRQ handled, potentially leading to a false negative. There was another bug that meant it ignored all error codes from request_irq() except -EBUSY, so it would 'succeed' despite this (possibly causing conflicts with other ISA devices). This was fixed by ab08999d6029bb2c79c16be5405d63d2bedbdfea 'WARNING: some request_irq() failures ignored in el2_open()', which exposed bug 1. This patch: 1. Replaces the use of probe_irq_{on,off}() with a real interrupt handler 2. Adds a delay before checking the interrupt-seen flag 3. Disables interrupts on all failure paths 4. Distinguishes error codes from the second request_irq() call, consistently with the first Compile-tested only. Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ca92b22ffac5a8b47d6be1a6f2e0dbe68b485f18 Author: Vlad Yasevich Date: Wed Sep 15 10:00:26 2010 -0400 sctp: Do not reset the packet during sctp_packet_config(). commit 4bdab43323b459900578b200a4b8cf9713ac8fab upstream. sctp_packet_config() is called when getting the packet ready for appending of chunks. The function should not touch the current state, since it's possible to ping-pong between two transports when sending, and that can result packet corruption followed by skb overlfow crash. Reported-by: Thomas Dreibholz Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit bded361d94196ecca089c8a3c28ed60af4906d8f Author: Daniel J Blueman Date: Tue Aug 17 23:56:55 2010 +0100 Fix unprotected access to task credentials in waitid() commit f362b73244fb16ea4ae127ced1467dd8adaa7733 upstream. Using a program like the following: #include #include #include #include int main() { id_t id; siginfo_t infop; pid_t res; id = fork(); if (id == 0) { sleep(1); exit(0); } kill(id, SIGSTOP); alarm(1); waitid(P_PID, id, &infop, WCONTINUED); return 0; } to call waitid() on a stopped process results in access to the child task's credentials without the RCU read lock being held - which may be replaced in the meantime - eliciting the following warning: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- kernel/exit.c:1460 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 2 locks held by waitid02/22252: #0: (tasklist_lock){.?.?..}, at: [] do_wait+0xc5/0x310 #1: (&(&sighand->siglock)->rlock){-.-...}, at: [] wait_consider_task+0x19a/0xbe0 stack backtrace: Pid: 22252, comm: waitid02 Not tainted 2.6.35-323cd+ #3 Call Trace: [] lockdep_rcu_dereference+0xa4/0xc0 [] wait_consider_task+0xaf1/0xbe0 [] do_wait+0xf5/0x310 [] sys_waitid+0x86/0x1f0 [] ? child_wait_callback+0x0/0x70 [] system_call_fastpath+0x16/0x1b This is fixed by holding the RCU read lock in wait_task_continued() to ensure that the task's current credentials aren't destroyed between us reading the cred pointer and us reading the UID from those credentials. Furthermore, protect wait_task_stopped() in the same way. We don't need to keep holding the RCU read lock once we've read the UID from the credentials as holding the RCU read lock doesn't stop the target task from changing its creds under us - so the credentials may be outdated immediately after we've read the pointer, lock or no lock. Signed-off-by: Daniel J Blueman Signed-off-by: David Howells Acked-by: Paul E. McKenney Acked-by: Oleg Nesterov Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit c837b58c0ea48760cc83a2e3ffddd92ac88dd156 Author: Luck, Tony Date: Tue Aug 24 11:44:18 2010 -0700 guard page for stacks that grow upwards commit 8ca3eb08097f6839b2206e2242db4179aee3cfb3 upstream. pa-risc and ia64 have stacks that grow upwards. Check that they do not run into other mappings. By making VM_GROWSUP 0x0 on architectures that do not ever use it, we can avoid some unpleasant #ifdefs in check_stack_guard_page(). Signed-off-by: Tony Luck Signed-off-by: Linus Torvalds Cc: dann frazier Signed-off-by: Greg Kroah-Hartman commit 288841853e8c83bba82a35937ec19eba59f14548 Author: Mel Gorman Date: Thu Sep 9 16:38:16 2010 -0700 mm: page allocator: update free page counters after pages are placed on the free list commit 72853e2991a2702ae93aaf889ac7db743a415dd3 upstream. When allocating a page, the system uses NR_FREE_PAGES counters to determine if watermarks would remain intact after the allocation was made. This check is made without interrupts disabled or the zone lock held and so is race-prone by nature. Unfortunately, when pages are being freed in batch, the counters are updated before the pages are added on the list. During this window, the counters are misleading as the pages do not exist yet. When under significant pressure on systems with large numbers of CPUs, it's possible for processes to make progress even though they should have been stalled. This is particularly problematic if a number of the processes are using GFP_ATOMIC as the min watermark can be accidentally breached and in extreme cases, the system can livelock. This patch updates the counters after the pages have been added to the list. This makes the allocator more cautious with respect to preserving the watermarks and mitigates livelock possibilities. [akpm@linux-foundation.org: avoid modifying incoming args] Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel Reviewed-by: Minchan Kim Reviewed-by: KAMEZAWA Hiroyuki Reviewed-by: Christoph Lameter Reviewed-by: KOSAKI Motohiro Acked-by: Johannes Weiner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit c2222d66adedc567318a7c288726363daeb3374b Author: Christoph Lameter Date: Thu Sep 9 16:38:17 2010 -0700 mm: page allocator: calculate a better estimate of NR_FREE_PAGES when memory is low and kswapd is awake commit aa45484031ddee09b06350ab8528bfe5b2c76d1c upstream. Ordinarily watermark checks are based on the vmstat NR_FREE_PAGES as it is cheaper than scanning a number of lists. To avoid synchronization overhead, counter deltas are maintained on a per-cpu basis and drained both periodically and when the delta is above a threshold. On large CPU systems, the difference between the estimated and real value of NR_FREE_PAGES can be very high. If NR_FREE_PAGES is much higher than number of real free page in buddy, the VM can allocate pages below min watermark, at worst reducing the real number of pages to zero. Even if the OOM killer kills some victim for freeing memory, it may not free memory if the exit path requires a new page resulting in livelock. This patch introduces a zone_page_state_snapshot() function (courtesy of Christoph) that takes a slightly more accurate view of an arbitrary vmstat counter. It is used to read NR_FREE_PAGES while kswapd is awake to avoid the watermark being accidentally broken. The estimate is not perfect and may result in cache line bounces but is expected to be lighter than the IPI calls necessary to continually drain the per-cpu counters while kswapd is awake. Signed-off-by: Christoph Lameter Signed-off-by: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 9fad902072e4689c9179b326669aac8918920e5c Author: Mel Gorman Date: Thu Sep 9 16:38:18 2010 -0700 mm: page allocator: drain per-cpu lists after direct reclaim allocation fails commit 9ee493ce0a60bf42c0f8fd0b0fe91df5704a1cbf upstream. When under significant memory pressure, a process enters direct reclaim and immediately afterwards tries to allocate a page. If it fails and no further progress is made, it's possible the system will go OOM. However, on systems with large amounts of memory, it's possible that a significant number of pages are on per-cpu lists and inaccessible to the calling process. This leads to a process entering direct reclaim more often than it should increasing the pressure on the system and compounding the problem. This patch notes that if direct reclaim is making progress but allocations are still failing that the system is already under heavy pressure. In this case, it drains the per-cpu lists and tries the allocation a second time before continuing. Signed-off-by: Mel Gorman Reviewed-by: Minchan Kim Reviewed-by: KAMEZAWA Hiroyuki Reviewed-by: KOSAKI Motohiro Reviewed-by: Christoph Lameter Cc: Dave Chinner Cc: Wu Fengguang Cc: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit c5b75362daff6a41168e6bf848d745763be99a21 Author: Divy Le Ray Date: Wed Mar 3 09:49:47 2010 +0000 cxgb3: fix hot plug removal crash commit a6f018e324ba91d0464cca6895447c2b89e6d578 upstream. queue restart tasklets need to be stopped after napi handlers are stopped since the latter can restart them. So stop them after stopping napi. Signed-off-by: Divy Le Ray Signed-off-by: David S. Miller Signed-off-by: Brandon Philips Signed-off-by: Greg Kroah-Hartman commit 392cbf6593e72c8fd5ad6c66fb178ede498b83ec Author: Nicolas Ferre Date: Fri Aug 20 16:44:33 2010 +0200 AT91: change dma resource index commit 8d2602e0778299e2d6084f03086b716d6e7a1e1e upstream. Reported-by: Dan Liang Signed-off-by: Nicolas Ferre Signed-off-by: Greg Kroah-Hartman commit 2b0a9cce5e3866adc5d15567df60dcea277c9898 Author: Michael Chan Date: Tue Jun 1 15:05:36 2010 +0000 bnx2: Fix hang during rmmod bnx2. commit f048fa9c8686119c3858a463cab6121dced7c0bf upstream. The regression is caused by: commit 4327ba435a56ada13eedf3eb332e583c7a0586a9 bnx2: Fix netpoll crash. If ->open() and ->close() are called multiple times, the same napi structs will be added to dev->napi_list multiple times, corrupting the dev->napi_list. This causes free_netdev() to hang during rmmod. We fix this by calling netif_napi_del() during ->close(). Also, bnx2_init_napi() must not be in the __devinit section since it is called by ->open(). Signed-off-by: Michael Chan Signed-off-by: Benjamin Li Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 508d1ab51e66eb8f50acba0301524b09babe37b3 Author: Benjamin Li Date: Tue Mar 23 13:13:11 2010 +0000 bnx2: Fix netpoll crash. commit 4327ba435a56ada13eedf3eb332e583c7a0586a9 upstream. The bnx2 driver calls netif_napi_add() for all the NAPI structs during ->probe() time but not all of them will be used if we're not in MSI-X mode. This creates a problem for netpoll since it will poll all the NAPI structs in the dev_list whether or not they are scheduled, resulting in a crash when we access structure fields not initialized for that vector. We fix it by moving the netif_napi_add() call to ->open() after the number of IRQ vectors has been determined. Signed-off-by: Benjamin Li Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 06f796fb6a720e5b3b56d0ba08c968d82a48f60b Author: Zhang Rui Date: Mon Dec 21 16:13:15 2009 +0800 ACPI: disable _OSI(Windows 2009) on Asus K50IJ commit 81074e90f5c150ca70ab8dfcc77860cbe76f364d upstream. Fix a win7 compability issue on Asus K50IJ. Here is the _BCM method of this laptop: Method (_BCM, 1, NotSerialized) { If (LGreaterEqual (OSFG, OSVT)) { If (LNotEqual (OSFG, OSW7)) { Store (One, BCMD) Store (GCBL (Arg0), Local0) Subtract (0x0F, Local0, LBTN) ^^^SBRG.EC0.STBR () ... } Else { DBGR (0x0B, Zero, Zero, Arg0) Store (Arg0, LBTN) ^^^SBRG.EC0.STBR () ... } } } LBTN is used to store the index of the brightness level in the _BCL. GCBL is a method that convert the percentage value to the index value. If _OSI(Windows 2009) is not disabled, LBTN is stored a percentage value which is surely beyond the end of _BCL package. http://bugzilla.kernel.org/show_bug.cgi?id=14753 Signed-off-by: Zhang Rui Signed-off-by: Len Brown Cc: maximilian attems Cc: Paolo Ornati Signed-off-by: Greg Kroah-Hartman commit d912e785e25fb48bc4eae43c497430b1d5b7e044 Author: Dan Rosenberg Date: Wed Sep 15 19:08:24 2010 -0400 drivers/video/via/ioctl.c: prevent reading uninitialized stack memory commit b4aaa78f4c2f9cde2f335b14f4ca30b01f9651ca upstream. The VIAFB_GET_INFO device ioctl allows unprivileged users to read 246 bytes of uninitialized stack memory, because the "reserved" member of the viafb_ioctl_info struct declared on the stack is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg Signed-off-by: Florian Tobias Schandinat Signed-off-by: Greg Kroah-Hartman commit 043d7866aebbc60a37dc3245035aee41836eb9be Author: Dan Rosenberg Date: Mon Sep 6 18:24:57 2010 -0400 xfs: prevent reading uninitialized stack memory commit a122eb2fdfd78b58c6dd992d6f4b1aaef667eef9 upstream. The XFS_IOC_FSGETXATTR ioctl allows unprivileged users to read 12 bytes of uninitialized stack memory, because the fsxattr struct declared on the stack in xfs_ioc_fsgetxattr() does not alter (or zero) the 12-byte fsx_pad member before copying it back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg Reviewed-by: Eric Sandeen Signed-off-by: Alex Elder Cc: dann frazier Signed-off-by: Greg Kroah-Hartman commit db1a0b94ba65383e790bbba18c95b319ecbe534e Author: David Howells Date: Fri Sep 10 09:59:51 2010 +0100 KEYS: Fix bug in keyctl_session_to_parent() if parent has no session keyring commit 3d96406c7da1ed5811ea52a3b0905f4f0e295376 upstream. Fix a bug in keyctl_session_to_parent() whereby it tries to check the ownership of the parent process's session keyring whether or not the parent has a session keyring [CVE-2010-2960]. This results in the following oops: BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0 IP: [] keyctl_session_to_parent+0x251/0x443 ... Call Trace: [] ? keyctl_session_to_parent+0x67/0x443 [] ? __do_fault+0x24b/0x3d0 [] sys_keyctl+0xb4/0xb8 [] system_call_fastpath+0x16/0x1b if the parent process has no session keyring. If the system is using pam_keyinit then it mostly protected against this as all processes derived from a login will have inherited the session keyring created by pam_keyinit during the log in procedure. To test this, pam_keyinit calls need to be commented out in /etc/pam.d/. Reported-by: Tavis Ormandy Signed-off-by: David Howells Acked-by: Tavis Ormandy Cc: dann frazier Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 516d04d051294af5124d866ed65a838c71006ba5 Author: David Howells Date: Fri Sep 10 09:59:46 2010 +0100 KEYS: Fix RCU no-lock warning in keyctl_session_to_parent() commit 9d1ac65a9698513d00e5608d93fca0c53f536c14 upstream. There's an protected access to the parent process's credentials in the middle of keyctl_session_to_parent(). This results in the following RCU warning: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- security/keys/keyctl.c:1291 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 1 lock held by keyctl-session-/2137: #0: (tasklist_lock){.+.+..}, at: [] keyctl_session_to_parent+0x60/0x236 stack backtrace: Pid: 2137, comm: keyctl-session- Not tainted 2.6.36-rc2-cachefs+ #1 Call Trace: [] lockdep_rcu_dereference+0xaa/0xb3 [] keyctl_session_to_parent+0xed/0x236 [] sys_keyctl+0xb4/0xb6 [] system_call_fastpath+0x16/0x1b The code should take the RCU read lock to make sure the parents credentials don't go away, even though it's holding a spinlock and has IRQ disabled. Signed-off-by: David Howells Signed-off-by: Linus Torvalds Cc: dann frazier Signed-off-by: Greg Kroah-Hartman commit 6e01cc9e2407c3f929c903f5813bc20c079f8e0b Author: Petr Tesarik Date: Wed Sep 15 15:35:48 2010 -0700 IA64: Optimize ticket spinlocks in fsys_rt_sigprocmask commit 2d2b6901649a62977452be85df53eda2412def24 upstream. Tony's fix (f574c843191728d9407b766a027f779dcd27b272) has a small bug, it incorrectly uses "r3" as a scratch register in the first of the two unlock paths ... it is also inefficient. Optimize the fast path again. Signed-off-by: Petr Tesarik Signed-off-by: Tony Luck Signed-off-by: Greg Kroah-Hartman commit e3c0109cdf6b0656887364b6dd435cb5e7d30b15 Author: Tony Luck Date: Thu Sep 9 15:16:56 2010 -0700 IA64: fix siglock commit f574c843191728d9407b766a027f779dcd27b272 upstream. When ia64 converted to using ticket locks, an inline implementation of trylock/unlock in fsys.S was missed. This was not noticed because in most circumstances it simply resulted in using the slow path because the siglock was apparently not available (under old spinlock rules). Problems occur when the ticket spinlock has value 0x0 (when first initialised, or when it wraps around). At this point the fsys.S code acquires the lock (changing the 0x0 to 0x1. If another process attempts to get the lock at this point, it will change the value from 0x1 to 0x2 (using new ticket lock rules). Then the fsys.S code will free the lock using old spinlock rules by writing 0x0 to it. From here a variety of bad things can happen. Signed-off-by: Tony Luck Signed-off-by: Greg Kroah-Hartman commit acf5fad61bc0f2c28e58ccfe21511ff242ea80e4 Author: Dmitry Monakhov Date: Sat Jun 5 11:51:27 2010 -0400 ext4: Fix remaining racy updates of EXT4_I(inode)->i_flags commit 84a8dce2710cc425089a2b92acc354d4fbb5788d upstream. A few functions were still modifying i_flags in a racy manner. Signed-off-by: Dmitry Monakhov Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit aaf3b48b50681f779723ea9bb141931739b75c4b Author: Ryan Kuester Date: Mon Apr 26 18:11:54 2010 -0500 SCSI: mptsas: fix hangs caused by ATA pass-through commit 2a1b7e575b80ceb19ea50bfa86ce0053ea57181d upstream. I may have an explanation for the LSI 1068 HBA hangs provoked by ATA pass-through commands, in particular by smartctl. First, my version of the symptoms. On an LSI SAS1068E B3 HBA running 01.29.00.00 firmware, with SATA disks, and with smartd running, I'm seeing occasional task, bus, and host resets, some of which lead to hard faults of the HBA requiring a reboot. Abusively looping the smartctl command, # while true; do smartctl -a /dev/sdb > /dev/null; done dramatically increases the frequency of these failures to nearly one per minute. A high IO load through the HBA while looping smartctl seems to improve the chance of a full scsi host reset or a non-recoverable hang. I reduced what smartctl was doing down to a simple test case which causes the hang with a single IO when pointed at the sd interface. See the code at the bottom of this e-mail. It uses an SG_IO ioctl to issue a single pass-through ATA identify device command. If the buffer userspace gives for the read data has certain alignments, the task is issued to the HBA but the HBA fails to respond. If run against the sg interface, neither the test code nor smartctl causes a hang. sd and sg handle the SG_IO ioctl slightly differently. Unless you specifically set a flag to do direct IO, sg passes a buffer of its own, which is page-aligned, to the block layer and later copies the result into the userspace buffer regardless of its alignment. sd, on the other hand, always does direct IO unless the userspace buffer fails an alignment test at block/blk-map.c line 57, in which case a page-aligned buffer is created and used for the transfer. The alignment test currently checks for word-alignment, the default setup by scsi_lib.c; therefore, userspace buffers of almost any alignment are given directly to the HBA as DMA targets. The LSI 1068 hardware doesn't seem to like at least a couple of the alignments which cross a page boundary (see the test code below). Curiously, many page-boundary-crossing alignments do work just fine. So, either the hardware has an bug handling certain alignments or the hardware has a stricter alignment requirement than the driver is advertising. If stricter alignment is required, then in no case should misaligned buffers from userspace be allowed through without being bounced or at least causing an error to be returned. It seems the mptsas driver could use blk_queue_dma_alignment() to advertise a stricter alignment requirement. If it does, sd does the right thing and bounces misaligned buffers (see block/blk-map.c line 57). The following patch to 2.6.34-rc5 makes my symptoms go away. I'm sure this is the wrong place for this code, but it gets my idea across. Acked-by: Kashyap Desai Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit e39ae50950e2cc9b92812f643d86f1a02cbab738 Author: Eric Paris Date: Wed Jul 28 10:18:37 2010 -0400 inotify: send IN_UNMOUNT events commit 611da04f7a31b2208e838be55a42c7a1310ae321 upstream. Since the .31 or so notify rewrite inotify has not sent events about inodes which are unmounted. This patch restores those events. Signed-off-by: Eric Paris Cc: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 02e33709e19a12720a8da0c5bfa5572ed8b5c9ec Author: Jeff Moyer Date: Fri Sep 10 14:16:00 2010 -0700 aio: check for multiplication overflow in do_io_submit commit 75e1c70fc31490ef8a373ea2a4bea2524099b478 upstream. Tavis Ormandy pointed out that do_io_submit does not do proper bounds checking on the passed-in iocb array:        if (unlikely(nr < 0))                return -EINVAL;        if (unlikely(!access_ok(VERIFY_READ, iocbpp, (nr*sizeof(iocbpp)))))                return -EFAULT;                      ^^^^^^^^^^^^^^^^^^ The attached patch checks for overflow, and if it is detected, the number of iocbs submitted is scaled down to a number that will fit in the long.  This is an ok thing to do, as sys_io_submit is documented as returning the number of iocbs submitted, so callers should handle a return value of less than the 'nr' argument passed in. Reported-by: Tavis Ormandy Signed-off-by: Jeff Moyer Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 61e0ce1b361c699387029fdda4ba08e6ae4af326 Author: Tejun Heo Date: Tue Sep 21 07:57:19 2010 +0200 percpu: fix pcpu_last_unit_cpu commit 46b30ea9bc3698bc1d1e6fd726c9601d46fa0a91 upstream. pcpu_first/last_unit_cpu are used to track which cpu has the first and last units assigned. This in turn is used to determine the span of a chunk for man/unmap cache flushes and whether an address belongs to the first chunk or not in per_cpu_ptr_to_phys(). When the number of possible CPUs isn't power of two, a chunk may contain unassigned units towards the end of a chunk. The logic to determine pcpu_last_unit_cpu was incorrect when there was an unused unit at the end of a chunk. It failed to ignore the unused unit and assigned the unused marker NR_CPUS to pcpu_last_unit_cpu. This was discovered through kdump failure which was caused by malfunctioning per_cpu_ptr_to_phys() on a kvm setup with 50 possible CPUs by CAI Qian. Signed-off-by: Tejun Heo Reported-by: CAI Qian Signed-off-by: Greg Kroah-Hartman commit e286d83959320fcf7eadfc81aeceb6e1a81667c9 Author: Dan Rosenberg Date: Wed Sep 22 13:05:09 2010 -0700 drivers/video/sis/sis_main.c: prevent reading uninitialized stack memory commit fd02db9de73faebc51240619c7c7f99bee9f65c7 upstream. The FBIOGET_VBLANK device ioctl allows unprivileged users to read 16 bytes of uninitialized stack memory, because the "reserved" member of the fb_vblank struct declared on the stack is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg Cc: Thomas Winischhofer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 0a71220bc6dc9ef745de3beb618f18c42af433db Author: Andrew Morton Date: Wed Sep 22 13:05:11 2010 -0700 drivers/pci/intel-iommu.c: fix build with older gcc's commit df08cdc7ef606509debe7677c439be0ca48790e4 upstream. drivers/pci/intel-iommu.c: In function `__iommu_calculate_agaw': drivers/pci/intel-iommu.c:437: sorry, unimplemented: inlining failed in call to 'width_to_agaw': function body not available drivers/pci/intel-iommu.c:445: sorry, unimplemented: called from here Move the offending function (and its siblings) to top-of-file, remove the forward declaration. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=17441 Reported-by: Martin Mokrejs Cc: David Woodhouse Cc: Jesse Barnes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 52a366e0f6b5d2190055dc1ae39a3b421c1f21e2 Author: Jan Kara Date: Tue Sep 21 11:49:01 2010 +0200 char: Mark /dev/zero and /dev/kmem as not capable of writeback commit 371d217ee1ff8b418b8f73fb2a34990f951ec2d4 upstream. These devices don't do any writeback but their device inodes still can get dirty so mark bdi appropriately so that bdi code does the right thing and files inodes to lists of bdi carrying the device inodes. Signed-off-by: Jan Kara Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 702c0563a79cb487988dfc83a4e9089435baf8e3 Author: Patrick Simmons Date: Wed Sep 8 10:34:28 2010 -0400 oprofile: Add Support for Intel CPU Family 6 / Model 22 (Intel Celeron 540) commit c33f543d320843e1732534c3931da4bbd18e6c14 upstream. This patch adds CPU type detection for the Intel Celeron 540, which is part of the Core 2 family according to Wikipedia; the family and ID pair is absent from the Volume 3B table referenced in the source code comments. I have tested this patch on an Intel Celeron 540 machine reporting itself as Family 6 Model 22, and OProfile runs on the machine without issue. Spec: http://download.intel.com/design/mobile/SPECUPDT/317667.pdf Signed-off-by: Patrick Simmons Acked-by: Andi Kleen Acked-by: Arnd Bergmann Signed-off-by: Robert Richter Signed-off-by: Greg Kroah-Hartman commit 58d5e434228b39f0906b17ce4f2b14d2cd78e947 Author: Stanislaw Gruszka Date: Tue Sep 14 16:35:14 2010 +0200 sched: Fix user time incorrectly accounted as system time on 32-bit commit e75e863dd5c7d96b91ebbd241da5328fc38a78cc upstream. We have 32-bit variable overflow possibility when multiply in task_times() and thread_group_times() functions. When the overflow happens then the scaled utime value becomes erroneously small and the scaled stime becomes i erroneously big. Reported here: https://bugzilla.redhat.com/show_bug.cgi?id=633037 https://bugzilla.kernel.org/show_bug.cgi?id=16559 Reported-by: Michael Chapman Reported-by: Ciriaco Garcia de Celis Signed-off-by: Stanislaw Gruszka Signed-off-by: Peter Zijlstra Cc: Hidetoshi Seto LKML-Reference: <20100914143513.GB8415@redhat.com> Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 6b7b329a3112648aa31e2d05db61f9bfd4a9b0b6 Author: Paul E. McKenney Date: Tue Aug 31 17:00:18 2010 -0700 pid: make setpgid() system call use RCU read-side critical section commit 950eaaca681c44aab87a46225c9e44f902c080aa upstream. [ 23.584719] [ 23.584720] =================================================== [ 23.585059] [ INFO: suspicious rcu_dereference_check() usage. ] [ 23.585176] --------------------------------------------------- [ 23.585176] kernel/pid.c:419 invoked rcu_dereference_check() without protection! [ 23.585176] [ 23.585176] other info that might help us debug this: [ 23.585176] [ 23.585176] [ 23.585176] rcu_scheduler_active = 1, debug_locks = 1 [ 23.585176] 1 lock held by rc.sysinit/728: [ 23.585176] #0: (tasklist_lock){.+.+..}, at: [] sys_setpgid+0x5f/0x193 [ 23.585176] [ 23.585176] stack backtrace: [ 23.585176] Pid: 728, comm: rc.sysinit Not tainted 2.6.36-rc2 #2 [ 23.585176] Call Trace: [ 23.585176] [] lockdep_rcu_dereference+0x99/0xa2 [ 23.585176] [] find_task_by_pid_ns+0x50/0x6a [ 23.585176] [] find_task_by_vpid+0x1d/0x1f [ 23.585176] [] sys_setpgid+0x67/0x193 [ 23.585176] [] system_call_fastpath+0x16/0x1b [ 24.959669] type=1400 audit(1282938522.956:4): avc: denied { module_request } for pid=766 comm="hwclock" kmod="char-major-10-135" scontext=system_u:system_r:hwclock_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclas It turns out that the setpgid() system call fails to enter an RCU read-side critical section before doing a PID-to-task_struct translation. This commit therefore does rcu_read_lock() before the translation, and also does rcu_read_unlock() after the last use of the returned pointer. Reported-by: Andrew Morton Signed-off-by: Paul E. McKenney Acked-by: David Howells Cc: Jiri Slaby Cc: Oleg Nesterov Signed-off-by: Greg Kroah-Hartman commit ab0b42d8a04ce4c767c5c39a1cab1ef1a8289905 Author: Dan Carpenter Date: Fri Sep 10 01:56:16 2010 +0000 net/llc: make opt unsigned in llc_ui_setsockopt() commit 339db11b219f36cf7da61b390992d95bb6b7ba2e upstream. The members of struct llc_sock are unsigned so if we pass a negative value for "opt" it can cause a sign bug. Also it can cause an integer overflow when we multiply "opt * HZ". Signed-off-by: Dan Carpenter Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e220aa2dd5c106fbeb97558d68475b84d8fbd12a Author: Dan Carpenter Date: Mon Sep 6 14:32:30 2010 +0200 Staging: vt6655: fix buffer overflow commit dd173abfead903c7df54e977535973f3312cd307 upstream. "param->u.wpa_associate.wpa_ie_len" comes from the user. We should check it so that the copy_from_user() doesn't overflow the buffer. Also further down in the function, we assume that if "param->u.wpa_associate.wpa_ie_len" is set then "abyWPAIE[0]" is initialized. To make that work, I changed the test here to say that if "wpa_ie_len" is set then "wpa_ie" has to be a valid pointer or we return -EINVAL. Oddly, we only use the first element of the abyWPAIE[] array. So I suspect there may be some other issues in this function. Signed-off-by: Dan Carpenter Signed-off-by: Greg Kroah-Hartman commit b00df6475ff298aecca148e4812675aadab0f859 Author: Andy Gospodarek Date: Fri Sep 10 11:43:20 2010 +0000 bonding: correctly process non-linear skbs commit ab12811c89e88f2e66746790b1fe4469ccb7bdd9 upstream. It was recently brought to my attention that 802.3ad mode bonds would no longer form when using some network hardware after a driver update. After snooping around I realized that the particular hardware was using page-based skbs and found that skb->data did not contain a valid LACPDU as it was not stored there. That explained the inability to form an 802.3ad-based bond. For balance-alb mode bonds this was also an issue as ARPs would not be properly processed. This patch fixes the issue in my tests and should be applied to 2.6.36 and as far back as anyone cares to add it to stable. Thanks to Alexander Duyck and Jesse Brandeburg for the suggestions on this one. Signed-off-by: Andy Gospodarek CC: Alexander Duyck CC: Jesse Brandeburg Signed-off-by: Jay Vosburgh Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ab5fc422d50d6e43faed1d757be1deb2ba2200e0 Author: Dan Rosenberg Date: Wed Sep 15 11:43:04 2010 +0000 drivers/net/eql.c: prevent reading uninitialized stack memory commit 44467187dc22fdd33a1a06ea0ba86ce20be3fe3c upstream. Fixed formatting (tabs and line breaks). The EQL_GETMASTRCFG device ioctl allows unprivileged users to read 16 bytes of uninitialized stack memory, because the "master_name" member of the master_config_t struct declared on the stack in eql_g_master_cfg() is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c99489058f070b580faf1093a448d47fbd676489 Author: Dan Rosenberg Date: Wed Sep 15 11:43:12 2010 +0000 drivers/net/cxgb3/cxgb3_main.c: prevent reading uninitialized stack memory commit 49c37c0334a9b85d30ab3d6b5d1acb05ef2ef6de upstream. Fixed formatting (tabs and line breaks). The CHELSIO_GET_QSET_NUM device ioctl allows unprivileged users to read 4 bytes of uninitialized stack memory, because the "addr" member of the ch_reg struct declared on the stack in cxgb_extension_ioctl() is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit adf62df3786df2525f0817c8847d609bf72113a3 Author: Dan Rosenberg Date: Wed Sep 15 11:43:28 2010 +0000 drivers/net/usb/hso.c: prevent reading uninitialized memory commit 7011e660938fc44ed86319c18a5954e95a82ab3e upstream. Fixed formatting (tabs and line breaks). The TIOCGICOUNT device ioctl allows unprivileged users to read uninitialized stack memory, because the "reserved" member of the serial_icounter_struct struct declared on the stack in hso_get_count() is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 710acfd8d1b88d5b6761f5b5141dbaca08254046 Author: David S. Miller Date: Sun Sep 19 17:50:44 2010 -0700 sparc64: Get rid of indirect p1275 PROM call buffer. [ Upstream commit 25edd6946a1d74e5e77813c2324a0908c68bcf9e ] This is based upon a report by Meelis Roos showing that it's possible that we'll try to fetch a property that is 32K in size with some devices. With the current fixed 3K buffer we use for moving data in and out of the firmware during PROM calls, that simply won't work. In fact, it will scramble random kernel data during bootup. The reasoning behind the temporary buffer is entirely historical. It used to be the case that we had problems referencing dynamic kernel memory (including the stack) early in the boot process before we explicitly told the firwmare to switch us over to the kernel trap table. So what we did was always give the firmware buffers that were locked into the main kernel image. But we no longer have problems like that, so get rid of all of this indirect bounce buffering. Besides fixing Meelis's bug, this also makes the kernel data about 3K smaller. It was also discovered during these conversions that the implementation of prom_retain() was completely wrong, so that was fixed here as well. Currently that interface is not in use. Reported-by: Meelis Roos Tested-by: Meelis Roos Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0bc4d2f86c7c6bf4bc8dab446e184744cdee51d0 Author: Timo Teräs Date: Wed Jun 9 17:31:48 2010 -0700 r8169: fix mdio_read and update mdio_write according to hw specs [ Upstream commit 81a95f049962ec20a9aed888e676208b206f0f2e ] Realtek confirmed that a 20us delay is needed after mdio_read and mdio_write operations. Reduce the delay in mdio_write, and add it to mdio_read too. Also add a comment that the 20us is from hw specs. Signed-off-by: Timo Teräs Acked-by: Francois Romieu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9ab48c2ea9806947d8f5fcadb0c47767c29f175c Author: Timo Teräs Date: Sun Jun 6 15:38:47 2010 -0700 r8169: fix random mdio_write failures [ Upstream commit 024a07bacf8287a6ddfa83e9d5b951c5e8b4070e ] Some configurations need delay between the "write completed" indication and new write to work reliably. Realtek driver seems to use longer delay when polling the "write complete" bit, so it waits long enough between writes with high probability (but could probably break too). This patch adds a new udelay to make sure we wait unconditionally some time after the write complete indication. This caused a regression with XID 18000000 boards when the board specific phy configuration writing many mdio registers was added in commit 2e955856ff (r8169: phy init for the 8169scd). Some of the configration mdio writes would almost always fail, and depending on failure might leave the PHY in non-working state. Signed-off-by: Timo Teräs Acked-off-by: Francois Romieu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0987a43dadd8a45843c906d674221b180488e42d Author: Tetsuo Handa Date: Sat Sep 4 01:34:28 2010 +0000 UNIX: Do not loop forever at unix_autobind(). [ Upstream commit a9117426d0fcc05a194f728159a2d43df43c7add ] We assumed that unix_autobind() never fails if kzalloc() succeeded. But unix_autobind() allows only 1048576 names. If /proc/sys/fs/file-max is larger than 1048576 (e.g. systems with more than 10GB of RAM), a local user can consume all names using fork()/socket()/bind(). If all names are in use, those who call bind() with addr_len == sizeof(short) or connect()/sendmsg() with setsockopt(SO_PASSCRED) will continue while (1) yield(); loop at unix_autobind() till a name becomes available. This patch adds a loop counter in order to give up after 1048576 attempts. Calling yield() for once per 256 attempts may not be sufficient when many names are already in use, for __unix_find_socket_byname() can take long time under such circumstance. Therefore, this patch also adds cond_resched() call. Note that currently a local user can consume 2GB of kernel memory if the user is allowed to create and autobind 1048576 UNIX domain sockets. We should consider adding some restriction for autobind operation. Signed-off-by: Tetsuo Handa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 36f2140e8f572790a71899e9db7ff6f2986987cd Author: Alexey Kuznetsov Date: Wed Sep 15 10:27:52 2010 -0700 tcp: Prevent overzealous packetization by SWS logic. [ Upstream commit 01f83d69844d307be2aa6fea88b0e8fe5cbdb2f4 ] If peer uses tiny MSS (say, 75 bytes) and similarly tiny advertised window, the SWS logic will packetize to half the MSS unnecessarily. This causes problems with some embedded devices. However for large MSS devices we do want to half-MSS packetize otherwise we never get enough packets into the pipe for things like fast retransmit and recovery to work. Be careful also to handle the case where MSS > window, otherwise we'll never send until the probe timer. Reported-by: ツ Leandro Melo de Sales Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f99150007ae1ff46aefbd5c96cf22af0cdc34563 Author: Eric Dumazet Date: Mon Aug 16 03:25:00 2010 +0000 rds: fix a leak of kernel memory [ Upstream commit f037590fff3005ce8a1513858d7d44f50053cc8f ] struct rds_rdma_notify contains a 32 bits hole on 64bit arches, make sure it is zeroed before copying it to user. Signed-off-by: Eric Dumazet CC: Andy Grover Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 12fc5c218002041cf2d3be3a5fd26fad993c6fbc Author: Steven J. Magnani Date: Tue Mar 30 13:56:01 2010 -0700 net: Fix oops from tcp_collapse() when using splice() [ Upstream commit baff42ab1494528907bf4d5870359e31711746ae ] tcp_read_sock() can have a eat skbs without immediately advancing copied_seq. This can cause a panic in tcp_collapse() if it is called as a result of the recv_actor dropping the socket lock. A userspace program that splices data from a socket to either another socket or to a file can trigger this bug. Signed-off-by: Steven J. Magnani Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c9d740b86fee3c2be9eaf3d7728c15c0e03f4d83 Author: David S. Miller Date: Sun Sep 19 21:45:29 2010 -0700 bridge: Clear INET control block of SKBs passed into ip_fragment(). [ Upstream commit 4ce6b9e1621c187a32a47a17bf6be93b1dc4a3df ] In a similar vain to commit 17762060c25590bfddd68cc1131f28ec720f405f ("bridge: Clear IPCB before possible entry into IP stack") Any time we call into the IP stack we have to make sure the state there is as expected by the ipv4 code. With help from Eric Dumazet and Herbert Xu. Reported-by: Brandan Das Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit aa018b2a1b0bfb81517380aa651cf8271e4c7451 Author: Herbert Xu Date: Mon Jul 5 21:29:28 2010 +0000 bridge: Clear IPCB before possible entry into IP stack [ Upstream commit 17762060c25590bfddd68cc1131f28ec720f405f ] The bridge protocol lives dangerously by having incestuous relations with the IP stack. In this instance an abomination has been created where a bogus IPCB area from a bridged packet leads to a crash in the IP stack because it's interpreted as IP options. This patch papers over the problem by clearing the IPCB area in that particular spot. To fix this properly we'd also need to parse any IP options if present but I'm way too lazy for that. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f535c2e115f90cf18408d91f11f7c5585bc38ef0 Author: Eric Dumazet Date: Sun Sep 19 21:38:12 2010 -0700 tcp: fix three tcp sysctls tuning [ Upstream commit c5ed63d66f24fd4f7089b5a6e087b0ce7202aa8e ] As discovered by Anton Blanchard, current code to autotune tcp_death_row.sysctl_max_tw_buckets, sysctl_tcp_max_orphans and sysctl_max_syn_backlog makes little sense. The bigger a page is, the less tcp_max_orphans is : 4096 on a 512GB machine in Anton's case. (tcp_hashinfo.bhash_size * sizeof(struct inet_bind_hashbucket)) is much bigger if spinlock debugging is on. Its wrong to select bigger limits in this case (where kernel structures are also bigger) bhash_size max is 65536, and we get this value even for small machines. A better ground is to use size of ehash table, this also makes code shorter and more obvious. Based on a patch from Anton, and another from David. Reported-and-tested-by: Anton Blanchard Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a89d316f2bdf5df8f9a109d32f0e7b0940041e7f Author: David S. Miller Date: Wed Aug 25 02:27:49 2010 -0700 tcp: Combat per-cpu skew in orphan tests. [ Upstream commit ad1af0fedba14f82b240a03fe20eb9b2fdbd0357 ] As reported by Anton Blanchard when we use percpu_counter_read_positive() to make our orphan socket limit checks, the check can be off by up to num_cpus_online() * batch (which is 32 by default) which on a 128 cpu machine can be as large as the default orphan limit itself. Fix this by doing the full expensive sum check if the optimized check triggers. Reported-by: Anton Blanchard Signed-off-by: David S. Miller Acked-by: Eric Dumazet Signed-off-by: Greg Kroah-Hartman commit ebde7b9827da782d1fd0a51cdbac81b589552494 Author: KOSAKI Motohiro Date: Tue Aug 24 16:05:48 2010 +0000 tcp: select(writefds) don't hang up when a peer close connection [ Upstream commit d84ba638e4ba3c40023ff997aa5e8d3ed002af36 ] This issue come from ruby language community. Below test program hang up when only run on Linux. % uname -mrsv Linux 2.6.26-2-486 #1 Sat Dec 26 08:37:39 UTC 2009 i686 % ruby -rsocket -ve ' BasicSocket.do_not_reverse_lookup = true serv = TCPServer.open("127.0.0.1", 0) s1 = TCPSocket.open("127.0.0.1", serv.addr[1]) s2 = serv.accept s2.close s1.write("a") rescue p $! s1.write("a") rescue p $! Thread.new { s1.write("a") }.join' ruby 1.9.3dev (2010-07-06 trunk 28554) [i686-linux] # [Hang Here] FreeBSD, Solaris, Mac doesn't. because Ruby's write() method call select() internally. and tcp_poll has a bug. SUS defined 'ready for writing' of select() as following. | A descriptor shall be considered ready for writing when a call to an output | function with O_NONBLOCK clear would not block, whether or not the function | would transfer data successfully. That said, EPIPE situation is clearly one of 'ready for writing'. We don't have read-side issue because tcp_poll() already has read side shutdown care. | if (sk->sk_shutdown & RCV_SHUTDOWN) | mask |= POLLIN | POLLRDNORM | POLLRDHUP; So, Let's insert same logic in write side. - reference url http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31065 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31068 Signed-off-by: KOSAKI Motohiro Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8e75a0d65dce9d24b551d4a5ccbe00a23771583a Author: David S. Miller Date: Sun Sep 19 17:56:19 2010 -0700 irda: Correctly clean up self->ias_obj on irda_bind() failure. [ Upstream commit 628e300cccaa628d8fb92aa28cb7530a3d5f2257 ] If irda_open_tsap() fails, the irda_bind() code tries to destroy the ->ias_obj object by hand, but does so wrongly. In particular, it fails to a) release the hashbin attached to the object and b) reset the self->ias_obj pointer to NULL. Fix both problems by using irias_delete_object() and explicitly setting self->ias_obj to NULL, just as irda_release() does. Reported-by: Tavis Ormandy Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 602b219309ced41c9787976230e5536c1796c720 Author: Jarek Poplawski Date: Sat Sep 4 10:34:29 2010 +0000 gro: Re-fix different skb headrooms [ Upstream commit 64289c8e6851bca0e589e064c9a5c9fbd6ae5dd4 ] The patch: "gro: fix different skb headrooms" in its part: "2) allocate a minimal skb for head of frag_list" is buggy. The copied skb has p->data set at the ip header at the moment, and skb_gro_offset is the length of ip + tcp headers. So, after the change the length of mac header is skipped. Later skb_set_mac_header() sets it into the NET_SKB_PAD area (if it's long enough) and ip header is misaligned at NET_SKB_PAD + NET_IP_ALIGN offset. There is no reason to assume the original skb was wrongly allocated, so let's copy it as it was. bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626 fixes commit: 3d3be4333fdf6faa080947b331a6a19bce1a4f57 Reported-by: Plamen Petrov Signed-off-by: Jarek Poplawski CC: Eric Dumazet Acked-by: Eric Dumazet Tested-by: Plamen Petrov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c69a7edf623d417fd389507281752722bbbeebca Author: Eric Dumazet Date: Wed Sep 1 00:50:51 2010 +0000 gro: fix different skb headrooms [ Upstream commit 3d3be4333fdf6faa080947b331a6a19bce1a4f57 ] Packets entering GRO might have different headrooms, even for a given flow (because of implementation details in drivers, like copybreak). We cant force drivers to deliver packets with a fixed headroom. 1) fix skb_segment() skb_segment() makes the false assumption headrooms of fragments are same than the head. When CHECKSUM_PARTIAL is used, this can give csum_start errors, and crash later in skb_copy_and_csum_dev() 2) allocate a minimal skb for head of frag_list skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to allocate a fresh skb. This adds NET_SKB_PAD to a padding already provided by netdevice, depending on various things, like copybreak. Use alloc_skb() to allocate an exact padding, to reduce cache line needs: NET_SKB_PAD + NET_IP_ALIGN bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626 Many thanks to Plamen Petrov, testing many debugging patches ! With help of Jarek Poplawski. Reported-by: Plamen Petrov Signed-off-by: Eric Dumazet CC: Jarek Poplawski Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 79faa9a964ce53d43cc5b63538f289ae83f77f28 Author: David S. Miller Date: Wed Mar 3 02:30:37 2010 -0800 sparc: Provide io{read,write}{16,32}be(). [ Upstream commit 1bff4dbb79a2bc0ee4881c8ea6a4fbed64ea6309 ] Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f2ba1916d33431e42f4ef440c903aeb5dbac62fb Author: Dan Rosenberg Date: Wed Sep 15 17:44:16 2010 -0400 USB: serial/mos*: prevent reading uninitialized stack memory commit a0846f1868b11cd827bdfeaf4527d8b1b1c0b098 upstream. The TIOCGICOUNT device ioctl in both mos7720.c and mos7840.c allows unprivileged users to read uninitialized stack memory, because the "reserved" member of the serial_icounter_struct struct declared on the stack is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg Signed-off-by: Greg Kroah-Hartman