commit afc84dacd12c94d2ade2fbc45fa2e4b57da37b65 Author: Greg Kroah-Hartman Date: Wed Oct 8 20:24:05 2008 -0700 Linux 2.6.26.6 commit 34f3c11bc4d09fe7d3b105b5e4e6127dc4d8ee24 Author: Jarod Wilson Date: Tue Sep 9 12:38:56 2008 +0200 S390: CVE-2008-1514: prevent ptrace padding area read/write in 31-bit mode commit 3d6e48f43340343d97839eadb1ab7b6a3ea98797 upstream When running a 31-bit ptrace, on either an s390 or s390x kernel, reads and writes into a padding area in struct user_regs_struct32 will result in a kernel panic. This is also known as CVE-2008-1514. Test case available here: http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/user-area-padding.c?cvsroot=systemtap Steps to reproduce: 1) wget the above 2) gcc -o user-area-padding-31bit user-area-padding.c -Wall -ggdb2 -D_GNU_SOURCE -m31 3) ./user-area-padding-31bit Test status ----------- Without patch, both s390 and s390x kernels panic. With patch, the test case, as well as the gdb testsuite, pass without incident, padding area reads returning zero, writes ignored. Nb: original version returned -EINVAL on write attempts, which broke the gdb test and made the test case slightly unhappy, Jan Kratochvil suggested the change to return 0 on write attempts. Signed-off-by: Jarod Wilson Tested-by: Jan Kratochvil Signed-off-by: Martin Schwidefsky Cc: Moritz Muehlenhoff Signed-off-by: Greg Kroah-Hartman commit 553d7dd7336a3c1f3dd12085b5c42451c17225e1 Author: Balbir Singh Date: Sun Oct 5 17:43:37 2008 +0100 mm owner: fix race between swapoff and exit [Here's a backport of 2.6.27-rc8's 31a78f23bac0069004e69f98808b6988baccb6b6 to 2.6.26 or 2.6.26.5: I wouldn't trouble -stable for the (root only) swapoff case which uncovered the bug, but the /proc// case is open to all, so I think worth plugging in the next 2.6.26-stable. - Hugh] There's a race between mm->owner assignment and swapoff, more easily seen when task slab poisoning is turned on. The condition occurs when try_to_unuse() runs in parallel with an exiting task. A similar race can occur with callers of get_task_mm(), such as /proc// or ptrace or page migration. CPU0 CPU1 try_to_unuse looks at mm = task0->mm increments mm->mm_users task 0 exits mm->owner needs to be updated, but no new owner is found (mm_users > 1, but no other task has task->mm = task0->mm) mm_update_next_owner() leaves mmput(mm) decrements mm->mm_users task0 freed dereferencing mm->owner fails The fix is to notify the subsystem via mm_owner_changed callback(), if no new owner is found, by specifying the new task as NULL. Jiri Slaby: mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but must be set after that, so as not to pass NULL as old owner causing oops. Daisuke Nishimura: mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task() and its callers need to take account of this situation to avoid oops. Hugh Dickins: Lockdep warning and hang below exec_mmap() when testing these patches. exit_mm() up_reads mmap_sem before calling mm_update_next_owner(), so exec_mmap() now needs to do the same. And with that repositioning, there's now no point in mm_need_new_owner() allowing for NULL mm. Reported-by: Hugh Dickins Signed-off-by: Balbir Singh Signed-off-by: Jiri Slaby Signed-off-by: Daisuke Nishimura Signed-off-by: Hugh Dickins Cc: KAMEZAWA Hiroyuki Cc: Paul Menage Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit eb07718d62cfd8da699a8127110fbb9fa5a18663 Author: Marcin Slusarz Date: Sat Oct 4 01:25:03 2008 +0000 rtc: fix kernel panic on second use of SIGIO nofitication commit 2e4a75cdcb89ff53bb182dda3a6dcdc14befe007 upstream When userspace uses SIGIO notification and forgets to disable it before closing file descriptor, rtc->async_queue contains stale pointer to struct file. When user space enables again SIGIO notification in different process, kernel dereferences this (poisoned) pointer and crashes. So disable SIGIO notification on close. Kernel panic: (second run of qemu (requires echo 1024 > /sys/class/rtc/rtc0/max_user_freq)) general protection fault: 0000 [1] PREEMPT CPU 0 Modules linked in: af_packet snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq usbhid tuner tea5767 tda8290 tuner_xc2028 xc5000 tda9887 tuner_simple tuner_types mt20xx tea5761 tda9875 uhci_hcd ehci_hcd usbcore bttv snd_via82xx snd_ac97_codec ac97_bus snd_pcm snd_timer ir_common compat_ioctl32 snd_page_alloc videodev v4l1_compat snd_mpu401_uart snd_rawmidi v4l2_common videobuf_dma_sg videobuf_core snd_seq_device snd btcx_risc soundcore tveeprom i2c_viapro Pid: 5781, comm: qemu-system-x86 Not tainted 2.6.27-rc6 #363 RIP: 0010:[] [] __lock_acquire+0x3db/0x73f RSP: 0000:ffffffff80674cb8 EFLAGS: 00010002 RAX: ffff8800224c62f0 RBX: 0000000000000046 RCX: 0000000000000002 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800224c62f0 RBP: ffffffff80674d08 R08: 0000000000000002 R09: 0000000000000001 R10: ffffffff80238941 R11: 0000000000000001 R12: 0000000000000000 R13: 6b6b6b6b6b6b6b6b R14: ffff88003a450080 R15: 0000000000000000 FS: 00007f98b69516f0(0000) GS:ffffffff80623200(0000) knlGS:00000000f7cc86d0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000a87000 CR3: 0000000022598000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process qemu-system-x86 (pid: 5781, threadinfo ffff880028812000, task ffff88003a450080) Stack: ffffffff80674cf8 0000000180238440 0000000200000002 0000000000000000 ffff8800224c62f0 0000000000000046 0000000000000000 0000000000000002 0000000000000002 0000000000000000 ffffffff80674d68 ffffffff8024fc7a Call Trace: [] lock_acquire+0x85/0xa9 [] ? send_sigio+0x2a/0x184 [] _read_lock+0x3e/0x4a [] ? send_sigio+0x2a/0x184 [] send_sigio+0x2a/0x184 [] ? __lock_acquire+0x6e1/0x73f [] ? kill_fasync+0x2c/0x4e [] __kill_fasync+0x54/0x65 [] kill_fasync+0x3a/0x4e [] rtc_update_irq+0x9c/0xa5 [] cmos_interrupt+0xae/0xc0 [] handle_IRQ_event+0x25/0x5a [] handle_edge_irq+0xdd/0x123 [] do_IRQ+0xe4/0x144 [] ret_from_intr+0x0/0xf [] ? __alloc_pages_internal+0xe7/0x3ad [] ? clear_page_c+0x7/0x10 [] ? get_page_from_freelist+0x385/0x450 [] ? __alloc_pages_internal+0xe7/0x3ad [] ? anon_vma_prepare+0x2e/0xf6 [] ? handle_mm_fault+0x227/0x6a5 [] ? do_page_fault+0x494/0x83f [] ? error_exit+0x0/0xa9 Code: cc 41 39 45 28 74 24 e8 5e 1d 0f 00 85 c0 0f 84 6a 03 00 00 83 3d 8f a9 aa 00 00 be 47 03 00 00 0f 84 6a 02 00 00 e9 53 03 00 00 <41> ff 85 38 01 00 00 45 8b be 90 06 00 00 41 83 ff 2f 76 24 e8 RIP [] __lock_acquire+0x3db/0x73f RSP ---[ end trace 431877d860448760 ]--- Kernel panic - not syncing: Aiee, killing interrupt handler! Signed-off-by: Marcin Slusarz Acked-by: Alessandro Zummo Acked-by: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit be38e82a6675bf9ee6a750f32683159c8b5ab1e5 Author: David Winn Date: Fri Oct 3 01:46:02 2008 +0000 fbcon: fix monochrome color value calculation commit 08650869e0ec581f8d88cfdb563d37f5383abfe2 upstream Commit 22af89aa0c0b4012a7431114a340efd3665a7617 ("fbcon: replace mono_col macro with static inline") changed the order of operations for computing monochrome color values. This generates 0xffff000f instead of 0x0000000f for a 4 bit monochrome color, leading to image corruption if it is passed to cfb_imageblit or other similar functions. Fix it up. Cc: Harvey Harrison Cc: "Antonino A. Daplas" Cc: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit ff37b8e1ac5c7c0c663526d1c42a8ce3f9b9386b Author: Risto Suominen Date: Thu Oct 2 22:55:15 2008 +0000 ALSA: snd-powermac: HP detection for 1st iMac G3 SL commit 030b655b062fe5190fc490e0091ea50307d7a86f upstream Correct headphone detection for 1st generation iMac G3 Slot-loading (Screamer). This patch fixes the regression in the recent snd-powermac which doesn't support some G3/G4 PowerMacs: http://lkml.org/lkml/2008/10/1/220 Signed-off-by: Risto Suominen Tested-by: Mariusz Kozlowski Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 0433c92cb3490c6daf3e313484bd5bf45e22b0bb Author: Risto Suominen Date: Thu Oct 2 22:55:18 2008 +0000 ALSA: snd-powermac: mixers for PowerMac G4 AGP commit 4dbf95ba6c344186ec6d38ff514dc675da464bec upstream Add mixer controls for PowerMac G4 AGP (Screamer). This patch fixes the regression in the recent snd-powermac which doesn't support some G3/G4 PowerMacs: http://lkml.org/lkml/2008/10/1/220 Signed-off-by: Risto Suominen Tested-by: Mariusz Kozlowski Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit c6b06fdb17a6467fa17b18a41c8d8147f4fb64e0 Author: Pascal Terjan Date: Fri Oct 3 01:45:55 2008 +0000 braille_console: only register notifiers when the braille console is used commit c0c9209ddd96bc4f1d70a8b9958710671e076080 upstream Only register the braille driver VT and keyboard notifiers when the braille console is used. Avoids eating insert or backspace keys. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=11242 Signed-off-by: Pascal Terjan Signed-off-by: Samuel Thibault Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Cc: Moritz Muehlenhoff Signed-off-by: Greg Kroah-Hartman commit 88e399f0f57023d72dfe7f29d4f283e5462f000e Author: David S. Miller Date: Mon Sep 22 15:42:24 2008 -0700 sparc64: Fix missing devices due to PCI bridge test in of_create_pci_dev(). [ Upstream commit 44b50e5a1af13c605d6c3b17a60e42eb0ee48d5f ] Just like in the arch/sparc64/kernel/of_device.c code fix commit 071d7f4c3b411beae08d27656e958070c43b78b4 ("sparc64: Fix disappearing PCI devices on e3500.") we have to check the OF device node name for "pci" instead of relying upon the 'device_type' property being there on all PCI bridges. Tested by Meelis Roos, and confirmed to make the PCI QFE devices reappear on the E3500 system. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d78fdd8a0e39de2115ef051f4d33b4b5df476164 Author: David S. Miller Date: Sat Sep 20 22:00:40 2008 -0700 sparc64: Fix disappearing PCI devices on e3500. [ Upstream commit 7ee766d8fba9dfd93bf3eca7a8d84a25404a68dc ] Based upon a bug report by Meelis Roos. The OF device layer builds properties by matching bus types and applying 'range' properties as appropriate, up to the root. The match for "PCI" busses is looking at the 'device_type' property, and this does work %99 of the time. But on an E3500 system with a PCI QFE card, the DEC 21153 bridge sitting above the QFE network interface devices has a 'name' of "pci", but it completely lacks a 'device_type' property. So we don't match it as a PCI bus, and subsequently we end up with no resource values at all for the devices sitting under that DEC bridge. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 28a65ba636ea471e15c8552faeb2be0eb179b385 Author: David S. Miller Date: Tue Sep 16 09:53:42 2008 -0700 sparc64: Fix OOPS in psycho_pcierr_intr_other(). [ Upstream commit f948cc6ab9e61a8e88d70ee9aafc690e6d26f92c ] We no longer put the top-level PCI controller device into the PCI layer device list. So pbm->pci_bus->self is always NULL. Therefore, use direct PCI config space accesses to get at the PCI controller's PCI_STATUS register. Tested by Meelis Roos. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 284be31eccb95054e4e3c4eb9f86b9d4b562008a Author: David S. Miller Date: Wed Sep 10 14:08:27 2008 -0700 sparc64: Fix interrupt register calculations on Psycho and Sabre. [ Upstream commit ebfb2c63405f2410897674f14e41c031c9302909 ] Use the IMAP offset calculation for OBIO devices as documented in the programmer's manual. Which is "0x10000 + ((ino & 0x1f) << 3)" Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 24c5886b091f6d8f3c31d2b9a5793c6f56274fc2 Author: David S. Miller Date: Fri Sep 12 15:13:15 2008 -0700 sparc64: Fix PCI error interrupt registry on PSYCHO. [ Upstream commit 80a56ab626c70468be92e74cf3d288ffaed23fdb ] We need to pass IRQF_SHARED, otherwise we get things like: IRQ handler type mismatch for IRQ 33 current handler: PSYCHO_UE Call Trace: [000000000048394c] request_irq+0xac/0x120 [00000000007c5f6c] psycho_scan_bus+0x98/0x158 [00000000007c2bc0] pcibios_init+0xdc/0x12c [0000000000426a5c] do_one_initcall+0x1c/0x160 [00000000007c0180] kernel_init+0x9c/0xfc [0000000000427050] kernel_thread+0x30/0x60 [00000000006ae1d0] rest_init+0x10/0x60 on e3500 and similar systems. On a single board, the UE interrupts of two Psycho nodes are funneled through the same interrupt, from of_debug=3 dump: /pci@b,4000: direct translate 2ee --> 21 ... /pci@b,2000: direct translate 2ee --> 21 Decimal "33" mentioned above is the hex "21" mentioned here. Thanks to Meelis Roos for dumps and testing. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit fc69b36cd5d05d78c7aa34fd490e8f156be9e5f6 Author: Herbert Xu Date: Mon Sep 15 11:48:46 2008 -0700 udp: Fix rcv socket locking [ Upstream commit 93821778def10ec1e69aa3ac10adee975dad4ff3 ] The previous patch in response to the recursive locking on IPsec reception is broken as it tries to drop the BH socket lock while in user context. This patch fixes it by shrinking the section protected by the socket lock to sock_queue_rcv_skb only. The only reason we added the lock is for the accounting which happens in that function. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ce8fd8b97b424c7abb5123640e05bd0f3d292131 Author: Vlad Yasevich Date: Thu Sep 18 16:28:27 2008 -0700 sctp: Fix oops when INIT-ACK indicates that peer doesn't support AUTH [ Upstream commit add52379dde2e5300e2d574b172e62c6cf43b3d3 ] If INIT-ACK is received with SupportedExtensions parameter which indicates that the peer does not support AUTH, the packet will be silently ignore, and sctp_process_init() do cleanup all of the transports in the association. When T1-Init timer is expires, OOPS happen while we try to choose a different init transport. The solution is to only clean up the non-active transports, i.e the ones that the peer added. However, that introduces a problem with sctp_connectx(), because we don't mark the proper state for the transports provided by the user. So, we'll simply mark user-provided transports as ACTIVE. That will allow INIT retransmissions to work properly in the sctp_connectx() context and prevent the crash. Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 43562861a5c17416151964c9a6c09a38fdda00a7 Author: Vlad Yasevich Date: Thu Sep 18 16:27:38 2008 -0700 sctp: do not enable peer features if we can't do them. [ Upstream commit 0ef46e285c062cbe35d60c0adbff96f530d31c86 ] Do not enable peer features like addip and auth, if they are administratively disabled localy. If the peer resports that he supports something that we don't, neither end can use it so enabling it is pointless. This solves a problem when talking to a peer that has auth and addip enabled while we do not. Found by Andrei Pelinescu-Onciul . Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b047cf6dfa81ca03b62f2e3ae63793ef5c300158 Author: Herbert Xu Date: Tue Sep 30 02:03:19 2008 -0700 ipsec: Fix pskb_expand_head corruption in xfrm_state_check_space [ Upstream commit d01dbeb6af7a0848063033f73c3d146fec7451f3 ] We're never supposed to shrink the headroom or tailroom. In fact, shrinking the headroom is a fatal action. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 877755eb1c4e46b460ac1af9938dec6f9d528fc2 Author: Vegard Nossum Date: Thu Sep 11 19:05:29 2008 -0700 netlink: fix overrun in attribute iteration [ Upstream commit 1045b03e07d85f3545118510a587035536030c1c ] kmemcheck reported this: kmemcheck: Caught 16-bit read from uninitialized memory (f6c1ba30) 0500110001508abf050010000500000002017300140000006f72672e66726565 i i i i i i i i i i i i i u u u u u u u u u u u u u u u u u u u ^ Pid: 3462, comm: wpa_supplicant Not tainted (2.6.27-rc3-00054-g6397ab9-dirty #13) EIP: 0060:[] EFLAGS: 00010296 CPU: 0 EIP is at nla_parse+0x5a/0xf0 EAX: 00000008 EBX: fffffffd ECX: c06f16c0 EDX: 00000005 ESI: 00000010 EDI: f6c1ba30 EBP: f6367c6c ESP: c0a11e88 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 8005003b CR2: f781cc84 CR3: 3632f000 CR4: 000006d0 DR0: c0ead9bc DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff4ff0 DR7: 00000400 [] rtnl_setlink+0x63/0x130 [] rtnetlink_rcv_msg+0x165/0x200 [] netlink_rcv_skb+0x76/0xa0 [] rtnetlink_rcv+0x1e/0x30 [] netlink_unicast+0x281/0x290 [] netlink_sendmsg+0x1b9/0x2b0 [] sock_sendmsg+0xd2/0x100 [] sys_sendto+0xa5/0xd0 [] sys_send+0x36/0x40 [] sys_socketcall+0x1e6/0x2c0 [] sysenter_do_call+0x12/0x3f [] 0xffffffff This is the line in nla_ok(): /** * nla_ok - check if the netlink attribute fits into the remaining bytes * @nla: netlink attribute * @remaining: number of bytes remaining in attribute stream */ static inline int nla_ok(const struct nlattr *nla, int remaining) { return remaining >= sizeof(*nla) && nla->nla_len >= sizeof(*nla) && nla->nla_len <= remaining; } It turns out that remaining can become negative due to alignment in nla_next(). But GCC promotes "remaining" to unsigned in the test against sizeof(*nla) above. Therefore the test succeeds, and the nla_for_each_attr() may access memory outside the received buffer. A short example illustrating this point is here: #include main(void) { printf("%d\n", -1 >= sizeof(int)); } ...which prints "1". This patch adds a cast in front of the sizeof so that GCC will make a signed comparison and fix the illegal memory dereference. With the patch applied, there is no kmemcheck report. Signed-off-by: Vegard Nossum Acked-by: Thomas Graf Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 99479c654ea71eeb156b94fc497fdadd5e4d440c Author: Santwona Behera Date: Fri Sep 12 16:04:26 2008 -0700 niu: panic on reset [ Upstream commit cff502a38394fd33693f6233e03fca363dfa956d ] The reset_task function in the niu driver does not reset the tx and rx buffers properly. This leads to panic on reset. This patch is a modified implementation of the previously posted fix. Signed-off-by: Santwona Behera Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1e4c1698a4f5f177f9a1d83921e8341de0798968 Author: Neil Horman Date: Tue Sep 9 13:51:35 2008 -0700 ipv6: Fix OOPS in ip6_dst_lookup_tail(). [ Upstream commit e550dfb0c2c31b6363aa463a035fc9f8dcaa3c9b ] This fixes kernel bugzilla 11469: "TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash" dst->neighbour is not necessarily hooked up at this point in the processing path, so blindly dereferencing it is the wrong thing to do. This NULL check exists in other similar paths and this case was just an oversight. Also fix the completely wrong and confusing indentation here while we're at it. Based upon a patch by Evgeniy Polyakov. Signed-off-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9c44da042826e2db05f21a4d2fe8df468d82e24f Author: Arnaud Ebalard Date: Wed Oct 1 02:37:56 2008 -0700 XFRM,IPv6: initialize ip6_dst_blackhole_ops.kmem_cachep [ Upstream commit 5dc121e9a7a8a3721cefeb07f3559f50fbedc67e ] ip6_dst_blackhole_ops.kmem_cachep is not expected to be NULL (i.e. to be initialized) when dst_alloc() is called from ip6_dst_blackhole(). Otherwise, it results in the following (xfrm_larval_drop is now set to 1 by default): [ 78.697642] Unable to handle kernel paging request for data at address 0x0000004c [ 78.703449] Faulting instruction address: 0xc0097f54 [ 78.786896] Oops: Kernel access of bad area, sig: 11 [#1] [ 78.792791] PowerMac [ 78.798383] Modules linked in: btusb usbhid bluetooth b43 mac80211 cfg80211 ehci_hcd ohci_hcd sungem sungem_phy usbcore ssb [ 78.804263] NIP: c0097f54 LR: c0334a28 CTR: c002d430 [ 78.809997] REGS: eef19ad0 TRAP: 0300 Not tainted (2.6.27-rc5) [ 78.815743] MSR: 00001032 CR: 22242482 XER: 20000000 [ 78.821550] DAR: 0000004c, DSISR: 40000000 [ 78.827278] TASK = eef0df40[3035] 'mip6d' THREAD: eef18000 [ 78.827408] GPR00: 00001032 eef19b80 eef0df40 00000000 00008020 eef19c30 00000001 00000000 [ 78.833249] GPR08: eee5101c c05a5c10 ef9ad500 00000000 24242422 1005787c 00000000 1004f960 [ 78.839151] GPR16: 00000000 10024e90 10050040 48030018 0fe44150 00000000 00000000 eef19c30 [ 78.845046] GPR24: eef19e44 00000000 eef19bf8 efb37c14 eef19bf8 00008020 00009032 c0596064 [ 78.856671] NIP [c0097f54] kmem_cache_alloc+0x20/0x94 [ 78.862581] LR [c0334a28] dst_alloc+0x40/0xc4 [ 78.868451] Call Trace: [ 78.874252] [eef19b80] [c03c1810] ip6_dst_lookup_tail+0x1c8/0x1dc (unreliable) [ 78.880222] [eef19ba0] [c0334a28] dst_alloc+0x40/0xc4 [ 78.886164] [eef19bb0] [c03cd698] ip6_dst_blackhole+0x28/0x1cc [ 78.892090] [eef19be0] [c03d9be8] rawv6_sendmsg+0x75c/0xc88 [ 78.897999] [eef19cb0] [c038bca4] inet_sendmsg+0x4c/0x78 [ 78.903907] [eef19cd0] [c03207c8] sock_sendmsg+0xac/0xe4 [ 78.909734] [eef19db0] [c03209e4] sys_sendmsg+0x1e4/0x2a0 [ 78.915540] [eef19f00] [c03220a8] sys_socketcall+0xfc/0x210 [ 78.921406] [eef19f40] [c0014b3c] ret_from_syscall+0x0/0x38 [ 78.927295] --- Exception: c01 at 0xfe2d730 [ 78.927297] LR = 0xfe2d71c [ 78.939019] Instruction dump: [ 78.944835] 91640018 9144001c 900a0000 4bffff44 9421ffe0 7c0802a6 bf810010 7c9d2378 [ 78.950694] 90010024 7fc000a6 57c0045e 7c000124 <83e3004c> 8383005c 2f9f0000 419e0050 [ 78.956464] ---[ end trace 05fa1ed7972487a1 ]--- As commented by Benjamin Thery, the bug was introduced by f2fc6a54585a1be6669613a31fbaba2ecbadcd36, while adding network namespaces support to ipv6 routes. Signed-off-by: Arnaud Ebalard Acked-by: Benjamin Thery Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1ead836b8d0f04dc441f29f0126a6a3b2cb574e6 Author: Timo Teras Date: Wed Oct 1 05:17:54 2008 -0700 af_key: Free dumping state on socket close [ Upstream commit 0523820482dcb42784572ffd2296c2f08c275a2b ] Fix a xfrm_{state,policy}_walk leak if pfkey socket is closed while dumping is on-going. Signed-off-by: Timo Teras Signed-off-by: David S. Miller commit 400f9f32043fbec8fc8de42fd9b9428b4557b19c Author: Alan Cox Date: Thu Oct 2 09:53:38 2008 +0200 pcmcia: Fix broken abuse of dev->driver_data [ Upstream commit: cec5eb7be3a104fffd27ca967ee8e15a123050e2 ] PCMCIA abuses dev->private_data in the probe methods. Unfortunately it continues to abuse it after calling drv->probe() which leads to crashes and other nasties (such as bogus probes of multifunction devices) giving errors like pcmcia: registering new device pcmcia0.1 kernel: 0.1: GetNextTuple: No more items Extract the passed data before calling the driver probe function that way we don't blow up when the driver reuses dev->private_data as its right. Signed-off-by: Alan Cox Signed-off-by: Dominik Brodowski Signed-off-by: Greg Kroah-Hartman commit bc3ac469af00b0e5b7799c127d00b6650fab5587 Author: Thomas Gleixner Date: Tue Sep 9 21:38:57 2008 +0200 clockevents: remove WARN_ON which was used to gather information commit 61c22c34c6f80a8e89cff5ff717627c54cc14fd4 upstream The issue of the endless reprogramming loop due to a too small min_delta_ns was fixed with the previous updates of the clock events code, but we had no information about the spread of this problem. I added a WARN_ON to get automated information via kerneloops.org and to get some direct reports, which allowed me to analyse the affected machines. The WARN_ON has served its purpose and would be annoying for a release kernel. Remove it and just keep the information about the increase of the min_delta_ns value. Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit e0d725a2b5770e9631b893c0ee37396569767de5 Author: Maciej W. Rozycki Date: Fri Sep 5 14:05:31 2008 -0700 ntp: fix calculation of the next jiffie to trigger RTC sync commit 4ff4b9e19a80b73959ebeb28d1df40176686f0a8 upstream We have a bug in the calculation of the next jiffie to trigger the RTC synchronisation. The aim here is to run sync_cmos_clock() as close as possible to the middle of a second. Which means we want this function to be called less than or equal to half a jiffie away from when now.tv_nsec equals 5e8 (500000000). If this is not the case for a given call to the function, for this purpose instead of updating the RTC we calculate the offset in nanoseconds to the next point in time where now.tv_nsec will be equal 5e8. The calculated offset is then converted to jiffies as these are the unit used by the timer. Hovewer timespec_to_jiffies() used here uses a ceil()-type rounding mode, where the resulting value is rounded up. As a result the range of now.tv_nsec when the timer will trigger is from 5e8 to 5e8 + TICK_NSEC rather than the desired 5e8 - TICK_NSEC / 2 to 5e8 + TICK_NSEC / 2. As a result if for example sync_cmos_clock() happens to be called at the time when now.tv_nsec is between 5e8 + TICK_NSEC / 2 and 5e8 to 5e8 + TICK_NSEC, it will simply be rescheduled HZ jiffies later, falling in the same range of now.tv_nsec again. Similarly for cases offsetted by an integer multiple of TICK_NSEC. This change addresses the problem by subtracting TICK_NSEC / 2 from the nanosecond offset to the next point in time where now.tv_nsec will be equal 5e8, effectively shifting the following rounding in timespec_to_jiffies() so that it produces a rounded-to-nearest result. Signed-off-by: Maciej W. Rozycki Signed-off-by: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 9c57bca1856eae14e62d363a35cb161edfa134e9 Author: Thomas Gleixner Date: Sat Sep 6 03:06:08 2008 +0200 x86: HPET: read back compare register before reading counter commit 72d43d9bc9210d24d09202eaf219eac09e17b339 upstream After fixing the u32 thinko I sill had occasional hickups on ATI chipsets with small deltas. There seems to be a delay between writing the compare register and the transffer to the internal register which triggers the interrupt. Reading back the value makes sure, that it hit the internal match register befor we compare against the counter value. Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 45f9d5228563175bf2e340e1863f3c936a7d5888 Author: Thomas Gleixner Date: Sat Sep 6 03:03:32 2008 +0200 x86: HPET fix moronic 32/64bit thinko commit f7676254f179eac6b5244a80195ec8ae0e9d4606 upstream We use the HPET only in 32bit mode because: 1) some HPETs are 32bit only 2) on i386 there is no way to read/write the HPET atomic 64bit wide The HPET code unification done by the "moron of the year" did not take into account that unsigned long is different on 32 and 64 bit. This thinko results in a possible endless loop in the clockevents code, when the return comparison fails due to the 64bit/332bit unawareness. unsigned long cnt = (u32) hpet_read() + delta can wrap over 32bit. but the final compare will fail and return -ETIME causing endless loops. Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 92741d2d653769b582015c6a379e7b46e113435d Author: Thomas Gleixner Date: Sat Sep 6 03:01:45 2008 +0200 clockevents: broadcast fixup possible waiters commit 7300711e8c6824fcfbd42a126980ff50439d8dd0 upstream Until the C1E patches arrived there where no users of periodic broadcast before switching to oneshot mode. Now we need to trigger a possible waiter for a periodic broadcast when switching to oneshot mode. Otherwise we can starve them for ever. Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit f8a5d65f576686312aee4cfa74bd50a002863927 Author: Thomas Gleixner Date: Wed Sep 3 21:37:24 2008 +0000 HPET: make minimum reprogramming delta useful commit 7cfb0435330364f90f274a26ecdc5f47f738498c upstream The minimum reprogramming delta was hardcoded in HPET ticks, which is stupid as it does not work with faster running HPETs. The C1E idle patches made this prominent on AMD/RS690 chipsets, where the HPET runs with 25MHz. Set it to 5us which seems to be a reasonable value and fixes the problems on the bug reporters machines. We have a further sanity check now in the clock events, which increases the delta when it is not sufficient. Signed-off-by: Thomas Gleixner Tested-by: Luiz Fernando N. Capitulino Tested-by: Dmitry Nezhevenko Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 9b4989324acb35a5ada4d52e13fd339e5da89762 Author: Thomas Gleixner Date: Wed Sep 3 21:37:14 2008 +0000 clockevents: prevent endless loop lockup commit 1fb9b7d29d8e85ba3196eaa7ab871bf76fc98d36 upstream The C1E/HPET bug reports on AMDX2/RS690 systems where tracked down to a too small value of the HPET minumum delta for programming an event. The clockevents code needs to enforce an interrupt event on the clock event device in some cases. The enforcement code was stupid and naive, as it just added the minimum delta to the current time and tried to reprogram the device. When the minimum delta is too small, then this loops forever. Add a sanity check. Allow reprogramming to fail 3 times, then print a warning and double the minimum delta value to make sure, that this does not happen again. Use the same function for both tick-oneshot and tick-broadcast code. Signed-off-by: Thomas Gleixner Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 7f0a673a75d3f9f50a64f83055b71be67526efd7 Author: Thomas Gleixner Date: Wed Sep 3 21:37:08 2008 +0000 clockevents: prevent multiple init/shutdown commit 9c17bcda991000351cb2373f78be7e4b1c44caa3 upstream While chasing the C1E/HPET bugreports I went through the clock events code inch by inch and found that the broadcast device can be initialized and shutdown multiple times. Multiple shutdowns are not critical, but useless waste of time. Multiple initializations are simply broken. Another CPU might have the device in use already after the first initialization and the second init could just render it unusable again. Signed-off-by: Thomas Gleixner Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit ea16e1b4b005e8a574efce13fb57d0fdbc543d67 Author: Thomas Gleixner Date: Wed Sep 3 21:37:03 2008 +0000 clockevents: enforce reprogram in oneshot setup commit 7205656ab48da29a95d7f55e43a81db755d3cb3a upstream In tick_oneshot_setup we program the device to the given next_event, but we do not check the return value. We need to make sure that the device is programmed enforced so the interrupt handler engine starts working. Split out the reprogramming function from tick_program_event() and call it with the device, which was handed in to tick_setup_oneshot(). Set the force argument, so the devices is firing an interrupt. Signed-off-by: Thomas Gleixner Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit cf25095cf6e21b9abe299d709835db2d6338b2b5 Author: Thomas Gleixner Date: Wed Sep 3 21:36:57 2008 +0000 clockevents: prevent endless loop in periodic broadcast handler commit d4496b39559c6d43f83e4c08b899984f8b8089b5 upstream The reprogramming of the periodic broadcast handler was broken, when the first programming returned -ETIME. The clockevents code stores the new expiry value in the clock events device next_event field only when the programming time has not been elapsed yet. The loop in question calculates the new expiry value from the next_event value and therefor never increases. Signed-off-by: Thomas Gleixner Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 2a2bac600a84eedf9d9dd6766232640876593856 Author: Venkatesh Pallipadi Date: Wed Sep 3 21:36:50 2008 +0000 clockevents: prevent clockevent event_handler ending up handler_noop commit 7c1e76897492d92b6a1c2d6892494d39ded9680c upstream There is a ordering related problem with clockevents code, due to which clockevents_register_device() called after tickless/highres switch will not work. The new clockevent ends up with clockevents_handle_noop as event handler, resulting in no timer activity. The problematic path seems to be * old device already has hrtimer_interrupt as the event_handler * new clockevent device registers with a higher rating * tick_check_new_device() is called * clockevents_exchange_device() gets called * old->event_handler is set to clockevents_handle_noop * tick_setup_device() is called for the new device * which sets new->event_handler using the old->event_handler which is noop. Change the ordering so that new device inherits the proper handler. This does not have any issue in normal case as most likely all the clockevent devices are setup before the highres switch. But, can potentially be affecting some corner case where HPET force detect happens after the highres switch. This was a problem with HPET in MSI mode code that we have been experimenting with. Signed-off-by: Venkatesh Pallipadi Signed-off-by: Shaohua Li Signed-off-by: Thomas Gleixner Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 579b4b38460a47b3ba51c35412b960dafe3d0949 Author: Prarit Bhargava Date: Wed Sep 24 20:27:49 2008 -0400 x86: fix memmap=exactmap boot argument Backport of d6be118a97ce51ca84035270f91c2bccecbfac5f by Chuck Ebbert When using kdump modifying the e820 map is yielding strange results. For example starting with BIOS-provided physical RAM map: BIOS-e820: 0000000000000100 - 0000000000093400 (usable) BIOS-e820: 0000000000093400 - 00000000000a0000 (reserved) BIOS-e820: 0000000000100000 - 000000003fee0000 (usable) BIOS-e820: 000000003fee0000 - 000000003fef3000 (ACPI data) BIOS-e820: 000000003fef3000 - 000000003ff80000 (ACPI NVS) BIOS-e820: 000000003ff80000 - 0000000040000000 (reserved) BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved) and booting with args memmap=exactmap memmap=640K@0K memmap=5228K@16384K memmap=125188K@22252K memmap=76K#1047424K memmap=564K#1047500K resulted in: user-defined physical RAM map: user: 0000000000000000 - 0000000000093400 (usable) user: 0000000000093400 - 00000000000a0000 (reserved) user: 0000000000100000 - 000000003fee0000 (usable) user: 000000003fee0000 - 000000003fef3000 (ACPI data) user: 000000003fef3000 - 000000003ff80000 (ACPI NVS) user: 000000003ff80000 - 0000000040000000 (reserved) user: 00000000e0000000 - 00000000f0000000 (reserved) user: 00000000fec00000 - 00000000fec10000 (reserved) user: 00000000fee00000 - 00000000fee01000 (reserved) user: 00000000ff000000 - 0000000100000000 (reserved) But should have resulted in: user-defined physical RAM map: user: 0000000000000000 - 00000000000a0000 (usable) user: 0000000001000000 - 000000000151b000 (usable) user: 00000000015bb000 - 0000000008ffc000 (usable) user: 000000003fee0000 - 000000003ff80000 (ACPI data) This is happening because of an improper usage of strcmp() in the e820 parsing code. The strcmp() always returns !0 and never resets the value for e820.nr_map and returns an incorrect user-defined map. This patch fixes the problem. Signed-off-by: Prarit Bhargava Signed-off-by: Ingo Molnar Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 130bdec9b321e1c0fdf7c2fdef03166f12815c5a Author: Chuck Ebbert Date: Wed Sep 24 19:26:04 2008 -0400 x86: add io delay quirk for Presario F700 commit e6a5652fd156a286faadbf7a4062b5354d4e346e upstream Manually adding "io_delay=0xed" fixes system lockups in ioapic mode on this machine. System Information Manufacturer: Hewlett-Packard Product Name: Presario F700 (KA695EA#ABF) Base Board Information Manufacturer: Quanta Product Name: 30D3 Reference: https://bugzilla.redhat.com/show_bug.cgi?id=459546 Signed-off-by: Chuck Ebbert Signed-off-by: H. Peter Anvin Signed-off-by: Greg Kroah-Hartman commit e6908f26e33567ebd565fad04096537a5853fec0 Author: Zhao Yakui Date: Tue Sep 23 13:38:13 2008 +0800 ACPI: Avoid bogus EC timeout when EC is in Polling mode commit 9d699ed92a459cb408e2577e8bbeabc8ec3989e1 upstream When EC is in Polling mode, OS will check the EC status continually by using the following source code: clear_bit(EC_FLAGS_WAIT_GPE, &ec->flags); while (time_before(jiffies, delay)) { if (acpi_ec_check_status(ec, event)) return 0; msleep(1); } But msleep is realized by the function of schedule_timeout. At the same time although one process is already waken up by some events, it won't be scheduled immediately. So maybe there exists the following phenomena: a. The current jiffies is already after the predefined jiffies. But before timeout happens, OS has no chance to check the EC status again. b. If preemptible schedule is enabled, maybe preempt schedule will happen before checking loop. When the process is resumed again, maybe timeout already happens, which means that OS has no chance to check the EC status. In such case maybe EC status is already what OS expects when timeout happens. But OS has no chance to check the EC status and regards it as AE_TIME. So it will be more appropriate that OS will try to check the EC status again when timeout happens. If the EC status is what we expect, it won't be regarded as timeout. Only when the EC status is not what we expect, it will be regarded as timeout, which means that EC controller can't give a response in time. http://bugzilla.kernel.org/show_bug.cgi?id=9823 http://bugzilla.kernel.org/show_bug.cgi?id=11141 Signed-off-by: Zhao Yakui Signed-off-by: Zhang Rui Signed-off-by: Andi Kleen Signed-off-by: Greg Kroah-Hartman commit fe1c832405d5450bcde9a2e60b3ac008c406c8e6 Author: Pekka Paalanen Date: Mon May 12 21:21:01 2008 +0200 x86: fix SMP alternatives: use mutex instead of spinlock, text_poke is sleepable commit 2f1dafe50cc4e58a239fd81bd47f87f32042a1ee upstream text_poke is sleepable. The original fix by Mathieu Desnoyers . Signed-off-by: Pekka Paalanen Signed-off-by: Ingo Molnar Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 459872ffa5ac2357472118b6a875758a2b11139e Author: Ingo Molnar Date: Sat Aug 23 17:59:07 2008 +0200 rtc: fix deadlock commit 38c052f8cff1bd323ccfa968136a9556652ee420 upstream if get_rtc_time() is _ever_ called with IRQs off, we deadlock badly in it, waiting for jiffies to increment. So make the code more robust by doing an explicit mdelay(20). This solves a very hard to reproduce/debug hard lockup reported by Mikael Pettersson. Reported-by: Mikael Pettersson Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 2d5794c10db3863fb9af076195900adc15645d51 Author: Nick Piggin Date: Wed Sep 3 20:27:35 2008 -0400 mm: dirty page tracking race fix commit 479db0bf408e65baa14d2a9821abfcbc0804b847 upstream There is a race with dirty page accounting where a page may not properly be accounted for. clear_page_dirty_for_io() calls page_mkclean; then TestClearPageDirty. page_mkclean walks the rmaps for that page, and for each one it cleans and write protects the pte if it was dirty. It uses page_check_address to find the pte. That function has a shortcut to avoid the ptl if the pte is not present. Unfortunately, the pte can be switched to not-present then back to present by other code while holding the page table lock -- this should not be a signal for page_mkclean to ignore that pte, because it may be dirty. For example, powerpc64's set_pte_at will clear a previously present pte before setting it to the desired value. There may also be other code in core mm or in arch which do similar things. The consequence of the bug is loss of data integrity due to msync, and loss of dirty page accounting accuracy. XIP's __xip_unmap could easily also be unreliable (depending on the exact XIP locking scheme), which can lead to data corruption. Fix this by having an option to always take ptl to check the pte in page_check_address. It's possible to retain this optimization for page_referenced and try_to_unmap. Signed-off-by: Nick Piggin Cc: Jared Hulbert Cc: Carsten Otte Cc: Hugh Dickins Acked-by: Peter Zijlstra Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 6ff36eba0bc2beb3cc8ebc95655a2a01dbdd575c Author: Jan Beulich Date: Wed Sep 3 20:25:24 2008 -0400 x86-64: fix overlap of modules and fixmap areas commit 66d4bdf22b8652cda215e2653c8bbec7a767ed57 upstream Plus add a build time check so this doesn't go unnoticed again. Signed-off-by: Jan Beulich Signed-off-by: Ingo Molnar Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 05039ab90e8b5c1e4f38beeed90956c7bd1b5895 Author: Venkatesh Pallipadi Date: Wed Sep 3 19:54:55 2008 -0400 x86: PAT proper tracking of set_memory_uc and friends commit c15238df3b65e34fadb1021b0fb0d5aebc7c42c6 upstream Big thinko in pat memtype tracking code. reserve_memtype should be called with physical address and not virtual address. Signed-off-by: Venkatesh Pallipadi Signed-off-by: Suresh Siddha Signed-off-by: Ingo Molnar Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 9db33a508500ad82e907a51570bc1e6a5b94163d Author: Andi Kleen Date: Wed Sep 3 19:47:05 2008 -0400 x86: fix oprofile + hibernation badness commit 80a8c9fffa78f57d7d4351af2f15a56386805ceb upstream Vegard Nossum reported oprofile + hibernation problems: > Now some warnings: > > ------------[ cut here ]------------ > WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/smp.c:328 s > mp_call_function_mask+0x194/0x1a0() The usual problem: the suspend function when interrupts are already disabled calls smp_call_function which is not allowed with interrupt off. But at this point all the other CPUs should be already down anyways, so it should be enough to just drop that. This patch should fix that problem at least by fixing cpu hotplug& suspend support. [ mingo@elte.hu: fixed 5 coding style errors. ] Backported by Chuck Ebbert Signed-off-by: Andi Kleen Tested-by: Vegard Nossum Signed-off-by: Ingo Molnar Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 3850427d9afd4c1ee7b51e357e028e3e38bd9aa3 Author: Krzysztof Helt Date: Wed Sep 3 19:44:55 2008 -0400 x86: fdiv bug detection fix commit e0d22d03c06c4e2c194d7010bc1e4a972199f156 upstream The fdiv detection code writes s32 integer into the boot_cpu_data.fdiv_bug. However, the boot_cpu_data.fdiv_bug is only char (s8) field so the detection overwrites already set fields for other bugs, e.g. the f00f bug field. Use local s32 variable to receive result. This is a partial fix to Bugzilla #9928 - fixes wrong information about the f00f bug (tested) and probably for coma bug (I have no cpu to test this). Signed-off-by: Krzysztof Helt Cc: Andrew Morton Signed-off-by: Ingo Molnar Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 939a3b7956341f34aadeb2e24b394e3bc96bf497 Author: Ivo van Doorn Date: Fri Jul 4 13:41:31 2008 +0200 rt2x00: Use ieee80211_hw->workqueue again commit 8e260c22238dd8b57aefb1f5e4bd114486a9c17d upstream Remove the rt2x00 singlethreaded workqueue and move the link tuner and packet filter scheduled work to the ieee80211_hw->workqueue again. The only exception is the interface scheduled work handler which uses the mac80211 interface iterator under the RTNL lock. This work needs to be handled on the kernel workqueue to prevent lockdep issues. Signed-off-by: Ivo van Doorn Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 634e781bb4af630592df4632508986e90d6d79c0 Author: Ravikiran Thirumalai Date: Tue Sep 23 11:03:50 2008 -0700 x86: Fix 27-rc crash on vsmp due to paravirt during module load commit 05e12e1c4c09cd35ac9f4e6af1e42b0036375d72 upstream. vsmp_patch has been marked with __init ever since pvops, however, apply_paravirt can be called during module load causing calls to freed memory location. Since apply_paravirt can only be called during bootup and module load, mark vsmp patch with "__init_or_module" Signed-off-by: Ravikiran Thirumalai Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit e568b3605f6f6ad1e9cbe37231cf5b578ff16d4b Author: FUJITA Tomonori Date: Sat Sep 13 01:16:45 2008 +0900 sg: disable interrupts inside sg_copy_buffer This is the backport of the upstream commit 50bed2e2862a8f3a4f7d683d0d27292e71ef18b9 The callers of sg_copy_buffer must disable interrupts before calling it (since it uses kmap_atomic). Some callers use it on interrupt-disabled code but some need to take the trouble to disable interrupts just for this. No wonder they forget about it and we hit a bug like: http://bugzilla.kernel.org/show_bug.cgi?id=11529 James said that it might be better to disable interrupts inside the function rather than risk the callers getting it wrong. Signed-off-by: FUJITA Tomonori Signed-off-by: Jens Axboe Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit 2e8e9ac3bd989d4a654b7750e04a6ce5f60b0dc5 Author: Joel Becker Date: Wed Sep 10 06:27:07 2008 -0700 ocfs2: Increment the reference count of an already-active stack. commit d6817cdbd143f87f9d7c59a4c3194091190eeb84 upstream The ocfs2_stack_driver_request() function failed to increment the refcount of an already-active stack. It only did the increment on the first reference. Whoops. Signed-off-by: Joel Becker Tested-by: Marcos Matsunaga Signed-off-by: Mark Fasheh Signed-off-by: Greg Kroah-Hartman commit 90af668a965fd4732996274c7babfc63b090ddf0 Author: Yinghai Lu Date: Fri Sep 12 13:08:18 2008 +0200 APIC routing fix commit e0da33646826b66ef933d47ea2fb7a693fd849bf upstream x86: introduce max_physical_apicid for bigsmp switching a multi-socket test-system with 3 or 4 ioapics, when 4 dualcore cpus or 2 quadcore cpus installed, needs to switch to bigsmp or physflat. CPU apic id is [4,11] instead of [0,7], and we need to check max apic id instead of cpu numbers. also add check for 32 bit when acpi is not compiled in or acpi=off. Signed-off-by: Yinghai Lu Signed-off-by: Ingo Molnar Cc: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 092609f380fe84ab974062729942ba6b0be3a78f Author: Balbir Singh Date: Fri Sep 5 18:12:23 2008 +0200 sched: fix process time monotonicity commit 49048622eae698e5c4ae61f7e71200f265ccc529 upstream Spencer reported a problem where utime and stime were going negative despite the fixes in commit b27f03d4bdc145a09fb7b0c0e004b29f1ee555fa. The suspected reason for the problem is that signal_struct maintains it's own utime and stime (of exited tasks), these are not updated using the new task_utime() routine, hence sig->utime can go backwards and cause the same problem to occur (sig->utime, adds tsk->utime and not task_utime()). This patch fixes the problem TODO: using max(task->prev_utime, derived utime) works for now, but a more generic solution is to implement cputime_max() and use the cputime_gt() function for comparison. Reported-by: spencer@bluehost.com Signed-off-by: Balbir Singh Signed-off-by: Peter Zijlstra Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 1fac74ef6eb3ac9e7355c3d43803ef8ec9c0971f Author: Jens Axboe Date: Wed Sep 3 19:49:10 2008 -0400 block: submit_bh() inadvertently discards barrier flag on a sync write commit 48fd4f93a00eac844678629f2f00518e146ed30d upstream Reported by Milan Broz , commit 18ce3751 inadvertently made submit_bh() discard the barrier bit for a WRITE_SYNC request. Fix that up. Signed-off-by: Jens Axboe Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit b22d675062c08a757798367daf561c8b5f795275 Author: Zachary Amsden Date: Wed Oct 1 16:45:04 2008 +0000 x86: Fix broken LDT access in VMI commit de59985e3a623d4d5d6207f1777398ca0606ab1c upstream After investigating a JRE failure, I found this bug was introduced a long time ago, and had already managed to survive another bugfix which occurred on the same line. The result is a total failure of the JRE due to LDT selectors not working properly. This one took a long time to rear up because LDT usage is not very common, but the bug is quite serious. It got introduced along with another bug, already fixed, by 75b8bb3e56ca09a467fbbe5229bc68627f7445be Signed-off-by: Zachary Amsden Cc: Ingo Molnar Cc: Glauber de Oliveira Costa Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 42f5a87ed3db2b49da7374e61ed4a7aa5f46e626 Author: Suresh Siddha Date: Tue Sep 30 17:56:13 2008 -0700 x64, fpu: fix possible FPU leakage in error conditions [Upstream commit: 6ffac1e90a17ea0aded5c581204397421eec91b6] On Thu, Jul 24, 2008 at 03:43:44PM -0700, Linus Torvalds wrote: > So how about this patch as a starting point? This is the RightThing(tm) to > do regardless, and if it then makes it easier to do some other cleanups, > we should do it first. What do you think? restore_fpu_checking() calls init_fpu() in error conditions. While this is wrong(as our main intention is to clear the fpu state of the thread), this was benign before commit 92d140e21f1 ("x86: fix taking DNA during 64bit sigreturn"). Post commit 92d140e21f1, live FPU registers may not belong to this process at this error scenario. In the error condition for restore_fpu_checking() (especially during the 64bit signal return), we are doing init_fpu(), which saves the live FPU register state (possibly belonging to some other process context) into the thread struct (through unlazy_fpu() in init_fpu()). This is wrong and can leak the FPU data. For the signal handler restore error condition in restore_i387(), clear the fpu state present in the thread struct(before ultimately sending a SIGSEGV for badframe). For the paranoid error condition check in math_state_restore(), send a SIGSEGV, if we fail to restore the state. Signed-off-by: Suresh Siddha Cc: Linus Torvalds Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 568fc52e4917a0bfb0ac8b54eb8636f9e51886c1 Author: Linus Torvalds Date: Tue Sep 30 17:56:12 2008 -0700 x86-64: Clean up save/restore_i387() usage [ Upstream commit b30f3ae50cd03ef2ff433a5030fbf88dd8323528] Suresh Siddha wants to fix a possible FPU leakage in error conditions, but the fact that save/restore_i387() are inlines in a header file makes that harder to do than necessary. So start off with an obvious cleanup. This just moves the x86-64 version of save/restore_i387() out of the header file, and moves it to the only file that it is actually used in: arch/x86/kernel/signal_64.c. So exposing it in a header file was wrong to begin with. [ Side note: I'd like to fix up some of the games we play with the 32-bit version of these functions too, but that's a separate matter. The 32-bit versions are shared - under different names at that! - by both the native x86-32 code and the x86-64 32-bit compatibility code ] Acked-by: Suresh Siddha Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 689f18f9c2e72b4b8589b055a51ff7bc7ffbd5bd Author: Joerg Roedel Date: Sat Sep 13 08:38:42 2008 +0300 KVM: SVM: fix guest global tlb flushes with NPT (cherry picked from commit e5eab0cede4b1ffaca4ad857d840127622038e55) Accesses to CR4 are intercepted even with Nested Paging enabled. But the code does not check if the guest wants to do a global TLB flush. So this flush gets lost. This patch adds the check and the flush to svm_set_cr4. Signed-off-by: Joerg Roedel Signed-off-by: Avi Kivity Signed-off-by: Greg Kroah-Hartman commit feec4f615504a766e1897412d5a9a28b1c4eec6c Author: Joerg Roedel Date: Sat Sep 13 08:38:41 2008 +0300 KVM: SVM: fix random segfaults with NPT enabled (cherry picked from commit 44874f84918e37b64bec6df1587e5fe2fdf6ab62) This patch introduces a guest TLB flush on every NPF exit in KVM. This fixes random segfaults and #UD exceptions in the guest seen under some workloads (e.g. long running compile workloads or tbench). A kernbench run with and without that fix showed that it has a slowdown lower than 0.5% Signed-off-by: Joerg Roedel Signed-off-by: Alexander Graf Signed-off-by: Avi Kivity Signed-off-by: Greg Kroah-Hartman commit 7fa544746f2837f5d743b50e638fa60dd36da5a8 Author: Takashi Iwai Date: Tue Sep 30 11:54:12 2008 +0200 ALSA: remove unneeded power_mutex lock in snd_pcm_drop Upstream-commit-id: 24e8fc498e9618338854bfbcf8d1d737e0bf1775 The power_mutex lock in snd_pcm_drop may cause a possible deadlock chain, and above all, it's unneeded. Let's get rid of it. Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 875e33b0a336ada0441f5305ddc862b354c457cb Author: Takashi Iwai Date: Tue Sep 30 11:52:57 2008 +0200 ALSA: fix locking in snd_pcm_open*() and snd_rawmidi_open*() Upstream-commit-id: 399ccdc1cd4e92e541d4dacbbf18c52bd693418b The PCM and rawmidi open callbacks have a lock against card->controls_list but it takes a wrong one, card->controls_rwsem, instead of a right one card->ctl_files_rwlock. This patch fixes them. This change also fixes automatically the potential deadlocks due to mm->mmap_sem in munmap and copy_from/to_user, reported by Sitsofe Wheeler: A: snd_ctl_elem_user_tlv(): card->controls_rwsem => mm->mmap_sem B: snd_pcm_open(): card->open_mutex => card->controls_rwsem C: munmap: mm->mmap_sem => snd_pcm_release(): card->open_mutex The patch breaks the chain. Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit b7bacf78acade979a96792b302a42c9f9f122246 Author: Clemens Ladisch Date: Tue Sep 30 11:50:35 2008 +0200 ALSA: oxygen: fix distorted output on AK4396-based cards Upstream-commit-id: df91bc23dcb052ff2da71b3482bf3c5fbf4b8a53 When changing the sample rate, the CMI8788's master clock output becomes unstable for a short time. The AK4396 needs the master clock to do SPI writes, so writing to an AK4396 control register directly after a sample rate change will garble the value. In our case, this leads to the DACs being misconfigured to I2S sample format, which results in a wrong output level and horrible distortions on samples louder than -6 dB. To fix this, we need to wait until the new master clock signal has become stable before doing SPI writes. Signed-off-by: Clemens Ladisch Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 5af6467733f9af297ce977a2594662bbef1f999d Author: Takashi Iwai Date: Tue Sep 30 18:15:10 2008 +0000 ALSA: hda - Fix model for Dell Inspiron 1525 commit 24918b61b55c21e09a3e07cd82e1b3a8154782dc upstream Dell Inspiron 1525 seems to have a buggy BIOS setup and screws up the recent codec parser, as reported by Oleksandr Natalenko: http://lkml.org/lkml/2008/9/12/203 This patch adds the working model, dell-3stack, statically. Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 61c8bd1d3a9e7e57843d98d9b133fec77ecf4e1b Author: Andrew Vasquez Date: Mon Sep 29 15:15:04 2008 +0000 SCSI: qla2xxx: Defer enablement of RISC interrupts until ISP initialization completes. commit 048feec5548c0582ee96148c61b87cccbcb5f9be upstream Josip Rodin noted (http://article.gmane.org/gmane.linux.ports.sparc/10152) the driver oopsing during registration of an rport to the FC-transport layer with a backtrace indicating a dereferencing of an shost->shost_data equal to NULL. David Miller identified a small window in driver logic where this could happen: > Look at how the driver registers the IRQ handler before the host has > been registered with the SCSI layer. > > That leads to a window of time where the shost hasn't been setup > fully, yet ISRs can come in and trigger DPC thread events, such as > loop resyncs, which expect the transport area to be setup. > > But it won't be setup, because scsi_add_host() hasn't finished yet. > > Note that in Josip's crash log, we don't even see the > > qla_printk(KERN_INFO, ha, "\n" > " QLogic Fibre Channel HBA Driver: %s\n" > " QLogic %s - %s\n" > " ISP%04X: %s @ %s hdma%c, host#=%ld, fw=%s\n", > ... > > message yet. > > Which means that the crash occurs between qla2x00_request_irqs() > and printing that message. Close this window by enabling RISC interrupts after the host has been registered with the SCSI midlayer. Reported-by: Josip Rodin Signed-off-by: Andrew Vasquez Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit 90e21dd5346538810cff7f5fa2d3b0ae4c88989d Author: Geoff Levand Date: Tue Sep 23 22:05:34 2008 +0000 USB: fix hcd interrupt disabling commit 83a798207361cc26385187b2e71efa2b5d75de7f upstream Commit de85422b94ddb23c021126815ea49414047c13dc, 'USB: fix interrupt disabling for HCDs with shared interrupt handlers' changed usb_add_hcd() to strip IRQF_DISABLED from irqflags prior to calling request_irq() with the justification that such a removal was necessary for shared interrupts to work properly. Unfortunately, the change in that commit unconditionally removes the IRQF_DISABLED flag, causing problems on platforms that don't use a shared interrupt but require IRQF_DISABLED. This change adds a check for IRQF_SHARED prior to removing the IRQF_DISABLED flag. Fixes the PS3 system startup hang reported with recent Fedora and OpenSUSE kernels. Note that this problem is hidden when CONFIG_LOCKDEP=y (ps3_defconfig), as local_irq_enable_in_hardirq() is defined as a null statement for that config. Signed-off-by: Geoff Levand Cc: Alan Stern Cc: Stefan Becker Signed-off-by: Greg Kroah-Hartman commit 6d859b16f920fc0369dc900a08f0208fa7ecef36 Author: Kirill A. Shutemov Date: Tue Sep 23 17:25:04 2008 +0000 smb.h: do not include linux/time.h in userspace commit c32a162fd420fe8dfb049db941b2438061047fcc upstream linux/time.h conflicts with time.h from glibc It breaks building smbmount from samba. It's regression introduced by commit 76308da (" smb.h: uses struct timespec but didn't include linux/time.h"). Signed-off-by: Kirill A. Shutemov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit ff1301544d010243e577f7652a451e0901de1322 Author: Mike Rapoport Date: Wed Oct 1 10:39:24 2008 -0700 pxa2xx_spi: fix build breakage commit 20b918dc77b383e9779dafceee3f2198a6f7b0e5 upstream This patch fixes a build error in the pxa2xx-spi driver, introduced by commit 7e96445533ac3f4f7964646a202ff3620602fab4 ("pxa2xx_spi: dma bugfixes") CC drivers/spi/pxa2xx_spi.o drivers/spi/pxa2xx_spi.c: In function 'map_dma_buffers': drivers/spi/pxa2xx_spi.c:331: error: invalid operands to binary & drivers/spi/pxa2xx_spi.c:331: error: invalid operands to binary & drivers/spi/pxa2xx_spi.c: In function 'pump_transfers': drivers/spi/pxa2xx_spi.c:897: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'unsigned int' [dbrownell@users.sourceforge.net: fix warning too ] Signed-off-by: Mike Rapoport Acked-by: Eric Miao Signed-off-by: Andrew Morton Signed-off-by: David Brownell Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 1453bc9be93730042a1ee1cf9fd6358dc3bde91d Author: Ned Forrester Date: Sat Sep 13 22:05:54 2008 +0000 pxa2xx_spi: chipselect bugfixes commit 8423597d676615f3dd2d9ab36f59f147086b90b8 upstream Fixes several chipselect bugs in the pxa2xx_spi driver. These bugs are in all versions of this driver and prevent using it with chips like m25p16 flash. 1. The spi_transfer.cs_change flag is handled too early: before spi_transfer.delay_usecs applies, thus making the delay ineffective at holding chip select. 2. spi_transfer.delay_usecs is ignored on the last transfer of a message (likewise not holding chipselect long enough). 3. If spi_transfer.cs_change is set on the last transfer, the chip select is always disabled, instead of the intended meaning: optionally holding chip select enabled for the next message. Those first three bugs were fixed with a relocation of delays and chip select de-assertions. 4. If a message has the cs_change flag set on the last transfer, and had the chip select stayed enabled as requested (see 3, above), it would not have been disabled if the next message is for a different chip. Fixed by dropping chip select regardless of cs_change at end of a message, if there is no next message or if the next message is for a different chip. This patch should apply to all kernels back to and including 2.6.20; it was test patched against 2.6.20. An additional patch would be required for older kernels, but those versions are very buggy anyway. Signed-off-by: Ned Forrester Cc: Vernon Sauder Cc: Eric Miao Signed-off-by: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit c2d562dda6e0e6cb31a43c71868055f013d1b914 Author: Ned Forrester Date: Sat Sep 13 22:05:47 2008 +0000 pxa2xx_spi: dma bugfixes commit 7e96445533ac3f4f7964646a202ff3620602fab4 upstream Fixes two DMA bugs in the pxa2xx_spi driver. The first bug is in all versions of this driver; the second was introduced in the 2.6.20 kernel, and prevents using the driver with chips like m25p16 flash (which can issue large DMA reads). 1. Zero length transfers are permitted for use to insert timing, but pxa2xx_spi.c will fail if this is requested in DMA mode. Fixed by using programmed I/O (PIO) mode for such transfers. 2. Transfers larger than 8191 are not permitted in DMA mode. A test for length rejects all large transfers regardless of DMA or PIO mode. Worked around by rejecting only large transfers with DMA mapped buffers, and forcing all other transfers larger than 8191 to use PIO mode. A rate limited warning is issued for DMA transfers forced to PIO mode. This patch should apply to all kernels back to and including 2.6.20; it was test patched against 2.6.20. An additional patch would be required for older kernels, but those versions are very buggy anyway. Signed-off-by: Ned Forrester Cc: Vernon Sauder Cc: Eric Miao Signed-off-by: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 6b546b3dbbc51800bdbd075da923288c6a4fe5af Author: Mel Gorman Date: Sat Sep 13 22:05:39 2008 +0000 mm: mark the correct zone as full when scanning zonelists commit 5bead2a0680687b9576d57c177988e8aa082b922 upstream The iterator for_each_zone_zonelist() uses a struct zoneref *z cursor when scanning zonelists to keep track of where in the zonelist it is. The zoneref that is returned corresponds to the the next zone that is to be scanned, not the current one. It was intended to be treated as an opaque list. When the page allocator is scanning a zonelist, it marks elements in the zonelist corresponding to zones that are temporarily full. As the zonelist is being updated, it uses the cursor here; if (NUMA_BUILD) zlc_mark_zone_full(zonelist, z); This is intended to prevent rescanning in the near future but the zoneref cursor does not correspond to the zone that has been found to be full. This is an easy misunderstanding to make so this patch corrects the problem by changing zoneref cursor to be the current zone being scanned instead of the next one. Signed-off-by: Mel Gorman Cc: Andy Whitcroft Cc: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 36b560bfebe9f35a15d7069b3708a7ef7ad414d6 Author: Yuri Tikhonov Date: Sat Sep 6 01:10:06 2008 +0000 async_tx: fix the bug in async_tx_run_dependencies commit de24125dd0a452bfd4502fc448e3534c5d2e87aa upstream Should clear the next pointer of the TX if we are sure that the next TX (say NXT) will be submitted to the channel too. Overwise, we break the chain of descriptors, because we lose the information about the next descriptor to run. So next time, when invoke async_tx_run_dependencies() with TX, it's TX->next will be NULL, and NXT will be never submitted. Signed-off-by: Yuri Tikhonov Signed-off-by: Ilya Yanok Signed-off-by: Dan Williams Signed-off-by: Greg Kroah-Hartman commit 4fa9a2f9e5ad0171b890ecea3433276e66bc8353 Author: Andrew Morton Date: Sat Sep 6 01:10:03 2008 +0000 drivers/mmc/card/block.c: fix refcount leak in mmc_block_open() commit 70bb08962ea9bd50797ae9f16b2493f5f7c65053 upstream mmc_block_open() increments md->usage although it returns with -EROFS when default mounting a MMC/SD card with write protect switch on. This reference counting bug prevents /dev/mmcblkX from being released on card removal, and situation worsen with reinsertion until the minor number range runs out. Reported-by: Acked-by: Pierre Ossman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit e2bbadddf9ff42e79cb1cba9f1046d9c581d0177 Author: Andy Gospodarek Date: Thu Sep 4 01:05:06 2008 +0000 ixgbe: initialize interrupt throttle rate commit 15e79f24b60c4b0bf8019423bda4e03a576b02f2 upstream This commit dropped the setting of the default interrupt throttle rate. commit 021230d40ae0e6508d6c717b6e0d6d81cd77ac25 Author: Ayyappan Veeraiyan Date: Mon Mar 3 15:03:45 2008 -0800 ixgbe: Introduce MSI-X queue vector code The following patch adds it back. Without this the default value of 0 causes the performance of this card to be awful. Restoring these to the default values yields much better performance. This regression has been around since 2.6.25. Signed-off-by: Andy Gospodarek Acked-by: Jesse Brandeburg Signed-off-by: Jeff Kirsher Signed-off-by: Jeff Garzik Signed-off-by: Greg Kroah-Hartman commit 75678e311ea588ef8f0134ba534482d91fc1e0cb Author: Sven Wegener Date: Sun Sep 28 14:14:21 2008 +0200 i2c-dev: Return correct error code on class_create() failure In Linus' tree: http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Ftorvalds%2Flinux-2.6.git;a=commit;h=e74783ec3cb981211689bd2cfd3248f8dc48ec01 We need to convert the error pointer from class_create(), else we'll return the successful return code from register_chrdev() on failure. Signed-off-by: Sven Wegener Signed-off-by: Jean Delvare Signed-off-by: Greg Kroah-Hartman commit 536e90ea2fb9552a9e17f8ca7ea83043f4240fd2 Author: Milan Broz Date: Wed Sep 3 19:41:12 2008 -0400 ACPI: Fix thermal shutdowns commit 9f497bcc695fb828da023d74ad3c966b1e58ad21 upstream ACPI: Fix thermal shutdowns Do not use unsigned int if there is test for negative number... See drivers/acpi/processor_perflib.c static unsigned int ignore_ppc = -1; ... if (event == CPUFREQ_START && ignore_ppc <= 0) { ignore_ppc = 0; ... Signed-off-by: Milan Broz Signed-off-by: Andi Kleen Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit cf1b2b7e7d2603bc99ce39b2f6f362afa4389a95 Author: Chuck Ebbert