commit 6b5f4c2c71c9013971c6520b6260e21a17d06882 Author: Greg Kroah-Hartman Date: Mon Jul 5 11:09:05 2010 -0700 Linux 2.6.27.48 commit 7694247af71eb2a9067c07ee716ea2ead7bd8fe2 Author: Wei Yongjun Date: Mon May 17 22:51:58 2010 -0700 sctp: fix append error cause to ERROR chunk correctly commit 2e3219b5c8a2e44e0b83ae6e04f52f20a82ac0f2 upstream. commit 5fa782c2f5ef6c2e4f04d3e228412c9b4a4c8809 sctp: Fix skb_over_panic resulting from multiple invalid \ parameter errors (CVE-2010-1173) (v4) cause 'error cause' never be add the the ERROR chunk due to some typo when check valid length in sctp_init_cause_fixed(). Signed-off-by: Wei Yongjun Reviewed-by: Neil Horman Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8e3c5b14d0aa33e347569e53f42f874a83f426c5 Author: Toshiyuki Okajima Date: Fri Apr 30 14:32:13 2010 +0100 KEYS: find_keyring_by_name() can gain access to a freed keyring commit cea7daa3589d6b550546a8c8963599f7c1a3ae5c upstream. find_keyring_by_name() can gain access to a keyring that has had its reference count reduced to zero, and is thus ready to be freed. This then allows the dead keyring to be brought back into use whilst it is being destroyed. The following timeline illustrates the process: |(cleaner) (user) | | free_user(user) sys_keyctl() | | | | key_put(user->session_keyring) keyctl_get_keyring_ID() | || //=> keyring->usage = 0 | | |schedule_work(&key_cleanup_task) lookup_user_key() | || | | kmem_cache_free(,user) | | . |[KEY_SPEC_USER_KEYRING] | . install_user_keyrings() | . || | key_cleanup() [<= worker_thread()] || | | || | [spin_lock(&key_serial_lock)] |[mutex_lock(&key_user_keyr..mutex)] | | || | atomic_read() == 0 || | |{ rb_ease(&key->serial_node,) } || | | || | [spin_unlock(&key_serial_lock)] |find_keyring_by_name() | | ||| | keyring_destroy(keyring) ||[read_lock(&keyring_name_lock)] | || ||| | |[write_lock(&keyring_name_lock)] ||atomic_inc(&keyring->usage) | |. ||| *** GET freeing keyring *** | |. ||[read_unlock(&keyring_name_lock)] | || || | |list_del() |[mutex_unlock(&key_user_k..mutex)] | || | | |[write_unlock(&keyring_name_lock)] ** INVALID keyring is returned ** | | . | kmem_cache_free(,keyring) . | . | atomic_dec(&keyring->usage) v *** DESTROYED *** TIME If CONFIG_SLUB_DEBUG=y then we may see the following message generated: ============================================================================= BUG key_jar: Poison overwritten ----------------------------------------------------------------------------- INFO: 0xffff880197a7e200-0xffff880197a7e200. First byte 0x6a instead of 0x6b INFO: Allocated in key_alloc+0x10b/0x35f age=25 cpu=1 pid=5086 INFO: Freed in key_cleanup+0xd0/0xd5 age=12 cpu=1 pid=10 INFO: Slab 0xffffea000592cb90 objects=16 used=2 fp=0xffff880197a7e200 flags=0x200000000000c3 INFO: Object 0xffff880197a7e200 @offset=512 fp=0xffff880197a7e300 Bytes b4 0xffff880197a7e1f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Object 0xffff880197a7e200: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk Alternatively, we may see a system panic happen, such as: BUG: unable to handle kernel NULL pointer dereference at 0000000000000001 IP: [] kmem_cache_alloc+0x5b/0xe9 PGD 6b2b4067 PUD 6a80d067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/kernel/kexec_crash_loaded CPU 1 ... Pid: 31245, comm: su Not tainted 2.6.34-rc5-nofixed-nodebug #2 D2089/PRIMERGY RIP: 0010:[] [] kmem_cache_alloc+0x5b/0xe9 RSP: 0018:ffff88006af3bd98 EFLAGS: 00010002 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88007d19900b RDX: 0000000100000000 RSI: 00000000000080d0 RDI: ffffffff81828430 RBP: ffffffff81828430 R08: ffff88000a293750 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000100000 R12: 00000000000080d0 R13: 00000000000080d0 R14: 0000000000000296 R15: ffffffff810f20ce FS: 00007f97116bc700(0000) GS:ffff88000a280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000001 CR3: 000000006a91c000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process su (pid: 31245, threadinfo ffff88006af3a000, task ffff8800374414c0) Stack: 0000000512e0958e 0000000000008000 ffff880037f8d180 0000000000000001 0000000000000000 0000000000008001 ffff88007d199000 ffffffff810f20ce 0000000000008000 ffff88006af3be48 0000000000000024 ffffffff810face3 Call Trace: [] ? get_empty_filp+0x70/0x12f [] ? do_filp_open+0x145/0x590 [] ? tlb_finish_mmu+0x2a/0x33 [] ? unmap_region+0xd3/0xe2 [] ? virt_to_head_page+0x9/0x2d [] ? alloc_fd+0x69/0x10e [] ? do_sys_open+0x56/0xfc [] ? system_call_fastpath+0x16/0x1b Code: 0f 1f 44 00 00 49 89 c6 fa 66 0f 1f 44 00 00 65 4c 8b 04 25 60 e8 00 00 48 8b 45 00 49 01 c0 49 8b 18 48 85 db 74 0d 48 63 45 18 <48> 8b 04 03 49 89 00 eb 14 4c 89 f9 83 ca ff 44 89 e6 48 89 ef RIP [] kmem_cache_alloc+0x5b/0xe9 This problem is that find_keyring_by_name does not confirm that the keyring is valid before accepting it. Skipping keyrings that have been reduced to a zero count seems the way to go. To this end, use atomic_inc_not_zero() to increment the usage count and skip the candidate keyring if that returns false. The following script _may_ cause the bug to happen, but there's no guarantee as the window of opportunity is small: #!/bin/sh LOOP=100000 USER=dummy_user /bin/su -c "exit;" $USER || { /usr/sbin/adduser -m $USER; add=1; } for ((i=0; i /dev/null" $USER done (( add == 1 )) && /usr/sbin/userdel -r $USER exit Note that the nominated user must not be in use. An alternative way of testing this may be: for ((i=0; i<100000; i++)) do keyctl session foo /bin/true || break done >&/dev/null as that uses a keyring named "foo" rather than relying on the user and user-session named keyrings. Reported-by: Toshiyuki Okajima Signed-off-by: David Howells Tested-by: Toshiyuki Okajima Acked-by: Serge Hallyn Signed-off-by: James Morris Cc: Ben Hutchings Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit e24ad12540142e1ae7e24da1072e977c5b17003b Author: Dan Carpenter Date: Mon May 17 14:42:35 2010 +0100 KEYS: Return more accurate error codes commit 4d09ec0f705cf88a12add029c058b53f288cfaa2 upstream. We were using the wrong variable here so the error codes weren't being returned properly. The original code returns -ENOKEY. Signed-off-by: Dan Carpenter Signed-off-by: David Howells Signed-off-by: James Morris Signed-off-by: Greg Kroah-Hartman commit 648efc40754d15a7f38fc31d50bdc4e04024630d Author: Helge Deller Date: Mon May 3 20:44:21 2010 +0000 parisc: clear floating point exception flag on SIGFPE signal commit 550f0d922286556c7ea43974bb7921effb5a5278 upstream. Clear the floating point exception flag before returning to user space. This is needed, else the libc trampoline handler may hit the same SIGFPE again while building up a trampoline to a signal handler. Fixes debian bug #559406. Signed-off-by: Helge Deller Signed-off-by: Kyle McMartin Signed-off-by: Greg Kroah-Hartman commit 670037a94e70b107a79fd9ac0f4f2c9e1d26f742 Author: Neil Horman Date: Wed Mar 3 08:31:23 2010 +0000 tipc: Fix oops on send prior to entering networked mode (v3) commit d0021b252eaf65ca07ed14f0d66425dd9ccab9a6 upstream. Fix TIPC to disallow sending to remote addresses prior to entering NET_MODE user programs can oops the kernel by sending datagrams via AF_TIPC prior to entering networked mode. The following backtrace has been observed: ID: 13459 TASK: ffff810014640040 CPU: 0 COMMAND: "tipc-client" [exception RIP: tipc_node_select_next_hop+90] RIP: ffffffff8869d3c3 RSP: ffff81002d9a5ab8 RFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000001001001 RBP: 0000000001001001 R8: 0074736575716552 R9: 0000000000000000 R10: ffff81003fbd0680 R11: 00000000000000c8 R12: 0000000000000008 R13: 0000000000000001 R14: 0000000000000001 R15: ffff810015c6ca00 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 RIP: 0000003cbd8d49a3 RSP: 00007fffc84e0be8 RFLAGS: 00010206 RAX: 000000000000002c RBX: ffffffff8005d116 RCX: 0000000000000000 RDX: 0000000000000008 RSI: 00007fffc84e0c00 RDI: 0000000000000003 RBP: 0000000000000000 R8: 00007fffc84e0c10 R9: 0000000000000010 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fffc84e0d10 R14: 0000000000000000 R15: 00007fffc84e0c30 ORIG_RAX: 000000000000002c CS: 0033 SS: 002b What happens is that, when the tipc module in inserted it enters a standalone node mode in which communication to its own address is allowed <0.0.0> but not to other addresses, since the appropriate data structures have not been allocated yet (specifically the tipc_net pointer). There is nothing stopping a client from trying to send such a message however, and if that happens, we attempt to dereference tipc_net.zones while the pointer is still NULL, and explode. The fix is pretty straightforward. Since these oopses all arise from the dereference of global pointers prior to their assignment to allocated values, and since these allocations are small (about 2k total), lets convert these pointers to static arrays of the appropriate size. All the accesses to these bits consider 0/NULL to be a non match when searching, so all the lookups still work properly, and there is no longer a chance of a bad dererence anywhere. As a bonus, this lets us eliminate the setup/teardown routines for those pointers, and elimnates the need to preform any locking around them to prevent access while their being allocated/freed. I've updated the tipc_net structure to behave this way to fix the exact reported problem, and also fixed up the tipc_bearers and media_list arrays to fix an obvious simmilar problem that arises from issuing tipc-config commands to manipulate bearers/links prior to entering networked mode I've tested this for a few hours by running the sanity tests and stress test with the tipcutils suite, and nothing has fallen over. There have been a few lockdep warnings, but those were there before, and can be addressed later, as they didn't actually result in any deadlock. Signed-off-by: Neil Horman CC: Allan Stephens CC: David S. Miller CC: tipc-discussion@lists.sourceforge.net Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 23b5e014564bb71eb05871af019912957263f158 Author: Miklos Szeredi Date: Wed Feb 10 12:15:53 2010 +0100 vfs: add NOFOLLOW flag to umount(2) commit db1f05bb85d7966b9176e293f3ceead1cb8b5d79 upstream. Add a new UMOUNT_NOFOLLOW flag to umount(2). This is needed to prevent symlink attacks in unprivileged unmounts (fuse, samba, ncpfs). Additionally, return -EINVAL if an unknown flag is used (and specify an explicitly unused flag: UMOUNT_UNUSED). This makes it possible for the caller to determine if a flag is supported or not. CC: Eugene Teo CC: Michael Kerrisk Signed-off-by: Miklos Szeredi Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit b1b7bf1eede2a2d954dc0e4e7db6bb94e7650f60 Author: Neil Horman Date: Wed Apr 28 10:30:59 2010 +0000 sctp: Fix skb_over_panic resulting from multiple invalid parameter errors (CVE-2010-1173) (v4) commit 5fa782c2f5ef6c2e4f04d3e228412c9b4a4c8809 upstream. Ok, version 4 Change Notes: 1) Minor cleanups, from Vlads notes Summary: Hey- Recently, it was reported to me that the kernel could oops in the following way: <5> kernel BUG at net/core/skbuff.c:91! <5> invalid operand: 0000 [#1] <5> Modules linked in: sctp netconsole nls_utf8 autofs4 sunrpc iptable_filter ip_tables cpufreq_powersave parport_pc lp parport vmblock(U) vsock(U) vmci(U) vmxnet(U) vmmemctl(U) vmhgfs(U) acpiphp dm_mirror dm_mod button battery ac md5 ipv6 uhci_hcd ehci_hcd snd_ens1371 snd_rawmidi snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_ac97_codec snd soundcore pcnet32 mii floppy ext3 jbd ata_piix libata mptscsih mptsas mptspi mptscsi mptbase sd_mod scsi_mod <5> CPU: 0 <5> EIP: 0060:[] Not tainted VLI <5> EFLAGS: 00010216 (2.6.9-89.0.25.EL) <5> EIP is at skb_over_panic+0x1f/0x2d <5> eax: 0000002c ebx: c033f461 ecx: c0357d96 edx: c040fd44 <5> esi: c033f461 edi: df653280 ebp: 00000000 esp: c040fd40 <5> ds: 007b es: 007b ss: 0068 <5> Process swapper (pid: 0, threadinfo=c040f000 task=c0370be0) <5> Stack: c0357d96 e0c29478 00000084 00000004 c033f461 df653280 d7883180 e0c2947d <5> 00000000 00000080 df653490 00000004 de4f1ac0 de4f1ac0 00000004 df653490 <5> 00000001 e0c2877a 08000800 de4f1ac0 df653490 00000000 e0c29d2e 00000004 <5> Call Trace: <5> [] sctp_addto_chunk+0xb0/0x128 [sctp] <5> [] sctp_addto_chunk+0xb5/0x128 [sctp] <5> [] sctp_init_cause+0x3f/0x47 [sctp] <5> [] sctp_process_unk_param+0xac/0xb8 [sctp] <5> [] sctp_verify_init+0xcc/0x134 [sctp] <5> [] sctp_sf_do_5_1B_init+0x83/0x28e [sctp] <5> [] sctp_do_sm+0x41/0x77 [sctp] <5> [] cache_grow+0x140/0x233 <5> [] sctp_endpoint_bh_rcv+0xc5/0x108 [sctp] <5> [] sctp_inq_push+0xe/0x10 [sctp] <5> [] sctp_rcv+0x454/0x509 [sctp] <5> [] ipt_hook+0x17/0x1c [iptable_filter] <5> [] nf_iterate+0x40/0x81 <5> [] ip_local_deliver_finish+0x0/0x151 <5> [] ip_local_deliver_finish+0xc6/0x151 <5> [] nf_hook_slow+0x83/0xb5 <5> [] ip_local_deliver+0x1a2/0x1a9 <5> [] ip_local_deliver_finish+0x0/0x151 <5> [] ip_rcv+0x334/0x3b4 <5> [] netif_receive_skb+0x320/0x35b <5> [] init_stall_timer+0x67/0x6a [uhci_hcd] <5> [] process_backlog+0x6c/0xd9 <5> [] net_rx_action+0xfe/0x1f8 <5> [] __do_softirq+0x35/0x79 <5> [] handle_IRQ_event+0x0/0x4f <5> [] do_softirq+0x46/0x4d Its an skb_over_panic BUG halt that results from processing an init chunk in which too many of its variable length parameters are in some way malformed. The problem is in sctp_process_unk_param: if (NULL == *errp) *errp = sctp_make_op_error_space(asoc, chunk, ntohs(chunk->chunk_hdr->length)); if (*errp) { sctp_init_cause(*errp, SCTP_ERROR_UNKNOWN_PARAM, WORD_ROUND(ntohs(param.p->length))); sctp_addto_chunk(*errp, WORD_ROUND(ntohs(param.p->length)), param.v); When we allocate an error chunk, we assume that the worst case scenario requires that we have chunk_hdr->length data allocated, which would be correct nominally, given that we call sctp_addto_chunk for the violating parameter. Unfortunately, we also, in sctp_init_cause insert a sctp_errhdr_t structure into the error chunk, so the worst case situation in which all parameters are in violation requires chunk_hdr->length+(sizeof(sctp_errhdr_t)*param_count) bytes of data. The result of this error is that a deliberately malformed packet sent to a listening host can cause a remote DOS, described in CVE-2010-1173: http://cve.mitre.org/cgi-bin/cvename.cgi?name=2010-1173 I've tested the below fix and confirmed that it fixes the issue. We move to a strategy whereby we allocate a fixed size error chunk and ignore errors we don't have space to report. Tested by me successfully Signed-off-by: Neil Horman Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 70635511aad7493f7748aa4e42dd852dfc0eb73f Author: Aneesh Kumar K.V Date: Fri May 28 14:27:23 2010 -0500 ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages commit 2acf2c261b823d9d9ed954f348b97620297a36b5 upstream. With delayed allocation we lock the page in write_cache_pages() and try to build an in memory extent of contiguous blocks. This is needed so that we can get large contiguous blocks request. If range_cyclic mode is enabled, write_cache_pages() will loop back to the 0 index if no I/O has been done yet, and try to start writing from the beginning of the range. That causes an attempt to take the page lock of lower index page while holding the page lock of higher index page, which can cause a dead lock with another writeback thread. The solution is to implement the range_cyclic behavior in ext4_da_writepages() instead. http://bugzilla.kernel.org/show_bug.cgi?id=12579 Signed-off-by: Aneesh Kumar K.V Signed-off-by: "Theodore Ts'o" Signed-off-by: Jayson R. King Signed-off-by: Greg Kroah-Hartman commit f4a4420cd056a7dcc4134dd6df6314d0960ba39f Author: Aneesh Kumar K.V Date: Fri May 28 14:26:57 2010 -0500 ext4: Fix file fragmentation during large file write. commit 22208dedbd7626e5fc4339c417f8d24cc21f79d7 upstream. The range_cyclic writeback mode uses the address_space writeback_index as the start index for writeback. With delayed allocation we were updating writeback_index wrongly resulting in highly fragmented file. This patch reduces the number of extents reduced from 4000 to 27 for a 3GB file. Signed-off-by: Aneesh Kumar K.V Signed-off-by: Theodore Ts'o [dev@jaysonking.com: Some changed lines from the original version of this patch were dropped, since they were rolled up with another cherry-picked patch applied to 2.6.27.y earlier.] [dev@jaysonking.com: Use of wbc->no_nrwrite_index_update was dropped, since write_cache_pages_da() implies it.] Signed-off-by: Jayson R. King Signed-off-by: Greg Kroah-Hartman commit 14dd35b4c54e223de27d1161985eeef847a130ee Author: Theodore Ts'o Date: Fri May 28 14:26:25 2010 -0500 ext4: Use our own write_cache_pages() commit 8e48dcfbd7c0892b4cfd064d682cc4c95a29df32 upstream. Make a copy of write_cache_pages() for the benefit of ext4_da_writepages(). This allows us to simplify the code some, and will allow us to further customize the code in future patches. There are some nasty hacks in write_cache_pages(), which Linus has (correctly) characterized as vile. I've just copied it into write_cache_pages_da(), without trying to clean those bits up lest I break something in the ext4's delalloc implementation, which is a bit fragile right now. This will allow Dave Chinner to clean up write_cache_pages() in mm/page-writeback.c, without worrying about breaking ext4. Eventually write_cache_pages_da() will go away when I rewrite ext4's delayed allocation and create a general ext4_writepages() which is used for all of ext4's writeback. Until now this is the lowest risk way to clean up the core write_cache_pages() function. Signed-off-by: "Theodore Ts'o" Cc: Dave Chinner [dev@jaysonking.com: Dropped the hunks which reverted the use of no_nrwrite_index_update, since those lines weren't ever created on 2.6.27.y] [dev@jaysonking.com: Copied from 2.6.27.y's version of write_cache_pages(), plus the changes to it from patch "vfs: Add no_nrwrite_index_update writeback control flag"] Signed-off-by: Jayson R. King Signed-off-by: Greg Kroah-Hartman commit 805fe57fb54f9f66256a2c0f81b11abeb83b0142 Author: Eric Sandeen Date: Sun May 16 01:00:00 2010 -0400 ext4: check s_log_groups_per_flex in online resize code commit 42007efd569f1cf3bfb9a61da60ef6c2179508ca upstream. If groups_per_flex < 2, sbi->s_flex_groups[] doesn't get filled out, and every other access to this first tests s_log_groups_per_flex; same thing needs to happen in resize or we'll wander off into a null pointer when doing an online resize of the file system. Thanks to Christoph Biedl, who came up with the trivial testcase: # truncate --size 128M fsfile # mkfs.ext3 -F fsfile # tune2fs -O extents,uninit_bg,dir_index,flex_bg,huge_file,dir_nlink,extra_isize fsfile # e2fsck -yDf -C0 fsfile # truncate --size 132M fsfile # losetup /dev/loop0 fsfile # mount /dev/loop0 mnt # resize2fs -p /dev/loop0 https://bugzilla.kernel.org/show_bug.cgi?id=13549 Reported-by: Alessandro Polverini Test-case-by: Christoph Biedl Signed-off-by: Eric Sandeen Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit f5cdcd6bb088de9c8ce7b63a2487f7064305c03c Author: Richard Kennedy Date: Thu May 27 10:22:28 2010 +0100 gconfig: fix build failure on fedora 13 commit cbab05f041a4cff6ca15856bdd35238b282b64eb upstream. Making gconfig fails on fedora 13 as the linker cannot resolve dlsym. Adding libdl to the link command fixes this. make shows this error :- /usr/bin/ld: scripts/kconfig/kconfig_load.o: undefined reference to symbol 'dlsym@@GLIBC_2.2.5' /usr/bin/ld: note: 'dlsym@@GLIBC_2.2.5' is defined in DSO /lib64/libdl.so.2 so try adding it to the linker command line /lib64/libdl.so.2: could not read symbols: Invalid operation tested on x86_64 fedora 13. Signed-off-by: Richard Kennedy Reviewed-by: WANG Cong Signed-off-by: Andrew Morton Signed-off-by: Michal Marek Signed-off-by: Greg Kroah-Hartman commit 310c0ab725ad1950bb1592306d583a60e9f000fb Author: Jiri Kosina Date: Wed May 26 14:43:53 2010 -0700 ipmi: handle run_to_completion properly in deliver_recv_msg() commit a747c5abc329611220f16df0bb4cf0ca4a7fdf0c upstream. If run_to_completion flag is set, it means that we are running in a single-threaded mode, and thus no locks are held. This fixes a deadlock when IPMI notifier is being called during panic. Signed-off-by: Jiri Kosina Acked-by: Corey Minyard Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 9b698b5535523995150a8bebad483d541c5ab9cf Author: Jeff Moyer Date: Wed May 26 11:49:40 2010 -0400 do_generic_file_read: clear page errors when issuing a fresh read of the page commit 91803b499cca2fe558abad709ce83dc896b80950 upstream. I/O errors can happen due to temporary failures, like multipath errors or losing network contact with the iSCSI server. Because of that, the VM will retry readpage on the page. However, do_generic_file_read does not clear PG_error. This causes the system to be unable to actually use the data in the page cache page, even if the subsequent readpage completes successfully! The function filemap_fault has had a ClearPageError before readpage forever. This patch simply adds the same to do_generic_file_read. Signed-off-by: Jeff Moyer Signed-off-by: Rik van Riel Acked-by: Larry Woodman Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 789242303b22bddc1cd03a0c013bfdc091556d83 Author: Dan Williams Date: Wed May 12 08:25:37 2010 +1000 md: set mddev readonly flag on blkdev BLKROSET ioctl commit e2218350465e7e0931676b4849b594c978437bce upstream. When the user sets the block device to readwrite then the mddev should follow suit. Otherwise, the BUG_ON in md_write_start() will be set to trigger. The reverse direction, setting mddev->ro to match a set readonly request, can be ignored because the blkdev level readonly flag precludes the need to have mddev->ro set correctly. Nevermind the fact that setting mddev->ro to 1 may fail if the array is in use. Signed-off-by: Dan Williams Signed-off-by: NeilBrown Signed-off-by: Greg Kroah-Hartman commit f2a41e6807357c237ec2bbd61e79cb1558e48b8a Author: NeilBrown Date: Sat May 8 08:20:17 2010 +1000 md: Fix read balancing in RAID1 and RAID10 on drives > 2TB commit af3a2cd6b8a479345786e7fe5e199ad2f6240e56 upstream. read_balance uses a "unsigned long" for a sector number which will get truncated beyond 2TB. This will cause read-balancing to be non-optimal, and can cause data to be read from the 'wrong' branch during a resync. This has a very small chance of returning wrong data. Reported-by: Jordan Russell Signed-off-by: NeilBrown Signed-off-by: Greg Kroah-Hartman commit faeb8ec4bf76f163317e9b885c5eb35ee960f705 Author: NeilBrown Date: Tue May 18 15:27:13 2010 +1000 md/raid1: fix counting of write targets. commit 964147d5c86d63be79b442c30f3783d49860c078 upstream. There is a very small race window when writing to a RAID1 such that if a device is marked faulty at exactly the wrong time, the write-in-progress will not be sent to the device, but the bitmap (if present) will be updated to say that the write was sent. Then if the device turned out to still be usable as was re-added to the array, the bitmap-based-resync would skip resyncing that block, possibly leading to corruption. This would only be a problem if no further writes were issued to that area of the device (i.e. that bitmap chunk). Suitable for any pending -stable kernel. Signed-off-by: NeilBrown Signed-off-by: Greg Kroah-Hartman commit 43bee391355ad45021579511963dfb46d6e810fb Author: Denis Kirjanov Date: Tue Jun 1 15:43:34 2010 -0400 powerpc/oprofile: fix potential buffer overrun in op_model_cell.c commit 238c1a78c957f3dc7cb848b161dcf4805793ed56 upstream. Fix potential initial_lfsr buffer overrun. Writing past the end of the buffer could happen when index == ENTRIES Signed-off-by: Denis Kirjanov Signed-off-by: Robert Richter Signed-off-by: Greg Kroah-Hartman commit 21dfd35afeaf03c5fe4409d1021b2cc888c11326 Author: Michael Neuling Date: Wed Apr 28 13:39:41 2010 +0000 powerpc/pseries: Make query_cpu_stopped callable outside hotplug cpu commit f8b67691828321f5c85bb853283aa101ae673130 upstream. This moves query_cpu_stopped() out of the hotplug cpu code and into smp.c so it can called in other places and renames it to smp_query_cpu_stopped(). It also cleans up the return values by adding some #defines Signed-off-by: Michael Neuling Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Greg Kroah-Hartman commit 4be4e01c7d844bbb7b88a2bcf931ac00408289c3 Author: Michael Neuling Date: Wed Apr 28 13:39:41 2010 +0000 powerpc/pseries: Only call start-cpu when a CPU is stopped commit aef40e87d866355ffd279ab21021de733242d0d5 upstream. Currently we always call start-cpu irrespective of if the CPU is stopped or not. Unfortunatley on POWER7, firmware seems to not like start-cpu being called when a cpu already been started. This was not the case on POWER6 and earlier. This patch checks to see if the CPU is stopped or not via an query-cpu-stopped-state call, and only calls start-cpu on CPUs which are stopped. This fixes a bug with kexec on POWER7 on PHYP where only the primary thread would make it to the second kernel. Reported-by: Ankita Garg Signed-off-by: Michael Neuling Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Greg Kroah-Hartman commit ae7aff994d0934d86a3278a65889f34629c778a7 Author: Jeff Mahoney Date: Wed Mar 17 10:55:51 2010 +0000 powerpc: Fix handling of strncmp with zero len commit 637a99022fb119b90fb281715d13172f0394fc12 upstream. Commit 0119536c, which added the assembly version of strncmp to powerpc, mentions that it adds two instructions to the version from boot/string.S to allow it to handle len=0. Unfortunately, it doesn't always return 0 when that is the case. The length is passed in r5, but the return value is passed back in r3. In certain cases, this will happen to work. Otherwise it will pass back the address of the first string as the return value. This patch lifts the len <= 0 handling code from memcpy to handle that case. Reported by: Christian_Sellars@symantec.com Signed-off-by: Jeff Mahoney Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Greg Kroah-Hartman commit 88102a584bf833892493a514430bcebd53d8f017 Author: Andreas Bombe Date: Mon May 17 23:12:46 2010 -0700 ARCNET: Limit com20020 PCI ID matches for SOHARD cards commit e7971c80a8e0299f91272ad8e8ac4167623e1862 upstream. The SH SOHARD ARCNET cards are implemented using generic PLX Technology PCI<->IOBus bridges. Subvendor and subdevice IDs were not specified, causing the driver to attach to any such bridge and likely crash the system by attempting to initialize an unrelated device. Fix by specifying subvendor and subdevice according to the values found in the PCI-ID Repository at http://pci-ids.ucw.cz/ . Signed-off-by: Andreas Bombe Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b9cb946fac3288e7a35571525ef09946e2ba6d7f Author: Pavel Emelyanov Date: Fri May 14 15:33:36 2010 +0400 NFSD: don't report compiled-out versions as present commit 15ddb4aec54422ead137b03ea4e9b3f5db3f7cc2 upstream. The /proc/fs/nfsd/versions file calls nfsd_vers() to check whether the particular nfsd version is present/available. The problem is that once I turn off e.g. NFSD-V4 this call returns -1 which is true from the callers POV which is wrong. The proposal is to report false in that case. The bug has existed since 6658d3a7bbfd1768 "[PATCH] knfsd: remove nfsd_versbits as intermediate storage for desired versions". Signed-off-by: Pavel Emelyanov Acked-by: NeilBrown Signed-off-by: J. Bruce Fields Signed-off-by: Greg Kroah-Hartman commit 90bfd720bcda3a04e22be93b66138a117c0bcf28 Author: Tejun Heo Date: Wed May 19 15:38:58 2010 +0200 libata: disable ATAPI AN by default commit e7ecd435692ca9bde9d124be30b3a26e672ea6c2 upstream. There are ATAPI devices which raise AN when hit by commands issued by open(). This leads to infinite loop of AN -> MEDIA_CHANGE uevent -> udev open() to check media -> AN. Both ACS and SerialATA standards don't define in which case ATAPI devices are supposed to raise or not raise AN. They both list media insertion event as a possible use case for ATAPI ANs but there is no clear description of what constitutes such events. As such, it seems a bit too naive to export ANs directly to userland as MEDIA_CHANGE events without further verification (which should behave similarly to windows as it apparently is the only thing that some hardware vendors are testing against). This patch adds libata.atapi_an module parameter and disables ATAPI AN by default for now. Signed-off-by: Tejun Heo Cc: Kay Sievers Cc: Nick Bowler Cc: David Zeuthen Signed-off-by: Jeff Garzik Signed-off-by: Greg Kroah-Hartman