- Jul 05, 2010
-
-
Greg Kroah-Hartman authored
-
Wei Yongjun authored
commit 2e3219b5 upstream. commit 5fa782c2 sctp: Fix skb_over_panic resulting from multiple invalid \ parameter errors (CVE-2010-1173) (v4) cause 'error cause' never be add the the ERROR chunk due to some typo when check valid length in sctp_init_cause_fixed(). Signed-off-by:
Wei Yongjun <yjwei@cn.fujitsu.com> Reviewed-by:
Neil Horman <nhorman@tuxdriver.com> Acked-by:
Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Toshiyuki Okajima authored
commit cea7daa3 upstream. find_keyring_by_name() can gain access to a keyring that has had its reference count reduced to zero, and is thus ready to be freed. This then allows the dead keyring to be brought back into use whilst it is being destroyed. The following timeline illustrates the process: |(cleaner) (user) | | free_user(user) sys_keyctl() | | | | key_put(user->session_keyring) keyctl_get_keyring_ID() | || //=> keyring->usage = 0 | | |schedule_work(&key_cleanup_task) lookup_user_key() | || | | kmem_cache_free(,user) | | . |[KEY_SPEC_USER_KEYRING] | . install_user_keyrings() | . || | key_cleanup() [<= worker_thread()] || | | || | [spin_lock(&key_serial_lock)] |[mutex_lock(&key_user_keyr..mutex)] | | || | atomic_read() == 0 || | |{ rb_ease(&key->serial_node,) } || | | || | [spin_unlock(&key_serial_lock)] |find_keyring_by_name() | | ||| | keyring_destroy(keyring) ||[read_lock(&keyring_name_lock)] | || ||| | |[write_lock(&keyring_name_lock)] ||atomic_inc(&keyring->usage) | |. ||| *** GET freeing keyring *** | |. ||[read_unlock(&keyring_name_lock)] | || || | |list_del() |[mutex_unlock(&key_user_k..mutex)] | || | | |[write_unlock(&keyring_name_lock)] ** INVALID keyring is returned ** | | . | kmem_cache_free(,keyring) . | . | atomic_dec(&keyring->usage) v *** DESTROYED *** TIME If CONFIG_SLUB_DEBUG=y then we may see the following message generated: ============================================================================= BUG key_jar: Poison overwritten ----------------------------------------------------------------------------- INFO: 0xffff880197a7e200-0xffff880197a7e200. First byte 0x6a instead of 0x6b INFO: Allocated in key_alloc+0x10b/0x35f age=25 cpu=1 pid=5086 INFO: Freed in key_cleanup+0xd0/0xd5 age=12 cpu=1 pid=10 INFO: Slab 0xffffea000592cb90 objects=16 used=2 fp=0xffff880197a7e200 flags=0x200000000000c3 INFO: Object 0xffff880197a7e200 @offset=512 fp=0xffff880197a7e300 Bytes b4 0xffff880197a7e1f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Object 0xffff880197a7e200: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk Alternatively, we may see a system panic happen, such as: BUG: unable to handle kernel NULL pointer dereference at 0000000000000001 IP: [<ffffffff810e61a3>] kmem_cache_alloc+0x5b/0xe9 PGD 6b2b4067 PUD 6a80d067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/kernel/kexec_crash_loaded CPU 1 ... Pid: 31245, comm: su Not tainted 2.6.34-rc5-nofixed-nodebug #2 D2089/PRIMERGY RIP: 0010:[<ffffffff810e61a3>] [<ffffffff810e61a3>] kmem_cache_alloc+0x5b/0xe9 RSP: 0018:ffff88006af3bd98 EFLAGS: 00010002 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88007d19900b RDX: 0000000100000000 RSI: 00000000000080d0 RDI: ffffffff81828430 RBP: ffffffff81828430 R08: ffff88000a293750 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000100000 R12: 00000000000080d0 R13: 00000000000080d0 R14: 0000000000000296 R15: ffffffff810f20ce FS: 00007f97116bc700(0000) GS:ffff88000a280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000001 CR3: 000000006a91c000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process su (pid: 31245, threadinfo ffff88006af3a000, task ffff8800374414c0) Stack: 0000000512e0958e 0000000000008000 ffff880037f8d180 0000000000000001 0000000000000000 0000000000008001 ffff88007d199000 ffffffff810f20ce 0000000000008000 ffff88006af3be48 0000000000000024 ffffffff810face3 Call Trace: [<ffffffff810f20ce>] ? get_empty_filp+0x70/0x12f [<ffffffff810face3>] ? do_filp_open+0x145/0x590 [<ffffffff810ce208>] ? tlb_finish_mmu+0x2a/0x33 [<ffffffff810ce43c>] ? unmap_region+0xd3/0xe2 [<ffffffff810e4393>] ? virt_to_head_page+0x9/0x2d [<ffffffff81103916>] ? alloc_fd+0x69/0x10e [<ffffffff810ef4ed>] ? do_sys_open+0x56/0xfc [<ffffffff81008a02>] ? system_call_fastpath+0x16/0x1b Code: 0f 1f 44 00 00 49 89 c6 fa 66 0f 1f 44 00 00 65 4c 8b 04 25 60 e8 00 00 48 8b 45 00 49 01 c0 49 8b 18 48 85 db 74 0d 48 63 45 18 <48> 8b 04 03 49 89 00 eb 14 4c 89 f9 83 ca ff 44 89 e6 48 89 ef RIP [<ffffffff810e61a3>] kmem_cache_alloc+0x5b/0xe9 This problem is that find_keyring_by_name does not confirm that the keyring is valid before accepting it. Skipping keyrings that have been reduced to a zero count seems the way to go. To this end, use atomic_inc_not_zero() to increment the usage count and skip the candidate keyring if that returns false. The following script _may_ cause the bug to happen, but there's no guarantee as the window of opportunity is small: #!/bin/sh LOOP=100000 USER=dummy_user /bin/su -c "exit;" $USER || { /usr/sbin/adduser -m $USER; add=1; } for ((i=0; i<LOOP; i++)) do /bin/su -c "echo '$i' > /dev/null" $USER done (( add == 1 )) && /usr/sbin/userdel -r $USER exit Note that the nominated user must not be in use. An alternative way of testing this may be: for ((i=0; i<100000; i++)) do keyctl session foo /bin/true || break done >&/dev/null as that uses a keyring named "foo" rather than relying on the user and user-session named keyrings. Reported-by:
Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by:
David Howells <dhowells@redhat.com> Tested-by:
Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Acked-by:
Serge Hallyn <serue@us.ibm.com> Signed-off-by:
James Morris <jmorris@namei.org> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Carpenter authored
commit 4d09ec0f upstream. We were using the wrong variable here so the error codes weren't being returned properly. The original code returns -ENOKEY. Signed-off-by:
Dan Carpenter <error27@gmail.com> Signed-off-by:
David Howells <dhowells@redhat.com> Signed-off-by:
James Morris <jmorris@namei.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Helge Deller authored
commit 550f0d92 upstream. Clear the floating point exception flag before returning to user space. This is needed, else the libc trampoline handler may hit the same SIGFPE again while building up a trampoline to a signal handler. Fixes debian bug #559406. Signed-off-by:
Helge Deller <deller@gmx.de> Signed-off-by:
Kyle McMartin <kyle@mcmartin.ca> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Neil Horman authored
commit d0021b25 upstream. Fix TIPC to disallow sending to remote addresses prior to entering NET_MODE user programs can oops the kernel by sending datagrams via AF_TIPC prior to entering networked mode. The following backtrace has been observed: ID: 13459 TASK: ffff810014640040 CPU: 0 COMMAND: "tipc-client" [exception RIP: tipc_node_select_next_hop+90] RIP: ffffffff8869d3c3 RSP: ffff81002d9a5ab8 RFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000001001001 RBP: 0000000001001001 R8: 0074736575716552 R9: 0000000000000000 R10: ffff81003fbd0680 R11: 00000000000000c8 R12: 0000000000000008 R13: 0000000000000001 R14: 0000000000000001 R15: ffff810015c6ca00 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 RIP: 0000003cbd8d49a3 RSP: 00007fffc84e0be8 RFLAGS: 00010206 RAX: 000000000000002c RBX: ffffffff8005d116 RCX: 0000000000000000 RDX: 0000000000000008 RSI: 00007fffc84e0c00 RDI: 0000000000000003 RBP: 0000000000000000 R8: 00007fffc84e0c10 R9: 0000000000000010 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fffc84e0d10 R14: 0000000000000000 R15: 00007fffc84e0c30 ORIG_RAX: 000000000000002c CS: 0033 SS: 002b What happens is that, when the tipc module in inserted it enters a standalone node mode in which communication to its own address is allowed <0.0.0> but not to other addresses, since the appropriate data structures have not been allocated yet (specifically the tipc_net pointer). There is nothing stopping a client from trying to send such a message however, and if that happens, we attempt to dereference tipc_net.zones while the pointer is still NULL, and explode. The fix is pretty straightforward. Since these oopses all arise from the dereference of global pointers prior to their assignment to allocated values, and since these allocations are small (about 2k total), lets convert these pointers to static arrays of the appropriate size. All the accesses to these bits consider 0/NULL to be a non match when searching, so all the lookups still work properly, and there is no longer a chance of a bad dererence anywhere. As a bonus, this lets us eliminate the setup/teardown routines for those pointers, and elimnates the need to preform any locking around them to prevent access while their being allocated/freed. I've updated the tipc_net structure to behave this way to fix the exact reported problem, and also fixed up the tipc_bearers and media_list arrays to fix an obvious simmilar problem that arises from issuing tipc-config commands to manipulate bearers/links prior to entering networked mode I've tested this for a few hours by running the sanity tests and stress test with the tipcutils suite, and nothing has fallen over. There have been a few lockdep warnings, but those were there before, and can be addressed later, as they didn't actually result in any deadlock. Signed-off-by:
Neil Horman <nhorman@tuxdriver.com> CC: Allan Stephens <allan.stephens@windriver.com> CC: David S. Miller <davem@davemloft.net> CC: tipc-discussion@lists.sourceforge.net Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Miklos Szeredi authored
commit db1f05bb upstream. Add a new UMOUNT_NOFOLLOW flag to umount(2). This is needed to prevent symlink attacks in unprivileged unmounts (fuse, samba, ncpfs). Additionally, return -EINVAL if an unknown flag is used (and specify an explicitly unused flag: UMOUNT_UNUSED). This makes it possible for the caller to determine if a flag is supported or not. CC: Eugene Teo <eugene@redhat.com> CC: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Neil Horman authored
commit 5fa782c2 upstream. Ok, version 4 Change Notes: 1) Minor cleanups, from Vlads notes Summary: Hey- Recently, it was reported to me that the kernel could oops in the following way: <5> kernel BUG at net/core/skbuff.c:91! <5> invalid operand: 0000 [#1] <5> Modules linked in: sctp netconsole nls_utf8 autofs4 sunrpc iptable_filter ip_tables cpufreq_powersave parport_pc lp parport vmblock(U) vsock(U) vmci(U) vmxnet(U) vmmemctl(U) vmhgfs(U) acpiphp dm_mirror dm_mod button battery ac md5 ipv6 uhci_hcd ehci_hcd snd_ens1371 snd_rawmidi snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_ac97_codec snd soundcore pcnet32 mii floppy ext3 jbd ata_piix libata mptscsih mptsas mptspi mptscsi mptbase sd_mod scsi_mod <5> CPU: 0 <5> EIP: 0060:[<c02bff27>] Not tainted VLI <5> EFLAGS: 00010216 (2.6.9-89.0.25.EL) <5> EIP is at skb_over_panic+0x1f/0x2d <5> eax: 0000002c ebx: c033f461 ecx: c0357d96 edx: c040fd44 <5> esi: c033f461 edi: df653280 ebp: 00000000 esp: c040fd40 <5> ds: 007b es: 007b ss: 0068 <5> Process swapper (pid: 0, threadinfo=c040f000 task=c0370be0) <5> Stack: c0357d96 e0c29478 00000084 00000004 c033f461 df653280 d7883180 e0c2947d <5> 00000000 00000080 df653490 00000004 de4f1ac0 de4f1ac0 00000004 df653490 <5> 00000001 e0c2877a 08000800 de4f1ac0 df653490 00000000 e0c29d2e 00000004 <5> Call Trace: <5> [<e0c29478>] sctp_addto_chunk+0xb0/0x128 [sctp] <5> [<e0c2947d>] sctp_addto_chunk+0xb5/0x128 [sctp] <5> [<e0c2877a>] sctp_init_cause+0x3f/0x47 [sctp] <5> [<e0c29d2e>] sctp_process_unk_param+0xac/0xb8 [sctp] <5> [<e0c29e90>] sctp_verify_init+0xcc/0x134 [sctp] <5> [<e0c20322>] sctp_sf_do_5_1B_init+0x83/0x28e [sctp] <5> [<e0c25333>] sctp_do_sm+0x41/0x77 [sctp] <5> [<c01555a4>] cache_grow+0x140/0x233 <5> [<e0c26ba1>] sctp_endpoint_bh_rcv+0xc5/0x108 [sctp] <5> [<e0c2b863>] sctp_inq_push+0xe/0x10 [sctp] <5> [<e0c34600>] sctp_rcv+0x454/0x509 [sctp] <5> [<e084e017>] ipt_hook+0x17/0x1c [iptable_filter] <5> [<c02d005e>] nf_iterate+0x40/0x81 <5> [<c02e0bb9>] ip_local_deliver_finish+0x0/0x151 <5> [<c02e0c7f>] ip_local_deliver_finish+0xc6/0x151 <5> [<c02d0362>] nf_hook_slow+0x83/0xb5 <5> [<c02e0bb2>] ip_local_deliver+0x1a2/0x1a9 <5> [<c02e0bb9>] ip_local_deliver_finish+0x0/0x151 <5> [<c02e103e>] ip_rcv+0x334/0x3b4 <5> [<c02c66fd>] netif_receive_skb+0x320/0x35b <5> [<e0a0928b>] init_stall_timer+0x67/0x6a [uhci_hcd] <5> [<c02c67a4>] process_backlog+0x6c/0xd9 <5> [<c02c690f>] net_rx_action+0xfe/0x1f8 <5> [<c012a7b1>] __do_softirq+0x35/0x79 <5> [<c0107efb>] handle_IRQ_event+0x0/0x4f <5> [<c01094de>] do_softirq+0x46/0x4d Its an skb_over_panic BUG halt that results from processing an init chunk in which too many of its variable length parameters are in some way malformed. The problem is in sctp_process_unk_param: if (NULL == *errp) *errp = sctp_make_op_error_space(asoc, chunk, ntohs(chunk->chunk_hdr->length)); if (*errp) { sctp_init_cause(*errp, SCTP_ERROR_UNKNOWN_PARAM, WORD_ROUND(ntohs(param.p->length))); sctp_addto_chunk(*errp, WORD_ROUND(ntohs(param.p->length)), param.v); When we allocate an error chunk, we assume that the worst case scenario requires that we have chunk_hdr->length data allocated, which would be correct nominally, given that we call sctp_addto_chunk for the violating parameter. Unfortunately, we also, in sctp_init_cause insert a sctp_errhdr_t structure into the error chunk, so the worst case situation in which all parameters are in violation requires chunk_hdr->length+(sizeof(sctp_errhdr_t)*param_count) bytes of data. The result of this error is that a deliberately malformed packet sent to a listening host can cause a remote DOS, described in CVE-2010-1173: http://cve.mitre.org/cgi-bin/cvename.cgi?name=2010-1173 I've tested the below fix and confirmed that it fixes the issue. We move to a strategy whereby we allocate a fixed size error chunk and ignore errors we don't have space to report. Tested by me successfully Signed-off-by:
Neil Horman <nhorman@tuxdriver.com> Acked-by:
Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Aneesh Kumar K.V authored
commit 2acf2c26 upstream. With delayed allocation we lock the page in write_cache_pages() and try to build an in memory extent of contiguous blocks. This is needed so that we can get large contiguous blocks request. If range_cyclic mode is enabled, write_cache_pages() will loop back to the 0 index if no I/O has been done yet, and try to start writing from the beginning of the range. That causes an attempt to take the page lock of lower index page while holding the page lock of higher index page, which can cause a dead lock with another writeback thread. The solution is to implement the range_cyclic behavior in ext4_da_writepages() instead. http://bugzilla.kernel.org/show_bug.cgi?id=12579 Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Jayson R. King <dev@jaysonking.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Aneesh Kumar K.V authored
commit 22208ded upstream. The range_cyclic writeback mode uses the address_space writeback_index as the start index for writeback. With delayed allocation we were updating writeback_index wrongly resulting in highly fragmented file. This patch reduces the number of extents reduced from 4000 to 27 for a 3GB file. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> [dev@jaysonking.com: Some changed lines from the original version of this patch were dropped, since they were rolled up with another cherry-picked patch applied to 2.6.27.y earlier.] [dev@jaysonking.com: Use of wbc->no_nrwrite_index_update was dropped, since write_cache_pages_da() implies it.] Signed-off-by:
Jayson R. King <dev@jaysonking.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Theodore Ts'o authored
commit 8e48dcfb upstream. Make a copy of write_cache_pages() for the benefit of ext4_da_writepages(). This allows us to simplify the code some, and will allow us to further customize the code in future patches. There are some nasty hacks in write_cache_pages(), which Linus has (correctly) characterized as vile. I've just copied it into write_cache_pages_da(), without trying to clean those bits up lest I break something in the ext4's delalloc implementation, which is a bit fragile right now. This will allow Dave Chinner to clean up write_cache_pages() in mm/page-writeback.c, without worrying about breaking ext4. Eventually write_cache_pages_da() will go away when I rewrite ext4's delayed allocation and create a general ext4_writepages() which is used for all of ext4's writeback. Until now this is the lowest risk way to clean up the core write_cache_pages() function. Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Cc: Dave Chinner <david@fromorbit.com> [dev@jaysonking.com: Dropped the hunks which reverted the use of no_nrwrite_index_update, since those lines weren't ever created on 2.6.27.y] [dev@jaysonking.com: Copied from 2.6.27.y's version of write_cache_pages(), plus the changes to it from patch "vfs: Add no_nrwrite_index_update writeback control flag"] Signed-off-by:
Jayson R. King <dev@jaysonking.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Eric Sandeen authored
commit 42007efd upstream. If groups_per_flex < 2, sbi->s_flex_groups[] doesn't get filled out, and every other access to this first tests s_log_groups_per_flex; same thing needs to happen in resize or we'll wander off into a null pointer when doing an online resize of the file system. Thanks to Christoph Biedl, who came up with the trivial testcase: # truncate --size 128M fsfile # mkfs.ext3 -F fsfile # tune2fs -O extents,uninit_bg,dir_index,flex_bg,huge_file,dir_nlink,extra_isize fsfile # e2fsck -yDf -C0 fsfile # truncate --size 132M fsfile # losetup /dev/loop0 fsfile # mount /dev/loop0 mnt # resize2fs -p /dev/loop0 https://bugzilla.kernel.org/show_bug.cgi?id=13549 Reported-by:
Alessandro Polverini <alex@nibbles.it> Test-case-by:
Christoph Biedl <bugzilla.kernel.bpeb@manchmal.in-ulm.de> Signed-off-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Richard Kennedy authored
commit cbab05f0 upstream. Making gconfig fails on fedora 13 as the linker cannot resolve dlsym. Adding libdl to the link command fixes this. make shows this error :- /usr/bin/ld: scripts/kconfig/kconfig_load.o: undefined reference to symbol 'dlsym@@GLIBC_2.2.5' /usr/bin/ld: note: 'dlsym@@GLIBC_2.2.5' is defined in DSO /lib64/libdl.so.2 so try adding it to the linker command line /lib64/libdl.so.2: could not read symbols: Invalid operation tested on x86_64 fedora 13. Signed-off-by:
Richard Kennedy <richard@rsk.demon.co.uk> Reviewed-by:
WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Michal Marek <mmarek@suse.cz> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jiri Kosina authored
commit a747c5ab upstream. If run_to_completion flag is set, it means that we are running in a single-threaded mode, and thus no locks are held. This fixes a deadlock when IPMI notifier is being called during panic. Signed-off-by:
Jiri Kosina <jkosina@suse.cz> Acked-by:
Corey Minyard <minyard@acm.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jeff Moyer authored
commit 91803b49 upstream. I/O errors can happen due to temporary failures, like multipath errors or losing network contact with the iSCSI server. Because of that, the VM will retry readpage on the page. However, do_generic_file_read does not clear PG_error. This causes the system to be unable to actually use the data in the page cache page, even if the subsequent readpage completes successfully! The function filemap_fault has had a ClearPageError before readpage forever. This patch simply adds the same to do_generic_file_read. Signed-off-by:
Jeff Moyer <jmoyer@redhat.com> Signed-off-by:
Rik van Riel <riel@redhat.com> Acked-by:
Larry Woodman <lwoodman@redhat.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Williams authored
commit e2218350 upstream. When the user sets the block device to readwrite then the mddev should follow suit. Otherwise, the BUG_ON in md_write_start() will be set to trigger. The reverse direction, setting mddev->ro to match a set readonly request, can be ignored because the blkdev level readonly flag precludes the need to have mddev->ro set correctly. Nevermind the fact that setting mddev->ro to 1 may fail if the array is in use. Signed-off-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
NeilBrown <neilb@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
NeilBrown authored
commit af3a2cd6 upstream. read_balance uses a "unsigned long" for a sector number which will get truncated beyond 2TB. This will cause read-balancing to be non-optimal, and can cause data to be read from the 'wrong' branch during a resync. This has a very small chance of returning wrong data. Reported-by:
Jordan Russell <jr-list-2010@quo.to> Signed-off-by:
NeilBrown <neilb@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
NeilBrown authored
commit 964147d5 upstream. There is a very small race window when writing to a RAID1 such that if a device is marked faulty at exactly the wrong time, the write-in-progress will not be sent to the device, but the bitmap (if present) will be updated to say that the write was sent. Then if the device turned out to still be usable as was re-added to the array, the bitmap-based-resync would skip resyncing that block, possibly leading to corruption. This would only be a problem if no further writes were issued to that area of the device (i.e. that bitmap chunk). Suitable for any pending -stable kernel. Signed-off-by:
NeilBrown <neilb@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Denis Kirjanov authored
commit 238c1a78 upstream. Fix potential initial_lfsr buffer overrun. Writing past the end of the buffer could happen when index == ENTRIES Signed-off-by:
Denis Kirjanov <dkirjanov@kernel.org> Signed-off-by:
Robert Richter <robert.richter@amd.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Michael Neuling authored
commit f8b67691 upstream. This moves query_cpu_stopped() out of the hotplug cpu code and into smp.c so it can called in other places and renames it to smp_query_cpu_stopped(). It also cleans up the return values by adding some #defines Signed-off-by:
Michael Neuling <mikey@neuling.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Michael Neuling authored
commit aef40e87 upstream. Currently we always call start-cpu irrespective of if the CPU is stopped or not. Unfortunatley on POWER7, firmware seems to not like start-cpu being called when a cpu already been started. This was not the case on POWER6 and earlier. This patch checks to see if the CPU is stopped or not via an query-cpu-stopped-state call, and only calls start-cpu on CPUs which are stopped. This fixes a bug with kexec on POWER7 on PHYP where only the primary thread would make it to the second kernel. Reported-by:
Ankita Garg <ankita@linux.vnet.ibm.com> Signed-off-by:
Michael Neuling <mikey@neuling.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jeff Mahoney authored
commit 637a9902 upstream. Commit 0119536c, which added the assembly version of strncmp to powerpc, mentions that it adds two instructions to the version from boot/string.S to allow it to handle len=0. Unfortunately, it doesn't always return 0 when that is the case. The length is passed in r5, but the return value is passed back in r3. In certain cases, this will happen to work. Otherwise it will pass back the address of the first string as the return value. This patch lifts the len <= 0 handling code from memcpy to handle that case. Reported by: Christian_Sellars@symantec.com Signed-off-by:
Jeff Mahoney <jeffm@suse.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Andreas Bombe authored
commit e7971c80 upstream. The SH SOHARD ARCNET cards are implemented using generic PLX Technology PCI<->IOBus bridges. Subvendor and subdevice IDs were not specified, causing the driver to attach to any such bridge and likely crash the system by attempting to initialize an unrelated device. Fix by specifying subvendor and subdevice according to the values found in the PCI-ID Repository at http://pci-ids.ucw.cz/ . Signed-off-by:
Andreas Bombe <aeb@debian.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Pavel Emelyanov authored
commit 15ddb4ae upstream. The /proc/fs/nfsd/versions file calls nfsd_vers() to check whether the particular nfsd version is present/available. The problem is that once I turn off e.g. NFSD-V4 this call returns -1 which is true from the callers POV which is wrong. The proposal is to report false in that case. The bug has existed since 6658d3a7 "[PATCH] knfsd: remove nfsd_versbits as intermediate storage for desired versions". Signed-off-by:
Pavel Emelyanov <xemul@openvz.org> Acked-by:
NeilBrown <neilb@suse.de> Signed-off-by:
J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Tejun Heo authored
commit e7ecd435 upstream. There are ATAPI devices which raise AN when hit by commands issued by open(). This leads to infinite loop of AN -> MEDIA_CHANGE uevent -> udev open() to check media -> AN. Both ACS and SerialATA standards don't define in which case ATAPI devices are supposed to raise or not raise AN. They both list media insertion event as a possible use case for ATAPI ANs but there is no clear description of what constitutes such events. As such, it seems a bit too naive to export ANs directly to userland as MEDIA_CHANGE events without further verification (which should behave similarly to windows as it apparently is the only thing that some hardware vendors are testing against). This patch adds libata.atapi_an module parameter and disables ATAPI AN by default for now. Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Nick Bowler <nbowler@elliptictech.com> Cc: David Zeuthen <david@fubar.dk> Signed-off-by:
Jeff Garzik <jgarzik@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
- May 26, 2010
-
-
Greg Kroah-Hartman authored
-
Junjiro R. Okajima authored
commit 1b79cd04 upstream. The previous patch from Alan Cox ("nfsd: fix vm overcommit crash", commit 731572d3) fixed the problem where knfsd crashes on exported shmemfs objects and strict overcommit is set. But the patch forgot supporting the case when CONFIG_SECURITY is disabled. This patch copies a part of his fix which is mainly for detecting a bug earlier. Acked-by:
James Morris <jmorris@namei.org> Signed-off-by:
Alan Cox <alan@redhat.com> Signed-off-by:
Junjiro R. Okajima <hooanon05@yahoo.co.jp> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Alan Cox authored
commit 731572d3 upstream. Junjiro R. Okajima reported a problem where knfsd crashes if you are using it to export shmemfs objects and run strict overcommit. In this situation the current->mm based modifier to the overcommit goes through a NULL pointer. We could simply check for NULL and skip the modifier but we've caught other real bugs in the past from mm being NULL here - cases where we did need a valid mm set up (eg the exec bug about a year ago). To preserve the checks and get the logic we want shuffle the checking around and add a new helper to the vm_ security wrappers Also fix a current->mm reference in nommu that should use the passed mm [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix build] Reported-by:
Junjiro R. Okajima <hooanon05@yahoo.co.jp> Acked-by:
James Morris <jmorris@namei.org> Signed-off-by:
Alan Cox <alan@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jean Delvare authored
commit 1c010ff8 upstream. The functionality bit vector is always returned as a little-endian 32-bit number by the device, so it must be byte-swapped to the host endianness. On the other hand, the delay value is handled by the USB stack, so no byte swapping is needed on our side. This fixes bug #15105: http://bugzilla.kernel.org/show_bug.cgi?id=15105 Reported-by:
Jens Richter <jens@richter-stutensee.de> Signed-off-by:
Jean Delvare <khali@linux-fr.org> Tested-by:
Jens Richter <jens@richter-stutensee.de> Cc: Till Harbaum <till@harbaum.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jean Delvare authored
commit c074c39d upstream. Experience has shown that the block buffer can only be used for SMBus (not I2C) block transactions, even though the datasheet doesn't mention this limitation. Reported-by:
Felix Rubinstein <felixru@gmail.com> Signed-off-by:
Jean Delvare <khali@linux-fr.org> Cc: Oleg Ryjkov <oryjkov@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jean Delvare authored
commit b0bcdd3c upstream. Different motherboards have different PNP declarations for W83781D/W83782D chips. Some declare the whole range of I/O ports (8 ports), some declare only the useful ports (2 ports at offset 5) and some declare fancy ranges, for example 4 ports at offset 4. To properly handle all cases, request all ports individually for probing. After we have determined that we really have a W83781D or W83782D chip, the useful port range will be requested again, as a single block. I did not see a board which needs this yet, but I know of one for lm78 driver and I'd like to keep the logic of these two drivers in sync. Signed-off-by:
Jean Delvare <khali@linux-fr.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Tom Tucker authored
commit 22945e4a upstream. A race between svc_revisit and svc_delete_xprt can result in deferred requests holding references on a transport that can never be recovered because dead transports are not enqueued for subsequent processing. Check for XPT_DEAD in revisit to clean up completing deferrals on a dead transport and sweep a transport's deferred queue to do the same for queued but unprocessed deferrals. Signed-off-by:
Tom Tucker <tom@opengridcomputing.com> Signed-off-by:
J. Bruce Fields <bfields@citi.umich.edu> Cc: roma1390 <roma1390@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Tejun Heo authored
commit 534ead70 upstream. libata currently doesn't retry if a command fails with AC_ERR_INVALID assuming that retrying won't get it any further even if retried. However, a failure may be classified as invalid through hardware glitch (incorrect reading of the error register or firmware bug) and there isn't whole lot to gain by not retrying as actually invalid commands will be failed immediately. Also, commands serving FS IOs are extremely unlikely to be invalid. Retry FS IOs even if it's marked invalid. Transient and incorrect invalid failure was seen while debugging firmware related issue on Samsung n130 on bko#14314. http://bugzilla.kernel.org/show_bug.cgi?id=14314 Signed-off-by:
Tejun Heo <tj@kernel.org> Reported-by:
Johannes Stezenbach <js@sig21.net> Signed-off-by:
Jeff Garzik <jgarzik@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jeff Garzik authored
libata: ensure NCQ error result taskfile is fully initialized before returning it via qc->result_tf. commit a09bf4cd upstream. Signed-off-by:
Jeff Garzik <jgarzik@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jean Delvare authored
commit b1d4b390 upstream. Some FSC hardware monitoring chips (Syleus at least) doesn't like quick writes we typically use to probe for I2C chips. Use a regular byte read instead for the address they live at (0x73). These are the only known chips living at this address on PC systems. For clarity, this fix should not be needed for kernels 2.6.30 and later, as we started instantiating the hwmon devices explicitly based on DMI data. Still, this fix is valuable in the following two cases: * Support for recent FSC chips on older kernels. The DMI-based device instantiation is more difficult to backport than the device support itself. * Case where the DMI-based device instantiation fails, whatever the reason. We fall back to probing in that case, so it should work. This fixes kernel bug #15634: https://bugzilla.kernel.org/show_bug.cgi?id=15634 Signed-off-by:
Jean Delvare <khali@linux-fr.org> Acked-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Chuck Lever authored
commit 356e76b8 upstream. NFSv4 mounts ignore the rsize and wsize mount options, and always use the default transfer size for both. This seems to be because all NFSv4 mounts are now cloned, and the cloning logic doesn't copy the rsize and wsize settings from the parent nfs_server. I tested Fedora's 2.6.32.11-99 and it seems to have this problem as well, so I'm guessing that .33, .32, and perhaps older kernels have this issue as well. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Al Viro authored
commit d9e80b7d upstream. If dentry found stale happens to be a root of disconnected tree, we can't d_drop() it; its d_hash is actually part of s_anon and d_drop() would simply hide it from shrink_dcache_for_umount(), leading to all sorts of fun, including busy inodes on umount and oopsen after that. Bug had been there since at least 2006 (commit c636eb already has it), so it's definitely -stable fodder. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Carpenter authored
commit fa7fe7af upstream. There is a typo here. We should be testing "*dentry" which was just assigned instead of "dentry". This could result in dereferencing an ERR_PTR inside either usbfs_mkdir() or usbfs_create(). Signed-off-by:
Dan Carpenter <error27@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Neil Brown authored
commit 2bc3c117 upstream. When read_buf is called to move over to the next page in the pagelist of an NFSv4 request, it sets argp->end to essentially a random number, certainly not an address within the page which argp->p now points to. So subsequent calls to READ_BUF will think there is much more than a page of spare space (the cast to u32 ensures an unsigned comparison) so we can expect to fall off the end of the second page. We never encountered thsi in testing because typically the only operations which use more than two pages are write-like operations, which have their own decoding logic. Something like a getattr after a write may cross a page boundary, but it would be very unusual for it to cross another boundary after that. Signed-off-by:
J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Suresh Siddha authored
This is a merge of two mainline commits, intended for stable@kernel.org submission for 2.6.27 kernel. commit f833bab8 and commit 918aae42 Changelog of both: Currently clockevents_notify() is called with interrupts enabled at some places and interrupts disabled at some other places. This results in a deadlock in this scenario. cpu A holds clockevents_lock in clockevents_notify() with irqs enabled cpu B waits for clockevents_lock in clockevents_notify() with irqs disabled cpu C doing set_mtrr() which will try to rendezvous of all the cpus. This will result in C and A come to the rendezvous point and waiting for B. B is stuck forever waiting for the spinlock and thus not reaching the rendezvous point. Fix the clockevents code so that clockevents_lock is taken with interrupts disabled and thus avoid the above deadlock. Also call lapic_timer_propagate_broadcast() on the destination cpu so that we avoid calling smp_call_function() in the clockevents notifier chain. This issue left us wondering if we need to change the MTRR rendezvous logic to use stop machine logic (instead of smp_call_function) or add a check in spinlock debug code to see if there are other spinlocks which gets taken under both interrupts enabled/disabled conditions. Signed-off-by:
Suresh Siddha <suresh.b.siddha@intel.com> Cc: "Brown Len" <len.brown@intel.com> Cc: stable@kernel.org LKML-Reference: <1250544899.2709.210.camel@sbs-t61.sc.intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> I got following warning on ia64 box: In function 'acpi_processor_power_verify': 642: warning: passing argument 2 of 'smp_call_function_single' from incompatible pointer type This smp_call_function_single() was introduced by a commit f833bab8: The problem is that the lapic_timer_propagate_broadcast() has 2 versions: One is real code that modified in the above commit, and the other is NOP code that used when !ARCH_APICTIMER_STOPS_ON_C3: static void lapic_timer_propagate_broadcast(struct acpi_processor *pr) { } So I got warning because of !ARCH_APICTIMER_STOPS_ON_C3. We really want to do nothing here on !ARCH_APICTIMER_STOPS_ON_C3, so modify lapic_timer_propagate_broadcast() of real version to use smp_call_function_single() in it. Signed-off-by:
Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by:
Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com> Signed-off-by:
Thomas Renninger <trenn@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-