Commits · v2.6.27.48 · BeagleBoard.org / Linux

Jul 05, 2010

Linux 2.6.27.48 · 6b5f4c2c
Greg Kroah-Hartman authored 14 years ago

v2.6.27.48

6b5f4c2c

sctp: fix append error cause to ERROR chunk correctly · 7694247a

Wei Yongjun authored 15 years ago


commit 2e3219b5 upstream.

commit 5fa782c2
  sctp: Fix skb_over_panic resulting from multiple invalid \
    parameter errors (CVE-2010-1173) (v4)

cause 'error cause' never be add the the ERROR chunk due to
some typo when check valid length in sctp_init_cause_fixed().

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

7694247a

KEYS: find_keyring_by_name() can gain access to a freed keyring · 8e3c5b14

Toshiyuki Okajima authored 15 years ago


commit cea7daa3 upstream.

find_keyring_by_name() can gain access to a keyring that has had its reference
count reduced to zero, and is thus ready to be freed.  This then allows the
dead keyring to be brought back into use whilst it is being destroyed.

The following timeline illustrates the process:

|(cleaner)                           (user)
|
| free_user(user)                    sys_keyctl()
|  |                                  |
|  key_put(user->session_keyring)     keyctl_get_keyring_ID()
|  ||	//=> keyring->usage = 0        |
|  |schedule_work(&key_cleanup_task)   lookup_user_key()
|  ||                                   |
|  kmem_cache_free(,user)               |
|  .                                    |[KEY_SPEC_USER_KEYRING]
|  .                                    install_user_keyrings()
|  .                                    ||
| key_cleanup() [<= worker_thread()]    ||
|  |                                    ||
|  [spin_lock(&key_serial_lock)]        |[mutex_lock(&key_user_keyr..mutex)]
|  |                                    ||
|  atomic_read() == 0                   ||
|  |{ rb_ease(&key->serial_node,) }     ||
|  |                                    ||
|  [spin_unlock(&key_serial_lock)]      |find_keyring_by_name()
|  |                                    |||
|  keyring_destroy(keyring)             ||[read_lock(&keyring_name_lock)]
|  ||                                   |||
|  |[write_lock(&keyring_name_lock)]    ||atomic_inc(&keyring->usage)
|  |.                                   ||| *** GET freeing keyring ***
|  |.                                   ||[read_unlock(&keyring_name_lock)]
|  ||                                   ||
|  |list_del()                          |[mutex_unlock(&key_user_k..mutex)]
|  ||                                   |
|  |[write_unlock(&keyring_name_lock)]  ** INVALID keyring is returned **
|  |                                    .
|  kmem_cache_free(,keyring)            .
|                                       .
|                                       atomic_dec(&keyring->usage)
v                                         *** DESTROYED ***
TIME

If CONFIG_SLUB_DEBUG=y then we may see the following message generated:

	=============================================================================
	BUG key_jar: Poison overwritten
	-----------------------------------------------------------------------------

	INFO: 0xffff880197a7e200-0xffff880197a7e200. First byte 0x6a instead of 0x6b
	INFO: Allocated in key_alloc+0x10b/0x35f age=25 cpu=1 pid=5086
	INFO: Freed in key_cleanup+0xd0/0xd5 age=12 cpu=1 pid=10
	INFO: Slab 0xffffea000592cb90 objects=16 used=2 fp=0xffff880197a7e200 flags=0x200000000000c3
	INFO: Object 0xffff880197a7e200 @offset=512 fp=0xffff880197a7e300

	Bytes b4 0xffff880197a7e1f0:  5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
	  Object 0xffff880197a7e200:  6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk

Alternatively, we may see a system panic happen, such as:

	BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
	IP: [<ffffffff810e61a3>] kmem_cache_alloc+0x5b/0xe9
	PGD 6b2b4067 PUD 6a80d067 PMD 0
	Oops: 0000 [#1] SMP
	last sysfs file: /sys/kernel/kexec_crash_loaded
	CPU 1
	...
	Pid: 31245, comm: su Not tainted 2.6.34-rc5-nofixed-nodebug #2 D2089/PRIMERGY
	RIP: 0010:[<ffffffff810e61a3>]  [<ffffffff810e61a3>] kmem_cache_alloc+0x5b/0xe9
	RSP: 0018:ffff88006af3bd98  EFLAGS: 00010002
	RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88007d19900b
	RDX: 0000000100000000 RSI: 00000000000080d0 RDI: ffffffff81828430
	RBP: ffffffff81828430 R08: ffff88000a293750 R09: 0000000000000000
	R10: 0000000000000001 R11: 0000000000100000 R12: 00000000000080d0
	R13: 00000000000080d0 R14: 0000000000000296 R15: ffffffff810f20ce
	FS:  00007f97116bc700(0000) GS:ffff88000a280000(0000) knlGS:0000000000000000
	CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	CR2: 0000000000000001 CR3: 000000006a91c000 CR4: 00000000000006e0
	DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
	DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
	Process su (pid: 31245, threadinfo ffff88006af3a000, task ffff8800374414c0)
	Stack:
	 0000000512e0958e 0000000000008000 ffff880037f8d180 0000000000000001
	 0000000000000000 0000000000008001 ffff88007d199000 ffffffff810f20ce
	 0000000000008000 ffff88006af3be48 0000000000000024 ffffffff810face3
	Call Trace:
	 [<ffffffff810f20ce>] ? get_empty_filp+0x70/0x12f
	 [<ffffffff810face3>] ? do_filp_open+0x145/0x590
	 [<ffffffff810ce208>] ? tlb_finish_mmu+0x2a/0x33
	 [<ffffffff810ce43c>] ? unmap_region+0xd3/0xe2
	 [<ffffffff810e4393>] ? virt_to_head_page+0x9/0x2d
	 [<ffffffff81103916>] ? alloc_fd+0x69/0x10e
	 [<ffffffff810ef4ed>] ? do_sys_open+0x56/0xfc
	 [<ffffffff81008a02>] ? system_call_fastpath+0x16/0x1b
	Code: 0f 1f 44 00 00 49 89 c6 fa 66 0f 1f 44 00 00 65 4c 8b 04 25 60 e8 00 00 48 8b 45 00 49 01 c0 49 8b 18 48 85 db 74 0d 48 63 45 18 <48> 8b 04 03 49 89 00 eb 14 4c 89 f9 83 ca ff 44 89 e6 48 89 ef
	RIP  [<ffffffff810e61a3>] kmem_cache_alloc+0x5b/0xe9

This problem is that find_keyring_by_name does not confirm that the keyring is
valid before accepting it.

Skipping keyrings that have been reduced to a zero count seems the way to go.
To this end, use atomic_inc_not_zero() to increment the usage count and skip
the candidate keyring if that returns false.

The following script _may_ cause the bug to happen, but there's no guarantee
as the window of opportunity is small:

	#!/bin/sh
	LOOP=100000
	USER=dummy_user
	/bin/su -c "exit;" $USER || { /usr/sbin/adduser -m $USER; add=1; }
	for ((i=0; i<LOOP; i++))
	do
		/bin/su -c "echo '$i' > /dev/null" $USER
	done
	(( add == 1 )) && /usr/sbin/userdel -r $USER
	exit

Note that the nominated user must not be in use.

An alternative way of testing this may be:

	for ((i=0; i<100000; i++))
	do
		keyctl session foo /bin/true || break
	done >&/dev/null

as that uses a keyring named "foo" rather than relying on the user and
user-session named keyrings.

Reported-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

8e3c5b14

KEYS: Return more accurate error codes · e24ad125

Dan Carpenter authored 15 years ago


commit 4d09ec0f upstream.

We were using the wrong variable here so the error codes weren't being returned
properly.  The original code returns -ENOKEY.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e24ad125

parisc: clear floating point exception flag on SIGFPE signal · 648efc40

Helge Deller authored 15 years ago


commit 550f0d92 upstream.

Clear the floating point exception flag before returning to
user space. This is needed, else the libc trampoline handler
may hit the same SIGFPE again while building up a trampoline
to a signal handler.

Fixes debian bug #559406.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

648efc40

tipc: Fix oops on send prior to entering networked mode (v3) · 670037a9

Neil Horman authored 15 years ago

commit d0021b25 upstream.

Fix TIPC to disallow sending to remote addresses prior to entering NET_MODE

user programs can oops the kernel by sending datagrams via AF_TIPC prior to
entering networked mode. The following backtrace has been observed:

ID: 13459 TASK: ffff810014640040 CPU: 0 COMMAND: "tipc-client"
[exception RIP: tipc_node_select_next_hop+90]
RIP: ffffffff8869d3c3 RSP: ffff81002d9a5ab8 RFLAGS: 00010202
RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000001001001
RBP: 0000000001001001 R8: 0074736575716552 R9: 0000000000000000
R10: ffff81003fbd0680 R11: 00000000000000c8 R12: 0000000000000008
R13: 0000000000000001 R14: 0000000000000001 R15: ffff810015c6ca00
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
RIP: 0000003cbd8d49a3 RSP: 00007fffc84e0be8 RFLAGS: 00010206
RAX: 000000000000002c RBX: ffffffff8005d116 RCX: 0000000000000000
RDX: 0000000000000008 RSI: 00007fffc84e0c00 RDI: 0000000000000003
RBP: 0000000000000000 R8: 00007fffc84e0c10 R9: 0000000000000010
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fffc84e0d10 R14: 0000000000000000 R15: 00007fffc84e0c30
ORIG_RAX: 000000000000002c CS: 0033 SS: 002b

What happens is that, when the tipc module in inserted it enters a standalone
node mode in which communication to its own address is allowed <0.0.0> but not
to other addresses, since the appropriate data structures have not been
allocated yet (specifically the tipc_net pointer). There is nothing stopping a
client from trying to send such a message however, and if that happens, we
attempt to dereference tipc_net.zones while the pointer is still NULL, and
explode. The fix is pretty straightforward. Since these oopses all arise from
the dereference of global pointers prior to their assignment to allocated
values, and since these allocations are small (about 2k total), lets convert
these pointers to static arrays of the appropriate size. All the accesses to
these bits consider 0/NULL to be a non match when searching, so all the lookups
still work properly, and there is no longer a chance of a bad dererence
anywhere. As a bonus, this lets us eliminate the setup/teardown routines for
those pointers, and elimnates the need to preform any locking around them to
prevent access while their being allocated/freed.

I've updated the tipc_net structure to behave this way to fix the exact reported
problem, and also fixed up the tipc_bearers and media_list arrays to fix an
obvious simmilar problem that arises from issuing tipc-config commands to
manipulate bearers/links prior to entering networked mode

I've tested this for a few hours by running the sanity tests and stress test
with the tipcutils suite, and nothing has fallen over. There have been a few
lockdep warnings, but those were there before, and can be addressed later, as
they didn't actually result in any deadlock.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Allan Stephens <allan.stephens@windriver.com>
CC: David S. Miller <davem@davemloft.net>
CC: tipc-discussion@lists.sourceforge.net
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

670037a9

vfs: add NOFOLLOW flag to umount(2) · 23b5e014

Miklos Szeredi authored 15 years ago


commit db1f05bb upstream.

Add a new UMOUNT_NOFOLLOW flag to umount(2).  This is needed to prevent
symlink attacks in unprivileged unmounts (fuse, samba, ncpfs).

Additionally, return -EINVAL if an unknown flag is used (and specify
an explicitly unused flag: UMOUNT_UNUSED).  This makes it possible for
the caller to determine if a flag is supported or not.

CC: Eugene Teo <eugene@redhat.com>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

23b5e014

sctp: Fix skb_over_panic resulting from multiple invalid parameter errors (CVE-2010-1173) (v4) · b1b7bf1e

Neil Horman authored 15 years ago

commit 5fa782c2 upstream.

Ok, version 4

Change Notes:
1) Minor cleanups, from Vlads notes

Summary:

Hey-
	Recently, it was reported to me that the kernel could oops in the
following way:

<5> kernel BUG at net/core/skbuff.c:91!
<5> invalid operand: 0000 [#1]
<5> Modules linked in: sctp netconsole nls_utf8 autofs4 sunrpc iptable_filter
ip_tables cpufreq_powersave parport_pc lp parport vmblock(U) vsock(U) vmci(U)
vmxnet(U) vmmemctl(U) vmhgfs(U) acpiphp dm_mirror dm_mod button battery ac md5
ipv6 uhci_hcd ehci_hcd snd_ens1371 snd_rawmidi snd_seq_device snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_ac97_codec snd soundcore
pcnet32 mii floppy ext3 jbd ata_piix libata mptscsih mptsas mptspi mptscsi
mptbase sd_mod scsi_mod
<5> CPU:    0
<5> EIP:    0060:[<c02bff27>]    Not tainted VLI
<5> EFLAGS: 00010216   (2.6.9-89.0.25.EL)
<5> EIP is at skb_over_panic+0x1f/0x2d
<5> eax: 0000002c   ebx: c033f461   ecx: c0357d96   edx: c040fd44
<5> esi: c033f461   edi: df653280   ebp: 00000000   esp: c040fd40
<5> ds: 007b   es: 007b   ss: 0068
<5> Process swapper (pid: 0, threadinfo=c040f000 task=c0370be0)
<5> Stack: c0357d96 e0c29478 00000084 00000004 c033f461 df653280 d7883180
e0c2947d
<5>        00000000 00000080 df653490 00000004 de4f1ac0 de4f1ac0 00000004
df653490
<5>        00000001 e0c2877a 08000800 de4f1ac0 df653490 00000000 e0c29d2e
00000004
<5> Call Trace:
<5>  [<e0c29478>] sctp_addto_chunk+0xb0/0x128 [sctp]
<5>  [<e0c2947d>] sctp_addto_chunk+0xb5/0x128 [sctp]
<5>  [<e0c2877a>] sctp_init_cause+0x3f/0x47 [sctp]
<5>  [<e0c29d2e>] sctp_process_unk_param+0xac/0xb8 [sctp]
<5>  [<e0c29e90>] sctp_verify_init+0xcc/0x134 [sctp]
<5>  [<e0c20322>] sctp_sf_do_5_1B_init+0x83/0x28e [sctp]
<5>  [<e0c25333>] sctp_do_sm+0x41/0x77 [sctp]
<5>  [<c01555a4>] cache_grow+0x140/0x233
<5>  [<e0c26ba1>] sctp_endpoint_bh_rcv+0xc5/0x108 [sctp]
<5>  [<e0c2b863>] sctp_inq_push+0xe/0x10 [sctp]
<5>  [<e0c34600>] sctp_rcv+0x454/0x509 [sctp]
<5>  [<e084e017>] ipt_hook+0x17/0x1c [iptable_filter]
<5>  [<c02d005e>] nf_iterate+0x40/0x81
<5>  [<c02e0bb9>] ip_local_deliver_finish+0x0/0x151
<5>  [<c02e0c7f>] ip_local_deliver_finish+0xc6/0x151
<5>  [<c02d0362>] nf_hook_slow+0x83/0xb5
<5>  [<c02e0bb2>] ip_local_deliver+0x1a2/0x1a9
<5>  [<c02e0bb9>] ip_local_deliver_finish+0x0/0x151
<5>  [<c02e103e>] ip_rcv+0x334/0x3b4
<5>  [<c02c66fd>] netif_receive_skb+0x320/0x35b
<5>  [<e0a0928b>] init_stall_timer+0x67/0x6a [uhci_hcd]
<5>  [<c02c67a4>] process_backlog+0x6c/0xd9
<5>  [<c02c690f>] net_rx_action+0xfe/0x1f8
<5>  [<c012a7b1>] __do_softirq+0x35/0x79
<5>  [<c0107efb>] handle_IRQ_event+0x0/0x4f
<5>  [<c01094de>] do_softirq+0x46/0x4d

Its an skb_over_panic BUG halt that results from processing an init chunk in
which too many of its variable length parameters are in some way malformed.

The problem is in sctp_process_unk_param:
if (NULL == *errp)
	*errp = sctp_make_op_error_space(asoc, chunk,
					 ntohs(chunk->chunk_hdr->length));

	if (*errp) {
		sctp_init_cause(*errp, SCTP_ERROR_UNKNOWN_PARAM,
				 WORD_ROUND(ntohs(param.p->length)));
		sctp_addto_chunk(*errp,
			WORD_ROUND(ntohs(param.p->length)),
				  param.v);

When we allocate an error chunk, we assume that the worst case scenario requires
that we have chunk_hdr->length data allocated, which would be correct nominally,
given that we call sctp_addto_chunk for the violating parameter.  Unfortunately,
we also, in sctp_init_cause insert a sctp_errhdr_t structure into the error
chunk, so the worst case situation in which all parameters are in violation
requires chunk_hdr->length+(sizeof(sctp_errhdr_t)*param_count) bytes of data.

The result of this error is that a deliberately malformed packet sent to a
listening host can cause a remote DOS, described in CVE-2010-1173:
http://cve.mitre.org/cgi-bin/cvename.cgi?name=2010-1173



I've tested the below fix and confirmed that it fixes the issue.  We move to a
strategy whereby we allocate a fixed size error chunk and ignore errors we don't
have space to report.  Tested by me successfully

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

b1b7bf1e

ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages · 70635511

Aneesh Kumar K.V authored 15 years ago

commit 2acf2c26 upstream.

With delayed allocation we lock the page in write_cache_pages() and
try to build an in memory extent of contiguous blocks.  This is needed
so that we can get large contiguous blocks request.  If range_cyclic
mode is enabled, write_cache_pages() will loop back to the 0 index if
no I/O has been done yet, and try to start writing from the beginning
of the range.  That causes an attempt to take the page lock of lower
index page while holding the page lock of higher index page, which can
cause a dead lock with another writeback thread.

The solution is to implement the range_cyclic behavior in
ext4_da_writepages() instead.

http://bugzilla.kernel.org/show_bug.cgi?id=12579



Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Jayson R. King <dev@jaysonking.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

70635511

ext4: Fix file fragmentation during large file write. · f4a4420c

Aneesh Kumar K.V authored 15 years ago


commit 22208ded upstream.

The range_cyclic writeback mode uses the address_space writeback_index
as the start index for writeback.  With delayed allocation we were
updating writeback_index wrongly resulting in highly fragmented file.
This patch reduces the number of extents reduced from 4000 to 27 for a
3GB file.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[dev@jaysonking.com: Some changed lines from the original version of this patch were dropped, since they were rolled up with another cherry-picked patch applied to 2.6.27.y earlier.]
[dev@jaysonking.com: Use of wbc->no_nrwrite_index_update was dropped, since write_cache_pages_da() implies it.]
Signed-off-by: Jayson R. King <dev@jaysonking.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

f4a4420c

ext4: Use our own write_cache_pages() · 14dd35b4

Theodore Ts'o authored 15 years ago


commit 8e48dcfb upstream.

Make a copy of write_cache_pages() for the benefit of
ext4_da_writepages().  This allows us to simplify the code some, and
will allow us to further customize the code in future patches.

There are some nasty hacks in write_cache_pages(), which Linus has
(correctly) characterized as vile.  I've just copied it into
write_cache_pages_da(), without trying to clean those bits up lest I
break something in the ext4's delalloc implementation, which is a bit
fragile right now.  This will allow Dave Chinner to clean up
write_cache_pages() in mm/page-writeback.c, without worrying about
breaking ext4.  Eventually write_cache_pages_da() will go away when I
rewrite ext4's delayed allocation and create a general
ext4_writepages() which is used for all of ext4's writeback.  Until
now this is the lowest risk way to clean up the core
write_cache_pages() function.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Dave Chinner <david@fromorbit.com>
[dev@jaysonking.com: Dropped the hunks which reverted the use of no_nrwrite_index_update, since those lines weren't ever created on 2.6.27.y]
[dev@jaysonking.com: Copied from 2.6.27.y's version of write_cache_pages(), plus the changes to it from patch "vfs: Add no_nrwrite_index_update writeback control flag"]
Signed-off-by: Jayson R. King <dev@jaysonking.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

14dd35b4

ext4: check s_log_groups_per_flex in online resize code · 805fe57f

Eric Sandeen authored 15 years ago

commit 42007efd upstream.

If groups_per_flex < 2, sbi->s_flex_groups[] doesn't get filled out,
and every other access to this first tests s_log_groups_per_flex;
same thing needs to happen in resize or we'll wander off into
a null pointer when doing an online resize of the file system.

Thanks to Christoph Biedl, who came up with the trivial testcase:

# truncate --size 128M fsfile
# mkfs.ext3 -F fsfile
# tune2fs -O extents,uninit_bg,dir_index,flex_bg,huge_file,dir_nlink,extra_isize fsfile
# e2fsck -yDf -C0 fsfile
# truncate --size 132M fsfile
# losetup /dev/loop0 fsfile
# mount /dev/loop0 mnt
# resize2fs -p /dev/loop0

	https://bugzilla.kernel.org/show_bug.cgi?id=13549



Reported-by: Alessandro Polverini <alex@nibbles.it>
Test-case-by: Christoph Biedl <bugzilla.kernel.bpeb@manchmal.in-ulm.de>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

805fe57f

gconfig: fix build failure on fedora 13 · f5cdcd6b

Richard Kennedy authored 15 years ago


commit cbab05f0 upstream.

Making gconfig fails on fedora 13 as the linker cannot resolve dlsym.

Adding libdl to the link command fixes this.

make shows this error :-
    /usr/bin/ld: scripts/kconfig/kconfig_load.o: undefined reference to symbol 'dlsym@@GLIBC_2.2.5'
    /usr/bin/ld: note: 'dlsym@@GLIBC_2.2.5' is defined in DSO /lib64/libdl.so.2 so try adding it to the linker command line
    /lib64/libdl.so.2: could not read symbols: Invalid operation

tested on x86_64 fedora 13.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

f5cdcd6b

ipmi: handle run_to_completion properly in deliver_recv_msg() · 310c0ab7

Jiri Kosina authored 15 years ago


commit a747c5ab upstream.

If run_to_completion flag is set, it means that we are running in a
single-threaded mode, and thus no locks are held.

This fixes a deadlock when IPMI notifier is being called during panic.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

310c0ab7

do_generic_file_read: clear page errors when issuing a fresh read of the page · 9b698b55

Jeff Moyer authored 15 years ago


commit 91803b49 upstream.

I/O errors can happen due to temporary failures, like multipath
errors or losing network contact with the iSCSI server. Because
of that, the VM will retry readpage on the page.

However, do_generic_file_read does not clear PG_error.  This
causes the system to be unable to actually use the data in the
page cache page, even if the subsequent readpage completes
successfully!

The function filemap_fault has had a ClearPageError before
readpage forever.  This patch simply adds the same to
do_generic_file_read.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Larry Woodman <lwoodman@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

9b698b55

md: set mddev readonly flag on blkdev BLKROSET ioctl · 78924230

Dan Williams authored 15 years ago


commit e2218350 upstream.

When the user sets the block device to readwrite then the mddev should
follow suit.  Otherwise, the BUG_ON in md_write_start() will be set to
trigger.

The reverse direction, setting mddev->ro to match a set readonly
request, can be ignored because the blkdev level readonly flag precludes
the need to have mddev->ro set correctly.  Nevermind the fact that
setting mddev->ro to 1 may fail if the array is in use.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

78924230

md: Fix read balancing in RAID1 and RAID10 on drives > 2TB · f2a41e68

NeilBrown authored 15 years ago


commit af3a2cd6 upstream.

read_balance uses a "unsigned long" for a sector number which
will get truncated beyond 2TB.
This will cause read-balancing to be non-optimal, and can cause
data to be read from the 'wrong' branch during a resync.  This has a
very small chance of returning wrong data.

Reported-by: Jordan Russell <jr-list-2010@quo.to>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

f2a41e68

md/raid1: fix counting of write targets. · faeb8ec4

NeilBrown authored 15 years ago


commit 964147d5 upstream.

There is a very small race window when writing to a
RAID1 such that if a device is marked faulty at exactly the wrong
time, the write-in-progress will not be sent to the device,
but the bitmap (if present) will be updated to say that
the write was sent.

Then if the device turned out to still be usable as was re-added
to the array, the bitmap-based-resync would skip resyncing that
block, possibly leading to corruption.  This would only be a problem
if no further writes were issued to that area of the device (i.e.
that bitmap chunk).

Suitable for any pending -stable kernel.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

faeb8ec4

powerpc/oprofile: fix potential buffer overrun in op_model_cell.c · 43bee391

Denis Kirjanov authored 14 years ago


commit 238c1a78 upstream.

Fix potential initial_lfsr buffer overrun.
Writing past the end of the buffer could happen when index == ENTRIES

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

43bee391

powerpc/pseries: Make query_cpu_stopped callable outside hotplug cpu · 21dfd35a

Michael Neuling authored 15 years ago


commit f8b67691 upstream.

This moves query_cpu_stopped() out of the hotplug cpu code and into
smp.c so it can called in other places and renames it to
smp_query_cpu_stopped().

It also cleans up the return values by adding some #defines

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

21dfd35a

powerpc/pseries: Only call start-cpu when a CPU is stopped · 4be4e01c

Michael Neuling authored 15 years ago


commit aef40e87 upstream.

Currently we always call start-cpu irrespective of if the CPU is
stopped or not. Unfortunatley on POWER7, firmware seems to not like
start-cpu being called when a cpu already been started.  This was not
the case on POWER6 and earlier.

This patch checks to see if the CPU is stopped or not via an
query-cpu-stopped-state call, and only calls start-cpu on CPUs which
are stopped.

This fixes a bug with kexec on POWER7 on PHYP where only the primary
thread would make it to the second kernel.

Reported-by: Ankita Garg <ankita@linux.vnet.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

4be4e01c

powerpc: Fix handling of strncmp with zero len · ae7aff99

Jeff Mahoney authored 15 years ago


commit 637a9902 upstream.

Commit 0119536c, which added the assembly version of strncmp to
powerpc, mentions that it adds two instructions to the version from
boot/string.S to allow it to handle len=0. Unfortunately, it doesn't
always return 0 when that is the case. The length is passed in r5, but
the return value is passed back in r3. In certain cases, this will
happen to work. Otherwise it will pass back the address of the first
string as the return value.

This patch lifts the len <= 0 handling code from memcpy to handle that
case.

Reported by: Christian_Sellars@symantec.com
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ae7aff99

ARCNET: Limit com20020 PCI ID matches for SOHARD cards · 88102a58

Andreas Bombe authored 15 years ago

commit e7971c80 upstream.

The SH SOHARD ARCNET cards are implemented using generic PLX Technology
PCI<->IOBus bridges. Subvendor and subdevice IDs were not specified,
causing the driver to attach to any such bridge and likely crash the
system by attempting to initialize an unrelated device.

Fix by specifying subvendor and subdevice according to the values found
in the PCI-ID Repository at http://pci-ids.ucw.cz/

 .

Signed-off-by: Andreas Bombe <aeb@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

88102a58

NFSD: don't report compiled-out versions as present · b9cb946f

Pavel Emelyanov authored 15 years ago


commit 15ddb4ae upstream.

The /proc/fs/nfsd/versions file calls nfsd_vers() to check whether
the particular nfsd version is present/available. The problem is
that once I turn off e.g. NFSD-V4 this call returns -1 which is
true from the callers POV which is wrong.

The proposal is to report false in that case.

The bug has existed since 6658d3a7 "[PATCH] knfsd: remove
nfsd_versbits as intermediate storage for desired versions".

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

b9cb946f

libata: disable ATAPI AN by default · 90bfd720

Tejun Heo authored 15 years ago


commit e7ecd435 upstream.

There are ATAPI devices which raise AN when hit by commands issued by
open().  This leads to infinite loop of AN -> MEDIA_CHANGE uevent ->
udev open() to check media -> AN.

Both ACS and SerialATA standards don't define in which case ATAPI
devices are supposed to raise or not raise AN.  They both list media
insertion event as a possible use case for ATAPI ANs but there is no
clear description of what constitutes such events.  As such, it seems
a bit too naive to export ANs directly to userland as MEDIA_CHANGE
events without further verification (which should behave similarly to
windows as it apparently is the only thing that some hardware vendors
are testing against).

This patch adds libata.atapi_an module parameter and disables ATAPI AN
by default for now.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Nick Bowler <nbowler@elliptictech.com>
Cc: David Zeuthen <david@fubar.dk>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

90bfd720

May 26, 2010

Linux 2.6.27.47 · a9a5c1f8
Greg Kroah-Hartman authored 15 years ago

v2.6.27.47

a9a5c1f8

nfsd: fix vm overcommit crash fix #2 · c0ac2723

Junjiro R. Okajima authored 16 years ago


commit 1b79cd04 upstream.

The previous patch from Alan Cox ("nfsd: fix vm overcommit crash",
commit 731572d3) fixed the problem where
knfsd crashes on exported shmemfs objects and strict overcommit is set.

But the patch forgot supporting the case when CONFIG_SECURITY is
disabled.

This patch copies a part of his fix which is mainly for detecting a bug
earlier.

Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Junjiro R. Okajima <hooanon05@yahoo.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

c0ac2723

nfsd: fix vm overcommit crash · 3f955fe6

Alan Cox authored 16 years ago


commit 731572d3 upstream.

Junjiro R.  Okajima reported a problem where knfsd crashes if you are
using it to export shmemfs objects and run strict overcommit.  In this
situation the current->mm based modifier to the overcommit goes through a
NULL pointer.

We could simply check for NULL and skip the modifier but we've caught
other real bugs in the past from mm being NULL here - cases where we did
need a valid mm set up (eg the exec bug about a year ago).

To preserve the checks and get the logic we want shuffle the checking
around and add a new helper to the vm_ security wrappers

Also fix a current->mm reference in nommu that should use the passed mm

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix build]
Reported-by: Junjiro R. Okajima <hooanon05@yahoo.co.jp>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

3f955fe6

i2c-tiny-usb: Fix on big-endian systems · 0ce0dd4e

Jean Delvare authored 15 years ago

commit 1c010ff8 upstream.

The functionality bit vector is always returned as a little-endian
32-bit number by the device, so it must be byte-swapped to the host
endianness.

On the other hand, the delay value is handled by the USB stack, so no
byte swapping is needed on our side.

This fixes bug #15105:
http://bugzilla.kernel.org/show_bug.cgi?id=15105



Reported-by: Jens Richter <jens@richter-stutensee.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Jens Richter <jens@richter-stutensee.de>
Cc: Till Harbaum <till@harbaum.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

0ce0dd4e

i2c-i801: Don't use the block buffer for I2C block writes · ae3936ed

Jean Delvare authored 15 years ago


commit c074c39d upstream.

Experience has shown that the block buffer can only be used for SMBus
(not I2C) block transactions, even though the datasheet doesn't
mention this limitation.

Reported-by: Felix Rubinstein <felixru@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Oleg Ryjkov <oryjkov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ae3936ed

hwmon: (w83781d) Request I/O ports individually for probing · 6dfbed03

Jean Delvare authored 15 years ago


commit b0bcdd3c upstream.

Different motherboards have different PNP declarations for
W83781D/W83782D chips. Some declare the whole range of I/O ports (8
ports), some declare only the useful ports (2 ports at offset 5) and
some declare fancy ranges, for example 4 ports at offset 4. To
properly handle all cases, request all ports individually for probing.
After we have determined that we really have a W83781D or W83782D
chip, the useful port range will be requested again, as a single
block.

I did not see a board which needs this yet, but I know of one for lm78
driver and I'd like to keep the logic of these two drivers in sync.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

6dfbed03

svc: Clean up deferred requests on transport destruction · 60f30297

Tom Tucker authored 16 years ago


commit 22945e4a upstream.

A race between svc_revisit and svc_delete_xprt can result in
deferred requests holding references on a transport that can never be
recovered because dead transports are not enqueued for subsequent
processing.

Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
transport and sweep a transport's deferred queue to do the same for queued
but unprocessed deferrals.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: roma1390 <roma1390@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

60f30297

libata: retry FS IOs even if it has failed with AC_ERR_INVALID · ce58c0ab

Tejun Heo authored 15 years ago

commit 534ead70 upstream.

libata currently doesn't retry if a command fails with AC_ERR_INVALID
assuming that retrying won't get it any further even if retried.
However, a failure may be classified as invalid through hardware
glitch (incorrect reading of the error register or firmware bug) and
there isn't whole lot to gain by not retrying as actually invalid
commands will be failed immediately.  Also, commands serving FS IOs
are extremely unlikely to be invalid.  Retry FS IOs even if it's
marked invalid.

Transient and incorrect invalid failure was seen while debugging
firmware related issue on Samsung n130 on bko#14314.

  http://bugzilla.kernel.org/show_bug.cgi?id=14314



Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ce58c0ab

libata: ensure NCQ error result taskfile is fully initialized before returning... · bf3d0e3c

Jeff Garzik authored 15 years ago

libata: ensure NCQ error result taskfile is fully initialized before returning it via qc->result_tf.

commit a09bf4cd upstream.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

bf3d0e3c

i2c: Fix probing of FSC hardware monitoring chips · ba1531fc

Jean Delvare authored 15 years ago

commit b1d4b390 upstream.

Some FSC hardware monitoring chips (Syleus at least) doesn't like
quick writes we typically use to probe for I2C chips. Use a regular
byte read instead for the address they live at (0x73). These are the
only known chips living at this address on PC systems.

For clarity, this fix should not be needed for kernels 2.6.30 and
later, as we started instantiating the hwmon devices explicitly based
on DMI data. Still, this fix is valuable in the following two cases:
* Support for recent FSC chips on older kernels. The DMI-based device
  instantiation is more difficult to backport than the device support
  itself.
* Case where the DMI-based device instantiation fails, whatever the
  reason. We fall back to probing in that case, so it should work.

This fixes kernel bug #15634:
https://bugzilla.kernel.org/show_bug.cgi?id=15634



Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ba1531fc

NFS: rsize and wsize settings ignored on v4 mounts · e3f8d359

Chuck Lever authored 15 years ago


commit 356e76b8 upstream.

NFSv4 mounts ignore the rsize and wsize mount options, and always use
the default transfer size for both.  This seems to be because all
NFSv4 mounts are now cloned, and the cloning logic doesn't copy the
rsize and wsize settings from the parent nfs_server.

I tested Fedora's 2.6.32.11-99 and it seems to have this problem as
well, so I'm guessing that .33, .32, and perhaps older kernels have
this issue as well.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e3f8d359

nfs d_revalidate() is too trigger-happy with d_drop() · 5ea176dc

Al Viro authored 15 years ago


commit d9e80b7d upstream.

If dentry found stale happens to be a root of disconnected tree, we
can't d_drop() it; its d_hash is actually part of s_anon and d_drop()
would simply hide it from shrink_dcache_for_umount(), leading to
all sorts of fun, including busy inodes on umount and oopsen after
that.

Bug had been there since at least 2006 (commit c636eb already has it),
so it's definitely -stable fodder.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

5ea176dc

USB: fix testing the wrong variable in fs_create_by_name() · d130b918

Dan Carpenter authored 15 years ago


commit fa7fe7af upstream.

There is a typo here.  We should be testing "*dentry" which was just
assigned instead of "dentry".  This could result in dereferencing an
ERR_PTR inside either usbfs_mkdir() or usbfs_create().

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

d130b918

nfsd4: bug in read_buf · dc116e1a

Neil Brown authored 15 years ago


commit 2bc3c117 upstream.

When read_buf is called to move over to the next page in the pagelist
of an NFSv4 request, it sets argp->end to essentially a random
number, certainly not an address within the page which argp->p now
points to.  So subsequent calls to READ_BUF will think there is much
more than a page of spare space (the cast to u32 ensures an unsigned
comparison) so we can expect to fall off the end of the second
page.

We never encountered thsi in testing because typically the only
operations which use more than two pages are write-like operations,
which have their own decoding logic.  Something like a getattr after a
write may cross a page boundary, but it would be very unusual for it to
cross another boundary after that.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

dc116e1a

clockevent: Prevent dead lock on clockevents_lock · 843447c6

Suresh Siddha authored 15 years ago


This is a merge of two mainline commits, intended
for stable@kernel.org submission for 2.6.27 kernel.

commit f833bab8
and

commit 918aae42
Changelog of both:

    Currently clockevents_notify() is called with interrupts enabled at
    some places and interrupts disabled at some other places.

    This results in a deadlock in this scenario.

    cpu A holds clockevents_lock in clockevents_notify() with irqs enabled
    cpu B waits for clockevents_lock in clockevents_notify() with irqs disabled
    cpu C doing set_mtrr() which will try to rendezvous of all the cpus.

    This will result in C and A come to the rendezvous point and waiting
    for B. B is stuck forever waiting for the spinlock and thus not
    reaching the rendezvous point.

    Fix the clockevents code so that clockevents_lock is taken with
    interrupts disabled and thus avoid the above deadlock.

    Also call lapic_timer_propagate_broadcast() on the destination cpu so
    that we avoid calling smp_call_function() in the clockevents notifier
    chain.

    This issue left us wondering if we need to change the MTRR rendezvous
    logic to use stop machine logic (instead of smp_call_function) or add
    a check in spinlock debug code to see if there are other spinlocks
    which gets taken under both interrupts enabled/disabled conditions.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
    Cc: "Brown Len" <len.brown@intel.com>
    Cc: stable@kernel.org
    LKML-Reference: <1250544899.2709.210.camel@sbs-t61.sc.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

    I got following warning on ia64 box:
      In function 'acpi_processor_power_verify':
      642: warning: passing argument 2 of 'smp_call_function_single' from
      incompatible pointer type

    This smp_call_function_single() was introduced by a commit
    f833bab8:

    The problem is that the lapic_timer_propagate_broadcast() has 2 versions:
    One is real code that modified in the above commit, and the other is NOP
    code that used when !ARCH_APICTIMER_STOPS_ON_C3:

      static void lapic_timer_propagate_broadcast(struct acpi_processor *pr) { }

    So I got warning because of !ARCH_APICTIMER_STOPS_ON_C3.

    We really want to do nothing here on !ARCH_APICTIMER_STOPS_ON_C3, so
    modify lapic_timer_propagate_broadcast() of real version to use
    smp_call_function_single() in it.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

843447c6

Admin message