- Feb 23, 2021
-
-
Greg Kroah-Hartman authored
Tested-by:
Florian Fainelli <f.fainelli@gmail.com> Tested-by:
Guenter Roeck <linux@roeck-us.net> Tested-by:
Jason Self <jason@bluehome.net> Tested-by:
Linux Kernel Functional Testing <lkft@linaro.org> Link: https://lore.kernel.org/r/20210222121022.546148341@linuxfoundation.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lai Jiangshan authored
commit 88bf56d0 upstream In kvm_mmu_notifier_invalidate_range_start(), tlbs_dirty is used as: need_tlb_flush |= kvm->tlbs_dirty; with need_tlb_flush's type being int and tlbs_dirty's type being long. It means that tlbs_dirty is always used as int and the higher 32 bits is useless. We need to check tlbs_dirty in a correct way and this change checks it directly without propagating it to need_tlb_flush. Note: it's _extremely_ unlikely this neglecting of higher 32 bits can cause problems in practice. It would require encountering tlbs_dirty on a 4 billion count boundary, and KVM would need to be using shadow paging or be running a nested guest. Cc: stable@vger.kernel.org Fixes: a4ee1ca4 ("KVM: MMU: delay flush all tlbs on sync_page path") Signed-off-by:
Lai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20201217154118.16497-1-jiangshanlai@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> [sudip: adjust context] Signed-off-by:
Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arun Easi authored
commit 8de309e7 upstream Crash stack: [576544.715489] Unable to handle kernel paging request for data at address 0xd00000000f970000 [576544.715497] Faulting instruction address: 0xd00000000f880f64 [576544.715503] Oops: Kernel access of bad area, sig: 11 [#1] [576544.715506] SMP NR_CPUS=2048 NUMA pSeries : [576544.715703] NIP [d00000000f880f64] .qla27xx_fwdt_template_valid+0x94/0x100 [qla2xxx] [576544.715722] LR [d00000000f7952dc] .qla24xx_load_risc_flash+0x2fc/0x590 [qla2xxx] [576544.715726] Call Trace: [576544.715731] [c0000004d0ffb000] [c0000006fe02c350] 0xc0000006fe02c350 (unreliable) [576544.715750] [c0000004d0ffb080] [d00000000f7952dc] .qla24xx_load_risc_flash+0x2fc/0x590 [qla2xxx] [576544.715770] [c0000004d0ffb170] [d00000000f7aa034] .qla81xx_load_risc+0x84/0x1a0 [qla2xxx] [576544.715789] [c0000004d0ffb210] [d00000000f79f7c8] .qla2x00_setup_chip+0xc8/0x910 [qla2xxx] [576544.715808] [c0000004d0ffb300] [d00000000f7a631c] .qla2x00_initialize_adapter+0x4dc/0xb00 [qla2xxx] [576544.715826] [c0000004d0ffb3e0] [d00000000f78ce28] .qla2x00_probe_one+0xf08/0x2200 [qla2xxx] Link: https://lore.kernel.org/r/20201202132312.19966-8-njavali@marvell.com Fixes: f73cb695 ("[SCSI] qla2xxx: Add support for ISP2071.") Cc: stable@vger.kernel.org Reviewed-by:
Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by:
Arun Easi <aeasi@marvell.com> Signed-off-by:
Nilesh Javali <njavali@marvell.com> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> [sudip: adjust context] Signed-off-by:
Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Beulich authored
commit 871997bc upstream. The function uses a goto-based loop, which may lead to an earlier error getting discarded by a later iteration. Exit this ad-hoc loop when an error was encountered. The out-of-memory error path additionally fails to fill a structure field looked at by xen_blkbk_unmap_prepare() before inspecting the handle which does get properly set (to BLKBACK_INVALID_HANDLE). Since the earlier exiting from the ad-hoc loop requires the same field filling (invalidation) as that on the out-of-memory path, fold both paths. While doing so, drop the pr_alert(), as extra log messages aren't going to help the situation (the kernel will log oom conditions already anyway). This is XSA-365. Signed-off-by:
Jan Beulich <jbeulich@suse.com> Reviewed-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
Julien Grall <julien@xen.org> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Beulich authored
commit 7c77474b upstream. In particular -ENOMEM may come back here, from set_foreign_p2m_mapping(). Don't make problems worse, the more that handling elsewhere (together with map's status fields now indicating whether a mapping wasn't even attempted, and hence has to be considered failed) doesn't require this odd way of dealing with errors. This is part of XSA-362. Signed-off-by:
Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Beulich authored
commit 3194a174 upstream. In particular -ENOMEM may come back here, from set_foreign_p2m_mapping(). Don't make problems worse, the more that handling elsewhere (together with map's status fields now indicating whether a mapping wasn't even attempted, and hence has to be considered failed) doesn't require this odd way of dealing with errors. This is part of XSA-362. Signed-off-by:
Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Beulich authored
commit 5a264285 upstream. In particular -ENOMEM may come back here, from set_foreign_p2m_mapping(). Don't make problems worse, the more that handling elsewhere (together with map's status fields now indicating whether a mapping wasn't even attempted, and hence has to be considered failed) doesn't require this odd way of dealing with errors. This is part of XSA-362. Signed-off-by:
Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stefano Stabellini authored
commit 36bf1dfb upstream. set_phys_to_machine can fail due to lack of memory, see the kzalloc call in arch/arm/xen/p2m.c:__set_phys_to_machine_multi. Don't ignore the potential return error in set_foreign_p2m_mapping, returning it to the caller instead. This is part of XSA-361. Signed-off-by:
Stefano Stabellini <stefano.stabellini@xilinx.com> Cc: stable@vger.kernel.org Reviewed-by:
Julien Grall <jgrall@amazon.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Beulich authored
commit ebee0eab upstream. Failure of the kernel part of the mapping operation should also be indicated as an error to the caller, or else it may assume the respective kernel VA is okay to access. Furthermore gnttab_map_refs() failing still requires recording successfully mapped handles, so they can be unmapped subsequently. This in turn requires there to be a way to tell full hypercall failure from partial success - preset map_op status fields such that they won't "happen" to look as if the operation succeeded. Also again use GNTST_okay instead of implying its value (zero). This is part of XSA-361. Signed-off-by:
Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Beulich authored
commit dbe52836 upstream. We may not skip setting the field in the unmap structure when GNTMAP_device_map is in use - such an unmap would fail to release the respective resources (a page ref in the hypervisor). Otoh the field doesn't need setting at all when GNTMAP_device_map is not in use. To record the value for unmapping, we also better don't use our local p2m: In particular after a subsequent change it may not have got updated for all the batch elements. Instead it can simply be taken from the respective map's results. We can additionally avoid playing this game altogether for the kernel part of the mappings in (x86) PV mode. This is part of XSA-361. Signed-off-by:
Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by:
Stefano Stabellini <sstabellini@kernel.org> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Beulich authored
commit b512e1b0 upstream. We should not set up further state if either mapping failed; paying attention to just the user mapping's status isn't enough. Also use GNTST_okay instead of implying its value (zero). This is part of XSA-361. Signed-off-by:
Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Beulich authored
commit a35f2ef3 upstream. Its sibling (set_foreign_p2m_mapping()) as well as the sibling of its only caller (gnttab_map_refs()) don't clean up after themselves in case of error. Higher level callers are expected to do so. However, in order for that to really clean up any partially set up state, the operation should not terminate upon encountering an entry in unexpected state. It is particularly relevant to notice here that set_foreign_p2m_mapping() would skip setting up a p2m entry if its grant mapping failed, but it would continue to set up further p2m entries as long as their mappings succeeded. Arguably down the road set_foreign_p2m_mapping() may want its page state related WARN_ON() also converted to an error return. This is part of XSA-361. Signed-off-by:
Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasily Gorbik authored
commit 07d04081 upstream. Currently if CONFIG_FTRACE_MCOUNT_RECORD is enabled -mrecord-mcount compiler flag support is tested for every Makefile. Top 4 cc-option usages: 511 -mrecord-mcount 11 -fno-stack-protector 9 -Wno-override-init 2 -fsched-pressure To address that move cc-option from scripts/Makefile.build to top Makefile and export CC_USING_RECORD_MCOUNT to be used in original place. While doing that also add -mrecord-mcount to CC_FLAGS_FTRACE (if gcc actually supports it). Link: http://lkml.kernel.org/r/patch-2.thread-aa7b8d.git-de935bace15a.your-ad-here.call-01533557518-ext-9465@work.hours Acked-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Greg Thelen authored
commit ed7d40bc upstream. Non gcc-5 builds with CONFIG_STACK_VALIDATION=y and SKIP_STACK_VALIDATION=1 fail. Example output: /bin/sh: init/.tmp_main.o: Permission denied commit 96f60dfa ("trace: Use -mcount-record for dynamic ftrace"), added a mismatched endif. This causes cmd_objtool to get mistakenly set. Relocate endif to balance the newly added -record-mcount check. Link: http://lkml.kernel.org/r/20180608214746.136554-1-gthelen@google.com Fixes: 96f60dfa ("trace: Use -mcount-record for dynamic ftrace") Acked-by:
Andi Kleen <ak@linux.intel.com> Tested-by:
David Rientjes <rientjes@google.com> Signed-off-by:
Greg Thelen <gthelen@google.com> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andi Kleen authored
commit 96f60dfa upstream. gcc 5 supports a new -mcount-record option to generate ftrace tables directly. This avoids the need to run record_mcount manually. Use this option when available. So far doesn't use -mcount-nop, which also exists now. This is needed to make ftrace work with LTO because the normal record-mcount script doesn't run over the link time output. It should also improve build times slightly in the general case. Link: http://lkml.kernel.org/r/20171127213423.27218-12-andi@firstfloor.org Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Borislav Petkov authored
commit 256b92af upstream. Commit 20bf2b37 ("x86/build: Disable CET instrumentation in the kernel") disabled CET instrumentation which gets added by default by the Ubuntu gcc9 and 10 by default, but did that only for 64-bit builds. It would still fail when building a 32-bit target. So disable CET for all x86 builds. Fixes: 20bf2b37 ("x86/build: Disable CET instrumentation in the kernel") Reported-by:
AC <achirvasub@gmail.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Acked-by:
Josh Poimboeuf <jpoimboe@redhat.com> Tested-by:
AC <achirvasub@gmail.com> Link: https://lkml.kernel.org/r/YCCIgMHkzh/xT4ex@arch-chirva.localdomain Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stefano Garzarella authored
commit 1c5fae9c upstream. In vsock_shutdown() we touched some socket fields without holding the socket lock, such as 'state' and 'sk_flags'. Also, after the introduction of multi-transport, we are accessing 'vsk->transport' in vsock_send_shutdown() without holding the lock and this call can be made while the connection is in progress, so the transport can change in the meantime. To avoid issues, we hold the socket lock when we enter in vsock_shutdown() and release it when we leave. Among the transports that implement the 'shutdown' callback, only hyperv_transport acquired the lock. Since the caller now holds it, we no longer take it. Fixes: d021c344 ("VSOCK: Introduce VM Sockets") Signed-off-by:
Stefano Garzarella <sgarzare@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stefano Garzarella authored
commit ce7536bc upstream. If the socket is closed or is being released, some resources used by virtio_transport_space_update() such as 'vsk->trans' may be released. To avoid a use after free bug we should only update the available credit when we are sure the socket is still open and we have the lock held. Fixes: 06a8fc78 ("VSOCK: Introduce virtio_vsock_common.ko") Signed-off-by:
Stefano Garzarella <sgarzare@redhat.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20210208144454.84438-1-sgarzare@redhat.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Edwin Peer authored
commit 3aa6bce9 upstream. Prevent netif_tx_disable() running concurrently with dev_watchdog() by taking the device global xmit lock. Otherwise, the recommended: netif_carrier_off(dev); netif_tx_disable(dev); driver shutdown sequence can happen after the watchdog has already checked carrier, resulting in possible false alarms. This is because netif_tx_lock() only sets the frozen bit without maintaining the locks on the individual queues. Fixes: c3f26a26 ("netdev: Fix lockdep warnings in multiqueue configurations.") Signed-off-by:
Edwin Peer <edwin.peer@broadcom.com> Reviewed-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Norbert Slusarek authored
commit 3d0bc44d upstream. A possible locking issue in vsock_connect_timeout() was recognized by Eric Dumazet which might cause a null pointer dereference in vsock_transport_cancel_pkt(). This patch assures that vsock_transport_cancel_pkt() will be called within the lock, so a race condition won't occur which could result in vsk->transport to be set to NULL. Fixes: 380feae0 ("vsock: cancel packets when failing to connect") Reported-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
Norbert Slusarek <nslusarek@gmx.net> Reviewed-by:
Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/trinity-f8e0937a-cf0e-4d80-a76e-d9a958ba3ef1-1612535522360@3c-app-gmx-bap12 Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Serge Semin authored
commit fca3f138 upstream Originally the procedure of the ULPI transaction finish detection has been developed as a simple busy-loop with just decrementing counter and no delays. It's wrong since on different systems the loop will take a different time to complete. So if the system bus and CPU are fast enough to overtake the ULPI bus and the companion PHY reaction, then we'll get to take a false timeout error. Fix this by converting the busy-loop procedure to take the standard bus speed, address value and the registers access mode into account for the busy-loop delay calculation. Here is the way the fix works. It's known that the ULPI bus is clocked with 60MHz signal. In accordance with [1] the ULPI bus protocol is created so to spend 5 and 6 clock periods for immediate register write and read operations respectively, and 6 and 7 clock periods - for the extended register writes and reads. Based on that we can easily pre-calculate the time which will be needed for the controller to perform a requested IO operation. Note we'll still preserve the attempts counter in case if the DWC USB3 controller has got some internals delays. [1] UTMI+ Low Pin Interface (ULPI) Specification, Revision 1.1, October 20, 2004, pp. 30 - 36. Fixes: 88bc9d19 ("usb: dwc3: add ULPI interface support") Acked-by:
Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by:
Serge Semin <Sergey.Semin@baikalelectronics.ru> Link: https://lore.kernel.org/r/20201210085008.13264-3-Sergey.Semin@baikalelectronics.ru Cc: stable <stable@vger.kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> [sudip: adjust context] Signed-off-by:
Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Felipe Balbi authored
commit 2a499b45 upstream no functional changes. Signed-off-by:
Felipe Balbi <balbi@kernel.org> Signed-off-by:
Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Randy Dunlap authored
[ Upstream commit ade9679c ] Fix a build error for undefined 'TI_PRE_COUNT' by adding it to asm-offsets.c. h8300-linux-ld: arch/h8300/kernel/entry.o: in function `resume_kernel': (.text+0x29a): undefined reference to `TI_PRE_COUNT' Link: https://lkml.kernel.org/r/20210212021650.22740-1-rdunlap@infradead.org Fixes: df2078b8 ("h8300: Low level entry") Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Reported-by:
kernel test robot <lkp@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Florian Westphal authored
[ Upstream commit 07998281 ] The origin skip check needs to re-test the zone. Else, we might skip a colliding tuple in the reply direction. This only occurs when using 'directional zones' where origin tuples reside in different zones but the reply tuples share the same zone. This causes the new conntrack entry to be dropped at confirmation time because NAT clash resolution was elided. Fixes: 4e35c1cb ("netfilter: nf_nat: skip nat clash resolution for same-origin entries") Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Juergen Gross authored
[ Upstream commit ec7d8e7d ] Since commit 23025393 ("xen/netback: use lateeoi irq binding") xenvif_rx_ring_slots_available() is no longer called only from the rx queue kernel thread, so it needs to access the rx queue with the associated queue held. Reported-by:
Igor Druzhinin <igor.druzhinin@citrix.com> Fixes: 23025393 ("xen/netback: use lateeoi irq binding") Signed-off-by:
Juergen Gross <jgross@suse.com> Acked-by:
Wei Liu <wl@xen.org> Link: https://lore.kernel.org/r/20210202070938.7863-1-jgross@suse.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jozsef Kadlecsik authored
[ Upstream commit b1bdde33 ] When both --reap and --update flag are specified, there's a code path at which the entry to be updated is reaped beforehand, which then leads to kernel crash. Reap only entries which won't be updated. Fixes kernel bugzilla #207773. Link: https://bugzilla.kernel.org/show_bug.cgi?id=207773 Reported-by:
Reindl Harald <h.reindl@thelounge.net> Fixes: 0079c5ae ("netfilter: xt_recent: add an entry reaper") Signed-off-by:
Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Bui Quang Minh authored
[ Upstream commit 6183f4d3 ] On 32-bit architecture, roundup_pow_of_two() can return 0 when the argument has upper most bit set due to resulting 1UL << 32. Add a check for this case. Fixes: d5a3b1f6 ("bpf: introduce BPF_MAP_TYPE_STACK_TRACE") Signed-off-by:
Bui Quang Minh <minhquangbui99@gmail.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210127063653.3576-1-minhquangbui99@gmail.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Roman Gushchin authored
[ Upstream commit 2dcb3964 ] With kaslr the kernel image is placed at a random place, so starting the bottom-up allocation with the kernel_end can result in an allocation failure and a warning like this one: hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node ------------[ cut here ]------------ memblock: bottom-up allocation failed, memory hotremove may be affected WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #1169 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 RIP: 0010:memblock_find_in_range_node+0x178/0x25a Code: e9 6d ff ff ff 48 85 c0 0f 85 da 00 00 00 80 3d 9b 35 df 00 00 75 15 48 c7 c7 c0 75 59 88 c6 05 8b 35 df 00 01 e8 25 8a fa ff <0f> 0b 48 c7 44 24 20 ff ff ff ff 44 89 e6 44 89 ea 48 c7 c1 70 5c RSP: 0000:ffffffff88803d18 EFLAGS: 00010086 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000240000000 RCX: 00000000ffffdfff RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000046 RBP: 0000000100000000 R08: ffffffff88922788 R09: 0000000000009ffb R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000 R13: 0000000000000000 R14: 0000000080000000 R15: 00000001fb42c000 FS: 0000000000000000(0000) GS:ffffffff88f71000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa080fb401000 CR3: 00000001fa80a000 CR4: 00000000000406b0 Call Trace: memblock_alloc_range_nid+0x8d/0x11e cma_declare_contiguous_nid+0x2c4/0x38c hugetlb_cma_reserve+0xdc/0x128 flush_tlb_one_kernel+0xc/0x20 native_set_fixmap+0x82/0xd0 flat_get_apic_id+0x5/0x10 register_lapic_address+0x8e/0x97 setup_arch+0x8a5/0xc3f start_kernel+0x66/0x547 load_ucode_bsp+0x4c/0xcd secondary_startup_64_no_verify+0xb0/0xbb random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0 ---[ end trace f151227d0b39be70 ]--- At the same time, the kernel image is protected with memblock_reserve(), so we can just start searching at PAGE_SIZE. In this case the bottom-up allocation has the same chances to success as a top-down allocation, so there is no reason to fallback in the case of a failure. All together it simplifies the logic. Link: https://lkml.kernel.org/r/20201217201214.3414100-2-guro@fb.com Fixes: 8fabc623 ("powerpc: Ensure that swiotlb buffer is allocated from low memory") Signed-off-by:
Roman Gushchin <guro@fb.com> Reviewed-by:
Mike Rapoport <rppt@linux.ibm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: Wonhyuk Yang <vvghjk1234@gmail.com> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Alexandre Belloni authored
[ Upstream commit 5638159f ] This reverts commit c17e9377. The lpc32xx clock driver is not able to actually change the PLL rate as this would require reparenting ARM_CLK, DDRAM_CLK, PERIPH_CLK to SYSCLK, then stop the PLL, update the register, restart the PLL and wait for the PLL to lock and finally reparent ARM_CLK, DDRAM_CLK, PERIPH_CLK to HCLK PLL. Currently, the HCLK driver simply updates the registers but this has no real effect and all the clock rate calculation end up being wrong. This is especially annoying for the peripheral (e.g. UARTs, I2C, SPI). Signed-off-by:
Alexandre Belloni <alexandre.belloni@bootlin.com> Tested-by:
Gregory CLEMENT <gregory.clement@bootlin.com> Link: https://lore.kernel.org/r/20210203090320.GA3760268@piout.net ' Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Amir Goldstein authored
[ Upstream commit 03fedf93 ] When inode has no listxattr op of its own (e.g. squashfs) vfs_listxattr calls the LSM inode_listsecurity hooks to list the xattrs that LSMs will intercept in inode_getxattr hooks. When selinux LSM is installed but not initialized, it will list the security.selinux xattr in inode_listsecurity, but will not intercept it in inode_getxattr. This results in -ENODATA for a getxattr call for an xattr returned by listxattr. This situation was manifested as overlayfs failure to copy up lower files from squashfs when selinux is built-in but not initialized, because ovl_copy_xattr() iterates the lower inode xattrs by vfs_listxattr() and vfs_getxattr(). ovl_copy_xattr() skips copy up of security labels that are indentified by inode_copy_up_xattr LSM hooks, but it does that after vfs_getxattr(). Since we are not going to copy them, skip vfs_getxattr() of the security labels. Reported-by:
Michael Labriola <michael.d.labriola@gmail.com> Tested-by:
Michael Labriola <michael.d.labriola@gmail.com> Link: https://lore.kernel.org/linux-unionfs/2nv9d47zt7.fsf@aldarion.sourceruckus.org/ Signed-off-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Miklos Szeredi <mszeredi@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Steven Rostedt (VMware) authored
commit b220c049 upstream. When filters are used by trace events, a page is allocated on each CPU and used to copy the trace event fields to this page before writing to the ring buffer. The reason to use the filter and not write directly into the ring buffer is because a filter may discard the event and there's more overhead on discarding from the ring buffer than the extra copy. The problem here is that there is no check against the size being allocated when using this page. If an event asks for more than a page size while being filtered, it will get only a page, leading to the caller writing more that what was allocated. Check the length of the request, and if it is more than PAGE_SIZE minus the header default back to allocating from the ring buffer directly. The ring buffer may reject the event if its too big anyway, but it wont overflow. Link: https://lore.kernel.org/ath10k/1612839593-2308-1-git-send-email-wgong@codeaurora.org/ Cc: stable@vger.kernel.org Fixes: 0fc1b09f ("tracing: Use temp buffer when filtering events") Reported-by:
Wen Gong <wgong@codeaurora.org> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steven Rostedt (VMware) authored
commit 256cfdd6 upstream. The file /sys/kernel/tracing/events/enable is used to enable all events by echoing in "1", or disabling all events when echoing in "0". To know if all events are enabled, disabled, or some are enabled but not all of them, cating the file should show either "1" (all enabled), "0" (all disabled), or "X" (some enabled but not all of them). This works the same as the "enable" files in the individule system directories (like tracing/events/sched/enable). But when all events are enabled, the top level "enable" file shows "X". The reason is that its checking the "ftrace" events, which are special events that only exist for their format files. These include the format for the function tracer events, that are enabled when the function tracer is enabled, but not by the "enable" file. The check includes these events, which will always be disabled, and even though all true events are enabled, the top level "enable" file will show "X" instead of "1". To fix this, have the check test the event's flags to see if it has the "IGNORE_ENABLE" flag set, and if so, not test it. Cc: stable@vger.kernel.org Fixes: 553552ce ("tracing: Combine event filter_active and enable into single flags field") Reported-by:
"Yordan Karadzhov (VMware)" <y.karadz@gmail.com> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Phillip Lougher authored
commit 506220d2 upstream. Sysbot has reported a warning where a kmalloc() attempt exceeds the maximum limit. This has been identified as corruption of the xattr_ids count when reading the xattr id lookup table. This patch adds a number of additional sanity checks to detect this corruption and others. 1. It checks for a corrupted xattr index read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This would cause an out of bounds read. 2. It checks against corruption of the xattr_ids count. This can either lead to the above kmalloc failure, or a smaller than expected table to be read. 3. It checks the contents of the index table for corruption. [phillip@squashfs.org.uk: fix checkpatch issue] Link: https://lkml.kernel.org/r/270245655.754655.1612770082682@webmail.123-reg.co.uk Link: https://lkml.kernel.org/r/20210204130249.4495-5-phillip@squashfs.org.uk Signed-off-by:
Phillip Lougher <phillip@squashfs.org.uk> Reported-by:
<syzbot+2ccea6339d368360800d@syzkaller.appspotmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Phillip Lougher authored
commit eabac19e upstream. Sysbot has reported an "slab-out-of-bounds read" error which has been identified as being caused by a corrupted "ino_num" value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This patch adds additional sanity checks to detect this, and the following corruption. 1. It checks against corruption of the inodes count. This can either lead to a larger table to be read, or a smaller than expected table to be read. In the case of a too large inodes count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values. 2. It checks the contents of the index table for corruption. [phillip@squashfs.org.uk: fix checkpatch issue] Link: https://lkml.kernel.org/r/527909353.754618.1612769948607@webmail.123-reg.co.uk Link: https://lkml.kernel.org/r/20210204130249.4495-4-phillip@squashfs.org.uk Signed-off-by:
Phillip Lougher <phillip@squashfs.org.uk> Reported-by:
<syzbot+04419e3ff19d2970ea28@syzkaller.appspotmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Phillip Lougher authored
commit f37aa4c7 upstream. Sysbot has reported a number of "slab-out-of-bounds reads" and "use-after-free read" errors which has been identified as being caused by a corrupted index value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This patch adds additional sanity checks to detect this, and the following corruption. 1. It checks against corruption of the ids count. This can either lead to a larger table to be read, or a smaller than expected table to be read. In the case of a too large ids count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values. 2. It checks the contents of the index table for corruption. Link: https://lkml.kernel.org/r/20210204130249.4495-3-phillip@squashfs.org.uk Signed-off-by:
Phillip Lougher <phillip@squashfs.org.uk> Reported-by:
<syzbot+b06d57ba83f604522af2@syzkaller.appspotmail.com> Reported-by:
<syzbot+c021ba012da41ee9807c@syzkaller.appspotmail.com> Reported-by:
<syzbot+5024636e8b5fd19f0f19@syzkaller.appspotmail.com> Reported-by:
<syzbot+bcbc661df46657d0fa4f@syzkaller.appspotmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Gleixner authored
commit da791a66 upstream. Stefan reported, that the glibc tst-robustpi4 test case fails occasionally. That case creates the following race between sys_exit() and sys_futex_lock_pi(): CPU0 CPU1 sys_exit() sys_futex() do_exit() futex_lock_pi() exit_signals(tsk) No waiters: tsk->flags |= PF_EXITING; *uaddr == 0x00000PID mm_release(tsk) Set waiter bit exit_robust_list(tsk) { *uaddr = 0x80000PID; Set owner died attach_to_pi_owner() { *uaddr = 0xC0000000; tsk = get_task(PID); } if (!tsk->flags & PF_EXITING) { ... attach(); tsk->flags |= PF_EXITPIDONE; } else { if (!(tsk->flags & PF_EXITPIDONE)) return -EAGAIN; return -ESRCH; <--- FAIL } ESRCH is returned all the way to user space, which triggers the glibc test case assert. Returning ESRCH unconditionally is wrong here because the user space value has been changed by the exiting task to 0xC0000000, i.e. the FUTEX_OWNER_DIED bit is set and the futex PID value has been cleared. This is a valid state and the kernel has to handle it, i.e. taking the futex. Cure it by rereading the user space value when PF_EXITING and PF_EXITPIDONE is set in the task which 'owns' the futex. If the value has changed, let the kernel retry the operation, which includes all regular sanity checks and correctly handles the FUTEX_OWNER_DIED case. If it hasn't changed, then return ESRCH as there is no way to distinguish this case from malfunctioning user space. This happens when the exiting task did not have a robust list, the robust list was corrupted or the user space value in the futex was simply bogus. Reported-by:
Stefan Liebler <stli@linux.ibm.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
Peter Zijlstra <peterz@infradead.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Sasha Levin <sashal@kernel.org> Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=200467 Link: https://lkml.kernel.org/r/20181210152311.986181245@linutronix.de Signed-off-by:
Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> [Lee: Required to satisfy functional dependency from futex back-port. Re-add the missing handle_exit_race() parts from: 3d4775df ("futex: Replace PF_EXITPIDONE with a state")] Signed-off-by:
Lee Jones <lee.jones@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
Currently futex-pi relies on hb->lock to serialize everything. But hb->lock creates another set of problems, especially priority inversions on RT where hb->lock becomes a rt_mutex itself. The rt_mutex::wait_lock is the most obvious protection for keeping the futex user space value and the kernel internal pi_state in sync. Rework and document the locking so rt_mutex::wait_lock is held accross all operations which modify the user space value and the pi state. This allows to invoke rt_mutex_unlock() (including deboost) without holding hb->lock as a next step. Nothing yet relies on the new locking rules. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104151.751993333@infradead.org Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> [Lee: Back-ported in support of a previous futex back-port attempt] Signed-off-by:
Lee Jones <lee.jones@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Gleixner authored
commit 12bb3f7f upstream In case that futex_lock_pi() was aborted by a signal or a timeout and the task returned without acquiring the rtmutex, but is the designated owner of the futex due to a concurrent futex_unlock_pi() fixup_owner() is invoked to establish consistent state. In that case it invokes fixup_pi_state_owner() which in turn tries to acquire the rtmutex again. If that succeeds then it does not propagate this success to fixup_owner() and futex_lock_pi() returns -EINTR or -ETIMEOUT despite having the futex locked. Return success from fixup_pi_state_owner() in all cases where the current task owns the rtmutex and therefore the futex and propagate it correctly through fixup_owner(). Fixup the other callsite which does not expect a positive return value. Fixes: c1e2f0ea ("futex: Avoid violating the 10th rule of futex") Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> [Lee: Back-ported in support of a previous futex attempt] Signed-off-by:
Lee Jones <lee.jones@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
[ Upstream commit 68f23b89 ] Without memcg, there is a one-to-one mapping between the bdi and bdi_writeback structures. In this world, things are fairly straightforward; the first thing bdi_unregister() does is to shutdown the bdi_writeback structure (or wb), and part of that writeback ensures that no other work queued against the wb, and that the wb is fully drained. With memcg, however, there is a one-to-many relationship between the bdi and bdi_writeback structures; that is, there are multiple wb objects which can all point to a single bdi. There is a refcount which prevents the bdi object from being released (and hence, unregistered). So in theory, the bdi_unregister() *should* only get called once its refcount goes to zero (bdi_put will drop the refcount, and when it is zero, release_bdi gets called, which calls bdi_unregister). Unfortunately, del_gendisk() in block/gen_hd.c never got the memo about the Brave New memcg World, and calls bdi_unregister directly. It does this without informing the file system, or the memcg code, or anything else. This causes the root wb associated with the bdi to be unregistered, but none of the memcg-specific wb's are shutdown. So when one of these wb's are woken up to do delayed work, they try to dereference their wb->bdi->dev to fetch the device name, but unfortunately bdi->dev is now NULL, thanks to the bdi_unregister() called by del_gendisk(). As a result, *boom*. Fortunately, it looks like the rest of the writeback path is perfectly happy with bdi->dev and bdi->owner being NULL, so the simplest fix is to create a bdi_dev_name() function which can handle bdi->dev being NULL. This also allows us to bulletproof the writeback tracepoints to prevent them from dereferencing a NULL pointer and crashing the kernel if one is tracing with memcg's enabled, and an iSCSI device dies or a USB storage stick is pulled. The most common way of triggering this will be hotremoval of a device while writeback with memcg enabled is going on. It was triggering several times a day in a heavily loaded production environment. Google Bug Id: 145475544 Link: https://lore.kernel.org/r/20191227194829.150110-1-tytso@mit.edu Link: http://lkml.kernel.org/r/20191228005211.163952-1-tytso@mit.edu Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: Chris Mason <clm@fb.com> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Qian Cai authored
[ Upstream commit d1a445d3 ] There are many of those warnings. In file included from ./arch/powerpc/include/asm/paca.h:15, from ./arch/powerpc/include/asm/current.h:13, from ./include/linux/thread_info.h:21, from ./include/asm-generic/preempt.h:5, from ./arch/powerpc/include/generated/asm/preempt.h:1, from ./include/linux/preempt.h:78, from ./include/linux/spinlock.h:51, from fs/fs-writeback.c:19: In function 'strncpy', inlined from 'perf_trace_writeback_page_template' at ./include/trace/events/writeback.h:56:1: ./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified bound 32 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix it by using the new strscpy_pad() which was introduced in "lib/string: Add strscpy_pad() function" and will always be NUL-terminated instead of strncpy(). Also, change strlcpy() to use strscpy_pad() in this file for consistency. Link: http://lkml.kernel.org/r/1564075099-27750-1-git-send-email-cai@lca.pw Fixes: 455b2864 ("writeback: Initial tracing support") Fixes: 028c2dd1 ("writeback: Add tracing to balance_dirty_pages") Fixes: e84d0a4f ("writeback: trace event writeback_queue_io") Fixes: b48c104d ("writeback: trace event bdi_dirty_ratelimit") Fixes: cc1676d9 ("writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()") Fixes: 9fb0a7da ("writeback: add more tracepoints") Signed-off-by:
Qian Cai <cai@lca.pw> Reviewed-by:
Jan Kara <jack@suse.cz> Cc: Tobin C. Harding <tobin@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Dave Chinner <dchinner@redhat.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Nitin Gote <nitin.r.gote@intel.com> Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Cc: Stephen Kitt <steve@sk2.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-