- Feb 25, 2023
-
-
Greg Kroah-Hartman authored
Link: https://lore.kernel.org/r/20230223130426.817998725@linuxfoundation.org Link: https://lore.kernel.org/r/20230223141540.701637224@linuxfoundation.org Tested-by:
Florian Fainelli <f.fainelli@gmail.com> Tested-by:
Pavel Machek (CIP) <pavel@denx.de> Tested-by:
Shuah Khan <skhan@linuxfoundation.org> Tested-by:
Slade Watkins <srw@sladewatkins.net> Tested-by:
Guenter Roeck <linux@roeck-us.net> Tested-by:
Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by:
Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Linus Torvalds authored
commit f3dd0c53 upstream. Commit 74e19ef0 ("uaccess: Add speculation barrier to copy_from_user()") built fine on x86-64 and arm64, and that's the extent of my local build testing. It turns out those got the <linux/nospec.h> include incidentally through other header files (<linux/kvm_host.h> in particular), but that was not true of other architectures, resulting in build errors kernel/bpf/core.c: In function ‘___bpf_prog_run’: kernel/bpf/core.c:1913:3: error: implicit declaration of function ‘barrier_nospec’ so just make sure to explicitly include the proper <linux/nospec.h> header file to make everybody see it. Fixes: 74e19ef0 ("uaccess: Add speculation barrier to copy_from_user()") Reported-by:
kernel test robot <lkp@intel.com> Reported-by:
Viresh Kumar <viresh.kumar@linaro.org> Reported-by:
Huacai Chen <chenhuacai@loongson.cn> Tested-by:
Geert Uytterhoeven <geert@linux-m68k.org> Tested-by:
Dave Hansen <dave.hansen@linux.intel.com> Acked-by:
Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vladimir Oltean authored
commit af7b29b1 upstream. taprio_attach() has this logic at the end, which should have been removed with the blamed patch (which is now being reverted): /* access to the child qdiscs is not needed in offload mode */ if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { kfree(q->qdiscs); q->qdiscs = NULL; } because otherwise, we make use of q->qdiscs[] even after this array was deallocated, namely in taprio_leaf(). Therefore, whenever one would try to attach a valid child qdisc to a fully offloaded taprio root, one would immediately dereference a NULL pointer. $ tc qdisc replace dev eno0 handle 8001: parent root taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ max-sdu 0 0 0 0 0 200 0 0 \ base-time 200 \ sched-entry S 80 20000 \ sched-entry S a0 20000 \ sched-entry S 5f 60000 \ flags 2 $ max_frame_size=1500 $ data_rate_kbps=20000 $ port_transmit_rate_kbps=1000000 $ idleslope=$data_rate_kbps $ sendslope=$(($idleslope - $port_transmit_rate_kbps)) $ locredit=$(($max_frame_size * $sendslope / $port_transmit_rate_kbps)) $ hicredit=$(($max_frame_size * $idleslope / $port_transmit_rate_kbps)) $ tc qdisc replace dev eno0 parent 8001:7 cbs \ idleslope $idleslope \ sendslope $sendslope \ hicredit $hicredit \ locredit $locredit \ offload 0 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030 pc : taprio_leaf+0x28/0x40 lr : qdisc_leaf+0x3c/0x60 Call trace: taprio_leaf+0x28/0x40 tc_modify_qdisc+0xf0/0x72c rtnetlink_rcv_msg+0x12c/0x390 netlink_rcv_skb+0x5c/0x130 rtnetlink_rcv+0x1c/0x2c The solution is not as obvious as the problem. The code which deallocates q->qdiscs[] is in fact copied and pasted from mqprio, which also deallocates the array in mqprio_attach() and never uses it afterwards. Therefore, the identical cleanup logic of priv->qdiscs[] that mqprio_destroy() has is deceptive because it will never take place at qdisc_destroy() time, but just at raw ops->destroy() time (otherwise said, priv->qdiscs[] do not last for the entire lifetime of the mqprio root), but rather, this is just the twisted way in which the Qdisc API understands error path cleanup should be done (Qdisc_ops :: destroy() is called even when Qdisc_ops :: init() never succeeded). Side note, in fact this is also what the comment in mqprio_init() says: /* pre-allocate qdisc, attachment can't fail */ Or reworded, mqprio's priv->qdiscs[] scheme is only meant to serve as data passing between Qdisc_ops :: init() and Qdisc_ops :: attach(). [ this comment was also copied and pasted into the initial taprio commit, even though taprio_attach() came way later ] The problem is that taprio also makes extensive use of the q->qdiscs[] array in the software fast path (taprio_enqueue() and taprio_dequeue()), but it does not keep a reference of its own on q->qdiscs[i] (you'd think that since it creates these Qdiscs, it holds the reference, but nope, this is not completely true). To understand the difference between taprio_destroy() and mqprio_destroy() one must look before commit 13511704 ("net: taprio offload: enforce qdisc to netdev queue mapping"), because that just muddied the waters. In the "original" taprio design, taprio always attached itself (the root Qdisc) to all netdev TX queues, so that dev_qdisc_enqueue() would go through taprio_enqueue(). It also called qdisc_refcount_inc() on itself for as many times as there were netdev TX queues, in order to counter-balance what tc_get_qdisc() does when destroying a Qdisc (simplified for brevity below): if (n->nlmsg_type == RTM_DELQDISC) err = qdisc_graft(dev, parent=NULL, new=NULL, q, extack); qdisc_graft(where "new" is NULL so this deletes the Qdisc): for (i = 0; i < num_q; i++) { struct netdev_queue *dev_queue; dev_queue = netdev_get_tx_queue(dev, i); old = dev_graft_qdisc(dev_queue, new); if (new && i > 0) qdisc_refcount_inc(new); qdisc_put(old); ~~~~~~~~~~~~~~ this decrements taprio's refcount once for each TX queue } notify_and_destroy(net, skb, n, classid, rtnl_dereference(dev->qdisc), new); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ and this finally decrements it to zero, making qdisc_put() call qdisc_destroy() The q->qdiscs[] created using qdisc_create_dflt() (or their replacements, if taprio_graft() was ever to get called) were then privately freed by taprio_destroy(). This is still what is happening after commit 13511704 ("net: taprio offload: enforce qdisc to netdev queue mapping"), but only for software mode. In full offload mode, the per-txq "qdisc_put(old)" calls from qdisc_graft() now deallocate the child Qdiscs rather than decrement taprio's refcount. So when notify_and_destroy(taprio) finally calls taprio_destroy(), the difference is that the child Qdiscs were already deallocated. And this is exactly why the taprio_attach() comment "access to the child qdiscs is not needed in offload mode" is deceptive too. Not only the q->qdiscs[] array is not needed, but it is also necessary to get rid of it as soon as possible, because otherwise, we will also call qdisc_put() on the child Qdiscs in qdisc_destroy() -> taprio_destroy(), and this will cause a nasty use-after-free/refcount-saturate/whatever. In short, the problem is that since the blamed commit, taprio_leaf() needs q->qdiscs[] to not be freed by taprio_attach(), while qdisc_destroy() -> taprio_destroy() does need q->qdiscs[] to be freed by taprio_attach() for full offload. Fixing one problem triggers the other. All of this can be solved by making taprio keep its q->qdiscs[i] with a refcount elevated at 2 (in offloaded mode where they are attached to the netdev TX queues), both in taprio_attach() and in taprio_graft(). The generic qdisc_graft() would just decrement the child qdiscs' refcounts to 1, and taprio_destroy() would give them the final coup de grace. However the rabbit hole of changes is getting quite deep, and the complexity increases. The blamed commit was supposed to be a bug fix in the first place, and the bug it addressed is not so significant so as to justify further rework in stable trees. So I'd rather just revert it. I don't know enough about multi-queue Qdisc design to make a proper judgement right now regarding what is/isn't idiomatic use of Qdisc concepts in taprio. I will try to study the problem more and come with a different solution in net-next. Fixes: 1461d212 ("net/sched: taprio: make qdisc_leaf() see the per-netdev-queue pfifo child qdiscs") Reported-by:
Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Reported-by:
Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by:
Vinicius Costa Gomes <vinicius.gomes@intel.com> Link: https://lore.kernel.org/r/20221004220100.1650558-1-vladimir.oltean@nxp.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kees Cook authored
commit 118901ad upstream. With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG), indirect call targets are validated against the expected function pointer prototype to make sure the call target is valid to help mitigate ROP attacks. If they are not identical, there is a failure at run time, which manifests as either a kernel panic or thread getting killed. ext4_feat_ktype was setting the "release" handler to "kfree", which doesn't have a matching function prototype. Add a simple wrapper with the correct prototype. This was found as a result of Clang's new -Wcast-function-type-strict flag, which is more sensitive than the simpler -Wcast-function-type, which only checks for type width mismatches. Note that this code is only reached when ext4 is a loadable module and it is being unloaded: CFI failure at kobject_put+0xbb/0x1b0 (target: kfree+0x0/0x180; expected type: 0x7c4aa698) ... RIP: 0010:kobject_put+0xbb/0x1b0 ... Call Trace: <TASK> ext4_exit_sysfs+0x14/0x60 [ext4] cleanup_module+0x67/0xedb [ext4] Fixes: b99fee58 ("ext4: create ext4_feat kobject dynamically") Cc: Theodore Ts'o <tytso@mit.edu> Cc: Eric Biggers <ebiggers@kernel.org> Cc: stable@vger.kernel.org Build-tested-by:
Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by:
Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by:
Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20230103234616.never.915-kees@kernel.org Signed-off-by:
Kees Cook <keescook@chromium.org> Reviewed-by:
Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20230104210908.gonna.388-kees@kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul Moore authored
commit 6c6cd913 upstream. We've moved the upstream Linux Kernel audit subsystem discussions to a new mailing list, this patch updates the MAINTAINERS info with the new list address. Marking this for stable inclusion to help speed uptake of the new list across all of the supported kernel releases. This is a doc only patch so the risk should be close to nil. Cc: stable@vger.kernel.org Signed-off-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lukas Wunner authored
commit 36dd7a4c upstream. Commit e3fffc1f ("devicetree: document new marvell-8xxx and pwrseq-sd8787 options") documented a compatible string for SD8787 in the devicetree bindings, but neglected to add it to the mwifiex driver. Fixes: e3fffc1f ("devicetree: document new marvell-8xxx and pwrseq-sd8787 options") Signed-off-by:
Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v4.11+ Cc: Matt Ranostay <mranostay@ti.com> Signed-off-by:
Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/320de5005ff3b8fd76be2d2b859fd021689c3681.1674827105.git.lukas@wunner.de Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Zhang Wensheng authored
commit 858f1bf6 upstream. When 'index' is a big numbers, it may become negative which forced to 'int'. then 'index << part_shift' might overflow to a positive value that is not greater than '0xfffff', then sysfs might complains about duplicate creation. Because of this, move the 'index' judgment to the front will fix it and be better. Fixes: b0d9111a ("nbd: use an idr to keep track of nbd devices") Fixes: 940c2649 ("nbd: fix possible overflow for 'first_minor' in nbd_dev_add()") Signed-off-by:
Zhang Wensheng <zhangwensheng5@huawei.com> Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20220521073749.3146892-6-yukuai3@huawei.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Wen Yang <wenyang.linux@foxmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yu Kuai authored
commit 940c2649 upstream. If 'part_shift' is not zero, then 'index << part_shift' might overflow to a value that is not greater than '0xfffff', then sysfs might complains about duplicate creation. Fixes: b0d9111a ("nbd: use an idr to keep track of nbd devices") Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20211102015237.2309763-3-yebin10@huawei.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Wen Yang <wenyang.linux@foxmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yu Kuai authored
commit e4c4871a upstream. commit b1a81163 ("block: nbd: add sanity check for first_minor") checks that 'first_minor' should not be greater than 0xff, which is wrong. Whitout the commit, the details that when user pass 0x100000, it ends up create sysfs dir "/sys/block/43:0" are as follows: nbd_dev_add disk->first_minor = index << part_shift -> default part_shift is 5, first_minor is 0x2000000 device_add_disk ddev->devt = MKDEV(disk->major, disk->first_minor) -> (0x2b << 20) | (0x2000000) = 0x2b00000 device_add device_create_sys_dev_entry format_dev_t sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev)); -> got 43:0 sysfs_create_link -> /sys/block/43:0 By the way, with the wrong fix, when part_shift is the default value, only 8 ndb devices can be created since 8 << 5 is greater than 0xff. Since the max bits for 'first_minor' should be the same as what MKDEV() does, which is 20. Change the upper bound of 'first_minor' from 0xff to 0xfffff. Fixes: b1a81163 ("block: nbd: add sanity check for first_minor") Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20211102015237.2309763-2-yebin10@huawei.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Wen Yang <wenyang.linux@foxmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Wen Yang authored
This reverts commit 0daa75bf. These problems such as: https://lore.kernel.org/all/CACPK8XfUWoOHr-0RwRoYoskia4fbAbZ7DYf5wWBnv6qUnGq18w@mail.gmail.com/ It was introduced by introduced by commit b1a81163 ("block: nbd: add sanity check for first_minor") and has been have been fixed by commit e4c4871a ("nbd: fix max value for 'first_minor'"). Cc: Joel Stanley <joel@jms.id.au> Cc: Christoph Hellwig <hch@lst.de> Cc: Pavel Skripkin <paskripkin@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Sasha Levin <sashal@kernel.org> Cc: stable@vger.kernel.org # v5.10+ Signed-off-by:
Wen Yang <wenyang.linux@foxmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dave Hansen authored
commit 74e19ef0 upstream. The results of "access_ok()" can be mis-speculated. The result is that you can end speculatively: if (access_ok(from, size)) // Right here even for bad from/size combinations. On first glance, it would be ideal to just add a speculation barrier to "access_ok()" so that its results can never be mis-speculated. But there are lots of system calls just doing access_ok() via "copy_to_user()" and friends (example: fstat() and friends). Those are generally not problematic because they do not _consume_ data from userspace other than the pointer. They are also very quick and common system calls that should not be needlessly slowed down. "copy_from_user()" on the other hand uses a user-controller pointer and is frequently followed up with code that might affect caches. Take something like this: if (!copy_from_user(&kernelvar, uptr, size)) do_something_with(kernelvar); If userspace passes in an evil 'uptr' that *actually* points to a kernel addresses, and then do_something_with() has cache (or other) side-effects, it could allow userspace to infer kernel data values. Add a barrier to the common copy_from_user() code to prevent mis-speculated values which happen after the copy. Also add a stub for architectures that do not define barrier_nospec(). This makes the macro usable in generic code. Since the barrier is now usable in generic code, the x86 #ifdef in the BPF code can also go away. Reported-by:
Jordy Zomer <jordyzomer@google.com> Suggested-by:
Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by:
Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by: Daniel Borkmann <daniel@iogearbox.net> # BPF bits Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pavel Skripkin authored
commit 8b5cb7e4 upstream. Syzbot hit NULL deref in rhashtable_free_and_destroy(). The problem was in mesh_paths and mpp_paths being NULL. mesh_pathtbl_init() could fail in case of memory allocation failure, but nobody cared, since ieee80211_mesh_init_sdata() returns void. It led to leaving 2 pointers as NULL. Syzbot has found null deref on exit path, but it could happen anywhere else, because code assumes these pointers are valid. Since all ieee80211_*_setup_sdata functions are void and do not fail, let's embedd mesh_paths and mpp_paths into parent struct to avoid adding error handling on higher levels and follow the pattern of others setup_sdata functions Fixes: 60854fd9 ("mac80211: mesh: convert path table to rhashtable") Reported-and-tested-by:
<syzbot+860268315ba86ea6b96b@syzkaller.appspotmail.com> Signed-off-by:
Pavel Skripkin <paskripkin@gmail.com> Link: https://lore.kernel.org/r/20211230195547.23977-1-paskripkin@gmail.com Signed-off-by:
Johannes Berg <johannes.berg@intel.com> [pchelkin@ispras.ru: adapt a comment spell fixing issue] Signed-off-by:
Fedor Pchelkin <pchelkin@ispras.ru> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Zheng Wang authored
commit 4a61648a upstream. If intel_gvt_dma_map_guest_page failed, it will call ppgtt_invalidate_spt, which will finally free the spt. But the caller function ppgtt_populate_spt_by_guest_entry does not notice that, it will free spt again in its error path. Fix this by canceling the mapping of DMA address and freeing sub_spt. Besides, leave the handle of spt destroy to caller function instead of callee function when error occurs. Fixes: b901b252 ("drm/i915/gvt: Add 2M huge gtt support") Signed-off-by:
Zheng Wang <zyytlz.wz@163.com> Reviewed-by:
Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by:
Zhenyu Wang <zhenyuw@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20221229165641.1192455-1-zyytlz.wz@163.com Signed-off-by:
Ovidiu Panait <ovidiu.panait@eng.windriver.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sean Anderson authored
[ Upstream commit 8d8bee13 ] There aren't enough resources to run these ports at 10G speeds. Disable 10G for these ports, reverting to the previous speed. Fixes: 36926a7d ("powerpc: dts: t208x: Mark MAC1 and MAC2 as 10G") Reported-by:
Camelia Alexandra Groza <camelia.groza@nxp.com> Signed-off-by:
Sean Anderson <sean.anderson@seco.com> Reviewed-by:
Camelia Groza <camelia.groza@nxp.com> Tested-by:
Camelia Groza <camelia.groza@nxp.com> Link: https://lore.kernel.org/r/20221216172937.2960054-1-sean.anderson@seco.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Marc Kleine-Budde authored
[ Upstream commit f0062291 ] Debian's gcc-13 [1] throws the following error in kvaser_usb_hydra_cmd_size(): [1] gcc version 13.0.0 20221214 (experimental) [master r13-4693-g512098a3316] (Debian 13-20221214-1) | drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c:502:65: error: | array subscript ‘struct kvaser_cmd_ext[0]’ is partly outside array | bounds of ‘unsigned char[32]’ [-Werror=array-bounds=] | 502 | ret = le16_to_cpu(((struct kvaser_cmd_ext *)cmd)->len); kvaser_usb_hydra_cmd_size() returns the size of given command. It depends on the command number (cmd->header.cmd_no). For extended commands (cmd->header.cmd_no == CMD_EXTENDED) the above shown code is executed. Help gcc to recognize that this code path is not taken in all cases, by calling kvaser_usb_hydra_cmd_size() directly after assigning the command number. Fixes: aec5fb22 ("can: kvaser_usb: Add support for Kvaser USB hydra family") Cc: Jimmy Assarsson <extja@kvaser.com> Cc: Anssi Hannula <anssi.hannula@bitwise.fi> Link: https://lore.kernel.org/all/20221219110104.1073881-1-mkl@pengutronix.de Tested-by:
Jimmy Assarsson <extja@kvaser.com> Signed-off-by:
Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jim Mattson authored
[ Upstream commit 2e7eab81 ] According to Intel's document on Indirect Branch Restricted Speculation, "Enabling IBRS does not prevent software from controlling the predicted targets of indirect branches of unrelated software executed later at the same predictor mode (for example, between two different user applications, or two different virtual machines). Such isolation can be ensured through use of the Indirect Branch Predictor Barrier (IBPB) command." This applies to both basic and enhanced IBRS. Since L1 and L2 VMs share hardware predictor modes (guest-user and guest-kernel), hardware IBRS is not sufficient to virtualize IBRS. (The way that basic IBRS is implemented on pre-eIBRS parts, hardware IBRS is actually sufficient in practice, even though it isn't sufficient architecturally.) For virtual CPUs that support IBRS, add an indirect branch prediction barrier on emulated VM-exit, to ensure that the predicted targets of indirect branches executed in L1 cannot be controlled by software that was executed in L2. Since we typically don't intercept guest writes to IA32_SPEC_CTRL, perform the IBPB at emulated VM-exit regardless of the current IA32_SPEC_CTRL.IBRS value, even though the IBPB could technically be deferred until L1 sets IA32_SPEC_CTRL.IBRS, if IA32_SPEC_CTRL.IBRS is clear at emulated VM-exit. This is CVE-2022-2196. Fixes: 5c911bef ("KVM: nVMX: Skip IBPB when switching between vmcs01 and vmcs02") Cc: Sean Christopherson <seanjc@google.com> Signed-off-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221019213620.1953281-3-jmattson@google.com Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sean Christopherson authored
[ Upstream commit 5c30e810 ] Skip the WRMSR fastpath in SVM's VM-Exit handler if the next RIP isn't valid, e.g. because KVM is running with nrips=false. SVM must decode and emulate to skip the WRMSR if the CPU doesn't provide the next RIP. Getting the instruction bytes to decode the WRMSR requires reading guest memory, which in turn means dereferencing memslots, and that isn't safe because KVM doesn't hold SRCU when the fastpath runs. Don't bother trying to enable the fastpath for this case, e.g. by doing only the WRMSR and leaving the "skip" until later. NRIPS is supported on all modern CPUs (KVM has considered making it mandatory), and the next RIP will be valid the vast, vast majority of the time. ============================= WARNING: suspicious RCU usage 6.0.0-smp--4e557fcd3d80-skip #13 Tainted: G O ----------------------------- include/linux/kvm_host.h:954 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by stable/206475: #0: ffff9d9dfebcc0f0 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x8b/0x620 [kvm] stack backtrace: CPU: 152 PID: 206475 Comm: stable Tainted: G O 6.0.0-smp--4e557fcd3d80-skip #13 Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 10.48.0 01/27/2022 Call Trace: <TASK> dump_stack_lvl+0x69/0xaa dump_stack+0x10/0x12 lockdep_rcu_suspicious+0x11e/0x130 kvm_vcpu_gfn_to_memslot+0x155/0x190 [kvm] kvm_vcpu_gfn_to_hva_prot+0x18/0x80 [kvm] paging64_walk_addr_generic+0x183/0x450 [kvm] paging64_gva_to_gpa+0x63/0xd0 [kvm] kvm_fetch_guest_virt+0x53/0xc0 [kvm] __do_insn_fetch_bytes+0x18b/0x1c0 [kvm] x86_decode_insn+0xf0/0xef0 [kvm] x86_emulate_instruction+0xba/0x790 [kvm] kvm_emulate_instruction+0x17/0x20 [kvm] __svm_skip_emulated_instruction+0x85/0x100 [kvm_amd] svm_skip_emulated_instruction+0x13/0x20 [kvm_amd] handle_fastpath_set_msr_irqoff+0xae/0x180 [kvm] svm_vcpu_run+0x4b8/0x5a0 [kvm_amd] vcpu_enter_guest+0x16ca/0x22f0 [kvm] kvm_arch_vcpu_ioctl_run+0x39d/0x900 [kvm] kvm_vcpu_ioctl+0x538/0x620 [kvm] __se_sys_ioctl+0x77/0xc0 __x64_sys_ioctl+0x1d/0x20 do_syscall_64+0x3d/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 404d5d7b ("KVM: X86: Introduce more exit_fastpath_completion enum values") Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220930234031.1732249-1-seanjc@google.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sean Christopherson authored
[ Upstream commit 17122c06 ] Treat any exception during instruction decode for EMULTYPE_SKIP as a "full" emulation failure, i.e. signal failure instead of queuing the exception. When decoding purely to skip an instruction, KVM and/or the CPU has already done some amount of emulation that cannot be unwound, e.g. on an EPT misconfig VM-Exit KVM has already processeed the emulated MMIO. KVM already does this if a #UD is encountered, but not for other exceptions, e.g. if a #PF is encountered during fetch. In SVM's soft-injection use case, queueing the exception is particularly problematic as queueing exceptions while injecting events can put KVM into an infinite loop due to bailing from VM-Enter to service the newly pending exception. E.g. multiple warnings to detect such behavior fire: ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1017 at arch/x86/kvm/x86.c:9873 kvm_arch_vcpu_ioctl_run+0x1de5/0x20a0 [kvm] Modules linked in: kvm_amd ccp kvm irqbypass CPU: 3 PID: 1017 Comm: svm_nested_soft Not tainted 6.0.0-rc1+ #220 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x1de5/0x20a0 [kvm] Call Trace: kvm_vcpu_ioctl+0x223/0x6d0 [kvm] __x64_sys_ioctl+0x85/0xc0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1017 at arch/x86/kvm/x86.c:9987 kvm_arch_vcpu_ioctl_run+0x12a3/0x20a0 [kvm] Modules linked in: kvm_amd ccp kvm irqbypass CPU: 3 PID: 1017 Comm: svm_nested_soft Tainted: G W 6.0.0-rc1+ #220 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x12a3/0x20a0 [kvm] Call Trace: kvm_vcpu_ioctl+0x223/0x6d0 [kvm] __x64_sys_ioctl+0x85/0xc0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ---[ end trace 0000000000000000 ]--- Fixes: 6ea6e843 ("KVM: x86: inject exceptions produced by x86_decode_insn") Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220930233632.1725475-1-seanjc@google.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jason A. Donenfeld authored
[ Upstream commit d7bf7f3b ] add_latent_entropy() is called every time a process forks, in kernel_clone(). This in turn calls add_device_randomness() using the latent entropy global state. add_device_randomness() does two things: 2) Mixes into the input pool the latent entropy argument passed; and 1) Mixes in a cycle counter, a sort of measurement of when the event took place, the high precision bits of which are presumably difficult to predict. (2) is impossible without CONFIG_GCC_PLUGIN_LATENT_ENTROPY=y. But (1) is always possible. However, currently CONFIG_GCC_PLUGIN_LATENT_ENTROPY=n disables both (1) and (2), instead of just (2). This commit causes the CONFIG_GCC_PLUGIN_LATENT_ENTROPY=n case to still do (1) by passing NULL (len 0) to add_device_randomness() when add_latent_ entropy() is called. Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: PaX Team <pageexec@freemail.hu> Cc: Emese Revfy <re.emese@gmail.com> Fixes: 38addce8 ("gcc-plugins: Add latent_entropy plugin") Signed-off-by:
Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Rahul Tanwar authored
[ Upstream commit 7256d1f4 ] Commit 03617731 ("clk: mxl: Switch from direct readl/writel based IO to regmap based IO") introduced code resulting in below warning issued by the smatch static checker. drivers/clk/x86/clk-lgm.c:441 lgm_cgu_probe() warn: passing zero to 'PTR_ERR' Fix the warning by replacing incorrect IS_ERR_OR_NULL() with IS_ERR(). Fixes: 03617731 ("clk: mxl: Switch from direct readl/writel based IO to regmap based IO") Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Rahul Tanwar <rtanwar@maxlinear.com> Link: https://lore.kernel.org/r/49e339d4739e4ae4c92b00c1b2918af0755d4122.1666695221.git.rtanwar@maxlinear.com Signed-off-by:
Stephen Boyd <sboyd@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sean Anderson authored
[ Upstream commit 36926a7d ] On the T208X SoCs, MAC1 and MAC2 support XGMII. Add some new MAC dtsi fragments, and mark the QMAN ports as 10G. Fixes: da414bb9 ("powerpc/mpc85xx: Add FSL QorIQ DPAA FMan support to the SoC device tree(s)") Signed-off-by:
Sean Anderson <sean.anderson@seco.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Rahul Tanwar authored
[ Upstream commit 106ef3bd ] One of the clock entry "dcl" clk has some HW limitations. One is that its rate can only by changed by changing its parent clk's rate & two is that HW does not support enable/disable for this clk. Handle above two limitations by adding relevant flags. Add standard flag CLK_SET_RATE_PARENT to handle rate change and add driver internal flag DIV_CLK_NO_MASK to handle enable/disable. Fixes: d058fd9e ("clk: intel: Add CGU clock driver for a new SoC") Reviewed-by:
Yi xin Zhu <yzhu@maxlinear.com> Signed-off-by:
Rahul Tanwar <rtanwar@maxlinear.com> Link: https://lore.kernel.org/r/a4770e7225f8a0c03c8ab2ba80434a4e8e9afb17.1665642720.git.rtanwar@maxlinear.com Signed-off-by:
Stephen Boyd <sboyd@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Rahul Tanwar authored
[ Upstream commit a5d49bd3 ] In MxL's LGM SoC, gate clocks can be controlled either from CGU clk driver i.e. this driver or directly from power management driver/daemon. It is dependent on the power policy/profile requirements of the end product. To support such use cases, provide option to override gate clks enable/disable by adding a flag GATE_CLK_HW which controls if these gate clks are controlled by HW i.e. this driver or overridden in order to allow it to be controlled by power profiles instead. Reviewed-by:
Yi xin Zhu <yzhu@maxlinear.com> Signed-off-by:
Rahul Tanwar <rtanwar@maxlinear.com> Link: https://lore.kernel.org/r/bdc9c89317b5d338a6c4f1d49386b696e947a672.1665642720.git.rtanwar@maxlinear.com [sboyd@kernel.org: Add braces on many line if-else] Signed-off-by:
Stephen Boyd <sboyd@kernel.org> Stable-dep-of: 106ef3bd ("clk: mxl: Fix a clk entry by adding relevant flags") Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Rahul Tanwar authored
[ Upstream commit eaabee88 ] Patch 1/4 of this patch series switches from direct readl/writel based register access to regmap based register access. Instead of using direct readl/writel, regmap API's are used to read, write & read-modify-write clk registers. Regmap API's already use their own spinlocks to serialize the register accesses across multiple cores in which case additional driver spinlocks becomes redundant. Hence, remove redundant spinlocks from driver in this patch 2/4. Reviewed-by:
Yi xin Zhu <yzhu@maxlinear.com> Signed-off-by:
Rahul Tanwar <rtanwar@maxlinear.com> Link: https://lore.kernel.org/r/a8a02c8773b88924503a9fdaacd37dd2e6488bf3.1665642720.git.rtanwar@maxlinear.com Signed-off-by:
Stephen Boyd <sboyd@kernel.org> Stable-dep-of: 106ef3bd ("clk: mxl: Fix a clk entry by adding relevant flags") Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Rahul Tanwar authored
[ Upstream commit 03617731 ] Earlier version of driver used direct io remapped register read writes using readl/writel. But we need secure boot access which is only possible when registers are read & written using regmap. This is because the security bus/hook is written & coupled only with regmap layer. Switch the driver from direct readl/writel based register accesses to regmap based register accesses. Additionally, update the license headers to latest status. Reviewed-by:
Yi xin Zhu <yzhu@maxlinear.com> Signed-off-by:
Rahul Tanwar <rtanwar@maxlinear.com> Link: https://lore.kernel.org/r/2610331918206e0e3bd18babb39393a558fb34f9.1665642720.git.rtanwar@maxlinear.com Signed-off-by:
Stephen Boyd <sboyd@kernel.org> Stable-dep-of: 106ef3bd ("clk: mxl: Fix a clk entry by adding relevant flags") Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Bitterblue Smith authored
[ Upstream commit 791082ec ] Re-enable the function rtl8xxxu_gen2_report_connect. It informs the firmware when connecting to a network. This makes the firmware enable the rate control, which makes the upload faster. It also informs the firmware when disconnecting from a network. In the past this made reconnecting impossible because it was sending the auth on queue 0x7 (TXDESC_QUEUE_VO) instead of queue 0x12 (TXDESC_QUEUE_MGNT): wlp0s20f0u3: send auth to 90:55:de:__:__:__ (try 1/3) wlp0s20f0u3: send auth to 90:55:de:__:__:__ (try 2/3) wlp0s20f0u3: send auth to 90:55:de:__:__:__ (try 3/3) wlp0s20f0u3: authentication with 90:55:de:__:__:__ timed out Probably the firmware disables the unnecessary TX queues when it knows it's disconnected. However, this was fixed in commit edd5747a ("wifi: rtl8xxxu: Fix skb misuse in TX queue selection"). Fixes: c59f13bb ("rtl8xxxu: Work around issue with 8192eu and 8723bu devices not reconnecting") Signed-off-by:
Bitterblue Smith <rtl8821cerfe2@gmail.com> Signed-off-by:
Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/43200afc-0c65-ee72-48f8-231edd1df493@gmail.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Lucas Stach authored
[ Upstream commit d37c120b ] While the interface for the MMU mapping takes phys_addr_t to hold a full 64bit address when necessary and MMUv2 is able to map physical addresses with up to 40bit, etnaviv_iommu_map() truncates the address to 32bits. Fix this by using the correct type. Fixes: 931e97f3 ("drm/etnaviv: mmuv2: support 40 bit phys address") Signed-off-by:
Lucas Stach <l.stach@pengutronix.de> Reviewed-by:
Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
- Feb 22, 2023
-
-
Greg Kroah-Hartman authored
Link: https://lore.kernel.org/r/20230220133549.360169435@linuxfoundation.org Tested-by:
Pavel Machek (CIP) <pavel@denx.de> Tested-by:
Linux Kernel Functional Testing <lkft@linaro.org> Tested-by:
Jon Hunter <jonathanh@nvidia.com> Tested-by:
Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by:
Guenter Roeck <linux@roeck-us.net> Tested-by:
Salvatore Bonaccorso <carnil@debian.org> Tested-by:
Florian Fainelli <f.fainelli@gmail.com> Tested-by:
Shuah Khan <skhan@linuxfoundation.org> Tested-by:
Hulk Robot <hulkrobot@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Russell King (Oracle) authored
commit 0c4862b1 upstream. Dan Carpenter points out that the return code was not set in commit 60c8b4aebd8e ("nvmem: core: fix cleanup after dev_set_name()"), but this is not the only issue - we also need to zero wp_gpio to prevent gpiod_put() being called on an error value. Fixes: 560181d3 ("nvmem: core: fix cleanup after dev_set_name()") Cc: stable@vger.kernel.org Reported-by:
kernel test robot <lkp@intel.com> Reported-by:
Dan Carpenter <error27@gmail.com> Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by:
Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20230127104015.23839-10-srinivas.kandagatla@linaro.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Carpenter authored
commit 9cec2aaf upstream. The > needs be >= to prevent an out of bounds access. Fixes: de5ca4c3 ("net: sched: sch: Bounds check priority") Signed-off-by:
Dan Carpenter <error27@gmail.com> Reviewed-by:
Simon Horman <simon.horman@corigine.com> Reviewed-by:
Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/Y+D+KN18FQI2DKLq@kili Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pierre-Louis Bossart authored
commit 1f810d2b upstream. The HDaudio stream allocation is done first, and in a second step the LOSIDV parameter is programmed for the multi-link used by a codec. This leads to a possible stream_tag leak, e.g. if a DisplayAudio link is not used. This would happen when a non-Intel graphics card is used and userspace unconditionally uses the Intel Display Audio PCMs without checking if they are connected to a receiver with jack controls. We should first check that there is a valid multi-link entry to configure before allocating a stream_tag. This change aligns the dma_assign and dma_cleanup phases. Complements: b0cd60f3 ("ALSA/ASoC: hda: clarify bus_get_link() and bus_link_get() helpers") Link: https://github.com/thesofproject/linux/issues/4151 Signed-off-by:
Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by:
Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by:
Rander Wang <rander.wang@intel.com> Reviewed-by:
Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by:
Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://lore.kernel.org/r/20230216162340.19480-1-peter.ujfalusi@linux.intel.com Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Gleixner authored
commit d125d134 upstream. syzbot reported a RCU stall which is caused by setting up an alarmtimer with a very small interval and ignoring the signal. The reproducer arms the alarm timer with a relative expiry of 8ns and an interval of 9ns. Not a problem per se, but that's an issue when the signal is ignored because then the timer is immediately rearmed because there is no way to delay that rearming to the signal delivery path. See posix_timer_fn() and commit 58229a18 ("posix-timers: Prevent softirq starvation by small intervals and SIG_IGN") for details. The reproducer does not set SIG_IGN explicitely, but it sets up the timers signal with SIGCONT. That has the same effect as explicitely setting SIG_IGN for a signal as SIGCONT is ignored if there is no handler set and the task is not ptraced. The log clearly shows that: [pid 5102] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=316014, si_int=0, si_ptr=NULL} --- It works because the tasks are traced and therefore the signal is queued so the tracer can see it, which delays the restart of the timer to the signal delivery path. But then the tracer is killed: [pid 5087] kill(-5102, SIGKILL <unfinished ...> ... ./strace-static-x86_64: Process 5107 detached and after it's gone the stall can be observed: syzkaller login: [ 79.439102][ C0] hrtimer: interrupt took 68471 ns [ 184.460538][ C1] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: ... [ 184.658237][ C1] rcu: Stack dump where RCU GP kthread last ran: [ 184.664574][ C1] Sending NMI from CPU 1 to CPUs 0: [ 184.669821][ C0] NMI backtrace for cpu 0 [ 184.669831][ C0] CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0 ... [ 184.670036][ C0] Call Trace: [ 184.670041][ C0] <IRQ> [ 184.670045][ C0] alarmtimer_fired+0x327/0x670 posix_timer_fn() prevents that by checking whether the interval for timers which have the signal ignored is smaller than a jiffie and artifically delay it by shifting the next expiry out by a jiffie. That's accurate vs. the overrun accounting, but slightly inaccurate vs. timer_gettimer(2). The comment in that function says what needs to be done and there was a fix available for the regular userspace induced SIG_IGN mechanism, but that did not work due to the implicit ignore for SIGCONT and similar signals. This needs to be worked on, but for now the only available workaround is to do exactly what posix_timer_fn() does: Increase the interval of self-rearming timers, which have their signal ignored, to at least a jiffie. Interestingly this has been fixed before via commit ff86bf0c ("alarmtimer: Rate limit periodic intervals") already, but that fix got lost in a later rework. Reported-by:
<syzbot+b9564ba6e8e00694511b@syzkaller.appspotmail.com> Fixes: f2c45807 ("alarmtimer: Switch over to generic set/get/rearm routine") Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
John Stultz <jstultz@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87k00q1no2.ffs@tglx Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Greg Kroah-Hartman authored
commit 2c10b614 upstream. When calling the KVM_GET_DEBUGREGS ioctl, on some configurations, there might be some unitialized portions of the kvm_debugregs structure that could be copied to userspace. Prevent this as is done in the other kvm ioctls, by setting the whole structure to 0 before copying anything into it. Bonus is that this reduces the lines of code as the explicit flag setting and reserved space zeroing out can be removed. Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: <x86@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: stable <stable@kernel.org> Reported-by:
Xingyuan Mo <hdthky0@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <20230214103304.3689213-1-gregkh@linuxfoundation.org> Tested-by:
Xingyuan Mo <hdthky0@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pedro Tammela authored
[ Upstream commit 42018a32 ] Syzkaller found an issue where a handle greater than 16 bits would trigger a null-ptr-deref in the imperfect hash area update. general protection fault, probably for non-canonical address 0xdffffc0000000015: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x00000000000000a8-0x00000000000000af] CPU: 0 PID: 5070 Comm: syz-executor456 Not tainted 6.2.0-rc7-syzkaller-00112-gc68f345b7c42 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 RIP: 0010:tcindex_set_parms+0x1a6a/0x2990 net/sched/cls_tcindex.c:509 Code: 01 e9 e9 fe ff ff 4c 8b bd 28 fe ff ff e8 0e 57 7d f9 48 8d bb a8 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 94 0c 00 00 48 8b 85 f8 fd ff ff 48 8b 9b a8 00 RSP: 0018:ffffc90003d3ef88 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000015 RSI: ffffffff8803a102 RDI: 00000000000000a8 RBP: ffffc90003d3f1d8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88801e2b10a8 R13: dffffc0000000000 R14: 0000000000030000 R15: ffff888017b3be00 FS: 00005555569af300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000056041c6d2000 CR3: 000000002bfca000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcindex_change+0x1ea/0x320 net/sched/cls_tcindex.c:572 tc_new_tfilter+0x96e/0x2220 net/sched/cls_api.c:2155 rtnetlink_rcv_msg+0x959/0xca0 net/core/rtnetlink.c:6132 netlink_rcv_skb+0x165/0x440 net/netlink/af_netlink.c:2574 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x91b/0xe10 net/netlink/af_netlink.c:1942 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xd3/0x120 net/socket.c:734 ____sys_sendmsg+0x334/0x8c0 net/socket.c:2476 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2530 __sys_sendmmsg+0x18f/0x460 net/socket.c:2616 __do_sys_sendmmsg net/socket.c:2645 [inline] __se_sys_sendmmsg net/socket.c:2642 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2642 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 Fixes: ee059170 ("net/sched: tcindex: update imperfect hash filters respecting rcu") Signed-off-by:
Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by:
Pedro Tammela <pctammela@mojatatu.com> Reported-by:
syzbot <syzkaller@googlegroups.com> Reviewed-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Natalia Petrova authored
[ Upstream commit 7fa0b526 ] The result of nlmsg_find_attr() 'br_spec' is dereferenced in nla_for_each_nested(), but it can take NULL value in nla_find() function, which will result in an error. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 51616018 ("i40e: Add support for getlink, setlink ndo ops") Signed-off-by:
Natalia Petrova <n.petrova@fintech.ru> Reviewed-by:
Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230209172833.3596034-1-anthony.l.nguyen@intel.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pedro Tammela authored
[ Upstream commit 21c167aa ] The tc action act_ctinfo was using shared stats, fix it to use percpu stats since bstats_update() must be called with locks or with a percpu pointer argument. tdc results: 1..12 ok 1 c826 - Add ctinfo action with default setting ok 2 0286 - Add ctinfo action with dscp ok 3 4938 - Add ctinfo action with valid cpmark and zone ok 4 7593 - Add ctinfo action with drop control ok 5 2961 - Replace ctinfo action zone and action control ok 6 e567 - Delete ctinfo action with valid index ok 7 6a91 - Delete ctinfo action with invalid index ok 8 5232 - List ctinfo actions ok 9 7702 - Flush ctinfo actions ok 10 3201 - Add ctinfo action with duplicate index ok 11 8295 - Add ctinfo action with invalid index ok 12 3964 - Replace ctinfo action with invalid goto_chain control Fixes: 24ec483c ("net: sched: Introduce act_ctinfo action") Reviewed-by:
Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by:
Pedro Tammela <pctammela@mojatatu.com> Reviewed-by:
Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20230210200824.444856-1-pctammela@mojatatu.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Baowen Zheng authored
[ Upstream commit 40bd094d ] Fill flags to action structure to allow user control if the action should be offloaded to hardware or not. Signed-off-by:
Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by:
Louis Peens <louis.peens@corigine.com> Signed-off-by:
Simon Horman <simon.horman@corigine.com> Acked-by:
Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Stable-dep-of: 21c167aa ("net/sched: act_ctinfo: use percpu stats") Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Matt Roper authored
[ Upstream commit d5a1224a ] The UNSLICE_UNIT_LEVEL_CLKGATE register programmed by this workaround has 'BUS' style reset, indicating that it does not lose its value on engine resets. Furthermore, this register is part of the GT forcewake domain rather than the RENDER domain, so it should not be impacted by RCS engine resets. As such, we should implement this on the GT workaround list rather than an engine list. Bspec: 19219 Fixes: 3551ff92 ("drm/i915/gen11: Moving WAs to rcs_engine_wa_init()") Signed-off-by:
Matt Roper <matthew.d.roper@intel.com> Reviewed-by:
Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230201222831.608281-2-matthew.d.roper@intel.com (cherry picked from commit 5f21dc07) Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Raviteja Goud Talla authored
[ Upstream commit 67b858dd ] Bspec page says "Reset: BUS", Accordingly moving w/a's: Wa_1407352427,Wa_1406680159 to proper function icl_gt_workarounds_init() Which will resolve guc enabling error v2: - Previous patch rev2 was created by email client which caused the Build failure, This v2 is to resolve the previous broken series Reviewed-by:
John Harrison <John.C.Harrison@Intel.com> Signed-off-by:
Raviteja Goud Talla <ravitejax.goud.talla@intel.com> Signed-off-by:
John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211203145603.4006937-1-ravitejax.goud.talla@intel.com Stable-dep-of: d5a1224a ("drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list") Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Ryusuke Konishi authored
commit 99b9402a upstream. Macro NILFS_SB2_OFFSET_BYTES, which computes the position of the second superblock, underflows when the argument device size is less than 4096 bytes. Therefore, when using this macro, it is necessary to check in advance that the device size is not less than a lower limit, or at least that underflow does not occur. The current nilfs2 implementation lacks this check, causing out-of-bound block access when mounting devices smaller than 4096 bytes: I/O error, dev loop0, sector 36028797018963960 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 NILFS (loop0): unable to read secondary superblock (blocksize = 1024) In addition, when trying to resize the filesystem to a size below 4096 bytes, this underflow occurs in nilfs_resize_fs(), passing a huge number of segments to nilfs_sufile_resize(), corrupting parameters such as the number of segments in superblocks. This causes excessive loop iterations in nilfs_sufile_resize() during a subsequent resize ioctl, causing semaphore ns_segctor_sem to block for a long time and hang the writer thread: INFO: task segctord:5067 blocked for more than 143 seconds. Not tainted 6.2.0-rc8-syzkaller-00015-gf6feea56f66d #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:segctord state:D stack:23456 pid:5067 ppid:2 flags:0x00004000 Call Trace: <TASK> context_switch kernel/sched/core.c:5293 [inline] __schedule+0x1409/0x43f0 kernel/sched/core.c:6606 schedule+0xc3/0x190 kernel/sched/core.c:6682 rwsem_down_write_slowpath+0xfcf/0x14a0 kernel/locking/rwsem.c:1190 nilfs_transaction_lock+0x25c/0x4f0 fs/nilfs2/segment.c:357 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2486 [inline] nilfs_segctor_thread+0x52f/0x1140 fs/nilfs2/segment.c:2570 kthread+0x270/0x300 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308 </TASK> ... Call Trace: <TASK> folio_mark_accessed+0x51c/0xf00 mm/swap.c:515 __nilfs_get_page_block fs/nilfs2/page.c:42 [inline] nilfs_grab_buffer+0x3d3/0x540 fs/nilfs2/page.c:61 nilfs_mdt_submit_block+0xd7/0x8f0 fs/nilfs2/mdt.c:121 nilfs_mdt_read_block+0xeb/0x430 fs/nilfs2/mdt.c:176 nilfs_mdt_get_block+0x12d/0xbb0 fs/nilfs2/mdt.c:251 nilfs_sufile_get_segment_usage_block fs/nilfs2/sufile.c:92 [inline] nilfs_sufile_truncate_range fs/nilfs2/sufile.c:679 [inline] nilfs_sufile_resize+0x7a3/0x12b0 fs/nilfs2/sufile.c:777 nilfs_resize_fs+0x20c/0xed0 fs/nilfs2/super.c:422 nilfs_ioctl_resize fs/nilfs2/ioctl.c:1033 [inline] nilfs_ioctl+0x137c/0x2440 fs/nilfs2/ioctl.c:1301 ... This fixes these issues by inserting appropriate minimum device size checks or anti-underflow checks, depending on where the macro is used. Link: https://lkml.kernel.org/r/0000000000004e1dfa05f4a48e6b@google.com Link: https://lkml.kernel.org/r/20230214224043.24141-1-konishi.ryusuke@gmail.com Signed-off-by:
Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by:
<syzbot+f0c4082ce5ebebdac63b@syzkaller.appspotmail.com> Tested-by:
Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-