- Feb 25, 2023
-
-
Greg Kroah-Hartman authored
Link: https://lore.kernel.org/r/20230223130431.553657459@linuxfoundation.org Link: https://lore.kernel.org/r/20230223141545.280864003@linuxfoundation.org Tested-by:
Salvatore Bonaccorso <carnil@debian.org> Tested-by:
Justin M. Forbes <jforbes@fedoraproject.org> Tested-by:
Florian Fainelli <f.fainelli@gmail.com> Tested-by:
Conor Dooley <conor.dooley@microchip.com> Tested-by:
Shuah Khan <skhan@linuxfoundation.org> Tested-by:
Bagas Sanjaya <bagasdotme@gmail.com> Tested-by:
Guenter Roeck <linux@roeck-us.net> Tested-by:
Linux Kernel Functional Testing <lkft@linaro.org> Tested-by:
Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by:
Slade Watkins <srw@sladewatkins.net> Tested-by:
Ron Economos <re@w6rz.net> Tested-by:
Allen Pais <apais@linux.microsoft.com> Tested-by:
Kelsey Steele <kelseysteele@linux.microsoft.com> Tested-by:
Rudi Heitbaum <rudi@heitbaum.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Linus Torvalds authored
commit f3dd0c53 upstream. Commit 74e19ef0 ("uaccess: Add speculation barrier to copy_from_user()") built fine on x86-64 and arm64, and that's the extent of my local build testing. It turns out those got the <linux/nospec.h> include incidentally through other header files (<linux/kvm_host.h> in particular), but that was not true of other architectures, resulting in build errors kernel/bpf/core.c: In function ‘___bpf_prog_run’: kernel/bpf/core.c:1913:3: error: implicit declaration of function ‘barrier_nospec’ so just make sure to explicitly include the proper <linux/nospec.h> header file to make everybody see it. Fixes: 74e19ef0 ("uaccess: Add speculation barrier to copy_from_user()") Reported-by:
kernel test robot <lkp@intel.com> Reported-by:
Viresh Kumar <viresh.kumar@linaro.org> Reported-by:
Huacai Chen <chenhuacai@loongson.cn> Tested-by:
Geert Uytterhoeven <geert@linux-m68k.org> Tested-by:
Dave Hansen <dave.hansen@linux.intel.com> Acked-by:
Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Biggers authored
commit 78f7a3fd upstream. The randstruct support released in Clang 15 is unsafe to use due to a bug that can cause miscompilations: "-frandomize-layout-seed inconsistently randomizes all-function-pointers structs" (https://github.com/llvm/llvm-project/issues/60349). It has been fixed on the Clang 16 release branch, so add a Clang version check. Fixes: 035f7f87 ("randstruct: Enable Clang support") Cc: stable@vger.kernel.org Signed-off-by:
Eric Biggers <ebiggers@google.com> Acked-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Nathan Chancellor <nathan@kernel.org> Reviewed-by:
Bill Wendling <morbo@google.com> Signed-off-by:
Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230208065133.220589-1-ebiggers@kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kees Cook authored
commit 118901ad upstream. With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG), indirect call targets are validated against the expected function pointer prototype to make sure the call target is valid to help mitigate ROP attacks. If they are not identical, there is a failure at run time, which manifests as either a kernel panic or thread getting killed. ext4_feat_ktype was setting the "release" handler to "kfree", which doesn't have a matching function prototype. Add a simple wrapper with the correct prototype. This was found as a result of Clang's new -Wcast-function-type-strict flag, which is more sensitive than the simpler -Wcast-function-type, which only checks for type width mismatches. Note that this code is only reached when ext4 is a loadable module and it is being unloaded: CFI failure at kobject_put+0xbb/0x1b0 (target: kfree+0x0/0x180; expected type: 0x7c4aa698) ... RIP: 0010:kobject_put+0xbb/0x1b0 ... Call Trace: <TASK> ext4_exit_sysfs+0x14/0x60 [ext4] cleanup_module+0x67/0xedb [ext4] Fixes: b99fee58 ("ext4: create ext4_feat kobject dynamically") Cc: Theodore Ts'o <tytso@mit.edu> Cc: Eric Biggers <ebiggers@kernel.org> Cc: stable@vger.kernel.org Build-tested-by:
Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by:
Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by:
Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20230103234616.never.915-kees@kernel.org Signed-off-by:
Kees Cook <keescook@chromium.org> Reviewed-by:
Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20230104210908.gonna.388-kees@kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hans de Goede authored
commit 0d9bdd8a upstream. On some Lenovo Legion models, the backlight might be driven by either one of nvidia_wmi_ec_backlight or amdgpu_bl0 at different times. When the Nvidia WMI EC backlight interface reports the backlight is controlled by the EC, the current backlight handling only registers nvidia_wmi_ec_backlight (and registers no other backlight interfaces). This hides (never registers) the amdgpu_bl0 interface, where as prior to 6.1.4 users would have both nvidia_wmi_ec_backlight and amdgpu_bl0 and could work around things in userspace. Add a force module parameter which can be used with acpi_backlight=native to restore the old behavior as a workound (for now) by passing: "acpi_backlight=native nvidia-wmi-ec-backlight.force=1" Fixes: 8d0ca287 ("platform/x86: nvidia-wmi-ec-backlight: Use acpi_video_get_backlight_type()") Link: https://bugzilla.kernel.org/show_bug.cgi?id=217026 Cc: stable@vger.kernel.org Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Reviewed-by:
Daniel Dadap <ddadap@nvidia.com> Link: https://lore.kernel.org/r/20230217144208.5721-1-hdegoede@redhat.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shyam Sundar S K authored
commit 3004e8d2 upstream. It is reported that amd_pmf driver is missing "depends on" for CONFIG_POWER_SUPPLY causing the following build error. ld: drivers/platform/x86/amd/pmf/core.o: in function `amd_pmf_remove': core.c:(.text+0x10): undefined reference to `power_supply_unreg_notifier' ld: drivers/platform/x86/amd/pmf/core.o: in function `amd_pmf_probe': core.c:(.text+0x38f): undefined reference to `power_supply_reg_notifier' make[1]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1 make: *** [Makefile:1248: vmlinux] Error 2 Add this to the Kconfig file. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217028 Fixes: c5258d39 ("platform/x86/amd/pmf: Add helper routine to update SPS thermals") Signed-off-by:
Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20230213121457.1764463-1-Shyam-sundar.S-k@amd.com Cc: stable@vger.kernel.org Reviewed-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul Moore authored
commit 6c6cd913 upstream. We've moved the upstream Linux Kernel audit subsystem discussions to a new mailing list, this patch updates the MAINTAINERS info with the new list address. Marking this for stable inclusion to help speed uptake of the new list across all of the supported kernel releases. This is a doc only patch so the risk should be close to nil. Cc: stable@vger.kernel.org Signed-off-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lukas Wunner authored
commit 36dd7a4c upstream. Commit e3fffc1f ("devicetree: document new marvell-8xxx and pwrseq-sd8787 options") documented a compatible string for SD8787 in the devicetree bindings, but neglected to add it to the mwifiex driver. Fixes: e3fffc1f ("devicetree: document new marvell-8xxx and pwrseq-sd8787 options") Signed-off-by:
Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v4.11+ Cc: Matt Ranostay <mranostay@ti.com> Signed-off-by:
Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/320de5005ff3b8fd76be2d2b859fd021689c3681.1674827105.git.lukas@wunner.de Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tom Saeger authored
commit c1c551be upstream. sh vmlinux fails to link with GNU ld < 2.40 (likely < 2.36) since commit 99cb0d91 ("arch: fix broken BuildID for arm64 and riscv"). This is similar to fixes for powerpc and s390: commit 4b9880db ("powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT"). commit a494398b ("s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36"). $ sh4-linux-gnu-ld --version | head -n1 GNU ld (GNU Binutils for Debian) 2.35.2 $ make ARCH=sh CROSS_COMPILE=sh4-linux-gnu- microdev_defconfig $ make ARCH=sh CROSS_COMPILE=sh4-linux-gnu- `.exit.text' referenced in section `__bug_table' of crypto/algboss.o: defined in discarded section `.exit.text' of crypto/algboss.o `.exit.text' referenced in section `__bug_table' of drivers/char/hw_random/core.o: defined in discarded section `.exit.text' of drivers/char/hw_random/core.o make[2]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1 make[1]: *** [Makefile:1252: vmlinux] Error 2 arch/sh/kernel/vmlinux.lds.S keeps EXIT_TEXT: /* * .exit.text is discarded at runtime, not link time, to deal with * references from __bug_table */ .exit.text : AT(ADDR(.exit.text)) { EXIT_TEXT } However, EXIT_TEXT is thrown away by DISCARD(include/asm-generic/vmlinux.lds.h) because sh does not define RUNTIME_DISCARD_EXIT. GNU ld 2.40 does not have this issue and builds fine. This corresponds with Masahiro's comments in a494398b: "Nathan [Chancellor] also found that binutils commit 21401fc7bf67 ("Duplicate output sections in scripts") cured this issue, so we cannot reproduce it with binutils 2.36+, but it is better to not rely on it." Link: https://lkml.kernel.org/r/9166a8abdc0f979e50377e61780a4bba1dfa2f52.1674518464.git.tom.saeger@oracle.com Fixes: 99cb0d91 ("arch: fix broken BuildID for arm64 and riscv") Link: https://lore.kernel.org/all/Y7Jal56f6UBh1abE@dev-arch.thelio-3990X/ Link: https://lore.kernel.org/all/20230123194218.47ssfzhrpnv3xfez@oracle.com/ Signed-off-by:
Tom Saeger <tom.saeger@oracle.com> Tested-by:
John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Dennis Gilmore <dennis@ausil.us> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Tom Saeger <tom.saeger@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masahiro Yamada authored
commit a494398b upstream. Nathan Chancellor reports that the s390 vmlinux fails to link with GNU ld < 2.36 since commit 99cb0d91 ("arch: fix broken BuildID for arm64 and riscv"). It happens for defconfig, or more specifically for CONFIG_EXPOLINE=y. $ s390x-linux-gnu-ld --version | head -n1 GNU ld (GNU Binutils for Debian) 2.35.2 $ make -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- allnoconfig $ ./scripts/config -e CONFIG_EXPOLINE $ make -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- olddefconfig $ make -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- `.exit.text' referenced in section `.s390_return_reg' of drivers/base/dd.o: defined in discarded section `.exit.text' of drivers/base/dd.o make[1]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1 make: *** [Makefile:1252: vmlinux] Error 2 arch/s390/kernel/vmlinux.lds.S wants to keep EXIT_TEXT: .exit.text : { EXIT_TEXT } But, at the same time, EXIT_TEXT is thrown away by DISCARD because s390 does not define RUNTIME_DISCARD_EXIT. I still do not understand why the latter wins after 99cb0d91, but defining RUNTIME_DISCARD_EXIT seems correct because the comment line in arch/s390/kernel/vmlinux.lds.S says: /* * .exit.text is discarded at runtime, not link time, * to deal with references from __bug_table */ Nathan also found that binutils commit 21401fc7bf67 ("Duplicate output sections in scripts") cured this issue, so we cannot reproduce it with binutils 2.36+, but it is better to not rely on it. Fixes: 99cb0d91 ("arch: fix broken BuildID for arm64 and riscv") Link: https://lore.kernel.org/all/Y7Jal56f6UBh1abE@dev-arch.thelio-3990X/ Reported-by:
Nathan Chancellor <nathan@kernel.org> Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20230105031306.1455409-1-masahiroy@kernel.org Signed-off-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Tom Saeger <tom.saeger@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michael Ellerman authored
commit 07b050f9 upstream. Relocatable kernels must not discard relocations, they need to be processed at runtime. As such they are included for CONFIG_RELOCATABLE builds in the powerpc linker script (line 340). However they are also unconditionally discarded later in the script (line 414). Previously that worked because the earlier inclusion superseded the discard. However commit 99cb0d91 ("arch: fix broken BuildID for arm64 and riscv") introduced an earlier use of DISCARD as part of the RO_DATA macro (line 137). With binutils < 2.36 that causes the DISCARD directives later in the script to be applied earlier, causing .rela* to actually be discarded at link time, leading to build warnings and a kernel that doesn't boot: ld: warning: discarding dynamic section .rela.init.rodata Fix it by conditionally discarding .rela* only when CONFIG_RELOCATABLE is disabled. Fixes: 99cb0d91 ("arch: fix broken BuildID for arm64 and riscv") Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230105132349.384666-2-mpe@ellerman.id.au Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tom Saeger <tom.saeger@oracle.com>
-
Michael Ellerman authored
commit 4b9880db upstream. The powerpc linker script explicitly includes .exit.text, because otherwise the link fails due to references from __bug_table and __ex_table. The code is freed (discarded) at runtime along with .init.text and data. That has worked in the past despite powerpc not defining RUNTIME_DISCARD_EXIT because DISCARDS appears late in the powerpc linker script (line 410), and the explicit inclusion of .exit.text earlier (line 280) supersedes the discard. However commit 99cb0d91 ("arch: fix broken BuildID for arm64 and riscv") introduced an earlier use of DISCARD as part of the RO_DATA macro (line 136). With binutils < 2.36 that causes the DISCARD directives later in the script to be applied earlier [1], causing .exit.text to actually be discarded at link time, leading to build errors: '.exit.text' referenced in section '__bug_table' of crypto/algboss.o: defined in discarded section '.exit.text' of crypto/algboss.o '.exit.text' referenced in section '__ex_table' of drivers/nvdimm/core.o: defined in discarded section '.exit.text' of drivers/nvdimm/core.o Fix it by defining RUNTIME_DISCARD_EXIT, which causes the generic DISCARDS macro to not include .exit.text at all. 1: https://lore.kernel.org/lkml/87fscp2v7k.fsf@igel.home/ Fixes: 99cb0d91 ("arch: fix broken BuildID for arm64 and riscv") Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230105132349.384666-1-mpe@ellerman.id.au Signed-off-by:
Tom Saeger <tom.saeger@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masahiro Yamada authored
commit 99cb0d91 upstream. Dennis Gilmore reports that the BuildID is missing in the arm64 vmlinux since commit 994b7ac1 ("arm64: remove special treatment for the link order of head.o"). The issue is that the type of .notes section, which contains the BuildID, changed from NOTES to PROGBITS. Ard Biesheuvel figured out that whichever object gets linked first gets to decide the type of a section. The PROGBITS type is the result of the compiler emitting .note.GNU-stack as PROGBITS rather than NOTE. While Ard provided a fix for arm64, I want to fix this globally because the same issue is happening on riscv since commit 2348e6bf ("riscv: remove special treatment for the link order of head.o"). This problem will happen in general for other architectures if they start to drop unneeded entries from scripts/head-object-list.txt. Discard .note.GNU-stack in include/asm-generic/vmlinux.lds.h. Link: https://lore.kernel.org/lkml/CAABkxwuQoz1CTbyb57n0ZX65eSYiTonFCU8-LCQc=74D=xE=rA@mail.gmail.com/ Fixes: 994b7ac1 ("arm64: remove special treatment for the link order of head.o") Fixes: 2348e6bf ("riscv: remove special treatment for the link order of head.o") Reported-by:
Dennis Gilmore <dennis@ausil.us> Suggested-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org> Acked-by:
Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by:
Tom Saeger <tom.saeger@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masahiro Yamada authored
commit 994b7ac1 upstream. In the previous discussion (see the Link tag), Ard pointed out that arm/arm64/kernel/head.o does not need any special treatment - the only piece that must appear right at the start of the binary image is the image header which is emitted into .head.text. The linker script does the right thing to do. The build system does not need to manipulate the link order of head.o. Link: https://lore.kernel.org/lkml/CAMj1kXH77Ja8bSsq2Qj8Ck9iSZKw=1F8Uy-uAWGVDm4-CG=EuA@mail.gmail.com/ Suggested-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org> Reviewed-by:
Nicolas Schier <nicolas@fjasle.eu> Link: https://lore.kernel.org/r/20221012233500.156764-1-masahiroy@kernel.org Signed-off-by:
Will Deacon <will@kernel.org> Signed-off-by:
Tom Saeger <tom.saeger@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jisheng Zhang authored
commit 2348e6bf upstream. arch/riscv/kernel/head.o does not need any special treatment - the only requirement is the ".head.text" section must be placed before the normal ".text" section. The linker script does the right thing to do. The build system does not need to manipulate the link order of head.o. Signed-off-by:
Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20221018141200.1040-1-jszhang@kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by:
Tom Saeger <tom.saeger@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shengyu Qu authored
commit ca2a9944 upstream. Add IDs to usb_device_id table for WCN6855. IDs are extracted from Windows driver of Lenovo Thinkpad T14 Gen 2(Driver version 1.0.0.1205 Windows 10) Windows driver download address: https://pcsupport.lenovo.com/us/en/products/laptops-and-netbooks/ thinkpad-t-series-laptops/thinkpad-t14-gen-2-type-20xk-20xl/downloads /driver-list/ Signed-off-by:
Shengyu Qu <wiagn233@outlook.com> Signed-off-by:
Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Cc: "Limonciello, Mario" <Mario.Limonciello@amd.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit 923510c8 upstream. Clang likes to create conditional tail calls like: 0000000000000350 <amd_pmu_add_event>: 350: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 351: R_X86_64_NONE __fentry__-0x4 355: 48 83 bf 20 01 00 00 00 cmpq $0x0,0x120(%rdi) 35d: 0f 85 00 00 00 00 jne 363 <amd_pmu_add_event+0x13> 35f: R_X86_64_PLT32 __SCT__amd_pmu_branch_add-0x4 363: e9 00 00 00 00 jmp 368 <amd_pmu_add_event+0x18> 364: R_X86_64_PLT32 __x86_return_thunk-0x4 Where 0x35d is a static call site that's turned into a conditional tail-call using the Jcc class of instructions. Teach the in-line static call text patching about this. Notably, since there is no conditional-ret, in that case patch the Jcc to point at an empty stub function that does the ret -- or the return thunk when needed. Reported-by:
"Erhard F." <erhard_f@mailbox.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Reviewed-by:
Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/Y9Kdg9QjHkr9G5b5@hirez.programming.kicks-ass.net [nathan: Backport to 6.1: - Use __x86_return_thunk instead of x86_return_thunk for func in __static_call_transform() - Remove ASM_FUNC_ALIGN in __static_call_return() asm, as call depth tracking was merged in 6.2] Signed-off-by:
Nathan Chancellor <nathan@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit ac0ee0a9 upstream. In order to re-write Jcc.d32 instructions text_poke_bp() needs to be taught about them. The biggest hurdle is that the whole machinery is currently made for 5 byte instructions and extending this would grow struct text_poke_loc which is currently a nice 16 bytes and used in an array. However, since text_poke_loc contains a full copy of the (s32) displacement, it is possible to map the Jcc.d32 2 byte opcodes to Jcc.d8 1 byte opcode for the int3 emulation. This then leaves the replacement bytes; fudge that by only storing the last 5 bytes and adding the rule that 'length == 6' instruction will be prefixed with a 0x0f byte. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Reviewed-by:
Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20230123210607.115718513@infradead.org [nathan: Introduce is_jcc32() as part of this change; upstream introduced it in 3b6c1747 ] Signed-off-by:
Nathan Chancellor <nathan@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit db7adcfd upstream. Move the kprobe Jcc emulation into int3_emulate_jcc() so it can be used by more code -- specifically static_call() will need this. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Reviewed-by:
Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20230123210607.057678245@infradead.org Signed-off-by:
Nathan Chancellor <nathan@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dave Hansen authored
commit 74e19ef0 upstream. The results of "access_ok()" can be mis-speculated. The result is that you can end speculatively: if (access_ok(from, size)) // Right here even for bad from/size combinations. On first glance, it would be ideal to just add a speculation barrier to "access_ok()" so that its results can never be mis-speculated. But there are lots of system calls just doing access_ok() via "copy_to_user()" and friends (example: fstat() and friends). Those are generally not problematic because they do not _consume_ data from userspace other than the pointer. They are also very quick and common system calls that should not be needlessly slowed down. "copy_from_user()" on the other hand uses a user-controller pointer and is frequently followed up with code that might affect caches. Take something like this: if (!copy_from_user(&kernelvar, uptr, size)) do_something_with(kernelvar); If userspace passes ...
-
Yu Xiao authored
[ Upstream commit 821de68c ] Unsupported port speed can be set and cause error. Now fixing it and return an error if setting unsupported speed. This fix depends on the following, which was included in v6.2-rc1: commit a61474c4 ("nfp: ethtool: support reporting link modes"). Fixes: 7c698737 ("nfp: add support for .set_link_ksettings()") Signed-off-by:
Yu Xiao <yu.xiao@corigine.com> Signed-off-by:
Simon Horman <simon.horman@corigine.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Yu Xiao authored
[ Upstream commit a61474c4 ] Add support for reporting link modes, including `Supported link modes` and `Advertised link modes`, via ethtool $DEV. A new command `SPCODE_READ_MEDIA` is added to read info from management firmware. Also, the mapping table `nfp_eth_media_table` associates the link modes between NFP and kernel. Both of them help to support this ability. Signed-off-by:
Yu Xiao <yu.xiao@corigine.com> Reviewed-by:
Louis Peens <louis.peens@corigine.com> Signed-off-by:
Simon Horman <simon.horman@corigine.com> Reviewed-by:
Alexander Lobakin <alexandr.lobakin@intel.com> Link: https://lore.kernel.org/r/20221125113030.141642-1-simon.horman@corigine.com Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Stable-dep-of: 821de68c ("nfp: ethtool: fix the bug of setting unsupported port speed") Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Michael Ellerman authored
[ Upstream commit 111bcb37 ] If a relocatable kernel is loaded at a non-zero address and told not to relocate to zero (kdump or RELOCATABLE_TEST), the mapping of the interrupt code at zero is left with RWX permissions. That is a security weakness, and leads to a warning at boot if CONFIG_DEBUG_WX is enabled: powerpc/mm: Found insecure W+X mapping at address 00000000056435bc/0xc000000000000000 WARNING: CPU: 1 PID: 1 at arch/powerpc/mm/ptdump/ptdump.c:193 note_page+0x484/0x4c0 CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc1-00001-g8ae8e98aea82-dirty #175 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-dd0dca hv:linux,kvm pSeries NIP: c0000000004a1c34 LR: c0000000004a1c30 CTR: 0000000000000000 REGS: c000000003503770 TRAP: 0700 Not tainted (6.2.0-rc1-00001-g8ae8e98aea82-dirty) MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 24000220 XER: 00000000 CFAR: c000000000545a58 IRQMASK: 0 ... NIP note_page+0x484/0x4c0 LR note_page+0x480/0x4c0 Call Trace: note_page+0x480/0x4c0 (unreliable) ptdump_pmd_entry+0xc8/0x100 walk_pgd_range+0x618/0xab0 walk_page_range_novma+0x74/0xc0 ptdump_walk_pgd+0x98/0x170 ptdump_check_wx+0x94/0x100 mark_rodata_ro+0x30/0x70 kernel_init+0x78/0x1a0 ret_from_kernel_thread+0x5c/0x64 The fix has two parts. Firstly the pages from zero up to the end of interrupts need to be marked read-only, so that they are left with R-X permissions. Secondly the mapping logic needs to be taught to ensure there is a page boundary at the end of the interrupt region, so that the permission change only applies to the interrupt text, and not the region following it. Fixes: c55d7b5e ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE") Reported-by:
Sachin Sant <sachinp@linux.ibm.com> Tested-by:
Sachin Sant <sachinp@linux.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230110124753.1325426-2-mpe@ellerman.id.au Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Paolo Bonzini authored
[ Upstream commit 50aa870b ] Placing a declaration of evt_reset is pedantically invalid according to the C standard. While GCC does not really care and only warns with -Wpedantic, clang ignores the declaration altogether with an error: x86_64/xen_shinfo_test.c:965:2: error: expected expression struct kvm_xen_hvm_attr evt_reset = { ^ x86_64/xen_shinfo_test.c:969:38: error: use of undeclared identifier evt_reset vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &evt_reset); ^ Reported-by:
Yu Zhang <yu.c.zhang@linux.intel.com> Reported-by:
Sean Christopherson <seanjc@google.com> Fixes: a79b53aa ("KVM: x86: fix deadlock for KVM_XEN_EVTCHN_RESET", 2022-12-28) Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Paolo Bonzini authored
[ Upstream commit a79b53aa ] While KVM_XEN_EVTCHN_RESET is usually called with no vCPUs running, if that happened it could cause a deadlock. This is due to kvm_xen_eventfd_reset() doing a synchronize_srcu() inside a kvm->lock critical section. To avoid this, first collect all the evtchnfd objects in an array and free all of them once the kvm->lock critical section is over and th SRCU grace period has expired. Reported-by:
Michal Luczaj <mhal@rbox.co> Cc: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Lucas De Marchi authored
[ Upstream commit fff75869 ] The attribute __maybe_unused should remain only until the respective info is not in the pciidlist. The info can't be added together with its definition because that would cause the driver to automatically probe for the device, while it's still not ready for that. However once pciidlist contains it, the attribute can be removed. Fixes: 78353039 ("drm/i915/mtl: Add MeteorLake PCI IDs") Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by:
Radhakrishna Sripada <radhakrishna.sripada@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221214194944.3670344-1-lucas.demarchi@intel.com (cherry picked from commit 50490ce0 ) Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Ricardo Ribalda authored
[ Upstream commit b24cded8 ] If the irq is enabled after the spi si registered, there can be a race with the initialization of the devices on the spi bus. Eg: mtk-spi 1100a000.spi: spi-mem transfer timeout spi-nor: probe of spi0.0 failed with error -110 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 ... Call trace: mtk_spi_can_dma+0x0/0x2c Fixes: c6f78746 ("spi: mediatek: Enable irq when pdata is ready") Reported-by:
Daniel Golle <daniel@makrotopia.org> Signed-off-by:
Ricardo Ribalda <ribalda@chromium.org> Tested-by:
Daniel Golle <daniel@makrotopia.org> Link: https://lore.kernel.org/r/20221225-mtk-spi-fixes-v1-0-bb6c14c232f8@chromium.org Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sean Anderson authored
[ Upstream commit 8d8bee13 ] There aren't enough resources to run these ports at 10G speeds. Disable 10G for these ports, reverting to the previous speed. Fixes: 36926a7d ("powerpc: dts: t208x: Mark MAC1 and MAC2 as 10G") Reported-by:
Camelia Alexandra Groza <camelia.groza@nxp.com> Signed-off-by:
Sean Anderson <sean.anderson@seco.com> Reviewed-by:
Camelia Groza <camelia.groza@nxp.com> Tested-by:
Camelia Groza <camelia.groza@nxp.com> Link: https://lore.kernel.org/r/20221216172937.2960054-1-sean.anderson@seco.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Marc Kleine-Budde authored
[ Upstream commit f0062291 ] Debian's gcc-13 [1] throws the following error in kvaser_usb_hydra_cmd_size(): [1] gcc version 13.0.0 20221214 (experimental) [master r13-4693-g512098a3316] (Debian 13-20221214-1) | drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c:502:65: error: | array subscript ‘struct kvaser_cmd_ext[0]’ is partly outside array | bounds of ‘unsigned char[32]’ [-Werror=array-bounds=] | 502 | ret = le16_to_cpu(((struct kvaser_cmd_ext *)cmd)->len); kvaser_usb_hydra_cmd_size() returns the size of given command. It depends on the command number (cmd->header.cmd_no). For extended commands (cmd->header.cmd_no == CMD_EXTENDED) the above shown code is executed. Help gcc to recognize that this code path is not taken in all cases, by calling kvaser_usb_hydra_cmd_size() directly after assigning the command number. Fixes: aec5fb22 ("can: kvaser_usb: Add support for Kvaser USB hydra family") Cc: Jimmy Assarsson <extja@kvaser.com> Cc: Anssi Hannula <anssi.hannula@bitwise.fi> Link: https://lore.kernel.org/all/20221219110104.1073881-1-mkl@pengutronix.de Tested-by:
Jimmy Assarsson <extja@kvaser.com> Signed-off-by:
Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jim Mattson authored
[ Upstream commit 2e7eab81 ] According to Intel's document on Indirect Branch Restricted Speculation, "Enabling IBRS does not prevent software from controlling the predicted targets of indirect branches of unrelated software executed later at the same predictor mode (for example, between two different user applications, or two different virtual machines). Such isolation can be ensured through use of the Indirect Branch Predictor Barrier (IBPB) command." This applies to both basic and enhanced IBRS. Since L1 and L2 VMs share hardware predictor modes (guest-user and guest-kernel), hardware IBRS is not sufficient to virtualize IBRS. (The way that basic IBRS is implemented on pre-eIBRS parts, hardware IBRS is actually sufficient in practice, even though it isn't sufficient architecturally.) For virtual CPUs that support IBRS, add an indirect branch prediction barrier on emulated VM-exit, to ensure that the predicted targets of indirect branches executed in L1 cannot be controlled by software that was executed in L2. Since we typically don't intercept guest writes to IA32_SPEC_CTRL, perform the IBPB at emulated VM-exit regardless of the current IA32_SPEC_CTRL.IBRS value, even though the IBPB could technically be deferred until L1 sets IA32_SPEC_CTRL.IBRS, if IA32_SPEC_CTRL.IBRS is clear at emulated VM-exit. This is CVE-2022-2196. Fixes: 5c911bef ("KVM: nVMX: Skip IBPB when switching between vmcs01 and vmcs02") Cc: Sean Christopherson <seanjc@google.com> Signed-off-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221019213620.1953281-3-jmattson@google.com Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sean Christopherson authored
[ Upstream commit 5c30e810 ] Skip the WRMSR fastpath in SVM's VM-Exit handler if the next RIP isn't valid, e.g. because KVM is running with nrips=false. SVM must decode and emulate to skip the WRMSR if the CPU doesn't provide the next RIP. Getting the instruction bytes to decode the WRMSR requires reading guest memory, which in turn means dereferencing memslots, and that isn't safe because KVM doesn't hold SRCU when the fastpath runs. Don't bother trying to enable the fastpath for this case, e.g. by doing only the WRMSR and leaving the "skip" until later. NRIPS is supported on all modern CPUs (KVM has considered making it mandatory), and the next RIP will be valid the vast, vast majority of the time. ============================= WARNING: suspicious RCU usage 6.0.0-smp--4e557fcd3d80-skip #13 Tainted: G O ----------------------------- include/linux/kvm_host.h:954 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by stable/206475: #0: ffff9d9dfebcc0f0 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x8b/0x620 [kvm] stack backtrace: CPU: 152 PID: 206475 Comm: stable Tainted: G O 6.0.0-smp--4e557fcd3d80-skip #13 Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 10.48.0 01/27/2022 Call Trace: <TASK> dump_stack_lvl+0x69/0xaa dump_stack+0x10/0x12 lockdep_rcu_suspicious+0x11e/0x130 kvm_vcpu_gfn_to_memslot+0x155/0x190 [kvm] kvm_vcpu_gfn_to_hva_prot+0x18/0x80 [kvm] paging64_walk_addr_generic+0x183/0x450 [kvm] paging64_gva_to_gpa+0x63/0xd0 [kvm] kvm_fetch_guest_virt+0x53/0xc0 [kvm] __do_insn_fetch_bytes+0x18b/0x1c0 [kvm] x86_decode_insn+0xf0/0xef0 [kvm] x86_emulate_instruction+0xba/0x790 [kvm] kvm_emulate_instruction+0x17/0x20 [kvm] __svm_skip_emulated_instruction+0x85/0x100 [kvm_amd] svm_skip_emulated_instruction+0x13/0x20 [kvm_amd] handle_fastpath_set_msr_irqoff+0xae/0x180 [kvm] svm_vcpu_run+0x4b8/0x5a0 [kvm_amd] vcpu_enter_guest+0x16ca/0x22f0 [kvm] kvm_arch_vcpu_ioctl_run+0x39d/0x900 [kvm] kvm_vcpu_ioctl+0x538/0x620 [kvm] __se_sys_ioctl+0x77/0xc0 __x64_sys_ioctl+0x1d/0x20 do_syscall_64+0x3d/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 404d5d7b ("KVM: X86: Introduce more exit_fastpath_completion enum values") Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220930234031.1732249-1-seanjc@google.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sean Christopherson authored
[ Upstream commit 17122c06 ] Treat any exception during instruction decode for EMULTYPE_SKIP as a "full" emulation failure, i.e. signal failure instead of queuing the exception. When decoding purely to skip an instruction, KVM and/or the CPU has already done some amount of emulation that cannot be unwound, e.g. on an EPT misconfig VM-Exit KVM has already processeed the emulated MMIO. KVM already does this if a #UD is encountered, but not for other exceptions, e.g. if a #PF is encountered during fetch. In SVM's soft-injection use case, queueing the exception is particularly problematic as queueing exceptions while injecting events can put KVM into an infinite loop due to bailing from VM-Enter to service the newly pending exception. E.g. multiple warnings to detect such behavior fire: ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1017 at arch/x86/kvm/x86.c:9873 kvm_arch_vcpu_ioctl_run+0x1de5/0x20a0 [kvm] Modules linked in: kvm_amd ccp kvm irqbypass CPU: 3 PID: 1017 Comm: svm_nested_soft Not tainted 6.0.0-rc1+ #220 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x1de5/0x20a0 [kvm] Call Trace: kvm_vcpu_ioctl+0x223/0x6d0 [kvm] __x64_sys_ioctl+0x85/0xc0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1017 at arch/x86/kvm/x86.c:9987 kvm_arch_vcpu_ioctl_run+0x12a3/0x20a0 [kvm] Modules linked in: kvm_amd ccp kvm irqbypass CPU: 3 PID: 1017 Comm: svm_nested_soft Tainted: G W 6.0.0-rc1+ #220 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x12a3/0x20a0 [kvm] Call Trace: kvm_vcpu_ioctl+0x223/0x6d0 [kvm] __x64_sys_ioctl+0x85/0xc0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ---[ end trace 0000000000000000 ]--- Fixes: 6ea6e843 ("KVM: x86: inject exceptions produced by x86_decode_insn") Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220930233632.1725475-1-seanjc@google.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Yicong Yang authored
[ Upstream commit eb79f12b ] The PMU instance will be called hisi_pcie<sicl>_core<core> rather than hisi_pcie<sicl>_<core>. Fix this in the documentation. Fixes: c8602008 ("docs: perf: Add description for HiSilicon PCIe PMU driver") Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20221117084136.53572-3-yangyicong@huawei.com Signed-off-by:
Will Deacon <will@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Ricardo Ribalda authored
[ Upstream commit c6f78746 ] If the device does not come straight from reset, we might receive an IRQ before we are ready to handle it. Fixes: [ 0.832328] Unable to handle kernel read from unreadable memory at virtual address 0000000000000010 [ 1.040343] Call trace: [ 1.040347] mtk_spi_can_dma+0xc/0x40 ... [ 1.262265] start_kernel+0x338/0x42c Signed-off-by:
Ricardo Ribalda <ribalda@chromium.org> Reviewed-by:
AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20221128-spi-mt65xx-v1-0-509266830665@chromium.org Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jie Zhan authored
[ Upstream commit 3c2673a0 ] SATA devices on an expander may be removed and not be found again when I_T nexus reset and revalidation are processed simultaneously. The issue comes from: - Revalidation can remove SATA devices in link reset, e.g. in hisi_sas_clear_nexus_ha(). - However, hisi_sas_debug_I_T_nexus_reset() polls the state of a SATA device on an expander after sending link_reset, where it calls: hisi_sas_debug_I_T_nexus_reset sas_ata_wait_after_reset ata_wait_after_reset ata_wait_ready smp_ata_check_ready sas_ex_phy_discover sas_ex_phy_discover_helper sas_set_ex_phy The ex_phy's change count is updated in sas_set_ex_phy(), so SATA devices after a link reset may not be found later through revalidation. A similar issue was reported in: commit 0f3fce5c ("[SCSI] libsas: fix ata_eh clobbering ex_phys via smp_ata_check_ready") commit 87c8331f ("[SCSI] libsas: prevent domain rediscovery competing with ata error handling"). To address this issue, in hisi_sas_debug_I_T_nexus_reset(), we now call smp_ata_check_ready_type() that only polls the device type while not updating the ex_phy's data of libsas. Fixes: 71453bd9 ("scsi: hisi_sas: Use sas_ata_wait_after_reset() in IT nexus reset") Signed-off-by:
Jie Zhan <zhanjie9@hisilicon.com> Link: https://lore.kernel.org/r/20221118083714.4034612-5-zhanjie9@hisilicon.com Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jie Zhan authored
[ Upstream commit 9181ce3c ] Create function smp_ata_check_ready_type() for LLDDs to wait for SATA devices to come up after a link reset. Signed-off-by:
Jie Zhan <zhanjie9@hisilicon.com> Link: https://lore.kernel.org/r/20221118083714.4034612-4-zhanjie9@hisilicon.com Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 3c2673a0 ("scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset") Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jason A. Donenfeld authored
[ Upstream commit d7bf7f3b ] add_latent_entropy() is called every time a process forks, in kernel_clone(). This in turn calls add_device_randomness() using the latent entropy global state. add_device_randomness() does two things: 2) Mixes into the input pool the latent entropy argument passed; and 1) Mixes in a cycle counter, a sort of measurement of when the event took place, the high precision bits of which are presumably difficult to predict. (2) is impossible without CONFIG_GCC_PLUGIN_LATENT_ENTROPY=y. But (1) is always possible. However, currently CONFIG_GCC_PLUGIN_LATENT_ENTROPY=n disables both (1) and (2), instead of just (2). This commit causes the CONFIG_GCC_PLUGIN_LATENT_ENTROPY=n case to still do (1) by passing NULL (len 0) to add_device_randomness() when add_latent_ entropy() is called. Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: PaX Team <pageexec@freemail.hu> Cc: Emese Revfy <re.emese@gmail.com> Fixes: 38addce8 ("gcc-plugins: Add latent_entropy plugin") Signed-off-by:
Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Suren Baghdasaryan authored
[ Upstream commit 710ffe67 ] Psi polling mechanism is trying to minimize the number of wakeups to run psi_poll_work and is currently relying on timer_pending() to detect when this work is already scheduled. This provides a window of opportunity for psi_group_change to schedule an immediate psi_poll_work after poll_timer_fn got called but before psi_poll_work could reschedule itself. Below is the depiction of this entire window: poll_timer_fn wake_up_interruptible(&group->poll_wait); psi_poll_worker wait_event_interruptible(group->poll_wait, ...) psi_poll_work psi_schedule_poll_work if (timer_pending(&group->poll_timer)) return; ... mod_timer(&group->poll_timer, jiffies + delay); Prior to 461daba0 we used to rely on poll_scheduled atomic which was reset and set back inside psi_poll_work and therefore this race window was much smaller. The larger window causes increased number of wakeups and our partners report visible power regression of ~10mA after applying 461daba0. Bring back the poll_scheduled atomic and make this race window even narrower by resetting poll_scheduled only when we reach polling expiration time. This does not completely eliminate the possibility of extra wakeups caused by a race with psi_group_change however it will limit it to the worst case scenario of one extra wakeup per every tracking window (0.5s in the worst case). This patch also ensures correct ordering between clearing poll_scheduled flag and obtaining changed_states using memory barrier. Correct ordering between updating changed_states and setting poll_scheduled is ensured by atomic_xchg operation. By tracing the number of immediate rescheduling attempts performed by psi_group_change and the number of these attempts being blocked due to psi monitor being already active, we can assess the effects of this change: Before the patch: Run#1 Run#2 Run#3 Immediate reschedules attempted: 684365 1385156 1261240 Immediate reschedules blocked: 682846 1381654 1258682 Immediate reschedules (delta): 1519 3502 2558 Immediate reschedules (% of attempted): 0.22% 0.25% 0.20% After the patch: Run#1 Run#2 Run#3 Immediate reschedules attempted: 882244 770298 426218 Immediate reschedules blocked: 881996 769796 426074 Immediate reschedules (delta): 248 502 144 Immediate reschedules (% of attempted): 0.03% 0.07% 0.03% The number of non-blocked immediate reschedules dropped from 0.22-0.25% to 0.03-0.07%. The drop is attributed to the decrease in the race window size and the fact that we allow this race only when psi monitors reach polling window expiration time. Fixes: 461daba0 ("psi: eliminate kthread_worker from psi trigger scheduling mechanism") Reported-by:
Kathleen Chang <yt.chang@mediatek.com> Reported-by:
Wenju Xu <wenju.xu@mediatek.com> Reported-by:
Jonathan Chen <jonathan.jmchen@mediatek.com> Signed-off-by:
Suren Baghdasaryan <surenb@google.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Chengming Zhou <zhouchengming@bytedance.com> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Tested-by:
SH Chen <show-hong.chen@mediatek.com> Link: https://lore.kernel.org/r/20221028194541.813985-1-surenb@google.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Rahul Tanwar authored
[ Upstream commit 7256d1f4 ] Commit 03617731 ("clk: mxl: Switch from direct readl/writel based IO to regmap based IO") introduced code resulting in below warning issued by the smatch static checker. drivers/clk/x86/clk-lgm.c:441 lgm_cgu_probe() warn: passing zero to 'PTR_ERR' Fix the warning by replacing incorrect IS_ERR_OR_NULL() with IS_ERR(). Fixes: 03617731 ("clk: mxl: Switch from direct readl/writel based IO to regmap based IO") Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Rahul Tanwar <rtanwar@maxlinear.com> Link: https://lore.kernel.org/r/49e339d4739e4ae4c92b00c1b2918af0755d4122.1666695221.git.rtanwar@maxlinear.com Signed-off-by:
Stephen Boyd <sboyd@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sean Anderson authored
[ Upstream commit 36926a7d ] On the T208X SoCs, MAC1 and MAC2 support XGMII. Add some new MAC dtsi fragments, and mark the QMAN ports as 10G. Fixes: da414bb9 ("powerpc/mpc85xx: Add FSL QorIQ DPAA FMan support to the SoC device tree(s)") Signed-off-by:
Sean Anderson <sean.anderson@seco.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-