Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Mar 17, 2023
    • Greg Kroah-Hartman's avatar
    • Rhythm Mahajan's avatar
      x86/cpu: Fix LFENCE serialization check in init_amd() · 9e1f4df3
      Rhythm Mahajan authored
      The commit: 3f235279 ("x86/cpu: Restore AMD's DE_CFG MSR after resume")
      which was backported from the upstream commit: 2632daeb renamed the
      MSR_F10H_DECFG_LFENCE_SERIALIZE macro to MSR_AMD64_DE_CFG_LFENCE_SERIALIZE.
      The fix for 4.14 and 4.9 changed MSR_F10H_DECFG_LFENCE_SERIALIZE to
      MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT in the init_amd() function, but should
      have used MSR_AMD64_DE_CFG_LFENCE_SERIALIZE.  This causes a discrepency in the
      LFENCE serialization check in the init_amd() function.
      
      This causes a ~16% sysbench memory regression, when running:
          sysbench --test=memory run
      
      Fixes: 3f235279
      
       ("x86/cpu: Restore AMD's DE_CFG MSR after resume")
      Signed-off-by: default avatarRhythm Mahajan <rhythm.m.mahajan@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e1f4df3
    • John Harrison's avatar
      drm/i915: Don't use BAR mappings for ring buffers with LLC · 7f973ce9
      John Harrison authored
      commit 85636167
      
       upstream.
      
      Direction from hardware is that ring buffers should never be mapped
      via the BAR on systems with LLC. There are too many caching pitfalls
      due to the way BAR accesses are routed. So it is safest to just not
      use it.
      
      Signed-off-by: default avatarJohn Harrison <John.C.Harrison@Intel.com>
      Fixes: 9d80841e
      
       ("drm/i915: Allow ringbuffers to be bound anywhere")
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Cc: intel-gfx@lists.freedesktop.org
      Cc: <stable@vger.kernel.org> # v4.9+
      Tested-by: default avatarJouni Högander <jouni.hogander@intel.com>
      Reviewed-by: default avatarDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20230216011101.1909009-3-John.C.Harrison@Intel.com
      (cherry picked from commit 65c08339
      
      )
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarJohn Harrison <John.C.Harrison@Intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7f973ce9
    • Tung Nguyen's avatar
      tipc: improve function tipc_wait_for_cond() · 79f749ad
      Tung Nguyen authored
      commit 223b7329 upstream.
      
      Commit 844cf763
      
       ("tipc: make macro tipc_wait_for_cond() smp safe")
      replaced finish_wait() with remove_wait_queue() but still used
      prepare_to_wait(). This causes unnecessary conditional
      checking  before adding to wait queue in prepare_to_wait().
      
      This commit replaces prepare_to_wait() with add_wait_queue()
      as the pair function with remove_wait_queue().
      
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarTung Nguyen <tung.q.nguyen@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Lee Jones <lee@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      79f749ad
    • Paul Elder's avatar
      media: ov5640: Fix analogue gain control · f9be5f66
      Paul Elder authored
      [ Upstream commit afa48057
      
       ]
      
      Gain control is badly documented in publicly available (including
      leaked) documentation.
      
      There is an AGC pre-gain in register 0x3a13, expressed as a 6-bit value
      (plus an enable bit in bit 6). The driver hardcodes it to 0x43, which
      one application note states is equal to x1.047. The documentation also
      states that 0x40 is equel to x1.000. The pre-gain thus seems to be
      expressed as in 1/64 increments, and thus ranges from x1.00 to x1.984.
      What the pre-gain does is however unspecified.
      
      There is then an AGC gain limit, in registers 0x3a18 and 0x3a19,
      expressed as a 10-bit "real gain format" value. One application note
      sets it to 0x00f8 and states it is equal to x15.5, so it appears to be
      expressed in 1/16 increments, up to x63.9375.
      
      The manual gain is stored in registers 0x350a and 0x350b, also as a
      10-bit "real gain format" value. It is documented in the application
      note as a Q6.4 values, up to x63.9375.
      
      One version of the datasheet indicates that the sensor supports a
      digital gain:
      
        The OV5640 supports 1/2/4 digital gain. Normally, the gain is
        controlled automatically by the automatic gain control (AGC) block.
      
      It isn't clear how that would be controlled manually.
      
      There appears to be no indication regarding whether the gain controlled
      through registers 0x350a and 0x350b is an analogue gain only or also
      includes digital gain. The words "real gain" don't necessarily mean
      "combined analogue and digital gains". Some OmniVision sensors (such as
      the OV8858) are documented as supoprting different formats for the gain
      values, selectable through a register bit, and they are called "real
      gain format" and "sensor gain format". For that sensor, we have (one of)
      the gain registers documented as
      
        0x3503[2]=0, gain[7:0] is real gain format, where low 4 bits are
        fraction bits, for example, 0x10 is 1x gain, 0x28 is 2.5x gain
      
        If 0x3503[2]=1, gain[7:0] is sensor gain format, gain[7:4] is coarse
        gain, 00000: 1x, 00001: 2x, 00011: 4x, 00111: 8x, gain[7] is 1,
        gain[3:0] is fine gain. For example, 0x10 is 1x gain, 0x30 is 2x gain,
        0x70 is 4x gain
      
      (The second part of the text makes little sense)
      
      "Real gain" may thus refer to the combination of the coarse and fine
      analogue gains as a single value.
      
      The OV5640 0x350a and 0x350b registers thus appear to control analogue
      gain. The driver incorrectly uses V4L2_CID_GAIN as V4L2 has a specific
      control for analogue gain, V4L2_CID_ANALOGUE_GAIN. Use it.
      
      If registers 0x350a and 0x350b are later found to control digital gain
      as well, the driver could then restrict the range of the analogue gain
      control value to lower than x64 and add a separate digital gain control.
      
      Signed-off-by: default avatarPaul Elder <paul.elder@ideasonboard.com>
      Signed-off-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Reviewed-by: default avatarJacopo Mondi <jacopo.mondi@ideasonboard.com>
      Reviewed-by: default avatarJai Luthra <j-luthra@ti.com>
      Signed-off-by: default avatarSakari Ailus <sakari.ailus@linux.intel.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f9be5f66
    • Alvaro Karsz's avatar
      PCI: Add SolidRun vendor ID · 8b088495
      Alvaro Karsz authored
      [ Upstream commit db6c4dee
      
       ]
      
      Add SolidRun vendor ID to pci_ids.h
      
      The vendor ID is used in 2 different source files, the SNET vDPA driver
      and PCI quirks.
      
      Signed-off-by: default avatarAlvaro Karsz <alvaro.karsz@solid-run.com>
      Acked-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Message-Id: <20230110165638.123745-2-alvaro.karsz@solid-run.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8b088495
    • Nathan Chancellor's avatar
      macintosh: windfarm: Use unsigned type for 1-bit bitfields · 10e549f9
      Nathan Chancellor authored
      [ Upstream commit 748ea32d
      
       ]
      
      Clang warns:
      
        drivers/macintosh/windfarm_lm75_sensor.c:63:14: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
                        lm->inited = 1;
                                   ^ ~
      
        drivers/macintosh/windfarm_smu_sensors.c:356:19: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
                        pow->fake_volts = 1;
                                        ^ ~
        drivers/macintosh/windfarm_smu_sensors.c:368:18: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
                        pow->quadratic = 1;
                                       ^ ~
      
      There is no bug here since no code checks the actual value of these
      fields, just whether or not they are zero (boolean context), but this
      can be easily fixed by switching to an unsigned type.
      
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230215-windfarm-wsingle-bit-bitfield-constant-conversion-v1-1-26415072e855@kernel.org
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      10e549f9
    • Edward Humes's avatar
      alpha: fix R_ALPHA_LITERAL reloc for large modules · 758fcca4
      Edward Humes authored
      [ Upstream commit b6b17a8b
      
       ]
      
      Previously, R_ALPHA_LITERAL relocations would overflow for large kernel
      modules.
      
      This was because the Alpha's apply_relocate_add was relying on the kernel's
      module loader to have sorted the GOT towards the very end of the module as it
      was mapped into memory in order to correctly assign the global pointer. While
      this behavior would mostly work fine for small kernel modules, this approach
      would overflow on kernel modules with large GOT's since the global pointer
      would be very far away from the GOT, and thus, certain entries would be out of
      range.
      
      This patch fixes this by instead using the Tru64 behavior of assigning the
      global pointer to be 32KB away from the start of the GOT. The change made
      in this patch won't work for multi-GOT kernel modules as it makes the
      assumption the module only has one GOT located at the beginning of .got,
      although for the vast majority kernel modules, this should be fine. Of the
      kernel modules that would previously result in a relocation error, none of
      them, even modules like nouveau, have even come close to filling up a single
      GOT, and they've all worked fine under this patch.
      
      Signed-off-by: default avatarEdward Humes <aurxenon@lunos.org>
      Signed-off-by: default avatarMatt Turner <mattst88@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      758fcca4
    • xurui's avatar
      MIPS: Fix a compilation issue · 1f91fc28
      xurui authored
      [ Upstream commit 109d587a
      
       ]
      
      arch/mips/include/asm/mach-rc32434/pci.h:377:
      cc1: error: result of ‘-117440512 << 16’ requires 44 bits to represent, but ‘int’ only has 32 bits [-Werror=shift-overflow=]
      
      All bits in KORINA_STAT are already at the correct position, so there is
      no addtional shift needed.
      
      Signed-off-by: default avatarxurui <xurui@kylinos.cn>
      Signed-off-by: default avatarThomas Bogendoerfer <tsbogend@alpha.franken.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1f91fc28
    • Shigeru Yoshida's avatar
      net: caif: Fix use-after-free in cfusbl_device_notify() · 68a45c3c
      Shigeru Yoshida authored
      [ Upstream commit 9781e98a ]
      
      syzbot reported use-after-free in cfusbl_device_notify() [1].  This
      causes a stack trace like below:
      
      BUG: KASAN: use-after-free in cfusbl_device_notify+0x7c9/0x870 net/caif/caif_usb.c:138
      Read of size 8 at addr ffff88807ac4e6f0 by task kworker/u4:6/1214
      
      CPU: 0 PID: 1214 Comm: kworker/u4:6 Not tainted 5.19.0-rc3-syzkaller-00146-g92f20ff72066 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: netns cleanup_net
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       print_address_description.constprop.0.cold+0xeb/0x467 mm/kasan/report.c:313
       print_report mm/kasan/report.c:429 [inline]
       kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491
       cfusbl_device_notify+0x7c9/0x870 net/caif/caif_usb.c:138
       notifier_call_chain+0xb5/0x200 kernel/notifier.c:87
       call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:1945
       call_netdevice_notifiers_extack net/core/dev.c:1983 [inline]
       call_netdevice_notifiers net/core/dev.c:1997 [inline]
       netdev_wait_allrefs_any net/core/dev.c:10227 [inline]
       netdev_run_todo+0xbc0/0x10f0 net/core/dev.c:10341
       default_device_exit_batch+0x44e/0x590 net/core/dev.c:11334
       ops_exit_list+0x125/0x170 net/core/net_namespace.c:167
       cleanup_net+0x4ea/0xb00 net/core/net_namespace.c:594
       process_one_work+0x996/0x1610 kernel/workqueue.c:2289
       worker_thread+0x665/0x1080 kernel/workqueue.c:2436
       kthread+0x2e9/0x3a0 kernel/kthread.c:376
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302
       </TASK>
      
      When unregistering a net device, unregister_netdevice_many_notify()
      sets the device's reg_state to NETREG_UNREGISTERING, calls notifiers
      with NETDEV_UNREGISTER, and adds the device to the todo list.
      
      Later on, devices in the todo list are processed by netdev_run_todo().
      netdev_run_todo() waits devices' reference count become 1 while
      rebdoadcasting NETDEV_UNREGISTER notification.
      
      When cfusbl_device_notify() is called with NETDEV_UNREGISTER multiple
      times, the parent device might be freed.  This could cause UAF.
      Processing NETDEV_UNREGISTER multiple times also causes inbalance of
      reference count for the module.
      
      This patch fixes the issue by accepting only first NETDEV_UNREGISTER
      notification.
      
      Fixes: 7ad65bf6
      
       ("caif: Add support for CAIF over CDC NCM USB interface")
      CC: sjur.brandeland@stericsson.com <sjur.brandeland@stericsson.com>
      Reported-by: default avatar <syzbot+b563d33852b893653a9e@syzkaller.appspotmail.com>
      Link: https://syzkaller.appspot.com/bug?id=c3bfd8e2450adab3bffe4d80821fbbced600407f
      
       [1]
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Link: https://lore.kernel.org/r/20230301163913.391304-1-syoshida@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      68a45c3c
    • Eric Dumazet's avatar
      ila: do not generate empty messages in ila_xlat_nl_cmd_get_mapping() · b26bc586
      Eric Dumazet authored
      [ Upstream commit 693aa2c0 ]
      
      ila_xlat_nl_cmd_get_mapping() generates an empty skb,
      triggerring a recent sanity check [1].
      
      Instead, return an error code, so that user space
      can get it.
      
      [1]
      skb_assert_len
      WARNING: CPU: 0 PID: 5923 at include/linux/skbuff.h:2527 skb_assert_len include/linux/skbuff.h:2527 [inline]
      WARNING: CPU: 0 PID: 5923 at include/linux/skbuff.h:2527 __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156
      Modules linked in:
      CPU: 0 PID: 5923 Comm: syz-executor269 Not tainted 6.2.0-syzkaller-18300-g2ebd1fbb946d #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023
      pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      pc : skb_assert_len include/linux/skbuff.h:2527 [inline]
      pc : __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156
      lr : skb_assert_len include/linux/skbuff.h:2527 [inline]
      lr : __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156
      sp : ffff80001e0d6c40
      x29: ffff80001e0d6e60 x28: dfff800000000000 x27: ffff0000c86328c0
      x26: dfff800000000000 x25: ffff0000c8632990 x24: ffff0000c8632a00
      x23: 0000000000000000 x22: 1fffe000190c6542 x21: ffff0000c8632a10
      x20: ffff0000c8632a00 x19: ffff80001856e000 x18: ffff80001e0d5fc0
      x17: 0000000000000000 x16: ffff80001235d16c x15: 0000000000000000
      x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000001
      x11: ff80800008353a30 x10: 0000000000000000 x9 : 21567eaf25bfb600
      x8 : 21567eaf25bfb600 x7 : 0000000000000001 x6 : 0000000000000001
      x5 : ffff80001e0d6558 x4 : ffff800015c74760 x3 : ffff800008596744
      x2 : 0000000000000001 x1 : 0000000100000000 x0 : 000000000000000e
      Call trace:
      skb_assert_len include/linux/skbuff.h:2527 [inline]
      __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156
      dev_queue_xmit include/linux/netdevice.h:3033 [inline]
      __netlink_deliver_tap_skb net/netlink/af_netlink.c:307 [inline]
      __netlink_deliver_tap+0x45c/0x6f8 net/netlink/af_netlink.c:325
      netlink_deliver_tap+0xf4/0x174 net/netlink/af_netlink.c:338
      __netlink_sendskb net/netlink/af_netlink.c:1283 [inline]
      netlink_sendskb+0x6c/0x154 net/netlink/af_netlink.c:1292
      netlink_unicast+0x334/0x8d4 net/netlink/af_netlink.c:1380
      nlmsg_unicast include/net/netlink.h:1099 [inline]
      genlmsg_unicast include/net/genetlink.h:433 [inline]
      genlmsg_reply include/net/genetlink.h:443 [inline]
      ila_xlat_nl_cmd_get_mapping+0x620/0x7d0 net/ipv6/ila/ila_xlat.c:493
      genl_family_rcv_msg_doit net/netlink/genetlink.c:968 [inline]
      genl_family_rcv_msg net/netlink/genetlink.c:1048 [inline]
      genl_rcv_msg+0x938/0xc1c net/netlink/genetlink.c:1065
      netlink_rcv_skb+0x214/0x3c4 net/netlink/af_netlink.c:2574
      genl_rcv+0x38/0x50 net/netlink/genetlink.c:1076
      netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
      netlink_unicast+0x660/0x8d4 net/netlink/af_netlink.c:1365
      netlink_sendmsg+0x800/0xae0 net/netlink/af_netlink.c:1942
      sock_sendmsg_nosec net/socket.c:714 [inline]
      sock_sendmsg net/socket.c:734 [inline]
      ____sys_sendmsg+0x558/0x844 net/socket.c:2479
      ___sys_sendmsg net/socket.c:2533 [inline]
      __sys_sendmsg+0x26c/0x33c net/socket.c:2562
      __do_sys_sendmsg net/socket.c:2571 [inline]
      __se_sys_sendmsg net/socket.c:2569 [inline]
      __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2569
      __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
      invoke_syscall+0x98/0x2c0 arch/arm64/kernel/syscall.c:52
      el0_svc_common+0x138/0x258 arch/arm64/kernel/syscall.c:142
      do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:193
      el0_svc+0x58/0x168 arch/arm64/kernel/entry-common.c:637
      el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655
      el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591
      irq event stamp: 136484
      hardirqs last enabled at (136483): [<ffff800008350244>] __up_console_sem+0x60/0xb4 kernel/printk/printk.c:345
      hardirqs last disabled at (136484): [<ffff800012358d60>] el1_dbg+0x24/0x80 arch/arm64/kernel/entry-common.c:405
      softirqs last enabled at (136418): [<ffff800008020ea8>] softirq_handle_end kernel/softirq.c:414 [inline]
      softirqs last enabled at (136418): [<ffff800008020ea8>] __do_softirq+0xd4c/0xfa4 kernel/softirq.c:600
      softirqs last disabled at (136371): [<ffff80000802b4a4>] ____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:80
      ---[ end trace 0000000000000000 ]---
      skb len=0 headroom=0 headlen=0 tailroom=192
      mac=(0,0) net=(0,-1) trans=-1
      shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
      csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
      hash(0x0 sw=0 l4=0) proto=0x0010 pkttype=6 iif=0
      dev name=nlmon0 feat=0x0000000000005861
      
      Fixes: 7f00feaf
      
       ("ila: Add generic ILA translation facility")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b26bc586
    • Kang Chen's avatar
      nfc: fdp: add null check of devm_kmalloc_array in fdp_nci_i2c_read_device_properties · ad11b872
      Kang Chen authored
      [ Upstream commit 11f180a5 ]
      
      devm_kmalloc_array may fails, *fw_vsc_cfg might be null and cause
      out-of-bounds write in device_property_read_u8_array later.
      
      Fixes: a06347c0
      
       ("NFC: Add Intel Fields Peak NFC solution driver")
      Signed-off-by: default avatarKang Chen <void0red@gmail.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230227093037.907654-1-void0red@gmail.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ad11b872
    • Fedor Pchelkin's avatar
      nfc: change order inside nfc_se_io error path · 192027fe
      Fedor Pchelkin authored
      commit 7d834b4d upstream.
      
      cb_context should be freed on the error path in nfc_se_io as stated by
      commit 25ff6f8a
      
       ("nfc: fix memory leak of se_io context in
      nfc_genl_se_io").
      
      Make the error path in nfc_se_io unwind everything in reverse order, i.e.
      free the cb_context after unlocking the device.
      
      Suggested-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Signed-off-by: default avatarFedor Pchelkin <pchelkin@ispras.ru>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Link: https://lore.kernel.org/r/20230306212650.230322-1-pchelkin@ispras.ru
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      192027fe
    • Zhihao Cheng's avatar
      ext4: zero i_disksize when initializing the bootloader inode · d6c1447e
      Zhihao Cheng authored
      commit f5361da1 upstream.
      
      If the boot loader inode has never been used before, the
      EXT4_IOC_SWAP_BOOT inode will initialize it, including setting the
      i_size to 0.  However, if the "never before used" boot loader has a
      non-zero i_size, then i_disksize will be non-zero, and the
      inconsistency between i_size and i_disksize can trigger a kernel
      warning:
      
       WARNING: CPU: 0 PID: 2580 at fs/ext4/file.c:319
       CPU: 0 PID: 2580 Comm: bb Not tainted 6.3.0-rc1-00004-g703695902cfa
       RIP: 0010:ext4_file_write_iter+0xbc7/0xd10
       Call Trace:
        vfs_write+0x3b1/0x5c0
        ksys_write+0x77/0x160
        __x64_sys_write+0x22/0x30
        do_syscall_64+0x39/0x80
      
      Reproducer:
       1. create corrupted image and mount it:
             mke2fs -t ext4 /tmp/foo.img 200
             debugfs -wR "sif <5> size 25700" /tmp/foo.img
             mount -t ext4 /tmp/foo.img /mnt
             cd /mnt
             echo 123 > file
       2. Run the reproducer program:
             posix_memalign(&buf, 1024, 1024)
             fd = open("file", O_RDWR | O_DIRECT);
             ioctl(fd, EXT4_IOC_SWAP_BOOT);
             write(fd, buf, 1024);
      
      Fix this by setting i_disksize as well as i_size to zero when
      initiaizing the boot loader inode.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=217159
      
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Link: https://lore.kernel.org/r/20230308032643.641113-1-chengzhihao1@huawei.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d6c1447e
    • Ye Bin's avatar
      ext4: fix WARNING in ext4_update_inline_data · c5aa102b
      Ye Bin authored
      commit 2b96b4a5
      
       upstream.
      
      Syzbot found the following issue:
      EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 without journal. Quota mode: none.
      fscrypt: AES-256-CTS-CBC using implementation "cts-cbc-aes-aesni"
      fscrypt: AES-256-XTS using implementation "xts-aes-aesni"
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 5071 at mm/page_alloc.c:5525 __alloc_pages+0x30a/0x560 mm/page_alloc.c:5525
      Modules linked in:
      CPU: 1 PID: 5071 Comm: syz-executor263 Not tainted 6.2.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
      RIP: 0010:__alloc_pages+0x30a/0x560 mm/page_alloc.c:5525
      RSP: 0018:ffffc90003c2f1c0 EFLAGS: 00010246
      RAX: ffffc90003c2f220 RBX: 0000000000000014 RCX: 0000000000000000
      RDX: 0000000000000028 RSI: 0000000000000000 RDI: ffffc90003c2f248
      RBP: ffffc90003c2f2d8 R08: dffffc0000000000 R09: ffffc90003c2f220
      R10: fffff52000785e49 R11: 1ffff92000785e44 R12: 0000000000040d40
      R13: 1ffff92000785e40 R14: dffffc0000000000 R15: 1ffff92000785e3c
      FS:  0000555556c0d300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f95d5e04138 CR3: 00000000793aa000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       __alloc_pages_node include/linux/gfp.h:237 [inline]
       alloc_pages_node include/linux/gfp.h:260 [inline]
       __kmalloc_large_node+0x95/0x1e0 mm/slab_common.c:1113
       __do_kmalloc_node mm/slab_common.c:956 [inline]
       __kmalloc+0xfe/0x190 mm/slab_common.c:981
       kmalloc include/linux/slab.h:584 [inline]
       kzalloc include/linux/slab.h:720 [inline]
       ext4_update_inline_data+0x236/0x6b0 fs/ext4/inline.c:346
       ext4_update_inline_dir fs/ext4/inline.c:1115 [inline]
       ext4_try_add_inline_entry+0x328/0x990 fs/ext4/inline.c:1307
       ext4_add_entry+0x5a4/0xeb0 fs/ext4/namei.c:2385
       ext4_add_nondir+0x96/0x260 fs/ext4/namei.c:2772
       ext4_create+0x36c/0x560 fs/ext4/namei.c:2817
       lookup_open fs/namei.c:3413 [inline]
       open_last_lookups fs/namei.c:3481 [inline]
       path_openat+0x12ac/0x2dd0 fs/namei.c:3711
       do_filp_open+0x264/0x4f0 fs/namei.c:3741
       do_sys_openat2+0x124/0x4e0 fs/open.c:1310
       do_sys_open fs/open.c:1326 [inline]
       __do_sys_openat fs/open.c:1342 [inline]
       __se_sys_openat fs/open.c:1337 [inline]
       __x64_sys_openat+0x243/0x290 fs/open.c:1337
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Above issue happens as follows:
      ext4_iget
         ext4_find_inline_data_nolock ->i_inline_off=164 i_inline_size=60
      ext4_try_add_inline_entry
         __ext4_mark_inode_dirty
            ext4_expand_extra_isize_ea ->i_extra_isize=32 s_want_extra_isize=44
               ext4_xattr_shift_entries
      	 ->after shift i_inline_off is incorrect, actually is change to 176
      ext4_try_add_inline_entry
        ext4_update_inline_dir
          get_max_inline_xattr_value_size
            if (EXT4_I(inode)->i_inline_off)
      	entry = (struct ext4_xattr_entry *)((void *)raw_inode +
      			EXT4_I(inode)->i_inline_off);
              free += EXT4_XATTR_SIZE(le32_to_cpu(entry->e_value_size));
      	->As entry is incorrect, then 'free' may be negative
         ext4_update_inline_data
            value = kzalloc(len, GFP_NOFS);
            -> len is unsigned int, maybe very large, then trigger warning when
               'kzalloc()'
      
      To resolve the above issue we need to update 'i_inline_off' after
      'ext4_xattr_shift_entries()'.  We do not need to set
      EXT4_STATE_MAY_INLINE_DATA flag here, since ext4_mark_inode_dirty()
      already sets this flag if needed.  Setting EXT4_STATE_MAY_INLINE_DATA
      when it is needed may trigger a BUG_ON in ext4_writepages().
      
      Reported-by: default avatar <syzbot+d30838395804afc2fa6f@syzkaller.appspotmail.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20230307015253.2232062-3-yebin@huaweicloud.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5aa102b
    • Ye Bin's avatar
      ext4: move where set the MAY_INLINE_DATA flag is set · dbd0db09
      Ye Bin authored
      commit 1dcdce59
      
       upstream.
      
      The only caller of ext4_find_inline_data_nolock() that needs setting of
      EXT4_STATE_MAY_INLINE_DATA flag is ext4_iget_extra_inode().  In
      ext4_write_inline_data_end() we just need to update inode->i_inline_off.
      Since we are going to add one more caller that does not need to set
      EXT4_STATE_MAY_INLINE_DATA, just move setting of EXT4_STATE_MAY_INLINE_DATA
      out to ext4_iget_extra_inode().
      
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Cc: stable@kernel.org
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20230307015253.2232062-2-yebin@huaweicloud.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dbd0db09
    • Darrick J. Wong's avatar
      ext4: fix another off-by-one fsmap error on 1k block filesystems · a70b49dc
      Darrick J. Wong authored
      commit c993799b
      
       upstream.
      
      Apparently syzbot figured out that issuing this FSMAP call:
      
      struct fsmap_head cmd = {
      	.fmh_count	= ...;
      	.fmh_keys	= {
      		{ .fmr_device = /* ext4 dev */, .fmr_physical = 0, },
      		{ .fmr_device = /* ext4 dev */, .fmr_physical = 0, },
      	},
      ...
      };
      ret = ioctl(fd, FS_IOC_GETFSMAP, &cmd);
      
      Produces this crash if the underlying filesystem is a 1k-block ext4
      filesystem:
      
      kernel BUG at fs/ext4/ext4.h:3331!
      invalid opcode: 0000 [#1] PREEMPT SMP
      CPU: 3 PID: 3227965 Comm: xfs_io Tainted: G        W  O       6.2.0-rc8-achx
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
      RIP: 0010:ext4_mb_load_buddy_gfp+0x47c/0x570 [ext4]
      RSP: 0018:ffffc90007c03998 EFLAGS: 00010246
      RAX: ffff888004978000 RBX: ffffc90007c03a20 RCX: ffff888041618000
      RDX: 0000000000000000 RSI: 00000000000005a4 RDI: ffffffffa0c99b11
      RBP: ffff888012330000 R08: ffffffffa0c2b7d0 R09: 0000000000000400
      R10: ffffc90007c03950 R11: 0000000000000000 R12: 0000000000000001
      R13: 00000000ffffffff R14: 0000000000000c40 R15: ffff88802678c398
      FS:  00007fdf2020c880(0000) GS:ffff88807e100000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ffd318a5fe8 CR3: 000000007f80f001 CR4: 00000000001706e0
      Call Trace:
       <TASK>
       ext4_mballoc_query_range+0x4b/0x210 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
       ext4_getfsmap_datadev+0x713/0x890 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
       ext4_getfsmap+0x2b7/0x330 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
       ext4_ioc_getfsmap+0x153/0x2b0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
       __ext4_ioctl+0x2a7/0x17e0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
       __x64_sys_ioctl+0x82/0xa0
       do_syscall_64+0x2b/0x80
       entry_SYSCALL_64_after_hwframe+0x46/0xb0
      RIP: 0033:0x7fdf20558aff
      RSP: 002b:00007ffd318a9e30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 00000000000200c0 RCX: 00007fdf20558aff
      RDX: 00007fdf1feb2010 RSI: 00000000c0c0583b RDI: 0000000000000003
      RBP: 00005625c0634be0 R08: 00005625c0634c40 R09: 0000000000000001
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fdf1feb2010
      R13: 00005625be70d994 R14: 0000000000000800 R15: 0000000000000000
      
      For GETFSMAP calls, the caller selects a physical block device by
      writing its block number into fsmap_head.fmh_keys[01].fmr_device.
      To query mappings for a subrange of the device, the starting byte of the
      range is written to fsmap_head.fmh_keys[0].fmr_physical and the last
      byte of the range goes in fsmap_head.fmh_keys[1].fmr_physical.
      
      IOWs, to query what mappings overlap with bytes 3-14 of /dev/sda, you'd
      set the inputs as follows:
      
      	fmh_keys[0] = { .fmr_device = major(8, 0), .fmr_physical = 3},
      	fmh_keys[1] = { .fmr_device = major(8, 0), .fmr_physical = 14},
      
      Which would return you whatever is mapped in the 12 bytes starting at
      physical offset 3.
      
      The crash is due to insufficient range validation of keys[1] in
      ext4_getfsmap_datadev.  On 1k-block filesystems, block 0 is not part of
      the filesystem, which means that s_first_data_block is nonzero.
      ext4_get_group_no_and_offset subtracts this quantity from the blocknr
      argument before cracking it into a group number and a block number
      within a group.  IOWs, block group 0 spans blocks 1-8192 (1-based)
      instead of 0-8191 (0-based) like what happens with larger blocksizes.
      
      The net result of this encoding is that blocknr < s_first_data_block is
      not a valid input to this function.  The end_fsb variable is set from
      the keys that are copied from userspace, which means that in the above
      example, its value is zero.  That leads to an underflow here:
      
      	blocknr = blocknr - le32_to_cpu(es->s_first_data_block);
      
      The division then operates on -1:
      
      	offset = do_div(blocknr, EXT4_BLOCKS_PER_GROUP(sb)) >>
      		EXT4_SB(sb)->s_cluster_bits;
      
      Leaving an impossibly large group number (2^32-1) in blocknr.
      ext4_getfsmap_check_keys checked that keys[0].fmr_physical and
      keys[1].fmr_physical are in increasing order, but
      ext4_getfsmap_datadev adjusts keys[0].fmr_physical to be at least
      s_first_data_block.  This implies that we have to check it again after
      the adjustment, which is the piece that I forgot.
      
      Reported-by: default avatar <syzbot+6be2b977c89f79b6b153@syzkaller.appspotmail.com>
      Fixes: 4a495624 ("ext4: fix off-by-one fsmap error on 1k block filesystems")
      Link: https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c30002
      
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Link: https://lore.kernel.org/r/Y+58NPTH7VNGgzdd@magnolia
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a70b49dc
    • Eric Whitney's avatar
      ext4: fix RENAME_WHITEOUT handling for inline directories · e49ddab1
      Eric Whitney authored
      commit c9f62c8b upstream.
      
      A significant number of xfstests can cause ext4 to log one or more
      warning messages when they are run on a test file system where the
      inline_data feature has been enabled.  An example:
      
      "EXT4-fs warning (device vdc): ext4_dirblock_csum_set:425: inode
       #16385: comm fsstress: No space for directory leaf checksum. Please
      run e2fsck -D."
      
      The xfstests include: ext4/057, 058, and 307; generic/013, 051, 068,
      070, 076, 078, 083, 232, 269, 270, 390, 461, 475, 476, 482, 579, 585,
      589, 626, 631, and 650.
      
      In this situation, the warning message indicates a bug in the code that
      performs the RENAME_WHITEOUT operation on a directory entry that has
      been stored inline.  It doesn't detect that the directory is stored
      inline, and incorrectly attempts to compute a dirent block checksum on
      the whiteout inode when creating it.  This attempt fails as a result
      of the integrity checking in get_dirent_tail (usually due to a failure
      to match the EXT4_FT_DIR_CSUM magic cookie), and the warning message
      is then emitted.
      
      Fix this by simply collecting the inlined data state at the time the
      search for the source directory entry is performed.  Existing code
      handles the rest, and this is sufficient to eliminate all spurious
      warning messages produced by the tests above.  Go one step further
      and do the same in the code that resets the source directory entry in
      the event of failure.  The inlined state should be present in the
      "old" struct, but given the possibility of a race there's no harm
      in taking a conservative approach and getting that information again
      since the directory entry is being reread anyway.
      
      Fixes: b7ff91fd
      
       ("ext4: find old entry again if failed to rename whiteout")
      Cc: stable@kernel.org
      Signed-off-by: default avatarEric Whitney <enwlinux@gmail.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20230210173244.679890-1-enwlinux@gmail.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e49ddab1
    • Andrew Cooper's avatar
      x86/CPU/AMD: Disable XSAVES on AMD family 0x17 · 35769d15
      Andrew Cooper authored
      commit b0563468
      
       upstream.
      
      AMD Erratum 1386 is summarised as:
      
        XSAVES Instruction May Fail to Save XMM Registers to the Provided
        State Save Area
      
      This piece of accidental chronomancy causes the %xmm registers to
      occasionally reset back to an older value.
      
      Ignore the XSAVES feature on all AMD Zen1/2 hardware.  The XSAVEC
      instruction (which works fine) is equivalent on affected parts.
      
        [ bp: Typos, move it into the F17h-specific function. ]
      
      Reported-by: default avatarTavis Ormandy <taviso@gmail.com>
      Signed-off-by: default avatarAndrew Cooper <andrew.cooper3@citrix.com>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Cc: <stable@kernel.org>
      Link: https://lore.kernel.org/r/20230307174643.1240184-1-andrew.cooper3@citrix.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35769d15
    • Theodore Ts'o's avatar
      fs: prevent out-of-bounds array speculation when closing a file descriptor · f31cd5da
      Theodore Ts'o authored
      commit 609d5444
      
       upstream.
      
      Google-Bug-Id: 114199369
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f31cd5da
  2. Mar 13, 2023
  3. Mar 11, 2023