Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Mar 23, 2022
    • Greg Kroah-Hartman's avatar
    • Filipe Manana's avatar
      btrfs: skip reserved bytes warning on unmount after log cleanup failure · 44557a8f
      Filipe Manana authored
      commit 40cdc509 upstream.
      
      After the recent changes made by commit c2e39305 ("btrfs: clear
      extent buffer uptodate when we fail to write it") and its followup fix,
      commit 651740a5 ("btrfs: check WRITE_ERR when trying to read an
      extent buffer"), we can now end up not cleaning up space reservations of
      log tree extent buffers after a transaction abort happens, as well as not
      cleaning up still dirty extent buffers.
      
      This happens because if writeback for a log tree extent buffer failed,
      then we have cleared the bit EXTENT_BUFFER_UPTODATE from the extent buffer
      and we have also set the bit EXTENT_BUFFER_WRITE_ERR on it. Later on,
      when trying to free the log tree with free_log_tree(), which iterates
      over the tree, we can end up getting an -EIO error when trying to read
      a node or a leaf, since read_extent_buffer_pages() returns -EIO if an
      extent buffer does not have EXTENT_BUFFER_UPTODATE set and has the
      EXTENT_BUFFER_WRITE_ERR bit set. Getting that -EIO means that we return
      immediately as we can not iterate over the entire tree.
      
      In that case we never update the reserved space for an extent buffer in
      the respective block group and space_info object.
      
      When this happens we get the following traces when unmounting the fs:
      
      [174957.284509] BTRFS: error (device dm-0) in cleanup_transaction:1913: errno=-5 IO failure
      [174957.286497] BTRFS: error (device dm-0) in free_log_tree:3420: errno=-5 IO failure
      [174957.399379] ------------[ cut here ]------------
      [174957.402497] WARNING: CPU: 2 PID: 3206883 at fs/btrfs/block-group.c:127 btrfs_put_block_group+0x77/0xb0 [btrfs]
      [174957.407523] Modules linked in: btrfs overlay dm_zero (...)
      [174957.424917] CPU: 2 PID: 3206883 Comm: umount Tainted: G        W         5.16.0-rc5-btrfs-next-109 #1
      [174957.426689] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
      [174957.428716] RIP: 0010:btrfs_put_block_group+0x77/0xb0 [btrfs]
      [174957.429717] Code: 21 48 8b bd (...)
      [174957.432867] RSP: 0018:ffffb70d41cffdd0 EFLAGS: 00010206
      [174957.433632] RAX: 0000000000000001 RBX: ffff8b09c3848000 RCX: ffff8b0758edd1c8
      [174957.434689] RDX: 0000000000000001 RSI: ffffffffc0b467e7 RDI: ffff8b0758edd000
      [174957.436068] RBP: ffff8b0758edd000 R08: 0000000000000000 R09: 0000000000000000
      [174957.437114] R10: 0000000000000246 R11: 0000000000000000 R12: ffff8b09c3848148
      [174957.438140] R13: ffff8b09c3848198 R14: ffff8b0758edd188 R15: dead000000000100
      [174957.439317] FS:  00007f328fb82800(0000) GS:ffff8b0a2d200000(0000) knlGS:0000000000000000
      [174957.440402] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [174957.441164] CR2: 00007fff13563e98 CR3: 0000000404f4e005 CR4: 0000000000370ee0
      [174957.442117] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [174957.443076] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [174957.443948] Call Trace:
      [174957.444264]  <TASK>
      [174957.444538]  btrfs_free_block_groups+0x255/0x3c0 [btrfs]
      [174957.445238]  close_ctree+0x301/0x357 [btrfs]
      [174957.445803]  ? call_rcu+0x16c/0x290
      [174957.446250]  generic_shutdown_super+0x74/0x120
      [174957.446832]  kill_anon_super+0x14/0x30
      [174957.447305]  btrfs_kill_super+0x12/0x20 [btrfs]
      [174957.447890]  deactivate_locked_super+0x31/0xa0
      [174957.448440]  cleanup_mnt+0x147/0x1c0
      [174957.448888]  task_work_run+0x5c/0xa0
      [174957.449336]  exit_to_user_mode_prepare+0x1e5/0x1f0
      [174957.449934]  syscall_exit_to_user_mode+0x16/0x40
      [174957.450512]  do_syscall_64+0x48/0xc0
      [174957.450980]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [174957.451605] RIP: 0033:0x7f328fdc4a97
      [174957.452059] Code: 03 0c 00 f7 (...)
      [174957.454320] RSP: 002b:00007fff13564ec8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
      [174957.455262] RAX: 0000000000000000 RBX: 00007f328feea264 RCX: 00007f328fdc4a97
      [174957.456131] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000560b8ae51dd0
      [174957.457118] RBP: 0000560b8ae51ba0 R08: 0000000000000000 R09: 00007fff13563c40
      [174957.458005] R10: 00007f328fe49fc0 R11: 0000000000000246 R12: 0000000000000000
      [174957.459113] R13: 0000560b8ae51dd0 R14: 0000560b8ae51cb0 R15: 0000000000000000
      [174957.460193]  </TASK>
      [174957.460534] irq event stamp: 0
      [174957.461003] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
      [174957.461947] hardirqs last disabled at (0): [<ffffffffb0e94214>] copy_process+0x934/0x2040
      [174957.463147] softirqs last  enabled at (0): [<ffffffffb0e94214>] copy_process+0x934/0x2040
      [174957.465116] softirqs last disabled at (0): [<0000000000000000>] 0x0
      [174957.466323] ---[ end trace bc7ee0c490bce3af ]---
      [174957.467282] ------------[ cut here ]------------
      [174957.468184] WARNING: CPU: 2 PID: 3206883 at fs/btrfs/block-group.c:3976 btrfs_free_block_groups+0x330/0x3c0 [btrfs]
      [174957.470066] Modules linked in: btrfs overlay dm_zero (...)
      [174957.483137] CPU: 2 PID: 3206883 Comm: umount Tainted: G        W         5.16.0-rc5-btrfs-next-109 #1
      [174957.484691] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
      [174957.486853] RIP: 0010:btrfs_free_block_groups+0x330/0x3c0 [btrfs]
      [174957.488050] Code: 00 00 00 ad de (...)
      [174957.491479] RSP: 0018:ffffb70d41cffde0 EFLAGS: 00010206
      [174957.492520] RAX: ffff8b08d79310b0 RBX: ffff8b09c3848000 RCX: 0000000000000000
      [174957.493868] RDX: 0000000000000001 RSI: fffff443055ee600 RDI: ffffffffb1131846
      [174957.495183] RBP: ffff8b08d79310b0 R08: 0000000000000000 R09: 0000000000000000
      [174957.496580] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8b08d7931000
      [174957.498027] R13: ffff8b09c38492b0 R14: dead000000000122 R15: dead000000000100
      [174957.499438] FS:  00007f328fb82800(0000) GS:ffff8b0a2d200000(0000) knlGS:0000000000000000
      [174957.500990] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [174957.502117] CR2: 00007fff13563e98 CR3: 0000000404f4e005 CR4: 0000000000370ee0
      [174957.503513] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [174957.504864] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [174957.506167] Call Trace:
      [174957.506654]  <TASK>
      [174957.507047]  close_ctree+0x301/0x357 [btrfs]
      [174957.507867]  ? call_rcu+0x16c/0x290
      [174957.508567]  generic_shutdown_super+0x74/0x120
      [174957.509447]  kill_anon_super+0x14/0x30
      [174957.510194]  btrfs_kill_super+0x12/0x20 [btrfs]
      [174957.511123]  deactivate_locked_super+0x31/0xa0
      [174957.511976]  cleanup_mnt+0x147/0x1c0
      [174957.512610]  task_work_run+0x5c/0xa0
      [174957.513309]  exit_to_user_mode_prepare+0x1e5/0x1f0
      [174957.514231]  syscall_exit_to_user_mode+0x16/0x40
      [174957.515069]  do_syscall_64+0x48/0xc0
      [174957.515718]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [174957.516688] RIP: 0033:0x7f328fdc4a97
      [174957.517413] Code: 03 0c 00 f7 d8 (...)
      [174957.521052] RSP: 002b:00007fff13564ec8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
      [174957.522514] RAX: 0000000000000000 RBX: 00007f328feea264 RCX: 00007f328fdc4a97
      [174957.523950] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000560b8ae51dd0
      [174957.525375] RBP: 0000560b8ae51ba0 R08: 0000000000000000 R09: 00007fff13563c40
      [174957.526763] R10: 00007f328fe49fc0 R11: 0000000000000246 R12: 0000000000000000
      [174957.528058] R13: 0000560b8ae51dd0 R14: 0000560b8ae51cb0 R15: 0000000000000000
      [174957.529404]  </TASK>
      [174957.529843] irq event stamp: 0
      [174957.530256] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
      [174957.531061] hardirqs last disabled at (0): [<ffffffffb0e94214>] copy_process+0x934/0x2040
      [174957.532075] softirqs last  enabled at (0): [<ffffffffb0e94214>] copy_process+0x934/0x2040
      [174957.533083] softirqs last disabled at (0): [<0000000000000000>] 0x0
      [174957.533865] ---[ end trace bc7ee0c490bce3b0 ]---
      [174957.534452] BTRFS info (device dm-0): space_info 4 has 1070841856 free, is not full
      [174957.535404] BTRFS info (device dm-0): space_info total=1073741824, used=2785280, pinned=0, reserved=49152, may_use=0, readonly=65536 zone_unusable=0
      [174957.537029] BTRFS info (device dm-0): global_block_rsv: size 0 reserved 0
      [174957.537859] BTRFS info (device dm-0): trans_block_rsv: size 0 reserved 0
      [174957.538697] BTRFS info (device dm-0): chunk_block_rsv: size 0 reserved 0
      [174957.539552] BTRFS info (device dm-0): delayed_block_rsv: size 0 reserved 0
      [174957.540403] BTRFS info (device dm-0): delayed_refs_rsv: size 0 reserved 0
      
      This also means that in case we have log tree extent buffers that are
      still dirty, we can end up not cleaning them up in case we find an
      extent buffer with EXTENT_BUFFER_WRITE_ERR set on it, as in that case
      we have no way for iterating over the rest of the tree.
      
      This issue is very often triggered with test cases generic/475 and
      generic/648 from fstests.
      
      The issue could almost be fixed by iterating over the io tree attached to
      each log root which keeps tracks of the range of allocated extent buffers,
      log_root->dirty_log_pages, however that does not work and has some
      inconveniences:
      
      1) After we sync the log, we clear the range of the extent buffers from
         the io tree, so we can't find them after writeback. We could keep the
         ranges in the io tree, with a separate bit to signal they represent
         extent buffers already written, but that means we need to hold into
         more memory until the transaction commits.
      
         How much more memory is used depends a lot on whether we are able to
         allocate contiguous extent buffers on disk (and how often) for a log
         tree - if we are able to, then a single extent state record can
         represent multiple extent buffers, otherwise we need multiple extent
         state record structures to track each extent buffer.
         In fact, my earlier approach did that:
      
         https://lore.kernel.org/linux-btrfs/3aae7c6728257c7ce2279d6660ee2797e5e34bbd.1641300250.git.fdmanana@suse.com/
      
         However that can cause a very significant negative impact on
         performance, not only due to the extra memory usage but also because
         we get a larger and deeper dirty_log_pages io tree.
         We got a report that, on beefy machines at least, we can get such
         performance drop with fsmark for example:
      
         https://lore.kernel.org/linux-btrfs/20220117082426.GE32491@xsang-OptiPlex-9020/
      
      
      
      2) We would be doing it only to deal with an unexpected and exceptional
         case, which is basically failure to read an extent buffer from disk
         due to IO failures. On a healthy system we don't expect transaction
         aborts to happen after all;
      
      3) Instead of relying on iterating the log tree or tracking the ranges
         of extent buffers in the dirty_log_pages io tree, using the radix
         tree that tracks extent buffers (fs_info->buffer_radix) to find all
         log tree extent buffers is not reliable either, because after writeback
         of an extent buffer it can be evicted from memory by the release page
         callback of the btree inode (btree_releasepage()).
      
      Since there's no way to be able to properly cleanup a log tree without
      being able to read its extent buffers from disk and without using more
      memory to track the logical ranges of the allocated extent buffers do
      the following:
      
      1) When we fail to cleanup a log tree, setup a flag that indicates that
         failure;
      
      2) Trigger writeback of all log tree extent buffers that are still dirty,
         and wait for the writeback to complete. This is just to cleanup their
         state, page states, page leaks, etc;
      
      3) When unmounting the fs, ignore if the number of bytes reserved in a
         block group and in a space_info is not 0 if, and only if, we failed to
         cleanup a log tree. Also ignore only for metadata block groups and the
         metadata space_info object.
      
      This is far from a perfect solution, but it serves to silence test
      failures such as those from generic/475 and generic/648. However having
      a non-zero value for the reserved bytes counters on unmount after a
      transaction abort, is not such a terrible thing and it's completely
      harmless, it does not affect the filesystem integrity in any way.
      
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarAnand Jain <anand.jain@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44557a8f
    • Kalle Valo's avatar
      Revert "ath10k: drop beacon and probe response which leak from other channel" · b8c56b04
      Kalle Valo authored
      commit 45b4eb7e upstream.
      
      This reverts commit 3bf2537e.
      
      I was reported privately that this commit breaks AP and mesh mode on QCA9984
      (firmware 10.4-3.9.0.2-00156). So revert the commit to fix the regression.
      
      There was a conflict due to cfg80211 API changes but that was easy to fix.
      
      Fixes: 3bf2537e
      
       ("ath10k: drop beacon and probe response which leak from other channel")
      Signed-off-by: default avatarKalle Valo <quic_kvalo@quicinc.com>
      Link: https://lore.kernel.org/r/20220315155455.20446-1-kvalo@kernel.org
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8c56b04
    • Vladimir Oltean's avatar
      Revert "arm64: dts: freescale: Fix 'interrupt-map' parent address cells" · 439b7bb6
      Vladimir Oltean authored
      commit 1447c635 upstream.
      
      This reverts commit 869f0ec0. That
      updated the expected device tree binding format for the ls-extirq
      driver, without also updating the parsing code (ls_extirq_parse_map)
      to the new format.
      
      The context is that the ls-extirq driver uses the standard
      "interrupt-map" OF property in a non-standard way, as suggested by
      Rob Herring during review:
      https://lore.kernel.org/lkml/20190927161118.GA19333@bogus/
      
      This has turned out to be problematic, as Marc Zyngier discovered
      through commit 04128418 ("of/irq: Allow matching of an interrupt-map
      local to an interrupt controller"), later fixed through commit
      de4adddc ("of/irq: Add a quirk for controllers with their own
      definition of interrupt-map"). Marc's position, expressed on multiple
      opportunities, is that:
      
      (a) [ making private use of the reserved "interrupt-map" name in a
          driver ] "is wrong, by the very letter of what an interrupt-map
          means. If the interrupt map points to an interrupt controller,
          that's the target for the interrupt."
      https://lore.kernel.org/lkml/87k0g8jlmg.wl-maz@kernel.org/
      
      (b) [ updating the driver's bindings to accept a non-reserved name for
          this property, as an alternative, is ] "is totally pointless. These
          machines have been in the wild for years, and existing DTs will be
          there *forever*."
      https://lore.kernel.org/lkml/87ilvrk1r0.wl-maz@kernel.org/
      
      Considering the above, the Linux kernel has quirks in place to deal with
      the ls-extirq's non-standard use of the "interrupt-map". These quirks
      may be needed in other operating systems that consume this device tree,
      yet this is seen as the only viable solution.
      
      Therefore, the premise of the patch being reverted here is invalid.
      It doesn't matter whether the driver, in its non-standard use of the
      property, complies to the standard format or not, since this property
      isn't expected to be used for interrupt translation by the core.
      
      This change restores LS1088A, LS2088A/LS2085A and LX2160A to their
      previous bindings, which allows these systems to continue to use
      external interrupt lines with the correct polarity.
      
      Fixes: 869f0ec0
      
       ("arm64: dts: freescale: Fix 'interrupt-map' parent address cells")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      439b7bb6
    • Michael Petlan's avatar
      perf symbols: Fix symbol size calculation condition · 2a55a52b
      Michael Petlan authored
      commit 3cf6a32f upstream.
      
      Before this patch, the symbol end address fixup to be called, needed two
      conditions being met:
      
        if (prev->end == prev->start && prev->end != curr->start)
      
      Where
        "prev->end == prev->start" means that prev is zero-long
                                   (and thus needs a fixup)
      and
        "prev->end != curr->start" means that fixup hasn't been applied yet
      
      However, this logic is incorrect in the following situation:
      
      *curr  = {rb_node = {__rb_parent_color = 278218928,
        rb_right = 0x0, rb_left = 0x0},
        start = 0xc000000000062354,
        end = 0xc000000000062354, namelen = 40, type = 2 '\002',
        binding = 0 '\000', idle = 0 '\000', ignore = 0 '\000',
        inlined = 0 '\000', arch_sym = 0 '\000', annotate2 = false,
        name = 0x1159739e "kprobe_optinsn_page\t[__builtin__kprobes]"}
      
      *prev = {rb_node = {__rb_parent_color = 278219041,
        rb_right = 0x109548b0, rb_left = 0x109547c0},
        start = 0xc000000000062354,
        end = 0xc000000000062354, namelen = 12, type = 2 '\002',
        binding = 1 '\001', idle = 0 '\000', ignore = 0 '\000',
        inlined = 0 '\000', arch_sym = 0 '\000', annotate2 = false,
        name = 0x1095486e "optinsn_slot"}
      
      In this case, prev->start == prev->end == curr->start == curr->end,
      thus the condition above thinks that "we need a fixup due to zero
      length of prev symbol, but it has been probably done, since the
      prev->end == curr->start", which is wrong.
      
      After the patch, the execution path proceeds to arch__symbols__fixup_end
      function which fixes up the size of prev symbol by adding page_size to
      its end offset.
      
      Fixes: 3b01a413
      
       ("perf symbols: Improve kallsyms symbol end addr calculation")
      Signed-off-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20220317135536.805-1-mpetlan@redhat.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2a55a52b
    • Arnd Bergmann's avatar
      arm64: errata: avoid duplicate field initializer · 6be29c4b
      Arnd Bergmann authored
      commit 316e46f6
      
       upstream.
      
      The '.type' field is initialized both in place and in the macro
      as reported by this W=1 warning:
      
      arch/arm64/include/asm/cpufeature.h:281:9: error: initialized field overwritten [-Werror=override-init]
        281 |         (ARM64_CPUCAP_SCOPE_LOCAL_CPU | ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU)
            |         ^
      arch/arm64/kernel/cpu_errata.c:136:17: note: in expansion of macro 'ARM64_CPUCAP_LOCAL_CPU_ERRATUM'
        136 |         .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM,                         \
            |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      arch/arm64/kernel/cpu_errata.c:145:9: note: in expansion of macro 'ERRATA_MIDR_RANGE'
        145 |         ERRATA_MIDR_RANGE(m, var, r_min, var, r_max)
            |         ^~~~~~~~~~~~~~~~~
      arch/arm64/kernel/cpu_errata.c:613:17: note: in expansion of macro 'ERRATA_MIDR_REV_RANGE'
        613 |                 ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 2),
            |                 ^~~~~~~~~~~~~~~~~~~~~
      arch/arm64/include/asm/cpufeature.h:281:9: note: (near initialization for 'arm64_errata[18].type')
        281 |         (ARM64_CPUCAP_SCOPE_LOCAL_CPU | ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU)
            |         ^
      
      Remove the extranous initializer.
      
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Fixes: 1dd498e5 ("KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata")
      Link: https://lore.kernel.org/r/20220316183800.1546731-1-arnd@kernel.org
      
      
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6be29c4b
    • Pavel Skripkin's avatar
      Input: aiptek - properly check endpoint type · 35069e65
      Pavel Skripkin authored
      commit 5600f698 upstream.
      
      Syzbot reported warning in usb_submit_urb() which is caused by wrong
      endpoint type. There was a check for the number of endpoints, but not
      for the type of endpoint.
      
      Fix it by replacing old desc.bNumEndpoints check with
      usb_find_common_endpoints() helper for finding endpoints
      
      Fail log:
      
      usb 5-1: BOGUS urb xfer, pipe 1 != type 3
      WARNING: CPU: 2 PID: 48 at drivers/usb/core/urb.c:502 usb_submit_urb+0xed2/0x18a0 drivers/usb/core/urb.c:502
      Modules linked in:
      CPU: 2 PID: 48 Comm: kworker/2:2 Not tainted 5.17.0-rc6-syzkaller-00226-g07ebd38a0da2 #0
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
      Workqueue: usb_hub_wq hub_event
      ...
      Call Trace:
       <TASK>
       aiptek_open+0xd5/0x130 drivers/input/tablet/aiptek.c:830
       input_open_device+0x1bb/0x320 drivers/input/input.c:629
       kbd_connect+0xfe/0x160 drivers/tty/vt/keyboard.c:1593
      
      Fixes: 8e20cf2b
      
       ("Input: aiptek - fix crash on detecting device without endpoints")
      Reported-and-tested-by: default avatar <syzbot+75cccf2b7da87fb6f84b@syzkaller.appspotmail.com>
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Link: https://lore.kernel.org/r/20220308194328.26220-1-paskripkin@gmail.com
      
      
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35069e65
    • Matt Lupfer's avatar
      scsi: mpt3sas: Page fault in reply q processing · 3916e33b
      Matt Lupfer authored
      commit 69ad4ef8 upstream.
      
      A page fault was encountered in mpt3sas on a LUN reset error path:
      
      [  145.763216] mpt3sas_cm1: Task abort tm failed: handle(0x0002),timeout(30) tr_method(0x0) smid(3) msix_index(0)
      [  145.778932] scsi 1:0:0:0: task abort: FAILED scmd(0x0000000024ba29a2)
      [  145.817307] scsi 1:0:0:0: attempting device reset! scmd(0x0000000024ba29a2)
      [  145.827253] scsi 1:0:0:0: [sg1] tag#2 CDB: Receive Diagnostic 1c 01 01 ff fc 00
      [  145.837617] scsi target1:0:0: handle(0x0002), sas_address(0x500605b0000272b9), phy(0)
      [  145.848598] scsi target1:0:0: enclosure logical id(0x500605b0000272b8), slot(0)
      [  149.858378] mpt3sas_cm1: Poll ReplyDescriptor queues for completion of smid(0), task_type(0x05), handle(0x0002)
      [  149.875202] BUG: unable to handle page fault for address: 00000007fffc445d
      [  149.885617] #PF: supervisor read access in kernel mode
      [  149.894346] #PF: error_code(0x0000) - not-present page
      [  149.903123] PGD 0 P4D 0
      [  149.909387] Oops: 0000 [#1] PREEMPT SMP NOPTI
      [  149.917417] CPU: 24 PID: 3512 Comm: scsi_eh_1 Kdump: loaded Tainted: G S         O      5.10.89-altav-1 #1
      [  149.934327] Hardware name: DDN           200NVX2             /200NVX2-MB          , BIOS ATHG2.2.02.01 09/10/2021
      [  149.951871] RIP: 0010:_base_process_reply_queue+0x4b/0x900 [mpt3sas]
      [  149.961889] Code: 0f 84 22 02 00 00 8d 48 01 49 89 fd 48 8d 57 38 f0 0f b1 4f 38 0f 85 d8 01 00 00 49 8b 45 10 45 31 e4 41 8b 55 0c 48 8d 1c d0 <0f> b6 03 83 e0 0f 3c 0f 0f 85 a2 00 00 00 e9 e6 01 00 00 0f b7 ee
      [  149.991952] RSP: 0018:ffffc9000f1ebcb8 EFLAGS: 00010246
      [  150.000937] RAX: 0000000000000055 RBX: 00000007fffc445d RCX: 000000002548f071
      [  150.011841] RDX: 00000000ffff8881 RSI: 0000000000000001 RDI: ffff888125ed50d8
      [  150.022670] RBP: 0000000000000000 R08: 0000000000000000 R09: c0000000ffff7fff
      [  150.033445] R10: ffffc9000f1ebb68 R11: ffffc9000f1ebb60 R12: 0000000000000000
      [  150.044204] R13: ffff888125ed50d8 R14: 0000000000000080 R15: 34cdc00034cdea80
      [  150.054963] FS:  0000000000000000(0000) GS:ffff88dfaf200000(0000) knlGS:0000000000000000
      [  150.066715] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  150.076078] CR2: 00000007fffc445d CR3: 000000012448a006 CR4: 0000000000770ee0
      [  150.086887] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  150.097670] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  150.108323] PKRU: 55555554
      [  150.114690] Call Trace:
      [  150.120497]  ? printk+0x48/0x4a
      [  150.127049]  mpt3sas_scsih_issue_tm.cold.114+0x2e/0x2b3 [mpt3sas]
      [  150.136453]  mpt3sas_scsih_issue_locked_tm+0x86/0xb0 [mpt3sas]
      [  150.145759]  scsih_dev_reset+0xea/0x300 [mpt3sas]
      [  150.153891]  scsi_eh_ready_devs+0x541/0x9e0 [scsi_mod]
      [  150.162206]  ? __scsi_host_match+0x20/0x20 [scsi_mod]
      [  150.170406]  ? scsi_try_target_reset+0x90/0x90 [scsi_mod]
      [  150.178925]  ? blk_mq_tagset_busy_iter+0x45/0x60
      [  150.186638]  ? scsi_try_target_reset+0x90/0x90 [scsi_mod]
      [  150.195087]  scsi_error_handler+0x3a5/0x4a0 [scsi_mod]
      [  150.203206]  ? __schedule+0x1e9/0x610
      [  150.209783]  ? scsi_eh_get_sense+0x210/0x210 [scsi_mod]
      [  150.217924]  kthread+0x12e/0x150
      [  150.224041]  ? kthread_worker_fn+0x130/0x130
      [  150.231206]  ret_from_fork+0x1f/0x30
      
      This is caused by mpt3sas_base_sync_reply_irqs() using an invalid reply_q
      pointer outside of the list_for_each_entry() loop. At the end of the full
      list traversal the pointer is invalid.
      
      Move the _base_process_reply_queue() call inside of the loop.
      
      Link: https://lore.kernel.org/r/d625deae-a958-0ace-2ba3-0888dd0a415b@ddn.com
      Fixes: 711a923c
      
       ("scsi: mpt3sas: Postprocessing of target and LUN reset")
      Cc: stable@vger.kernel.org
      Acked-by: default avatarSreekanth Reddy <sreekanth.reddy@broadcom.com>
      Signed-off-by: default avatarMatt Lupfer <mlupfer@ddn.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3916e33b
    • Alan Stern's avatar
      usb: usbtmc: Fix bug in pipe direction for control transfers · 5f6a2d63
      Alan Stern authored
      commit e9b667a8
      
       upstream.
      
      The syzbot fuzzer reported a minor bug in the usbtmc driver:
      
      usb 5-1: BOGUS control dir, pipe 80001e80 doesn't match bRequestType 0
      WARNING: CPU: 0 PID: 3813 at drivers/usb/core/urb.c:412
      usb_submit_urb+0x13a5/0x1970 drivers/usb/core/urb.c:410
      Modules linked in:
      CPU: 0 PID: 3813 Comm: syz-executor122 Not tainted
      5.17.0-rc5-syzkaller-00306-g2293be58d6a1 #0
      ...
      Call Trace:
       <TASK>
       usb_start_wait_urb+0x113/0x530 drivers/usb/core/message.c:58
       usb_internal_control_msg drivers/usb/core/message.c:102 [inline]
       usb_control_msg+0x2a5/0x4b0 drivers/usb/core/message.c:153
       usbtmc_ioctl_request drivers/usb/class/usbtmc.c:1947 [inline]
      
      The problem is that usbtmc_ioctl_request() uses usb_rcvctrlpipe() for
      all of its transfers, whether they are in or out.  It's easy to fix.
      
      CC: <stable@vger.kernel.org>
      Reported-and-tested-by: default avatar <syzbot+a48e3d1a875240cab5de@syzkaller.appspotmail.com>
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Link: https://lore.kernel.org/r/YiEsYTPEE6lOCOA5@rowland.harvard.edu
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5f6a2d63
    • Alan Stern's avatar
      usb: gadget: Fix use-after-free bug by not setting udc->dev.driver · 27d64436
      Alan Stern authored
      commit 16b1941e upstream.
      
      The syzbot fuzzer found a use-after-free bug:
      
      BUG: KASAN: use-after-free in dev_uevent+0x712/0x780 drivers/base/core.c:2320
      Read of size 8 at addr ffff88802b934098 by task udevd/3689
      
      CPU: 2 PID: 3689 Comm: udevd Not tainted 5.17.0-rc4-syzkaller-00229-g4f12b742eb2b #0
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       print_address_description.constprop.0.cold+0x8d/0x303 mm/kasan/report.c:255
       __kasan_report mm/kasan/report.c:442 [inline]
       kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
       dev_uevent+0x712/0x780 drivers/base/core.c:2320
       uevent_show+0x1b8/0x380 drivers/base/core.c:2391
       dev_attr_show+0x4b/0x90 drivers/base/core.c:2094
      
      Although the bug manifested in the driver core, the real cause was a
      race with the gadget core.  dev_uevent() does:
      
      	if (dev->driver)
      		add_uevent_var(env, "DRIVER=%s", dev->driver->name);
      
      and between the test and the dereference of dev->driver, the gadget
      core sets dev->driver to NULL.
      
      The race wouldn't occur if the gadget core registered its devices on
      a real bus, using the standard synchronization techniques of the
      driver core.  However, it's not necessary to make such a large change
      in order to fix this bug; all we need to do is make sure that
      udc->dev.driver is always NULL.
      
      In fact, there is no reason for udc->dev.driver ever to be set to
      anything, let alone to the value it currently gets: the address of the
      gadget's driver.  After all, a gadget driver only knows how to manage
      a gadget, not how to manage a UDC.
      
      This patch simply removes the statements in the gadget core that touch
      udc->dev.driver.
      
      Fixes: 2ccea03a
      
       ("usb: gadget: introduce UDC Class")
      CC: <stable@vger.kernel.org>
      Reported-and-tested-by: default avatar <syzbot+348b571beb5eeb70a582@syzkaller.appspotmail.com>
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Link: https://lore.kernel.org/r/YiQgukfFFbBnwJ/9@rowland.harvard.edu
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      27d64436
    • Dan Carpenter's avatar
      usb: gadget: rndis: prevent integer overflow in rndis_set_response() · df7e088d
      Dan Carpenter authored
      commit 65f3324f upstream.
      
      If "BufOffset" is very large the "BufOffset + 8" operation can have an
      integer overflow.
      
      Cc: stable@kernel.org
      Fixes: 38ea1eac
      
       ("usb: gadget: rndis: check size of RNDIS_MSG_SET command")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Link: https://lore.kernel.org/r/20220301080424.GA17208@kili
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      df7e088d
    • Arnd Bergmann's avatar
      arm64: fix clang warning about TRAMP_VALIAS · c4874bdd
      Arnd Bergmann authored
      [ Upstream commit 7f34b43e ]
      
      The newly introduced TRAMP_VALIAS definition causes a build warning
      with clang-14:
      
      arch/arm64/include/asm/vectors.h:66:31: error: arithmetic on a null pointer treated as a cast from integer to pointer is a GNU extension [-Werror,-Wnull-pointer-arithmetic]
                      return (char *)TRAMP_VALIAS + SZ_2K * slot;
      
      Change the addition to something clang does not complain about.
      
      Fixes: bd09128d
      
       ("arm64: Add percpu vectors for EL1")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarJames Morse <james.morse@arm.com>
      Link: https://lore.kernel.org/r/20220316183833.1563139-1-arnd@kernel.org
      
      
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c4874bdd
    • Ivan Vecera's avatar
      iavf: Fix hang during reboot/shutdown · 4477b9a4
      Ivan Vecera authored
      [ Upstream commit b04683ff ]
      
      Recent commit 97457801 ("iavf: Add waiting so the port is
      initialized in remove") adds a wait-loop at the beginning of
      iavf_remove() to ensure that port initialization is finished
      prior unregistering net device. This causes a regression
      in reboot/shutdown scenario because in this case callback
      iavf_shutdown() is called and this callback detaches the device,
      makes it down if it is running and sets its state to __IAVF_REMOVE.
      Later shutdown callback of associated PF driver (e.g. ice_shutdown)
      is called. That callback calls among other things sriov_disable()
      that calls indirectly iavf_remove() (see stack trace below).
      As the adapter state is already __IAVF_REMOVE then the mentioned
      loop is end-less and shutdown process hangs.
      
      The patch fixes this by checking adapter's state at the beginning
      of iavf_remove() and skips the rest of the function if the adapter
      is already in remove state (shutdown is in progress).
      
      Reproducer:
      1. Create VF on PF driven by ice or i40e driver
      2. Ensure that the VF is bound to iavf driver
      3. Reboot
      
      [52625.981294] sysrq: SysRq : Show Blocked State
      [52625.988377] task:reboot          state:D stack:    0 pid:17359 ppid:     1 f2
      [52625.996732] Call Trace:
      [52625.999187]  __schedule+0x2d1/0x830
      [52626.007400]  schedule+0x35/0xa0
      [52626.010545]  schedule_hrtimeout_range_clock+0x83/0x100
      [52626.020046]  usleep_range+0x5b/0x80
      [52626.023540]  iavf_remove+0x63/0x5b0 [iavf]
      [52626.027645]  pci_device_remove+0x3b/0xc0
      [52626.031572]  device_release_driver_internal+0x103/0x1f0
      [52626.036805]  pci_stop_bus_device+0x72/0xa0
      [52626.040904]  pci_stop_and_remove_bus_device+0xe/0x20
      [52626.045870]  pci_iov_remove_virtfn+0xba/0x120
      [52626.050232]  sriov_disable+0x2f/0xe0
      [52626.053813]  ice_free_vfs+0x7c/0x340 [ice]
      [52626.057946]  ice_remove+0x220/0x240 [ice]
      [52626.061967]  ice_shutdown+0x16/0x50 [ice]
      [52626.065987]  pci_device_shutdown+0x34/0x60
      [52626.070086]  device_shutdown+0x165/0x1c5
      [52626.074011]  kernel_restart+0xe/0x30
      [52626.077593]  __do_sys_reboot+0x1d2/0x210
      [52626.093815]  do_syscall_64+0x5b/0x1a0
      [52626.097483]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      
      Fixes: 97457801
      
       ("iavf: Add waiting so the port is initialized in remove")
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Link: https://lore.kernel.org/r/20220317104524.2802848-1-ivecera@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4477b9a4
    • Vladimir Oltean's avatar
      net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower offload · d0618aae
      Vladimir Oltean authored
      [ Upstream commit 8e0341ae ]
      
      ACL rules can be offloaded to VCAP IS2 either through chain 0, or, since
      the blamed commit, through a chain index whose number encodes a specific
      PAG (Policy Action Group) and lookup number.
      
      The chain number is translated through ocelot_chain_to_pag() into a PAG,
      and through ocelot_chain_to_lookup() into a lookup number.
      
      The problem with the blamed commit is that the above 2 functions don't
      have special treatment for chain 0. So ocelot_chain_to_pag(0) returns
      filter->pag = 224, which is in fact -32, but the "pag" field is an u8.
      
      So we end up programming the hardware with VCAP IS2 entries having a PAG
      of 224. But the way in which the PAG works is that it defines a subset
      of VCAP IS2 filters which should match on a packet. The default PAG is
      0, and previous VCAP IS1 rules (which we offload using 'goto') can
      modify it. So basically, we are installing filters with a PAG on which
      no packet will ever match. This is the hardware equivalent of adding
      filters to a chain which has no 'goto' to it.
      
      Restore the previous functionality by making ACL filters offloaded to
      chain 0 go to PAG 0 and lookup number 0. The choice of PAG is clearly
      correct, but the choice of lookup number isn't "as before" (which was to
      leave the lookup a "don't care"). However, lookup 0 should be fine,
      since even though there are ACL actions (policers) which have a
      requirement to be used in a specific lookup, that lookup is 0.
      
      Fixes: 226e9cd8
      
       ("net: mscc: ocelot: only install TCAM entries into a specific lookup and PAG")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220316192117.2568261-1-vladimir.oltean@nxp.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d0618aae
    • Doug Berger's avatar
      net: bcmgenet: skip invalid partial checksums · 65a2e32d
      Doug Berger authored
      [ Upstream commit 0f643c88 ]
      
      The RXCHK block will return a partial checksum of 0 if it encounters
      a problem while receiving a packet. Since a 1's complement sum can
      only produce this result if no bits are set in the received data
      stream it is fair to treat it as an invalid partial checksum and
      not pass it up the stack.
      
      Fixes: 81015539
      
       ("net: bcmgenet: use CHECKSUM_COMPLETE for NETIF_F_RXCSUM")
      Signed-off-by: default avatarDoug Berger <opendmb@gmail.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20220317012812.1313196-1-opendmb@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      65a2e32d
    • Manish Chopra's avatar
      bnx2x: fix built-in kernel driver load failure · 851edd04
      Manish Chopra authored
      [ Upstream commit 424e7834 ]
      
      Commit b7a49f73 ("bnx2x: Utilize firmware 7.13.21.0")
      added request_firmware() logic in probe() which caused
      load failure when firmware file is not present in initrd (below),
      as access to firmware file is not feasible during probe.
      
        Direct firmware load for bnx2x/bnx2x-e2-7.13.15.0.fw failed with error -2
        Direct firmware load for bnx2x/bnx2x-e2-7.13.21.0.fw failed with error -2
      
      This patch fixes this issue by -
      
      1. Removing request_firmware() logic from the probe()
         such that .ndo_open() handle it as it used to handle
         it earlier
      
      2. Given request_firmware() is removed from probe(), so
         driver has to relax FW version comparisons a bit against
         the already loaded FW version (by some other PFs of same
         adapter) to allow different compatible/close enough FWs with which
         multiple PFs may run with (in different environments), as the
         given PF who is in probe flow has no idea now with which firmware
         file version it is going to initialize the device in ndo_open()
      
      Link: https://lore.kernel.org/all/46f2d9d9-ae7f-b332-ddeb-b59802be2bab@molgen.mpg.de/
      
      
      Reported-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Tested-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Fixes: b7a49f73
      
       ("bnx2x: Utilize firmware 7.13.21.0")
      Signed-off-by: default avatarManish Chopra <manishc@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Link: https://lore.kernel.org/r/20220316214613.6884-1-manishc@marvell.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      851edd04
    • Juerg Haefliger's avatar
      net: phy: mscc: Add MODULE_FIRMWARE macros · 2fad4864
      Juerg Haefliger authored
      [ Upstream commit f1858c27 ]
      
      The driver requires firmware so define MODULE_FIRMWARE so that modinfo
      provides the details.
      
      Fixes: fa164e40
      
       ("net: phy: mscc: split the driver into separate files")
      Signed-off-by: default avatarJuerg Haefliger <juergh@canonical.com>
      Link: https://lore.kernel.org/r/20220316151835.88765-1-juergh@canonical.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2fad4864
    • Miaoqian Lin's avatar
      net: dsa: Add missing of_node_put() in dsa_port_parse_of · fd22b43b
      Miaoqian Lin authored
      [ Upstream commit cb0b430b ]
      
      The device_node pointer is returned by of_parse_phandle()  with refcount
      incremented. We should use of_node_put() on it when done.
      
      Fixes: 6d4e5c57
      
       ("net: dsa: get port type at parse time")
      Signed-off-by: default avatarMiaoqian Lin <linmq006@gmail.com>
      Link: https://lore.kernel.org/r/20220316082602.10785-1-linmq006@gmail.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fd22b43b
    • Thomas Zimmermann's avatar
      drm: Don't make DRM_PANEL_BRIDGE dependent on DRM_KMS_HELPERS · 939b6ebc
      Thomas Zimmermann authored
      [ Upstream commit 3c338405
      
       ]
      
      Fix a number of undefined references to drm_kms_helper.ko in
      drm_dp_helper.ko:
      
        arm-suse-linux-gnueabi-ld: drivers/gpu/drm/dp/drm_dp_mst_topology.o: in function `drm_dp_mst_duplicate_state':
        drm_dp_mst_topology.c:(.text+0x2df0): undefined reference to `__drm_atomic_helper_private_obj_duplicate_state'
        arm-suse-linux-gnueabi-ld: drivers/gpu/drm/dp/drm_dp_mst_topology.o: in function `drm_dp_delayed_destroy_work':
        drm_dp_mst_topology.c:(.text+0x370c): undefined reference to `drm_kms_helper_hotplug_event'
        arm-suse-linux-gnueabi-ld: drivers/gpu/drm/dp/drm_dp_mst_topology.o: in function `drm_dp_mst_up_req_work':
        drm_dp_mst_topology.c:(.text+0x7938): undefined reference to `drm_kms_helper_hotplug_event'
        arm-suse-linux-gnueabi-ld: drivers/gpu/drm/dp/drm_dp_mst_topology.o: in function `drm_dp_mst_link_probe_work':
        drm_dp_mst_topology.c:(.text+0x82e0): undefined reference to `drm_kms_helper_hotplug_event'
      
      This happens if panel-edp.ko has been configured with
      
        DRM_PANEL_EDP=y
        DRM_DP_HELPER=y
        DRM_KMS_HELPER=m
      
      which builds DP helpers into the kernel and KMS helpers sa a module.
      Making DRM_PANEL_EDP select DRM_KMS_HELPER resolves this problem.
      
      To avoid a resulting cyclic dependency with DRM_PANEL_BRIDGE, don't
      make the latter depend on DRM_KMS_HELPER and fix the one DRM bridge
      drivers that doesn't already select DRM_KMS_HELPER. As KMS helpers
      cannot be selected directly by the user, config symbols should avoid
      depending on it anyway.
      
      Signed-off-by: default avatarThomas Zimmermann <tzimmermann@suse.de>
      Fixes: 3755d35e
      
       ("drm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDP")
      Acked-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Tested-by: default avatarBrian Masney <bmasney@redhat.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Cc: Thomas Zimmermann <tzimmermann@suse.de>
      Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
      Cc: Linux Kernel Functional Testing <lkft@linaro.org>
      Cc: Lyude Paul <lyude@redhat.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Maxime Ripard <mripard@kernel.org>
      Cc: dri-devel@lists.freedesktop.org
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Thierry Reding <thierry.reding@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/478296/
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      939b6ebc
    • Nicolas Dichtel's avatar
      net: handle ARPHRD_PIMREG in dev_is_mac_header_xmit() · a50011f2
      Nicolas Dichtel authored
      [ Upstream commit 4ee06de7 ]
      
      This kind of interface doesn't have a mac header. This patch fixes
      bpf_redirect() to a PIM interface.
      
      Fixes: 27b29f63
      
       ("bpf: add bpf_redirect() helper")
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Link: https://lore.kernel.org/r/20220315092008.31423-1-nicolas.dichtel@6wind.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a50011f2
    • Marek Vasut's avatar
      drm/panel: simple: Fix Innolux G070Y2-L01 BPP settings · c40fe00d
      Marek Vasut authored
      [ Upstream commit fc1b6ef7 ]
      
      The Innolux G070Y2-L01 supports two modes of operation:
      1) FRC=Low/NC ... MEDIA_BUS_FMT_RGB666_1X7X3_SPWG ... BPP=6
      2) FRC=High ..... MEDIA_BUS_FMT_RGB888_1X7X4_SPWG ... BPP=8
      
      Currently the panel description mixes both, BPP from 1) and bus
      format from 2), which triggers a warning at panel-simple.c:615.
      
      Pick the later, set bpp=8, fix the warning.
      
      Fixes: a5d2ade6
      
       ("drm/panel: simple: Add support for Innolux G070Y2-L01")
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Cc: Christoph Fritz <chf.fritz@googlemail.com>
      Cc: Laurent Pinchart <Laurent.pinchart@ideasonboard.com>
      Cc: Maxime Ripard <maxime@cerno.tech>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Thomas Zimmermann <tzimmermann@suse.de>
      Reviewed-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220220040718.532866-1-marex@denx.de
      
      
      Signed-off-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c40fe00d
    • Christoph Niedermaier's avatar
      drm/imx: parallel-display: Remove bus flags check in imx_pd_bridge_atomic_check() · a2f9aaa0
      Christoph Niedermaier authored
      [ Upstream commit 6061806a ]
      
      If display timings were read from the devicetree using
      of_get_display_timing() and pixelclk-active is defined
      there, the flag DISPLAY_FLAGS_SYNC_POSEDGE/NEGEDGE is
      automatically generated. Through the function
      drm_bus_flags_from_videomode() e.g. called in the
      panel-simple driver this flag got into the bus flags,
      but then in imx_pd_bridge_atomic_check() the bus flag
      check failed and will not initialize the display. The
      original commit fe141ced does not explain why this
      check was introduced. So remove the bus flags check,
      because it stops the initialization of the display with
      valid bus flags.
      
      Fixes: fe141ced
      
       ("drm/imx: pd: Use bus format/flags provided by the bridge when available")
      Signed-off-by: default avatarChristoph Niedermaier <cniedermaier@dh-electronics.com>
      Cc: Marek Vasut <marex@denx.de>
      Cc: Boris Brezillon <boris.brezillon@collabora.com>
      Cc: Philipp Zabel <p.zabel@pengutronix.de>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Cc: Sascha Hauer <s.hauer@pengutronix.de>
      Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
      Cc: Fabio Estevam <festevam@gmail.com>
      Cc: NXP Linux Team <linux-imx@nxp.com>
      Cc: linux-arm-kernel@lists.infradead.org
      To: dri-devel@lists.freedesktop.org
      Tested-by: default avatarMax Krummenacher <max.krummenacher@toradex.com>
      Acked-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220201113643.4638-1-cniedermaier@dh-electronics.com
      
      
      Signed-off-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a2f9aaa0
    • Jiasheng Jiang's avatar
      hv_netvsc: Add check for kvmalloc_array · 411e256d
      Jiasheng Jiang authored
      [ Upstream commit 886e44c9 ]
      
      As the potential failure of the kvmalloc_array(),
      it should be better to check and restore the 'data'
      if fails in order to avoid the dereference of the
      NULL pointer.
      
      Fixes: 6ae74671
      
       ("hv_netvsc: Add per-cpu ethtool stats for netvsc")
      Signed-off-by: default avatarJiasheng Jiang <jiasheng@iscas.ac.cn>
      Link: https://lore.kernel.org/r/20220314020125.2365084-1-jiasheng@iscas.ac.cn
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      411e256d
    • Przemyslaw Patynowski's avatar
      iavf: Fix double free in iavf_reset_task · ed1d0411
      Przemyslaw Patynowski authored
      [ Upstream commit 16b2dd8c ]
      
      Fix double free possibility in iavf_disable_vf, as crit_lock is
      freed in caller, iavf_reset_task. Add kernel-doc for iavf_disable_vf.
      Remove mutex_unlock in iavf_disable_vf.
      Without this patch there is double free scenario, when calling
      iavf_reset_task.
      
      Fixes: e85ff9c6
      
       ("iavf: Fix deadlock in iavf_reset_task")
      Signed-off-by: default avatarPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Suggested-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ed1d0411
    • Maciej Fijalkowski's avatar
      ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats() · 2397270e
      Maciej Fijalkowski authored
      [ Upstream commit f1535469 ]
      
      It is possible to do NULL pointer dereference in routine that updates
      Tx ring stats. Currently only stats and bytes are updated when ring
      pointer is valid, but later on ring is accessed to propagate gathered Tx
      stats onto VSI stats.
      
      Change the existing logic to move to next ring when ring is NULL.
      
      Fixes: e72bba21
      
       ("ice: split ice_ring onto Tx/Rx separate structs")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Acked-by: default avatarAlexander Lobakin <alexandr.lobakin@intel.com>
      Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2397270e
    • Jiasheng Jiang's avatar
      atm: eni: Add check for dma_map_single · 12494334
      Jiasheng Jiang authored
      [ Upstream commit 0f74b29a ]
      
      As the potential failure of the dma_map_single(),
      it should be better to check it and return error
      if fails.
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarJiasheng Jiang <jiasheng@iscas.ac.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      12494334
    • Hannes Reinecke's avatar
      nvmet: revert "nvmet: make discovery NQN configurable" · 80e4153c
      Hannes Reinecke authored
      [ Upstream commit 0c48645a ]
      
      Revert commit 626851e9 ("nvmet: make discovery NQN configurable");
      the interface was deemed incorrect and will be replaced with a different
      one.
      
      Fixes: 626851e9
      
       ("nvmet: make discovery NQN configurable")
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      80e4153c
    • Eric Dumazet's avatar
      net/packet: fix slab-out-of-bounds access in packet_recvmsg() · ef591b35
      Eric Dumazet authored
      [ Upstream commit c700525f ]
      
      syzbot found that when an AF_PACKET socket is using PACKET_COPY_THRESH
      and mmap operations, tpacket_rcv() is queueing skbs with
      garbage in skb->cb[], triggering a too big copy [1]
      
      Presumably, users of af_packet using mmap() already gets correct
      metadata from the mapped buffer, we can simply make sure
      to clear 12 bytes that might be copied to user space later.
      
      BUG: KASAN: stack-out-of-bounds in memcpy include/linux/fortify-string.h:225 [inline]
      BUG: KASAN: stack-out-of-bounds in packet_recvmsg+0x56c/0x1150 net/packet/af_packet.c:3489
      Write of size 165 at addr ffffc9000385fb78 by task syz-executor233/3631
      
      CPU: 0 PID: 3631 Comm: syz-executor233 Not tainted 5.17.0-rc7-syzkaller-02396-g0b3660695e80 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       print_address_description.constprop.0.cold+0xf/0x336 mm/kasan/report.c:255
       __kasan_report mm/kasan/report.c:442 [inline]
       kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
       check_region_inline mm/kasan/generic.c:183 [inline]
       kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189
       memcpy+0x39/0x60 mm/kasan/shadow.c:66
       memcpy include/linux/fortify-string.h:225 [inline]
       packet_recvmsg+0x56c/0x1150 net/packet/af_packet.c:3489
       sock_recvmsg_nosec net/socket.c:948 [inline]
       sock_recvmsg net/socket.c:966 [inline]
       sock_recvmsg net/socket.c:962 [inline]
       ____sys_recvmsg+0x2c4/0x600 net/socket.c:2632
       ___sys_recvmsg+0x127/0x200 net/socket.c:2674
       __sys_recvmsg+0xe2/0x1a0 net/socket.c:2704
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7fdfd5954c29
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007ffcf8e71e48 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fdfd5954c29
      RDX: 0000000000000000 RSI: 0000000020000500 RDI: 0000000000000005
      RBP: 0000000000000000 R08: 000000000000000d R09: 000000000000000d
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffcf8e71e60
      R13: 00000000000f4240 R14: 000000000000c1ff R15: 00007ffcf8e71e54
       </TASK>
      
      addr ffffc9000385fb78 is located in stack of task syz-executor233/3631 at offset 32 in frame:
       ____sys_recvmsg+0x0/0x600 include/linux/uio.h:246
      
      this frame has 1 object:
       [32, 160) 'addr'
      
      Memory state around the buggy address:
       ffffc9000385fa80: 00 04 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00
       ffffc9000385fb00: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
      >ffffc9000385fb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f3
                                                                      ^
       ffffc9000385fc00: f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 f1
       ffffc9000385fc80: f1 f1 f1 00 f2 f2 f2 00 f2 f2 f2 00 00 00 00 00
      ==================================================================
      
      Fixes: 0fb375fb
      
       ("[AF_PACKET]: Allow for > 8 byte hardware addresses.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Link: https://lore.kernel.org/r/20220312232958.3535620-1-eric.dumazet@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ef591b35
    • Kurt Cancemi's avatar
      net: phy: marvell: Fix invalid comparison in the resume and suspend functions · 4df8f62d
      Kurt Cancemi authored
      [ Upstream commit 837d9e49 ]
      
      This bug resulted in only the current mode being resumed and suspended when
      the PHY supported both fiber and copper modes and when the PHY only supported
      copper mode the fiber mode would incorrectly be attempted to be resumed and
      suspended.
      
      Fixes: 3758be3d
      
       ("Marvell phy: add functions to suspend and resume both interfaces: fiber and copper links.")
      Signed-off-by: default avatarKurt Cancemi <kurt@x64architecture.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20220312201512.326047-1-kurt@x64architecture.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4df8f62d
    • Sabrina Dubroca's avatar
      esp6: fix check on ipv6_skip_exthdr's return value · 4feb2e0b
      Sabrina Dubroca authored
      [ Upstream commit 4db4075f ]
      
      Commit 5f9c55c8 ("ipv6: check return value of ipv6_skip_exthdr")
      introduced an incorrect check, which leads to all ESP packets over
      either TCPv6 or UDPv6 encapsulation being dropped. In this particular
      case, offset is negative, since skb->data points to the ESP header in
      the following chain of headers, while skb->network_header points to
      the IPv6 header:
      
          IPv6 | ext | ... | ext | UDP | ESP | ...
      
      That doesn't seem to be a problem, especially considering that if we
      reach esp6_input_done2, we're guaranteed to have a full set of headers
      available (otherwise the packet would have been dropped earlier in the
      stack). However, it means that the return value will (intentionally)
      be negative. We can make the test more specific, as the expected
      return value of ipv6_skip_exthdr will be the (negated) size of either
      a UDP header, or a TCP header with possible options.
      
      In the future, we should probably either make ipv6_skip_exthdr
      explicitly accept negative offsets (and adjust its return value for
      error cases), or make ipv6_skip_exthdr only take non-negative
      offsets (and audit all callers).
      
      Fixes: 5f9c55c8
      
       ("ipv6: check return value of ipv6_skip_exthdr")
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4feb2e0b
    • Jiyong Park's avatar
      vsock: each transport cycles only on its own sockets · 76cd8ac3
      Jiyong Park authored
      [ Upstream commit 8e6ed963 ]
      
      When iterating over sockets using vsock_for_each_connected_socket, make
      sure that a transport filters out sockets that don't belong to the
      transport.
      
      There actually was an issue caused by this; in a nested VM
      configuration, destroying the nested VM (which often involves the
      closing of /dev/vhost-vsock if there was h2g connections to the nested
      VM) kills not only the h2g connections, but also all existing g2h
      connections to the (outmost) host which are totally unrelated.
      
      Tested: Executed the following steps on Cuttlefish (Android running on a
      VM) [1]: (1) Enter into an `adb shell` session - to have a g2h
      connection inside the VM, (2) open and then close /dev/vhost-vsock by
      `exec 3< /dev/vhost-vsock && exec 3<&-`, (3) observe that the adb
      session is not reset.
      
      [1] https://android.googlesource.com/device/google/cuttlefish/
      
      Fixes: c0cfa2d8
      
       ("vsock: add multi-transports support")
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJiyong Park <jiyong@google.com>
      Link: https://lore.kernel.org/r/20220311020017.1509316-1-jiyong@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      76cd8ac3
    • Niels Dossche's avatar
      alx: acquire mutex for alx_reinit in alx_change_mtu · f1bcbfea
      Niels Dossche authored
      [ Upstream commit 46b348fd ]
      
      alx_reinit has a lockdep assertion that the alx->mtx mutex must be held.
      alx_reinit is called from two places: alx_reset and alx_change_mtu.
      alx_reset does acquire alx->mtx before calling alx_reinit.
      alx_change_mtu does not acquire this mutex, nor do its callers or any
      path towards alx_change_mtu.
      Acquire the mutex in alx_change_mtu.
      
      The issue was introduced when the fine-grained locking was introduced
      to the code to replace the RTNL. The same commit also introduced the
      lockdep assertion.
      
      Fixes: 4a5fe57e
      
       ("alx: use fine-grained locking instead of RTNL")
      Signed-off-by: default avatarNiels Dossche <dossche.niels@gmail.com>
      Link: https://lore.kernel.org/r/20220310232707.44251-1-dossche.niels@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f1bcbfea
    • Randy Dunlap's avatar
      efi: fix return value of __setup handlers · ddac7764
      Randy Dunlap authored
      [ Upstream commit 9feaf8b3 ]
      
      When "dump_apple_properties" is used on the kernel boot command line,
      it causes an Unknown parameter message and the string is added to init's
      argument strings:
      
        Unknown kernel command line parameters "dump_apple_properties
          BOOT_IMAGE=/boot/bzImage-517rc6 efivar_ssdt=newcpu_ssdt", will be
          passed to user space.
      
       Run /sbin/init as init process
         with arguments:
           /sbin/init
           dump_apple_properties
         with environment:
           HOME=/
           TERM=linux
           BOOT_IMAGE=/boot/bzImage-517rc6
           efivar_ssdt=newcpu_ssdt
      
      Similarly when "efivar_ssdt=somestring" is used, it is added to the
      Unknown parameter message and to init's environment strings, polluting
      them (see examples above).
      
      Change the return value of the __setup functions to 1 to indicate
      that the __setup options have been handled.
      
      Fixes: 58c5475a ("x86/efi: Retrieve and assign Apple device properties")
      Fixes: 475fb4e8
      
       ("efi / ACPI: load SSTDs from EFI variables")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatarIgor Zhbanov <i.zhbanov@omprussia.ru>
      Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: linux-efi@vger.kernel.org
      Cc: Lukas Wunner <lukas@wunner.de>
      Cc: Octavian Purdila <octavian.purdila@intel.com>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Link: https://lore.kernel.org/r/20220301041851.12459-1-rdunlap@infradead.org
      
      
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ddac7764
    • Jocelyn Falempe's avatar
      drm/mgag200: Fix PLL setup for g200wb and g200ew · 9b11e64e
      Jocelyn Falempe authored
      commit 40ce1121 upstream.
      
      commit f86c3ed5 ("drm/mgag200: Split PLL setup into compute and
       update functions") introduced a regression for g200wb and g200ew.
      The PLLs are not set up properly, and VGA screen stays
      black, or displays "out of range" message.
      
      MGA1064_WB_PIX_PLLC_N/M/P was mistakenly replaced with
      MGA1064_PIX_PLLC_N/M/P which have different addresses.
      
      Patch tested on a Dell T310 with g200wb
      
      Fixes: f86c3ed5
      
       ("drm/mgag200: Split PLL setup into compute and update functions")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJocelyn Falempe <jfalempe@redhat.com>
      Signed-off-by: default avatarThomas Zimmermann <tzimmermann@suse.de>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220308174321.225606-1-jfalempe@redhat.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b11e64e
    • Ming Lei's avatar
      block: release rq qos structures for queue without disk · 60c2c8e2
      Ming Lei authored
      commit daaca352 upstream.
      
      blkcg_init_queue() may add rq qos structures to request queue, previously
      blk_cleanup_queue() calls rq_qos_exit() to release them, but commit
      8e141f9e ("block: drain file system I/O on del_gendisk")
      moves rq_qos_exit() into del_gendisk(), so memory leak is caused
      because queues may not have disk, such as un-present scsi luns, nvme
      admin queue, ...
      
      Fixes the issue by adding rq_qos_exit() to blk_cleanup_queue() back.
      
      BTW, v5.18 won't need this patch any more since we move
      blkcg_init_queue()/blkcg_exit_queue() into disk allocation/release
      handler, and patches have been in for-5.18/block.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: stable@vger.kernel.org
      Fixes: 8e141f9e
      
       ("block: drain file system I/O on del_gendisk")
      Reported-by: default avatar <syzbot+b42749a851a47a0f581b@syzkaller.appspotmail.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20220314043018.177141-1-ming.lei@redhat.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      60c2c8e2
    • Guo Ziliang's avatar
      mm: swap: get rid of livelock in swapin readahead · 8728a9a4
      Guo Ziliang authored
      commit 029c4628 upstream.
      
      In our testing, a livelock task was found.  Through sysrq printing, same
      stack was found every time, as follows:
      
        __swap_duplicate+0x58/0x1a0
        swapcache_prepare+0x24/0x30
        __read_swap_cache_async+0xac/0x220
        read_swap_cache_async+0x58/0xa0
        swapin_readahead+0x24c/0x628
        do_swap_page+0x374/0x8a0
        __handle_mm_fault+0x598/0xd60
        handle_mm_fault+0x114/0x200
        do_page_fault+0x148/0x4d0
        do_translation_fault+0xb0/0xd4
        do_mem_abort+0x50/0xb0
      
      The reason for the livelock is that swapcache_prepare() always returns
      EEXIST, indicating that SWAP_HAS_CACHE has not been cleared, so that it
      cannot jump out of the loop.  We suspect that the task that clears the
      SWAP_HAS_CACHE flag never gets a chance to run.  We try to lower the
      priority of the task stuck in a livelock so that the task that clears
      the SWAP_HAS_CACHE flag will run.  The results show that the system
      returns to normal after the priority is lowered.
      
      In our testing, multiple real-time tasks are bound to the same core, and
      the task in the livelock is the highest priority task of the core, so
      the livelocked task cannot be preempted.
      
      Although cond_resched() is used by __read_swap_cache_async, it is an
      empty function in the preemptive system and cannot achieve the purpose
      of releasing the CPU.  A high-priority task cannot release the CPU
      unless preempted by a higher-priority task.  But when this task is
      already the highest priority task on this core, other tasks will not be
      able to be scheduled.  So we think we should replace cond_resched() with
      schedule_timeout_uninterruptible(1), schedule_timeout_interruptible will
      call set_current_state first to set the task state, so the task will be
      removed from the running queue, so as to achieve the purpose of giving
      up the CPU and prevent it from running in kernel mode for too long.
      
      (akpm: ugly hack becomes uglier.  But it fixes the issue in a
      backportable-to-stable fashion while we hopefully work on something
      better)
      
      Link: https://lkml.kernel.org/r/20220221111749.1928222-1-cgel.zte@gmail.com
      
      
      Signed-off-by: default avatarGuo Ziliang <guo.ziliang@zte.com.cn>
      Reported-by: default avatarZeal Robot <zealci@zte.com.cn>
      Reviewed-by: default avatarRan Xiaokai <ran.xiaokai@zte.com.cn>
      Reviewed-by: default avatarJiang Xuexin <jiang.xuexin@zte.com.cn>
      Reviewed-by: default avatarYang Yang <yang.yang29@zte.com.cn>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Roger Quadros <rogerq@kernel.org>
      Cc: Ziliang Guo <guo.ziliang@zte.com.cn>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8728a9a4
    • Joseph Qi's avatar
      ocfs2: fix crash when initialize filecheck kobj fails · ae102801
      Joseph Qi authored
      commit 7b0b1332 upstream.
      
      Once s_root is set, genric_shutdown_super() will be called if
      fill_super() fails.  That means, we will call ocfs2_dismount_volume()
      twice in such case, which can lead to kernel crash.
      
      Fix this issue by initializing filecheck kobj before setting s_root.
      
      Link: https://lkml.kernel.org/r/20220310081930.86305-1-joseph.qi@linux.alibaba.com
      Fixes: 5f483c4a
      
       ("ocfs2: add kobject for online file check")
      Signed-off-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae102801
    • Brian Masney's avatar
      crypto: qcom-rng - ensure buffer for generate is completely filled · 485995cb
      Brian Masney authored
      commit a680b183 upstream.
      
      The generate function in struct rng_alg expects that the destination
      buffer is completely filled if the function returns 0. qcom_rng_read()
      can run into a situation where the buffer is partially filled with
      randomness and the remaining part of the buffer is zeroed since
      qcom_rng_generate() doesn't check the return value. This issue can
      be reproduced by running the following from libkcapi:
      
          kcapi-rng -b 9000000 > OUTFILE
      
      The generated OUTFILE will have three huge sections that contain all
      zeros, and this is caused by the code where the test
      'val & PRNG_STATUS_DATA_AVAIL' fails.
      
      Let's fix this issue by ensuring that qcom_rng_read() always returns
      with a full buffer if the function returns success. Let's also have
      qcom_rng_generate() return the correct value.
      
      Here's some statistics from the ent project
      (https://www.fourmilab.ch/random/
      
      ) that shows information about the
      quality of the generated numbers:
      
          $ ent -c qcom-random-before
          Value Char Occurrences Fraction
            0           606748   0.067416
            1            33104   0.003678
            2            33001   0.003667
          ...
          253   �        32883   0.003654
          254   �        33035   0.003671
          255   �        33239   0.003693
      
          Total:       9000000   1.000000
      
          Entropy = 7.811590 bits per byte.
      
          Optimum compression would reduce the size
          of this 9000000 byte file by 2 percent.
      
          Chi square distribution for 9000000 samples is 9329962.81, and
          randomly would exceed this value less than 0.01 percent of the
          times.
      
          Arithmetic mean value of data bytes is 119.3731 (127.5 = random).
          Monte Carlo value for Pi is 3.197293333 (error 1.77 percent).
          Serial correlation coefficient is 0.159130 (totally uncorrelated =
          0.0).
      
      Without this patch, the results of the chi-square test is 0.01%, and
      the numbers are certainly not random according to ent's project page.
      The results improve with this patch:
      
          $ ent -c qcom-random-after
          Value Char Occurrences Fraction
            0            35432   0.003937
            1            35127   0.003903
            2            35424   0.003936
          ...
          253   �        35201   0.003911
          254   �        34835   0.003871
          255   �        35368   0.003930
      
          Total:       9000000   1.000000
      
          Entropy = 7.999979 bits per byte.
      
          Optimum compression would reduce the size
          of this 9000000 byte file by 0 percent.
      
          Chi square distribution for 9000000 samples is 258.77, and randomly
          would exceed this value 42.24 percent of the times.
      
          Arithmetic mean value of data bytes is 127.5006 (127.5 = random).
          Monte Carlo value for Pi is 3.141277333 (error 0.01 percent).
          Serial correlation coefficient is 0.000468 (totally uncorrelated =
          0.0).
      
      This change was tested on a Nexus 5 phone (msm8974 SoC).
      
      Signed-off-by: default avatarBrian Masney <bmasney@redhat.com>
      Fixes: ceec5f5b
      
       ("crypto: qcom-rng - Add Qcom prng driver")
      Cc: stable@vger.kernel.org # 4.19+
      Reviewed-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Reviewed-by: default avatarAndrew Halaney <ahalaney@redhat.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      485995cb
  2. Mar 19, 2022
    • Greg Kroah-Hartman's avatar
    • Ivan Vecera's avatar
      ice: Fix race condition during interface enslave · e1014fc5
      Ivan Vecera authored
      commit 5cb1ebdb upstream.
      
      Commit 5dbbbd01
      
       ("ice: Avoid RTNL lock when re-creating
      auxiliary device") changes a process of re-creation of aux device
      so ice_plug_aux_dev() is called from ice_service_task() context.
      This unfortunately opens a race window that can result in dead-lock
      when interface has left LAG and immediately enters LAG again.
      
      Reproducer:
      ```
      #!/bin/sh
      
      ip link add lag0 type bond mode 1 miimon 100
      ip link set lag0
      
      for n in {1..10}; do
              echo Cycle: $n
              ip link set ens7f0 master lag0
              sleep 1
              ip link set ens7f0 nomaster
      done
      ```
      
      This results in:
      [20976.208697] Workqueue: ice ice_service_task [ice]
      [20976.213422] Call Trace:
      [20976.215871]  __schedule+0x2d1/0x830
      [20976.219364]  schedule+0x35/0xa0
      [20976.222510]  schedule_preempt_disabled+0xa/0x10
      [20976.227043]  __mutex_lock.isra.7+0x310/0x420
      [20976.235071]  enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core]
      [20976.251215]  ib_enum_roce_netdev+0xa4/0xe0 [ib_core]
      [20976.256192]  ib_cache_setup_one+0x33/0xa0 [ib_core]
      [20976.261079]  ib_register_device+0x40d/0x580 [ib_core]
      [20976.266139]  irdma_ib_register_device+0x129/0x250 [irdma]
      [20976.281409]  irdma_probe+0x2c1/0x360 [irdma]
      [20976.285691]  auxiliary_bus_probe+0x45/0x70
      [20976.289790]  really_probe+0x1f2/0x480
      [20976.298509]  driver_probe_device+0x49/0xc0
      [20976.302609]  bus_for_each_drv+0x79/0xc0
      [20976.306448]  __device_attach+0xdc/0x160
      [20976.310286]  bus_probe_device+0x9d/0xb0
      [20976.314128]  device_add+0x43c/0x890
      [20976.321287]  __auxiliary_device_add+0x43/0x60
      [20976.325644]  ice_plug_aux_dev+0xb2/0x100 [ice]
      [20976.330109]  ice_service_task+0xd0c/0xed0 [ice]
      [20976.342591]  process_one_work+0x1a7/0x360
      [20976.350536]  worker_thread+0x30/0x390
      [20976.358128]  kthread+0x10a/0x120
      [20976.365547]  ret_from_fork+0x1f/0x40
      ...
      [20976.438030] task:ip              state:D stack:    0 pid:213658 ppid:213627 flags:0x00004084
      [20976.446469] Call Trace:
      [20976.448921]  __schedule+0x2d1/0x830
      [20976.452414]  schedule+0x35/0xa0
      [20976.455559]  schedule_preempt_disabled+0xa/0x10
      [20976.460090]  __mutex_lock.isra.7+0x310/0x420
      [20976.464364]  device_del+0x36/0x3c0
      [20976.467772]  ice_unplug_aux_dev+0x1a/0x40 [ice]
      [20976.472313]  ice_lag_event_handler+0x2a2/0x520 [ice]
      [20976.477288]  notifier_call_chain+0x47/0x70
      [20976.481386]  __netdev_upper_dev_link+0x18b/0x280
      [20976.489845]  bond_enslave+0xe05/0x1790 [bonding]
      [20976.494475]  do_setlink+0x336/0xf50
      [20976.502517]  __rtnl_newlink+0x529/0x8b0
      [20976.543441]  rtnl_newlink+0x43/0x60
      [20976.546934]  rtnetlink_rcv_msg+0x2b1/0x360
      [20976.559238]  netlink_rcv_skb+0x4c/0x120
      [20976.563079]  netlink_unicast+0x196/0x230
      [20976.567005]  netlink_sendmsg+0x204/0x3d0
      [20976.570930]  sock_sendmsg+0x4c/0x50
      [20976.574423]  ____sys_sendmsg+0x1eb/0x250
      [20976.586807]  ___sys_sendmsg+0x7c/0xc0
      [20976.606353]  __sys_sendmsg+0x57/0xa0
      [20976.609930]  do_syscall_64+0x5b/0x1a0
      [20976.613598]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      
      1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev()
         is called from ice_service_task() context, aux device is created
         and associated device->lock is taken.
      2. Command 'ip link ... set master...' calls ice's notifier under
         RTNL lock and that notifier calls ice_unplug_aux_dev(). That
         function tries to take aux device->lock but this is already taken
         by ice_plug_aux_dev() in step 1
      3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already
         taken in step 2
      4. Dead-lock
      
      The patch fixes this issue by following changes:
      - Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev()
        call in ice_service_task()
      - The bit is checked in ice_clear_rdma_cap() and only if it is not set
        then ice_unplug_aux_dev() is called. If it is set (in other words
        plugging of aux device was requested and ice_plug_aux_dev() is
        potentially running) then the function only clears the bit
      - Once ice_plug_aux_dev() call (in ice_service_task) is finished
        the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked
        whether it was already cleared by ice_clear_rdma_cap(). If so then
        aux device is unplugged.
      
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Co-developed-by: default avatarPetr Oros <poros@redhat.com>
      Signed-off-by: default avatarPetr Oros <poros@redhat.com>
      Reviewed-by: default avatarDave Ertman <david.m.ertman@intel.com>
      Link: https://lore.kernel.org/r/20220310171641.3863659-1-ivecera@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e1014fc5