Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Aug 10, 2015
  2. Aug 06, 2015
    • Bogdan Purcareata's avatar
      powerpc/kvm: Disable in-kernel MPIC emulation for PREEMPT_RT_FULL · 728533c0
      Bogdan Purcareata authored
      
      While converting the openpic emulation code to use a raw_spinlock_t enables
      guests to run on RT, there's still a performance issue. For interrupts sent in
      directed delivery mode with a multiple CPU mask, the emulated openpic will loop
      through all of the VCPUs, and for each VCPUs, it call IRQ_check, which will loop
      through all the pending interrupts for that VCPU. This is done while holding the
      raw_lock, meaning that in all this time the interrupts and preemption are
      disabled on the host Linux. A malicious user app can max both these number and
      cause a DoS.
      
      This temporary fix is sent for two reasons. First is so that users who want to
      use the in-kernel MPIC emulation are aware of the potential latencies, thus
      making sure that the hardware MPIC and their usage scenario does not involve
      interrupts sent in directed delivery mode, and the number of possible pending
      interrupts is kept small. Secondly, this should incentivize the development of a
      proper openpic emulation that would be better suited for RT.
      
      Cc: stable-rt@vger.kernel.org
      Acked-by: default avatarScott Wood <scottwood@freescale.com>
      Signed-off-by: default avatarBogdan Purcareata <bogdan.purcareata@freescale.com>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      728533c0
    • Steven Rostedt's avatar
      xfs: Disable percpu SB on PREEMPT_RT_FULL · 64091a79
      Steven Rostedt authored
      Running a test on a large CPU count box with xfs, I hit a live lock
      with the following backtraces on several CPUs:
      
       Call Trace:
        [<ffffffff812c34f8>] __const_udelay+0x28/0x30
        [<ffffffffa033ab9a>] xfs_icsb_lock_cntr+0x2a/0x40 [xfs]
        [<ffffffffa033c871>] xfs_icsb_modify_counters+0x71/0x280 [xfs]
        [<ffffffffa03413e1>] xfs_trans_reserve+0x171/0x210 [xfs]
        [<ffffffffa0378cfd>] xfs_create+0x24d/0x6f0 [xfs]
        [<ffffffff8124c8eb>] ? avc_has_perm_flags+0xfb/0x1e0
        [<ffffffffa0336eeb>] xfs_vn_mknod+0xbb/0x1e0 [xfs]
        [<ffffffffa0337043>] xfs_vn_create+0x13/0x20 [xfs]
        [<ffffffff811b0edd>] vfs_create+0xcd/0x130
        [<ffffffff811b21ef>] do_last+0xb8f/0x1240
        [<ffffffff811b39b2>] path_openat+0xc2/0x490
      
      Looking at the code I see it was stuck at:
      
      STATIC void
      xfs_icsb_lock_cntr(
      	xfs_icsb_cnts_t	*icsbp)
      {
      	while (test_and_set_bit(XFS_ICSB_FLAG_LOCK, &icsbp->icsb_flags)) {
      		ndelay(1000);
      	}
      }
      
      In xfs_icsb_modify_counters() the code is fine. There's a
      preempt_disable() called when taking this bit spinlock and a
      preempt_enable() after it is released. The issue is that not all
      locations are protected by preempt_disable() when PREEMPT_RT is set.
      Namely the places that grab all CPU cntr locks.
      
      STATIC void
      xfs_icsb_lock_all_counters(
      	xfs_mount_t	*mp)
      {
      	xfs_icsb_cnts_t *cntp;
      	int		i;
      
      	for_each_online_cpu(i) {
      		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
      		xfs_icsb_lock_cntr(cntp);
      	}
      }
      
      STATIC void
      xfs_icsb_disable_counter()
      {
      	[...]
      	xfs_icsb_lock_all_counters(mp);
      	[...]
      	xfs_icsb_unlock_all_counters(mp);
      }
      
      STATIC void
      xfs_icsb_balance_counter_locked()
      {
      	[...]
      	xfs_icsb_disable_counter();
      	[...]
      }
      
      STATIC void
      xfs_icsb_balance_counter(
      	xfs_mount_t	*mp,
      	xfs_sb_field_t  fields,
      	int		min_per_cpu)
      {
      	spin_lock(&mp->m_sb_lock);
      	xfs_icsb_balance_counter_locked(mp, fields, min_per_cpu);
      	spin_unlock(&mp->m_sb_lock);
      }
      
      Now, when PREEMPT_RT is not enabled, that spin_lock() disables
      preemption. But for PREEMPT_RT, it does not. Although with my test box I
      was not able to produce a task state of all tasks, but I'm assuming that
      some task called the xfs_icsb_lock_all_counters() and was preempted by
      an RT task and could not finish, causing all callers of that lock to
      block indefinitely.
      
      Dave Chinner has stated that the scalability of that code will probably
      be negated by PREEMPT_RT, and that it is probably best to just disable
      the code in question. Also, this code has been rewritten in newer kernels.
      
      Link: http://lkml.kernel.org/r/20150504004844.GA21261@dastard
      
      
      
      Cc: stable-rt@vger.kernel.org
      Suggested-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      64091a79
    • Frederic Weisbecker's avatar
      x86-Tell-irq-work-about-self-IPI-support-3.14 · eb19fc2b
      Frederic Weisbecker authored
      commit 3010279f
      
      
      Author: Frederic Weisbecker <fweisbec@gmail.com>
      Date:   Sat Aug 16 18:47:15 2014 +0200
      
      x86: Tell irq work about self IPI support
      
      x86 supports irq work self-IPIs when local apic is available. This is
      partly known on runtime so lets implement arch_irq_work_has_interrupt()
      accordingly.
      
      This should be safely called after setup_arch().
      
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      eb19fc2b
    • Thomas Gleixner's avatar
      mm/slub: move slab initialization into irq enabled region · 92e72acf
      Thomas Gleixner authored
      
      Initializing a new slab can introduce rather large latencies because most
      of the initialization runs always with interrupts disabled.
      
      There is no point in doing so.  The newly allocated slab is not visible
      yet, so there is no reason to protect it against concurrent alloc/free.
      
      Move the expensive parts of the initialization into allocate_slab(), so
      for all allocations with GFP_WAIT set, interrupts are enabled.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      92e72acf
    • Sebastian Andrzej Siewior's avatar
      Revert "slub: delay ctor until the object is requested" · f1aca908
      Sebastian Andrzej Siewior authored
      
      This approach is broken with SLAB_DESTROY_BY_RCU allocations.
      Reported by Steven Rostedt and Koehrer Mathias.
      
      Cc: stable-rt@vger.kernel.org
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f1aca908
  3. Jul 30, 2015