Commits · v4.4.7-rt16-rebase · BeagleBoard.org / Ti Linux Kernel

Apr 15, 2016

v4.4.7-rt16 · b16a2ce4
Thomas Gleixner authored 13 years ago
```
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
```
v4.4.7-rt16-rebase

b16a2ce4

workqueue: Prevent deadlock/stall on RT · 246a3384

Thomas Gleixner authored 10 years ago

Austin reported a XFS deadlock/stall on RT where scheduled work gets
never exececuted and tasks are waiting for each other for ever.

The underlying problem is the modification of the RT code to the
handling of workers which are about to go to sleep. In mainline a
worker thread which goes to sleep wakes an idle worker if there is
more work to do. This happens from the guts of the schedule()
function. On RT this must be outside and the accessed data structures
are not protected against scheduling due to the spinlock to rtmutex
conversion. So the naive solution to this was to move the code outside
of the scheduler and protect the data structures by the pool
lock. That approach turned out to be a little naive as we cannot call
into that code when the thread blocks on a lock, as it is not allowed
to block on two locks in parallel. So we dont call into the worker
wakeup magic when the worker is blocked on a lock, which causes the
deadlock/stall observed by Austin and Mike.

Looking deeper into that worker code it turns out that the only
relevant data structure which needs to be protected is the list of
idle workers which can be woken up.

So the solution is to protect the list manipulation operations with
preempt_enable/disable pairs on RT and call unconditionally into the
worker code even when the worker is blocked on a lock. The preemption
protection is safe as there is nothing which can fiddle with the list
outside of thread context.

Reported-and_tested-by: Austin Schuh <austin@peloton-tech.com>
Reported-and_tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://vger.kernel.org/r/alpine.DEB.2.10.1406271249510.5170@nanos
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

246a3384

md: disable bcache · ad1a5510

Sebastian Andrzej Siewior authored 11 years ago


It uses anon semaphores
|drivers/md/bcache/request.c: In function ‘cached_dev_write_complete’:
|drivers/md/bcache/request.c:1007:2: error: implicit declaration of function ‘up_read_non_owner’ [-Werror=implicit-function-declaration]
|  up_read_non_owner(&dc->writeback_lock);
|  ^
|drivers/md/bcache/request.c: In function ‘request_write’:
|drivers/md/bcache/request.c:1033:2: error: implicit declaration of function ‘down_read_non_owner’ [-Werror=implicit-function-declaration]
|  down_read_non_owner(&dc->writeback_lock);
|  ^

either we get rid of those or we have to introduce them…

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

ad1a5510

rt,ntp: Move call to schedule_delayed_work() to helper thread · e396ffe7

Steven Rostedt authored 11 years ago


The ntp code for notify_cmos_timer() is called from a hard interrupt
context. schedule_delayed_work() under PREEMPT_RT_FULL calls spinlocks
that have been converted to mutexes, thus calling schedule_delayed_work()
from interrupt is not safe.

Add a helper thread that does the call to schedule_delayed_work and wake
up that thread instead of calling schedule_delayed_work() directly.
This is only for CONFIG_PREEMPT_RT_FULL, otherwise the code still calls
schedule_delayed_work() directly in irq context.

Note: There's a few places in the kernel that do this. Perhaps the RT
code should have a dedicated thread that does the checks. Just register
a notifier on boot up for your check and wake up the thread when
needed. This will be a todo.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

e396ffe7

memcontrol: Prevent scheduling while atomic in cgroup code · e09953b8

Mike Galbraith authored 10 years ago


mm, memcg: make refill_stock() use get_cpu_light()

Nikita reported the following memcg scheduling while atomic bug:

Call Trace:
[e22d5a90] [c0007ea8] show_stack+0x4c/0x168 (unreliable)
[e22d5ad0] [c0618c04] __schedule_bug+0x94/0xb0
[e22d5ae0] [c060b9ec] __schedule+0x530/0x550
[e22d5bf0] [c060bacc] schedule+0x30/0xbc
[e22d5c00] [c060ca24] rt_spin_lock_slowlock+0x180/0x27c
[e22d5c70] [c00b39dc] res_counter_uncharge_until+0x40/0xc4
[e22d5ca0] [c013ca88] drain_stock.isra.20+0x54/0x98
[e22d5cc0] [c01402ac] __mem_cgroup_try_charge+0x2e8/0xbac
[e22d5d70] [c01410d4] mem_cgroup_charge_common+0x3c/0x70
[e22d5d90] [c0117284] __do_fault+0x38c/0x510
[e22d5df0] [c011a5f4] handle_pte_fault+0x98/0x858
[e22d5e50] [c060ed08] do_page_fault+0x42c/0x6fc
[e22d5f40] [c000f5b4] handle_page_fault+0xc/0x80

What happens:

   refill_stock()
      get_cpu_var()
      drain_stock()
         res_counter_uncharge()
            res_counter_uncharge_until()
               spin_lock() <== boom

Fix it by replacing get/put_cpu_var() with get/put_cpu_light().


Reported-by: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

e09953b8

cgroups: use simple wait in css_release() · 19c316a4

Sebastian Andrzej Siewior authored 10 years ago


To avoid:
|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:914
|in_atomic(): 1, irqs_disabled(): 0, pid: 92, name: rcuc/11
|2 locks held by rcuc/11/92:
| #0:  (rcu_callback){......}, at: [<ffffffff810e037e>] rcu_cpu_kthread+0x3de/0x940
| #1:  (rcu_read_lock_sched){......}, at: [<ffffffff81328390>] percpu_ref_call_confirm_rcu+0x0/0xd0
|Preemption disabled at:[<ffffffff813284e2>] percpu_ref_switch_to_atomic_rcu+0x82/0xc0
|CPU: 11 PID: 92 Comm: rcuc/11 Not tainted 3.18.7-rt0+ #1
| ffff8802398cdf80 ffff880235f0bc28 ffffffff815b3a12 0000000000000000
| 0000000000000000 ffff880235f0bc48 ffffffff8109aa16 0000000000000000
| ffff8802398cdf80 ffff880235f0bc78 ffffffff815b8dd4 000000000000df80
|Call Trace:
| [<ffffffff815b3a12>] dump_stack+0x4f/0x7c
| [<ffffffff8109aa16>] __might_sleep+0x116/0x190
| [<ffffffff815b8dd4>] rt_spin_lock+0x24/0x60
| [<ffffffff8108d2cd>] queue_work_on+0x6d/0x1d0
| [<ffffffff8110c881>] css_release+0x81/0x90
| [<ffffffff8132844e>] percpu_ref_call_confirm_rcu+0xbe/0xd0
| [<ffffffff813284e2>] percpu_ref_switch_to_atomic_rcu+0x82/0xc0
| [<ffffffff810e03e5>] rcu_cpu_kthread+0x445/0x940
| [<ffffffff81098a2d>] smpboot_thread_fn+0x18d/0x2d0
| [<ffffffff810948d8>] kthread+0xe8/0x100
| [<ffffffff815b9c3c>] ret_from_fork+0x7c/0xb0

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

19c316a4

drm,i915: Use local_lock/unlock_irq() in intel_pipe_update_start/end() · a217688b

Mike Galbraith authored 9 years ago


[    8.014039] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:918
[    8.014041] in_atomic(): 0, irqs_disabled(): 1, pid: 78, name: kworker/u4:4
[    8.014045] CPU: 1 PID: 78 Comm: kworker/u4:4 Not tainted 4.1.7-rt7 #5
[    8.014055] Workqueue: events_unbound async_run_entry_fn
[    8.014059]  0000000000000000 ffff880037153748 ffffffff815f32c9 0000000000000002
[    8.014063]  ffff88013a50e380 ffff880037153768 ffffffff815ef075 ffff8800372c06c8
[    8.014066]  ffff8800372c06c8 ffff880037153778 ffffffff8107c0b3 ffff880037153798
[    8.014067] Call Trace:
[    8.014074]  [<ffffffff815f32c9>] dump_stack+0x4a/0x61
[    8.014078]  [<ffffffff815ef075>] ___might_sleep.part.93+0xe9/0xee
[    8.014082]  [<ffffffff8107c0b3>] ___might_sleep+0x53/0x80
[    8.014086]  [<ffffffff815f9064>] rt_spin_lock+0x24/0x50
[    8.014090]  [<ffffffff8109368b>] prepare_to_wait+0x2b/0xa0
[    8.014152]  [<ffffffffa016c04c>] intel_pipe_update_start+0x17c/0x300 [i915]
[    8.014156]  [<ffffffff81093b40>] ? prepare_to_wait_event+0x120/0x120
[    8.014201]  [<ffffffffa0158f36>] intel_begin_crtc_commit+0x166/0x1e0 [i915]
[    8.014215]  [<ffffffffa00c806d>] drm_atomic_helper_commit_planes+0x5d/0x1a0 [drm_kms_helper]
[    8.014260]  [<ffffffffa0171e9b>] intel_atomic_commit+0xab/0xf0 [i915]
[    8.014288]  [<ffffffffa00654c7>] drm_atomic_commit+0x37/0x60 [drm]
[    8.014298]  [<ffffffffa00c6fcd>] drm_atomic_helper_plane_set_property+0x8d/0xd0 [drm_kms_helper]
[    8.014301]  [<ffffffff815f77d9>] ? __ww_mutex_lock+0x39/0x40
[    8.014319]  [<ffffffffa0053b3d>] drm_mode_plane_set_obj_prop+0x2d/0x90 [drm]
[    8.014328]  [<ffffffffa00c8edb>] restore_fbdev_mode+0x6b/0xf0 [drm_kms_helper]
[    8.014337]  [<ffffffffa00cae49>] drm_fb_helper_restore_fbdev_mode_unlocked+0x29/0x80 [drm_kms_helper]
[    8.014346]  [<ffffffffa00caec2>] drm_fb_helper_set_par+0x22/0x50 [drm_kms_helper]
[    8.014390]  [<ffffffffa016dfba>] intel_fbdev_set_par+0x1a/0x60 [i915]
[    8.014394]  [<ffffffff81327dc4>] fbcon_init+0x4f4/0x580
[    8.014398]  [<ffffffff8139ef4c>] visual_init+0xbc/0x120
[    8.014401]  [<ffffffff813a1623>] do_bind_con_driver+0x163/0x330
[    8.014405]  [<ffffffff813a1b2c>] do_take_over_console+0x11c/0x1c0
[    8.014408]  [<ffffffff813236e3>] do_fbcon_takeover+0x63/0xd0
[    8.014410]  [<ffffffff81328965>] fbcon_event_notify+0x785/0x8d0
[    8.014413]  [<ffffffff8107c12d>] ? __might_sleep+0x4d/0x90
[    8.014416]  [<ffffffff810775fe>] notifier_call_chain+0x4e/0x80
[    8.014419]  [<ffffffff810779cd>] __blocking_notifier_call_chain+0x4d/0x70
[    8.014422]  [<ffffffff81077a06>] blocking_notifier_call_chain+0x16/0x20
[    8.014425]  [<ffffffff8132b48b>] fb_notifier_call_chain+0x1b/0x20
[    8.014428]  [<ffffffff8132d8fa>] register_framebuffer+0x21a/0x350
[    8.014439]  [<ffffffffa00cb164>] drm_fb_helper_initial_config+0x274/0x3e0 [drm_kms_helper]
[    8.014483]  [<ffffffffa016f1cb>] intel_fbdev_initial_config+0x1b/0x20 [i915]
[    8.014486]  [<ffffffff8107912c>] async_run_entry_fn+0x4c/0x160
[    8.014490]  [<ffffffff81070ffa>] process_one_work+0x14a/0x470
[    8.014493]  [<ffffffff81071489>] worker_thread+0x169/0x4c0
[    8.014496]  [<ffffffff81071320>] ? process_one_work+0x470/0x470
[    8.014499]  [<ffffffff81076606>] kthread+0xc6/0xe0
[    8.014502]  [<ffffffff81070000>] ? queue_work_on+0x80/0x110
[    8.014506]  [<ffffffff81076540>] ? kthread_worker_fn+0x1c0/0x1c0

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

a217688b

drm,radeon,i915: Use preempt_disable/enable_rt() where recommended · f2c1eae3

Mike Galbraith authored 9 years ago


DRM folks identified the spots, so use them.

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

f2c1eae3

i915: bogus warning from i915 when running on PREEMPT_RT · 281f762b

Clark Williams authored 9 years ago


The i915 driver has a 'WARN_ON(!in_interrupt())' in the display
handler, which whines constanly on the RT kernel (since the interrupt
is actually handled in a threaded handler and not actual interrupt
context).

Change the WARN_ON to WARN_ON_NORT

Tested-by: Joakim Hernberg <jhernberg@alchemy.lu>
Signed-off-by: Clark Williams <williams@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

281f762b

drm/i915: drop trace_i915_gem_ring_dispatch on rt · ee0f802b

Sebastian Andrzej Siewior authored 12 years ago


This tracepoint is responsible for:

|[<814cc358>] __schedule_bug+0x4d/0x59
|[<814d24cc>] __schedule+0x88c/0x930
|[<814d3b90>] ? _raw_spin_unlock_irqrestore+0x40/0x50
|[<814d3b95>] ? _raw_spin_unlock_irqrestore+0x45/0x50
|[<810b57b5>] ? task_blocks_on_rt_mutex+0x1f5/0x250
|[<814d27d9>] schedule+0x29/0x70
|[<814d3423>] rt_spin_lock_slowlock+0x15b/0x278
|[<814d3786>] rt_spin_lock+0x26/0x30
|[<a00dced9>] gen6_gt_force_wake_get+0x29/0x60 [i915]
|[<a00e183f>] gen6_ring_get_irq+0x5f/0x100 [i915]
|[<a00b2a33>] ftrace_raw_event_i915_gem_ring_dispatch+0xe3/0x100 [i915]
|[<a00ac1b3>] i915_gem_do_execbuffer.isra.13+0xbd3/0x1430 [i915]
|[<810f8943>] ? trace_buffer_unlock_commit+0x43/0x60
|[<8113e8d2>] ? ftrace_raw_event_kmem_alloc+0xd2/0x180
|[<8101d063>] ? native_sched_clock+0x13/0x80
|[<a00acf29>] i915_gem_execbuffer2+0x99/0x280 [i915]
|[<a00114a3>] drm_ioctl+0x4c3/0x570 [drm]
|[<8101d0d9>] ? sched_clock+0x9/0x10
|[<a00ace90>] ? i915_gem_execbuffer+0x480/0x480 [i915]
|[<810f1c18>] ? rb_commit+0x68/0xa0
|[<810f1c6c>] ? ring_buffer_unlock_commit+0x1c/0xa0
|[<81197467>] do_vfs_ioctl+0x97/0x540
|[<81021318>] ? ftrace_raw_event_sys_enter+0xd8/0x130
|[<811979a1>] sys_ioctl+0x91/0xb0
|[<814db931>] tracesys+0xe1/0xe6

Chris Wilson does not like to move i915_trace_irq_get() out of the macro

|No. This enables the IRQ, as well as making a number of
|very expensively serialised read, unconditionally.

so it is gone now on RT.


Reported-by: Joakim Hernberg <jbh@alchemy.lu>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

ee0f802b

gpu/i915: don't open code these things · abde397f

Sebastian Andrzej Siewior authored 9 years ago

The opencode part is gone in 1f83fee0

 ("drm/i915: clear up wedged transitions")
the owner check is still there.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

abde397f

drivers/block/zram: Replace bit spinlocks with rtmutex for -rt · 8e3af934

Mike Galbraith authored 9 years ago

They're nondeterministic, and lead to ___might_sleep() splats in -rt.
OTOH, they're a lot less wasteful than an rtmutex per page.

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

8e3af934

cpufreq: drop K8's driver from beeing selected · 232b0e1a

Sebastian Andrzej Siewior authored 10 years ago


Ralf posted a picture of a backtrace from

| powernowk8_target_fn() -> transition_frequency_fidvid() and then at the
| end:
| 932         policy = cpufreq_cpu_get(smp_processor_id());
| 933         cpufreq_cpu_put(policy);

crashing the system on -RT. I assumed that policy was a NULL pointer but
was rulled out. Since Ralf can't do any more investigations on this and
I have no machine with this, I simply switch it off.

Reported-by: Ralf Mardorf <ralf.mardorf@alice-dsl.net>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

232b0e1a

mmci: Remove bogus local_irq_save() · f82ab2fc

Thomas Gleixner authored 12 years ago


On !RT interrupt runs with interrupts disabled. On RT it's in a
thread, so no need to disable interrupts at all.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

f82ab2fc

i2c/omap: drop the lock hard irq context · 60419a7a

Sebastian Andrzej Siewior authored 12 years ago

The lock is taken while reading two registers. On RT the first lock is
taken in hard irq where it might sleep and in the threaded irq.
The threaded irq runs in oneshot mode so the hard irq does not run until
the thread the completes so there is no reason to grab the lock.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

60419a7a

leds: trigger: disable CPU trigger on -RT · 0d26db89

Sebastian Andrzej Siewior authored 11 years ago


as it triggers:
|CPU: 0 PID: 0 Comm: swapper Not tainted 3.12.8-rt10 #141
|[<c0014aa4>] (unwind_backtrace+0x0/0xf8) from [<c0012788>] (show_stack+0x1c/0x20)
|[<c0012788>] (show_stack+0x1c/0x20) from [<c043c8dc>] (dump_stack+0x20/0x2c)
|[<c043c8dc>] (dump_stack+0x20/0x2c) from [<c004c5e8>] (__might_sleep+0x13c/0x170)
|[<c004c5e8>] (__might_sleep+0x13c/0x170) from [<c043f270>] (__rt_spin_lock+0x28/0x38)
|[<c043f270>] (__rt_spin_lock+0x28/0x38) from [<c043fa00>] (rt_read_lock+0x68/0x7c)
|[<c043fa00>] (rt_read_lock+0x68/0x7c) from [<c036cf74>] (led_trigger_event+0x2c/0x5c)
|[<c036cf74>] (led_trigger_event+0x2c/0x5c) from [<c036e0bc>] (ledtrig_cpu+0x54/0x5c)
|[<c036e0bc>] (ledtrig_cpu+0x54/0x5c) from [<c000ffd8>] (arch_cpu_idle_exit+0x18/0x1c)
|[<c000ffd8>] (arch_cpu_idle_exit+0x18/0x1c) from [<c00590b8>] (cpu_startup_entry+0xa8/0x234)
|[<c00590b8>] (cpu_startup_entry+0xa8/0x234) from [<c043b2cc>] (rest_init+0xb8/0xe0)
|[<c043b2cc>] (rest_init+0xb8/0xe0) from [<c061ebe0>] (start_kernel+0x2c4/0x380)


Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

0d26db89

arm+arm64: lazy-preempt: add TIF_NEED_RESCHED_LAZY to _TIF_WORK_MASK · ece1ff28

Sebastian Andrzej Siewior authored 9 years ago


_TIF_WORK_MASK is used to check for TIF_NEED_RESCHED so we need to check
for TIF_NEED_RESCHED_LAZY here, too.

Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

ece1ff28

arch/arm64: Add lazy preempt support · 34ca3c06

Anders Roxell authored 9 years ago

arm64 is missing support for PREEMPT_RT. The main feature which is
lacking is support for lazy preemption. The arch-specific entry code,
thread information structure definitions, and associated data tables
have to be extended to provide this support. Then the Kconfig file has
to be extended to indicate the support is available, and also to
indicate that support for full RT preemption is now available.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>

34ca3c06

powerpc: Add support for lazy preemption · 39009cc0

Thomas Gleixner authored 12 years ago


Implement the powerpc pieces for lazy preempt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

39009cc0

arm: Add support for lazy preemption · 1b29b700

Thomas Gleixner authored 12 years ago


Implement the arm pieces for lazy preempt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

1b29b700

x86: Support for lazy preemption · 31215861

Thomas Gleixner authored 12 years ago


Implement the x86 pieces for lazy preempt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

31215861

preempt-lazy: Add the lazy-preemption check to preempt_schedule() · 582ff318

Sebastian Andrzej Siewior authored 9 years ago

Probably in the rebase onto v4.1 this check got moved into less commonly used
preempt_schedule_notrace(). This patch ensures that both functions use it.

Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

582ff318

sched: Add support for lazy preemption · 48ed1d44

Thomas Gleixner authored 12 years ago


It has become an obsession to mitigate the determinism vs. throughput
loss of RT. Looking at the mainline semantics of preemption points
gives a hint why RT sucks throughput wise for ordinary SCHED_OTHER
tasks. One major issue is the wakeup of tasks which are right away
preempting the waking task while the waking task holds a lock on which
the woken task will block right after having preempted the wakee. In
mainline this is prevented due to the implicit preemption disable of
spin/rw_lock held regions. On RT this is not possible due to the fully
preemptible nature of sleeping spinlocks.

Though for a SCHED_OTHER task preempting another SCHED_OTHER task this
is really not a correctness issue. RT folks are concerned about
SCHED_FIFO/RR tasks preemption and not about the purely fairness
driven SCHED_OTHER preemption latencies.

So I introduced a lazy preemption mechanism which only applies to
SCHED_OTHER tasks preempting another SCHED_OTHER task. Aside of the
existing preempt_count each tasks sports now a preempt_lazy_count
which is manipulated on lock acquiry and release. This is slightly
incorrect as for lazyness reasons I coupled this on
migrate_disable/enable so some other mechanisms get the same treatment
(e.g. get_cpu_light).

Now on the scheduler side instead of setting NEED_RESCHED this sets
NEED_RESCHED_LAZY in case of a SCHED_OTHER/SCHED_OTHER preemption and
therefor allows to exit the waking task the lock held region before
the woken task preempts. That also works better for cross CPU wakeups
as the other side can stay in the adaptive spinning loop.

For RT class preemption there is no change. This simply sets
NEED_RESCHED and forgoes the lazy preemption counter.

 Initial test do not expose any observable latency increasement, but
history shows that I've been proven wrong before :)

The lazy preemption mode is per default on, but with
CONFIG_SCHED_DEBUG enabled it can be disabled via:

 # echo NO_PREEMPT_LAZY >/sys/kernel/debug/sched_features

and reenabled via

 # echo PREEMPT_LAZY >/sys/kernel/debug/sched_features

The test results so far are very machine and workload dependent, but
there is a clear trend that it enhances the non RT workload
performance.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

48ed1d44

rcu: make RCU_BOOST default on RT · 580d95b8

Sebastian Andrzej Siewior authored 11 years ago


Since it is no longer invoked from the softirq people run into OOM more
often if the priority of the RCU thread is too low. Making boosting
default on RT should help in those case and it can be switched off if
someone knows better.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

580d95b8

rcu: Eliminate softirq processing from rcutree · 0bb1727c

Paul E. McKenney authored 11 years ago

Running RCU out of softirq is a problem for some workloads that would
like to manage RCU core processing independently of other softirq work,
for example, setting kthread priority. This commit therefore moves the
RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread
named rcuc. The SCHED_OTHER approach avoids the scalability problems
that appeared with the earlier attempt to move RCU core processing to
from softirq to kthreads. That said, kernels built with RCU_BOOST=y
will run the rcuc kthreads at the RCU-boosting priority.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

0bb1727c

rcu: Disable RCU_FAST_NO_HZ on RT · 92abf19a

Thomas Gleixner authored 12 years ago


This uses a timer_list timer from the irq disabled guts of the idle
code. Disable it for now to prevent wreckage.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

92abf19a

kernel/perf: mark perf_cpu_context's timer as irqsafe · b54159a7

Sebastian Andrzej Siewior authored 9 years ago


Otherwise we get a WARN_ON() backtrace and some events are reported as
"not counted".

Cc: stable-rt@vger.kernel.org
Reported-by: Yang Shi <yang.shi@linaro.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

b54159a7

perf: Make swevent hrtimer run in irq instead of softirq · 087cd4b1

Yong Zhang authored 12 years ago


Otherwise we get a deadlock like below:

[ 1044.042749] BUG: scheduling while atomic: ksoftirqd/21/141/0x00010003
[ 1044.042752] INFO: lockdep is turned off.
[ 1044.042754] Modules linked in:
[ 1044.042757] Pid: 141, comm: ksoftirqd/21 Tainted: G        W    3.4.0-rc2-rt3-23676-ga723175-dirty #29
[ 1044.042759] Call Trace:
[ 1044.042761]  <IRQ>  [<ffffffff8107d8e5>] __schedule_bug+0x65/0x80
[ 1044.042770]  [<ffffffff8168978c>] __schedule+0x83c/0xa70
[ 1044.042775]  [<ffffffff8106bdd2>] ? prepare_to_wait+0x32/0xb0
[ 1044.042779]  [<ffffffff81689a5e>] schedule+0x2e/0xa0
[ 1044.042782]  [<ffffffff81071ebd>] hrtimer_wait_for_timer+0x6d/0xb0
[ 1044.042786]  [<ffffffff8106bb30>] ? wake_up_bit+0x40/0x40
[ 1044.042790]  [<ffffffff81071f20>] hrtimer_cancel+0x20/0x40
[ 1044.042794]  [<ffffffff8111da0c>] perf_swevent_cancel_hrtimer+0x3c/0x50
[ 1044.042798]  [<ffffffff8111da31>] task_clock_event_stop+0x11/0x40
[ 1044.042802]  [<ffffffff8111da6e>] task_clock_event_del+0xe/0x10
[ 1044.042805]  [<ffffffff8111c568>] event_sched_out+0x118/0x1d0
[ 1044.042809]  [<ffffffff8111c649>] group_sched_out+0x29/0x90
[ 1044.042813]  [<ffffffff8111ed7e>] __perf_event_disable+0x18e/0x200
[ 1044.042817]  [<ffffffff8111c343>] remote_function+0x63/0x70
[ 1044.042821]  [<ffffffff810b0aae>] generic_smp_call_function_single_interrupt+0xce/0x120
[ 1044.042826]  [<ffffffff81022bc7>] smp_call_function_single_interrupt+0x27/0x40
[ 1044.042831]  [<ffffffff8168d50c>] call_function_single_interrupt+0x6c/0x80
[ 1044.042833]  <EOI>  [<ffffffff811275b0>] ? perf_event_overflow+0x20/0x20
[ 1044.042840]  [<ffffffff8168b970>] ? _raw_spin_unlock_irq+0x30/0x70
[ 1044.042844]  [<ffffffff8168b976>] ? _raw_spin_unlock_irq+0x36/0x70
[ 1044.042848]  [<ffffffff810702e2>] run_hrtimer_softirq+0xc2/0x200
[ 1044.042853]  [<ffffffff811275b0>] ? perf_event_overflow+0x20/0x20
[ 1044.042857]  [<ffffffff81045265>] __do_softirq_common+0xf5/0x3a0
[ 1044.042862]  [<ffffffff81045c3d>] __thread_do_softirq+0x15d/0x200
[ 1044.042865]  [<ffffffff81045dda>] run_ksoftirqd+0xfa/0x210
[ 1044.042869]  [<ffffffff81045ce0>] ? __thread_do_softirq+0x200/0x200
[ 1044.042873]  [<ffffffff81045ce0>] ? __thread_do_softirq+0x200/0x200
[ 1044.042877]  [<ffffffff8106b596>] kthread+0xb6/0xc0
[ 1044.042881]  [<ffffffff8168b97b>] ? _raw_spin_unlock_irq+0x3b/0x70
[ 1044.042886]  [<ffffffff8168d994>] kernel_thread_helper+0x4/0x10
[ 1044.042889]  [<ffffffff8107d98c>] ? finish_task_switch+0x8c/0x110
[ 1044.042894]  [<ffffffff8168b97b>] ? _raw_spin_unlock_irq+0x3b/0x70
[ 1044.042897]  [<ffffffff8168bd5d>] ? retint_restore_args+0xe/0xe
[ 1044.042900]  [<ffffffff8106b4e0>] ? kthreadd+0x1e0/0x1e0
[ 1044.042902]  [<ffffffff8168d990>] ? gs_change+0xb/0xb

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1341476476-5666-1-git-send-email-yong.zhang0@gmail.com


Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

087cd4b1

lockdep: selftest: fix warnings due to missing PREEMPT_RT conditionals · ade31113

Josh Cartwright authored 10 years ago


"lockdep: Selftest: Only do hardirq context test for raw spinlock"
disabled the execution of certain tests with PREEMPT_RT_FULL, but did
not prevent the tests from still being defined.  This leads to warnings
like:

  ./linux/lib/locking-selftest.c:574:1: warning: 'irqsafe1_hard_rlock_12' defined but not used [-Wunused-function]
  ./linux/lib/locking-selftest.c:574:1: warning: 'irqsafe1_hard_rlock_21' defined but not used [-Wunused-function]
  ./linux/lib/locking-selftest.c:577:1: warning: 'irqsafe1_hard_wlock_12' defined but not used [-Wunused-function]
  ./linux/lib/locking-selftest.c:577:1: warning: 'irqsafe1_hard_wlock_21' defined but not used [-Wunused-function]
  ./linux/lib/locking-selftest.c:580:1: warning: 'irqsafe1_soft_spin_12' defined but not used [-Wunused-function]
  ...

Fixed by wrapping the test definitions in #ifndef CONFIG_PREEMPT_RT_FULL
conditionals.


Signed-off-by: Josh Cartwright <josh.cartwright@ni.com>
Signed-off-by: Xander Huff <xander.huff@ni.com>
Acked-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

ade31113

lockdep: selftest: Only do hardirq context test for raw spinlock · e6530e89

Yong Zhang authored 13 years ago


On -rt there is no softirq context any more and rwlock is sleepable,
disable softirq context test and rwlock+irq test.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Yong Zhang <yong.zhang@windriver.com>
Link: http://lkml.kernel.org/r/1334559716-18447-3-git-send-email-yong.zhang0@gmail.com


Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

e6530e89

crypto: Convert crypto notifier chain to SRCU · ce52ad3e

Peter Zijlstra authored 12 years ago


The crypto notifier deadlocks on RT. Though this can be a real deadlock
on mainline as well due to fifo fair rwsems.

The involved parties here are:

[   82.172678] swapper/0       S 0000000000000001     0     1      0 0x00000000
[   82.172682]  ffff88042f18fcf0 0000000000000046 ffff88042f18fc80 ffffffff81491238
[   82.172685]  0000000000011cc0 0000000000011cc0 ffff88042f18c040 ffff88042f18ffd8
[   82.172688]  0000000000011cc0 0000000000011cc0 ffff88042f18ffd8 0000000000011cc0
[   82.172689] Call Trace:
[   82.172697]  [<ffffffff81491238>] ? _raw_spin_unlock_irqrestore+0x6c/0x7a
[   82.172701]  [<ffffffff8148fd3f>] schedule+0x64/0x66
[   82.172704]  [<ffffffff8148ec6b>] schedule_timeout+0x27/0xd0
[   82.172708]  [<ffffffff81043c0c>] ? unpin_current_cpu+0x1a/0x6c
[   82.172713]  [<ffffffff8106e491>] ? migrate_enable+0x12f/0x141
[   82.172716]  [<ffffffff8148fbbd>] wait_for_common+0xbb/0x11f
[   82.172719]  [<ffffffff810709f2>] ? try_to_wake_up+0x182/0x182
[   82.172722]  [<ffffffff8148fc96>] wait_for_completion_interruptible+0x1d/0x2e
[   82.172726]  [<ffffffff811debfd>] crypto_wait_for_test+0x49/0x6b
[   82.172728]  [<ffffffff811ded32>] crypto_register_alg+0x53/0x5a
[   82.172730]  [<ffffffff811ded6c>] crypto_register_algs+0x33/0x72
[   82.172734]  [<ffffffff81ad7686>] ? aes_init+0x12/0x12
[   82.172737]  [<ffffffff81ad76ea>] aesni_init+0x64/0x66
[   82.172741]  [<ffffffff81000318>] do_one_initcall+0x7f/0x13b
[   82.172744]  [<ffffffff81ac4d34>] kernel_init+0x199/0x22c
[   82.172747]  [<ffffffff81ac44ef>] ? loglevel+0x31/0x31
[   82.172752]  [<ffffffff814987c4>] kernel_thread_helper+0x4/0x10
[   82.172755]  [<ffffffff81491574>] ? retint_restore_args+0x13/0x13
[   82.172759]  [<ffffffff81ac4b9b>] ? start_kernel+0x3ca/0x3ca
[   82.172761]  [<ffffffff814987c0>] ? gs_change+0x13/0x13

[   82.174186] cryptomgr_test  S 0000000000000001     0    41      2 0x00000000
[   82.174189]  ffff88042c971980 0000000000000046 ffffffff81d74830 0000000000000292
[   82.174192]  0000000000011cc0 0000000000011cc0 ffff88042c96eb80 ffff88042c971fd8
[   82.174195]  0000000000011cc0 0000000000011cc0 ffff88042c971fd8 0000000000011cc0
[   82.174195] Call Trace:
[   82.174198]  [<ffffffff8148fd3f>] schedule+0x64/0x66
[   82.174201]  [<ffffffff8148ec6b>] schedule_timeout+0x27/0xd0
[   82.174204]  [<ffffffff81043c0c>] ? unpin_current_cpu+0x1a/0x6c
[   82.174206]  [<ffffffff8106e491>] ? migrate_enable+0x12f/0x141
[   82.174209]  [<ffffffff8148fbbd>] wait_for_common+0xbb/0x11f
[   82.174212]  [<ffffffff810709f2>] ? try_to_wake_up+0x182/0x182
[   82.174215]  [<ffffffff8148fc96>] wait_for_completion_interruptible+0x1d/0x2e
[   82.174218]  [<ffffffff811e4883>] cryptomgr_notify+0x280/0x385
[   82.174221]  [<ffffffff814943de>] notifier_call_chain+0x6b/0x98
[   82.174224]  [<ffffffff8108a11c>] ? rt_down_read+0x10/0x12
[   82.174227]  [<ffffffff810677cd>] __blocking_notifier_call_chain+0x70/0x8d
[   82.174230]  [<ffffffff810677fe>] blocking_notifier_call_chain+0x14/0x16
[   82.174234]  [<ffffffff811dd272>] crypto_probing_notify+0x24/0x50
[   82.174236]  [<ffffffff811dd7a1>] crypto_alg_mod_lookup+0x3e/0x74
[   82.174238]  [<ffffffff811dd949>] crypto_alloc_base+0x36/0x8f
[   82.174241]  [<ffffffff811e9408>] cryptd_alloc_ablkcipher+0x6e/0xb5
[   82.174243]  [<ffffffff811dd591>] ? kzalloc.clone.5+0xe/0x10
[   82.174246]  [<ffffffff8103085d>] ablk_init_common+0x1d/0x38
[   82.174249]  [<ffffffff8103852a>] ablk_ecb_init+0x15/0x17
[   82.174251]  [<ffffffff811dd8c6>] __crypto_alloc_tfm+0xc7/0x114
[   82.174254]  [<ffffffff811e0caa>] ? crypto_lookup_skcipher+0x1f/0xe4
[   82.174256]  [<ffffffff811e0dcf>] crypto_alloc_ablkcipher+0x60/0xa5
[   82.174258]  [<ffffffff811e5bde>] alg_test_skcipher+0x24/0x9b
[   82.174261]  [<ffffffff8106d96d>] ? finish_task_switch+0x3f/0xfa
[   82.174263]  [<ffffffff811e6b8e>] alg_test+0x16f/0x1d7
[   82.174267]  [<ffffffff811e45ac>] ? cryptomgr_probe+0xac/0xac
[   82.174269]  [<ffffffff811e45d8>] cryptomgr_test+0x2c/0x47
[   82.174272]  [<ffffffff81061161>] kthread+0x7e/0x86
[   82.174275]  [<ffffffff8106d9dd>] ? finish_task_switch+0xaf/0xfa
[   82.174278]  [<ffffffff814987c4>] kernel_thread_helper+0x4/0x10
[   82.174281]  [<ffffffff81491574>] ? retint_restore_args+0x13/0x13
[   82.174284]  [<ffffffff810610e3>] ? __init_kthread_worker+0x8c/0x8c
[   82.174287]  [<ffffffff814987c0>] ? gs_change+0x13/0x13

[   82.174329] cryptomgr_probe D 0000000000000002     0    47      2 0x00000000
[   82.174332]  ffff88042c991b70 0000000000000046 ffff88042c991bb0 0000000000000006
[   82.174335]  0000000000011cc0 0000000000011cc0 ffff88042c98ed00 ffff88042c991fd8
[   82.174338]  0000000000011cc0 0000000000011cc0 ffff88042c991fd8 0000000000011cc0
[   82.174338] Call Trace:
[   82.174342]  [<ffffffff8148fd3f>] schedule+0x64/0x66
[   82.174344]  [<ffffffff814901ad>] __rt_mutex_slowlock+0x85/0xbe
[   82.174347]  [<ffffffff814902d2>] rt_mutex_slowlock+0xec/0x159
[   82.174351]  [<ffffffff81089c4d>] rt_mutex_fastlock.clone.8+0x29/0x2f
[   82.174353]  [<ffffffff81490372>] rt_mutex_lock+0x33/0x37
[   82.174356]  [<ffffffff8108a0f2>] __rt_down_read+0x50/0x5a
[   82.174358]  [<ffffffff8108a11c>] ? rt_down_read+0x10/0x12
[   82.174360]  [<ffffffff8108a11c>] rt_down_read+0x10/0x12
[   82.174363]  [<ffffffff810677b5>] __blocking_notifier_call_chain+0x58/0x8d
[   82.174366]  [<ffffffff810677fe>] blocking_notifier_call_chain+0x14/0x16
[   82.174369]  [<ffffffff811dd272>] crypto_probing_notify+0x24/0x50
[   82.174372]  [<ffffffff811debd6>] crypto_wait_for_test+0x22/0x6b
[   82.174374]  [<ffffffff811decd3>] crypto_register_instance+0xb4/0xc0
[   82.174377]  [<ffffffff811e9b76>] cryptd_create+0x378/0x3b6
[   82.174379]  [<ffffffff811de512>] ? __crypto_lookup_template+0x5b/0x63
[   82.174382]  [<ffffffff811e4545>] cryptomgr_probe+0x45/0xac
[   82.174385]  [<ffffffff811e4500>] ? crypto_alloc_pcomp+0x1b/0x1b
[   82.174388]  [<ffffffff81061161>] kthread+0x7e/0x86
[   82.174391]  [<ffffffff8106d9dd>] ? finish_task_switch+0xaf/0xfa
[   82.174394]  [<ffffffff814987c4>] kernel_thread_helper+0x4/0x10
[   82.174398]  [<ffffffff81491574>] ? retint_restore_args+0x13/0x13
[   82.174401]  [<ffffffff810610e3>] ? __init_kthread_worker+0x8c/0x8c
[   82.174403]  [<ffffffff814987c0>] ? gs_change+0x13/0x13

cryptomgr_test spawns the cryptomgr_probe thread from the notifier
call. The probe thread fires the same notifier as the test thread and
deadlocks on the rwsem on RT.

Now this is a potential deadlock in mainline as well, because we have
fifo fair rwsems. If another thread blocks with a down_write() on the
notifier chain before the probe thread issues the down_read() it will
block the probe thread and the whole party is dead locked.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ce52ad3e

net: Add a mutex around devnet_rename_seq · bd6a10ec

Sebastian Andrzej Siewior authored 12 years ago


On RT write_seqcount_begin() disables preemption and device_rename()
allocates memory with GFP_KERNEL and grabs later the sysfs_mutex
mutex. Serialize with a mutex and add use the non preemption disabling
__write_seqcount_begin().

To avoid writer starvation, let the reader grab the mutex and release
it when it detects a writer in progress. This keeps the normal case
(no reader on the fly) fast.

[ tglx: Instead of replacing the seqcount by a mutex, add the mutex ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

bd6a10ec

net: netfilter: Serialize xt_write_recseq sections on RT · 420716a9

Thomas Gleixner authored 12 years ago


The netfilter code relies only on the implicit semantics of
local_bh_disable() for serializing wt_write_recseq sections. RT breaks
that and needs explicit serialization here.

Reported-by: Peter LaDow <petela@gocougs.wsu.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

420716a9

net/core: protect users of napi_alloc_cache against reentrance · 0dc18fc3

Sebastian Andrzej Siewior authored 9 years ago


On -RT the code running in BH can not be moved to another CPU so CPU
local variable remain local. However the code can be preempted
and another task may enter BH accessing the same CPU using the same
napi_alloc_cache variable.
This patch ensures that each user of napi_alloc_cache uses a local lock.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

0dc18fc3

net: Another local_irq_disable/kmalloc headache · 55a3a08f

Thomas Gleixner authored 12 years ago


Replace it by a local lock. Though that's pretty inefficient :(

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

55a3a08f

net: Remove preemption disabling in netif_rx() · 6c9de450

Priyanka Jain authored 12 years ago


1)enqueue_to_backlog() (called from netif_rx) should be
  bind to a particluar CPU. This can be achieved by
  disabling migration. No need to disable preemption

2)Fixes crash "BUG: scheduling while atomic: ksoftirqd"
  in case of RT.
  If preemption is disabled, enqueue_to_backog() is called
  in atomic context. And if backlog exceeds its count,
  kfree_skb() is called. But in RT, kfree_skb() might
  gets scheduled out, so it expects non atomic context.

3)When CONFIG_PREEMPT_RT_FULL is not defined,
 migrate_enable(), migrate_disable() maps to
 preempt_enable() and preempt_disable(), so no
 change in functionality in case of non-RT.

-Replace preempt_enable(), preempt_disable() with
 migrate_enable(), migrate_disable() respectively
-Replace get_cpu(), put_cpu() with get_cpu_light(),
 put_cpu_light() respectively

Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
Acked-by: Rajan Srivastava <Rajan.Srivastava@freescale.com>
Cc: <rostedt@goodmis.orgn>
Link: http://lkml.kernel.org/r/1337227511-2271-1-git-send-email-Priyanka.Jain@freescale.com



Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

6c9de450

scsi: qla2xxx: Use local_irq_save_nort() in qla2x00_poll · 5781b79a

John Kacur authored 13 years ago


RT triggers the following:

[   11.307652]  [<ffffffff81077b27>] __might_sleep+0xe7/0x110
[   11.307663]  [<ffffffff8150e524>] rt_spin_lock+0x24/0x60
[   11.307670]  [<ffffffff8150da78>] ? rt_spin_lock_slowunlock+0x78/0x90
[   11.307703]  [<ffffffffa0272d83>] qla24xx_intr_handler+0x63/0x2d0 [qla2xxx]
[   11.307736]  [<ffffffffa0262307>] qla2x00_poll+0x67/0x90 [qla2xxx]

Function qla2x00_poll does local_irq_save() before calling qla24xx_intr_handler
which has a spinlock. Since spinlocks are sleepable on rt, it is not allowed
to call them with interrupts disabled. Therefore we use local_irq_save_nort()
instead which saves flags without disabling interrupts.

This fix needs to be applied to v3.0-rt, v3.2-rt and v3.4-rt

Suggested-by: Thomas Gleixner
Signed-off-by: John Kacur <jkacur@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: David Sommerseth <davids@redhat.com>
Link: http://lkml.kernel.org/r/1335523726-10024-1-git-send-email-jkacur@redhat.com



Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

5781b79a

rt/locking: Reenable migration accross schedule · 45ba2e2a

Thomas Gleixner authored 9 years ago

We currently disable migration across lock acquisition. That includes the part
where we block on the lock and schedule out. We cannot disable migration after
taking the lock as that would cause a possible lock inversion.

But we can be smart and enable migration when we block and schedule out. That
allows the scheduler to place the task freely at least if this is the first
migrate disable level. For nested locking this does not help at all.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

45ba2e2a

rtmutex: push down migrate_disable() into rt_spin_lock() · 7665ed1c

Sebastian Andrzej Siewior authored 9 years ago


No point in having the migrate disable/enable invocations in all the
macro/inlines. That's just more code for no win as we do a function
call anyway. Move it to the core code and save quite some text size.

    text      data        bss        dec        filename
11034127   3676912   14901248   29612287  vmlinux.before
10990437   3676848   14901248   29568533  vmlinux.after

~-40KiB

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

7665ed1c

hotplug: Use set_cpus_allowed_ptr() in sync_unplug_thread() · 005b6ff2

Mike Galbraith authored 10 years ago


do_set_cpus_allowed() is not safe vs ->sched_class change.

crash> bt
PID: 11676  TASK: ffff88026f979da0  CPU: 22  COMMAND: "sync_unplug/22"
 #0 [ffff880274d25bc8] machine_kexec at ffffffff8103b41c
 #1 [ffff880274d25c18] crash_kexec at ffffffff810d881a
 #2 [ffff880274d25cd8] oops_end at ffffffff81525818
 #3 [ffff880274d25cf8] do_invalid_op at ffffffff81003096
 #4 [ffff880274d25d90] invalid_op at ffffffff8152d3de
    [exception RIP: set_cpus_allowed_rt+18]
    RIP: ffffffff8109e012  RSP: ffff880274d25e48  RFLAGS: 00010202
    RAX: ffffffff8109e000  RBX: ffff88026f979da0  RCX: ffff8802770cb6e8
    RDX: 0000000000000000  RSI: ffffffff81add700  RDI: ffff88026f979da0
    RBP: ffff880274d25e78   R8: ffffffff816112e0   R9: 0000000000000001
    R10: 0000000000000001  R11: 0000000000011940  R12: ffff88026f979da0
    R13: ffff8802770cb6d0  R14: ffff880274d25fd8  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #5 [ffff880274d25e60] do_set_cpus_allowed at ffffffff8108e65f
 #6 [ffff880274d25e80] sync_unplug_thread at ffffffff81058c08
 #7 [ffff880274d25ed8] kthread at ffffffff8107cad6
 #8 [ffff880274d25f50] ret_from_fork at ffffffff8152bbbc
crash> task_struct ffff88026f979da0 | grep class
  sched_class = 0xffffffff816111e0 <fair_sched_class+64>,

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

005b6ff2

Admin message