Skip to content

Commit

Permalink
mm, oom_reaper: skip mm structs with mmu notifiers
Browse files Browse the repository at this point in the history
Andrea has noticed that the oom_reaper doesn't invalidate the range via
mmu notifiers (mmu_notifier_invalidate_range_start/end) and that can
corrupt the memory of the kvm guest for example.

tlb_flush_mmu_tlbonly already invokes mmu notifiers but that is not
sufficient as per Andrea:

 "mmu_notifier_invalidate_range cannot be used in replacement of
  mmu_notifier_invalidate_range_start/end. For KVM
  mmu_notifier_invalidate_range is a noop and rightfully so. A MMU
  notifier implementation has to implement either ->invalidate_range
  method or the invalidate_range_start/end methods, not both. And if you
  implement invalidate_range_start/end like KVM is forced to do, calling
  mmu_notifier_invalidate_range in common code is a noop for KVM.

  For those MMU notifiers that can get away only implementing
  ->invalidate_range, the ->invalidate_range is implicitly called by
  mmu_notifier_invalidate_range_end(). And only those secondary MMUs
  that share the same pagetable with the primary MMU (like AMD iommuv2)
  can get away only implementing ->invalidate_range"

As the callback is allowed to sleep and the implementation is out of
hand of the MM it is safer to simply bail out if there is an mmu
notifier registered.  In order to not fail too early make the
mm_has_notifiers check under the oom_lock and have a little nap before
failing to give the current oom victim some more time to exit.

[[email protected]: coding-style fixes]
Link: http://lkml.kernel.org/r/[email protected]
Fixes: aac4536 ("mm, oom: introduce oom reaper")
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: Andrea Arcangeli <[email protected]>
Reviewed-by: Andrea Arcangeli <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
Michal Hocko authored and torvalds committed Oct 4, 2017
1 parent d5567c9 commit 4d4bbd8
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 0 deletions.
5 changes: 5 additions & 0 deletions include/linux/mmu_notifier.h
Original file line number Diff line number Diff line change
Expand Up @@ -400,6 +400,11 @@ extern void mmu_notifier_synchronize(void);

#else /* CONFIG_MMU_NOTIFIER */

static inline int mm_has_notifiers(struct mm_struct *mm)
{
return 0;
}

static inline void mmu_notifier_release(struct mm_struct *mm)
{
}
Expand Down
16 changes: 16 additions & 0 deletions mm/oom_kill.c
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
#include <linux/ratelimit.h>
#include <linux/kthread.h>
#include <linux/init.h>
#include <linux/mmu_notifier.h>

#include <asm/tlb.h>
#include "internal.h"
Expand Down Expand Up @@ -494,6 +495,21 @@ static bool __oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
goto unlock_oom;
}

/*
* If the mm has notifiers then we would need to invalidate them around
* unmap_page_range and that is risky because notifiers can sleep and
* what they do is basically undeterministic. So let's have a short
* sleep to give the oom victim some more time.
* TODO: we really want to get rid of this ugly hack and make sure that
* notifiers cannot block for unbounded amount of time and add
* mmu_notifier_invalidate_range_{start,end} around unmap_page_range
*/
if (mm_has_notifiers(mm)) {
up_read(&mm->mmap_sem);
schedule_timeout_idle(HZ);
goto unlock_oom;
}

/*
* MMF_OOM_SKIP is set by exit_mmap when the OOM reaper can't
* work on the mm anymore. The check for MMF_OOM_SKIP must run
Expand Down

0 comments on commit 4d4bbd8

Please sign in to comment.