Skip to content

Commit

Permalink
Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm…
Browse files Browse the repository at this point in the history
…/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:
 "The main changes in this cycle are:

   - rwsem scalability improvements, phase #2, by Waiman Long, which are
     rather impressive:

       "On a 2-socket 40-core 80-thread Skylake system with 40 reader
        and writer locking threads, the min/mean/max locking operations
        done in a 5-second testing window before the patchset were:

         40 readers, Iterations Min/Mean/Max = 1,807/1,808/1,810
         40 writers, Iterations Min/Mean/Max = 1,807/50,344/151,255

        After the patchset, they became:

         40 readers, Iterations Min/Mean/Max = 30,057/31,359/32,741
         40 writers, Iterations Min/Mean/Max = 94,466/95,845/97,098"

     There's a lot of changes to the locking implementation that makes
     it similar to qrwlock, including owner handoff for more fair
     locking.

     Another microbenchmark shows how across the spectrum the
     improvements are:

       "With a locking microbenchmark running on 5.1 based kernel, the
        total locking rates (in kops/s) on a 2-socket Skylake system
        with equal numbers of readers and writers (mixed) before and
        after this patchset were:

        # of Threads   Before Patch      After Patch
        ------------   ------------      -----------
             2            2,618             4,193
             4            1,202             3,726
             8              802             3,622
            16              729             3,359
            32              319             2,826
            64              102             2,744"

     The changes are extensive and the patch-set has been through
     several iterations addressing various locking workloads. There
     might be more regressions, but unless they are pathological I
     believe we want to use this new implementation as the baseline
     going forward.

   - jump-label optimizations by Daniel Bristot de Oliveira: the primary
     motivation was to remove IPI disturbance of isolated RT-workload
     CPUs, which resulted in the implementation of batched jump-label
     updates. Beyond the improvement of the real-time characteristics
     kernel, in one test this patchset improved static key update
     overhead from 57 msecs to just 1.4 msecs - which is a nice speedup
     as well.

   - atomic64_t cross-arch type cleanups by Mark Rutland: over the last
     ~10 years of atomic64_t existence the various types used by the
     APIs only had to be self-consistent within each architecture -
     which means they became wildly inconsistent across architectures.
     Mark puts and end to this by reworking all the atomic64
     implementations to use 's64' as the base type for atomic64_t, and
     to ensure that this type is consistently used for parameters and
     return values in the API, avoiding further problems in this area.

   - A large set of small improvements to lockdep by Yuyang Du: type
     cleanups, output cleanups, function return type and othr cleanups
     all around the place.

   - A set of percpu ops cleanups and fixes by Peter Zijlstra.

   - Misc other changes - please see the Git log for more details"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits)
  locking/lockdep: increase size of counters for lockdep statistics
  locking/atomics: Use sed(1) instead of non-standard head(1) option
  locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING
  x86/jump_label: Make tp_vec_nr static
  x86/percpu: Optimize raw_cpu_xchg()
  x86/percpu, sched/fair: Avoid local_clock()
  x86/percpu, x86/irq: Relax {set,get}_irq_regs()
  x86/percpu: Relax smp_processor_id()
  x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}()
  locking/rwsem: Guard against making count negative
  locking/rwsem: Adaptive disabling of reader optimistic spinning
  locking/rwsem: Enable time-based spinning on reader-owned rwsem
  locking/rwsem: Make rwsem->owner an atomic_long_t
  locking/rwsem: Enable readers spinning on writer
  locking/rwsem: Clarify usage of owner's nonspinaable bit
  locking/rwsem: Wake up almost all readers in wait queue
  locking/rwsem: More optimal RT task handling of null owner
  locking/rwsem: Always release wait_lock before waking up tasks
  locking/rwsem: Implement lock handoff to prevent lock starvation
  locking/rwsem: Make rwsem_spin_on_owner() return owner state
  ...
  • Loading branch information
torvalds committed Jul 8, 2019
2 parents 46f1ec2 + 9156e54 commit e192832
Show file tree
Hide file tree
Showing 55 changed files with 2,788 additions and 2,020 deletions.
9 changes: 7 additions & 2 deletions Documentation/atomic_t.txt
Original file line number Diff line number Diff line change
Expand Up @@ -81,9 +81,11 @@ Non-RMW ops:

The non-RMW ops are (typically) regular LOADs and STOREs and are canonically
implemented using READ_ONCE(), WRITE_ONCE(), smp_load_acquire() and
smp_store_release() respectively.
smp_store_release() respectively. Therefore, if you find yourself only using
the Non-RMW operations of atomic_t, you do not in fact need atomic_t at all
and are doing it wrong.

The one detail to this is that atomic_set{}() should be observable to the RMW
A subtle detail of atomic_set{}() is that it should be observable to the RMW
ops. That is:

C atomic-set
Expand Down Expand Up @@ -200,6 +202,9 @@ These helper barriers exist because architectures have varying implicit
ordering on their SMP atomic primitives. For example our TSO architectures
provide full ordered atomics and these barriers are no-ops.

NOTE: when the atomic RmW ops are fully ordered, they should also imply a
compiler barrier.

Thus:

atomic_fetch_add();
Expand Down
112 changes: 84 additions & 28 deletions Documentation/locking/lockdep-design.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,34 +15,48 @@ tens of thousands of) instantiations. For example a lock in the inode
struct is one class, while each inode has its own instantiation of that
lock class.

The validator tracks the 'state' of lock-classes, and it tracks
dependencies between different lock-classes. The validator maintains a
rolling proof that the state and the dependencies are correct.

Unlike an lock instantiation, the lock-class itself never goes away: when
a lock-class is used for the first time after bootup it gets registered,
and all subsequent uses of that lock-class will be attached to this
lock-class.
The validator tracks the 'usage state' of lock-classes, and it tracks
the dependencies between different lock-classes. Lock usage indicates
how a lock is used with regard to its IRQ contexts, while lock
dependency can be understood as lock order, where L1 -> L2 suggests that
a task is attempting to acquire L2 while holding L1. From lockdep's
perspective, the two locks (L1 and L2) are not necessarily related; that
dependency just means the order ever happened. The validator maintains a
continuing effort to prove lock usages and dependencies are correct or
the validator will shoot a splat if incorrect.

A lock-class's behavior is constructed by its instances collectively:
when the first instance of a lock-class is used after bootup the class
gets registered, then all (subsequent) instances will be mapped to the
class and hence their usages and dependecies will contribute to those of
the class. A lock-class does not go away when a lock instance does, but
it can be removed if the memory space of the lock class (static or
dynamic) is reclaimed, this happens for example when a module is
unloaded or a workqueue is destroyed.

State
-----

The validator tracks lock-class usage history into 4 * nSTATEs + 1 separate
state bits:
The validator tracks lock-class usage history and divides the usage into
(4 usages * n STATEs + 1) categories:

where the 4 usages can be:
- 'ever held in STATE context'
- 'ever held as readlock in STATE context'
- 'ever held with STATE enabled'
- 'ever held as readlock with STATE enabled'

Where STATE can be either one of (kernel/locking/lockdep_states.h)
- hardirq
- softirq
where the n STATEs are coded in kernel/locking/lockdep_states.h and as of
now they include:
- hardirq
- softirq

where the last 1 category is:
- 'ever used' [ == !unused ]

When locking rules are violated, these state bits are presented in the
locking error messages, inside curlies. A contrived example:
When locking rules are violated, these usage bits are presented in the
locking error messages, inside curlies, with a total of 2 * n STATEs bits.
A contrived example:

modprobe/2287 is trying to acquire lock:
(&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24
Expand All @@ -51,28 +65,67 @@ locking error messages, inside curlies. A contrived example:
(&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24


The bit position indicates STATE, STATE-read, for each of the states listed
above, and the character displayed in each indicates:
For a given lock, the bit positions from left to right indicate the usage
of the lock and readlock (if exists), for each of the n STATEs listed
above respectively, and the character displayed at each bit position
indicates:

'.' acquired while irqs disabled and not in irq context
'-' acquired in irq context
'+' acquired with irqs enabled
'?' acquired in irq context with irqs enabled.

Unused mutexes cannot be part of the cause of an error.
The bits are illustrated with an example:

(&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24
||||
||| \-> softirq disabled and not in softirq context
|| \--> acquired in softirq context
| \---> hardirq disabled and not in hardirq context
\----> acquired in hardirq context


For a given STATE, whether the lock is ever acquired in that STATE
context and whether that STATE is enabled yields four possible cases as
shown in the table below. The bit character is able to indicate which
exact case is for the lock as of the reporting time.

-------------------------------------------
| | irq enabled | irq disabled |
|-------------------------------------------|
| ever in irq | ? | - |
|-------------------------------------------|
| never in irq | + | . |
-------------------------------------------

The character '-' suggests irq is disabled because if otherwise the
charactor '?' would have been shown instead. Similar deduction can be
applied for '+' too.

Unused locks (e.g., mutexes) cannot be part of the cause of an error.


Single-lock state rules:
------------------------

A lock is irq-safe means it was ever used in an irq context, while a lock
is irq-unsafe means it was ever acquired with irq enabled.

A softirq-unsafe lock-class is automatically hardirq-unsafe as well. The
following states are exclusive, and only one of them is allowed to be
set for any lock-class:
following states must be exclusive: only one of them is allowed to be set
for any lock-class based on its usage:

<hardirq-safe> or <hardirq-unsafe>
<softirq-safe> or <softirq-unsafe>

<hardirq-safe> and <hardirq-unsafe>
<softirq-safe> and <softirq-unsafe>
This is because if a lock can be used in irq context (irq-safe) then it
cannot be ever acquired with irq enabled (irq-unsafe). Otherwise, a
deadlock may happen. For example, in the scenario that after this lock
was acquired but before released, if the context is interrupted this
lock will be attempted to acquire twice, which creates a deadlock,
referred to as lock recursion deadlock.

The validator detects and reports lock usage that violate these
The validator detects and reports lock usage that violates these
single-lock state rules.

Multi-lock dependency rules:
Expand All @@ -81,15 +134,18 @@ Multi-lock dependency rules:
The same lock-class must not be acquired twice, because this could lead
to lock recursion deadlocks.

Furthermore, two locks may not be taken in different order:
Furthermore, two locks can not be taken in inverse order:

<L1> -> <L2>
<L2> -> <L1>

because this could lead to lock inversion deadlocks. (The validator
finds such dependencies in arbitrary complexity, i.e. there can be any
other locking sequence between the acquire-lock operations, the
validator will still track all dependencies between locks.)
because this could lead to a deadlock - referred to as lock inversion
deadlock - as attempts to acquire the two locks form a circle which
could lead to the two contexts waiting for each other permanently. The
validator will find such dependency circle in arbitrary complexity,
i.e., there can be any other locking sequence between the acquire-lock
operations; the validator will still find whether these locks can be
acquired in a circular fashion.

Furthermore, the following usage based lock dependencies are not allowed
between any two lock-classes:
Expand Down
20 changes: 10 additions & 10 deletions arch/alpha/include/asm/atomic.h
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,9 @@ static inline int atomic_fetch_##op##_relaxed(int i, atomic_t *v) \
}

#define ATOMIC64_OP(op, asm_op) \
static __inline__ void atomic64_##op(long i, atomic64_t * v) \
static __inline__ void atomic64_##op(s64 i, atomic64_t * v) \
{ \
unsigned long temp; \
s64 temp; \
__asm__ __volatile__( \
"1: ldq_l %0,%1\n" \
" " #asm_op " %0,%2,%0\n" \
Expand All @@ -109,9 +109,9 @@ static __inline__ void atomic64_##op(long i, atomic64_t * v) \
} \

#define ATOMIC64_OP_RETURN(op, asm_op) \
static __inline__ long atomic64_##op##_return_relaxed(long i, atomic64_t * v) \
static __inline__ s64 atomic64_##op##_return_relaxed(s64 i, atomic64_t * v) \
{ \
long temp, result; \
s64 temp, result; \
__asm__ __volatile__( \
"1: ldq_l %0,%1\n" \
" " #asm_op " %0,%3,%2\n" \
Expand All @@ -128,9 +128,9 @@ static __inline__ long atomic64_##op##_return_relaxed(long i, atomic64_t * v) \
}

#define ATOMIC64_FETCH_OP(op, asm_op) \
static __inline__ long atomic64_fetch_##op##_relaxed(long i, atomic64_t * v) \
static __inline__ s64 atomic64_fetch_##op##_relaxed(s64 i, atomic64_t * v) \
{ \
long temp, result; \
s64 temp, result; \
__asm__ __volatile__( \
"1: ldq_l %2,%1\n" \
" " #asm_op " %2,%3,%0\n" \
Expand Down Expand Up @@ -246,9 +246,9 @@ static __inline__ int atomic_fetch_add_unless(atomic_t *v, int a, int u)
* Atomically adds @a to @v, so long as it was not @u.
* Returns the old value of @v.
*/
static __inline__ long atomic64_fetch_add_unless(atomic64_t *v, long a, long u)
static __inline__ s64 atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u)
{
long c, new, old;
s64 c, new, old;
smp_mb();
__asm__ __volatile__(
"1: ldq_l %[old],%[mem]\n"
Expand Down Expand Up @@ -276,9 +276,9 @@ static __inline__ long atomic64_fetch_add_unless(atomic64_t *v, long a, long u)
* The function returns the old value of *v minus 1, even if
* the atomic variable, v, was not decremented.
*/
static inline long atomic64_dec_if_positive(atomic64_t *v)
static inline s64 atomic64_dec_if_positive(atomic64_t *v)
{
long old, tmp;
s64 old, tmp;
smp_mb();
__asm__ __volatile__(
"1: ldq_l %[old],%[mem]\n"
Expand Down
41 changes: 20 additions & 21 deletions arch/arc/include/asm/atomic.h
Original file line number Diff line number Diff line change
Expand Up @@ -321,14 +321,14 @@ ATOMIC_OPS(xor, ^=, CTOP_INST_AXOR_DI_R2_R2_R3)
*/

typedef struct {
aligned_u64 counter;
s64 __aligned(8) counter;
} atomic64_t;

#define ATOMIC64_INIT(a) { (a) }

static inline long long atomic64_read(const atomic64_t *v)
static inline s64 atomic64_read(const atomic64_t *v)
{
unsigned long long val;
s64 val;

__asm__ __volatile__(
" ldd %0, [%1] \n"
Expand All @@ -338,7 +338,7 @@ static inline long long atomic64_read(const atomic64_t *v)
return val;
}

static inline void atomic64_set(atomic64_t *v, long long a)
static inline void atomic64_set(atomic64_t *v, s64 a)
{
/*
* This could have been a simple assignment in "C" but would need
Expand All @@ -359,9 +359,9 @@ static inline void atomic64_set(atomic64_t *v, long long a)
}

#define ATOMIC64_OP(op, op1, op2) \
static inline void atomic64_##op(long long a, atomic64_t *v) \
static inline void atomic64_##op(s64 a, atomic64_t *v) \
{ \
unsigned long long val; \
s64 val; \
\
__asm__ __volatile__( \
"1: \n" \
Expand All @@ -372,13 +372,13 @@ static inline void atomic64_##op(long long a, atomic64_t *v) \
" bnz 1b \n" \
: "=&r"(val) \
: "r"(&v->counter), "ir"(a) \
: "cc"); \
: "cc"); \
} \

#define ATOMIC64_OP_RETURN(op, op1, op2) \
static inline long long atomic64_##op##_return(long long a, atomic64_t *v) \
static inline s64 atomic64_##op##_return(s64 a, atomic64_t *v) \
{ \
unsigned long long val; \
s64 val; \
\
smp_mb(); \
\
Expand All @@ -399,9 +399,9 @@ static inline long long atomic64_##op##_return(long long a, atomic64_t *v) \
}

#define ATOMIC64_FETCH_OP(op, op1, op2) \
static inline long long atomic64_fetch_##op(long long a, atomic64_t *v) \
static inline s64 atomic64_fetch_##op(s64 a, atomic64_t *v) \
{ \
unsigned long long val, orig; \
s64 val, orig; \
\
smp_mb(); \
\
Expand Down Expand Up @@ -441,10 +441,10 @@ ATOMIC64_OPS(xor, xor, xor)
#undef ATOMIC64_OP_RETURN
#undef ATOMIC64_OP

static inline long long
atomic64_cmpxchg(atomic64_t *ptr, long long expected, long long new)
static inline s64
atomic64_cmpxchg(atomic64_t *ptr, s64 expected, s64 new)
{
long long prev;
s64 prev;

smp_mb();

Expand All @@ -464,9 +464,9 @@ atomic64_cmpxchg(atomic64_t *ptr, long long expected, long long new)
return prev;
}

static inline long long atomic64_xchg(atomic64_t *ptr, long long new)
static inline s64 atomic64_xchg(atomic64_t *ptr, s64 new)
{
long long prev;
s64 prev;

smp_mb();

Expand All @@ -492,9 +492,9 @@ static inline long long atomic64_xchg(atomic64_t *ptr, long long new)
* the atomic variable, v, was not decremented.
*/

static inline long long atomic64_dec_if_positive(atomic64_t *v)
static inline s64 atomic64_dec_if_positive(atomic64_t *v)
{
long long val;
s64 val;

smp_mb();

Expand Down Expand Up @@ -525,10 +525,9 @@ static inline long long atomic64_dec_if_positive(atomic64_t *v)
* Atomically adds @a to @v, if it was not @u.
* Returns the old value of @v
*/
static inline long long atomic64_fetch_add_unless(atomic64_t *v, long long a,
long long u)
static inline s64 atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u)
{
long long old, temp;
s64 old, temp;

smp_mb();

Expand Down
Loading

0 comments on commit e192832

Please sign in to comment.