Skip to content

Commit

Permalink
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/lin…
Browse files Browse the repository at this point in the history
…ux/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar:

 - Continued initialization/Kconfig updates: hide most Kconfig options
   from unsuspecting users.

   There's now a single high level configuration option:

        *
        * RCU Subsystem
        *
        Make expert-level adjustments to RCU configuration (RCU_EXPERT) [N/y/?] (NEW)

   Which if answered in the negative, leaves us with a single
   interactive configuration option:

        Offload RCU callback processing from boot-selected CPUs (RCU_NOCB_CPU) [N/y/?] (NEW)

   All the rest of the RCU options are configured automatically.  Later
   on we'll remove this single leftover configuration option as well.

 - Remove all uses of RCU-protected array indexes: replace the
   rcu_[access|dereference]_index_check() APIs with READ_ONCE() and
   rcu_lockdep_assert()

 - RCU CPU-hotplug cleanups

 - Updates to Tiny RCU: a race fix and further code shrinkage.

 - RCU torture-testing updates: fixes, speedups, cleanups and
   documentation updates.

 - Miscellaneous fixes

 - Documentation updates

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  rcutorture: Allow repetition factors in Kconfig-fragment lists
  rcutorture: Display "make oldconfig" errors
  rcutorture: Update TREE_RCU-kconfig.txt
  rcutorture: Make rcutorture scripts force RCU_EXPERT
  rcutorture: Update configuration fragments for rcutree.rcu_fanout_exact
  rcutorture: TASKS_RCU set directly, so don't explicitly set it
  rcutorture: Test SRCU cleanup code path
  rcutorture: Replace barriers with smp_store_release() and smp_load_acquire()
  locktorture: Change longdelay_us to longdelay_ms
  rcutorture: Allow negative values of nreaders to oversubscribe
  rcutorture: Exchange TREE03 and TREE08 NR_CPUS, speed up CPU hotplug
  rcutorture: Exchange TREE03 and TREE04 geometries
  locktorture: fix deadlock in 'rw_lock_irq' type
  rcu: Correctly handle non-empty Tiny RCU callback list with none ready
  rcutorture: Test both RCU-sched and RCU-bh for Tiny RCU
  rcu: Further shrink Tiny RCU by making empty functions static inlines
  rcu: Conditionally compile RCU's eqs warnings
  rcu: Remove prompt for RCU implementation
  rcu: Make RCU able to tolerate undefined CONFIG_RCU_KTHREAD_PRIO
  rcu: Make RCU able to tolerate undefined CONFIG_RCU_FANOUT_LEAF
  ...
  • Loading branch information
torvalds committed Jun 22, 2015
2 parents 052b398 + 085c789 commit fc934d4
Show file tree
Hide file tree
Showing 57 changed files with 809 additions and 624 deletions.
20 changes: 16 additions & 4 deletions Documentation/RCU/arrayRCU.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,19 @@ also be used to protect arrays. Three situations are as follows:

3. Resizeable Arrays

Each of these situations are discussed below.
Each of these three situations involves an RCU-protected pointer to an
array that is separately indexed. It might be tempting to consider use
of RCU to instead protect the index into an array, however, this use
case is -not- supported. The problem with RCU-protected indexes into
arrays is that compilers can play way too many optimization games with
integers, which means that the rules governing handling of these indexes
are far more trouble than they are worth. If RCU-protected indexes into
arrays prove to be particularly valuable (which they have not thus far),
explicit cooperation from the compiler will be required to permit them
to be safely used.

That aside, each of the three RCU-protected pointer situations are
described in the following sections.


Situation 1: Hash Tables
Expand All @@ -36,9 +48,9 @@ Quick Quiz: Why is it so important that updates be rare when
Situation 3: Resizeable Arrays

Use of RCU for resizeable arrays is demonstrated by the grow_ary()
function used by the System V IPC code. The array is used to map from
semaphore, message-queue, and shared-memory IDs to the data structure
that represents the corresponding IPC construct. The grow_ary()
function formerly used by the System V IPC code. The array is used
to map from semaphore, message-queue, and shared-memory IDs to the data
structure that represents the corresponding IPC construct. The grow_ary()
function does not acquire any locks; instead its caller must hold the
ids->sem semaphore.

Expand Down
10 changes: 0 additions & 10 deletions Documentation/RCU/lockdep.txt
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,6 @@ checking of rcu_dereference() primitives:
Use explicit check expression "c" along with
srcu_read_lock_held()(). This is useful in code that
is invoked by both SRCU readers and updaters.
rcu_dereference_index_check(p, c):
Use explicit check expression "c", but the caller
must supply one of the rcu_read_lock_held() functions.
This is useful in code that uses RCU-protected arrays
that is invoked by both RCU readers and updaters.
rcu_dereference_raw(p):
Don't check. (Use sparingly, if at all.)
rcu_dereference_protected(p, c):
Expand All @@ -64,11 +59,6 @@ checking of rcu_dereference() primitives:
but retain the compiler constraints that prevent duplicating
or coalescsing. This is useful when when testing the
value of the pointer itself, for example, against NULL.
rcu_access_index(idx):
Return the value of the index and omit all barriers, but
retain the compiler constraints that prevent duplicating
or coalescsing. This is useful when when testing the
value of the index itself, for example, against -1.

The rcu_dereference_check() check expression can be any boolean
expression, but would normally include a lockdep expression. However,
Expand Down
38 changes: 17 additions & 21 deletions Documentation/RCU/rcu_dereference.txt
Original file line number Diff line number Diff line change
Expand Up @@ -25,17 +25,6 @@ o You must use one of the rcu_dereference() family of primitives
for an example where the compiler can in fact deduce the exact
value of the pointer, and thus cause misordering.

o Do not use single-element RCU-protected arrays. The compiler
is within its right to assume that the value of an index into
such an array must necessarily evaluate to zero. The compiler
could then substitute the constant zero for the computation, so
that the array index no longer depended on the value returned
by rcu_dereference(). If the array index no longer depends
on rcu_dereference(), then both the compiler and the CPU
are within their rights to order the array access before the
rcu_dereference(), which can cause the array access to return
garbage.

o Avoid cancellation when using the "+" and "-" infix arithmetic
operators. For example, for a given variable "x", avoid
"(x-x)". There are similar arithmetic pitfalls from other
Expand Down Expand Up @@ -76,14 +65,15 @@ o Do not use the results from the boolean "&&" and "||" when
dereferencing. For example, the following (rather improbable)
code is buggy:

int a[2];
int index;
int force_zero_index = 1;
int *p;
int *q;

...

r1 = rcu_dereference(i1)
r2 = a[r1 && force_zero_index]; /* BUGGY!!! */
p = rcu_dereference(gp)
q = &global_q;
q += p != &oom_p1 && p != &oom_p2;
r1 = *q; /* BUGGY!!! */

The reason this is buggy is that "&&" and "||" are often compiled
using branches. While weak-memory machines such as ARM or PowerPC
Expand All @@ -94,14 +84,15 @@ o Do not use the results from relational operators ("==", "!=",
">", ">=", "<", or "<=") when dereferencing. For example,
the following (quite strange) code is buggy:

int a[2];
int index;
int flip_index = 0;
int *p;
int *q;

...

r1 = rcu_dereference(i1)
r2 = a[r1 != flip_index]; /* BUGGY!!! */
p = rcu_dereference(gp)
q = &global_q;
q += p > &oom_p;
r1 = *q; /* BUGGY!!! */

As before, the reason this is buggy is that relational operators
are often compiled using branches. And as before, although
Expand Down Expand Up @@ -193,6 +184,11 @@ o Be very careful about comparing pointers obtained from
pointer. Note that the volatile cast in rcu_dereference()
will normally prevent the compiler from knowing too much.

However, please note that if the compiler knows that the
pointer takes on only one of two values, a not-equal
comparison will provide exactly the information that the
compiler needs to deduce the value of the pointer.

o Disable any value-speculation optimizations that your compiler
might provide, especially if you are making use of feedback-based
optimizations that take data collected from prior runs. Such
Expand Down
6 changes: 3 additions & 3 deletions Documentation/RCU/whatisRCU.txt
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,9 @@ rcu_dereference()
If you are going to be fetching multiple fields from the
RCU-protected structure, using the local variable is of
course preferred. Repeated rcu_dereference() calls look
ugly and incur unnecessary overhead on Alpha CPUs.
ugly, do not guarantee that the same pointer will be returned
if an update happened while in the critical section, and incur
unnecessary overhead on Alpha CPUs.

Note that the value returned by rcu_dereference() is valid
only within the enclosing RCU read-side critical section.
Expand Down Expand Up @@ -879,9 +881,7 @@ SRCU: Initialization/cleanup

All: lockdep-checked RCU-protected pointer access

rcu_access_index
rcu_access_pointer
rcu_dereference_index_check
rcu_dereference_raw
rcu_lockdep_assert
rcu_sleep_check
Expand Down
33 changes: 30 additions & 3 deletions Documentation/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2998,11 +2998,34 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
Set maximum number of finished RCU callbacks to
process in one batch.

rcutree.dump_tree= [KNL]
Dump the structure of the rcu_node combining tree
out at early boot. This is used for diagnostic
purposes, to verify correct tree setup.

rcutree.gp_cleanup_delay= [KNL]
Set the number of jiffies to delay each step of
RCU grace-period cleanup. This only has effect
when CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP is set.

rcutree.gp_init_delay= [KNL]
Set the number of jiffies to delay each step of
RCU grace-period initialization. This only has
effect when CONFIG_RCU_TORTURE_TEST_SLOW_INIT is
set.
effect when CONFIG_RCU_TORTURE_TEST_SLOW_INIT
is set.

rcutree.gp_preinit_delay= [KNL]
Set the number of jiffies to delay each step of
RCU grace-period pre-initialization, that is,
the propagation of recent CPU-hotplug changes up
the rcu_node combining tree. This only has effect
when CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT is set.

rcutree.rcu_fanout_exact= [KNL]
Disable autobalancing of the rcu_node combining
tree. This is used by rcutorture, and might
possibly be useful for architectures having high
cache-to-cache transfer latencies.

rcutree.rcu_fanout_leaf= [KNL]
Increase the number of CPUs assigned to each
Expand Down Expand Up @@ -3107,7 +3130,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
test, hence the "fake".

rcutorture.nreaders= [KNL]
Set number of RCU readers.
Set number of RCU readers. The value -1 selects
N-1, where N is the number of CPUs. A value
"n" less than -1 selects N-n-2, where N is again
the number of CPUs. For example, -2 selects N
(the number of CPUs), -3 selects N+1, and so on.

rcutorture.object_debug= [KNL]
Enable debug-object double-call_rcu() testing.
Expand Down
62 changes: 36 additions & 26 deletions Documentation/memory-barriers.txt
Original file line number Diff line number Diff line change
Expand Up @@ -617,16 +617,16 @@ case what's actually required is:
However, stores are not speculated. This means that ordering -is- provided
for load-store control dependencies, as in the following example:

q = ACCESS_ONCE(a);
q = READ_ONCE_CTRL(a);
if (q) {
ACCESS_ONCE(b) = p;
}

Control dependencies pair normally with other types of barriers.
That said, please note that ACCESS_ONCE() is not optional! Without the
ACCESS_ONCE(), might combine the load from 'a' with other loads from
'a', and the store to 'b' with other stores to 'b', with possible highly
counterintuitive effects on ordering.
Control dependencies pair normally with other types of barriers. That
said, please note that READ_ONCE_CTRL() is not optional! Without the
READ_ONCE_CTRL(), the compiler might combine the load from 'a' with
other loads from 'a', and the store to 'b' with other stores to 'b',
with possible highly counterintuitive effects on ordering.

Worse yet, if the compiler is able to prove (say) that the value of
variable 'a' is always non-zero, it would be well within its rights
Expand All @@ -636,12 +636,15 @@ as follows:
q = a;
b = p; /* BUG: Compiler and CPU can both reorder!!! */

So don't leave out the ACCESS_ONCE().
Finally, the READ_ONCE_CTRL() includes an smp_read_barrier_depends()
that DEC Alpha needs in order to respect control depedencies.

So don't leave out the READ_ONCE_CTRL().

It is tempting to try to enforce ordering on identical stores on both
branches of the "if" statement as follows:

q = ACCESS_ONCE(a);
q = READ_ONCE_CTRL(a);
if (q) {
barrier();
ACCESS_ONCE(b) = p;
Expand All @@ -655,7 +658,7 @@ branches of the "if" statement as follows:
Unfortunately, current compilers will transform this as follows at high
optimization levels:

q = ACCESS_ONCE(a);
q = READ_ONCE_CTRL(a);
barrier();
ACCESS_ONCE(b) = p; /* BUG: No ordering vs. load from a!!! */
if (q) {
Expand Down Expand Up @@ -685,7 +688,7 @@ memory barriers, for example, smp_store_release():
In contrast, without explicit memory barriers, two-legged-if control
ordering is guaranteed only when the stores differ, for example:

q = ACCESS_ONCE(a);
q = READ_ONCE_CTRL(a);
if (q) {
ACCESS_ONCE(b) = p;
do_something();
Expand All @@ -694,14 +697,14 @@ ordering is guaranteed only when the stores differ, for example:
do_something_else();
}

The initial ACCESS_ONCE() is still required to prevent the compiler from
proving the value of 'a'.
The initial READ_ONCE_CTRL() is still required to prevent the compiler
from proving the value of 'a'.

In addition, you need to be careful what you do with the local variable 'q',
otherwise the compiler might be able to guess the value and again remove
the needed conditional. For example:

q = ACCESS_ONCE(a);
q = READ_ONCE_CTRL(a);
if (q % MAX) {
ACCESS_ONCE(b) = p;
do_something();
Expand All @@ -714,7 +717,7 @@ If MAX is defined to be 1, then the compiler knows that (q % MAX) is
equal to zero, in which case the compiler is within its rights to
transform the above code into the following:

q = ACCESS_ONCE(a);
q = READ_ONCE_CTRL(a);
ACCESS_ONCE(b) = p;
do_something_else();

Expand All @@ -725,7 +728,7 @@ is gone, and the barrier won't bring it back. Therefore, if you are
relying on this ordering, you should make sure that MAX is greater than
one, perhaps as follows:

q = ACCESS_ONCE(a);
q = READ_ONCE_CTRL(a);
BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
if (q % MAX) {
ACCESS_ONCE(b) = p;
Expand All @@ -742,14 +745,15 @@ of the 'if' statement.
You must also be careful not to rely too much on boolean short-circuit
evaluation. Consider this example:

q = ACCESS_ONCE(a);
q = READ_ONCE_CTRL(a);
if (a || 1 > 0)
ACCESS_ONCE(b) = 1;

Because the second condition is always true, the compiler can transform
this example as following, defeating control dependency:
Because the first condition cannot fault and the second condition is
always true, the compiler can transform this example as following,
defeating control dependency:

q = ACCESS_ONCE(a);
q = READ_ONCE_CTRL(a);
ACCESS_ONCE(b) = 1;

This example underscores the need to ensure that the compiler cannot
Expand All @@ -762,8 +766,8 @@ demonstrated by two related examples, with the initial values of
x and y both being zero:

CPU 0 CPU 1
===================== =====================
r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y);
======================= =======================
r1 = READ_ONCE_CTRL(x); r2 = READ_ONCE_CTRL(y);
if (r1 > 0) if (r2 > 0)
ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1;

Expand All @@ -783,14 +787,21 @@ But because control dependencies do -not- provide transitivity, the above
assertion can fail after the combined three-CPU example completes. If you
need the three-CPU example to provide ordering, you will need smp_mb()
between the loads and stores in the CPU 0 and CPU 1 code fragments,
that is, just before or just after the "if" statements.
that is, just before or just after the "if" statements. Furthermore,
the original two-CPU example is very fragile and should be avoided.

These two examples are the LB and WWC litmus tests from this paper:
http://www.cl.cam.ac.uk/users/pes20/ppc-supplemental/test6.pdf and this
site: https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html.

In summary:

(*) Control dependencies must be headed by READ_ONCE_CTRL().
Or, as a much less preferable alternative, interpose
be headed by READ_ONCE() or an ACCESS_ONCE() read and must
have smp_read_barrier_depends() between this read and the
control-dependent write.

(*) Control dependencies can order prior loads against later stores.
However, they do -not- guarantee any other sort of ordering:
Not prior loads against later loads, nor prior stores against
Expand Down Expand Up @@ -1784,10 +1795,9 @@ for each construct. These operations all imply certain barriers:

Memory operations issued before the ACQUIRE may be completed after
the ACQUIRE operation has completed. An smp_mb__before_spinlock(),
combined with a following ACQUIRE, orders prior loads against
subsequent loads and stores and also orders prior stores against
subsequent stores. Note that this is weaker than smp_mb()! The
smp_mb__before_spinlock() primitive is free on many architectures.
combined with a following ACQUIRE, orders prior stores against
subsequent loads and stores. Note that this is weaker than smp_mb()!
The smp_mb__before_spinlock() primitive is free on many architectures.

(2) RELEASE operation implication:

Expand Down
1 change: 1 addition & 0 deletions arch/powerpc/include/asm/barrier.h
Original file line number Diff line number Diff line change
Expand Up @@ -89,5 +89,6 @@ do { \

#define smp_mb__before_atomic() smp_mb()
#define smp_mb__after_atomic() smp_mb()
#define smp_mb__before_spinlock() smp_mb()

#endif /* _ASM_POWERPC_BARRIER_H */
Loading

0 comments on commit fc934d4

Please sign in to comment.