Skip to content

Commit

Permalink
Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Browse files Browse the repository at this point in the history
Conflicts:
	include/linux/mod_devicetable.h
	scripts/mod/file2alias.c
  • Loading branch information
davem330 committed May 19, 2010
2 parents 7b39f90 + 537b60d commit 2ec8c6b
Show file tree
Hide file tree
Showing 577 changed files with 24,542 additions and 14,740 deletions.
94 changes: 71 additions & 23 deletions Documentation/RCU/stallwarn.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,56 +3,104 @@ Using RCU's CPU Stall Detector
The CONFIG_RCU_CPU_STALL_DETECTOR kernel config parameter enables
RCU's CPU stall detector, which detects conditions that unduly delay
RCU grace periods. The stall detector's idea of what constitutes
"unduly delayed" is controlled by a pair of C preprocessor macros:
"unduly delayed" is controlled by a set of C preprocessor macros:

RCU_SECONDS_TILL_STALL_CHECK

This macro defines the period of time that RCU will wait from
the beginning of a grace period until it issues an RCU CPU
stall warning. It is normally ten seconds.
stall warning. This time period is normally ten seconds.

RCU_SECONDS_TILL_STALL_RECHECK

This macro defines the period of time that RCU will wait after
issuing a stall warning until it issues another stall warning.
It is normally set to thirty seconds.
issuing a stall warning until it issues another stall warning
for the same stall. This time period is normally set to thirty
seconds.

RCU_STALL_RAT_DELAY

The CPU stall detector tries to make the offending CPU rat on itself,
as this often gives better-quality stack traces. However, if
the offending CPU does not detect its own stall in the number
of jiffies specified by RCU_STALL_RAT_DELAY, then other CPUs will
complain. This is normally set to two jiffies.
The CPU stall detector tries to make the offending CPU print its
own warnings, as this often gives better-quality stack traces.
However, if the offending CPU does not detect its own stall in
the number of jiffies specified by RCU_STALL_RAT_DELAY, then
some other CPU will complain. This delay is normally set to
two jiffies.

The following problems can result in an RCU CPU stall warning:
When a CPU detects that it is stalling, it will print a message similar
to the following:

INFO: rcu_sched_state detected stall on CPU 5 (t=2500 jiffies)

This message indicates that CPU 5 detected that it was causing a stall,
and that the stall was affecting RCU-sched. This message will normally be
followed by a stack dump of the offending CPU. On TREE_RCU kernel builds,
RCU and RCU-sched are implemented by the same underlying mechanism,
while on TREE_PREEMPT_RCU kernel builds, RCU is instead implemented
by rcu_preempt_state.

On the other hand, if the offending CPU fails to print out a stall-warning
message quickly enough, some other CPU will print a message similar to
the following:

INFO: rcu_bh_state detected stalls on CPUs/tasks: { 3 5 } (detected by 2, 2502 jiffies)

This message indicates that CPU 2 detected that CPUs 3 and 5 were both
causing stalls, and that the stall was affecting RCU-bh. This message
will normally be followed by stack dumps for each CPU. Please note that
TREE_PREEMPT_RCU builds can be stalled by tasks as well as by CPUs,
and that the tasks will be indicated by PID, for example, "P3421".
It is even possible for a rcu_preempt_state stall to be caused by both
CPUs -and- tasks, in which case the offending CPUs and tasks will all
be called out in the list.

Finally, if the grace period ends just as the stall warning starts
printing, there will be a spurious stall-warning message:

INFO: rcu_bh_state detected stalls on CPUs/tasks: { } (detected by 4, 2502 jiffies)

This is rare, but does happen from time to time in real life.

So your kernel printed an RCU CPU stall warning. The next question is
"What caused it?" The following problems can result in RCU CPU stall
warnings:

o A CPU looping in an RCU read-side critical section.

o A CPU looping with interrupts disabled.
o A CPU looping with interrupts disabled. This condition can
result in RCU-sched and RCU-bh stalls.

o A CPU looping with preemption disabled.
o A CPU looping with preemption disabled. This condition can
result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh
stalls.

o A CPU looping with bottom halves disabled. This condition can
result in RCU-sched and RCU-bh stalls.

o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
without invoking schedule().

o A bug in the RCU implementation.

o A hardware failure. This is quite unlikely, but has occurred
at least once in a former life. A CPU failed in a running system,
at least once in real life. A CPU failed in a running system,
becoming unresponsive, but not causing an immediate crash.
This resulted in a series of RCU CPU stall warnings, eventually
leading the realization that the CPU had failed.

The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning.
SRCU does not do so directly, but its calls to synchronize_sched() will
result in RCU-sched detecting any CPU stalls that might be occurring.

To diagnose the cause of the stall, inspect the stack traces. The offending
function will usually be near the top of the stack. If you have a series
of stall warnings from a single extended stall, comparing the stack traces
can often help determine where the stall is occurring, which will usually
be in the function nearest the top of the stack that stays the same from
trace to trace.
The RCU, RCU-sched, and RCU-bh implementations have CPU stall
warning. SRCU does not have its own CPU stall warnings, but its
calls to synchronize_sched() will result in RCU-sched detecting
RCU-sched-related CPU stalls. Please note that RCU only detects
CPU stalls when there is a grace period in progress. No grace period,
no CPU stall warnings.

To diagnose the cause of the stall, inspect the stack traces.
The offending function will usually be near the top of the stack.
If you have a series of stall warnings from a single extended stall,
comparing the stack traces can often help determine where the stall
is occurring, which will usually be in the function nearest the top of
that portion of the stack which remains the same from trace to trace.
If you can reliably trigger the stall, ftrace can be quite helpful.

RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE.
10 changes: 0 additions & 10 deletions Documentation/RCU/torture.txt
Original file line number Diff line number Diff line change
Expand Up @@ -182,16 +182,6 @@ Similarly, sched_expedited RCU provides the following:
sched_expedited-torture: Reader Pipe: 12660320201 95875 0 0 0 0 0 0 0 0 0
sched_expedited-torture: Reader Batch: 12660424885 0 0 0 0 0 0 0 0 0 0
sched_expedited-torture: Free-Block Circulation: 1090795 1090795 1090794 1090793 1090792 1090791 1090790 1090789 1090788 1090787 0
state: -1 / 0:0 3:0 4:0

As before, the first four lines are similar to those for RCU.
The last line shows the task-migration state. The first number is
-1 if synchronize_sched_expedited() is idle, -2 if in the process of
posting wakeups to the migration kthreads, and N when waiting on CPU N.
Each of the colon-separated fields following the "/" is a CPU:state pair.
Valid states are "0" for idle, "1" for waiting for quiescent state,
"2" for passed through quiescent state, and "3" when a race with a
CPU-hotplug event forces use of the synchronize_sched() primitive.


USAGE
Expand Down
35 changes: 19 additions & 16 deletions Documentation/RCU/trace.txt
Original file line number Diff line number Diff line change
Expand Up @@ -256,23 +256,23 @@ o Each element of the form "1/1 0:127 ^0" represents one struct
The output of "cat rcu/rcu_pending" looks as follows:

rcu_sched:
0 np=255892 qsp=53936 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741
1 np=261224 qsp=54638 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792
2 np=237496 qsp=49664 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629
3 np=236249 qsp=48766 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723
4 np=221310 qsp=46850 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110
5 np=237332 qsp=48449 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456
6 np=219995 qsp=46718 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834
7 np=249893 qsp=49390 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888
0 np=255892 qsp=53936 rpq=85 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741
1 np=261224 qsp=54638 rpq=33 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792
2 np=237496 qsp=49664 rpq=23 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629
3 np=236249 qsp=48766 rpq=98 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723
4 np=221310 qsp=46850 rpq=7 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110
5 np=237332 qsp=48449 rpq=9 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456
6 np=219995 qsp=46718 rpq=12 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834
7 np=249893 qsp=49390 rpq=42 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888
rcu_bh:
0 np=146741 qsp=1419 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314
1 np=155792 qsp=12597 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180
2 np=136629 qsp=18680 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936
3 np=137723 qsp=2843 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863
4 np=123110 qsp=12433 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671
5 np=137456 qsp=4210 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235
6 np=120834 qsp=9902 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921
7 np=144888 qsp=26336 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542
0 np=146741 qsp=1419 rpq=6 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314
1 np=155792 qsp=12597 rpq=3 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180
2 np=136629 qsp=18680 rpq=1 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936
3 np=137723 qsp=2843 rpq=0 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863
4 np=123110 qsp=12433 rpq=0 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671
5 np=137456 qsp=4210 rpq=1 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235
6 np=120834 qsp=9902 rpq=2 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921
7 np=144888 qsp=26336 rpq=0 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542

As always, this is once again split into "rcu_sched" and "rcu_bh"
portions, with CONFIG_TREE_PREEMPT_RCU kernels having an additional
Expand All @@ -284,6 +284,9 @@ o "np" is the number of times that __rcu_pending() has been invoked
o "qsp" is the number of times that the RCU was waiting for a
quiescent state from this CPU.

o "rpq" is the number of times that the CPU had passed through
a quiescent state, but not yet reported it to RCU.

o "cbr" is the number of times that this CPU had RCU callbacks
that had passed through a grace period, and were thus ready
to be invoked.
Expand Down
16 changes: 9 additions & 7 deletions Documentation/intel_txt.txt
Original file line number Diff line number Diff line change
Expand Up @@ -161,13 +161,15 @@ o In order to put a system into any of the sleep states after a TXT
has been restored, it will restore the TPM PCRs and then
transfer control back to the kernel's S3 resume vector.
In order to preserve system integrity across S3, the kernel
provides tboot with a set of memory ranges (kernel
code/data/bss, S3 resume code, and AP trampoline) that tboot
will calculate a MAC (message authentication code) over and then
seal with the TPM. On resume and once the measured environment
has been re-established, tboot will re-calculate the MAC and
verify it against the sealed value. Tboot's policy determines
what happens if the verification fails.
provides tboot with a set of memory ranges (RAM and RESERVED_KERN
in the e820 table, but not any memory that BIOS might alter over
the S3 transition) that tboot will calculate a MAC (message
authentication code) over and then seal with the TPM. On resume
and once the measured environment has been re-established, tboot
will re-calculate the MAC and verify it against the sealed value.
Tboot's policy determines what happens if the verification fails.
Note that the c/s 194 of tboot which has the new MAC code supports
this.

That's pretty much it for TXT support.

Expand Down
8 changes: 7 additions & 1 deletion Documentation/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -324,6 +324,8 @@ and is between 256 and 4096 characters. It is defined in the file
they are unmapped. Otherwise they are
flushed before they will be reused, which
is a lot of faster
off - do not initialize any AMD IOMMU found in
the system

amijoy.map= [HW,JOY] Amiga joystick support
Map of devices attached to JOY0DAT and JOY1DAT
Expand Down Expand Up @@ -784,8 +786,12 @@ and is between 256 and 4096 characters. It is defined in the file
as early as possible in order to facilitate early
boot debugging.

ftrace_dump_on_oops
ftrace_dump_on_oops[=orig_cpu]
[FTRACE] will dump the trace buffers on oops.
If no parameter is passed, ftrace will dump
buffers of all CPUs, but if you pass orig_cpu, it will
dump only the buffer of the CPU that triggered the
oops.

ftrace_filter=[function-list]
[FTRACE] Limit the functions traced by the function
Expand Down
10 changes: 2 additions & 8 deletions Documentation/kprobes.txt
Original file line number Diff line number Diff line change
Expand Up @@ -165,8 +165,8 @@ the user entry_handler invocation is also skipped.

1.4 How Does Jump Optimization Work?

If you configured your kernel with CONFIG_OPTPROBES=y (currently
this option is supported on x86/x86-64, non-preemptive kernel) and
If your kernel is built with CONFIG_OPTPROBES=y (currently this flag
is automatically set 'y' on x86/x86-64, non-preemptive kernel) and
the "debug.kprobes_optimization" kernel parameter is set to 1 (see
sysctl(8)), Kprobes tries to reduce probe-hit overhead by using a jump
instruction instead of a breakpoint instruction at each probepoint.
Expand Down Expand Up @@ -271,8 +271,6 @@ tweak the kernel's execution path, you need to suppress optimization,
using one of the following techniques:
- Specify an empty function for the kprobe's post_handler or break_handler.
or
- Config CONFIG_OPTPROBES=n.
or
- Execute 'sysctl -w debug.kprobes_optimization=n'

2. Architectures Supported
Expand Down Expand Up @@ -307,10 +305,6 @@ it useful to "Compile the kernel with debug info" (CONFIG_DEBUG_INFO),
so you can use "objdump -d -l vmlinux" to see the source-to-object
code mapping.

If you want to reduce probing overhead, set "Kprobes jump optimization
support" (CONFIG_OPTPROBES) to "y". You can find this option under the
"Kprobes" line.

4. API Reference

The Kprobes API includes a "register" function and an "unregister"
Expand Down
58 changes: 58 additions & 0 deletions Documentation/rbtree.txt
Original file line number Diff line number Diff line change
Expand Up @@ -190,3 +190,61 @@ Example:
for (node = rb_first(&mytree); node; node = rb_next(node))
printk("key=%s\n", rb_entry(node, struct mytype, node)->keystring);

Support for Augmented rbtrees
-----------------------------

Augmented rbtree is an rbtree with "some" additional data stored in each node.
This data can be used to augment some new functionality to rbtree.
Augmented rbtree is an optional feature built on top of basic rbtree
infrastructure. rbtree user who wants this feature will have an augment
callback function in rb_root initialized.

This callback function will be called from rbtree core routines whenever
a node has a change in one or both of its children. It is the responsibility
of the callback function to recalculate the additional data that is in the
rb node using new children information. Note that if this new additional
data affects the parent node's additional data, then callback function has
to handle it and do the recursive updates.


Interval tree is an example of augmented rb tree. Reference -
"Introduction to Algorithms" by Cormen, Leiserson, Rivest and Stein.
More details about interval trees:

Classical rbtree has a single key and it cannot be directly used to store
interval ranges like [lo:hi] and do a quick lookup for any overlap with a new
lo:hi or to find whether there is an exact match for a new lo:hi.

However, rbtree can be augmented to store such interval ranges in a structured
way making it possible to do efficient lookup and exact match.

This "extra information" stored in each node is the maximum hi
(max_hi) value among all the nodes that are its descendents. This
information can be maintained at each node just be looking at the node
and its immediate children. And this will be used in O(log n) lookup
for lowest match (lowest start address among all possible matches)
with something like:

find_lowest_match(lo, hi, node)
{
lowest_match = NULL;
while (node) {
if (max_hi(node->left) > lo) {
// Lowest overlap if any must be on left side
node = node->left;
} else if (overlap(lo, hi, node)) {
lowest_match = node;
break;
} else if (lo > node->lo) {
// Lowest overlap if any must be on right side
node = node->right;
} else {
break;
}
}
return lowest_match;
}

Finding exact match will be to first find lowest match and then to follow
successor nodes looking for exact match, until the start of a node is beyond
the hi value we are looking for.
Loading

0 comments on commit 2ec8c6b

Please sign in to comment.