Skip to content

Commit

Permalink
Merge tag 'pm-4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kerne…
Browse files Browse the repository at this point in the history
…l/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
 "These include one big-ticket item which is the rework of the idle loop
  in order to prevent CPUs from spending too much time in shallow idle
  states. It reduces idle power on some systems by 10% or more and may
  improve performance of workloads in which the idle loop overhead
  matters. This has been in the works for several weeks and it has been
  tested and reviewed quite thoroughly.

  Also included are changes that finalize the cpufreq cleanup moving
  frequency table validation from drivers to the core, a few fixes and
  cleanups of cpufreq drivers, a cpuidle documentation update and a PM
  QoS core update to mark the expected switch fall-throughs in it.

  Specifics:

   - Rework the idle loop in order to prevent CPUs from spending too
     much time in shallow idle states by making it stop the scheduler
     tick before putting the CPU into an idle state only if the idle
     duration predicted by the idle governor is long enough.

     That required the code to be reordered to invoke the idle governor
     before stopping the tick, among other things (Rafael Wysocki,
     Frederic Weisbecker, Arnd Bergmann).

   - Add the missing description of the residency sysfs attribute to the
     cpuidle documentation (Prashanth Prakash).

   - Finalize the cpufreq cleanup moving frequency table validation from
     drivers to the core (Viresh Kumar).

   - Fix a clock leak regression in the armada-37xx cpufreq driver
     (Gregory Clement).

   - Fix the initialization of the CPU performance data structures for
     shared policies in the CPPC cpufreq driver (Shunyong Yang).

   - Clean up the ti-cpufreq, intel_pstate and CPPC cpufreq drivers a
     bit (Viresh Kumar, Rafael Wysocki).

   - Mark the expected switch fall-throughs in the PM QoS core (Gustavo
     Silva)"

* tag 'pm-4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits)
  tick-sched: avoid a maybe-uninitialized warning
  cpufreq: Drop cpufreq_table_validate_and_show()
  cpufreq: SCMI: Don't validate the frequency table twice
  cpufreq: CPPC: Initialize shared perf capabilities of CPUs
  cpufreq: armada-37xx: Fix clock leak
  cpufreq: CPPC: Don't set transition_latency
  cpufreq: ti-cpufreq: Use builtin_platform_driver()
  cpufreq: intel_pstate: Do not include debugfs.h
  PM / QoS: mark expected switch fall-throughs
  cpuidle: Add definition of residency to sysfs documentation
  time: hrtimer: Use timerqueue_iterate_next() to get to the next timer
  nohz: Avoid duplication of code related to got_idle_tick
  nohz: Gather tick_sched booleans under a common flag field
  cpuidle: menu: Avoid selecting shallow states with stopped tick
  cpuidle: menu: Refine idle state selection for running tick
  sched: idle: Select idle state before stopping the tick
  time: hrtimer: Introduce hrtimer_next_event_without()
  time: tick-sched: Split tick_nohz_stop_sched_tick()
  cpuidle: Return nohz hint from cpuidle_select()
  jiffies: Introduce USER_TICK_USEC and redefine TICK_USEC
  ...
  • Loading branch information
torvalds committed Apr 12, 2018
2 parents 9697376 + 51798de commit 1fe4311
Show file tree
Hide file tree
Showing 25 changed files with 438 additions and 161 deletions.
12 changes: 5 additions & 7 deletions Documentation/cpu-freq/core.txt
Original file line number Diff line number Diff line change
Expand Up @@ -97,12 +97,10 @@ flags - flags of the cpufreq driver
==================================================================
For details about OPP, see Documentation/power/opp.txt

dev_pm_opp_init_cpufreq_table - cpufreq framework typically is initialized with
cpufreq_table_validate_and_show() which is provided with the list of
frequencies that are available for operation. This function provides
a ready to use conversion routine to translate the OPP layer's internal
information about the available frequencies into a format readily
providable to cpufreq.
dev_pm_opp_init_cpufreq_table -
This function provides a ready to use conversion routine to translate
the OPP layer's internal information about the available frequencies
into a format readily providable to cpufreq.

WARNING: Do not use this function in interrupt context.

Expand All @@ -112,7 +110,7 @@ dev_pm_opp_init_cpufreq_table - cpufreq framework typically is initialized with
/* Do things */
r = dev_pm_opp_init_cpufreq_table(dev, &freq_table);
if (!r)
cpufreq_table_validate_and_show(policy, freq_table);
policy->freq_table = freq_table;
/* Do other things */
}

Expand Down
6 changes: 2 additions & 4 deletions Documentation/cpu-freq/cpu-drivers.txt
Original file line number Diff line number Diff line change
Expand Up @@ -259,10 +259,8 @@ CPUFREQ_ENTRY_INVALID. The entries don't need to be in sorted in any
particular order, but if they are cpufreq core will do DVFS a bit
quickly for them as search for best match is faster.

By calling cpufreq_table_validate_and_show(), the cpuinfo.min_freq and
cpuinfo.max_freq values are detected, and policy->min and policy->max
are set to the same values. This is helpful for the per-CPU
initialization stage.
The cpufreq table is verified automatically by the core if the policy contains a
valid pointer in its policy->freq_table field.

cpufreq_frequency_table_verify() assures that at least one valid
frequency is within policy->min and policy->max, and all other criteria
Expand Down
6 changes: 6 additions & 0 deletions Documentation/cpuidle/sysfs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ total 0
-r--r--r-- 1 root root 4096 Feb 8 10:42 latency
-r--r--r-- 1 root root 4096 Feb 8 10:42 name
-r--r--r-- 1 root root 4096 Feb 8 10:42 power
-r--r--r-- 1 root root 4096 Feb 8 10:42 residency
-r--r--r-- 1 root root 4096 Feb 8 10:42 time
-r--r--r-- 1 root root 4096 Feb 8 10:42 usage

Expand All @@ -50,6 +51,7 @@ total 0
-r--r--r-- 1 root root 4096 Feb 8 10:42 latency
-r--r--r-- 1 root root 4096 Feb 8 10:42 name
-r--r--r-- 1 root root 4096 Feb 8 10:42 power
-r--r--r-- 1 root root 4096 Feb 8 10:42 residency
-r--r--r-- 1 root root 4096 Feb 8 10:42 time
-r--r--r-- 1 root root 4096 Feb 8 10:42 usage

Expand All @@ -60,6 +62,7 @@ total 0
-r--r--r-- 1 root root 4096 Feb 8 10:42 latency
-r--r--r-- 1 root root 4096 Feb 8 10:42 name
-r--r--r-- 1 root root 4096 Feb 8 10:42 power
-r--r--r-- 1 root root 4096 Feb 8 10:42 residency
-r--r--r-- 1 root root 4096 Feb 8 10:42 time
-r--r--r-- 1 root root 4096 Feb 8 10:42 usage

Expand All @@ -70,6 +73,7 @@ total 0
-r--r--r-- 1 root root 4096 Feb 8 10:42 latency
-r--r--r-- 1 root root 4096 Feb 8 10:42 name
-r--r--r-- 1 root root 4096 Feb 8 10:42 power
-r--r--r-- 1 root root 4096 Feb 8 10:42 residency
-r--r--r-- 1 root root 4096 Feb 8 10:42 time
-r--r--r-- 1 root root 4096 Feb 8 10:42 usage
--------------------------------------------------------------------------------
Expand All @@ -78,6 +82,8 @@ total 0
* desc : Small description about the idle state (string)
* disable : Option to disable this idle state (bool) -> see note below
* latency : Latency to exit out of this idle state (in microseconds)
* residency : Time after which a state becomes more effecient than any
shallower state (in microseconds)
* name : Name of the idle state (string)
* power : Power consumed while in this idle state (in milliwatts)
* time : Total time spent in this idle state (in microseconds)
Expand Down
1 change: 1 addition & 0 deletions arch/x86/xen/smp_pv.c
Original file line number Diff line number Diff line change
Expand Up @@ -425,6 +425,7 @@ static void xen_pv_play_dead(void) /* used only with HOTPLUG_CPU */
* data back is to call:
*/
tick_nohz_idle_enter();
tick_nohz_idle_stop_tick_protected();

cpuhp_online_idle(CPUHP_AP_ONLINE_IDLE);
}
Expand Down
2 changes: 2 additions & 0 deletions drivers/cpufreq/armada-37xx-cpufreq.c
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,7 @@ static int __init armada37xx_cpufreq_driver_init(void)
cur_frequency = clk_get_rate(clk);
if (!cur_frequency) {
dev_err(cpu_dev, "Failed to get clock rate for CPU\n");
clk_put(clk);
return -EINVAL;
}

Expand All @@ -210,6 +211,7 @@ static int __init armada37xx_cpufreq_driver_init(void)
return -EINVAL;

armada37xx_cpufreq_dvfs_setup(nb_pm_base, clk, dvfs->divider);
clk_put(clk);

for (load_lvl = ARMADA_37XX_DVFS_LOAD_0; load_lvl < LOAD_LEVEL_NR;
load_lvl++) {
Expand Down
15 changes: 12 additions & 3 deletions drivers/cpufreq/cppc_cpufreq.c
Original file line number Diff line number Diff line change
Expand Up @@ -162,14 +162,23 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
cpu->perf_caps.highest_perf;
policy->cpuinfo.max_freq = cppc_dmi_max_khz;

policy->cpuinfo.transition_latency = cppc_get_transition_latency(cpu_num);
policy->transition_delay_us = cppc_get_transition_latency(cpu_num) /
NSEC_PER_USEC;
policy->shared_type = cpu->shared_type;

if (policy->shared_type == CPUFREQ_SHARED_TYPE_ANY)
if (policy->shared_type == CPUFREQ_SHARED_TYPE_ANY) {
int i;

cpumask_copy(policy->cpus, cpu->shared_cpu_map);
else if (policy->shared_type == CPUFREQ_SHARED_TYPE_ALL) {

for_each_cpu(i, policy->cpus) {
if (unlikely(i == policy->cpu))
continue;

memcpy(&all_cpu_data[i]->perf_caps, &cpu->perf_caps,
sizeof(cpu->perf_caps));
}
} else if (policy->shared_type == CPUFREQ_SHARED_TYPE_ALL) {
/* Support only SW_ANY for now. */
pr_debug("Unsupported CPU co-ord type\n");
return -EFAULT;
Expand Down
14 changes: 0 additions & 14 deletions drivers/cpufreq/freq_table.c
Original file line number Diff line number Diff line change
Expand Up @@ -352,20 +352,6 @@ static int set_freq_table_sorted(struct cpufreq_policy *policy)
return 0;
}

int cpufreq_table_validate_and_show(struct cpufreq_policy *policy,
struct cpufreq_frequency_table *table)
{
int ret;

ret = cpufreq_frequency_table_cpuinfo(policy, table);
if (ret)
return ret;

policy->freq_table = table;
return 0;
}
EXPORT_SYMBOL_GPL(cpufreq_table_validate_and_show);

int cpufreq_table_validate_and_sort(struct cpufreq_policy *policy)
{
int ret;
Expand Down
1 change: 0 additions & 1 deletion drivers/cpufreq/intel_pstate.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
#include <linux/sysfs.h>
#include <linux/types.h>
#include <linux/fs.h>
#include <linux/debugfs.h>
#include <linux/acpi.h>
#include <linux/vmalloc.h>
#include <trace/events/power.h>
Expand Down
10 changes: 1 addition & 9 deletions drivers/cpufreq/scmi-cpufreq.c
Original file line number Diff line number Diff line change
Expand Up @@ -159,13 +159,7 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
priv->domain_id = handle->perf_ops->device_domain_id(cpu_dev);

policy->driver_data = priv;

ret = cpufreq_table_validate_and_show(policy, freq_table);
if (ret) {
dev_err(cpu_dev, "%s: invalid frequency table: %d\n", __func__,
ret);
goto out_free_cpufreq_table;
}
policy->freq_table = freq_table;

/* SCMI allows DVFS request for any domain from any CPU */
policy->dvfs_possible_from_any_cpu = true;
Expand All @@ -179,8 +173,6 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
policy->fast_switch_possible = true;
return 0;

out_free_cpufreq_table:
dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
out_free_priv:
kfree(priv);
out_free_opp:
Expand Down
2 changes: 1 addition & 1 deletion drivers/cpufreq/ti-cpufreq.c
Original file line number Diff line number Diff line change
Expand Up @@ -304,7 +304,7 @@ static struct platform_driver ti_cpufreq_driver = {
.name = "ti-cpufreq",
},
};
module_platform_driver(ti_cpufreq_driver);
builtin_platform_driver(ti_cpufreq_driver);

MODULE_DESCRIPTION("TI CPUFreq/OPP hw-supported driver");
MODULE_AUTHOR("Dave Gerlach <[email protected]>");
Expand Down
10 changes: 8 additions & 2 deletions drivers/cpuidle/cpuidle.c
Original file line number Diff line number Diff line change
Expand Up @@ -272,12 +272,18 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
*
* @drv: the cpuidle driver
* @dev: the cpuidle device
* @stop_tick: indication on whether or not to stop the tick
*
* Returns the index of the idle state. The return value must not be negative.
*
* The memory location pointed to by @stop_tick is expected to be written the
* 'false' boolean value if the scheduler tick should not be stopped before
* entering the returned state.
*/
int cpuidle_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
int cpuidle_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
bool *stop_tick)
{
return cpuidle_curr_governor->select(drv, dev);
return cpuidle_curr_governor->select(drv, dev, stop_tick);
}

/**
Expand Down
3 changes: 2 additions & 1 deletion drivers/cpuidle/governors/ladder.c
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,10 @@ static inline void ladder_do_selection(struct ladder_device *ldev,
* ladder_select_state - selects the next state to enter
* @drv: cpuidle driver
* @dev: the CPU
* @dummy: not used
*/
static int ladder_select_state(struct cpuidle_driver *drv,
struct cpuidle_device *dev)
struct cpuidle_device *dev, bool *dummy)
{
struct ladder_device *ldev = this_cpu_ptr(&ladder_devices);
struct device *device = get_cpu_device(dev->cpu);
Expand Down
Loading

0 comments on commit 1fe4311

Please sign in to comment.