Skip to content

Commit

Permalink
Merge tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/sc…
Browse files Browse the repository at this point in the history
…m/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:

 - Yosry Ahmed brought back some cgroup v1 stats in OOM logs

 - Yosry has also eliminated cgroup's atomic rstat flushing

 - Nhat Pham adds the new cachestat() syscall. It provides userspace
   with the ability to query pagecache status - a similar concept to
   mincore() but more powerful and with improved usability

 - Mel Gorman provides more optimizations for compaction, reducing the
   prevalence of page rescanning

 - Lorenzo Stoakes has done some maintanance work on the
   get_user_pages() interface

 - Liam Howlett continues with cleanups and maintenance work to the
   maple tree code. Peng Zhang also does some work on maple tree

 - Johannes Weiner has done some cleanup work on the compaction code

 - David Hildenbrand has contributed additional selftests for
   get_user_pages()

 - Thomas Gleixner has contributed some maintenance and optimization
   work for the vmalloc code

 - Baolin Wang has provided some compaction cleanups,

 - SeongJae Park continues maintenance work on the DAMON code

 - Huang Ying has done some maintenance on the swap code's usage of
   device refcounting

 - Christoph Hellwig has some cleanups for the filemap/directio code

 - Ryan Roberts provides two patch series which yield some
   rationalization of the kernel's access to pte entries - use the
   provided APIs rather than open-coding accesses

 - Lorenzo Stoakes has some fixes to the interaction between pagecache
   and directio access to file mappings

 - John Hubbard has a series of fixes to the MM selftesting code

 - ZhangPeng continues the folio conversion campaign

 - Hugh Dickins has been working on the pagetable handling code, mainly
   with a view to reducing the load on the mmap_lock

 - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment
   from 128 to 8

 - Domenico Cerasuolo has improved the zswap reclaim mechanism by
   reorganizing the LRU management

 - Matthew Wilcox provides some fixups to make gfs2 work better with the
   buffer_head code

 - Vishal Moola also has done some folio conversion work

 - Matthew Wilcox has removed the remnants of the pagevec code - their
   functionality is migrated over to struct folio_batch

* tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits)
  mm/hugetlb: remove hugetlb_set_page_subpool()
  mm: nommu: correct the range of mmap_sem_read_lock in task_mem()
  hugetlb: revert use of page_cache_next_miss()
  Revert "page cache: fix page_cache_next/prev_miss off by one"
  mm/vmscan: fix root proactive reclaim unthrottling unbalanced node
  mm: memcg: rename and document global_reclaim()
  mm: kill [add|del]_page_to_lru_list()
  mm: compaction: convert to use a folio in isolate_migratepages_block()
  mm: zswap: fix double invalidate with exclusive loads
  mm: remove unnecessary pagevec includes
  mm: remove references to pagevec
  mm: rename invalidate_mapping_pagevec to mapping_try_invalidate
  mm: remove struct pagevec
  net: convert sunrpc from pagevec to folio_batch
  i915: convert i915_gpu_error to use a folio_batch
  pagevec: rename fbatch_count()
  mm: remove check_move_unevictable_pages()
  drm: convert drm_gem_put_pages() to use a folio_batch
  i915: convert shmem_sg_free_table() to use a folio_batch
  scatterlist: add sg_set_folio()
  ...
  • Loading branch information
torvalds committed Jun 28, 2023
2 parents 6aeadf7 + acc72d5 commit 6e17c6d
Show file tree
Hide file tree
Showing 334 changed files with 8,341 additions and 6,905 deletions.
2 changes: 1 addition & 1 deletion Documentation/admin-guide/cgroup-v1/memory.rst
Original file line number Diff line number Diff line change
Expand Up @@ -297,7 +297,7 @@ Lock order is as follows::

Page lock (PG_locked bit of page->flags)
mm->page_table_lock or split pte_lock
lock_page_memcg (memcg->move_lock)
folio_memcg_lock (memcg->move_lock)
mapping->i_pages lock
lruvec->lru_lock.

Expand Down
7 changes: 7 additions & 0 deletions Documentation/admin-guide/cgroup-v2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1580,6 +1580,13 @@ PAGE_SIZE multiple when read back.

Healthy workloads are not expected to reach this limit.

memory.swap.peak
A read-only single value file which exists on non-root
cgroups.

The max swap usage recorded for the cgroup and its
descendants since the creation of the cgroup.

memory.swap.max
A read-write single value file which exists on non-root
cgroups. The default is "max".
Expand Down
10 changes: 5 additions & 5 deletions Documentation/admin-guide/mm/damon/start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -119,9 +119,9 @@ set size has chronologically changed.::
Data Access Pattern Aware Memory Management
===========================================

Below three commands make every memory region of size >=4K that doesn't
accessed for >=60 seconds in your workload to be swapped out. ::
Below command makes every memory region of size >=4K that has not accessed for
>=60 seconds in your workload to be swapped out. ::

$ echo "#min-size max-size min-acc max-acc min-age max-age action" > test_scheme
$ echo "4K max 0 0 60s max pageout" >> test_scheme
$ damo schemes -c test_scheme <pid of your workload>
$ sudo damo schemes --damos_access_rate 0 0 --damos_sz_region 4K max \
--damos_age 60s max --damos_action pageout \
<pid of your workload>
146 changes: 62 additions & 84 deletions Documentation/admin-guide/mm/damon/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,21 +10,16 @@ DAMON provides below interfaces for different users.
`This <https://github.com/awslabs/damo>`_ is for privileged people such as
system administrators who want a just-working human-friendly interface.
Using this, users can use the DAMON’s major features in a human-friendly way.
It may not be highly tuned for special cases, though. It supports both
virtual and physical address spaces monitoring. For more detail, please
refer to its `usage document
It may not be highly tuned for special cases, though. For more detail,
please refer to its `usage document
<https://github.com/awslabs/damo/blob/next/USAGE.md>`_.
- *sysfs interface.*
:ref:`This <sysfs_interface>` is for privileged user space programmers who
want more optimized use of DAMON. Using this, users can use DAMON’s major
features by reading from and writing to special sysfs files. Therefore,
you can write and use your personalized DAMON sysfs wrapper programs that
reads/writes the sysfs files instead of you. The `DAMON user space tool
<https://github.com/awslabs/damo>`_ is one example of such programs. It
supports both virtual and physical address spaces monitoring. Note that this
interface provides only simple :ref:`statistics <damos_stats>` for the
monitoring results. For detailed monitoring results, DAMON provides a
:ref:`tracepoint <tracepoint>`.
<https://github.com/awslabs/damo>`_ is one example of such programs.
- *debugfs interface. (DEPRECATED!)*
:ref:`This <debugfs_interface>` is almost identical to :ref:`sysfs interface
<sysfs_interface>`. This is deprecated, so users should move to the
Expand Down Expand Up @@ -139,7 +134,7 @@ scheme of the kdamond. Writing ``clear_schemes_tried_regions`` to ``state``
file clears the DAMON-based operating scheme action tried regions directory for
each DAMON-based operation scheme of the kdamond. For details of the
DAMON-based operation scheme action tried regions directory, please refer to
:ref:tried_regions section <sysfs_schemes_tried_regions>`.
:ref:`tried_regions section <sysfs_schemes_tried_regions>`.

If the state is ``on``, reading ``pid`` shows the pid of the kdamond thread.

Expand Down Expand Up @@ -259,12 +254,9 @@ be equal or smaller than ``start`` of directory ``N+1``.
contexts/<N>/schemes/
---------------------

For usual DAMON-based data access aware memory management optimizations, users
would normally want the system to apply a memory management action to a memory
region of a specific access pattern. DAMON receives such formalized operation
schemes from the user and applies those to the target memory regions. Users
can get and set the schemes by reading from and writing to files under this
directory.
The directory for DAMON-based Operation Schemes (:ref:`DAMOS
<damon_design_damos>`). Users can get and set the schemes by reading from and
writing to files under this directory.

In the beginning, this directory has only one file, ``nr_schemes``. Writing a
number (``N``) to the file creates the number of child directories named ``0``
Expand All @@ -277,12 +269,12 @@ In each scheme directory, five directories (``access_pattern``, ``quotas``,
``watermarks``, ``filters``, ``stats``, and ``tried_regions``) and one file
(``action``) exist.

The ``action`` file is for setting and getting what action you want to apply to
memory regions having specific access pattern of the interest. The keywords
that can be written to and read from the file and their meaning are as below.
The ``action`` file is for setting and getting the scheme's :ref:`action
<damon_design_damos_action>`. The keywords that can be written to and read
from the file and their meaning are as below.

Note that support of each action depends on the running DAMON operations set
`implementation <sysfs_contexts>`.
:ref:`implementation <sysfs_contexts>`.

- ``willneed``: Call ``madvise()`` for the region with ``MADV_WILLNEED``.
Supported by ``vaddr`` and ``fvaddr`` operations set.
Expand All @@ -304,56 +296,48 @@ Note that support of each action depends on the running DAMON operations set
schemes/<N>/access_pattern/
---------------------------

The target access pattern of each DAMON-based operation scheme is constructed
with three ranges including the size of the region in bytes, number of
monitored accesses per aggregate interval, and number of aggregated intervals
for the age of the region.
The directory for the target access :ref:`pattern
<damon_design_damos_access_pattern>` of the given DAMON-based operation scheme.

Under the ``access_pattern`` directory, three directories (``sz``,
``nr_accesses``, and ``age``) each having two files (``min`` and ``max``)
exist. You can set and get the access pattern for the given scheme by writing
to and reading from the ``min`` and ``max`` files under ``sz``,
``nr_accesses``, and ``age`` directories, respectively.
``nr_accesses``, and ``age`` directories, respectively. Note that the ``min``
and the ``max`` form a closed interval.

schemes/<N>/quotas/
-------------------

Optimal ``target access pattern`` for each ``action`` is workload dependent, so
not easy to find. Worse yet, setting a scheme of some action too aggressive
can cause severe overhead. To avoid such overhead, users can limit time and
size quota for each scheme. In detail, users can ask DAMON to try to use only
up to specific time (``time quota``) for applying the action, and to apply the
action to only up to specific amount (``size quota``) of memory regions having
the target access pattern within a given time interval (``reset interval``).

When the quota limit is expected to be exceeded, DAMON prioritizes found memory
regions of the ``target access pattern`` based on their size, access frequency,
and age. For personalized prioritization, users can set the weights for the
three properties.
The directory for the :ref:`quotas <damon_design_damos_quotas>` of the given
DAMON-based operation scheme.

Under ``quotas`` directory, three files (``ms``, ``bytes``,
``reset_interval_ms``) and one directory (``weights``) having three files
(``sz_permil``, ``nr_accesses_permil``, and ``age_permil``) in it exist.

You can set the ``time quota`` in milliseconds, ``size quota`` in bytes, and
``reset interval`` in milliseconds by writing the values to the three files,
respectively. You can also set the prioritization weights for size, access
frequency, and age in per-thousand unit by writing the values to the three
files under the ``weights`` directory.
respectively. Then, DAMON tries to use only up to ``time quota`` milliseconds
for applying the ``action`` to memory regions of the ``access_pattern``, and to
apply the action to only up to ``bytes`` bytes of memory regions within the
``reset_interval_ms``. Setting both ``ms`` and ``bytes`` zero disables the
quota limits.

You can also set the :ref:`prioritization weights
<damon_design_damos_quotas_prioritization>` for size, access frequency, and age
in per-thousand unit by writing the values to the three files under the
``weights`` directory.

schemes/<N>/watermarks/
-----------------------

To allow easy activation and deactivation of each scheme based on system
status, DAMON provides a feature called watermarks. The feature receives five
values called ``metric``, ``interval``, ``high``, ``mid``, and ``low``. The
``metric`` is the system metric such as free memory ratio that can be measured.
If the metric value of the system is higher than the value in ``high`` or lower
than ``low`` at the memoent, the scheme is deactivated. If the value is lower
than ``mid``, the scheme is activated.
The directory for the :ref:`watermarks <damon_design_damos_watermarks>` of the
given DAMON-based operation scheme.

Under the watermarks directory, five files (``metric``, ``interval_us``,
``high``, ``mid``, and ``low``) for setting each value exist. You can set and
``high``, ``mid``, and ``low``) for setting the metric, the time interval
between check of the metric, and the three watermarks exist. You can set and
get the five values by writing to the files, respectively.

Keywords and meanings of those that can be written to the ``metric`` file are
Expand All @@ -367,12 +351,8 @@ The ``interval`` should written in microseconds unit.
schemes/<N>/filters/
--------------------

Users could know something more than the kernel for specific types of memory.
In the case, users could do their own management for the memory and hence
doesn't want DAMOS bothers that. Users could limit DAMOS by setting the access
pattern of the scheme and/or the monitoring regions for the purpose, but that
can be inefficient in some cases. In such cases, users could set non-access
pattern driven filters using files in this directory.
The directory for the :ref:`filters <damon_design_damos_filters>` of the given
DAMON-based operation scheme.

In the beginning, this directory has only one file, ``nr_filters``. Writing a
number (``N``) to the file creates the number of child directories named ``0``
Expand Down Expand Up @@ -432,13 +412,17 @@ starting from ``0`` under this directory. Each directory contains files
exposing detailed information about each of the memory region that the
corresponding scheme's ``action`` has tried to be applied under this directory,
during next :ref:`aggregation interval <sysfs_monitoring_attrs>`. The
information includes address range, ``nr_accesses``, , and ``age`` of the
region.
information includes address range, ``nr_accesses``, and ``age`` of the region.

The directories will be removed when another special keyword,
``clear_schemes_tried_regions``, is written to the relevant
``kdamonds/<N>/state`` file.

The expected usage of this directory is investigations of schemes' behaviors,
and query-like efficient data access monitoring results retrievals. For the
latter use case, in particular, users can set the ``action`` as ``stat`` and
set the ``access pattern`` as their interested pattern that they want to query.

tried_regions/<N>/
------------------

Expand Down Expand Up @@ -600,15 +584,10 @@ update.
Schemes
-------

For usual DAMON-based data access aware memory management optimizations, users
would simply want the system to apply a memory management action to a memory
region of a specific access pattern. DAMON receives such formalized operation
schemes from the user and applies those to the target processes.

Users can get and set the schemes by reading from and writing to ``schemes``
debugfs file. Reading the file also shows the statistics of each scheme. To
the file, each of the schemes should be represented in each line in below
form::
Users can get and set the DAMON-based operation :ref:`schemes
<damon_design_damos>` by reading from and writing to ``schemes`` debugfs file.
Reading the file also shows the statistics of each scheme. To the file, each
of the schemes should be represented in each line in below form::

<target access pattern> <action> <quota> <watermarks>

Expand All @@ -617,8 +596,9 @@ You can disable schemes by simply writing an empty string to the file.
Target Access Pattern
~~~~~~~~~~~~~~~~~~~~~

The ``<target access pattern>`` is constructed with three ranges in below
form::
The target access :ref:`pattern <damon_design_damos_access_pattern>` of the
scheme. The ``<target access pattern>`` is constructed with three ranges in
below form::

min-size max-size min-acc max-acc min-age max-age

Expand All @@ -631,9 +611,9 @@ closed interval.
Action
~~~~~~

The ``<action>`` is a predefined integer for memory management actions, which
DAMON will apply to the regions having the target access pattern. The
supported numbers and their meanings are as below.
The ``<action>`` is a predefined integer for memory management :ref:`actions
<damon_design_damos_action>`. The supported numbers and their meanings are as
below.

- 0: Call ``madvise()`` for the region with ``MADV_WILLNEED``. Ignored if
``target`` is ``paddr``.
Expand All @@ -649,10 +629,8 @@ supported numbers and their meanings are as below.
Quota
~~~~~

Optimal ``target access pattern`` for each ``action`` is workload dependent, so
not easy to find. Worse yet, setting a scheme of some action too aggressive
can cause severe overhead. To avoid such overhead, users can limit time and
size quota for the scheme via the ``<quota>`` in below form::
Users can set the :ref:`quotas <damon_design_damos_quotas>` of the given scheme
via the ``<quota>`` in below form::

<ms> <sz> <reset interval> <priority weights>

Expand All @@ -662,19 +640,17 @@ the action to memory regions of the ``target access pattern`` within the
``<sz>`` bytes of memory regions within the ``<reset interval>``. Setting both
``<ms>`` and ``<sz>`` zero disables the quota limits.

When the quota limit is expected to be exceeded, DAMON prioritizes found memory
regions of the ``target access pattern`` based on their size, access frequency,
and age. For personalized prioritization, users can set the weights for the
three properties in ``<priority weights>`` in below form::
For the :ref:`prioritization <damon_design_damos_quotas_prioritization>`, users
can set the weights for the three properties in ``<priority weights>`` in below
form::

<size weight> <access frequency weight> <age weight>

Watermarks
~~~~~~~~~~

Some schemes would need to run based on current value of the system's specific
metrics like free memory ratio. For such cases, users can specify watermarks
for the condition.::
Users can specify :ref:`watermarks <damon_design_damos_watermarks>` of the
given scheme via ``<watermarks>`` in below form::

<metric> <check interval> <high mark> <middle mark> <low mark>

Expand Down Expand Up @@ -797,10 +773,12 @@ root directory only.
Tracepoint for Monitoring Results
=================================

DAMON provides the monitoring results via a tracepoint,
``damon:damon_aggregated``. While the monitoring is turned on, you could
record the tracepoint events and show results using tracepoint supporting tools
like ``perf``. For example::
Users can get the monitoring results via the :ref:`tried_regions
<sysfs_schemes_tried_regions>` or a tracepoint, ``damon:damon_aggregated``.
While the tried regions directory is useful for getting a snapshot, the
tracepoint is useful for getting a full record of the results. While the
monitoring is turned on, you could record the tracepoint events and show
results using tracepoint supporting tools like ``perf``. For example::

# echo on > monitor_on
# perf record -e damon:damon_aggregated &
Expand Down
9 changes: 6 additions & 3 deletions Documentation/dev-tools/kasan.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,9 +107,12 @@ effectively disables ``panic_on_warn`` for KASAN reports.
Alternatively, independent of ``panic_on_warn``, the ``kasan.fault=`` boot
parameter can be used to control panic and reporting behaviour:

- ``kasan.fault=report`` or ``=panic`` controls whether to only print a KASAN
report or also panic the kernel (default: ``report``). The panic happens even
if ``kasan_multi_shot`` is enabled.
- ``kasan.fault=report``, ``=panic``, or ``=panic_on_write`` controls whether
to only print a KASAN report, panic the kernel, or panic the kernel on
invalid writes only (default: ``report``). The panic happens even if
``kasan_multi_shot`` is enabled. Note that when using asynchronous mode of
Hardware Tag-Based KASAN, ``kasan.fault=panic_on_write`` always panics on
asynchronously checked accesses (including reads).

Software and Hardware Tag-Based KASAN modes (see the section about various
modes below) support altering stack trace collection behavior:
Expand Down
1 change: 1 addition & 0 deletions Documentation/dev-tools/kselftest.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ Running the selftests (hotplug tests are run in limited mode)

To build the tests::

$ make headers
$ make -C tools/testing/selftests

To run the tests::
Expand Down
Loading

0 comments on commit 6e17c6d

Please sign in to comment.