Skip to content

Commit

Permalink
Merge branch 'akpm' (patches from Andrew)
Browse files Browse the repository at this point in the history
Merge second patch-bomb from Andrew Morton:
 "Almost all of the rest of MM.  There was an unusually large amount of
  MM material this time"

* emailed patches from Andrew Morton <[email protected]>: (141 commits)
  zpool: remove no-op module init/exit
  mm: zbud: constify the zbud_ops
  mm: zpool: constify the zpool_ops
  mm: swap: zswap: maybe_preload & refactoring
  zram: unify error reporting
  zsmalloc: remove null check from destroy_handle_cache()
  zsmalloc: do not take class lock in zs_shrinker_count()
  zsmalloc: use class->pages_per_zspage
  zsmalloc: consider ZS_ALMOST_FULL as migrate source
  zsmalloc: partial page ordering within a fullness_list
  zsmalloc: use shrinker to trigger auto-compaction
  zsmalloc: account the number of compacted pages
  zsmalloc/zram: introduce zs_pool_stats api
  zsmalloc: cosmetic compaction code adjustments
  zsmalloc: introduce zs_can_compact() function
  zsmalloc: always keep per-class stats
  zsmalloc: drop unused variable `nr_to_migrate'
  mm/memblock.c: fix comment in __next_mem_range()
  mm/page_alloc.c: fix type information of memoryless node
  memory-hotplug: fix comments in zone_spanned_pages_in_node() and zone_spanned_pages_in_node()
  ...
  • Loading branch information
torvalds committed Sep 9, 2015
2 parents 839fe91 + df69f52 commit f6f7a63
Show file tree
Hide file tree
Showing 99 changed files with 2,784 additions and 1,569 deletions.
7 changes: 7 additions & 0 deletions Documentation/DMA-API.txt
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,13 @@ crossing restrictions, pass 0 for alloc; passing 4096 says memory allocated
from this pool must not cross 4KByte boundaries.


void *dma_pool_zalloc(struct dma_pool *pool, gfp_t mem_flags,
dma_addr_t *handle)

Wraps dma_pool_alloc() and also zeroes the returned memory if the
allocation attempt succeeded.


void *dma_pool_alloc(struct dma_pool *pool, gfp_t gfp_flags,
dma_addr_t *dma_handle);

Expand Down
3 changes: 2 additions & 1 deletion Documentation/blockdev/zram.txt
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,8 @@ mem_used_max RW the maximum amount memory zram have consumed to
store compressed data
mem_limit RW the maximum amount of memory ZRAM can use to store
the compressed data
num_migrated RO the number of objects migrated migrated by compaction
pages_compacted RO the number of pages freed during compaction
(available only via zram<id>/mm_stat node)
compact WO trigger memory compaction

WARNING
Expand Down
7 changes: 4 additions & 3 deletions Documentation/filesystems/dax.txt
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,10 @@ Filesystem support consists of
- implementing the direct_IO address space operation, and calling
dax_do_io() instead of blockdev_direct_IO() if S_DAX is set
- implementing an mmap file operation for DAX files which sets the
VM_MIXEDMAP flag on the VMA, and setting the vm_ops to include handlers
for fault and page_mkwrite (which should probably call dax_fault() and
dax_mkwrite(), passing the appropriate get_block() callback)
VM_MIXEDMAP and VM_HUGEPAGE flags on the VMA, and setting the vm_ops to
include handlers for fault, pmd_fault and page_mkwrite (which should
probably call dax_fault(), dax_pmd_fault() and dax_mkwrite(), passing the
appropriate get_block() callback)
- calling dax_truncate_page() instead of block_truncate_page() for DAX files
- calling dax_zero_page_range() instead of zero_user() for DAX files
- ensuring that there is sufficient locking between reads, writes,
Expand Down
18 changes: 13 additions & 5 deletions Documentation/filesystems/proc.txt
Original file line number Diff line number Diff line change
Expand Up @@ -424,6 +424,7 @@ Private_Dirty: 0 kB
Referenced: 892 kB
Anonymous: 0 kB
Swap: 0 kB
SwapPss: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 374 kB
Expand All @@ -433,16 +434,23 @@ the first of these lines shows the same information as is displayed for the
mapping in /proc/PID/maps. The remaining lines show the size of the mapping
(size), the amount of the mapping that is currently resident in RAM (RSS), the
process' proportional share of this mapping (PSS), the number of clean and
dirty private pages in the mapping. Note that even a page which is part of a
MAP_SHARED mapping, but has only a single pte mapped, i.e. is currently used
by only one process, is accounted as private and not as shared. "Referenced"
indicates the amount of memory currently marked as referenced or accessed.
dirty private pages in the mapping.

The "proportional set size" (PSS) of a process is the count of pages it has
in memory, where each page is divided by the number of processes sharing it.
So if a process has 1000 pages all to itself, and 1000 shared with one other
process, its PSS will be 1500.
Note that even a page which is part of a MAP_SHARED mapping, but has only
a single pte mapped, i.e. is currently used by only one process, is accounted
as private and not as shared.
"Referenced" indicates the amount of memory currently marked as referenced or
accessed.
"Anonymous" shows the amount of memory that does not belong to any file. Even
a mapping associated with a file may contain anonymous pages: when MAP_PRIVATE
and a page is modified, the file page is replaced by a private anonymous copy.
"Swap" shows how much would-be-anonymous memory is also used, but out on
swap.

"SwapPss" shows proportional swap share of this mapping.
"VmFlags" field deserves a separate description. This member represents the kernel
flags associated with the particular virtual memory area in two letter encoded
manner. The codes are the following:
Expand Down
4 changes: 2 additions & 2 deletions Documentation/sysctl/vm.txt
Original file line number Diff line number Diff line change
Expand Up @@ -349,7 +349,7 @@ zone[i]'s protection[j] is calculated by following expression.

(i < j):
zone[i]->protection[j]
= (total sums of present_pages from zone[i+1] to zone[j] on the node)
= (total sums of managed_pages from zone[i+1] to zone[j] on the node)
/ lowmem_reserve_ratio[i];
(i = j):
(should not be protected. = 0;
Expand All @@ -360,7 +360,7 @@ The default values of lowmem_reserve_ratio[i] are
256 (if zone[i] means DMA or DMA32 zone)
32 (others).
As above expression, they are reciprocal number of ratio.
256 means 1/256. # of protection pages becomes about "0.39%" of total present
256 means 1/256. # of protection pages becomes about "0.39%" of total managed
pages of higher zones on the node.

If you would like to protect more pages, smaller values are effective.
Expand Down
3 changes: 2 additions & 1 deletion Documentation/sysrq.txt
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,8 @@ On all - write a character to /proc/sysrq-trigger. e.g.:

'e' - Send a SIGTERM to all processes, except for init.

'f' - Will call oom_kill to kill a memory hog process.
'f' - Will call the oom killer to kill a memory hog process, but do not
panic if nothing can be killed.

'g' - Used by kgdb (kernel debugger)

Expand Down
15 changes: 11 additions & 4 deletions Documentation/vm/hugetlbpage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -329,7 +329,14 @@ Examples

3) hugepage-mmap: see tools/testing/selftests/vm/hugepage-mmap.c

4) The libhugetlbfs (http://libhugetlbfs.sourceforge.net) library provides a
wide range of userspace tools to help with huge page usability, environment
setup, and control. Furthermore it provides useful test cases that should be
used when modifying code to ensure no regressions are introduced.
4) The libhugetlbfs (https://github.com/libhugetlbfs/libhugetlbfs) library
provides a wide range of userspace tools to help with huge page usability,
environment setup, and control.

Kernel development regression testing
=====================================

The most complete set of hugetlb tests are in the libhugetlbfs repository.
If you modify any hugetlb related code, use the libhugetlbfs test suite
to check for regressions. In addition, if you add any new hugetlb
functionality, please add appropriate tests to libhugetlbfs.
15 changes: 13 additions & 2 deletions Documentation/vm/pagemap.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,17 @@ There are three components to pagemap:
* Bits 0-4 swap type if swapped
* Bits 5-54 swap offset if swapped
* Bit 55 pte is soft-dirty (see Documentation/vm/soft-dirty.txt)
* Bits 56-60 zero
* Bit 61 page is file-page or shared-anon
* Bit 56 page exclusively mapped (since 4.2)
* Bits 57-60 zero
* Bit 61 page is file-page or shared-anon (since 3.5)
* Bit 62 page swapped
* Bit 63 page present

Since Linux 4.0 only users with the CAP_SYS_ADMIN capability can get PFNs.
In 4.0 and 4.1 opens by unprivileged fail with -EPERM. Starting from
4.2 the PFN field is zeroed if the user does not have CAP_SYS_ADMIN.
Reason: information about PFNs helps in exploiting Rowhammer vulnerability.

If the page is not present but in swap, then the PFN contains an
encoding of the swap file number and the page's offset into the
swap. Unmapped pages return a null PFN. This allows determining
Expand Down Expand Up @@ -159,3 +165,8 @@ Other notes:
Reading from any of the files will return -EINVAL if you are not starting
the read on an 8-byte boundary (e.g., if you sought an odd number of bytes
into the file), or if the size of the read is not a multiple of 8 bytes.

Before Linux 3.11 pagemap bits 55-60 were used for "page-shift" (which is
always 12 at most architectures). Since Linux 3.11 their meaning changes
after first clear of soft-dirty bits. Since Linux 4.2 they are used for
flags unconditionally.
62 changes: 62 additions & 0 deletions arch/arm64/kernel/setup.c
Original file line number Diff line number Diff line change
Expand Up @@ -339,6 +339,67 @@ static void __init request_standard_resources(void)
}
}

#ifdef CONFIG_BLK_DEV_INITRD
/*
* Relocate initrd if it is not completely within the linear mapping.
* This would be the case if mem= cuts out all or part of it.
*/
static void __init relocate_initrd(void)
{
phys_addr_t orig_start = __virt_to_phys(initrd_start);
phys_addr_t orig_end = __virt_to_phys(initrd_end);
phys_addr_t ram_end = memblock_end_of_DRAM();
phys_addr_t new_start;
unsigned long size, to_free = 0;
void *dest;

if (orig_end <= ram_end)
return;

/*
* Any of the original initrd which overlaps the linear map should
* be freed after relocating.
*/
if (orig_start < ram_end)
to_free = ram_end - orig_start;

size = orig_end - orig_start;

/* initrd needs to be relocated completely inside linear mapping */
new_start = memblock_find_in_range(0, PFN_PHYS(max_pfn),
size, PAGE_SIZE);
if (!new_start)
panic("Cannot relocate initrd of size %ld\n", size);
memblock_reserve(new_start, size);

initrd_start = __phys_to_virt(new_start);
initrd_end = initrd_start + size;

pr_info("Moving initrd from [%llx-%llx] to [%llx-%llx]\n",
orig_start, orig_start + size - 1,
new_start, new_start + size - 1);

dest = (void *)initrd_start;

if (to_free) {
memcpy(dest, (void *)__phys_to_virt(orig_start), to_free);
dest += to_free;
}

copy_from_early_mem(dest, orig_start + to_free, size - to_free);

if (to_free) {
pr_info("Freeing original RAMDISK from [%llx-%llx]\n",
orig_start, orig_start + to_free - 1);
memblock_free(orig_start, to_free);
}
}
#else
static inline void __init relocate_initrd(void)
{
}
#endif

u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };

void __init setup_arch(char **cmdline_p)
Expand Down Expand Up @@ -372,6 +433,7 @@ void __init setup_arch(char **cmdline_p)
acpi_boot_table_init();

paging_init();
relocate_initrd();
request_standard_resources();

early_ioremap_reset();
Expand Down
6 changes: 1 addition & 5 deletions arch/ia64/hp/common/sba_iommu.c
Original file line number Diff line number Diff line change
Expand Up @@ -1140,13 +1140,9 @@ sba_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle,

#ifdef CONFIG_NUMA
{
int node = ioc->node;
struct page *page;

if (node == NUMA_NO_NODE)
node = numa_node_id();

page = alloc_pages_exact_node(node, flags, get_order(size));
page = alloc_pages_node(ioc->node, flags, get_order(size));
if (unlikely(!page))
return NULL;

Expand Down
2 changes: 1 addition & 1 deletion arch/ia64/kernel/uncached.c
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ static int uncached_add_chunk(struct uncached_pool *uc_pool, int nid)

/* attempt to allocate a granule's worth of cached memory pages */

page = alloc_pages_exact_node(nid,
page = __alloc_pages_node(nid,
GFP_KERNEL | __GFP_ZERO | __GFP_THISNODE,
IA64_GRANULE_SHIFT-PAGE_SHIFT);
if (!page) {
Expand Down
2 changes: 1 addition & 1 deletion arch/ia64/sn/pci/pci_dma.c
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ static void *sn_dma_alloc_coherent(struct device *dev, size_t size,
*/
node = pcibus_to_node(pdev->bus);
if (likely(node >=0)) {
struct page *p = alloc_pages_exact_node(node,
struct page *p = __alloc_pages_node(node,
flags, get_order(size));

if (likely(p))
Expand Down
2 changes: 1 addition & 1 deletion arch/powerpc/platforms/cell/ras.c
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ static int __init cbe_ptcal_enable_on_node(int nid, int order)

area->nid = nid;
area->order = order;
area->pages = alloc_pages_exact_node(area->nid,
area->pages = __alloc_pages_node(area->nid,
GFP_KERNEL|__GFP_THISNODE,
area->order);

Expand Down
2 changes: 1 addition & 1 deletion arch/sparc/include/asm/pgtable_32.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
#include <asm-generic/4level-fixup.h>

#include <linux/spinlock.h>
#include <linux/swap.h>
#include <linux/mm_types.h>
#include <asm/types.h>
#include <asm/pgtsrmmu.h>
#include <asm/vaddrs.h>
Expand Down
22 changes: 1 addition & 21 deletions arch/x86/kernel/setup.c
Original file line number Diff line number Diff line change
Expand Up @@ -317,15 +317,12 @@ static u64 __init get_ramdisk_size(void)
return ramdisk_size;
}

#define MAX_MAP_CHUNK (NR_FIX_BTMAPS << PAGE_SHIFT)
static void __init relocate_initrd(void)
{
/* Assume only end is not page aligned */
u64 ramdisk_image = get_ramdisk_image();
u64 ramdisk_size = get_ramdisk_size();
u64 area_size = PAGE_ALIGN(ramdisk_size);
unsigned long slop, clen, mapaddr;
char *p, *q;

/* We need to move the initrd down into directly mapped mem */
relocated_ramdisk = memblock_find_in_range(0, PFN_PHYS(max_pfn_mapped),
Expand All @@ -343,25 +340,8 @@ static void __init relocate_initrd(void)
printk(KERN_INFO "Allocated new RAMDISK: [mem %#010llx-%#010llx]\n",
relocated_ramdisk, relocated_ramdisk + ramdisk_size - 1);

q = (char *)initrd_start;

/* Copy the initrd */
while (ramdisk_size) {
slop = ramdisk_image & ~PAGE_MASK;
clen = ramdisk_size;
if (clen > MAX_MAP_CHUNK-slop)
clen = MAX_MAP_CHUNK-slop;
mapaddr = ramdisk_image & PAGE_MASK;
p = early_memremap(mapaddr, clen+slop);
memcpy(q, p+slop, clen);
early_memunmap(p, clen+slop);
q += clen;
ramdisk_image += clen;
ramdisk_size -= clen;
}
copy_from_early_mem((void *)initrd_start, ramdisk_image, ramdisk_size);

ramdisk_image = get_ramdisk_image();
ramdisk_size = get_ramdisk_size();
printk(KERN_INFO "Move RAMDISK from [mem %#010llx-%#010llx] to"
" [mem %#010llx-%#010llx]\n",
ramdisk_image, ramdisk_image + ramdisk_size - 1,
Expand Down
2 changes: 1 addition & 1 deletion arch/x86/kvm/vmx.c
Original file line number Diff line number Diff line change
Expand Up @@ -3150,7 +3150,7 @@ static struct vmcs *alloc_vmcs_cpu(int cpu)
struct page *pages;
struct vmcs *vmcs;

pages = alloc_pages_exact_node(node, GFP_KERNEL, vmcs_config.order);
pages = __alloc_pages_node(node, GFP_KERNEL, vmcs_config.order);
if (!pages)
return NULL;
vmcs = page_address(pages);
Expand Down
6 changes: 4 additions & 2 deletions arch/x86/mm/numa.c
Original file line number Diff line number Diff line change
Expand Up @@ -246,8 +246,10 @@ int __init numa_cleanup_meminfo(struct numa_meminfo *mi)
bi->start = max(bi->start, low);
bi->end = min(bi->end, high);

/* and there's no empty block */
if (bi->start >= bi->end)
/* and there's no empty or non-exist block */
if (bi->start >= bi->end ||
!memblock_overlaps_region(&memblock.memory,
bi->start, bi->end - bi->start))
numa_remove_memblk_from(i--, mi);
}

Expand Down
Loading

0 comments on commit f6f7a63

Please sign in to comment.