1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
linux/tools/testing/selftests/mm/ksm_functional_tests.c
Linus Torvalds 61307b7be4 The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.  Notable
 series include:
 
 - Lucas Stach has provided some page-mapping
   cleanup/consolidation/maintainability work in the series "mm/treewide:
   Remove pXd_huge() API".
 
 - In the series "Allow migrate on protnone reference with
   MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
   MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one
   test.
 
 - In their series "Memory allocation profiling" Kent Overstreet and
   Suren Baghdasaryan have contributed a means of determining (via
   /proc/allocinfo) whereabouts in the kernel memory is being allocated:
   number of calls and amount of memory.
 
 - Matthew Wilcox has provided the series "Various significant MM
   patches" which does a number of rather unrelated things, but in largely
   similar code sites.
 
 - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes
   Weiner has fixed the page allocator's handling of migratetype requests,
   with resulting improvements in compaction efficiency.
 
 - In the series "make the hugetlb migration strategy consistent" Baolin
   Wang has fixed a hugetlb migration issue, which should improve hugetlb
   allocation reliability.
 
 - Liu Shixin has hit an I/O meltdown caused by readahead in a
   memory-tight memcg.  Addressed in the series "Fix I/O high when memory
   almost met memcg limit".
 
 - In the series "mm/filemap: optimize folio adding and splitting" Kairui
   Song has optimized pagecache insertion, yielding ~10% performance
   improvement in one test.
 
 - Baoquan He has cleaned up and consolidated the early zone
   initialization code in the series "mm/mm_init.c: refactor
   free_area_init_core()".
 
 - Baoquan has also redone some MM initializatio code in the series
   "mm/init: minor clean up and improvement".
 
 - MM helper cleanups from Christoph Hellwig in his series "remove
   follow_pfn".
 
 - More cleanups from Matthew Wilcox in the series "Various page->flags
   cleanups".
 
 - Vlastimil Babka has contributed maintainability improvements in the
   series "memcg_kmem hooks refactoring".
 
 - More folio conversions and cleanups in Matthew Wilcox's series
 
 	"Convert huge_zero_page to huge_zero_folio"
 	"khugepaged folio conversions"
 	"Remove page_idle and page_young wrappers"
 	"Use folio APIs in procfs"
 	"Clean up __folio_put()"
 	"Some cleanups for memory-failure"
 	"Remove page_mapping()"
 	"More folio compat code removal"
 
 - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb
   functions to work on folis".
 
 - Code consolidation and cleanup work related to GUP's handling of
   hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
 
 - Rick Edgecombe has developed some fixes to stack guard gaps in the
   series "Cover a guard gap corner case".
 
 - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series
   "mm/ksm: fix ksm exec support for prctl".
 
 - Baolin Wang has implemented NUMA balancing for multi-size THPs.  This
   is a simple first-cut implementation for now.  The series is "support
   multi-size THP numa balancing".
 
 - Cleanups to vma handling helper functions from Matthew Wilcox in the
   series "Unify vma_address and vma_pgoff_address".
 
 - Some selftests maintenance work from Dev Jain in the series
   "selftests/mm: mremap_test: Optimizations and style fixes".
 
 - Improvements to the swapping of multi-size THPs from Ryan Roberts in
   the series "Swap-out mTHP without splitting".
 
 - Kefeng Wang has significantly optimized the handling of arm64's
   permission page faults in the series
 
 	"arch/mm/fault: accelerate pagefault when badaccess"
 	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
 
 - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it
   GUP-fast".
 
 - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to
   use struct vm_fault".
 
 - selftests build fixes from John Hubbard in the series "Fix
   selftests/mm build without requiring "make headers"".
 
 - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
   series "Improved Memory Tier Creation for CPUless NUMA Nodes".  Fixes
   the initialization code so that migration between different memory types
   works as intended.
 
 - David Hildenbrand has improved follow_pte() and fixed an errant driver
   in the series "mm: follow_pte() improvements and acrn follow_pte()
   fixes".
 
 - David also did some cleanup work on large folio mapcounts in his
   series "mm: mapcount for large folios + page_mapcount() cleanups".
 
 - Folio conversions in KSM in Alex Shi's series "transfer page to folio
   in KSM".
 
 - Barry Song has added some sysfs stats for monitoring multi-size THP's
   in the series "mm: add per-order mTHP alloc and swpout counters".
 
 - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled
   and limit checking cleanups".
 
 - Matthew Wilcox has been looking at buffer_head code and found the
   documentation to be lacking.  The series is "Improve buffer head
   documentation".
 
 - Multi-size THPs get more work, this time from Lance Yang.  His series
   "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes
   the freeing of these things.
 
 - Kemeng Shi has added more userspace-visible writeback instrumentation
   in the series "Improve visibility of writeback".
 
 - Kemeng Shi then sent some maintenance work on top in the series "Fix
   and cleanups to page-writeback".
 
 - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the
   series "Improve anon_vma scalability for anon VMAs".  Intel's test bot
   reported an improbable 3x improvement in one test.
 
 - SeongJae Park adds some DAMON feature work in the series
 
 	"mm/damon: add a DAMOS filter type for page granularity access recheck"
 	"selftests/damon: add DAMOS quota goal test"
 
 - Also some maintenance work in the series
 
 	"mm/damon/paddr: simplify page level access re-check for pageout"
 	"mm/damon: misc fixes and improvements"
 
 - David Hildenbrand has disabled some known-to-fail selftests ni the
   series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL".
 
 - memcg metadata storage optimizations from Shakeel Butt in "memcg:
   reduce memory consumption by memcg stats".
 
 - DAX fixes and maintenance work from Vishal Verma in the series
   "dax/bus.c: Fixups for dax-bus locking".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkgQYwAKCRDdBJ7gKXxA
 jrdKAP9WVJdpEcXxpoub/vVE0UWGtffr8foifi9bCwrQrGh5mgEAx7Yf0+d/oBZB
 nvA4E0DcPrUAFy144FNM0NTCb7u9vAw=
 =V3R/
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:
 "The usual shower of singleton fixes and minor series all over MM,
  documented (hopefully adequately) in the respective changelogs.
  Notable series include:

   - Lucas Stach has provided some page-mapping cleanup/consolidation/
     maintainability work in the series "mm/treewide: Remove pXd_huge()
     API".

   - In the series "Allow migrate on protnone reference with
     MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
     MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
     one test.

   - In their series "Memory allocation profiling" Kent Overstreet and
     Suren Baghdasaryan have contributed a means of determining (via
     /proc/allocinfo) whereabouts in the kernel memory is being
     allocated: number of calls and amount of memory.

   - Matthew Wilcox has provided the series "Various significant MM
     patches" which does a number of rather unrelated things, but in
     largely similar code sites.

   - In his series "mm: page_alloc: freelist migratetype hygiene"
     Johannes Weiner has fixed the page allocator's handling of
     migratetype requests, with resulting improvements in compaction
     efficiency.

   - In the series "make the hugetlb migration strategy consistent"
     Baolin Wang has fixed a hugetlb migration issue, which should
     improve hugetlb allocation reliability.

   - Liu Shixin has hit an I/O meltdown caused by readahead in a
     memory-tight memcg. Addressed in the series "Fix I/O high when
     memory almost met memcg limit".

   - In the series "mm/filemap: optimize folio adding and splitting"
     Kairui Song has optimized pagecache insertion, yielding ~10%
     performance improvement in one test.

   - Baoquan He has cleaned up and consolidated the early zone
     initialization code in the series "mm/mm_init.c: refactor
     free_area_init_core()".

   - Baoquan has also redone some MM initializatio code in the series
     "mm/init: minor clean up and improvement".

   - MM helper cleanups from Christoph Hellwig in his series "remove
     follow_pfn".

   - More cleanups from Matthew Wilcox in the series "Various
     page->flags cleanups".

   - Vlastimil Babka has contributed maintainability improvements in the
     series "memcg_kmem hooks refactoring".

   - More folio conversions and cleanups in Matthew Wilcox's series:
	"Convert huge_zero_page to huge_zero_folio"
	"khugepaged folio conversions"
	"Remove page_idle and page_young wrappers"
	"Use folio APIs in procfs"
	"Clean up __folio_put()"
	"Some cleanups for memory-failure"
	"Remove page_mapping()"
	"More folio compat code removal"

   - David Hildenbrand chipped in with "fs/proc/task_mmu: convert
     hugetlb functions to work on folis".

   - Code consolidation and cleanup work related to GUP's handling of
     hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".

   - Rick Edgecombe has developed some fixes to stack guard gaps in the
     series "Cover a guard gap corner case".

   - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
     series "mm/ksm: fix ksm exec support for prctl".

   - Baolin Wang has implemented NUMA balancing for multi-size THPs.
     This is a simple first-cut implementation for now. The series is
     "support multi-size THP numa balancing".

   - Cleanups to vma handling helper functions from Matthew Wilcox in
     the series "Unify vma_address and vma_pgoff_address".

   - Some selftests maintenance work from Dev Jain in the series
     "selftests/mm: mremap_test: Optimizations and style fixes".

   - Improvements to the swapping of multi-size THPs from Ryan Roberts
     in the series "Swap-out mTHP without splitting".

   - Kefeng Wang has significantly optimized the handling of arm64's
     permission page faults in the series
	"arch/mm/fault: accelerate pagefault when badaccess"
	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"

   - GUP cleanups from David Hildenbrand in "mm/gup: consistently call
     it GUP-fast".

   - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
     path to use struct vm_fault".

   - selftests build fixes from John Hubbard in the series "Fix
     selftests/mm build without requiring "make headers"".

   - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
     series "Improved Memory Tier Creation for CPUless NUMA Nodes".
     Fixes the initialization code so that migration between different
     memory types works as intended.

   - David Hildenbrand has improved follow_pte() and fixed an errant
     driver in the series "mm: follow_pte() improvements and acrn
     follow_pte() fixes".

   - David also did some cleanup work on large folio mapcounts in his
     series "mm: mapcount for large folios + page_mapcount() cleanups".

   - Folio conversions in KSM in Alex Shi's series "transfer page to
     folio in KSM".

   - Barry Song has added some sysfs stats for monitoring multi-size
     THP's in the series "mm: add per-order mTHP alloc and swpout
     counters".

   - Some zswap cleanups from Yosry Ahmed in the series "zswap
     same-filled and limit checking cleanups".

   - Matthew Wilcox has been looking at buffer_head code and found the
     documentation to be lacking. The series is "Improve buffer head
     documentation".

   - Multi-size THPs get more work, this time from Lance Yang. His
     series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
     optimizes the freeing of these things.

   - Kemeng Shi has added more userspace-visible writeback
     instrumentation in the series "Improve visibility of writeback".

   - Kemeng Shi then sent some maintenance work on top in the series
     "Fix and cleanups to page-writeback".

   - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
     the series "Improve anon_vma scalability for anon VMAs". Intel's
     test bot reported an improbable 3x improvement in one test.

   - SeongJae Park adds some DAMON feature work in the series
	"mm/damon: add a DAMOS filter type for page granularity access recheck"
	"selftests/damon: add DAMOS quota goal test"

   - Also some maintenance work in the series
	"mm/damon/paddr: simplify page level access re-check for pageout"
	"mm/damon: misc fixes and improvements"

   - David Hildenbrand has disabled some known-to-fail selftests ni the
     series "selftests: mm: cow: flag vmsplice() hugetlb tests as
     XFAIL".

   - memcg metadata storage optimizations from Shakeel Butt in "memcg:
     reduce memory consumption by memcg stats".

   - DAX fixes and maintenance work from Vishal Verma in the series
     "dax/bus.c: Fixups for dax-bus locking""

* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
  memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
  selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
  selftests: cgroup: add tests to verify the zswap writeback path
  mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
  mm/damon/core: fix return value from damos_wmark_metric_value
  mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
  selftests: cgroup: remove redundant enabling of memory controller
  Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
  Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
  Docs/mm/damon/design: use a list for supported filters
  Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
  Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
  selftests/damon: classify tests for functionalities and regressions
  selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
  selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
  selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
  mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
  selftests/damon: add a test for DAMOS quota goal
  ...
2024-05-19 09:21:03 -07:00

713 lines
17 KiB
C

// SPDX-License-Identifier: GPL-2.0-only
/*
* KSM functional tests
*
* Copyright 2022, Red Hat, Inc.
*
* Author(s): David Hildenbrand <david@redhat.com>
*/
#define _GNU_SOURCE
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
#include <stdint.h>
#include <unistd.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/prctl.h>
#include <sys/syscall.h>
#include <sys/ioctl.h>
#include <sys/wait.h>
#include <linux/userfaultfd.h>
#include "../kselftest.h"
#include "vm_util.h"
#define KiB 1024u
#define MiB (1024 * KiB)
#define FORK_EXEC_CHILD_PRG_NAME "ksm_fork_exec_child"
#define MAP_MERGE_FAIL ((void *)-1)
#define MAP_MERGE_SKIP ((void *)-2)
enum ksm_merge_mode {
KSM_MERGE_PRCTL,
KSM_MERGE_MADVISE,
KSM_MERGE_NONE, /* PRCTL already set */
};
static int mem_fd;
static int ksm_fd;
static int ksm_full_scans_fd;
static int proc_self_ksm_stat_fd;
static int proc_self_ksm_merging_pages_fd;
static int ksm_use_zero_pages_fd;
static int pagemap_fd;
static size_t pagesize;
static bool range_maps_duplicates(char *addr, unsigned long size)
{
unsigned long offs_a, offs_b, pfn_a, pfn_b;
/*
* There is no easy way to check if there are KSM pages mapped into
* this range. We only check that the range does not map the same PFN
* twice by comparing each pair of mapped pages.
*/
for (offs_a = 0; offs_a < size; offs_a += pagesize) {
pfn_a = pagemap_get_pfn(pagemap_fd, addr + offs_a);
/* Page not present or PFN not exposed by the kernel. */
if (pfn_a == -1ul || !pfn_a)
continue;
for (offs_b = offs_a + pagesize; offs_b < size;
offs_b += pagesize) {
pfn_b = pagemap_get_pfn(pagemap_fd, addr + offs_b);
if (pfn_b == -1ul || !pfn_b)
continue;
if (pfn_a == pfn_b)
return true;
}
}
return false;
}
static long get_my_ksm_zero_pages(void)
{
char buf[200];
char *substr_ksm_zero;
size_t value_pos;
ssize_t read_size;
unsigned long my_ksm_zero_pages;
if (!proc_self_ksm_stat_fd)
return 0;
read_size = pread(proc_self_ksm_stat_fd, buf, sizeof(buf) - 1, 0);
if (read_size < 0)
return -errno;
buf[read_size] = 0;
substr_ksm_zero = strstr(buf, "ksm_zero_pages");
if (!substr_ksm_zero)
return 0;
value_pos = strcspn(substr_ksm_zero, "0123456789");
my_ksm_zero_pages = strtol(substr_ksm_zero + value_pos, NULL, 10);
return my_ksm_zero_pages;
}
static long get_my_merging_pages(void)
{
char buf[10];
ssize_t ret;
if (proc_self_ksm_merging_pages_fd < 0)
return proc_self_ksm_merging_pages_fd;
ret = pread(proc_self_ksm_merging_pages_fd, buf, sizeof(buf) - 1, 0);
if (ret <= 0)
return -errno;
buf[ret] = 0;
return strtol(buf, NULL, 10);
}
static long ksm_get_full_scans(void)
{
char buf[10];
ssize_t ret;
ret = pread(ksm_full_scans_fd, buf, sizeof(buf) - 1, 0);
if (ret <= 0)
return -errno;
buf[ret] = 0;
return strtol(buf, NULL, 10);
}
static int ksm_merge(void)
{
long start_scans, end_scans;
/* Wait for two full scans such that any possible merging happened. */
start_scans = ksm_get_full_scans();
if (start_scans < 0)
return start_scans;
if (write(ksm_fd, "1", 1) != 1)
return -errno;
do {
end_scans = ksm_get_full_scans();
if (end_scans < 0)
return end_scans;
} while (end_scans < start_scans + 2);
return 0;
}
static int ksm_unmerge(void)
{
if (write(ksm_fd, "2", 1) != 1)
return -errno;
return 0;
}
static char *__mmap_and_merge_range(char val, unsigned long size, int prot,
enum ksm_merge_mode mode)
{
char *map;
char *err_map = MAP_MERGE_FAIL;
int ret;
/* Stabilize accounting by disabling KSM completely. */
if (ksm_unmerge()) {
ksft_print_msg("Disabling (unmerging) KSM failed\n");
return err_map;
}
if (get_my_merging_pages() > 0) {
ksft_print_msg("Still pages merged\n");
return err_map;
}
map = mmap(NULL, size, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANON, -1, 0);
if (map == MAP_FAILED) {
ksft_print_msg("mmap() failed\n");
return err_map;
}
/* Don't use THP. Ignore if THP are not around on a kernel. */
if (madvise(map, size, MADV_NOHUGEPAGE) && errno != EINVAL) {
ksft_print_msg("MADV_NOHUGEPAGE failed\n");
goto unmap;
}
/* Make sure each page contains the same values to merge them. */
memset(map, val, size);
if (mprotect(map, size, prot)) {
ksft_print_msg("mprotect() failed\n");
err_map = MAP_MERGE_SKIP;
goto unmap;
}
switch (mode) {
case KSM_MERGE_PRCTL:
ret = prctl(PR_SET_MEMORY_MERGE, 1, 0, 0, 0);
if (ret < 0 && errno == EINVAL) {
ksft_print_msg("PR_SET_MEMORY_MERGE not supported\n");
err_map = MAP_MERGE_SKIP;
goto unmap;
} else if (ret) {
ksft_print_msg("PR_SET_MEMORY_MERGE=1 failed\n");
goto unmap;
}
break;
case KSM_MERGE_MADVISE:
if (madvise(map, size, MADV_MERGEABLE)) {
ksft_print_msg("MADV_MERGEABLE failed\n");
goto unmap;
}
break;
case KSM_MERGE_NONE:
break;
}
/* Run KSM to trigger merging and wait. */
if (ksm_merge()) {
ksft_print_msg("Running KSM failed\n");
goto unmap;
}
/*
* Check if anything was merged at all. Ignore the zero page that is
* accounted differently (depending on kernel support).
*/
if (val && !get_my_merging_pages()) {
ksft_print_msg("No pages got merged\n");
goto unmap;
}
return map;
unmap:
munmap(map, size);
return err_map;
}
static char *mmap_and_merge_range(char val, unsigned long size, int prot,
enum ksm_merge_mode mode)
{
char *map;
char *ret = MAP_FAILED;
map = __mmap_and_merge_range(val, size, prot, mode);
if (map == MAP_MERGE_FAIL)
ksft_test_result_fail("Merging memory failed");
else if (map == MAP_MERGE_SKIP)
ksft_test_result_skip("Merging memory skipped");
else
ret = map;
return ret;
}
static void test_unmerge(void)
{
const unsigned int size = 2 * MiB;
char *map;
ksft_print_msg("[RUN] %s\n", __func__);
map = mmap_and_merge_range(0xcf, size, PROT_READ | PROT_WRITE, KSM_MERGE_MADVISE);
if (map == MAP_FAILED)
return;
if (madvise(map, size, MADV_UNMERGEABLE)) {
ksft_test_result_fail("MADV_UNMERGEABLE failed\n");
goto unmap;
}
ksft_test_result(!range_maps_duplicates(map, size),
"Pages were unmerged\n");
unmap:
munmap(map, size);
}
static void test_unmerge_zero_pages(void)
{
const unsigned int size = 2 * MiB;
char *map;
unsigned int offs;
unsigned long pages_expected;
ksft_print_msg("[RUN] %s\n", __func__);
if (proc_self_ksm_stat_fd < 0) {
ksft_test_result_skip("open(\"/proc/self/ksm_stat\") failed\n");
return;
}
if (ksm_use_zero_pages_fd < 0) {
ksft_test_result_skip("open \"/sys/kernel/mm/ksm/use_zero_pages\" failed\n");
return;
}
if (write(ksm_use_zero_pages_fd, "1", 1) != 1) {
ksft_test_result_skip("write \"/sys/kernel/mm/ksm/use_zero_pages\" failed\n");
return;
}
/* Let KSM deduplicate zero pages. */
map = mmap_and_merge_range(0x00, size, PROT_READ | PROT_WRITE, KSM_MERGE_MADVISE);
if (map == MAP_FAILED)
return;
/* Check if ksm_zero_pages is updated correctly after KSM merging */
pages_expected = size / pagesize;
if (pages_expected != get_my_ksm_zero_pages()) {
ksft_test_result_fail("'ksm_zero_pages' updated after merging\n");
goto unmap;
}
/* Try to unmerge half of the region */
if (madvise(map, size / 2, MADV_UNMERGEABLE)) {
ksft_test_result_fail("MADV_UNMERGEABLE failed\n");
goto unmap;
}
/* Check if ksm_zero_pages is updated correctly after unmerging */
pages_expected /= 2;
if (pages_expected != get_my_ksm_zero_pages()) {
ksft_test_result_fail("'ksm_zero_pages' updated after unmerging\n");
goto unmap;
}
/* Trigger unmerging of the other half by writing to the pages. */
for (offs = size / 2; offs < size; offs += pagesize)
*((unsigned int *)&map[offs]) = offs;
/* Now we should have no zeropages remaining. */
if (get_my_ksm_zero_pages()) {
ksft_test_result_fail("'ksm_zero_pages' updated after write fault\n");
goto unmap;
}
/* Check if ksm zero pages are really unmerged */
ksft_test_result(!range_maps_duplicates(map, size),
"KSM zero pages were unmerged\n");
unmap:
munmap(map, size);
}
static void test_unmerge_discarded(void)
{
const unsigned int size = 2 * MiB;
char *map;
ksft_print_msg("[RUN] %s\n", __func__);
map = mmap_and_merge_range(0xcf, size, PROT_READ | PROT_WRITE, KSM_MERGE_MADVISE);
if (map == MAP_FAILED)
return;
/* Discard half of all mapped pages so we have pte_none() entries. */
if (madvise(map, size / 2, MADV_DONTNEED)) {
ksft_test_result_fail("MADV_DONTNEED failed\n");
goto unmap;
}
if (madvise(map, size, MADV_UNMERGEABLE)) {
ksft_test_result_fail("MADV_UNMERGEABLE failed\n");
goto unmap;
}
ksft_test_result(!range_maps_duplicates(map, size),
"Pages were unmerged\n");
unmap:
munmap(map, size);
}
#ifdef __NR_userfaultfd
static void test_unmerge_uffd_wp(void)
{
struct uffdio_writeprotect uffd_writeprotect;
const unsigned int size = 2 * MiB;
struct uffdio_api uffdio_api;
char *map;
int uffd;
ksft_print_msg("[RUN] %s\n", __func__);
map = mmap_and_merge_range(0xcf, size, PROT_READ | PROT_WRITE, KSM_MERGE_MADVISE);
if (map == MAP_FAILED)
return;
/* See if UFFD is around. */
uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
if (uffd < 0) {
ksft_test_result_skip("__NR_userfaultfd failed\n");
goto unmap;
}
/* See if UFFD-WP is around. */
uffdio_api.api = UFFD_API;
uffdio_api.features = UFFD_FEATURE_PAGEFAULT_FLAG_WP;
if (ioctl(uffd, UFFDIO_API, &uffdio_api) < 0) {
ksft_test_result_fail("UFFDIO_API failed\n");
goto close_uffd;
}
if (!(uffdio_api.features & UFFD_FEATURE_PAGEFAULT_FLAG_WP)) {
ksft_test_result_skip("UFFD_FEATURE_PAGEFAULT_FLAG_WP not available\n");
goto close_uffd;
}
/* Register UFFD-WP, no need for an actual handler. */
if (uffd_register(uffd, map, size, false, true, false)) {
ksft_test_result_fail("UFFDIO_REGISTER_MODE_WP failed\n");
goto close_uffd;
}
/* Write-protect the range using UFFD-WP. */
uffd_writeprotect.range.start = (unsigned long) map;
uffd_writeprotect.range.len = size;
uffd_writeprotect.mode = UFFDIO_WRITEPROTECT_MODE_WP;
if (ioctl(uffd, UFFDIO_WRITEPROTECT, &uffd_writeprotect)) {
ksft_test_result_fail("UFFDIO_WRITEPROTECT failed\n");
goto close_uffd;
}
if (madvise(map, size, MADV_UNMERGEABLE)) {
ksft_test_result_fail("MADV_UNMERGEABLE failed\n");
goto close_uffd;
}
ksft_test_result(!range_maps_duplicates(map, size),
"Pages were unmerged\n");
close_uffd:
close(uffd);
unmap:
munmap(map, size);
}
#endif
/* Verify that KSM can be enabled / queried with prctl. */
static void test_prctl(void)
{
int ret;
ksft_print_msg("[RUN] %s\n", __func__);
ret = prctl(PR_SET_MEMORY_MERGE, 1, 0, 0, 0);
if (ret < 0 && errno == EINVAL) {
ksft_test_result_skip("PR_SET_MEMORY_MERGE not supported\n");
return;
} else if (ret) {
ksft_test_result_fail("PR_SET_MEMORY_MERGE=1 failed\n");
return;
}
ret = prctl(PR_GET_MEMORY_MERGE, 0, 0, 0, 0);
if (ret < 0) {
ksft_test_result_fail("PR_GET_MEMORY_MERGE failed\n");
return;
} else if (ret != 1) {
ksft_test_result_fail("PR_SET_MEMORY_MERGE=1 not effective\n");
return;
}
ret = prctl(PR_SET_MEMORY_MERGE, 0, 0, 0, 0);
if (ret) {
ksft_test_result_fail("PR_SET_MEMORY_MERGE=0 failed\n");
return;
}
ret = prctl(PR_GET_MEMORY_MERGE, 0, 0, 0, 0);
if (ret < 0) {
ksft_test_result_fail("PR_GET_MEMORY_MERGE failed\n");
return;
} else if (ret != 0) {
ksft_test_result_fail("PR_SET_MEMORY_MERGE=0 not effective\n");
return;
}
ksft_test_result_pass("Setting/clearing PR_SET_MEMORY_MERGE works\n");
}
static int test_child_ksm(void)
{
const unsigned int size = 2 * MiB;
char *map;
/* Test if KSM is enabled for the process. */
if (prctl(PR_GET_MEMORY_MERGE, 0, 0, 0, 0) != 1)
return -1;
/* Test if merge could really happen. */
map = __mmap_and_merge_range(0xcf, size, PROT_READ | PROT_WRITE, KSM_MERGE_NONE);
if (map == MAP_MERGE_FAIL)
return -2;
else if (map == MAP_MERGE_SKIP)
return -3;
munmap(map, size);
return 0;
}
static void test_child_ksm_err(int status)
{
if (status == -1)
ksft_test_result_fail("unexpected PR_GET_MEMORY_MERGE result in child\n");
else if (status == -2)
ksft_test_result_fail("Merge in child failed\n");
else if (status == -3)
ksft_test_result_skip("Merge in child skipped\n");
}
/* Verify that prctl ksm flag is inherited. */
static void test_prctl_fork(void)
{
int ret, status;
pid_t child_pid;
ksft_print_msg("[RUN] %s\n", __func__);
ret = prctl(PR_SET_MEMORY_MERGE, 1, 0, 0, 0);
if (ret < 0 && errno == EINVAL) {
ksft_test_result_skip("PR_SET_MEMORY_MERGE not supported\n");
return;
} else if (ret) {
ksft_test_result_fail("PR_SET_MEMORY_MERGE=1 failed\n");
return;
}
child_pid = fork();
if (!child_pid) {
exit(test_child_ksm());
} else if (child_pid < 0) {
ksft_test_result_fail("fork() failed\n");
return;
}
if (waitpid(child_pid, &status, 0) < 0) {
ksft_test_result_fail("waitpid() failed\n");
return;
}
status = WEXITSTATUS(status);
if (status) {
test_child_ksm_err(status);
return;
}
if (prctl(PR_SET_MEMORY_MERGE, 0, 0, 0, 0)) {
ksft_test_result_fail("PR_SET_MEMORY_MERGE=0 failed\n");
return;
}
ksft_test_result_pass("PR_SET_MEMORY_MERGE value is inherited\n");
}
static void test_prctl_fork_exec(void)
{
int ret, status;
pid_t child_pid;
ksft_print_msg("[RUN] %s\n", __func__);
ret = prctl(PR_SET_MEMORY_MERGE, 1, 0, 0, 0);
if (ret < 0 && errno == EINVAL) {
ksft_test_result_skip("PR_SET_MEMORY_MERGE not supported\n");
return;
} else if (ret) {
ksft_test_result_fail("PR_SET_MEMORY_MERGE=1 failed\n");
return;
}
child_pid = fork();
if (child_pid == -1) {
ksft_test_result_skip("fork() failed\n");
return;
} else if (child_pid == 0) {
char *prg_name = "./ksm_functional_tests";
char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME };
execv(prg_name, argv_for_program);
return;
}
if (waitpid(child_pid, &status, 0) > 0) {
if (WIFEXITED(status)) {
status = WEXITSTATUS(status);
if (status) {
test_child_ksm_err(status);
return;
}
} else {
ksft_test_result_fail("program didn't terminate normally\n");
return;
}
} else {
ksft_test_result_fail("waitpid() failed\n");
return;
}
if (prctl(PR_SET_MEMORY_MERGE, 0, 0, 0, 0)) {
ksft_test_result_fail("PR_SET_MEMORY_MERGE=0 failed\n");
return;
}
ksft_test_result_pass("PR_SET_MEMORY_MERGE value is inherited\n");
}
static void test_prctl_unmerge(void)
{
const unsigned int size = 2 * MiB;
char *map;
ksft_print_msg("[RUN] %s\n", __func__);
map = mmap_and_merge_range(0xcf, size, PROT_READ | PROT_WRITE, KSM_MERGE_PRCTL);
if (map == MAP_FAILED)
return;
if (prctl(PR_SET_MEMORY_MERGE, 0, 0, 0, 0)) {
ksft_test_result_fail("PR_SET_MEMORY_MERGE=0 failed\n");
goto unmap;
}
ksft_test_result(!range_maps_duplicates(map, size),
"Pages were unmerged\n");
unmap:
munmap(map, size);
}
static void test_prot_none(void)
{
const unsigned int size = 2 * MiB;
char *map;
int i;
ksft_print_msg("[RUN] %s\n", __func__);
map = mmap_and_merge_range(0x11, size, PROT_NONE, KSM_MERGE_MADVISE);
if (map == MAP_FAILED)
goto unmap;
/* Store a unique value in each page on one half using ptrace */
for (i = 0; i < size / 2; i += pagesize) {
lseek(mem_fd, (uintptr_t) map + i, SEEK_SET);
if (write(mem_fd, &i, sizeof(i)) != sizeof(i)) {
ksft_test_result_fail("ptrace write failed\n");
goto unmap;
}
}
/* Trigger unsharing on the other half. */
if (madvise(map + size / 2, size / 2, MADV_UNMERGEABLE)) {
ksft_test_result_fail("MADV_UNMERGEABLE failed\n");
goto unmap;
}
ksft_test_result(!range_maps_duplicates(map, size),
"Pages were unmerged\n");
unmap:
munmap(map, size);
}
int main(int argc, char **argv)
{
unsigned int tests = 8;
int err;
if (argc > 1 && !strcmp(argv[1], FORK_EXEC_CHILD_PRG_NAME)) {
exit(test_child_ksm());
}
#ifdef __NR_userfaultfd
tests++;
#endif
ksft_print_header();
ksft_set_plan(tests);
pagesize = getpagesize();
mem_fd = open("/proc/self/mem", O_RDWR);
if (mem_fd < 0)
ksft_exit_fail_msg("opening /proc/self/mem failed\n");
ksm_fd = open("/sys/kernel/mm/ksm/run", O_RDWR);
if (ksm_fd < 0)
ksft_exit_skip("open(\"/sys/kernel/mm/ksm/run\") failed\n");
ksm_full_scans_fd = open("/sys/kernel/mm/ksm/full_scans", O_RDONLY);
if (ksm_full_scans_fd < 0)
ksft_exit_skip("open(\"/sys/kernel/mm/ksm/full_scans\") failed\n");
pagemap_fd = open("/proc/self/pagemap", O_RDONLY);
if (pagemap_fd < 0)
ksft_exit_skip("open(\"/proc/self/pagemap\") failed\n");
proc_self_ksm_stat_fd = open("/proc/self/ksm_stat", O_RDONLY);
proc_self_ksm_merging_pages_fd = open("/proc/self/ksm_merging_pages",
O_RDONLY);
ksm_use_zero_pages_fd = open("/sys/kernel/mm/ksm/use_zero_pages", O_RDWR);
test_unmerge();
test_unmerge_zero_pages();
test_unmerge_discarded();
#ifdef __NR_userfaultfd
test_unmerge_uffd_wp();
#endif
test_prot_none();
test_prctl();
test_prctl_fork();
test_prctl_fork_exec();
test_prctl_unmerge();
err = ksft_get_fail_cnt();
if (err)
ksft_exit_fail_msg("%d out of %d tests failed\n",
err, ksft_test_num());
ksft_exit_pass();
}