Let's use alloc_contig_pages() for allocating memory and remove the linear mapping manually via arch_remove_linear_mapping(). Mark all pages PG_offline, such that they will definitely not get touched - e.g., when hibernating. When freeing memory, try to revert what we did. The original idea was discussed in: https://lkml.kernel.org/r/48340e96-7e6b-736f-9e23-d3111b915b6e@redhat.com This is similar to CONFIG_DEBUG_PAGEALLOC handling on other architectures, whereby only single pages are unmapped from the linear mapping. Let's mimic what memory hot(un)plug would do with the linear mapping. We now need MEMORY_HOTPLUG and CONTIG_ALLOC as dependencies. Add a TODO that we want to use __GFP_ZERO for clearing once alloc_contig_pages() understands that. Tested with in QEMU/TCG with 10 GiB of main memory: [root@localhost ~]# echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable [ 105.903043][ T1080] memtrace: Allocated trace memory on node 0 at 0x0000000080000000 [root@localhost ~]# echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable [ 145.042493][ T1080] radix-mmu: Mapped 0x0000000080000000-0x00000000c0000000 with 64.0 KiB pages [ 145.049019][ T1080] memtrace: Freed trace memory back on node 0 [ 145.333960][ T1080] memtrace: Allocated trace memory on node 0 at 0x0000000080000000 [root@localhost ~]# echo 0x80000000 > /sys/kernel/debug/powerpc/memtrace/enable [ 213.606916][ T1080] radix-mmu: Mapped 0x0000000080000000-0x00000000c0000000 with 64.0 KiB pages [ 213.613855][ T1080] memtrace: Freed trace memory back on node 0 [ 214.185094][ T1080] memtrace: Allocated trace memory on node 0 at 0x0000000080000000 [root@localhost ~]# echo 0x100000000 > /sys/kernel/debug/powerpc/memtrace/enable [ 234.874872][ T1080] radix-mmu: Mapped 0x0000000080000000-0x0000000100000000 with 64.0 KiB pages [ 234.886974][ T1080] memtrace: Freed trace memory back on node 0 [ 234.890153][ T1080] memtrace: Failed to allocate trace memory on node 0 [root@localhost ~]# echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable [ 259.490196][ T1080] memtrace: Allocated trace memory on node 0 at 0x0000000080000000 I also made sure allocated memory is properly zeroed. Note 1: We currently won't be allocating from ZONE_MOVABLE - because our pages are not movable. However, as we don't run with any memory hot(un)plug mechanism around, we could make an exception to increase the chance of allocations succeeding. Note 2: PG_reserved isn't sufficient. E.g., kernel_page_present() used along PG_reserved in hibernation code will always return "true" on powerpc, resulting in the pages getting touched. It's too generic - e.g., indicates boot allocations. Note 3: For now, we keep using memory_block_size_bytes() as minimum granularity. Suggested-by: Michal Hocko <mhocko@kernel.org> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201111145322.15793-9-david@redhat.com
52 lines
1.3 KiB
Text
52 lines
1.3 KiB
Text
# SPDX-License-Identifier: GPL-2.0
|
|
config PPC_POWERNV
|
|
depends on PPC64 && PPC_BOOK3S
|
|
bool "IBM PowerNV (Non-Virtualized) platform support"
|
|
select PPC_NATIVE
|
|
select PPC_XICS
|
|
select PPC_ICP_NATIVE
|
|
select PPC_XIVE_NATIVE
|
|
select PPC_P7_NAP
|
|
select FORCE_PCI
|
|
select PCI_MSI
|
|
select EPAPR_BOOT
|
|
select PPC_INDIRECT_PIO
|
|
select PPC_UDBG_16550
|
|
select ARCH_RANDOM
|
|
select CPU_FREQ
|
|
select PPC_DOORBELL
|
|
select MMU_NOTIFIER
|
|
select FORCE_SMP
|
|
default y
|
|
|
|
config OPAL_PRD
|
|
tristate 'OPAL PRD driver'
|
|
depends on PPC_POWERNV
|
|
help
|
|
This enables the opal-prd driver, a facility to run processor
|
|
recovery diagnostics on OpenPower machines
|
|
|
|
config PPC_MEMTRACE
|
|
bool "Enable runtime allocation of RAM for tracing"
|
|
depends on PPC_POWERNV && MEMORY_HOTPLUG && CONTIG_ALLOC
|
|
help
|
|
Enabling this option allows for runtime allocation of memory (RAM)
|
|
for hardware tracing.
|
|
|
|
config PPC_VAS
|
|
bool "IBM Virtual Accelerator Switchboard (VAS)"
|
|
depends on PPC_POWERNV && PPC_64K_PAGES
|
|
default y
|
|
help
|
|
This enables support for IBM Virtual Accelerator Switchboard (VAS).
|
|
|
|
VAS allows accelerators in co-processors like NX-GZIP and NX-842
|
|
to be accessible to kernel subsystems and user processes.
|
|
|
|
VAS adapters are found in POWER9 based systems.
|
|
|
|
If unsure, say N.
|
|
|
|
config SCOM_DEBUGFS
|
|
bool "Expose SCOM controllers via debugfs"
|
|
depends on DEBUG_FS
|