linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Christophe Leroy	e2fb251188	powerpc/mm: flatten function __find_linux_pte() step 2 __find_linux_pte() is full of if/else which is hard to follow allthough the handling is pretty simple. Previous patch left { } blocks. This patch removes the first one by shifting its content to the left. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:24 +10:00
Christophe Leroy	fab9a1165b	powerpc/mm: flatten function __find_linux_pte() step 1 __find_linux_pte() is full of if/else which is hard to follow allthough the handling is pretty simple. This patch flattens the function by getting rid of as much if/else as possible. In order to ease the review, this is done in three steps. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:24 +10:00
Christophe Leroy	4df4b27585	powerpc/mm: cleanup remaining ifdef mess in hugetlbpage.c Only 3 subarches support huge pages. So when it is either 2 of them, it is not the third one. And mmu_has_feature() is known by all subarches so IS_ENABLED() can be used instead of #ifdef Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:24 +10:00
Christophe Leroy	c5710cd207	powerpc/mm: cleanup HPAGE_SHIFT setup Only book3s/64 may select default among several HPAGE_SHIFT at runtime. 8xx always defines 512K pages as default FSL_BOOK3E always defines 4M pages as default This patch limits HUGETLB_PAGE_SIZE_VARIABLE to book3s/64 moves the definitions in subarches files. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:24 +10:00
Christophe Leroy	45d0ba527b	powerpc/mm: move hugetlb_disabled into asm/hugetlb.h No need to have this in asm/page.h, move it into asm/hugetlb.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:24 +10:00
Christophe Leroy	723f268f19	powerpc/mm: cleanup ifdef mess in add_huge_page_size() Introduce a subarch specific helper check_and_get_huge_psize() to check the huge page sizes and cleanup the ifdef mess in add_huge_page_size() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:23 +10:00
Christophe Leroy	5fb84fec46	powerpc/mm: add a helper to populate hugepd This patchs adds a subarch helper to populate hugepd. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:23 +10:00
Christophe Leroy	0001e5aa5c	powerpc/mm: make gup_hugepte() static gup_huge_pd() is the only user of gup_hugepte() and it is located in the same file. This patch moves gup_huge_pd() after gup_hugepte() and makes gup_hugepte() static. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:23 +10:00
Christophe Leroy	b7dcf96ce0	powerpc/mm: make hugetlbpage.c depend on CONFIG_HUGETLB_PAGE The only function in hugetlbpage.c which doesn't depend on CONFIG_HUGETLB_PAGE is gup_hugepte(), and this function is only called from gup_huge_pd() which depends on CONFIG_HUGETLB_PAGE so all the content of hugetlbpage.c depends on CONFIG_HUGETLB_PAGE. This patch modifies Makefile to only compile hugetlbpage.c when CONFIG_HUGETLB_PAGE is set. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:23 +10:00
Christophe Leroy	0caed4de50	powerpc/mm: move __find_linux_pte() out of hugetlbpage.c __find_linux_pte() is the only function in hugetlbpage.c which is compiled in regardless on CONFIG_HUGETLBPAGE This patch moves it in pgtable.c. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:23 +10:00
Christophe Leroy	3dea7332cc	powerpc/book3e: hugetlbpage is only for CONFIG_PPC_FSL_BOOK3E As per Kconfig.cputype, only CONFIG_PPC_FSL_BOOK3E gets to select SYS_SUPPORTS_HUGETLBFS so simplify accordingly. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:23 +10:00
Christophe Leroy	5874cabe29	powerpc/64: only book3s/64 supports CONFIG_PPC_64K_PAGES CONFIG_PPC_64K_PAGES cannot be selected by nohash/64. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:23 +10:00
Christophe Leroy	a521c44c3d	powerpc/book3e: drop mmu_get_tsize() This function is not used anymore, drop it. Fixes: `b42279f016` ("powerpc/mm/nohash: MM_SLICE is only used by book3s 64") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:23 +10:00
Christophe Leroy	5953fb4f46	powerpc/mm: define subarch SLB_ADDR_LIMIT_DEFAULT This patch defines a subarch specific SLB_ADDR_LIMIT_DEFAULT to remove the #ifdefs around the setup of mm->context.slb_addr_limit It also generalises the use of mm_ctx_set_slb_addr_limit() helper. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:23 +10:00
Christophe Leroy	43ed7909d7	powerpc/mm: define get_slice_psize() all the time get_slice_psize() can be defined regardless of CONFIG_PPC_MM_SLICES to avoid ifdefs Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:23 +10:00
Christophe Leroy	203a1fa628	powerpc/mm: remove a couple of #ifdef CONFIG_PPC_64K_PAGES in mm/slice.c This patch replaces a couple of #ifdef CONFIG_PPC_64K_PAGES by IS_ENABLED(CONFIG_PPC_64K_PAGES) to improve code maintainability. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:23 +10:00
Christophe Leroy	b4baad0b27	powerpc/mm: remove unnecessary #ifdef CONFIG_PPC64 For PPC32 that's a noop, gcc should be smart enough to ignore it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:22 +10:00
Christophe Leroy	fca5c1e9eb	powerpc/mm: move slice_mask_for_size() into mmu.h Move slice_mask_for_size() into subarch mmu.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [mpe: Retain the BUG_ON()s, rather than converting to VM_BUG_ON()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:22 +10:00
Christophe Leroy	6f60cc98df	powerpc/mm: hand a context_t over to slice_mask_for_size() instead of mm_struct slice_mask_for_size() only uses mm->context, so hand directly a pointer to the context. This will help moving the function in subarch mmu.h in the next patch by avoiding having to include the definition of struct mm_struct Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:22 +10:00
Christophe Leroy	27e23b5f5f	powerpc/mm: Move nohash specifics in subdirectory mm/nohash Many files in arch/powerpc/mm are only for nohash. This patch creates a subdirectory for them. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Shorten new filenames] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:22 +10:00
Christophe Leroy	17312f258c	powerpc/mm: Move book3s32 specifics in subdirectory mm/book3s64 Several files in arch/powerpc/mm are only for book3S32. This patch creates a subdirectory for them. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Shorten new filenames] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:20:22 +10:00
Christophe Leroy	47d99948ee	powerpc/mm: Move book3s64 specifics in subdirectory mm/book3s64 Many files in arch/powerpc/mm are only for book3S64. This patch creates a subdirectory for them. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Update the selftest sym links, shorten new filenames, cleanup some whitespace and formatting in the new files.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-03 01:18:38 +10:00
Christophe Leroy	9d9f2cccde	powerpc/mm: change #include "mmu_decl.h" to <mm/mmu_decl.h> This patch make inclusion of mmu_decl.h independant of the location of the file including it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-02 21:18:58 +10:00
Christophe Leroy	a1ac2a9c4f	powerpc/book3e: drop BUG_ON() in map_kernel_page() early_alloc_pgtable() never returns NULL as it panics on failure. This patch drops the three BUG_ON() which check the non nullity of early_alloc_pgtable() returned value. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-02 21:18:57 +10:00
Christophe Leroy	12f363511d	powerpc/32s: Fix BATs setting with CONFIG_STRICT_KERNEL_RWX Serge reported some crashes with CONFIG_STRICT_KERNEL_RWX enabled on a book3s32 machine. Analysis shows two issues: - BATs addresses and sizes are not properly aligned. - There is a gap between the last address covered by BATs and the first address covered by pages. Memory mapped with DBATs: 0: 0xc0000000-0xc07fffff 0x00000000 Kernel RO coherent 1: 0xc0800000-0xc0bfffff 0x00800000 Kernel RO coherent 2: 0xc0c00000-0xc13fffff 0x00c00000 Kernel RW coherent 3: 0xc1400000-0xc23fffff 0x01400000 Kernel RW coherent 4: 0xc2400000-0xc43fffff 0x02400000 Kernel RW coherent 5: 0xc4400000-0xc83fffff 0x04400000 Kernel RW coherent 6: 0xc8400000-0xd03fffff 0x08400000 Kernel RW coherent 7: 0xd0400000-0xe03fffff 0x10400000 Kernel RW coherent Memory mapped with pages: 0xe1000000-0xefffffff 0x21000000 240M rw present dirty accessed This patch fixes both issues. With the patch, we get the following which is as expected: Memory mapped with DBATs: 0: 0xc0000000-0xc07fffff 0x00000000 Kernel RO coherent 1: 0xc0800000-0xc0bfffff 0x00800000 Kernel RO coherent 2: 0xc0c00000-0xc0ffffff 0x00c00000 Kernel RW coherent 3: 0xc1000000-0xc1ffffff 0x01000000 Kernel RW coherent 4: 0xc2000000-0xc3ffffff 0x02000000 Kernel RW coherent 5: 0xc4000000-0xc7ffffff 0x04000000 Kernel RW coherent 6: 0xc8000000-0xcfffffff 0x08000000 Kernel RW coherent 7: 0xd0000000-0xdfffffff 0x10000000 Kernel RW coherent Memory mapped with pages: 0xe0000000-0xefffffff 0x20000000 256M rw present dirty accessed Fixes: `63b2bc6195` ("powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX") Reported-by: Serge Belyshev <belyshev@depni.sinp.msu.ru> Acked-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-02 15:33:46 +10:00
Aneesh Kumar K.V	2c474c0350	powerpc/mm/radix: Fix kernel crash when running subpage protect test This patch fixes the below crash by making sure we touch the subpage protection related structures only if we know they are allocated on the platform. With radix translation we don't allocate hash context at all and trying to access subpage_prot_table results in: Faulting instruction address: 0xc00000000008bdb4 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV .... NIP [c00000000008bdb4] sys_subpage_prot+0x74/0x590 LR [c00000000000b688] system_call+0x5c/0x70 Call Trace: [c00020002c6b7d30] [c00020002c6b7d90] 0xc00020002c6b7d90 (unreliable) [c00020002c6b7e20] [c00000000000b688] system_call+0x5c/0x70 Instruction dump: fb61ffd8 fb81ffe0 fba1ffe8 fbc1fff0 fbe1fff8 f821ff11 e92d1178 f9210068 39200000 e92d0968 ebe90630 e93f03e8 <eb891038> 60000000 3860fffe e9410068 We also move the subpage_prot_table with mmp_sem held to avoid race between two parallel subpage_prot syscall. Fixes: `701101865f` ("powerpc/mm: Reduce memory usage for mm_context_t for radix") Reported-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-01 23:01:33 +10:00
Nathan Fontenot	b2d3b5ee66	powerpc/pseries: Track LMB nid instead of using device tree When removing memory we need to remove the memory from the node it was added to instead of looking up the node it should be in in the device tree. During testing we have seen scenarios where the affinity for a LMB changes due to a partition migration or PRRN event. In these cases the node the LMB exists in may not match the node the device tree indicates it belongs in. This can lead to a system crash when trying to DLPAR remove the LMB after a migration or PRRN event. The current code looks up the node in the device tree to remove the LMB from, the crash occurs when we try to offline this node and it does not have any data, i.e. node_data[nid] == NULL. 36:mon> e cpu 0x36: Vector: 300 (Data Access) at [c0000001828b7810] pc: c00000000036d08c: try_offline_node+0x2c/0x1b0 lr: c0000000003a14ec: remove_memory+0xbc/0x110 sp: c0000001828b7a90 msr: 800000000280b033 dar: 9a28 dsisr: 40000000 current = 0xc0000006329c4c80 paca = 0xc000000007a55200 softe: 0 irq_happened: 0x01 pid = 76926, comm = kworker/u320:3 36:mon> t [link register ] c0000000003a14ec remove_memory+0xbc/0x110 [c0000001828b7a90] c00000000006a1cc arch_remove_memory+0x9c/0xd0 (unreliable) [c0000001828b7ad0] c0000000003a14e0 remove_memory+0xb0/0x110 [c0000001828b7b20] c0000000000c7db4 dlpar_remove_lmb+0x94/0x160 [c0000001828b7b60] c0000000000c8ef8 dlpar_memory+0x7e8/0xd10 [c0000001828b7bf0] c0000000000bf828 handle_dlpar_errorlog+0xf8/0x160 [c0000001828b7c60] c0000000000bf8cc pseries_hp_work_fn+0x3c/0xa0 [c0000001828b7c90] c000000000128cd8 process_one_work+0x298/0x5a0 [c0000001828b7d20] c000000000129068 worker_thread+0x88/0x620 [c0000001828b7dc0] c00000000013223c kthread+0x1ac/0x1c0 [c0000001828b7e30] c00000000000b45c ret_from_kernel_thread+0x5c/0x80 To resolve this we need to track the node a LMB belongs to when it is added to the system so we can remove it from that node instead of the node that the device tree indicates it should belong to. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-29 22:27:16 +10:00
Colin Ian King	f341d89790	powerpc/mm: fix spelling mistake "Outisde" -> "Outside" There are several identical spelling mistakes in warning messages, fix these. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-28 16:28:23 +10:00
Aneesh Kumar K.V	26ad26718d	powerpc/mm: Fix section mismatch warning This patch fix the below section mismatch warnings. WARNING: vmlinux.o(.text+0x2d1f44): Section mismatch in reference from the function devm_memremap_pages_release() to the function .meminit.text:arch_remove_memory() WARNING: vmlinux.o(.text+0x2d265c): Section mismatch in reference from the function devm_memremap_pages() to the function .meminit.text:arch_add_memory() Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:12:40 +10:00
Aneesh Kumar K.V	5f53d28608	powerpc/mm/hash: Rename KERNEL_REGION_ID to LINEAR_MAP_REGION_ID The region actually point to linear map. Rename the #define to clarify thati. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:12:40 +10:00
Aneesh Kumar K.V	e09093927e	powerpc/mm: Validate address values against different region limits This adds an explicit check in various functions. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:12:39 +10:00
Aneesh Kumar K.V	0034d395f8	powerpc/mm/hash64: Map all the kernel regions in the same 0xc range This patch maps vmalloc, IO and vmemap regions in the 0xc address range instead of the current 0xd and 0xf range. This brings the mapping closer to radix translation mode. With hash 64K page size each of this region is 512TB whereas with 4K config we are limited by the max page table range of 64TB and hence there regions are of 16TB size. The kernel mapping is now: On 4K hash kernel_region_map_size = 16TB kernel vmalloc start = 0xc000100000000000 kernel IO start = 0xc000200000000000 kernel vmemmap start = 0xc000300000000000 64K hash, 64K radix and 4k radix: kernel_region_map_size = 512TB kernel vmalloc start = 0xc008000000000000 kernel IO start = 0xc00a000000000000 kernel vmemmap start = 0xc00c000000000000 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:12:39 +10:00
Aneesh Kumar K.V	a35a3c6f60	powerpc/mm/hash64: Add a variable to track the end of IO mapping This makes it easy to update the region mapping in the later patch Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:12:39 +10:00
Aneesh Kumar K.V	ef629cc5bf	powerc/mm/hash: Reduce hash_mm_context size Allocate subpage protect related variables only if we use the feature. This helps in reducing the hash related mm context struct by around 4K Before the patch sizeof(struct hash_mm_context) = 8288 After the patch sizeof(struct hash_mm_context) = 4160 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:12:39 +10:00
Aneesh Kumar K.V	701101865f	powerpc/mm: Reduce memory usage for mm_context_t for radix Currently, our mm_context_t on book3s64 include all hash specific context details like slice mask and subpage protection details. We can skip allocating these with radix translation. This will help us to save 8K per mm_context with radix translation. With the patch applied we have sizeof(mm_context_t) = 136 sizeof(struct hash_mm_context) = 8288 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:12:39 +10:00
Aneesh Kumar K.V	67fda38f0d	powerpc/mm: Move slb_addr_linit to early_init_mmu Avoid #ifdef in generic code. Also enables us to do this specific to MMU translation mode on book3s64 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:12:39 +10:00
Aneesh Kumar K.V	60458fba46	powerpc/mm: Add helpers for accessing hash translation related variables We want to switch to allocating them runtime only when hash translation is enabled. Add helpers so that both book3s and nohash can be adapted to upcoming change easily. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:12:38 +10:00
Christophe Leroy	a68c31fc01	powerpc/32s: Implement Kernel Userspace Access Protection This patch implements Kernel Userspace Access Protection for book3s/32. Due to limitations of the processor page protection capabilities, the protection is only against writing. read protection cannot be achieved using page protection. The previous patch modifies the page protection so that RW user pages are RW for Key 0 and RO for Key 1, and it sets Key 0 for both user and kernel. This patch changes userspace segment registers are set to Ku 0 and Ks 1. When kernel needs to write to RW pages, the associated segment register is then changed to Ks 0 in order to allow write access to the kernel. In order to avoid having the read all segment registers when locking/unlocking the access, some data is kept in the thread_struct and saved on stack on exceptions. The field identifies both the first unlocked segment and the first segment following the last unlocked one. When no segment is unlocked, it contains value 0. As the hash_page() function is not able to easily determine if a protfault is due to a bad kernel access to userspace, protfaults need to be handled by handle_page_fault when KUAP is set. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Drop allow_read/write_to/from_user() as they're now in kup.h, and adapt allow_user_access() to do nothing when to == NULL] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:11:47 +10:00
Christophe Leroy	f342adca3a	powerpc/32s: Prepare Kernel Userspace Access Protection This patch prepares Kernel Userspace Access Protection for book3s/32. Due to limitations of the processor page protection capabilities, the protection is only against writing. read protection cannot be achieved using page protection. book3s/32 provides the following values for PP bits: PP00 provides RW for Key 0 and NA for Key 1 PP01 provides RW for Key 0 and RO for Key 1 PP10 provides RW for all PP11 provides RO for all Today PP10 is used for RW pages and PP11 for RO pages, and user segment register's Kp and Ks are set to 1. This patch modifies page protection to use PP01 for RW pages and sets user segment registers to Kp 0 and Ks 0. This will allow to setup Userspace write access protection by settng Ks to 1 in the following patch. Kernel space segment registers remain unchanged. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:11:46 +10:00
Christophe Leroy	31ed2b13c4	powerpc/32s: Implement Kernel Userspace Execution Prevention. To implement Kernel Userspace Execution Prevention, this patch sets NX bit on all user segments on kernel entry and clears NX bit on all user segments on kernel exit. Note that powerpc 601 doesn't have the NX bit, so KUEP will not work on it. A warning is displayed at startup. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:11:46 +10:00
Christophe Leroy	2679f9bd0a	powerpc/8xx: Add Kernel Userspace Access Protection This patch adds Kernel Userspace Access Protection on the 8xx. When a page is RO or RW, it is set RO or RW for Key 0 and NA for Key 1. Up to now, the User group is defined with Key 0 for both User and Supervisor. By changing the group to Key 0 for User and Key 1 for Supervisor, this patch prevents the Kernel from being able to access user data. At exception entry, the kernel saves SPRN_MD_AP in the regs struct, and reapply the protection. At exception exit it restores SPRN_MD_AP with the value saved on exception entry. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Drop allow_read/write_to/from_user() as they're now in kup.h] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:11:46 +10:00
Christophe Leroy	06fbe81b59	powerpc/8xx: Add Kernel Userspace Execution Prevention This patch adds Kernel Userspace Execution Prevention on the 8xx. When a page is Executable, it is set Executable for Key 0 and NX for Key 1. Up to now, the User group is defined with Key 0 for both User and Supervisor. By changing the group to Key 0 for User and Key 1 for Supervisor, this patch prevents the Kernel from being able to execute user code. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:11:46 +10:00
Michael Ellerman	5e5be3aed2	powerpc/mm: Detect bad KUAP faults When KUAP is enabled we have logic to detect page faults that occur outside of a valid user access region and are blocked by the AMR. What we don't have at the moment is logic to detect a fault within a valid user access region, that has been incorrectly blocked by AMR. This is not meant to ever happen, but it can if we incorrectly save/restore the AMR, or if the AMR was overwritten for some other reason. Currently if that happens we assume it's just a regular fault that will be corrected by handling the fault normally, so we just return. But there is nothing the fault handling code can do to fix it, so the fault just happens again and we spin forever, leading to soft lockups. So add some logic to detect that case and WARN() if we ever see it. Arguably it should be a BUG(), but it's more polite to fail the access and let the kernel continue, rather than taking down the box. There should be no data integrity issue with failing the fault rather than BUG'ing, as we're just going to disallow an access that should have been allowed. To make the code a little easier to follow, unroll the condition at the end of bad_kernel_fault() and comment each case, before adding the call to bad_kuap_fault(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:06:04 +10:00
Michael Ellerman	890274c2dc	powerpc/64s: Implement KUAP for Radix MMU Kernel Userspace Access Prevention utilises a feature of the Radix MMU which disallows read and write access to userspace addresses. By utilising this, the kernel is prevented from accessing user data from outside of trusted paths that perform proper safety checks, such as copy_{to/from}_user() and friends. Userspace access is disabled from early boot and is only enabled when performing an operation like copy_{to/from}_user(). The register that controls this (AMR) does not prevent userspace from accessing itself, so there is no need to save and restore when entering and exiting userspace. When entering the kernel from the kernel we save AMR and if it is not blocking user access (because eg. we faulted doing a user access) we reblock user access for the duration of the exception (ie. the page fault) and then restore the AMR when returning back to the kernel. This feature can be tested by using the lkdtm driver (CONFIG_LKDTM=y) and performing the following: # (echo ACCESS_USERSPACE) > [debugfs]/provoke-crash/DIRECT If enabled, this should send SIGSEGV to the thread. We also add paranoid checking of AMR in switch and syscall return under CONFIG_PPC_KUAP_DEBUG. Co-authored-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:06:02 +10:00
Russell Currey	1bb2bae2e6	powerpc/mm/radix: Use KUEP API for Radix MMU Execution protection already exists on radix, this just refactors the radix init to provide the KUEP setup function instead. Thus, the only functional change is that it can now be disabled. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:06:01 +10:00
Russell Currey	b28c97505e	powerpc/64: Setup KUP on secondary CPUs Some platforms (i.e. Radix MMU) need per-CPU initialisation for KUP. Any platforms that only want to do KUP initialisation once globally can just check to see if they're running on the boot CPU, or check if whatever setup they need has already been performed. Note that this is only for 64-bit. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:05:59 +10:00
Christophe Leroy	de78a9c42a	powerpc: Add a framework for Kernel Userspace Access Protection This patch implements a framework for Kernel Userspace Access Protection. Then subarches will have the possibility to provide their own implementation by providing setup_kuap() and allow/prevent_user_access(). Some platforms will need to know the area accessed and whether it is accessed from read, write or both. Therefore source, destination and size and handed over to the two functions. mpe: Rename to allow/prevent rather than unlock/lock, and add read/write wrappers. Drop the 32-bit code for now until we have an implementation for it. Add kuap to pt_regs for 64-bit as well as 32-bit. Don't split strings, use pr_crit_ratelimited(). Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:05:57 +10:00
Christophe Leroy	0fb1c25ab5	powerpc: Add skeleton for Kernel Userspace Execution Prevention This patch adds a skeleton for Kernel Userspace Execution Prevention. Then subarches implementing it have to define CONFIG_PPC_HAVE_KUEP and provide setup_kuep() function. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Don't split strings, use pr_crit_ratelimited()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:05:56 +10:00
Christophe Leroy	69795cabe4	powerpc: Add framework for Kernel Userspace Protection This patch adds a skeleton for Kernel Userspace Protection functionnalities like Kernel Userspace Access Protection and Kernel Userspace Execution Prevention The subsequent implementation of KUAP for radix makes use of a MMU feature in order to patch out assembly when KUAP is disabled or unsupported. This won't work unless there's an entry point for KUP support before the feature magic happens, so for PPC64 setup_kup() is called early in setup. On PPC32, feature_fixup() is done too early to allow the same. Suggested-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-21 23:05:54 +10:00
Nathan Lynch	558f86493d	powerpc/numa: document topology_updates_enabled, disable by default Changing the NUMA associations for CPUs and memory at runtime is basically unsupported by the core mm, scheduler etc. We see all manner of crashes, warnings and instability when the pseries code tries to do this. Disable this behavior by default, and document the switch a bit. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-04-20 22:03:59 +10:00

... 2 3 4 5 6 ...

2250 commits