linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Fabiano Rosas	c3fa64c99c	KVM: PPC: Book3S HV: Decouple the debug timing from the P8 entry path We are currently doing the timing for debug purposes of the P9 entry path using the accumulators and terminology defined by the old entry path for P8 machines. Not only the "real-mode" and "napping" mentions are out of place for the P9 Radix entry path but also we cannot change them because the timing code is coupled to the structures defined in struct kvm_vcpu_arch. Add a new CONFIG_KVM_BOOK3S_HV_P9_TIMING to enable the timing code for the P9 entry path. For now, just add the new CONFIG and duplicate the structures. A subsequent patch will add the P9 changes. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220525130554.2614394-4-farosas@linux.ibm.com	2022-06-29 19:21:21 +10:00
Fabiano Rosas	3f8ed993be	KVM: PPC: Book3S HV: Add a new config for P8 debug timing Turn the existing Kconfig KVM_BOOK3S_HV_EXIT_TIMING into KVM_BOOK3S_HV_P8_TIMING in preparation for the addition of a new config for P9 timings. This applies only to P8 code, the generic timing code is still kept under KVM_BOOK3S_HV_EXIT_TIMING. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220525130554.2614394-3-farosas@linux.ibm.com	2022-06-29 19:21:21 +10:00
Fabiano Rosas	9981bace85	KVM: PPC: Book3S HV: Fix "rm_exit" entry in debugfs timings At debugfs/kvm/<pid>/vcpu0/timings we show how long each part of the code takes to run: $ cat /sys/kernel/debug/kvm/-/vcpu0/timings rm_entry: 123785 49398892 118 4898 rm_intr: 123780 6075890 22 390 rm_exit: 0 0 0 0 <-- NOK guest: 123780 46732919988 402 9997638 cede: 0 0 0 0 <-- OK, no cede napping in P9 The "rm_exit" is always showing zero because it is the last one and end_timing does not increment the counter of the previous entry. We can fix it by calling accumulate_time again instead of end_timing. That way the counter gets incremented. The rest of the arithmetic can be ignored because there are no timing points after this and the accumulators are reset before the next round. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220525130554.2614394-2-farosas@linux.ibm.com	2022-06-29 19:21:21 +10:00
Christophe Leroy	c7b9ed7c34	powerpc/64e: KASAN Full support for BOOK3E/64 We now have memory organised in a way that allows implementing KASAN. Unlike book3s/64, book3e always has translation active so the only thing needed to use KASAN is to setup an early zero shadow mapping just after setting a stack pointer and before calling early_setup(). The memory layout is now as follows +------------------------+ Kernel virtual map end (0xc000200000000000) \| \| \| 16TB of KASAN map \| \| \| +------------------------+ Kernel KASAN shadow map start \| \| \| 16TB of IO map \| \| \| +------------------------+ Kernel IO map start \| \| \| 16TB of vmemmap \| \| \| +------------------------+ Kernel vmemmap start \| \| \| 16TB of vmap \| \| \| +------------------------+ Kernel virt start (0xc000100000000000) \| \| \| 64TB of linear mem \| \| \| +------------------------+ Kernel linear (0xc.....) Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0bef8beda27baf71e3b9e8b13e620fba6e19499b.1656427701.git.christophe.leroy@csgroup.eu	2022-06-29 17:04:15 +10:00
Christophe Leroy	059c189389	powerpc/64e: Reorganise virtual memory Reduce the size of IO map in order to leave the last quarter of virtual MAP for KASAN shadow mapping. This gives the following layout. +------------------------+ Kernel virtual map end (0xc000200000000000) \| \| \| 16TB (unused) \| \| \| +------------------------+ Kernel IO map end \| \| \| 16TB of IO map \| \| \| +------------------------+ Kernel IO map start \| \| \| 16TB of vmemmap \| \| \| +------------------------+ Kernel vmemmap start \| \| \| 16TB of vmap \| \| \| +------------------------+ Kernel virt start (0xc000100000000000) \| \| \| 64TB of linear mem \| \| \| +------------------------+ Kernel linear (0xc.....) Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/54ef01673bf14228106afd629f795c83acb9a00c.1656427701.git.christophe.leroy@csgroup.eu	2022-06-29 17:04:15 +10:00
Christophe Leroy	128c1ea2f8	powerpc/64e: Move virtual memory closer to linear memory Today nohash/64 have linear memory based at 0xc000000000000000 and virtual memory based at 0x8000000000000000. In order to implement KASAN, we need to regroup both areas. Move virtual memmory at 0xc000100000000000. This complicates a bit TLB miss handlers. Until now, memory region was easily identified with the 4 higher bits of address: - 0 ==> User - c ==> Linear Memory - 8 ==> Virtual Memory Now we need to rely on the 20 higher bits, with: - 0xxxx ==> User - c0000 ==> Linear Memory - c0001 ==> Virtual Memory Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4b225168031449fc34fc7132f3923cc8dc54af60.1656427701.git.christophe.leroy@csgroup.eu	2022-06-29 17:04:15 +10:00
Christophe Leroy	b646c1f7f4	powerpc/64e: Remove unused REGION related macros Those macros are not used anywhere. Remove them as they are soon going to be wrong and are not worth modifying as they are not used. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f0efde8cee0924c3991790042b176ac77ad35e1f.1656427701.git.christophe.leroy@csgroup.eu	2022-06-29 17:04:15 +10:00
Christophe Leroy	3adfb457b8	powerpc/64e: Remove MMU_FTR_USE_TLBRSRV and MMU_FTR_USE_PAIRED_MAS Commit `fb5a515704` ("powerpc: Remove platforms/wsp and associated pieces") removed the last CPU having features MMU_FTRS_A2 and commit `cd68098bce` ("powerpc: Clean up MMU_FTRS_A2 and MMU_FTR_TYPE_3E") removed MMU_FTRS_A2 which was the last user of MMU_FTR_USE_TLBRSRV and MMU_FTR_USE_PAIRED_MAS. Remove all code that relies on MMU_FTR_USE_TLBRSRV and MMU_FTR_USE_PAIRED_MAS. With this change done, TLB miss can happen before the mmu feature fixups. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/cfd5a0ecdb1598da968832e1bddf7431ec267200.1656427701.git.christophe.leroy@csgroup.eu	2022-06-29 17:04:14 +10:00
Christophe Leroy	0931764311	powerpc/64e: Fix early TLB miss with KUAP With KUAP, the TLB miss handler bails out when an access to user memory is performed with a nul TID. But the normal TLB miss routine which is only used early during boot does the check regardless for all memory areas, not only user memory. By chance there is no early IO or vmalloc access, but when KASAN come we will start having early TLB misses. Fix it by creating a special branch for user accesses similar to the one in the 'bolted' TLB miss handlers. Unfortunately SPRN_MAS1 is now read too early and there are no registers available to preserve it so it will be read a second time. Fixes: `57bc963837` ("powerpc/kuap: Wire-up KUAP on book3e/64") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8d6c5859a45935d6e1a336da4dc20be421e8cea7.1656427701.git.christophe.leroy@csgroup.eu	2022-06-29 17:04:14 +10:00
Christophe Leroy	dd8de84b57	powerpc/ptdump: Fix display of RW pages on FSL_BOOK3E On FSL_BOOK3E, _PAGE_RW is defined with two bits, one for user and one for supervisor. As soon as one of the two bits is set, the page has to be display as RW. But the way it is implemented today requires both bits to be set in order to display it as RW. Instead of display RW when _PAGE_RW bits are set and R otherwise, reverse the logic and display R when _PAGE_RW bits are all 0 and RW otherwise. This change has no impact on other platforms as _PAGE_RW is a single bit on all of them. Fixes: `8eb07b1870` ("powerpc/mm: Dump linux pagetables") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0c33b96317811edf691e81698aaee8fa45ec3449.1656427391.git.christophe.leroy@csgroup.eu	2022-06-29 17:03:42 +10:00
Christophe Leroy	2db2008e63	powerpc/64e: Rewrite p4d_populate() as a static inline function Rewrite p4d_populate() as a static inline function instead of a macro. This change allows typechecking and would have helped detecting a recently found bug in map_kernel_page(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1b416f8a8fe1bc3f4e01175680ce310b7eb3a1e4.1655974565.git.christophe.leroy@csgroup.eu	2022-06-29 17:03:11 +10:00
Christophe Leroy	12a9eddd23	powerpc: Remove _PAGE_SAO stub for book3e/64 Since commit `634093c59a` ("powerpc/mm: enable ARCH_HAS_VM_GET_PAGE_PROT"), _PAGE_SAO is used only in arch/powerpc/mm/book3s64/pgtable.c The _PAGE_SAO stub defined as 0 for book3e/64 can be removed. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/715e644fb3c7d992c0b71f6165ab6cf8c682055a.1655706069.git.christophe.leroy@csgroup.eu	2022-06-29 16:59:31 +10:00
Christophe Leroy	513f5bbac7	powerpc/32: Remove __map_without_ltlbs __map_without_ltlbs is used only for 40x, and only when STRICT_KERNEL_RWX, KFENCE or DEBUG_PAGEALLOC is active. Do the verification directly in 40x version of mmu_mapin_ram() and remove __map_without_ltlbs from core ppc32. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3422094db965d218c4c3d8580f526963a9ac897f.1655202721.git.christophe.leroy@csgroup.eu	2022-06-29 16:59:06 +10:00
Christophe Leroy	56e54b4e6c	powerpc/32: Remove 'noltlbs' kernel parameter Mapping without large TLBs has no added value on the 8xx. Mapping without large TLBs is still necessary on 40x when selecting CONFIG_KFENCE or CONFIG_DEBUG_PAGEALLOC or CONFIG_STRICT_KERNEL_RWX, but this is done automatically and doesn't require user selection. Remove 'noltlbs' kernel parameter, the user has no reason to use it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/80ca17bd39cf608a8ebd0764d7064a498e131199.1655202721.git.christophe.leroy@csgroup.eu	2022-06-29 16:59:06 +10:00
Christophe Leroy	1ce844973b	powerpc/32: Remove the 'nobats' kernel parameter Mapping without BATs doesn't bring any added value to the user. Remove that option. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/6977314c823cfb728bc0273cea634b41807bfb64.1655202721.git.christophe.leroy@csgroup.eu	2022-06-29 16:59:06 +10:00
Christophe Leroy	92f89ec1b5	powerpc: Restore CONFIG_DEBUG_INFO in defconfigs Commit `f9b3cd2457` ("Kconfig.debug: make DEBUG_INFO selectable from a choice") broke the selection of CONFIG_DEBUG_INFO by powerpc defconfigs. It is now necessary to select one of the three DEBUG_INFO_DWARF* options to get DEBUG_INFO enabled. Replace DEBUG_INFO=y by DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y in all defconfigs using the following command: sed -i s/DEBUG_INFO=y/DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y/g `git grep -l DEBUG_INFO arch/powerpc/configs/` Fixes: `f9b3cd2457` ("Kconfig.debug: make DEBUG_INFO selectable from a choice") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/98a4c2603bf9e4b776e219f5b8541d23aa24e854.1654930308.git.christophe.leroy@csgroup.eu	2022-06-29 16:58:49 +10:00
Christophe Leroy	78f1c24abd	powerpc/irq: Simplify __do_irq() Remove duplicated code by implementing a proper if/else. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5a3b21311191f1240850db6ab29b19ac7885fe03.1654769775.git.christophe.leroy@csgroup.eu	2022-06-29 16:58:31 +10:00
Christophe Leroy	e90855be9e	powerpc/irq: Perform stack_overflow detection after switching to IRQ stack When KASAN is enabled, as shown by the Oops below, the 2k limit is not enough to allow stack dump after a stack overflow detection when CONFIG_DEBUG_STACKOVERFLOW is selected: do_IRQ: stack overflow: 1984 CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1 Call Trace: Oops: Kernel stack overflow, sig: 11 [#1] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac Modules linked in: sr_mod cdrom radeon(+) ohci_pci(+) hwmon i2c_algo_bit drm_ttm_helper ttm drm_dp_helper snd_aoa_i2sbus snd_aoa_soundbus snd_pcm ehci_pci snd_timer ohci_hcd snd ssb ehci_hcd 8250_pci soundcore drm_kms_helper pcmcia 8250 pcmcia_core syscopyarea usbcore sysfillrect 8250_base sysimgblt serial_mctrl_gpio fb_sys_fops usb_common pkcs8_key_parser fuse drm drm_panel_orientation_quirks configfs CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1 NIP: c02e5558 LR: c07eb3bc CTR: c07f46a8 REGS: e7fe9f50 TRAP: 0000 Not tainted (5.18.0-gentoo-PMacG4) MSR: 00001032 <ME,IR,DR,RI> CR: 44a14824 XER: 20000000 GPR00: c07eb3bc eaa1c000 c26baea0 eaa1c0a0 00000008 00000000 c07eb3bc eaa1c010 GPR08: eaa1c0a8 04f3f3f3 f1f1f1f1 c07f4c84 44a14824 0080f7e4 00000005 00000010 GPR16: 00000025 eaa1c154 eaa1c158 c0dbad64 00000020 fd543810 eaa1c0a0 eaa1c29e GPR24: c0dbad44 c0db8740 05ffffff fd543802 eaa1c150 c0c9a3c0 eaa1c0a0 c0c9a3c0 NIP [c02e5558] kasan_check_range+0xc/0x2b4 LR [c07eb3bc] format_decode+0x80/0x604 Call Trace: [eaa1c000] [c07eb3bc] format_decode+0x80/0x604 (unreliable) [eaa1c070] [c07f4dac] vsnprintf+0x128/0x938 [eaa1c110] [c07f5788] sprintf+0xa0/0xc0 [eaa1c180] [c0154c1c] __sprint_symbol.constprop.0+0x170/0x198 [eaa1c230] [c07ee71c] symbol_string+0xf8/0x260 [eaa1c430] [c07f46d0] pointer+0x15c/0x710 [eaa1c4b0] [c07f4fbc] vsnprintf+0x338/0x938 [eaa1c550] [c00e8fa0] vprintk_store+0x2a8/0x678 [eaa1c690] [c00e94e4] vprintk_emit+0x174/0x378 [eaa1c6d0] [c00ea008] _printk+0x9c/0xc0 [eaa1c750] [c000ca94] show_stack+0x21c/0x260 [eaa1c7a0] [c07d0bd4] dump_stack_lvl+0x60/0x90 [eaa1c7c0] [c0009234] __do_IRQ+0x170/0x174 [eaa1c800] [c0009258] do_IRQ+0x20/0x34 [eaa1c820] [c00045b4] HardwareInterrupt_virt+0x108/0x10c ... As the detection is asynchronously performed at IRQs, we could be long after the limit has been crossed, so increasing the limit would not solve the problem completely. In order to be sure that there is enough stack space for the stack dump, do it after the switch to the IRQ stack. That way it is sure that the stack is large enough, unless the IRQ stack has been overfilled in which case the end of life is close. Reported-by: Erhard Furtner <erhard_f@mailbox.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c215d714329f475b431a6193369035aadfc0d182.1654769775.git.christophe.leroy@csgroup.eu	2022-06-29 16:58:31 +10:00
Christophe Leroy	051bd351a2	powerpc/irq: Make __do_irq() static Since commit `48cf12d889` ("powerpc/irq: Inline call_do_irq() and call_do_softirq()"), __do_irq() is not used outside irq.c Reorder functions and make __do_irq() static and drop the declaration in irq.h. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/adbe1c8315ec2d63259f41468e82e51677bb1eda.1654769775.git.christophe.leroy@csgroup.eu	2022-06-29 16:58:27 +10:00
Christophe Leroy	41f20d6db2	powerpc/irq: Increase stack_overflow detection limit when KASAN is enabled When KASAN is enabled, as shown by the Oops below, the 2k limit is not enough to allow stack dump after a stack overflow detection when CONFIG_DEBUG_STACKOVERFLOW is selected: do_IRQ: stack overflow: 1984 CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1 Call Trace: Oops: Kernel stack overflow, sig: 11 [#1] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac Modules linked in: sr_mod cdrom radeon(+) ohci_pci(+) hwmon i2c_algo_bit drm_ttm_helper ttm drm_dp_helper snd_aoa_i2sbus snd_aoa_soundbus snd_pcm ehci_pci snd_timer ohci_hcd snd ssb ehci_hcd 8250_pci soundcore drm_kms_helper pcmcia 8250 pcmcia_core syscopyarea usbcore sysfillrect 8250_base sysimgblt serial_mctrl_gpio fb_sys_fops usb_common pkcs8_key_parser fuse drm drm_panel_orientation_quirks configfs CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1 NIP: c02e5558 LR: c07eb3bc CTR: c07f46a8 REGS: e7fe9f50 TRAP: 0000 Not tainted (5.18.0-gentoo-PMacG4) MSR: 00001032 <ME,IR,DR,RI> CR: 44a14824 XER: 20000000 GPR00: c07eb3bc eaa1c000 c26baea0 eaa1c0a0 00000008 00000000 c07eb3bc eaa1c010 GPR08: eaa1c0a8 04f3f3f3 f1f1f1f1 c07f4c84 44a14824 0080f7e4 00000005 00000010 GPR16: 00000025 eaa1c154 eaa1c158 c0dbad64 00000020 fd543810 eaa1c0a0 eaa1c29e GPR24: c0dbad44 c0db8740 05ffffff fd543802 eaa1c150 c0c9a3c0 eaa1c0a0 c0c9a3c0 NIP [c02e5558] kasan_check_range+0xc/0x2b4 LR [c07eb3bc] format_decode+0x80/0x604 Call Trace: [eaa1c000] [c07eb3bc] format_decode+0x80/0x604 (unreliable) [eaa1c070] [c07f4dac] vsnprintf+0x128/0x938 [eaa1c110] [c07f5788] sprintf+0xa0/0xc0 [eaa1c180] [c0154c1c] __sprint_symbol.constprop.0+0x170/0x198 [eaa1c230] [c07ee71c] symbol_string+0xf8/0x260 [eaa1c430] [c07f46d0] pointer+0x15c/0x710 [eaa1c4b0] [c07f4fbc] vsnprintf+0x338/0x938 [eaa1c550] [c00e8fa0] vprintk_store+0x2a8/0x678 [eaa1c690] [c00e94e4] vprintk_emit+0x174/0x378 [eaa1c6d0] [c00ea008] _printk+0x9c/0xc0 [eaa1c750] [c000ca94] show_stack+0x21c/0x260 [eaa1c7a0] [c07d0bd4] dump_stack_lvl+0x60/0x90 [eaa1c7c0] [c0009234] __do_IRQ+0x170/0x174 [eaa1c800] [c0009258] do_IRQ+0x20/0x34 [eaa1c820] [c00045b4] HardwareInterrupt_virt+0x108/0x10c ... An investigation shows that on PPC32, calling dump_stack() requires more than 1k when KASAN is not selected and a bit more than 2k bytes when KASAN is selected. On PPC64 the registers are twice the size of PPC32 registers, so the need should be approximately twice the need on PPC32. In the meantime we have THREAD_SIZE which is twice larger on PPC64 than PPC32, and twice larger when KASAN is selected. So we can easily use the value of THREAD_SIZE to set the limit. On PPC32, THREAD_SIZE is 8k without KASAN and 16k with KASAN. On PPC64, THREAD_SIZE is 16k without KASAN. To be on the safe side, leave 2k on PPC32 without KASAN, 4k with KASAN, and 4k on PPC64 without KASAN. It means the limit should be one fourth of THREAD_SIZE. Reported-by: Erhard Furtner <erhard_f@mailbox.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e8b4eb82a126c3c6c352173a544fe94609ff660b.1654261687.git.christophe.leroy@csgroup.eu	2022-06-29 16:57:57 +10:00
Christophe Leroy	077fc62b2b	powerpc/irq: remove inline assembly in hard_irq_disable macro Use WRITE_ONCE() instead of opencoding the saving of current stack pointeur. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9f05937d8722ddd2064a7c2362d8f9000e15e1ba.1652863723.git.christophe.leroy@csgroup.eu	2022-06-29 16:57:25 +10:00
Christophe Leroy	78ffe6a7e2	powerpc/irq: Replace #ifdefs by IS_ENABLED() Replace #ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG and #ifdef CONFIG_PERF_EVENTS by IS_ENABLED() in hw_irq.h and plpar_wrappers.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c1ded642f8d9002767f8fed48ed6d1e76254ed73.1652862729.git.christophe.leroy@csgroup.eu	2022-06-29 16:57:09 +10:00
Christophe Leroy	ef5b570d37	powerpc/irq: Don't open code irq_soft_mask helpers Use READ_ONCE() and WRITE_ONCE() instead of open coding read and write of local PACA irq_soft_mask. For the write, add a barrier to keep the memory clobber that was there previously. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e2454434992cc932a5a34b695ae981c0b2f4c28e.1652862729.git.christophe.leroy@csgroup.eu	2022-06-29 16:57:09 +10:00
Christophe Leroy	98552307e3	powerpc/irq64: Remove get_irq_happened() No need to open code the read of local_paca->irq_happened in assembly, we have READ_ONCE() for doing the same. Replace get_irq_happened() by READ_ONCE(local_paca->irq_happened). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/af511b53e4eb51f8fbc51eda7f5597175e68dce6.1652859593.git.christophe.leroy@csgroup.eu	2022-06-29 16:47:43 +10:00
Christophe Leroy	7d7b28b302	powerpc/irq: Split irq.c More than half of irq.c is dedicated to PPC64. Move PPC64 code out of irq.c into irq_64.c Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9f1a47de80f78d3dd270a7a72f69f55f581c4054.1652859593.git.christophe.leroy@csgroup.eu	2022-06-29 16:47:42 +10:00
Christophe Leroy	e93dee186f	powerpc: Don't include asm/ppc_asm.h in other headers asm/ppc_asm.h is not needed in any of the header it is included. It is only needed by irq.c. Include it there and remove it from other headers. word-at-a-time.h only need ex_table.h, so include it instead. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e2d7b96547037f852c7ed164e4f79e8918c2607a.1651828453.git.christophe.leroy@csgroup.eu	2022-06-29 16:45:13 +10:00
Christophe Leroy	46d60bdb12	powerpc: Include asm/firmware.h in all users of firmware_has_feature() Trying to remove asm/ppc_asm.h from all places that don't need it leads to several failures linked to firmware_has_feature(). To fix it, include asm/firmware.h in all files using firmware_has_feature() All users found with: git grep -L "firmware\.h" ` git grep -l "firmware_has_feature("` Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/11956ec181a034b51a881ac9c059eea72c679a73.1651828453.git.christophe.leroy@csgroup.eu	2022-06-29 16:45:05 +10:00
Athira Rajeev	890005a7d9	powerpc/perf: Optimize clearing the pending PMI and remove WARN_ON for PMI check in power_pmu_disable commit `2c9ac51b85` ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC") added a new function "pmi_irq_pending" in hw_irq.h. This function is to check if there is a PMI marked as pending in Paca (PACA_IRQ_PMI).This is used in power_pmu_disable in a WARN_ON. The intention here is to provide a warning if there is PMI pending, but no counter is found overflown. During some of the perf runs, below warning is hit: WARNING: CPU: 36 PID: 0 at arch/powerpc/perf/core-book3s.c:1332 power_pmu_disable+0x25c/0x2c0 Modules linked in: ----- NIP [c000000000141c3c] power_pmu_disable+0x25c/0x2c0 LR [c000000000141c8c] power_pmu_disable+0x2ac/0x2c0 Call Trace: [c000000baffcfb90] [c000000000141c8c] power_pmu_disable+0x2ac/0x2c0 (unreliable) [c000000baffcfc10] [c0000000003e2f8c] perf_pmu_disable+0x4c/0x60 [c000000baffcfc30] [c0000000003e3344] group_sched_out.part.124+0x44/0x100 [c000000baffcfc80] [c0000000003e353c] __perf_event_disable+0x13c/0x240 [c000000baffcfcd0] [c0000000003dd334] event_function+0xc4/0x140 [c000000baffcfd20] [c0000000003d855c] remote_function+0x7c/0xa0 [c000000baffcfd50] [c00000000026c394] flush_smp_call_function_queue+0xd4/0x300 [c000000baffcfde0] [c000000000065b24] smp_ipi_demux_relaxed+0xa4/0x100 [c000000baffcfe20] [c0000000000cb2b0] xive_muxed_ipi_action+0x20/0x40 [c000000baffcfe40] [c000000000207c3c] __handle_irq_event_percpu+0x8c/0x250 [c000000baffcfee0] [c000000000207e2c] handle_irq_event_percpu+0x2c/0xa0 [c000000baffcff10] [c000000000210a04] handle_percpu_irq+0x84/0xc0 [c000000baffcff40] [c000000000205f14] generic_handle_irq+0x54/0x80 [c000000baffcff60] [c000000000015740] __do_irq+0x90/0x1d0 [c000000baffcff90] [c000000000016990] __do_IRQ+0xc0/0x140 [c0000009732f3940] [c000000bafceaca8] 0xc000000bafceaca8 [c0000009732f39d0] [c000000000016b78] do_IRQ+0x168/0x1c0 [c0000009732f3a00] [c0000000000090c8] hardware_interrupt_common_virt+0x218/0x220 This means that there is no PMC overflown among the active events in the PMU, but there is a PMU pending in Paca. The function "any_pmc_overflown" checks the PMCs on active events in cpuhw->n_events. Code snippet: <<>> if (any_pmc_overflown(cpuhw)) clear_pmi_irq_pending(); else WARN_ON(pmi_irq_pending()); <<>> Here the PMC overflown is not from active event. Example: When we do perf record, default cycles and instructions will be running on PMC6 and PMC5 respectively. It could happen that overflowed event is currently not active and pending PMI is for the inactive event. Debug logs from trace_printk: <<>> any_pmc_overflown: idx is 5: pmc value is 0xd9a power_pmu_disable: PMC1: 0x0, PMC2: 0x0, PMC3: 0x0, PMC4: 0x0, PMC5: 0xd9a, PMC6: 0x80002011 <<>> Here active PMC (from idx) is PMC5 , but overflown PMC is PMC6(0x80002011). When we handle PMI interrupt for such cases, if the PMC overflown is from inactive event, it will be ignored. Reference commit: commit `bc09c219b2` ("powerpc/perf: Fix finding overflowed PMC in interrupt") Patch addresses two changes: 1) Fix 1 : Removal of warning ( WARN_ON(pmi_irq_pending()); ) We were printing warning if no PMC is found overflown among active PMU events, but PMI pending in PACA. But this could happen in cases where PMC overflown is not in active PMC. An inactive event could have caused the overflow. Hence the warning is not needed. To know pending PMI is from an inactive event, we need to loop through all PMC's which will cause more SPR reads via mfspr and increase in context switch. Also in existing function: perf_event_interrupt, already we ignore PMI's overflown when it is from an inactive PMC. 2) Fix 2: optimization in clearing pending PMI. Currently we check for any active PMC overflown before clearing PMI pending in Paca. This is causing additional SPR read also. From point 1, we know that if PMI pending in Paca from inactive cases, that is going to be ignored during replay. Hence if there is pending PMI in Paca, just clear it irrespective of PMC overflown or not. In summary, remove the any_pmc_overflown check entirely in power_pmu_disable. ie If there is a pending PMI in Paca, clear it, since we are in pmu_disable. There could be cases where PMI is pending because of inactive PMC ( which later when replayed also will get ignored ), so WARN_ON could give false warning. Hence removing it. Fixes: `2c9ac51b85` ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC") Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220522142256.24699-1-atrajeev@linux.vnet.ibm.com	2022-06-28 23:41:15 +10:00
Arnd Bergmann	4313a24985	arch/*/: remove CONFIG_VIRT_TO_BUS All architecture-independent users of virt_to_bus() and bus_to_virt() have been fixed to use the dma mapping interfaces or have been removed now. This means the definitions on most architectures, and the CONFIG_VIRT_TO_BUS symbol are now obsolete and can be removed. The only exceptions to this are a few network and scsi drivers for m68k Amiga and VME machines and ppc32 Macintosh. These drivers work correctly with the old interfaces and are probably not worth changing. On alpha and parisc, virt_to_bus() were still used in asm/floppy.h. alpha can use isa_virt_to_bus() like x86 does, and parisc can just open-code the virt_to_phys() here, as this is architecture specific code. I tried updating the bus-virt-phys-mapping.rst documentation, which started as an email from Linus to explain some details of the Linux-2.0 driver interfaces. The bits about virt_to_bus() were declared obsolete backin 2000, and the rest is not all that relevant any more, so in the end I just decided to remove the file completely. Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2022-06-28 13:20:21 +02:00
Mike Rapoport	ee65728e10	docs: rename Documentation/vm to Documentation/mm so it will be consistent with code mm directory and with Documentation/admin-guide/mm and won't be confused with virtual machines. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Tested-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Jonathan Corbet <corbet@lwn.net> Acked-by: Wu XiangCheng <bobwxc@email.cn>	2022-06-27 12:52:53 -07:00
akpm	46a3b11253	Merge branch 'master' into mm-stable	2022-06-27 10:31:34 -07:00
Christophe Leroy	d7f3964615	powerpc/powermac: Remove empty function note_scsi_host() note_scsi_host() has been an empty function since commit `6ee0d9f744` ("[POWERPC] Remove unused old code from powermac setup code"). Remove it. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/26f8b72a4276c0bd8ed63860c7316f6361c351b4.1655978907.git.christophe.leroy@csgroup.eu	2022-06-26 10:29:44 +10:00
Michael Ellerman	2d38676975	powerpc: Update asm-prototypes.h comment This header was recently cleaned up in commit `76222808fc` ("powerpc: Move C prototypes out of asm-prototypes.h"), update the comment to reflect it's proper purpose. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220617080243.2177583-1-mpe@ellerman.id.au	2022-06-26 10:29:43 +10:00
Liam Howlett	6886da5f49	powerpc/prom_init: Fix kernel config grep When searching for config options, use the KCONFIG_CONFIG shell variable so that builds using non-standard config locations work. Fixes: `26deb04342` ("powerpc: prepare string/mem functions for KASAN") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220624011745.4060795-1-Liam.Howlett@oracle.com	2022-06-24 13:47:26 +10:00
Alexey Kardashevskiy	a784101f77	KVM: PPC: Book3s: Fix warning about xics_rm_h_xirr_x This fixes "no previous prototype": arch/powerpc/kvm/book3s_hv_rm_xics.c:482:15: warning: no previous prototype for 'xics_rm_h_xirr_x' [-Wmissing-prototypes] Reported by the kernel test robot. Fixes: `b22af90419` ("KVM: PPC: Book3s: Remove real mode interrupt controller hcalls handlers") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220622055235.1139204-1-aik@ozlabs.ru	2022-06-24 12:58:33 +10:00
Christophe Leroy	9864816180	powerpc/book3e: Fix PUD allocation size in map_kernel_page() Commit `2fb4706057` ("powerpc: add support for folded p4d page tables") erroneously changed PUD setup to a mix of PMD and PUD. Fix it. While at it, use PTE_TABLE_SIZE instead of PAGE_SIZE for PTE tables in order to avoid any confusion. Fixes: `2fb4706057` ("powerpc: add support for folded p4d page tables") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/95ddfd6176d53e6c85e13bd1c358359daa56775f.1655974558.git.christophe.leroy@csgroup.eu	2022-06-24 12:42:45 +10:00
Nathan Lynch	19fc5bb93c	powerpc/xive/spapr: correct bitmap allocation size kasan detects access beyond the end of the xibm->bitmap allocation: BUG: KASAN: slab-out-of-bounds in _find_first_zero_bit+0x40/0x140 Read of size 8 at addr c00000001d1d0118 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc2-00001-g90df023b36dd #28 Call Trace: [c00000001d98f770] [c0000000012baab8] dump_stack_lvl+0xac/0x108 (unreliable) [c00000001d98f7b0] [c00000000068faac] print_report+0x37c/0x710 [c00000001d98f880] [c0000000006902c0] kasan_report+0x110/0x354 [c00000001d98f950] [c000000000692324] __asan_load8+0xa4/0xe0 [c00000001d98f970] [c0000000011c6ed0] _find_first_zero_bit+0x40/0x140 [c00000001d98f9b0] [c0000000000dbfbc] xive_spapr_get_ipi+0xcc/0x260 [c00000001d98fa70] [c0000000000d6d28] xive_setup_cpu_ipi+0x1e8/0x450 [c00000001d98fb30] [c000000004032a20] pSeries_smp_probe+0x5c/0x118 [c00000001d98fb60] [c000000004018b44] smp_prepare_cpus+0x944/0x9ac [c00000001d98fc90] [c000000004009f9c] kernel_init_freeable+0x2d4/0x640 [c00000001d98fd90] [c0000000000131e8] kernel_init+0x28/0x1d0 [c00000001d98fe10] [c00000000000cd54] ret_from_kernel_thread+0x5c/0x64 Allocated by task 0: kasan_save_stack+0x34/0x70 __kasan_kmalloc+0xb4/0xf0 __kmalloc+0x268/0x540 xive_spapr_init+0x4d0/0x77c pseries_init_irq+0x40/0x27c init_IRQ+0x44/0x84 start_kernel+0x2a4/0x538 start_here_common+0x1c/0x20 The buggy address belongs to the object at c00000001d1d0118 which belongs to the cache kmalloc-8 of size 8 The buggy address is located 0 bytes inside of 8-byte region [c00000001d1d0118, c00000001d1d0120) The buggy address belongs to the physical page: page:c00c000000074740 refcount:1 mapcount:0 mapping:0000000000000000 index:0xc00000001d1d0558 pfn:0x1d1d flags: 0x7ffff000000200(slab\|node=0\|zone=0\|lastcpupid=0x7ffff) raw: 007ffff000000200 c00000001d0003c8 c00000001d0003c8 c00000001d010480 raw: c00000001d1d0558 0000000001e1000a 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: c00000001d1d0000: fc 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc c00000001d1d0080: fc fc 00 fc fc fc fc fc fc fc fc fc fc fc fc fc >c00000001d1d0100: fc fc fc 02 fc fc fc fc fc fc fc fc fc fc fc fc ^ c00000001d1d0180: fc fc fc fc 04 fc fc fc fc fc fc fc fc fc fc fc c00000001d1d0200: fc fc fc fc fc 04 fc fc fc fc fc fc fc fc fc fc This happens because the allocation uses the wrong unit (bits) when it should pass (BITS_TO_LONGS(count) * sizeof(long)) or equivalent. With small numbers of bits, the allocated object can be smaller than sizeof(long), which results in invalid accesses. Use bitmap_zalloc() to allocate and initialize the irq bitmap, paired with bitmap_free() for consistency. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220623182509.3985625-1-nathanl@linux.ibm.com	2022-06-24 12:40:38 +10:00
Jason A. Donenfeld	f3eac42665	powerpc/powernv: wire up rng during setup_arch The platform's RNG must be available before random_init() in order to be useful for initial seeding, which in turn means that it needs to be called from setup_arch(), rather than from an init call. Complicating things, however, is that POWER8 systems need some per-cpu state and kmalloc, which isn't available at this stage. So we split things up into an early phase and a later opportunistic phase. This commit also removes some noisy log messages that don't add much. Fixes: `a4da0d50b2` ("powerpc: Implement arch_get_random_long/int() for powernv") Cc: stable@vger.kernel.org # v3.13+ Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Add of_node_put(), use pnv naming, minor change log editing] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220621140849.127227-1-Jason@zx2c4.com	2022-06-22 19:47:22 +10:00
Andy Shevchenko	00bcb550dc	powerpc/52xx: Get rid of of_node assignment Let GPIO library assign of_node from the parent device. This allows to move GPIO library and drivers to use fwnode APIs instead of being stuck with OF-only interfaces. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220507100147.5802-3-andriy.shevchenko@linux.intel.com	2022-06-22 12:51:49 +10:00
Andy Shevchenko	de06fba62a	powerpc/mpc5xxx: Switch mpc5xxx_get_bus_frequency() to use fwnode Switch mpc5xxx_get_bus_frequency() to use fwnode in order to help cleaning up other parts of the kernel from OF specific code. No functional change intended. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Chris Packham <chris.packham@alliedtelesis.co.nz> # for i2c-mpc Acked-by: Wolfram Sang <wsa@kernel.org> # for the I2C part Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for mscan/mpc5xxx_can Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220507100147.5802-2-andriy.shevchenko@linux.intel.com	2022-06-22 12:51:49 +10:00
Andy Shevchenko	6d056b7254	powerpc/52xx: Remove dead code, i.e. mpc52xx_get_xtal_freq() It seems mpc52xx_get_xtal_freq() is not used anywhere. Remove dead code. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220507100147.5802-1-andriy.shevchenko@linux.intel.com	2022-06-22 12:51:49 +10:00
Christophe Leroy	7dc3ba0a07	powerpc: Move prom_init() out of asm-prototypes.h This is the end of the work started with commit `76222808fc` ("powerpc: Move C prototypes out of asm-prototypes.h") Now that asm/machdep.h doesn't include asm/setup.h anymore, there are no conflicts anymore with the function prom_init() defined in drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowrom.o So we can move it to asm/setup.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e111e4f0addb0fa810d5f6a71d3b8e62c0b53492.1654966508.git.christophe.leroy@csgroup.eu	2022-06-20 11:29:49 +10:00
Christophe Leroy	113fe88eed	powerpc: Don't include asm/setup.h in asm/machdep.h asm/machdep.h doesn't need asm/setup.h Remove it. Add it directly in files that needs it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3b1dfb19a2c3265fb4abc2bfc7b6eae9261a998b.1654966508.git.christophe.leroy@csgroup.eu	2022-06-20 11:29:49 +10:00
Christophe Leroy	ca5dabcff1	powerpc/prom_init: Fix build failure with GCC_PLUGIN_STRUCTLEAK_BYREF_ALL and KASAN When CONFIG_KASAN is selected, we expect prom_init to use __memset() because it is too early to use memset(). But with CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL, the compiler adds calls to memset() to clear objects on stack, hence the following failure: PROMCHK arch/powerpc/kernel/prom_init_check Error: External symbol 'memset' referenced from prom_init.c make[2]: *** [arch/powerpc/kernel/Makefile:204 : arch/powerpc/kernel/prom_init_check] Erreur 1 prom_find_machine_type() is called from prom_init() and is called only once, so lets put compat[] in BSS instead of stack to avoid that. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3802811f7cf94f730be44688539c01bba3a3b5c0.1654875808.git.christophe.leroy@csgroup.eu	2022-06-19 21:57:56 +10:00
Andrew Donnellan	7bc08056a6	powerpc/rtas: Allow ibm,platform-dump RTAS call with null buffer address Add a special case to block_rtas_call() to allow the ibm,platform-dump RTAS call through the RTAS filter if the buffer address is 0. According to PAPR, ibm,platform-dump is called with a null buffer address to notify the platform firmware that processing of a particular dump is finished. Without this, on a pseries machine with CONFIG_PPC_RTAS_FILTER enabled, an application such as rtas_errd that is attempting to retrieve a dump will encounter an error at the end of the retrieval process. Fixes: `bd59380c5b` ("powerpc/rtas: Restrict RTAS requests from userspace") Cc: stable@vger.kernel.org Reported-by: Sathvika Vasireddy <sathvika@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220614134952.156010-1-ajd@linux.ibm.com	2022-06-18 10:19:10 +10:00
Naveen N. Rao	ec6d0dde71	powerpc: Enable execve syscall exit tracepoint On execve[at], we are zero'ing out most of the thread register state including gpr[0], which contains the syscall number. Due to this, we fail to trigger the syscall exit tracepoint properly. Fix this by retaining gpr[0] in the thread register state. Before this patch: # tail /sys/kernel/debug/tracing/trace cat-123 [000] ..... 61.449351: sys_execve(filename: 7fffa6b23448, argv: 7fffa6b233e0, envp: 7fffa6b233f8) cat-124 [000] ..... 62.428481: sys_execve(filename: 7fffa6b23448, argv: 7fffa6b233e0, envp: 7fffa6b233f8) echo-125 [000] ..... 65.813702: sys_execve(filename: 7fffa6b23378, argv: 7fffa6b233a0, envp: 7fffa6b233b0) echo-125 [000] ..... 65.822214: sys_execveat(fd: 0, filename: 1009ac48, argv: 7ffff65d0c98, envp: 7ffff65d0ca8, flags: 0) After this patch: # tail /sys/kernel/debug/tracing/trace cat-127 [000] ..... 100.416262: sys_execve(filename: 7fffa41b3448, argv: 7fffa41b33e0, envp: 7fffa41b33f8) cat-127 [000] ..... 100.418203: sys_execve -> 0x0 echo-128 [000] ..... 103.873968: sys_execve(filename: 7fffa41b3378, argv: 7fffa41b33a0, envp: 7fffa41b33b0) echo-128 [000] ..... 103.875102: sys_execve -> 0x0 echo-128 [000] ..... 103.882097: sys_execveat(fd: 0, filename: 1009ac48, argv: 7fffd10d2148, envp: 7fffd10d2158, flags: 0) echo-128 [000] ..... 103.883225: sys_execveat -> 0x0 Cc: stable@vger.kernel.org Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Tested-by: Sumit Dubey2 <Sumit.Dubey2@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220609103328.41306-1-naveen.n.rao@linux.vnet.ibm.com	2022-06-18 10:19:10 +10:00
Jason A. Donenfeld	e561e472a3	powerpc/pseries: wire up rng during setup_arch() The platform's RNG must be available before random_init() in order to be useful for initial seeding, which in turn means that it needs to be called from setup_arch(), rather than from an init call. Fortunately, each platform already has a setup_arch function pointer, which means it's easy to wire this up. This commit also removes some noisy log messages that don't add much. Fixes: `a489043f46` ("powerpc/pseries: Implement arch_get_random_long() based on H_RANDOM") Cc: stable@vger.kernel.org # v3.13+ Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220611151015.548325-4-Jason@zx2c4.com	2022-06-18 10:19:10 +10:00
Jason A. Donenfeld	20a9689b36	powerpc/microwatt: wire up rng during setup_arch() The platform's RNG must be available before random_init() in order to be useful for initial seeding, which in turn means that it needs to be called from setup_arch(), rather than from an init call. Fortunately, each platform already has a setup_arch function pointer, which means it's easy to wire this up. This commit also removes some noisy log messages that don't add much. Fixes: `c25769fdda` ("powerpc/microwatt: Add support for hardware random number generator") Cc: stable@vger.kernel.org # v5.14+ Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220611151015.548325-2-Jason@zx2c4.com	2022-06-18 10:19:10 +10:00
Michael Ellerman	6cf06c17e9	powerpc/mm: Move CMA reservations after initmem_init() After commit `11ac3e87ce` ("mm: cma: use pageblock_order as the single alignment") there is an error at boot about the KVM CMA reservation failing, eg: kvm_cma_reserve: reserving 6553 MiB for global area cma: Failed to reserve 6553 MiB That makes it impossible to start KVM guests using the hash MMU with more than 2G of memory, because the VM is unable to allocate a large enough region for the hash page table, eg: $ qemu-system-ppc64 -enable-kvm -M pseries -m 4G ... qemu-system-ppc64: Failed to allocate KVM HPT of order 25: Cannot allocate memory Aneesh pointed out that this happens because when kvm_cma_reserve() is called, pageblock_order has not been initialised yet, and is still zero, causing the checks in cma_init_reserved_mem() against CMA_MIN_ALIGNMENT_PAGES to fail. Fix it by moving the call to kvm_cma_reserve() after initmem_init(). The pageblock_order is initialised in sparse_init() which is called from initmem_init(). Also move the hugetlb CMA reservation. Fixes: `11ac3e87ce` ("mm: cma: use pageblock_order as the single alignment") Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220616120033.1976732-1-mpe@ellerman.id.au	2022-06-18 10:18:55 +10:00
Peter Xu	d92725256b	mm: avoid unnecessary page fault retires on shared memory types I observed that for each of the shared file-backed page faults, we're very likely to retry one more time for the 1st write fault upon no page. It's because we'll need to release the mmap lock for dirty rate limit purpose with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()). Then after that throttling we return VM_FAULT_RETRY. We did that probably because VM_FAULT_RETRY is the only way we can return to the fault handler at that time telling it we've released the mmap lock. However that's not ideal because it's very likely the fault does not need to be retried at all since the pgtable was well installed before the throttling, so the next continuous fault (including taking mmap read lock, walk the pgtable, etc.) could be in most cases unnecessary. It's not only slowing down page faults for shared file-backed, but also add more mmap lock contention which is in most cases not needed at all. To observe this, one could try to write to some shmem page and look at "pgfault" value in /proc/vmstat, then we should expect 2 counts for each shmem write simply because we retried, and vm event "pgfault" will capture that. To make it more efficient, add a new VM_FAULT_COMPLETED return code just to show that we've completed the whole fault and released the lock. It's also a hint that we should very possibly not need another fault immediately on this page because we've just completed it. This patch provides a ~12% perf boost on my aarch64 test VM with a simple program sequentially dirtying 400MB shmem file being mmap()ed and these are the time it needs: Before: 650.980 ms (+-1.94%) After: 569.396 ms (+-1.38%) I believe it could help more than that. We need some special care on GUP and the s390 pgfault handler (for gmap code before returning from pgfault), the rest changes in the page fault handlers should be relatively straightforward. Another thing to mention is that mm_account_fault() does take this new fault as a generic fault to be accounted, unlike VM_FAULT_RETRY. I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping them as-is. Link: https://lkml.kernel.org/r/20220530183450.42886-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vineet Gupta <vgupta@kernel.org> Acked-by: Guo Ren <guoren@kernel.org> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm part] Acked-by: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Brian Cain <bcain@quicinc.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Weinberger <richard@nod.at> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Will Deacon <will@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Simek <monstr@monstr.eu> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: David Hildenbrand <david@redhat.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Rich Felker <dalias@libc.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Helge Deller <deller@gmx.de> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-06-16 19:48:27 -07:00

... 2 3 4 5 6 ...

25332 commits