linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Jann Horn	59c1dcbed5	x86/traps: Print address on #GP A frequent cause of #GP exceptions are memory accesses to non-canonical addresses. Unlike #PF, #GP doesn't report a fault address in CR2, so the kernel doesn't currently print the fault address for a #GP. Luckily, the necessary infrastructure for decoding x86 instructions and computing the memory address being accessed is already present. Hook it up to the #GP handler so that the address operand of the faulting instruction can be figured out and printed. Distinguish two cases: a) (Part of) the memory range being accessed lies in the non-canonical address range; in this case, it is likely that the decoded address is actually the one that caused the #GP. b) The entire memory range of the decoded operand lies in canonical address space; the #GP may or may not be related in some way to the computed address. Print it, but with hedging language in the message. While it is already possible to compute the faulting address manually by disassembling the opcode dump and evaluating the instruction against the register dump, this should make it slightly easier to identify crashes at a glance. Note that the operand length which comes from the instruction decoder and is used to determine whether the access straddles into non-canonical address space, is currently somewhat unreliable; but it should be good enough, considering that Linux on x86-64 never maps the page directly before the start of the non-canonical range anyway, and therefore the case where a memory range begins in that page and potentially straddles into the non-canonical range should be fairly uncommon. In the case the address is still computed wrongly, it only influences whether the error message claims that the access is canonical. [ bp: Remove ambiguous "we", massage, reflow comments and spacing. ] Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Tested-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: kasan-dev@googlegroups.com Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191218231150.12139-2-jannh@google.com	2019-12-31 12:31:13 +01:00
Qian Cai	e278af89f1	x86/resctrl: Fix an imbalance in domain_remove_cpu() A system that supports resource monitoring may have multiple resources while not all of these resources are capable of monitoring. Monitoring related state is initialized only for resources that are capable of monitoring and correspondingly this state should subsequently only be removed from these resources that are capable of monitoring. domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable is true where it will initialize d->mbm_over. However, domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without checking r->mon_capable resulting in an attempt to cancel d->mbm_over on all resources, even those that never initialized d->mbm_over because they are not capable of monitoring. Hence, it triggers a debugobjects warning when offlining CPUs because those timer debugobjects are never initialized: ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0 WARNING: CPU: 143 PID: 789 at lib/debugobjects.c:484 debug_print_object Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 Compute Module, BIOS I40 05/23/2018 RIP: 0010:debug_print_object Call Trace: debug_object_assert_init del_timer try_to_grab_pending cancel_delayed_work resctrl_offline_cpu cpuhp_invoke_callback cpuhp_thread_fun smpboot_thread_fn kthread ret_from_fork Fixes: `e33026831b` ("x86/intel_rdt/mbm: Handle counter overflow") Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: john.stultz@linaro.org Cc: sboyd@kernel.org Cc: <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: tj@kernel.org Cc: Tony Luck <tony.luck@intel.com> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191211033042.2188-1-cai@lca.pw	2019-12-30 19:25:59 +01:00
Peter Zijlstra	1f676247f3	x86/alternatives: Implement a better poke_int3_handler() completion scheme Commit: `285a54efe3` ("x86/alternatives: Sync bp_patching update for avoiding NULL pointer exception") added an additional text_poke_sync() IPI to text_poke_bp_batch() to handle the rare case where another CPU is still inside an INT3 handler while we clear the global state. Instead of spraying IPIs around, count the active INT3 handlers and wait for them to go away before proceeding to clear/reuse the data. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-25 10:43:29 +01:00
Ingo Molnar	46f5cfc13d	Merge branch 'core/kprobes' into perf/core, to pick up a completed branch Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-25 10:43:08 +01:00
Omar Sandoval	8757dc970f	x86/crash: Define arch_crash_save_vmcoreinfo() if CONFIG_CRASH_CORE=y On x86 kernels configured with CONFIG_PROC_KCORE=y and CONFIG_KEXEC_CORE=n, the vmcoreinfo note in /proc/kcore is incomplete. Specifically, it is missing arch-specific information like the KASLR offset and whether 5-level page tables are enabled. This breaks applications like drgn [1] and crash [2], which need this information for live debugging via /proc/kcore. This happens because: 1. CONFIG_PROC_KCORE selects CONFIG_CRASH_CORE. 2. kernel/crash_core.c (compiled if CONFIG_CRASH_CORE=y) calls arch_crash_save_vmcoreinfo() to get the arch-specific parts of vmcoreinfo. If it is not defined, then it uses a no-op fallback. 3. x86 defines arch_crash_save_vmcoreinfo() in arch/x86/kernel/machine_kexec_.c, which is only compiled if CONFIG_KEXEC_CORE=y. Therefore, an x86 kernel with CONFIG_CRASH_CORE=y and CONFIG_KEXEC_CORE=n uses the no-op fallback and gets incomplete vmcoreinfo data. This isn't relevant to kdump, which requires CONFIG_KEXEC_CORE. It only affects applications which read vmcoreinfo at runtime, like the ones mentioned above. Fix it by moving arch_crash_save_vmcoreinfo() into two new arch/x86/kernel/crash_core_.c files, which are gated behind CONFIG_CRASH_CORE. 1: `73dd7def12/libdrgn/program.c (L385)` 2: `60a42d7092` Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kairui Song <kasong@redhat.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/0589961254102cca23e3618b96541b89f2b249e2.1576858905.git.osandov@fb.com	2019-12-23 12:58:41 +01:00
Linus Torvalds	5c741e2583	Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS fixes from Borislav Petkov: "Three urgent RAS fixes for the AMD side of things: - initialize struct mce.bank so that calculated error severity on AMD SMCA machines is correct - do not send IPIs early during bank initialization, when interrupts are disabled - a fix for when only a subset of MCA banks are enabled, which led to boot hangs on some new AMD CPUs" * 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Fix possibly incorrect severity calculation on AMD x86/MCE/AMD: Allow Reserved types to be overwritten in smca_banks[] x86/MCE/AMD: Do not use rdmsr_safe_on_cpu() in smca_configure()	2019-12-21 06:04:12 -08:00
Linus Torvalds	2abf193275	Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Add HPET quirks for the Intel 'Coffee Lake H' and 'Ice Lake' platforms" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/intel: Disable HPET on Intel Ice Lake platforms x86/intel: Disable HPET on Intel Coffee Lake H platforms	2019-12-17 11:11:08 -08:00
Jan H. Schönherr	81736abd55	x86/mce: Remove mce_inject_log() in favor of mce_log() The mutex in mce_inject_log() became unnecessary with commit `5de97c9f6d` ("x86/mce: Factor out and deprecate the /dev/mcelog driver"), though the original reason for its presence only vanished with commit `7298f08ea8` ("x86/mcelog: Get rid of RCU remnants"). Drop the mutex. And as that makes mce_inject_log() identical to mce_log(), get rid of the former in favor of the latter. Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191210000733.17979-7-jschoenh@amazon.de	2019-12-17 10:26:41 +01:00
Jan H. Schönherr	2d806d0723	x86/mce: Pass MCE message to mce_panic() on failed kernel recovery In commit `b2f9d678e2` ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries") another call to mce_panic() was introduced. Pass the message of the handled MCE to that instance of mce_panic() as well, as there doesn't seem to be a reason not to. Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191210000733.17979-6-jschoenh@amazon.de	2019-12-17 10:26:35 +01:00
Arnd Bergmann	db1ae0314f	x86/mce/therm_throt: Mark throttle_active_work() as __maybe_unused throttle_active_work() is only called if CONFIG_SYSFS is set, otherwise we get a harmless warning: arch/x86/kernel/cpu/mce/therm_throt.c:238:13: error: 'throttle_active_work' \ defined but not used [-Werror=unused-function] Mark the function as __maybe_unused to avoid the warning. Fixes: `f6656208f0` ("x86/mce/therm_throt: Optimize notifications of thermal throttle") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: bberg@redhat.com Cc: ckellner@redhat.com Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: hdegoede@redhat.com Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191210203925.3119091-1-arnd@arndb.de	2019-12-17 10:26:28 +01:00
Jan H. Schönherr	a3a57ddad0	x86/mce: Fix possibly incorrect severity calculation on AMD The function mce_severity_amd_smca() requires m->bank to be initialized for correct operation. Fix the one case, where mce_severity() is called without doing so. Fixes: `6bda529ec4` ("x86/mce: Grade uncorrected errors for SMCA-enabled systems") Fixes: `d28af26faa` ("x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()") Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Link: https://lkml.kernel.org/r/20191210000733.17979-4-jschoenh@amazon.de	2019-12-17 09:39:53 +01:00
Yazen Ghannam	966af20929	x86/MCE/AMD: Allow Reserved types to be overwritten in smca_banks[] Each logical CPU in Scalable MCA systems controls a unique set of MCA banks in the system. These banks are not shared between CPUs. The bank types and ordering will be the same across CPUs on currently available systems. However, some CPUs may see a bank as Reserved/Read-as-Zero (RAZ) while other CPUs do not. In this case, the bank seen as Reserved on one CPU is assumed to be the same type as the bank seen as a known type on another CPU. In general, this occurs when the hardware represented by the MCA bank is disabled, e.g. disabled memory controllers on certain models, etc. The MCA bank is disabled in the hardware, so there is no possibility of getting an MCA/MCE from it even if it is assumed to have a known type. For example: Full system: Bank \| Type seen on CPU0 \| Type seen on CPU1 ------------------------------------------------ 0 \| LS \| LS 1 \| UMC \| UMC 2 \| CS \| CS System with hardware disabled: Bank \| Type seen on CPU0 \| Type seen on CPU1 ------------------------------------------------ 0 \| LS \| LS 1 \| UMC \| RAZ 2 \| CS \| CS For this reason, there is a single, global struct smca_banks[] that is initialized at boot time. This array is initialized on each CPU as it comes online. However, the array will not be updated if an entry already exists. This works as expected when the first CPU (usually CPU0) has all possible MCA banks enabled. But if the first CPU has a subset, then it will save a "Reserved" type in smca_banks[]. Successive CPUs will then not be able to update smca_banks[] even if they encounter a known bank type. This may result in unexpected behavior. Depending on the system configuration, a user may observe issues enumerating the MCA thresholding sysfs interface. The issues may be as trivial as sysfs entries not being available, or as severe as system hangs. For example: Bank \| Type seen on CPU0 \| Type seen on CPU1 ------------------------------------------------ 0 \| LS \| LS 1 \| RAZ \| UMC 2 \| CS \| CS Extend the smca_banks[] entry check to return if the entry is a non-reserved type. Otherwise, continue so that CPUs that encounter a known bank type can update smca_banks[]. Fixes: `68627a697c` ("x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191121141508.141273-1-Yazen.Ghannam@amd.com	2019-12-17 09:39:53 +01:00
Konstantin Khlebnikov	246ff09f89	x86/MCE/AMD: Do not use rdmsr_safe_on_cpu() in smca_configure() ... because interrupts are disabled that early and sending IPIs can deadlock: BUG: sleeping function called from invalid context at kernel/sched/completion.c:99 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1 no locks held by swapper/1/0. irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<ffffffff8106dda9>] copy_process+0x8b9/0x1ca0 softirqs last enabled at (0): [<ffffffff8106dda9>] copy_process+0x8b9/0x1ca0 softirqs last disabled at (0): [<0000000000000000>] 0x0 Preemption disabled at: [<ffffffff8104703b>] start_secondary+0x3b/0x190 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.5.0-rc2+ #1 Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018 Call Trace: dump_stack ___might_sleep.cold.92 wait_for_completion ? generic_exec_single rdmsr_safe_on_cpu ? wrmsr_on_cpus mce_amd_feature_init mcheck_cpu_init identify_cpu identify_secondary_cpu smp_store_cpu_info start_secondary secondary_startup_64 The function smca_configure() is called only on the current CPU anyway, therefore replace rdmsr_safe_on_cpu() with atomic rdmsr_safe() and avoid the IPI. [ bp: Update commit message. ] Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/157252708836.3876.4604398213417262402.stgit@buzz	2019-12-17 09:39:33 +01:00
Borislav Petkov	d157aa0fb2	x86/cpu/tsx: Define pr_fmt() ... so that all current and future pr_* statements in this file have the proper prefix. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: x86@kernel.org Link: https://lkml.kernel.org/r/20191112221823.19677-2-bp@alien8.de	2019-12-15 10:58:54 +01:00
Borislav Petkov	72c2ce9867	x86/bugs: Move enum taa_mitigations to bugs.c ... because it is used only there. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: x86@kernel.org Link: https://lkml.kernel.org/r/20191112221823.19677-1-bp@alien8.de	2019-12-14 16:06:33 +01:00
yu kuai	27353d5785	x86/process: Remove set but not used variables prev and next Remove two unused variables: arch/x86/kernel/process.c: In function ‘__switch_to_xtra’: arch/x86/kernel/process.c:618:31: warning: variable ‘next’ set but not used [-Wunused-but-set-variable] 618 \| struct thread_struct prev, next; \| ^~~~ arch/x86/kernel/process.c:618:24: warning: variable ‘prev’ set but not used [-Wunused-but-set-variable] 618 \| struct thread_struct prev, next; \| They are never used and so can be removed. Signed-off-by: yu kuai <yukuai3@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Cc: yi.zhang@huawei.com Cc: zhengbin13@huawei.com Link: https://lkml.kernel.org/r/20191213121253.10072-1-yukuai3@huawei.com	2019-12-14 08:26:00 +01:00
Linus Torvalds	22ff311af9	treewide conversion from FIELD_SIZEOF() to sizeof_field() -----BEGIN PGP SIGNATURE----- Comment: Kees Cook <kees@outflux.net> iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl3umDgWHGtlZXNjb29r QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJlvsD/49R12HK7UzTxNTrcpvbadJ4t7j j/qJvjMerW7iVNAPOoNAOePUa21+y3rI1AZPvoPyzIqp1Bf2eOICf5SdisG2cG+O X0A8EKWvS0SSQWSKaT6udUKJ3nBJItwvOvQ5B58KQzcOj3S4X7B9iVBWgieMHrzz urkZm7pqowrZB3wuF8keRtli5IZaoiCwzApy48Qrn70G3OeXymknFbpHTDwIAiGw RiE5Xh0R4EzQdsYyCgjR8U56gBchadAmj8BUJU0ppMnOFMyIAG670hNLrs0L3roP 8TOIeyb993ZC5GZaMlnR8mz0jfibfkPa3Z85VAsVyQSPaOQldwc9j8TGBqD5Gfat 1PjOU5RVwma0pH5xTPOeevWPQpIK9KovQpQYqMMN9GMxOEx96IOUjwTrnNK2xWoN UGyOVlESFGoniClhCiKYzPSrYOjlIBk5ovf15PdTe+bwyUDMfyfy5CZV88OS2DHz ZBZvpLrH/EMW9zJ+FqMTp0C4s4wa2Ioid3bSh6XuNUTtltKSjp71eUja8ZEz+2sd 5AGstCC+hYqxaEk+6/851pfkQ9sbBjwuGtNrtX+pqreiLUvWLhQ0yUj6cLXlEQNH aucjCukCjI+4lMzofeaQ2LbNhtff4YsfO4b1Ye8maoDdHjzUVL57n3bTOxKhdzbt y6FM3lApOjk3OyaTJQ== =YU4A -----END PGP SIGNATURE----- Merge tag 'sizeof_field-v5.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull FIELD_SIZEOF conversion from Kees Cook: "A mostly mechanical treewide conversion from FIELD_SIZEOF() to sizeof_field(). This avoids the redundancy of having 2 macros (actually 3) doing the same thing, and consolidates on sizeof_field(). While "field" is not an accurate name, it is the common name used in the kernel, and doesn't result in any unintended innuendo. As there are still users of FIELD_SIZEOF() in -next, I will clean up those during this coming development cycle and send the final old macro removal patch at that time" * tag 'sizeof_field-v5.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: treewide: Use sizeof_field() macro MIPS: OCTEON: Replace SIZEOF_FIELD() macro	2019-12-13 14:02:12 -08:00
Shile Zhang	f14bf6a350	x86/unwind/orc: Remove boot-time ORC unwind tables sorting Now that the orc_unwind and orc_unwind_ip tables are sorted at build time, remove the boot time sorting pass. No change in functionality. [ mingo: Rewrote the changelog and code comments. ] Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kbuild@vger.kernel.org Link: https://lkml.kernel.org/r/20191204004633.88660-8-shile.zhang@linux.alibaba.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-13 10:47:58 +01:00
Linus Torvalds	6674fdb25a	This contains 3 changes: - Removal of code I accidentally applied when doing a minor fix up to a patch, and then using "git commit -a --amend", which pulled in some other changes I was playing with. - Remove an used variable in trace_events_inject code - Fix to function graph tracer when it traces a ftrace direct function. It will now ignore tracing a function that has a ftrace direct tramploine attached. This is needed for eBPF to use the ftrace direct code. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXfD/thQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qoo2AP4j7ONw7BTmMyo+GdYqPPntBeDnClHK vfMKrgK1j5BxYgEA7LgkwuUT9bcyLjfJVcyfeW67rB2PtmovKTWnKihFOwI= =DZ6N -----END PGP SIGNATURE----- Merge tag 'trace-v5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Remove code I accidentally applied when doing a minor fix up to a patch, and then using "git commit -a --amend", which pulled in some other changes I was playing with. - Remove an used variable in trace_events_inject code - Fix function graph tracer when it traces a ftrace direct function. It will now ignore tracing a function that has a ftrace direct tramploine attached. This is needed for eBPF to use the ftrace direct code. * tag 'trace-v5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Fix function_graph tracer interaction with BPF trampoline tracing: remove set but not used variable 'buffer' module: Remove accidental change of module_enable_x()	2019-12-11 12:22:38 -08:00
Alexei Starovoitov	ff205766db	ftrace: Fix function_graph tracer interaction with BPF trampoline Depending on type of BPF programs served by BPF trampoline it can call original function. In such case the trampoline will skip one stack frame while returning. That will confuse function_graph tracer and will cause crashes with bad RIP. Teach graph tracer to skip functions that have BPF trampoline attached. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2019-12-10 13:53:59 -05:00
Sean Christopherson	960786422f	x86/ACPI/sleep: Move acpi_get_wakeup_address() into sleep.c, remove <asm/realmode.h> from <asm/acpi.h> Move the definition of acpi_get_wakeup_address() into sleep.c to break linux/acpi.h's dependency (by way of asm/acpi.h) on asm/realmode.h. Everyone and their mother includes linux/acpi.h, i.e. modifying realmode.h results in a full kernel rebuild, which makes the already inscrutable real mode boot code even more difficult to understand and is positively rage inducing when trying to make changes to x86's boot flow. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Link: https://lkml.kernel.org/r/20191126165417.22423-13-sean.j.christopherson@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-10 10:15:48 +01:00
Sean Christopherson	cb28909525	x86/ACPI/sleep: Remove an unnecessary include of asm/realmode.h None of the declarations in x86's acpi/sleep.h are in any way dependent on the real mode boot code. Remove sleep.h's include of asm/realmode.h to limit the dependencies on realmode.h to code that actually interacts with the boot code. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lkml.kernel.org/r/20191126165417.22423-11-sean.j.christopherson@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-10 10:15:48 +01:00
Sean Christopherson	6315ec9286	x86/kprobes: Explicitly include vmalloc.h for set_vm_flush_reset_perms() The inclusion of linux/vmalloc.h, which is required for its definition of set_vm_flush_reset_perms(), is somehow dependent on asm/realmode.h being included by asm/acpi.h. Explicitly include linux/vmalloc.h so that a future patch can drop the realmode.h include from asm/acpi.h without breaking the build. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20191126165417.22423-5-sean.j.christopherson@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-10 10:15:48 +01:00
Sean Christopherson	ac0b14dc16	x86/ftrace: Explicitly include vmalloc.h for set_vm_flush_reset_perms() The inclusion of linux/vmalloc.h, which is required for its definition of set_vm_flush_reset_perms(), is somehow dependent on asm/realmode.h being included by asm/acpi.h. Explicitly include linux/vmalloc.h so that a future patch can drop the realmode.h include from asm/acpi.h without breaking the build. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20191126165417.22423-4-sean.j.christopherson@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-10 10:15:47 +01:00
Sean Christopherson	ca947b72e1	x86/boot: Explicitly include realmode.h to handle RM reservations Explicitly include asm/realmode.h, which provides reserve_real_mode(), instead of picking it up by an indirect include of asm/acpi.h. acpi.h will soon stop including realmode.h so that changing realmode.h doesn't require a full kernel rebuild. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Link: https://lkml.kernel.org/r/20191126165417.22423-3-sean.j.christopherson@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-10 10:15:47 +01:00
Ingo Molnar	186525bd6b	mm, x86/mm: Untangle address space layout definitions from basic pgtable type definitions - Untangle the somewhat incestous way of how VMALLOC_START is used all across the kernel, but is, on x86, defined deep inside one of the lowest level page table headers. It doesn't help that vmalloc.h only includes a single asm header: #include <asm/page.h> /* pgprot_t */ So there was no existing cross-arch way to decouple address layout definitions from page.h details. I used this: #ifndef VMALLOC_START # include <asm/vmalloc.h> #endif This way every architecture that wants to simplify page.h can do so. - Also on x86 we had a couple of LDT related inline functions that used the late-stage address space layout positions - but these could be uninlined without real trouble - the end result is cleaner this way as well. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-10 10:12:55 +01:00
Ingo Molnar	eb243d1d28	x86/mm/pat: Rename <asm/pat.h> => <asm/memtype.h> pat.h is a file whose main purpose is to provide the memtype_*() APIs. PAT is the low level hardware mechanism - but the high level abstraction is memtype. So name the header <memtype.h> as well - this goes hand in hand with memtype.c and memtype_interval.c. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-10 10:12:55 +01:00
Ingo Molnar	2040cf9f59	Linux 5.5-rc1 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl3tf/0eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGlKwH/3fTToujuJfTx5E5 mrARAP65J1L/DxpEKvKRt2bNZo6w13mNd8g7ZPmYChz90bYGvXQSG8hYTU9iAw3O yimSTJlNXDhVAluB53XnDdUxIWC4HUZsNxWJNCeXMuiMcGNsTGX+v3f+x7oHCT0P jI1RSIsFGjgr0RWqZ8U5aJckQo2xABC1TfYw53K66Oc/JLZpSFJFwMgjf1fD5diU HGDA8E2p0u1TQIyNzr86iqMvnlSRYBQwBQn6OgEKCG4Z0NLtXfDF4mqnxsXgLmIH oQoFfxaMKXyGWds7ZxwcGWntALCF41ThfpiJWDIyxjWxFEty4bqTCbDPwwyp7ip0 iuASmTI= =YqO2 -----END PGP SIGNATURE----- Merge tag 'v5.5-rc1' into core/kprobes, to resolve conflicts Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-10 10:11:00 +01:00
Ingo Molnar	360db4ace3	x86/setup: Enhance the comments Update various comments, fix outright mistakes and meaningless descriptions. Also harmonize the style across the file, both in form and in language. Cc: linux-kernel@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-10 09:59:38 +01:00
Ingo Molnar	12609013c4	x86/setup: Clean up the header portion of setup.c In 20 years we accumulated 89 #include lines in setup.c, but we only need 30 of them (!) ... Get rid of the excessive ones, and while at it, sort the remaining ones alphabetically. Also get rid of the incomplete changelogs at the header of the file, and explain better what this file does. Cc: linux-kernel@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-10 09:59:37 +01:00
Pankaj Bharadiya	c593642c8b	treewide: Use sizeof_field() macro Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except at places where these are defined. Later patches will remove the unused definition of FIELD_SIZEOF(). This patch is generated using following script: EXCLUDE_FILES="include/linux/stddef.h\|include/linux/kernel.h" git grep -l -e "\bFIELD_SIZEOF\b" \| while read file; do if [[ "$file" =~ $EXCLUDE_FILES ]]; then continue fi sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file; done Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: David Miller <davem@davemloft.net> # for net	2019-12-09 10:36:44 -08:00
Kees Cook	4fc265a9c9	x86/mtrr: Require CAP_SYS_ADMIN for all access Zhang Xiaoxu noted that physical address locations for MTRR were visible to non-root users, which could be considered an information leak. In discussing[1] the options for solving this, it sounded like just moving the capable check into open() was the first step. If this breaks userspace, then we will have a test case for the more conservative approaches discussed in the thread. In summary: - MTRR should check capabilities at open time (or retain the checks on the opener's permissions for later checks). - changing the DAC permissions might break something that expects to open mtrr when not uid 0. - if we leave the DAC permissions alone and just move the capable check to the opener, we should get the desired protection. (i.e. check against CAP_SYS_ADMIN not just the wider uid 0.) - if that still breaks things, as in userspace expects to be able to read other parts of the file as non-uid-0 and non-CAP_SYS_ADMIN, then we need to censor the contents using the opener's permissions. For example, as done in other /proc cases, like commit `51d7b12041` ("/proc/iomem: only expose physical resource addresses to privileged users"). [1] https://lore.kernel.org/lkml/201911110934.AC5BA313@keescook/ Reported-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: James Morris <jamorris@linux.microsoft.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-security-module@vger.kernel.org Cc: Matthew Garrett <mjg59@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: x86-ml <x86@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/201911181308.63F06502A1@keescook	2019-12-09 09:24:24 +01:00
Borislav Petkov	2e30dd9e06	x86/mtrr: Get rid of mtrr_seq_show() forward declaration ... by moving the function up in the file. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: x86@kernel.org Link: https://lkml.kernel.org/r/20191108200815.24589-1-bp@alien8.de	2019-12-09 09:23:44 +01:00
Linus Torvalds	e5b3fc125d	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Various fixes: - Fix the PAT performance regression that downgraded write-combining device memory regions to uncached. - There's been a number of bugs in 32-bit double fault handling - hopefully all fixed now. - Fix an LDT crash - Fix an FPU over-optimization that broke with GCC9 code optimizations. - Misc cleanups" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/pat: Fix off-by-one bugs in interval tree search x86/ioperm: Save an indentation level in tss_update_io_bitmap() x86/fpu: Don't cache access to fpu_fpregs_owner_ctx x86/entry/32: Remove unused 'restore_all_notrace' local label x86/ptrace: Document FSBASE and GSBASE ABI oddities x86/ptrace: Remove set_segment_reg() implementations for current x86/traps: die() instead of panicking on a double fault x86/doublefault/32: Rewrite the x86_32 #DF handler and unify with 64-bit x86/doublefault/32: Move #DF stack and TSS to cpu_entry_area x86/doublefault/32: Rename doublefault.c to doublefault_32.c x86/traps: Disentangle the 32-bit and 64-bit doublefault code lkdtm: Add a DOUBLE_FAULT crash type on x86 selftests/x86/single_step_syscall: Check SYSENTER directly x86/mm/32: Sync only to VMALLOC_END in vmalloc_sync_all()	2019-12-01 19:05:07 -08:00
Linus Torvalds	8fa91bfa9b	Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS fix from Borislav Petkov: "One urgent fix for the thermal throttling machinery: the recent change reworking the thermal notifications forgot to mask out read-only and reserved bits in the thermal status MSRs, leading to exceptions while writing those MSRs. The fix takes care of masking out those bits first" * 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce/therm_throt: Mask out read-only and reserved MSR bits	2019-11-30 14:49:08 -08:00
Borislav Petkov	7b0b8cfd26	x86/ioperm: Save an indentation level in tss_update_io_bitmap() ... for better readability. No functional changes. [ Minor edit. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-30 18:06:56 +01:00
Kai-Heng Feng	e0748539e3	x86/intel: Disable HPET on Intel Ice Lake platforms Like CFL and CFL-H, ICL SoC has skewed HPET timer once it hits PC10. So let's disable HPET on ICL. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: feng.tang@intel.com Cc: harry.pan@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20191129062303.18982-2-kai.heng.feng@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-29 12:17:58 +01:00
Kai-Heng Feng	f8edbde885	x86/intel: Disable HPET on Intel Coffee Lake H platforms Coffee Lake H SoC has similar behavior as Coffee Lake, skewed HPET timer once the SoCs entered PC10. So let's disable HPET on CFL-H platforms. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: feng.tang@intel.com Cc: harry.pan@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20191129062303.18982-1-kai.heng.feng@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-29 12:17:58 +01:00
Srinivas Pandruvada	5a43b87b3c	x86/mce/therm_throt: Mask out read-only and reserved MSR bits While writing to MSR IA32_THERM_STATUS/IA32_PKG_THERM_STATUS, avoid writing 1 to read only and reserved fields because updating some fields generates exception. [ bp: Vertically align for better readability. ] Fixes: `f6656208f0` ("x86/mce/therm_throt: Optimize notifications of thermal throttle") Reported-by: Dominik Brodowski <linux@dominikbrodowski.net> Tested-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191128150824.22413-1-srinivas.pandruvada@linux.intel.com	2019-11-29 09:17:52 +01:00
Linus Torvalds	81b6b96475	dma-mapping updates for 5.5-rc1 - improve dma-debug scalability (Eric Dumazet) - tiny dma-debug cleanup (Dan Carpenter) - check for vmap memory in dma_map_single (Kees Cook) - check for dma_addr_t overflows in dma-direct when using DMA offsets (Nicolas Saenz Julienne) - switch the x86 sta2x11 SOC to use more generic DMA code (Nicolas Saenz Julienne) - fix arm-nommu dma-ranges handling (Vladimir Murzin) - use __initdata in CMA (Shyam Saini) - replace the bus dma mask with a limit (Nicolas Saenz Julienne) - merge the remapping helpers into the main dma-direct flow (me) - switch xtensa to the generic dma remap handling (me) - various cleanups around dma_capable (me) - remove unused dev arguments to various dma-noncoherent helpers (me) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl3f+eULHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYPyPg/+PVHCrhmepudQQFHu6wfurE5U77iNnoUifvG+b5z5 5mHmTMkQwyox6rKDe8NuFApAhz1VJDSUgSelPmvTSOIEIGXCvX1p+GqRSVS5YQON aLzGvbWKE8hCpaPdDHKYDauD1FZGMM8L2P5oOMF9X9fQ94xxRqfqJM6c8iD16Sgg +aOgPNzTnxQHJFF/Dbt/mjJrKXWI+XF+bgUbH+l9yKa7Dd7ibmJR8yl9hs1jmp0H 1CZ+CizwnAs57rCd1a6Ybc6gj59tySc03NMnnbTko+KDxrcbD3Ee2tpqHVkkCjYz Yl0m4FIpbotrpokL/FIS727bVvkjbWgoeM+kiVPoYzmZea3pq/tFDr6tp/BxDhFj TZXSFfgQljlYMD3ppSoklFlfjGriVWV0tPO3arPXwuuMF5EX/IMQmvxei05jpc8n iELNXOP9iZZkY4tLHy2hn2uWrxBRrS1WQwlLg9hahlNRzyfFSyHeP0zWlVDt+RgF 5CCbEI+HQcUqg1FApB30lQNWTn1+dJftrpKVBlgNBIyIa/z2rFbt8GdSnItxjfQX /XX8EZbFvF6AcXkgURkYFIoKM/EbYShOSLcYA3PTUtcuTnF6Kk5eimySiGWZTVCS prruSFDZJOvL3SnOIMIiYVmBdB7lEbDyLI/VYuhoECXEDCJpVmRktNkJNg4q6/E+ fjQ= =e5wO -----END PGP SIGNATURE----- Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux; tag 'dma-mapping-5.5' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - improve dma-debug scalability (Eric Dumazet) - tiny dma-debug cleanup (Dan Carpenter) - check for vmap memory in dma_map_single (Kees Cook) - check for dma_addr_t overflows in dma-direct when using DMA offsets (Nicolas Saenz Julienne) - switch the x86 sta2x11 SOC to use more generic DMA code (Nicolas Saenz Julienne) - fix arm-nommu dma-ranges handling (Vladimir Murzin) - use __initdata in CMA (Shyam Saini) - replace the bus dma mask with a limit (Nicolas Saenz Julienne) - merge the remapping helpers into the main dma-direct flow (me) - switch xtensa to the generic dma remap handling (me) - various cleanups around dma_capable (me) - remove unused dev arguments to various dma-noncoherent helpers (me) * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux: * tag 'dma-mapping-5.5' of git://git.infradead.org/users/hch/dma-mapping: (22 commits) dma-mapping: treat dev->bus_dma_mask as a DMA limit dma-direct: exclude dma_direct_map_resource from the min_low_pfn check dma-direct: don't check swiotlb=force in dma_direct_map_resource dma-debug: clean up put_hash_bucket() powerpc: remove support for NULL dev in __phys_to_dma / __dma_to_phys dma-direct: avoid a forward declaration for phys_to_dma dma-direct: unify the dma_capable definitions dma-mapping: drop the dev argument to arch_sync_dma_for_* x86/PCI: sta2x11: use default DMA address translation dma-direct: check for overflows on 32 bit DMA addresses dma-debug: increase HASH_SIZE dma-debug: reorder struct dma_debug_entry fields xtensa: use the generic uncached segment support dma-mapping: merge the generic remapping helpers into dma-direct dma-direct: provide mmap and get_sgtable method overrides dma-direct: remove the dma_handle argument to __dma_direct_alloc_pages dma-direct: remove __dma_direct_free_pages usb: core: Remove redundant vmap checks kernel: dma-contiguous: mark CMA parameters __initdata/__initconst dma-debug: add a schedule point in debug_dma_dump_mappings() ...	2019-11-28 11:16:43 -08:00
Linus Torvalds	95f1fa9e34	New tracing features: - PERAMAENT flag to ftrace_ops when attaching a callback to a function As /proc/sys/kernel/ftrace_enabled when set to zero will disable all attached callbacks in ftrace, this has a detrimental impact on live kernel tracing, as it disables all that it patched. If a ftrace_ops is registered to ftrace with the PERMANENT flag set, it will prevent ftrace_enabled from being disabled, and if ftrace_enabled is already disabled, it will prevent a ftrace_ops with PREMANENT flag set from being registered. - New register_ftrace_direct(). As eBPF would like to register its own trampolines to be called by the ftrace nop locations directly, without going through the ftrace trampoline, this function has been added. This allows for eBPF trampolines to live along side of ftrace, perf, kprobe and live patching. It also utilizes the ftrace enabled_functions file that keeps track of functions that have been modified in the kernel, to allow for security auditing. - Allow for kernel internal use of ftrace instances. Subsystems in the kernel can now create and destroy their own tracing instances which allows them to have their own tracing buffer, and be able to record events without worrying about other users from writing over their data. - New seq_buf_hex_dump() that lets users use the hex_dump() in their seq_buf usage. - Notifications now added to tracing_max_latency to allow user space to know when a new max latency is hit by one of the latency tracers. - Wider spread use of generic compare operations for use of bsearch and friends. - More synthetic event fields may be defined (32 up from 16) - Use of xarray for architectures with sparse system calls, for the system call trace events. This along with small clean ups and fixes. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXdwv4BQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qnB5AP91vsdHQjwE1+/UWG/cO+qFtKvn2QJK QmBRIJNH/s+1TAD/fAOhgw+ojSK3o/qc+NpvPTEW9AEwcJL1wacJUn+XbQc= =ztql -----END PGP SIGNATURE----- Merge tag 'trace-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "New tracing features: - New PERMANENT flag to ftrace_ops when attaching a callback to a function. As /proc/sys/kernel/ftrace_enabled when set to zero will disable all attached callbacks in ftrace, this has a detrimental impact on live kernel tracing, as it disables all that it patched. If a ftrace_ops is registered to ftrace with the PERMANENT flag set, it will prevent ftrace_enabled from being disabled, and if ftrace_enabled is already disabled, it will prevent a ftrace_ops with PREMANENT flag set from being registered. - New register_ftrace_direct(). As eBPF would like to register its own trampolines to be called by the ftrace nop locations directly, without going through the ftrace trampoline, this function has been added. This allows for eBPF trampolines to live along side of ftrace, perf, kprobe and live patching. It also utilizes the ftrace enabled_functions file that keeps track of functions that have been modified in the kernel, to allow for security auditing. - Allow for kernel internal use of ftrace instances. Subsystems in the kernel can now create and destroy their own tracing instances which allows them to have their own tracing buffer, and be able to record events without worrying about other users from writing over their data. - New seq_buf_hex_dump() that lets users use the hex_dump() in their seq_buf usage. - Notifications now added to tracing_max_latency to allow user space to know when a new max latency is hit by one of the latency tracers. - Wider spread use of generic compare operations for use of bsearch and friends. - More synthetic event fields may be defined (32 up from 16) - Use of xarray for architectures with sparse system calls, for the system call trace events. This along with small clean ups and fixes" * tag 'trace-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (51 commits) tracing: Enable syscall optimization for MIPS tracing: Use xarray for syscall trace events tracing: Sample module to demonstrate kernel access to Ftrace instances. tracing: Adding new functions for kernel access to Ftrace instances tracing: Fix Kconfig indentation ring-buffer: Fix typos in function ring_buffer_producer ftrace: Use BIT() macro ftrace: Return ENOTSUPP when DYNAMIC_FTRACE_WITH_DIRECT_CALLS is not configured ftrace: Rename ftrace_graph_stub to ftrace_stub_graph ftrace: Add a helper function to modify_ftrace_direct() to allow arch optimization ftrace: Add helper find_direct_entry() to consolidate code ftrace: Add another check for match in register_ftrace_direct() ftrace: Fix accounting bug with direct->count in register_ftrace_direct() ftrace/selftests: Fix spelling mistake "wakeing" -> "waking" tracing: Increase SYNTH_FIELDS_MAX for synthetic_events ftrace/samples: Add a sample module that implements modify_ftrace_direct() ftrace: Add modify_ftrace_direct() tracing: Add missing "inline" in stub function of latency_fsnotify() tracing: Remove stray tab in TRACE_EVAL_MAP_FILE's help text tracing: Use seq_buf_hex_dump() to dump buffers ...	2019-11-27 11:42:01 -08:00
Masami Hiramatsu	285a54efe3	x86/alternatives: Sync bp_patching update for avoiding NULL pointer exception ftracetest multiple_kprobes.tc testcase hits the following NULL pointer exception: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 800000007bf60067 P4D 800000007bf60067 PUD 7bf5f067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI RIP: 0010:poke_int3_handler+0x39/0x100 Call Trace: <IRQ> do_int3+0xd/0xf0 int3+0x42/0x50 RIP: 0010:sched_clock+0x6/0x10 poke_int3_handler+0x39 was alternatives:958: static inline void text_poke_addr(struct text_poke_loc tp) { return _stext + tp->rel_addr; <------ Here is line #958 } This seems to be caused by tp (bp_patching.vec) being NULL but bp_patching.nr_entries != 0. There is a small chance for this to happen, because we have no synchronization between the zeroing of bp_patching.nr_entries and before clearing bp_patching.vec. Steve suggested we could fix this by adding sync_core(), because int3 is done with interrupts disabled, and the on_each_cpu() requires all CPUs to have had their interrupts enabled. [ mingo: Edited the comments and the changelog. ] Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Tested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bristot@redhat.com Fixes: `c0213b0ac0` ("x86/alternative: Batch of patch operations") Link: https://lkml.kernel.org/r/157483421229.25881.15314414408559963162.stgit@devnote2 Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-27 07:44:25 +01:00
Peter Zijlstra	76ffa7204b	x86/alternatives: Use INT3_INSN_SIZE Use INT3_INSN_SIZE instead of sizeof(int3). Suggested-by: Ingo Molnar <mingo@kernel.org> Tested-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191111132458.460144656@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-27 07:44:25 +01:00
Peter Zijlstra	f2cb4f95b7	x86/kprobe: Add comments to arch_{,un}optimize_kprobes() Add a few words describing how it is safe to overwrite the 4 bytes after a kprobe. In specific it is possible the JMP.d32 required for the optimized kprobe overwrites multiple instructions. Tested-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191111132458.401696663@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-27 07:44:25 +01:00
Peter Zijlstra	5c02ece818	x86/kprobes: Fix ordering while text-patching Kprobes does something like: register: arch_arm_kprobe() text_poke(INT3) /* guarantees nothing, INT3 will become visible at some point, maybe / kprobe_optimizer() / guarantees the bytes after INT3 are unused / synchronize_rcu_tasks(); text_poke_bp(JMP32); / implies IPI-sync, kprobe really is enabled / unregister: __disarm_kprobe() unoptimize_kprobe() text_poke_bp(INT3 + tail); / implies IPI-sync, so tail is guaranteed visible / arch_disarm_kprobe() text_poke(old); / guarantees nothing, old will maybe become visible */ synchronize_rcu() free-stuff Now the problem is that on register, the synchronize_rcu_tasks() does not imply sufficient to guarantee all CPUs have already observed INT3 (although in practice this is exceedingly unlikely not to have happened) (similar to how MEMBARRIER_CMD_PRIVATE_EXPEDITED does not imply MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE). Worse, even if it did, we'd have to do 2 synchronize calls to provide the guarantee we're looking for, the first to ensure INT3 is visible, the second to guarantee nobody is then still using the instruction bytes after INT3. Similar on unregister; the synchronize_rcu() between __unregister_kprobe_top() and __unregister_kprobe_bottom() does not guarantee all CPUs are free of the INT3 (and observe the old text). Therefore, sprinkle some IPI-sync love around. This guarantees that all CPUs agree on the text and RCU once again provides the required guaranteed. Tested-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191111132458.162172862@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-27 07:44:24 +01:00
Peter Zijlstra	ab09e95ca0	x86/kprobes: Convert to text-patching.h Convert kprobes to the new text-poke naming. Tested-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191111132458.103959370@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-27 07:44:24 +01:00
Borislav Petkov	38ebd8d119	x86/ftrace: Mark ftrace_modify_code_direct() __ref ... because it calls the .init.text function text_poke_early(). That is ok because it does call that function early, during boot. Fixes: 9706f7c3531f ("x86/ftrace: Use text_poke()") Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191116204607.GC23231@zn.tnic	2019-11-27 07:44:24 +01:00
Peter Zijlstra	4531ef6a8a	x86/alternative: Shrink text_poke_loc Employ the fact that all text must be within a s32 displacement of one another to shrink the text_poke_loc::addr field. Make it relative to _stext. This then shrinks struct text_poke_loc to 16 bytes, and consequently increases TP_VEC_MAX from 170 to 256. Tested-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191111132458.047052889@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-27 07:44:24 +01:00
Peter Zijlstra	97e6c977cc	x86/alternative: Remove text_poke_loc::len Per the BUG_ON(len != insn.length) in text_poke_loc_init(), tp->len must indeed be the same as text_opcode_size(tp->opcode). Use this to remove this field from the structure. Sadly, due to 8 byte alignment, this only increases the structure padding. Tested-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191111132457.989922744@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-27 07:44:24 +01:00
Peter Zijlstra	67c1d4a280	x86/ftrace: Use text_gen_insn() Replace the ftrace_code_union with the generic text_gen_insn() helper, which does exactly this. Tested-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191111132457.932808000@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-27 07:44:24 +01:00

1 2 3 4 5 ...

15800 commits