* Clean up vCPU targets, always returning generic v8 as the preferred target * Trap forwarding infrastructure for nested virtualization (used for traps that are taken from an L2 guest and are needed by the L1 hypervisor) * FEAT_TLBIRANGE support to only invalidate specific ranges of addresses when collapsing a table PTE to a block PTE. This avoids that the guest refills the TLBs again for addresses that aren't covered by the table PTE. * Fix vPMU issues related to handling of PMUver. * Don't unnecessary align non-stack allocations in the EL2 VA space * Drop HCR_VIRT_EXCP_MASK, which was never used... * Don't use smp_processor_id() in kvm_arch_vcpu_load(), but the cpu parameter instead * Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort() * Remove prototypes without implementations RISC-V: * Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest * Added ONE_REG interface for SATP mode * Added ONE_REG interface to enable/disable multiple ISA extensions * Improved error codes returned by ONE_REG interfaces * Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V * Added get-reg-list selftest for KVM RISC-V s390: * PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch) Allows a PV guest to use crypto cards. Card access is governed by the firmware and once a crypto queue is "bound" to a PV VM every other entity (PV or not) looses access until it is not bound anymore. Enablement is done via flags when creating the PV VM. * Guest debug fixes (Ilya) x86: * Clean up KVM's handling of Intel architectural events * Intel bugfixes * Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use debug registers and generate/handle #DBs * Clean up LBR virtualization code * Fix a bug where KVM fails to set the target pCPU during an IRTE update * Fix fatal bugs in SEV-ES intrahost migration * Fix a bug where the recent (architecturally correct) change to reinject #BP and skip INT3 broke SEV guests (can't decode INT3 to skip it) * Retry APIC map recalculation if a vCPU is added/enabled * Overhaul emergency reboot code to bring SVM up to par with VMX, tie the "emergency disabling" behavior to KVM actually being loaded, and move all of the logic within KVM * Fix user triggerable WARNs in SVM where KVM incorrectly assumes the TSC ratio MSR cannot diverge from the default when TSC scaling is disabled up related code * Add a framework to allow "caching" feature flags so that KVM can check if the guest can use a feature without needing to search guest CPUID * Rip out the ancient MMU_DEBUG crud and replace the useful bits with CONFIG_KVM_PROVE_MMU * Fix KVM's handling of !visible guest roots to avoid premature triple fault injection * Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the API surface that is needed by external users (currently only KVMGT), and fix a variety of issues in the process This last item had a silly one-character bug in the topic branch that was sent to me. Because it caused pretty bad selftest failures in some configurations, I decided to squash in the fix. So, while the exact commit ids haven't been in linux-next, the code has (from the kvm-x86 tree). Generic: * Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass action specific data without needing to constantly update the main handlers. * Drop unused function declarations Selftests: * Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs * Add support for printf() in guest code and covert all guest asserts to use printf-based reporting * Clean up the PMU event filter test and add new testcases * Include x86 selftests in the KVM x86 MAINTAINERS entry -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmT1m0kUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMNgggAiN7nz6UC423FznuI+yO3TLm8tkx1 CpKh5onqQogVtchH+vrngi97cfOzZb1/AtifY90OWQi31KEWhehkeofcx7G6ERhj 5a9NFADY1xGBsX4exca/VHDxhnzsbDWaWYPXw5vWFWI6erft9Mvy3tp1LwTvOzqM v8X4aWz+g5bmo/DWJf4Wu32tEU6mnxzkrjKU14JmyqQTBawVmJ3RYvHVJ/Agpw+n hRtPAy7FU6XTdkmq/uCT+KUHuJEIK0E/l1js47HFAqSzwdW70UDg14GGo1o4ETxu RjZQmVNvL57yVgi6QU38/A0FWIsWQm5IlaX1Ug6x8pjZPnUKNbo9BY4T1g== =W+4p -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "ARM: - Clean up vCPU targets, always returning generic v8 as the preferred target - Trap forwarding infrastructure for nested virtualization (used for traps that are taken from an L2 guest and are needed by the L1 hypervisor) - FEAT_TLBIRANGE support to only invalidate specific ranges of addresses when collapsing a table PTE to a block PTE. This avoids that the guest refills the TLBs again for addresses that aren't covered by the table PTE. - Fix vPMU issues related to handling of PMUver. - Don't unnecessary align non-stack allocations in the EL2 VA space - Drop HCR_VIRT_EXCP_MASK, which was never used... - Don't use smp_processor_id() in kvm_arch_vcpu_load(), but the cpu parameter instead - Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort() - Remove prototypes without implementations RISC-V: - Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest - Added ONE_REG interface for SATP mode - Added ONE_REG interface to enable/disable multiple ISA extensions - Improved error codes returned by ONE_REG interfaces - Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V - Added get-reg-list selftest for KVM RISC-V s390: - PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch) Allows a PV guest to use crypto cards. Card access is governed by the firmware and once a crypto queue is "bound" to a PV VM every other entity (PV or not) looses access until it is not bound anymore. Enablement is done via flags when creating the PV VM. - Guest debug fixes (Ilya) x86: - Clean up KVM's handling of Intel architectural events - Intel bugfixes - Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use debug registers and generate/handle #DBs - Clean up LBR virtualization code - Fix a bug where KVM fails to set the target pCPU during an IRTE update - Fix fatal bugs in SEV-ES intrahost migration - Fix a bug where the recent (architecturally correct) change to reinject #BP and skip INT3 broke SEV guests (can't decode INT3 to skip it) - Retry APIC map recalculation if a vCPU is added/enabled - Overhaul emergency reboot code to bring SVM up to par with VMX, tie the "emergency disabling" behavior to KVM actually being loaded, and move all of the logic within KVM - Fix user triggerable WARNs in SVM where KVM incorrectly assumes the TSC ratio MSR cannot diverge from the default when TSC scaling is disabled up related code - Add a framework to allow "caching" feature flags so that KVM can check if the guest can use a feature without needing to search guest CPUID - Rip out the ancient MMU_DEBUG crud and replace the useful bits with CONFIG_KVM_PROVE_MMU - Fix KVM's handling of !visible guest roots to avoid premature triple fault injection - Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the API surface that is needed by external users (currently only KVMGT), and fix a variety of issues in the process Generic: - Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass action specific data without needing to constantly update the main handlers. - Drop unused function declarations Selftests: - Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs - Add support for printf() in guest code and covert all guest asserts to use printf-based reporting - Clean up the PMU event filter test and add new testcases - Include x86 selftests in the KVM x86 MAINTAINERS entry" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (279 commits) KVM: x86/mmu: Include mmu.h in spte.h KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest roots KVM: x86/mmu: Disallow guest from using !visible slots for page tables KVM: x86/mmu: Harden TDP MMU iteration against root w/o shadow page KVM: x86/mmu: Harden new PGD against roots without shadow pages KVM: x86/mmu: Add helper to convert root hpa to shadow page drm/i915/gvt: Drop final dependencies on KVM internal details KVM: x86/mmu: Handle KVM bookkeeping in page-track APIs, not callers KVM: x86/mmu: Drop @slot param from exported/external page-track APIs KVM: x86/mmu: Bug the VM if write-tracking is used but not enabled KVM: x86/mmu: Assert that correct locks are held for page write-tracking KVM: x86/mmu: Rename page-track APIs to reflect the new reality KVM: x86/mmu: Drop infrastructure for multiple page-track modes KVM: x86/mmu: Use page-track notifiers iff there are external users KVM: x86/mmu: Move KVM-only page-track declarations to internal header KVM: x86: Remove the unused page-track hook track_flush_slot() drm/i915/gvt: switch from ->track_flush_slot() to ->track_remove_region() KVM: x86: Add a new page-track hook to handle memslot deletion drm/i915/gvt: Don't bother removing write-protection on to-be-deleted slot KVM: x86: Reject memslot MOVE operations if KVMGT is attached ...
230 lines
6.7 KiB
C
230 lines
6.7 KiB
C
/* SPDX-License-Identifier: GPL-2.0 */
|
|
#ifndef _ASM_X86_KEXEC_H
|
|
#define _ASM_X86_KEXEC_H
|
|
|
|
#ifdef CONFIG_X86_32
|
|
# define PA_CONTROL_PAGE 0
|
|
# define VA_CONTROL_PAGE 1
|
|
# define PA_PGD 2
|
|
# define PA_SWAP_PAGE 3
|
|
# define PAGES_NR 4
|
|
#else
|
|
# define PA_CONTROL_PAGE 0
|
|
# define VA_CONTROL_PAGE 1
|
|
# define PA_TABLE_PAGE 2
|
|
# define PA_SWAP_PAGE 3
|
|
# define PAGES_NR 4
|
|
#endif
|
|
|
|
# define KEXEC_CONTROL_CODE_MAX_SIZE 2048
|
|
|
|
#ifndef __ASSEMBLY__
|
|
|
|
#include <linux/string.h>
|
|
#include <linux/kernel.h>
|
|
|
|
#include <asm/page.h>
|
|
#include <asm/ptrace.h>
|
|
#include <asm/bootparam.h>
|
|
|
|
struct kimage;
|
|
|
|
/*
|
|
* KEXEC_SOURCE_MEMORY_LIMIT maximum page get_free_page can return.
|
|
* I.e. Maximum page that is mapped directly into kernel memory,
|
|
* and kmap is not required.
|
|
*
|
|
* So far x86_64 is limited to 40 physical address bits.
|
|
*/
|
|
#ifdef CONFIG_X86_32
|
|
/* Maximum physical address we can use pages from */
|
|
# define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
|
|
/* Maximum address we can reach in physical address mode */
|
|
# define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL)
|
|
/* Maximum address we can use for the control code buffer */
|
|
# define KEXEC_CONTROL_MEMORY_LIMIT TASK_SIZE
|
|
|
|
# define KEXEC_CONTROL_PAGE_SIZE 4096
|
|
|
|
/* The native architecture */
|
|
# define KEXEC_ARCH KEXEC_ARCH_386
|
|
|
|
/* We can also handle crash dumps from 64 bit kernel. */
|
|
# define vmcore_elf_check_arch_cross(x) ((x)->e_machine == EM_X86_64)
|
|
#else
|
|
/* Maximum physical address we can use pages from */
|
|
# define KEXEC_SOURCE_MEMORY_LIMIT (MAXMEM-1)
|
|
/* Maximum address we can reach in physical address mode */
|
|
# define KEXEC_DESTINATION_MEMORY_LIMIT (MAXMEM-1)
|
|
/* Maximum address we can use for the control pages */
|
|
# define KEXEC_CONTROL_MEMORY_LIMIT (MAXMEM-1)
|
|
|
|
/* Allocate one page for the pdp and the second for the code */
|
|
# define KEXEC_CONTROL_PAGE_SIZE (4096UL + 4096UL)
|
|
|
|
/* The native architecture */
|
|
# define KEXEC_ARCH KEXEC_ARCH_X86_64
|
|
#endif
|
|
|
|
/*
|
|
* This function is responsible for capturing register states if coming
|
|
* via panic otherwise just fix up the ss and sp if coming via kernel
|
|
* mode exception.
|
|
*/
|
|
static inline void crash_setup_regs(struct pt_regs *newregs,
|
|
struct pt_regs *oldregs)
|
|
{
|
|
if (oldregs) {
|
|
memcpy(newregs, oldregs, sizeof(*newregs));
|
|
} else {
|
|
#ifdef CONFIG_X86_32
|
|
asm volatile("movl %%ebx,%0" : "=m"(newregs->bx));
|
|
asm volatile("movl %%ecx,%0" : "=m"(newregs->cx));
|
|
asm volatile("movl %%edx,%0" : "=m"(newregs->dx));
|
|
asm volatile("movl %%esi,%0" : "=m"(newregs->si));
|
|
asm volatile("movl %%edi,%0" : "=m"(newregs->di));
|
|
asm volatile("movl %%ebp,%0" : "=m"(newregs->bp));
|
|
asm volatile("movl %%eax,%0" : "=m"(newregs->ax));
|
|
asm volatile("movl %%esp,%0" : "=m"(newregs->sp));
|
|
asm volatile("movl %%ss, %%eax;" :"=a"(newregs->ss));
|
|
asm volatile("movl %%cs, %%eax;" :"=a"(newregs->cs));
|
|
asm volatile("movl %%ds, %%eax;" :"=a"(newregs->ds));
|
|
asm volatile("movl %%es, %%eax;" :"=a"(newregs->es));
|
|
asm volatile("pushfl; popl %0" :"=m"(newregs->flags));
|
|
#else
|
|
asm volatile("movq %%rbx,%0" : "=m"(newregs->bx));
|
|
asm volatile("movq %%rcx,%0" : "=m"(newregs->cx));
|
|
asm volatile("movq %%rdx,%0" : "=m"(newregs->dx));
|
|
asm volatile("movq %%rsi,%0" : "=m"(newregs->si));
|
|
asm volatile("movq %%rdi,%0" : "=m"(newregs->di));
|
|
asm volatile("movq %%rbp,%0" : "=m"(newregs->bp));
|
|
asm volatile("movq %%rax,%0" : "=m"(newregs->ax));
|
|
asm volatile("movq %%rsp,%0" : "=m"(newregs->sp));
|
|
asm volatile("movq %%r8,%0" : "=m"(newregs->r8));
|
|
asm volatile("movq %%r9,%0" : "=m"(newregs->r9));
|
|
asm volatile("movq %%r10,%0" : "=m"(newregs->r10));
|
|
asm volatile("movq %%r11,%0" : "=m"(newregs->r11));
|
|
asm volatile("movq %%r12,%0" : "=m"(newregs->r12));
|
|
asm volatile("movq %%r13,%0" : "=m"(newregs->r13));
|
|
asm volatile("movq %%r14,%0" : "=m"(newregs->r14));
|
|
asm volatile("movq %%r15,%0" : "=m"(newregs->r15));
|
|
asm volatile("movl %%ss, %%eax;" :"=a"(newregs->ss));
|
|
asm volatile("movl %%cs, %%eax;" :"=a"(newregs->cs));
|
|
asm volatile("pushfq; popq %0" :"=m"(newregs->flags));
|
|
#endif
|
|
newregs->ip = _THIS_IP_;
|
|
}
|
|
}
|
|
|
|
#ifdef CONFIG_X86_32
|
|
asmlinkage unsigned long
|
|
relocate_kernel(unsigned long indirection_page,
|
|
unsigned long control_page,
|
|
unsigned long start_address,
|
|
unsigned int has_pae,
|
|
unsigned int preserve_context);
|
|
#else
|
|
unsigned long
|
|
relocate_kernel(unsigned long indirection_page,
|
|
unsigned long page_list,
|
|
unsigned long start_address,
|
|
unsigned int preserve_context,
|
|
unsigned int host_mem_enc_active);
|
|
#endif
|
|
|
|
#define ARCH_HAS_KIMAGE_ARCH
|
|
|
|
#ifdef CONFIG_X86_32
|
|
struct kimage_arch {
|
|
pgd_t *pgd;
|
|
#ifdef CONFIG_X86_PAE
|
|
pmd_t *pmd0;
|
|
pmd_t *pmd1;
|
|
#endif
|
|
pte_t *pte0;
|
|
pte_t *pte1;
|
|
};
|
|
#else
|
|
struct kimage_arch {
|
|
p4d_t *p4d;
|
|
pud_t *pud;
|
|
pmd_t *pmd;
|
|
pte_t *pte;
|
|
};
|
|
#endif /* CONFIG_X86_32 */
|
|
|
|
#ifdef CONFIG_X86_64
|
|
/*
|
|
* Number of elements and order of elements in this structure should match
|
|
* with the ones in arch/x86/purgatory/entry64.S. If you make a change here
|
|
* make an appropriate change in purgatory too.
|
|
*/
|
|
struct kexec_entry64_regs {
|
|
uint64_t rax;
|
|
uint64_t rcx;
|
|
uint64_t rdx;
|
|
uint64_t rbx;
|
|
uint64_t rsp;
|
|
uint64_t rbp;
|
|
uint64_t rsi;
|
|
uint64_t rdi;
|
|
uint64_t r8;
|
|
uint64_t r9;
|
|
uint64_t r10;
|
|
uint64_t r11;
|
|
uint64_t r12;
|
|
uint64_t r13;
|
|
uint64_t r14;
|
|
uint64_t r15;
|
|
uint64_t rip;
|
|
};
|
|
|
|
extern int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages,
|
|
gfp_t gfp);
|
|
#define arch_kexec_post_alloc_pages arch_kexec_post_alloc_pages
|
|
|
|
extern void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages);
|
|
#define arch_kexec_pre_free_pages arch_kexec_pre_free_pages
|
|
|
|
void arch_kexec_protect_crashkres(void);
|
|
#define arch_kexec_protect_crashkres arch_kexec_protect_crashkres
|
|
|
|
void arch_kexec_unprotect_crashkres(void);
|
|
#define arch_kexec_unprotect_crashkres arch_kexec_unprotect_crashkres
|
|
|
|
#ifdef CONFIG_KEXEC_FILE
|
|
struct purgatory_info;
|
|
int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
|
|
Elf_Shdr *section,
|
|
const Elf_Shdr *relsec,
|
|
const Elf_Shdr *symtab);
|
|
#define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add
|
|
|
|
int arch_kimage_file_post_load_cleanup(struct kimage *image);
|
|
#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
|
|
#endif
|
|
#endif
|
|
|
|
extern void kdump_nmi_shootdown_cpus(void);
|
|
|
|
#ifdef CONFIG_CRASH_HOTPLUG
|
|
void arch_crash_handle_hotplug_event(struct kimage *image);
|
|
#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
|
|
|
|
#ifdef CONFIG_HOTPLUG_CPU
|
|
int arch_crash_hotplug_cpu_support(void);
|
|
#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
|
|
#endif
|
|
|
|
#ifdef CONFIG_MEMORY_HOTPLUG
|
|
int arch_crash_hotplug_memory_support(void);
|
|
#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
|
|
#endif
|
|
|
|
unsigned int arch_crash_get_elfcorehdr_size(void);
|
|
#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
|
|
#endif
|
|
|
|
#endif /* __ASSEMBLY__ */
|
|
|
|
#endif /* _ASM_X86_KEXEC_H */
|