The HV_REGISTER_ are used as arguments to hv_set/get_register(), which
delegate to arch-specific mechanisms for getting/setting synthetic
Hyper-V MSRs.
On arm64, HV_REGISTER_ defines are synthetic VP registers accessed via
the get/set vp registers hypercalls. The naming matches the TLFS
document, although these register names are not specific to arm64.
However, on x86 the prefix HV_REGISTER_ indicates Hyper-V MSRs accessed
via rdmsrl()/wrmsrl(). This is not consistent with the TLFS doc, where
HV_REGISTER_ is *only* used for used for VP register names used by
the get/set register hypercalls.
To fix this inconsistency and prevent future confusion, change the
arch-generic aliases used by callers of hv_set/get_register() to have
the prefix HV_MSR_ instead of HV_REGISTER_.
Use the prefix HV_X64_MSR_ for the x86-only Hyper-V MSRs. On x86, the
generic HV_MSR_'s point to the corresponding HV_X64_MSR_.
Move the arm64 HV_REGISTER_* defines to the asm-generic hyperv-tlfs.h,
since these are not specific to arm64. On arm64, the generic HV_MSR_'s
point to the corresponding HV_REGISTER_.
While at it, rename hv_get/set_registers() and related functions to
hv_get/set_msr(), hv_get/set_nested_msr(), etc. These are only used for
Hyper-V MSRs and this naming makes that clear.
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/1708440933-27125-1-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1708440933-27125-1-git-send-email-nunodasneves@linux.microsoft.com>
- Limit the hardcoded topology quirk for Hygon CPUs to those which have a
model ID less than 4. The newer models have the topology CPUID leaf 0xB
correctly implemented and are not affected.
- Make SMT control more robust against enumeration failures
SMT control was added to allow controlling SMT at boottime or
runtime. The primary purpose was to provide a simple mechanism to
disable SMT in the light of speculation attack vectors.
It turned out that the code is sensible to enumeration failures and
worked only by chance for XEN/PV. XEN/PV has no real APIC enumeration
which means the primary thread mask is not set up correctly. By chance
a XEN/PV boot ends up with smp_num_siblings == 2, which makes the
hotplug control stay at its default value "enabled". So the mask is
never evaluated.
The ongoing rework of the topology evaluation caused XEN/PV to end up
with smp_num_siblings == 1, which sets the SMT control to "not
supported" and the empty primary thread mask causes the hotplug core to
deny the bringup of the APS.
Make the decision logic more robust and take 'not supported' and 'not
implemented' into account for the decision whether a CPU should be
booted or not.
- Fake primary thread mask for XEN/PV
Pretend that all XEN/PV vCPUs are primary threads, which makes the
usage of the primary thread mask valid on XEN/PV. That is consistent
with because all of the topology information on XEN/PV is fake or even
non-existent.
- Encapsulate topology information in cpuinfo_x86
Move the randomly scattered topology data into a separate data
structure for readability and as a preparatory step for the topology
evaluation overhaul.
- Consolidate APIC ID data type to u32
It's fixed width hardware data and not randomly u16, int, unsigned long
or whatever developers decided to use.
- Cure the abuse of cpuinfo for persisting logical IDs.
Per CPU cpuinfo is used to persist the logical package and die
IDs. That's really not the right place simply because cpuinfo is
subject to be reinitialized when a CPU goes through an offline/online
cycle.
Use separate per CPU data for the persisting to enable the further
topology management rework. It will be removed once the new topology
management is in place.
- Provide a debug interface for inspecting topology information
Useful in general and extremly helpful for validating the topology
management rework in terms of correctness or "bug" compatibility.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmU+yX0THHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoROUD/4vlvKEcpm9rbI5DzLcaq4DFHKbyEZF
cQtzuOSM/9vTc9DHnuoNNLl9TWSYxiVYnejf3E21evfsqspYlzbTH8bId9XBCUid
6B68AJW842M2erNuwj0b0HwF1z++zpDmBDyhGOty/KQhoM8pYOHMvntAmbzJbuso
Dgx6BLVFcboTy6RwlfRa0EE8f9W5V+JbmG/VBDpdyCInal7VrudoVFZmWQnPIft7
zwOJpAoehkp8OKq7geKDf79yWxu9a1sNPd62HtaVEvfHwehHqE6OaMLss1us+0vT
SJ/D6gmRQBOwcXaZL0wL1dG7Km9Et4AisOvzhXGvTa5b2D5oljVoqJ7V7FTf5g3u
y3aqWbeUJzERUbeJt1HoGVAKyA4GtZOvg+TNIysf6F1Z4khl9alfa9jiqjj4g1au
zgItq/ZMBEBmJ7X4FxQUEUVBG2CDsEidyNBDRcimWQUDfBakV/iCs0suD8uu8ZOD
K5jMx8Hi2+xFx7r1YqsfsyMBYOf/zUZw65RbNe+kI992JbJ9nhcODbnbo5MlAsyv
vcqlK5FwXgZ4YAC8dZHU/tyTiqAW7oaOSkqKwTP5gcyNEqsjQHV//q6v+uqtjfYn
1C4oUsRHT2vJiV9ktNJTA4GQHIYF4geGgpG8Ih2SjXsSzdGtUd3DtX1iq0YiLEOk
eHhYsnniqsYB5g==
=xrz8
-----END PGP SIGNATURE-----
Merge tag 'x86-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Thomas Gleixner:
- Limit the hardcoded topology quirk for Hygon CPUs to those which have
a model ID less than 4.
The newer models have the topology CPUID leaf 0xB correctly
implemented and are not affected.
- Make SMT control more robust against enumeration failures
SMT control was added to allow controlling SMT at boottime or
runtime. The primary purpose was to provide a simple mechanism to
disable SMT in the light of speculation attack vectors.
It turned out that the code is sensible to enumeration failures and
worked only by chance for XEN/PV. XEN/PV has no real APIC enumeration
which means the primary thread mask is not set up correctly. By
chance a XEN/PV boot ends up with smp_num_siblings == 2, which makes
the hotplug control stay at its default value "enabled". So the mask
is never evaluated.
The ongoing rework of the topology evaluation caused XEN/PV to end up
with smp_num_siblings == 1, which sets the SMT control to "not
supported" and the empty primary thread mask causes the hotplug core
to deny the bringup of the APS.
Make the decision logic more robust and take 'not supported' and 'not
implemented' into account for the decision whether a CPU should be
booted or not.
- Fake primary thread mask for XEN/PV
Pretend that all XEN/PV vCPUs are primary threads, which makes the
usage of the primary thread mask valid on XEN/PV. That is consistent
with because all of the topology information on XEN/PV is fake or
even non-existent.
- Encapsulate topology information in cpuinfo_x86
Move the randomly scattered topology data into a separate data
structure for readability and as a preparatory step for the topology
evaluation overhaul.
- Consolidate APIC ID data type to u32
It's fixed width hardware data and not randomly u16, int, unsigned
long or whatever developers decided to use.
- Cure the abuse of cpuinfo for persisting logical IDs.
Per CPU cpuinfo is used to persist the logical package and die IDs.
That's really not the right place simply because cpuinfo is subject
to be reinitialized when a CPU goes through an offline/online cycle.
Use separate per CPU data for the persisting to enable the further
topology management rework. It will be removed once the new topology
management is in place.
- Provide a debug interface for inspecting topology information
Useful in general and extremly helpful for validating the topology
management rework in terms of correctness or "bug" compatibility.
* tag 'x86-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
x86/apic, x86/hyperv: Use u32 in hv_snp_boot_ap() too
x86/cpu: Provide debug interface
x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids
x86/apic: Use u32 for wakeup_secondary_cpu[_64]()
x86/apic: Use u32 for [gs]et_apic_id()
x86/apic: Use u32 for phys_pkg_id()
x86/apic: Use u32 for cpu_present_to_apicid()
x86/apic: Use u32 for check_apicid_used()
x86/apic: Use u32 for APIC IDs in global data
x86/apic: Use BAD_APICID consistently
x86/cpu: Move cpu_l[l2]c_id into topology info
x86/cpu: Move logical package and die IDs into topology info
x86/cpu: Remove pointless evaluation of x86_coreid_bits
x86/cpu: Move cu_id into topology info
x86/cpu: Move cpu_core_id into topology info
hwmon: (fam15h_power) Use topology_core_id()
scsi: lpfc: Use topology_core_id()
x86/cpu: Move cpu_die_id into topology info
x86/cpu: Move phys_proc_id into topology info
x86/cpu: Encapsulate topology information in cpuinfo_x86
...
The data type for APIC IDs was standardized to 'u32' in the
following recent commit:
db4a4086a2 ("x86/apic: Use u32 for wakeup_secondary_cpu[_64]()")
Which changed the function arguments type signature of the
apic->wakeup_secondary_cpu() APIC driver function.
Propagate this to hv_snp_boot_ap() as well, which also addresses a
'assignment from incompatible pointer type' build warning that triggers
under the -Werror=incompatible-pointer-types GCC warning.
Fixes: db4a4086a2 ("x86/apic: Use u32 for wakeup_secondary_cpu[_64]()")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230814085113.233274223@linutronix.de
There has been cases reported where HYPERV_VTL_MODE is enabled by mistake,
on a non Hyper-V platforms. This causes the hv_vtl_early_init function to
be called in an non Hyper-V/VTL platforms which results the memory
corruption.
Remove the early_initcall for hv_vtl_early_init and call it at the end of
hyperv_init to make sure it is never called in a non Hyper-V platform by
mistake.
Reported-by: Mathias Krause <minipli@grsecurity.net>
Closes: https://lore.kernel.org/lkml/40467722-f4ab-19a5-4989-308225b1f9f0@grsecurity.net/
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Acked-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/1695358720-27681-1-git-send-email-ssengar@linux.microsoft.com
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmT0EE8THHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXg5FCACGJ6n2ikhtRHAENHIVY/mTh+HbhO07
ERzjADfqKF43u1Nt9cslgT4MioqwLjQsAu/A0YcJgVxVSOtg7dnbDmurRAjrGT/3
iKqcVvnaiwSV44TkF8evpeMttZSOg29ImmpyQjoZJJvDMfpxleEK53nuKB9EsjKL
Mz/0gSPoNc79bWF+85cVhgPnGIh9nBarxHqVsuWjMhc+UFhzjf9mOtk34qqPfJ1Q
4RsKGEjkVkeXoG6nGd6Gl/+8WoTpenOZQLchhInocY+k9FlAzW1Kr+ICLDx+Topw
8OJ6fv2rMDOejT9aOaA3/imf7LMer0xSUKb6N0sqQAQX8KzwcOYyKtQJ
=rC/v
-----END PGP SIGNATURE-----
Merge tag 'hyperv-next-signed-20230902' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- Support for SEV-SNP guests on Hyper-V (Tianyu Lan)
- Support for TDX guests on Hyper-V (Dexuan Cui)
- Use SBRM API in Hyper-V balloon driver (Mitchell Levy)
- Avoid dereferencing ACPI root object handle in VMBus driver (Maciej
Szmigiero)
- A few misecllaneous fixes (Jiapeng Chong, Nathan Chancellor, Saurabh
Sengar)
* tag 'hyperv-next-signed-20230902' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (24 commits)
x86/hyperv: Remove duplicate include
x86/hyperv: Move the code in ivm.c around to avoid unnecessary ifdef's
x86/hyperv: Remove hv_isolation_type_en_snp
x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM with the paravisor
Drivers: hv: vmbus: Bring the post_msg_page back for TDX VMs with the paravisor
x86/hyperv: Introduce a global variable hyperv_paravisor_present
Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM
x86/hyperv: Fix serial console interrupts for fully enlightened TDX guests
Drivers: hv: vmbus: Support fully enlightened TDX guests
x86/hyperv: Support hypercalls for fully enlightened TDX guests
x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests
x86/hyperv: Fix undefined reference to isolation_type_en_snp without CONFIG_HYPERV
x86/hyperv: Add missing 'inline' to hv_snp_boot_ap() stub
hv: hyperv.h: Replace one-element array with flexible-array member
Drivers: hv: vmbus: Don't dereference ACPI root object handle
x86/hyperv: Add hyperv-specific handling for VMMCALL under SEV-ES
x86/hyperv: Add smp support for SEV-SNP guest
clocksource: hyper-v: Mark hyperv tsc page unencrypted in sev-snp enlightened guest
x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp enlightened guest
drivers: hv: Mark percpu hvcall input arg page unencrypted in SEV-SNP enlightened guest
...
In ms_hyperv_init_platform(), do not distinguish between a SNP VM with
the paravisor and a SNP VM without the paravisor.
Replace hv_isolation_type_en_snp() with
!ms_hyperv.paravisor_present && hv_isolation_type_snp().
The hv_isolation_type_en_snp() in drivers/hv/hv.c and
drivers/hv/hv_common.c can be changed to hv_isolation_type_snp() since
we know !ms_hyperv.paravisor_present is true there.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Tianyu Lan <tiala@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20230824080712.30327-10-decui@microsoft.com
When the paravisor is present, a SNP VM must use GHCB to access some
special MSRs, including HV_X64_MSR_GUEST_OS_ID and some SynIC MSRs.
Similarly, when the paravisor is present, a TDX VM must use TDX GHCI
to access the same MSRs.
Implement hv_tdx_msr_write() and hv_tdx_msr_read(), and use the helper
functions hv_ivm_msr_read() and hv_ivm_msr_write() to access the MSRs
in a unified way for SNP/TDX VMs with the paravisor.
Do not export hv_tdx_msr_write() and hv_tdx_msr_read(), because we never
really used hv_ghcb_msr_write() and hv_ghcb_msr_read() in any module.
Update arch/x86/include/asm/mshyperv.h so that the kernel can still build
if CONFIG_AMD_MEM_ENCRYPT or CONFIG_INTEL_TDX_GUEST is not set, or
neither is set.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Tianyu Lan <tiala@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20230824080712.30327-9-decui@microsoft.com
The new variable hyperv_paravisor_present is set only when the VM
is a SNP/TDX VM with the paravisor running: see ms_hyperv_init_platform().
We introduce hyperv_paravisor_present because we can not use
ms_hyperv.paravisor_present in arch/x86/include/asm/mshyperv.h:
struct ms_hyperv_info is defined in include/asm-generic/mshyperv.h, which
is included at the end of arch/x86/include/asm/mshyperv.h, but at the
beginning of arch/x86/include/asm/mshyperv.h, we would already need to use
struct ms_hyperv_info in hv_do_hypercall().
We use hyperv_paravisor_present only in include/asm-generic/mshyperv.h,
and use ms_hyperv.paravisor_present elsewhere. In the future, we'll
introduce a hypercall function structure for different VM types, and
at boot time, the right function pointers would be written into the
structure so that runtime testing of TDX vs. SNP vs. normal will be
avoided and hyperv_paravisor_present will no longer be needed.
Call hv_vtom_init() when it's a VBS VM or when ms_hyperv.paravisor_present
is true, i.e. the VM is a SNP VM or TDX VM with the paravisor.
Enhance hv_vtom_init() for a TDX VM with the paravisor.
In hv_common_cpu_init(), don't decrypt the hyperv_pcpu_input_arg
for a TDX VM with the paravisor, just like we don't decrypt the page
for a SNP VM with the paravisor.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Tianyu Lan <tiala@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20230824080712.30327-7-decui@microsoft.com
A fully enlightened TDX guest on Hyper-V (i.e. without the paravisor) only
uses the GHCI call rather than hv_hypercall_pg. Do not initialize
hypercall_pg for such a guest.
In hv_common_cpu_init(), the hyperv_pcpu_input_arg page needs to be
decrypted in such a guest.
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Tianyu Lan <tiala@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20230824080712.30327-3-decui@microsoft.com
No logic change to SNP/VBS guests.
hv_isolation_type_tdx() will be used to instruct a TDX guest on Hyper-V to
do some TDX-specific operations, e.g. for a fully enlightened TDX guest
(i.e. without the paravisor), hv_do_hypercall() should use
__tdx_hypercall() and such a guest on Hyper-V should handle the Hyper-V
Event/Message/Monitor pages specially.
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Tianyu Lan <tiala@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20230824080712.30327-2-decui@microsoft.com
When building without CONFIG_AMD_MEM_ENCRYPT, there are several
repeated instances of -Wunused-function due to missing 'inline' on
the stub of hy_snp_boot_ap():
In file included from drivers/hv/hv_common.c:29:
./arch/x86/include/asm/mshyperv.h:272:12: error: 'hv_snp_boot_ap' defined but not used [-Werror=unused-function]
272 | static int hv_snp_boot_ap(int cpu, unsigned long start_ip) { return 0; }
| ^~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Add 'inline' to fix the warnings.
Fixes: 44676bb9d5 ("x86/hyperv: Add smp support for SEV-SNP guest")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20230822-hv_snp_boot_ap-missing-inline-v1-1-e712dcb2da0f@kernel.org
In the AMD SEV-SNP guest, AP needs to be started up via sev es
save area and Hyper-V requires to call HVCALL_START_VP hypercall
to pass the gpa of sev es save area with AP's vp index and VTL(Virtual
trust level) parameters. Override wakeup_secondary_cpu_64 callback
with hv_snp_boot_ap.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Tianyu Lan <tiala@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20230818102919.1318039-8-ltykernel@gmail.com
In sev-snp enlightened guest, Hyper-V hypercall needs
to use vmmcall to trigger vmexit and notify hypervisor
to handle hypercall request.
Signed-off-by: Tianyu Lan <tiala@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20230818102919.1318039-6-ltykernel@gmail.com
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmTNh9UTHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXkhFCACvhcb/scp31lcBjaGb8AWogejPBFfW
sb38M9X0E50omXoWahYXCMe2Dot7sgxEPuCr7PgwMBGj7Mbdy+kJg9OgdQc7mvks
XVENJmfe5H4pyNw69eVjGz/ceNP+GzpTEO0ut+suCxa8+JiBDNnTfLd1W/UYNCJ8
JOGauWnvkzA5B6/lgAB6JSldDTijKQr1NQGYGZOO6LMxVSUYb/9jFRceyEaGRL/S
HQdQU3FVcBYK2R2bqp4KCmQWFF352szmYuKMioMhpXQZTo868/llLPVarIwL/fUA
2u7NLLmza0N+DR3esZnKmMASa8wuNJgxV1Ros3nRHBa50xsTwfbvICjW
=MaTo
-----END PGP SIGNATURE-----
Merge tag 'hyperv-fixes-signed-20230804' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- Fix a bug in a python script for Hyper-V (Ani Sinha)
- Workaround a bug in Hyper-V when IBT is enabled (Michael Kelley)
- Fix an issue parsing MP table when Linux runs in VTL2 (Saurabh
Sengar)
- Several cleanup patches (Nischala Yelchuri, Kameron Carr, YueHaibing,
ZhiHu)
* tag 'hyperv-fixes-signed-20230804' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Remove unused extern declaration vmbus_ontimer()
x86/hyperv: add noop functions to x86_init mpparse functions
vmbus_testing: fix wrong python syntax for integer value comparison
x86/hyperv: fix a warning in mshyperv.h
x86/hyperv: Disable IBT when hypercall page lacks ENDBR instruction
x86/hyperv: Improve code for referencing hyperv_pcpu_input_arg
Drivers: hv: Change hv_free_hyperv_page() to take void * argument
The following checkpatch warning is removed:
WARNING: Use #include <linux/io.h> instead of <asm/io.h>
Signed-off-by: ZhiHu <huzhi001@208suo.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
With the intent to provide local_clock_noinstr(), a variant of
local_clock() that's safe to be called from noinstr code (with the
assumption that any such code will already be non-preemptible),
prepare for things by making the Hyper-V TSC and MSR sched_clock
implementations noinstr.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Co-developed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com> # Hyper-V
Link: https://lore.kernel.org/r/20230519102715.843039089@infradead.org
- Mark arch_cpu_idle_dead() __noreturn, make all architectures & drivers that did
this inconsistently follow this new, common convention, and fix all the fallout
that objtool can now detect statically.
- Fix/improve the ORC unwinder becoming unreliable due to UNWIND_HINT_EMPTY ambiguity,
split it into UNWIND_HINT_END_OF_STACK and UNWIND_HINT_UNDEFINED to resolve it.
- Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code.
- Generate ORC data for __pfx code
- Add more __noreturn annotations to various kernel startup/shutdown/panic functions.
- Misc improvements & fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmRK1x0RHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1ghxQ/+IkCynMYtdF5OG9YwbcGJqsPSfOPMEcEM
pUSFYg+gGPBDT/fJfcVSqvUtdnWbLC2kXt9yiswXz3X3J2nmNkBk5YKQftsNDcul
TmKeqIIAK51XTncpegKH0EGnOX63oZ9Vxa8CTPdDlb+YF23Km2FoudGRI9F5qbUd
LoraXqGYeiaeySkGyWmZVl6Uc8dIxnMkTN3H/oI9aB6TOrsi059hAtFcSaFfyemP
c4LqXXCH7k2baiQt+qaLZ8cuZVG/+K5r2N2cmjO5kmJc6ynIaFnfMe4XxZLjp5LT
/PulYI15bXkvSARKx5CRh/CDHMOx5Blw+ASO0RhWbdy0WH4ZhhcaVF5AeIpPW86a
1LBcz97rMp72WmvKgrJeVO1r9+ll4SI6/YKGJRsxsCMdP3hgFpqntXyVjTFNdTM1
0gH6H5v55x06vJHvhtTk8SR3PfMTEM2fRU5jXEOrGowoGifx+wNUwORiwj6LE3KQ
SKUdT19RNzoW3VkFxhgk65ThK1S7YsJUKRoac3YdhttpqqqtFV//erenrZoR4k/p
vzvKy68EQ7RCNyD5wNWNFe0YjeJl5G8gQ8bUm4Xmab7djjgz+pn4WpQB8yYKJLAo
x9dqQ+6eUbw3Hcgk6qQ9E+r/svbulnAL0AeALAWK/91DwnZ2mCzKroFkLN7napKi
fRho4CqzrtM=
=NwEV
-----END PGP SIGNATURE-----
Merge tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
- Mark arch_cpu_idle_dead() __noreturn, make all architectures &
drivers that did this inconsistently follow this new, common
convention, and fix all the fallout that objtool can now detect
statically
- Fix/improve the ORC unwinder becoming unreliable due to
UNWIND_HINT_EMPTY ambiguity, split it into UNWIND_HINT_END_OF_STACK
and UNWIND_HINT_UNDEFINED to resolve it
- Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code
- Generate ORC data for __pfx code
- Add more __noreturn annotations to various kernel startup/shutdown
and panic functions
- Misc improvements & fixes
* tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
x86/hyperv: Mark hv_ghcb_terminate() as noreturn
scsi: message: fusion: Mark mpt_halt_firmware() __noreturn
x86/cpu: Mark {hlt,resume}_play_dead() __noreturn
btrfs: Mark btrfs_assertfail() __noreturn
objtool: Include weak functions in global_noreturns check
cpu: Mark nmi_panic_self_stop() __noreturn
cpu: Mark panic_smp_self_stop() __noreturn
arm64/cpu: Mark cpu_park_loop() and friends __noreturn
x86/head: Mark *_start_kernel() __noreturn
init: Mark start_kernel() __noreturn
init: Mark [arch_call_]rest_init() __noreturn
objtool: Generate ORC data for __pfx code
x86/linkage: Fix padding for typed functions
objtool: Separate prefix code from stack validation code
objtool: Remove superfluous dead_end_function() check
objtool: Add symbol iteration helpers
objtool: Add WARN_INSN()
scripts/objdump-func: Support multiple functions
context_tracking: Fix KCSAN noinstr violation
objtool: Add stackleak instrumentation to uaccess safe list
...
Virtual Trust Levels (VTL) helps enable Hyper-V Virtual Secure Mode (VSM)
feature. VSM is a set of hypervisor capabilities and enlightenments
offered to host and guest partitions which enable the creation and
management of new security boundaries within operating system software.
VSM achieves and maintains isolation through VTLs.
Add early initialization for Virtual Trust Levels (VTL). This includes
initializing the x86 platform for VTL and enabling boot support for
secondary CPUs to start in targeted VTL context. For now, only enable
the code for targeted VTL level as 2.
When starting an AP at a VTL other than VTL0, the AP must start directly
in 64-bit mode, bypassing the usual 16-bit -> 32-bit -> 64-bit mode
transition sequence that occurs after waking up an AP with SIPI whose
vector points to the 16-bit AP startup trampoline code.
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com>
Link: https://lore.kernel.org/r/1681192532-15460-6-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Annotate the function prototype and definition as noreturn to prevent
objtool warnings like:
vmlinux.o: warning: objtool: hyperv_init+0x55c: unreachable instruction
Also, as per Josh's suggestion, add it to the global_noreturns list.
As a comparison, an objdump output without the annotation:
[...]
1b63: mov $0x1,%esi
1b68: xor %edi,%edi
1b6a: callq ffffffff8102f680 <hv_ghcb_terminate>
1b6f: jmpq ffffffff82f217ec <hyperv_init+0x9c> # unreachable
1b74: cmpq $0xffffffffffffffff,-0x702a24(%rip)
[...]
Now, after adding the __noreturn to the function prototype:
[...]
17df: callq ffffffff8102f6d0 <hv_ghcb_negotiate_protocol>
17e4: test %al,%al
17e6: je ffffffff82f21bb9 <hyperv_init+0x469>
[...] <many insns>
1bb9: mov $0x1,%esi
1bbe: xor %edi,%edi
1bc0: callq ffffffff8102f680 <hv_ghcb_terminate>
1bc5: nopw %cs:0x0(%rax,%rax,1) # end of function
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/32453a703dfcf0d007b473c9acbf70718222b74b.1681342859.git.jpoimboe@kernel.org
Hyper-V guests on AMD SEV-SNP hardware have the option of using the
"virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP
architecture. With vTOM, shared vs. private memory accesses are
controlled by splitting the guest physical address space into two
halves.
vTOM is the dividing line where the uppermost bit of the physical
address space is set; e.g., with 47 bits of guest physical address
space, vTOM is 0x400000000000 (bit 46 is set). Guest physical memory is
accessible at two parallel physical addresses -- one below vTOM and one
above vTOM. Accesses below vTOM are private (encrypted) while accesses
above vTOM are shared (decrypted). In this sense, vTOM is like the
GPA.SHARED bit in Intel TDX.
Support for Hyper-V guests using vTOM was added to the Linux kernel in
two patch sets[1][2]. This support treats the vTOM bit as part of
the physical address. For accessing shared (decrypted) memory, these
patch sets create a second kernel virtual mapping that maps to physical
addresses above vTOM.
A better approach is to treat the vTOM bit as a protection flag, not
as part of the physical address. This new approach is like the approach
for the GPA.SHARED bit in Intel TDX. Rather than creating a second kernel
virtual mapping, the existing mapping is updated using recently added
coco mechanisms.
When memory is changed between private and shared using
set_memory_decrypted() and set_memory_encrypted(), the PTEs for the
existing kernel mapping are changed to add or remove the vTOM bit in the
guest physical address, just as with TDX. The hypercalls to change the
memory status on the host side are made using the existing callback
mechanism. Everything just works, with a minor tweak to map the IO-APIC
to use private accesses.
To accomplish the switch in approach, the following must be done:
* Update Hyper-V initialization to set the cc_mask based on vTOM
and do other coco initialization.
* Update physical_mask so the vTOM bit is no longer treated as part
of the physical address
* Remove CC_VENDOR_HYPERV and merge the associated vTOM functionality
under CC_VENDOR_AMD. Update cc_mkenc() and cc_mkdec() to set/clear
the vTOM bit as a protection flag.
* Code already exists to make hypercalls to inform Hyper-V about pages
changing between shared and private. Update this code to run as a
callback from __set_memory_enc_pgtable().
* Remove the Hyper-V special case from __set_memory_enc_dec()
* Remove the Hyper-V specific call to swiotlb_update_mem_attributes()
since mem_encrypt_init() will now do it.
* Add a Hyper-V specific implementation of the is_private_mmio()
callback that returns true for the IO-APIC and vTPM MMIO addresses
[1] https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/
[2] https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/
[ bp: Touchups. ]
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/1679838727-87310-7-git-send-email-mikelley@microsoft.com
hv_get_nested_reg only translates SINT0, resulting in the wrong sint
being registered by nested vmbus.
Fix the issue with new utility function hv_is_sint_reg.
While at it, improve clarity of hv_set_non_nested_register and hv_is_synic_reg.
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Jinank Jain <jinankjain@linux.microsoft.com>
Link: https://lore.kernel.org/r/1675980172-6851-1-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
According to TLFS, in order to communicate to L0 hypervisor there needs
to be an additional bit set in the control register. This communication
is required to perform privileged instructions which can only be
performed by L0 hypervisor. An example of that could be setting up the
VMBus infrastructure.
Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/24f9d46d5259a688113e6e5e69e21002647f4949.1672639707.git.jinankjain@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Child partitions are free to allocate SynIC message and event page but in
case of root partition it must use the pages allocated by Microsoft
Hypervisor (MSHV). Base address for these pages can be found using
synthetic MSRs exposed by MSHV. There is a slight difference in those MSRs
for nested vs non-nested root partition.
Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/cb951fb1ad6814996fc54f4a255c5841a20a151f.1672639707.git.jinankjain@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
clocksource/hyperv_timer.h is included into the VDSO build. It includes
asm/mshyperv.h which in turn includes the world and some more. This worked
so far by chance, but any subtle change in the include chain results in a
build breakage because VDSO builds are building user space libraries.
Include asm/hyperv-tlfs.h instead which contains everything what the VDSO
build needs except the hv_get_raw_timer() define. Move this define into a
separate header file, which contains the prerequisites (msr.h) and is
included by clocksource/hyperv_timer.h.
Fixup drivers/hv/vmbus_drv.c which relies on the indirect include of
asm/mshyperv.h.
With that the VDSO build only pulls in the minimum requirements.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/87fsemtut0.ffs@tglx
Hyper-V Isolation VM current code uses sev_es_ghcb_hv_call()
to read/write MSR via GHCB page and depends on the sev code.
This may cause regression when sev code changes interface
design.
The latest SEV-ES code requires to negotiate GHCB version before
reading/writing MSR via GHCB page and sev_es_ghcb_hv_call() doesn't
work for Hyper-V Isolation VM. Add Hyper-V ghcb related implementation
to decouple SEV and Hyper-V code. Negotiate GHCB version in the
hyperv_init() and use the version to communicate with Hyper-V
in the ghcb hv call function.
Fixes: 2ea29c5abb ("x86/sev: Save the negotiated GHCB version")
Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220614014553.1915929-1-ltykernel@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmHhw7oTHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXrjSB/979LV4Dn1PMcFYsSdlFEMeHcjzJdw/
kFnLPXMaPJyfg6QPuf83jxzw9uxw8fcePMdVq/FFBtmVV9fJMAv62B8jaGS1p58c
WnAg+7zsTN+xEoJn+tskSSon8BNMWVrl41zP3K4Ged+5j8UEBk62GB8Orz1qkpwL
fTh3/+xAvczJeD4zZb1dAm4WnmcQJ4vhg45p07jX6owvnwQAikMFl45aSW54I5o8
vAxGzFgdsZ2NtExnRNKh3b3DozA8JUE89KckBSZnDtq4rH8Fyy6Wij56Hc6v6Cml
SUohiNbHX7hsNwit/lxL8wuF97IiA0pQSABobEg3rxfTghTUep51LlaN
=/m4A
-----END PGP SIGNATURE-----
Merge tag 'hyperv-next-signed-20220114' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- More patches for Hyper-V isolation VM support (Tianyu Lan)
- Bug fixes and clean-up patches from various people
* tag 'hyperv-next-signed-20220114' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
scsi: storvsc: Fix storvsc_queuecommand() memory leak
x86/hyperv: Properly deal with empty cpumasks in hyperv_flush_tlb_multi()
Drivers: hv: vmbus: Initialize request offers message for Isolation VM
scsi: storvsc: Fix unsigned comparison to zero
swiotlb: Add CONFIG_HAS_IOMEM check around swiotlb_mem_remap()
x86/hyperv: Fix definition of hv_ghcb_pg variable
Drivers: hv: Fix definition of hypercall input & output arg variables
net: netvsc: Add Isolation VM support for netvsc driver
scsi: storvsc: Add Isolation VM support for storvsc driver
hyper-v: Enable swiotlb bounce buffer for Isolation VM
x86/hyper-v: Add hyperv Isolation VM check in the cc_platform_has()
swiotlb: Add swiotlb bounce buffer remap function for HV IVM
Encapsulate arch dependencies in Hyper-V vPCI through a set of
arch-dependent interfaces. Adding these arch specific interfaces will
allow for an implementation for other architectures, such as arm64.
There are no functional changes expected from this patch.
Link: https://lore.kernel.org/r/1641411156-31705-2-git-send-email-sunilmut@linux.microsoft.com
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
The percpu variable hv_ghcb_pg is incorrectly defined. The __percpu
qualifier should be associated with the union hv_ghcb * (i.e.,
a pointer), not with the target of the pointer. This distinction
makes no difference to gcc and the generated code, but sparse
correctly complains. Fix the definition in the interest of
general correctness in addition to making sparse happy.
No functional change.
Fixes: 0cc4f6d9f0 ("x86/hyperv: Initialize GHCB page in Isolation VM")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1640662315-22260-2-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Hyperv provides GHCB protocol to write Synthetic Interrupt
Controller MSR registers in Isolation VM with AMD SEV SNP
and these registers are emulated by hypervisor directly.
Hyperv requires to write SINTx MSR registers twice. First
writes MSR via GHCB page to communicate with hypervisor
and then writes wrmsr instruction to talk with paravisor
which runs in VMPL0. Guest OS ID MSR also needs to be set
via GHCB page.
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Link: https://lore.kernel.org/r/20211025122116.264793-7-ltykernel@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Add new hvcall guest address host visibility support to mark
memory visible to host. Call it inside set_memory_decrypted
/encrypted(). Add HYPERVISOR feature check in the
hv_is_isolation_supported() to optimize in non-virtualization
environment.
Acked-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Link: https://lore.kernel.org/r/20211025122116.264793-4-ltykernel@gmail.com
[ wei: fix conflicts with tip ]
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Hyperv exposes GHCB page via SEV ES GHCB MSR for SNP guest
to communicate with hypervisor. Map GHCB page for all
cpus to read/write MSR register and submit hvcall request
via ghcb page.
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Link: https://lore.kernel.org/r/20211025122116.264793-2-ltykernel@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
The code to allocate and initialize the hv_vp_index array is
architecture neutral. Similarly, the code to allocate and
populate the hypercall input and output arg pages is architecture
neutral. Move both sets of code out from arch/x86 and into
utility functions in drivers/hv/hv_common.c that can be shared
by Hyper-V initialization on ARM64.
No functional changes. However, the allocation of the hypercall
input and output arg pages is done differently so that the
size is always the Hyper-V page size, even if not the same as
the guest page size (such as with ARM64's 64K page size).
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1626287687-2045-2-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
There is not a consistent pattern for checking Hyper-V hypercall status.
Existing code uses a number of variants. The variants work, but a consistent
pattern would improve the readability of the code, and be more conformant
to what the Hyper-V TLFS says about hypercall status.
Implemented new helper functions hv_result(), hv_result_success(), and
hv_repcomp(). Changed the places where hv_do_hypercall() and related variants
are used to use the helper functions.
Signed-off-by: Joseph Salisbury <joseph.salisbury@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1618620183-9967-2-git-send-email-joseph.salisbury@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
This patch makes no functional changes. It simply moves hv_do_rep_hypercall()
out of arch/x86/include/asm/mshyperv.h and into asm-generic/mshyperv.h
hv_do_rep_hypercall() is architecture independent, so it makes sense that it
should be in the architecture independent mshyperv.h, not in the x86-specific
mshyperv.h.
This is done in preperation for a follow up patch which creates a consistent
pattern for checking Hyper-V hypercall status.
Signed-off-by: Joseph Salisbury <joseph.salisbury@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1618620183-9967-1-git-send-email-joseph.salisbury@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
STIMER0 interrupts are most naturally modeled as per-cpu IRQs. But
because x86/x64 doesn't have per-cpu IRQs, the core STIMER0 interrupt
handling machinery is done in code under arch/x86 and Linux IRQs are
not used. Adding support for ARM64 means adding equivalent code
using per-cpu IRQs under arch/arm64.
A better model is to treat per-cpu IRQs as the normal path (which it is
for modern architectures), and the x86/x64 path as the exception. Do this
by incorporating standard Linux per-cpu IRQ allocation into the main
SITMER0 driver code, and bypass it in the x86/x64 exception case. For
x86/x64, special case code is retained under arch/x86, but no STIMER0
interrupt handling code is needed under arch/arm64.
No functional change.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/1614721102-2241-11-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
While the Hyper-V Reference TSC code is architecture neutral, the
pv_ops.time.sched_clock() function is implemented for x86/x64, but not
for ARM64. Current code calls a utility function under arch/x86 (and
coming, under arch/arm64) to handle the difference.
Change this approach to handle the difference inline based on whether
GENERIC_SCHED_CLOCK is present. The new approach removes code under
arch/* since the difference is tied more to the specifics of the Linux
implementation than to the architecture.
No functional change.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/1614721102-2241-9-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
While the driver for the Hyper-V Reference TSC and STIMERs is architecture
neutral, vDSO is implemented for x86/x64, but not for ARM64. Current code
calls into utility functions under arch/x86 (and coming, under arch/arm64)
to handle the difference.
Change this approach to handle the difference inline based on whether
VDSO_CLOCK_MODE_HVCLOCK is present. The new approach removes code under
arch/* since the difference is tied more to the specifics of the Linux
implementation than to the architecture.
No functional change.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/1614721102-2241-8-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
VMbus interrupts are most naturally modelled as per-cpu IRQs. But
because x86/x64 doesn't have per-cpu IRQs, the core VMbus interrupt
handling machinery is done in code under arch/x86 and Linux IRQs are
not used. Adding support for ARM64 means adding equivalent code
using per-cpu IRQs under arch/arm64.
A better model is to treat per-cpu IRQs as the normal path (which it is
for modern architectures), and the x86/x64 path as the exception. Do this
by incorporating standard Linux per-cpu IRQ allocation into the main VMbus
driver, and bypassing it in the x86/x64 exception case. For x86/x64,
special case code is retained under arch/x86, but no VMbus interrupt
handling code is needed under arch/arm64.
No functional change.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/r/1614721102-2241-7-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
On x86/x64, Hyper-V provides a flag to indicate auto EOI functionality,
but it doesn't on ARM64. Handle this quirk inline instead of calling
into code under arch/x86 (and coming, under arch/arm64).
No functional change.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/r/1614721102-2241-6-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Current code defines a separate get and set macro for each Hyper-V
synthetic MSR used by the VMbus driver. Furthermore, the get macro
can't be converted to a standard function because the second argument
is modified in place, which is somewhat bad form.
Redo this by providing a single get and a single set function that
take a parameter specifying the MSR to be operated on. Fixup usage
of the get function. Calling locations are no more complex than before,
but the code under arch/x86 and the upcoming code under arch/arm64
is significantly simplified.
Also standardize the names of Hyper-V synthetic MSRs that are
architecture neutral. But keep the old x86-specific names as aliases
that can be removed later when all references (particularly in KVM
code) have been cleaned up in a separate patch series.
No functional change.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/r/1614721102-2241-4-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
The Hyper-V page allocator functions are implemented in an architecture
neutral way. Move them into the architecture neutral VMbus module so
a separate implementation for ARM64 is not needed.
No functional change.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/r/1614721102-2241-2-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Just like MSI/MSI-X, IO-APIC interrupts are remapped by Microsoft
Hypervisor when Linux runs as the root partition. Implement an IRQ
domain to handle mapping and unmapping of IO-APIC interrupts.
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Joerg Roedel <joro@8bytes.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-17-wei.liu@kernel.org
When Linux runs as the root partition on Microsoft Hypervisor, its
interrupts are remapped. Linux will need to explicitly map and unmap
interrupts for hardware.
Implement an MSI domain to issue the correct hypercalls. And initialize
this irq domain as the default MSI irq domain.
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-16-wei.liu@kernel.org
We will soon need to access fields inside the MSI address and MSI data
fields. Introduce hv_msi_address_register and hv_msi_data_register.
Fix up one user of hv_msi_entry in mshyperv.h.
No functional change expected.
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-12-wei.liu@kernel.org
They are used to deposit pages into Microsoft Hypervisor and bring up
logical and virtual processors.
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Co-Developed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-10-wei.liu@kernel.org
We will need the partition ID for executing some hypercalls later.
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-7-wei.liu@kernel.org
When Linux runs as the root partition, it will need to make hypercalls
which return data from the hypervisor.
Allocate pages for storing results when Linux runs as the root
partition.
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-6-wei.liu@kernel.org
For now we can use the privilege flag to check. Stash the value to be
used later.
Put in a bunch of defines for future use when we want to have more
fine-grained detection.
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-3-wei.liu@kernel.org