1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

17961 commits

Author SHA1 Message Date
Ingo Molnar
9cea0d46f5 Merge branch 'x86/cpu' into x86/core, to resolve conflicts
Conflicts:
	arch/x86/include/asm/cpufeatures.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2022-03-15 12:52:51 +01:00
Ingo Molnar
8c490b42fe Merge branch 'x86/pasid' into x86/core, to resolve conflicts
Conflicts:
	tools/objtool/arch/x86/decode.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2022-03-15 12:50:59 +01:00
Masahiro Yamada
83a44a4f47 x86: Remove toolchain check for X32 ABI capability
Commit 0bf6276392 ("x32: Warn and disable rather than error if
binutils too old") added a small test in arch/x86/Makefile because
binutils 2.22 or newer is needed to properly support elf32-x86-64. This
check is no longer necessary, as the minimum supported version of
binutils is 2.23, which is enforced at configuration time with
scripts/min-tool-version.sh.

Remove this check and replace all uses of CONFIG_X86_X32 with
CONFIG_X86_X32_ABI, as two symbols are no longer necessary.

[nathan: Rebase, fix up a few places where CONFIG_X86_X32 was still
         used, and simplify commit message to satisfy -tip requirements]

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220314194842.3452-2-nathan@kernel.org
2022-03-15 10:32:48 +01:00
Peter Zijlstra
ed53a0d971 x86/alternative: Use .ibt_endbr_seal to seal indirect calls
Objtool's --ibt option generates .ibt_endbr_seal which lists
superfluous ENDBR instructions. That is those instructions for which
the function is never indirectly called.

Overwrite these ENDBR instructions with a NOP4 such that these
function can never be indirect called, reducing the number of viable
ENDBR targets in the kernel.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.822545231@infradead.org
2022-03-15 10:32:47 +01:00
Peter Zijlstra
89bc853eae objtool: Find unused ENDBR instructions
Find all ENDBR instructions which are never referenced and stick them
in a section such that the kernel can poison them, sealing the
functions from ever being an indirect call target.

This removes about 1-in-4 ENDBR instructions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.763643193@infradead.org
2022-03-15 10:32:47 +01:00
Peter Zijlstra
f9cdf7ca57 x86: Mark stop_this_cpu() __noreturn
vmlinux.o: warning: objtool: smp_stop_nmi_callback()+0x2b: unreachable instruction

0000 0000000000047cf0 <smp_stop_nmi_callback>:
...
0026    47d16:  e8 00 00 00 00          call   47d1b <smp_stop_nmi_callback+0x2b>       47d17: R_X86_64_PLT32   stop_this_cpu-0x4
002b    47d1b:  b8 01 00 00 00          mov    $0x1,%eax

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.290905453@infradead.org
2022-03-15 10:32:43 +01:00
Peter Zijlstra
e8d61bdf0f x86/ibt,sev: Annotations
No IBT on AMD so far.. probably correct, who knows.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.995109889@infradead.org
2022-03-15 10:32:41 +01:00
Peter Zijlstra
3215de84c0 x86/ibt,ftrace: Annotate ftrace code patching
These are code patching sites, not indirect targets.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.936599479@infradead.org
2022-03-15 10:32:41 +01:00
Peter Zijlstra
3e3f069504 x86/ibt: Annotate text references
Annotate away some of the generic code references. This is things
where we take the address of a symbol for exception handling or return
addresses (eg. context switch).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.877758523@infradead.org
2022-03-15 10:32:40 +01:00
Peter Zijlstra
fe379fa4d1 x86/ibt: Disable IBT around firmware
Assume firmware isn't IBT clean and disable it across calls.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.759989383@infradead.org
2022-03-15 10:32:40 +01:00
Peter Zijlstra
99c95c5d4f x86/alternative: Simplify int3_selftest_ip
Similar to ibt_selftest_ip, apply the same pattern.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.700456643@infradead.org
2022-03-15 10:32:40 +01:00
Peter Zijlstra
af22700390 x86/ibt,kexec: Disable CET on kexec
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.641454603@infradead.org
2022-03-15 10:32:39 +01:00
Peter Zijlstra
991625f3dd x86/ibt: Add IBT feature, MSR and #CP handling
The bits required to make the hardware go.. Of note is that, provided
the syscall entry points are covered with ENDBR, #CP doesn't need to
be an IST because we'll never hit the syscall gap.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.582331711@infradead.org
2022-03-15 10:32:39 +01:00
Peter Zijlstra
cc66bb9145 x86/ibt,kprobes: Cure sym+0 equals fentry woes
In order to allow kprobes to skip the ENDBR instructions at sym+0 for
X86_KERNEL_IBT builds, change _kprobe_addr() to take an architecture
callback to inspect the function at hand and modify the offset if
needed.

This streamlines the existing interface to cover more cases and
require less hooks. Once PowerPC gets fully converted there will only
be the one arch hook.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.405947704@infradead.org
2022-03-15 10:32:38 +01:00
Peter Zijlstra
e52fc2cf3f x86/ibt,ftrace: Make function-graph play nice
Return trampoline must not use indirect branch to return; while this
preserves the RSB, it is fundamentally incompatible with IBT. Instead
use a retpoline like ROP gadget that defeats IBT while not unbalancing
the RSB.

And since ftrace_stub is no longer a plain RET, don't use it to copy
from. Since RET is a trivial instruction, poke it directly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.347296408@infradead.org
2022-03-15 10:32:37 +01:00
Peter Zijlstra
aebfd12521 x86/ibt,ftrace: Search for __fentry__ location
Currently a lot of ftrace code assumes __fentry__ is at sym+0. However
with Intel IBT enabled the first instruction of a function will most
likely be ENDBR.

Change ftrace_location() to not only return the __fentry__ location
when called for the __fentry__ location, but also when called for the
sym+0 location.

Then audit/update all callsites of this function to consistently use
these new semantics.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.227581603@infradead.org
2022-03-15 10:32:37 +01:00
Peter Zijlstra
c3b037917c x86/ibt,paravirt: Sprinkle ENDBR
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.051635891@infradead.org
2022-03-15 10:32:36 +01:00
Peter Zijlstra
8f93402b92 x86/ibt,entry: Sprinkle ENDBR dust
Kernel entry points should be having ENDBR on for IBT configs.

The SYSCALL entry points are found through taking their respective
address in order to program them in the MSRs, while the exception
entry points are found through UNWIND_HINT_IRET_REGS.

The rule is that any UNWIND_HINT_IRET_REGS at sym+0 should have an
ENDBR, see the later objtool ibt validation patch.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.933157479@infradead.org
2022-03-15 10:32:35 +01:00
Peter Zijlstra
5b2fc51576 x86/ibt,xen: Sprinkle the ENDBR
Even though Xen currently doesn't advertise IBT, prepare for when it
will eventually do so and sprinkle the ENDBR dust accordingly.

Even though most of the entry points are IRET like, the CPL0
Hypervisor can set WAIT-FOR-ENDBR and demand ENDBR at these sites.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.873919996@infradead.org
2022-03-15 10:32:35 +01:00
Peter Zijlstra
8b87d8cec1 x86/entry,xen: Early rewrite of restore_regs_and_return_to_kernel()
By doing an early rewrite of 'jmp native_iret` in
restore_regs_and_return_to_kernel() we can get rid of the last
INTERRUPT_RETURN user and paravirt_iret.

Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.815039833@infradead.org
2022-03-15 10:32:34 +01:00
Peter Zijlstra
ba27d1a808 x86/ibt,paravirt: Use text_gen_insn() for paravirt_patch()
Less duplication is more better.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.697253958@infradead.org
2022-03-15 10:32:34 +01:00
Ingo Molnar
ccdbf33c23 Linux 5.17-rc8
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmIuUskeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGCFkH/2n3mpGXuITp0ZXE
 TNrpbdZOof5SgLw+w7THswXuo6m5yRGNKQs9fvIvDD8Vf7/OdQQfPOmF1cIE5+nk
 wcz6aHKbdrok8Jql2qjJqWXZ5xbGj6qywg3zZrwOUsCKFP5p+AjBJcmZOsvQHjSp
 ASODy1moOlK+nO52TrMaJw74a8xQPmQiNa+T2P+FedEYjlcRH/c7hLJ7GEnL6+cC
 /R4bATZq3tiInbTBlkC0hR0iVNgRXwXNyv9PEXrYYYHnekh8G1mgSNf06iejLcsG
 aAYsW9NyPxu8zPhhHNx79K9o8BMtxGD4YQpsfdfIEnf9Q3euqAKe2evRWqHHlDms
 RuSCtsc=
 =M9Nc
 -----END PGP SIGNATURE-----

Merge tag 'v5.17-rc8' into sched/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2022-03-15 10:28:12 +01:00
Jani Nikula
5f1b97cb9a x86/gpu: include drm/i915_pciids.h directly in early quirks
early-quirks.c is the only user of drm/i915_drm.h that also needs
drm/i915_pciids.h. Include the masses of PCI ID macros only where
needed.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220311100639.114685-1-jani.nikula@intel.com
2022-03-14 17:57:00 +02:00
Linus Torvalds
f0e18b03fc - Free shmem backing storage for SGX enclave pages when those are
swapped back into EPC memory
 
 - Prevent do_int3() from being kprobed, to avoid recursion
 
 - Remap setup_data and setup_indirect structures properly when accessing
 their members
 
 - Correct the alternatives patching order for modules too
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmItzJgACgkQEsHwGGHe
 VUqaow/8C115xuZEBn+iT+adcQxbqrg3S2en/Hq0aJOEhkNkbOhgAW0OWHvj7Gs3
 +2taD35MqzneEOfa0Gv46600V4+SV5K5NAFndr4PA2FVgIw01rEQios2oc4QSQBP
 PVJgvGyIMpN71ODKTiZ8w4ihp3J7MWDkCP1z4hbO/lfM4tOXcYzh2Lv1fE8hHr5b
 qFtPDyYgfEKUVFa+sv2sE1cJw670UFDcFqGAIjxUUm0r78GKmPz08gZm9YiBTJgV
 jrxySdpAh/eaPeHNfFH9RzAD2ZGZppgIkPCp33ZdrMEhnZmwLz7vc76BMbkD2P6w
 1fBmBZ5F8yOMaaLHSGh4Ek5Gs3p9DjmaZdEWwz+yiIe1RFLKyOQu6gsmGbAyuQx4
 KSfPFnfkOfw/7cz6BSp3Sh6zgrGPqloIVcHkWRth/LJZSV/fVgM8bPg3VLJP6WFi
 o4WTcNAq/fNMAmGwtIVpTUW/QJafXvOauKkDGQkMQ87U68QSh6uDrvrvMHPF8W+Y
 SPcYrdsAPagLxq0GCCQ6doSvBjWNTolXfTnfAoATZpae0URmrvu9ddgUbIlgeQWY
 n/rK+cKk+iuLTEZC55+v5OALwEMOM3Tuz4Ghko8re0pkD/kE61m3Az6w5sKN3Inc
 c21tvO/dxHhAnHV+34d2LM27PU4qoFdVO2mPup702x68XT+X0/g=
 =YLph
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v5.17_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Free shmem backing storage for SGX enclave pages when those are
   swapped back into EPC memory

 - Prevent do_int3() from being kprobed, to avoid recursion

 - Remap setup_data and setup_indirect structures properly when
   accessing their members

 - Correct the alternatives patching order for modules too

* tag 'x86_urgent_for_v5.17_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sgx: Free backing memory after faulting the enclave page
  x86/traps: Mark do_int3() NOKPROBE_SYMBOL
  x86/boot: Add setup_indirect support in early_memremap_is_setup_data()
  x86/boot: Fix memremap of setup_indirect structures
  x86/module: Fix the paravirt vs alternative order
2022-03-13 10:36:38 -07:00
Jarkko Sakkinen
08999b2489 x86/sgx: Free backing memory after faulting the enclave page
There is a limited amount of SGX memory (EPC) on each system.  When that
memory is used up, SGX has its own swapping mechanism which is similar
in concept but totally separate from the core mm/* code.  Instead of
swapping to disk, SGX swaps from EPC to normal RAM.  That normal RAM
comes from a shared memory pseudo-file and can itself be swapped by the
core mm code.  There is a hierarchy like this:

	EPC <-> shmem <-> disk

After data is swapped back in from shmem to EPC, the shmem backing
storage needs to be freed.  Currently, the backing shmem is not freed.
This effectively wastes the shmem while the enclave is running.  The
memory is recovered when the enclave is destroyed and the backing
storage freed.

Sort this out by freeing memory with shmem_truncate_range(), as soon as
a page is faulted back to the EPC.  In addition, free the memory for
PCMD pages as soon as all PCMD's in a page have been marked as unused
by zeroing its contents.

Cc: stable@vger.kernel.org
Fixes: 1728ab54b4 ("x86/sgx: Add a page reclaimer")
Reported-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20220303223859.273187-1-jarkko@kernel.org
2022-03-11 10:31:06 -08:00
Li Huafei
a365a65f9c x86/traps: Mark do_int3() NOKPROBE_SYMBOL
Since kprobe_int3_handler() is called in do_int3(), probing do_int3()
can cause a breakpoint recursion and crash the kernel. Therefore,
do_int3() should be marked as NOKPROBE_SYMBOL.

Fixes: 21e28290b3 ("x86/traps: Split int3 handler up")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220310120915.63349-1-lihuafei1@huawei.com
2022-03-11 19:19:30 +01:00
Jakub Kicinski
1e8a3f0d2a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/dsa/dsa2.c
  commit afb3cc1a39 ("net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails")
  commit e83d565378 ("net: dsa: replay master state events in dsa_tree_{setup,teardown}_master")
https://lore.kernel.org/all/20220307101436.7ae87da0@canb.auug.org.au/

drivers/net/ethernet/intel/ice/ice.h
  commit 97b0129146 ("ice: Fix error with handling of bonding MTU")
  commit 43113ff734 ("ice: add TTY for GNSS module for E810T device")
https://lore.kernel.org/all/20220310112843.3233bcf1@canb.auug.org.au/

drivers/staging/gdm724x/gdm_lte.c
  commit fc7f750dc9 ("staging: gdm724x: fix use after free in gdm_lte_rx()")
  commit 4bcc4249b4 ("staging: Use netif_rx().")
https://lore.kernel.org/all/20220308111043.1018a59d@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 17:16:56 -08:00
Eric W. Biederman
355f841a3f tracehook: Remove tracehook.h
Now that all of the definitions have moved out of tracehook.h into
ptrace.h, sched/signal.h, resume_user_mode.h there is nothing left in
tracehook.h so remove it.

Update the few files that were depending upon tracehook.h to bring in
definitions to use the headers they need directly.

Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20220309162454.123006-13-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-10 16:51:51 -06:00
Eric W. Biederman
8ba62d3794 task_work: Call tracehook_notify_signal from get_signal on all architectures
Always handle TIF_NOTIFY_SIGNAL in get_signal.  With commit 35d0b389f3
("task_work: unconditionally run task_work from get_signal()") always
calling task_work_run all of the work of tracehook_notify_signal is
already happening except clearing TIF_NOTIFY_SIGNAL.

Factor clear_notify_signal out of tracehook_notify_signal and use it in
get_signal so that get_signal only needs one call of task_work_run.

To keep the semantics in sync update xfer_to_guest_mode_work (which
does not call get_signal) to call tracehook_notify_signal if either
_TIF_SIGPENDING or _TIF_NOTIFY_SIGNAL.

Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20220309162454.123006-8-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-10 16:51:36 -06:00
Ross Philipson
7228918b34 x86/boot: Fix memremap of setup_indirect structures
As documented, the setup_indirect structure is nested inside
the setup_data structures in the setup_data list. The code currently
accesses the fields inside the setup_indirect structure but only
the sizeof(struct setup_data) is being memremapped. No crash
occurred but this is just due to how the area is remapped under the
covers.

Properly memremap both the setup_data and setup_indirect structures
in these cases before accessing them.

Fixes: b3c72fc9a7 ("x86/boot: Introduce setup_indirect")
Signed-off-by: Ross Philipson <ross.philipson@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1645668456-22036-2-git-send-email-ross.philipson@oracle.com
2022-03-09 12:49:44 +01:00
Michael Kelley
eeda29db98 x86/hyperv: Output host build info as normal Windows version number
Hyper-V provides host version number information that is output in
text form by a Linux guest when it boots. For whatever reason, the
formatting has historically been non-standard. Change it to output
in normal Windows version format for better readability.

Similar code for ARM64 guests already outputs in normal Windows
version format.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1646767364-2234-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-03-08 20:44:50 +00:00
Mark Cilissen
e702196bf8 ACPI / x86: Work around broken XSDT on Advantech DAC-BJ01 board
On this board the ACPI RSDP structure points to both a RSDT and an XSDT,
but the XSDT points to a truncated FADT. This causes all sorts of trouble
and usually a complete failure to boot after the following error occurs:

  ACPI Error: Unsupported address space: 0x20 (*/hwregs-*)
  ACPI Error: AE_SUPPORT, Unable to initialize fixed events (*/evevent-*)
  ACPI: Unable to start ACPI Interpreter

This leaves the ACPI implementation in such a broken state that subsequent
kernel subsystem initialisations go wrong, resulting in among others
mismapped PCI memory, SATA and USB enumeration failures, and freezes.

As this is an older embedded platform that will likely never see any BIOS
updates to address this issue and its default shipping OS only complies to
ACPI 1.0, work around this by forcing `acpi=rsdt`. This patch, applied on
top of Linux 5.10.102, was confirmed on real hardware to fix the issue.

Signed-off-by: Mark Cilissen <mark@yotsuba.nl>
Cc: All applicable <stable@vger.kernel.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-08 19:52:22 +01:00
Huang Rui
eb5616d4ad x86/ACPI: CPPC: Move init_freq_invariance_cppc() into x86 CPPC
The init_freq_invariance_cppc code actually doesn't need the SMP
functionality. So setting the CONFIG_SMP as the check condition for
init_freq_invariance_cppc may cause the confusion to misunderstand the
CPPC. And the x86 CPPC file is better space to store the CPPC related
functions, while the init_freq_invariance_cppc is out of smpboot, that
means, the CONFIG_SMP won't be mandatory condition any more. And It's more
clear than before.

Signed-off-by: Huang Rui <ray.huang@amd.com>
[ rjw: Subject adjustment ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-08 19:16:43 +01:00
Huang Rui
666f6ecf35 x86: Expose init_freq_invariance() to topology header
The function init_freq_invariance will be used on x86 CPPC, so expose it in
the topology header.

Signed-off-by: Huang Rui <ray.huang@amd.com>
[ rjw: Subject adjustment ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-08 19:16:43 +01:00
Huang Rui
82d8936914 x86/ACPI: CPPC: Move AMD maximum frequency ratio setting function into x86 CPPC
The AMD maximum frequency ratio setting function depends on CPPC, so the
x86 CPPC implementation file is better space for this function.

Signed-off-by: Huang Rui <ray.huang@amd.com>
[ rjw: Subject adjustment ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-08 19:16:43 +01:00
Huang Rui
fd8af343a2 x86/ACPI: CPPC: Rename cppc_msr.c to cppc.c
Rename the cppc_msr.c to cppc.c in x86 ACPI, that expects to use this file
to cover more function implementation for ACPI CPPC beside MSR helpers.
Naming as "cppc" is more straightforward as one of the functionalities
under ACPI subsystem.

Signed-off-by: Huang Rui <ray.huang@amd.com>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-08 19:16:43 +01:00
Peter Zijlstra
5adf349439 x86/module: Fix the paravirt vs alternative order
Ever since commit

  4e6292114c ("x86/paravirt: Add new features for paravirt patching")

there is an ordering dependency between patching paravirt ops and
patching alternatives, the module loader still violates this.

Fixes: 4e6292114c ("x86/paravirt: Add new features for paravirt patching")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220303112825.068773913@infradead.org
2022-03-08 14:15:25 +01:00
Linus Torvalds
4a01e748a5 - Mitigate Spectre v2-type Branch History Buffer attacks on machines
which support eIBRS, i.e., the hardware-assisted speculation restriction
 after it has been shown that such machines are vulnerable even with the
 hardware mitigation.
 
 - Do not use the default LFENCE-based Spectre v2 mitigation on AMD as it
 is insufficient to mitigate such attacks. Instead, switch to retpolines
 on all AMD by default.
 
 - Update the docs and add some warnings for the obviously vulnerable
 cmdline configurations.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmIkktUACgkQEsHwGGHe
 VUo7ZQ/+O4hzL/tHY0V/ekkDxCrJ3q3Hp+DcxUl2ee5PC3Qgxv1Z1waH6ppK8jQs
 marAGr7FYbvzY039ON7irxhpSIckBCpx9tM2F43zsPxxY8EdxGojkHbmaqso5HtW
 l3/O28AcZYoKN/fF8rRAIJy4hrTVascKrNJ2fOiYWYBT62ZIoPm0FusgXbKTZPD+
 gT7iUMoyPjBnKdWDT9L6kKOxDF9TivX1Y6JdDHbnnBsgRkeFatkeq9BJ93M73q63
 Ziq9c8ZcEXyKez+cGFCfXM7+pNYmfsiL48lilTyf+v+GXahDJQOkFw39j5zXEALm
 Nk6yB3PRQ74pEwm5WbK7KO8iwPpblmnDB978mfUcpk+9xWJD8pyoUcItAmCBsXh1
 LjIImYPqL6YihUb9udh+PEDISsfzWNzr4T+kgW9/yXXG4ZmGy3TLInhTK+rNAxJa
 EshWZExEZj6yJvt83Vu08W9fppYJq976tJvl8LWOYthaxqY7IQz0q7mYd799yxk0
 MLPqvZP1+4pHzqn2c9yeHgrwHwMmoqcyMx6B3EA5maYQPdlT7Fk9RCBeCdIA/ieF
 OgGxy1WwMH+cvUa5MaBy3Y32LeYU3bUJh0yPFq/7BxEYGG9PJtLhg2xTo1Ui8F1d
 fKrcSFcjZKVJ9UE5HaqOcp4ka+Q220I9IDGURXkAFQlnOU7X7CE=
 =Athd
 -----END PGP SIGNATURE-----

Merge tag 'x86_bugs_for_v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 spectre fixes from Borislav Petkov:

 - Mitigate Spectre v2-type Branch History Buffer attacks on machines
   which support eIBRS, i.e., the hardware-assisted speculation
   restriction after it has been shown that such machines are vulnerable
   even with the hardware mitigation.

 - Do not use the default LFENCE-based Spectre v2 mitigation on AMD as
   it is insufficient to mitigate such attacks. Instead, switch to
   retpolines on all AMD by default.

 - Update the docs and add some warnings for the obviously vulnerable
   cmdline configurations.

* tag 'x86_bugs_for_v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
  x86/speculation: Warn about Spectre v2 LFENCE mitigation
  x86/speculation: Update link to AMD speculation whitepaper
  x86/speculation: Use generic retpoline by default on AMD
  x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
  Documentation/hw-vuln: Update spectre doc
  x86/speculation: Add eIBRS + Retpoline options
  x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
2022-03-07 17:29:47 -08:00
Linus Torvalds
f81664f760 x86 guest:
* Tweaks to the paravirtualization code, to avoid using them
 when they're pointless or harmful
 
 x86 host:
 
 * Fix for SRCU lockdep splat
 
 * Brown paper bag fix for the propagation of errno
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmIkkdsUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroP15Qf7B8BXNMlNkret5WN/4pGf06gNdIY6
 ZqC8t/Lx1+fCkzGk+VtAw0bxRscOF4z1XzvfywO5ZI5bxQB/b2xTyBkVY90SqhsB
 shug5QpikejpmvVZJXxwD3+loCUah2T6FUT6QJa0sKVhW+XiqOva8fAmYLG5agaa
 VGvqFXTXiVmbiw/O9ZI/CfUC0WNrn+I1iDO+oGWyhv/22tePxGCizVczRFJn6DAD
 Vh5P6AfOqXjmzdpUeOiU544FQZPHAZehb7/xYc0T9GSW4fPnTmHwRzwhUqgJnx7d
 3E+eWGwny+Q/OrpKf7SbxtB65yn7lHRmdN/YtCHygl4sjs6CdjSPY8/9jQ==
 =PPz1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "x86 guest:

   - Tweaks to the paravirtualization code, to avoid using them when
     they're pointless or harmful

  x86 host:

   - Fix for SRCU lockdep splat

   - Brown paper bag fix for the propagation of errno"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: pull kvm->srcu read-side to kvm_arch_vcpu_ioctl_run
  KVM: x86/mmu: Passing up the error state of mmu_alloc_shadow_roots()
  KVM: x86: Yield to IPI target vCPU only if it is busy
  x86/kvmclock: Fix Hyper-V Isolated VM's boot issue when vCPUs > 64
  x86/kvm: Don't waste memory if kvmclock is disabled
  x86/kvm: Don't use PV TLB/yield when mwait is advertised
2022-03-06 12:08:42 -08:00
Josh Poimboeuf
0de05d056a x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
The commit

   44a3918c82 ("x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting")

added a warning for the "eIBRS + unprivileged eBPF" combination, which
has been shown to be vulnerable against Spectre v2 BHB-based attacks.

However, there's no warning about the "eIBRS + LFENCE retpoline +
unprivileged eBPF" combo. The LFENCE adds more protection by shortening
the speculation window after a mispredicted branch. That makes an attack
significantly more difficult, even with unprivileged eBPF. So at least
for now the logic doesn't warn about that combination.

But if you then add SMT into the mix, the SMT attack angle weakens the
effectiveness of the LFENCE considerably.

So extend the "eIBRS + unprivileged eBPF" warning to also include the
"eIBRS + LFENCE + unprivileged eBPF + SMT" case.

  [ bp: Massage commit message. ]

Suggested-by: Alyssa Milburn <alyssa.milburn@linux.intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-03-05 09:30:47 +01:00
Josh Poimboeuf
eafd987d4a x86/speculation: Warn about Spectre v2 LFENCE mitigation
With:

  f8a66d608a ("x86,bugs: Unconditionally allow spectre_v2=retpoline,amd")

it became possible to enable the LFENCE "retpoline" on Intel. However,
Intel doesn't recommend it, as it has some weaknesses compared to
retpoline.

Now AMD doesn't recommend it either.

It can still be left available as a cmdline option. It's faster than
retpoline but is weaker in certain scenarios -- particularly SMT, but
even non-SMT may be vulnerable in some cases.

So just unconditionally warn if the user requests it on the cmdline.

  [ bp: Massage commit message. ]

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-03-05 09:16:24 +01:00
Jakub Kicinski
80901bff81 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/batman-adv/hard-interface.c
  commit 690bb6fb64 ("batman-adv: Request iflink once in batadv-on-batadv check")
  commit 6ee3c393ee ("batman-adv: Demote batadv-on-batadv skip error message")
https://lore.kernel.org/all/20220302163049.101957-1-sw@simonwunderlich.de/

net/smc/af_smc.c
  commit 4d08b7b57e ("net/smc: Fix cleanup when register ULP fails")
  commit 462791bbfa ("net/smc: add sysctl interface for SMC")
https://lore.kernel.org/all/20220302112209.355def40@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 11:55:12 -08:00
Kim Phillips
244d00b5dd x86/speculation: Use generic retpoline by default on AMD
AMD retpoline may be susceptible to speculation. The speculation
execution window for an incorrect indirect branch prediction using
LFENCE/JMP sequence may potentially be large enough to allow
exploitation using Spectre V2.

By default, don't use retpoline,lfence on AMD.  Instead, use the
generic retpoline.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-02-28 18:37:08 +01:00
Greg Kroah-Hartman
4a248f85b3 Merge 5.17-rc6 into driver-core-next
We need the driver core fix in here as well for future changes.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-28 07:45:41 +01:00
Dave Airlie
6c64ae228f Linux 5.17-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmIb/PEeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGugMH/R8icwG5gOjAuxBr
 fuz9032ECVFS36Cy3ps9Eqgf5EdS4G6yz3joM4aNtp4B7e5FzI9lzBaHS8OAguNL
 y7puFtBr9CywsnniJumZzciB9pEHmF/yyEKfMlRZA3JsRfLDacFstETp+duJnXoA
 +s49IWsy1ot5zoherhDXcFLqDoFhLVU4hYwE1xpLpW/cllqmSnW8SDZMtWW1Ui/B
 Of7zqINR4zBMmRjP4ymGq/ZrPWlFyWLdtOo0xxVoAQkeMgm33kfaaeGzfyK25FqR
 JDGUZ7lkKvwz3PYh2hqJ7dc5K+vhJ18I+F1UmOiL6QAAUF/k8jq7i3Qaf6MKqh2Z
 2v+A5k0=
 =Hiar
 -----END PGP SIGNATURE-----

Backmerge tag 'v5.17-rc6' into drm-next

This backmerges v5.17-rc6 so I can merge some amdgpu and some tegra changes on top.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-02-28 14:57:14 +10:00
Li RongQing
9ee83635d8 KVM: x86: Yield to IPI target vCPU only if it is busy
When sending a call-function IPI-many to vCPUs, yield to the
IPI target vCPU which is marked as preempted.

but when emulating HLT, an idling vCPU will be voluntarily
scheduled out and mark as preempted from the guest kernel
perspective. yielding to idle vCPU is pointless and increase
unnecessary vmexit, maybe miss the true preempted vCPU

so yield to IPI target vCPU only if vCPU is busy and preempted

Signed-off-by: Li RongQing <lirongqing@baidu.com>
Message-Id: <1644380201-29423-1-git-send-email-lirongqing@baidu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-25 10:09:35 -05:00
Dexuan Cui
92e68cc558 x86/kvmclock: Fix Hyper-V Isolated VM's boot issue when vCPUs > 64
When Linux runs as an Isolated VM on Hyper-V, it supports AMD SEV-SNP
but it's partially enlightened, i.e. cc_platform_has(
CC_ATTR_GUEST_MEM_ENCRYPT) is true but sev_active() is false.

Commit 4d96f91091 per se is good, but with it now
kvm_setup_vsyscall_timeinfo() -> kvmclock_init_mem() calls
set_memory_decrypted(), and later gets stuck when trying to zere out
the pages pointed by 'hvclock_mem', if Linux runs as an Isolated VM on
Hyper-V. The cause is that here now the Linux VM should no longer access
the original guest physical addrss (GPA); instead the VM should do
memremap() and access the original GPA + ms_hyperv.shared_gpa_boundary:
see the example code in drivers/hv/connection.c: vmbus_connect() or
drivers/hv/ring_buffer.c: hv_ringbuffer_init(). If the VM tries to
access the original GPA, it keepts getting injected a fault by Hyper-V
and gets stuck there.

Here the issue happens only when the VM has >=65 vCPUs, because the
global static array hv_clock_boot[] can hold 64 "struct
pvclock_vsyscall_time_info" (the sizeof of the struct is 64 bytes), so
kvmclock_init_mem() only allocates memory in the case of vCPUs > 64.

Since the 'hvclock_mem' pages are only useful when the kvm clock is
supported by the underlying hypervisor, fix the issue by returning
early when Linux VM runs on Hyper-V, which doesn't support kvm clock.

Fixes: 4d96f91091 ("x86/sev: Replace occurrences of sev_active() with cc_platform_has()")
Tested-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Message-Id: <20220225084600.17817-1-decui@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-25 10:09:34 -05:00
Wanpeng Li
3c51d0a6c7 x86/kvm: Don't waste memory if kvmclock is disabled
Even if "no-kvmclock" is passed in cmdline parameter, the guest kernel
still allocates hvclock_mem which is scaled by the number of vCPUs,
let's check kvmclock enable in advance to avoid this memory waste.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1645520523-30814-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-25 10:09:34 -05:00
Wanpeng Li
40cd58dbf1 x86/kvm: Don't use PV TLB/yield when mwait is advertised
MWAIT is advertised in host is not overcommitted scenario, however, PV
TLB/sched yield should be enabled in host overcommitted scenario. Let's
add the MWAIT checking when enabling PV TLB/sched yield.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1645777780-2581-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-25 10:09:34 -05:00
Arnd Bergmann
36903abedf x86: remove __range_not_ok()
The __range_not_ok() helper is an x86 (and sparc64) specific interface
that does roughly the same thing as __access_ok(), but with different
calling conventions.

Change this to use the normal interface in order for consistency as we
clean up all access_ok() implementations.

This changes the limit from TASK_SIZE to TASK_SIZE_MAX, which Al points
out is the right thing do do here anyway.

The callers have to use __access_ok() instead of the normal access_ok()
though, because on x86 that contains a WARN_ON_IN_IRQ() check that cannot
be used inside of NMI context while tracing.

The check in copy_code() is not needed any more, because this one is
already done by copy_from_user_nmi().

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/lkml/YgsUKcXGR7r4nINj@zeniv-ca.linux.org.uk/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 09:36:05 +01:00