- Add Kan Liang to MAINTAINERS as a perf tools reviewer.
- Add support for using the 'capstone' disassembler library in various tools,
such as 'perf script' and 'perf annotate'. This is an alternative for the
use of the 'xed' and 'objdump' disassemblers.
- Data-type profiling improvements:
Resolve types for a->b->c by backtracking the assignments until it finds
DWARF info for one of those members
Support for global variables, keeping a cache to speed up lookups.
Handle the 'call' instruction, dealing with effects on registers and handling
its return when tracking register data types.
Handle x86's segment based addressing like %gs:0x28, to support things like
per CPU variables, the stack canary, etc.
Data-type profiling got big speedups when using capstone for disassembling.
The objdump outoput parsing method is left as a fallback when capstone fails or
isn't available. There are patches posted for 6.11 that to use a LLVM
disassembler.
Support event group display in the TUI when annotating types with --data-type,
for instance to show memory load and store events for the data type fields.
Optimize the 'perf annotate' data structures, reducing memory usage.
Add a initial 'perf test' for 'perf annotate', checking that a target symbol
appears on the output, specifying objdump via the command line, etc.
- Integrate the shellcheck utility with the build of perf to allow catching
shell problems early in areas such as 'perf test', 'perf trace' scrape
scripts, etc.
- Add 'uretprobe' variant in the 'perf bench uprobe' tool.
- Add script to run instances of 'perf script' in parallel.
- Allow parsing tracepoint names that start with digits, such as
9p/9p_client_req, etc. Make sure 'perf test' tests it even on systems
where those tracepoints aren't available.
Vendor Events:
- Update Intel JSON files for Cascade Lake X, Emerald Rapids, Grand Ridge, Ice
Lake X, Lunar Lake, Meteor Lake, Sapphire Rapids, Sierra Forest, Sky Lake X,
Sky Lake and Snow Ridge X. Remove info metrics erroneously in TopdownL1.
- Add AMD's Zen 5 core and uncore events and metrics. Those come from the
"Performance Monitor Counters for AMD Family 1Ah Model 00h- 0Fh Processors"
document, with events that capture information on op dispatch, execution and
retirement, branch prediction, L1 and L2 cache activity, TLB activity, etc.
- Mark L1D_CACHE_INVAL impacted by errata for ARM64's AmpereOne/AmpereOneX.
Miscellaneous:
- Sync header copies with the kernel sources.
- Move some header copies used only for generating translation string tables
for ioctl cmds and other syscall integer arguments to a new directory under
tools/perf/beauty/, to separate from copies in tools/include/ that are used
to build the tools.
- Introduce scrape script for several syscall 'flags'/'mask' arguments.
- Improve cpumap utilization, fixing up pairing of refcounts, using the right
iterators (perf_cpu_map__for_each_cpu), etc.
- Give more details about raw event encodings in 'perf list', show tracepoint
encoding in the detailed output.
- Refactor the DSOs handling code, reducing memory usage.
- Document the BPF event modifier and add a 'perf test' for it.
- Improve the event parser, better error messages and add further 'perf test's
for it.
- Add reference count checking to 'struct comm_str' and 'struct mem_info'.
- Make ARM64's 'perf test' entries for the Neoverse N1 more robust.
- Tweak the ARM64's Coresight 'perf test's.
- Improve ARM64's CoreSight ETM version detection and error reporting.
- Fix handling of symbols when using kcore.
- Fix PAI (Processor Activity Instrumentation) counter names for s390 virtual
machines in 'perf report'.
- Fix -g/--call-graph option failure in 'perf sched timehist'.
- Add LIBTRACEEVENT_DIR build option to allow building with libtraceevent
installed in non-standard directories, such as when doing cross builds.
- Various 'perf test' and 'perf bench' fixes.
- Improve 'perf probe' error message for long C++ probe names.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZkzdjgAKCRCyPKLppCJ+
J8ZYAP46rcbOuDWol6tjD9FDXd+spkWc40bnqeSnOR+TWlmJXwEA87XU4+3LAh6p
HQxKXehJRh90I90yn954mK2NuN+58Q0=
=sie+
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v6.10-1-2024-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
"General:
- Integrate the shellcheck utility with the build of perf to allow
catching shell problems early in areas such as 'perf test', 'perf
trace' scrape scripts, etc
- Add 'uretprobe' variant in the 'perf bench uprobe' tool
- Add script to run instances of 'perf script' in parallel
- Allow parsing tracepoint names that start with digits, such as
9p/9p_client_req, etc. Make sure 'perf test' tests it even on
systems where those tracepoints aren't available
- Add Kan Liang to MAINTAINERS as a perf tools reviewer
- Add support for using the 'capstone' disassembler library in
various tools, such as 'perf script' and 'perf annotate'. This is
an alternative for the use of the 'xed' and 'objdump' disassemblers
Data-type profiling improvements:
- Resolve types for a->b->c by backtracking the assignments until it
finds DWARF info for one of those members
- Support for global variables, keeping a cache to speed up lookups
- Handle the 'call' instruction, dealing with effects on registers
and handling its return when tracking register data types
- Handle x86's segment based addressing like %gs:0x28, to support
things like per CPU variables, the stack canary, etc
- Data-type profiling got big speedups when using capstone for
disassembling. The objdump outoput parsing method is left as a
fallback when capstone fails or isn't available. There are patches
posted for 6.11 that to use a LLVM disassembler
- Support event group display in the TUI when annotating types with
--data-type, for instance to show memory load and store events for
the data type fields
- Optimize the 'perf annotate' data structures, reducing memory usage
- Add a initial 'perf test' for 'perf annotate', checking that a
target symbol appears on the output, specifying objdump via the
command line, etc
Vendor Events:
- Update Intel JSON files for Cascade Lake X, Emerald Rapids, Grand
Ridge, Ice Lake X, Lunar Lake, Meteor Lake, Sapphire Rapids, Sierra
Forest, Sky Lake X, Sky Lake and Snow Ridge X. Remove info metrics
erroneously in TopdownL1
- Add AMD's Zen 5 core and uncore events and metrics. Those come from
the "Performance Monitor Counters for AMD Family 1Ah Model 00h- 0Fh
Processors" document, with events that capture information on op
dispatch, execution and retirement, branch prediction, L1 and L2
cache activity, TLB activity, etc
- Mark L1D_CACHE_INVAL impacted by errata for ARM64's AmpereOne/
AmpereOneX
Miscellaneous:
- Sync header copies with the kernel sources
- Move some header copies used only for generating translation string
tables for ioctl cmds and other syscall integer arguments to a new
directory under tools/perf/beauty/, to separate from copies in
tools/include/ that are used to build the tools
- Introduce scrape script for several syscall 'flags'/'mask'
arguments
- Improve cpumap utilization, fixing up pairing of refcounts, using
the right iterators (perf_cpu_map__for_each_cpu), etc
- Give more details about raw event encodings in 'perf list', show
tracepoint encoding in the detailed output
- Refactor the DSOs handling code, reducing memory usage
- Document the BPF event modifier and add a 'perf test' for it
- Improve the event parser, better error messages and add further
'perf test's for it
- Add reference count checking to 'struct comm_str' and 'struct
mem_info'
- Make ARM64's 'perf test' entries for the Neoverse N1 more robust
- Tweak the ARM64's Coresight 'perf test's
- Improve ARM64's CoreSight ETM version detection and error reporting
- Fix handling of symbols when using kcore
- Fix PAI (Processor Activity Instrumentation) counter names for s390
virtual machines in 'perf report'
- Fix -g/--call-graph option failure in 'perf sched timehist'
- Add LIBTRACEEVENT_DIR build option to allow building with
libtraceevent installed in non-standard directories, such as when
doing cross builds
- Various 'perf test' and 'perf bench' fixes
- Improve 'perf probe' error message for long C++ probe names"
* tag 'perf-tools-for-v6.10-1-2024-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (260 commits)
tools lib subcmd: Show parent options in help
perf pmu: Count sys and cpuid JSON events separately
perf stat: Don't display metric header for non-leader uncore events
perf annotate-data: Ensure the number of type histograms
perf annotate: Fix segfault on sample histogram
perf daemon: Fix file leak in daemon_session__control
libsubcmd: Fix parse-options memory leak
perf lock: Avoid memory leaks from strdup()
perf sched: Rename 'switches' column header to 'count' and add usage description, options for latency
perf tools: Ignore deleted cgroups
perf parse: Allow tracepoint names to start with digits
perf parse-events: Add new 'fake_tp' parameter for tests
perf parse-events: pass parse_state to add_tracepoint
perf symbols: Fix ownership of string in dso__load_vmlinux()
perf symbols: Update kcore map before merging in remaining symbols
perf maps: Re-use __maps__free_maps_by_name()
perf symbols: Remove map from list before updating addresses
perf tracepoint: Don't scan all tracepoints to test if one exists
perf dwarf-aux: Fix build with HAVE_DWARF_CFI_SUPPORT
perf thread: Fixes to thread__new() related to initializing comm
...
- Extend the x86 instruction decoder with APX and
other new instructions
- Misc cleanups
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmZIa8ERHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1gHbw//Zet6K5cbgp5QB570J+rdDyViAl+spYxt
sbWk8CUg0/jk5oSo45psl9xR8mmSPpeEOpTsuJPzEGfbunvTLU8G6HV/l1EDAk8I
Yeia3zLvssTfsirIfSck6spSDRmCRQTiKWibj2mlXSFlXRuVXiIKmbSYZGyx4vk6
5zkKuC6+k77X1qlWYCl9M9Sn0nWr/oEuXPXotliDqhev/DdhP5iBniKHEhkzUOEn
KHtfFTu0B4GbTC1w3hZ3Dmbqz3nrdXf56Py1Vf/uMyzP3UhuE0vE+tC4h7TnfZf6
LBTLEpw+K4KRuppcI2PbEMvzfMT41rtx7S8u83gzKIBhqrfSm1L6OSi8UEOph68G
+p1IS1H4c4woY+0JefaFLiTeweuws4L45PiNNa4qnQp9HX/3G3bTt+kc1vddbfjg
x7pnIntSDKwLtKfo5GYJ+OtTfKQRC13dQroLujsmFa0/me3MbFao+i50UlAoWWBa
1qSCsJpSpGAhYlchxBVfitiiLVpGU7+O39m6ZosA6n2HGSpfgfW1p3xigaPYRISq
GcedKmx8lIThe483T0Y8/Bk2QtCeVCryZb9Qij3B2NKFttlNJaGx/iabE2AuLheY
qnEEQ5UqYgrXEJz1Vu/QqR5Yb9dqkC2MID8llawK66M+kH91cXSXg7RcBEkoLBF4
eT9AuGGWMp4=
=mmyf
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-2024-05-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event updates from Ingo Molnar:
- Extend the x86 instruction decoder with APX and
other new instructions
- Misc cleanups
* tag 'perf-urgent-2024-05-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/cstate: Remove unused 'struct perf_cstate_msr'
perf/x86/rapl: Rename 'maxdie' to nr_rapl_pmu and 'dieid' to rapl_pmu_idx
x86/insn: Add support for APX EVEX instructions to the opcode map
x86/insn: Add support for APX EVEX to the instruction decoder logic
x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode map
x86/insn: Add support for REX2 prefix to the instruction decoder logic
x86/insn: Add misc new Intel instructions
x86/insn: Add VEX versions of VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS
x86/insn: Fix PUSH instruction in x86 instruction decoder opcode map
x86/insn: Add Key Locker instructions to the opcode map
Highlights:
- New drivers/platform/arm64 directory for arm64 embedded-controller drivers
- New drivers for:
- Acer Aspire 1 embedded controllers (for arm64 models)
- ACPI quickstart PNP0C32 buttons
- Dell All-In-One backlight support (dell-uart-backlight)
- Lenovo WMI camera buttons
- Lenovo Yoga Tablet 2 Pro 1380F/L fast charging
- MeeGoPad ANX7428 Type-C Cross Switch (power sequencing only)
- MSI WMI sensors (fan speed sensors only for now)
- Asus WMI:
- 2024 ROG Mini-LED support
- MCU powersave support
- Vivobook GPU MUX support
- Misc. other improvements
- Ideapad laptop:
- Export FnLock LED as LED class device
- Switch platform profiles using thermal management key
- Intel drivers:
- IFS: various improvements
- PMC: Lunar Lake support
- SDSI: various improvements
- TPMI/ISST: various improvements
- tools: intel-speed-select: various improvements
- MS Surface drivers:
- Fan profile switching support
- Surface Pro thermal sensors support
- ThinkPad ACPI:
- Reworked hotkey support to use sparse keymaps
- Add support for new trackpoint-doubletap, Fn+N and Fn+G hotkeys
- WMI core:
- New WMI driver development guide
- x86 Android tablets:
- Lenovo Yoga Tablet 2 Pro 1380F/L support
- Xiaomi MiPad 2 status LED and bezel touch buttons backlight support
- Miscellaneous cleanups / fixes / improvements
The following is an automated git shortlog grouped by driver:
ACPI:
- platform-profile: add platform_profile_cycle()
Add ACPI quickstart button (PNP0C32) driver:
- Add ACPI quickstart button (PNP0C32) driver
Add lenovo-yoga-tab2-pro-1380-fastcharger driver:
- Add lenovo-yoga-tab2-pro-1380-fastcharger driver
Add new Dell UART backlight driver:
- Add new Dell UART backlight driver
Add lenovo WMI camera button driver:
- Add lenovo WMI camera button driver
Add new MeeGoPad ANX7428 Type-C Cross Switch driver:
- Add new MeeGoPad ANX7428 Type-C Cross Switch driver
ISST:
- Support SST-BF and SST-TF per level
- Add missing MODULE_DESCRIPTION
- Add dev_fmt
- Use in_range() to check package ID validity
- Support partitioned systems
- Shorten the assignments for power_domain_info
- Use local variable for auxdev->dev
MAINTAINERS:
- drop Daniel Oliveira Nascimento
arm64:
- dts: qcom: acer-aspire1: Add embedded controller
asus-laptop:
- Use sysfs_emit() and sysfs_emit_at() to replace sprintf()
asus-wmi:
- cleanup main struct to avoid some holes
- Add support for MCU powersave
- ROG Ally increase wait time, allow MCU powersave
- adjust formatting of ppt-<name>() functions
- store a min default for ppt options
- support toggling POST sound
- add support variant of TUF RGB
- add support for Vivobook GPU MUX
- add support for 2024 ROG Mini-LED
- use sysfs_emit() instead of sprintf()
classmate-laptop:
- Add missing MODULE_DESCRIPTION()
devm-helpers:
- Fix a misspelled cancellation in the comments
dt-bindings:
- leds: Add LED_FUNCTION_FNLOCK
- platform: Add Acer Aspire 1 EC
hp-wmi:
- use sysfs_emit() instead of sprintf()
huawei-wmi:
- use sysfs_emit() instead of sprintf()
ideapad-laptop:
- switch platform profiles using thermal management key
- add FnLock LED class device
- add fn_lock_get/set functions
intel-vbtn:
- Log event code on unexpected button events
intel/pmc:
- Enable S0ix blocker show in Lunar Lake
- Add support to show S0ix blocker counter
- Update LNL signal status map
msi-laptop:
- Use sysfs_emit() to replace sprintf()
p2sb:
- Don't init until unassigned resources have been assigned
- Make p2sb_get_devfn() return void
platform:
- arm64: Add Acer Aspire 1 embedded controller driver
- Add ARM64 platform directory
platform/surface:
- aggregator: Log critical errors during SAM probing
- aggregator_registry: Add support for thermal sensors on the Surface Pro 9
- platform_profile: add fan profile switching
platform/x86/amd:
- pmc: Add new ACPI ID AMDI000B
- pmf: Add new ACPI ID AMDI0105
platform/x86/amd/hsmp:
- switch to use device_add_groups()
platform/x86/amd/pmc:
- Fix implicit declaration error on i386
- Add AMD MP2 STB functionality
platform/x86/fujitsu-laptop:
- Replace sprintf() with sysfs_emit()
platform/x86/intel-uncore-freq:
- Don't present root domain on error
platform/x86/intel/ifs:
- Disable irq during one load stage
- trace: display batch num in hex
- Classify error scenarios correctly
platform/x86/intel/pmc:
- Fix PCH names in comments
platform/x86/intel/sdsi:
- Add attribute to read the current meter state
- Add in-band BIOS lock support
- Combine read and write mailbox flows
- Set message size during writes
platform/x86/intel/tpmi:
- Add additional TPMI header fields
- Align comments in kernel-doc
- Check major version change for TPMI Information
- Handle error from tpmi_process_info()
quickstart:
- Fix race condition when reporting input event
- fix Kconfig selects
- Miscellaneous improvements
samsung-laptop:
- Use sysfs_emit() to replace the old interface sprintf()
think-lmi:
- Convert container_of() macros to static inline
thinkpad_acpi:
- Use false to set acpi_send_ev to false
- Support hotkey to disable trackpoint doubletap
- Support for system debug info hotkey
- Support for trackpoint doubletap
- Simplify known_ev handling
- Add mappings for adaptive kbd clipping-tool and cloud keys
- Switch to using sparse-keymap helpers
- Drop KEY_RESERVED special handling
- Use correct keycodes for volume and brightness keys
- Change hotkey_reserved_mask initialization
- Do not send ACPI netlink events for unknown hotkeys
- Move tpacpi_driver_event() call to tpacpi_input_send_key()
- Move hkey > scancode mapping to tpacpi_input_send_key()
- Drop tpacpi_input_send_key_masked() and hotkey_driver_event()
- Always call tpacpi_driver_event() for hotkeys
- Move hotkey_user_mask check to tpacpi_input_send_key()
- Move special original hotkeys handling out of switch-case
- Move adaptive kbd event handling to tpacpi_driver_event()
- Make tpacpi_driver_event() return if it handled the event
- Do hkey to scancode translation later
- Use tpacpi_input_send_key() in adaptive kbd code
- Drop ignore_acpi_ev
- Drop setting send_/ignore_acpi_ev defaults twice
- Provide hotkey_poll_stop_sync() dummy
- Take hotkey_mutex during hotkey_exit()
- change sprintf() to sysfs_emit()
- use platform_profile_cycle()
tools arch x86:
- Add dell-uart-backlight-emulator
tools/arch/x86/intel_sdsi:
- Add current meter support
- Simplify ascii printing
- Fix meter_certificate decoding
- Fix meter_show display
- Fix maximum meter bundle length
tools/power/x86/intel-speed-select:
- v1.19 release
- Display CPU as None for -1
- SST BF/TF support per level
- Increase number of CPUs displayed
- Present all TRL levels for turbo-freq
- Fix display for unsupported levels
- Support multiple dies
- Increase die count
toshiba_acpi:
- Add quirk for buttons on Z830
uv_sysfs:
- use sysfs_emit() instead of sprintf()
wmi:
- Add MSI WMI Platform driver
- Add driver development guide
- Mark simple WMI drivers as legacy-free
- Avoid returning AE_OK upon unknown error
- Support reading/writing 16 bit EC values
x86-android-tablets:
- Create LED device for Xiaomi Pad 2 bottom bezel touch buttons
- Xiaomi pad2 RGB LED fwnode updates
- Pass struct device to init()
- Add Lenovo Yoga Tablet 2 Pro 1380F/L data
- Unregister devices in reverse order
- Add swnode for Xiaomi pad2 indicator LED
- Use GPIO_LOOKUP() macro
xiaomi-wmi:
- Drop unnecessary NULL checks
- Fix race condition when reporting key events
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmZF1kwUHGhkZWdvZWRl
QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9wSXwgAsaSH6Sawn5sHOj52lQY7gNI0uf3V
YfZFawRpreCrlwLPU2f7SX0mLW+hh+ekQ2C1NvaUUVqQwzONELh0DWSYJpzz/v1r
jD14EcY2dnTv+FVyvCj5jZsiYxo/ViTvthMduiO7rrJKN7aOej9iNn68P0lvcY8s
HDJ2lPFNGnY01snz3C1NyjyIWw8YsfwqXEqOmhrDyyoKLXpsDs8H/Jqq5yXfeLax
hSpjbGB85EGJPXna6Ux5TziPh/MYMtF1+8R4Fn0sGvfcZO6/H1fDne0uI9UwrKnN
d2g4VHXU2DIhTshUc14YT2AU27eQiZVN+J3VpuYIbC9cmlQ2F6bjN3uxoQ==
=UWbu
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Hans de Goede:
- New drivers/platform/arm64 directory for arm64 embedded-controller
drivers
- New drivers:
- Acer Aspire 1 embedded controllers (for arm64 models)
- ACPI quickstart PNP0C32 buttons
- Dell All-In-One backlight support (dell-uart-backlight)
- Lenovo WMI camera buttons
- Lenovo Yoga Tablet 2 Pro 1380F/L fast charging
- MeeGoPad ANX7428 Type-C Cross Switch (power sequencing only)
- MSI WMI sensors (fan speed sensors only for now)
- Asus WMI:
- 2024 ROG Mini-LED support
- MCU powersave support
- Vivobook GPU MUX support
- Misc. other improvements
- Ideapad laptop:
- Export FnLock LED as LED class device
- Switch platform profiles using thermal management key
- Intel drivers:
- IFS: various improvements
- PMC: Lunar Lake support
- SDSI: various improvements
- TPMI/ISST: various improvements
- tools: intel-speed-select: various improvements
- MS Surface drivers:
- Fan profile switching support
- Surface Pro thermal sensors support
- ThinkPad ACPI:
- Reworked hotkey support to use sparse keymaps
- Add support for new trackpoint-doubletap, Fn+N and Fn+G hotkeys
- WMI core:
- New WMI driver development guide
- x86 Android tablets:
- Lenovo Yoga Tablet 2 Pro 1380F/L support
- Xiaomi MiPad 2 status LED and bezel touch buttons backlight
support
- Miscellaneous cleanups / fixes / improvements
* tag 'platform-drivers-x86-v6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (128 commits)
platform/x86: Add new MeeGoPad ANX7428 Type-C Cross Switch driver
devm-helpers: Fix a misspelled cancellation in the comments
tools arch x86: Add dell-uart-backlight-emulator
platform/x86: Add new Dell UART backlight driver
platform/x86: x86-android-tablets: Create LED device for Xiaomi Pad 2 bottom bezel touch buttons
platform/x86: x86-android-tablets: Xiaomi pad2 RGB LED fwnode updates
platform/x86: x86-android-tablets: Pass struct device to init()
platform/x86/amd: pmc: Add new ACPI ID AMDI000B
platform/x86/amd: pmf: Add new ACPI ID AMDI0105
platform/x86: p2sb: Don't init until unassigned resources have been assigned
platform/surface: aggregator: Log critical errors during SAM probing
platform/x86: ISST: Support SST-BF and SST-TF per level
platform/x86/fujitsu-laptop: Replace sprintf() with sysfs_emit()
tools/power/x86/intel-speed-select: v1.19 release
tools/power/x86/intel-speed-select: Display CPU as None for -1
tools/power/x86/intel-speed-select: SST BF/TF support per level
tools/power/x86/intel-speed-select: Increase number of CPUs displayed
tools/power/x86/intel-speed-select: Present all TRL levels for turbo-freq
tools/power/x86/intel-speed-select: Fix display for unsupported levels
tools/power/x86/intel-speed-select: Support multiple dies
...
ACPI:
* Support for the Firmware ACPI Control Structure (FACS) signature
feature which is used to reboot out of hibernation on some systems.
Kbuild:
* Support for building Flat Image Tree (FIT) images, where the kernel
Image is compressed alongside a set of devicetree blobs.
Memory management:
* Optimisation of our early page-table manipulation for creation of the
linear mapping.
* Support for userfaultfd write protection, which brings along some nice
cleanups to our handling of invalid but present ptes.
* Extend our use of range TLBI invalidation at EL1.
Perf and PMUs:
* Ensure that the 'pmu->parent' pointer is correctly initialised by PMU
drivers.
* Avoid allocating 'cpumask_t' types on the stack in some PMU drivers.
* Fix parsing of the CPU PMU "version" field in assembly code, as it
doesn't follow the usual architectural rules.
* Add best-effort unwinding support for USER_STACKTRACE
* Minor driver fixes and cleanups.
Selftests:
* Minor cleanups to the arm64 selftests (missing NULL check, unused
variable).
Miscellaneous
* Add a command-line alias for disabling 32-bit application support.
* Add part number for Neoverse-V2 CPUs.
* Minor fixes and cleanups.
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmY+IWkQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNBVNB/9JG4jlmgxzbTDoer0md31YFvWCDGeOKx1x
g3XhE24W5w8eLXnc75p7/tOUKfo0TNWL4qdUs0hJCEUAOSy6a4Qz13bkkkvvBtDm
nnHvEjidx5yprHggocsoTF29CKgHMJ3bt8rJe6g+O3Lp1JAFlXXNgplX5koeaVtm
TtaFvX9MGyDDNkPIcQ/SQTFZJ2Oz51+ik6O8SYuGYtmAcR7MzlxH77lHl2mrF1bf
Jzv/f5n0lS+Gt9tRuFWhbfEm4aKdUlLha4ufzUq42/vJvELboZbG3LqLxRG8DbqR
+HvyZOG/xtu2dbzDqHkRumMToWmwzD4oBGSK4JAoJxeHavEdAvSG
=JMvT
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"The most interesting parts are probably the mm changes from Ryan which
optimise the creation of the linear mapping at boot and (separately)
implement write-protect support for userfaultfd.
Outside of our usual directories, the Kbuild-related changes under
scripts/ have been acked by Masahiro whilst the drivers/acpi/ parts
have been acked by Rafael and the addition of cpumask_any_and_but()
has been acked by Yury.
ACPI:
- Support for the Firmware ACPI Control Structure (FACS) signature
feature which is used to reboot out of hibernation on some systems
Kbuild:
- Support for building Flat Image Tree (FIT) images, where the kernel
Image is compressed alongside a set of devicetree blobs
Memory management:
- Optimisation of our early page-table manipulation for creation of
the linear mapping
- Support for userfaultfd write protection, which brings along some
nice cleanups to our handling of invalid but present ptes
- Extend our use of range TLBI invalidation at EL1
Perf and PMUs:
- Ensure that the 'pmu->parent' pointer is correctly initialised by
PMU drivers
- Avoid allocating 'cpumask_t' types on the stack in some PMU drivers
- Fix parsing of the CPU PMU "version" field in assembly code, as it
doesn't follow the usual architectural rules
- Add best-effort unwinding support for USER_STACKTRACE
- Minor driver fixes and cleanups
Selftests:
- Minor cleanups to the arm64 selftests (missing NULL check, unused
variable)
Miscellaneous:
- Add a command-line alias for disabling 32-bit application support
- Add part number for Neoverse-V2 CPUs
- Minor fixes and cleanups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (64 commits)
arm64/mm: Fix pud_user_accessible_page() for PGTABLE_LEVELS <= 2
arm64/mm: Add uffd write-protect support
arm64/mm: Move PTE_PRESENT_INVALID to overlay PTE_NG
arm64/mm: Remove PTE_PROT_NONE bit
arm64/mm: generalize PMD_PRESENT_INVALID for all levels
arm64: simplify arch_static_branch/_jump function
arm64: Add USER_STACKTRACE support
arm64: Add the arm64.no32bit_el0 command line option
drivers/perf: hisi: hns3: Actually use devm_add_action_or_reset()
drivers/perf: hisi: hns3: Fix out-of-bound access when valid event group
drivers/perf: hisi_pcie: Fix out-of-bound access when valid event group
kselftest: arm64: Add a null pointer check
arm64: defer clearing DAIF.D
arm64: assembler: update stale comment for disable_step_tsk
arm64/sysreg: Update PIE permission encodings
kselftest/arm64: Remove unused parameters in abi test
perf/arm-spe: Assign parents for event_source device
perf/arm-smmuv3: Assign parents for event_source device
perf/arm-dsu: Assign parents for event_source device
perf/arm-dmc620: Assign parents for event_source device
...
Dell All In One (AIO) models released after 2017 use a backlight controller
board connected to an UART.
Add a small emulator to allow development and testing of
the drivers/platform/x86/dell/dell-uart-backlight.c driver for
this board, without requiring access to an actual Dell All In One.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240513144603.93874-3-hdegoede@redhat.com
To support APX functionality, the EVEX prefix is used to:
- promote legacy instructions
- promote VEX instructions
- add new instructions
Promoted VEX instructions require no extra annotation because the opcodes
do not change and the permissive nature of the instruction decoder already
allows them to have an EVEX prefix.
Promoted legacy instructions and new instructions are placed in map 4 which
has not been used before.
Create a new table for map 4 and add APX instructions.
Annotate SCALABLE instructions with "(es)" - refer to patch "x86/insn: Add
support for APX EVEX to the instruction decoder logic". SCALABLE
instructions must be represented in both no-prefix (NP) and 66 prefix
forms.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-9-adrian.hunter@intel.com
Intel Advanced Performance Extensions (APX) extends the EVEX prefix to
support:
- extended general purpose registers (EGPRs) i.e. r16 to r31
- Push-Pop Acceleration (PPX) hints
- new data destination (NDD) register
- suppress status flags writes (NF) of common instructions
- new instructions
Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture
Specification for details.
The extended EVEX prefix does not need amended instruction decoder logic,
except in one area. Some instructions are defined as SCALABLE which means
the EVEX.W bit and EVEX.pp bits are used to determine operand size.
Specifically, if an instruction is SCALABLE and EVEX.W is zero, then
EVEX.pp value 0 (representing no prefix NP) means default operand size,
whereas EVEX.pp value 1 (representing 66 prefix) means operand size
override i.e. 16 bits
Add an attribute (INAT_EVEX_SCALABLE) to identify such instructions, and
amend the logic appropriately.
Amend the awk script that generates the attribute tables from the opcode
map, to recognise "(es)" as attribute INAT_EVEX_SCALABLE.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-8-adrian.hunter@intel.com
Support for REX2 has been added to the instruction decoder logic and the
awk script that generates the attribute tables from the opcode map.
Add REX2 prefix byte (0xD5) to the opcode map.
Add annotation (!REX2) for map 0/1 opcodes that are reserved under REX2.
Add JMPABS to the opcode map and add annotation (REX2) to identify that it
has a mandatory REX2 prefix. A separate opcode attribute table is not
needed at this time because JMPABS has the same attribute encoding as the
MOV instruction that it shares an opcode with i.e. INAT_MOFFSET.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-7-adrian.hunter@intel.com
Intel Advanced Performance Extensions (APX) uses a new 2-byte prefix named
REX2 to select extended general purpose registers (EGPRs) i.e. r16 to r31.
The REX2 prefix is effectively an extended version of the REX prefix.
REX2 and EVEX are also used with PUSH/POP instructions to provide a
Push-Pop Acceleration (PPX) hint. With PPX hints, a CPU will attempt to
fast-forward register data between matching PUSH and POP instructions.
REX2 is valid only with opcodes in maps 0 and 1. Similar extension for
other maps is provided by the EVEX prefix, covered in a separate patch.
Some opcodes in maps 0 and 1 are reserved under REX2. One of these is used
for a new 64-bit absolute direct jump instruction JMPABS.
Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture
Specification for details.
Define a code value for the REX2 prefix (INAT_PFX_REX2), and add attribute
flags for opcodes reserved under REX2 (INAT_NO_REX2) and to identify
opcodes (only JMPABS) that require a mandatory REX2 prefix
(INAT_REX2_VARIANT).
Amend logic to read the REX2 prefix and get the opcode attribute for the
map number (0 or 1) encoded in the REX2 prefix.
Amend the awk script that generates the attribute tables from the opcode
map, to recognise "REX2" as attribute INAT_PFX_REX2, and "(!REX2)"
as attribute INAT_NO_REX2, and "(REX2)" as attribute INAT_REX2_VARIANT.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-6-adrian.hunter@intel.com
The x86 instruction decoder is used not only for decoding kernel
instructions. It is also used by perf uprobes (user space probes) and by
perf tools Intel Processor Trace decoding. Consequently, it needs to
support instructions executed by user space also.
Add instructions documented in Intel Architecture Instruction Set
Extensions and Future Features Programming Reference March 2024
319433-052, that have not been added yet:
AADD
AAND
AOR
AXOR
CMPccXADD
PBNDKB
RDMSRLIST
URDMSR
UWRMSR
VBCSTNEBF162PS
VBCSTNESH2PS
VCVTNEEBF162PS
VCVTNEEPH2PS
VCVTNEOBF162PS
VCVTNEOPH2PS
VCVTNEPS2BF16
VPDPB[SU,UU,SS]D[,S]
VPDPW[SU,US,UU]D[,S]
VPMADD52HUQ
VPMADD52LUQ
VSHA512MSG1
VSHA512MSG2
VSHA512RNDS2
VSM3MSG1
VSM3MSG2
VSM3RNDS2
VSM4KEY4
VSM4RNDS4
WRMSRLIST
TCMMIMFP16PS
TCMMRLFP16PS
TDPFP16PS
PREFETCHIT1
PREFETCHIT0
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-5-adrian.hunter@intel.com
The x86 instruction decoder is used not only for decoding kernel
instructions. It is also used by perf uprobes (user space probes) and by
perf tools Intel Processor Trace decoding. Consequently, it needs to
support instructions executed by user space also.
Intel Architecture Instruction Set Extensions and Future Features manual
number 319433-044 of May 2021, documented VEX versions of instructions
VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS, but the opcode map has them
listed as EVEX only.
Remove EVEX-only (ev) annotation from instructions VPDPBUSD, VPDPBUSDS,
VPDPWSSD and VPDPWSSDS, which allows them to be decoded with either a VEX
or EVEX prefix.
Fixes: 0153d98f2d ("x86/insn: Add misc instructions to x86 instruction decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-4-adrian.hunter@intel.com
The x86 instruction decoder is used not only for decoding kernel
instructions. It is also used by perf uprobes (user space probes) and by
perf tools Intel Processor Trace decoding. Consequently, it needs to
support instructions executed by user space also.
Opcode 0x68 PUSH instruction is currently defined as 64-bit operand size
only i.e. (d64). That was based on Intel SDM Opcode Map. However that is
contradicted by the Instruction Set Reference section for PUSH in the
same manual.
Remove 64-bit operand size only annotation from opcode 0x68 PUSH
instruction.
Example:
$ cat pushw.s
.global _start
.text
_start:
pushw $0x1234
mov $0x1,%eax # system call number (sys_exit)
int $0x80
$ as -o pushw.o pushw.s
$ ld -s -o pushw pushw.o
$ objdump -d pushw | tail -4
0000000000401000 <.text>:
401000: 66 68 34 12 pushw $0x1234
401004: b8 01 00 00 00 mov $0x1,%eax
401009: cd 80 int $0x80
$ perf record -e intel_pt//u ./pushw
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.014 MB perf.data ]
Before:
$ perf script --insn-trace=disasm
Warning:
1 instruction trace errors
pushw 10349 [000] 10586.869237014: 401000 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) pushw $0x1234
pushw 10349 [000] 10586.869237014: 401006 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb %al, (%rax)
pushw 10349 [000] 10586.869237014: 401008 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb %cl, %ch
pushw 10349 [000] 10586.869237014: 40100a [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb $0x2e, (%rax)
instruction trace error type 1 time 10586.869237224 cpu 0 pid 10349 tid 10349 ip 0x40100d code 6: Trace doesn't match instruction
After:
$ perf script --insn-trace=disasm
pushw 10349 [000] 10586.869237014: 401000 [unknown] (./pushw) pushw $0x1234
pushw 10349 [000] 10586.869237014: 401004 [unknown] (./pushw) movl $1, %eax
Fixes: eb13296cfa ("x86: Instruction decoder API")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-3-adrian.hunter@intel.com
The x86 instruction decoder needs to know these new instructions that
are going to be used in the crypto library as well as the x86 core
code. Add the following:
LOADIWKEY:
Load a CPU-internal wrapping key.
ENCODEKEY128:
Wrap a 128-bit AES key to a key handle.
ENCODEKEY256:
Wrap a 256-bit AES key to a key handle.
AESENC128KL:
Encrypt a 128-bit block of data using a 128-bit AES key
indicated by a key handle.
AESENC256KL:
Encrypt a 128-bit block of data using a 256-bit AES key
indicated by a key handle.
AESDEC128KL:
Decrypt a 128-bit block of data using a 128-bit AES key
indicated by a key handle.
AESDEC256KL:
Decrypt a 128-bit block of data using a 256-bit AES key
indicated by a key handle.
AESENCWIDE128KL:
Encrypt 8 128-bit blocks of data using a 128-bit AES key
indicated by a key handle.
AESENCWIDE256KL:
Encrypt 8 128-bit blocks of data using a 256-bit AES key
indicated by a key handle.
AESDECWIDE128KL:
Decrypt 8 128-bit blocks of data using a 128-bit AES key
indicated by a key handle.
AESDECWIDE256KL:
Decrypt 8 128-bit blocks of data using a 256-bit AES key
indicated by a key handle.
The detail can be found in Intel Software Developer Manual.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20240502105853.5338-2-adrian.hunter@intel.com
Add support to read the 'meter_current' file. The display is the same as
the 'meter_certificate', but will show the current snapshot of the
counters.
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240411025856.2782476-10-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Add #define for feature length and move NUL assignment from callers to
get_feature().
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240411025856.2782476-9-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Fix errors in the calculation of the start position of the counters and in
the display loop. While here, use a #define for the bundle count and size.
Fixes: 7fdc03a737 ("tools/arch/x86: intel_sdsi: Add support for reading meter certificates")
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240411025856.2782476-8-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Fixes sdsi_meter_cert_show() to correctly decode and display the meter
certificate output. Adds and displays a missing version field, displays the
ASCII name of the signature, and fixes the print alignment.
Fixes: 7fdc03a737 ("tools/arch/x86: intel_sdsi: Add support for reading meter certificates")
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240411025856.2782476-7-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
The maximum number of bundles in the meter certificate was set to 8 which
is much less than the maximum. Instead, since the bundles appear at the end
of the file, set it based on the remaining file size from the bundle start
position.
Fixes: 7fdc03a737 ("tools/arch/x86: intel_sdsi: Add support for reading meter certificates")
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240411025856.2782476-6-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Fix left shift overflow issue when the parameter idx is greater than or
equal to 8 in the calculation of perm in PIRx_ELx_PERM macro.
Fix this by modifying the encoding to use a long integer type.
Signed-off-by: Shiqi Liu <shiqiliu@hust.edu.cn>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240421063328.29710-1-shiqiliu@hust.edu.cn
Signed-off-by: Will Deacon <will@kernel.org>
To pick the changes from:
95a6ccbdc7 ("x86/bhi: Mitigate KVM by default")
ec9404e40e ("x86/bhi: Add BHI mitigation knob")
be482ff950 ("x86/bhi: Enumerate Branch History Injection (BHI) bug")
0f4a837615 ("x86/bhi: Define SPEC_CTRL_BHI_DIS_S")
7390db8aea ("x86/bhi: Add support for clearing branch history at syscall entry")
This causes these perf files to be rebuilt and brings some X86_FEATURE
that will be used when updating the copies of
tools/arch/x86/lib/mem{cpy,set}_64.S with the kernel sources:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
And addresses this perf build warning:
Warning: Kernel ABI header differences:
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/ZirIx4kPtJwGFZS0@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes from:
fb091ff394 ("arm64: Subscribe Microsoft Azure Cobalt 100 to ARM Neoverse N2 errata")
This should address these tools/perf build warnings:
Warning: Kernel ABI header differences:
diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240408185520.1550865-10-namhyung@kernel.org
To pick up the changes from:
598c2fafc0 ("perf/x86/amd/lbr: Use freeze based on availability")
7f274e609f ("x86/cpufeatures: Add new word for scattered features")
This should address these tools/perf build warnings:
Warning: Kernel ABI header differences:
diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h
diff -u tools/arch/x86/include/asm/required-features.h arch/x86/include/asm/required-features.h
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240408185520.1550865-6-namhyung@kernel.org
To pick up the changes from:
6bda055d62 ("KVM: define __KVM_HAVE_GUEST_DEBUG unconditionally")
5d9cb71642 ("KVM: arm64: move ARM-specific defines to uapi/asm/kvm.h")
71cd774ad2 ("KVM: s390: move s390-specific structs to uapi/asm/kvm.h")
d750951c9e ("KVM: powerpc: move powerpc-specific structs to uapi/asm/kvm.h")
bcac047727 ("KVM: x86: move x86-specific structs to uapi/asm/kvm.h")
c0a411904e ("KVM: remove more traces of device assignment UAPI")
f3c80061c0 ("KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP")
That should be used to beautify the KVM arguments and it addresses these
tools/perf build warnings:
Warning: Kernel ABI header differences:
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
diff -u tools/arch/powerpc/include/uapi/asm/kvm.h arch/powerpc/include/uapi/asm/kvm.h
diff -u tools/arch/s390/include/uapi/asm/kvm.h arch/s390/include/uapi/asm/kvm.h
diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240408185520.1550865-4-namhyung@kernel.org
1, Add objtool support for LoongArch;
2, Add ORC stack unwinder support for LoongArch;
3, Add kernel livepatching support for LoongArch;
4, Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig;
5, Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig;
6, Some bug fixes and other small changes.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmX69KsWHGNoZW5odWFj
YWlAa2VybmVsLm9yZwAKCRAChivD8uImerjFD/9rAzm1+G4VxFvFzlOiJXEqquNJ
+Vz2fAZLU3lJhBlx0uUKXFVijvLe8s/DnoLrM9e/M6gk4ivT9eszy3DnqT3NjGDX
njYFkPUWZhZGACmbkoVk9St80R8sPIdZrwXtW3q7g3T0bC7LXUXrJw52Sh4gmbYx
RqLsE6GoEWGY0zhhWqeeAM9LkKDuLxxyjH4fYE4g623EhQt7A7hP5okyaC+xHzp+
qp/4dPFLu61LeqIfeBUKK7nQ6uzno3EWLiME2eHEHiuelYfzmh+BtNMcX9Ugb/En
j0vLGNsoDGmEYw7xGa6OSRaCR/nCwVJz4SvuH33wbbbHhVAiUKUBVNFR3gmAtLlc
BSa2dDZbKhHkiWSUCM9K2ihr7WiQNuraTK1kKHwBgfa+RbEVOTu1q8yokAB9XCaT
T7lijJ8MKQmzHpMvgev7nN41baDB6V5bPIni0Ueh+NhQJKZ2/IxtYA3XzV5D0UgL
TBovVgYB/VNThS9gzOrlenKuDX9hT+kCQgyudErXaoIo645P6dsPFowOZRQxCEIv
WnLskZatLTCA8xWl1XyC1bqtGxhp34Gbhg0ZcvUqlNE20luaK/qi8wtW9Mv1Utp+
aXFO3i7d93z99oAcUT0oc1N83T0x0M/p69Z+rL/2+L0sYQgBf1cwUEiDNRW4OCdI
h15289rRTxjeL7NZPw==
=lSkY
-----END PGP SIGNATURE-----
Merge tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Add objtool support for LoongArch
- Add ORC stack unwinder support for LoongArch
- Add kernel livepatching support for LoongArch
- Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig
- Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig
- Some bug fixes and other small changes
* tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch/crypto: Clean up useless assignment operations
LoongArch: Define the __io_aw() hook as mmiowb()
LoongArch: Remove superfluous flush_dcache_page() definition
LoongArch: Move {dmw,tlb}_virt_to_page() definition to page.h
LoongArch: Change __my_cpu_offset definition to avoid mis-optimization
LoongArch: Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig
LoongArch: Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig
LoongArch: Add kernel livepatching support
LoongArch: Add ORC stack unwinder support
objtool: Check local label in read_unwind_hints()
objtool: Check local label in add_dead_ends()
objtool/LoongArch: Enable orc to be built
objtool/x86: Separate arch-specific and generic parts
objtool/LoongArch: Implement instruction decoder
objtool/LoongArch: Enable objtool to be built
It is used only to generate string tables, not to build perf, so move it
to the tools/perf/trace/beauty/{include,arch}/ hierarchies, that is used
just for scraping.
This is a something that should've have happened, as happened with the
linux/socket.h scrapper, do it now as Ian suggested while doing an
audit/refactor session in the headers used by perf.
No other tools/ living code uses it, just <linux/usbdevice_fs.h> coming
from either 'make install_headers' or from the system /usr/include/
directory.
Suggested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20240315204835.748716-3-acme@kernel.org
Link: https://lore.kernel.org/lkml/CAP-5=fWZVrpRufO4w-S4EcSi9STXcTAN2ERLwTSN7yrSSA-otQ@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is used only to generate string tables, not to build perf, so move it
to the tools/perf/trace/beauty/include/ hierarchy, that is used just for
scraping.
This is a something that should've have happened, as happened with the
linux/socket.h scrapper, do it now as Ian suggested while doing an
audit/refactor session in the headers used by perf.
No other tools/ living code uses it.
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/CAP-5=fWZVrpRufO4w-S4EcSi9STXcTAN2ERLwTSN7yrSSA-otQ@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Changes to FPU handling came in via the main s390 pull request
* Only deliver to the guest the SCLP events that userspace has
requested.
* More virtual vs physical address fixes (only a cleanup since
virtual and physical address spaces are currently the same).
* Fix selftests undefined behavior.
x86:
* Fix a restriction that the guest can't program a PMU event whose
encoding matches an architectural event that isn't included in the
guest CPUID. The enumeration of an architectural event only says
that if a CPU supports an architectural event, then the event can be
programmed *using the architectural encoding*. The enumeration does
NOT say anything about the encoding when the CPU doesn't report support
the event *in general*. It might support it, and it might support it
using the same encoding that made it into the architectural PMU spec.
* Fix a variety of bugs in KVM's emulation of RDPMC (more details on
individual commits) and add a selftest to verify KVM correctly emulates
RDMPC, counter availability, and a variety of other PMC-related
behaviors that depend on guest CPUID and therefore are easier to
validate with selftests than with custom guests (aka kvm-unit-tests).
* Zero out PMU state on AMD if the virtual PMU is disabled, it does not
cause any bug but it wastes time in various cases where KVM would check
if a PMC event needs to be synthesized.
* Optimize triggering of emulated events, with a nice ~10% performance
improvement in VM-Exit microbenchmarks when a vPMU is exposed to the
guest.
* Tighten the check for "PMI in guest" to reduce false positives if an NMI
arrives in the host while KVM is handling an IRQ VM-Exit.
* Fix a bug where KVM would report stale/bogus exit qualification information
when exiting to userspace with an internal error exit code.
* Add a VMX flag in /proc/cpuinfo to report 5-level EPT support.
* Rework TDP MMU root unload, free, and alloc to run with mmu_lock held for
read, e.g. to avoid serializing vCPUs when userspace deletes a memslot.
* Tear down TDP MMU page tables at 4KiB granularity (used to be 1GiB). KVM
doesn't support yielding in the middle of processing a zap, and 1GiB
granularity resulted in multi-millisecond lags that are quite impolite
for CONFIG_PREEMPT kernels.
* Allocate write-tracking metadata on-demand to avoid the memory overhead when
a kernel is built with i915 virtualization support but the workloads use
neither shadow paging nor i915 virtualization.
* Explicitly initialize a variety of on-stack variables in the emulator that
triggered KMSAN false positives.
* Fix the debugregs ABI for 32-bit KVM.
* Rework the "force immediate exit" code so that vendor code ultimately decides
how and when to force the exit, which allowed some optimization for both
Intel and AMD.
* Fix a long-standing bug where kvm_has_noapic_vcpu could be left elevated if
vCPU creation ultimately failed, causing extra unnecessary work.
* Cleanup the logic for checking if the currently loaded vCPU is in-kernel.
* Harden against underflowing the active mmu_notifier invalidation
count, so that "bad" invalidations (usually due to bugs elsehwere in the
kernel) are detected earlier and are less likely to hang the kernel.
x86 Xen emulation:
* Overlay pages can now be cached based on host virtual address,
instead of guest physical addresses. This removes the need to
reconfigure and invalidate the cache if the guest changes the
gpa but the underlying host virtual address remains the same.
* When possible, use a single host TSC value when computing the deadline for
Xen timers in order to improve the accuracy of the timer emulation.
* Inject pending upcall events when the vCPU software-enables its APIC to fix
a bug where an upcall can be lost (and to follow Xen's behavior).
* Fall back to the slow path instead of warning if "fast" IRQ delivery of Xen
events fails, e.g. if the guest has aliased xAPIC IDs.
RISC-V:
* Support exception and interrupt handling in selftests
* New self test for RISC-V architectural timer (Sstc extension)
* New extension support (Ztso, Zacas)
* Support userspace emulation of random number seed CSRs.
ARM:
* Infrastructure for building KVM's trap configuration based on the
architectural features (or lack thereof) advertised in the VM's ID
registers
* Support for mapping vfio-pci BARs as Normal-NC (vaguely similar to
x86's WC) at stage-2, improving the performance of interacting with
assigned devices that can tolerate it
* Conversion of KVM's representation of LPIs to an xarray, utilized to
address serialization some of the serialization on the LPI injection
path
* Support for _architectural_ VHE-only systems, advertised through the
absence of FEAT_E2H0 in the CPU's ID register
* Miscellaneous cleanups, fixes, and spelling corrections to KVM and
selftests
LoongArch:
* Set reserved bits as zero in CPUCFG.
* Start SW timer only when vcpu is blocking.
* Do not restart SW timer when it is expired.
* Remove unnecessary CSR register saving during enter guest.
* Misc cleanups and fixes as usual.
Generic:
* cleanup Kconfig by removing CONFIG_HAVE_KVM, which was basically always
true on all architectures except MIPS (where Kconfig determines the
available depending on CPU capabilities). It is replaced either by
an architecture-dependent symbol for MIPS, and IS_ENABLED(CONFIG_KVM)
everywhere else.
* Factor common "select" statements in common code instead of requiring
each architecture to specify it
* Remove thoroughly obsolete APIs from the uapi headers.
* Move architecture-dependent stuff to uapi/asm/kvm.h
* Always flush the async page fault workqueue when a work item is being
removed, especially during vCPU destruction, to ensure that there are no
workers running in KVM code when all references to KVM-the-module are gone,
i.e. to prevent a very unlikely use-after-free if kvm.ko is unloaded.
* Grab a reference to the VM's mm_struct in the async #PF worker itself instead
of gifting the worker a reference, so that there's no need to remember
to *conditionally* clean up after the worker.
Selftests:
* Reduce boilerplate especially when utilize selftest TAP infrastructure.
* Add basic smoke tests for SEV and SEV-ES, along with a pile of library
support for handling private/encrypted/protected memory.
* Fix benign bugs where tests neglect to close() guest_memfd files.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmX0iP8UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroND7wf+JZoNvwZ+bmwWe/4jn/YwNoYi/C5z
eypn8M1gsWEccpCpqPBwznVm9T29rF4uOlcMvqLEkHfTpaL1EKUUjP1lXPz/ileP
6a2RdOGxAhyTiFC9fjy+wkkjtLbn1kZf6YsS0hjphP9+w0chNbdn0w81dFVnXryd
j7XYI8R/bFAthNsJOuZXSEjCfIHxvTTG74OrTf1B1FEBB+arPmrgUeJftMVhffQK
Sowgg8L/Ii/x6fgV5NZQVSIyVf1rp8z7c6UaHT4Fwb0+RAMW8p9pYv9Qp1YkKp8y
5j0V9UzOHP7FRaYimZ5BtwQoqiZXYylQ+VuU/Y2f4X85cvlLzSqxaEMAPA==
=mqOV
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"S390:
- Changes to FPU handling came in via the main s390 pull request
- Only deliver to the guest the SCLP events that userspace has
requested
- More virtual vs physical address fixes (only a cleanup since
virtual and physical address spaces are currently the same)
- Fix selftests undefined behavior
x86:
- Fix a restriction that the guest can't program a PMU event whose
encoding matches an architectural event that isn't included in the
guest CPUID. The enumeration of an architectural event only says
that if a CPU supports an architectural event, then the event can
be programmed *using the architectural encoding*. The enumeration
does NOT say anything about the encoding when the CPU doesn't
report support the event *in general*. It might support it, and it
might support it using the same encoding that made it into the
architectural PMU spec
- Fix a variety of bugs in KVM's emulation of RDPMC (more details on
individual commits) and add a selftest to verify KVM correctly
emulates RDMPC, counter availability, and a variety of other
PMC-related behaviors that depend on guest CPUID and therefore are
easier to validate with selftests than with custom guests (aka
kvm-unit-tests)
- Zero out PMU state on AMD if the virtual PMU is disabled, it does
not cause any bug but it wastes time in various cases where KVM
would check if a PMC event needs to be synthesized
- Optimize triggering of emulated events, with a nice ~10%
performance improvement in VM-Exit microbenchmarks when a vPMU is
exposed to the guest
- Tighten the check for "PMI in guest" to reduce false positives if
an NMI arrives in the host while KVM is handling an IRQ VM-Exit
- Fix a bug where KVM would report stale/bogus exit qualification
information when exiting to userspace with an internal error exit
code
- Add a VMX flag in /proc/cpuinfo to report 5-level EPT support
- Rework TDP MMU root unload, free, and alloc to run with mmu_lock
held for read, e.g. to avoid serializing vCPUs when userspace
deletes a memslot
- Tear down TDP MMU page tables at 4KiB granularity (used to be
1GiB). KVM doesn't support yielding in the middle of processing a
zap, and 1GiB granularity resulted in multi-millisecond lags that
are quite impolite for CONFIG_PREEMPT kernels
- Allocate write-tracking metadata on-demand to avoid the memory
overhead when a kernel is built with i915 virtualization support
but the workloads use neither shadow paging nor i915 virtualization
- Explicitly initialize a variety of on-stack variables in the
emulator that triggered KMSAN false positives
- Fix the debugregs ABI for 32-bit KVM
- Rework the "force immediate exit" code so that vendor code
ultimately decides how and when to force the exit, which allowed
some optimization for both Intel and AMD
- Fix a long-standing bug where kvm_has_noapic_vcpu could be left
elevated if vCPU creation ultimately failed, causing extra
unnecessary work
- Cleanup the logic for checking if the currently loaded vCPU is
in-kernel
- Harden against underflowing the active mmu_notifier invalidation
count, so that "bad" invalidations (usually due to bugs elsehwere
in the kernel) are detected earlier and are less likely to hang the
kernel
x86 Xen emulation:
- Overlay pages can now be cached based on host virtual address,
instead of guest physical addresses. This removes the need to
reconfigure and invalidate the cache if the guest changes the gpa
but the underlying host virtual address remains the same
- When possible, use a single host TSC value when computing the
deadline for Xen timers in order to improve the accuracy of the
timer emulation
- Inject pending upcall events when the vCPU software-enables its
APIC to fix a bug where an upcall can be lost (and to follow Xen's
behavior)
- Fall back to the slow path instead of warning if "fast" IRQ
delivery of Xen events fails, e.g. if the guest has aliased xAPIC
IDs
RISC-V:
- Support exception and interrupt handling in selftests
- New self test for RISC-V architectural timer (Sstc extension)
- New extension support (Ztso, Zacas)
- Support userspace emulation of random number seed CSRs
ARM:
- Infrastructure for building KVM's trap configuration based on the
architectural features (or lack thereof) advertised in the VM's ID
registers
- Support for mapping vfio-pci BARs as Normal-NC (vaguely similar to
x86's WC) at stage-2, improving the performance of interacting with
assigned devices that can tolerate it
- Conversion of KVM's representation of LPIs to an xarray, utilized
to address serialization some of the serialization on the LPI
injection path
- Support for _architectural_ VHE-only systems, advertised through
the absence of FEAT_E2H0 in the CPU's ID register
- Miscellaneous cleanups, fixes, and spelling corrections to KVM and
selftests
LoongArch:
- Set reserved bits as zero in CPUCFG
- Start SW timer only when vcpu is blocking
- Do not restart SW timer when it is expired
- Remove unnecessary CSR register saving during enter guest
- Misc cleanups and fixes as usual
Generic:
- Clean up Kconfig by removing CONFIG_HAVE_KVM, which was basically
always true on all architectures except MIPS (where Kconfig
determines the available depending on CPU capabilities). It is
replaced either by an architecture-dependent symbol for MIPS, and
IS_ENABLED(CONFIG_KVM) everywhere else
- Factor common "select" statements in common code instead of
requiring each architecture to specify it
- Remove thoroughly obsolete APIs from the uapi headers
- Move architecture-dependent stuff to uapi/asm/kvm.h
- Always flush the async page fault workqueue when a work item is
being removed, especially during vCPU destruction, to ensure that
there are no workers running in KVM code when all references to
KVM-the-module are gone, i.e. to prevent a very unlikely
use-after-free if kvm.ko is unloaded
- Grab a reference to the VM's mm_struct in the async #PF worker
itself instead of gifting the worker a reference, so that there's
no need to remember to *conditionally* clean up after the worker
Selftests:
- Reduce boilerplate especially when utilize selftest TAP
infrastructure
- Add basic smoke tests for SEV and SEV-ES, along with a pile of
library support for handling private/encrypted/protected memory
- Fix benign bugs where tests neglect to close() guest_memfd files"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits)
selftests: kvm: remove meaningless assignments in Makefiles
KVM: riscv: selftests: Add Zacas extension to get-reg-list test
RISC-V: KVM: Allow Zacas extension for Guest/VM
KVM: riscv: selftests: Add Ztso extension to get-reg-list test
RISC-V: KVM: Allow Ztso extension for Guest/VM
RISC-V: KVM: Forward SEED CSR access to user space
KVM: riscv: selftests: Add sstc timer test
KVM: riscv: selftests: Change vcpu_has_ext to a common function
KVM: riscv: selftests: Add guest helper to get vcpu id
KVM: riscv: selftests: Add exception handling support
LoongArch: KVM: Remove unnecessary CSR register saving during enter guest
LoongArch: KVM: Do not restart SW timer when it is expired
LoongArch: KVM: Start SW timer only when vcpu is blocking
LoongArch: KVM: Set reserved bits as zero in CPUCFG
KVM: selftests: Explicitly close guest_memfd files in some gmem tests
KVM: x86/xen: fix recursive deadlock in timer injection
KVM: pfncache: simplify locking and make more self-contained
KVM: x86/xen: remove WARN_ON_ONCE() with false positives in evtchn delivery
KVM: x86/xen: inject vCPU upcall vector when local APIC is enabled
KVM: x86/xen: improve accuracy of Xen timers
...
- The biggest change is the rework of the percpu code,
to support the 'Named Address Spaces' GCC feature,
by Uros Bizjak:
- This allows C code to access GS and FS segment relative
memory via variables declared with such attributes,
which allows the compiler to better optimize those accesses
than the previous inline assembly code.
- The series also includes a number of micro-optimizations
for various percpu access methods, plus a number of
cleanups of %gs accesses in assembly code.
- These changes have been exposed to linux-next testing for
the last ~5 months, with no known regressions in this area.
- Fix/clean up __switch_to()'s broken but accidentally
working handling of FPU switching - which also generates
better code.
- Propagate more RIP-relative addressing in assembly code,
to generate slightly better code.
- Rework the CPU mitigations Kconfig space to be less idiosyncratic,
to make it easier for distros to follow & maintain these options.
- Rework the x86 idle code to cure RCU violations and
to clean up the logic.
- Clean up the vDSO Makefile logic.
- Misc cleanups and fixes.
[ Please note that there's a higher number of merge commits in
this branch (three) than is usual in x86 topic trees. This happened
due to the long testing lifecycle of the percpu changes that
involved 3 merge windows, which generated a longer history
and various interactions with other core x86 changes that we
felt better about to carry in a single branch. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmXvB0gRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1jUqRAAqnEQPiabF5acQlHrwviX+cjSobDlqtH5
9q2AQy9qaEHapzD0XMOxvFye6XIvehGOGxSPvk6CoviSxBND8rb56lvnsEZuLeBV
Bo5QSIL2x42Zrvo11iPHwgXZfTIusU90sBuKDRFkYBAxY3HK2naMDZe8MAsYCUE9
nwgHF8DDc/NYiSOXV8kosWoWpNIkoK/STyH5bvTQZMqZcwyZ49AIeP1jGZb/prbC
e/rbnlrq5Eu6brpM7xo9kELO0Vhd34urV14KrrIpdkmUKytW2KIsyvW8D6fqgDBj
NSaQLLcz0pCXbhF+8Nqvdh/1coR4L7Ymt08P1rfEjCsQgb/2WnSAGUQuC5JoGzaj
ngkbFcZllIbD9gNzMQ1n4Aw5TiO+l9zxCqPC/r58Uuvstr+K9QKlwnp2+B3Q73Ft
rojIJ04NJL6lCHdDgwAjTTks+TD2PT/eBWsDfJ/1pnUWttmv9IjMpnXD5sbHxoiU
2RGGKnYbxXczYdq/ALYDWM6JXpfnJZcXL3jJi0IDcCSsb92xRvTANYFHnTfyzGfw
EHkhbF4e4Vy9f6QOkSP3CvW5H26BmZS9DKG0J9Il5R3u2lKdfbb5vmtUmVTqHmAD
Ulo5cWZjEznlWCAYSI/aIidmBsp9OAEvYd+X7Z5SBIgTfSqV7VWHGt0BfA1heiVv
F/mednG0gGc=
=3v4F
-----END PGP SIGNATURE-----
Merge tag 'x86-core-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core x86 updates from Ingo Molnar:
- The biggest change is the rework of the percpu code, to support the
'Named Address Spaces' GCC feature, by Uros Bizjak:
- This allows C code to access GS and FS segment relative memory
via variables declared with such attributes, which allows the
compiler to better optimize those accesses than the previous
inline assembly code.
- The series also includes a number of micro-optimizations for
various percpu access methods, plus a number of cleanups of %gs
accesses in assembly code.
- These changes have been exposed to linux-next testing for the
last ~5 months, with no known regressions in this area.
- Fix/clean up __switch_to()'s broken but accidentally working handling
of FPU switching - which also generates better code
- Propagate more RIP-relative addressing in assembly code, to generate
slightly better code
- Rework the CPU mitigations Kconfig space to be less idiosyncratic, to
make it easier for distros to follow & maintain these options
- Rework the x86 idle code to cure RCU violations and to clean up the
logic
- Clean up the vDSO Makefile logic
- Misc cleanups and fixes
* tag 'x86-core-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
x86/idle: Select idle routine only once
x86/idle: Let prefer_mwait_c1_over_halt() return bool
x86/idle: Cleanup idle_setup()
x86/idle: Clean up idle selection
x86/idle: Sanitize X86_BUG_AMD_E400 handling
sched/idle: Conditionally handle tick broadcast in default_idle_call()
x86: Increase brk randomness entropy for 64-bit systems
x86/vdso: Move vDSO to mmap region
x86/vdso/kbuild: Group non-standard build attributes and primary object file rules together
x86/vdso: Fix rethunk patching for vdso-image-{32,64}.o
x86/retpoline: Ensure default return thunk isn't used at runtime
x86/vdso: Use CONFIG_COMPAT_32 to specify vdso32
x86/vdso: Use $(addprefix ) instead of $(foreach )
x86/vdso: Simplify obj-y addition
x86/vdso: Consolidate targets and clean-files
x86/bugs: Rename CONFIG_RETHUNK => CONFIG_MITIGATION_RETHUNK
x86/bugs: Rename CONFIG_CPU_SRSO => CONFIG_MITIGATION_SRSO
x86/bugs: Rename CONFIG_CPU_IBRS_ENTRY => CONFIG_MITIGATION_IBRS_ENTRY
x86/bugs: Rename CONFIG_CPU_UNRET_ENTRY => CONFIG_MITIGATION_UNRET_ENTRY
x86/bugs: Rename CONFIG_SLS => CONFIG_MITIGATION_SLS
...
kernel to be used as a KVM hypervisor capable of running SNP (Secure
Nested Paging) guests. Roughly speaking, SEV-SNP is the ultimate goal
of the AMD confidential computing side, providing the most
comprehensive confidential computing environment up to date.
This is the x86 part and there is a KVM part which did not get ready
in time for the merge window so latter will be forthcoming in the next
cycle.
- Rework the early code's position-dependent SEV variable references in
order to allow building the kernel with clang and -fPIE/-fPIC and
-mcmodel=kernel
- The usual set of fixes, cleanups and improvements all over the place
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmXvH0wACgkQEsHwGGHe
VUrzmA//VS/n6dhHRnm/nAGngr4PeegkgV1OhyKYFfiZ272rT6P9QvblQrgcY0dc
Ij1DOhEKlke51pTHvMOQ33B3P4Fuc0mx3dpCLY0up5V26kzQiKCjRKEkC4U1bcw8
W4GqMejaR89bE14bYibmwpSib9T/uVsV65eM3xf1iF5UvsnoUaTziymDoy+nb43a
B1pdd5vcl4mBNqXeEvt0qjg+xkMLpWUI9tJDB8mbMl/cnIFGgMZzBaY8oktHSROK
QpuUnKegOgp1RXpfLbNjmZ2Q4Rkk4MNazzDzWq3EIxaRjXL3Qp507ePK7yeA2qa0
J3jCBQc9E2j7lfrIkUgNIzOWhMAXM2YH5bvH6UrIcMi1qsWJYDmkp2MF1nUedjdf
Wj16/pJbeEw1aKKIywJGwsmViSQju158vY3SzXG83U/A/Iz7zZRHFmC/ALoxZptY
Bi7VhfcOSpz98PE3axnG8CvvxRDWMfzBr2FY1VmQbg6VBNo1Xl1aP/IH1I8iQNKg
/laBYl/qP+1286TygF1lthYROb1lfEIJprgi2xfO6jVYUqPb7/zq2sm78qZRfm7l
25PN/oHnuidfVfI/H3hzcGubjOG9Zwra8WWYBB2EEmelf21rT0OLqq+eS4T6pxFb
GNVfc0AzG77UmqbrpkAMuPqL7LrGaSee4NdU3hkEdSphlx1/YTo=
=c1ps
-----END PGP SIGNATURE-----
Merge tag 'x86_sev_for_v6.9_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV updates from Borislav Petkov:
- Add the x86 part of the SEV-SNP host support.
This will allow the kernel to be used as a KVM hypervisor capable of
running SNP (Secure Nested Paging) guests. Roughly speaking, SEV-SNP
is the ultimate goal of the AMD confidential computing side,
providing the most comprehensive confidential computing environment
up to date.
This is the x86 part and there is a KVM part which did not get ready
in time for the merge window so latter will be forthcoming in the
next cycle.
- Rework the early code's position-dependent SEV variable references in
order to allow building the kernel with clang and -fPIE/-fPIC and
-mcmodel=kernel
- The usual set of fixes, cleanups and improvements all over the place
* tag 'x86_sev_for_v6.9_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
x86/sev: Disable KMSAN for memory encryption TUs
x86/sev: Dump SEV_STATUS
crypto: ccp - Have it depend on AMD_IOMMU
iommu/amd: Fix failure return from snp_lookup_rmpentry()
x86/sev: Fix position dependent variable references in startup code
crypto: ccp: Make snp_range_list static
x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
Documentation: virt: Fix up pre-formatted text block for SEV ioctls
crypto: ccp: Add the SNP_SET_CONFIG command
crypto: ccp: Add the SNP_COMMIT command
crypto: ccp: Add the SNP_PLATFORM_STATUS command
x86/cpufeatures: Enable/unmask SEV-SNP CPU feature
KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe
crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump
iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown
crypto: ccp: Handle legacy SEV commands when SNP is enabled
crypto: ccp: Handle non-volatile INIT_EX data when SNP is enabled
crypto: ccp: Handle the legacy TMR allocation when SNP is enabled
x86/sev: Introduce an SNP leaked pages list
crypto: ccp: Provide an API to issue SEV and SNP commands
...
FRED is a replacement for IDT event delivery on x86 and addresses most of
the technical nightmares which IDT exposes:
1) Exception cause registers like CR2 need to be manually preserved in
nested exception scenarios.
2) Hardware interrupt stack switching is suboptimal for nested exceptions
as the interrupt stack mechanism rewinds the stack on each entry which
requires a massive effort in the low level entry of #NMI code to handle
this.
3) No hardware distinction between entry from kernel or from user which
makes establishing kernel context more complex than it needs to be
especially for unconditionally nestable exceptions like NMI.
4) NMI nesting caused by IRET unconditionally reenabling NMIs, which is a
problem when the perf NMI takes a fault when collecting a stack trace.
5) Partial restore of ESP when returning to a 16-bit segment
6) Limitation of the vector space which can cause vector exhaustion on
large systems.
7) Inability to differentiate NMI sources
FRED addresses these shortcomings by:
1) An extended exception stack frame which the CPU uses to save exception
cause registers. This ensures that the meta information for each
exception is preserved on stack and avoids the extra complexity of
preserving it in software.
2) Hardware interrupt stack switching is non-rewinding if a nested
exception uses the currently interrupt stack.
3) The entry points for kernel and user context are separate and GS BASE
handling which is required to establish kernel context for per CPU
variable access is done in hardware.
4) NMIs are now nesting protected. They are only reenabled on the return
from NMI.
5) FRED guarantees full restore of ESP
6) FRED does not put a limitation on the vector space by design because it
uses a central entry points for kernel and user space and the CPUstores
the entry type (exception, trap, interrupt, syscall) on the entry stack
along with the vector number. The entry code has to demultiplex this
information, but this removes the vector space restriction.
The first hardware implementations will still have the current
restricted vector space because lifting this limitation requires
further changes to the local APIC.
7) FRED stores the vector number and meta information on stack which
allows having more than one NMI vector in future hardware when the
required local APIC changes are in place.
The series implements the initial FRED support by:
- Reworking the existing entry and IDT handling infrastructure to
accomodate for the alternative entry mechanism.
- Expanding the stack frame to accomodate for the extra 16 bytes FRED
requires to store context and meta information
- Providing FRED specific C entry points for events which have information
pushed to the extended stack frame, e.g. #PF and #DB.
- Providing FRED specific C entry points for #NMI and #MCE
- Implementing the FRED specific ASM entry points and the C code to
demultiplex the events
- Providing detection and initialization mechanisms and the necessary
tweaks in context switching, GS BASE handling etc.
The FRED integration aims for maximum code reuse vs. the existing IDT
implementation to the extent possible and the deviation in hot paths like
context switching are handled with alternatives to minimalize the
impact. The low level entry and exit paths are seperate due to the extended
stack frame and the hardware based GS BASE swichting and therefore have no
impact on IDT based systems.
It has been extensively tested on existing systems and on the FRED
simulation and as of now there are know outstanding problems.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmXuKPgTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoWyUEACevJMHU+Ot9zqBPizSWxByM1uunHbp
bjQXhaFeskd3mt7k7HU6GsPRSmC3q4lliP1Y9ypfbU0DvYSI2h/PhMWizjhmot2y
nIvFpl51r/NsI+JHx1oXcFetz0eGHEqBui/4YQ/swgOCMymYgfqgHhazXTdldV3g
KpH9/8W3AeGvw79uzXFH9tjBzTkbvywpam3v0LYNDJWTCuDkilyo8PjhsgRZD4x3
V9f1nLD7nSHZW8XLoktdJJ38bKwI2Lhao91NQ0ErwopekA4/9WphZEKsDpidUSXJ
sn1O148oQ8X92IO2OaQje8XC5pLGr5GqQBGPWzRH56P/Vd3+WOwBxaFoU6Drxc5s
tIe23ZjkVcpA8EEG7BQBZV1Un/NX7XaCCnMniOt0RauXw+1NaslX7t/tnUAh5F1V
TWCH4D0I0oJ0qJ7kNliGn2BP3agYXOVg81xVEUjT6KfHcYU4ImUrwi+BkeNXuXtL
Ch5ADnbYAcUjWLFnAmEmaRtfmfNGY5T7PeGFHW2RRkaOJ88v5g14Voo6gPJaDUPn
wMQ0nLq1xN4xZWF6ZgfRqAhArvh20k38ZujRku5vXEqnhOugQ76TF2UYiFEwOXbQ
8jcM+yEBLGgBz7tGMwmIAml6kfxaFF1KPpdrtcPxNkGlbE6KTSuIolLx2YGUvlSU
6/O8nwZy49ckmQ==
=Ib7w
-----END PGP SIGNATURE-----
Merge tag 'x86-fred-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 FRED support from Thomas Gleixner:
"Support for x86 Fast Return and Event Delivery (FRED).
FRED is a replacement for IDT event delivery on x86 and addresses most
of the technical nightmares which IDT exposes:
1) Exception cause registers like CR2 need to be manually preserved
in nested exception scenarios.
2) Hardware interrupt stack switching is suboptimal for nested
exceptions as the interrupt stack mechanism rewinds the stack on
each entry which requires a massive effort in the low level entry
of #NMI code to handle this.
3) No hardware distinction between entry from kernel or from user
which makes establishing kernel context more complex than it needs
to be especially for unconditionally nestable exceptions like NMI.
4) NMI nesting caused by IRET unconditionally reenabling NMIs, which
is a problem when the perf NMI takes a fault when collecting a
stack trace.
5) Partial restore of ESP when returning to a 16-bit segment
6) Limitation of the vector space which can cause vector exhaustion
on large systems.
7) Inability to differentiate NMI sources
FRED addresses these shortcomings by:
1) An extended exception stack frame which the CPU uses to save
exception cause registers. This ensures that the meta information
for each exception is preserved on stack and avoids the extra
complexity of preserving it in software.
2) Hardware interrupt stack switching is non-rewinding if a nested
exception uses the currently interrupt stack.
3) The entry points for kernel and user context are separate and GS
BASE handling which is required to establish kernel context for
per CPU variable access is done in hardware.
4) NMIs are now nesting protected. They are only reenabled on the
return from NMI.
5) FRED guarantees full restore of ESP
6) FRED does not put a limitation on the vector space by design
because it uses a central entry points for kernel and user space
and the CPUstores the entry type (exception, trap, interrupt,
syscall) on the entry stack along with the vector number. The
entry code has to demultiplex this information, but this removes
the vector space restriction.
The first hardware implementations will still have the current
restricted vector space because lifting this limitation requires
further changes to the local APIC.
7) FRED stores the vector number and meta information on stack which
allows having more than one NMI vector in future hardware when the
required local APIC changes are in place.
The series implements the initial FRED support by:
- Reworking the existing entry and IDT handling infrastructure to
accomodate for the alternative entry mechanism.
- Expanding the stack frame to accomodate for the extra 16 bytes FRED
requires to store context and meta information
- Providing FRED specific C entry points for events which have
information pushed to the extended stack frame, e.g. #PF and #DB.
- Providing FRED specific C entry points for #NMI and #MCE
- Implementing the FRED specific ASM entry points and the C code to
demultiplex the events
- Providing detection and initialization mechanisms and the necessary
tweaks in context switching, GS BASE handling etc.
The FRED integration aims for maximum code reuse vs the existing IDT
implementation to the extent possible and the deviation in hot paths
like context switching are handled with alternatives to minimalize the
impact. The low level entry and exit paths are seperate due to the
extended stack frame and the hardware based GS BASE swichting and
therefore have no impact on IDT based systems.
It has been extensively tested on existing systems and on the FRED
simulation and as of now there are no outstanding problems"
* tag 'x86-fred-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
x86/fred: Fix init_task thread stack pointer initialization
MAINTAINERS: Add a maintainer entry for FRED
x86/fred: Fix a build warning with allmodconfig due to 'inline' failing to inline properly
x86/fred: Invoke FRED initialization code to enable FRED
x86/fred: Add FRED initialization functions
x86/syscall: Split IDT syscall setup code into idt_syscall_init()
KVM: VMX: Call fred_entry_from_kvm() for IRQ/NMI handling
x86/entry: Add fred_entry_from_kvm() for VMX to handle IRQ/NMI
x86/entry/calling: Allow PUSH_AND_CLEAR_REGS being used beyond actual entry code
x86/fred: Fixup fault on ERETU by jumping to fred_entrypoint_user
x86/fred: Let ret_from_fork_asm() jmp to asm_fred_exit_user when FRED is enabled
x86/traps: Add sysvec_install() to install a system interrupt handler
x86/fred: FRED entry/exit and dispatch code
x86/fred: Add a machine check entry stub for FRED
x86/fred: Add a NMI entry stub for FRED
x86/fred: Add a debug fault entry stub for FRED
x86/idtentry: Incorporate definitions/declarations of the FRED entries
x86/fred: Make exc_page_fault() work for FRED
x86/fred: Allow single-step trap and NMI when starting a new task
x86/fred: No ESPFIX needed when FRED is enabled
...
Implement arch-specific init_orc_entry(), write_orc_entry(), reg_name(),
orc_type_name(), print_reg() and orc_print_dump(), then set BUILD_ORC as
y to build the orc related files.
Co-developed-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Only copy the minimal definitions of instruction opcodes and formats
in inst.h from arch/loongarch to tools/arch/loongarch, and also copy
the definition of sign_extend64() to tools/include/linux/bitops.h to
decode the following kinds of instructions:
(1) stack pointer related instructions
addi.d, ld.d, st.d, ldptr.d and stptr.d
(2) branch and jump related instructions
beq, bne, blt, bge, bltu, bgeu, beqz, bnez, bceqz, bcnez, b, bl and jirl
(3) other instructions
break, nop and ertn
See more info about instructions in LoongArch Reference Manual:
https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html
Co-developed-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
- Exception and interrupt handling for selftests
- Sstc (aka arch_timer) selftest
- Forward seed CSR access to KVM userspace
- Ztso extension support for Guest/VM
- Zacas extension support for Guest/VM
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmXpXfMACgkQrUjsVaLH
LAfZoRAAgIJqgL9jMQ2dliTq8sk24dQQUPo1V2yP7yEYLeUyp/9EAR31ZhBDP1EE
yTT0yvEbyn21fxhdHagVnqJXOepHz1MPmGGxEFx96RVib/m80zDhyDKogW6IgP4c
e9yXW1Wo6m0R+7aQhn8YA9+47xbmq/cNDCwjlkIp0oL/SyktJdTcPjZUTr524jde
dYTGLqSyoQyZMm+wBrJYTiME6nFK3RhKf7V9Mn77VTTFnhIk4J2upl+1kE2pLTAQ
Zp3EwXCK1B2D2J1bdOuqclCApglw3H0CM4c81knDaEyB4w/l/OwpuCA+u58tSU9g
z6DTO+vYvwlROmfeFjjmHKx1pl9uSktYlFlVqinelW7IG7Y3qDD1zPnbT27OpkLP
rFIsF4Dm42MnmcC0sTxaMgKQtMvb56lgpaoa9XHL/DD76pAUvgKoWUnWay+32j1e
8Hhx/PEp16ALGvDfm+9Wo8AgbvrFGl37epe2LXFFr6+zOzmGN+6vATW2EqgJ3Ueo
P5TNkcFSyKg61r0moEr/ZSKZNvv8MuVUKSe0EmPykIccqg6oYHAoAJ66FL5yaX1k
n/aIVNnIavzwlN6DeaQyjAxZxfSg39Z7wOH5PyKmateYKg9yM8P3RY/DndiO6F3c
me9Q0eshu+C7h4YBRXPVstdWkkhCSVa+TN1b/6hnlSP0HLtyHik=
=cWpl
-----END PGP SIGNATURE-----
Merge tag 'kvm-riscv-6.9-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 6.9
- Exception and interrupt handling for selftests
- Sstc (aka arch_timer) selftest
- Forward seed CSR access to KVM userspace
- Ztso extension support for Guest/VM
- Zacas extension support for Guest/VM
- Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to
avoid creating ABI that KVM can't sanely support.
- Update documentation for KVM_SW_PROTECTED_VM to make it abundantly
clear that such VMs are purely a development and testing vehicle, and
come with zero guarantees.
- Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan
is to support confidential VMs with deterministic private memory (SNP
and TDX) only in the TDP MMU.
- Fix a bug in a GUEST_MEMFD negative test that resulted in false passes
when verifying that KVM_MEM_GUEST_MEMFD memslots can't be dirty logged.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmXZB/8ACgkQOlYIJqCj
N/3XlQ//RIsvqr38k7kELSKhCMyWgF4J57itABrHpMqAZu3gaAo5sETX8AGcHEe5
mxmquxyNQSf4cthhWy1kzxjGCy6+fk+Z0Z7wzfz0Yd5D+FI6vpo3HhkjovLb2gpt
kSrHuhJyuj2vkftNvdaz0nHX1QalVyIEnXnR3oqTmxUUsg6lp1x/zr5SP0KBXjo8
ZzJtyFd0fkRXWpA792T7XPRBWrzPV31HYZBLX8sPlYmJATcbIx9rYSThgCN6XuVN
bfE6wATsC+mwv5BpCoDFpCKmFcqSqamag9NGe5qE5mOby5DQGYTCRMCQB8YXXBR0
97ppaY9ZJV4nOVjrYJn6IMOSMVNfoG7nTRFfcd0eFP4tlPEgHwGr5BGDaBtQPkrd
KcgWJw8nS02eCA2iOE+FtCXvGJwKhTTjQ45w7rU4EcfUk603L5J4GO1ddmjMhPcP
upGGcWDK9vCGrSUFTm8pyWp/NKRJPvAQEiQd/BweSk9+isQHTX2RYCQgPAQnwlTS
wTg7ZPNSLoUkRYmd6r+TUT32ELJGNc8GLftMnxIwweq6V7AgNMi0HE60eMovuBNO
7DAWWzfBEZmJv+0mNNZPGXczHVv4YvMWysRdKkhztBc3+sO7P3AL1zWIDlm5qwoG
LpFeeI3qo3o5ZNaqGzkSop2pUUGNGpWCH46WmP0AG7RpzW/Natw=
=M0td
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-guest_memfd_fixes-6.8' of https://github.com/kvm-x86/linux into HEAD
KVM GUEST_MEMFD fixes for 6.8:
- Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to
avoid creating ABI that KVM can't sanely support.
- Update documentation for KVM_SW_PROTECTED_VM to make it abundantly
clear that such VMs are purely a development and testing vehicle, and
come with zero guarantees.
- Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan
is to support confidential VMs with deterministic private memory (SNP
and TDX) only in the TDP MMU.
- Fix a bug in a GUEST_MEMFD negative test that resulted in false passes
when verifying that KVM_MEM_GUEST_MEMFD memslots can't be dirty logged.
Borrow the cpu_relax() definitions from kernel's
arch/riscv/include/asm/vdso/processor.h to tools/ for riscv.
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Borrow the csr definitions and operations from kernel's
arch/riscv/include/asm/csr.h to tools/ for riscv.
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
No point in checking again as this was already done by the caller.
Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240222111636.2214523-3-nik.borisov@suse.com
It's pointless checking if a particular part of an instruction is
decoded before calling the routine responsible for decoding it as this
check is duplicated in the routines itself. Streamline the code by
removing the superfluous checks. No functional difference.
Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240222111636.2214523-2-nik.borisov@suse.com
We've had issues with gcc and 'asm goto' before, and we created a
'asm_volatile_goto()' macro for that in the past: see commits
3f0116c323 ("compiler/gcc4: Add quirk for 'asm goto' miscompilation
bug") and a9f180345f ("compiler/gcc4: Make quirk for
asm_volatile_goto() unconditional").
Then, much later, we ended up removing the workaround in commit
43c249ea0b ("compiler-gcc.h: remove ancient workaround for gcc PR
58670") because we no longer supported building the kernel with the
affected gcc versions, but we left the macro uses around.
Now, Sean Christopherson reports a new version of a very similar
problem, which is fixed by re-applying that ancient workaround. But the
problem in question is limited to only the 'asm goto with outputs'
cases, so instead of re-introducing the old workaround as-is, let's
rename and limit the workaround to just that much less common case.
It looks like there are at least two separate issues that all hit in
this area:
(a) some versions of gcc don't mark the asm goto as 'volatile' when it
has outputs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420
which is easy to work around by just adding the 'volatile' by hand.
(b) Internal compiler errors:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422
which are worked around by adding the extra empty 'asm' as a
barrier, as in the original workaround.
but the problem Sean sees may be a third thing since it involves bad
code generation (not an ICE) even with the manually added 'volatile'.
but the same old workaround works for this case, even if this feels a
bit like voodoo programming and may only be hiding the issue.
Reported-and-tested-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Andrew Pinski <quic_apinski@quicinc.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It is more accurate to check if KVM is enabled, instead of having the
architecture say so. Architectures always "have" KVM, so for example
checking CONFIG_HAVE_KVM in x86 code is pointless, but if KVM is disabled
in a specific build, there is no need for support code.
Alternatively, many of the #ifdefs could simply be deleted. However,
this would add completely dead code. For example, when KVM is disabled,
there should not be any posted interrupts, i.e. NOT wiring up the "dummy"
handlers and treating IRQs on those vectors as spurious is the right
thing to do.
Cc: x86@kernel.org
Cc: kbingham@kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add MSR numbers for the FRED configuration registers per FRED spec 5.0.
Originally-by: Megha Dey <megha.dey@intel.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Shan Kang <shan.kang@intel.com>
Link: https://lore.kernel.org/r/20231205105030.8698-13-xin3.li@intel.com
ERETU returns from an event handler while making a transition to ring 3,
and ERETS returns from an event handler while staying in ring 0.
Add instruction opcodes used by ERET[US] to the x86 opcode map; opcode
numbers are per FRED spec v5.0.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Shan Kang <shan.kang@intel.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20231205105030.8698-10-xin3.li@intel.com
This is to get the changes from:
94ea9c0521 ("x86/headers: Replace #include <asm/export.h> with #include <linux/export.h>")
10f4c9b9a3 ("x86/asm: Fix build of UML with KASAN")
That addresses these perf tools build warning:
Warning: Kernel ABI header differences:
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/lkml/ZbkIKpKdNqOFdMwJ@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
1e536e1068 ("x86/cpu: Detect TDX partial write machine check erratum")
765a0542fd ("x86/virt/tdx: Detect TDX during kernel boot")
30fa92832f ("x86/CPU/AMD: Add ZenX generations flags")
04c3024560 ("x86/barrier: Do not serialize MSR accesses on AMD")
This causes these perf files to be rebuilt and brings some X86_FEATURE
that will be used when updating the copies of
tools/arch/x86/lib/mem{cpy,set}_64.S with the kernel sources:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
And addresses this perf build warning:
Warning: Kernel ABI header differences:
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>