With multiple pci segment support, number of BDF supported by each
segment may differ. Hence introduce per segment device table size
which depends on last_bdf. This will replace global
"device_table_size" variable.
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20220706113825.25582-12-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Current code uses global "amd_iommu_last_bdf" to track the last bdf
supported by the system. This value is used for various memory
allocation, device data flushing, etc.
Introduce per PCI segment last_bdf which will be used to track last bdf
supported by the given PCI segment and use this value for all per
segment memory allocations. Eventually it will replace global
"amd_iommu_last_bdf".
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20220706113825.25582-11-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Newer AMD systems can support multiple PCI segments. In order to support
multiple PCI segments IVMD table in IVRS structure is enhanced to
include pci segment id. Update ivmd_header structure to include "pci_seg".
Also introduce per PCI segment unity map list. It will replace global
amd_iommu_unity_map list.
Note that we have used "reserved" field in IVMD table to include "pci_seg
id" which was set to zero. It will take care of backward compatibility
(new kernel will work fine on older systems).
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20220706113825.25582-10-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This will replace global rlookup table (amd_iommu_rlookup_table).
Add helper functions to set/get rlookup table for the given device.
Also add macros to get seg/devid from sbdf.
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20220706113825.25582-5-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Introduce per PCI segment device table. All IOMMUs within the segment
will share this device table. This will replace global device
table i.e. amd_iommu_dev_table.
Also introduce helper function to get the device table for the given IOMMU.
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20220706113825.25582-4-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Newer AMD systems can support multiple PCI segments, where each segment
contains one or more IOMMU instances. However, an IOMMU instance can only
support a single PCI segment.
Current code assumes that system contains only one pci segment (segment 0)
and creates global data structures such as device table, rlookup table,
etc.
Introducing per PCI segment data structure, which contains segment
specific data structures. This will eventually replace the global
data structures.
Also update `amd_iommu->pci_seg` variable to point to PCI segment
structure instead of PCI segment ID.
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20220706113825.25582-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Including:
- Intel VT-d driver updates
- Domain force snooping improvement.
- Cleanups, no intentional functional changes.
- ARM SMMU driver updates
- Add new Qualcomm device-tree compatible strings
- Add new Nvidia device-tree compatible string for Tegra234
- Fix UAF in SMMUv3 shared virtual addressing code
- Force identity-mapped domains for users of ye olde SMMU
legacy binding
- Minor cleanups
- Patches to fix a BUG_ON in the vfio_iommu_group_notifier
- Groundwork for upcoming iommufd framework
- Introduction of DMA ownership so that an entire IOMMU group
is either controlled by the kernel or by user-space
- MT8195 and MT8186 support in the Mediatek IOMMU driver
- Patches to make forcing of cache-coherent DMA more coherent
between IOMMU drivers
- Fixes for thunderbolt device DMA protection
- Various smaller fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmKWCbUACgkQK/BELZcB
GuPHmRAAuoH9iK/jrC3SgrqpBfH2iRN7ovIX8dFvgbQWX27lhXF4gvj2/nYdIvPK
75j/LmdibuzV3Iez4kjbGKNG1AikwK3dKIH21a84f3ctnoamQyL6nMfCVBFaVD/D
kvPpTHyjbGPNf6KZyWQdkJ5DXD1aoG1DKkBnslH5pTNPqGuNqbcnRTg0YxiJFLBv
5w2B6jL06XRzunh+Sp1Dbj+po8ROjLRCEU+tdrndO8W/Dyp6+ZNNuxL9/3BM9zMj
py0M4piFtGnhmJSdym1eeHm7r1YRjkZw+MN+e8NcrcSihmDutEWo7nRRxA5uVaa+
3O2DNERqCvQUYxfNRUOKwzV8v51GYQHEPhvOe/MLgaEQDmDmlF2dHNGm93eCMdrv
m1cT011oU7pa4qHomwLyTJxSsR7FzJ37igq/WeY++MBhl+frqfzEQPVxF+W7GLb8
QvT/+woCPzLVpJbE7s0FUD4nbPd8c1dAz4+HO1DajxILIOTq1bnPIorSjgXODRjq
yzsiP1rAg0L0PsL7pXn3cPMzNCE//xtOsRsAGmaVv6wBoMLyWVFCU/wjPEdjrSWA
nXpAuCL84uxCEl/KLYMsg9UhjT6ko7CuKdsybIG9zNIiUau43uSqgTen0xCpYt0i
m//O/X3tPyxmoLKRW+XVehGOrBZW+qrQny6hk/Zex+6UJQqVMTA=
=W0hj
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- Intel VT-d driver updates:
- Domain force snooping improvement.
- Cleanups, no intentional functional changes.
- ARM SMMU driver updates:
- Add new Qualcomm device-tree compatible strings
- Add new Nvidia device-tree compatible string for Tegra234
- Fix UAF in SMMUv3 shared virtual addressing code
- Force identity-mapped domains for users of ye olde SMMU legacy
binding
- Minor cleanups
- Fix a BUG_ON in the vfio_iommu_group_notifier:
- Groundwork for upcoming iommufd framework
- Introduction of DMA ownership so that an entire IOMMU group is
either controlled by the kernel or by user-space
- MT8195 and MT8186 support in the Mediatek IOMMU driver
- Make forcing of cache-coherent DMA more coherent between IOMMU
drivers
- Fixes for thunderbolt device DMA protection
- Various smaller fixes and cleanups
* tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
iommu/amd: Increase timeout waiting for GA log enablement
iommu/s390: Tolerate repeat attach_dev calls
iommu/vt-d: Remove hard coding PGSNP bit in PASID entries
iommu/vt-d: Remove domain_update_iommu_snooping()
iommu/vt-d: Check domain force_snooping against attached devices
iommu/vt-d: Block force-snoop domain attaching if no SC support
iommu/vt-d: Size Page Request Queue to avoid overflow condition
iommu/vt-d: Fold dmar_insert_one_dev_info() into its caller
iommu/vt-d: Change return type of dmar_insert_one_dev_info()
iommu/vt-d: Remove unneeded validity check on dev
iommu/dma: Explicitly sort PCI DMA windows
iommu/dma: Fix iova map result check bug
iommu/mediatek: Fix NULL pointer dereference when printing dev_name
iommu: iommu_group_claim_dma_owner() must always assign a domain
iommu/arm-smmu: Force identity domains for legacy binding
iommu/arm-smmu: Support Tegra234 SMMU
dt-bindings: arm-smmu: Add compatible for Tegra234 SOC
dt-bindings: arm-smmu: Document nvidia,memory-controller property
iommu/arm-smmu-qcom: Add SC8280XP support
dt-bindings: arm-smmu: Add compatible for Qualcomm SC8280XP
...
On some systems it can take a long time for the hardware to enable the
GA log of the AMD IOMMU. The current wait time is only 0.1ms, but
testing showed that it can take up to 14ms for the GA log to enter
running state after it has been enabled.
Sometimes the long delay happens when booting the system, sometimes
only on resume. Adjust the timeout accordingly to not print a warning
when hardware takes a longer than usual.
There has already been an attempt to fix this with commit
9b45a7738e ("iommu/amd: Fix loop timeout issue in iommu_ga_log_enable()")
But that commit was based on some wrong math and did not fix the issue
in all cases.
Cc: "D. Ziegfeld" <dzigg@posteo.de>
Cc: Jörg-Volker Peetz <jvpeetz@web.de>
Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220520102214.12563-1-joro@8bytes.org
The IOMMU table tries to separate the different IOMMUs into different
backends, but actually requires various cross calls.
Rewrite the code to do the generic swiotlb/swiotlb-xen setup directly
in pci-dma.c and then just call into the IOMMU drivers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
The AMD IOMMU logs I/O page faults and such to a ring buffer in
system memory, and this ring buffer can overflow. The AMD IOMMU
spec has the following to say about the interrupt status bit that
signals this overflow condition:
EventOverflow: Event log overflow. RW1C. Reset 0b. 1 = IOMMU
event log overflow has occurred. This bit is set when a new
event is to be written to the event log and there is no usable
entry in the event log, causing the new event information to
be discarded. An interrupt is generated when EventOverflow = 1b
and MMIO Offset 0018h[EventIntEn] = 1b. No new event log
entries are written while this bit is set. Software Note: To
resume logging, clear EventOverflow (W1C), and write a 1 to
MMIO Offset 0018h[EventLogEn].
The AMD IOMMU driver doesn't currently implement this recovery
sequence, meaning that if a ring buffer overflow occurs, logging
of EVT/PPR/GA events will cease entirely.
This patch implements the spec-mandated reset sequence, with the
minor tweak that the hardware seems to want to have a 0 written to
MMIO Offset 0018h[EventLogEn] first, before writing an 1 into this
field, or the IOMMU won't actually resume logging events.
Signed-off-by: Lennert Buytenhek <buytenh@arista.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/YVrSXEdW2rzEfOvk@wantstofly.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The polling loop for the register change in iommu_ga_log_enable() needs
to have a udelay() in it. Otherwise the CPU might be faster than the
IOMMU hardware and wrongly trigger the WARN_ON() further down the code
stream. Use a 10us for udelay(), has there is some hardware where
activation of the GA log can take more than a 100ms.
A future optimization should move the activation check of the GA log
to the point where it gets used for the first time. But that is a
bigger change and not suitable for a fix.
Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220204115537.3894-1-joro@8bytes.org
Use IRQCHIP_MASK_ON_SUSPEND to make the core irq code to
mask the iommu interrupt on suspend and unmask it on the resume.
Since now the unmask function updates the INTX settings,
that will restore them on resume from s3/s4.
Since IRQCHIP_MASK_ON_SUSPEND is only effective for interrupts
which are not wakeup sources, remove IRQCHIP_SKIP_SET_WAKE flag
and instead implement a dummy .irq_set_wake which doesn't allow
the interrupt to become a wakeup source.
Fixes: 6692981295 ("iommu/amd: Add support for X2APIC IOMMU interrupts")
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20211123161038.48009-5-mlevitsk@redhat.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This is more logically correct and will also allow us to
to use mask/unmask logic to restore INTX setttings after
the resume from s3/s4.
Fixes: 6692981295 ("iommu/amd: Add support for X2APIC IOMMU interrupts")
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20211123161038.48009-4-mlevitsk@redhat.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This will give IOMMU GA log a chance to work after resume
from s3/s4.
Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20211123161038.48009-2-mlevitsk@redhat.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The end goal of the current buffer overflow detection work[0] is to gain
full compile-time and run-time coverage of all detectable buffer overflows
seen via array indexing or memcpy(), memmove(), and memset(). The str*()
family of functions already have full coverage.
While much of the work for these changes have been on-going for many
releases (i.e. 0-element and 1-element array replacements, as well as
avoiding false positives and fixing discovered overflows[1]), this series
contains the foundational elements of several related buffer overflow
detection improvements by providing new common helpers and FORTIFY_SOURCE
changes needed to gain the introspection required for compiler visibility
into array sizes. Also included are a handful of already Acked instances
using the helpers (or related clean-ups), with many more waiting at the
ready to be taken via subsystem-specific trees[2]. The new helpers are:
- struct_group() for gaining struct member range introspection.
- memset_after() and memset_startat() for clearing to the end of structures.
- DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in structs.
Also included is the beginning of the refactoring of FORTIFY_SOURCE to
support memcpy() introspection, fix missing and regressed coverage under
GCC, and to prepare to fix the currently broken Clang support. Finishing
this work is part of the larger series[0], but depends on all the false
positives and buffer overflow bug fixes to have landed already and those
that depend on this series to land.
As part of the FORTIFY_SOURCE refactoring, a set of both a compile-time
and run-time tests are added for FORTIFY_SOURCE and the mem*()-family
functions respectively. The compile time tests have found a legitimate
(though corner-case) bug[6] already.
Please note that the appearance of "panic" and "BUG" in the
FORTIFY_SOURCE refactoring are the result of relocating existing code,
and no new use of those code-paths are expected nor desired.
Finally, there are two tree-wide conversions for 0-element arrays and
flexible array unions to gain sane compiler introspection coverage that
result in no known object code differences.
After this series (and the changes that have now landed via netdev
and usb), we are very close to finally being able to build with
-Warray-bounds and -Wzero-length-bounds. However, due corner cases in
GCC[3] and Clang[4], I have not included the last two patches that turn
on these options, as I don't want to introduce any known warnings to
the build. Hopefully these can be solved soon.
[0] https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE
[2] https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/
[3] https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/
[4] https://bugs.llvm.org/show_bug.cgi?id=51682
[5] https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/
[6] https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmGAFWcWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJmKFD/45MJdnvW5MhIEeW5tc5UjfcIPS
ae+YvlEX/2ZwgSlTxocFVocE6hz7b6eCiX3dSAChPkPxsSfgeiuhjxsU+4ROnELR
04RqTA/rwT6JXfJcXbDPXfxDL4huUkgktAW3m1sT771AZspeap2GrSwFyttlTqKA
+kTiZ3lXJVFcw10uyhfp3Lk6eFJxdf5iOjuEou5kBOQfpNKEOduRL2K15hSowOwB
lARiAC+HbmN+E+npvDE7YqK4V7ZQ0/dtB0BlfqgTkn1spQz8N21kBAMpegV5vvIk
A+qGHc7q2oyk4M14TRTidQHGQ4juW1Kkvq3NV6KzwQIVD+mIfz0ESn3d4tnp28Hk
Y+OXTI1BRFlApQU9qGWv33gkNEozeyqMLDRLKhDYRSFPA9UKkpgXQRzeTzoLKyrQ
4B6n5NnUGcu7I6WWhpyZQcZLDsHGyy0vHzjQGs/NXtb1PzXJ5XIGuPdmx9pVMykk
IVKnqRcWyGWahfh3asOnoXvdhi1No4NSHQ/ZHfUM+SrIGYjBMaUisw66qm3Fe8ZU
lbO2CFkCsfGSoKNPHf0lUEGlkyxAiDolazOfflDNxdzzlZo2X1l/a7O/yoO4Pqul
cdL0eDjiNoQ2YR2TSYPnXq5KSL1RI0tlfS8pH8k1hVhZsQx0wpAQ+qki0S+fLePV
PdA9XB82G2tmqKc9cQ==
=9xbT
-----END PGP SIGNATURE-----
Merge tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull overflow updates from Kees Cook:
"The end goal of the current buffer overflow detection work[0] is to
gain full compile-time and run-time coverage of all detectable buffer
overflows seen via array indexing or memcpy(), memmove(), and
memset(). The str*() family of functions already have full coverage.
While much of the work for these changes have been on-going for many
releases (i.e. 0-element and 1-element array replacements, as well as
avoiding false positives and fixing discovered overflows[1]), this
series contains the foundational elements of several related buffer
overflow detection improvements by providing new common helpers and
FORTIFY_SOURCE changes needed to gain the introspection required for
compiler visibility into array sizes. Also included are a handful of
already Acked instances using the helpers (or related clean-ups), with
many more waiting at the ready to be taken via subsystem-specific
trees[2].
The new helpers are:
- struct_group() for gaining struct member range introspection
- memset_after() and memset_startat() for clearing to the end of
structures
- DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in
structs
Also included is the beginning of the refactoring of FORTIFY_SOURCE to
support memcpy() introspection, fix missing and regressed coverage
under GCC, and to prepare to fix the currently broken Clang support.
Finishing this work is part of the larger series[0], but depends on
all the false positives and buffer overflow bug fixes to have landed
already and those that depend on this series to land.
As part of the FORTIFY_SOURCE refactoring, a set of both a
compile-time and run-time tests are added for FORTIFY_SOURCE and the
mem*()-family functions respectively. The compile time tests have
found a legitimate (though corner-case) bug[6] already.
Please note that the appearance of "panic" and "BUG" in the
FORTIFY_SOURCE refactoring are the result of relocating existing code,
and no new use of those code-paths are expected nor desired.
Finally, there are two tree-wide conversions for 0-element arrays and
flexible array unions to gain sane compiler introspection coverage
that result in no known object code differences.
After this series (and the changes that have now landed via netdev and
usb), we are very close to finally being able to build with
-Warray-bounds and -Wzero-length-bounds.
However, due corner cases in GCC[3] and Clang[4], I have not included
the last two patches that turn on these options, as I don't want to
introduce any known warnings to the build. Hopefully these can be
solved soon"
Link: https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/ [0]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE [1]
Link: https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/ [2]
Link: https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/ [3]
Link: https://bugs.llvm.org/show_bug.cgi?id=51682 [4]
Link: https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/ [5]
Link: https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/ [6]
* tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits)
fortify: strlen: Avoid shadowing previous locals
compiler-gcc.h: Define __SANITIZE_ADDRESS__ under hwaddress sanitizer
treewide: Replace 0-element memcpy() destinations with flexible arrays
treewide: Replace open-coded flex arrays in unions
stddef: Introduce DECLARE_FLEX_ARRAY() helper
btrfs: Use memset_startat() to clear end of struct
string.h: Introduce memset_startat() for wiping trailing members and padding
xfrm: Use memset_after() to clear padding
string.h: Introduce memset_after() for wiping trailing members/padding
lib: Introduce CONFIG_MEMCPY_KUNIT_TEST
fortify: Add compile-time FORTIFY_SOURCE tests
fortify: Allow strlen() and strnlen() to pass compile-time known lengths
fortify: Prepare to improve strnlen() and strlen() warnings
fortify: Fix dropped strcpy() compile-time write overflow check
fortify: Explicitly disable Clang support
fortify: Move remaining fortify helpers into fortify-string.h
lib/string: Move helper functions out of string.c
compiler_types.h: Remove __compiletime_object_size()
cm4000_cs: Use struct_group() to zero struct cm4000_dev region
can: flexcan: Use struct_group() to zero struct flexcan_regs regions
...
Replace uses of sme_active() with the more generic cc_platform_has()
using CC_ATTR_HOST_MEM_ENCRYPT. If future support is added for other
memory encryption technologies, the use of CC_ATTR_HOST_MEM_ENCRYPT
can be updated, as required.
This also replaces two usages of sev_active() that are really geared
towards detecting if SME is active.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210928191009.32551-6-bp@alien8.de
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.
Use struct_group() in struct ivhd_entry around members ext and hidh, so
they can be referenced together. This will allow memcpy() and sizeof()
to more easily reason about sizes, improve readability, and avoid future
warnings about writing beyond the end of ext.
"pahole" shows no size nor member offset changes to struct ivhd_entry.
"objdump -d" shows no object code changes.
Cc: Will Deacon <will@kernel.org>
Cc: iommu@lists.linux-foundation.org
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Since the function has been simplified and only call iommu_init_ga_log(),
remove the function and replace with iommu_init_ga_log() instead.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20210820202957.187572-4-suravee.suthikulpanit@amd.com
Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently, iommu_init_ga() checks and disables IOMMU VAPIC support
(i.e. AMD AVIC support in IOMMU) when GAMSup feature bit is not set.
However it forgets to clear IRQ_POSTING_CAP from the previously set
amd_iommu_irq_ops.capability.
This triggers an invalid page fault bug during guest VM warm reboot
if AVIC is enabled since the irq_remapping_cap(IRQ_POSTING_CAP) is
incorrectly set, and crash the system with the following kernel trace.
BUG: unable to handle page fault for address: 0000000000400dd8
RIP: 0010:amd_iommu_deactivate_guest_mode+0x19/0xbc
Call Trace:
svm_set_pi_irte_mode+0x8a/0xc0 [kvm_amd]
? kvm_make_all_cpus_request_except+0x50/0x70 [kvm]
kvm_request_apicv_update+0x10c/0x150 [kvm]
svm_toggle_avic_for_irq_window+0x52/0x90 [kvm_amd]
svm_enable_irq_window+0x26/0xa0 [kvm_amd]
vcpu_enter_guest+0xbbe/0x1560 [kvm]
? avic_vcpu_load+0xd5/0x120 [kvm_amd]
? kvm_arch_vcpu_load+0x76/0x240 [kvm]
? svm_get_segment_base+0xa/0x10 [kvm_amd]
kvm_arch_vcpu_ioctl_run+0x103/0x590 [kvm]
kvm_vcpu_ioctl+0x22a/0x5d0 [kvm]
__x64_sys_ioctl+0x84/0xc0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes by moving the initializing of AMD IOMMU interrupt remapping mode
(amd_iommu_guest_ir) earlier before setting up the
amd_iommu_irq_ops.capability with appropriate IRQ_POSTING_CAP flag.
[joro: Squashed the two patches and limited
check_features_on_all_iommus() to CONFIG_IRQ_REMAP
to fix a compile warning.]
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20210820202957.187572-2-suravee.suthikulpanit@amd.com
Link: https://lore.kernel.org/r/20210820202957.187572-3-suravee.suthikulpanit@amd.com
Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Remove the new use of the variable introduced in the AMD driver branch.
The variable was removed already in the iommu core branch, causing build
errors when the brances are merged.
Cc: Nadav Amit <namit@vmware.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20210802150643.3634-1-joro@8bytes.org
Do not use flush-queue on virtualized environments, where the NpCache
capability of the IOMMU is set. This is required to reduce
virtualization overheads.
This change follows a similar change to Intel's VT-d and a detailed
explanation as for the rationale is described in commit 29b3283972
("iommu/vt-d: Do not use flush-queue when caching-mode is on").
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: Jiajun Cao <caojiajun@vmware.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Nadav Amit <namit@vmware.com>
Link: https://lore.kernel.org/r/20210723093209.714328-3-namit@vmware.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We only ever now set strict mode enabled in iommu_set_dma_strict(), so
just remove the argument.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1626088340-5838-7-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Make IOMMU_DEFAULT_LAZY default for when AMD_IOMMU config is set, which
matches current behaviour.
For "fullflush" param, just call iommu_set_dma_strict(true) directly.
Since we get a strict vs lazy mode print already in iommu_subsys_init(),
and maintain a deprecation print when "fullflush" param is passed, drop the
prints in amd_iommu_init_dma_ops().
Finally drop global flag amd_iommu_unmap_flush, as it has no longer has any
purpose.
[jpg: Rebase for relocated file and drop amd_iommu_unmap_flush]
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1626088340-5838-6-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Now that the x86 drivers support iommu.strict, deprecate the custom
methods.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1626088340-5838-2-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
A recent commit introduced this section mismatch warning:
WARNING: modpost: vmlinux.o(.text.unlikely+0x22a1f): Section mismatch in reference from the function detect_ivrs() to the variable .init.data:amd_iommu_force_enable
The reason is that detect_ivrs() is not marked __init while it should
be, because it is only called from another __init function. Mark
detect_ivrs() __init to get rid of the warning.
Fixes: b1e650db2c ("iommu/amd: Add amd_iommu=force_enable option")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20210608122843.8413-1-joro@8bytes.org
Now that DMA ops are part of the core API via iommu-dma, fold the
vestigial remains of the IOMMU_DMA_OPS init state into the IOMMU API
phase, and clean up a few other leftovers. This should also close the
race window wherein bus_set_iommu() effectively makes the DMA ops state
visible before its nominal initialisation - it seems this was previously
fairly benign, but since commit a250c23f15 ("iommu: remove
DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE") it can now lead to the strict flush
queue policy inadvertently being picked for default domains allocated
during that window, with a corresponding unexpected perfomance impact.
Reported-by: Jussi Maki <joamaki@gmail.com>
Tested-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Fixes: a250c23f15 ("iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE")
Link: https://lore.kernel.org/r/665db61e23ff8d54ac5eb391bef520b3a803fcb9.1622727974.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add this option to enable the IOMMU on platforms like AMD Stoney,
where the kernel usually disables it because it may cause problems in
some scenarios.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20210603130203.29016-1-joro@8bytes.org
print_iommu_info prints the EFR register and then the decoded list of
features on a separate line:
pci 0000:00:00.2: AMD-Vi: Extended features (0x206d73ef22254ade):
PPR X2APIC NX GT IA GA PC GA_vAPIC
The second line is emitted via 'pr_cont', which causes it to have a
different ('warn') loglevel compared to the previous line ('info').
Commit 9a295ff0ff attempted to rectify this by removing the newline
from the pci_info format string, but this doesn't work, as pci_info
calls implicitly append a newline anyway.
Printing the decoded features on the same line would make it quite long.
Instead, change pci_info() to pr_info() to omit PCI bus location info,
which is also shown in the preceding message. This results in:
pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
AMD-Vi: Extended features (0x206d73ef22254ade): PPR X2APIC NX GT IA GA PC GA_vAPIC
AMD-Vi: Interrupt remapping enabled
Fixes: 9a295ff0ff ("iommu/amd: Print extended features in one line to fix divergent log levels")
Link: https://lore.kernel.org/lkml/alpine.LNX.2.20.13.2104112326460.11104@monopod.intra.ispras.ru
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: iommu@lists.linux-foundation.org
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://lore.kernel.org/r/20210504102220.1793-1-amonakov@ispras.ru
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Rather than have separate opaque setter functions that are easy to
overlook and lead to repetitive boilerplate in drivers, let's pass the
relevant initialisation parameters directly to iommu_device_register().
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/ab001b87c533b6f4db71eb90db6f888953986c36.1617285386.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In early AMD desktop/mobile platforms (during 2013), when the IOMMU
Performance Counter (PMC) support was first introduced in
commit 30861ddc9c ("perf/x86/amd: Add IOMMU Performance Counter
resource management"), there was a HW bug where the counters could not
be accessed. The result was reading of the counter always return zero.
At the time, the suggested workaround was to add a test logic prior
to initializing the PMC feature to check if the counters can be programmed
and read back the same value. This has been working fine until the more
recent desktop/mobile platforms start enabling power gating for the PMC,
which prevents access to the counters. This results in the PMC support
being disabled unnecesarily.
Unfortunatly, there is no documentation of since which generation
of hardware the original PMC HW bug was fixed. Although, it was fixed
soon after the first introduction of the PMC. Base on this, we assume
that the buggy platforms are less likely to be in used, and it should
be relatively safe to remove this legacy logic.
Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Alexander Monakov <amonakov@ispras.ru>
Cc: David Coe <david.coe@live.co.uk>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20210409085848.3908-3-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This reverts commit 6778ff5b21.
The original commit tries to address an issue, where PMC power-gating
causing the IOMMU PMC pre-init test to fail on certain desktop/mobile
platforms where the power-gating is normally enabled.
There have been several reports that the workaround still does not
guarantee to work, and can add up to 100 ms (on the worst case)
to the boot process on certain platforms such as the MSI B350M MORTAR
with AMD Ryzen 3 2200G.
Therefore, revert this commit as a prelude to removing the pre-init
test.
Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Alexander Monakov <amonakov@ispras.ru>
Cc: David Coe <david.coe@live.co.uk>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20210409085848.3908-2-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Remove exports for functions that are only used in the AMD IOMMU driver
itself, or the also always builtin perf events support.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210402143312.372386-4-hch@lst.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>