Commit 8e01797331 ("iommu/amd: Enable Guest Translation before
registering devices") moved IOMMU Guest Translation (GT) enablement to
early init path. It does feature check based on Global EFR value (got from
ACPI IVRS table). Later it adjusts EFR value based on IOMMU feature
register (late_iommu_features_init()).
It seems in some systems BIOS doesn't set gloabl EFR value properly.
This is causing mismatch. Hence move IOMMU GT enablement after
late_iommu_features_init() so that it does check based on IOMMU EFR
value.
Fixes: 8e01797331 ("iommu/amd: Enable Guest Translation before registering devices")
Reported-by: Klara Modin <klarasmodin@gmail.com>
Closes: https://lore.kernel.org/linux-iommu/333e6eb6-361c-4afb-8107-2573324bf689@gmail.com/
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Tested-by: Klara Modin <klarasmodin@gmail.com>
Link: https://lore.kernel.org/r/20240506082039.7575-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
AMD IOMMU hardware supports PCI Peripheral Paging Request (PPR) using
a PPR log, which is a circular buffer containing requests from downstream
end-point devices.
There is one PPR log per IOMMU instance. Therefore, allocate an iopf_queue
per IOMMU instance during driver initialization, and free the queue during
driver deinitialization.
Also rename enable_iommus_v2() -> enable_iommus_ppr() to reflect its
usage. And add amd_iommu_gt_ppr_supported() check before enabling PPR
log.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-10-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In preparation to subsequent PPR-related patches, and also remove static
declaration for certain helper functions so that it can be reused in other
files.
Also rename below functions:
alloc_ppr_log -> amd_iommu_alloc_ppr_log
iommu_enable_ppr_log -> amd_iommu_enable_ppr_log
free_ppr_log -> amd_iommu_free_ppr_log
iommu_poll_ppr_log -> amd_iommu_poll_ppr_log
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-5-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The Intel IOMMU code currently tries to allocate all DMAR fault interrupt
vectors on the boot cpu. On large systems with high DMAR counts this
results in vector exhaustion, and most of the vectors are not initially
allocated socket local.
Instead, have a cpu on each node do the vector allocation for the DMARs on
that node. The boot cpu still does the allocation for its node during its
boot sequence.
Signed-off-by: Dimitri Sivanich <sivanich@hpe.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/Zfydpp2Hm+as16TY@hpe.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use consistent log severity (pr_warn) to log all messages in SNP
enable path.
Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20240410101643.32309-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
DTE[Mode]=0 is not supported when SNP is enabled in the host. That means
to support SNP, IOMMU must be configured with V1 page table (See IOMMU
spec [1] for the details). If user passes kernel command line to configure
IOMMU domains with v2 page table (amd_iommu=pgtbl_v2) then disable SNP
as the user asked by not forcing the page table to v1.
[1] https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/specifications/48882_IOMMU.pdf
Cc: Ashish Kalra <ashish.kalra@amd.com>
Cc: Michael Roth <michael.roth@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20240410085702.31869-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The host SNP worthiness can determined later, after alternatives have
been patched, in snp_rmptable_init() depending on cmdline options like
iommu=pt which is incompatible with SNP, for example.
Which means that one cannot use X86_FEATURE_SEV_SNP and will need to
have a special flag for that control.
Use that newly added CC_ATTR_HOST_SEV_SNP in the appropriate places.
Move kdump_sev_callback() to its rightful place, while at it.
Fixes: 216d106c7f ("x86/sev: Add SEV-SNP host initialization support")
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Srikanth Aithal <sraithal@amd.com>
Link: https://lore.kernel.org/r/20240327154317.29909-6-bp@alien8.de
Including:
- Core changes:
- Constification of bus_type pointer
- Preparations for user-space page-fault delivery
- Use a named kmem_cache for IOVA magazines
- Intel VT-d changes from Lu Baolu:
- Add RBTree to track iommu probed devices
- Add Intel IOMMU debugfs document
- Cleanup and refactoring
- ARM-SMMU Updates from Will Deacon:
- Device-tree binding updates for a bunch of Qualcomm SoCs
- SMMUv2: Support for Qualcomm X1E80100 MDSS
- SMMUv3: Significant rework of the driver's STE manipulation and
domain handling code. This is the initial part of a larger scale
rework aiming to improve the driver's implementation of the
IOMMU-API in preparation for hooking up IOMMUFD support.
- AMD-Vi Updates:
- Refactor GCR3 table support for SVA
- Cleanups
- Some smaller cleanups and fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmXuyf8ACgkQK/BELZcB
GuNXwxAApkjDm7VWM2D2K8Y+8YLbtaljMCCudNZKhgT++HEo4YlXcA5NmOddMIFc
qhF9EwAWlQfj3krJLJQSZ6v/joKpXSwS6LDYuEGmJ/pIGfN5HqaTsOCItriP7Mle
ZgRTI28u5ykZt4b6IKG8QeexilQi2DsIxT46HFiHL0GrvcBcdxDuKnE22PNCTwU2
25WyJzgo//Ht2BrwlhrduZVQUh0KzXYuV5lErvoobmT0v/a4llS20ov+IE/ut54w
FxIqGR8rMdJ9D2dM0bWRkdJY/vJxokah2QHm0gcna3Gr2iENL2xWFUtm+j1B6Smb
VuxbwMkB0Iz530eShebmzQ07e2f1rRb4DySriu4m/jb8we20AYqKMYaxQxZkU68T
1hExo+/QJQil9p1t+7Eur+S1u6gRHOdqfBnCzGOth/zzY1lbEzpdp8b9M8wnGa4K
Y0EDeUpKtVIP1ZRCBi8CGyU1jgJF13Nx7MnOalgGWjDysB5RPamnrhz71EuD6rLw
Jxp2EYo8NQPmPbEcl9NDS+oOn5Fz5TyPiMF2GUzhb9KisLxUjriLoTaNyBsdFkds
2q+x6KY8qPGk37NhN0ktfpk9CtSGN47Pm8ZznEkFt9AR96GJDX+3NhUNAwEKslwt
1tavDmmdOclOfIpWtaMlKQTHGhuSBZo1A40ATeM/MjHQ8rEtwXk=
=HV07
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
"Core changes:
- Constification of bus_type pointer
- Preparations for user-space page-fault delivery
- Use a named kmem_cache for IOVA magazines
Intel VT-d changes from Lu Baolu:
- Add RBTree to track iommu probed devices
- Add Intel IOMMU debugfs document
- Cleanup and refactoring
ARM-SMMU Updates from Will Deacon:
- Device-tree binding updates for a bunch of Qualcomm SoCs
- SMMUv2: Support for Qualcomm X1E80100 MDSS
- SMMUv3: Significant rework of the driver's STE manipulation and
domain handling code. This is the initial part of a larger scale
rework aiming to improve the driver's implementation of the
IOMMU-API in preparation for hooking up IOMMUFD support.
AMD-Vi Updates:
- Refactor GCR3 table support for SVA
- Cleanups
Some smaller cleanups and fixes"
* tag 'iommu-updates-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
iommu: Fix compilation without CONFIG_IOMMU_INTEL
iommu/amd: Fix sleeping in atomic context
iommu/dma: Document min_align_mask assumption
iommu/vt-d: Remove scalabe mode in domain_context_clear_one()
iommu/vt-d: Remove scalable mode context entry setup from attach_dev
iommu/vt-d: Setup scalable mode context entry in probe path
iommu/vt-d: Fix NULL domain on device release
iommu: Add static iommu_ops->release_domain
iommu/vt-d: Improve ITE fault handling if target device isn't present
iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected
PCI: Make pci_dev_is_disconnected() helper public for other drivers
iommu/vt-d: Use device rbtree in iopf reporting path
iommu/vt-d: Use rbtree to track iommu probed devices
iommu/vt-d: Merge intel_svm_bind_mm() into its caller
iommu/vt-d: Remove initialization for dynamically heap-allocated rcu_head
iommu/vt-d: Remove treatment for revoking PASIDs with pending page faults
iommu/vt-d: Add the document for Intel IOMMU debugfs
iommu/vt-d: Use kcalloc() instead of kzalloc()
iommu/vt-d: Remove INTEL_IOMMU_BROKEN_GFX_WA
iommu: re-use local fwnode variable in iommu_ops_from_fwnode()
...
On many systems that have an AMD IOMMU the following sequence of
warnings is observed during bootup.
```
pci 0000:00:00.2 can't derive routing for PCI INT A
pci 0000:00:00.2: PCI INT A: not connected
```
This series of events happens because of the IOMMU initialization
sequence order and the lack of _PRT entries for the IOMMU.
During initialization the IOMMU driver first enables the PCI device
using pci_enable_device(). This will call acpi_pci_irq_enable()
which will check if the interrupt is declared in a PCI routing table
(_PRT) entry. According to the PCI spec [1] these routing entries
are only required under PCI root bridges:
The _PRT object is required under all PCI root bridges
The IOMMU is directly connected to the root complex, so there is no
parent bridge to look for a _PRT entry. The first warning is emitted
since no entry could be found in the hierarchy. The second warning is
then emitted because the interrupt hasn't yet been configured to any
value. The pin was configured in pci_read_irq() but the byte in
PCI_INTERRUPT_LINE return 0xff which means "Unknown".
After that sequence of events pci_enable_msi() is called and this
will allocate an interrupt.
That is both of these warnings are totally harmless because the IOMMU
uses MSI for interrupts. To avoid even trying to probe for a _PRT
entry mark the IOMMU as IRQ managed. This avoids both warnings.
Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html?highlight=_prt#prt-pci-routing-table [1]
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Fixes: cffe0a2b5a ("x86, irq: Keep balance of IOAPIC pin reference count")
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20240122233400.1802-1-mario.limonciello@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
IOMMU Guest Translation (GT) feature needs to be enabled before
invalidating guest translations (CMD_INV_IOMMU_PAGES with GN=1).
Currently GT feature is enabled after setting up interrupt handler.
So far it was fine as we were not invalidating guest page table
before this point.
Upcoming series will introduce per device GCR3 table and it will
invalidate guest pages after configuring. Hence move GT feature
enablement to early_enable_iommu().
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
.. as IOMMU perf counters are always built as part of kernel.
No functional change intended.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240118090105.5864-7-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit
f366a8dac1: ("iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown")
leads to the following Smatch static checker warning:
drivers/iommu/amd/init.c:3820 iommu_page_make_shared() error: uninitialized symbol 'assigned'.
Fix it.
[ bp: Address the other error cases too. ]
Fixes: f366a8dac1 ("iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown")
Closes: https://lore.kernel.org/linux-iommu/1be69f6a-e7e1-45f9-9a74-b2550344f3fd@moroto.mountain
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Joerg Roedel <jroedel@suse.com>
Link: https://lore.kernel.org/lkml/20240126041126.1927228-20-michael.roth@amd.com
Add a new IOMMU API interface amd_iommu_snp_disable() to transition
IOMMU pages to Hypervisor state from Reclaim state after SNP_SHUTDOWN_EX
command. Invoke this API from the CCP driver after SNP_SHUTDOWN_EX
command.
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-20-michael.roth@amd.com
Currently, the expectation is that the kernel will call
amd_iommu_snp_enable() to perform various checks and set the
amd_iommu_snp_en flag that the IOMMU uses to adjust its setup routines
to account for additional requirements on hosts where SNP is enabled.
This is somewhat fragile as it relies on this call being done prior to
IOMMU setup. It is more robust to just do this automatically as part of
IOMMU initialization, so rework the code accordingly.
There is still a need to export information about whether or not the
IOMMU is configured in a manner compatible with SNP, so relocate the
existing amd_iommu_snp_en flag so it can be used to convey that
information in place of the return code that was previously provided by
calls to amd_iommu_snp_enable().
While here, also adjust the kernel messages related to IOMMU SNP
enablement for consistency/grammar/clarity.
Suggested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-4-michael.roth@amd.com
Drop EXPORT_SYMBOLS for the functions that are not used by any modules.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Tested-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20231006095706.5694-5-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit 1adf3cc20d ("iommu: Add max_pasids field in struct iommu_device")
introduced a variable struct iommu_device.max_pasids to track max
PASIDS supported by each IOMMU.
Let us initialize this field for AMD IOMMU. IOMMU core will use this value
to set max PASIDs per device (see __iommu_probe_device()).
Also remove unused global 'amd_iommu_max_pasid' variable.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20230921092147.5930-15-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In order to support v2 page table, IOMMU driver need to check if the
hardware can support Guest Translation (GT) and Peripheral Page Request
(PPR) features. Currently, IOMMU driver uses global (amd_iommu_v2_present)
and per-iommu (struct amd_iommu.is_iommu_v2) variables to track the
features. There variables area redundant since we could simply just check
the global EFR mask.
Therefore, replace it with a helper function with appropriate name.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20230921092147.5930-10-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently, IOMMU driver assumes capabilities on all IOMMU instances to be
homogeneous. During early_amd_iommu_init(), the driver probes all IVHD
blocks and do sanity check to make sure that only features common among all
IOMMU instances are supported. This is tracked in the global amd_iommu_efr
and amd_iommu_efr2, which should be used whenever the driver need to check
hardware capabilities.
Therefore, introduce check_feature() and check_feature2(), and modify
the driver to adopt the new helper functions.
In addition, clean up the print_iommu_info() to avoid reporting redundant
EFR/EFR2 for each IOMMU instance.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20230921092147.5930-9-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Since AMD IOMMU page table is not used in passthrough mode, switching to
v1 page table is not required.
Therefore, remove redundant amd_iommu_pgtable update and misleading
warning message.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20230921092147.5930-7-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Current code enables PPR and GA interrupts before setting up the
interrupt handler (in state_next()). Make sure interrupt handler
is in place before enabling these interrupt.
amd_iommu_enable_interrupts() gets called in normal boot, kdump as well
as in suspend/resume path. Hence moving interrupt enablement to this
function works fine.
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20230628054554.6131-4-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Move PPR log interrupt bit setting to iommu_enable_ppr_log(). Also
rearrange iommu_enable_ppr_log() such that PPREn bit is enabled
before enabling PPRLog and PPRInt bits. So that when PPRLog bit is
set it will clear the PPRLogOverflow bit and sets the PPRLogRun bit
in the IOMMU Status Register [MMIO Offset 2020h].
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20230628054554.6131-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
AMD IOMMU has three log buffers (i.e. Event, PPR, and GA). These logs can
be configured to generate different interrupts when an entry is inserted
into a log buffer.
However, current implementation share single interrupt to handle all three
logs. With increasing usages of the GA (for IOMMU AVIC) and PPR logs (for
IOMMUv2 APIs and SVA), interrupt sharing could potentially become
performance bottleneck.
Hence, separate IOMMU interrupt into use three separate vectors and irq
threads with corresponding name, which will be displayed in the
/proc/interrupts as "AMD-Vi<x>-[Evt/PPR/GA]", where "x" is an IOMMU id.
Note that this patch changes interrupt handling only in IOMMU x2apic mode
(MMIO 0x18[IntCapXTEn]=1). In legacy mode it will continue to use single
MSI interrupt.
Signed-off-by: Vasant Hegde<vasant.hegde@amd.com>
Reviewed-by: Alexey Kardashevskiy<aik@amd.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20230628053222.5962-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Some ATS-capable peripherals can issue requests to the processor to service
peripheral page requests using PCIe PRI (the Page Request Interface). IOMMU
supports PRI using PPR log buffer. IOMMU writes PRI request to PPR log
buffer and sends PPR interrupt to host. When there is no space in the
PPR log buffer (PPR log overflow) it will set PprOverflow bit in 'MMIO
Offset 2020h IOMMU Status Register'. When this happens PPR log needs to be
restarted as specified in IOMMU spec [1] section 2.6.2.
When handling the event it just resumes the PPR log without resizing
(similar to the way event and GA log overflow is handled).
Failing to handle PPR overflow means device may not work properly as
IOMMU stops processing new PPR events from device.
[1] https://www.amd.com/system/files/TechDocs/48882_3.07_PUB.pdf
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20230628051624.5792-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Each IOMMU has three log buffers (Event, GA and PPR log). Once a buffer
becomes full, IOMMU generates an interrupt with the corresponding overflow
status bit, and stop processing the log. To handle an overflow, the IOMMU
driver needs to disable the log, clear the overflow status bit, and
re-enable the log. This procedure is same among all types of log
buffer except it uses different overflow status bit and enabling bit.
Hence, to consolidate the log buffer restarting logic, introduce a helper
function amd_iommu_restart_log(), which caller can specify parameters
specific for each type of log buffer.
Also rename MMIO_STATUS_EVT_OVERFLOW_INT_MASK as
MMIO_STATUS_EVT_OVERFLOW_MASK.
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20230628051624.5792-2-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Invalidating Interrupt Remapping Table (IRT) requires, the AMD IOMMU driver
to issue INVALIDATE_INTERRUPT_TABLE and COMPLETION_WAIT commands.
Currently, the driver issues the two commands separately, which requires
calling raw_spin_lock_irqsave() twice. In addition, the COMPLETION_WAIT
could potentially be interleaved with other commands causing delay of
the COMPLETION_WAIT command.
Therefore, combine issuing of the two commands in one spin-lock, and
changing struct amd_iommu.cmd_sem_val to use atomic64 to minimize
locking.
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20230530141137.14376-6-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
An Interrupt Remapping Table (IRT) stores interrupt remapping configuration
for each device. In a normal operation, the AMD IOMMU caches the table
to optimize subsequent data accesses. This requires the IOMMU driver to
invalidate IRT whenever it updates the table. The invalidation process
includes issuing an INVALIDATE_INTERRUPT_TABLE command following by
a COMPLETION_WAIT command.
However, there are cases in which the IRT is updated at a high rate.
For example, for IOMMU AVIC, the IRTE[IsRun] bit is updated on every
vcpu scheduling (i.e. amd_iommu_update_ga()). On system with large
amount of vcpus and VFIO PCI pass-through devices, the invalidation
process could potentially become a performance bottleneck.
Introducing a new kernel boot option:
amd_iommu=irtcachedis
which disables IRTE caching by setting the IRTCachedis bit in each IOMMU
Control register, and bypass the IRT invalidation process.
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Co-developed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20230530141137.14376-4-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
GALog exists to propagate interrupts into all vCPUs in the system when
interrupts are marked as non running (e.g. when vCPUs aren't running). A
GALog overflow happens when there's in no space in the log to record the
GATag of the interrupt. So when the GALOverflow condition happens, the
GALog queue is processed and the GALog is restarted, as the IOMMU
manual indicates in section "2.7.4 Guest Virtual APIC Log Restart
Procedure":
| * Wait until MMIO Offset 2020h[GALogRun]=0b so that all request
| entries are completed as circumstances allow. GALogRun must be 0b to
| modify the guest virtual APIC log registers safely.
| * Write MMIO Offset 0018h[GALogEn]=0b.
| * As necessary, change the following values (e.g., to relocate or
| resize the guest virtual APIC event log):
| - the Guest Virtual APIC Log Base Address Register
| [MMIO Offset 00E0h],
| - the Guest Virtual APIC Log Head Pointer Register
| [MMIO Offset 2040h][GALogHead], and
| - the Guest Virtual APIC Log Tail Pointer Register
| [MMIO Offset 2048h][GALogTail].
| * Write MMIO Offset 2020h[GALOverflow] = 1b to clear the bit (W1C).
| * Write MMIO Offset 0018h[GALogEn] = 1b, and either set
| MMIO Offset 0018h[GAIntEn] to enable the GA log interrupt or clear
| the bit to disable it.
Failing to handle the GALog overflow means that none of the VFs (in any
guest) will work with IOMMU AVIC forcing the user to power cycle the
host. When handling the event it resumes the GALog without resizing
much like how it is done in the event handler overflow. The
[MMIO Offset 2020h][GALOverflow] bit might be set in status register
without the [MMIO Offset 2020h][GAInt] bit, so when deciding to poll
for GA events (to clear space in the galog), also check the overflow
bit.
[suravee: Check for GAOverflow without GAInt, toggle CONTROL_GAINT_EN]
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20230419201154.83880-3-joao.m.martins@oracle.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use sysfs_emit() instead of the sprintf() for sysfs entries. sysfs_emit()
knows the maximum of the temporary buffer used for outputting sysfs
content and avoids overrunning the buffer length.
Prefer 'long long' over 'long long int' as suggested by checkpatch.pl.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20230322123421.278852-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use numa information to allocate irq resources and also to set
irq affinity. This optimizes the IOMMU interrupt handling.
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20230321092348.6127-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The 'acpiid' buffer in the parse_ivrs_acpihid function may overflow,
because the string specifier in the format string sscanf()
has no width limitation.
Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with SVACE.
Fixes: ca3bf5d47c ("iommu/amd: Introduces ivrs_acpihid kernel parameter")
Cc: stable@vger.kernel.org
Signed-off-by: Ilia.Gavrilov <Ilia.Gavrilov@infotecs.ru>
Reviewed-by: Kim Phillips <kim.phillips@amd.com>
Link: https://lore.kernel.org/r/20230202082719.1513849-1-Ilia.Gavrilov@infotecs.ru
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently, these options cause the following libkmod error:
libkmod: ERROR ../libkmod/libkmod-config.c:489 kcmdline_parse_result: \
Ignoring bad option on kernel command line while parsing module \
name: 'ivrs_xxxx[XX:XX'
Fix by introducing a new parameter format for these options and
throw a warning for the deprecated format.
Users are still allowed to omit the PCI Segment if zero.
Adding a Link: to the reason why we're modding the syntax parsing
in the driver and not in libkmod.
Fixes: ca3bf5d47c ("iommu/amd: Introduces ivrs_acpihid kernel parameter")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/linux-modules/20200310082308.14318-2-lucas.demarchi@intel.com/
Reported-by: Kim Phillips <kim.phillips@amd.com>
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Link: https://lore.kernel.org/r/20220919155638.391481-2-kim.phillips@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The second (UID) strcmp in acpi_dev_hid_uid_match considers
"0" and "00" different, which can prevent device registration.
Have the AMD IOMMU driver's ivrs_acpihid parsing code remove
any leading zeroes to make the UID strcmp succeed. Now users
can safely specify "AMDxxxxx:00" or "AMDxxxxx:0" and expect
the same behaviour.
Fixes: ca3bf5d47c ("iommu/amd: Introduces ivrs_acpihid kernel parameter")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: stable@vger.kernel.org
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220919155638.391481-1-kim.phillips@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
CHECK drivers/iommu/amd/iommu.c
drivers/iommu/amd/iommu.c:73:24: warning: symbol 'amd_iommu_ops' was not declared. Should it be static?
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20220912063248.7909-6-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
AMD IOMMU introduces support for Guest I/O protection where the request
from the I/O device without a PASID are treated as if they have PASID 0.
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20220825063939.8360-8-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>