1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

152 commits

Author SHA1 Message Date
Linus Torvalds
e1cbc3b96a IOMMU Updates for Linux v5.19
Including:
 
 	- Intel VT-d driver updates
 	  - Domain force snooping improvement.
 	  - Cleanups, no intentional functional changes.
 
 	- ARM SMMU driver updates
 	  - Add new Qualcomm device-tree compatible strings
 	  - Add new Nvidia device-tree compatible string for Tegra234
 	  - Fix UAF in SMMUv3 shared virtual addressing code
 	  - Force identity-mapped domains for users of ye olde SMMU
 	    legacy binding
 	  - Minor cleanups
 
 	- Patches to fix a BUG_ON in the vfio_iommu_group_notifier
 	  - Groundwork for upcoming iommufd framework
 	  - Introduction of DMA ownership so that an entire IOMMU group
 	    is either controlled by the kernel or by user-space
 
 	- MT8195 and MT8186 support in the Mediatek IOMMU driver
 
 	- Patches to make forcing of cache-coherent DMA more coherent
 	  between IOMMU drivers
 
 	- Fixes for thunderbolt device DMA protection
 
 	- Various smaller fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmKWCbUACgkQK/BELZcB
 GuPHmRAAuoH9iK/jrC3SgrqpBfH2iRN7ovIX8dFvgbQWX27lhXF4gvj2/nYdIvPK
 75j/LmdibuzV3Iez4kjbGKNG1AikwK3dKIH21a84f3ctnoamQyL6nMfCVBFaVD/D
 kvPpTHyjbGPNf6KZyWQdkJ5DXD1aoG1DKkBnslH5pTNPqGuNqbcnRTg0YxiJFLBv
 5w2B6jL06XRzunh+Sp1Dbj+po8ROjLRCEU+tdrndO8W/Dyp6+ZNNuxL9/3BM9zMj
 py0M4piFtGnhmJSdym1eeHm7r1YRjkZw+MN+e8NcrcSihmDutEWo7nRRxA5uVaa+
 3O2DNERqCvQUYxfNRUOKwzV8v51GYQHEPhvOe/MLgaEQDmDmlF2dHNGm93eCMdrv
 m1cT011oU7pa4qHomwLyTJxSsR7FzJ37igq/WeY++MBhl+frqfzEQPVxF+W7GLb8
 QvT/+woCPzLVpJbE7s0FUD4nbPd8c1dAz4+HO1DajxILIOTq1bnPIorSjgXODRjq
 yzsiP1rAg0L0PsL7pXn3cPMzNCE//xtOsRsAGmaVv6wBoMLyWVFCU/wjPEdjrSWA
 nXpAuCL84uxCEl/KLYMsg9UhjT6ko7CuKdsybIG9zNIiUau43uSqgTen0xCpYt0i
 m//O/X3tPyxmoLKRW+XVehGOrBZW+qrQny6hk/Zex+6UJQqVMTA=
 =W0hj
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Intel VT-d driver updates:
     - Domain force snooping improvement.
     - Cleanups, no intentional functional changes.

 - ARM SMMU driver updates:
     - Add new Qualcomm device-tree compatible strings
     - Add new Nvidia device-tree compatible string for Tegra234
     - Fix UAF in SMMUv3 shared virtual addressing code
     - Force identity-mapped domains for users of ye olde SMMU legacy
       binding
     - Minor cleanups

 - Fix a BUG_ON in the vfio_iommu_group_notifier:
     - Groundwork for upcoming iommufd framework
     - Introduction of DMA ownership so that an entire IOMMU group is
       either controlled by the kernel or by user-space

 - MT8195 and MT8186 support in the Mediatek IOMMU driver

 - Make forcing of cache-coherent DMA more coherent between IOMMU
   drivers

 - Fixes for thunderbolt device DMA protection

 - Various smaller fixes and cleanups

* tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
  iommu/amd: Increase timeout waiting for GA log enablement
  iommu/s390: Tolerate repeat attach_dev calls
  iommu/vt-d: Remove hard coding PGSNP bit in PASID entries
  iommu/vt-d: Remove domain_update_iommu_snooping()
  iommu/vt-d: Check domain force_snooping against attached devices
  iommu/vt-d: Block force-snoop domain attaching if no SC support
  iommu/vt-d: Size Page Request Queue to avoid overflow condition
  iommu/vt-d: Fold dmar_insert_one_dev_info() into its caller
  iommu/vt-d: Change return type of dmar_insert_one_dev_info()
  iommu/vt-d: Remove unneeded validity check on dev
  iommu/dma: Explicitly sort PCI DMA windows
  iommu/dma: Fix iova map result check bug
  iommu/mediatek: Fix NULL pointer dereference when printing dev_name
  iommu: iommu_group_claim_dma_owner() must always assign a domain
  iommu/arm-smmu: Force identity domains for legacy binding
  iommu/arm-smmu: Support Tegra234 SMMU
  dt-bindings: arm-smmu: Add compatible for Tegra234 SOC
  dt-bindings: arm-smmu: Document nvidia,memory-controller property
  iommu/arm-smmu-qcom: Add SC8280XP support
  dt-bindings: arm-smmu: Add compatible for Qualcomm SC8280XP
  ...
2022-05-31 09:56:54 -07:00
Linus Torvalds
2518f226c6 drm for 5.19-rc1
dma-buf:
 - add dma_resv_replace_fences
 - add dma_resv_get_singleton
 - make dma_excl_fence private
 
 core:
 - EDID parser refactorings
 - switch drivers to drm_mode_copy/duplicate
 - DRM managed mutex initialization
 
 display-helper:
 - put HDMI, SCDC, HDCP, DSC and DP into new module
 
 gem:
 - rework fence handling
 
 ttm:
 - rework bulk move handling
 - add common debugfs for resource managers
 - convert to kvcalloc
 
 format helpers:
 - support monochrome formats
 - RGB888, RGB565 to XRGB8888 conversions
 
 fbdev:
 - cfb/sys_imageblit fixes
 - pagelist corruption fix
 - create offb platform device
 - deferred io improvements
 
 sysfb:
 - Kconfig rework
 - support for VESA mode selection
 
 bridge:
 - conversions to devm_drm_of_get_bridge
 - conversions to panel_bridge
 - analogix_dp - autosuspend support
 - it66121 - audio support
 - tc358767 - DSI to DPI support
 - icn6211 - PLL/I2C fixes, DT property
 - adv7611 - enable DRM_BRIDGE_OP_HPD
 - anx7625 - fill ELD if no monitor
 - dw_hdmi - add audio support
 - lontium LT9211 support, i.MXMP LDB
 - it6505: Kconfig fix, DPCD set power fix
 - adv7511 - CEC support for ADV7535
 
 panel:
 - ltk035c5444t, B133UAN01, NV3052C panel support
 - DataImage FG040346DSSWBG04 support
 - st7735r - DT bindings fix
 - ssd130x - fixes
 
 i915:
 - DG2 laptop PCI-IDs ("motherboard down")
 - Initial RPL-P PCI IDs
 - compute engine ABI
 - DG2 Tile4 support
 - DG2 CCS clear color compression support
 - DG2 render/media compression formats support
 - ATS-M platform info
 - RPL-S PCI IDs added
 - Bump ADL-P DMC version to v2.16
 - Support static DRRS
 - Support multiple eDP/LVDS native mode refresh rates
 - DP HDR support for HSW+
 - Lots of display refactoring + fixes
 - GuC hwconfig support and query
 - sysfs support for multi-tile
 - fdinfo per-client gpu utilisation
 - add geometry subslices query
 - fix prime mmap with LMEM
 - fix vm open count and remove vma refcounts
 - contiguous allocation fixes
 - steered register write support
 - small PCI BAR enablement
 - GuC error capture support
 - sunset igpu legacy mmap support for newer devices
 - GuC version 70.1.1 support
 
 amdgpu:
 - Initial SoC21 support
 - SMU 13.x enablement
 - SMU 13.0.4 support
 - ttm_eu cleanups
 - USB-C, GPUVM updates
 - TMZ fixes for RV
 - RAS support for VCN
 - PM sysfs code cleanup
 - DC FP rework
 - extend CG/PG flags to 64-bit
 - SI dpm lockdep fix
 - runtime PM fixes
 
 amdkfd:
 - RAS/SVM fixes
 - TLB flush fixes
 - CRIU GWS support
 - ignore bogus MEC signals more efficiently
 
 msm:
 - Fourcc modifier for tiled but not compressed layouts
 - Support for userspace allocated IOVA (GPU virtual address)
 - DPU: DSC (Display Stream Compression) support
 - DP: eDP support
 - DP: conversion to use drm_bridge and drm_bridge_connector
 - Merge DPU1 and MDP5 MDSS driver
 - DPU: writeback support
 
 nouveau:
 - make some structures static
 - make some variables static
 - switch to drm_gem_plane_helper_prepare_fb
 
 radeon:
 - misc fixes/cleanups
 
 mxsfb:
 - rework crtc mode setting
 - LCDIF CRC support
 
 etnaviv:
 - fencing improvements
 - fix address space collisions
 - cleanup MMU reference handling
 
 gma500:
 - GEM/GTT improvements
 - connector handling fixes
 
 komeda:
 - switch to plane reset helper
 
 mediatek:
 - MIPI DSI improvements
 
 omapdrm:
 - GEM improvements
 
 qxl:
 - aarch64 support
 
 vc4:
 - add a CL submission tracepoint
 - HDMI YUV support
 - HDMI/clock improvements
 - drop is_hdmi caching
 
 virtio:
 - remove restriction of non-zero blob types
 
 vmwgfx:
 - support for cursormob and cursorbypass 4
 - fence improvements
 
 tidss:
 - reset DISPC on startup
 
 solomon:
 - SPI support
 - DT improvements
 
 sun4i:
 - allwinner D1 support
 - drop is_hdmi caching
 
 imx:
 - use swap() instead of open-coding
 - use devm_platform_ioremap_resource
 - remove redunant initializations
 
 ast:
 - Displayport support
 
 rockchip:
 - Refactor IOMMU initialisation
 - make some structures static
 - replace drm_detect_hdmi_monitor with drm_display_info.is_hdmi
 - support swapped YUV formats,
 - clock improvements
 - rk3568 support
 - VOP2 support
 
 mediatek:
 - MT8186 support
 
 tegra:
 - debugabillity improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmKNxkAACgkQDHTzWXnE
 hr4hghAAqSXeMEw1w34miyM28hcOpXqkDfT1VVooqFxBT8MBqamzpZvCH94qsZwm
 3DRXlhQ4pk8wzUcWJpGprdNakxNQPpFVs2UuxYxOyrxYpdkbOwqsEcM3d8VXD9Cy
 E36z+dr85A8Te/J0Yg/FLoZMHulTlidqEZeOz6SMaNUohtrmH/oPWR+cPIy4/Zpp
 yysfbBSKTwblJFDf4+nIpks/VvJYAUO3i6KClT/Rh79Yg6de582AU0YaNcEArTi6
 JqdiYIoYLx609Ecy5NVme6wR/ai46afFLMYt3ZIP4OfHRINk+YL01BYMo2JE2M8l
 xjOH0Iwb7evzWqLK/ESwqp3P7nyppmLlfbZOFHWUfNJsjq2H3ePaAGhzOlYx1c70
 XENzY4IvpYYdR0pJuh1gw1cNZfM9JDAynGJ5jvsATLGBGQbpFsy3w/PMZT17q8an
 DpBwqQmShUdCJ2m+6zznC3VsxJpbvWKNE1I93NxAWZXmFYxoHCzRihahUxKcNDrQ
 ZLH7RSlk9SE/ZtNSLkU15YnKtoW+ThFIssUpVio6U/fZot1+efZkmkXplSuFvj6R
 i7s14hMWQjSJzpJg1DXfhDMycEOujNiQppCG2EaDlVxvUtCqYBd3EHOI7KQON//+
 iVtmEEnWh5rcCM+WsxLGf3Y7sVP3vfo1LOCxshb1XVfDmeMksoI=
 =BYQA
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Intel have enabled DG2 on certain SKUs for laptops, AMD has started
  some new GPU support, msm has user allocated VA controls

  dma-buf:
   - add dma_resv_replace_fences
   - add dma_resv_get_singleton
   - make dma_excl_fence private

  core:
   - EDID parser refactorings
   - switch drivers to drm_mode_copy/duplicate
   - DRM managed mutex initialization

  display-helper:
   - put HDMI, SCDC, HDCP, DSC and DP into new module

  gem:
   - rework fence handling

  ttm:
   - rework bulk move handling
   - add common debugfs for resource managers
   - convert to kvcalloc

  format helpers:
   - support monochrome formats
   - RGB888, RGB565 to XRGB8888 conversions

  fbdev:
   - cfb/sys_imageblit fixes
   - pagelist corruption fix
   - create offb platform device
   - deferred io improvements

  sysfb:
   - Kconfig rework
   - support for VESA mode selection

  bridge:
   - conversions to devm_drm_of_get_bridge
   - conversions to panel_bridge
   - analogix_dp - autosuspend support
   - it66121 - audio support
   - tc358767 - DSI to DPI support
   - icn6211 - PLL/I2C fixes, DT property
   - adv7611 - enable DRM_BRIDGE_OP_HPD
   - anx7625 - fill ELD if no monitor
   - dw_hdmi - add audio support
   - lontium LT9211 support, i.MXMP LDB
   - it6505: Kconfig fix, DPCD set power fix
   - adv7511 - CEC support for ADV7535

  panel:
   - ltk035c5444t, B133UAN01, NV3052C panel support
   - DataImage FG040346DSSWBG04 support
   - st7735r - DT bindings fix
   - ssd130x - fixes

  i915:
   - DG2 laptop PCI-IDs ("motherboard down")
   - Initial RPL-P PCI IDs
   - compute engine ABI
   - DG2 Tile4 support
   - DG2 CCS clear color compression support
   - DG2 render/media compression formats support
   - ATS-M platform info
   - RPL-S PCI IDs added
   - Bump ADL-P DMC version to v2.16
   - Support static DRRS
   - Support multiple eDP/LVDS native mode refresh rates
   - DP HDR support for HSW+
   - Lots of display refactoring + fixes
   - GuC hwconfig support and query
   - sysfs support for multi-tile
   - fdinfo per-client gpu utilisation
   - add geometry subslices query
   - fix prime mmap with LMEM
   - fix vm open count and remove vma refcounts
   - contiguous allocation fixes
   - steered register write support
   - small PCI BAR enablement
   - GuC error capture support
   - sunset igpu legacy mmap support for newer devices
   - GuC version 70.1.1 support

  amdgpu:
   - Initial SoC21 support
   - SMU 13.x enablement
   - SMU 13.0.4 support
   - ttm_eu cleanups
   - USB-C, GPUVM updates
   - TMZ fixes for RV
   - RAS support for VCN
   - PM sysfs code cleanup
   - DC FP rework
   - extend CG/PG flags to 64-bit
   - SI dpm lockdep fix
   - runtime PM fixes

  amdkfd:
   - RAS/SVM fixes
   - TLB flush fixes
   - CRIU GWS support
   - ignore bogus MEC signals more efficiently

  msm:
   - Fourcc modifier for tiled but not compressed layouts
   - Support for userspace allocated IOVA (GPU virtual address)
   - DPU: DSC (Display Stream Compression) support
   - DP: eDP support
   - DP: conversion to use drm_bridge and drm_bridge_connector
   - Merge DPU1 and MDP5 MDSS driver
   - DPU: writeback support

  nouveau:
   - make some structures static
   - make some variables static
   - switch to drm_gem_plane_helper_prepare_fb

  radeon:
   - misc fixes/cleanups

  mxsfb:
   - rework crtc mode setting
   - LCDIF CRC support

  etnaviv:
   - fencing improvements
   - fix address space collisions
   - cleanup MMU reference handling

  gma500:
   - GEM/GTT improvements
   - connector handling fixes

  komeda:
   - switch to plane reset helper

  mediatek:
   - MIPI DSI improvements

  omapdrm:
   - GEM improvements

  qxl:
   - aarch64 support

  vc4:
   - add a CL submission tracepoint
   - HDMI YUV support
   - HDMI/clock improvements
   - drop is_hdmi caching

  virtio:
   - remove restriction of non-zero blob types

  vmwgfx:
   - support for cursormob and cursorbypass 4
   - fence improvements

  tidss:
   - reset DISPC on startup

  solomon:
   - SPI support
   - DT improvements

  sun4i:
   - allwinner D1 support
   - drop is_hdmi caching

  imx:
   - use swap() instead of open-coding
   - use devm_platform_ioremap_resource
   - remove redunant initializations

  ast:
   - Displayport support

  rockchip:
   - Refactor IOMMU initialisation
   - make some structures static
   - replace drm_detect_hdmi_monitor with drm_display_info.is_hdmi
   - support swapped YUV formats,
   - clock improvements
   - rk3568 support
   - VOP2 support

  mediatek:
   - MT8186 support

  tegra:
   - debugabillity improvements"

* tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm: (1740 commits)
  drm/i915/dsi: fix VBT send packet port selection for ICL+
  drm/i915/uc: Fix undefined behavior due to shift overflowing the constant
  drm/i915/reg: fix undefined behavior due to shift overflowing the constant
  drm/i915/gt: Fix use of static in macro mismatch
  drm/i915/audio: fix audio code enable/disable pipe logging
  drm/i915: Fix CFI violation with show_dynamic_id()
  drm/i915: Fix 'mixing different enum types' warnings in intel_display_power.c
  drm/i915/gt: Fix build error without CONFIG_PM
  drm/msm/dpu: handle pm_runtime_get_sync() errors in bind path
  drm/msm/dpu: add DRM_MODE_ROTATE_180 back to supported rotations
  drm/msm: don't free the IRQ if it was not requested
  drm/msm/dpu: limit writeback modes according to max_linewidth
  drm/amd: Don't reset dGPUs if the system is going to s2idle
  drm/amdgpu: Unmap legacy queue when MES is enabled
  drm: msm: fix possible memory leak in mdp5_crtc_cursor_set()
  drm/msm: Fix fb plane offset calculation
  drm/msm/a6xx: Fix refcount leak in a6xx_gpu_init
  drm/msm/dsi: don't powerup at modeset time for parade-ps8640
  drm/rockchip: Change register space names in vop2
  dt-bindings: display: rockchip: make reg-names mandatory for VOP2
  ...
2022-05-25 16:18:27 -07:00
Joerg Roedel
b0dacee202 Merge branches 'apple/dart', 'arm/mediatek', 'arm/msm', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd' and 'vfio-notifier-fix' into next 2022-05-20 12:27:17 +02:00
Lu Baolu
e80552267b iommu/vt-d: Remove domain_update_iommu_snooping()
The IOMMU force snooping capability is not required to be consistent
among all the IOMMUs anymore. Remove force snooping capability check
in the IOMMU hot-add path and domain_update_iommu_snooping() becomes
a dead code now.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20220508123525.1973626-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220510023407.2759143-8-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-13 15:14:56 +02:00
Lu Baolu
fc0051cb95 iommu/vt-d: Check domain force_snooping against attached devices
As domain->force_snooping only impacts the devices attached with the
domain, there's no need to check against all IOMMU units. On the other
hand, force_snooping could be set on a domain no matter whether it has
been attached or not, and once set it is an immutable flag. If no
device attached, the operation always succeeds. Then this empty domain
can be only attached to a device of which the IOMMU supports snoop
control.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20220508123525.1973626-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220510023407.2759143-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-13 15:14:56 +02:00
Lu Baolu
9d6ab26a75 iommu/vt-d: Block force-snoop domain attaching if no SC support
In the attach_dev callback of the default domain ops, if the domain has
been set force_snooping, but the iommu hardware of the device does not
support SC(Snoop Control) capability, the callback should block it and
return a corresponding error code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20220508123525.1973626-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220510023407.2759143-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-13 15:14:56 +02:00
Lu Baolu
bac4e778d6 iommu/vt-d: Fold dmar_insert_one_dev_info() into its caller
Fold dmar_insert_one_dev_info() into domain_add_dev_info() which is its
only caller.

No intentional functional impact.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220416120423.879552-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220510023407.2759143-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-13 15:14:56 +02:00
Lu Baolu
e19c3992b9 iommu/vt-d: Change return type of dmar_insert_one_dev_info()
The dmar_insert_one_dev_info() returns the pass-in domain on success and
NULL on failure. This doesn't make much sense. Change it to an integer.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220416120423.879552-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220510023407.2759143-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-13 15:14:56 +02:00
Muhammad Usama Anjum
cd901e9284 iommu/vt-d: Remove unneeded validity check on dev
dev_iommu_priv_get() is being used at the top of this function which
dereferences dev. Dev cannot be NULL after this. Remove the validity
check on dev and simplify the code.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20220313150337.593650-1-usama.anjum@collabora.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220510023407.2759143-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-13 15:14:56 +02:00
Jason Gunthorpe
f78dc1dad8 iommu: Redefine IOMMU_CAP_CACHE_COHERENCY as the cap flag for IOMMU_CACHE
While the comment was correct that this flag was intended to convey the
block no-snoop support in the IOMMU, it has become widely implemented and
used to mean the IOMMU supports IOMMU_CACHE as a map flag. Only the Intel
driver was different.

Now that the Intel driver is using enforce_cache_coherency() update the
comment to make it clear that IOMMU_CAP_CACHE_COHERENCY is only about
IOMMU_CACHE.  Fix the Intel driver to return true since IOMMU_CACHE always
works.

The two places that test this flag, usnic and vdpa, are both assigning
userspace pages to a driver controlled iommu_domain and require
IOMMU_CACHE behavior as they offer no way for userspace to synchronize
caches.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v3-2cf356649677+a32-intel_no_snoop_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28 17:24:57 +02:00
Jason Gunthorpe
71cfafda9c vfio: Move the Intel no-snoop control off of IOMMU_CACHE
IOMMU_CACHE means "normal DMA to this iommu_domain's IOVA should be cache
coherent" and is used by the DMA API. The definition allows for special
non-coherent DMA to exist - ie processing of the no-snoop flag in PCIe
TLPs - so long as this behavior is opt-in by the device driver.

The flag is mainly used by the DMA API to synchronize the IOMMU setting
with the expected cache behavior of the DMA master. eg based on
dev_is_dma_coherent() in some case.

For Intel IOMMU IOMMU_CACHE was redefined to mean 'force all DMA to be
cache coherent' which has the practical effect of causing the IOMMU to
ignore the no-snoop bit in a PCIe TLP.

x86 platforms are always IOMMU_CACHE, so Intel should ignore this flag.

Instead use the new domain op enforce_cache_coherency() which causes every
IOPTE created in the domain to have the no-snoop blocking behavior.

Reconfigure VFIO to always use IOMMU_CACHE and call
enforce_cache_coherency() to operate the special Intel behavior.

Remove the IOMMU_CACHE test from Intel IOMMU.

Ultimately VFIO plumbs the result of enforce_cache_coherency() back into
the x86 platform code through kvm_arch_register_noncoherent_dma() which
controls if the WBINVD instruction is available in the guest.  No other
archs implement kvm_arch_register_noncoherent_dma() nor are there any
other known consumers of VFIO_DMA_CC_IOMMU that might be affected by the
user visible result change on non-x86 archs.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/2-v3-2cf356649677+a32-intel_no_snoop_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28 17:24:57 +02:00
Jason Gunthorpe
6043257b1d iommu: Introduce the domain op enforce_cache_coherency()
This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and
IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.

Currently only Intel and AMD IOMMUs are known to support this
feature. They both implement it as an IOPTE bit, that when set, will cause
PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
the no-snoop bit was clear.

The new API is triggered by calling enforce_cache_coherency() before
mapping any IOVA to the domain which globally switches on no-snoop
blocking. This allows other implementations that might block no-snoop
globally and outside the IOPTE - AMD also documents such a HW capability.

Leave AMD out of sync with Intel and have it block no-snoop even for
in-kernel users. This can be trivially resolved in a follow up patch.

Only VFIO needs to call this API because it does not have detailed control
over the device to avoid requesting no-snoop behavior at the device
level. Other places using domains with real kernel drivers should simply
avoid asking their devices to set the no-snoop bit.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v3-2cf356649677+a32-intel_no_snoop_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28 17:24:57 +02:00
David Stevens
59bf3557cf iommu/vt-d: Calculate mask for non-aligned flushes
Calculate the appropriate mask for non-size-aligned page selective
invalidation. Since psi uses the mask value to mask out the lower order
bits of the target address, properly flushing the iotlb requires using a
mask value such that [pfn, pfn+pages) all lie within the flushed
size-aligned region.  This is not normally an issue because iova.c
always allocates iovas that are aligned to their size. However, iovas
which come from other sources (e.g. userspace via VFIO) may not be
aligned.

To properly flush the IOTLB, both the start and end pfns need to be
equal after applying the mask. That means that the most efficient mask
to use is the index of the lowest bit that is equal where all higher
bits are also equal. For example, if pfn=0x17f and pages=3, then
end_pfn=0x181, so the smallest mask we can use is 8. Any differences
above the highest bit of pages are due to carrying, so by xnor'ing pfn
and end_pfn and then masking out the lower order bits based on pages, we
get 0xffffff00, where the first set bit is the mask we want to use.

Fixes: 6fe1010d6d ("vfio/type1: DMA unmap chunking")
Cc: stable@vger.kernel.org
Signed-off-by: David Stevens <stevensd@chromium.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20220401022430.1262215-1-stevensd@google.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220410013533.3959168-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28 11:01:06 +02:00
Robin Murphy
d0be55fbeb iommu: Add capability for pre-boot DMA protection
VT-d's dmar_platform_optin() actually represents a combination of
properties fairly well standardised by Microsoft as "Pre-boot DMA
Protection" and "Kernel DMA Protection"[1]. As such, we can provide
interested consumers with an abstracted capability rather than
driver-specific interfaces that won't scale. We name it for the former
aspect since that's what external callers are most likely to be
interested in; the latter is for the IOMMU layer to handle itself.

[1] https://docs.microsoft.com/en-us/windows-hardware/design/device-experiences/oem-kernel-dma-protection

Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/d6218dff2702472da80db6aec2c9589010684551.1650878781.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28 10:30:25 +02:00
Jani Nikula
83970cd63b Merge drm/drm-next into drm-intel-next
Sync up with v5.18-rc1, in particular to get 5e3094cfd9
("drm/i915/xehpsdv: Add has_flat_ccs to device info").

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-04-11 16:01:56 +03:00
Joerg Roedel
e17c6debd4 Merge branches 'arm/mediatek', 'arm/msm', 'arm/renesas', 'arm/rockchip', 'arm/smmu', 'x86/vt-d' and 'x86/amd' into next 2022-03-08 12:21:31 +01:00
Yian Chen
97f2f2c531 iommu/vt-d: Enable ATS for the devices in SATC table
Starting from Intel VT-d v3.2, Intel platform BIOS can provide additional
SATC table structure. SATC table includes a list of SoC integrated devices
that support ATC (Address translation cache).

Enabling ATC (via ATS capability) can be a functional requirement for SATC
device operation or optional to enhance device performance/functionality.
This is determined by the bit of ATC_REQUIRED in SATC table. When IOMMU is
working in scalable mode, software chooses to always enable ATS for every
device in SATC table because Intel SoC devices in SATC table are trusted to
use ATS.

On the other hand, if IOMMU is in legacy mode, ATS of SATC capable devices
can work transparently to software and be automatically enabled by IOMMU
hardware. As the result, there is no need for software to enable ATS on
these devices.

This also removes dmar_find_matched_atsr_unit() helper as it becomes dead
code now.

Signed-off-by: Yian Chen <yian.chen@intel.com>
Link: https://lore.kernel.org/r/20220222185416.1722611-1-yian.chen@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220301020159.633356-13-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04 16:46:31 +01:00
Marco Bonelli
45967ffb9e iommu/vt-d: Add missing "__init" for rmrr_sanity_check()
rmrr_sanity_check() calls arch_rmrr_sanity_check(), which is marked as
"__init". Add the annotation for rmrr_sanity_check() too. This function is
currently only called from dmar_parse_one_rmrr() which is also "__init".

Signed-off-by: Marco Bonelli <marco@mebeim.net>
Link: https://lore.kernel.org/r/20220210023141.9208-1-marco@mebeim.net
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220301020159.633356-11-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04 16:46:31 +01:00
Lu Baolu
2187a57ef0 iommu/vt-d: Fix indentation of goto labels
Remove blanks before goto labels.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220301020159.633356-9-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04 16:46:31 +01:00
Lu Baolu
782861df7d iommu/vt-d: Remove unnecessary prototypes
Some prototypes in iommu.c are unnecessary. Delete them.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220301020159.633356-8-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04 16:46:30 +01:00
Lu Baolu
763e656c69 iommu/vt-d: Remove unnecessary includes
Remove unnecessary include files and sort the remaining alphabetically.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220301020159.633356-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04 16:46:30 +01:00
Lu Baolu
586081d3f6 iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO
Allocate and set the per-device iommu private data during iommu device
probe. This makes the per-device iommu private data always available
during iommu_probe_device() and iommu_release_device(). With this changed,
the dummy DEFER_DEVICE_DOMAIN_INFO pointer could be removed. The wrappers
for getting the private data and domain are also cleaned.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220301020159.633356-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04 16:46:30 +01:00
Lu Baolu
ee2653bbe8 iommu/vt-d: Remove domain and devinfo mempool
The domain and devinfo memory blocks are only allocated during device
probe and released during remove. There's no hot-path context, hence
no need for memory pools.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220301020159.633356-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04 16:46:30 +01:00
Lu Baolu
c8850a6e6d iommu/vt-d: Remove iova_cache_get/put()
These have been done in drivers/iommu/dma-iommu.c. Remove this duplicate
code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220301020159.633356-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04 16:46:30 +01:00
Lu Baolu
c5d27545fb iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info()
The Intel IOMMU driver has already converted to use default domain
framework in iommu core. There's no need to find a domain for the
device in the domain attaching path. Cleanup that code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220301020159.633356-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04 16:46:30 +01:00
Lu Baolu
402e6688a7 iommu/vt-d: Remove intel_iommu::domains
The "domains" field of the intel_iommu structure keeps the mapping of
domain_id to dmar_domain. This information is not used anywhere. Remove
and cleanup it to avoid unnecessary memory consumption.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220301020159.633356-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04 16:46:30 +01:00
Tejas Upadhyay
0a967f5bfd iommu/vt-d: Add RPLS to quirk list to skip TE disabling
The VT-d spec requires (10.4.4 Global Command Register, TE
field) that:

Hardware implementations supporting DMA draining must drain
any in-flight DMA read/write requests queued within the
Root-Complex before completing the translation enable
command and reflecting the status of the command through
the TES field in the Global Status register.

Unfortunately, some integrated graphic devices fail to do
so after some kind of power state transition. As the
result, the system might stuck in iommu_disable_translati
on(), waiting for the completion of TE transition.

This adds RPLS to a quirk list for those devices and skips
TE disabling if the qurik hits.

Link: https://gitlab.freedesktop.org/drm/intel/-/issues/4898
Tested-by: Raviteja Goud Talla <ravitejax.goud.talla@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302043256.191529-1-tejaskumarx.surendrakumar.upadhyay@intel.com
2022-03-02 17:06:30 -05:00
Adrian Huang
b00833768e iommu/vt-d: Fix double list_add when enabling VMD in scalable mode
When enabling VMD and IOMMU scalable mode, the following kernel panic
call trace/kernel log is shown in Eagle Stream platform (Sapphire Rapids
CPU) during booting:

pci 0000:59:00.5: Adding to iommu group 42
...
vmd 0000:59:00.5: PCI host bridge to bus 10000:80
pci 10000:80:01.0: [8086:352a] type 01 class 0x060400
pci 10000:80:01.0: reg 0x10: [mem 0x00000000-0x0001ffff 64bit]
pci 10000:80:01.0: enabling Extended Tags
pci 10000:80:01.0: PME# supported from D0 D3hot D3cold
pci 10000:80:01.0: DMAR: Setup RID2PASID failed
pci 10000:80:01.0: Failed to add to iommu group 42: -16
pci 10000:80:03.0: [8086:352b] type 01 class 0x060400
pci 10000:80:03.0: reg 0x10: [mem 0x00000000-0x0001ffff 64bit]
pci 10000:80:03.0: enabling Extended Tags
pci 10000:80:03.0: PME# supported from D0 D3hot D3cold
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:29!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.17.0-rc3+ #7
Hardware name: Lenovo ThinkSystem SR650V3/SB27A86647, BIOS ESE101Y-1.00 01/13/2022
Workqueue: events work_for_cpu_fn
RIP: 0010:__list_add_valid.cold+0x26/0x3f
Code: 9a 4a ab ff 4c 89 c1 48 c7 c7 40 0c d9 9e e8 b9 b1 fe ff 0f
      0b 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 f0 0c d9 9e e8 a2 b1
      fe ff <0f> 0b 48 89 d1 4c 89 c6 4c 89 ca 48 c7 c7 98 0c d9
      9e e8 8b b1 fe
RSP: 0000:ff5ad434865b3a40 EFLAGS: 00010246
RAX: 0000000000000058 RBX: ff4d61160b74b880 RCX: ff4d61255e1fffa8
RDX: 0000000000000000 RSI: 00000000fffeffff RDI: ffffffff9fd34f20
RBP: ff4d611d8e245c00 R08: 0000000000000000 R09: ff5ad434865b3888
R10: ff5ad434865b3880 R11: ff4d61257fdc6fe8 R12: ff4d61160b74b8a0
R13: ff4d61160b74b8a0 R14: ff4d611d8e245c10 R15: ff4d611d8001ba70
FS:  0000000000000000(0000) GS:ff4d611d5ea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ff4d611fa1401000 CR3: 0000000aa0210001 CR4: 0000000000771ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 intel_pasid_alloc_table+0x9c/0x1d0
 dmar_insert_one_dev_info+0x423/0x540
 ? device_to_iommu+0x12d/0x2f0
 intel_iommu_attach_device+0x116/0x290
 __iommu_attach_device+0x1a/0x90
 iommu_group_add_device+0x190/0x2c0
 __iommu_probe_device+0x13e/0x250
 iommu_probe_device+0x24/0x150
 iommu_bus_notifier+0x69/0x90
 blocking_notifier_call_chain+0x5a/0x80
 device_add+0x3db/0x7b0
 ? arch_memremap_can_ram_remap+0x19/0x50
 ? memremap+0x75/0x140
 pci_device_add+0x193/0x1d0
 pci_scan_single_device+0xb9/0xf0
 pci_scan_slot+0x4c/0x110
 pci_scan_child_bus_extend+0x3a/0x290
 vmd_enable_domain.constprop.0+0x63e/0x820
 vmd_probe+0x163/0x190
 local_pci_probe+0x42/0x80
 work_for_cpu_fn+0x13/0x20
 process_one_work+0x1e2/0x3b0
 worker_thread+0x1c4/0x3a0
 ? rescuer_thread+0x370/0x370
 kthread+0xc7/0xf0
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x1f/0x30
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
...
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x1ca00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

The following 'lspci' output shows devices '10000:80:*' are subdevices of
the VMD device 0000:59:00.5:

  $ lspci
  ...
  0000:59:00.5 RAID bus controller: Intel Corporation Volume Management Device NVMe RAID Controller (rev 20)
  ...
  10000:80:01.0 PCI bridge: Intel Corporation Device 352a (rev 03)
  10000:80:03.0 PCI bridge: Intel Corporation Device 352b (rev 03)
  10000:80:05.0 PCI bridge: Intel Corporation Device 352c (rev 03)
  10000:80:07.0 PCI bridge: Intel Corporation Device 352d (rev 03)
  10000:81:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller]
  10000:82:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller]

The symptom 'list_add double add' is caused by the following failure
message:

  pci 10000:80:01.0: DMAR: Setup RID2PASID failed
  pci 10000:80:01.0: Failed to add to iommu group 42: -16
  pci 10000:80:03.0: [8086:352b] type 01 class 0x060400

Device 10000:80:01.0 is the subdevice of the VMD device 0000:59:00.5,
so invoking intel_pasid_alloc_table() gets the pasid_table of the VMD
device 0000:59:00.5. Here is call path:

  intel_pasid_alloc_table
    pci_for_each_dma_alias
     get_alias_pasid_table
       search_pasid_table

pci_real_dma_dev() in pci_for_each_dma_alias() gets the real dma device
which is the VMD device 0000:59:00.5. However, pte of the VMD device
0000:59:00.5 has been configured during this message "pci 0000:59:00.5:
Adding to iommu group 42". So, the status -EBUSY is returned when
configuring pasid entry for device 10000:80:01.0.

It then invokes dmar_remove_one_dev_info() to release
'struct device_domain_info *' from iommu_devinfo_cache. But, the pasid
table is not released because of the following statement in
__dmar_remove_one_dev_info():

	if (info->dev && !dev_is_real_dma_subdevice(info->dev)) {
		...
		intel_pasid_free_table(info->dev);
        }

The subsequent dmar_insert_one_dev_info() operation of device
10000:80:03.0 allocates 'struct device_domain_info *' from
iommu_devinfo_cache. The allocated address is the same address that
is released previously for device 10000:80:01.0. Finally, invoking
device_attach_pasid_table() causes the issue.

`git bisect` points to the offending commit 474dd1c650 ("iommu/vt-d:
Fix clearing real DMA device's scalable-mode context entries"), which
releases the pasid table if the device is not the subdevice by
checking the returned status of dev_is_real_dma_subdevice().
Reverting the offending commit can work around the issue.

The solution is to prevent from allocating pasid table if those
devices are subdevices of the VMD device.

Fixes: 474dd1c650 ("iommu/vt-d: Fix clearing real DMA device's scalable-mode context entries")
Cc: stable@vger.kernel.org # v5.14+
Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
Link: https://lore.kernel.org/r/20220216091307.703-1-adrianhuang0701@gmail.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220221053348.262724-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28 13:48:44 +01:00
Lu Baolu
9a630a4b41 iommu: Split struct iommu_ops
Move the domain specific operations out of struct iommu_ops into a new
structure that only has domain specific operations. This solves the
problem of needing to know if the method vector for a given operation
needs to be retrieved from the device or the domain. Logically the domain
ops are the ones that make sense for external subsystems and endpoint
drivers to use, while device ops, with the sole exception of domain_alloc,
are IOMMU API internals.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220216025249.3459465-10-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28 13:25:49 +01:00
Lu Baolu
41bb23e70b iommu: Remove unused argument in is_attach_deferred
The is_attach_deferred iommu_ops callback is a device op. The domain
argument is unnecessary and never used. Remove it to make code clean.

Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220216025249.3459465-9-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28 13:25:49 +01:00
Lu Baolu
241469685d iommu/vt-d: Remove aux-domain related callbacks
The aux-domain related callbacks are not called in the tree. Remove them
to avoid dead code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220216025249.3459465-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28 13:25:48 +01:00
Lu Baolu
989192ac6a iommu/vt-d: Remove guest pasid related callbacks
The guest pasid related callbacks are not called in the tree. Remove them
to avoid dead code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220216025249.3459465-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28 13:25:48 +01:00
Joerg Roedel
66dc1b791c Merge branches 'arm/smmu', 'virtio', 'x86/amd', 'x86/vt-d' and 'core' into next 2022-01-04 10:33:45 +01:00
Matthew Wilcox (Oracle)
87f60cc65d iommu/vt-d: Use put_pages_list
page->freelist is for the use of slab.  We already have the ability
to free a list of pages in the core mm, but it requires the use of a
list_head and for the pages to be chained together through page->lru.
Switch the Intel IOMMU and IOVA code over to using free_pages_list().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
[rm: split from original patch, cosmetic tweaks, fix fq entries]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/2115b560d9a0ce7cd4b948bd51a2b7bde8fdfd59.1639753638.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-20 09:03:05 +01:00
Maíra Canal
c95a9c278d iommu/vt-d: Remove unused dma_to_mm_pfn function
Remove dma_to_buf_pfn function, which is not used in the codebase.

This was pointed by clang with the following warning:

'dma_to_mm_pfn' [-Wunused-function]
static inline unsigned long dma_to_mm_pfn(unsigned long dma_pfn)
                            ^
https://lore.kernel.org/r/YYhY7GqlrcTZlzuA@fedora

drivers/iommu/intel/iommu.c:136:29: warning: unused function
Signed-off-by: Maíra Canal <maira.canal@usp.br>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211217083817.1745419-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17 09:40:29 +01:00
Kefeng Wang
f5209f9127 iommu/vt-d: Drop duplicate check in dma_pte_free_pagetable()
The BUG_ON check exists in dma_pte_clear_range(), kill the duplicate
check.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20211025032307.182974-1-wangkefeng.wang@huawei.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211217083817.1745419-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17 09:40:29 +01:00
Christophe JAILLET
bb71257396 iommu/vt-d: Use bitmap_zalloc() when applicable
'iommu->domain_ids' is a bitmap. So use 'bitmap_zalloc()' to simplify code
and improve the semantic.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/cb7a3e0a8d522447a06298a4f244c3df072f948b.1635018498.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211217083817.1745419-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17 09:40:29 +01:00
Kees Cook
4599d78a82 iommu/vt-d: Use correctly sized arguments for bit field
The find.h APIs are designed to be used only on unsigned long arguments.
This can technically result in a over-read, but it is harmless in this
case. Regardless, fix it to avoid the warning seen under -Warray-bounds,
which we'd like to enable globally:

In file included from ./include/linux/bitmap.h:9,
                 from drivers/iommu/intel/iommu.c:17:
drivers/iommu/intel/iommu.c: In function 'domain_context_mapping_one':
./include/linux/find.h:119:37: warning: array subscript 'long unsigned int[0]' is partly outside array bounds of 'int[1]' [-Warray-bounds]
  119 |                 unsigned long val = *addr & GENMASK(size - 1, 0);
      |                                     ^~~~~
drivers/iommu/intel/iommu.c:2115:18: note: while referencing 'max_pde'
 2115 |         int pds, max_pde;
      |                  ^~~~~~~

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20211215232432.2069605-1-keescook@chromium.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17 09:05:04 +01:00
Alex Williamson
86dc40c7ea iommu/vt-d: Fix unmap_pages support
When supporting only the .map and .unmap callbacks of iommu_ops,
the IOMMU driver can make assumptions about the size and alignment
used for mappings based on the driver provided pgsize_bitmap.  VT-d
previously used essentially PAGE_MASK for this bitmap as any power
of two mapping was acceptably filled by native page sizes.

However, with the .map_pages and .unmap_pages interface we're now
getting page-size and count arguments.  If we simply combine these
as (page-size * count) and make use of the previous map/unmap
functions internally, any size and alignment assumptions are very
different.

As an example, a given vfio device assignment VM will often create
a 4MB mapping at IOVA pfn [0x3fe00 - 0x401ff].  On a system that
does not support IOMMU super pages, the unmap_pages interface will
ask to unmap 1024 4KB pages at the base IOVA.  dma_pte_clear_level()
will recurse down to level 2 of the page table where the first half
of the pfn range exactly matches the entire pte level.  We clear the
pte, increment the pfn by the level size, but (oops) the next pte is
on a new page, so we exit the loop an pop back up a level.  When we
then update the pfn based on that higher level, we seem to assume
that the previous pfn value was at the start of the level.  In this
case the level size is 256K pfns, which we add to the base pfn and
get a results of 0x7fe00, which is clearly greater than 0x401ff,
so we're done.  Meanwhile we never cleared the ptes for the remainder
of the range.  When the VM remaps this range, we're overwriting valid
ptes and the VT-d driver complains loudly, as reported by the user
report linked below.

The fix for this seems relatively simple, if each iteration of the
loop in dma_pte_clear_level() is assumed to clear to the end of the
level pte page, then our next pfn should be calculated from level_pfn
rather than our working pfn.

Fixes: 3f34f12597 ("iommu/vt-d: Implement map/unmap_pages() iommu_ops callback")
Reported-by: Ajay Garg <ajaygargnsit@gmail.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Tested-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Link: https://lore.kernel.org/all/20211002124012.18186-1-ajaygargnsit@gmail.com/
Link: https://lore.kernel.org/r/163659074748.1617923.12716161410774184024.stgit@omen
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211126135556.397932-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-11-26 22:54:47 +01:00
Longpeng(Mike)
9906b9352a iommu/vt-d: Avoid duplicate removing in __domain_mapping()
The __domain_mapping() always removes the pages in the range from
'iov_pfn' to 'end_pfn', but the 'end_pfn' is always the last pfn
of the range that the caller wants to map.

This would introduce too many duplicated removing and leads the
map operation take too long, for example:

  Map iova=0x100000,nr_pages=0x7d61800
    iov_pfn: 0x100000, end_pfn: 0x7e617ff
    iov_pfn: 0x140000, end_pfn: 0x7e617ff
    iov_pfn: 0x180000, end_pfn: 0x7e617ff
    iov_pfn: 0x1c0000, end_pfn: 0x7e617ff
    iov_pfn: 0x200000, end_pfn: 0x7e617ff
    ...
  it takes about 50ms in total.

We can reduce the cost by recalculate the 'end_pfn' and limit it
to the boundary of the end of this pte page.

  Map iova=0x100000,nr_pages=0x7d61800
    iov_pfn: 0x100000, end_pfn: 0x13ffff
    iov_pfn: 0x140000, end_pfn: 0x17ffff
    iov_pfn: 0x180000, end_pfn: 0x1bffff
    iov_pfn: 0x1c0000, end_pfn: 0x1fffff
    iov_pfn: 0x200000, end_pfn: 0x23ffff
    ...
  it only need 9ms now.

This also removes a meaningless BUG_ON() in __domain_mapping().

Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Tested-by: Liujunjie <liujunjie23@huawei.com>
Link: https://lore.kernel.org/r/20211008000433.1115-1-longpeng2@huawei.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211014053839.727419-10-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18 12:31:48 +02:00
Lu Baolu
94f797ad61 iommu/vt-d: Delete dev_has_feat callback
The commit 262948f8ba ("iommu: Delete iommu_dev_has_feature()") has
deleted the iommu_dev_has_feature() interface. Remove the dev_has_feat
callback to avoid dead code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210929072030.1330225-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20211014053839.727419-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18 12:31:48 +02:00
Lu Baolu
032c5ee40e iommu/vt-d: Use second level for GPA->HPA translation
The IOMMU VT-d implementation uses the first level for GPA->HPA translation
by default. Although both the first level and the second level could handle
the DMA translation, they're different in some way. For example, the second
level translation has separate controls for the Access/Dirty page tracking.
With the first level translation, there's no such control. On the other
hand, the second level translation has the page-level control for forcing
snoop, but the first level only has global control with pasid granularity.

This uses the second level for GPA->HPA translation so that we can provide
a consistent hardware interface for use cases like dirty page tracking for
live migration.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20210926114535.923263-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20211014053839.727419-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18 12:31:48 +02:00
Lu Baolu
b34380a6d7 iommu/vt-d: Remove duplicate identity domain flag
The iommu_domain data structure already has the "type" field to keep the
type of a domain. It's unnecessary to have the DOMAIN_FLAG_STATIC_IDENTITY
flag in the vt-d implementation. This cleans it up with no functionality
change.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20210926114535.923263-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20211014053839.727419-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18 12:31:48 +02:00
Kyung Min Park
914ff7719e iommu/vt-d: Dump DMAR translation structure when DMA fault occurs
When the dmar translation fault happens, the kernel prints a single line
fault reason with corresponding hexadecimal code defined in the Intel VT-d
specification.

Currently, when user wants to debug the translation fault in detail,
debugfs is used for dumping the dmar_translation_struct, which is not
available when the kernel failed to boot.

Dump the DMAR translation structure, pagewalk the IO page table and print
the page table entry when the fault happens.

This takes effect only when CONFIG_DMAR_DEBUG is enabled.

Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
Link: https://lore.kernel.org/r/20210815203845.31287-1-kyung.min.park@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211014053839.727419-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18 12:31:48 +02:00
Tvrtko Ursulin
5240aed2cd iommu/vt-d: Do not falsely log intel_iommu is unsupported kernel option
Handling of intel_iommu kernel command line option should return "true" to
indicate option is valid and so avoid logging it as unknown by the core
parsing code.

Also log unknown sub-options at the notice level to let user know of
potential typos or similar.

Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://lore.kernel.org/r/20210831112947.310080-1-tvrtko.ursulin@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211014053839.727419-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18 12:31:48 +02:00
Lu Baolu
48811c4434 iommu/vt-d: Allow devices to have more than 32 outstanding PRs
The minimum per-IOMMU PRQ queue size is one 4K page, this is more entries
than the hardcoded limit of 32 in the current VT-d code. Some devices can
support up to 512 outstanding PRQs but underutilized by this limit of 32.
Although, 32 gives some rough fairness when multiple devices share the same
IOMMU PRQ queue, but far from optimal for customized use case. This extends
the per-IOMMU PRQ queue size to four 4K pages and let the devices have as
many outstanding page requests as they can.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210818134852.1847070-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-19 10:41:08 +02:00
Lu Baolu
289b3b005c iommu/vt-d: Preset A/D bits for user space DMA usage
We preset the access and dirty bits for IOVA over first level usage only
for the kernel DMA (i.e., when domain type is IOMMU_DOMAIN_DMA). We should
also preset the FL A/D for user space DMA usage. The idea is that even the
user space A/D bit memory write is unnecessary. We should avoid it to
minimize the overhead.

Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210818134852.1847070-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-19 10:41:08 +02:00
Lu Baolu
792fb43ce2 iommu/vt-d: Enable Intel IOMMU scalable mode by default
The commit 8950dcd83a ("iommu/vt-d: Leave scalable mode default off")
leaves the scalable mode default off and end users could turn it on with
"intel_iommu=sm_on". Using the Intel IOMMU scalable mode for kernel DMA,
user-level device access and Shared Virtual Address have been enabled.
This enables the scalable mode by default if the hardware advertises the
support and adds kernel options of "intel_iommu=sm_on/sm_off" for end
users to configure it through the kernel parameters.

Suggested-by: Ashok Raj <ashok.raj@intel.com>
Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210818134852.1847070-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-19 10:41:08 +02:00
Lu Baolu
01dac2d9d2 iommu/vt-d: Refactor Kconfig a bit
Put all sub-options inside a "if INTEL_IOMMU" so that they don't need to
always depend on INTEL_IOMMU. Use IS_ENABLED() instead of #ifdef as well.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210818134852.1847070-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-19 10:41:08 +02:00
Zhen Lei
5e41c99894 iommu/vt-d: Remove unnecessary oom message
Fixes scripts/checkpatch.pl warning:
WARNING: Possible unnecessary 'out of memory' message

Remove it can help us save a bit of memory.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210609124937.14260-1-thunder.leizhen@huawei.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210818134852.1847070-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-19 10:41:08 +02:00