linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Arjan van de Ven	e11452eb07	dmaengine: idxd: add a new security check to deal with a hardware erratum On Sapphire Rapids and related platforms, the DSA and IAA devices have an erratum that causes direct access (for example, by using the ENQCMD or MOVDIR64 instructions) from untrusted applications to be a security problem. To solve this, add a flag to the PCI device enumeration and device structures to indicate the presence/absence of this security exposure. In the mmap() method of the device, this flag is then used to enforce that the user has the CAP_SYS_RAWIO capability. In a future patch, a write() based method will be added that allows untrusted applications submit work to the accelerator, where the kernel can do sanity checking on the user input to ensure secure operation of the accelerator. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2024-05-13 14:07:40 +00:00
Rex Zhang	d5638de827	dmaengine: idxd: Convert spinlock to mutex to lock evl workqueue drain_workqueue() cannot be called safely in a spinlocked context due to possible task rescheduling. In the multi-task scenario, calling queue_work() while drain_workqueue() will lead to a Call Trace as pushing a work on a draining workqueue is not permitted in spinlocked context. Call Trace: <TASK> ? __warn+0x7d/0x140 ? __queue_work+0x2b2/0x440 ? report_bug+0x1f8/0x200 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? __queue_work+0x2b2/0x440 queue_work_on+0x28/0x30 idxd_misc_thread+0x303/0x5a0 [idxd] ? __schedule+0x369/0xb40 ? __pfx_irq_thread_fn+0x10/0x10 ? irq_thread+0xbc/0x1b0 irq_thread_fn+0x21/0x70 irq_thread+0x102/0x1b0 ? preempt_count_add+0x74/0xa0 ? __pfx_irq_thread_dtor+0x10/0x10 ? __pfx_irq_thread+0x10/0x10 kthread+0x103/0x140 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> The current implementation uses a spinlock to protect event log workqueue and will lead to the Call Trace due to potential task rescheduling. To address the locking issue, convert the spinlock to mutex, allowing the drain_workqueue() to be called in a safe mutex-locked context. This change ensures proper synchronization when accessing the event log workqueue, preventing potential Call Trace and improving the overall robustness of the code. Fixes: `c40bd7d973` ("dmaengine: idxd: process user page faults for completion record") Signed-off-by: Rex Zhang <rex.zhang@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Lijun Pan <lijun.pan@intel.com> Link: https://lore.kernel.org/r/20240404223949.2885604-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-04-07 13:31:44 +05:30
Fenghua Yu	d3ea125df3	dmaengine: idxd: Ensure safe user copy of completion record If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from event log cache to user triggers a kernel bug. [ 1987.159822] usercopy: Kernel memory exposure attempt detected from SLUB object 'dsa0' (offset 74, size 31)! [ 1987.170845] ------------[ cut here ]------------ [ 1987.176086] kernel BUG at mm/usercopy.c:102! [ 1987.180946] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 1987.186866] CPU: 17 PID: 528 Comm: kworker/17:1 Not tainted 6.8.0-rc2+ #5 [ 1987.194537] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023 [ 1987.206405] Workqueue: wq0.0 idxd_evl_fault_work [idxd] [ 1987.212338] RIP: 0010:usercopy_abort+0x72/0x90 [ 1987.217381] Code: 58 65 9c 50 48 c7 c2 17 85 61 9c 57 48 c7 c7 98 fd 6b 9c 48 0f 44 d6 48 c7 c6 b3 08 62 9c 4c 89 d1 49 0f 44 f3 e8 1e 2e d5 ff <0f> 0b 49 c7 c1 9e 42 61 9c 4c 89 cf 4d 89 c8 eb a9 66 66 2e 0f 1f [ 1987.238505] RSP: 0018:ff62f5cf20607d60 EFLAGS: 00010246 [ 1987.244423] RAX: 000000000000005f RBX: 000000000000001f RCX: 0000000000000000 [ 1987.252480] RDX: 0000000000000000 RSI: ffffffff9c61429e RDI: 00000000ffffffff [ 1987.260538] RBP: ff62f5cf20607d78 R08: ff2a6a89ef3fffe8 R09: 00000000fffeffff [ 1987.268595] R10: ff2a6a89eed00000 R11: 0000000000000003 R12: ff2a66934849c89a [ 1987.276652] R13: 0000000000000001 R14: ff2a66934849c8b9 R15: ff2a66934849c899 [ 1987.284710] FS: 0000000000000000(0000) GS:ff2a66b22fe40000(0000) knlGS:0000000000000000 [ 1987.293850] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1987.300355] CR2: 00007fe291a37000 CR3: 000000010fbd4005 CR4: 0000000000f71ef0 [ 1987.308413] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1987.316470] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 1987.324527] PKRU: 55555554 [ 1987.327622] Call Trace: [ 1987.330424] <TASK> [ 1987.332826] ? show_regs+0x6e/0x80 [ 1987.336703] ? die+0x3c/0xa0 [ 1987.339988] ? do_trap+0xd4/0xf0 [ 1987.343662] ? do_error_trap+0x75/0xa0 [ 1987.347922] ? usercopy_abort+0x72/0x90 [ 1987.352277] ? exc_invalid_op+0x57/0x80 [ 1987.356634] ? usercopy_abort+0x72/0x90 [ 1987.360988] ? asm_exc_invalid_op+0x1f/0x30 [ 1987.365734] ? usercopy_abort+0x72/0x90 [ 1987.370088] __check_heap_object+0xb7/0xd0 [ 1987.374739] __check_object_size+0x175/0x2d0 [ 1987.379588] idxd_copy_cr+0xa9/0x130 [idxd] [ 1987.384341] idxd_evl_fault_work+0x127/0x390 [idxd] [ 1987.389878] process_one_work+0x13e/0x300 [ 1987.394435] ? __pfx_worker_thread+0x10/0x10 [ 1987.399284] worker_thread+0x2f7/0x420 [ 1987.403544] ? _raw_spin_unlock_irqrestore+0x2b/0x50 [ 1987.409171] ? __pfx_worker_thread+0x10/0x10 [ 1987.414019] kthread+0x107/0x140 [ 1987.417693] ? __pfx_kthread+0x10/0x10 [ 1987.421954] ret_from_fork+0x3d/0x60 [ 1987.426019] ? __pfx_kthread+0x10/0x10 [ 1987.430281] ret_from_fork_asm+0x1b/0x30 [ 1987.434744] </TASK> The issue arises because event log cache is created using kmem_cache_create() which is not suitable for user copy. Fix the issue by creating event log cache with kmem_cache_create_usercopy(), ensuring safe user copy. Fixes: `c2f156bf16` ("dmaengine: idxd: create kmem cache for event log fault items") Reported-by: Tony Zhu <tony.zhu@intel.com> Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Lijun Pan <lijun.pan@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240209191412.1050270-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-02-22 19:27:22 +05:30
Tom Zanussi	979f6ded93	dmaengine: idxd: Add support for device/wq defaults Add a load_device_defaults() function pointer to struct idxd_driver_data, which if defined, will be called when an idxd device is probed and will allow the idxd device to be configured with default values. The load_device_defaults() function is passed an idxd device to work with to set specific device attributes. Also add a load_device_defaults() implementation IAA devices; future patches would add default functions for other device types such as DSA. The way idxd device probing works, if the device configuration is valid at that point e.g. at least one workqueue and engine is properly configured then the device will be enabled and ready to go. The IAA implementation, idxd_load_iaa_device_defaults(), configures a single workqueue (wq0) for each device with the following default values: mode "dedicated" threshold 0 size Total WQ Size from WQCAP priority 10 type IDXD_WQT_KERNEL group 0 name "iaa_crypto" driver_name "crypto" Note that this now adds another configuration step for any users that want to configure their own devices/workqueus with something different in that they'll first need to disable (in the case of IAA) wq0 and the device itself before they can set their own attributes and re-enable, since they've been already been auto-enabled. Note also that in order for the new configuration to be applied to the deflate-iaa crypto algorithm the iaa_crypto module needs to unregister the old version, which is accomplished by removing the iaa_crypto module, and re-registering it with the new configuration by reinserting the iaa_crypto module. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-12-15 17:52:53 +08:00
Jacob Pan	f5ccf55e10	dmaengine/idxd: Re-enable kernel workqueue under DMA API Kernel workqueues were disabled due to flawed use of kernel VA and SVA API. Now that we have the support for attaching PASID to the device's default domain and the ability to reserve global PASIDs from SVA APIs, we can re-enable the kernel work queues and use them under DMA API. We also use non-privileged access for in-kernel DMA to be consistent with the IOMMU settings. Consequently, interrupt for user privilege is enabled for work completion IRQs. Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/ Tested-by: Tony Zhu <tony.zhu@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/20230802212427.1497170-9-jacob.jun.pan@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-08-09 17:44:39 +02:00
Linus Torvalds	7994beabfb	dmaengine updates for v6.4 New support: - Apple admac t8112 device support - StarFive JH7110 DMA controller Updates: - Big pile of idxd updates to support IAA 2.0 device capabilities, DSA 2.0 Event Log and completion record faulting features and new DSA operations - at_xdmac supend & resume updates and driver code cleanup - k3-udma supend & resume support - k3-psil thread support for J784s4 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmRSOJYACgkQfBQHDyUj g0fmUQ//ZmrcNgbubI8evje8oovapLZUqdCgxoq9RbUbJ0TA6y7QCZ4ebkQYQymA LCJSV+NpG3++gLXLmIoYWiWXCabHZWDPh5M+uUiRm51MV2AtKeHy7UzQolmOWb6E d4JYf4HdGvzto7nMM7zo/PDaEDKdkostJNb7ZFYZrRqqCbSSSTJYmNGg3PfYRfaw yJpQsOjizgboM8EciKbqLh9gX3+FfQGhKsnkaiGQBwtdZD29LHeC7UoQrslkH9uh EiGTMYPKCZIAb+75G4WSO3sCNZcShQPEQt8w3SBP9X6eyaH75AZPymSRhF6O8gte ecGW/mRzc7t8lOrRnzD1zvZ/4NgwlnxTQAl+561XwWjmZIW/Zd9dH8dHFzwS5alS z7uVperSYXoEo//UmmwuhuZXlOWD4muOswWB4I5lTeHW97OUyccU5S3DhxYnffI6 P8qnGMmxLj8nsmk98T8TsOmzdb5txxSbPP+PumvDVF7g9LoFKfz8wDAZG3buTsJP L/HR9sAFaFO0GrjoC7EsHJsiLauSzzcDnYIUJ5UIv/6Ogf2mUCrBhgwSmVhIeR26 hUzYAdpwmhrrVXtbMXuec7jAz62bqkACdJIWGd21t/FTzUNUaDv5OimiNEZncpnY 9q2cT9x94iHHVCl430jlIAOa/oF2AQpeqs4any7+OI2xwiJEA7g= =Vqrw -----END PGP SIGNATURE----- Merge tag 'dmaengine-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "New support: - Apple admac t8112 device support - StarFive JH7110 DMA controller Updates: - Big pile of idxd updates to support IAA 2.0 device capabilities, DSA 2.0 Event Log and completion record faulting features and new DSA operations - at_xdmac supend & resume updates and driver code cleanup - k3-udma supend & resume support - k3-psil thread support for J784s4" * tag 'dmaengine-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (57 commits) dmaengine: idxd: add per wq PRS disable dmaengine: idxd: add pid to exported sysfs attribute for opened file dmaengine: idxd: expose fault counters to sysfs dmaengine: idxd: add a device to represent the file opened dmaengine: idxd: add per file user counters for completion record faults dmaengine: idxd: process batch descriptor completion record faults dmaengine: idxd: add descs_completed field for completion record dmaengine: idxd: process user page faults for completion record dmaengine: idxd: add idxd_copy_cr() to copy user completion record during page fault handling dmaengine: idxd: create kmem cache for event log fault items dmaengine: idxd: add per DSA wq workqueue for processing cr faults dmanegine: idxd: add debugfs for event log dump dmaengine: idxd: add interrupt handling for event log dmaengine: idxd: setup event log configuration dmaengine: idxd: add event log size sysfs attribute dmaengine: idxd: make misc interrupt one shot dt-bindings: dma: snps,dw-axi-dmac: constrain the items of resets for JH7110 dma dt-bindings: dma: Drop unneeded quotes dmaengine: at_xdmac: align declaration of ret with the rest of variables dmaengine: at_xdmac: add a warning message regarding for unpaused channels ...	2023-05-03 11:11:56 -07:00
Joerg Roedel	e51b419839	Merge branches 'iommu/fixes', 'arm/allwinner', 'arm/exynos', 'arm/mediatek', 'arm/omap', 'arm/renesas', 'arm/rockchip', 'arm/smmu', 'ppc/pamu', 'unisoc', 'x86/vt-d', 'x86/amd', 'core' and 'platform-remove_new' into next	2023-04-14 13:45:50 +02:00
Lu Baolu	84c9ef72b6	dmaengine: idxd: Add enable/disable device IOPF feature The iommu subsystem requires IOMMU_DEV_FEAT_IOPF must be enabled before and disabled after IOMMU_DEV_FEAT_SVA, if device's I/O page faults rely on the IOMMU. Add explicit IOMMU_DEV_FEAT_IOPF enabling/disabling in this driver. At present, missing IOPF enabling/disabling doesn't cause any real issue, because the IOMMU driver places the IOPF enabling/disabling in the path of SVA feature handling. But this may change. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20230324120234.313643-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-04-13 12:05:46 +02:00
Dave Jiang	2442b7473a	dmaengine: idxd: process batch descriptor completion record faults Add event log processing for faulting of user batch descriptor completion record. When encountering an event log entry for a page fault on a completion record, the driver is expected to do the following: 1. If the "first error in batch" bit in event log entry error info is set, discard any previously recorded errors associated with the "batch identifier". 2. Fix the page fault according to the fault address in the event log. If successful, write the completion record to the fault address in user space. 3. If an error is encountered while writing the completion record and it is associated to a descriptor in the batch, the driver associates the error with the batch identifier of the event log entry and tracks it until the event log entry for the corresponding batch desc is encountered. While processing an event log entry for a batch descriptor with error indicating that one or more descs in the batch had event log entries, the driver will do the following before writing the batch completion record: 1. If the status field of the completion record is 0x1, the driver will change it to error code 0x5 (one or more operations in batch completed with status not successful) and changes the result field to 1. 2. If the status is error code 0x6 (page fault on batch descriptor list address), change the result field to 1. 3. If status is any other value, the completion record is not changed. 4. Clear the recorded error in preparation for next batch with same batch identifier. The result field is for user software to determine whether to set the "Batch Error" flag bit in the descriptor for continuation of partial batch descriptor completion. See DSA spec 2.0 for additional information. If no error has been recorded for the batch, the batch completion record is written to user space as is. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-12-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	c40bd7d973	dmaengine: idxd: process user page faults for completion record DSA supports page fault handling through PRS. However, the DMA engine that's processing the descriptor is blocked until the PRS response is received. Other workqueues sharing the engine are also blocked. Page fault handing by the driver with PRS disabled can be used to mitigate the stalling. With PRS disabled while ATS remain enabled, DSA handles page faults on a completion record by reporting an event in the event log. In this instance, the descriptor is completed and the event log contains the completion record address and the contents of the completion record. Add support to the event log handling code to fault in the completion record and copy the content of the completion record to user memory. A bitmap is introduced to keep track of discarded event log entries. When the user process initiates ->release() of the char device, it no longer is interested in any remaining event log entries tied to the relevant wq and PASID. The driver will mark the event log entry index in the bitmap. Upon encountering the entries during processing, the event log handler will just clear the bitmap bit and skip the entry rather than attempt to process the event log entry. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-10-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Fenghua Yu	b022f59725	dmaengine: idxd: add idxd_copy_cr() to copy user completion record during page fault handling Define idxd_copy_cr() to copy completion record to fault address in user address that is found by work queue (wq) and PASID. It will be used to write the user's completion record that the hardware device is not able to write due to user completion record page fault. An xarray is added to associate the PASID and mm with the struct idxd_user_context so mm can be found by PASID and wq. It is called when handling the completion record fault in a kernel thread context. Switch to the mm using kthread_use_vm() and copy the completion record to the mm via copy_to_user(). Once the copy is completed, switch back to the current mm using kthread_unuse_mm(). Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Suggested-by: Tony Luck <tony.luck@intel.com> Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-9-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	c2f156bf16	dmaengine: idxd: create kmem cache for event log fault items Add a kmem cache per device for allocating event log fault context. The context allows an event log entry to be copied and passed to a software workqueue to be processed. Due to each device can have different sized event log entry depending on device type, it's not possible to have a global kmem cache. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-8-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	5fbe6503b5	dmanegine: idxd: add debugfs for event log dump Add debugfs entry to dump the content of the event log for debugging. The function will dump all non-zero entries in the event log. It will note which entries are processed and which entries are still pending processing at the time of the dump. The entries may not always be in chronological order due to the log is a circular buffer. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-6-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	244da66cda	dmaengine: idxd: setup event log configuration Add setup of event log feature for supported device. Event log addresses error reporting that was lacking in gen 1 DSA devices where a second error event does not get reported when a first event is pending software handling. The event log allows a circular buffer that the device can push error events to. It is up to the user to create a large enough event log ring in order to capture the expected events. The evl size can be set in the device sysfs attribute. By default 64 entries are supported as minimal when event log is enabled. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-4-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:45 +05:30
Dave Jiang	1649091f91	dmaengine: idxd: add event log size sysfs attribute Add support for changing of the event log size. Event log is a feature added to DSA 2.0 hardware to improve error reporting. It supersedes the SWERROR register on DSA 1.0 hardware and hope to prevent loss of reported errors. The error log size determines how many error entries supported for the device. It can be configured by the user via sysfs attribute. Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230407203143.2189681-3-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-04-12 23:18:44 +05:30
Dave Jiang	9f0d99b327	dmaengine: idxd: expose IAA CAP register via sysfs knob Add IAA (IAX) capability mask sysfs attribute to expose to applications. The mask provides application knowledge of what capabilities this IAA device supports. This mask is available for IAA 2.0 device or later. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230303213732.3357494-3-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-03-31 17:26:53 +05:30
Dave Jiang	34ca00662e	dmaengine: idxd: reformat swerror output to standard Linux bitmap output SWERROR register is 4 64bit wide registers. Currently the sysfs attribute just outputs 4 64bit hex integers. Convert to output with %*pb format specifier. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20230303213732.3357494-2-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-03-31 17:26:53 +05:30
Jacob Pan	fffaed1e24	iommu/ioasid: Rename INVALID_IOASID INVALID_IOASID and IOMMU_PASID_INVALID are duplicated. Rename INVALID_IOASID and consolidate since we are moving away from IOASID infrastructure. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/20230322200803.869130-7-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-03-31 10:03:27 +02:00
Bjorn Helgaas	3c5cc03979	dmaengine: idxd: Remove unnecessary aer.h include <linux/aer.h> is unused, so remove it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Acked-by: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230307192655.874008-3-helgaas@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-03-17 23:16:47 +05:30
Fenghua Yu	601bdadadb	dmaengine: idxd: Fix default allowed read buffers value in group Currently default read buffers that is allowed in a group is 0. grpcfg will be configured to max read buffers that IDXD can support if the group's allowed read buffers value is 0. But 0 is an invalid read buffers value and user may get confused when seeing the invalid initial value 0 through sysfs interface. To show only valid allowed read buffers value and eliminate confusion, directly initialize the allowed read buffers to IDXD's max read buffers. User still can change the value through sysfs interface. Suggested-by: Ramesh Thomas <ramesh.thomas@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Nikhil Rao <nikhil.rao@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230127192855.966929-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-02-16 18:45:48 +05:30
Fenghua Yu	9735bde364	dmaengine: idxd: Set traffic class values in GRPCFG on DSA 2.0 On DSA/IAX 1.0, TC-A and TC-B in GRPCFG are set as 1 to have best performance and cannot be changed through sysfs knobs unless override option is given. The same values should be set on DSA 2.0 as well. Fixes: `ea7c8f598c` ("dmaengine: idxd: restore traffic class defaults after wq reset") Fixes: `ade8a86b51` ("dmaengine: idxd: Set defaults for GRPCFG traffic class") Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20221209172141.562648-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2022-12-28 12:30:50 +05:30
Jason Gunthorpe	90337f526c	Merge tag 'v6.1-rc7' into iommufd.git for-next Resolve conflicts in drivers/vfio/vfio_main.c by using the iommfd version. The rc fix was done a different way when iommufd patches reworked this code. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-12-02 12:04:39 -04:00
Xiaochen Shen	e8dbd6445d	dmaengine: idxd: Fix max batch size for Intel IAA >From Intel IAA spec [1], Intel IAA does not support batch processing. Two batch related default values for IAA are incorrect in current code: (1) The max batch size of device is set during device initialization, that indicates batch is supported. It should be always 0 on IAA. (2) The max batch size of work queue is set to WQ_DEFAULT_MAX_BATCH (32) as the default value regardless of Intel DSA or IAA device during work queue setup and cleanup. It should be always 0 on IAA. Fix the issues by setting the max batch size of device and max batch size of work queue to 0 on IAA device, that means batch is not supported. [1]: https://cdrdv2.intel.com/v1/dl/getContent/721858 Fixes: `23084545db` ("dmaengine: idxd: set max_xfer and max_batch for RO device") Fixes: `92452a72eb` ("dmaengine: idxd: set defaults for wq configs") Fixes: `bfe1d56091` ("dmaengine: idxd: Init and probe for Intel data accelerators") Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220930201528.18621-2-xiaochen.shen@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2022-11-08 10:43:55 +05:30
Lu Baolu	942fd5435d	iommu: Remove SVM_FLAG_SUPERVISOR_MODE support The current kernel DMA with PASID support is based on the SVA with a flag SVM_FLAG_SUPERVISOR_MODE. The IOMMU driver binds the kernel memory address space to a PASID of the device. The device driver programs the device with kernel virtual address (KVA) for DMA access. There have been security and functional issues with this approach: - The lack of IOTLB synchronization upon kernel page table updates. (vmalloc, module/BPF loading, CONFIG_DEBUG_PAGEALLOC etc.) - Other than slight more protection, using kernel virtual address (KVA) has little advantage over physical address. There are also no use cases yet where DMA engines need kernel virtual addresses for in-kernel DMA. This removes SVM_FLAG_SUPERVISOR_MODE support from the IOMMU interface. The device drivers are suggested to handle kernel DMA with PASID through the kernel DMA APIs. The drvdata parameter in iommu_sva_bind_device() and all callbacks is not needed anymore. Cleanup them as well. Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/ Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-11-03 15:47:45 +01:00
Dave Jiang	b0325aefd3	dmaengine: idxd: add WQ operation cap restriction support DSA 2.0 add the capability of configuring DMA ops on a per workqueue basis. This means that certain ops can be disabled by the system administrator for certain wq. By default, all ops are available. A bitmap is used to store the ops due to total op size of 256 bits and it is more convenient to use a range list to specify which bits are enabled. One of the usage to support this is for VM migration between different iteration of devices. The newer ops are disabled in order to allow guest to migrate to a host that only support older ops. Another usage is to restrict the WQ to certain operations for QoS of performance. A sysfs of ops_config attribute is added per wq. It is only usable when the ops_config bit is set under WQ_CAP register. This means that this attribute will return -EOPNOTSUPP on DSA 1.x devices. The expected input is a range list for the bits per operation the WQ supports. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220917161222.2835172-4-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2022-09-29 22:46:08 +05:30
Dave Jiang	a8563a33a5	dmanegine: idxd: reformat opcap output to match bitmap_parse() input To make input and output consistent and prepping for the per WQ operation configuration support, change the output of opcap display to match the input that is expected by bitmap_parse() helper function. The output will be a bitmap with field width as the number of bits using the %*pb format specifier for printk() family. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220917161222.2835172-3-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2022-09-29 22:46:08 +05:30
Jerry Snitselaar	de5819b994	dmaengine: idxd: track enabled workqueues in bitmap Now that idxd_wq_disable_cleanup() sets the workqueue state to IDXD_WQ_DISABLED, use a bitmap to track which workqueues have been enabled. This will then be used to determine which workqueues should be re-enabled when attempting a software reset to recover from a device halt state. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220928154856.623545-3-jsnitsel@redhat.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2022-09-29 13:02:38 +05:30
Jerry Snitselaar	8ffccd119a	dmaengine: idxd: Only call idxd_enable_system_pasid() if succeeded in enabling SVA feature On a Sapphire Rapids system if boot without intel_iommu=on, the IDXD driver will crash during probe in iommu_sva_bind_device(). [ 21.423729] BUG: kernel NULL pointer dereference, address: 0000000000000038 [ 21.445108] #PF: supervisor read access in kernel mode [ 21.450912] #PF: error_code(0x0000) - not-present page [ 21.456706] PGD 0 [ 21.459047] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 21.464004] CPU: 0 PID: 1420 Comm: kworker/0:3 Not tainted 5.19.0-0.rc3.27.eln120.x86_64 #1 [ 21.464011] Hardware name: Intel Corporation EAGLESTREAM/EAGLESTREAM, BIOS EGSDCRB1.SYS.0067.D12.2110190954 10/19/2021 [ 21.464015] Workqueue: events work_for_cpu_fn [ 21.464030] RIP: 0010:iommu_sva_bind_device+0x1d/0xe0 [ 21.464046] Code: c3 cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 41 56 49 89 d6 41 55 41 54 55 53 48 83 ec 08 48 8b 87 d8 02 00 00 <48> 8b 40 38 48 8b 50 10 48 83 7a 70 00 48 89 14 24 0f 84 91 00 00 [ 21.464050] RSP: 0018:ff7245d9096b7db8 EFLAGS: 00010296 [ 21.464054] RAX: 0000000000000000 RBX: ff1eadeec8a51000 RCX: 0000000000000000 [ 21.464058] RDX: ff7245d9096b7e24 RSI: 0000000000000000 RDI: ff1eadeec8a510d0 [ 21.464060] RBP: ff1eadeec8a51000 R08: ffffffffb1a12300 R09: ff1eadffbfce25b4 [ 21.464062] R10: ffffffffffffffff R11: 0000000000000038 R12: ffffffffc09f8000 [ 21.464065] R13: ff1eadeec8a510d0 R14: ff7245d9096b7e24 R15: ff1eaddf54429000 [ 21.464067] FS: 0000000000000000(0000) GS:ff1eadee7f600000(0000) knlGS:0000000000000000 [ 21.464070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 21.464072] CR2: 0000000000000038 CR3: 00000008c0e10006 CR4: 0000000000771ef0 [ 21.464074] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 21.464076] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 21.464078] PKRU: 55555554 [ 21.464079] Call Trace: [ 21.464083] <TASK> [ 21.464092] idxd_pci_probe+0x259/0x1070 [idxd] [ 21.464121] local_pci_probe+0x3e/0x80 [ 21.464132] work_for_cpu_fn+0x13/0x20 [ 21.464136] process_one_work+0x1c4/0x380 [ 21.464143] worker_thread+0x1ab/0x380 [ 21.464147] ? _raw_spin_lock_irqsave+0x23/0x50 [ 21.464158] ? process_one_work+0x380/0x380 [ 21.464161] kthread+0xe6/0x110 [ 21.464168] ? kthread_complete_and_exit+0x20/0x20 [ 21.464172] ret_from_fork+0x1f/0x30 iommu_sva_bind_device() requires SVA has been enabled successfully on the IDXD device before it's called. Otherwise, iommu_sva_bind_device() will access a NULL pointer. If Intel IOMMU is disabled, SVA cannot be enabled and thus idxd_enable_system_pasid() and iommu_sva_bind_device() should not be called. Fixes: `42a1b73852` ("dmaengine: idxd: Separate user and kernel pasid enabling") Cc: Vinod Koul <vkoul@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: Dave Jiang <dave.jiang@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/dmaengine/20220623170232.6whonfjuh3m5vcoy@cantor/ Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Acked-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220626051648.14249-1-jsnitsel@redhat.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2022-07-05 18:28:05 +05:30
Dave Jiang	42a1b73852	dmaengine: idxd: Separate user and kernel pasid enabling The idxd driver always gated the pasid enabling under a single knob and this assumption is incorrect. The pasid used for kernel operation can be independently toggled and has no dependency on the user pasid (and vice versa). Split the two so they are independent "enabled" flags. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/165231431746.986466.5666862038354800551.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2022-05-16 18:19:29 +05:30
Christophe JAILLET	b6f2f0352c	dmaengine: idxd: Remove useless DMA-32 fallback configuration As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32 bits case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lore.kernel.org/linux-kernel/YL3vSPK5DXTNvgdx@infradead.org/#t Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/009c80294dba72858cd8a6ed2ed81041df1b1e82.1642231430.git.christophe.jaillet@wanadoo.fr Signed-off-by: Vinod Koul <vkoul@kernel.org>	2022-03-11 15:23:36 +05:30
Dave Jiang	7ed6f1b85f	dmaengine: idxd: change bandwidth token to read buffers DSA spec v1.2 has changed the term of "bandwidth tokens" to "read buffers" in order to make the concept clearer. Deprecate bandwidth token naming in the driver and convert to read buffers in order to match with the spec and reduce confusion when reading the spec. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163951338932.2988321.6162640806935567317.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2022-01-05 13:14:25 +05:30
Dave Jiang	403a2e2365	dmaengine: idxd: change MSIX allocation based on per wq activation Change the driver where WQ interrupt is requested only when wq is being enabled. This new scheme set things up so that request_threaded_irq() is only called when a kernel wq type is being enabled. This also sets up for future interrupt request where different interrupt handler such as wq occupancy interrupt can be setup instead of the wq completion interrupt. Not calling request_irq() until the WQ actually needs an irq also prevents wasting of CPU irq vectors on x86 systems, which is a limited resource. idxd_flush_pending_descs() is moved to device.c since descriptor flushing is now part of wq disable rather than shutdown(). Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163942149487.2412839.6691222855803875848.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2022-01-05 13:11:22 +05:30
Dave Jiang	23a50c8035	dmaengine: idxd: fix descriptor flushing locking The descriptor flushing for shutdown is not holding the irq_entry list lock. If there's ongoing interrupt completion handling, this can corrupt the list. Add locking to protect list walking. Also refactor the code so it's more compact. Fixes: `8f47d1a5e5` ("dmaengine: idxd: connect idxd to dmaengine subsystem") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163942148935.2412839.18282664745572777280.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2022-01-05 13:11:22 +05:30
Dave Jiang	ec0d642316	dmaengine: idxd: embed irq_entry in idxd_wq struct With irq_entry already being associated with the wq in a 1:1 relationship, embed the irq_entry in the idxd_wq struct and remove back pointers for idxe_wq and idxd_device. In the process of this work, clean up the interrupt handle assignment so that there's no decision to be made during submit call on where interrupt handle value comes from. Set the interrupt handle during irq request initialization time. irq_entry 0 is designated as special and is tied to the device itself. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163942148362.2412839.12055447853311267866.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2022-01-05 13:11:21 +05:30
Dave Jiang	7930d85535	dmaengine: idxd: add knob for enqcmds retries Add a sysfs knob to allow tuning of retries for the kernel ENQCMDS descriptor submission. While on host, it is not as likely that ENQCMDS return busy during normal operations due to the driver controlling the number of descriptors allocated for submission. However, when the driver is operating as a guest driver, the chance of retry goes up significantly due to sharing a wq with multiple VMs. A default value is provided with the system admin being able to tune the value on a per WQ basis. Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163820629464.2702134.7577370098568297574.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-12-17 21:39:28 +05:30
Dave Jiang	92452a72eb	dmaengine: idxd: set defaults for wq configs Add default values for wq size, max_xfer_size and max_batch_size. These values should provide a general guidance for the wq configuration when the user does not specify any specific values. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528473483.3926048.7950067926287180976.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-12-17 21:39:28 +05:30
Dave Jiang	56fc39f5a3	dmaengine: idxd: handle interrupt handle revoked event "Interrupt handle revoked" is an event that happens when the driver is running on a guest kernel and the VM is migrated to a new machine. The device will trigger an interrupt that signals to the guest driver that the interrupt handles need to be replaced. The misc irq thread function calls a helper function to handle the event. The function uses the WQ percpu_ref to quiesce the kernel submissions. It then replaces the interrupt handles by requesting interrupt handle command for each I/O MSIX vector. Once the handle is updated, the driver will unblock the submission path to allow new submissions. The submitter will attempt to acquire a percpu_ref before submission. When the request fails, it will wait on the wq_resurrect 'completion'. The driver does anticipate the possibility of descriptors being submitted before the WQ percpu_ref is killed. If a descriptor has already been submitted, it will return with incorrect interrupt handle status. The descriptor will be re-submitted with the new interrupt handle on the completion path. For descriptors with incorrect interrupt handles, completion interrupt won't be triggered. At the completion of the interrupt handle refresh, the handling function will call idxd_int_handle_refresh_drain() to issue drain descriptors to each of the wq with associated interrupt handle. The drain descriptor will have interrupt request set but without completion record. This will ensure all descriptors with incorrect interrupt completion handle get drained and a completion interrupt is triggered for the guest driver to process them. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Co-Developed-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528420189.3925689.18212568593220415551.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-11-22 11:21:26 +05:30
Dave Jiang	8b67426e05	dmaengine: idxd: int handle management refactoring Attach int_handle to irq_entry. This removes the separate management of int handles and reduces the confusion of interating through int handles that is off by 1 count. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528417065.3925689.11505755433684476288.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-11-22 11:21:26 +05:30
Dave Jiang	5d78abb6fb	dmaengine: idxd: rework descriptor free path on failure Refactor the completion function to allow skipping of descriptor freeing on the submission failure path. This completely removes descriptor freeing from the submit failure path and leave the responsibility to the caller. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528416222.3925689.12859769271667814762.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-11-22 11:21:26 +05:30
Dave Jiang	98da0106aa	dmanegine: idxd: fix resource free ordering on driver removal Fault triggers on ioread32() when pci driver unbind is envoked. The placement of idxd sub-driver removal causes the probing of the device mmio region after the mmio mapping being torn down. The driver needs the sub-drivers to be unbound but not release the idxd context until all shutdown activities has been done. Move the sub-driver unregistering up before the remove() calls shutdown(). But take a device ref on the idxd->conf_dev so that the memory does not get freed in ->release(). When all cleanup activities has been done, release the ref to allow the idxd memory to be freed. [57159.542766] RIP: 0010:ioread32+0x27/0x60 [57159.547097] Code: 00 66 90 48 81 ff ff ff 03 00 77 1e 48 81 ff 00 00 01 00 76 05 0f b7 d7 ed c3 8b 15 03 50 41 01 b8 ff ff ff ff 85 d2 75 04 c3 <8b> 07 c3 55 83 ea 01 48 89 fe 48 c7 c7 00 70 5f 82 48 89 e5 48 83 [57159.566647] RSP: 0018:ffffc900011abb60 EFLAGS: 00010292 [57159.572295] RAX: ffffc900011e0000 RBX: ffff888107d39800 RCX: 0000000000000000 [57159.579842] RDX: 0000000000000000 RSI: ffffffff82b1e448 RDI: ffffc900011e0090 [57159.587421] RBP: ffffc900011abb88 R08: 0000000000000000 R09: 0000000000000001 [57159.594972] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881019840d0 [57159.602533] R13: ffff8881097e9000 R14: ffffffffa08542a0 R15: 00000000000003a8 [57159.610093] FS: 00007f991e0a8740(0000) GS:ffff888459900000(0000) knlGS:00000000000 00000 [57159.618614] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [57159.624814] CR2: ffffc900011e0090 CR3: 000000010862a002 CR4: 00000000003706e0 [57159.632397] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [57159.639973] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [57159.647601] Call Trace: [57159.650502] ? idxd_device_disable+0x41/0x110 [idxd] [57159.655948] idxd_device_drv_remove+0x2b/0x80 [idxd] [57159.661374] idxd_config_bus_remove+0x16/0x20 [57159.666191] __device_release_driver+0x163/0x240 [57159.671320] device_release_driver+0x2b/0x40 [57159.676052] bus_remove_device+0xf5/0x160 [57159.680524] device_del+0x19c/0x400 [57159.684440] device_unregister+0x18/0x60 [57159.688792] idxd_remove+0x140/0x1c0 [idxd] [57159.693406] pci_device_remove+0x3e/0xb0 [57159.697758] __device_release_driver+0x163/0x240 [57159.702788] device_driver_detach+0x43/0xb0 [57159.707424] unbind_store+0x11e/0x130 [57159.711537] drv_attr_store+0x24/0x30 [57159.715646] sysfs_kf_write+0x4b/0x60 [57159.719710] kernfs_fop_write_iter+0x153/0x1e0 [57159.724563] new_sync_write+0x120/0x1b0 [57159.728812] vfs_write+0x23e/0x350 [57159.732624] ksys_write+0x70/0xf0 [57159.736335] __x64_sys_write+0x1a/0x20 [57159.740492] do_syscall_64+0x3b/0x90 [57159.744465] entry_SYSCALL_64_after_hwframe+0x44/0xae [57159.749908] RIP: 0033:0x7f991e19c387 [57159.753898] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [57159.773564] RSP: 002b:00007ffc2ce2d6a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [57159.781550] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f991e19c387 [57159.789133] RDX: 000000000000000c RSI: 000055ee2630e140 RDI: 0000000000000001 [57159.796695] RBP: 000055ee2630e140 R08: 0000000000000000 R09: 00007f991e2324e0 [57159.804246] R10: 00007f991e2323e0 R11: 0000000000000246 R12: 000000000000000c [57159.811800] R13: 00007f991e26f520 R14: 000000000000000c R15: 00007f991e26f700 [57159.819373] Modules linked in: idxd bridge stp llc bnep sunrpc nls_iso8859_1 intel_ rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_code c_realtek iTCO_wdt 8250_dw snd_hda_codec_generic kvm_intel ledtrig_audio iTCO_vendor_s upport snd_hda_intel snd_intel_dspcfg ppdev kvm snd_hda_codec intel_wmi_thunderbolt sn d_hwdep irqbypass iwlwifi btusb snd_hda_core rapl btrtl intel_cstate snd_seq btbcm snd _seq_device btintel snd_pcm cfg80211 bluetooth pcspkr psmouse input_leds snd_timer int el_lpss_pci mei_me intel_lpss snd ecdh_generic ecc mei ucsi_acpi i2c_i801 idma64 i2c_s mbus virt_dma soundcore typec_ucsi typec wmi parport_pc parport video mac_hid acpi_pad sch_fq_codel drm ip_tables x_tables crct10dif_pclmul crc32_pclmul ghash_clmulni_intel usbkbd hid_generic usbmouse aesni_intel usbhid crypto_simd cryptd e1000e hid serio_ra w ahci libahci pinctrl_sunrisepoint fuse msr autofs4 [last unloaded: idxd] [57159.904082] CR2: ffffc900011e0090 [57159.907877] ---[ end trace b4e32f49ce9176a4 ]--- Fixes: `49c4959f04` ("dmaengine: idxd: fix sequence for pci driver remove() and shutdown()") Reported-by: Ziye Yang <ziye.yang@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163225535868.4152687.9318737776682088722.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-10-25 11:51:22 +05:30
Dave Jiang	ade8a86b51	dmaengine: idxd: Set defaults for GRPCFG traffic class Set GRPCFG traffic class to value of 1 for best performance on current generation of accelerators. Also add override option to allow experimentation. Sysfs knobs are disabled for DSA/IAX gen1 devices. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162681373005.1968485.3761065664382799202.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-07-28 17:55:40 +05:30
Dave Jiang	6e7f3ee97b	dmaengine: idxd: move dsa_drv support to compatible mode The original architecture of /sys/bus/dsa invented a scheme whereby a single entry in the list of bus drivers, /sys/bus/drivers/dsa, handled all device types and internally routed them to different different drivers. Those internal drivers were invisible to userspace. With the idxd driver transitioned to a proper bus device-driver model, the legacy behavior needs to be preserved due to it being exposed to user space via sysfs. Create a compat driver to provide the legacy behavior for /sys/bus/dsa/drivers/dsa. This should satisfy user tool accel-config v3.2 or ealier where this behavior is expected. If the distro has a newer accel-config then the legacy mode does not need to be enabled. When the compat driver binds the device (i.e. dsa0) to the dsa driver, it will be bound to the new idxd_drv. The wq device (i.e. wq0.0) will be bound to either the dmaengine_drv or the user_drv. The dsa_drv becomes a routing mechansim for the new drivers. It will not support additional external drivers that are implemented later. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637468705.744545.4399080971745974435.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-07-21 10:09:16 +05:30
Dave Jiang	d9e5481fca	dmaengine: dsa: move dsa_bus_type out of idxd driver to standalone In preparation for dsa_drv compat support to be built-in, move the bus code to its own compilation unit. A follow-on patch adds the compat implementation. Recall that the compat implementation allows for the deprecated / omnibus dsa_drv binding scheme rather than the idiomatic organization of a full fledged bus driver per driver type. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637468142.744545.2811632736881720857.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-07-21 10:09:16 +05:30
Dave Jiang	448c3de8ac	dmaengine: idxd: create user driver for wq 'device' The original architecture of /sys/bus/dsa invented a scheme whereby a single entry in the list of bus drivers, /sys/bus/drivers/dsa, handled all device types and internally routed them to different drivers. Those internal drivers were invisible to userspace. Now, as /sys/bus/dsa wants to grow support for alternate drivers for a given device, for example vfio-mdev instead of kernel-internal-dmaengine, a proper bus device-driver model is needed. The first step in that process is separating the existing omnibus/implicit "dsa" driver into proper individual drivers registered on /sys/bus/dsa. Establish the idxd_user_drv driver that controls the enabling and disabling of the wq and also register and unregister a char device to allow user space to mmap the descriptor submission portal. The cdev related bits are moved to the cdev driver probe/remove and out of the drv_enabe/disable_wq() calls. These bits are exclusive to the cdev operation and not part of the generic enable/disable of the wq device. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637467578.744545.10203997610072341376.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-07-21 10:09:16 +05:30
Dave Jiang	0cda4f6986	dmaengine: idxd: create dmaengine driver for wq 'device' The original architecture of /sys/bus/dsa invented a scheme whereby a single entry in the list of bus drivers, /sys/bus/drivers/dsa, handled all device types and internally routed them to different drivers. Those internal drivers were invisible to userspace. Now, as /sys/bus/dsa wants to grow support for alternate drivers for a given device, for example vfio-mdev instead of kernel-internal-dmaengine, a proper bus device-driver model is needed. The first step in that process is separating the existing omnibus/implicit "dsa" driver into proper individual drivers registered on /sys/bus/dsa. Establish the idxd_dmaengine_drv driver that controls the enabling and disabling of the wq and also register and unregister the dma channel. idxd_wq_alloc_resources() and idxd_wq_free_resources() also get moved to the dmaengine driver. The resources (dma descriptors allocation and setup) are only used by the dmaengine driver and should only happen when it loads. The char dev driver (cdev) related bits are left in the __drv_enable_wq() and __drv_disable_wq() calls to be moved when we split out the char dev driver just like how the dmaengine driver is split out. WQ autoload support is not expected currently. With the amount of configuration needed for the device, the wq is always expected to be enabled by a tool (or via sysfs) rather than auto enabled at driver load. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637467033.744545.12330636655625405394.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-07-21 10:09:16 +05:30
Dave Jiang	034b3290ba	dmaengine: idxd: create idxd_device sub-driver The original architecture of /sys/bus/dsa invented a scheme whereby a single entry in the list of bus drivers, /sys/bus/drivers/dsa, handled all device types and internally routed them to different drivers. Those internal drivers were invisible to userspace. Now, as /sys/bus/dsa wants to grow support for alternate drivers for a given device, for example vfio-mdev instead of kernel-internal-dmaengine, a proper bus device-driver model is needed. The first step in that process is separating the existing omnibus/implicit "dsa" driver into proper individual drivers registered on /sys/bus/dsa. Establish the idxd_drv driver that control the enabling and disabling of the accelerator device. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637466439.744545.15210886092627144577.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-07-21 10:09:16 +05:30
Dave Jiang	5fee6567ec	dmaengine: idxd: add type to driver in order to allow device matching Add an array of support device types to the idxd_device_driver definition in order to enable simple matching of device type to a given driver. The deprecated / omnibus dsa_drv driver specifies IDXD_DEV_NONE as its only role is to service legacy userspace (old accel-config) directed bind requests and route them to them the proper driver. It need not attach to a device when the bus is autoprobed. The accel-config tooling is being updated to drop its dependency on this deprecated bind scheme. Reviewed-by: Dan Willliams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637465882.744545.17456174666211577867.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-07-21 10:09:16 +05:30
Dave Jiang	c05257b560	dmanegine: idxd: open code the dsa_drv registration Don't need a wrapper to register the driver. Just do it directly. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637465319.744545.16325178432532362906.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-07-21 10:09:16 +05:30
Dave Jiang	f52058ae11	dmaengine: idxd: remove IDXD_DEV_CONF_READY The IDXD_DEV_CONF_READY state flag is no longer needed. The current implementation uses this flag to stop the device from doing configuration until the pci driver probe has completed. With the driver architecture going towards multiple sub-driver attached to the dsa_bus, this is no longer feasible. The sub-drivers will be allowed to probe and return with failure when they are not ready to complete the probe rather than using a state flag to gate the probing. There is no expectation that the devices auto-attach to a driver. Userspace configuration is expected to setup the device before enabling. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637460633.744545.8902095097471365420.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-07-21 10:09:15 +05:30
Dave Jiang	700af3a0a2	dmaengine: idxd: add 'struct idxd_dev' as wrapper for conf_dev Add a 'struct idxd_dev' that wraps the 'struct device' for idxd conf_dev that registers with the dsa bus. This is introduced in order to deal with multiple different types of 'devices' that are registered on the dsa_bus when the compat driver needs to route them to the correct driver to attach. The bind() call now can determine the type of device and then do the appropriate driver matching. Reviewed-by Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637460065.744545.584492831446090984.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2021-07-21 10:09:15 +05:30

1 2

99 commits