This patch changes the implementation of AMDGPU_PTE_MTYPE_GFX12,
clear the bits before setting the new one.
This fixed the potential issue that GFX12 setting memory to NC.
v2: Clear mtype field before setting the new one (Alex)
v3: Fix typo (Felix)
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: longlyao <Longlong.Yao@amd.com>
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On GFXIP9.4.3, make CPX mode as the default compute mode if the node is
setup in NPS4 memory partition mode. This change is only applicable for
dGPU, for APU, continue to use TPX mode.
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
With commit 89773b8559
("drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs")
big and small APU "VRAM" handling in KFD was unified. Since AMD_IS_APU
is set for both big and small APUs, we can simplify the checks in
the code.
v2: clean up a few more places (Lang)
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The pointer parent may be NULLed by the function amdgpu_vm_pt_parent.
To make the code more robust, check the pointer parent.
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use current speed/width on devices which don't support
dynamic PCIe switching.
Fixes: 466a7d1153 ("drm/amd: Use the first non-dGPU PCI device for BW limits")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3289
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On GFXIP9.4.3, make CPX mode as the default compute mode if the node is
setup in NPS4 memory partition mode. This change is only applicable for
dGPU, for APU, continue to use TPX mode.
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use correct ref/mask for differnent gfx ring pipe. Ported from
ZhenGuo's patch for gfx10.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
flush_gpu_tlb may be called from another thread while
device_gpu_recover is running.
Both of these threads access registers through the VF
RLCG interface during VF Full Access. Add a lock around this interface
to prevent race conditions between these threads.
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
With commit 89773b8559
("drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs")
big and small APU "VRAM" handling in KFD was unified. Since AMD_IS_APU
is set for both big and small APUs, we can simplify the checks in
the code.
v2: clean up a few more places (Lang)
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fix typo "info.ue_count" in amdgpu_ras_aca_sysfs_read() function.
Fixes: 865d339763 ("drm/amdgpu: add aca deferred error type support")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The pointer parent may be NULLed by the function amdgpu_vm_pt_parent.
To make the code more robust, check the pointer parent.
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
AMD_PG_SUPPORT_VCN_DPG is needed for secure parts
and should/can be enabled by now.
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Reviewed-by: Sonny Jiang <sonjiang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It was an enablement vehicle for MES 11 and was never
productized. Remove it.
v2: drop additional checks in the GFX10 code.
v3: drop mes_api_def.h
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use current speed/width on devices which don't support
dynamic PCIe switching.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3289
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
nouveau:
- fix bo metadata uAPI for vm bind
panthor:
- Fixes for panthor's heap logical block.
- Reset on unrecoverable fault
- Fix VM references.
- Reset fix.
xlnx:
- xlnx compile and doc fixes.
amdgpu:
- Handle vbios table integrated info v2.3
amdkfd:
- Handle duplicate BOs in reserve_bo_and_cond_vms
- Handle memory limitations on small APUs
dp/mst:
- MST null deref fix.
bridge:
- Don't let next bridge create connector in adv7511 to make probe work.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmZQ9rAACgkQDHTzWXnE
hr7kew//e+OwUJcock4rqpMsXrVwgse4bJ1H7gnHLmN/AQKsXWcaHKixJu+WkkQI
62gyhOEiSH+J4Je6zi5JbIt9AKgIKyZeJqg/QeSMqUFJzYkPKhrhj0aX4AefuNJB
dxCI1+UnkNdQ/LM+Dwb4sq757FX0/yPqhzBiytWEJJVHP4Th5SMRFmSs8YsT1Z4F
bn220QFvPv+KrVV2eKEf6plJqYhs92nT8cVkNw8x9LhnivLRhyh7we3Ew6R2zouV
Izm1nYx1CtTqaiHo3+UOeqRLzhqpmukzmJyML9zk/qt+s+YMj9yI/ie02EnTPqg6
NQCqqmoDy1BpUa37N486Q7MFTiS076HMaQpYiDmVz3PIWWPvKZDqcC5vonoh99ra
b88X4J4w3e747xZP7VDsnAOFsaLgcRah4JNtv8H3rcurbz0FrcV/0g/W/kt+Nenc
k2m2s2hv56yWXd6B7bd3tfUe5dOwxc+CgiNddM5X3WPenEvnH+DKBOoFojSRl2o7
LpRL7WhY6fceRUqVrHzfIPLYgdfz/Ho9LYST2jGIdKDnwV0Hw0sNfU9C0KOzm0J6
rjgGEeMHgVdZPzSeXSmaO82mBRvR7Ke6da7FIazbocledOapsB5qxjhLvkRyULAp
mlGZPRPJdNVM8slNLxEqrT/ugorIo5owtQdEnJmDJXFdozcLTMY=
=pexB
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Some fixes for the end of the merge window, mostly amdgpu and panthor,
with one nouveau uAPI change that fixes a bad decision we made a few
months back.
nouveau:
- fix bo metadata uAPI for vm bind
panthor:
- Fixes for panthor's heap logical block.
- Reset on unrecoverable fault
- Fix VM references.
- Reset fix.
xlnx:
- xlnx compile and doc fixes.
amdgpu:
- Handle vbios table integrated info v2.3
amdkfd:
- Handle duplicate BOs in reserve_bo_and_cond_vms
- Handle memory limitations on small APUs
dp/mst:
- MST null deref fix.
bridge:
- Don't let next bridge create connector in adv7511 to make probe
work"
* tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernel:
drm/amdgpu/atomfirmware: add intergrated info v2.3 table
drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2
drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs
drm/amdkfd: handle duplicate BOs in reserve_bo_and_cond_vms
drm/bridge: adv7511: Attach next bridge without creating connector
drm/buddy: Fix the warn on's during force merge
drm/nouveau: use tile_mode and pte_kind for VM_BIND bo allocations
drm/panthor: Call panthor_sched_post_reset() even if the reset failed
drm/panthor: Reset the FW VM to NULL on unplug
drm/panthor: Keep a ref to the VM at the panthor_kernel_bo level
drm/panthor: Force an immediate reset on unrecoverable faults
drm/panthor: Document drm_panthor_tiler_heap_destroy::handle validity constraints
drm/panthor: Fix an off-by-one in the heap context retrieval logic
drm/panthor: Relax the constraints on the tiler chunk size
drm/panthor: Make sure the tiler initial/max chunks are consistent
drm/panthor: Fix tiler OOM handling to allow incremental rendering
drm: xlnx: zynqmp_dpsub: Fix compilation error
drm: xlnx: zynqmp_dpsub: Fix few function comments
hbm filed takes bit 13 and bit 14 in boot status.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add support of all the CP GFX queues for gfx11 ipdump
to be used by devcoredump.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add gfx11 support of CP queue registers for all queues
to be used by devcoredump.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add support of gfx11 ipdump print so devcoredump
could trigger it to dump the captured registers
in devcoredump.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add general registers of gfx11 in ipdump for
devcoredump support.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Adding more device information:
a. PCI info
b. VRAM and GTT info
c. GDC config
Also correct the print layout and section information for
in devcoredump.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add prints before and after ip state is dumped.
It avoids user to think of system being
stuck/hung as dump could take some time after a
gpu hang.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add gfx queue register for all instances in devcoredump
for gfx10.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add support to dump registers of all instances of
cp queue registers of gfx10 to devcoredump.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rename the memory pointer from ip_dump to ip_dump_core
to make it specific to core registers and rest other
registers to be dumped in their respective memories.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
the inst passed to amdgpu_virt_rlcg_reg_rw should be physical instance.
Fix the miss matched code.
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v1
- driver MMIO read the register to check whether write is required
- if write is required, sriov full time to use rlcg, otherwise use KIQ
v2
- include gfx v11 sriov runtime case
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
hbm filed takes bit 13 and bit 14 in boot status.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
device_cntl2 is accessible from pci config space, so program it through pci cfg
space instead of mmio.
Signed-off-by: Frank Min <Frank.Min@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
The vram width value is 0.
Because the integratedsysteminfo table in VBIOS has updated to 2.3.
[How]
Driver needs a new intergrated info v2.3 table too.
Then the vram width value will be correct.
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit fixes a format truncation issue arosed by the snprintf
function potentially writing more characters into the ring->name buffer
than it can hold, in the amdgpu_gfx_kiq_init_ring function
The issue occurred because the '%d' format specifier could write between
1 and 10 bytes into a region of size between 0 and 8, depending on the
values of xcc_id, ring->me, ring->pipe, and ring->queue. The snprintf
function could output between 12 and 41 bytes into a destination of size
16, leading to potential truncation.
To resolve this, the snprintf line was modified to use the '%hhu' format
specifier for xcc_id, ring->me, ring->pipe, and ring->queue. The '%hhu'
specifier is used for unsigned char variables and ensures that these
values are printed as unsigned decimal integers.
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c: In function ‘amdgpu_gfx_kiq_init_ring’:
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:332:61: warning: ‘%d’ directive output may be truncated writing between 1 and 10 bytes into a region of size between 0 and 8 [-Wformat-truncation=]
332 | snprintf(ring->name, sizeof(ring->name), "kiq_%d.%d.%d.%d",
| ^~
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:332:50: note: directive argument in the range [0, 2147483647]
332 | snprintf(ring->name, sizeof(ring->name), "kiq_%d.%d.%d.%d",
| ^~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:332:9: note: ‘snprintf’ output between 12 and 41 bytes into a destination of size 16
332 | snprintf(ring->name, sizeof(ring->name), "kiq_%d.%d.%d.%d",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
333 | xcc_id, ring->me, ring->pipe, ring->queue);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: 345a36c4f1 ("drm/amdgpu: prefer snprintf over sprintf")
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since the type of pg_flags is u32, adev->pg_flags >> 16 >> 16 is 0
regardless of the values of its operands.
So removing the operations upper_32_bits and lower_32_bits.
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Port mes11 hw_fini to mes12, fix for mode1 reset.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since the type of data_size is uint32_t, adev->umsch_mm.data_size - 1 >> 16 >> 16 is 0
regardless of the values of its operands
So removing the operations upper_32_bits and lower_32_bits.
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When user space sets an invalid ta type, the pointer context will be empty.
So it need to check the pointer context before using it
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
skip to create 'xxx_err_count' node when ACA is enabled.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_connector_edid() copies the EDID from edid_blob_ptr as a side
effect if amdgpu_connector->edid isn't initialized. However, everywhere
that the returned EDID is used, the EDID should have been set
beforehands.
Only the drm EDID code should look at the EDID property, anyway, so stop
using it.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Pan
Cc: amd-gfx@lists.freedesktop.org
Reviewed-by: Robert Foss <rfoss@kernel.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1463862965d76e9458551598fd4d287a08d3d264.1715353572.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
With the rework of how the __string() handles dynamic strings where it
saves off the source string in field in the helper structure[1], the
assignment of that value to the trace event field is stored in the helper
value and does not need to be passed in again.
This means that with:
__string(field, mystring)
Which use to be assigned with __assign_str(field, mystring), no longer
needs the second parameter and it is unused. With this, __assign_str()
will now only get a single parameter.
There's over 700 users of __assign_str() and because coccinelle does not
handle the TRACE_EVENT() macro I ended up using the following sed script:
git grep -l __assign_str | while read a ; do
sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file;
mv /tmp/test-file $a;
done
I then searched for __assign_str() that did not end with ';' as those
were multi line assignments that the sed script above would fail to catch.
Note, the same updates will need to be done for:
__assign_str_len()
__assign_rel_str()
__assign_rel_str_len()
I tested this with both an allmodconfig and an allyesconfig (build only for both).
[1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/
Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts.
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for
Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Darrick J. Wong <djwong@kernel.org> # xfs
Tested-by: Guenter Roeck <linux@roeck-us.net>
[Why]
The vram width value is 0.
Because the integratedsysteminfo table in VBIOS has updated to 2.3.
[How]
Driver needs a new intergrated info v2.3 table too.
Then the vram width value will be correct.
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmZLzNIUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vwr/Q//STe2XGKI8bAKqP2wbbkzm+ISnK4A
Lqf3FEAIXunxDRspszfXKKV2p4vaIkmOFiwIdtp/kWvd0DQn5+ATXJ/iQtp8aFX/
R+6BQ7EZc2G7fN5fbQuK54+CvmWEpkKEMbXYbd6ivQ14Cijdb3Nbu+w+DYFjS+6C
k2a9lS1bTW7Xcy0fyiO1w6GQiWqtmOH8U3OlQtIrI0EVkDG9OG1LsLuc92/FgkOo
REN+sU+hX1K5fHrvm2CtjYDn/9/B6bJ/It22H1dPgUL9nKvKC67fYzosMtUCOX1M
6XSPjZIuXOmQGeZXHhpSlVwaidxoUjYO98I7nMquxKdCy6yct3geK7ULG/xeQCgD
ML7MGQB4+sTiSWalXUQaziKqF1FIDEvU3HMGXFWnoBL5l56eRp8KS1EI9Eqk9pU3
pk9fJaCkcFnkzPtMFzqPOm5q9zUZ6bGbfYb0hs72TUKplmVDhFo2T1YsW2AOyHZ7
mjuDzUYZX0H7uM1tntA56IgZX+oNOrLvhBt5L5M/BQeCsZFBBUfIcAEaYoL9LwXO
AYgIG3jdqzHHyAUzutJF+XHKinJLMHm0XVYbFmO6saPhFzrUJSNHqT7NzW1DGGTl
OnO8e1WNMX1EcnKvnc6fXyGmM3SgVwy45FsbG/zRnhn4uBKqKtjrh6uX/myA22LK
CSeqSUK9XmXxFNA=
=xjoS
-----END PGP SIGNATURE-----
Merge tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Skip E820 checks for MCFG ECAM regions for new (2016+) machines,
since there's no requirement to describe them in E820 and some
platforms require ECAM to work (Bjorn Helgaas)
- Rename PCI_IRQ_LEGACY to PCI_IRQ_INTX to be more specific (Damien
Le Moal)
- Remove last user and pci_enable_device_io() (Heiner Kallweit)
- Wait for Link Training==0 to avoid possible race (Ilpo Järvinen)
- Skip waiting for devices that have been disconnected while
suspended (Ilpo Järvinen)
- Clear Secondary Status errors after enumeration since Master Aborts
and Unsupported Request errors are an expected part of enumeration
(Vidya Sagar)
MSI:
- Remove unused IMS (Interrupt Message Store) support (Bjorn Helgaas)
Error handling:
- Mask Genesys GL975x SD host controller Replay Timer Timeout
correctable errors caused by a hardware defect; the errors cause
interrupts that prevent system suspend (Kai-Heng Feng)
- Fix EDR-related _DSM support, which previously evaluated revision 5
but assumed revision 6 behavior (Kuppuswamy Sathyanarayanan)
ASPM:
- Simplify link state definitions and mask calculation (Ilpo
Järvinen)
Power management:
- Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports, where BIOS
apparently doesn't know how to put them back in D0 (Mario
Limonciello)
CXL:
- Support resetting CXL devices; special handling required because
CXL Ports mask Secondary Bus Reset by default (Dave Jiang)
DOE:
- Support DOE Discovery Version 2 (Alexey Kardashevskiy)
Endpoint framework:
- Set endpoint BAR to be 64-bit if the driver says that's all the
device supports, in addition to doing so if the size is >2GB
(Niklas Cassel)
- Simplify endpoint BAR allocation and setting interfaces (Niklas
Cassel)
Cadence PCIe controller driver:
- Drop DT binding redundant msi-parent and pci-bus.yaml (Krzysztof
Kozlowski)
Cadence PCIe endpoint driver:
- Configure endpoint BARs to be 64-bit based on the BAR type, not the
BAR value (Niklas Cassel)
Freescale Layerscape PCIe controller driver:
- Convert DT binding to YAML (Frank Li)
MediaTek MT7621 PCIe controller driver:
- Add DT binding missing 'reg' property for child Root Ports
(Krzysztof Kozlowski)
- Fix theoretical string truncation in PHY name (Sergio Paracuellos)
NVIDIA Tegra194 PCIe controller driver:
- Return success for endpoint probe instead of falling through to the
failure path (Vidya Sagar)
Renesas R-Car PCIe controller driver:
- Add DT binding missing IOMMU properties (Geert Uytterhoeven)
- Add DT binding R-Car V4H compatible for host and endpoint mode
(Yoshihiro Shimoda)
Rockchip PCIe controller driver:
- Configure endpoint BARs to be 64-bit based on the BAR type, not the
BAR value (Niklas Cassel)
- Add DT binding missing maxItems to ep-gpios (Krzysztof Kozlowski)
- Set the Subsystem Vendor ID, which was previously zero because it
was masked incorrectly (Rick Wertenbroek)
Synopsys DesignWare PCIe controller driver:
- Restructure DBI register access to accommodate devices where this
requires Refclk to be active (Manivannan Sadhasivam)
- Remove the deinit() callback, which was only need by the
pcie-rcar-gen4, and do it directly in that driver (Manivannan
Sadhasivam)
- Add dw_pcie_ep_cleanup() so drivers that support PERST# can clean
up things like eDMA (Manivannan Sadhasivam)
- Rename dw_pcie_ep_exit() to dw_pcie_ep_deinit() to make it parallel
to dw_pcie_ep_init() (Manivannan Sadhasivam)
- Rename dw_pcie_ep_init_complete() to dw_pcie_ep_init_registers() to
reflect the actual functionality (Manivannan Sadhasivam)
- Call dw_pcie_ep_init_registers() directly from all the glue
drivers, not just those that require active Refclk from the host
(Manivannan Sadhasivam)
- Remove the "core_init_notifier" flag, which was an obscure way for
glue drivers to indicate that they depend on Refclk from the host
(Manivannan Sadhasivam)
TI J721E PCIe driver:
- Add DT binding J784S4 SoC Device ID (Siddharth Vadapalli)
- Add DT binding J722S SoC support (Siddharth Vadapalli)
TI Keystone PCIe controller driver:
- Add DT binding missing num-viewport, phys and phy-name properties
(Jan Kiszka)
Miscellaneous:
- Constify and annotate with __ro_after_init (Heiner Kallweit)
- Convert DT bindings to YAML (Krzysztof Kozlowski)
- Check for kcalloc() failure in of_pci_prop_intr_map() (Duoming
Zhou)"
* tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (97 commits)
PCI: Do not wait for disconnected devices when resuming
x86/pci: Skip early E820 check for ECAM region
PCI: Remove unused pci_enable_device_io()
ata: pata_cs5520: Remove unnecessary call to pci_enable_device_io()
PCI: Update pci_find_capability() stub return types
PCI: Remove PCI_IRQ_LEGACY
scsi: vmw_pvscsi: Do not use PCI_IRQ_LEGACY instead of PCI_IRQ_LEGACY
scsi: pmcraid: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: mpt3sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: megaraid_sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: ipr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: hpsa: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: arcmsr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
wifi: rtw89: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
dt-bindings: PCI: rockchip,rk3399-pcie: Add missing maxItems to ep-gpios
Revert "genirq/msi: Provide constants for PCI/IMS support"
Revert "x86/apic/msi: Enable PCI/IMS"
Revert "iommu/vt-d: Enable PCI/IMS"
Revert "iommu/amd: Enable PCI/IMS"
Revert "PCI/MSI: Provide IMS (Interrupt Message Store) support"
...
Small APUs(i.e., consumer, embedded products) usually have a small
carveout device memory which can't satisfy most compute workloads
memory allocation requirements.
We can't even run a Basic MNIST Example with a default 512MB carveout.
https://github.com/pytorch/examples/tree/main/mnist. Error Log:
"torch.cuda.OutOfMemoryError: HIP out of memory. Tried to allocate
84.00 MiB. GPU 0 has a total capacity of 512.00 MiB of which 0 bytes
is free. Of the allocated memory 103.83 MiB is allocated by PyTorch,
and 22.17 MiB is reserved by PyTorch but unallocated"
Though we can change BIOS settings to enlarge carveout size,
which is inflexible and may bring complaint. On the other hand,
the memory resource can't be effectively used between host and device.
The solution is MI300A approach, i.e., let VRAM allocations go to GTT.
Then device and host can flexibly and effectively share memory resource.
v2: Report local_mem_size_private as 0. (Felix)
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Align kerneldoc with the function argument name.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 26e20235ce ("drm/amdgpu: Add amdgpu_bo_is_vm_bo helper")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
'hqd_registers' used to be used in a member of the 'bonaire_mqd'
struct. 'bonaire_mqd' was removed by
commit 486d807cd9 ("drm/amdgpu: remove duplicate definition of cik_mqd")
It's now unused.
Remove 'hqd_registers' as well.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Avoid overflow issue.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>