Clear overflowed array index read warning by cast operation.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clear resource leak warning that when the prepare fails,
the allocated amdgpu job object will never be released.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
because the ue valid mca count will only be cleared after gpu reset,
so only dump mca log on the first time to get mca bank after receive RAS interrupt.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v1:
because SMU CE valid mca bank will be cleared after reading,
this patch adds mca cache at the driver level to ensure that the mca bank is not lost.
v2:
refine amdgpu_mca_init/fini/reset() function name.
v3:
add mca_cache.lock support
only add CE bank to mca bank cache.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- Refine mca driver code.
- Centralize mca bank dispatch code logic.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add initial support for MMHUB 4.1.0.
v1: Add mmhub v4_1_0 ip block support.
v2: Switch to AMDGPU_MMHUB0(0).
v3: squash in fix for ip version check (Alex)
v4: squash in vm_contexts_disable fix (Alex)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When TTM failed to alloc VRAM, TTM try evict BOs from VRAM to system
memory then retry the allocation, this skips the KFD BOs from the same
process because KFD require all BOs are resident for user queues.
If TTM with TTM_PL_FLAG_CONTIGUOUS flag to alloc contiguous VRAM, allow
TTM evict KFD BOs from the same process, this will evict the user queues
first, and restore the queues later after contiguous VRAM allocation.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Define macro AMDGPU_MAX_SG_SEGMENT_SIZE 2GB, because struct scatterlist
length is unsigned int, and some users of it cast to a signed int, so
every segment of sg table is limited to size 2GB maximum.
For contiguous VRAM allocation, don't limit the max buddy block size in
order to get contiguous VRAM memory. To workaround the sg table segment
size limit, allocate multiple segments if contiguous size is bigger than
AMDGPU_MAX_SG_SEGMENT_SIZE.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- remove unused callback functions.
- make part of mca functions static and refine the function order.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Adjust mes queue initialization before kgq/kcq initialization
to enable mes mapping legacy queue.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts drm/amdgpu: fix ftrace event amdgpu_bo_move always move
on same heap. The basic problem here is that after the move the old
location is simply not available any more.
Some fixes were suggested, but essentially we should call the move
notification before actually moving things because only this way we have
the correct order for DMA-buf and VM move notifications as well.
Also rework the statistic handling so that we don't update the eviction
counter before the move.
v2: add missing NULL check
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 94aeb41173 ("drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3171
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
RDMA device with limited scatter-gather ability requires contiguous VRAM
buffer allocation for RDMA peer direct support.
Add a new KFD alloc memory flag and store as bo alloc flag
AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS. When pin this bo to export for RDMA
peerdirect access, this will set TTM_PL_FLAG_CONTIFUOUS flag, and ask
VRAM buddy allocator to get contiguous VRAM.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Assign value to clock to fix the warning below:
"Using uninitialized value res. Field res.clock is uninitialized"
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmYutdweHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGG5oH/3Ggwz5N+gdEK3np
qxUfpgJTWo+fJ6xhRLGy84TnEZ9s9UnK0Su6UuVyOb0F/2Y8hesJ6iwB16yQFKNe
Nore/VvuBZ+utshz5N20yNyPugNOP74GGbyOm+d+iJwIKnmSE8jSjWyMwNFHJCZM
BfoBxZrpwU/YD/0PD1KkI44jhPX1H/EcEmtNiklLnuYvJydTWiRFeku+CSgcOiRz
6fIFrcZMREZrAytMQSwteBAvI3vWblC0S39ZgJmtZt+oi+s1ksIUNG8Mm5uMAiyF
LcGep2tWV06x9uB9XVvrk0qco/kOaUgYuQrHQwNu9LzsaUdYGKqBBMk41SoNXFkB
hBXN26U=
=N2E2
-----END PGP SIGNATURE-----
Merge v6.9-rc6 into drm-next
Thomas needs the defio fixes, Maíra needs the vkms fixes and Joonas
has some fun with i915-gem conflicts.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
GFX12 swizzle mode and GCC formats changed and is much simpler. Use a
seperate function for the same. Changes:
* Swizzle mode is now 3 bits only
* DCC enablement doesn't come from tiling_flags, it is always set in modifiers
* DCC max compressed block size of 128B
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Converting size from size_t to int may overflow.
v2: keep reverse xmas tree order (Christian)
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clear warning that uses uninitialized value fw_size.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Do not dump the ip registers during driver reload
in passthrough environment.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Now we have two flags for contiguous VRAM buffer allocation.
If the application request for AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
it would set the ttm place TTM_PL_FLAG_CONTIGUOUS flag in the
buffer's placement function.
This patch will change the default behaviour of the two flags.
When we set AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS
- This means contiguous is not mandatory.
- we will try to allocate the contiguous buffer. Say if the
allocation fails, we fallback to allocate the individual pages.
When we setTTM_PL_FLAG_CONTIGUOUS
- This means contiguous allocation is mandatory.
- we are setting this in amdgpu_bo_pin_restricted() before bo validation
and check this flag in the vram manager file.
- if this is set, we should allocate the buffer pages contiguously.
the allocation fails, we return -ENOSPC.
v2:
- keep the mem_flags and bo->flags check as is(Christian)
- place the TTM_PL_FLAG_CONTIGUOUS flag setting into the
amdgpu_bo_pin_restricted function placement range iteration
loop(Christian)
- rename find_pages with amdgpu_vram_mgr_calculate_pages_per_block
(Christian)
- Keep the kernel BO allocation as is(Christain)
- If BO pin vram allocation failed, we need to return -ENOSPC as
RDMA cannot work with scattered VRAM pages(Philip)
v3(Christian):
- keep contiguous flag handling outside of pages_per_block
calculation
- remove the hacky implementation in contiguous flag error
handling code
v4(Christian):
- use any variable and return value for non-contiguous
fallback
v5: rebase to amd-staging-drm-next branch
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There are new control registers introduced in gfx10 used to configure
hardware watchpoints triggered by SMEM instructions:
SQ_WATCH{0,1,2,3}_{CNTL_ADDR_HI,ADDR_L}.
Those registers work in a similar way as the TCP_WATCH* registers
currently used for gfx9 and above.
This patch adds support to program the SQ_WATCH registers for gfx10.
The SQ_WATCH?_CNTL.MASK field has one bit more than
TCP_WATCH?_CNTL.MASK, so SQ watchpoints can have a finer granularity
than TCP_WATCH watchpoints. In this patch, we keep the capabilities
advertised to the debugger unchanged
(HSA_DBG_WATCH_ADDR_MASK_*_BIT_GFX10) as this reflects what both TCP and
SQ watchpoints can do and both watchpoints are configured together.
Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function gfx_v9_4_3_init_microcode in gfx_v9_4_3.c was generating
about potential truncation of output when using the snprintf function.
The issue was due to the size of the buffer 'ucode_prefix' being too
small to accommodate the maximum possible length of the string being
written into it.
The string being written is "amdgpu/%s_mec.bin" or "amdgpu/%s_rlc.bin",
where %s is replaced by the value of 'chip_name'. The length of this
string without the %s is 16 characters. The warning message indicated
that 'chip_name' could be up to 29 characters long, resulting in a total
of 45 characters, which exceeds the buffer size of 30 characters.
To resolve this issue, the size of the 'ucode_prefix' buffer has been
reduced from 30 to 15. This ensures that the maximum possible length of
the string being written into the buffer will not exceed its size, thus
preventing potential buffer overflow and truncation issues.
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c: In function ‘gfx_v9_4_3_early_init’:
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:379:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=]
379 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", chip_name);
| ^~
......
439 | r = gfx_v9_4_3_init_rlc_microcode(adev, ucode_prefix);
| ~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:379:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 30
379 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", chip_name);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:413:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=]
413 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", chip_name);
| ^~
......
443 | r = gfx_v9_4_3_init_cp_compute_microcode(adev, ucode_prefix);
| ~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:413:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 30
413 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", chip_name);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: 8630112969 ("drm/amdgpu: split gc v9_4_3 functionality from gc v9_0")
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In amdgpu_umc_bad_page_polling_timeout, the amdgpu_umc_handle_bad_pages
will be run many times so that double free err_addr in some special case.
So set the err_addr to NULL to avoid the warnings.
Signed-off-by: Bob Zhou <bob.zhou@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The parameter "last_jump_jiffies" should be initialized
before being used in the function atom_op_jump.
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Check if ring is not a mes queue before freeing the wb entry,
because we only allocate a wb entry when it's not a mes queue.
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
After amdgpu_i2c_get_byte fail, amdgpu_i2c_put_byte shouldn't be
conducted to put wrong value.
So return and check the i2c transfer result.
Signed-off-by: Bob Zhou <bob.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
if the sdma_v4_0_irq_id_to_seq return -EINVAL, the process should
be stop to avoid out-of-bounds read, so directly return -EINVAL.
Signed-off-by: Bob Zhou <bob.zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Initialize the interrupt timestamp for some legacy SOCs
to fix the coverity issue "Uninitialized scalar variable"
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use new interface to reserve bad page.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
retired_page is page frame and should be expanded
to the full address when querying status.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Retire bad pages for umc v12_0.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add delay work to retire bad pages.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1. umc v12_0 logs ecc errors.
2. Reserve newly detected ecc error pages.
3. Add tag for bad pages, so that they can
be retired later.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x03000001.
V2: To really improve the handling we would actually
need to have a separate value of 0xffffffff.(Christian)
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We can't use a shared fence location because each transaction
should be considered independently.
Reviewed-by: Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As we use wb slots more dynamically, we need to lock
access to avoid racing on allocation or free.
Reviewed-by: Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>