1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

75 commits

Author SHA1 Message Date
Evan Quan
3059cd8c5f drm/amd/pm: disable cstate feature for gpu reset scenario
Suggested by PMFW team and same as what did for gfxoff feature.
This can address some Mode1Reset failures observed on SMU13.0.0.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18 22:12:20 -04:00
Darren Powell
ceb180361e amdgpu/pm: Fix possible array out-of-bounds if SCLK levels != 2
[v2]
simplified fix after Lijo's feedback
 removed clocks.num_levels from calculation of loop count
   removed unsafe accesses to shim table freq_values
 retained corner case output only min,now if
   clocks.num_levels == 1 && now > min

 [v1]
added a check to populate and use SCLK shim table freq_values only
   if using dpm_level == AMD_DPM_FORCED_LEVEL_MANUAL or
                         AMD_DPM_FORCED_LEVEL_PERF_DETERMINISM
removed clocks.num_levels from calculation of shim table size
removed unsafe accesses to shim table freq_values
   output gfx_table values if using other dpm levels
added check for freq_match when using freq_values for when now == min_clk

== Test ==
LOGFILE=aldebaran-sclk.test.log
AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1`
AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print $9}'`
HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON}

lspci -nn | grep "VGA\|Display"  > $LOGFILE
FILES="pp_od_clk_voltage
pp_dpm_sclk"

for f in $FILES
do
  echo === $f === >> $LOGFILE
  cat $HWMON_DIR/device/$f >> $LOGFILE
done
cat $LOGFILE

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-22 16:55:22 -04:00
Darren Powell
543faf57ee amdgpu/pm: Fix incorrect variable for size of clocks array
[v2]
No Changes, added RB
 [v1]
Size of pp_clock_levels_with_latency is PP_MAX_CLOCK_LEVELS, not MAX_NUM_CLOCKS.
Both are currently defined as 16, modifying in case one value is modified in future
Changed code in both arcturus and aldabaran.

Also removed unneeded var count, and used min_t function

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-22 16:55:12 -04:00
Alex Deucher
da1db031cd drm/amdgpu/swsmu: add SMU mailbox registers in SMU context
So we can eventaully use them in the common smu code for
accessing the SMU mailboxes without needing a lot of
per asic logic in the common code.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:45:00 -04:00
Stanley.Yang
cbd3e8440e drm/amdgpu: print umc correctable error address
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:44:15 -04:00
Stanley.Yang
2f6247dad2 drm/amdgpu/pm: support mca_ceumc_addr in ecctable
SMU add a new variable mca_ceumc_addr to record
umc correctable error address in EccInfo table,
driver side add EccInfo_V2_t to support this feature

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:43:36 -04:00
Lijo Lazar
b0f4d663fc drm/amd/pm: Fix missing thermal throttler status
On aldebaran, when thermal throttling happens due to excessive GPU
temperature, the reason for throttling event is missed in warning
message. This patch fixes it.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Kent Russell
4a93d938a4 drm/amdgpu: Use metrics data function to get unique_id for Aldebaran
This is abstracted well enough in the get_metrics_data function, so use
the function

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-31 23:05:54 -04:00
Stanley.Yang
d510eccfa5 drm/amd/pm: add send bad channel info function
support message SMU update bad channel info to update HBM bad channel
info in OOB table

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-15 14:25:16 -04:00
Evan Quan
2d282665d2 drm/amd/pm: update the data type for retrieving enabled ppfeatures
Use uint64_t instead of an array of uint32_t. This can avoid
some non-necessary intermediate uint32_t -> uint64_t conversions.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-07 18:01:16 -05:00
Luben Tuikov
00d6936dbd drm/amdgpu: Set FRU bus for Aldebaran and Vega 20
The FRU and RAS EEPROMs share the same I2C bus on Aldebaran and Vega 20
ASICs. Set the FRU bus "pointer" to this single bus, as access to the FRU
is sought through that bus "pointer" and not through the RAS bus "pointer".

Cc: Roy Sun <Roy.Sun@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Fixes: 2f60dd5076 ("drm/amd: Expose the FRU SMU I2C bus")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-07 17:59:53 -05:00
Alex Deucher
e281d5940a drm/amdgpu/swsmu/i2c: return an error if the SMU is not running
Return an error if someone tries to use the i2c bus when the
SMU is not running.  Otherwise we can end up sending commands
to the SMU which will either get ignored or could cause other
issues depending on what state the GPU and SMU are in.

Cc: Luben.Tuikov@amd.com
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-27 15:50:03 -05:00
Luben Tuikov
2f60dd5076 drm/amd: Expose the FRU SMU I2C bus
Expose both SMU I2C buses. Some boards use the same bus for both the RAS
and FRU EEPROMs and others use different buses.  This enables the
additional I2C bus and sets the right buses to use for RAS and FRU EEPROM
access.

Cc: Roy Sun <Roy.Sun@amd.com>
Co-developed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-27 15:49:48 -05:00
Evan Quan
56383e8f4d drm/amd/pm: drop unneeded smu->sensor_lock
As all those related APIs are already well protected by
adev->pm.mutex and smu->message_lock.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:32 -05:00
Evan Quan
da11407f06 drm/amd/pm: drop unneeded smu->metrics_lock
As all those related APIs are already well protected by
adev->pm.mutex and smu->message_lock.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:32 -05:00
Evan Quan
e0638c7abc drm/amd/pm: drop unneeded lock protection smu->mutex
As all those APIs are already protected either by adev->pm.mutex
or smu->message_lock.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:32 -05:00
Evan Quan
837d542a09 drm/amd/pm: relocate the power related headers
Instead of centralizing all headers in the same folder. Separate them into
different folders and place them among those source files those who really
need them.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-14 17:51:14 -05:00
Evan Quan
ebfc253335 drm/amd/pm: do not expose the smu_context structure used internally in power
This can cover the power implementation details. And as what did for
powerplay framework, we hook the smu_context to adev->powerplay.pp_handle.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-14 17:51:14 -05:00
Tao Zhou
15084a8e16 drm/amd/pm: only send GmiPwrDnControl msg on master die (v3)
PMFW only returns 0 on master die and sends NACK back on other dies for
the message.

v2: only send GmiPwrDnControl msg on master die instead of all
dies.
v3: remove the pointer check for get_socket_id and get_die_id as they
should be present on Aldebaran.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:28 -05:00
Kent Russell
de0af8a65e drm/amdgpu: Only overwrite serial if field is empty
On Aldebaran, the serial may be obtained from the FRU. Only overwrite
the serial with the unique_id if the serial is empty. This will support
printing serial numbers for mGPU devices where there are 2 unique_ids
for the 2 GPUs, but only one serial number for the board

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:43 -05:00
sashank saye
4da8b63944 drm/amdgpu: Send Message to SMU on aldebaran passthrough for sbr handling
For Aldebaran chip passthrough case we need to intimate SMU
about special handling for SBR.On older chips we send
LightSBR to SMU, enabling the same for Aldebaran. Slight
difference, compared to previous chips, is on Aldebaran, SMU
would do a heavy reset on SBR. Hence, the word Heavy
instead of Light SBR is used for SMU to differentiate.

Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: sashank saye <sashank.saye@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28 16:03:19 -05:00
Lijo Lazar
3c27abee3f drm/amd/pm: Fix xgmi link control on aldebaran
Fix the message argument.
	0: Allow power down
	1: Disallow power down

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-16 13:43:46 -05:00
Lijo Lazar
91e16017b6 drm/amd/pm: Skip power state allocation
Power states are not valid for arcturus and aldebaran, no need to
allocate memory.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-14 16:09:42 -05:00
Stanley.Yang
edd7942085 drm/amd/pm: add message smu to get ecc_table v2
support ECC TABLE message, this table include umc ras error count
and error address

v2:
    add smu version check to query whether support ecctable
    call smu_cmn_update_table to get ecctable directly

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:26 -05:00
Tao Zhou
3ebd8bf023 drm/amdgpu: support new mode-1 reset interface (v2)
If gpu reset is triggered by ras fatal error, tell it to smu in mode-1
reset message.

v2: move mode-1 reset function to aldebaran_ppt.c since it's aldebaran
specific currently.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:02 -05:00
Darren Powell
2d1ac1cbe5 amdgpu/pm: (v2) add limit_type to (pptable_funcs)->set_power_limit signature
v2
 add check for SMU_DEFAULT_PPT_LIMIT

 v1
 modify (pptable_funcs)->set_power_limit signature
 modify smu11 set_power_limit signature (arcturus, navi10, sienna_cichlid)
 modify smu13 set_power_limit signature (aldabaran)
 modify vangogh_set_power_limit signature (vangogh)

=== Test ===
sudo bash

AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1`
AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print $9}'`
HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON}
LOGFILE=pp_show_power_cap.log

cp $LOGFILE{,.old}
lspci -nn | grep "VGA\|Display" > $LOGFILE
FILES="
power1_cap
power2_cap"

for f in $FILES
do
  if test -f "$HWMON_DIR/$f"; then
    echo === $f === >> $LOGFILE
    cat $HWMON_DIR/$f >> $LOGFILE
    RESTORE_VALUE=`cat $HWMON_DIR/$f` 2>&1  >> $LOGFILE
    echo RESTORE_VALUE $RESTORE_VALUE >> $LOGFILE
    echo 120000000 > $HWMON_DIR/$f
    sleep 3
    cat $HWMON_DIR/$f >> $LOGFILE
    echo $RESTORE_VALUE > $HWMON_DIR/$f
    sleep 3
    cat $HWMON_DIR/$f >> $LOGFILE
  else
    echo === $f === >> $LOGFILE
    echo File Not Found >> $LOGFILE
  fi
done
cat $LOGFILE

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:14:54 -04:00
Lang Yu
8f48ba303d drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings(v2)
sysfs_emit and sysfs_emit_at requrie a page boundary
aligned buf address. Make them happy!

v2: use an inline function.

Warning Log:
[  492.545174] invalid sysfs_emit_at: buf:00000000f19bdfde at:0
[  492.546416] WARNING: CPU: 7 PID: 1304 at fs/sysfs/file.c:765 sysfs_emit_at+0x4a/0xa0
[  492.654805] Call Trace:
[  492.655353]  ? smu_cmn_get_metrics_table+0x40/0x50 [amdgpu]
[  492.656780]  vangogh_print_clk_levels+0x369/0x410 [amdgpu]
[  492.658245]  vangogh_common_print_clk_levels+0x77/0x80 [amdgpu]
[  492.659733]  ? preempt_schedule_common+0x18/0x30
[  492.660713]  smu_print_ppclk_levels+0x65/0x90 [amdgpu]
[  492.662107]  amdgpu_get_pp_od_clk_voltage+0x13d/0x190 [amdgpu]
[  492.663620]  dev_attr_show+0x1d/0x40

Signed-off-by: Lang Yu <lang.yu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-16 09:56:23 -04:00
Kees Cook
4a9bd6db19 drm/amd/pm: And destination bounds checking to struct copy
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

The "Board Parameters" members of the structs:
	struct atom_smc_dpm_info_v4_5
	struct atom_smc_dpm_info_v4_6
	struct atom_smc_dpm_info_v4_7
	struct atom_smc_dpm_info_v4_10
are written to the corresponding members of the corresponding PPTable_t
variables, but they lack destination size bounds checking, which means
the compiler cannot verify at compile time that this is an intended and
safe memcpy().

Since the header files are effectively immutable[1] and a struct_group()
cannot be used, nor a common struct referenced by both sides of the
memcpy() arguments, add a new helper, amdgpu_memcpy_trailing(), to
perform the bounds checking at compile time. Replace the open-coded
memcpy()s with amdgpu_memcpy_trailing() which includes enough context
for the bounds checking.

"objdump -d" shows no object code changes.

[1] https://lore.kernel.org/lkml/e56aad3c-a06f-da07-f491-a894a570d78f@amd.com

Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Feifei Xu <Feifei.Xu@amd.com>
Cc: Likun Gao <Likun.Gao@amd.com>
Cc: Jiawei Gu <Jiawei.Gu@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:34 -04:00
Kevin Wang
becf6c9552 drm/amd/pm: change smu msg's attribute to allow working under sriov
the following message is allowed in sriov mode:
1. GetEnabledSmuFeaturesLow
2. GetEnabledSmuFeaturesHigh

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:17:44 -04:00
Kevin Wang
cb5da84a5f drm/amd/pm: change return value in aldebaran_get_power_limit()
v1:
1. change return value to avoid smu driver probe fails when FEATURE_PPT is
not enabled.
2. if FEATURE_PPT is not enabled, set power limit value to 0.

v2:
instead dev_err with dev_warn

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:17:35 -04:00
Kevin Wang
19838cbae7 drm/amd/pm: correct DPM_XGMI/VCN_DPM feature name
the following feature is wrong, it will cause sysnode of pp_features show error:
1. DPM_XGMI
2. VCN_DPM

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:17:17 -04:00
Darren Powell
e738c2f0e6 amdgpu/pm: Replace smu12/13 usage of sprintf with sysfs_emit
initial modification of files
  renoir_ppt.c
  aldebaran_ppt.c
  yellow_carp_ppt.c

=== Test ===
AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1`
AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print $9}'`
HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON}
LOGFILE=pp_printf.test.log

lspci -nn | grep "VGA\|Display"  > $LOGFILE
FILES="pp_dpm_sclk
pp_power_profile_mode "

for f in $FILES
do
  echo === $f === >> $LOGFILE
  cat $HWMON_DIR/device/$f >> $LOGFILE
done
cat $LOGFILE

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-10 10:34:50 -04:00
Kevin Wang
a38414335d drm/amd/pm: correct aldebaran smu feature mapping FEATURE_DATA_CALCULATIONS
correct smu feature mapping: FEATURE_DATA_CALCULATIONS
it will cause sysfs node of "pp_features" show error.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:58 -04:00
Graham Sider
410e302ea5 drm/amdkfd: Update SMI throttle event bitmask
Update Arcturus/Aldebaran thermal throttle SMI event path to use
ASIC-independent throttler bits when logging.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Lijo Lazar
54e6065461 drm/amd/pm: Support board calibration on aldebaran
Add support for board power calibration on Aldebaran.
Board calibration is done after DC offset calibration.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:07:59 -04:00
Luben Tuikov
c0838d3a93 drm/amdgpu: The I2C IP doesn't support 0 writes/reads
The I2C IP doesn't support writes or reads of 0 bytes.

In order for a START/STOP transaction to take
place on the bus, the data written/read has to be
at least one byte.

That is, you cannot generate a write with 0 bytes,
just to get the ACK from a device, just so you can
probe that device if it is on the bus and so to
discover all devices on the bus--you'll have to
read at least one byte. Writes of 0 bytes generate
no START/STOP on this I2C IP--the bus is not
engaged at all.

Set the I2C_AQ_NO_ZERO_LEN to the existing I2C
quirk tables for Aldebaran, Arcturus, Navi10 and
Sienna Cichlid, and add a quirk table to the I2C
driver which drives the bus when the SMU
doesn't--for instance on Vega20.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Lijo Lazar <Lijo.Lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 15:12:51 -04:00
Luben Tuikov
ae87df0775 drm/amd/pm: Add I2C quirk table to Aldebaran
Add I2C quirk table to Aldebaran.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Lijo Lazar <Lijo.Lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 15:12:44 -04:00
Luben Tuikov
da98d99b0a drm/amd/pm: Simplify managed I2C transfer of Aldebaran
Simplify Aldebaran managed I2C transfer function
to correctly play with the upper I2C layers.

This gets it in line with Navi10, Acturus, and
Sienna Cichlid.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:25:33 -04:00
Stanley.Yang
513befa634 drm/amdgpu: message smu to update hbm bad page number
Use SMU to update the bad pages rather than directly
accessing the EEPROM from the driver.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-18 17:11:56 -04:00
Lijo Lazar
076f55a45e drm/amd/pm: Only primary die supports power data
On aldebaran, only primary die fetches valid power data. Show
power/energy values as 0 on secondary die. Also, power limit should not
be set through secondary die.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:03:19 -04:00
Evan Quan
488f211dab drm/amd/pm: correct the power limits reporting on OOB supported
As OOB(out-of-band) interface may be used to update the power limits.
Thus to make sure the power limits reporting of our driver always
reflects the correct values, the internal cache must be aligned
carefully.

V2: add support for out-of-band of other ASICs
    align cached current power limit with OOB imposed

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-By: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:03:09 -04:00
Graham Sider
56d9bf6201 drm/amd/pm: Add aldebaran throttler translation
Perform dependent to independent throttle status translation
for aldebaran.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-10 11:44:25 -04:00
Lijo Lazar
ff05bb18e1 drm/amd/pm: Remove BACO check for aldebaran
BACO/MACO is not applicable for aldebaran. Remove the redundant check.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:02:13 -04:00
Alex Deucher
dd1d82c04e drm/amdgpu/swsmu/aldebaran: fix check in is_dpm_running
If smu_cmn_get_enabled_mask() fails, return false to be
consistent with other asics.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Lee Jones <lee.jones@linaro.org>
2021-05-27 12:33:52 -04:00
Lee Jones
d26ebc5852 drm/amd/pm/inc/smu_v13_0: Move table into the only source file that uses it
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/../pm/inc/smu_v13_0.h:54:43: warning: ‘smu13_thermal_policy’ defined but not used [-Wunused-const-variable=]

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-27 12:33:50 -04:00
Feifei Xu
5051cb794a drm/amd/pm: fix return value in aldebaran_set_mp1_state()
For default cases,we should return 0. Otherwise resume will
abort because of the wrong return value.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 17:59:47 -04:00
David M Nieto
7884245712 drm/amdgpu/pm: display vcn pp dpm (v4)
Enable displaying DPM levels for VCN clocks
in swsmu supported ASICs

v2: removed set functions for navi, renoir
v3: removed set function from arcturus
v4: added missing defines in drm_table and remove
 uneeded goto label in navi10_ppt.c

Signed-off-by: David M Nieto <david.nieto@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 10:31:55 -04:00
Lijo Lazar
5709121a58 drm/amd/pm: Reset max GFX clock after disabling determinism
When determinism mode is disabled on aldebaran, max GFX clock will
be reset to default max frequency value.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-19 22:38:20 -04:00
Lijo Lazar
e943dd8861 drm/amd/pm: Fix showing incorrect frequencies on aldebaran
v1: Use the current and custom pstate frequencies to track the current and
user-set min/max values in manual and determinism mode. Previously, only
actual_* value was used to track the currrent and user requested value.
The value will get reassigned whenever user requests a new value with
pp_od_clk_voltage node. Hence it will show incorrect values when user
requests an invalid value or tries a partial request without committing
the values. Separating out to custom and current variable fixes such
issues.

v2: Remove redundant if-else check

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-19 22:38:17 -04:00
Evan Quan
cfd053be1f drm/amd/pm: expose pmfw attached timestamp on Aldebaran
Available with 68.18.0 and later PMFWs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-10 18:06:46 -04:00