For Aldebaran chip passthrough case we need to intimate SMU
about special handling for SBR.On older chips we send
LightSBR to SMU, enabling the same for Aldebaran. Slight
difference, compared to previous chips, is on Aldebaran, SMU
would do a heavy reset on SBR. Hence, the word Heavy
instead of Light SBR is used for SMU to differentiate.
Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: sashank saye <sashank.saye@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As the smu_context will be invisible from outside(of power). Also,
the smu_debug_mask can be shared around all power code instead of
some specific framework(swSMU) only.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SMU firmware expects the driver maintains error context
and doesn't interact with SMU any more when SMU errors
occurred. That will aid in debugging SMU firmware issues.
Add SMU debug option support for this request, it can be
enabled or disabled via amdgpu_smu_debug debugfs file.
Use a 32-bit mask to indicate corresponding debug modes.
Currently, only one mode(HALT_ON_ERROR) is supported.
When enabled, it brings hardware to a kind of halt state
so that no one can touch it any more in the envent of SMU
errors.
The dirver interacts with SMU via sending messages. And
threre are three ways to sending messages to SMU in current
implementation. Handle them respectively as following:
1, smu_cmn_send_smc_msg_with_param() for normal timeout cases
Halt on any error.
2, smu_cmn_send_msg_without_waiting()/smu_cmn_wait_for_response()
for longer timeout cases
Halt on errors apart from ETIME. Otherwise this way won't work.
Let the user handle ETIME error in such a case.
3, smu_cmn_send_msg_without_waiting() for no waiting cases
Halt on errors apart from ETIME. Otherwise second way won't work.
== Command Guide ==
1, enable HALT_ON_ERROR mode
# echo 0x1 > /sys/kernel/debug/dri/0/amdgpu_smu_debug
2, disable HALT_ON_ERROR mode
# echo 0x0 > /sys/kernel/debug/dri/0/amdgpu_smu_debug
v5:
- Use bit mask to allow more debug features.(Evan)
- Use WRAN() instead of BUG().(Evan)
v4:
- Set to halt state instead of a simple hang.(Christian)
v3:
- Use debugfs_create_bool().(Christian)
- Put variable into smu_context struct.
- Don't resend command when timeout.
v2:
- Resend command when timeout.(Lijo)
- Use debugfs file instead of module parameter.
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
support ECC TABLE message, this table include umc ras error count
and error address
v2:
add smu version check to query whether support ecctable
call smu_cmn_update_table to get ecctable directly
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
update smu driver if version to 0x08 to avoid mismatch log
A version mismatch can still happen with an older FW
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If gpu reset is triggered by ras fatal error, tell it to smu in mode-1
reset message.
v2: move mode-1 reset function to aldebaran_ppt.c since it's aldebaran
specific currently.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This was added by commit bd8dcea93a ("drm/amd/pm: add callbacks to
read/write sysfs file pp_power_profile_mode") but the feature was
deprecated from PMFW. Remove it from the driver.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add manual sclk/vddc setting supoort via pp_od_clk_voltage sysfs
to maintain consistency with other asics. As cyan skillfish doesn't
support DPM, there is only a single frequency and voltage to adjust.
v2: maintain consistency and add command guide.
v3: adjust user settings storage and coding style.
Command guide:
echo vc point sclk vddc > pp_od_clk_voltage
"vc" - sclk voltage curve
"point" - must be 0
"sclk" - target value of sclk(MHz), should be in safe range
"vddc" - target value of vddc(mV), a 6.25(mV) stepping is
recommended and should be in safe range (the real
vddc is an approximation of target value)
echo c > pp_od_clk_voltage
"c" - commit the changes of sclk and vddc, only after
the commit command, the target values set by "vc"
command will take effect
echo r > pp_od_clk_voltage
"r" - reset sclk and vddc to default value, a subsequent
commit command is needed to take effect
Example:
1) Check default sclk and vddc
$ cat pp_od_clk_voltage
OD_SCLK:
0: 1800Mhz *
OD_VDDC:
0: 862mV *
OD_RANGE:
SCLK: 1000Mhz 2000Mhz
VDDC: 700mV 1129mV
2) Set sclk to 1500MHz and vddc to 700mV
$ echo vc 0 1500 700 > pp_od_clk_voltage
$ echo c > pp_od_clk_voltage
$ cat pp_od_clk_voltage
OD_SCLK:
0: 1500Mhz *
OD_VDDC:
0: 693mV *
OD_RANGE:
SCLK: 1000Mhz 2000Mhz
VDDC: 700mV 1129mV
3) Reset sclk and vddc to default
$ echo r > pp_od_clk_voltage
$ echo c > pp_od_clk_voltage
$ cat pp_od_clk_voltage
OD_SCLK:
0: 1800Mhz *
OD_VDDC:
0: 874mV *
OD_RANGE:
SCLK: 1000Mhz 2000Mhz
VDDC: 700mV 1129mV
NOTE:
We don't specify an explicit safe range, you can set any values
between min and max at your own risk. Enjoy!
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add SmuMetrics_t definition for cyan skilfish.
v2: update SmuMetrics_t definition.
v3: cleanup and rearrange the order of fields.
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add some PPSMC MSGs for cyan skilfish.
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.
The "Board Parameters" members of the structs:
struct atom_smc_dpm_info_v4_5
struct atom_smc_dpm_info_v4_6
struct atom_smc_dpm_info_v4_7
struct atom_smc_dpm_info_v4_10
are written to the corresponding members of the corresponding PPTable_t
variables, but they lack destination size bounds checking, which means
the compiler cannot verify at compile time that this is an intended and
safe memcpy().
Since the header files are effectively immutable[1] and a struct_group()
cannot be used, nor a common struct referenced by both sides of the
memcpy() arguments, add a new helper, amdgpu_memcpy_trailing(), to
perform the bounds checking at compile time. Replace the open-coded
memcpy()s with amdgpu_memcpy_trailing() which includes enough context
for the bounds checking.
"objdump -d" shows no object code changes.
[1] https://lore.kernel.org/lkml/e56aad3c-a06f-da07-f491-a894a570d78f@amd.com
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Feifei Xu <Feifei.Xu@amd.com>
Cc: Likun Gao <Likun.Gao@amd.com>
Cc: Jiawei Gu <Jiawei.Gu@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Currently, the readout of fan speed pwm is transited into percent-based
and then pwm-based. However, the transition into percent-based is totally
unnecessary and make the final output less accurate.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The relationship "PWM = RPM / smu->fan_max_rpm" between fan speed
PWM and RPM is not true for SMU11 ASICs. So, we need a new way to
retrieving the fan speed RPM.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The relationship "PWM = RPM / smu->fan_max_rpm" between fan speed
PWM and RPM is not true for SMU11 ASICs. So, we need a new way to
retrieving the fan speed PWM.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As the relationship "PWM = RPM / smu->fan_max_rpm" between fan speed
PWM and RPM is not true for SMU11 ASICs. So, both the RPM and PWM
settings need to be saved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The relationship "PWM = RPM / smu->fan_max_rpm" between fan speed
PWM and RPM is not true for SMU11 ASICs. So, we need a new way to
perform the fan speed RPM setting.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
the following feature is wrong, it will cause sysnode of pp_features show error:
1. DPM_XGMI
2. VCN_DPM
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
correct smu feature mapping: FEATURE_DATA_CALCULATIONS
it will cause sysfs node of "pp_features" show error.
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Correct yellow carp driver-PMFW interface version to v4.
Signed-off-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The customized OD settings can be divided into two parts: those
committed ones and non-committed ones.
- For those changes which had been fed to SMU before S3/S4/Runpm
suspend kicked, they are committed changes. They should be properly
restored and fed to SMU on S3/S4/Runpm resume.
- For those non-committed changes, they are restored only without feeding
to SMU.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add check_fw_version function support for cyan_skillfish.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add support for board power calibration on Aldebaran.
Board calibration is done after DC offset calibration.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update the version to 0xD for beige_goby.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Jack Gui <Jack.Gui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since there's nothing special in smu implementation for yellow carp,
it's better to reuse the common smu_v13_0 interfaces and drop the
specific smu_v13_0_1.c|h files.
v2: remove the duplicate register offset and shift mask header files as
well.
Signed-off-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To suppress the annoying warning about version mismatch.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Due to the structure layout change: "uint32_t ThrottlerStatus" -> "
uint8_t ThrottlingPercentage[THROTTLER_COUNT]".
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
So we lock software as well as hardware access to the bus.
v2: fix mutex handling.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Use SMU to update the bad pages rather than directly
accessing the EEPROM from the driver.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As OOB(out-of-band) interface may be used to update the power limits.
Thus to make sure the power limits reporting of our driver always
reflects the correct values, the internal cache must be aligned
carefully.
V2: add support for out-of-band of other ASICs
align cached current power limit with OOB imposed
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-By: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For some ASICs, the real dpm feature disablement job is handled by
PMFW during baco reset and custom pptable loading. Cached dpm feature
status need to be updated to pair that.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Via the fSMC_MSG_ArmD3 message, PMFW can properly act on the
Dstate change. Driver involvement for determining the timing for
BACO enter/exit is not needed.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add new defines for thermal throttle status bits which are ASIC
independent. This bit field will be visible to userspace via
gpu_metrics alongside the previous ASIC dependent bit fields. Seperated
into four types: power throttlers (16 bits), current throttlers (16
bits), temperature (24 bits), other (8 bits).
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add get_vbios_bootup_values function to get the bootup values for yellow
carp.
Signed-off-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch adds set_driver_table_location implementation for yellow
carp.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For supporting yellow carp, we need to add smu13 ip
support for the moment.
V2: add smu_v13_0_1.c|h dedicated for apu.
V3: cleanup code.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch is to add smu v13.0.1 smc header for yellow carp.
v2: squash in updates (Alex)
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch is to add smu v13.0.1 firmware header for yellow carp.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch is to add smu v13 driver interface header for yellow carp.
v2: squash in updates (Alex)
v3: squash in v69.29.0 update (Alex)
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>