diff --git a/.mailmap b/.mailmap index 5ff0e5d681e7..399322897938 100644 --- a/.mailmap +++ b/.mailmap @@ -121,6 +121,8 @@ Ben Widawsky Benjamin Poirier Benjamin Tissoires Benjamin Tissoires +Bingwu Zhang +Bingwu Zhang Bjorn Andersson Bjorn Andersson Bjorn Andersson @@ -200,6 +202,7 @@ Elliot Berman Enric Balletbo i Serra Enric Balletbo i Serra Erik Kaneda +Ethan Carter Edwards Ethan Edwards Eugen Hristev Eugen Hristev Evgeniy Polyakov @@ -435,7 +438,7 @@ Martin Kepplinger Martin Kepplinger Martin Kepplinger Martyna Szapar-Mudlaw -Mathieu Othacehe +Mathieu Othacehe Mat Martineau Mat Martineau Matthew Wilcox @@ -735,6 +738,7 @@ Wolfram Sang Wolfram Sang Yakir Yang Yanteng Si +Ying Huang Yusuke Goda Zack Rusin Zhu Yanjun diff --git a/CREDITS b/CREDITS index b1777b53c63a..cda68f04d5f1 100644 --- a/CREDITS +++ b/CREDITS @@ -20,6 +20,10 @@ N: Thomas Abraham E: thomas.ab@samsung.com D: Samsung pin controller driver +N: Jose Abreu +E: jose.abreu@synopsys.com +D: Synopsys DesignWare XPCS MDIO/PCS driver. + N: Dragos Acostachioaie E: dragos@iname.com W: http://www.arbornet.org/~dragos @@ -1428,6 +1432,10 @@ S: 8124 Constitution Apt. 7 S: Sterling Heights, Michigan 48313 S: USA +N: Andy Gospodarek +E: andy@greyhouse.net +D: Maintenance and contributions to the network interface bonding driver. + N: Wolfgang Grandegger E: wg@grandegger.com D: Controller Area Network (device drivers) @@ -1812,6 +1820,10 @@ D: Author/maintainer of most DRM drivers (especially ATI, MGA) D: Core DRM templates, general DRM and 3D-related hacking S: No fixed address +N: Woojung Huh +E: woojung.huh@microchip.com +D: Microchip LAN78XX USB Ethernet driver + N: Kenn Humborg E: kenn@wombat.ie D: Mods to loop device to support sparse backing files diff --git a/Documentation/ABI/testing/sysfs-class-watchdog b/Documentation/ABI/testing/sysfs-class-watchdog index 94fb74615951..70eabccf0557 100644 --- a/Documentation/ABI/testing/sysfs-class-watchdog +++ b/Documentation/ABI/testing/sysfs-class-watchdog @@ -76,7 +76,7 @@ Description: timeout when the pretimeout interrupt is delivered. Pretimeout is an optional feature. -What: /sys/class/watchdog/watchdogn/pretimeout_avaialable_governors +What: /sys/class/watchdog/watchdogn/pretimeout_available_governors Date: February 2017 Contact: Wim Van Sebroeck Description: diff --git a/Documentation/accel/amdxdna/amdnpu.rst b/Documentation/accel/amdxdna/amdnpu.rst new file mode 100644 index 000000000000..fbe0a7585345 --- /dev/null +++ b/Documentation/accel/amdxdna/amdnpu.rst @@ -0,0 +1,281 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +.. include:: + +========= + AMD NPU +========= + +:Copyright: |copy| 2024 Advanced Micro Devices, Inc. +:Author: Sonal Santan + +Overview +======== + +AMD NPU (Neural Processing Unit) is a multi-user AI inference accelerator +integrated into AMD client APU. NPU enables efficient execution of Machine +Learning applications like CNN, LLM, etc. NPU is based on +`AMD XDNA Architecture`_. NPU is managed by **amdxdna** driver. + + +Hardware Description +==================== + +AMD NPU consists of the following hardware components: + +AMD XDNA Array +-------------- + +AMD XDNA Array comprises of 2D array of compute and memory tiles built with +`AMD AI Engine Technology`_. Each column has 4 rows of compute tiles and 1 +row of memory tile. Each compute tile contains a VLIW processor with its own +dedicated program and data memory. The memory tile acts as L2 memory. The 2D +array can be partitioned at a column boundary creating a spatially isolated +partition which can be bound to a workload context. + +Each column also has dedicated DMA engines to move data between host DDR and +memory tile. + +AMD Phoenix and AMD Hawk Point client NPU have a 4x5 topology, i.e., 4 rows of +compute tiles arranged into 5 columns. AMD Strix Point client APU have 4x8 +topology, i.e., 4 rows of compute tiles arranged into 8 columns. + +Shared L2 Memory +---------------- + +The single row of memory tiles create a pool of software managed on chip L2 +memory. DMA engines are used to move data between host DDR and memory tiles. +AMD Phoenix and AMD Hawk Point NPUs have a total of 2560 KB of L2 memory. +AMD Strix Point NPU has a total of 4096 KB of L2 memory. + +Microcontroller +--------------- + +A microcontroller runs NPU Firmware which is responsible for command processing, +XDNA Array partition setup, XDNA Array configuration, workload context +management and workload orchestration. + +NPU Firmware uses a dedicated instance of an isolated non-privileged context +called ERT to service each workload context. ERT is also used to execute user +provided ``ctrlcode`` associated with the workload context. + +NPU Firmware uses a single isolated privileged context called MERT to service +management commands from the amdxdna driver. + +Mailboxes +--------- + +The microcontroller and amdxdna driver use a privileged channel for management +tasks like setting up of contexts, telemetry, query, error handling, setting up +user channel, etc. As mentioned before, privileged channel requests are +serviced by MERT. The privileged channel is bound to a single mailbox. + +The microcontroller and amdxdna driver use a dedicated user channel per +workload context. The user channel is primarily used for submitting work to +the NPU. As mentioned before, a user channel requests are serviced by an +instance of ERT. Each user channel is bound to its own dedicated mailbox. + +PCIe EP +------- + +NPU is visible to the x86 host CPU as a PCIe device with multiple BARs and some +MSI-X interrupt vectors. NPU uses a dedicated high bandwidth SoC level fabric +for reading or writing into host memory. Each instance of ERT gets its own +dedicated MSI-X interrupt. MERT gets a single instance of MSI-X interrupt. + +The number of PCIe BARs varies depending on the specific device. Based on their +functions, PCIe BARs can generally be categorized into the following types. + +* PSP BAR: Expose the AMD PSP (Platform Security Processor) function +* SMU BAR: Expose the AMD SMU (System Management Unit) function +* SRAM BAR: Expose ring buffers for the mailbox +* Mailbox BAR: Expose the mailbox control registers (head, tail and ISR + registers etc.) +* Public Register BAR: Expose public registers + +On specific devices, the above-mentioned BAR type might be combined into a +single physical PCIe BAR. Or a module might require two physical PCIe BARs to +be fully functional. For example, + +* On AMD Phoenix device, PSP, SMU, Public Register BARs are on PCIe BAR index 0. +* On AMD Strix Point device, Mailbox and Public Register BARs are on PCIe BAR + index 0. The PSP has some registers in PCIe BAR index 0 (Public Register BAR) + and PCIe BAR index 4 (PSP BAR). + +Process Isolation Hardware +-------------------------- + +As explained before, XDNA Array can be dynamically divided into isolated +spatial partitions, each of which may have one or more columns. The spatial +partition is setup by programming the column isolation registers by the +microcontroller. Each spatial partition is associated with a PASID which is +also programmed by the microcontroller. Hence multiple spatial partitions in +the NPU can make concurrent host access protected by PASID. + +The NPU FW itself uses microcontroller MMU enforced isolated contexts for +servicing user and privileged channel requests. + + +Mixed Spatial and Temporal Scheduling +===================================== + +AMD XDNA architecture supports mixed spatial and temporal (time sharing) +scheduling of 2D array. This means that spatial partitions may be setup and +torn down dynamically to accommodate various workloads. A *spatial* partition +may be *exclusively* bound to one workload context while another partition may +be *temporarily* bound to more than one workload contexts. The microcontroller +updates the PASID for a temporarily shared partition to match the context that +has been bound to the partition at any moment. + +Resource Solver +--------------- + +The Resource Solver component of the amdxdna driver manages the allocation +of 2D array among various workloads. Every workload describes the number +of columns required to run the NPU binary in its metadata. The Resource Solver +component uses hints passed by the workload and its own heuristics to +decide 2D array (re)partition strategy and mapping of workloads for spatial and +temporal sharing of columns. The FW enforces the context-to-column(s) resource +binding decisions made by the Resource Solver. + +AMD Phoenix and AMD Hawk Point client NPU can support 6 concurrent workload +contexts. AMD Strix Point can support 16 concurrent workload contexts. + + +Application Binaries +==================== + +A NPU application workload is comprised of two separate binaries which are +generated by the NPU compiler. + +1. AMD XDNA Array overlay, which is used to configure a NPU spatial partition. + The overlay contains instructions for setting up the stream switch + configuration and ELF for the compute tiles. The overlay is loaded on the + spatial partition bound to the workload by the associated ERT instance. + Refer to the + `Versal Adaptive SoC AIE-ML Architecture Manual (AM020)`_ for more details. + +2. ``ctrlcode``, used for orchestrating the overlay loaded on the spatial + partition. ``ctrlcode`` is executed by the ERT running in protected mode on + the microcontroller in the context of the workload. ``ctrlcode`` is made up + of a sequence of opcodes named ``XAie_TxnOpcode``. Refer to the + `AI Engine Run Time`_ for more details. + + +Special Host Buffers +==================== + +Per-context Instruction Buffer +------------------------------ + +Every workload context uses a host resident 64 MB buffer which is memory +mapped into the ERT instance created to service the workload. The ``ctrlcode`` +used by the workload is copied into this special memory. This buffer is +protected by PASID like all other input/output buffers used by that workload. +Instruction buffer is also mapped into the user space of the workload. + +Global Privileged Buffer +------------------------ + +In addition, the driver also allocates a single buffer for maintenance tasks +like recording errors from MERT. This global buffer uses the global IOMMU +domain and is only accessible by MERT. + + +High-level Use Flow +=================== + +Here are the steps to run a workload on AMD NPU: + +1. Compile the workload into an overlay and a ``ctrlcode`` binary. +2. Userspace opens a context in the driver and provides the overlay. +3. The driver checks with the Resource Solver for provisioning a set of columns + for the workload. +4. The driver then asks MERT to create a context on the device with the desired + columns. +5. MERT then creates an instance of ERT. MERT also maps the Instruction Buffer + into ERT memory. +6. The userspace then copies the ``ctrlcode`` to the Instruction Buffer. +7. Userspace then creates a command buffer with pointers to input, output, and + instruction buffer; it then submits command buffer with the driver and goes + to sleep waiting for completion. +8. The driver sends the command over the Mailbox to ERT. +9. ERT *executes* the ``ctrlcode`` in the instruction buffer. +10. Execution of the ``ctrlcode`` kicks off DMAs to and from the host DDR while + AMD XDNA Array is running. +11. When ERT reaches end of ``ctrlcode``, it raises an MSI-X to send completion + signal to the driver which then wakes up the waiting workload. + + +Boot Flow +========= + +amdxdna driver uses PSP to securely load signed NPU FW and kick off the boot +of the NPU microcontroller. amdxdna driver then waits for the alive signal in +a special location on BAR 0. The NPU is switched off during SoC suspend and +turned on after resume where the NPU FW is reloaded, and the handshake is +performed again. + + +Userspace components +==================== + +Compiler +-------- + +Peano is an LLVM based open-source compiler for AMD XDNA Array compute tile +available at: +https://github.com/Xilinx/llvm-aie + +The open-source IREE compiler supports graph compilation of ML models for AMD +NPU and uses Peano underneath. It is available at: +https://github.com/nod-ai/iree-amd-aie + +Usermode Driver (UMD) +--------------------- + +The open-source XRT runtime stack interfaces with amdxdna kernel driver. XRT +can be found at: +https://github.com/Xilinx/XRT + +The open-source XRT shim for NPU is can be found at: +https://github.com/amd/xdna-driver + + +DMA Operation +============= + +DMA operation instructions are encoded in the ``ctrlcode`` as +``XAIE_IO_BLOCKWRITE`` opcode. When ERT executes ``XAIE_IO_BLOCKWRITE``, DMA +operations between host DDR and L2 memory are effected. + + +Error Handling +============== + +When MERT detects an error in AMD XDNA Array, it pauses execution for that +workload context and sends an asynchronous message to the driver over the +privileged channel. The driver then sends a buffer pointer to MERT to capture +the register states for the partition bound to faulting workload context. The +driver then decodes the error by reading the contents of the buffer pointer. + + +Telemetry +========= + +MERT can report various kinds of telemetry information like the following: + +* L1 interrupt counter +* DMA counter +* Deep Sleep counter +* etc. + + +References +========== + +- `AMD XDNA Architecture `_ +- `AMD AI Engine Technology `_ +- `Peano `_ +- `Versal Adaptive SoC AIE-ML Architecture Manual (AM020) `_ +- `AI Engine Run Time `_ diff --git a/Documentation/accel/amdxdna/index.rst b/Documentation/accel/amdxdna/index.rst new file mode 100644 index 000000000000..38c16939f1fc --- /dev/null +++ b/Documentation/accel/amdxdna/index.rst @@ -0,0 +1,11 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +===================================== + accel/amdxdna NPU driver +===================================== + +The accel/amdxdna driver supports the AMD NPU (Neural Processing Unit). + +.. toctree:: + + amdnpu diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst index e94a0160b6a0..bc85f26533d8 100644 --- a/Documentation/accel/index.rst +++ b/Documentation/accel/index.rst @@ -8,6 +8,7 @@ Compute Accelerators :maxdepth: 1 introduction + amdxdna/index qaic/index .. only:: subproject and html diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index 315ede811c9d..cb1b4e759b7e 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -64,13 +64,14 @@ v1 is available under :ref:`Documentation/admin-guide/cgroup-v1/index.rst `_). In some ASICs, the highest CPPC performance is not the one in the ``_CPC`` table, so we need to expose it to sysfs. If boost is not active, but still supported, this maximum frequency will be larger than the one in -``cpuinfo``. On systems that support preferred core, the driver will have -different values for some cores than others and this will reflect the values -advertised by the platform at bootup. +``cpuinfo``. This attribute is read-only. ``amd_pstate_lowest_nonlinear_freq`` diff --git a/Documentation/admin-guide/pm/cpuidle.rst b/Documentation/admin-guide/pm/cpuidle.rst index 19754beb5a4e..eb58d7a5affd 100644 --- a/Documentation/admin-guide/pm/cpuidle.rst +++ b/Documentation/admin-guide/pm/cpuidle.rst @@ -269,27 +269,7 @@ Namely, when invoked to select an idle state for a CPU (i.e. an idle state that the CPU will ask the processor hardware to enter), it attempts to predict the idle duration and uses the predicted value for idle state selection. -It first obtains the time until the closest timer event with the assumption -that the scheduler tick will be stopped. That time, referred to as the *sleep -length* in what follows, is the upper bound on the time before the next CPU -wakeup. It is used to determine the sleep length range, which in turn is needed -to get the sleep length correction factor. - -The ``menu`` governor maintains two arrays of sleep length correction factors. -One of them is used when tasks previously running on the given CPU are waiting -for some I/O operations to complete and the other one is used when that is not -the case. Each array contains several correction factor values that correspond -to different sleep length ranges organized so that each range represented in the -array is approximately 10 times wider than the previous one. - -The correction factor for the given sleep length range (determined before -selecting the idle state for the CPU) is updated after the CPU has been woken -up and the closer the sleep length is to the observed idle duration, the closer -to 1 the correction factor becomes (it must fall between 0 and 1 inclusive). -The sleep length is multiplied by the correction factor for the range that it -falls into to obtain the first approximation of the predicted idle duration. - -Next, the governor uses a simple pattern recognition algorithm to refine its +It first uses a simple pattern recognition algorithm to obtain a preliminary idle duration prediction. Namely, it saves the last 8 observed idle duration values and, when predicting the idle duration next time, it computes the average and variance of them. If the variance is small (smaller than 400 square @@ -301,29 +281,39 @@ Again, if the variance of them is small (in the above sense), the average is taken as the "typical interval" value and so on, until either the "typical interval" is determined or too many data points are disregarded, in which case the "typical interval" is assumed to equal "infinity" (the maximum unsigned -integer value). The "typical interval" computed this way is compared with the -sleep length multiplied by the correction factor and the minimum of the two is -taken as the predicted idle duration. +integer value). -Then, the governor computes an extra latency limit to help "interactive" -workloads. It uses the observation that if the exit latency of the selected -idle state is comparable with the predicted idle duration, the total time spent -in that state probably will be very short and the amount of energy to save by -entering it will be relatively small, so likely it is better to avoid the -overhead related to entering that state and exiting it. Thus selecting a -shallower state is likely to be a better option then. The first approximation -of the extra latency limit is the predicted idle duration itself which -additionally is divided by a value depending on the number of tasks that -previously ran on the given CPU and now they are waiting for I/O operations to -complete. The result of that division is compared with the latency limit coming -from the power management quality of service, or `PM QoS `_, -framework and the minimum of the two is taken as the limit for the idle states' -exit latency. +If the "typical interval" computed this way is long enough, the governor obtains +the time until the closest timer event with the assumption that the scheduler +tick will be stopped. That time, referred to as the *sleep length* in what follows, +is the upper bound on the time before the next CPU wakeup. It is used to determine +the sleep length range, which in turn is needed to get the sleep length correction +factor. + +The ``menu`` governor maintains an array containing several correction factor +values that correspond to different sleep length ranges organized so that each +range represented in the array is approximately 10 times wider than the previous +one. + +The correction factor for the given sleep length range (determined before +selecting the idle state for the CPU) is updated after the CPU has been woken +up and the closer the sleep length is to the observed idle duration, the closer +to 1 the correction factor becomes (it must fall between 0 and 1 inclusive). +The sleep length is multiplied by the correction factor for the range that it +falls into to obtain an approximation of the predicted idle duration that is +compared to the "typical interval" determined previously and the minimum of +the two is taken as the idle duration prediction. + +If the "typical interval" value is small, which means that the CPU is likely +to be woken up soon enough, the sleep length computation is skipped as it may +be costly and the idle duration is simply predicted to equal the "typical +interval" value. Now, the governor is ready to walk the list of idle states and choose one of them. For this purpose, it compares the target residency of each state with -the predicted idle duration and the exit latency of it with the computed latency -limit. It selects the state with the target residency closest to the predicted +the predicted idle duration and the exit latency of it with the with the latency +limit coming from the power management quality of service, or `PM QoS `_, +framework. It selects the state with the target residency closest to the predicted idle duration, but still below it, and exit latency that does not exceed the limit. diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst index 77db10e944f0..b42fea07c5ce 100644 --- a/Documentation/arch/arm64/silicon-errata.rst +++ b/Documentation/arch/arm64/silicon-errata.rst @@ -255,8 +255,9 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | Hisilicon | Hip08 SMMU PMCG | #162001800 | N/A | +----------------+-----------------+-----------------+-----------------------------+ -| Hisilicon | Hip{08,09,10,10C| #162001900 | N/A | -| | ,11} SMMU PMCG | | | +| Hisilicon | Hip{08,09,09A,10| #162001900 | N/A | +| | ,10C,11} | | | +| | SMMU PMCG | | | +----------------+-----------------+-----------------+-----------------------------+ | Hisilicon | Hip09 | #162100801 | HISILICON_ERRATUM_162100801 | +----------------+-----------------+-----------------+-----------------------------+ diff --git a/Documentation/core-api/cgroup.rst b/Documentation/core-api/cgroup.rst new file mode 100644 index 000000000000..734ea21e1e17 --- /dev/null +++ b/Documentation/core-api/cgroup.rst @@ -0,0 +1,9 @@ +================== +Cgroup Kernel APIs +================== + +Device Memory Cgroup API (dmemcg) +================================= +.. kernel-doc:: kernel/cgroup/dmem.c + :export: + diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst index 563b8fc0002f..913d91feaf76 100644 --- a/Documentation/core-api/index.rst +++ b/Documentation/core-api/index.rst @@ -109,6 +109,7 @@ more memory-management documentation in Documentation/mm/index.rst. dma-isa-lpc swiotlb mm-api + cgroup genalloc pin_user_pages boot-time-mm diff --git a/Documentation/core-api/symbol-namespaces.rst b/Documentation/core-api/symbol-namespaces.rst index 12e4aecdae94..27a9cccc792c 100644 --- a/Documentation/core-api/symbol-namespaces.rst +++ b/Documentation/core-api/symbol-namespaces.rst @@ -46,7 +46,7 @@ Please note that due to macro expansion that argument needs to be a preprocessor symbol. E.g. to export the symbol ``usb_stor_suspend`` into the namespace ``USB_STORAGE``, use:: - EXPORT_SYMBOL_NS(usb_stor_suspend, USB_STORAGE); + EXPORT_SYMBOL_NS(usb_stor_suspend, "USB_STORAGE"); The corresponding ksymtab entry struct ``kernel_symbol`` will have the member ``namespace`` set accordingly. A symbol that is exported without a namespace will @@ -68,7 +68,7 @@ is to define the default namespace in the ``Makefile`` of the subsystem. E.g. to export all symbols defined in usb-common into the namespace USB_COMMON, add a line like this to drivers/usb/common/Makefile:: - ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=USB_COMMON + ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE='"USB_COMMON"' That will affect all EXPORT_SYMBOL() and EXPORT_SYMBOL_GPL() statements. A symbol exported with EXPORT_SYMBOL_NS() while this definition is present, will @@ -79,7 +79,7 @@ A second option to define the default namespace is directly in the compilation unit as preprocessor statement. The above example would then read:: #undef DEFAULT_SYMBOL_NAMESPACE - #define DEFAULT_SYMBOL_NAMESPACE USB_COMMON + #define DEFAULT_SYMBOL_NAMESPACE "USB_COMMON" within the corresponding compilation unit before any EXPORT_SYMBOL macro is used. @@ -94,7 +94,7 @@ for the namespaces it uses symbols from. E.g. a module using the usb_stor_suspend symbol from above, needs to import the namespace USB_STORAGE using a statement like:: - MODULE_IMPORT_NS(USB_STORAGE); + MODULE_IMPORT_NS("USB_STORAGE"); This will create a ``modinfo`` tag in the module for each imported namespace. This has the side effect, that the imported namespaces of a module can be diff --git a/Documentation/devicetree/bindings/crypto/fsl,sec-v4.0.yaml b/Documentation/devicetree/bindings/crypto/fsl,sec-v4.0.yaml index 9c8c9991f29a..f0c4a7c83568 100644 --- a/Documentation/devicetree/bindings/crypto/fsl,sec-v4.0.yaml +++ b/Documentation/devicetree/bindings/crypto/fsl,sec-v4.0.yaml @@ -114,8 +114,9 @@ patternProperties: table that specifies the PPID to LIODN mapping. Needed if the PAMU is used. Value is a 12 bit value where value is a LIODN ID for this JR. This property is normally set by boot firmware. - $ref: /schemas/types.yaml#/definitions/uint32 - maximum: 0xfff + $ref: /schemas/types.yaml#/definitions/uint32-array + items: + - maximum: 0xfff '^rtic@[0-9a-f]+$': type: object @@ -186,8 +187,9 @@ patternProperties: Needed if the PAMU is used. Value is a 12 bit value where value is a LIODN ID for this JR. This property is normally set by boot firmware. - $ref: /schemas/types.yaml#/definitions/uint32 - maximum: 0xfff + $ref: /schemas/types.yaml#/definitions/uint32-array + items: + - maximum: 0xfff fsl,rtic-region: description: diff --git a/Documentation/devicetree/bindings/display/brcm,bcm2711-hdmi.yaml b/Documentation/devicetree/bindings/display/brcm,bcm2711-hdmi.yaml index 5b35adf34c7b..6d11f5955b51 100644 --- a/Documentation/devicetree/bindings/display/brcm,bcm2711-hdmi.yaml +++ b/Documentation/devicetree/bindings/display/brcm,bcm2711-hdmi.yaml @@ -14,6 +14,8 @@ properties: enum: - brcm,bcm2711-hdmi0 - brcm,bcm2711-hdmi1 + - brcm,bcm2712-hdmi0 + - brcm,bcm2712-hdmi1 reg: items: diff --git a/Documentation/devicetree/bindings/display/brcm,bcm2835-hvs.yaml b/Documentation/devicetree/bindings/display/brcm,bcm2835-hvs.yaml index 2e8566f47e63..f91c9dce2a44 100644 --- a/Documentation/devicetree/bindings/display/brcm,bcm2835-hvs.yaml +++ b/Documentation/devicetree/bindings/display/brcm,bcm2835-hvs.yaml @@ -13,6 +13,7 @@ properties: compatible: enum: - brcm,bcm2711-hvs + - brcm,bcm2712-hvs - brcm,bcm2835-hvs reg: @@ -36,7 +37,9 @@ if: properties: compatible: contains: - const: brcm,bcm2711-hvs + enum: + - brcm,bcm2711-hvs + - brcm,bcm2712-hvs then: required: diff --git a/Documentation/devicetree/bindings/display/brcm,bcm2835-pixelvalve0.yaml b/Documentation/devicetree/bindings/display/brcm,bcm2835-pixelvalve0.yaml index 4e1ba03f6477..6b5b1d3fbc0b 100644 --- a/Documentation/devicetree/bindings/display/brcm,bcm2835-pixelvalve0.yaml +++ b/Documentation/devicetree/bindings/display/brcm,bcm2835-pixelvalve0.yaml @@ -20,6 +20,9 @@ properties: - brcm,bcm2711-pixelvalve2 - brcm,bcm2711-pixelvalve3 - brcm,bcm2711-pixelvalve4 + - brcm,bcm2712-pixelvalve0 + - brcm,bcm2712-pixelvalve1 + - brcm,bcm2712-pixelvalve2 reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/display/brcm,bcm2835-txp.yaml b/Documentation/devicetree/bindings/display/brcm,bcm2835-txp.yaml index bb186197e471..16f45afd2bad 100644 --- a/Documentation/devicetree/bindings/display/brcm,bcm2835-txp.yaml +++ b/Documentation/devicetree/bindings/display/brcm,bcm2835-txp.yaml @@ -11,7 +11,10 @@ maintainers: properties: compatible: - const: brcm,bcm2835-txp + enum: + - brcm,bcm2712-mop + - brcm,bcm2712-moplet + - brcm,bcm2835-txp reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/display/brcm,bcm2835-vc4.yaml b/Documentation/devicetree/bindings/display/brcm,bcm2835-vc4.yaml index 49a5e041aa49..2aa9d5d2afff 100644 --- a/Documentation/devicetree/bindings/display/brcm,bcm2835-vc4.yaml +++ b/Documentation/devicetree/bindings/display/brcm,bcm2835-vc4.yaml @@ -18,6 +18,7 @@ properties: compatible: enum: - brcm,bcm2711-vc5 + - brcm,bcm2712-vc6 - brcm,bcm2835-vc4 - brcm,cygnus-vc4 diff --git a/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml b/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml index d33026f85e19..c167795c63f6 100644 --- a/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml +++ b/Documentation/devicetree/bindings/display/bridge/renesas,dsi-csi2-tx.yaml @@ -19,6 +19,7 @@ properties: enum: - renesas,r8a779a0-dsi-csi2-tx # for V3U - renesas,r8a779g0-dsi-csi2-tx # for V4H + - renesas,r8a779h0-dsi-csi2-tx # for V4M reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml b/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml index 48a97bb3e2e0..bad6f5c81b06 100644 --- a/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml +++ b/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml @@ -80,12 +80,12 @@ properties: - const: 4 port@2: - $ref: /schemas/graph.yaml#/properties/port description: Video port for LVDS Channel-A output (panel or bridge). + $ref: '#/$defs/lvds-port' port@3: - $ref: /schemas/graph.yaml#/properties/port description: Video port for LVDS Channel-B output (panel or bridge). + $ref: '#/$defs/lvds-port' required: - port@0 @@ -96,6 +96,36 @@ required: - reg - ports +$defs: + lvds-port: + $ref: /schemas/graph.yaml#/$defs/port-base + unevaluatedProperties: false + + properties: + endpoint: + $ref: /schemas/media/video-interfaces.yaml# + unevaluatedProperties: false + + properties: + ti,lvds-termination-ohms: + description: The value of near end differential termination in ohms. + enum: [100, 200] + default: 200 + + ti,lvds-vod-swing-clock-microvolt: + description: LVDS diferential output voltage for clock + lanes in microvolts. + $ref: /schemas/types.yaml#/definitions/uint32-array + minItems: 2 + maxItems: 2 + + ti,lvds-vod-swing-data-microvolt: + description: LVDS diferential output voltage for data + lanes in microvolts. + $ref: /schemas/types.yaml#/definitions/uint32-array + minItems: 2 + maxItems: 2 + allOf: - if: properties: diff --git a/Documentation/devicetree/bindings/display/mediatek/mediatek,dp.yaml b/Documentation/devicetree/bindings/display/mediatek/mediatek,dp.yaml index 2aef1eb32e11..75ce92f4a5fd 100644 --- a/Documentation/devicetree/bindings/display/mediatek/mediatek,dp.yaml +++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,dp.yaml @@ -42,6 +42,9 @@ properties: interrupts: maxItems: 1 + '#sound-dai-cells': + const: 0 + ports: $ref: /schemas/graph.yaml#/properties/ports properties: @@ -85,7 +88,21 @@ required: - ports - max-linkrate-mhz -additionalProperties: false +allOf: + - $ref: /schemas/sound/dai-common.yaml# + - if: + not: + properties: + compatible: + contains: + enum: + - mediatek,mt8188-dp-tx + - mediatek,mt8195-dp-tx + then: + properties: + '#sound-dai-cells': false + +unevaluatedProperties: false examples: - | diff --git a/Documentation/devicetree/bindings/display/msm/dp-controller.yaml b/Documentation/devicetree/bindings/display/msm/dp-controller.yaml index a212f335d5ff..e00b88332f2f 100644 --- a/Documentation/devicetree/bindings/display/msm/dp-controller.yaml +++ b/Documentation/devicetree/bindings/display/msm/dp-controller.yaml @@ -8,6 +8,7 @@ title: MSM Display Port Controller maintainers: - Kuogee Hsieh + - Abhinav Kumar description: | Device tree bindings for DisplayPort host controller for MSM targets diff --git a/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml b/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml index b0fd96b76ed1..a9636b76854d 100644 --- a/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml +++ b/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml @@ -30,6 +30,7 @@ properties: - qcom,sdm845-dsi-ctrl - qcom,sm6115-dsi-ctrl - qcom,sm6125-dsi-ctrl + - qcom,sm6150-dsi-ctrl - qcom,sm6350-dsi-ctrl - qcom,sm6375-dsi-ctrl - qcom,sm7150-dsi-ctrl @@ -349,6 +350,7 @@ allOf: enum: - qcom,sc7180-dsi-ctrl - qcom,sc7280-dsi-ctrl + - qcom,sm6150-dsi-ctrl - qcom,sm7150-dsi-ctrl - qcom,sm8150-dsi-ctrl - qcom,sm8250-dsi-ctrl diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml b/Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml index 52bbe132e6da..29bbc2f1c766 100644 --- a/Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml +++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml @@ -20,6 +20,7 @@ properties: - qcom,dsi-phy-14nm-660 - qcom,dsi-phy-14nm-8953 - qcom,sm6125-dsi-phy-14nm + - qcom,sm6150-dsi-phy-14nm reg: items: diff --git a/Documentation/devicetree/bindings/display/msm/qcom,sa8775p-mdss.yaml b/Documentation/devicetree/bindings/display/msm/qcom,sa8775p-mdss.yaml index 58f8a01f29c7..4536bb2f971f 100644 --- a/Documentation/devicetree/bindings/display/msm/qcom,sa8775p-mdss.yaml +++ b/Documentation/devicetree/bindings/display/msm/qcom,sa8775p-mdss.yaml @@ -168,7 +168,8 @@ examples: reg = <0xaf54000 0x104>, <0xaf54200 0x0c0>, <0xaf55000 0x770>, - <0xaf56000 0x09c>; + <0xaf56000 0x09c>, + <0xaf57000 0x09c>; interrupt-parent = <&mdss0>; interrupts = <12>; diff --git a/Documentation/devicetree/bindings/display/msm/qcom,sm6150-dpu.yaml b/Documentation/devicetree/bindings/display/msm/qcom,sm6150-dpu.yaml new file mode 100644 index 000000000000..b4f437172218 --- /dev/null +++ b/Documentation/devicetree/bindings/display/msm/qcom,sm6150-dpu.yaml @@ -0,0 +1,108 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/display/msm/qcom,sm6150-dpu.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Qualcomm SM6150 Display DPU + +maintainers: + - Abhinav Kumar + - Dmitry Baryshkov + +$ref: /schemas/display/msm/dpu-common.yaml# + +properties: + compatible: + const: qcom,sm6150-dpu + + reg: + items: + - description: Address offset and size for mdp register set + - description: Address offset and size for vbif register set + + reg-names: + items: + - const: mdp + - const: vbif + + clocks: + items: + - description: Display ahb clock + - description: Display hf axi clock + - description: Display core clock + - description: Display vsync clock + + clock-names: + items: + - const: iface + - const: bus + - const: core + - const: vsync + +unevaluatedProperties: false + +examples: + - | + #include + #include + + display-controller@ae01000 { + compatible = "qcom,sm6150-dpu"; + reg = <0x0ae01000 0x8f000>, + <0x0aeb0000 0x2008>; + reg-names = "mdp", "vbif"; + + clocks = <&dispcc_mdss_ahb_clk>, + <&gcc_disp_hf_axi_clk>, + <&dispcc_mdss_mdp_clk>, + <&dispcc_mdss_vsync_clk>; + clock-names = "iface", "bus", "core", "vsync"; + + assigned-clocks = <&dispcc_mdss_vsync_clk>; + assigned-clock-rates = <19200000>; + + operating-points-v2 = <&mdp_opp_table>; + power-domains = <&rpmhpd RPMHPD_CX>; + + interrupt-parent = <&mdss>; + interrupts = <0>; + + ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + dpu_intf0_out: endpoint { + }; + }; + + port@1 { + reg = <1>; + dpu_intf1_out: endpoint { + remote-endpoint = <&mdss_dsi0_in>; + }; + }; + }; + + mdp_opp_table: opp-table { + compatible = "operating-points-v2"; + + opp-19200000 { + opp-hz = /bits/ 64 <19200000>; + required-opps = <&rpmhpd_opp_low_svs>; + }; + + opp-25600000 { + opp-hz = /bits/ 64 <25600000>; + required-opps = <&rpmhpd_opp_svs>; + }; + + opp-307200000 { + opp-hz = /bits/ 64 <307200000>; + required-opps = <&rpmhpd_opp_nom>; + }; + }; + }; +... diff --git a/Documentation/devicetree/bindings/display/msm/qcom,sm6150-mdss.yaml b/Documentation/devicetree/bindings/display/msm/qcom,sm6150-mdss.yaml new file mode 100644 index 000000000000..9ac24f99d3ad --- /dev/null +++ b/Documentation/devicetree/bindings/display/msm/qcom,sm6150-mdss.yaml @@ -0,0 +1,245 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/display/msm/qcom,sm6150-mdss.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Qualcomm SM6150 Display MDSS + +maintainers: + - Abhinav Kumar + - Dmitry Baryshkov + +description: + Device tree bindings for MSM Mobile Display Subsystem(MDSS) that encapsulates + sub-blocks like DPU display controller, DSI and DP interfaces etc. Device tree + bindings of MDSS are mentioned for SM6150 target. + +$ref: /schemas/display/msm/mdss-common.yaml# + +properties: + compatible: + items: + - const: qcom,sm6150-mdss + + clocks: + items: + - description: Display AHB clock from gcc + - description: Display hf axi clock + - description: Display core clock + + clock-names: + items: + - const: iface + - const: bus + - const: core + + iommus: + maxItems: 1 + + interconnects: + maxItems: 2 + + interconnect-names: + maxItems: 2 + +patternProperties: + "^display-controller@[0-9a-f]+$": + type: object + additionalProperties: true + properties: + compatible: + const: qcom,sm6150-dpu + + "^dsi@[0-9a-f]+$": + type: object + additionalProperties: true + properties: + compatible: + items: + - const: qcom,sm6150-dsi-ctrl + - const: qcom,mdss-dsi-ctrl + + "^phy@[0-9a-f]+$": + type: object + additionalProperties: true + properties: + compatible: + const: qcom,sm6150-dsi-phy-14nm + +unevaluatedProperties: false + +examples: + - | + #include + #include + #include + #include + #include + + display-subsystem@ae00000 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "qcom,sm6150-mdss"; + reg = <0x0ae00000 0x1000>; + reg-names = "mdss"; + + interconnects = <&mmss_noc MASTER_MDP0 QCOM_ICC_TAG_ALWAYS + &mc_virt SLAVE_EBI1 QCOM_ICC_TAG_ALWAYS>, + <&gem_noc MASTER_APPSS_PROC QCOM_ICC_TAG_ACTIVE_ONLY + &config_noc SLAVE_DISPLAY_CFG QCOM_ICC_TAG_ACTIVE_ONLY>; + interconnect-names = "mdp0-mem", "cpu-cfg"; + + power-domains = <&dispcc_mdss_gdsc>; + + clocks = <&dispcc_mdss_ahb_clk>, + <&gcc_disp_hf_axi_clk>, + <&dispcc_mdss_mdp_clk>; + + interrupts = ; + interrupt-controller; + #interrupt-cells = <1>; + + iommus = <&apps_smmu 0x800 0x0>; + + ranges; + + display-controller@ae01000 { + compatible = "qcom,sm6150-dpu"; + reg = <0x0ae01000 0x8f000>, + <0x0aeb0000 0x2008>; + reg-names = "mdp", "vbif"; + + clocks = <&dispcc_mdss_ahb_clk>, + <&gcc_disp_hf_axi_clk>, + <&dispcc_mdss_mdp_clk>, + <&dispcc_mdss_vsync_clk>; + clock-names = "iface", "bus", "core", "vsync"; + + assigned-clocks = <&dispcc_mdss_vsync_clk>; + assigned-clock-rates = <19200000>; + + operating-points-v2 = <&mdp_opp_table>; + power-domains = <&rpmhpd RPMHPD_CX>; + + interrupt-parent = <&mdss>; + interrupts = <0>; + + ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + dpu_intf0_out: endpoint { + }; + }; + + port@1 { + reg = <1>; + dpu_intf1_out: endpoint { + remote-endpoint = <&mdss_dsi0_in>; + }; + }; + }; + + mdp_opp_table: opp-table { + compatible = "operating-points-v2"; + + opp-19200000 { + opp-hz = /bits/ 64 <19200000>; + required-opps = <&rpmhpd_opp_low_svs>; + }; + + opp-25600000 { + opp-hz = /bits/ 64 <25600000>; + required-opps = <&rpmhpd_opp_svs>; + }; + + opp-307200000 { + opp-hz = /bits/ 64 <307200000>; + required-opps = <&rpmhpd_opp_nom>; + }; + }; + }; + + dsi@ae94000 { + compatible = "qcom,sm6150-dsi-ctrl", + "qcom,mdss-dsi-ctrl"; + reg = <0x0ae94000 0x400>; + reg-names = "dsi_ctrl"; + + interrupt-parent = <&mdss>; + interrupts = <4>; + + clocks = <&dispcc_mdss_byte0_clk>, + <&dispcc_mdss_byte0_intf_clk>, + <&dispcc_mdss_pclk0_clk>, + <&dispcc_mdss_esc0_clk>, + <&dispcc_mdss_ahb_clk>, + <&gcc_disp_hf_axi_clk>; + clock-names = "byte", + "byte_intf", + "pixel", + "core", + "iface", + "bus"; + + assigned-clocks = <&dispcc_mdss_byte0_clk_src>, + <&dispcc_mdss_pclk0_clk_src>; + assigned-clock-parents = <&mdss_dsi0_phy 0>, + <&mdss_dsi0_phy 1>; + + operating-points-v2 = <&dsi0_opp_table>; + + phys = <&mdss_dsi0_phy>; + + #address-cells = <1>; + #size-cells = <0>; + + ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + mdss_dsi0_in: endpoint { + remote-endpoint = <&dpu_intf1_out>; + }; + }; + + port@1 { + reg = <1>; + mdss_dsi0_out: endpoint { + }; + }; + }; + + dsi0_opp_table: opp-table { + compatible = "operating-points-v2"; + + opp-164000000 { + opp-hz = /bits/ 64 <164000000>; + required-opps = <&rpmhpd_opp_low_svs>; + }; + }; + }; + + mdss_dsi0_phy: phy@ae94400 { + compatible = "qcom,sm6150-dsi-phy-14nm"; + reg = <0x0ae94400 0x100>, + <0x0ae94500 0x300>, + <0x0ae94800 0x188>; + reg-names = "dsi_phy", + "dsi_phy_lane", + "dsi_pll"; + + #clock-cells = <1>; + #phy-cells = <0>; + + clocks = <&dispcc_mdss_ahb_clk>, + <&rpmhcc RPMH_CXO_CLK>; + clock-names = "iface", "ref"; + }; + }; +... diff --git a/Documentation/devicetree/bindings/display/panel/panel-lvds.yaml b/Documentation/devicetree/bindings/display/panel/panel-lvds.yaml index 5af2d6930075..fcb5834f799a 100644 --- a/Documentation/devicetree/bindings/display/panel/panel-lvds.yaml +++ b/Documentation/devicetree/bindings/display/panel/panel-lvds.yaml @@ -42,6 +42,8 @@ properties: # Admatec 9904379 10.1" 1024x600 LVDS panel - admatec,9904379 - auo,b101ew05 + # AUO G084SN05 V9 8.4" 800x600 LVDS panel + - auo,g084sn05 # Chunghwa Picture Tubes Ltd. 7" WXGA (800x1280) TFT LCD LVDS panel - chunghwa,claa070wp03xg # EDT ETML0700Z9NDHA 7.0" WSVGA (1024x600) color TFT LCD LVDS panel diff --git a/Documentation/devicetree/bindings/display/panel/panel-simple.yaml b/Documentation/devicetree/bindings/display/panel/panel-simple.yaml index 18b63f356bb4..e3ee3a332bb7 100644 --- a/Documentation/devicetree/bindings/display/panel/panel-simple.yaml +++ b/Documentation/devicetree/bindings/display/panel/panel-simple.yaml @@ -206,12 +206,16 @@ properties: - mitsubishi,aa070mc01-ca1 # Mitsubishi AA084XE01 8.4" XGA TFT LCD panel - mitsubishi,aa084xe01 + # Multi-Inno Technology Co.,Ltd MI0700A2T-30 7" 800x480 TFT Resistive Touch Module + - multi-inno,mi0700a2t-30 # Multi-Inno Technology Co.,Ltd MI0700S4T-6 7" 800x480 TFT Resistive Touch Module - multi-inno,mi0700s4t-6 # Multi-Inno Technology Co.,Ltd MI0800FT-9 8" 800x600 TFT Resistive Touch Module - multi-inno,mi0800ft-9 # Multi-Inno Technology Co.,Ltd MI1010AIT-1CP 10.1" 1280x800 LVDS IPS Cap Touch Mod. - multi-inno,mi1010ait-1cp + # Multi-Inno Technology Co.,Ltd MI1010Z1T-1CP11 10.1" 1024x600 TFT Resistive Touch Module + - multi-inno,mi1010z1t-1cp11 # NEC LCD Technologies, Ltd. 12.1" WXGA (1280x800) LVDS TFT LCD panel - nec,nl12880bc20-05 # NEC LCD Technologies,Ltd. WQVGA TFT LCD panel @@ -280,10 +284,14 @@ properties: - team-source-display,tst043015cmhx # Tianma Micro-electronics TM070JDHG30 7.0" WXGA TFT LCD panel - tianma,tm070jdhg30 + # Tianma Micro-electronics TM070JDHG34-00 7.0" WXGA (1280x800) LVDS TFT LCD panel + - tianma,tm070jdhg34-00 # Tianma Micro-electronics TM070JVHG33 7.0" WXGA TFT LCD panel - tianma,tm070jvhg33 # Tianma Micro-electronics TM070RVHG71 7.0" WXGA TFT LCD panel - tianma,tm070rvhg71 + # Topland TIAN-G07017-01 7.0" WSVGA TFT-LCD panel with capacitive touch + - topland,tian-g07017-01 # Toshiba 8.9" WXGA (1280x768) TFT LCD panel - toshiba,lt089ac29000 # TPK U.S.A. LLC Fusion 7" 800 x 480 (WVGA) LCD panel with capacitive touch diff --git a/Documentation/devicetree/bindings/display/panel/samsung,atna33xc20.yaml b/Documentation/devicetree/bindings/display/panel/samsung,atna33xc20.yaml index 032f783eefc4..684c2896d238 100644 --- a/Documentation/devicetree/bindings/display/panel/samsung,atna33xc20.yaml +++ b/Documentation/devicetree/bindings/display/panel/samsung,atna33xc20.yaml @@ -23,6 +23,8 @@ properties: - samsung,atna45af01 # Samsung 14.5" 3K (2944x1840 pixels) eDP AMOLED panel - samsung,atna45dc02 + # Samsung 15.6" 3K (2880x1620 pixels) eDP AMOLED panel + - samsung,atna56ac03 - const: samsung,atna33xc20 enable-gpios: true diff --git a/Documentation/devicetree/bindings/display/renesas,du.yaml b/Documentation/devicetree/bindings/display/renesas,du.yaml index c5b9e6812bce..3880b4c2ea9a 100644 --- a/Documentation/devicetree/bindings/display/renesas,du.yaml +++ b/Documentation/devicetree/bindings/display/renesas,du.yaml @@ -41,6 +41,7 @@ properties: - renesas,du-r8a77995 # for R-Car D3 compatible DU - renesas,du-r8a779a0 # for R-Car V3U compatible DU - renesas,du-r8a779g0 # for R-Car V4H compatible DU + - renesas,du-r8a779h0 # for R-Car V4M compatible DU reg: maxItems: 1 @@ -69,14 +70,12 @@ properties: $ref: /schemas/graph.yaml#/properties/port unevaluatedProperties: false - required: - - port@0 - - port@1 - unevaluatedProperties: false renesas,cmms: $ref: /schemas/types.yaml#/definitions/phandle-array + minItems: 1 + maxItems: 4 items: maxItems: 1 description: @@ -85,6 +84,8 @@ properties: renesas,vsps: $ref: /schemas/types.yaml#/definitions/phandle-array + minItems: 1 + maxItems: 4 items: items: - description: phandle to VSP instance that serves the DU channel @@ -489,9 +490,11 @@ allOf: renesas,cmms: minItems: 4 + maxItems: 4 renesas,vsps: minItems: 4 + maxItems: 4 required: - clock-names @@ -558,9 +561,11 @@ allOf: renesas,cmms: minItems: 3 + maxItems: 3 renesas,vsps: minItems: 3 + maxItems: 3 required: - clock-names @@ -627,9 +632,11 @@ allOf: renesas,cmms: minItems: 3 + maxItems: 3 renesas,vsps: minItems: 3 + maxItems: 3 required: - clock-names @@ -683,7 +690,7 @@ allOf: - port@1 renesas,vsps: - minItems: 1 + maxItems: 1 required: - clock-names @@ -746,9 +753,11 @@ allOf: renesas,cmms: minItems: 2 + maxItems: 2 renesas,vsps: minItems: 2 + maxItems: 2 required: - clock-names @@ -799,6 +808,54 @@ allOf: renesas,vsps: minItems: 2 + maxItems: 2 + + required: + - clock-names + - interrupts + - resets + - reset-names + - renesas,vsps + + - if: + properties: + compatible: + contains: + enum: + - renesas,du-r8a779h0 + then: + properties: + clocks: + items: + - description: Functional clock + + clock-names: + items: + - const: du.0 + + interrupts: + maxItems: 1 + + resets: + maxItems: 1 + + reset-names: + items: + - const: du.0 + + ports: + properties: + port@0: + description: DSI 0 + port@1: false + port@2: false + port@3: false + + required: + - port@0 + + renesas,vsps: + maxItems: 1 required: - clock-names diff --git a/Documentation/devicetree/bindings/display/rockchip/rockchip,rk3588-mipi-dsi2.yaml b/Documentation/devicetree/bindings/display/rockchip/rockchip,rk3588-mipi-dsi2.yaml new file mode 100644 index 000000000000..53384e47b507 --- /dev/null +++ b/Documentation/devicetree/bindings/display/rockchip/rockchip,rk3588-mipi-dsi2.yaml @@ -0,0 +1,120 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/display/rockchip/rockchip,rk3588-mipi-dsi2.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Rockchip specific extensions to the Synopsys Designware MIPI DSI2 + +maintainers: + - Heiko Stuebner + +properties: + compatible: + enum: + - rockchip,rk3588-mipi-dsi2 + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + clocks: + maxItems: 2 + + clock-names: + items: + - const: pclk + - const: sys + + rockchip,grf: + $ref: /schemas/types.yaml#/definitions/phandle + description: + This SoC uses GRF regs to switch between vopl/vopb. + + phys: + maxItems: 1 + + phy-names: + const: dcphy + + power-domains: + maxItems: 1 + + resets: + maxItems: 1 + + reset-names: + const: apb + + ports: + $ref: /schemas/graph.yaml#/properties/ports + + properties: + port@0: + $ref: /schemas/graph.yaml#/properties/port + description: Input node to receive pixel data. + + port@1: + $ref: /schemas/graph.yaml#/properties/port + description: DSI output node to panel. + + required: + - port@0 + - port@1 + +required: + - compatible + - clocks + - clock-names + - rockchip,grf + - phys + - phy-names + - ports + - reg + +allOf: + - $ref: /schemas/display/dsi-controller.yaml# + +unevaluatedProperties: false + +examples: + - | + #include + #include + #include + #include + #include + #include + + soc { + #address-cells = <2>; + #size-cells = <2>; + + dsi@fde20000 { + compatible = "rockchip,rk3588-mipi-dsi2"; + reg = <0x0 0xfde20000 0x0 0x10000>; + interrupts = ; + clocks = <&cru PCLK_DSIHOST0>, <&cru CLK_DSIHOST0>; + clock-names = "pclk", "sys"; + resets = <&cru SRST_P_DSIHOST0>; + reset-names = "apb"; + power-domains = <&power RK3588_PD_VOP>; + phys = <&mipidcphy0 PHY_TYPE_DPHY>; + phy-names = "dcphy"; + rockchip,grf = <&vop_grf>; + + ports { + #address-cells = <1>; + #size-cells = <0>; + dsi0_in: port@0 { + reg = <0>; + }; + + dsi0_out: port@1 { + reg = <1>; + }; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.yaml b/Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.yaml index 554f9d5809d4..6b754d4f260e 100644 --- a/Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.yaml +++ b/Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.yaml @@ -100,12 +100,16 @@ properties: - description: Video layer, plane 1 (U/V or U) - description: Video layer, plane 2 (V) - description: Graphics layer + - description: Audio channel 0 + - description: Audio channel 1 dma-names: items: - const: vid0 - const: vid1 - const: vid2 - const: gfx0 + - const: aud0 + - const: aud1 phys: description: PHYs for the DP data lanes @@ -194,11 +198,13 @@ examples: power-domains = <&pd_dp>; resets = <&reset ZYNQMP_RESET_DP>; - dma-names = "vid0", "vid1", "vid2", "gfx0"; + dma-names = "vid0", "vid1", "vid2", "gfx0", "aud0", "aud1"; dmas = <&xlnx_dpdma 0>, <&xlnx_dpdma 1>, <&xlnx_dpdma 2>, - <&xlnx_dpdma 3>; + <&xlnx_dpdma 3>, + <&xlnx_dpdma 4>, + <&xlnx_dpdma 5>; phys = <&psgtr 1 PHY_TYPE_DP 0 3>, <&psgtr 0 PHY_TYPE_DP 1 3>; diff --git a/Documentation/devicetree/bindings/iio/st,st-sensors.yaml b/Documentation/devicetree/bindings/iio/st,st-sensors.yaml index 71c1ee33a393..e955eb8e8797 100644 --- a/Documentation/devicetree/bindings/iio/st,st-sensors.yaml +++ b/Documentation/devicetree/bindings/iio/st,st-sensors.yaml @@ -65,6 +65,7 @@ properties: - st,lsm9ds0-gyro - description: STMicroelectronics Magnetometers enum: + - st,iis2mdc - st,lis2mdl - st,lis3mdl-magn - st,lsm303agr-magn diff --git a/Documentation/devicetree/bindings/mtd/partitions/fixed-partitions.yaml b/Documentation/devicetree/bindings/mtd/partitions/fixed-partitions.yaml index 058253d6d889..62086366837c 100644 --- a/Documentation/devicetree/bindings/mtd/partitions/fixed-partitions.yaml +++ b/Documentation/devicetree/bindings/mtd/partitions/fixed-partitions.yaml @@ -82,7 +82,7 @@ examples: uimage@100000 { reg = <0x0100000 0x200000>; - compress = "lzma"; + compression = "lzma"; }; }; diff --git a/Documentation/devicetree/bindings/net/pse-pd/pse-controller.yaml b/Documentation/devicetree/bindings/net/pse-pd/pse-controller.yaml index a12cda8aa764..cd09560e0aea 100644 --- a/Documentation/devicetree/bindings/net/pse-pd/pse-controller.yaml +++ b/Documentation/devicetree/bindings/net/pse-pd/pse-controller.yaml @@ -81,7 +81,7 @@ properties: List of phandles, each pointing to the power supply for the corresponding pairset named in 'pairset-names'. This property aligns with IEEE 802.3-2022, Section 33.2.3 and 145.2.4. - PSE Pinout Alternatives (as per IEEE 802.3-2022 Table 145\u20133) + PSE Pinout Alternatives (as per IEEE 802.3-2022 Table 145-3) |-----------|---------------|---------------|---------------|---------------| | Conductor | Alternative A | Alternative A | Alternative B | Alternative B | | | (MDI-X) | (MDI) | (X) | (S) | diff --git a/Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.yaml b/Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.yaml index 6d6d211883ae..daee0c0fc915 100644 --- a/Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.yaml +++ b/Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.yaml @@ -113,11 +113,8 @@ allOf: maxItems: 1 - if: - properties: - compatible: - contains: - enum: - - fsl,imx95-usb-phy + required: + - orientation-switch then: $ref: /schemas/usb/usb-switch.yaml# diff --git a/Documentation/devicetree/bindings/power/mediatek,power-controller.yaml b/Documentation/devicetree/bindings/power/mediatek,power-controller.yaml index 6d37c06b2f65..591a080ca3ff 100644 --- a/Documentation/devicetree/bindings/power/mediatek,power-controller.yaml +++ b/Documentation/devicetree/bindings/power/mediatek,power-controller.yaml @@ -55,6 +55,10 @@ patternProperties: patternProperties: "^power-domain@[0-9a-f]+$": $ref: "#/$defs/power-domain-node" + patternProperties: + "^power-domain@[0-9a-f]+$": + $ref: "#/$defs/power-domain-node" + unevaluatedProperties: false unevaluatedProperties: false unevaluatedProperties: false unevaluatedProperties: false diff --git a/Documentation/devicetree/bindings/regulator/qcom,qca6390-pmu.yaml b/Documentation/devicetree/bindings/regulator/qcom,qca6390-pmu.yaml index ca401a209cca..47c425c9fff1 100644 --- a/Documentation/devicetree/bindings/regulator/qcom,qca6390-pmu.yaml +++ b/Documentation/devicetree/bindings/regulator/qcom,qca6390-pmu.yaml @@ -18,6 +18,7 @@ properties: compatible: enum: - qcom,qca6390-pmu + - qcom,wcn6750-pmu - qcom,wcn6855-pmu - qcom,wcn7850-pmu @@ -27,6 +28,9 @@ properties: vddaon-supply: description: VDD_AON supply regulator handle + vddasd-supply: + description: VDD_ASD supply regulator handle + vdddig-supply: description: VDD_DIG supply regulator handle @@ -42,6 +46,9 @@ properties: vddio1p2-supply: description: VDD_IO_1P2 supply regulator handle + vddrfa0p8-supply: + description: VDD_RFA_0P8 supply regulator handle + vddrfa0p95-supply: description: VDD_RFA_0P95 supply regulator handle @@ -51,12 +58,18 @@ properties: vddrfa1p3-supply: description: VDD_RFA_1P3 supply regulator handle + vddrfa1p7-supply: + description: VDD_RFA_1P7 supply regulator handle + vddrfa1p8-supply: description: VDD_RFA_1P8 supply regulator handle vddrfa1p9-supply: description: VDD_RFA_1P9 supply regulator handle + vddrfa2p2-supply: + description: VDD_RFA_2P2 supply regulator handle + vddpcie1p3-supply: description: VDD_PCIE_1P3 supply regulator handle @@ -119,6 +132,20 @@ allOf: - vddpcie1p3-supply - vddpcie1p9-supply - vddio-supply + - if: + properties: + compatible: + contains: + const: qcom,wcn6750-pmu + then: + required: + - vddaon-supply + - vddasd-supply + - vddpmu-supply + - vddrfa0p8-supply + - vddrfa1p2-supply + - vddrfa1p7-supply + - vddrfa2p2-supply - if: properties: compatible: diff --git a/Documentation/devicetree/bindings/soc/fsl/fsl,qman-portal.yaml b/Documentation/devicetree/bindings/soc/fsl/fsl,qman-portal.yaml index 17016184143f..e459fec02ba8 100644 --- a/Documentation/devicetree/bindings/soc/fsl/fsl,qman-portal.yaml +++ b/Documentation/devicetree/bindings/soc/fsl/fsl,qman-portal.yaml @@ -35,6 +35,7 @@ properties: fsl,liodn: $ref: /schemas/types.yaml#/definitions/uint32-array + maxItems: 2 description: See pamu.txt. Two LIODN(s). DQRR LIODN (DLIODN) and Frame LIODN (FLIODN) @@ -69,6 +70,7 @@ patternProperties: type: object properties: fsl,liodn: + $ref: /schemas/types.yaml#/definitions/uint32-array description: See pamu.txt, PAMU property used for static LIODN assignment fsl,iommu-parent: diff --git a/Documentation/devicetree/bindings/sound/realtek,rt5645.yaml b/Documentation/devicetree/bindings/sound/realtek,rt5645.yaml index 13f09f1bc800..0a698798c22b 100644 --- a/Documentation/devicetree/bindings/sound/realtek,rt5645.yaml +++ b/Documentation/devicetree/bindings/sound/realtek,rt5645.yaml @@ -51,7 +51,7 @@ properties: description: Power supply for AVDD, providing 1.8V. cpvdd-supply: - description: Power supply for CPVDD, providing 3.5V. + description: Power supply for CPVDD, providing 1.8V. hp-detect-gpios: description: diff --git a/Documentation/devicetree/bindings/vendor-prefixes.yaml b/Documentation/devicetree/bindings/vendor-prefixes.yaml index da01616802c7..42d14899d584 100644 --- a/Documentation/devicetree/bindings/vendor-prefixes.yaml +++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml @@ -1524,6 +1524,8 @@ patternProperties: description: Topeet "^topic,.*": description: Topic Embedded Systems + "^topland,.*": + description: Topland Electronics (H.K) Co., Ltd. "^toppoly,.*": description: TPO (deprecated, use tpo) deprecated: true diff --git a/Documentation/devicetree/bindings/watchdog/airoha,en7581-wdt.yaml b/Documentation/devicetree/bindings/watchdog/airoha,en7581-wdt.yaml new file mode 100644 index 000000000000..6bbab3cb28e5 --- /dev/null +++ b/Documentation/devicetree/bindings/watchdog/airoha,en7581-wdt.yaml @@ -0,0 +1,47 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/watchdog/airoha,en7581-wdt.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Airoha EN7581 Watchdog Timer + +maintainers: + - Christian Marangi + +allOf: + - $ref: watchdog.yaml# + +properties: + compatible: + const: airoha,en7581-wdt + + reg: + maxItems: 1 + + clocks: + description: BUS clock (timer ticks at half the BUS clock) + maxItems: 1 + + clock-names: + const: bus + +required: + - compatible + - reg + - clocks + - clock-names + +unevaluatedProperties: false + +examples: + - | + #include + + watchdog@1fbf0100 { + compatible = "airoha,en7581-wdt"; + reg = <0x1fbf0100 0x3c>; + + clocks = <&scuclk EN7523_CLK_BUS>; + clock-names = "bus"; + }; diff --git a/Documentation/devicetree/bindings/watchdog/fsl-imx-wdt.yaml b/Documentation/devicetree/bindings/watchdog/fsl-imx-wdt.yaml index 36b836d0620c..0da953cb7127 100644 --- a/Documentation/devicetree/bindings/watchdog/fsl-imx-wdt.yaml +++ b/Documentation/devicetree/bindings/watchdog/fsl-imx-wdt.yaml @@ -48,6 +48,8 @@ properties: clocks: maxItems: 1 + big-endian: true + fsl,ext-reset-output: $ref: /schemas/types.yaml#/definitions/flag description: | @@ -93,6 +95,18 @@ allOf: properties: fsl,suspend-in-wait: false + - if: + not: + properties: + compatible: + contains: + enum: + - fsl,ls1012a-wdt + - fsl,ls1043a-wdt + then: + properties: + big-endian: false + unevaluatedProperties: false examples: diff --git a/Documentation/devicetree/bindings/watchdog/qcom-wdt.yaml b/Documentation/devicetree/bindings/watchdog/qcom-wdt.yaml index 932393f8c649..34896a39fa91 100644 --- a/Documentation/devicetree/bindings/watchdog/qcom-wdt.yaml +++ b/Documentation/devicetree/bindings/watchdog/qcom-wdt.yaml @@ -26,6 +26,8 @@ properties: - qcom,apss-wdt-msm8994 - qcom,apss-wdt-qcm2290 - qcom,apss-wdt-qcs404 + - qcom,apss-wdt-qcs615 + - qcom,apss-wdt-qcs8300 - qcom,apss-wdt-sa8255p - qcom,apss-wdt-sa8775p - qcom,apss-wdt-sc7180 diff --git a/Documentation/devicetree/bindings/watchdog/samsung-wdt.yaml b/Documentation/devicetree/bindings/watchdog/samsung-wdt.yaml index 77a5ddd0426e..d175ae968336 100644 --- a/Documentation/devicetree/bindings/watchdog/samsung-wdt.yaml +++ b/Documentation/devicetree/bindings/watchdog/samsung-wdt.yaml @@ -26,6 +26,7 @@ properties: - samsung,exynos7-wdt # for Exynos7 - samsung,exynos850-wdt # for Exynos850 - samsung,exynosautov9-wdt # for Exynosautov9 + - samsung,exynosautov920-wdt # for Exynosautov920 - items: - enum: - tesla,fsd-wdt @@ -77,6 +78,7 @@ allOf: - samsung,exynos7-wdt - samsung,exynos850-wdt - samsung,exynosautov9-wdt + - samsung,exynosautov920-wdt then: required: - samsung,syscon-phandle @@ -88,6 +90,7 @@ allOf: - google,gs101-wdt - samsung,exynos850-wdt - samsung,exynosautov9-wdt + - samsung,exynosautov920-wdt then: properties: clocks: diff --git a/Documentation/gpu/drm-compute.rst b/Documentation/gpu/drm-compute.rst new file mode 100644 index 000000000000..f90c3e63aa9e --- /dev/null +++ b/Documentation/gpu/drm-compute.rst @@ -0,0 +1,54 @@ +================================== +Long running workloads and compute +================================== + +Long running workloads (compute) are workloads that will not complete in 10 +seconds. (The time let the user wait before he reaches for the power button). +This means that other techniques need to be used to manage those workloads, +that cannot use fences. + +Some hardware may schedule compute jobs, and have no way to pre-empt them, or +have their memory swapped out from them. Or they simply want their workload +not to be preempted or swapped out at all. + +This means that it differs from what is described in driver-api/dma-buf.rst. + +As with normal compute jobs, dma-fence may not be used at all. In this case, +not even to force preemption. The driver with is simply forced to unmap a BO +from the long compute job's address space on unbind immediately, not even +waiting for the workload to complete. Effectively this terminates the workload +when there is no hardware support to recover. + +Since this is undesirable, there need to be mitigations to prevent a workload +from being terminated. There are several possible approach, all with their +advantages and drawbacks. + +The first approach you will likely try is to pin all buffers used by compute. +This guarantees that the job will run uninterrupted, but also allows a very +denial of service attack by pinning as much memory as possible, hogging the +all GPU memory, and possibly a huge chunk of CPU memory. + +A second approach that will work slightly better on its own is adding an option +not to evict when creating a new job (any kind). If all of userspace opts in +to this flag, it would prevent cooperating userspace from forced terminating +older compute jobs to start a new one. + +If job preemption and recoverable pagefaults are not available, those are the +only approaches possible. So even with those, you want a separate way of +controlling resources. The standard kernel way of doing so is cgroups. + +This creates a third option, using cgroups to prevent eviction. Both GPU and +driver-allocated CPU memory would be accounted to the correct cgroup, and +eviction would be made cgroup aware. This allows the GPU to be partitioned +into cgroups, that will allow jobs to run next to each other without +interference. + +The interface to the cgroup would be similar to the current CPU memory +interface, with similar semantics for min/low/high/max, if eviction can +be made cgroup aware. + +What should be noted is that each memory region (tiled memory for example) +should have its own accounting. + +The key is set to the regionid set by the driver, for example "tile0". +For the value of $card, we use drmGetUnique(). diff --git a/Documentation/gpu/drm-kms-helpers.rst b/Documentation/gpu/drm-kms-helpers.rst index 8cf2f041af47..b4ee25af1702 100644 --- a/Documentation/gpu/drm-kms-helpers.rst +++ b/Documentation/gpu/drm-kms-helpers.rst @@ -221,6 +221,9 @@ Panel Helper Reference .. kernel-doc:: drivers/gpu/drm/drm_panel_orientation_quirks.c :export: +.. kernel-doc:: drivers/gpu/drm/drm_panel_backlight_quirks.c + :export: + Panel Self Refresh Helper Reference =================================== diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst index 2717cb2a597e..b7fc106dad99 100644 --- a/Documentation/gpu/drm-usage-stats.rst +++ b/Documentation/gpu/drm-usage-stats.rst @@ -145,57 +145,57 @@ both. Memory ^^^^^^ -- drm-memory-: [KiB|MiB] - -Each possible memory type which can be used to store buffer objects by the -GPU in question shall be given a stable and unique name to be returned as the -string here. +Each possible memory type which can be used to store buffer objects by the GPU +in question shall be given a stable and unique name to be used as the "" +string. The region name "memory" is reserved to refer to normal system memory. -Value shall reflect the amount of storage currently consumed by the buffer +The value shall reflect the amount of storage currently consumed by the buffer objects belong to this client, in the respective memory region. Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB' indicating kibi- or mebi-bytes. -This key is deprecated and is an alias for drm-resident-. Only one of -the two should be present in the output. +- drm-total-: [KiB|MiB] + +The total size of all requested buffers, including both shared and private +memory. The backing store for the buffers does not need to be currently +instantiated to count under this category. To avoid double-counting, if a buffer +has multiple regions where it can be allocated to, the implementation should +consistently select a single region for accounting purposes. - drm-shared-: [KiB|MiB] -The total size of buffers that are shared with another file (e.g., have more -than a single handle). - -- drm-total-: [KiB|MiB] - -The total size of all created buffers including shared and private memory. The -backing store for the buffers does not have to be currently instantiated to be -counted under this category. +The total size of buffers that are shared with another file (i.e., have more +than one handle). The same requirement to avoid double-counting that applies to +drm-total- also applies here. - drm-resident-: [KiB|MiB] -The total size of buffers that are resident (have their backing store present or -instantiated) in the specified region. +The total size of buffers that are resident (i.e., have their backing store +present or instantiated) in the specified region. -This is an alias for drm-memory- and only one of the two should be -present in the output. +- drm-memory-: [KiB|MiB] + +This key is deprecated and is only printed by amdgpu; it is an alias for +drm-resident-. - drm-purgeable-: [KiB|MiB] -The total size of buffers that are purgeable. +The total size of buffers that are resident and purgeable. -For example drivers which implement a form of 'madvise' like functionality can -here count buffers which have instantiated backing store, but have been marked -with an equivalent of MADV_DONTNEED. +For example, drivers that implement functionality similar to 'madvise' can count +buffers that have instantiated backing stores but have been marked with an +equivalent of MADV_DONTNEED. - drm-active-: [KiB|MiB] The total size of buffers that are active on one or more engines. -One practical example of this can be presence of unsignaled fences in an GEM -buffer reservation object. Therefore the active category is a subset of -resident. +One practical example of this could be the presence of unsignaled fences in a +GEM buffer reservation object. Therefore, the active category is a subset of the +resident category. Implementation Details ====================== diff --git a/Documentation/gpu/index.rst b/Documentation/gpu/index.rst index 37e383ccf73f..7dcb15850afd 100644 --- a/Documentation/gpu/index.rst +++ b/Documentation/gpu/index.rst @@ -13,6 +13,7 @@ GPU Driver Developer's Guide drm-usage-stats driver-uapi drm-client + drm-compute drivers backlight vga-switcheroo diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst index 3f07aa3b5432..92cfb25e64d3 100644 --- a/Documentation/gpu/xe/index.rst +++ b/Documentation/gpu/xe/index.rst @@ -23,4 +23,5 @@ DG2, etc is provided to prototype the driver. xe_firmware xe_tile xe_debugging + xe_devcoredump xe-drm-usage-stats.rst diff --git a/Documentation/gpu/xe/xe_devcoredump.rst b/Documentation/gpu/xe/xe_devcoredump.rst new file mode 100644 index 000000000000..ae4ec0e34dc0 --- /dev/null +++ b/Documentation/gpu/xe/xe_devcoredump.rst @@ -0,0 +1,14 @@ +.. SPDX-License-Identifier: (GPL-2.0+ OR MIT) + +================== +Xe Device Coredump +================== + +.. kernel-doc:: drivers/gpu/drm/xe/xe_devcoredump.c + :doc: Xe device coredump + +Internal API +============ + +.. kernel-doc:: drivers/gpu/drm/xe/xe_devcoredump.c + :internal: diff --git a/Documentation/gpu/zynqmp.rst b/Documentation/gpu/zynqmp.rst index f57bfa0ad6ec..1a6f9193de22 100644 --- a/Documentation/gpu/zynqmp.rst +++ b/Documentation/gpu/zynqmp.rst @@ -144,6 +144,4 @@ Internals .. kernel-doc:: drivers/gpu/drm/xlnx/zynqmp_dp.c -.. kernel-doc:: drivers/gpu/drm/xlnx/zynqmp_dpsub.c - .. kernel-doc:: drivers/gpu/drm/xlnx/zynqmp_kms.c diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst index e8618fbc62c9..1d416658d7f5 100644 --- a/Documentation/mm/process_addrs.rst +++ b/Documentation/mm/process_addrs.rst @@ -3,3 +3,853 @@ ================= Process Addresses ================= + +.. toctree:: + :maxdepth: 3 + + +Userland memory ranges are tracked by the kernel via Virtual Memory Areas or +'VMA's of type :c:struct:`!struct vm_area_struct`. + +Each VMA describes a virtually contiguous memory range with identical +attributes, each described by a :c:struct:`!struct vm_area_struct` +object. Userland access outside of VMAs is invalid except in the case where an +adjacent stack VMA could be extended to contain the accessed address. + +All VMAs are contained within one and only one virtual address space, described +by a :c:struct:`!struct mm_struct` object which is referenced by all tasks (that is, +threads) which share the virtual address space. We refer to this as the +:c:struct:`!mm`. + +Each mm object contains a maple tree data structure which describes all VMAs +within the virtual address space. + +.. note:: An exception to this is the 'gate' VMA which is provided by + architectures which use :c:struct:`!vsyscall` and is a global static + object which does not belong to any specific mm. + +------- +Locking +------- + +The kernel is designed to be highly scalable against concurrent read operations +on VMA **metadata** so a complicated set of locks are required to ensure memory +corruption does not occur. + +.. note:: Locking VMAs for their metadata does not have any impact on the memory + they describe nor the page tables that map them. + +Terminology +----------- + +* **mmap locks** - Each MM has a read/write semaphore :c:member:`!mmap_lock` + which locks at a process address space granularity which can be acquired via + :c:func:`!mmap_read_lock`, :c:func:`!mmap_write_lock` and variants. +* **VMA locks** - The VMA lock is at VMA granularity (of course) which behaves + as a read/write semaphore in practice. A VMA read lock is obtained via + :c:func:`!lock_vma_under_rcu` (and unlocked via :c:func:`!vma_end_read`) and a + write lock via :c:func:`!vma_start_write` (all VMA write locks are unlocked + automatically when the mmap write lock is released). To take a VMA write lock + you **must** have already acquired an :c:func:`!mmap_write_lock`. +* **rmap locks** - When trying to access VMAs through the reverse mapping via a + :c:struct:`!struct address_space` or :c:struct:`!struct anon_vma` object + (reachable from a folio via :c:member:`!folio->mapping`). VMAs must be stabilised via + :c:func:`!anon_vma_[try]lock_read` or :c:func:`!anon_vma_[try]lock_write` for + anonymous memory and :c:func:`!i_mmap_[try]lock_read` or + :c:func:`!i_mmap_[try]lock_write` for file-backed memory. We refer to these + locks as the reverse mapping locks, or 'rmap locks' for brevity. + +We discuss page table locks separately in the dedicated section below. + +The first thing **any** of these locks achieve is to **stabilise** the VMA +within the MM tree. That is, guaranteeing that the VMA object will not be +deleted from under you nor modified (except for some specific fields +described below). + +Stabilising a VMA also keeps the address space described by it around. + +Lock usage +---------- + +If you want to **read** VMA metadata fields or just keep the VMA stable, you +must do one of the following: + +* Obtain an mmap read lock at the MM granularity via :c:func:`!mmap_read_lock` (or a + suitable variant), unlocking it with a matching :c:func:`!mmap_read_unlock` when + you're done with the VMA, *or* +* Try to obtain a VMA read lock via :c:func:`!lock_vma_under_rcu`. This tries to + acquire the lock atomically so might fail, in which case fall-back logic is + required to instead obtain an mmap read lock if this returns :c:macro:`!NULL`, + *or* +* Acquire an rmap lock before traversing the locked interval tree (whether + anonymous or file-backed) to obtain the required VMA. + +If you want to **write** VMA metadata fields, then things vary depending on the +field (we explore each VMA field in detail below). For the majority you must: + +* Obtain an mmap write lock at the MM granularity via :c:func:`!mmap_write_lock` (or a + suitable variant), unlocking it with a matching :c:func:`!mmap_write_unlock` when + you're done with the VMA, *and* +* Obtain a VMA write lock via :c:func:`!vma_start_write` for each VMA you wish to + modify, which will be released automatically when :c:func:`!mmap_write_unlock` is + called. +* If you want to be able to write to **any** field, you must also hide the VMA + from the reverse mapping by obtaining an **rmap write lock**. + +VMA locks are special in that you must obtain an mmap **write** lock **first** +in order to obtain a VMA **write** lock. A VMA **read** lock however can be +obtained without any other lock (:c:func:`!lock_vma_under_rcu` will acquire then +release an RCU lock to lookup the VMA for you). + +This constrains the impact of writers on readers, as a writer can interact with +one VMA while a reader interacts with another simultaneously. + +.. note:: The primary users of VMA read locks are page fault handlers, which + means that without a VMA write lock, page faults will run concurrent with + whatever you are doing. + +Examining all valid lock states: + +.. table:: + + ========= ======== ========= ======= ===== =========== ========== + mmap lock VMA lock rmap lock Stable? Read? Write most? Write all? + ========= ======== ========= ======= ===== =========== ========== + \- \- \- N N N N + \- R \- Y Y N N + \- \- R/W Y Y N N + R/W \-/R \-/R/W Y Y N N + W W \-/R Y Y Y N + W W W Y Y Y Y + ========= ======== ========= ======= ===== =========== ========== + +.. warning:: While it's possible to obtain a VMA lock while holding an mmap read lock, + attempting to do the reverse is invalid as it can result in deadlock - if + another task already holds an mmap write lock and attempts to acquire a VMA + write lock that will deadlock on the VMA read lock. + +All of these locks behave as read/write semaphores in practice, so you can +obtain either a read or a write lock for each of these. + +.. note:: Generally speaking, a read/write semaphore is a class of lock which + permits concurrent readers. However a write lock can only be obtained + once all readers have left the critical region (and pending readers + made to wait). + + This renders read locks on a read/write semaphore concurrent with other + readers and write locks exclusive against all others holding the semaphore. + +VMA fields +^^^^^^^^^^ + +We can subdivide :c:struct:`!struct vm_area_struct` fields by their purpose, which makes it +easier to explore their locking characteristics: + +.. note:: We exclude VMA lock-specific fields here to avoid confusion, as these + are in effect an internal implementation detail. + +.. table:: Virtual layout fields + + ===================== ======================================== =========== + Field Description Write lock + ===================== ======================================== =========== + :c:member:`!vm_start` Inclusive start virtual address of range mmap write, + VMA describes. VMA write, + rmap write. + :c:member:`!vm_end` Exclusive end virtual address of range mmap write, + VMA describes. VMA write, + rmap write. + :c:member:`!vm_pgoff` Describes the page offset into the file, mmap write, + the original page offset within the VMA write, + virtual address space (prior to any rmap write. + :c:func:`!mremap`), or PFN if a PFN map + and the architecture does not support + :c:macro:`!CONFIG_ARCH_HAS_PTE_SPECIAL`. + ===================== ======================================== =========== + +These fields describes the size, start and end of the VMA, and as such cannot be +modified without first being hidden from the reverse mapping since these fields +are used to locate VMAs within the reverse mapping interval trees. + +.. table:: Core fields + + ============================ ======================================== ========================= + Field Description Write lock + ============================ ======================================== ========================= + :c:member:`!vm_mm` Containing mm_struct. None - written once on + initial map. + :c:member:`!vm_page_prot` Architecture-specific page table mmap write, VMA write. + protection bits determined from VMA + flags. + :c:member:`!vm_flags` Read-only access to VMA flags describing N/A + attributes of the VMA, in union with + private writable + :c:member:`!__vm_flags`. + :c:member:`!__vm_flags` Private, writable access to VMA flags mmap write, VMA write. + field, updated by + :c:func:`!vm_flags_*` functions. + :c:member:`!vm_file` If the VMA is file-backed, points to a None - written once on + struct file object describing the initial map. + underlying file, if anonymous then + :c:macro:`!NULL`. + :c:member:`!vm_ops` If the VMA is file-backed, then either None - Written once on + the driver or file-system provides a initial map by + :c:struct:`!struct vm_operations_struct` :c:func:`!f_ops->mmap()`. + object describing callbacks to be + invoked on VMA lifetime events. + :c:member:`!vm_private_data` A :c:member:`!void *` field for Handled by driver. + driver-specific metadata. + ============================ ======================================== ========================= + +These are the core fields which describe the MM the VMA belongs to and its attributes. + +.. table:: Config-specific fields + + ================================= ===================== ======================================== =============== + Field Configuration option Description Write lock + ================================= ===================== ======================================== =============== + :c:member:`!anon_name` CONFIG_ANON_VMA_NAME A field for storing a mmap write, + :c:struct:`!struct anon_vma_name` VMA write. + object providing a name for anonymous + mappings, or :c:macro:`!NULL` if none + is set or the VMA is file-backed. The + underlying object is reference counted + and can be shared across multiple VMAs + for scalability. + :c:member:`!swap_readahead_info` CONFIG_SWAP Metadata used by the swap mechanism mmap read, + to perform readahead. This field is swap-specific + accessed atomically. lock. + :c:member:`!vm_policy` CONFIG_NUMA :c:type:`!mempolicy` object which mmap write, + describes the NUMA behaviour of the VMA write. + VMA. The underlying object is reference + counted. + :c:member:`!numab_state` CONFIG_NUMA_BALANCING :c:type:`!vma_numab_state` object which mmap read, + describes the current state of numab-specific + NUMA balancing in relation to this VMA. lock. + Updated under mmap read lock by + :c:func:`!task_numa_work`. + :c:member:`!vm_userfaultfd_ctx` CONFIG_USERFAULTFD Userfaultfd context wrapper object of mmap write, + type :c:type:`!vm_userfaultfd_ctx`, VMA write. + either of zero size if userfaultfd is + disabled, or containing a pointer + to an underlying + :c:type:`!userfaultfd_ctx` object which + describes userfaultfd metadata. + ================================= ===================== ======================================== =============== + +These fields are present or not depending on whether the relevant kernel +configuration option is set. + +.. table:: Reverse mapping fields + + =================================== ========================================= ============================ + Field Description Write lock + =================================== ========================================= ============================ + :c:member:`!shared.rb` A red/black tree node used, if the mmap write, VMA write, + mapping is file-backed, to place the VMA i_mmap write. + in the + :c:member:`!struct address_space->i_mmap` + red/black interval tree. + :c:member:`!shared.rb_subtree_last` Metadata used for management of the mmap write, VMA write, + interval tree if the VMA is file-backed. i_mmap write. + :c:member:`!anon_vma_chain` List of pointers to both forked/CoW’d mmap read, anon_vma write. + :c:type:`!anon_vma` objects and + :c:member:`!vma->anon_vma` if it is + non-:c:macro:`!NULL`. + :c:member:`!anon_vma` :c:type:`!anon_vma` object used by When :c:macro:`NULL` and + anonymous folios mapped exclusively to setting non-:c:macro:`NULL`: + this VMA. Initially set by mmap read, page_table_lock. + :c:func:`!anon_vma_prepare` serialised + by the :c:macro:`!page_table_lock`. This When non-:c:macro:`NULL` and + is set as soon as any page is faulted in. setting :c:macro:`NULL`: + mmap write, VMA write, + anon_vma write. + =================================== ========================================= ============================ + +These fields are used to both place the VMA within the reverse mapping, and for +anonymous mappings, to be able to access both related :c:struct:`!struct anon_vma` objects +and the :c:struct:`!struct anon_vma` in which folios mapped exclusively to this VMA should +reside. + +.. note:: If a file-backed mapping is mapped with :c:macro:`!MAP_PRIVATE` set + then it can be in both the :c:type:`!anon_vma` and :c:type:`!i_mmap` + trees at the same time, so all of these fields might be utilised at + once. + +Page tables +----------- + +We won't speak exhaustively on the subject but broadly speaking, page tables map +virtual addresses to physical ones through a series of page tables, each of +which contain entries with physical addresses for the next page table level +(along with flags), and at the leaf level the physical addresses of the +underlying physical data pages or a special entry such as a swap entry, +migration entry or other special marker. Offsets into these pages are provided +by the virtual address itself. + +In Linux these are divided into five levels - PGD, P4D, PUD, PMD and PTE. Huge +pages might eliminate one or two of these levels, but when this is the case we +typically refer to the leaf level as the PTE level regardless. + +.. note:: In instances where the architecture supports fewer page tables than + five the kernel cleverly 'folds' page table levels, that is stubbing + out functions related to the skipped levels. This allows us to + conceptually act as if there were always five levels, even if the + compiler might, in practice, eliminate any code relating to missing + ones. + +There are four key operations typically performed on page tables: + +1. **Traversing** page tables - Simply reading page tables in order to traverse + them. This only requires that the VMA is kept stable, so a lock which + establishes this suffices for traversal (there are also lockless variants + which eliminate even this requirement, such as :c:func:`!gup_fast`). +2. **Installing** page table mappings - Whether creating a new mapping or + modifying an existing one in such a way as to change its identity. This + requires that the VMA is kept stable via an mmap or VMA lock (explicitly not + rmap locks). +3. **Zapping/unmapping** page table entries - This is what the kernel calls + clearing page table mappings at the leaf level only, whilst leaving all page + tables in place. This is a very common operation in the kernel performed on + file truncation, the :c:macro:`!MADV_DONTNEED` operation via + :c:func:`!madvise`, and others. This is performed by a number of functions + including :c:func:`!unmap_mapping_range` and :c:func:`!unmap_mapping_pages`. + The VMA need only be kept stable for this operation. +4. **Freeing** page tables - When finally the kernel removes page tables from a + userland process (typically via :c:func:`!free_pgtables`) extreme care must + be taken to ensure this is done safely, as this logic finally frees all page + tables in the specified range, ignoring existing leaf entries (it assumes the + caller has both zapped the range and prevented any further faults or + modifications within it). + +.. note:: Modifying mappings for reclaim or migration is performed under rmap + lock as it, like zapping, does not fundamentally modify the identity + of what is being mapped. + +**Traversing** and **zapping** ranges can be performed holding any one of the +locks described in the terminology section above - that is the mmap lock, the +VMA lock or either of the reverse mapping locks. + +That is - as long as you keep the relevant VMA **stable** - you are good to go +ahead and perform these operations on page tables (though internally, kernel +operations that perform writes also acquire internal page table locks to +serialise - see the page table implementation detail section for more details). + +When **installing** page table entries, the mmap or VMA lock must be held to +keep the VMA stable. We explore why this is in the page table locking details +section below. + +.. warning:: Page tables are normally only traversed in regions covered by VMAs. + If you want to traverse page tables in areas that might not be + covered by VMAs, heavier locking is required. + See :c:func:`!walk_page_range_novma` for details. + +**Freeing** page tables is an entirely internal memory management operation and +has special requirements (see the page freeing section below for more details). + +.. warning:: When **freeing** page tables, it must not be possible for VMAs + containing the ranges those page tables map to be accessible via + the reverse mapping. + + The :c:func:`!free_pgtables` function removes the relevant VMAs + from the reverse mappings, but no other VMAs can be permitted to be + accessible and span the specified range. + +Lock ordering +------------- + +As we have multiple locks across the kernel which may or may not be taken at the +same time as explicit mm or VMA locks, we have to be wary of lock inversion, and +the **order** in which locks are acquired and released becomes very important. + +.. note:: Lock inversion occurs when two threads need to acquire multiple locks, + but in doing so inadvertently cause a mutual deadlock. + + For example, consider thread 1 which holds lock A and tries to acquire lock B, + while thread 2 holds lock B and tries to acquire lock A. + + Both threads are now deadlocked on each other. However, had they attempted to + acquire locks in the same order, one would have waited for the other to + complete its work and no deadlock would have occurred. + +The opening comment in :c:macro:`!mm/rmap.c` describes in detail the required +ordering of locks within memory management code: + +.. code-block:: + + inode->i_rwsem (while writing or truncating, not reading or faulting) + mm->mmap_lock + mapping->invalidate_lock (in filemap_fault) + folio_lock + hugetlbfs_i_mmap_rwsem_key (in huge_pmd_share, see hugetlbfs below) + vma_start_write + mapping->i_mmap_rwsem + anon_vma->rwsem + mm->page_table_lock or pte_lock + swap_lock (in swap_duplicate, swap_info_get) + mmlist_lock (in mmput, drain_mmlist and others) + mapping->private_lock (in block_dirty_folio) + i_pages lock (widely used) + lruvec->lru_lock (in folio_lruvec_lock_irq) + inode->i_lock (in set_page_dirty's __mark_inode_dirty) + bdi.wb->list_lock (in set_page_dirty's __mark_inode_dirty) + sb_lock (within inode_lock in fs/fs-writeback.c) + i_pages lock (widely used, in set_page_dirty, + in arch-dependent flush_dcache_mmap_lock, + within bdi.wb->list_lock in __sync_single_inode) + +There is also a file-system specific lock ordering comment located at the top of +:c:macro:`!mm/filemap.c`: + +.. code-block:: + + ->i_mmap_rwsem (truncate_pagecache) + ->private_lock (__free_pte->block_dirty_folio) + ->swap_lock (exclusive_swap_page, others) + ->i_pages lock + + ->i_rwsem + ->invalidate_lock (acquired by fs in truncate path) + ->i_mmap_rwsem (truncate->unmap_mapping_range) + + ->mmap_lock + ->i_mmap_rwsem + ->page_table_lock or pte_lock (various, mainly in memory.c) + ->i_pages lock (arch-dependent flush_dcache_mmap_lock) + + ->mmap_lock + ->invalidate_lock (filemap_fault) + ->lock_page (filemap_fault, access_process_vm) + + ->i_rwsem (generic_perform_write) + ->mmap_lock (fault_in_readable->do_page_fault) + + bdi->wb.list_lock + sb_lock (fs/fs-writeback.c) + ->i_pages lock (__sync_single_inode) + + ->i_mmap_rwsem + ->anon_vma.lock (vma_merge) + + ->anon_vma.lock + ->page_table_lock or pte_lock (anon_vma_prepare and various) + + ->page_table_lock or pte_lock + ->swap_lock (try_to_unmap_one) + ->private_lock (try_to_unmap_one) + ->i_pages lock (try_to_unmap_one) + ->lruvec->lru_lock (follow_page_mask->mark_page_accessed) + ->lruvec->lru_lock (check_pte_range->folio_isolate_lru) + ->private_lock (folio_remove_rmap_pte->set_page_dirty) + ->i_pages lock (folio_remove_rmap_pte->set_page_dirty) + bdi.wb->list_lock (folio_remove_rmap_pte->set_page_dirty) + ->inode->i_lock (folio_remove_rmap_pte->set_page_dirty) + bdi.wb->list_lock (zap_pte_range->set_page_dirty) + ->inode->i_lock (zap_pte_range->set_page_dirty) + ->private_lock (zap_pte_range->block_dirty_folio) + +Please check the current state of these comments which may have changed since +the time of writing of this document. + +------------------------------ +Locking Implementation Details +------------------------------ + +.. warning:: Locking rules for PTE-level page tables are very different from + locking rules for page tables at other levels. + +Page table locking details +-------------------------- + +In addition to the locks described in the terminology section above, we have +additional locks dedicated to page tables: + +* **Higher level page table locks** - Higher level page tables, that is PGD, P4D + and PUD each make use of the process address space granularity + :c:member:`!mm->page_table_lock` lock when modified. + +* **Fine-grained page table locks** - PMDs and PTEs each have fine-grained locks + either kept within the folios describing the page tables or allocated + separated and pointed at by the folios if :c:macro:`!ALLOC_SPLIT_PTLOCKS` is + set. The PMD spin lock is obtained via :c:func:`!pmd_lock`, however PTEs are + mapped into higher memory (if a 32-bit system) and carefully locked via + :c:func:`!pte_offset_map_lock`. + +These locks represent the minimum required to interact with each page table +level, but there are further requirements. + +Importantly, note that on a **traversal** of page tables, sometimes no such +locks are taken. However, at the PTE level, at least concurrent page table +deletion must be prevented (using RCU) and the page table must be mapped into +high memory, see below. + +Whether care is taken on reading the page table entries depends on the +architecture, see the section on atomicity below. + +Locking rules +^^^^^^^^^^^^^ + +We establish basic locking rules when interacting with page tables: + +* When changing a page table entry the page table lock for that page table + **must** be held, except if you can safely assume nobody can access the page + tables concurrently (such as on invocation of :c:func:`!free_pgtables`). +* Reads from and writes to page table entries must be *appropriately* + atomic. See the section on atomicity below for details. +* Populating previously empty entries requires that the mmap or VMA locks are + held (read or write), doing so with only rmap locks would be dangerous (see + the warning below). +* As mentioned previously, zapping can be performed while simply keeping the VMA + stable, that is holding any one of the mmap, VMA or rmap locks. + +.. warning:: Populating previously empty entries is dangerous as, when unmapping + VMAs, :c:func:`!vms_clear_ptes` has a window of time between + zapping (via :c:func:`!unmap_vmas`) and freeing page tables (via + :c:func:`!free_pgtables`), where the VMA is still visible in the + rmap tree. :c:func:`!free_pgtables` assumes that the zap has + already been performed and removes PTEs unconditionally (along with + all other page tables in the freed range), so installing new PTE + entries could leak memory and also cause other unexpected and + dangerous behaviour. + +There are additional rules applicable when moving page tables, which we discuss +in the section on this topic below. + +PTE-level page tables are different from page tables at other levels, and there +are extra requirements for accessing them: + +* On 32-bit architectures, they may be in high memory (meaning they need to be + mapped into kernel memory to be accessible). +* When empty, they can be unlinked and RCU-freed while holding an mmap lock or + rmap lock for reading in combination with the PTE and PMD page table locks. + In particular, this happens in :c:func:`!retract_page_tables` when handling + :c:macro:`!MADV_COLLAPSE`. + So accessing PTE-level page tables requires at least holding an RCU read lock; + but that only suffices for readers that can tolerate racing with concurrent + page table updates such that an empty PTE is observed (in a page table that + has actually already been detached and marked for RCU freeing) while another + new page table has been installed in the same location and filled with + entries. Writers normally need to take the PTE lock and revalidate that the + PMD entry still refers to the same PTE-level page table. + +To access PTE-level page tables, a helper like :c:func:`!pte_offset_map_lock` or +:c:func:`!pte_offset_map` can be used depending on stability requirements. +These map the page table into kernel memory if required, take the RCU lock, and +depending on variant, may also look up or acquire the PTE lock. +See the comment on :c:func:`!__pte_offset_map_lock`. + +Atomicity +^^^^^^^^^ + +Regardless of page table locks, the MMU hardware concurrently updates accessed +and dirty bits (perhaps more, depending on architecture). Additionally, page +table traversal operations in parallel (though holding the VMA stable) and +functionality like GUP-fast locklessly traverses (that is reads) page tables, +without even keeping the VMA stable at all. + +When performing a page table traversal and keeping the VMA stable, whether a +read must be performed once and only once or not depends on the architecture +(for instance x86-64 does not require any special precautions). + +If a write is being performed, or if a read informs whether a write takes place +(on an installation of a page table entry say, for instance in +:c:func:`!__pud_install`), special care must always be taken. In these cases we +can never assume that page table locks give us entirely exclusive access, and +must retrieve page table entries once and only once. + +If we are reading page table entries, then we need only ensure that the compiler +does not rearrange our loads. This is achieved via :c:func:`!pXXp_get` +functions - :c:func:`!pgdp_get`, :c:func:`!p4dp_get`, :c:func:`!pudp_get`, +:c:func:`!pmdp_get`, and :c:func:`!ptep_get`. + +Each of these uses :c:func:`!READ_ONCE` to guarantee that the compiler reads +the page table entry only once. + +However, if we wish to manipulate an existing page table entry and care about +the previously stored data, we must go further and use an hardware atomic +operation as, for example, in :c:func:`!ptep_get_and_clear`. + +Equally, operations that do not rely on the VMA being held stable, such as +GUP-fast (see :c:func:`!gup_fast` and its various page table level handlers like +:c:func:`!gup_fast_pte_range`), must very carefully interact with page table +entries, using functions such as :c:func:`!ptep_get_lockless` and equivalent for +higher level page table levels. + +Writes to page table entries must also be appropriately atomic, as established +by :c:func:`!set_pXX` functions - :c:func:`!set_pgd`, :c:func:`!set_p4d`, +:c:func:`!set_pud`, :c:func:`!set_pmd`, and :c:func:`!set_pte`. + +Equally functions which clear page table entries must be appropriately atomic, +as in :c:func:`!pXX_clear` functions - :c:func:`!pgd_clear`, +:c:func:`!p4d_clear`, :c:func:`!pud_clear`, :c:func:`!pmd_clear`, and +:c:func:`!pte_clear`. + +Page table installation +^^^^^^^^^^^^^^^^^^^^^^^ + +Page table installation is performed with the VMA held stable explicitly by an +mmap or VMA lock in read or write mode (see the warning in the locking rules +section for details as to why). + +When allocating a P4D, PUD or PMD and setting the relevant entry in the above +PGD, P4D or PUD, the :c:member:`!mm->page_table_lock` must be held. This is +acquired in :c:func:`!__p4d_alloc`, :c:func:`!__pud_alloc` and +:c:func:`!__pmd_alloc` respectively. + +.. note:: :c:func:`!__pmd_alloc` actually invokes :c:func:`!pud_lock` and + :c:func:`!pud_lockptr` in turn, however at the time of writing it ultimately + references the :c:member:`!mm->page_table_lock`. + +Allocating a PTE will either use the :c:member:`!mm->page_table_lock` or, if +:c:macro:`!USE_SPLIT_PMD_PTLOCKS` is defined, a lock embedded in the PMD +physical page metadata in the form of a :c:struct:`!struct ptdesc`, acquired by +:c:func:`!pmd_ptdesc` called from :c:func:`!pmd_lock` and ultimately +:c:func:`!__pte_alloc`. + +Finally, modifying the contents of the PTE requires special treatment, as the +PTE page table lock must be acquired whenever we want stable and exclusive +access to entries contained within a PTE, especially when we wish to modify +them. + +This is performed via :c:func:`!pte_offset_map_lock` which carefully checks to +ensure that the PTE hasn't changed from under us, ultimately invoking +:c:func:`!pte_lockptr` to obtain a spin lock at PTE granularity contained within +the :c:struct:`!struct ptdesc` associated with the physical PTE page. The lock +must be released via :c:func:`!pte_unmap_unlock`. + +.. note:: There are some variants on this, such as + :c:func:`!pte_offset_map_rw_nolock` when we know we hold the PTE stable but + for brevity we do not explore this. See the comment for + :c:func:`!__pte_offset_map_lock` for more details. + +When modifying data in ranges we typically only wish to allocate higher page +tables as necessary, using these locks to avoid races or overwriting anything, +and set/clear data at the PTE level as required (for instance when page faulting +or zapping). + +A typical pattern taken when traversing page table entries to install a new +mapping is to optimistically determine whether the page table entry in the table +above is empty, if so, only then acquiring the page table lock and checking +again to see if it was allocated underneath us. + +This allows for a traversal with page table locks only being taken when +required. An example of this is :c:func:`!__pud_alloc`. + +At the leaf page table, that is the PTE, we can't entirely rely on this pattern +as we have separate PMD and PTE locks and a THP collapse for instance might have +eliminated the PMD entry as well as the PTE from under us. + +This is why :c:func:`!__pte_offset_map_lock` locklessly retrieves the PMD entry +for the PTE, carefully checking it is as expected, before acquiring the +PTE-specific lock, and then *again* checking that the PMD entry is as expected. + +If a THP collapse (or similar) were to occur then the lock on both pages would +be acquired, so we can ensure this is prevented while the PTE lock is held. + +Installing entries this way ensures mutual exclusion on write. + +Page table freeing +^^^^^^^^^^^^^^^^^^ + +Tearing down page tables themselves is something that requires significant +care. There must be no way that page tables designated for removal can be +traversed or referenced by concurrent tasks. + +It is insufficient to simply hold an mmap write lock and VMA lock (which will +prevent racing faults, and rmap operations), as a file-backed mapping can be +truncated under the :c:struct:`!struct address_space->i_mmap_rwsem` alone. + +As a result, no VMA which can be accessed via the reverse mapping (either +through the :c:struct:`!struct anon_vma->rb_root` or the :c:member:`!struct +address_space->i_mmap` interval trees) can have its page tables torn down. + +The operation is typically performed via :c:func:`!free_pgtables`, which assumes +either the mmap write lock has been taken (as specified by its +:c:member:`!mm_wr_locked` parameter), or that the VMA is already unreachable. + +It carefully removes the VMA from all reverse mappings, however it's important +that no new ones overlap these or any route remain to permit access to addresses +within the range whose page tables are being torn down. + +Additionally, it assumes that a zap has already been performed and steps have +been taken to ensure that no further page table entries can be installed between +the zap and the invocation of :c:func:`!free_pgtables`. + +Since it is assumed that all such steps have been taken, page table entries are +cleared without page table locks (in the :c:func:`!pgd_clear`, :c:func:`!p4d_clear`, +:c:func:`!pud_clear`, and :c:func:`!pmd_clear` functions. + +.. note:: It is possible for leaf page tables to be torn down independent of + the page tables above it as is done by + :c:func:`!retract_page_tables`, which is performed under the i_mmap + read lock, PMD, and PTE page table locks, without this level of care. + +Page table moving +^^^^^^^^^^^^^^^^^ + +Some functions manipulate page table levels above PMD (that is PUD, P4D and PGD +page tables). Most notable of these is :c:func:`!mremap`, which is capable of +moving higher level page tables. + +In these instances, it is required that **all** locks are taken, that is +the mmap lock, the VMA lock and the relevant rmap locks. + +You can observe this in the :c:func:`!mremap` implementation in the functions +:c:func:`!take_rmap_locks` and :c:func:`!drop_rmap_locks` which perform the rmap +side of lock acquisition, invoked ultimately by :c:func:`!move_page_tables`. + +VMA lock internals +------------------ + +Overview +^^^^^^^^ + +VMA read locking is entirely optimistic - if the lock is contended or a competing +write has started, then we do not obtain a read lock. + +A VMA **read** lock is obtained by :c:func:`!lock_vma_under_rcu`, which first +calls :c:func:`!rcu_read_lock` to ensure that the VMA is looked up in an RCU +critical section, then attempts to VMA lock it via :c:func:`!vma_start_read`, +before releasing the RCU lock via :c:func:`!rcu_read_unlock`. + +VMA read locks hold the read lock on the :c:member:`!vma->vm_lock` semaphore for +their duration and the caller of :c:func:`!lock_vma_under_rcu` must release it +via :c:func:`!vma_end_read`. + +VMA **write** locks are acquired via :c:func:`!vma_start_write` in instances where a +VMA is about to be modified, unlike :c:func:`!vma_start_read` the lock is always +acquired. An mmap write lock **must** be held for the duration of the VMA write +lock, releasing or downgrading the mmap write lock also releases the VMA write +lock so there is no :c:func:`!vma_end_write` function. + +Note that a semaphore write lock is not held across a VMA lock. Rather, a +sequence number is used for serialisation, and the write semaphore is only +acquired at the point of write lock to update this. + +This ensures the semantics we require - VMA write locks provide exclusive write +access to the VMA. + +Implementation details +^^^^^^^^^^^^^^^^^^^^^^ + +The VMA lock mechanism is designed to be a lightweight means of avoiding the use +of the heavily contended mmap lock. It is implemented using a combination of a +read/write semaphore and sequence numbers belonging to the containing +:c:struct:`!struct mm_struct` and the VMA. + +Read locks are acquired via :c:func:`!vma_start_read`, which is an optimistic +operation, i.e. it tries to acquire a read lock but returns false if it is +unable to do so. At the end of the read operation, :c:func:`!vma_end_read` is +called to release the VMA read lock. + +Invoking :c:func:`!vma_start_read` requires that :c:func:`!rcu_read_lock` has +been called first, establishing that we are in an RCU critical section upon VMA +read lock acquisition. Once acquired, the RCU lock can be released as it is only +required for lookup. This is abstracted by :c:func:`!lock_vma_under_rcu` which +is the interface a user should use. + +Writing requires the mmap to be write-locked and the VMA lock to be acquired via +:c:func:`!vma_start_write`, however the write lock is released by the termination or +downgrade of the mmap write lock so no :c:func:`!vma_end_write` is required. + +All this is achieved by the use of per-mm and per-VMA sequence counts, which are +used in order to reduce complexity, especially for operations which write-lock +multiple VMAs at once. + +If the mm sequence count, :c:member:`!mm->mm_lock_seq` is equal to the VMA +sequence count :c:member:`!vma->vm_lock_seq` then the VMA is write-locked. If +they differ, then it is not. + +Each time the mmap write lock is released in :c:func:`!mmap_write_unlock` or +:c:func:`!mmap_write_downgrade`, :c:func:`!vma_end_write_all` is invoked which +also increments :c:member:`!mm->mm_lock_seq` via +:c:func:`!mm_lock_seqcount_end`. + +This way, we ensure that, regardless of the VMA's sequence number, a write lock +is never incorrectly indicated and that when we release an mmap write lock we +efficiently release **all** VMA write locks contained within the mmap at the +same time. + +Since the mmap write lock is exclusive against others who hold it, the automatic +release of any VMA locks on its release makes sense, as you would never want to +keep VMAs locked across entirely separate write operations. It also maintains +correct lock ordering. + +Each time a VMA read lock is acquired, we acquire a read lock on the +:c:member:`!vma->vm_lock` read/write semaphore and hold it, while checking that +the sequence count of the VMA does not match that of the mm. + +If it does, the read lock fails. If it does not, we hold the lock, excluding +writers, but permitting other readers, who will also obtain this lock under RCU. + +Importantly, maple tree operations performed in :c:func:`!lock_vma_under_rcu` +are also RCU safe, so the whole read lock operation is guaranteed to function +correctly. + +On the write side, we acquire a write lock on the :c:member:`!vma->vm_lock` +read/write semaphore, before setting the VMA's sequence number under this lock, +also simultaneously holding the mmap write lock. + +This way, if any read locks are in effect, :c:func:`!vma_start_write` will sleep +until these are finished and mutual exclusion is achieved. + +After setting the VMA's sequence number, the lock is released, avoiding +complexity with a long-term held write lock. + +This clever combination of a read/write semaphore and sequence count allows for +fast RCU-based per-VMA lock acquisition (especially on page fault, though +utilised elsewhere) with minimal complexity around lock ordering. + +mmap write lock downgrading +--------------------------- + +When an mmap write lock is held one has exclusive access to resources within the +mmap (with the usual caveats about requiring VMA write locks to avoid races with +tasks holding VMA read locks). + +It is then possible to **downgrade** from a write lock to a read lock via +:c:func:`!mmap_write_downgrade` which, similar to :c:func:`!mmap_write_unlock`, +implicitly terminates all VMA write locks via :c:func:`!vma_end_write_all`, but +importantly does not relinquish the mmap lock while downgrading, therefore +keeping the locked virtual address space stable. + +An interesting consequence of this is that downgraded locks are exclusive +against any other task possessing a downgraded lock (since a racing task would +have to acquire a write lock first to downgrade it, and the downgraded lock +prevents a new write lock from being obtained until the original lock is +released). + +For clarity, we map read (R)/downgraded write (D)/write (W) locks against one +another showing which locks exclude the others: + +.. list-table:: Lock exclusivity + :widths: 5 5 5 5 + :header-rows: 1 + :stub-columns: 1 + + * - + - R + - D + - W + * - R + - N + - N + - Y + * - D + - N + - Y + - Y + * - W + - Y + - Y + - Y + +Here a Y indicates the locks in the matching row/column are mutually exclusive, +and N indicates that they are not. + +Stack expansion +--------------- + +Stack expansion throws up additional complexities in that we cannot permit there +to be racing page faults, as a result we invoke :c:func:`!vma_start_write` to +prevent this in :c:func:`!expand_downwards` or :c:func:`!expand_upwards`. diff --git a/Documentation/netlink/specs/mptcp_pm.yaml b/Documentation/netlink/specs/mptcp_pm.yaml index dc190bf838fe..dfd017780d2f 100644 --- a/Documentation/netlink/specs/mptcp_pm.yaml +++ b/Documentation/netlink/specs/mptcp_pm.yaml @@ -22,65 +22,67 @@ definitions: doc: unused event - name: created - doc: - token, family, saddr4 | saddr6, daddr4 | daddr6, sport, dport + doc: >- A new MPTCP connection has been created. It is the good time to allocate memory and send ADD_ADDR if needed. Depending on the traffic-patterns it can take a long time until the MPTCP_EVENT_ESTABLISHED is sent. + Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport, + dport, server-side. - name: established - doc: - token, family, saddr4 | saddr6, daddr4 | daddr6, sport, dport + doc: >- A MPTCP connection is established (can start new subflows). + Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport, + dport, server-side. - name: closed - doc: - token + doc: >- A MPTCP connection has stopped. + Attribute: token. - name: announced value: 6 - doc: - token, rem_id, family, daddr4 | daddr6 [, dport] + doc: >- A new address has been announced by the peer. + Attributes: token, rem_id, family, daddr4 | daddr6 [, dport]. - name: removed - doc: - token, rem_id + doc: >- An address has been lost by the peer. + Attributes: token, rem_id. - name: sub-established value: 10 - doc: - token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 | daddr6, sport, - dport, backup, if_idx [, error] + doc: >- A new subflow has been established. 'error' should not be set. + Attributes: token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 | + daddr6, sport, dport, backup, if_idx [, error]. - name: sub-closed - doc: - token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 | daddr6, sport, - dport, backup, if_idx [, error] + doc: >- A subflow has been closed. An error (copy of sk_err) could be set if an error has been detected for this subflow. + Attributes: token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 | + daddr6, sport, dport, backup, if_idx [, error]. - name: sub-priority value: 13 - doc: - token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 | daddr6, sport, - dport, backup, if_idx [, error] + doc: >- The priority of a subflow has changed. 'error' should not be set. + Attributes: token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 | + daddr6, sport, dport, backup, if_idx [, error]. - name: listener-created value: 15 - doc: - family, sport, saddr4 | saddr6 + doc: >- A new PM listener is created. + Attributes: family, sport, saddr4 | saddr6. - name: listener-closed - doc: - family, sport, saddr4 | saddr6 + doc: >- A PM listener is closed. + Attributes: family, sport, saddr4 | saddr6. attribute-sets: - @@ -306,8 +308,8 @@ operations: attributes: - addr - - name: flush-addrs - doc: flush addresses + name: flush-addrs + doc: Flush addresses attribute-set: endpoint dont-validate: [ strict ] flags: [ uns-admin-perm ] @@ -351,7 +353,7 @@ operations: - addr-remote - name: announce - doc: announce new sf + doc: Announce new address attribute-set: attr dont-validate: [ strict ] flags: [ uns-admin-perm ] @@ -362,7 +364,7 @@ operations: - token - name: remove - doc: announce removal + doc: Announce removal attribute-set: attr dont-validate: [ strict ] flags: [ uns-admin-perm ] @@ -373,7 +375,7 @@ operations: - loc-id - name: subflow-create - doc: todo + doc: Create subflow attribute-set: attr dont-validate: [ strict ] flags: [ uns-admin-perm ] @@ -385,7 +387,7 @@ operations: - addr-remote - name: subflow-destroy - doc: todo + doc: Destroy subflow attribute-set: attr dont-validate: [ strict ] flags: [ uns-admin-perm ] diff --git a/Documentation/networking/bareudp.rst b/Documentation/networking/bareudp.rst index b9d04ee6dac1..621cb9575c8f 100644 --- a/Documentation/networking/bareudp.rst +++ b/Documentation/networking/bareudp.rst @@ -6,16 +6,17 @@ Bare UDP Tunnelling Module Documentation There are various L3 encapsulation standards using UDP being discussed to leverage the UDP based load balancing capability of different networks. -MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them. +MPLSoUDP (https://tools.ietf.org/html/rfc7510) is one among them. The Bareudp tunnel module provides a generic L3 encapsulation support for tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP tunnel. Special Handling ---------------- + The bareudp device supports special handling for MPLS & IP as they can have multiple ethertypes. -MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). +The MPLS protocol can have ethertypes ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6). This special handling can be enabled only for ethertypes ETH_P_IP & ETH_P_MPLS_UC with a flag called multiproto mode. @@ -52,7 +53,7 @@ be enabled explicitly with the "multiproto" flag. 3) Device Usage The bareudp device could be used along with OVS or flower filter in TC. -The OVS or TC flower layer must set the tunnel information in SKB dst field before -sending packet buffer to the bareudp device for transmission. On reception the -bareudp device extracts and stores the tunnel information in SKB dst field before +The OVS or TC flower layer must set the tunnel information in the SKB dst field before +sending the packet buffer to the bareudp device for transmission. On reception, the +bareUDP device extracts and stores the tunnel information in the SKB dst field before passing the packet buffer to the network stack. diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index eacf8983e230..dcbb6f6caf6d 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -2170,6 +2170,12 @@ nexthop_compat_mode - BOOLEAN understands the new API, this sysctl can be disabled to achieve full performance benefits of the new API by disabling the nexthop expansion and extraneous notifications. + + Note that as a backward-compatible mode, dumping of modern features + might be incomplete or wrong. For example, resilient groups will not be + shown as such, but rather as just a list of next hops. Also weights that + do not fit into 8 bits will show incorrectly. + Default: true (backward compat mode) fib_notify_on_flag_change - INTEGER diff --git a/Documentation/power/runtime_pm.rst b/Documentation/power/runtime_pm.rst index 53d1996460ab..12f429359a82 100644 --- a/Documentation/power/runtime_pm.rst +++ b/Documentation/power/runtime_pm.rst @@ -347,7 +347,9 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: `int pm_runtime_resume_and_get(struct device *dev);` - run pm_runtime_resume(dev) and if successful, increment the device's - usage counter; return the result of pm_runtime_resume + usage counter; returns 0 on success (whether or not the device's + runtime PM status was already 'active') or the error code from + pm_runtime_resume() on failure. `int pm_request_idle(struct device *dev);` - submit a request to execute the subsystem-level idle callback for the diff --git a/Documentation/sound/codecs/cs35l56.rst b/Documentation/sound/codecs/cs35l56.rst new file mode 100644 index 000000000000..98c6f6c74394 --- /dev/null +++ b/Documentation/sound/codecs/cs35l56.rst @@ -0,0 +1,292 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +===================================================================== +Audio drivers for Cirrus Logic CS35L54/56/57 Boosted Smart Amplifiers +===================================================================== +:Copyright: 2025 Cirrus Logic, Inc. and + Cirrus Logic International Semiconductor Ltd. + +Contact: patches@opensource.cirrus.com + +Summary +======= + +The high-level summary of this document is: + +**If you have a laptop that uses CS35L54/56/57 amplifiers but audio is not +working, DO NOT ATTEMPT TO USE FIRMWARE AND SETTINGS FROM ANOTHER LAPTOP, +EVEN IF THAT LAPTOP SEEMS SIMILAR.** + +The CS35L54/56/57 amplifiers must be correctly configured for the power +supply voltage, speaker impedance, maximum speaker voltage/current, and +other external hardware connections. + +The amplifiers feature advanced boost technology that increases the voltage +used to drive the speakers, while proprietary speaker protection algorithms +allow these boosted amplifiers to push the limits of the speakers without +causing damage. These **must** be configured correctly. + +Supported Cirrus Logic amplifiers +--------------------------------- + +The cs35l56 drivers support: + +* CS35L54 +* CS35L56 +* CS35L57 + +There are two drivers in the kernel + +*For systems using SoundWire*: sound/soc/codecs/cs35l56.c and associated files + +*For systems using HDA*: sound/pci/hda/cs35l56_hda.c + +Firmware +======== + +The amplifier is controlled and managed by firmware running on the internal +DSP. Firmware files are essential to enable the full capabilities of the +amplifier. + +Firmware is distributed in the linux-firmware repository: +https://gitlab.com/kernel-firmware/linux-firmware.git + +On most SoundWire systems the amplifier has a default minimum capability to +produce audio. However this will be + +* at low volume, to protect the speakers, since the speaker specifications + and power supply voltages are unknown. +* a mono mix of left and right channels. + +On some SoundWire systems that have both CS42L43 and CS35L56/57 the CS35L56/57 +receive their audio from the CS42L43 instead of directly from the host +SoundWire interface. These systems can be identified by the CS42L43 showing +in dmesg as a SoundWire device, but the CS35L56/57 as SPI. On these systems +the firmware is *mandatory* to enable receiving the audio from the CS42L43. + +On HDA systems the firmware is *mandatory* to enable HDA bridge mode. There +will not be any audio from the amplifiers without firmware. + +Cirrus Logic firmware files +--------------------------- + +Each amplifier requires two firmware files. One file has a .wmfw suffix, the +other has a .bin suffix. + +The firmware is customized by the OEM to match the hardware of each laptop, +and the firmware is specific to that laptop. Because of this, there are many +firmware files in linux-firmware for these amplifiers. Firmware files are +**not interchangeable between laptops**. + +Cirrus Logic submits files for known laptops to the upstream linux-firmware +repository. Providing Cirrus Logic is aware of a particular laptop and has +permission from the manufacturer to publish the firmware, it will be pushed +to linux-firmware. You may need to upgrade to a newer release of +linux-firmware to obtain the firmware for your laptop. + +**Important:** the Makefile for linux-firmware creates symlinks that are listed +in the WHENCE file. These symlinks are required for the CS35L56 driver to be +able to load the firmware. + +How do I know which firmware file I should have? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +All firmware file names are qualified with a unique "system ID". On normal +x86 PCs with PCI audio this is the Vendor Subsystem ID (SSID) of the host +PCI audio interface. + +The SSID can be viewed using the lspci tool:: + + lspci -v -nn | grep -A2 -i audio + 0000:00:1f.3 Audio device [0403]: Intel Corporation Meteor Lake-P HD Audio Controller [8086:7e28] + Subsystem: Dell Meteor Lake-P HD Audio Controller [1028:0c63] + +In this example the SSID is 10280c63. + +The format of the firmware file names is: + + cs35lxx-b0-dsp1-misc-SSID[-spkidX]-ampN + +Where: + + * cs35lxx-b0 is the amplifier model and silicon revision. This information + is logged by the driver during initialization. + * SSID is the 8-digit hexadecimal SSID value. + * ampN is the amplifier number (for example amp1). This is the same as + the prefix on the ALSA control names except that it is always lower-case + in the file name. + * spkidX is an optional part, used for laptops that have firmware + configurations for different makes and models of internal speakers. + +Sound Open Firmware and ALSA topology files +------------------------------------------- + +All SoundWire systems will require a Sound Open Firmware (SOF) for the +host CPU audio DSP, together with an ALSA topology file (.tplg). + +The SOF firmware will usually be provided by the manufacturer of the host +CPU (i.e. Intel or AMD). The .tplg file is normally part of the SOF firmware +release. + +SOF binary builds are available from: https://github.com/thesofproject/sof-bin/releases + +The main SOF source is here: https://github.com/thesofproject + +ALSA-ucm configurations +----------------------- +Typically an appropriate ALSA-ucm configuration file is needed for +use-case managers and audio servers such as PipeWire. + +Configuration files are available from the alsa-ucm-conf repository: +https://git.alsa-project.org/?p=alsa-ucm-conf.git + +Kernel log messages +=================== + +SoundWire +--------- +A successful initialization will look like this (this will be repeated for +each amplifier):: + + [ 7.568374] cs35l56 sdw:0:0:01fa:3556:01:0: supply VDD_P not found, using dummy regulator + [ 7.605208] cs35l56 sdw:0:0:01fa:3556:01:0: supply VDD_IO not found, using dummy regulator + [ 7.605313] cs35l56 sdw:0:0:01fa:3556:01:0: supply VDD_A not found, using dummy regulator + [ 7.939279] cs35l56 sdw:0:0:01fa:3556:01:0: Cirrus Logic CS35L56 Rev B0 OTP3 fw:3.4.4 (patched=0) + [ 7.947844] cs35l56 sdw:0:0:01fa:3556:01:0: Slave 4 state check1: UNATTACHED, status was 1 + [ 8.740280] cs35l56 sdw:0:0:01fa:3556:01:0: supply VDD_B not found, using dummy regulator + [ 8.740552] cs35l56 sdw:0:0:01fa:3556:01:0: supply VDD_AMP not found, using dummy regulator + [ 9.242164] cs35l56 sdw:0:0:01fa:3556:01:0: DSP1: cirrus/cs35l56-b0-dsp1-misc-xxxxxxxx.wmfw: format 3 timestamp 0x66b2b872 + [ 9.242173] cs35l56 sdw:0:0:01fa:3556:01:0: DSP1: cirrus/cs35l56-b0-dsp1-misc-xxxxxxxx.wmfw: Tue 05 Dec 2023 21:37:21 GMT Standard Time + [ 9.991709] cs35l56 sdw:0:0:01fa:3556:01:0: DSP1: Firmware: 1a00d6 vendor: 0x2 v3.11.23, 41 algorithms + [10.039098] cs35l56 sdw:0:0:01fa:3556:01:0: DSP1: cirrus/cs35l56-b0-dsp1-misc-xxxxxxxx-amp1.bin: v3.11.23 + [10.879235] cs35l56 sdw:0:0:01fa:3556:01:0: Slave 4 state check1: UNATTACHED, status was 1 + [11.401536] cs35l56 sdw:0:0:01fa:3556:01:0: Calibration applied + +HDA +--- +A successful initialization will look like this (this will be repeated for +each amplifier):: + + [ 6.306475] cs35l56-hda i2c-CSC3556:00-cs35l56-hda.0: Cirrus Logic CS35L56 Rev B0 OTP3 fw:3.4.4 (patched=0) + [ 6.613892] cs35l56-hda i2c-CSC3556:00-cs35l56-hda.0: DSP system name: 'xxxxxxxx', amp name: 'AMP1' + [ 8.266660] snd_hda_codec_cs8409 ehdaudio0D0: bound i2c-CSC3556:00-cs35l56-hda.0 (ops cs35l56_hda_comp_ops [snd_hda_scodec_cs35l56]) + [ 8.287525] cs35l56-hda i2c-CSC3556:00-cs35l56-hda.0: DSP1: cirrus/cs35l56-b0-dsp1-misc-xxxxxxxx.wmfw: format 3 timestamp 0x66b2b872 + [ 8.287528] cs35l56-hda i2c-CSC3556:00-cs35l56-hda.0: DSP1: cirrus/cs35l56-b0-dsp1-misc-xxxxxxxx.wmfw: Tue 05 Dec 2023 21:37:21 GMT Standard Time + [ 9.984335] cs35l56-hda i2c-CSC3556:00-cs35l56-hda.0: DSP1: Firmware: 1a00d6 vendor: 0x2 v3.11.23, 41 algorithms + [10.085797] cs35l56-hda i2c-CSC3556:00-cs35l56-hda.0: DSP1: cirrus/cs35l56-b0-dsp1-misc-xxxxxxxx-amp1.bin: v3.11.23 + [10.655237] cs35l56-hda i2c-CSC3556:00-cs35l56-hda.0: Calibration applied + +Important messages +~~~~~~~~~~~~~~~~~~ +Cirrus Logic CS35L56 Rev B0 OTP3 fw:3.4.4 (patched=0) + Shows that the driver has been able to read device ID registers from the + amplifier. + + * The actual amplifier type and silicon revision (CS35L56 B0 in this + example) is shown, as read from the amplifier identification registers. + * (patched=0) is normal, and indicates that the amplifier has been hard + reset and is running default ROM firmware. + * (patched=1) means that something has previously downloaded firmware + to the amplifier and the driver does not have control of the RESET + signal to be able to replace this preloaded firmware. This is normal + for systems where the BIOS downloads firmware to the amplifiers + before OS boot. + This status can also be seen if the cs35l56 kernel module is unloaded + and reloaded on a system where the driver does not have control of + RESET. SoundWire systems typically do not give the driver control of + RESET and only a BIOS (re)boot can reset the amplifiers. + +DSP1: cirrus/cs35l56-b0-dsp1-misc-xxxxxxxx.wmfw + Shows that a .wmfw firmware file was found and downloaded. + +DSP1: cirrus/cs35l56-b0-dsp1-misc-xxxxxxxx-amp1.bin + Shows that a .bin firmware file was found and downloaded. + +Calibration applied + Factory calibration data in EFI was written to the amplifier. + +Error messages +============== +This section explains some of the error messages that the driver can log. + +Algorithm coefficient version %d.%d.%d but expected %d.%d.%d + The version of the .bin file content does not match the loaded firmware. + Caused by mismatched .wmfw and .bin file, or .bin file was found but + .wmfw was not. + +No %s for algorithm %x + The version of the .bin file content does not match the loaded firmware. + Caused by mismatched .wmfw and .bin file, or .bin file was found but + .wmfw was not. + +.bin file required but not found + HDA driver did not find a .bin file that matches this hardware. + +Calibration disabled due to missing firmware controls + Driver was not able to write EFI calibration data to firmware registers. + This typically means that either: + + * The driver did not find a suitable wmfw for this hardware, or + * The amplifier has already been patched with firmware by something + previously, and the driver does not have control of a hard RESET line + to be able to reset the amplifier and download the firmware files it + found. This situation is indicated by the device identification + string in the kernel log shows "(patched=1)" + +Failed to write calibration + Same meaning and cause as "Calibration disabled due to missing firmware + controls" + +Failed to read calibration data from EFI + Factory calibration data in EFI is missing, empty or corrupt. + This is most likely to be cause by accidentally deleting the file from + the EFI filesystem. + +No calibration for silicon ID + The factory calibration data in EFI does not match this hardware. + The most likely cause is that an amplifier has been replaced on the + motherboard without going through manufacturer calibration process to + generate calibration data for the new amplifier. + +Did not find any buses for CSCxxxx + Only on HDA systems. The HDA codec driver found an ACPI entry for + Cirrus Logic companion amps, but could not enumerate the ACPI entries for + the I2C/SPI buses. The most likely cause of this is that: + + * The relevant bus driver (I2C or SPI) is not part of the kernel. + * The HDA codec driver was built-in to the kernel but the I2C/SPI + bus driver is a module and so the HDA codec driver cannot call the + bus driver functions. + +init_completion timed out + The SoundWire bus controller (host end) did not enumerate the amplifier. + In other words, the ACPI says there is an amplifier but for some reason + it was not detected on the bus. + +No AF01 node + Indicates an error in ACPI. A SoundWire system should have a Device() + node named "AF01" but it was not found. + +Failed to get spk-id-gpios + ACPI says that the driver should request a GPIO but the driver was not + able to get that GPIO. The most likely cause is that the kernel does not + include the correct GPIO or PINCTRL driver for this system. + +Failed to read spk-id + ACPI says that the driver should request a GPIO but the driver was not + able to read that GPIO. + +Unexpected spk-id element count + AF01 contains more speaker ID GPIO entries than the driver supports + +Overtemp error + Amplifier overheat protection was triggered and the amplifier shut down + to protect itself. + +Amp short error + Amplifier detected a short-circuit on the speaker output pins and shut + down for protection. This would normally indicate a damaged speaker. + +Hibernate wake failed + The driver tried to wake the amplifier from its power-saving state but + did not see the expected responses from the amplifier. This can be caused + by using firmware that does not match the hardware. diff --git a/Documentation/sound/codecs/index.rst b/Documentation/sound/codecs/index.rst new file mode 100644 index 000000000000..2cb95d87bbef --- /dev/null +++ b/Documentation/sound/codecs/index.rst @@ -0,0 +1,9 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Codec-Specific Information +========================== + +.. toctree:: + :maxdepth: 2 + + cs35l56 diff --git a/Documentation/sound/index.rst b/Documentation/sound/index.rst index c437f2a4bc85..51cd736f65b5 100644 --- a/Documentation/sound/index.rst +++ b/Documentation/sound/index.rst @@ -13,6 +13,7 @@ Sound Subsystem Documentation alsa-configuration hd-audio/index cards/index + codecs/index utimers .. only:: subproject and html diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst index 272464bb7c60..2b74f96d09d5 100644 --- a/Documentation/trace/ftrace.rst +++ b/Documentation/trace/ftrace.rst @@ -810,6 +810,12 @@ Here is the list of current tracers that may be configured. to draw a graph of function calls similar to C code source. + Note that the function graph calculates the timings of when the + function starts and returns internally and for each instance. If + there are two instances that run function graph tracer and traces + the same functions, the length of the timings may be slightly off as + each read the timestamp separately and not at the same time. + "blk" The block tracer. The tracer used by the blktrace user diff --git a/Documentation/translations/it_IT/core-api/symbol-namespaces.rst b/Documentation/translations/it_IT/core-api/symbol-namespaces.rst index 17abc25ee4c1..6ee713988531 100644 --- a/Documentation/translations/it_IT/core-api/symbol-namespaces.rst +++ b/Documentation/translations/it_IT/core-api/symbol-namespaces.rst @@ -43,7 +43,7 @@ Tenete presente che per via dell'espansione delle macro questo argomento deve essere un simbolo di preprocessore. Per esempio per esportare il simbolo ``usb_stor_suspend`` nello spazio dei nomi ``USB_STORAGE`` usate:: - EXPORT_SYMBOL_NS(usb_stor_suspend, USB_STORAGE); + EXPORT_SYMBOL_NS(usb_stor_suspend, "USB_STORAGE"); Di conseguenza, nella tabella dei simboli del kernel ci sarà una voce rappresentata dalla struttura ``kernel_symbol`` che avrà il campo @@ -69,7 +69,7 @@ Per esempio per esportare tutti i simboli definiti in usb-common nello spazio dei nomi USB_COMMON, si può aggiungere la seguente linea in drivers/usb/common/Makefile:: - ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=USB_COMMON + ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE='"USB_COMMON"' Questo cambierà tutte le macro EXPORT_SYMBOL() ed EXPORT_SYMBOL_GPL(). Invece, un simbolo esportato con EXPORT_SYMBOL_NS() non verrà cambiato e il simbolo @@ -79,7 +79,7 @@ Una seconda possibilità è quella di definire il simbolo di preprocessore direttamente nei file da compilare. L'esempio precedente diventerebbe:: #undef DEFAULT_SYMBOL_NAMESPACE - #define DEFAULT_SYMBOL_NAMESPACE USB_COMMON + #define DEFAULT_SYMBOL_NAMESPACE "USB_COMMON" Questo va messo prima di un qualsiasi uso di EXPORT_SYMBOL. @@ -94,7 +94,7 @@ dei nomi che contiene i simboli desiderati. Per esempio un modulo che usa il simbolo usb_stor_suspend deve importare lo spazio dei nomi USB_STORAGE usando la seguente dichiarazione:: - MODULE_IMPORT_NS(USB_STORAGE); + MODULE_IMPORT_NS("USB_STORAGE"); Questo creerà un'etichetta ``modinfo`` per ogni spazio dei nomi importato. Un risvolto di questo fatto è che gli spazi dei diff --git a/Documentation/translations/zh_CN/core-api/symbol-namespaces.rst b/Documentation/translations/zh_CN/core-api/symbol-namespaces.rst index bb16f0611046..b1bec219912d 100644 --- a/Documentation/translations/zh_CN/core-api/symbol-namespaces.rst +++ b/Documentation/translations/zh_CN/core-api/symbol-namespaces.rst @@ -48,7 +48,7 @@ 要是一个预处理器符号。例如,要把符号 ``usb_stor_suspend`` 导出到命名空间 ``USB_STORAGE``, 请使用:: - EXPORT_SYMBOL_NS(usb_stor_suspend, USB_STORAGE); + EXPORT_SYMBOL_NS(usb_stor_suspend, "USB_STORAGE"); 相应的 ksymtab 条目结构体 ``kernel_symbol`` 将有相应的成员 ``命名空间`` 集。 导出时未指明命名空间的符号将指向 ``NULL`` 。如果没有定义命名空间,则默认没有。 @@ -66,7 +66,7 @@ 子系统的 ``Makefile`` 中定义默认命名空间。例如,如果要将usb-common中定义的所有符号导 出到USB_COMMON命名空间,可以在drivers/usb/common/Makefile中添加这样一行:: - ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=USB_COMMON + ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE='"USB_COMMON"' 这将影响所有 EXPORT_SYMBOL() 和 EXPORT_SYMBOL_GPL() 语句。当这个定义存在时, 用EXPORT_SYMBOL_NS()导出的符号仍然会被导出到作为命名空间参数传递的命名空间中, @@ -76,7 +76,7 @@ 成:: #undef DEFAULT_SYMBOL_NAMESPACE - #define DEFAULT_SYMBOL_NAMESPACE USB_COMMON + #define DEFAULT_SYMBOL_NAMESPACE "USB_COMMON" 应置于相关编译单元中任何 EXPORT_SYMBOL 宏之前 @@ -88,7 +88,7 @@ 表示它所使用的命名空间的符号。例如,一个使用usb_stor_suspend符号的 模块,需要使用如下语句导入命名空间USB_STORAGE:: - MODULE_IMPORT_NS(USB_STORAGE); + MODULE_IMPORT_NS("USB_STORAGE"); 这将在模块中为每个导入的命名空间创建一个 ``modinfo`` 标签。这也顺带 使得可以用modinfo检查模块已导入的命名空间:: diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 454c2aaa155e..f15b61317aad 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1914,6 +1914,9 @@ No flags are specified so far, the corresponding field must be set to zero. #define KVM_IRQ_ROUTING_HV_SINT 4 #define KVM_IRQ_ROUTING_XEN_EVTCHN 5 +On s390, adding a KVM_IRQ_ROUTING_S390_ADAPTER is rejected on ucontrol VMs with +error -EINVAL. + flags: - KVM_MSI_VALID_DEVID: used along with KVM_IRQ_ROUTING_MSI routing entry diff --git a/Documentation/virt/kvm/devices/s390_flic.rst b/Documentation/virt/kvm/devices/s390_flic.rst index ea96559ba501..b784f8016748 100644 --- a/Documentation/virt/kvm/devices/s390_flic.rst +++ b/Documentation/virt/kvm/devices/s390_flic.rst @@ -58,11 +58,15 @@ Groups: Enables async page faults for the guest. So in case of a major page fault the host is allowed to handle this async and continues the guest. + -EINVAL is returned when called on the FLIC of a ucontrol VM. + KVM_DEV_FLIC_APF_DISABLE_WAIT Disables async page faults for the guest and waits until already pending async page faults are done. This is necessary to trigger a completion interrupt for every init interrupt before migrating the interrupt list. + -EINVAL is returned when called on the FLIC of a ucontrol VM. + KVM_DEV_FLIC_ADAPTER_REGISTER Register an I/O adapter interrupt source. Takes a kvm_s390_io_adapter describing the adapter to register:: diff --git a/Documentation/watchdog/watchdog-parameters.rst b/Documentation/watchdog/watchdog-parameters.rst index 29153eed6689..0a0119edfa82 100644 --- a/Documentation/watchdog/watchdog-parameters.rst +++ b/Documentation/watchdog/watchdog-parameters.rst @@ -120,16 +120,6 @@ coh901327_wdt: ------------------------------------------------- -cpu5wdt: - port: - base address of watchdog card, default is 0x91 - verbose: - be verbose, default is 0 (no) - ticks: - count down ticks, default is 10000 - -------------------------------------------------- - cpwd: wd0_timeout: Default watchdog0 timeout in 1/10secs diff --git a/MAINTAINERS b/MAINTAINERS index c669c5bd61e7..fd4f528bc414 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -949,7 +949,6 @@ AMAZON ETHERNET DRIVERS M: Shay Agroskin M: Arthur Kiyanovski R: David Arinzon -R: Noam Dagan R: Saeed Bishara L: netdev@vger.kernel.org S: Supported @@ -1194,6 +1193,17 @@ L: linux-spi@vger.kernel.org S: Supported F: drivers/spi/spi-amd.c +AMD XDNA DRIVER +M: Min Ma +M: Lizhi Hou +L: dri-devel@lists.freedesktop.org +S: Supported +T: git https://gitlab.freedesktop.org/drm/misc/kernel.git +F: Documentation/accel/amdxdna/ +F: drivers/accel/amdxdna/ +F: include/trace/events/amdxdna.h +F: include/uapi/drm/amdxdna_accel.h + AMD XGBE DRIVER M: "Shyam Sundar S K" L: netdev@vger.kernel.org @@ -1797,7 +1807,6 @@ F: include/uapi/linux/if_arcnet.h ARM AND ARM64 SoC SUB-ARCHITECTURES (COMMON PARTS) M: Arnd Bergmann -M: Olof Johansson L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) L: soc@lists.linux.dev S: Maintained @@ -2691,7 +2700,6 @@ N: at91 N: atmel ARM/Microchip Sparx5 SoC support -M: Lars Povlsen M: Steen Hegelund M: Daniel Machon M: UNGLinuxDriver@microchip.com @@ -3376,6 +3384,8 @@ S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git F: Documentation/arch/arm64/ F: arch/arm64/ +F: drivers/virt/coco/arm-cca-guest/ +F: drivers/virt/coco/pkvm-guest/ F: tools/testing/selftests/arm64/ X: arch/arm64/boot/dts/ @@ -3606,6 +3616,7 @@ F: drivers/phy/qualcomm/phy-ath79-usb.c ATHEROS ATH GENERIC UTILITIES M: Kalle Valo +M: Jeff Johnson L: linux-wireless@vger.kernel.org S: Supported F: drivers/net/wireless/ath/* @@ -3891,7 +3902,7 @@ W: http://www.baycom.org/~tom/ham/ham.html F: drivers/net/hamradio/baycom* BCACHE (BLOCK LAYER CACHE) -M: Coly Li +M: Coly Li M: Kent Overstreet L: linux-bcache@vger.kernel.org S: Maintained @@ -4056,7 +4067,6 @@ F: net/bluetooth/ BONDING DRIVER M: Jay Vosburgh -M: Andy Gospodarek L: netdev@vger.kernel.org S: Maintained F: Documentation/networking/bonding.rst @@ -4129,7 +4139,6 @@ S: Odd Fixes F: drivers/net/ethernet/netronome/nfp/bpf/ BPF JIT for POWERPC (32-BIT AND 64-BIT) -M: Michael Ellerman M: Hari Bathini M: Christophe Leroy R: Naveen N Rao @@ -5467,6 +5476,7 @@ L: linux-sound@vger.kernel.org L: patches@opensource.cirrus.com S: Maintained F: Documentation/devicetree/bindings/sound/cirrus,cs* +F: Documentation/sound/codecs/cs* F: drivers/mfd/cs42l43* F: drivers/pinctrl/cirrus/pinctrl-cs42l43* F: drivers/spi/spi-cs42l43* @@ -7068,7 +7078,8 @@ T: git https://gitlab.freedesktop.org/drm/misc/kernel.git F: drivers/gpu/drm/sun4i/sun8i* DRM DRIVER FOR ARM PL111 CLCD -S: Orphan +M: Linus Walleij +S: Maintained T: git https://gitlab.freedesktop.org/drm/misc/kernel.git F: drivers/gpu/drm/pl111/ @@ -7383,7 +7394,7 @@ L: virtualization@lists.linux.dev S: Obsolete W: https://www.kraxel.org/blog/2014/10/qemu-using-cirrus-considered-harmful/ T: git https://gitlab.freedesktop.org/drm/misc/kernel.git -F: drivers/gpu/drm/tiny/cirrus.c +F: drivers/gpu/drm/tiny/cirrus-qemu.c DRM DRIVER FOR QXL VIRTUAL GPU M: Dave Airlie @@ -7794,6 +7805,7 @@ F: drivers/gpu/drm/rockchip/ DRM DRIVERS FOR STI M: Alain Volmat +M: Raphael Gallais-Pou L: dri-devel@lists.freedesktop.org S: Maintained T: git https://gitlab.freedesktop.org/drm/misc/kernel.git @@ -8451,7 +8463,7 @@ F: include/video/s1d13xxxfb.h EROFS FILE SYSTEM M: Gao Xiang M: Chao Yu -R: Yue Hu +R: Yue Hu R: Jeffle Xu R: Sandeep Dhavale L: linux-erofs@lists.ozlabs.org @@ -12630,7 +12642,7 @@ F: arch/mips/include/uapi/asm/kvm* F: arch/mips/kvm/ KERNEL VIRTUAL MACHINE FOR POWERPC (KVM/powerpc) -M: Michael Ellerman +M: Madhavan Srinivasan R: Nicholas Piggin L: linuxppc-dev@lists.ozlabs.org L: kvm@vger.kernel.org @@ -13209,11 +13221,11 @@ X: drivers/macintosh/adb-iop.c X: drivers/macintosh/via-macii.c LINUX FOR POWERPC (32-BIT AND 64-BIT) +M: Madhavan Srinivasan M: Michael Ellerman R: Nicholas Piggin R: Christophe Leroy R: Naveen N Rao -M: Madhavan Srinivasan L: linuxppc-dev@lists.ozlabs.org S: Supported W: https://github.com/linuxppc/wiki/wiki @@ -14564,7 +14576,6 @@ F: drivers/dma/mediatek/ MEDIATEK ETHERNET DRIVER M: Felix Fietkau M: Sean Wang -M: Mark Lee M: Lorenzo Bianconi L: netdev@vger.kernel.org S: Maintained @@ -14754,7 +14765,7 @@ F: drivers/memory/mtk-smi.c F: include/soc/mediatek/smi.h MEDIATEK SWITCH DRIVER -M: Arınç ÜNAL +M: Chester A. Unal M: Daniel Golle M: DENG Qingfang M: Sean Wang @@ -15343,7 +15354,7 @@ M: Daniel Machon M: UNGLinuxDriver@microchip.com L: netdev@vger.kernel.org S: Maintained -F: drivers/net/ethernet/microchip/lan969x/* +F: drivers/net/ethernet/microchip/sparx5/lan969x/* MICROCHIP LCDFB DRIVER M: Nicolas Ferre @@ -16267,6 +16278,7 @@ F: Documentation/devicetree/bindings/net/ F: Documentation/networking/net_cachelines/net_device.rst F: drivers/connector/ F: drivers/net/ +F: drivers/ptp/ F: include/dt-bindings/net/ F: include/linux/cn_proc.h F: include/linux/etherdevice.h @@ -16334,6 +16346,7 @@ F: Documentation/networking/ F: Documentation/networking/net_cachelines/ F: Documentation/process/maintainer-netdev.rst F: Documentation/userspace-api/netlink/ +F: include/linux/ethtool.h F: include/linux/framer/framer-provider.h F: include/linux/framer/framer.h F: include/linux/in.h @@ -16348,6 +16361,7 @@ F: include/linux/rtnetlink.h F: include/linux/seq_file_net.h F: include/linux/skbuff* F: include/net/ +F: include/uapi/linux/ethtool.h F: include/uapi/linux/genetlink.h F: include/uapi/linux/hsr_netlink.h F: include/uapi/linux/in.h @@ -18455,7 +18469,7 @@ F: Documentation/devicetree/bindings/pinctrl/mediatek,mt8183-pinctrl.yaml F: drivers/pinctrl/mediatek/ PIN CONTROLLER - MEDIATEK MIPS -M: Arınç ÜNAL +M: Chester A. Unal M: Sergio Paracuellos L: linux-mediatek@lists.infradead.org (moderated for non-subscribers) L: linux-mips@vger.kernel.org @@ -19499,7 +19513,7 @@ S: Maintained F: arch/mips/ralink RALINK MT7621 MIPS ARCHITECTURE -M: Arınç ÜNAL +M: Chester A. Unal M: Sergio Paracuellos L: linux-mips@vger.kernel.org S: Maintained @@ -20902,6 +20916,8 @@ F: kernel/sched/ SCHEDULER - SCHED_EXT R: Tejun Heo R: David Vernet +R: Andrea Righi +R: Changwoo Min L: linux-kernel@vger.kernel.org S: Maintained W: https://github.com/sched-ext/scx @@ -21986,6 +22002,7 @@ W: https://github.com/thesofproject/linux/ F: sound/soc/sof/ SOUND - GENERIC SOUND CARD (Simple-Audio-Card, Audio-Graph-Card) +M: Mark Brown M: Kuninori Morimoto S: Supported L: linux-sound@vger.kernel.org @@ -22407,7 +22424,7 @@ F: drivers/char/hw_random/jh7110-trng.c STARFIVE WATCHDOG DRIVER M: Xingyu Wu -M: Samin Guo +M: Ziv Xu S: Supported F: Documentation/devicetree/bindings/watchdog/starfive* F: drivers/watchdog/starfive-wdt.c @@ -22496,11 +22513,8 @@ F: Documentation/devicetree/bindings/phy/st,stm32mp25-combophy.yaml F: drivers/phy/st/phy-stm32-combophy.c STMMAC ETHERNET DRIVER -M: Alexandre Torgue -M: Jose Abreu L: netdev@vger.kernel.org -S: Supported -W: http://www.stlinux.com +S: Orphan F: Documentation/networking/device_drivers/ethernet/stmicro/ F: drivers/net/ethernet/stmicro/stmmac/ @@ -22732,9 +22746,8 @@ S: Supported F: drivers/net/ethernet/synopsys/ SYNOPSYS DESIGNWARE ETHERNET XPCS DRIVER -M: Jose Abreu L: netdev@vger.kernel.org -S: Supported +S: Orphan F: drivers/net/pcs/pcs-xpcs.c F: drivers/net/pcs/pcs-xpcs.h F: include/linux/pcs/pcs-xpcs.h @@ -23642,7 +23655,6 @@ F: tools/testing/selftests/timers/ TIPC NETWORK LAYER M: Jon Maloy -M: Ying Xue L: netdev@vger.kernel.org (core kernel code) L: tipc-discussion@lists.sourceforge.net (user apps, general discussion) S: Maintained @@ -24248,7 +24260,8 @@ F: Documentation/devicetree/bindings/usb/nxp,isp1760.yaml F: drivers/usb/isp1760/* USB LAN78XX ETHERNET DRIVER -M: Woojung Huh +M: Thangaraj Samynathan +M: Rengarajan Sundararajan M: UNGLinuxDriver@microchip.com L: netdev@vger.kernel.org S: Maintained diff --git a/Makefile b/Makefile index 93ab62cef244..b9464c88ac72 100644 --- a/Makefile +++ b/Makefile @@ -2,7 +2,7 @@ VERSION = 6 PATCHLEVEL = 13 SUBLEVEL = 0 -EXTRAVERSION = -rc1 +EXTRAVERSION = NAME = Baby Opossum Posse # *DOCUMENTATION* diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig index 5b2488142041..4f2eeda907ec 100644 --- a/arch/arc/Kconfig +++ b/arch/arc/Kconfig @@ -6,6 +6,7 @@ config ARC def_bool y select ARC_TIMERS + select ARCH_HAS_CPU_CACHE_ALIASING select ARCH_HAS_CACHE_LINE_SIZE select ARCH_HAS_DEBUG_VM_PGTABLE select ARCH_HAS_DMA_PREP_COHERENT @@ -297,7 +298,6 @@ config ARC_PAGE_SIZE_16K config ARC_PAGE_SIZE_4K bool "4KB" select HAVE_PAGE_SIZE_4KB - depends on ARC_MMU_V3 || ARC_MMU_V4 endchoice @@ -474,7 +474,8 @@ config HIGHMEM config ARC_HAS_PAE40 bool "Support for the 40-bit Physical Address Extension" - depends on ISA_ARCV2 + depends on ARC_MMU_V4 + depends on !ARC_PAGE_SIZE_4K select HIGHMEM select PHYS_ADDR_T_64BIT help diff --git a/arch/arc/Makefile b/arch/arc/Makefile index 2390dd042e36..fb98478ed1ab 100644 --- a/arch/arc/Makefile +++ b/arch/arc/Makefile @@ -6,7 +6,7 @@ KBUILD_DEFCONFIG := haps_hs_smp_defconfig ifeq ($(CROSS_COMPILE),) -CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux-) +CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux- arc-linux-gnu-) endif cflags-y += -fno-common -pipe -fno-builtin -mmedium-calls -D__linux__ diff --git a/arch/arc/boot/dts/axc001.dtsi b/arch/arc/boot/dts/axc001.dtsi index 2a151607b080..88bcc7ab6f5a 100644 --- a/arch/arc/boot/dts/axc001.dtsi +++ b/arch/arc/boot/dts/axc001.dtsi @@ -54,7 +54,7 @@ compatible = "snps,dw-apb-gpio-port"; gpio-controller; #gpio-cells = <2>; - snps,nr-gpios = <30>; + ngpios = <30>; reg = <0>; interrupt-controller; #interrupt-cells = <2>; diff --git a/arch/arc/boot/dts/axc003.dtsi b/arch/arc/boot/dts/axc003.dtsi index c0a812674ce9..9a2dc39a5cff 100644 --- a/arch/arc/boot/dts/axc003.dtsi +++ b/arch/arc/boot/dts/axc003.dtsi @@ -62,7 +62,7 @@ compatible = "snps,dw-apb-gpio-port"; gpio-controller; #gpio-cells = <2>; - snps,nr-gpios = <30>; + ngpios = <30>; reg = <0>; interrupt-controller; #interrupt-cells = <2>; diff --git a/arch/arc/boot/dts/axc003_idu.dtsi b/arch/arc/boot/dts/axc003_idu.dtsi index 67556f4b7057..f31382cb8be4 100644 --- a/arch/arc/boot/dts/axc003_idu.dtsi +++ b/arch/arc/boot/dts/axc003_idu.dtsi @@ -69,7 +69,7 @@ compatible = "snps,dw-apb-gpio-port"; gpio-controller; #gpio-cells = <2>; - snps,nr-gpios = <30>; + ngpios = <30>; reg = <0>; interrupt-controller; #interrupt-cells = <2>; diff --git a/arch/arc/boot/dts/axs10x_mb.dtsi b/arch/arc/boot/dts/axs10x_mb.dtsi index b64435385304..3add2fe257f8 100644 --- a/arch/arc/boot/dts/axs10x_mb.dtsi +++ b/arch/arc/boot/dts/axs10x_mb.dtsi @@ -250,7 +250,7 @@ compatible = "snps,dw-apb-gpio-port"; gpio-controller; #gpio-cells = <2>; - snps,nr-gpios = <32>; + ngpios = <32>; reg = <0>; }; @@ -258,7 +258,7 @@ compatible = "snps,dw-apb-gpio-port"; gpio-controller; #gpio-cells = <2>; - snps,nr-gpios = <8>; + ngpios = <8>; reg = <1>; }; @@ -266,7 +266,7 @@ compatible = "snps,dw-apb-gpio-port"; gpio-controller; #gpio-cells = <2>; - snps,nr-gpios = <8>; + ngpios = <8>; reg = <2>; }; }; @@ -281,7 +281,7 @@ compatible = "snps,dw-apb-gpio-port"; gpio-controller; #gpio-cells = <2>; - snps,nr-gpios = <30>; + ngpios = <30>; reg = <0>; }; @@ -289,7 +289,7 @@ compatible = "snps,dw-apb-gpio-port"; gpio-controller; #gpio-cells = <2>; - snps,nr-gpios = <10>; + ngpios = <10>; reg = <1>; }; @@ -297,7 +297,7 @@ compatible = "snps,dw-apb-gpio-port"; gpio-controller; #gpio-cells = <2>; - snps,nr-gpios = <8>; + ngpios = <8>; reg = <2>; }; }; diff --git a/arch/arc/boot/dts/hsdk.dts b/arch/arc/boot/dts/hsdk.dts index 41b980df862b..98bb850722a4 100644 --- a/arch/arc/boot/dts/hsdk.dts +++ b/arch/arc/boot/dts/hsdk.dts @@ -308,7 +308,7 @@ compatible = "snps,dw-apb-gpio-port"; gpio-controller; #gpio-cells = <2>; - snps,nr-gpios = <24>; + ngpios = <24>; reg = <0>; }; }; diff --git a/arch/arc/include/asm/arcregs.h b/arch/arc/include/asm/arcregs.h index 4b13f60fe7ca..005d9e4d187a 100644 --- a/arch/arc/include/asm/arcregs.h +++ b/arch/arc/include/asm/arcregs.h @@ -146,7 +146,7 @@ #ifndef __ASSEMBLY__ -#include +#include /* Helpers */ #define TO_KB(bytes) ((bytes) >> 10) diff --git a/arch/arc/include/asm/cachetype.h b/arch/arc/include/asm/cachetype.h new file mode 100644 index 000000000000..acd3b6cb4bf5 --- /dev/null +++ b/arch/arc/include/asm/cachetype.h @@ -0,0 +1,8 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ASM_ARC_CACHETYPE_H +#define __ASM_ARC_CACHETYPE_H + +#define cpu_dcache_is_aliasing() false +#define cpu_icache_is_aliasing() true + +#endif diff --git a/arch/arc/include/asm/cmpxchg.h b/arch/arc/include/asm/cmpxchg.h index 58045c898340..76f43db0890f 100644 --- a/arch/arc/include/asm/cmpxchg.h +++ b/arch/arc/include/asm/cmpxchg.h @@ -48,7 +48,7 @@ \ switch(sizeof((_p_))) { \ case 1: \ - _prev_ = (__typeof__(*(ptr)))cmpxchg_emu_u8((volatile u8 *)_p_, (uintptr_t)_o_, (uintptr_t)_n_); \ + _prev_ = (__typeof__(*(ptr)))cmpxchg_emu_u8((volatile u8 *__force)_p_, (uintptr_t)_o_, (uintptr_t)_n_); \ break; \ case 4: \ _prev_ = __cmpxchg(_p_, _o_, _n_); \ diff --git a/arch/arc/include/asm/mmu-arcv2.h b/arch/arc/include/asm/mmu-arcv2.h index d85dc0721907..41412642f279 100644 --- a/arch/arc/include/asm/mmu-arcv2.h +++ b/arch/arc/include/asm/mmu-arcv2.h @@ -9,7 +9,7 @@ #ifndef _ASM_ARC_MMU_ARCV2_H #define _ASM_ARC_MMU_ARCV2_H -#include +#include /* * TLB Management regs diff --git a/arch/arc/net/bpf_jit_arcv2.c b/arch/arc/net/bpf_jit_arcv2.c index 4458e409ca0a..6d989b6d88c6 100644 --- a/arch/arc/net/bpf_jit_arcv2.c +++ b/arch/arc/net/bpf_jit_arcv2.c @@ -2916,7 +2916,7 @@ bool check_jmp_32(u32 curr_off, u32 targ_off, u8 cond) addendum = (cond == ARC_CC_AL) ? 0 : INSN_len_normal; disp = get_displacement(curr_off + addendum, targ_off); - if (ARC_CC_AL) + if (cond == ARC_CC_AL) return is_valid_far_disp(disp); else return is_valid_near_disp(disp); diff --git a/arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi b/arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi index dd714d235d5f..b0bad0d1ba36 100644 --- a/arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi +++ b/arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi @@ -87,7 +87,7 @@ reg = <0x402c0000 0x4000>; interrupts = <110>; clocks = <&clks IMXRT1050_CLK_IPG_PDOF>, - <&clks IMXRT1050_CLK_OSC>, + <&clks IMXRT1050_CLK_AHB_PODF>, <&clks IMXRT1050_CLK_USDHC1>; clock-names = "ipg", "ahb", "per"; bus-width = <4>; diff --git a/arch/arm/common/locomo.c b/arch/arm/common/locomo.c index 06b0e5fd54a6..cb6ef449b987 100644 --- a/arch/arm/common/locomo.c +++ b/arch/arm/common/locomo.c @@ -516,7 +516,7 @@ static void locomo_remove(struct platform_device *dev) */ static struct platform_driver locomo_device_driver = { .probe = locomo_probe, - .remove_new = locomo_remove, + .remove = locomo_remove, #ifdef CONFIG_PM .suspend = locomo_suspend, .resume = locomo_resume, diff --git a/arch/arm/common/sa1111.c b/arch/arm/common/sa1111.c index 550978dc3c50..9846f30990f7 100644 --- a/arch/arm/common/sa1111.c +++ b/arch/arm/common/sa1111.c @@ -1154,7 +1154,7 @@ static struct dev_pm_ops sa1111_pm_ops = { */ static struct platform_driver sa1111_device_driver = { .probe = sa1111_probe, - .remove_new = sa1111_remove, + .remove = sa1111_remove, .driver = { .name = "sa1111", .pm = &sa1111_pm_ops, diff --git a/arch/arm/common/scoop.c b/arch/arm/common/scoop.c index 9018c7240166..0b08b6621878 100644 --- a/arch/arm/common/scoop.c +++ b/arch/arm/common/scoop.c @@ -250,7 +250,7 @@ static void scoop_remove(struct platform_device *pdev) static struct platform_driver scoop_driver = { .probe = scoop_probe, - .remove_new = scoop_remove, + .remove = scoop_remove, .suspend = scoop_suspend, .resume = scoop_resume, .driver = { diff --git a/arch/arm/configs/imx_v6_v7_defconfig b/arch/arm/configs/imx_v6_v7_defconfig index 0beecdde55f5..f25eadcba5e6 100644 --- a/arch/arm/configs/imx_v6_v7_defconfig +++ b/arch/arm/configs/imx_v6_v7_defconfig @@ -323,6 +323,7 @@ CONFIG_SND_SOC_IMX_SGTL5000=y CONFIG_SND_SOC_FSL_ASOC_CARD=y CONFIG_SND_SOC_AC97_CODEC=y CONFIG_SND_SOC_CS42XX8_I2C=y +CONFIG_SND_SOC_SPDIF=y CONFIG_SND_SOC_TLV320AIC3X_I2C=y CONFIG_SND_SOC_WM8960=y CONFIG_SND_SOC_WM8962=y diff --git a/arch/arm/mach-imx/Kconfig b/arch/arm/mach-imx/Kconfig index e4fe059cd861..dc47b2312127 100644 --- a/arch/arm/mach-imx/Kconfig +++ b/arch/arm/mach-imx/Kconfig @@ -6,6 +6,7 @@ menuconfig ARCH_MXC select CLKSRC_IMX_GPT select GENERIC_IRQ_CHIP select GPIOLIB + select PINCTRL select PM_OPP if PM select SOC_BUS select SRAM diff --git a/arch/arm/mach-imx/mmdc.c b/arch/arm/mach-imx/mmdc.c index b68cb86dbe4c..e898f7c2733e 100644 --- a/arch/arm/mach-imx/mmdc.c +++ b/arch/arm/mach-imx/mmdc.c @@ -596,7 +596,7 @@ static struct platform_driver imx_mmdc_driver = { .of_match_table = imx_mmdc_dt_ids, }, .probe = imx_mmdc_probe, - .remove_new = imx_mmdc_remove, + .remove = imx_mmdc_remove, }; static int __init imx_mmdc_init(void) diff --git a/arch/arm/mach-omap1/omap-dma.c b/arch/arm/mach-omap1/omap-dma.c index f091f78631d0..aebe5e55ff60 100644 --- a/arch/arm/mach-omap1/omap-dma.c +++ b/arch/arm/mach-omap1/omap-dma.c @@ -832,7 +832,7 @@ static void omap_system_dma_remove(struct platform_device *pdev) static struct platform_driver omap_system_dma_driver = { .probe = omap_system_dma_probe, - .remove_new = omap_system_dma_remove, + .remove = omap_system_dma_remove, .driver = { .name = "omap_dma_system" }, diff --git a/arch/arm/mach-pxa/sharpsl_pm.c b/arch/arm/mach-pxa/sharpsl_pm.c index 72fa2e3fd353..0c8d9000df5a 100644 --- a/arch/arm/mach-pxa/sharpsl_pm.c +++ b/arch/arm/mach-pxa/sharpsl_pm.c @@ -919,7 +919,7 @@ static void sharpsl_pm_remove(struct platform_device *pdev) static struct platform_driver sharpsl_pm_driver = { .probe = sharpsl_pm_probe, - .remove_new = sharpsl_pm_remove, + .remove = sharpsl_pm_remove, .suspend = sharpsl_pm_suspend, .resume = sharpsl_pm_resume, .driver = { diff --git a/arch/arm/mach-sa1100/jornada720_ssp.c b/arch/arm/mach-sa1100/jornada720_ssp.c index 1956b095e699..d94810217095 100644 --- a/arch/arm/mach-sa1100/jornada720_ssp.c +++ b/arch/arm/mach-sa1100/jornada720_ssp.c @@ -188,7 +188,7 @@ static void jornada_ssp_remove(struct platform_device *dev) struct platform_driver jornadassp_driver = { .probe = jornada_ssp_probe, - .remove_new = jornada_ssp_remove, + .remove = jornada_ssp_remove, .driver = { .name = "jornada_ssp", }, diff --git a/arch/arm/mach-sa1100/neponset.c b/arch/arm/mach-sa1100/neponset.c index 0ef0ebbf31ac..88fe79f0a4ed 100644 --- a/arch/arm/mach-sa1100/neponset.c +++ b/arch/arm/mach-sa1100/neponset.c @@ -423,7 +423,7 @@ static const struct dev_pm_ops neponset_pm_ops = { static struct platform_driver neponset_device_driver = { .probe = neponset_probe, - .remove_new = neponset_remove, + .remove = neponset_remove, .driver = { .name = "neponset", .pm = PM_OPS, diff --git a/arch/arm64/boot/dts/arm/fvp-base-revc.dts b/arch/arm64/boot/dts/arm/fvp-base-revc.dts index 19973ab4ea6b..9e10d7a6b5a2 100644 --- a/arch/arm64/boot/dts/arm/fvp-base-revc.dts +++ b/arch/arm64/boot/dts/arm/fvp-base-revc.dts @@ -233,7 +233,7 @@ #interrupt-cells = <0x1>; compatible = "pci-host-ecam-generic"; device_type = "pci"; - bus-range = <0x0 0x1>; + bus-range = <0x0 0xff>; reg = <0x0 0x40000000 0x0 0x10000000>; ranges = <0x2000000 0x0 0x50000000 0x0 0x50000000 0x0 0x10000000>; interrupt-map = <0 0 0 1 &gic 0 0 GIC_SPI 168 IRQ_TYPE_LEVEL_HIGH>, diff --git a/arch/arm64/boot/dts/broadcom/bcm2712.dtsi b/arch/arm64/boot/dts/broadcom/bcm2712.dtsi index 6e5a984c1d4e..26a29e5e5078 100644 --- a/arch/arm64/boot/dts/broadcom/bcm2712.dtsi +++ b/arch/arm64/boot/dts/broadcom/bcm2712.dtsi @@ -67,7 +67,7 @@ l2_cache_l0: l2-cache-l0 { compatible = "cache"; cache-size = <0x80000>; - cache-line-size = <128>; + cache-line-size = <64>; cache-sets = <1024>; //512KiB(size)/64(line-size)=8192ways/8-way set cache-level = <2>; cache-unified; @@ -91,7 +91,7 @@ l2_cache_l1: l2-cache-l1 { compatible = "cache"; cache-size = <0x80000>; - cache-line-size = <128>; + cache-line-size = <64>; cache-sets = <1024>; //512KiB(size)/64(line-size)=8192ways/8-way set cache-level = <2>; cache-unified; @@ -115,7 +115,7 @@ l2_cache_l2: l2-cache-l2 { compatible = "cache"; cache-size = <0x80000>; - cache-line-size = <128>; + cache-line-size = <64>; cache-sets = <1024>; //512KiB(size)/64(line-size)=8192ways/8-way set cache-level = <2>; cache-unified; @@ -139,7 +139,7 @@ l2_cache_l3: l2-cache-l3 { compatible = "cache"; cache-size = <0x80000>; - cache-line-size = <128>; + cache-line-size = <64>; cache-sets = <1024>; //512KiB(size)/64(line-size)=8192ways/8-way set cache-level = <2>; cache-unified; diff --git a/arch/arm64/boot/dts/freescale/imx8-ss-audio.dtsi b/arch/arm64/boot/dts/freescale/imx8-ss-audio.dtsi index a60ebb718789..c32a6947ae9c 100644 --- a/arch/arm64/boot/dts/freescale/imx8-ss-audio.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8-ss-audio.dtsi @@ -165,7 +165,7 @@ audio_subsys: bus@59000000 { }; esai0: esai@59010000 { - compatible = "fsl,imx8qm-esai"; + compatible = "fsl,imx8qm-esai", "fsl,imx6ull-esai"; reg = <0x59010000 0x10000>; interrupts = ; clocks = <&esai0_lpcg IMX_LPCG_CLK_4>, diff --git a/arch/arm64/boot/dts/freescale/imx8qm-ss-audio.dtsi b/arch/arm64/boot/dts/freescale/imx8qm-ss-audio.dtsi index e24e639b98ee..c9b55f02497a 100644 --- a/arch/arm64/boot/dts/freescale/imx8qm-ss-audio.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8qm-ss-audio.dtsi @@ -134,7 +134,7 @@ }; esai1: esai@59810000 { - compatible = "fsl,imx8qm-esai"; + compatible = "fsl,imx8qm-esai", "fsl,imx6ull-esai"; reg = <0x59810000 0x10000>; interrupts = ; clocks = <&esai1_lpcg IMX_LPCG_CLK_0>, diff --git a/arch/arm64/boot/dts/freescale/imx95.dtsi b/arch/arm64/boot/dts/freescale/imx95.dtsi index d10f62eacfe0..e9c7a8265d71 100644 --- a/arch/arm64/boot/dts/freescale/imx95.dtsi +++ b/arch/arm64/boot/dts/freescale/imx95.dtsi @@ -1673,7 +1673,7 @@ netcmix_blk_ctrl: syscon@4c810000 { compatible = "nxp,imx95-netcmix-blk-ctrl", "syscon"; - reg = <0x0 0x4c810000 0x0 0x10000>; + reg = <0x0 0x4c810000 0x0 0x8>; #clock-cells = <1>; clocks = <&scmi_clk IMX95_CLK_BUSNETCMIX>; assigned-clocks = <&scmi_clk IMX95_CLK_BUSNETCMIX>; diff --git a/arch/arm64/boot/dts/qcom/sa8775p.dtsi b/arch/arm64/boot/dts/qcom/sa8775p.dtsi index 9f315a51a7c1..9da62d7c4d27 100644 --- a/arch/arm64/boot/dts/qcom/sa8775p.dtsi +++ b/arch/arm64/boot/dts/qcom/sa8775p.dtsi @@ -2440,6 +2440,7 @@ qcom,cmb-element-bits = <32>; qcom,cmb-msrs-num = <32>; + status = "disabled"; out-ports { port { @@ -6092,7 +6093,7 @@ <0x0 0x40000000 0x0 0xf20>, <0x0 0x40000f20 0x0 0xa8>, <0x0 0x40001000 0x0 0x4000>, - <0x0 0x40200000 0x0 0x100000>, + <0x0 0x40200000 0x0 0x1fe00000>, <0x0 0x01c03000 0x0 0x1000>, <0x0 0x40005000 0x0 0x2000>; reg-names = "parf", "dbi", "elbi", "atu", "addr_space", @@ -6250,7 +6251,7 @@ <0x0 0x60000000 0x0 0xf20>, <0x0 0x60000f20 0x0 0xa8>, <0x0 0x60001000 0x0 0x4000>, - <0x0 0x60200000 0x0 0x100000>, + <0x0 0x60200000 0x0 0x1fe00000>, <0x0 0x01c13000 0x0 0x1000>, <0x0 0x60005000 0x0 0x2000>; reg-names = "parf", "dbi", "elbi", "atu", "addr_space", diff --git a/arch/arm64/boot/dts/qcom/x1e78100-lenovo-thinkpad-t14s.dts b/arch/arm64/boot/dts/qcom/x1e78100-lenovo-thinkpad-t14s.dts index 975550139e10..66513fc8e67a 100644 --- a/arch/arm64/boot/dts/qcom/x1e78100-lenovo-thinkpad-t14s.dts +++ b/arch/arm64/boot/dts/qcom/x1e78100-lenovo-thinkpad-t14s.dts @@ -773,6 +773,10 @@ status = "okay"; }; +&usb_1_ss0_dwc3 { + dr_mode = "host"; +}; + &usb_1_ss0_dwc3_hs { remote-endpoint = <&pmic_glink_ss0_hs_in>; }; @@ -801,6 +805,10 @@ status = "okay"; }; +&usb_1_ss1_dwc3 { + dr_mode = "host"; +}; + &usb_1_ss1_dwc3_hs { remote-endpoint = <&pmic_glink_ss1_hs_in>; }; diff --git a/arch/arm64/boot/dts/qcom/x1e80100-crd.dts b/arch/arm64/boot/dts/qcom/x1e80100-crd.dts index 39f9d9cdc10d..d51a9bdcf67f 100644 --- a/arch/arm64/boot/dts/qcom/x1e80100-crd.dts +++ b/arch/arm64/boot/dts/qcom/x1e80100-crd.dts @@ -1197,6 +1197,10 @@ status = "okay"; }; +&usb_1_ss0_dwc3 { + dr_mode = "host"; +}; + &usb_1_ss0_dwc3_hs { remote-endpoint = <&pmic_glink_ss0_hs_in>; }; @@ -1225,6 +1229,10 @@ status = "okay"; }; +&usb_1_ss1_dwc3 { + dr_mode = "host"; +}; + &usb_1_ss1_dwc3_hs { remote-endpoint = <&pmic_glink_ss1_hs_in>; }; @@ -1253,6 +1261,10 @@ status = "okay"; }; +&usb_1_ss2_dwc3 { + dr_mode = "host"; +}; + &usb_1_ss2_dwc3_hs { remote-endpoint = <&pmic_glink_ss2_hs_in>; }; diff --git a/arch/arm64/boot/dts/qcom/x1e80100.dtsi b/arch/arm64/boot/dts/qcom/x1e80100.dtsi index 88805629ed2b..7e4f46ad8edd 100644 --- a/arch/arm64/boot/dts/qcom/x1e80100.dtsi +++ b/arch/arm64/boot/dts/qcom/x1e80100.dtsi @@ -2924,7 +2924,7 @@ #address-cells = <3>; #size-cells = <2>; ranges = <0x01000000 0x0 0x00000000 0x0 0x70200000 0x0 0x100000>, - <0x02000000 0x0 0x70300000 0x0 0x70300000 0x0 0x1d00000>; + <0x02000000 0x0 0x70300000 0x0 0x70300000 0x0 0x3d00000>; bus-range = <0x00 0xff>; dma-coherent; @@ -4066,8 +4066,6 @@ dma-coherent; - usb-role-switch; - ports { #address-cells = <1>; #size-cells = <0>; @@ -4321,8 +4319,6 @@ dma-coherent; - usb-role-switch; - ports { #address-cells = <1>; #size-cells = <0>; @@ -4421,8 +4417,6 @@ dma-coherent; - usb-role-switch; - ports { #address-cells = <1>; #size-cells = <0>; diff --git a/arch/arm64/boot/dts/rockchip/rk3328.dtsi b/arch/arm64/boot/dts/rockchip/rk3328.dtsi index 0597de415fe0..7d992c3c01ce 100644 --- a/arch/arm64/boot/dts/rockchip/rk3328.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3328.dtsi @@ -333,6 +333,7 @@ power-domain@RK3328_PD_HEVC { reg = ; + clocks = <&cru SCLK_VENC_CORE>; #power-domain-cells = <0>; }; power-domain@RK3328_PD_VIDEO { diff --git a/arch/arm64/boot/dts/rockchip/rk3568.dtsi b/arch/arm64/boot/dts/rockchip/rk3568.dtsi index ecaefe208e3e..695cccbdab0f 100644 --- a/arch/arm64/boot/dts/rockchip/rk3568.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3568.dtsi @@ -350,6 +350,7 @@ assigned-clocks = <&pmucru CLK_PCIEPHY0_REF>; assigned-clock-rates = <100000000>; resets = <&cru SRST_PIPEPHY0>; + reset-names = "phy"; rockchip,pipe-grf = <&pipegrf>; rockchip,pipe-phy-grf = <&pipe_phy_grf0>; #phy-cells = <1>; diff --git a/arch/arm64/boot/dts/rockchip/rk356x-base.dtsi b/arch/arm64/boot/dts/rockchip/rk356x-base.dtsi index 62be06f3b863..e55390629114 100644 --- a/arch/arm64/boot/dts/rockchip/rk356x-base.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk356x-base.dtsi @@ -1681,6 +1681,7 @@ assigned-clocks = <&pmucru CLK_PCIEPHY1_REF>; assigned-clock-rates = <100000000>; resets = <&cru SRST_PIPEPHY1>; + reset-names = "phy"; rockchip,pipe-grf = <&pipegrf>; rockchip,pipe-phy-grf = <&pipe_phy_grf1>; #phy-cells = <1>; @@ -1697,6 +1698,7 @@ assigned-clocks = <&pmucru CLK_PCIEPHY2_REF>; assigned-clock-rates = <100000000>; resets = <&cru SRST_PIPEPHY2>; + reset-names = "phy"; rockchip,pipe-grf = <&pipegrf>; rockchip,pipe-phy-grf = <&pipe_phy_grf2>; #phy-cells = <1>; diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index c44d001da169..d597112f1d5b 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts @@ -72,7 +72,7 @@ rfkill { compatible = "rfkill-gpio"; - label = "rfkill-pcie-wlan"; + label = "rfkill-m2-wlan"; radio-type = "wlan"; shutdown-gpios = <&gpio4 RK_PA2 GPIO_ACTIVE_HIGH>; }; diff --git a/arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6.dtsi index 76a6e8e517e9..c9749cb50076 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6.dtsi @@ -434,6 +434,7 @@ &sdmmc { bus-width = <4>; cap-sd-highspeed; + cd-gpios = <&gpio0 RK_PA4 GPIO_ACTIVE_LOW>; disable-wp; max-frequency = <150000000>; no-mmc; diff --git a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi index 467f084c6469..e11d282462bd 100644 --- a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi +++ b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi @@ -1306,11 +1306,14 @@ "dp_vtc_pixel_clk_in"; power-domains = <&zynqmp_firmware PD_DP>; resets = <&zynqmp_reset ZYNQMP_RESET_DP>; - dma-names = "vid0", "vid1", "vid2", "gfx0"; + dma-names = "vid0", "vid1", "vid2", "gfx0", + "aud0", "aud1"; dmas = <&zynqmp_dpdma ZYNQMP_DPDMA_VIDEO0>, <&zynqmp_dpdma ZYNQMP_DPDMA_VIDEO1>, <&zynqmp_dpdma ZYNQMP_DPDMA_VIDEO2>, - <&zynqmp_dpdma ZYNQMP_DPDMA_GRAPHICS>; + <&zynqmp_dpdma ZYNQMP_DPDMA_GRAPHICS>, + <&zynqmp_dpdma ZYNQMP_DPDMA_AUDIO0>, + <&zynqmp_dpdma ZYNQMP_DPDMA_AUDIO1>; ports { #address-cells = <1>; diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index a523b519700f..a2b5d6f20f4d 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -18,7 +18,7 @@ #include "aes-ce-setkey.h" -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); static int num_rounds(struct crypto_aes_ctx *ctx) { diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index a147e847a5a1..b0150999743f 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -1048,7 +1048,7 @@ unregister_ciphers: #ifdef USE_V8_CRYPTO_EXTENSIONS module_cpu_feature_match(AES, aes_init); -EXPORT_SYMBOL_NS(ce_aes_mac_update, CRYPTO_INTERNAL); +EXPORT_SYMBOL_NS(ce_aes_mac_update, "CRYPTO_INTERNAL"); #else module_init(aes_init); EXPORT_SYMBOL(neon_aes_ecb_encrypt); diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 201a46efd918..cbbf70e0f204 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -44,6 +44,8 @@ cpucap_is_possible(const unsigned int cap) return IS_ENABLED(CONFIG_ARM64_TLB_RANGE); case ARM64_HAS_S1POE: return IS_ENABLED(CONFIG_ARM64_POE); + case ARM64_HAS_GCS: + return IS_ENABLED(CONFIG_ARM64_GCS); case ARM64_UNMAP_KERNEL_AT_EL0: return IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0); case ARM64_WORKAROUND_843419: diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index b64e49bd9d10..8b4e5a3cd24c 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -847,8 +847,7 @@ static inline bool system_supports_poe(void) static inline bool system_supports_gcs(void) { - return IS_ENABLED(CONFIG_ARM64_GCS) && - alternative_has_cap_unlikely(ARM64_HAS_GCS); + return alternative_has_cap_unlikely(ARM64_HAS_GCS); } static inline bool system_supports_haft(void) diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index 85ef966c08cd..4ef52d7245bb 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -87,7 +87,7 @@ 1 << PMSCR_EL2_PA_SHIFT) msr_s SYS_PMSCR_EL2, x0 // addresses and physical counter .Lskip_spe_el2_\@: - mov x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT) + mov x0, #MDCR_EL2_E2PB_MASK orr x2, x2, x0 // If we don't have VHE, then // use EL1&0 translation. @@ -100,7 +100,7 @@ and x0, x0, TRBIDR_EL1_P cbnz x0, .Lskip_trace_\@ // If TRBE is available at EL2 - mov x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT) + mov x0, #MDCR_EL2_E2TB_MASK orr x2, x2, x0 // allow the EL1&0 translation // to own it. diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h index 1d53022fc7e1..21df8bbd2668 100644 --- a/arch/arm64/include/asm/mman.h +++ b/arch/arm64/include/asm/mman.h @@ -7,6 +7,7 @@ #ifndef BUILD_VDSO #include #include +#include #include #include @@ -44,7 +45,7 @@ static inline unsigned long arch_calc_vm_flag_bits(struct file *file, if (system_supports_mte()) { if (flags & (MAP_ANONYMOUS | MAP_HUGETLB)) return VM_MTE_ALLOWED; - if (shmem_file(file)) + if (shmem_file(file) || is_file_hugepages(file)) return VM_MTE_ALLOWED; } diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 65f76064c86b..ae990da1eae5 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -114,8 +114,8 @@ SYM_CODE_START_LOCAL(__finalise_el2) // Use EL2 translations for SPE & TRBE and disable access from EL1 mrs x0, mdcr_el2 - bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT) - bic x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT) + bic x0, x0, #MDCR_EL2_E2PB_MASK + bic x0, x0, #MDCR_EL2_E2TB_MASK msr mdcr_el2, x0 // Transfer the MM state from EL1 to EL2 diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c index 7f99723fbb8c..1041bc67a3ee 100644 --- a/arch/arm64/kernel/patching.c +++ b/arch/arm64/kernel/patching.c @@ -30,20 +30,17 @@ static bool is_image_text(unsigned long addr) static void __kprobes *patch_map(void *addr, int fixmap) { - unsigned long uintaddr = (uintptr_t) addr; - bool image = is_image_text(uintaddr); - struct page *page; + phys_addr_t phys; - if (image) - page = phys_to_page(__pa_symbol(addr)); - else if (IS_ENABLED(CONFIG_EXECMEM)) - page = vmalloc_to_page(addr); - else - return addr; + if (is_image_text((unsigned long)addr)) { + phys = __pa_symbol(addr); + } else { + struct page *page = vmalloc_to_page(addr); + BUG_ON(!page); + phys = page_to_phys(page) + offset_in_page(addr); + } - BUG_ON(!page); - return (void *)set_fixmap_offset(fixmap, page_to_phys(page) + - (uintaddr & ~PAGE_MASK)); + return (void *)set_fixmap_offset(fixmap, phys); } static void __kprobes patch_unmap(int fixmap) diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index e4437f62a2cd..f79b0d5f71ac 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -720,6 +720,8 @@ static int fpmr_set(struct task_struct *target, const struct user_regset *regset if (!system_supports_fpmr()) return -EINVAL; + fpmr = target->thread.uw.fpmr; + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &fpmr, 0, count); if (ret) return ret; @@ -1427,7 +1429,7 @@ static int tagged_addr_ctrl_get(struct task_struct *target, { long ctrl = get_tagged_addr_ctrl(target); - if (IS_ERR_VALUE(ctrl)) + if (WARN_ON_ONCE(IS_ERR_VALUE(ctrl))) return ctrl; return membuf_write(&to, &ctrl, sizeof(ctrl)); @@ -1441,6 +1443,10 @@ static int tagged_addr_ctrl_set(struct task_struct *target, const struct int ret; long ctrl; + ctrl = get_tagged_addr_ctrl(target); + if (WARN_ON_ONCE(IS_ERR_VALUE(ctrl))) + return ctrl; + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &ctrl, 0, -1); if (ret) return ret; @@ -1472,6 +1478,8 @@ static int poe_set(struct task_struct *target, const struct if (!system_supports_poe()) return -EINVAL; + ctrl = target->thread.por_el0; + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &ctrl, 0, -1); if (ret) return ret; @@ -1483,6 +1491,22 @@ static int poe_set(struct task_struct *target, const struct #endif #ifdef CONFIG_ARM64_GCS +static void task_gcs_to_user(struct user_gcs *user_gcs, + const struct task_struct *target) +{ + user_gcs->features_enabled = target->thread.gcs_el0_mode; + user_gcs->features_locked = target->thread.gcs_el0_locked; + user_gcs->gcspr_el0 = target->thread.gcspr_el0; +} + +static void task_gcs_from_user(struct task_struct *target, + const struct user_gcs *user_gcs) +{ + target->thread.gcs_el0_mode = user_gcs->features_enabled; + target->thread.gcs_el0_locked = user_gcs->features_locked; + target->thread.gcspr_el0 = user_gcs->gcspr_el0; +} + static int gcs_get(struct task_struct *target, const struct user_regset *regset, struct membuf to) @@ -1495,9 +1519,7 @@ static int gcs_get(struct task_struct *target, if (target == current) gcs_preserve_current_state(); - user_gcs.features_enabled = target->thread.gcs_el0_mode; - user_gcs.features_locked = target->thread.gcs_el0_locked; - user_gcs.gcspr_el0 = target->thread.gcspr_el0; + task_gcs_to_user(&user_gcs, target); return membuf_write(&to, &user_gcs, sizeof(user_gcs)); } @@ -1513,6 +1535,8 @@ static int gcs_set(struct task_struct *target, const struct if (!system_supports_gcs()) return -EINVAL; + task_gcs_to_user(&user_gcs, target); + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &user_gcs, 0, -1); if (ret) return ret; @@ -1520,9 +1544,7 @@ static int gcs_set(struct task_struct *target, const struct if (user_gcs.features_enabled & ~PR_SHADOW_STACK_SUPPORTED_STATUS_MASK) return -EINVAL; - target->thread.gcs_el0_mode = user_gcs.features_enabled; - target->thread.gcs_el0_locked = user_gcs.features_locked; - target->thread.gcspr_el0 = user_gcs.gcspr_el0; + task_gcs_from_user(target, &user_gcs); return 0; } diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c index 14ac6fdb872b..99ea26d400ff 100644 --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c @@ -36,15 +36,8 @@ #include #include -#ifdef CONFIG_ARM64_GCS #define GCS_SIGNAL_CAP(addr) (((unsigned long)addr) & GCS_CAP_ADDR_MASK) -static bool gcs_signal_cap_valid(u64 addr, u64 val) -{ - return val == GCS_SIGNAL_CAP(addr); -} -#endif - /* * Do a signal return; undo the signal stack. These are aligned to 128-bit. */ @@ -1062,8 +1055,7 @@ static int restore_sigframe(struct pt_regs *regs, #ifdef CONFIG_ARM64_GCS static int gcs_restore_signal(void) { - unsigned long __user *gcspr_el0; - u64 cap; + u64 gcspr_el0, cap; int ret; if (!system_supports_gcs()) @@ -1072,7 +1064,7 @@ static int gcs_restore_signal(void) if (!(current->thread.gcs_el0_mode & PR_SHADOW_STACK_ENABLE)) return 0; - gcspr_el0 = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0); + gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0); /* * Ensure that any changes to the GCS done via GCS operations @@ -1087,22 +1079,23 @@ static int gcs_restore_signal(void) * then faults will be generated on GCS operations - the main * concern is to protect GCS pages. */ - ret = copy_from_user(&cap, gcspr_el0, sizeof(cap)); + ret = copy_from_user(&cap, (unsigned long __user *)gcspr_el0, + sizeof(cap)); if (ret) return -EFAULT; /* * Check that the cap is the actual GCS before replacing it. */ - if (!gcs_signal_cap_valid((u64)gcspr_el0, cap)) + if (cap != GCS_SIGNAL_CAP(gcspr_el0)) return -EINVAL; /* Invalidate the token to prevent reuse */ - put_user_gcs(0, (__user void*)gcspr_el0, &ret); + put_user_gcs(0, (unsigned long __user *)gcspr_el0, &ret); if (ret != 0) return -EFAULT; - write_sysreg_s(gcspr_el0 + 1, SYS_GCSPR_EL0); + write_sysreg_s(gcspr_el0 + 8, SYS_GCSPR_EL0); return 0; } @@ -1421,7 +1414,7 @@ static int get_sigframe(struct rt_sigframe_user_layout *user, static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig) { - unsigned long __user *gcspr_el0; + u64 gcspr_el0; int ret = 0; if (!system_supports_gcs()) @@ -1434,18 +1427,20 @@ static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig) * We are entering a signal handler, current register state is * active. */ - gcspr_el0 = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0); + gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0); /* * Push a cap and the GCS entry for the trampoline onto the GCS. */ - put_user_gcs((unsigned long)sigtramp, gcspr_el0 - 2, &ret); - put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 1), gcspr_el0 - 1, &ret); + put_user_gcs((unsigned long)sigtramp, + (unsigned long __user *)(gcspr_el0 - 16), &ret); + put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 8), + (unsigned long __user *)(gcspr_el0 - 8), &ret); if (ret != 0) return ret; - gcspr_el0 -= 2; - write_sysreg_s((unsigned long)gcspr_el0, SYS_GCSPR_EL0); + gcspr_el0 -= 16; + write_sysreg_s(gcspr_el0, SYS_GCSPR_EL0); return 0; } @@ -1462,10 +1457,33 @@ static int setup_return(struct pt_regs *regs, struct ksignal *ksig, struct rt_sigframe_user_layout *user, int usig) { __sigrestore_t sigtramp; + int err; + + if (ksig->ka.sa.sa_flags & SA_RESTORER) + sigtramp = ksig->ka.sa.sa_restorer; + else + sigtramp = VDSO_SYMBOL(current->mm->context.vdso, sigtramp); + + err = gcs_signal_entry(sigtramp, ksig); + if (err) + return err; + + /* + * We must not fail from this point onwards. We are going to update + * registers, including SP, in order to invoke the signal handler. If + * we failed and attempted to deliver a nested SIGSEGV to a handler + * after that point, the subsequent sigreturn would end up restoring + * the (partial) state for the original signal handler. + */ regs->regs[0] = usig; + if (ksig->ka.sa.sa_flags & SA_SIGINFO) { + regs->regs[1] = (unsigned long)&user->sigframe->info; + regs->regs[2] = (unsigned long)&user->sigframe->uc; + } regs->sp = (unsigned long)user->sigframe; regs->regs[29] = (unsigned long)&user->next_frame->fp; + regs->regs[30] = (unsigned long)sigtramp; regs->pc = (unsigned long)ksig->ka.sa.sa_handler; /* @@ -1506,14 +1524,7 @@ static int setup_return(struct pt_regs *regs, struct ksignal *ksig, sme_smstop(); } - if (ksig->ka.sa.sa_flags & SA_RESTORER) - sigtramp = ksig->ka.sa.sa_restorer; - else - sigtramp = VDSO_SYMBOL(current->mm->context.vdso, sigtramp); - - regs->regs[30] = (unsigned long)sigtramp; - - return gcs_signal_entry(sigtramp, ksig); + return 0; } static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set, @@ -1537,14 +1548,16 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set, err |= __save_altstack(&frame->uc.uc_stack, regs->sp); err |= setup_sigframe(&user, regs, set, &ua_state); - if (err == 0) { + if (ksig->ka.sa.sa_flags & SA_SIGINFO) + err |= copy_siginfo_to_user(&frame->info, &ksig->info); + + if (err == 0) err = setup_return(regs, ksig, &user, usig); - if (ksig->ka.sa.sa_flags & SA_SIGINFO) { - err |= copy_siginfo_to_user(&frame->info, &ksig->info); - regs->regs[1] = (unsigned long)&frame->info; - regs->regs[2] = (unsigned long)&frame->uc; - } - } + + /* + * We must not fail if setup_return() succeeded - see comment at the + * beginning of setup_return(). + */ if (err == 0) set_handler_user_access_state(); diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c index caef85462acb..1d9d51d7627f 100644 --- a/arch/arm64/kernel/stacktrace.c +++ b/arch/arm64/kernel/stacktrace.c @@ -26,7 +26,6 @@ enum kunwind_source { KUNWIND_SOURCE_CALLER, KUNWIND_SOURCE_TASK, KUNWIND_SOURCE_REGS_PC, - KUNWIND_SOURCE_REGS_LR, }; union unwind_flags { @@ -138,8 +137,10 @@ kunwind_recover_return_address(struct kunwind_state *state) orig_pc = ftrace_graph_ret_addr(state->task, &state->graph_idx, state->common.pc, (void *)state->common.fp); - if (WARN_ON_ONCE(state->common.pc == orig_pc)) + if (state->common.pc == orig_pc) { + WARN_ON_ONCE(state->task == current); return -EINVAL; + } state->common.pc = orig_pc; state->flags.fgraph = 1; } @@ -178,23 +179,8 @@ int kunwind_next_regs_pc(struct kunwind_state *state) state->regs = regs; state->common.pc = regs->pc; state->common.fp = regs->regs[29]; - state->source = KUNWIND_SOURCE_REGS_PC; - return 0; -} - -static __always_inline int -kunwind_next_regs_lr(struct kunwind_state *state) -{ - /* - * The stack for the regs was consumed by kunwind_next_regs_pc(), so we - * cannot consume that again here, but we know the regs are safe to - * access. - */ - state->common.pc = state->regs->regs[30]; - state->common.fp = state->regs->regs[29]; state->regs = NULL; - state->source = KUNWIND_SOURCE_REGS_LR; - + state->source = KUNWIND_SOURCE_REGS_PC; return 0; } @@ -215,12 +201,12 @@ kunwind_next_frame_record_meta(struct kunwind_state *state) case FRAME_META_TYPE_FINAL: if (meta == &task_pt_regs(tsk)->stackframe) return -ENOENT; - WARN_ON_ONCE(1); + WARN_ON_ONCE(tsk == current); return -EINVAL; case FRAME_META_TYPE_PT_REGS: return kunwind_next_regs_pc(state); default: - WARN_ON_ONCE(1); + WARN_ON_ONCE(tsk == current); return -EINVAL; } } @@ -274,11 +260,8 @@ kunwind_next(struct kunwind_state *state) case KUNWIND_SOURCE_FRAME: case KUNWIND_SOURCE_CALLER: case KUNWIND_SOURCE_TASK: - case KUNWIND_SOURCE_REGS_LR: - err = kunwind_next_frame_record(state); - break; case KUNWIND_SOURCE_REGS_PC: - err = kunwind_next_regs_lr(state); + err = kunwind_next_frame_record(state); break; default: err = -EINVAL; @@ -436,7 +419,6 @@ static const char *state_source_string(const struct kunwind_state *state) case KUNWIND_SOURCE_CALLER: return "C"; case KUNWIND_SOURCE_TASK: return "T"; case KUNWIND_SOURCE_REGS_PC: return "P"; - case KUNWIND_SOURCE_REGS_LR: return "L"; default: return "U"; } } diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c index 8c5d7990e5b3..3d7eb395e33d 100644 --- a/arch/arm64/kvm/at.c +++ b/arch/arm64/kvm/at.c @@ -739,8 +739,15 @@ static u64 compute_par_s12(struct kvm_vcpu *vcpu, u64 s1_par, final_attr = s1_parattr; break; default: - /* MemAttr[2]=0, Device from S2 */ - final_attr = s2_memattr & GENMASK(1,0) << 2; + /* + * MemAttr[2]=0, Device from S2. + * + * FWB does not influence the way that stage 1 + * memory types and attributes are combined + * with stage 2 Device type and attributes. + */ + final_attr = min(s2_memattr_to_attr(s2_memattr), + s1_parattr); } } else { /* Combination of R_HMNDG, R_TNHFM and R_GQFSF */ diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index caba3e4bd09e..e75374d682f4 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -783,9 +783,6 @@ static int hyp_ack_unshare(u64 addr, const struct pkvm_mem_transition *tx) if (tx->initiator.id == PKVM_ID_HOST && hyp_page_count((void *)addr)) return -EBUSY; - if (__hyp_ack_skip_pgtable_check(tx)) - return 0; - return __hyp_check_page_state_range(addr, size, PKVM_PAGE_SHARED_BORROWED); } diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c index 01616c39a810..071993c16de8 100644 --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c @@ -126,7 +126,7 @@ static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu) /* Trap SPE */ if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMSVer), feature_ids)) { mdcr_set |= MDCR_EL2_TPMS; - mdcr_clear |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT; + mdcr_clear |= MDCR_EL2_E2PB_MASK; } /* Trap Trace Filter */ @@ -143,7 +143,7 @@ static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu) /* Trap External Trace */ if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_ExtTrcBuff), feature_ids)) - mdcr_clear |= MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT; + mdcr_clear |= MDCR_EL2_E2TB_MASK; vcpu->arch.mdcr_el2 |= mdcr_set; vcpu->arch.mdcr_el2 &= ~mdcr_clear; diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c index 456102bc0b55..6c5950b9ceac 100644 --- a/arch/arm64/kvm/pmu-emul.c +++ b/arch/arm64/kvm/pmu-emul.c @@ -24,6 +24,7 @@ static DEFINE_MUTEX(arm_pmus_lock); static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc); static void kvm_pmu_release_perf_event(struct kvm_pmc *pmc); +static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc); static struct kvm_vcpu *kvm_pmc_to_vcpu(const struct kvm_pmc *pmc) { @@ -327,48 +328,25 @@ u64 kvm_pmu_implemented_counter_mask(struct kvm_vcpu *vcpu) return GENMASK(val - 1, 0) | BIT(ARMV8_PMU_CYCLE_IDX); } -/** - * kvm_pmu_enable_counter_mask - enable selected PMU counters - * @vcpu: The vcpu pointer - * @val: the value guest writes to PMCNTENSET register - * - * Call perf_event_enable to start counting the perf event - */ -void kvm_pmu_enable_counter_mask(struct kvm_vcpu *vcpu, u64 val) +static void kvm_pmc_enable_perf_event(struct kvm_pmc *pmc) { - int i; - if (!kvm_vcpu_has_pmu(vcpu)) + if (!pmc->perf_event) { + kvm_pmu_create_perf_event(pmc); return; - - if (!(kvm_vcpu_read_pmcr(vcpu) & ARMV8_PMU_PMCR_E) || !val) - return; - - for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++) { - struct kvm_pmc *pmc; - - if (!(val & BIT(i))) - continue; - - pmc = kvm_vcpu_idx_to_pmc(vcpu, i); - - if (!pmc->perf_event) { - kvm_pmu_create_perf_event(pmc); - } else { - perf_event_enable(pmc->perf_event); - if (pmc->perf_event->state != PERF_EVENT_STATE_ACTIVE) - kvm_debug("fail to enable perf event\n"); - } } + + perf_event_enable(pmc->perf_event); + if (pmc->perf_event->state != PERF_EVENT_STATE_ACTIVE) + kvm_debug("fail to enable perf event\n"); } -/** - * kvm_pmu_disable_counter_mask - disable selected PMU counters - * @vcpu: The vcpu pointer - * @val: the value guest writes to PMCNTENCLR register - * - * Call perf_event_disable to stop counting the perf event - */ -void kvm_pmu_disable_counter_mask(struct kvm_vcpu *vcpu, u64 val) +static void kvm_pmc_disable_perf_event(struct kvm_pmc *pmc) +{ + if (pmc->perf_event) + perf_event_disable(pmc->perf_event); +} + +void kvm_pmu_reprogram_counter_mask(struct kvm_vcpu *vcpu, u64 val) { int i; @@ -376,16 +354,18 @@ void kvm_pmu_disable_counter_mask(struct kvm_vcpu *vcpu, u64 val) return; for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++) { - struct kvm_pmc *pmc; + struct kvm_pmc *pmc = kvm_vcpu_idx_to_pmc(vcpu, i); if (!(val & BIT(i))) continue; - pmc = kvm_vcpu_idx_to_pmc(vcpu, i); - - if (pmc->perf_event) - perf_event_disable(pmc->perf_event); + if (kvm_pmu_counter_is_enabled(pmc)) + kvm_pmc_enable_perf_event(pmc); + else + kvm_pmc_disable_perf_event(pmc); } + + kvm_vcpu_pmu_restore_guest(vcpu); } /* @@ -626,27 +606,28 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val) if (!kvm_has_feat(vcpu->kvm, ID_AA64DFR0_EL1, PMUVer, V3P5)) val &= ~ARMV8_PMU_PMCR_LP; + /* Request a reload of the PMU to enable/disable affected counters */ + if ((__vcpu_sys_reg(vcpu, PMCR_EL0) ^ val) & ARMV8_PMU_PMCR_E) + kvm_make_request(KVM_REQ_RELOAD_PMU, vcpu); + /* The reset bits don't indicate any state, and shouldn't be saved. */ __vcpu_sys_reg(vcpu, PMCR_EL0) = val & ~(ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_P); - if (val & ARMV8_PMU_PMCR_E) { - kvm_pmu_enable_counter_mask(vcpu, - __vcpu_sys_reg(vcpu, PMCNTENSET_EL0)); - } else { - kvm_pmu_disable_counter_mask(vcpu, - __vcpu_sys_reg(vcpu, PMCNTENSET_EL0)); - } - if (val & ARMV8_PMU_PMCR_C) kvm_pmu_set_counter_value(vcpu, ARMV8_PMU_CYCLE_IDX, 0); if (val & ARMV8_PMU_PMCR_P) { - unsigned long mask = kvm_pmu_accessible_counter_mask(vcpu); - mask &= ~BIT(ARMV8_PMU_CYCLE_IDX); + /* + * Unlike other PMU sysregs, the controls in PMCR_EL0 always apply + * to the 'guest' range of counters and never the 'hyp' range. + */ + unsigned long mask = kvm_pmu_implemented_counter_mask(vcpu) & + ~kvm_pmu_hyp_counter_mask(vcpu) & + ~BIT(ARMV8_PMU_CYCLE_IDX); + for_each_set_bit(i, &mask, 32) kvm_pmu_set_pmc_value(kvm_vcpu_idx_to_pmc(vcpu, i), 0, true); } - kvm_vcpu_pmu_restore_guest(vcpu); } static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc) @@ -910,11 +891,11 @@ void kvm_vcpu_reload_pmu(struct kvm_vcpu *vcpu) { u64 mask = kvm_pmu_implemented_counter_mask(vcpu); - kvm_pmu_handle_pmcr(vcpu, kvm_vcpu_read_pmcr(vcpu)); - __vcpu_sys_reg(vcpu, PMOVSSET_EL0) &= mask; __vcpu_sys_reg(vcpu, PMINTENSET_EL1) &= mask; __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) &= mask; + + kvm_pmu_reprogram_counter_mask(vcpu, mask); } int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 83c6b4a07ef5..634ff18a59a1 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1208,16 +1208,14 @@ static bool access_pmcnten(struct kvm_vcpu *vcpu, struct sys_reg_params *p, mask = kvm_pmu_accessible_counter_mask(vcpu); if (p->is_write) { val = p->regval & mask; - if (r->Op2 & 0x1) { + if (r->Op2 & 0x1) /* accessing PMCNTENSET_EL0 */ __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) |= val; - kvm_pmu_enable_counter_mask(vcpu, val); - kvm_vcpu_pmu_restore_guest(vcpu); - } else { + else /* accessing PMCNTENCLR_EL0 */ __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) &= ~val; - kvm_pmu_disable_counter_mask(vcpu, val); - } + + kvm_pmu_reprogram_counter_mask(vcpu, val); } else { p->regval = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0); } @@ -2450,6 +2448,26 @@ static unsigned int s1pie_el2_visibility(const struct kvm_vcpu *vcpu, return __el2_visibility(vcpu, rd, s1pie_visibility); } +static bool access_mdcr(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + u64 old = __vcpu_sys_reg(vcpu, MDCR_EL2); + + if (!access_rw(vcpu, p, r)) + return false; + + /* + * Request a reload of the PMU to enable/disable the counters affected + * by HPME. + */ + if ((old ^ __vcpu_sys_reg(vcpu, MDCR_EL2)) & MDCR_EL2_HPME) + kvm_make_request(KVM_REQ_RELOAD_PMU, vcpu); + + return true; +} + + /* * Architected system registers. * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2 @@ -2618,7 +2636,8 @@ static const struct sys_reg_desc sys_reg_descs[] = { ID_WRITABLE(ID_AA64MMFR0_EL1, ~(ID_AA64MMFR0_EL1_RES0 | ID_AA64MMFR0_EL1_TGRAN4_2 | ID_AA64MMFR0_EL1_TGRAN64_2 | - ID_AA64MMFR0_EL1_TGRAN16_2)), + ID_AA64MMFR0_EL1_TGRAN16_2 | + ID_AA64MMFR0_EL1_ASIDBITS)), ID_WRITABLE(ID_AA64MMFR1_EL1, ~(ID_AA64MMFR1_EL1_RES0 | ID_AA64MMFR1_EL1_HCX | ID_AA64MMFR1_EL1_TWED | @@ -2982,7 +3001,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { EL2_REG(SCTLR_EL2, access_rw, reset_val, SCTLR_EL2_RES1), EL2_REG(ACTLR_EL2, access_rw, reset_val, 0), EL2_REG_VNCR(HCR_EL2, reset_hcr, 0), - EL2_REG(MDCR_EL2, access_rw, reset_val, 0), + EL2_REG(MDCR_EL2, access_mdcr, reset_val, 0), EL2_REG(CPTR_EL2, access_rw, reset_val, CPTR_NVHE_EL2_RES1), EL2_REG_VNCR(HSTR_EL2, reset_val, 0), EL2_REG_VNCR(HFGRTR_EL2, reset_val, 0), diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c index f4c4494645c3..fb96802799c6 100644 --- a/arch/arm64/kvm/vgic/vgic-its.c +++ b/arch/arm64/kvm/vgic/vgic-its.c @@ -608,12 +608,22 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its, lockdep_assert_held(&its->its_lock); vgic_get_irq_kref(irq); + old = xa_store(&its->translation_cache, cache_key, irq, GFP_KERNEL_ACCOUNT); + + /* + * Put the reference taken on @irq if the store fails. Intentionally do + * not return the error as the translation cache is best effort. + */ + if (xa_is_err(old)) { + vgic_put_irq(kvm, irq); + return; + } + /* * We could have raced with another CPU caching the same * translation behind our back, ensure we don't leak a * reference if that is the case. */ - old = xa_store(&its->translation_cache, cache_key, irq, GFP_KERNEL_ACCOUNT); if (old) vgic_put_irq(kvm, old); } diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c index 188197590fc9..b2ac06246327 100644 --- a/arch/arm64/mm/context.c +++ b/arch/arm64/mm/context.c @@ -32,9 +32,9 @@ static unsigned long nr_pinned_asids; static unsigned long *pinned_asid_map; #define ASID_MASK (~GENMASK(asid_bits - 1, 0)) -#define ASID_FIRST_VERSION (1UL << asid_bits) +#define ASID_FIRST_VERSION (1UL << 16) -#define NUM_USER_ASIDS ASID_FIRST_VERSION +#define NUM_USER_ASIDS (1UL << asid_bits) #define ctxid2asid(asid) ((asid) & ~ASID_MASK) #define asid2ctxid(asid, genid) ((asid) | (genid)) diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c index 87b3f1a25535..a86c897017df 100644 --- a/arch/arm64/mm/copypage.c +++ b/arch/arm64/mm/copypage.c @@ -30,11 +30,13 @@ void copy_highpage(struct page *to, struct page *from) if (!system_supports_mte()) return; - if (folio_test_hugetlb(src) && - folio_test_hugetlb_mte_tagged(src)) { - if (!folio_try_hugetlb_mte_tagging(dst)) + if (folio_test_hugetlb(src)) { + if (!folio_test_hugetlb_mte_tagged(src) || + from != folio_page(src, 0)) return; + WARN_ON_ONCE(!folio_try_hugetlb_mte_tagging(dst)); + /* * Populate tags for all subpages. * diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index d21f67d67cf5..ccdef53872a0 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -117,15 +117,6 @@ static void __init arch_reserve_crashkernel(void) static phys_addr_t __init max_zone_phys(phys_addr_t zone_limit) { - /** - * Information we get from firmware (e.g. DT dma-ranges) describe DMA - * bus constraints. Devices using DMA might have their own limitations. - * Some of them rely on DMA zone in low 32-bit memory. Keep low RAM - * DMA zone on platforms that have RAM there. - */ - if (memblock_start_of_DRAM() < U32_MAX) - zone_limit = min(zone_limit, U32_MAX); - return min(zone_limit, memblock_end_of_DRAM() - 1) + 1; } @@ -141,6 +132,14 @@ static void __init zone_sizes_init(void) acpi_zone_dma_limit = acpi_iort_dma_get_max_cpu_address(); dt_zone_dma_limit = of_dma_get_max_cpu_address(NULL); zone_dma_limit = min(dt_zone_dma_limit, acpi_zone_dma_limit); + /* + * Information we get from firmware (e.g. DT dma-ranges) describe DMA + * bus constraints. Devices using DMA might have their own limitations. + * Some of them rely on DMA zone in low 32-bit memory. Keep low RAM + * DMA zone on platforms that have RAM there. + */ + if (memblock_start_of_DRAM() < U32_MAX) + zone_dma_limit = min(zone_dma_limit, U32_MAX); arm64_dma_phys_limit = max_zone_phys(zone_dma_limit); max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit); #endif diff --git a/arch/hexagon/Makefile b/arch/hexagon/Makefile index 92d005958dfb..ff172cbe5881 100644 --- a/arch/hexagon/Makefile +++ b/arch/hexagon/Makefile @@ -32,3 +32,9 @@ KBUILD_LDFLAGS += $(ldflags-y) TIR_NAME := r19 KBUILD_CFLAGS += -ffixed-$(TIR_NAME) -DTHREADINFO_REG=$(TIR_NAME) -D__linux__ KBUILD_AFLAGS += -DTHREADINFO_REG=$(TIR_NAME) + +# Disable HexagonConstExtenders pass for LLVM versions prior to 19.1.0 +# https://github.com/llvm/llvm-project/issues/99714 +ifneq ($(call clang-min-version, 190100),y) +KBUILD_CFLAGS += -mllvm -hexagon-cext=false +endif diff --git a/arch/loongarch/include/asm/hugetlb.h b/arch/loongarch/include/asm/hugetlb.h index b837c65a4894..c8e4057734d0 100644 --- a/arch/loongarch/include/asm/hugetlb.h +++ b/arch/loongarch/include/asm/hugetlb.h @@ -24,6 +24,16 @@ static inline int prepare_hugepage_range(struct file *file, return 0; } +#define __HAVE_ARCH_HUGE_PTE_CLEAR +static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned long sz) +{ + pte_t clear; + + pte_val(clear) = (unsigned long)invalid_pte_table; + set_pte_at(mm, addr, ptep, clear); +} + #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h index 944482063f14..3089785ca97e 100644 --- a/arch/loongarch/include/asm/inst.h +++ b/arch/loongarch/include/asm/inst.h @@ -683,7 +683,17 @@ DEF_EMIT_REG2I16_FORMAT(blt, blt_op) DEF_EMIT_REG2I16_FORMAT(bge, bge_op) DEF_EMIT_REG2I16_FORMAT(bltu, bltu_op) DEF_EMIT_REG2I16_FORMAT(bgeu, bgeu_op) -DEF_EMIT_REG2I16_FORMAT(jirl, jirl_op) + +static inline void emit_jirl(union loongarch_instruction *insn, + enum loongarch_gpr rd, + enum loongarch_gpr rj, + int offset) +{ + insn->reg2i16_format.opcode = jirl_op; + insn->reg2i16_format.immediate = offset; + insn->reg2i16_format.rd = rd; + insn->reg2i16_format.rj = rj; +} #define DEF_EMIT_REG2BSTRD_FORMAT(NAME, OP) \ static inline void emit_##NAME(union loongarch_instruction *insn, \ diff --git a/arch/loongarch/kernel/efi.c b/arch/loongarch/kernel/efi.c index 2bf86aeda874..de21e72759ee 100644 --- a/arch/loongarch/kernel/efi.c +++ b/arch/loongarch/kernel/efi.c @@ -95,7 +95,7 @@ static void __init init_screen_info(void) memset(si, 0, sizeof(*si)); early_memunmap(si, sizeof(*si)); - memblock_reserve(screen_info.lfb_base, screen_info.lfb_size); + memblock_reserve(__screen_info_lfb_base(&screen_info), screen_info.lfb_size); } void __init efi_init(void) diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c index 3050329556d1..14d7d700bcb9 100644 --- a/arch/loongarch/kernel/inst.c +++ b/arch/loongarch/kernel/inst.c @@ -332,7 +332,7 @@ u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm) return INSN_BREAK; } - emit_jirl(&insn, rj, rd, imm >> 2); + emit_jirl(&insn, rd, rj, imm >> 2); return insn.word; } diff --git a/arch/loongarch/kernel/smp.c b/arch/loongarch/kernel/smp.c index 5d59e9ce2772..fbf747447f13 100644 --- a/arch/loongarch/kernel/smp.c +++ b/arch/loongarch/kernel/smp.c @@ -82,7 +82,7 @@ void show_ipi_list(struct seq_file *p, int prec) for (i = 0; i < NR_IPI; i++) { seq_printf(p, "%*s%u:%s", prec - 1, "IPI", i, prec >= 4 ? " " : ""); for_each_online_cpu(cpu) - seq_printf(p, "%10u ", per_cpu(irq_stat, cpu).ipi_irqs[i]); + seq_put_decimal_ull_width(p, " ", per_cpu(irq_stat, cpu).ipi_irqs[i], 10); seq_printf(p, " LoongArch %d %s\n", i + 1, ipi_types[i]); } } diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c index 69f3e3782cc9..a7893bd01e73 100644 --- a/arch/loongarch/kvm/exit.c +++ b/arch/loongarch/kvm/exit.c @@ -156,7 +156,7 @@ static int kvm_handle_csr(struct kvm_vcpu *vcpu, larch_inst inst) int kvm_emu_iocsr(larch_inst inst, struct kvm_run *run, struct kvm_vcpu *vcpu) { - int ret; + int idx, ret; unsigned long *val; u32 addr, rd, rj, opcode; @@ -167,7 +167,6 @@ int kvm_emu_iocsr(larch_inst inst, struct kvm_run *run, struct kvm_vcpu *vcpu) rj = inst.reg2_format.rj; opcode = inst.reg2_format.opcode; addr = vcpu->arch.gprs[rj]; - ret = EMULATE_DO_IOCSR; run->iocsr_io.phys_addr = addr; run->iocsr_io.is_write = 0; val = &vcpu->arch.gprs[rd]; @@ -207,20 +206,28 @@ int kvm_emu_iocsr(larch_inst inst, struct kvm_run *run, struct kvm_vcpu *vcpu) } if (run->iocsr_io.is_write) { - if (!kvm_io_bus_write(vcpu, KVM_IOCSR_BUS, addr, run->iocsr_io.len, val)) + idx = srcu_read_lock(&vcpu->kvm->srcu); + ret = kvm_io_bus_write(vcpu, KVM_IOCSR_BUS, addr, run->iocsr_io.len, val); + srcu_read_unlock(&vcpu->kvm->srcu, idx); + if (ret == 0) ret = EMULATE_DONE; - else + else { + ret = EMULATE_DO_IOCSR; /* Save data and let user space to write it */ memcpy(run->iocsr_io.data, val, run->iocsr_io.len); - + } trace_kvm_iocsr(KVM_TRACE_IOCSR_WRITE, run->iocsr_io.len, addr, val); } else { - if (!kvm_io_bus_read(vcpu, KVM_IOCSR_BUS, addr, run->iocsr_io.len, val)) + idx = srcu_read_lock(&vcpu->kvm->srcu); + ret = kvm_io_bus_read(vcpu, KVM_IOCSR_BUS, addr, run->iocsr_io.len, val); + srcu_read_unlock(&vcpu->kvm->srcu, idx); + if (ret == 0) ret = EMULATE_DONE; - else + else { + ret = EMULATE_DO_IOCSR; /* Save register id for iocsr read completion */ vcpu->arch.io_gpr = rd; - + } trace_kvm_iocsr(KVM_TRACE_IOCSR_READ, run->iocsr_io.len, addr, NULL); } @@ -359,7 +366,7 @@ static int kvm_handle_gspr(struct kvm_vcpu *vcpu) int kvm_emu_mmio_read(struct kvm_vcpu *vcpu, larch_inst inst) { - int ret; + int idx, ret; unsigned int op8, opcode, rd; struct kvm_run *run = vcpu->run; @@ -464,8 +471,10 @@ int kvm_emu_mmio_read(struct kvm_vcpu *vcpu, larch_inst inst) * it need not return to user space to handle the mmio * exception. */ + idx = srcu_read_lock(&vcpu->kvm->srcu); ret = kvm_io_bus_read(vcpu, KVM_MMIO_BUS, vcpu->arch.badv, run->mmio.len, &vcpu->arch.gprs[rd]); + srcu_read_unlock(&vcpu->kvm->srcu, idx); if (!ret) { update_pc(&vcpu->arch); vcpu->mmio_needed = 0; @@ -531,7 +540,7 @@ int kvm_complete_mmio_read(struct kvm_vcpu *vcpu, struct kvm_run *run) int kvm_emu_mmio_write(struct kvm_vcpu *vcpu, larch_inst inst) { - int ret; + int idx, ret; unsigned int rd, op8, opcode; unsigned long curr_pc, rd_val = 0; struct kvm_run *run = vcpu->run; @@ -631,7 +640,9 @@ int kvm_emu_mmio_write(struct kvm_vcpu *vcpu, larch_inst inst) * it need not return to user space to handle the mmio * exception. */ + idx = srcu_read_lock(&vcpu->kvm->srcu); ret = kvm_io_bus_write(vcpu, KVM_MMIO_BUS, vcpu->arch.badv, run->mmio.len, data); + srcu_read_unlock(&vcpu->kvm->srcu, idx); if (!ret) return EMULATE_DONE; diff --git a/arch/loongarch/kvm/intc/ipi.c b/arch/loongarch/kvm/intc/ipi.c index a233a323e295..93f4acd44523 100644 --- a/arch/loongarch/kvm/intc/ipi.c +++ b/arch/loongarch/kvm/intc/ipi.c @@ -98,7 +98,7 @@ static void write_mailbox(struct kvm_vcpu *vcpu, int offset, uint64_t data, int static int send_ipi_data(struct kvm_vcpu *vcpu, gpa_t addr, uint64_t data) { - int i, ret; + int i, idx, ret; uint32_t val = 0, mask = 0; /* @@ -107,7 +107,9 @@ static int send_ipi_data(struct kvm_vcpu *vcpu, gpa_t addr, uint64_t data) */ if ((data >> 27) & 0xf) { /* Read the old val */ + idx = srcu_read_lock(&vcpu->kvm->srcu); ret = kvm_io_bus_read(vcpu, KVM_IOCSR_BUS, addr, sizeof(val), &val); + srcu_read_unlock(&vcpu->kvm->srcu, idx); if (unlikely(ret)) { kvm_err("%s: : read date from addr %llx failed\n", __func__, addr); return ret; @@ -121,7 +123,9 @@ static int send_ipi_data(struct kvm_vcpu *vcpu, gpa_t addr, uint64_t data) val &= mask; } val |= ((uint32_t)(data >> 32) & ~mask); + idx = srcu_read_lock(&vcpu->kvm->srcu); ret = kvm_io_bus_write(vcpu, KVM_IOCSR_BUS, addr, sizeof(val), &val); + srcu_read_unlock(&vcpu->kvm->srcu, idx); if (unlikely(ret)) kvm_err("%s: : write date to addr %llx failed\n", __func__, addr); diff --git a/arch/loongarch/kvm/vcpu.c b/arch/loongarch/kvm/vcpu.c index cab1818be68d..d18a4a270415 100644 --- a/arch/loongarch/kvm/vcpu.c +++ b/arch/loongarch/kvm/vcpu.c @@ -240,7 +240,7 @@ static void kvm_late_check_requests(struct kvm_vcpu *vcpu) */ static int kvm_enter_guest_check(struct kvm_vcpu *vcpu) { - int ret; + int idx, ret; /* * Check conditions before entering the guest @@ -249,7 +249,9 @@ static int kvm_enter_guest_check(struct kvm_vcpu *vcpu) if (ret < 0) return ret; + idx = srcu_read_lock(&vcpu->kvm->srcu); ret = kvm_check_requests(vcpu); + srcu_read_unlock(&vcpu->kvm->srcu, idx); return ret; } diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c index dd350cba1252..ea357a3edc09 100644 --- a/arch/loongarch/net/bpf_jit.c +++ b/arch/loongarch/net/bpf_jit.c @@ -181,13 +181,13 @@ static void __build_epilogue(struct jit_ctx *ctx, bool is_tail_call) /* Set return value */ emit_insn(ctx, addiw, LOONGARCH_GPR_A0, regmap[BPF_REG_0], 0); /* Return to the caller */ - emit_insn(ctx, jirl, LOONGARCH_GPR_RA, LOONGARCH_GPR_ZERO, 0); + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0); } else { /* * Call the next bpf prog and skip the first instruction * of TCC initialization. */ - emit_insn(ctx, jirl, LOONGARCH_GPR_T3, LOONGARCH_GPR_ZERO, 1); + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T3, 1); } } @@ -904,7 +904,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, bool ext return ret; move_addr(ctx, t1, func_addr); - emit_insn(ctx, jirl, t1, LOONGARCH_GPR_RA, 0); + emit_insn(ctx, jirl, LOONGARCH_GPR_RA, t1, 0); move_reg(ctx, regmap[BPF_REG_0], LOONGARCH_GPR_A0); break; diff --git a/arch/mips/pci/pci-xtalk-bridge.c b/arch/mips/pci/pci-xtalk-bridge.c index 45ddbaa6c123..dae856fb3e5b 100644 --- a/arch/mips/pci/pci-xtalk-bridge.c +++ b/arch/mips/pci/pci-xtalk-bridge.c @@ -749,7 +749,7 @@ static void bridge_remove(struct platform_device *pdev) static struct platform_driver bridge_driver = { .probe = bridge_probe, - .remove_new = bridge_remove, + .remove = bridge_remove, .driver = { .name = "xtalk-bridge", } diff --git a/arch/nios2/kernel/cpuinfo.c b/arch/nios2/kernel/cpuinfo.c index 338849c430a5..7b1e8f9128e9 100644 --- a/arch/nios2/kernel/cpuinfo.c +++ b/arch/nios2/kernel/cpuinfo.c @@ -143,11 +143,11 @@ static int show_cpuinfo(struct seq_file *m, void *v) " DIV:\t\t%s\n" " BMX:\t\t%s\n" " CDX:\t\t%s\n", - cpuinfo.has_mul ? "yes" : "no", - cpuinfo.has_mulx ? "yes" : "no", - cpuinfo.has_div ? "yes" : "no", - cpuinfo.has_bmx ? "yes" : "no", - cpuinfo.has_cdx ? "yes" : "no"); + str_yes_no(cpuinfo.has_mul), + str_yes_no(cpuinfo.has_mulx), + str_yes_no(cpuinfo.has_div), + str_yes_no(cpuinfo.has_bmx), + str_yes_no(cpuinfo.has_cdx)); seq_printf(m, "Icache:\t\t%ukB, line length: %u\n", diff --git a/arch/openrisc/kernel/entry.S b/arch/openrisc/kernel/entry.S index 440711d7bf40..ce6f2b08a35e 100644 --- a/arch/openrisc/kernel/entry.S +++ b/arch/openrisc/kernel/entry.S @@ -239,6 +239,8 @@ handler: ;\ /* =====================================================[ exceptions] === */ + __REF + /* ---[ 0x100: RESET exception ]----------------------------------------- */ EXCEPTION_ENTRY(_tng_kernel_start) diff --git a/arch/openrisc/kernel/head.S b/arch/openrisc/kernel/head.S index 439e00f81e5d..bd760066f1cd 100644 --- a/arch/openrisc/kernel/head.S +++ b/arch/openrisc/kernel/head.S @@ -26,15 +26,15 @@ #include #include -#define tophys(rd,rs) \ - l.movhi rd,hi(-KERNELBASE) ;\ +#define tophys(rd,rs) \ + l.movhi rd,hi(-KERNELBASE) ;\ l.add rd,rd,rs -#define CLEAR_GPR(gpr) \ +#define CLEAR_GPR(gpr) \ l.movhi gpr,0x0 -#define LOAD_SYMBOL_2_GPR(gpr,symbol) \ - l.movhi gpr,hi(symbol) ;\ +#define LOAD_SYMBOL_2_GPR(gpr,symbol) \ + l.movhi gpr,hi(symbol) ;\ l.ori gpr,gpr,lo(symbol) @@ -326,21 +326,21 @@ l.addi r1,r1,-(INT_FRAME_SIZE) ;\ /* r1 is KSP, r30 is __pa(KSP) */ ;\ tophys (r30,r1) ;\ - l.sw PT_GPR12(r30),r12 ;\ + l.sw PT_GPR12(r30),r12 ;\ l.mfspr r12,r0,SPR_EPCR_BASE ;\ l.sw PT_PC(r30),r12 ;\ l.mfspr r12,r0,SPR_ESR_BASE ;\ l.sw PT_SR(r30),r12 ;\ /* save r31 */ ;\ EXCEPTION_T_LOAD_GPR30(r12) ;\ - l.sw PT_GPR30(r30),r12 ;\ + l.sw PT_GPR30(r30),r12 ;\ /* save r10 as was prior to exception */ ;\ EXCEPTION_T_LOAD_GPR10(r12) ;\ - l.sw PT_GPR10(r30),r12 ;\ - /* save PT_SP as was prior to exception */ ;\ + l.sw PT_GPR10(r30),r12 ;\ + /* save PT_SP as was prior to exception */ ;\ EXCEPTION_T_LOAD_SP(r12) ;\ l.sw PT_SP(r30),r12 ;\ - l.sw PT_GPR13(r30),r13 ;\ + l.sw PT_GPR13(r30),r13 ;\ /* --> */ ;\ /* save exception r4, set r4 = EA */ ;\ l.sw PT_GPR4(r30),r4 ;\ @@ -357,6 +357,8 @@ /* =====================================================[ exceptions] === */ + __HEAD + /* ---[ 0x100: RESET exception ]----------------------------------------- */ .org 0x100 /* Jump to .init code at _start which lives in the .head section @@ -394,7 +396,7 @@ _dispatch_do_ipage_fault: .org 0x500 EXCEPTION_HANDLE(_timer_handler) -/* ---[ 0x600: Alignment exception ]-------------------------------------- */ +/* ---[ 0x600: Alignment exception ]------------------------------------- */ .org 0x600 EXCEPTION_HANDLE(_alignment_handler) @@ -424,7 +426,7 @@ _dispatch_do_ipage_fault: .org 0xc00 EXCEPTION_HANDLE(_sys_call_handler) -/* ---[ 0xd00: Floating point exception ]--------------------------------- */ +/* ---[ 0xd00: Floating point exception ]-------------------------------- */ .org 0xd00 EXCEPTION_HANDLE(_fpe_trap_handler) @@ -506,10 +508,10 @@ _dispatch_do_ipage_fault: /* .text*/ -/* This early stuff belongs in HEAD, but some of the functions below definitely +/* This early stuff belongs in the .init.text section, but some of the functions below definitely * don't... */ - __HEAD + __INIT .global _start _start: /* Init r0 to zero as per spec */ @@ -816,7 +818,7 @@ secondary_start: #endif -/* ========================================[ cache ]=== */ +/* ==========================================================[ cache ]=== */ /* alignment here so we don't change memory offsets with * memory controller defined diff --git a/arch/openrisc/kernel/vmlinux.lds.S b/arch/openrisc/kernel/vmlinux.lds.S index bc1306047837..049bff45f612 100644 --- a/arch/openrisc/kernel/vmlinux.lds.S +++ b/arch/openrisc/kernel/vmlinux.lds.S @@ -50,6 +50,7 @@ SECTIONS .text : AT(ADDR(.text) - LOAD_OFFSET) { _stext = .; + HEAD_TEXT TEXT_TEXT SCHED_TEXT LOCK_TEXT @@ -83,8 +84,6 @@ SECTIONS . = ALIGN(PAGE_SIZE); __init_begin = .; - HEAD_TEXT_SECTION - /* Page aligned */ INIT_TEXT_SECTION(PAGE_SIZE) diff --git a/arch/powerpc/crypto/vmx.c b/arch/powerpc/crypto/vmx.c index 7eb713cc87c8..0b725e826388 100644 --- a/arch/powerpc/crypto/vmx.c +++ b/arch/powerpc/crypto/vmx.c @@ -74,4 +74,4 @@ MODULE_DESCRIPTION("IBM VMX cryptographic acceleration instructions " "support on Power 8"); MODULE_LICENSE("GPL"); MODULE_VERSION("1.0.0"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index 6d0d329cbb35..f9acf866c709 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -34,6 +34,8 @@ enum vcpu_ftr { #define E500_TLB_BITMAP (1 << 30) /* TLB1 entry is mapped by host TLB0 */ #define E500_TLB_TLB0 (1 << 29) +/* entry is writable on the host */ +#define E500_TLB_WRITABLE (1 << 28) /* bits [6-5] MAS2_X1 and MAS2_X0 and [4-0] bits for WIMGE */ #define E500_TLB_MAS2_ATTR (0x7f) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index e5a145b578a4..06caf8bbbe2b 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -45,11 +45,14 @@ static inline unsigned int tlb1_max_shadow_size(void) return host_tlb_params[1].entries - tlbcam_index - 1; } -static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) +static inline u32 e500_shadow_mas3_attrib(u32 mas3, bool writable, int usermode) { /* Mask off reserved bits. */ mas3 &= MAS3_ATTRIB_MASK; + if (!writable) + mas3 &= ~(MAS3_UW|MAS3_SW); + #ifndef CONFIG_KVM_BOOKE_HV if (!usermode) { /* Guest is in supervisor mode, @@ -242,17 +245,18 @@ static inline int tlbe_is_writable(struct kvm_book3e_206_tlb_entry *tlbe) return tlbe->mas7_3 & (MAS3_SW|MAS3_UW); } -static inline bool kvmppc_e500_ref_setup(struct tlbe_ref *ref, +static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref, struct kvm_book3e_206_tlb_entry *gtlbe, - kvm_pfn_t pfn, unsigned int wimg) + kvm_pfn_t pfn, unsigned int wimg, + bool writable) { ref->pfn = pfn; ref->flags = E500_TLB_VALID; + if (writable) + ref->flags |= E500_TLB_WRITABLE; /* Use guest supplied MAS2_G and MAS2_E */ ref->flags |= (gtlbe->mas2 & MAS2_ATTRIB_MASK) | wimg; - - return tlbe_is_writable(gtlbe); } static inline void kvmppc_e500_ref_release(struct tlbe_ref *ref) @@ -305,6 +309,7 @@ static void kvmppc_e500_setup_stlbe( { kvm_pfn_t pfn = ref->pfn; u32 pr = vcpu->arch.shared->msr & MSR_PR; + bool writable = !!(ref->flags & E500_TLB_WRITABLE); BUG_ON(!(ref->flags & E500_TLB_VALID)); @@ -312,7 +317,7 @@ static void kvmppc_e500_setup_stlbe( stlbe->mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe->mas2 = (gvaddr & MAS2_EPN) | (ref->flags & E500_TLB_MAS2_ATTR); stlbe->mas7_3 = ((u64)pfn << PAGE_SHIFT) | - e500_shadow_mas3_attrib(gtlbe->mas7_3, pr); + e500_shadow_mas3_attrib(gtlbe->mas7_3, writable, pr); } static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, @@ -321,15 +326,14 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, struct tlbe_ref *ref) { struct kvm_memory_slot *slot; - unsigned long pfn = 0; /* silence GCC warning */ + unsigned int psize; + unsigned long pfn; struct page *page = NULL; unsigned long hva; - int pfnmap = 0; int tsize = BOOK3E_PAGESZ_4K; int ret = 0; unsigned long mmu_seq; struct kvm *kvm = vcpu_e500->vcpu.kvm; - unsigned long tsize_pages = 0; pte_t *ptep; unsigned int wimg = 0; pgd_t *pgdir; @@ -351,110 +355,12 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, slot = gfn_to_memslot(vcpu_e500->vcpu.kvm, gfn); hva = gfn_to_hva_memslot(slot, gfn); - if (tlbsel == 1) { - struct vm_area_struct *vma; - mmap_read_lock(kvm->mm); - - vma = find_vma(kvm->mm, hva); - if (vma && hva >= vma->vm_start && - (vma->vm_flags & VM_PFNMAP)) { - /* - * This VMA is a physically contiguous region (e.g. - * /dev/mem) that bypasses normal Linux page - * management. Find the overlap between the - * vma and the memslot. - */ - - unsigned long start, end; - unsigned long slot_start, slot_end; - - pfnmap = 1; - - start = vma->vm_pgoff; - end = start + - vma_pages(vma); - - pfn = start + ((hva - vma->vm_start) >> PAGE_SHIFT); - - slot_start = pfn - (gfn - slot->base_gfn); - slot_end = slot_start + slot->npages; - - if (start < slot_start) - start = slot_start; - if (end > slot_end) - end = slot_end; - - tsize = (gtlbe->mas1 & MAS1_TSIZE_MASK) >> - MAS1_TSIZE_SHIFT; - - /* - * e500 doesn't implement the lowest tsize bit, - * or 1K pages. - */ - tsize = max(BOOK3E_PAGESZ_4K, tsize & ~1); - - /* - * Now find the largest tsize (up to what the guest - * requested) that will cover gfn, stay within the - * range, and for which gfn and pfn are mutually - * aligned. - */ - - for (; tsize > BOOK3E_PAGESZ_4K; tsize -= 2) { - unsigned long gfn_start, gfn_end; - tsize_pages = 1UL << (tsize - 2); - - gfn_start = gfn & ~(tsize_pages - 1); - gfn_end = gfn_start + tsize_pages; - - if (gfn_start + pfn - gfn < start) - continue; - if (gfn_end + pfn - gfn > end) - continue; - if ((gfn & (tsize_pages - 1)) != - (pfn & (tsize_pages - 1))) - continue; - - gvaddr &= ~((tsize_pages << PAGE_SHIFT) - 1); - pfn &= ~(tsize_pages - 1); - break; - } - } else if (vma && hva >= vma->vm_start && - is_vm_hugetlb_page(vma)) { - unsigned long psize = vma_kernel_pagesize(vma); - - tsize = (gtlbe->mas1 & MAS1_TSIZE_MASK) >> - MAS1_TSIZE_SHIFT; - - /* - * Take the largest page size that satisfies both host - * and guest mapping - */ - tsize = min(__ilog2(psize) - 10, tsize); - - /* - * e500 doesn't implement the lowest tsize bit, - * or 1K pages. - */ - tsize = max(BOOK3E_PAGESZ_4K, tsize & ~1); - } - - mmap_read_unlock(kvm->mm); - } - - if (likely(!pfnmap)) { - tsize_pages = 1UL << (tsize + 10 - PAGE_SHIFT); - pfn = __kvm_faultin_pfn(slot, gfn, FOLL_WRITE, NULL, &page); - if (is_error_noslot_pfn(pfn)) { - if (printk_ratelimit()) - pr_err("%s: real page not found for gfn %lx\n", - __func__, (long)gfn); - return -EINVAL; - } - - /* Align guest and physical address to page map boundaries */ - pfn &= ~(tsize_pages - 1); - gvaddr &= ~((tsize_pages << PAGE_SHIFT) - 1); + pfn = __kvm_faultin_pfn(slot, gfn, FOLL_WRITE, &writable, &page); + if (is_error_noslot_pfn(pfn)) { + if (printk_ratelimit()) + pr_err("%s: real page not found for gfn %lx\n", + __func__, (long)gfn); + return -EINVAL; } spin_lock(&kvm->mmu_lock); @@ -472,14 +378,13 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, * can't run hence pfn won't change. */ local_irq_save(flags); - ptep = find_linux_pte(pgdir, hva, NULL, NULL); + ptep = find_linux_pte(pgdir, hva, NULL, &psize); if (ptep) { pte_t pte = READ_ONCE(*ptep); if (pte_present(pte)) { wimg = (pte_val(pte) >> PTE_WIMGE_SHIFT) & MAS2_WIMGE_MASK; - local_irq_restore(flags); } else { local_irq_restore(flags); pr_err_ratelimited("%s: pte not present: gfn %lx,pfn %lx\n", @@ -488,10 +393,72 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, goto out; } } - writable = kvmppc_e500_ref_setup(ref, gtlbe, pfn, wimg); + local_irq_restore(flags); + if (psize && tlbsel == 1) { + unsigned long psize_pages, tsize_pages; + unsigned long start, end; + unsigned long slot_start, slot_end; + + psize_pages = 1UL << (psize - PAGE_SHIFT); + start = pfn & ~(psize_pages - 1); + end = start + psize_pages; + + slot_start = pfn - (gfn - slot->base_gfn); + slot_end = slot_start + slot->npages; + + if (start < slot_start) + start = slot_start; + if (end > slot_end) + end = slot_end; + + tsize = (gtlbe->mas1 & MAS1_TSIZE_MASK) >> + MAS1_TSIZE_SHIFT; + + /* + * Any page size that doesn't satisfy the host mapping + * will fail the start and end tests. + */ + tsize = min(psize - PAGE_SHIFT + BOOK3E_PAGESZ_4K, tsize); + + /* + * e500 doesn't implement the lowest tsize bit, + * or 1K pages. + */ + tsize = max(BOOK3E_PAGESZ_4K, tsize & ~1); + + /* + * Now find the largest tsize (up to what the guest + * requested) that will cover gfn, stay within the + * range, and for which gfn and pfn are mutually + * aligned. + */ + + for (; tsize > BOOK3E_PAGESZ_4K; tsize -= 2) { + unsigned long gfn_start, gfn_end; + tsize_pages = 1UL << (tsize - 2); + + gfn_start = gfn & ~(tsize_pages - 1); + gfn_end = gfn_start + tsize_pages; + + if (gfn_start + pfn - gfn < start) + continue; + if (gfn_end + pfn - gfn > end) + continue; + if ((gfn & (tsize_pages - 1)) != + (pfn & (tsize_pages - 1))) + continue; + + gvaddr &= ~((tsize_pages << PAGE_SHIFT) - 1); + pfn &= ~(tsize_pages - 1); + break; + } + } + + kvmppc_e500_ref_setup(ref, gtlbe, pfn, wimg, writable); kvmppc_e500_setup_stlbe(&vcpu_e500->vcpu, gtlbe, tsize, ref, gvaddr, stlbe); + writable = tlbe_is_writable(stlbe); /* Clear i-cache for new pages */ kvmppc_mmu_flush_icache(pfn); diff --git a/arch/powerpc/platforms/book3s/vas-api.c b/arch/powerpc/platforms/book3s/vas-api.c index f381b177ea06..0b6365d85d11 100644 --- a/arch/powerpc/platforms/book3s/vas-api.c +++ b/arch/powerpc/platforms/book3s/vas-api.c @@ -464,7 +464,43 @@ static vm_fault_t vas_mmap_fault(struct vm_fault *vmf) return VM_FAULT_SIGBUS; } +/* + * During mmap() paste address, mapping VMA is saved in VAS window + * struct which is used to unmap during migration if the window is + * still open. But the user space can remove this mapping with + * munmap() before closing the window and the VMA address will + * be invalid. Set VAS window VMA to NULL in this function which + * is called before VMA free. + */ +static void vas_mmap_close(struct vm_area_struct *vma) +{ + struct file *fp = vma->vm_file; + struct coproc_instance *cp_inst = fp->private_data; + struct vas_window *txwin; + + /* Should not happen */ + if (!cp_inst || !cp_inst->txwin) { + pr_err("No attached VAS window for the paste address mmap\n"); + return; + } + + txwin = cp_inst->txwin; + /* + * task_ref.vma is set in coproc_mmap() during mmap paste + * address. So it has to be the same VMA that is getting freed. + */ + if (WARN_ON(txwin->task_ref.vma != vma)) { + pr_err("Invalid paste address mmaping\n"); + return; + } + + mutex_lock(&txwin->task_ref.mmap_mutex); + txwin->task_ref.vma = NULL; + mutex_unlock(&txwin->task_ref.mmap_mutex); +} + static const struct vm_operations_struct vas_vm_ops = { + .close = vas_mmap_close, .fault = vas_mmap_fault, }; diff --git a/arch/riscv/include/asm/kfence.h b/arch/riscv/include/asm/kfence.h index 7388edd88986..d08bf7fb3aee 100644 --- a/arch/riscv/include/asm/kfence.h +++ b/arch/riscv/include/asm/kfence.h @@ -22,7 +22,9 @@ static inline bool kfence_protect_page(unsigned long addr, bool protect) else set_pte(pte, __pte(pte_val(ptep_get(pte)) | _PAGE_PRESENT)); - flush_tlb_kernel_range(addr, addr + PAGE_SIZE); + preempt_disable(); + local_flush_tlb_kernel_range(addr, addr + PAGE_SIZE); + preempt_enable(); return true; } diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h index 71aabc5c6713..125f5ecd9565 100644 --- a/arch/riscv/include/asm/page.h +++ b/arch/riscv/include/asm/page.h @@ -122,6 +122,7 @@ struct kernel_mapping { extern struct kernel_mapping kernel_map; extern phys_addr_t phys_ram_base; +extern unsigned long vmemmap_start_pfn; #define is_kernel_mapping(x) \ ((x) >= kernel_map.virt_addr && (x) < (kernel_map.virt_addr + kernel_map.size)) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index d4e99eef90ac..050fdc49b5ad 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -87,7 +87,7 @@ * Define vmemmap for pfn_to_page & page_to_pfn calls. Needed if kernel * is configured with CONFIG_SPARSEMEM_VMEMMAP enabled. */ -#define vmemmap ((struct page *)VMEMMAP_START - (phys_ram_base >> PAGE_SHIFT)) +#define vmemmap ((struct page *)VMEMMAP_START - vmemmap_start_pfn) #define PCI_IO_SIZE SZ_16M #define PCI_IO_END VMEMMAP_START diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index 6c82318065cf..3d250824178b 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -159,6 +159,7 @@ struct riscv_pmu_snapshot_data { }; #define RISCV_PMU_RAW_EVENT_MASK GENMASK_ULL(47, 0) +#define RISCV_PMU_PLAT_FW_EVENT_MASK GENMASK_ULL(61, 0) #define RISCV_PMU_RAW_EVENT_IDX 0x20000 #define RISCV_PLAT_FW_EVENT 0xFFFF diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h index e5121b89acea..52f11bfd0079 100644 --- a/arch/riscv/include/asm/spinlock.h +++ b/arch/riscv/include/asm/spinlock.h @@ -3,8 +3,11 @@ #ifndef __ASM_RISCV_SPINLOCK_H #define __ASM_RISCV_SPINLOCK_H -#ifdef CONFIG_RISCV_COMBO_SPINLOCKS +#ifdef CONFIG_QUEUED_SPINLOCKS #define _Q_PENDING_LOOPS (1 << 9) +#endif + +#ifdef CONFIG_RISCV_COMBO_SPINLOCKS #define __no_arch_spinlock_redefine #include diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index c200d329d4bd..33a5a9f2a0d4 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -23,21 +23,21 @@ REG_S a0, TASK_TI_A0(tp) csrr a0, CSR_CAUSE /* Exclude IRQs */ - blt a0, zero, _new_vmalloc_restore_context_a0 + blt a0, zero, .Lnew_vmalloc_restore_context_a0 REG_S a1, TASK_TI_A1(tp) /* Only check new_vmalloc if we are in page/protection fault */ li a1, EXC_LOAD_PAGE_FAULT - beq a0, a1, _new_vmalloc_kernel_address + beq a0, a1, .Lnew_vmalloc_kernel_address li a1, EXC_STORE_PAGE_FAULT - beq a0, a1, _new_vmalloc_kernel_address + beq a0, a1, .Lnew_vmalloc_kernel_address li a1, EXC_INST_PAGE_FAULT - bne a0, a1, _new_vmalloc_restore_context_a1 + bne a0, a1, .Lnew_vmalloc_restore_context_a1 -_new_vmalloc_kernel_address: +.Lnew_vmalloc_kernel_address: /* Is it a kernel address? */ csrr a0, CSR_TVAL - bge a0, zero, _new_vmalloc_restore_context_a1 + bge a0, zero, .Lnew_vmalloc_restore_context_a1 /* Check if a new vmalloc mapping appeared that could explain the trap */ REG_S a2, TASK_TI_A2(tp) @@ -69,7 +69,7 @@ _new_vmalloc_kernel_address: /* Check the value of new_vmalloc for this cpu */ REG_L a2, 0(a0) and a2, a2, a1 - beq a2, zero, _new_vmalloc_restore_context + beq a2, zero, .Lnew_vmalloc_restore_context /* Atomically reset the current cpu bit in new_vmalloc */ amoxor.d a0, a1, (a0) @@ -83,11 +83,11 @@ _new_vmalloc_kernel_address: csrw CSR_SCRATCH, x0 sret -_new_vmalloc_restore_context: +.Lnew_vmalloc_restore_context: REG_L a2, TASK_TI_A2(tp) -_new_vmalloc_restore_context_a1: +.Lnew_vmalloc_restore_context_a1: REG_L a1, TASK_TI_A1(tp) -_new_vmalloc_restore_context_a0: +.Lnew_vmalloc_restore_context_a0: REG_L a0, TASK_TI_A0(tp) .endm @@ -278,6 +278,7 @@ SYM_CODE_START_NOALIGN(ret_from_exception) #else sret #endif +SYM_INNER_LABEL(ret_from_exception_end, SYM_L_GLOBAL) SYM_CODE_END(ret_from_exception) ASM_NOKPROBE(ret_from_exception) diff --git a/arch/riscv/kernel/jump_label.c b/arch/riscv/kernel/jump_label.c index 6eee6f736f68..654ed159c830 100644 --- a/arch/riscv/kernel/jump_label.c +++ b/arch/riscv/kernel/jump_label.c @@ -36,9 +36,15 @@ bool arch_jump_label_transform_queue(struct jump_entry *entry, insn = RISCV_INSN_NOP; } - mutex_lock(&text_mutex); - patch_insn_write(addr, &insn, sizeof(insn)); - mutex_unlock(&text_mutex); + if (early_boot_irqs_disabled) { + riscv_patch_in_stop_machine = 1; + patch_insn_write(addr, &insn, sizeof(insn)); + riscv_patch_in_stop_machine = 0; + } else { + mutex_lock(&text_mutex); + patch_insn_write(addr, &insn, sizeof(insn)); + mutex_unlock(&text_mutex); + } return true; } diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c index 1cd461f3d872..47d0ebeec93c 100644 --- a/arch/riscv/kernel/module.c +++ b/arch/riscv/kernel/module.c @@ -23,7 +23,7 @@ struct used_bucket { struct relocation_head { struct hlist_node node; - struct list_head *rel_entry; + struct list_head rel_entry; void *location; }; @@ -634,7 +634,7 @@ process_accumulated_relocations(struct module *me, location = rel_head_iter->location; list_for_each_entry_safe(rel_entry_iter, rel_entry_iter_tmp, - rel_head_iter->rel_entry, + &rel_head_iter->rel_entry, head) { curr_type = rel_entry_iter->type; reloc_handlers[curr_type].reloc_handler( @@ -704,16 +704,7 @@ static int add_relocation_to_accumulate(struct module *me, int type, return -ENOMEM; } - rel_head->rel_entry = - kmalloc(sizeof(struct list_head), GFP_KERNEL); - - if (!rel_head->rel_entry) { - kfree(entry); - kfree(rel_head); - return -ENOMEM; - } - - INIT_LIST_HEAD(rel_head->rel_entry); + INIT_LIST_HEAD(&rel_head->rel_entry); rel_head->location = location; INIT_HLIST_NODE(&rel_head->node); if (!current_head->first) { @@ -722,7 +713,6 @@ static int add_relocation_to_accumulate(struct module *me, int type, if (!bucket) { kfree(entry); - kfree(rel_head->rel_entry); kfree(rel_head); return -ENOMEM; } @@ -735,7 +725,7 @@ static int add_relocation_to_accumulate(struct module *me, int type, } /* Add relocation to head of discovered rel_head */ - list_add_tail(&entry->head, rel_head->rel_entry); + list_add_tail(&entry->head, &rel_head->rel_entry); return 0; } diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c index 380a0e8cecc0..c0738d6c6498 100644 --- a/arch/riscv/kernel/probes/kprobes.c +++ b/arch/riscv/kernel/probes/kprobes.c @@ -30,7 +30,7 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p) p->ainsn.api.restore = (unsigned long)p->addr + len; patch_text_nosync(p->ainsn.api.insn, &p->opcode, len); - patch_text_nosync(p->ainsn.api.insn + len, &insn, GET_INSN_LENGTH(insn)); + patch_text_nosync((void *)p->ainsn.api.insn + len, &insn, GET_INSN_LENGTH(insn)); } static void __kprobes arch_prepare_simulate(struct kprobe *p) diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index 016b48fcd6f2..45010e71df86 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -227,7 +227,7 @@ static void __init init_resources(void) static void __init parse_dtb(void) { /* Early scan of device tree from init memory */ - if (early_init_dt_scan(dtb_early_va, __pa(dtb_early_va))) { + if (early_init_dt_scan(dtb_early_va, dtb_early_pa)) { const char *name = of_flat_dt_get_machine_name(); if (name) { diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c index 153a2db4c5fa..d4355c770c36 100644 --- a/arch/riscv/kernel/stacktrace.c +++ b/arch/riscv/kernel/stacktrace.c @@ -17,6 +17,7 @@ #ifdef CONFIG_FRAME_POINTER extern asmlinkage void handle_exception(void); +extern unsigned long ret_from_exception_end; static inline int fp_is_valid(unsigned long fp, unsigned long sp) { @@ -71,7 +72,8 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, fp = frame->fp; pc = ftrace_graph_ret_addr(current, &graph_idx, frame->ra, &frame->ra); - if (pc == (unsigned long)handle_exception) { + if (pc >= (unsigned long)handle_exception && + pc < (unsigned long)&ret_from_exception_end) { if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) break; diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 51ebfd23e007..8ff8e8b36524 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -35,7 +35,7 @@ int show_unhandled_signals = 1; -static DEFINE_SPINLOCK(die_lock); +static DEFINE_RAW_SPINLOCK(die_lock); static int copy_code(struct pt_regs *regs, u16 *val, const u16 *insns) { @@ -81,7 +81,7 @@ void die(struct pt_regs *regs, const char *str) oops_enter(); - spin_lock_irqsave(&die_lock, flags); + raw_spin_lock_irqsave(&die_lock, flags); console_verbose(); bust_spinlocks(1); @@ -100,7 +100,7 @@ void die(struct pt_regs *regs, const char *str) bust_spinlocks(0); add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE); - spin_unlock_irqrestore(&die_lock, flags); + raw_spin_unlock_irqrestore(&die_lock, flags); oops_exit(); if (in_interrupt()) diff --git a/arch/riscv/kvm/aia.c b/arch/riscv/kvm/aia.c index dcced4db7fe8..19afd1f23537 100644 --- a/arch/riscv/kvm/aia.c +++ b/arch/riscv/kvm/aia.c @@ -590,7 +590,7 @@ void kvm_riscv_aia_enable(void) csr_set(CSR_HIE, BIT(IRQ_S_GEXT)); /* Enable IRQ filtering for overflow interrupt only if sscofpmf is present */ if (__riscv_isa_extension_available(NULL, RISCV_ISA_EXT_SSCOFPMF)) - csr_write(CSR_HVIEN, BIT(IRQ_PMU_OVF)); + csr_set(CSR_HVIEN, BIT(IRQ_PMU_OVF)); } void kvm_riscv_aia_disable(void) diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 0e8c20adcd98..8d167e09f1fe 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include "../kernel/head.h" @@ -62,6 +63,13 @@ EXPORT_SYMBOL(pgtable_l5_enabled); phys_addr_t phys_ram_base __ro_after_init; EXPORT_SYMBOL(phys_ram_base); +#ifdef CONFIG_SPARSEMEM_VMEMMAP +#define VMEMMAP_ADDR_ALIGN (1ULL << SECTION_SIZE_BITS) + +unsigned long vmemmap_start_pfn __ro_after_init; +EXPORT_SYMBOL(vmemmap_start_pfn); +#endif + unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss; EXPORT_SYMBOL(empty_zero_page); @@ -240,8 +248,12 @@ static void __init setup_bootmem(void) * Make sure we align the start of the memory on a PMD boundary so that * at worst, we map the linear mapping with PMD mappings. */ - if (!IS_ENABLED(CONFIG_XIP_KERNEL)) + if (!IS_ENABLED(CONFIG_XIP_KERNEL)) { phys_ram_base = memblock_start_of_DRAM() & PMD_MASK; +#ifdef CONFIG_SPARSEMEM_VMEMMAP + vmemmap_start_pfn = round_down(phys_ram_base, VMEMMAP_ADDR_ALIGN) >> PAGE_SHIFT; +#endif + } /* * In 64-bit, any use of __va/__pa before this point is wrong as we @@ -1101,6 +1113,9 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa) kernel_map.xiprom_sz = (uintptr_t)(&_exiprom) - (uintptr_t)(&_xiprom); phys_ram_base = CONFIG_PHYS_RAM_BASE; +#ifdef CONFIG_SPARSEMEM_VMEMMAP + vmemmap_start_pfn = round_down(phys_ram_base, VMEMMAP_ADDR_ALIGN) >> PAGE_SHIFT; +#endif kernel_map.phys_addr = (uintptr_t)CONFIG_PHYS_RAM_BASE; kernel_map.size = (uintptr_t)(&_end) - (uintptr_t)(&_start); @@ -1566,7 +1581,7 @@ static void __meminit free_pte_table(pte_t *pte_start, pmd_t *pmd) pmd_clear(pmd); } -static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud) +static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud, bool is_vmemmap) { struct page *page = pud_page(*pud); struct ptdesc *ptdesc = page_ptdesc(page); @@ -1579,7 +1594,8 @@ static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud) return; } - pagetable_pmd_dtor(ptdesc); + if (!is_vmemmap) + pagetable_pmd_dtor(ptdesc); if (PageReserved(page)) free_reserved_page(page); else @@ -1703,7 +1719,7 @@ static void __meminit remove_pud_mapping(pud_t *pud_base, unsigned long addr, un remove_pmd_mapping(pmd_base, addr, next, is_vmemmap, altmap); if (pgtable_l4_enabled) - free_pmd_table(pmd_base, pudp); + free_pmd_table(pmd_base, pudp, is_vmemmap); } } diff --git a/arch/s390/boot/startup.c b/arch/s390/boot/startup.c index abe6e6c0ab98..6087d38c7235 100644 --- a/arch/s390/boot/startup.c +++ b/arch/s390/boot/startup.c @@ -234,6 +234,8 @@ static unsigned long get_vmem_size(unsigned long identity_size, vsize = round_up(SZ_2G + max_mappable, rte_size) + round_up(vmemmap_size, rte_size) + FIXMAP_SIZE + MODULES_LEN + KASLR_LEN; + if (IS_ENABLED(CONFIG_KMSAN)) + vsize += MODULES_LEN * 2; return size_add(vsize, vmalloc_size); } diff --git a/arch/s390/boot/vmem.c b/arch/s390/boot/vmem.c index 145035f84a0e..3fa28db2fe59 100644 --- a/arch/s390/boot/vmem.c +++ b/arch/s390/boot/vmem.c @@ -306,7 +306,7 @@ static void pgtable_pte_populate(pmd_t *pmd, unsigned long addr, unsigned long e pages++; } } - if (mode == POPULATE_DIRECT) + if (mode == POPULATE_IDENTITY) update_page_count(PG_DIRECT_MAP_4K, pages); } @@ -339,7 +339,7 @@ static void pgtable_pmd_populate(pud_t *pud, unsigned long addr, unsigned long e } pgtable_pte_populate(pmd, addr, next, mode); } - if (mode == POPULATE_DIRECT) + if (mode == POPULATE_IDENTITY) update_page_count(PG_DIRECT_MAP_1M, pages); } @@ -372,7 +372,7 @@ static void pgtable_pud_populate(p4d_t *p4d, unsigned long addr, unsigned long e } pgtable_pmd_populate(pud, addr, next, mode); } - if (mode == POPULATE_DIRECT) + if (mode == POPULATE_IDENTITY) update_page_count(PG_DIRECT_MAP_2G, pages); } diff --git a/arch/s390/crypto/aes_s390.c b/arch/s390/crypto/aes_s390.c index 8cc02d6e0d0f..9c46b1b630b1 100644 --- a/arch/s390/crypto/aes_s390.c +++ b/arch/s390/crypto/aes_s390.c @@ -1168,4 +1168,4 @@ MODULE_ALIAS_CRYPTO("aes-all"); MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/arch/s390/kernel/ipl.c b/arch/s390/kernel/ipl.c index edbb52ce3f1e..7d12a1305fc9 100644 --- a/arch/s390/kernel/ipl.c +++ b/arch/s390/kernel/ipl.c @@ -270,7 +270,7 @@ static ssize_t sys_##_prefix##_##_name##_store(struct kobject *kobj, \ if (len >= sizeof(_value)) \ return -E2BIG; \ len = strscpy(_value, buf, sizeof(_value)); \ - if (len < 0) \ + if ((ssize_t)len < 0) \ return len; \ strim(_value); \ return len; \ diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index ea8dce299954..d4f031e086fc 100644 --- a/arch/s390/kvm/interrupt.c +++ b/arch/s390/kvm/interrupt.c @@ -2678,9 +2678,13 @@ static int flic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr) kvm_s390_clear_float_irqs(dev->kvm); break; case KVM_DEV_FLIC_APF_ENABLE: + if (kvm_is_ucontrol(dev->kvm)) + return -EINVAL; dev->kvm->arch.gmap->pfault_enabled = 1; break; case KVM_DEV_FLIC_APF_DISABLE_WAIT: + if (kvm_is_ucontrol(dev->kvm)) + return -EINVAL; dev->kvm->arch.gmap->pfault_enabled = 0; /* * Make sure no async faults are in transition when @@ -2894,6 +2898,8 @@ int kvm_set_routing_entry(struct kvm *kvm, switch (ue->type) { /* we store the userspace addresses instead of the guest addresses */ case KVM_IRQ_ROUTING_S390_ADAPTER: + if (kvm_is_ucontrol(kvm)) + return -EINVAL; e->set = set_adapter_int; uaddr = gmap_translate(kvm->arch.gmap, ue->u.adapter.summary_addr); if (uaddr == -EFAULT) diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c index 150b9387860a..a687695d8f68 100644 --- a/arch/s390/kvm/vsie.c +++ b/arch/s390/kvm/vsie.c @@ -854,7 +854,7 @@ unpin: static void unpin_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page, gpa_t gpa) { - hpa_t hpa = (hpa_t) vsie_page->scb_o; + hpa_t hpa = virt_to_phys(vsie_page->scb_o); if (hpa) unpin_guest_page(vcpu->kvm, gpa, hpa); diff --git a/arch/sh/drivers/push-switch.c b/arch/sh/drivers/push-switch.c index 1dea43381b5a..2b51ad9d5586 100644 --- a/arch/sh/drivers/push-switch.c +++ b/arch/sh/drivers/push-switch.c @@ -110,7 +110,7 @@ static void switch_drv_remove(struct platform_device *pdev) static struct platform_driver switch_driver = { .probe = switch_drv_probe, - .remove_new = switch_drv_remove, + .remove = switch_drv_remove, .driver = { .name = DRV_NAME, }, diff --git a/arch/sparc/include/asm/parport_64.h b/arch/sparc/include/asm/parport_64.h index 4f530a270760..3068809ef9ad 100644 --- a/arch/sparc/include/asm/parport_64.h +++ b/arch/sparc/include/asm/parport_64.h @@ -243,7 +243,7 @@ static struct platform_driver ecpp_driver = { .of_match_table = ecpp_match, }, .probe = ecpp_probe, - .remove_new = ecpp_remove, + .remove = ecpp_remove, }; static int parport_pc_find_nonpci_ports(int autoirq, int autodma) diff --git a/arch/sparc/kernel/chmc.c b/arch/sparc/kernel/chmc.c index e02074062001..d4c74d6b2e1b 100644 --- a/arch/sparc/kernel/chmc.c +++ b/arch/sparc/kernel/chmc.c @@ -814,7 +814,7 @@ static struct platform_driver us3mc_driver = { .of_match_table = us3mc_match, }, .probe = us3mc_probe, - .remove_new = us3mc_remove, + .remove = us3mc_remove, }; static inline bool us3mc_platform(void) diff --git a/arch/um/drivers/rtc_kern.c b/arch/um/drivers/rtc_kern.c index 3a1582219c4b..134a58f93c85 100644 --- a/arch/um/drivers/rtc_kern.c +++ b/arch/um/drivers/rtc_kern.c @@ -176,7 +176,7 @@ static void uml_rtc_remove(struct platform_device *pdev) static struct platform_driver uml_rtc_driver = { .probe = uml_rtc_probe, - .remove_new = uml_rtc_remove, + .remove = uml_rtc_remove, .driver = { .name = "uml-rtc", }, diff --git a/arch/um/drivers/virtio_uml.c b/arch/um/drivers/virtio_uml.c index cc3be48a9d6e..65df43fa9be5 100644 --- a/arch/um/drivers/virtio_uml.c +++ b/arch/um/drivers/virtio_uml.c @@ -1465,7 +1465,7 @@ static int virtio_uml_resume(struct platform_device *pdev) static struct platform_driver virtio_uml_driver = { .probe = virtio_uml_probe, - .remove_new = virtio_uml_remove, + .remove = virtio_uml_remove, .driver = { .name = "virtio-uml", .of_match_table = virtio_uml_match, diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 9d7bd0ae48c4..ef6cfea9df73 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -83,7 +83,6 @@ config X86 select ARCH_HAS_DMA_OPS if GART_IOMMU || XEN select ARCH_HAS_EARLY_DEBUG if KGDB select ARCH_HAS_ELF_RANDOMIZE - select ARCH_HAS_EXECMEM_ROX if X86_64 select ARCH_HAS_FAST_MULTIPLIER select ARCH_HAS_FORTIFY_SOURCE select ARCH_HAS_GCOV_PROFILE_ALL diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index bb284aff7bfd..99c590da0ae2 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -429,6 +429,16 @@ static struct event_constraint intel_lnc_event_constraints[] = { EVENT_CONSTRAINT_END }; +static struct extra_reg intel_lnc_extra_regs[] __read_mostly = { + INTEL_UEVENT_EXTRA_REG(0x012a, MSR_OFFCORE_RSP_0, 0xfffffffffffull, RSP_0), + INTEL_UEVENT_EXTRA_REG(0x012b, MSR_OFFCORE_RSP_1, 0xfffffffffffull, RSP_1), + INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x01cd), + INTEL_UEVENT_EXTRA_REG(0x02c6, MSR_PEBS_FRONTEND, 0x9, FE), + INTEL_UEVENT_EXTRA_REG(0x03c6, MSR_PEBS_FRONTEND, 0x7fff1f, FE), + INTEL_UEVENT_EXTRA_REG(0x40ad, MSR_PEBS_FRONTEND, 0xf, FE), + INTEL_UEVENT_EXTRA_REG(0x04c2, MSR_PEBS_FRONTEND, 0x8, FE), + EVENT_EXTRA_END +}; EVENT_ATTR_STR(mem-loads, mem_ld_nhm, "event=0x0b,umask=0x10,ldlat=3"); EVENT_ATTR_STR(mem-loads, mem_ld_snb, "event=0xcd,umask=0x1,ldlat=3"); @@ -6422,7 +6432,7 @@ static __always_inline void intel_pmu_init_lnc(struct pmu *pmu) intel_pmu_init_glc(pmu); hybrid(pmu, event_constraints) = intel_lnc_event_constraints; hybrid(pmu, pebs_constraints) = intel_lnc_pebs_event_constraints; - hybrid(pmu, extra_regs) = intel_rwc_extra_regs; + hybrid(pmu, extra_regs) = intel_lnc_extra_regs; } static __always_inline void intel_pmu_init_skt(struct pmu *pmu) @@ -7135,6 +7145,7 @@ __init int intel_pmu_init(void) case INTEL_METEORLAKE: case INTEL_METEORLAKE_L: + case INTEL_ARROWLAKE_U: intel_pmu_init_hybrid(hybrid_big_small); x86_pmu.pebs_latency_data = cmt_latency_data; diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 8afc4ad3cd16..6ba6549f26fa 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1489,7 +1489,7 @@ void intel_pmu_pebs_enable(struct perf_event *event) * hence we need to drain when changing said * size. */ - intel_pmu_drain_large_pebs(cpuc); + intel_pmu_drain_pebs_buffer(); adaptive_pebs_record_size_update(); wrmsrl(MSR_PEBS_DATA_CFG, pebs_data_cfg); cpuc->active_pebs_data_cfg = pebs_data_cfg; @@ -2517,6 +2517,7 @@ void __init intel_ds_init(void) x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME; break; + case 6: case 5: x86_pmu.pebs_ept = 1; fallthrough; diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c index d98fac567684..e7aba7349231 100644 --- a/arch/x86/events/intel/uncore.c +++ b/arch/x86/events/intel/uncore.c @@ -1910,6 +1910,7 @@ static const struct x86_cpu_id intel_uncore_match[] __initconst = { X86_MATCH_VFM(INTEL_ATOM_GRACEMONT, &adl_uncore_init), X86_MATCH_VFM(INTEL_ATOM_CRESTMONT_X, &gnr_uncore_init), X86_MATCH_VFM(INTEL_ATOM_CRESTMONT, &gnr_uncore_init), + X86_MATCH_VFM(INTEL_ATOM_DARKMONT_X, &gnr_uncore_init), {}, }; MODULE_DEVICE_TABLE(x86cpu, intel_uncore_match); diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 17b6590748c0..645aa360628d 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -452,6 +452,7 @@ #define X86_FEATURE_SME_COHERENT (19*32+10) /* AMD hardware-enforced cache coherency */ #define X86_FEATURE_DEBUG_SWAP (19*32+14) /* "debug_swap" AMD SEV-ES full debug state swap support */ #define X86_FEATURE_SVSM (19*32+28) /* "svsm" SVSM present */ +#define X86_FEATURE_HV_INUSE_WR_ALLOWED (19*32+30) /* Allow Write to in-use hypervisor-owned pages */ /* AMD-defined Extended Feature 2 EAX, CPUID level 0x80000021 (EAX), word 20 */ #define X86_FEATURE_NO_NESTED_DATA_BP (20*32+ 0) /* No Nested Data Breakpoints */ diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 6f82e75b6149..4b804531b03c 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -36,10 +36,12 @@ #define _PAGE_BIT_DEVMAP _PAGE_BIT_SOFTW4 #ifdef CONFIG_X86_64 -#define _PAGE_BIT_SAVED_DIRTY _PAGE_BIT_SOFTW5 /* Saved Dirty bit */ +#define _PAGE_BIT_SAVED_DIRTY _PAGE_BIT_SOFTW5 /* Saved Dirty bit (leaf) */ +#define _PAGE_BIT_NOPTISHADOW _PAGE_BIT_SOFTW5 /* No PTI shadow (root PGD) */ #else /* Shared with _PAGE_BIT_UFFD_WP which is not supported on 32 bit */ -#define _PAGE_BIT_SAVED_DIRTY _PAGE_BIT_SOFTW2 /* Saved Dirty bit */ +#define _PAGE_BIT_SAVED_DIRTY _PAGE_BIT_SOFTW2 /* Saved Dirty bit (leaf) */ +#define _PAGE_BIT_NOPTISHADOW _PAGE_BIT_SOFTW2 /* No PTI shadow (root PGD) */ #endif /* If _PAGE_BIT_PRESENT is clear, we use these: */ @@ -139,6 +141,8 @@ #define _PAGE_PROTNONE (_AT(pteval_t, 1) << _PAGE_BIT_PROTNONE) +#define _PAGE_NOPTISHADOW (_AT(pteval_t, 1) << _PAGE_BIT_NOPTISHADOW) + /* * Set of bits not changed in pte_modify. The pte's * protection key is treated like _PAGE_RW, for diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index c0975815980c..20e6009381ed 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -230,6 +230,8 @@ static inline unsigned long long l1tf_pfn_limit(void) return BIT_ULL(boot_cpu_data.x86_cache_bits - 1 - PAGE_SHIFT); } +void init_cpu_devs(void); +void get_cpu_vendor(struct cpuinfo_x86 *c); extern void early_cpu_init(void); extern void identify_secondary_cpu(struct cpuinfo_x86 *); extern void print_cpu_info(struct cpuinfo_x86 *); diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index aec6e2d3aa1d..98bfc097389c 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -217,7 +217,7 @@ fail: #define nop() asm volatile ("nop") -static inline void serialize(void) +static __always_inline void serialize(void) { /* Instruction opcode for SERIALIZE; supported in binutils >= 2.35. */ asm volatile(".byte 0xf, 0x1, 0xe8" ::: "memory"); diff --git a/arch/x86/include/asm/static_call.h b/arch/x86/include/asm/static_call.h index 125c407e2abe..41502bd2afd6 100644 --- a/arch/x86/include/asm/static_call.h +++ b/arch/x86/include/asm/static_call.h @@ -65,4 +65,19 @@ extern bool __static_call_fixup(void *tramp, u8 op, void *dest); +extern void __static_call_update_early(void *tramp, void *func); + +#define static_call_update_early(name, _func) \ +({ \ + typeof(&STATIC_CALL_TRAMP(name)) __F = (_func); \ + if (static_call_initialized) { \ + __static_call_update(&STATIC_CALL_KEY(name), \ + STATIC_CALL_TRAMP_ADDR(name), __F);\ + } else { \ + WRITE_ONCE(STATIC_CALL_KEY(name).func, _func); \ + __static_call_update_early(STATIC_CALL_TRAMP_ADDR(name),\ + __F); \ + } \ +}) + #endif /* _ASM_STATIC_CALL_H */ diff --git a/arch/x86/include/asm/sync_core.h b/arch/x86/include/asm/sync_core.h index ab7382f92aff..96bda43538ee 100644 --- a/arch/x86/include/asm/sync_core.h +++ b/arch/x86/include/asm/sync_core.h @@ -8,7 +8,7 @@ #include #ifdef CONFIG_X86_32 -static inline void iret_to_self(void) +static __always_inline void iret_to_self(void) { asm volatile ( "pushfl\n\t" @@ -19,7 +19,7 @@ static inline void iret_to_self(void) : ASM_CALL_CONSTRAINT : : "memory"); } #else -static inline void iret_to_self(void) +static __always_inline void iret_to_self(void) { unsigned int tmp; @@ -55,7 +55,7 @@ static inline void iret_to_self(void) * Like all of Linux's memory ordering operations, this is a * compiler barrier as well. */ -static inline void sync_core(void) +static __always_inline void sync_core(void) { /* * The SERIALIZE instruction is the most straightforward way to diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h index a2dd24947eb8..97771b9d33af 100644 --- a/arch/x86/include/asm/xen/hypercall.h +++ b/arch/x86/include/asm/xen/hypercall.h @@ -39,9 +39,11 @@ #include #include #include +#include #include +#include #include #include #include @@ -86,11 +88,20 @@ struct xen_dm_op_buf; * there aren't more than 5 arguments...) */ -extern struct { char _entry[32]; } hypercall_page[]; +void xen_hypercall_func(void); +DECLARE_STATIC_CALL(xen_hypercall, xen_hypercall_func); -#define __HYPERCALL "call hypercall_page+%c[offset]" -#define __HYPERCALL_ENTRY(x) \ - [offset] "i" (__HYPERVISOR_##x * sizeof(hypercall_page[0])) +#ifdef MODULE +#define __ADDRESSABLE_xen_hypercall +#else +#define __ADDRESSABLE_xen_hypercall __ADDRESSABLE_ASM_STR(__SCK__xen_hypercall) +#endif + +#define __HYPERCALL \ + __ADDRESSABLE_xen_hypercall \ + "call __SCT__xen_hypercall" + +#define __HYPERCALL_ENTRY(x) "a" (x) #ifdef CONFIG_X86_32 #define __HYPERCALL_RETREG "eax" @@ -148,7 +159,7 @@ extern struct { char _entry[32]; } hypercall_page[]; __HYPERCALL_0ARG(); \ asm volatile (__HYPERCALL \ : __HYPERCALL_0PARAM \ - : __HYPERCALL_ENTRY(name) \ + : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name) \ : __HYPERCALL_CLOBBER0); \ (type)__res; \ }) @@ -159,7 +170,7 @@ extern struct { char _entry[32]; } hypercall_page[]; __HYPERCALL_1ARG(a1); \ asm volatile (__HYPERCALL \ : __HYPERCALL_1PARAM \ - : __HYPERCALL_ENTRY(name) \ + : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name) \ : __HYPERCALL_CLOBBER1); \ (type)__res; \ }) @@ -170,7 +181,7 @@ extern struct { char _entry[32]; } hypercall_page[]; __HYPERCALL_2ARG(a1, a2); \ asm volatile (__HYPERCALL \ : __HYPERCALL_2PARAM \ - : __HYPERCALL_ENTRY(name) \ + : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name) \ : __HYPERCALL_CLOBBER2); \ (type)__res; \ }) @@ -181,7 +192,7 @@ extern struct { char _entry[32]; } hypercall_page[]; __HYPERCALL_3ARG(a1, a2, a3); \ asm volatile (__HYPERCALL \ : __HYPERCALL_3PARAM \ - : __HYPERCALL_ENTRY(name) \ + : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name) \ : __HYPERCALL_CLOBBER3); \ (type)__res; \ }) @@ -192,7 +203,7 @@ extern struct { char _entry[32]; } hypercall_page[]; __HYPERCALL_4ARG(a1, a2, a3, a4); \ asm volatile (__HYPERCALL \ : __HYPERCALL_4PARAM \ - : __HYPERCALL_ENTRY(name) \ + : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name) \ : __HYPERCALL_CLOBBER4); \ (type)__res; \ }) @@ -206,12 +217,9 @@ xen_single_call(unsigned int call, __HYPERCALL_DECLS; __HYPERCALL_5ARG(a1, a2, a3, a4, a5); - if (call >= PAGE_SIZE / sizeof(hypercall_page[0])) - return -EINVAL; - - asm volatile(CALL_NOSPEC + asm volatile(__HYPERCALL : __HYPERCALL_5PARAM - : [thunk_target] "a" (&hypercall_page[call]) + : __HYPERCALL_ENTRY(call) : __HYPERCALL_CLOBBER5); return (long)__res; diff --git a/arch/x86/kernel/callthunks.c b/arch/x86/kernel/callthunks.c index 465647456753..f17d16607882 100644 --- a/arch/x86/kernel/callthunks.c +++ b/arch/x86/kernel/callthunks.c @@ -142,11 +142,6 @@ static bool skip_addr(void *dest) if (dest >= (void *)relocate_kernel && dest < (void*)relocate_kernel + KEXEC_CONTROL_CODE_MAX_SIZE) return true; -#endif -#ifdef CONFIG_XEN - if (dest >= (void *)hypercall_page && - dest < (void*)hypercall_page + PAGE_SIZE) - return true; #endif return false; } diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c index d2c732a34e5d..303bf74d175b 100644 --- a/arch/x86/kernel/cet.c +++ b/arch/x86/kernel/cet.c @@ -81,6 +81,34 @@ static void do_user_cp_fault(struct pt_regs *regs, unsigned long error_code) static __ro_after_init bool ibt_fatal = true; +/* + * By definition, all missing-ENDBRANCH #CPs are a result of WFE && !ENDBR. + * + * For the kernel IBT no ENDBR selftest where #CPs are deliberately triggered, + * the WFE state of the interrupted context needs to be cleared to let execution + * continue. Otherwise when the CPU resumes from the instruction that just + * caused the previous #CP, another missing-ENDBRANCH #CP is raised and the CPU + * enters a dead loop. + * + * This is not a problem with IDT because it doesn't preserve WFE and IRET doesn't + * set WFE. But FRED provides space on the entry stack (in an expanded CS area) + * to save and restore the WFE state, thus the WFE state is no longer clobbered, + * so software must clear it. + */ +static void ibt_clear_fred_wfe(struct pt_regs *regs) +{ + /* + * No need to do any FRED checks. + * + * For IDT event delivery, the high-order 48 bits of CS are pushed + * as 0s into the stack, and later IRET ignores these bits. + * + * For FRED, a test to check if fred_cs.wfe is set would be dropped + * by compilers. + */ + regs->fred_cs.wfe = 0; +} + static void do_kernel_cp_fault(struct pt_regs *regs, unsigned long error_code) { if ((error_code & CP_EC) != CP_ENDBR) { @@ -90,6 +118,7 @@ static void do_kernel_cp_fault(struct pt_regs *regs, unsigned long error_code) if (unlikely(regs->ip == (unsigned long)&ibt_selftest_noendbr)) { regs->ax = 0; + ibt_clear_fred_wfe(regs); return; } @@ -97,6 +126,7 @@ static void do_kernel_cp_fault(struct pt_regs *regs, unsigned long error_code) if (!ibt_fatal) { printk(KERN_DEFAULT CUT_HERE); __warn(__FILE__, __LINE__, (void *)regs->ip, TAINT_WARN, regs, NULL); + ibt_clear_fred_wfe(regs); return; } BUG(); diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index d8408aafeed9..79d2e17f6582 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -1065,7 +1065,7 @@ static void init_amd(struct cpuinfo_x86 *c) */ if (spectre_v2_in_eibrs_mode(spectre_v2_enabled) && cpu_has(c, X86_FEATURE_AUTOIBRS)) - WARN_ON_ONCE(msr_set_bit(MSR_EFER, _EFER_AUTOIBRS)); + WARN_ON_ONCE(msr_set_bit(MSR_EFER, _EFER_AUTOIBRS) < 0); /* AMD CPUs don't need fencing after x2APIC/TSC_DEADLINE MSR writes. */ clear_cpu_cap(c, X86_FEATURE_APIC_MSRS_FENCE); diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c index 392d09c936d6..e6fa03ed9172 100644 --- a/arch/x86/kernel/cpu/cacheinfo.c +++ b/arch/x86/kernel/cpu/cacheinfo.c @@ -178,8 +178,6 @@ struct _cpuid4_info_regs { struct amd_northbridge *nb; }; -static unsigned short num_cache_leaves; - /* AMD doesn't have CPUID4. Emulate it here to report the same information to the user. This makes some assumptions about the machine: L2 not shared, no SMT etc. that is currently true on AMD CPUs. @@ -717,20 +715,23 @@ void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c) void init_amd_cacheinfo(struct cpuinfo_x86 *c) { + struct cpu_cacheinfo *ci = get_cpu_cacheinfo(c->cpu_index); if (boot_cpu_has(X86_FEATURE_TOPOEXT)) { - num_cache_leaves = find_num_cache_leaves(c); + ci->num_leaves = find_num_cache_leaves(c); } else if (c->extended_cpuid_level >= 0x80000006) { if (cpuid_edx(0x80000006) & 0xf000) - num_cache_leaves = 4; + ci->num_leaves = 4; else - num_cache_leaves = 3; + ci->num_leaves = 3; } } void init_hygon_cacheinfo(struct cpuinfo_x86 *c) { - num_cache_leaves = find_num_cache_leaves(c); + struct cpu_cacheinfo *ci = get_cpu_cacheinfo(c->cpu_index); + + ci->num_leaves = find_num_cache_leaves(c); } void init_intel_cacheinfo(struct cpuinfo_x86 *c) @@ -740,21 +741,21 @@ void init_intel_cacheinfo(struct cpuinfo_x86 *c) unsigned int new_l1d = 0, new_l1i = 0; /* Cache sizes from cpuid(4) */ unsigned int new_l2 = 0, new_l3 = 0, i; /* Cache sizes from cpuid(4) */ unsigned int l2_id = 0, l3_id = 0, num_threads_sharing, index_msb; + struct cpu_cacheinfo *ci = get_cpu_cacheinfo(c->cpu_index); if (c->cpuid_level > 3) { - static int is_initialized; - - if (is_initialized == 0) { - /* Init num_cache_leaves from boot CPU */ - num_cache_leaves = find_num_cache_leaves(c); - is_initialized++; - } + /* + * There should be at least one leaf. A non-zero value means + * that the number of leaves has been initialized. + */ + if (!ci->num_leaves) + ci->num_leaves = find_num_cache_leaves(c); /* * Whenever possible use cpuid(4), deterministic cache * parameters cpuid leaf to find the cache details */ - for (i = 0; i < num_cache_leaves; i++) { + for (i = 0; i < ci->num_leaves; i++) { struct _cpuid4_info_regs this_leaf = {}; int retval; @@ -790,14 +791,14 @@ void init_intel_cacheinfo(struct cpuinfo_x86 *c) * Don't use cpuid2 if cpuid4 is supported. For P4, we use cpuid2 for * trace cache */ - if ((num_cache_leaves == 0 || c->x86 == 15) && c->cpuid_level > 1) { + if ((!ci->num_leaves || c->x86 == 15) && c->cpuid_level > 1) { /* supports eax=2 call */ int j, n; unsigned int regs[4]; unsigned char *dp = (unsigned char *)regs; int only_trace = 0; - if (num_cache_leaves != 0 && c->x86 == 15) + if (ci->num_leaves && c->x86 == 15) only_trace = 1; /* Number of times to iterate */ @@ -991,14 +992,12 @@ static void ci_leaf_init(struct cacheinfo *this_leaf, int init_cache_level(unsigned int cpu) { - struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); + struct cpu_cacheinfo *ci = get_cpu_cacheinfo(cpu); - if (!num_cache_leaves) + /* There should be at least one leaf. */ + if (!ci->num_leaves) return -ENOENT; - if (!this_cpu_ci) - return -EINVAL; - this_cpu_ci->num_levels = 3; - this_cpu_ci->num_leaves = num_cache_leaves; + return 0; } diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index a5c28975c608..3e9037690814 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -867,7 +867,7 @@ static void cpu_detect_tlb(struct cpuinfo_x86 *c) tlb_lld_4m[ENTRIES], tlb_lld_1g[ENTRIES]); } -static void get_cpu_vendor(struct cpuinfo_x86 *c) +void get_cpu_vendor(struct cpuinfo_x86 *c) { char *v = c->x86_vendor_id; int i; @@ -1649,15 +1649,11 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c) detect_nopl(); } -void __init early_cpu_init(void) +void __init init_cpu_devs(void) { const struct cpu_dev *const *cdev; int count = 0; -#ifdef CONFIG_PROCESSOR_SELECT - pr_info("KERNEL supported cpus:\n"); -#endif - for (cdev = __x86_cpu_dev_start; cdev < __x86_cpu_dev_end; cdev++) { const struct cpu_dev *cpudev = *cdev; @@ -1665,20 +1661,30 @@ void __init early_cpu_init(void) break; cpu_devs[count] = cpudev; count++; + } +} + +void __init early_cpu_init(void) +{ +#ifdef CONFIG_PROCESSOR_SELECT + unsigned int i, j; + + pr_info("KERNEL supported cpus:\n"); +#endif + + init_cpu_devs(); #ifdef CONFIG_PROCESSOR_SELECT - { - unsigned int j; - - for (j = 0; j < 2; j++) { - if (!cpudev->c_ident[j]) - continue; - pr_info(" %s %s\n", cpudev->c_vendor, - cpudev->c_ident[j]); - } + for (i = 0; i < X86_VENDOR_NUM && cpu_devs[i]; i++) { + for (j = 0; j < 2; j++) { + if (!cpu_devs[i]->c_ident[j]) + continue; + pr_info(" %s %s\n", cpu_devs[i]->c_vendor, + cpu_devs[i]->c_ident[j]); } -#endif } +#endif + early_identify_cpu(&boot_cpu_data); } diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index d1de300af173..8ded9f859a3a 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -555,7 +555,9 @@ static void init_intel(struct cpuinfo_x86 *c) c->x86_vfm == INTEL_WESTMERE_EX)) set_cpu_bug(c, X86_BUG_CLFLUSH_MONITOR); - if (boot_cpu_has(X86_FEATURE_MWAIT) && c->x86_vfm == INTEL_ATOM_GOLDMONT) + if (boot_cpu_has(X86_FEATURE_MWAIT) && + (c->x86_vfm == INTEL_ATOM_GOLDMONT || + c->x86_vfm == INTEL_LUNARLAKE_M)) set_cpu_bug(c, X86_BUG_MONITOR); #ifdef CONFIG_X86_64 diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index d18078834ded..dc12fe5ef3ca 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -223,6 +223,63 @@ static void hv_machine_crash_shutdown(struct pt_regs *regs) hyperv_cleanup(); } #endif /* CONFIG_CRASH_DUMP */ + +static u64 hv_ref_counter_at_suspend; +static void (*old_save_sched_clock_state)(void); +static void (*old_restore_sched_clock_state)(void); + +/* + * Hyper-V clock counter resets during hibernation. Save and restore clock + * offset during suspend/resume, while also considering the time passed + * before suspend. This is to make sure that sched_clock using hv tsc page + * based clocksource, proceeds from where it left off during suspend and + * it shows correct time for the timestamps of kernel messages after resume. + */ +static void save_hv_clock_tsc_state(void) +{ + hv_ref_counter_at_suspend = hv_read_reference_counter(); +} + +static void restore_hv_clock_tsc_state(void) +{ + /* + * Adjust the offsets used by hv tsc clocksource to + * account for the time spent before hibernation. + * adjusted value = reference counter (time) at suspend + * - reference counter (time) now. + */ + hv_adj_sched_clock_offset(hv_ref_counter_at_suspend - hv_read_reference_counter()); +} + +/* + * Functions to override save_sched_clock_state and restore_sched_clock_state + * functions of x86_platform. The Hyper-V clock counter is reset during + * suspend-resume and the offset used to measure time needs to be + * corrected, post resume. + */ +static void hv_save_sched_clock_state(void) +{ + old_save_sched_clock_state(); + save_hv_clock_tsc_state(); +} + +static void hv_restore_sched_clock_state(void) +{ + restore_hv_clock_tsc_state(); + old_restore_sched_clock_state(); +} + +static void __init x86_setup_ops_for_tsc_pg_clock(void) +{ + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE)) + return; + + old_save_sched_clock_state = x86_platform.save_sched_clock_state; + x86_platform.save_sched_clock_state = hv_save_sched_clock_state; + + old_restore_sched_clock_state = x86_platform.restore_sched_clock_state; + x86_platform.restore_sched_clock_state = hv_restore_sched_clock_state; +} #endif /* CONFIG_HYPERV */ static uint32_t __init ms_hyperv_platform(void) @@ -579,6 +636,7 @@ static void __init ms_hyperv_init_platform(void) /* Register Hyper-V specific clocksource */ hv_init_clocksource(); + x86_setup_ops_for_tsc_pg_clock(); hv_vtl_init_platform(); #endif /* diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c index 621a151ccf7d..b2e313ea17bf 100644 --- a/arch/x86/kernel/cpu/topology.c +++ b/arch/x86/kernel/cpu/topology.c @@ -428,8 +428,8 @@ void __init topology_apply_cmdline_limits_early(void) { unsigned int possible = nr_cpu_ids; - /* 'maxcpus=0' 'nosmp' 'nolapic' 'disableapic' 'noapic' */ - if (!setup_max_cpus || ioapic_is_disabled || apic_is_disabled) + /* 'maxcpus=0' 'nosmp' 'nolapic' 'disableapic' */ + if (!setup_max_cpus || apic_is_disabled) possible = 1; /* 'possible_cpus=N' */ @@ -443,7 +443,7 @@ void __init topology_apply_cmdline_limits_early(void) static __init bool restrict_to_up(void) { - if (!smp_found_config || ioapic_is_disabled) + if (!smp_found_config) return true; /* * XEN PV is special as it does not advertise the local APIC diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c index 6bc1eb2a21bd..887b0b8e21e3 100644 --- a/arch/x86/kernel/fpu/regset.c +++ b/arch/x86/kernel/fpu/regset.c @@ -190,7 +190,8 @@ int ssp_get(struct task_struct *target, const struct user_regset *regset, struct fpu *fpu = &target->thread.fpu; struct cet_user_state *cetregs; - if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK)) + if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK) || + !ssp_active(target, regset)) return -ENODEV; sync_fpstate(fpu); diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c index 1065ab995305..8f62e0666dea 100644 --- a/arch/x86/kernel/fpu/signal.c +++ b/arch/x86/kernel/fpu/signal.c @@ -63,16 +63,6 @@ setfx: return true; } -/* - * Update the value of PKRU register that was already pushed onto the signal frame. - */ -static inline int update_pkru_in_sigframe(struct xregs_state __user *buf, u32 pkru) -{ - if (unlikely(!cpu_feature_enabled(X86_FEATURE_OSPKE))) - return 0; - return __put_user(pkru, (unsigned int __user *)get_xsave_addr_user(buf, XFEATURE_PKRU)); -} - /* * Signal frame handlers. */ @@ -168,14 +158,8 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame, static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, u32 pkru) { - int err = 0; - - if (use_xsave()) { - err = xsave_to_user_sigframe(buf); - if (!err) - err = update_pkru_in_sigframe(buf, pkru); - return err; - } + if (use_xsave()) + return xsave_to_user_sigframe(buf, pkru); if (use_fxsr()) return fxsave_to_user_sigframe((struct fxregs_state __user *) buf); diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h index 0b86a5002c84..aa16f1a1bbcf 100644 --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -69,6 +69,28 @@ static inline u64 xfeatures_mask_independent(void) return fpu_kernel_cfg.independent_features; } +/* + * Update the value of PKRU register that was already pushed onto the signal frame. + */ +static inline int update_pkru_in_sigframe(struct xregs_state __user *buf, u64 mask, u32 pkru) +{ + u64 xstate_bv; + int err; + + if (unlikely(!cpu_feature_enabled(X86_FEATURE_OSPKE))) + return 0; + + /* Mark PKRU as in-use so that it is restored correctly. */ + xstate_bv = (mask & xfeatures_in_use()) | XFEATURE_MASK_PKRU; + + err = __put_user(xstate_bv, &buf->header.xfeatures); + if (err) + return err; + + /* Update PKRU value in the userspace xsave buffer. */ + return __put_user(pkru, (unsigned int __user *)get_xsave_addr_user(buf, XFEATURE_PKRU)); +} + /* XSAVE/XRSTOR wrapper functions */ #ifdef CONFIG_X86_64 @@ -256,7 +278,7 @@ static inline u64 xfeatures_need_sigframe_write(void) * The caller has to zero buf::header before calling this because XSAVE* * does not touch the reserved fields in the header. */ -static inline int xsave_to_user_sigframe(struct xregs_state __user *buf) +static inline int xsave_to_user_sigframe(struct xregs_state __user *buf, u32 pkru) { /* * Include the features which are not xsaved/rstored by the kernel @@ -281,6 +303,9 @@ static inline int xsave_to_user_sigframe(struct xregs_state __user *buf) XSTATE_OP(XSAVE, buf, lmask, hmask, err); clac(); + if (!err) + err = update_pkru_in_sigframe(buf, mask, pkru); + return err; } diff --git a/arch/x86/kernel/fred.c b/arch/x86/kernel/fred.c index 8d32c3f48abc..5e2cd1004980 100644 --- a/arch/x86/kernel/fred.c +++ b/arch/x86/kernel/fred.c @@ -50,7 +50,13 @@ void cpu_init_fred_exceptions(void) FRED_CONFIG_ENTRYPOINT(asm_fred_entrypoint_user)); wrmsrl(MSR_IA32_FRED_STKLVLS, 0); - wrmsrl(MSR_IA32_FRED_RSP0, 0); + + /* + * Ater a CPU offline/online cycle, the FRED RSP0 MSR should be + * resynchronized with its per-CPU cache. + */ + wrmsrl(MSR_IA32_FRED_RSP0, __this_cpu_read(fred_rsp0)); + wrmsrl(MSR_IA32_FRED_RSP1, 0); wrmsrl(MSR_IA32_FRED_RSP2, 0); wrmsrl(MSR_IA32_FRED_RSP3, 0); diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S index e9e88c342f75..540443d699e3 100644 --- a/arch/x86/kernel/relocate_kernel_64.S +++ b/arch/x86/kernel/relocate_kernel_64.S @@ -13,6 +13,7 @@ #include #include #include +#include /* * Must be relocatable PIC code callable as a C function, in particular @@ -242,6 +243,13 @@ SYM_CODE_START_LOCAL_NOALIGN(virtual_mapped) movq CR0(%r8), %r8 movq %rax, %cr3 movq %r8, %cr0 + +#ifdef CONFIG_KEXEC_JUMP + /* Saved in save_processor_state. */ + movq $saved_context, %rax + lgdt saved_context_gdt_desc(%rax) +#endif + movq %rbp, %rax popf diff --git a/arch/x86/kernel/static_call.c b/arch/x86/kernel/static_call.c index 4eefaac64c6c..9e51242ed125 100644 --- a/arch/x86/kernel/static_call.c +++ b/arch/x86/kernel/static_call.c @@ -172,6 +172,14 @@ void arch_static_call_transform(void *site, void *tramp, void *func, bool tail) } EXPORT_SYMBOL_GPL(arch_static_call_transform); +noinstr void __static_call_update_early(void *tramp, void *func) +{ + BUG_ON(system_state != SYSTEM_BOOTING); + BUG_ON(static_call_initialized); + __text_gen_insn(tramp, JMP32_INSN_OPCODE, tramp, func, JMP32_INSN_SIZE); + sync_core(); +} + #ifdef CONFIG_MITIGATION_RETHUNK /* * This is called by apply_returns() to fix up static call trampolines, diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index fab3ac9a4574..6a17396c8174 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -519,14 +519,10 @@ INIT_PER_CPU(irq_stack_backing_store); * linker will never mark as relocatable. (Using just ABSOLUTE() is not * sufficient for that). */ -#ifdef CONFIG_XEN #ifdef CONFIG_XEN_PV xen_elfnote_entry_value = ABSOLUTE(xen_elfnote_entry) + ABSOLUTE(startup_xen); #endif -xen_elfnote_hypercall_page_value = - ABSOLUTE(xen_elfnote_hypercall_page) + ABSOLUTE(hypercall_page); -#endif #ifdef CONFIG_PVH xen_elfnote_phys32_entry_value = ABSOLUTE(xen_elfnote_phys32_entry) + ABSOLUTE(pvh_start_xen - LOAD_OFFSET); diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 097bdc022d0f..ae0b438a2c99 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -36,6 +36,26 @@ u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly; EXPORT_SYMBOL_GPL(kvm_cpu_caps); +struct cpuid_xstate_sizes { + u32 eax; + u32 ebx; + u32 ecx; +}; + +static struct cpuid_xstate_sizes xstate_sizes[XFEATURE_MAX] __ro_after_init; + +void __init kvm_init_xstate_sizes(void) +{ + u32 ign; + int i; + + for (i = XFEATURE_YMM; i < ARRAY_SIZE(xstate_sizes); i++) { + struct cpuid_xstate_sizes *xs = &xstate_sizes[i]; + + cpuid_count(0xD, i, &xs->eax, &xs->ebx, &xs->ecx, &ign); + } +} + u32 xstate_required_size(u64 xstate_bv, bool compacted) { int feature_bit = 0; @@ -44,14 +64,15 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted) xstate_bv &= XFEATURE_MASK_EXTEND; while (xstate_bv) { if (xstate_bv & 0x1) { - u32 eax, ebx, ecx, edx, offset; - cpuid_count(0xD, feature_bit, &eax, &ebx, &ecx, &edx); + struct cpuid_xstate_sizes *xs = &xstate_sizes[feature_bit]; + u32 offset; + /* ECX[1]: 64B alignment in compacted form */ if (compacted) - offset = (ecx & 0x2) ? ALIGN(ret, 64) : ret; + offset = (xs->ecx & 0x2) ? ALIGN(ret, 64) : ret; else - offset = ebx; - ret = max(ret, offset + eax); + offset = xs->ebx; + ret = max(ret, offset + xs->eax); } xstate_bv >>= 1; diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index c8dc66eddefd..f16a7b2c2adc 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -31,6 +31,7 @@ int kvm_vcpu_ioctl_get_cpuid2(struct kvm_vcpu *vcpu, bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx, u32 *ecx, u32 *edx, bool exact_only); +void __init kvm_init_xstate_sizes(void); u32 xstate_required_size(u64 xstate_bv, bool compacted); int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 22e7ad235123..2401606db260 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3364,18 +3364,6 @@ static bool fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, return true; } -static bool is_access_allowed(struct kvm_page_fault *fault, u64 spte) -{ - if (fault->exec) - return is_executable_pte(spte); - - if (fault->write) - return is_writable_pte(spte); - - /* Fault was on Read access */ - return spte & PT_PRESENT_MASK; -} - /* * Returns the last level spte pointer of the shadow page walk for the given * gpa, and sets *spte to the spte value. This spte may be non-preset. If no diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index f332b33bc817..af10bc0380a3 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -461,6 +461,23 @@ static inline bool is_mmu_writable_spte(u64 spte) return spte & shadow_mmu_writable_mask; } +/* + * Returns true if the access indicated by @fault is allowed by the existing + * SPTE protections. Note, the caller is responsible for checking that the + * SPTE is a shadow-present, leaf SPTE (either before or after). + */ +static inline bool is_access_allowed(struct kvm_page_fault *fault, u64 spte) +{ + if (fault->exec) + return is_executable_pte(spte); + + if (fault->write) + return is_writable_pte(spte); + + /* Fault was on Read access */ + return spte & PT_PRESENT_MASK; +} + /* * If the MMU-writable flag is cleared, i.e. the SPTE is write-protected for * write-tracking, remote TLBs must be flushed, even if the SPTE was read-only, diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 4508d868f1cd..2f15e0e33903 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -985,6 +985,11 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, if (fault->prefetch && is_shadow_present_pte(iter->old_spte)) return RET_PF_SPURIOUS; + if (is_shadow_present_pte(iter->old_spte) && + is_access_allowed(fault, iter->old_spte) && + is_last_spte(iter->old_spte, iter->level)) + return RET_PF_SPURIOUS; + if (unlikely(!fault->slot)) new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL); else diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index 4b74ea91f4e6..65fd245a9953 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -1199,6 +1199,12 @@ bool avic_hardware_setup(void) return false; } + if (cc_platform_has(CC_ATTR_HOST_SEV_SNP) && + !boot_cpu_has(X86_FEATURE_HV_INUSE_WR_ALLOWED)) { + pr_warn("AVIC disabled: missing HvInUseWrAllowed on SNP-enabled system\n"); + return false; + } + if (boot_cpu_has(X86_FEATURE_AVIC)) { pr_info("AVIC enabled\n"); } else if (force_avic) { diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index dd15cc635655..21dacd312779 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3201,15 +3201,6 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) if (data & ~supported_de_cfg) return 1; - /* - * Don't let the guest change the host-programmed value. The - * MSR is very model specific, i.e. contains multiple bits that - * are completely unknown to KVM, and the one bit known to KVM - * is simply a reflection of hardware capabilities. - */ - if (!msr->host_initiated && data != svm->msr_decfg) - return 1; - svm->msr_decfg = data; break; } diff --git a/arch/x86/kvm/vmx/posted_intr.h b/arch/x86/kvm/vmx/posted_intr.h index 1715d2ab07be..ad9116a99bcc 100644 --- a/arch/x86/kvm/vmx/posted_intr.h +++ b/arch/x86/kvm/vmx/posted_intr.h @@ -2,7 +2,7 @@ #ifndef __KVM_X86_VMX_POSTED_INTR_H #define __KVM_X86_VMX_POSTED_INTR_H -#include +#include #include void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2e713480933a..c79a8cc57ba4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9976,7 +9976,7 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) { u64 ret = vcpu->run->hypercall.ret; - if (!is_64_bit_mode(vcpu)) + if (!is_64_bit_hypercall(vcpu)) ret = (u32)ret; kvm_rax_write(vcpu, ret); ++vcpu->stat.hypercalls; @@ -12724,6 +12724,13 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) kvm_hv_init_vm(kvm); kvm_xen_init_vm(kvm); + if (ignore_msrs && !report_ignored_msrs) { + pr_warn_once("Running KVM with ignore_msrs=1 and report_ignored_msrs=0 is not a\n" + "a supported configuration. Lying to the guest about the existence of MSRs\n" + "may cause the guest operating system to hang or produce errors. If a guest\n" + "does not run without ignore_msrs=1, please report it to kvm@vger.kernel.org.\n"); + } + return 0; out_uninit_mmu: @@ -13997,6 +14004,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_rmp_fault); static int __init kvm_x86_init(void) { + kvm_init_xstate_sizes(); + kvm_mmu_x86_module_init(); mitigate_smt_rsb &= boot_cpu_has_bug(X86_BUG_SMT_RSB) && cpu_smt_possible(); return 0; diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c index 437e96fb4977..5ab7bd2f1983 100644 --- a/arch/x86/mm/ident_map.c +++ b/arch/x86/mm/ident_map.c @@ -174,7 +174,7 @@ static int ident_p4d_init(struct x86_mapping_info *info, p4d_t *p4d_page, if (result) return result; - set_p4d(p4d, __p4d(__pa(pud) | info->kernpg_flag)); + set_p4d(p4d, __p4d(__pa(pud) | info->kernpg_flag | _PAGE_NOPTISHADOW)); } return 0; @@ -218,14 +218,14 @@ int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page, if (result) return result; if (pgtable_l5_enabled()) { - set_pgd(pgd, __pgd(__pa(p4d) | info->kernpg_flag)); + set_pgd(pgd, __pgd(__pa(p4d) | info->kernpg_flag | _PAGE_NOPTISHADOW)); } else { /* * With p4d folded, pgd is equal to p4d. * The pgd entry has to point to the pud page table in this case. */ pud_t *pud = pud_offset(p4d, 0); - set_pgd(pgd, __pgd(__pa(pud) | info->kernpg_flag)); + set_pgd(pgd, __pgd(__pa(pud) | info->kernpg_flag | _PAGE_NOPTISHADOW)); } } diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index c6d29f283001..62aa4d66a032 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -1080,7 +1080,8 @@ struct execmem_info __init *execmem_arch_setup(void) start = MODULES_VADDR + offset; - if (IS_ENABLED(CONFIG_ARCH_HAS_EXECMEM_ROX)) { + if (IS_ENABLED(CONFIG_ARCH_HAS_EXECMEM_ROX) && + cpu_feature_enabled(X86_FEATURE_PSE)) { pgprot = PAGE_KERNEL_ROX; flags = EXECMEM_KASAN_SHADOW | EXECMEM_ROX_CACHE; } else { diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index 069e421c2247..95bc50a8541c 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -354,7 +354,7 @@ bool cpu_cache_has_invalidate_memregion(void) { return !cpu_feature_enabled(X86_FEATURE_HYPERVISOR); } -EXPORT_SYMBOL_NS_GPL(cpu_cache_has_invalidate_memregion, DEVMEM); +EXPORT_SYMBOL_NS_GPL(cpu_cache_has_invalidate_memregion, "DEVMEM"); int cpu_cache_invalidate_memregion(int res_desc) { @@ -363,7 +363,7 @@ int cpu_cache_invalidate_memregion(int res_desc) wbinvd_on_all_cpus(); return 0; } -EXPORT_SYMBOL_NS_GPL(cpu_cache_invalidate_memregion, DEVMEM); +EXPORT_SYMBOL_NS_GPL(cpu_cache_invalidate_memregion, "DEVMEM"); #endif static void __cpa_flush_all(void *arg) diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c index 851ec8f1363a..5f0d579932c6 100644 --- a/arch/x86/mm/pti.c +++ b/arch/x86/mm/pti.c @@ -132,7 +132,7 @@ pgd_t __pti_set_user_pgtbl(pgd_t *pgdp, pgd_t pgd) * Top-level entries added to init_mm's usermode pgd after boot * will not be automatically propagated to other mms. */ - if (!pgdp_maps_userspace(pgdp)) + if (!pgdp_maps_userspace(pgdp) || (pgd.pgd & _PAGE_NOPTISHADOW)) return pgd; /* diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 84e5adbd0925..43dcd8c7badc 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -2,6 +2,7 @@ #include #include +#include #include #include #include @@ -21,7 +22,8 @@ #include "xen-ops.h" -EXPORT_SYMBOL_GPL(hypercall_page); +DEFINE_STATIC_CALL(xen_hypercall, xen_hypercall_hvm); +EXPORT_STATIC_CALL_TRAMP(xen_hypercall); /* * Pointer to the xen_vcpu_info structure or @@ -68,6 +70,67 @@ EXPORT_SYMBOL(xen_start_flags); */ struct shared_info *HYPERVISOR_shared_info = &xen_dummy_shared_info; +static __ref void xen_get_vendor(void) +{ + init_cpu_devs(); + cpu_detect(&boot_cpu_data); + get_cpu_vendor(&boot_cpu_data); +} + +void xen_hypercall_setfunc(void) +{ + if (static_call_query(xen_hypercall) != xen_hypercall_hvm) + return; + + if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD || + boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)) + static_call_update(xen_hypercall, xen_hypercall_amd); + else + static_call_update(xen_hypercall, xen_hypercall_intel); +} + +/* + * Evaluate processor vendor in order to select the correct hypercall + * function for HVM/PVH guests. + * Might be called very early in boot before vendor has been set by + * early_cpu_init(). + */ +noinstr void *__xen_hypercall_setfunc(void) +{ + void (*func)(void); + + /* + * Xen is supported only on CPUs with CPUID, so testing for + * X86_FEATURE_CPUID is a test for early_cpu_init() having been + * run. + * + * Note that __xen_hypercall_setfunc() is noinstr only due to a nasty + * dependency chain: it is being called via the xen_hypercall static + * call when running as a PVH or HVM guest. Hypercalls need to be + * noinstr due to PV guests using hypercalls in noinstr code. So we + * can safely tag the function body as "instrumentation ok", since + * the PV guest requirement is not of interest here (xen_get_vendor() + * calls noinstr functions, and static_call_update_early() might do + * so, too). + */ + instrumentation_begin(); + + if (!boot_cpu_has(X86_FEATURE_CPUID)) + xen_get_vendor(); + + if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD || + boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)) + func = xen_hypercall_amd; + else + func = xen_hypercall_intel; + + static_call_update_early(xen_hypercall, func); + + instrumentation_end(); + + return func; +} + static int xen_cpu_up_online(unsigned int cpu) { xen_init_lock_cpu(cpu); diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c index 24d2957a4726..fe57ff85d004 100644 --- a/arch/x86/xen/enlighten_hvm.c +++ b/arch/x86/xen/enlighten_hvm.c @@ -106,15 +106,8 @@ static void __init init_hvm_pv_info(void) /* PVH set up hypercall page in xen_prepare_pvh(). */ if (xen_pvh_domain()) pv_info.name = "Xen PVH"; - else { - u64 pfn; - uint32_t msr; - + else pv_info.name = "Xen HVM"; - msr = cpuid_ebx(base + 2); - pfn = __pa(hypercall_page); - wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32)); - } xen_setup_features(); @@ -300,6 +293,10 @@ static uint32_t __init xen_platform_hvm(void) if (xen_pv_domain()) return 0; + /* Set correct hypercall function. */ + if (xen_domain) + xen_hypercall_setfunc(); + if (xen_pvh_domain() && nopv) { /* Guest booting via the Xen-PVH boot entry goes here */ pr_info("\"nopv\" parameter is ignored in PVH guest\n"); diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index d6818c6cafda..a8eb7e0c473c 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -1341,6 +1341,9 @@ asmlinkage __visible void __init xen_start_kernel(struct start_info *si) xen_domain_type = XEN_PV_DOMAIN; xen_start_flags = xen_start_info->flags; + /* Interrupts are guaranteed to be off initially. */ + early_boot_irqs_disabled = true; + static_call_update_early(xen_hypercall, xen_hypercall_pv); xen_setup_features(); @@ -1431,7 +1434,6 @@ asmlinkage __visible void __init xen_start_kernel(struct start_info *si) WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_pv, xen_cpu_dead_pv)); local_irq_disable(); - early_boot_irqs_disabled = true; xen_raw_console_write("mapping kernel into physical memory\n"); xen_setup_kernel_pagetable((pgd_t *)xen_start_info->pt_base, diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c index bf68c329fc01..0e3d930bcb89 100644 --- a/arch/x86/xen/enlighten_pvh.c +++ b/arch/x86/xen/enlighten_pvh.c @@ -129,17 +129,10 @@ static void __init pvh_arch_setup(void) void __init xen_pvh_init(struct boot_params *boot_params) { - u32 msr; - u64 pfn; - xen_pvh = 1; xen_domain_type = XEN_HVM_DOMAIN; xen_start_flags = pvh_start_info.flags; - msr = cpuid_ebx(xen_cpuid_base() + 2); - pfn = __pa(hypercall_page); - wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32)); - x86_init.oem.arch_setup = pvh_arch_setup; x86_init.oem.banner = xen_banner; diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S index 83189cf5cdce..b518f36d1ca2 100644 --- a/arch/x86/xen/xen-asm.S +++ b/arch/x86/xen/xen-asm.S @@ -20,9 +20,32 @@ #include #include +#include #include <../entry/calling.h> .pushsection .noinstr.text, "ax" +/* + * PV hypercall interface to the hypervisor. + * + * Called via inline asm(), so better preserve %rcx and %r11. + * + * Input: + * %eax: hypercall number + * %rdi, %rsi, %rdx, %r10, %r8: args 1..5 for the hypercall + * Output: %rax + */ +SYM_FUNC_START(xen_hypercall_pv) + ANNOTATE_NOENDBR + push %rcx + push %r11 + UNWIND_HINT_SAVE + syscall + UNWIND_HINT_RESTORE + pop %r11 + pop %rcx + RET +SYM_FUNC_END(xen_hypercall_pv) + /* * Disabling events is simply a matter of making the event mask * non-zero. @@ -176,7 +199,6 @@ SYM_CODE_START(xen_early_idt_handler_array) SYM_CODE_END(xen_early_idt_handler_array) __FINIT -hypercall_iret = hypercall_page + __HYPERVISOR_iret * 32 /* * Xen64 iret frame: * @@ -186,17 +208,28 @@ hypercall_iret = hypercall_page + __HYPERVISOR_iret * 32 * cs * rip <-- standard iret frame * - * flags + * flags <-- xen_iret must push from here on * - * rcx } - * r11 }<-- pushed by hypercall page - * rsp->rax } + * rcx + * r11 + * rsp->rax */ +.macro xen_hypercall_iret + pushq $0 /* Flags */ + push %rcx + push %r11 + push %rax + mov $__HYPERVISOR_iret, %eax + syscall /* Do the IRET. */ +#ifdef CONFIG_MITIGATION_SLS + int3 +#endif +.endm + SYM_CODE_START(xen_iret) UNWIND_HINT_UNDEFINED ANNOTATE_NOENDBR - pushq $0 - jmp hypercall_iret + xen_hypercall_iret SYM_CODE_END(xen_iret) /* @@ -301,8 +334,7 @@ SYM_CODE_START(xen_entry_SYSENTER_compat) ENDBR lea 16(%rsp), %rsp /* strip %rcx, %r11 */ mov $-ENOSYS, %rax - pushq $0 - jmp hypercall_iret + xen_hypercall_iret SYM_CODE_END(xen_entry_SYSENTER_compat) SYM_CODE_END(xen_entry_SYSCALL_compat) diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S index 7f6c69dbb816..9252652afe59 100644 --- a/arch/x86/xen/xen-head.S +++ b/arch/x86/xen/xen-head.S @@ -6,9 +6,11 @@ #include #include +#include #include #include +#include #include #include #include @@ -20,28 +22,6 @@ #include #include -.pushsection .noinstr.text, "ax" - .balign PAGE_SIZE -SYM_CODE_START(hypercall_page) - .rept (PAGE_SIZE / 32) - UNWIND_HINT_FUNC - ANNOTATE_NOENDBR - ANNOTATE_UNRET_SAFE - ret - /* - * Xen will write the hypercall page, and sort out ENDBR. - */ - .skip 31, 0xcc - .endr - -#define HYPERCALL(n) \ - .equ xen_hypercall_##n, hypercall_page + __HYPERVISOR_##n * 32; \ - .type xen_hypercall_##n, @function; .size xen_hypercall_##n, 32 -#include -#undef HYPERCALL -SYM_CODE_END(hypercall_page) -.popsection - #ifdef CONFIG_XEN_PV __INIT SYM_CODE_START(startup_xen) @@ -87,6 +67,87 @@ SYM_CODE_END(xen_cpu_bringup_again) #endif #endif + .pushsection .noinstr.text, "ax" +/* + * Xen hypercall interface to the hypervisor. + * + * Input: + * %eax: hypercall number + * 32-bit: + * %ebx, %ecx, %edx, %esi, %edi: args 1..5 for the hypercall + * 64-bit: + * %rdi, %rsi, %rdx, %r10, %r8: args 1..5 for the hypercall + * Output: %[er]ax + */ +SYM_FUNC_START(xen_hypercall_hvm) + ENDBR + FRAME_BEGIN + /* Save all relevant registers (caller save and arguments). */ +#ifdef CONFIG_X86_32 + push %eax + push %ebx + push %ecx + push %edx + push %esi + push %edi +#else + push %rax + push %rcx + push %rdx + push %rdi + push %rsi + push %r11 + push %r10 + push %r9 + push %r8 +#ifdef CONFIG_FRAME_POINTER + pushq $0 /* Dummy push for stack alignment. */ +#endif +#endif + /* Set the vendor specific function. */ + call __xen_hypercall_setfunc + /* Set ZF = 1 if AMD, Restore saved registers. */ +#ifdef CONFIG_X86_32 + lea xen_hypercall_amd, %ebx + cmp %eax, %ebx + pop %edi + pop %esi + pop %edx + pop %ecx + pop %ebx + pop %eax +#else + lea xen_hypercall_amd(%rip), %rbx + cmp %rax, %rbx +#ifdef CONFIG_FRAME_POINTER + pop %rax /* Dummy pop. */ +#endif + pop %r8 + pop %r9 + pop %r10 + pop %r11 + pop %rsi + pop %rdi + pop %rdx + pop %rcx + pop %rax +#endif + /* Use correct hypercall function. */ + jz xen_hypercall_amd + jmp xen_hypercall_intel +SYM_FUNC_END(xen_hypercall_hvm) + +SYM_FUNC_START(xen_hypercall_amd) + vmmcall + RET +SYM_FUNC_END(xen_hypercall_amd) + +SYM_FUNC_START(xen_hypercall_intel) + vmcall + RET +SYM_FUNC_END(xen_hypercall_intel) + .popsection + ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS, .asciz "linux") ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION, .asciz "2.6") ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION, .asciz "xen-3.0") @@ -116,8 +177,6 @@ SYM_CODE_END(xen_cpu_bringup_again) #else # define FEATURES_DOM0 0 #endif - ELFNOTE(Xen, XEN_ELFNOTE_HYPERCALL_PAGE, .globl xen_elfnote_hypercall_page; - xen_elfnote_hypercall_page: _ASM_PTR xen_elfnote_hypercall_page_value - .) ELFNOTE(Xen, XEN_ELFNOTE_SUPPORTED_FEATURES, .long FEATURES_PV | FEATURES_PVH | FEATURES_DOM0) ELFNOTE(Xen, XEN_ELFNOTE_LOADER, .asciz "generic") diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h index e1b782e823e6..63c13a2ccf55 100644 --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -326,4 +326,13 @@ static inline void xen_smp_intr_free_pv(unsigned int cpu) {} static inline void xen_smp_count_cpus(void) { } #endif /* CONFIG_SMP */ +#ifdef CONFIG_XEN_PV +void xen_hypercall_pv(void); +#endif +void xen_hypercall_hvm(void); +void xen_hypercall_amd(void); +void xen_hypercall_intel(void); +void xen_hypercall_setfunc(void); +void *__xen_hypercall_setfunc(void); + #endif /* XEN_OPS_H */ diff --git a/block/bdev.c b/block/bdev.c index 738e3c8457e7..9d73a8fbf7f9 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -155,8 +155,7 @@ int set_blocksize(struct file *file, int size) struct inode *inode = file->f_mapping->host; struct block_device *bdev = I_BDEV(inode); - /* Size must be a power of two, and between 512 and PAGE_SIZE */ - if (size > PAGE_SIZE || size < 512 || !is_power_of_2(size)) + if (blk_validate_block_size(size)) return -EINVAL; /* Size cannot be smaller than the size supported by the device */ diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 95dd7b795935..cad16c163611 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -6844,16 +6844,24 @@ static struct bfq_queue *bfq_waker_bfqq(struct bfq_queue *bfqq) if (new_bfqq == waker_bfqq) { /* * If waker_bfqq is in the merge chain, and current - * is the only procress. + * is the only process, waker_bfqq can be freed. */ if (bfqq_process_refs(waker_bfqq) == 1) return NULL; - break; + + return waker_bfqq; } new_bfqq = new_bfqq->new_bfqq; } + /* + * If waker_bfqq is not in the merge chain, and it's procress reference + * is 0, waker_bfqq can be freed. + */ + if (bfqq_process_refs(waker_bfqq) == 0) + return NULL; + return waker_bfqq; } diff --git a/block/bio.c b/block/bio.c index 699a78c85c75..d5bdc31d88d3 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1171,7 +1171,7 @@ void __bio_release_pages(struct bio *bio, bool mark_dirty) } EXPORT_SYMBOL_GPL(__bio_release_pages); -void bio_iov_bvec_set(struct bio *bio, struct iov_iter *iter) +void bio_iov_bvec_set(struct bio *bio, const struct iov_iter *iter) { WARN_ON_ONCE(bio->bi_max_vecs); diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index e68c725cf8d9..45a395862fbc 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1324,10 +1324,14 @@ void blkcg_unpin_online(struct cgroup_subsys_state *blkcg_css) struct blkcg *blkcg = css_to_blkcg(blkcg_css); do { + struct blkcg *parent; + if (!refcount_dec_and_test(&blkcg->online_pin)) break; + + parent = blkcg_parent(blkcg); blkcg_destroy_blkgs(blkcg); - blkcg = blkcg_parent(blkcg); + blkcg = parent; } while (blkcg); } diff --git a/block/blk-iocost.c b/block/blk-iocost.c index 384aa15e8260..a5894ec9696e 100644 --- a/block/blk-iocost.c +++ b/block/blk-iocost.c @@ -1098,7 +1098,14 @@ static void __propagate_weights(struct ioc_gq *iocg, u32 active, u32 inuse, inuse = DIV64_U64_ROUND_UP(active * iocg->child_inuse_sum, iocg->child_active_sum); } else { - inuse = clamp_t(u32, inuse, 1, active); + /* + * It may be tempting to turn this into a clamp expression with + * a lower limit of 1 but active may be 0, which cannot be used + * as an upper limit in that situation. This expression allows + * active to clamp inuse unless it is 0, in which case inuse + * becomes 1. + */ + inuse = min(inuse, active) ?: 1; } iocg->last_inuse = iocg->inuse; diff --git a/block/blk-map.c b/block/blk-map.c index b5fd1d857461..894009b2d881 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -574,7 +574,7 @@ static int blk_rq_map_user_bvec(struct request *rq, const struct iov_iter *iter) bio = blk_rq_map_bio_alloc(rq, 0, GFP_KERNEL); if (!bio) return -ENOMEM; - bio_iov_bvec_set(bio, (struct iov_iter *)iter); + bio_iov_bvec_set(bio, iter); /* check that the data layout matches the hardware restrictions */ ret = bio_split_rw_at(bio, lim, &nsegs, max_bytes); diff --git a/block/blk-mq.c b/block/blk-mq.c index 424239c075e2..8ac19d4ae3c0 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -43,6 +43,7 @@ static DEFINE_PER_CPU(struct llist_head, blk_cpu_done); static DEFINE_PER_CPU(call_single_data_t, blk_cpu_csd); +static DEFINE_MUTEX(blk_mq_cpuhp_lock); static void blk_mq_insert_request(struct request *rq, blk_insert_t flags); static void blk_mq_request_bypass_insert(struct request *rq, @@ -1543,19 +1544,17 @@ static void blk_mq_requeue_work(struct work_struct *work) while (!list_empty(&rq_list)) { rq = list_entry(rq_list.next, struct request, queuelist); + list_del_init(&rq->queuelist); /* - * If RQF_DONTPREP ist set, the request has been started by the + * If RQF_DONTPREP is set, the request has been started by the * driver already and might have driver-specific data allocated * already. Insert it into the hctx dispatch list to avoid * block layer merges for the request. */ - if (rq->rq_flags & RQF_DONTPREP) { - list_del_init(&rq->queuelist); + if (rq->rq_flags & RQF_DONTPREP) blk_mq_request_bypass_insert(rq, 0); - } else { - list_del_init(&rq->queuelist); + else blk_mq_insert_request(rq, BLK_MQ_INSERT_AT_HEAD); - } } while (!list_empty(&flush_list)) { @@ -3739,13 +3738,91 @@ static int blk_mq_hctx_notify_dead(unsigned int cpu, struct hlist_node *node) return 0; } -static void blk_mq_remove_cpuhp(struct blk_mq_hw_ctx *hctx) +static void __blk_mq_remove_cpuhp(struct blk_mq_hw_ctx *hctx) { - if (!(hctx->flags & BLK_MQ_F_STACKING)) + lockdep_assert_held(&blk_mq_cpuhp_lock); + + if (!(hctx->flags & BLK_MQ_F_STACKING) && + !hlist_unhashed(&hctx->cpuhp_online)) { cpuhp_state_remove_instance_nocalls(CPUHP_AP_BLK_MQ_ONLINE, &hctx->cpuhp_online); - cpuhp_state_remove_instance_nocalls(CPUHP_BLK_MQ_DEAD, - &hctx->cpuhp_dead); + INIT_HLIST_NODE(&hctx->cpuhp_online); + } + + if (!hlist_unhashed(&hctx->cpuhp_dead)) { + cpuhp_state_remove_instance_nocalls(CPUHP_BLK_MQ_DEAD, + &hctx->cpuhp_dead); + INIT_HLIST_NODE(&hctx->cpuhp_dead); + } +} + +static void blk_mq_remove_cpuhp(struct blk_mq_hw_ctx *hctx) +{ + mutex_lock(&blk_mq_cpuhp_lock); + __blk_mq_remove_cpuhp(hctx); + mutex_unlock(&blk_mq_cpuhp_lock); +} + +static void __blk_mq_add_cpuhp(struct blk_mq_hw_ctx *hctx) +{ + lockdep_assert_held(&blk_mq_cpuhp_lock); + + if (!(hctx->flags & BLK_MQ_F_STACKING) && + hlist_unhashed(&hctx->cpuhp_online)) + cpuhp_state_add_instance_nocalls(CPUHP_AP_BLK_MQ_ONLINE, + &hctx->cpuhp_online); + + if (hlist_unhashed(&hctx->cpuhp_dead)) + cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_DEAD, + &hctx->cpuhp_dead); +} + +static void __blk_mq_remove_cpuhp_list(struct list_head *head) +{ + struct blk_mq_hw_ctx *hctx; + + lockdep_assert_held(&blk_mq_cpuhp_lock); + + list_for_each_entry(hctx, head, hctx_list) + __blk_mq_remove_cpuhp(hctx); +} + +/* + * Unregister cpuhp callbacks from exited hw queues + * + * Safe to call if this `request_queue` is live + */ +static void blk_mq_remove_hw_queues_cpuhp(struct request_queue *q) +{ + LIST_HEAD(hctx_list); + + spin_lock(&q->unused_hctx_lock); + list_splice_init(&q->unused_hctx_list, &hctx_list); + spin_unlock(&q->unused_hctx_lock); + + mutex_lock(&blk_mq_cpuhp_lock); + __blk_mq_remove_cpuhp_list(&hctx_list); + mutex_unlock(&blk_mq_cpuhp_lock); + + spin_lock(&q->unused_hctx_lock); + list_splice(&hctx_list, &q->unused_hctx_list); + spin_unlock(&q->unused_hctx_lock); +} + +/* + * Register cpuhp callbacks from all hw queues + * + * Safe to call if this `request_queue` is live + */ +static void blk_mq_add_hw_queues_cpuhp(struct request_queue *q) +{ + struct blk_mq_hw_ctx *hctx; + unsigned long i; + + mutex_lock(&blk_mq_cpuhp_lock); + queue_for_each_hw_ctx(q, hctx, i) + __blk_mq_add_cpuhp(hctx); + mutex_unlock(&blk_mq_cpuhp_lock); } /* @@ -3796,8 +3873,6 @@ static void blk_mq_exit_hctx(struct request_queue *q, if (set->ops->exit_hctx) set->ops->exit_hctx(hctx, hctx_idx); - blk_mq_remove_cpuhp(hctx); - xa_erase(&q->hctx_table, hctx_idx); spin_lock(&q->unused_hctx_lock); @@ -3814,6 +3889,7 @@ static void blk_mq_exit_hw_queues(struct request_queue *q, queue_for_each_hw_ctx(q, hctx, i) { if (i == nr_queue) break; + blk_mq_remove_cpuhp(hctx); blk_mq_exit_hctx(q, set, hctx, i); } } @@ -3824,16 +3900,11 @@ static int blk_mq_init_hctx(struct request_queue *q, { hctx->queue_num = hctx_idx; - if (!(hctx->flags & BLK_MQ_F_STACKING)) - cpuhp_state_add_instance_nocalls(CPUHP_AP_BLK_MQ_ONLINE, - &hctx->cpuhp_online); - cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_DEAD, &hctx->cpuhp_dead); - hctx->tags = set->tags[hctx_idx]; if (set->ops->init_hctx && set->ops->init_hctx(hctx, set->driver_data, hctx_idx)) - goto unregister_cpu_notifier; + goto fail; if (blk_mq_init_request(set, hctx->fq->flush_rq, hctx_idx, hctx->numa_node)) @@ -3850,8 +3921,7 @@ static int blk_mq_init_hctx(struct request_queue *q, exit_hctx: if (set->ops->exit_hctx) set->ops->exit_hctx(hctx, hctx_idx); - unregister_cpu_notifier: - blk_mq_remove_cpuhp(hctx); + fail: return -1; } @@ -3877,6 +3947,8 @@ blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set, INIT_DELAYED_WORK(&hctx->run_work, blk_mq_run_work_fn); spin_lock_init(&hctx->lock); INIT_LIST_HEAD(&hctx->dispatch); + INIT_HLIST_NODE(&hctx->cpuhp_dead); + INIT_HLIST_NODE(&hctx->cpuhp_online); hctx->queue = q; hctx->flags = set->flags & ~BLK_MQ_F_TAG_QUEUE_SHARED; @@ -4340,6 +4412,15 @@ struct gendisk *blk_mq_alloc_disk_for_queue(struct request_queue *q, } EXPORT_SYMBOL(blk_mq_alloc_disk_for_queue); +/* + * Only hctx removed from cpuhp list can be reused + */ +static bool blk_mq_hctx_is_reusable(struct blk_mq_hw_ctx *hctx) +{ + return hlist_unhashed(&hctx->cpuhp_online) && + hlist_unhashed(&hctx->cpuhp_dead); +} + static struct blk_mq_hw_ctx *blk_mq_alloc_and_init_hctx( struct blk_mq_tag_set *set, struct request_queue *q, int hctx_idx, int node) @@ -4349,7 +4430,7 @@ static struct blk_mq_hw_ctx *blk_mq_alloc_and_init_hctx( /* reuse dead hctx first */ spin_lock(&q->unused_hctx_lock); list_for_each_entry(tmp, &q->unused_hctx_list, hctx_list) { - if (tmp->numa_node == node) { + if (tmp->numa_node == node && blk_mq_hctx_is_reusable(tmp)) { hctx = tmp; break; } @@ -4415,6 +4496,12 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, xa_for_each_start(&q->hctx_table, j, hctx, j) blk_mq_exit_hctx(q, set, hctx, j); mutex_unlock(&q->sysfs_lock); + + /* unregister cpuhp callbacks for exited hctxs */ + blk_mq_remove_hw_queues_cpuhp(q); + + /* register cpuhp for new initialized hctxs */ + blk_mq_add_hw_queues_cpuhp(q); } int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 4241aea84161..767598e719ab 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -263,7 +263,7 @@ static ssize_t queue_nr_zones_show(struct gendisk *disk, char *page) static ssize_t queue_iostats_passthrough_show(struct gendisk *disk, char *page) { - return queue_var_show(blk_queue_passthrough_stat(disk->queue), page); + return queue_var_show(!!blk_queue_passthrough_stat(disk->queue), page); } static ssize_t queue_iostats_passthrough_store(struct gendisk *disk, diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 263e28b72053..84da1eadff64 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -41,7 +41,6 @@ static const char *const zone_cond_name[] = { /* * Per-zone write plug. * @node: hlist_node structure for managing the plug using a hash table. - * @link: To list the plug in the zone write plug error list of the disk. * @ref: Zone write plug reference counter. A zone write plug reference is * always at least 1 when the plug is hashed in the disk plug hash table. * The reference is incremented whenever a new BIO needing plugging is @@ -63,7 +62,6 @@ static const char *const zone_cond_name[] = { */ struct blk_zone_wplug { struct hlist_node node; - struct list_head link; refcount_t ref; spinlock_t lock; unsigned int flags; @@ -80,8 +78,8 @@ struct blk_zone_wplug { * - BLK_ZONE_WPLUG_PLUGGED: Indicates that the zone write plug is plugged, * that is, that write BIOs are being throttled due to a write BIO already * being executed or the zone write plug bio list is not empty. - * - BLK_ZONE_WPLUG_ERROR: Indicates that a write error happened which will be - * recovered with a report zone to update the zone write pointer offset. + * - BLK_ZONE_WPLUG_NEED_WP_UPDATE: Indicates that we lost track of a zone + * write pointer offset and need to update it. * - BLK_ZONE_WPLUG_UNHASHED: Indicates that the zone write plug was removed * from the disk hash table and that the initial reference to the zone * write plug set when the plug was first added to the hash table has been @@ -91,11 +89,9 @@ struct blk_zone_wplug { * freed once all remaining references from BIOs or functions are dropped. */ #define BLK_ZONE_WPLUG_PLUGGED (1U << 0) -#define BLK_ZONE_WPLUG_ERROR (1U << 1) +#define BLK_ZONE_WPLUG_NEED_WP_UPDATE (1U << 1) #define BLK_ZONE_WPLUG_UNHASHED (1U << 2) -#define BLK_ZONE_WPLUG_BUSY (BLK_ZONE_WPLUG_PLUGGED | BLK_ZONE_WPLUG_ERROR) - /** * blk_zone_cond_str - Return string XXX in BLK_ZONE_COND_XXX. * @zone_cond: BLK_ZONE_COND_XXX. @@ -115,6 +111,30 @@ const char *blk_zone_cond_str(enum blk_zone_cond zone_cond) } EXPORT_SYMBOL_GPL(blk_zone_cond_str); +struct disk_report_zones_cb_args { + struct gendisk *disk; + report_zones_cb user_cb; + void *user_data; +}; + +static void disk_zone_wplug_sync_wp_offset(struct gendisk *disk, + struct blk_zone *zone); + +static int disk_report_zones_cb(struct blk_zone *zone, unsigned int idx, + void *data) +{ + struct disk_report_zones_cb_args *args = data; + struct gendisk *disk = args->disk; + + if (disk->zone_wplugs_hash) + disk_zone_wplug_sync_wp_offset(disk, zone); + + if (!args->user_cb) + return 0; + + return args->user_cb(zone, idx, args->user_data); +} + /** * blkdev_report_zones - Get zones information * @bdev: Target block device @@ -139,6 +159,11 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector, { struct gendisk *disk = bdev->bd_disk; sector_t capacity = get_capacity(disk); + struct disk_report_zones_cb_args args = { + .disk = disk, + .user_cb = cb, + .user_data = data, + }; if (!bdev_is_zoned(bdev) || WARN_ON_ONCE(!disk->fops->report_zones)) return -EOPNOTSUPP; @@ -146,7 +171,8 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector, if (!nr_zones || sector >= capacity) return 0; - return disk->fops->report_zones(disk, sector, nr_zones, cb, data); + return disk->fops->report_zones(disk, sector, nr_zones, + disk_report_zones_cb, &args); } EXPORT_SYMBOL_GPL(blkdev_report_zones); @@ -427,7 +453,7 @@ static inline void disk_put_zone_wplug(struct blk_zone_wplug *zwplug) { if (refcount_dec_and_test(&zwplug->ref)) { WARN_ON_ONCE(!bio_list_empty(&zwplug->bio_list)); - WARN_ON_ONCE(!list_empty(&zwplug->link)); + WARN_ON_ONCE(zwplug->flags & BLK_ZONE_WPLUG_PLUGGED); WARN_ON_ONCE(!(zwplug->flags & BLK_ZONE_WPLUG_UNHASHED)); call_rcu(&zwplug->rcu_head, disk_free_zone_wplug_rcu); @@ -441,8 +467,8 @@ static inline bool disk_should_remove_zone_wplug(struct gendisk *disk, if (zwplug->flags & BLK_ZONE_WPLUG_UNHASHED) return false; - /* If the zone write plug is still busy, it cannot be removed. */ - if (zwplug->flags & BLK_ZONE_WPLUG_BUSY) + /* If the zone write plug is still plugged, it cannot be removed. */ + if (zwplug->flags & BLK_ZONE_WPLUG_PLUGGED) return false; /* @@ -525,12 +551,11 @@ again: return NULL; INIT_HLIST_NODE(&zwplug->node); - INIT_LIST_HEAD(&zwplug->link); refcount_set(&zwplug->ref, 2); spin_lock_init(&zwplug->lock); zwplug->flags = 0; zwplug->zone_no = zno; - zwplug->wp_offset = sector & (disk->queue->limits.chunk_sectors - 1); + zwplug->wp_offset = bdev_offset_from_zone_start(disk->part0, sector); bio_list_init(&zwplug->bio_list); INIT_WORK(&zwplug->bio_work, blk_zone_wplug_bio_work); zwplug->disk = disk; @@ -574,115 +599,22 @@ static void disk_zone_wplug_abort(struct blk_zone_wplug *zwplug) } /* - * Abort (fail) all plugged BIOs of a zone write plug that are not aligned - * with the assumed write pointer location of the zone when the BIO will - * be unplugged. - */ -static void disk_zone_wplug_abort_unaligned(struct gendisk *disk, - struct blk_zone_wplug *zwplug) -{ - unsigned int wp_offset = zwplug->wp_offset; - struct bio_list bl = BIO_EMPTY_LIST; - struct bio *bio; - - while ((bio = bio_list_pop(&zwplug->bio_list))) { - if (disk_zone_is_full(disk, zwplug->zone_no, wp_offset) || - (bio_op(bio) != REQ_OP_ZONE_APPEND && - bio_offset_from_zone_start(bio) != wp_offset)) { - blk_zone_wplug_bio_io_error(zwplug, bio); - continue; - } - - wp_offset += bio_sectors(bio); - bio_list_add(&bl, bio); - } - - bio_list_merge(&zwplug->bio_list, &bl); -} - -static inline void disk_zone_wplug_set_error(struct gendisk *disk, - struct blk_zone_wplug *zwplug) -{ - unsigned long flags; - - if (zwplug->flags & BLK_ZONE_WPLUG_ERROR) - return; - - /* - * At this point, we already have a reference on the zone write plug. - * However, since we are going to add the plug to the disk zone write - * plugs work list, increase its reference count. This reference will - * be dropped in disk_zone_wplugs_work() once the error state is - * handled, or in disk_zone_wplug_clear_error() if the zone is reset or - * finished. - */ - zwplug->flags |= BLK_ZONE_WPLUG_ERROR; - refcount_inc(&zwplug->ref); - - spin_lock_irqsave(&disk->zone_wplugs_lock, flags); - list_add_tail(&zwplug->link, &disk->zone_wplugs_err_list); - spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); -} - -static inline void disk_zone_wplug_clear_error(struct gendisk *disk, - struct blk_zone_wplug *zwplug) -{ - unsigned long flags; - - if (!(zwplug->flags & BLK_ZONE_WPLUG_ERROR)) - return; - - /* - * We are racing with the error handling work which drops the reference - * on the zone write plug after handling the error state. So remove the - * plug from the error list and drop its reference count only if the - * error handling has not yet started, that is, if the zone write plug - * is still listed. - */ - spin_lock_irqsave(&disk->zone_wplugs_lock, flags); - if (!list_empty(&zwplug->link)) { - list_del_init(&zwplug->link); - zwplug->flags &= ~BLK_ZONE_WPLUG_ERROR; - disk_put_zone_wplug(zwplug); - } - spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); -} - -/* - * Set a zone write plug write pointer offset to either 0 (zone reset case) - * or to the zone size (zone finish case). This aborts all plugged BIOs, which - * is fine to do as doing a zone reset or zone finish while writes are in-flight - * is a mistake from the user which will most likely cause all plugged BIOs to - * fail anyway. + * Set a zone write plug write pointer offset to the specified value. + * This aborts all plugged BIOs, which is fine as this function is called for + * a zone reset operation, a zone finish operation or if the zone needs a wp + * update from a report zone after a write error. */ static void disk_zone_wplug_set_wp_offset(struct gendisk *disk, struct blk_zone_wplug *zwplug, unsigned int wp_offset) { - unsigned long flags; - - spin_lock_irqsave(&zwplug->lock, flags); - - /* - * Make sure that a BIO completion or another zone reset or finish - * operation has not already removed the plug from the hash table. - */ - if (zwplug->flags & BLK_ZONE_WPLUG_UNHASHED) { - spin_unlock_irqrestore(&zwplug->lock, flags); - return; - } + lockdep_assert_held(&zwplug->lock); /* Update the zone write pointer and abort all plugged BIOs. */ + zwplug->flags &= ~BLK_ZONE_WPLUG_NEED_WP_UPDATE; zwplug->wp_offset = wp_offset; disk_zone_wplug_abort(zwplug); - /* - * Updating the write pointer offset puts back the zone - * in a good state. So clear the error flag and decrement the - * error count if we were in error state. - */ - disk_zone_wplug_clear_error(disk, zwplug); - /* * The zone write plug now has no BIO plugged: remove it from the * hash table so that it cannot be seen. The plug will be freed @@ -690,8 +622,58 @@ static void disk_zone_wplug_set_wp_offset(struct gendisk *disk, */ if (disk_should_remove_zone_wplug(disk, zwplug)) disk_remove_zone_wplug(disk, zwplug); +} +static unsigned int blk_zone_wp_offset(struct blk_zone *zone) +{ + switch (zone->cond) { + case BLK_ZONE_COND_IMP_OPEN: + case BLK_ZONE_COND_EXP_OPEN: + case BLK_ZONE_COND_CLOSED: + return zone->wp - zone->start; + case BLK_ZONE_COND_FULL: + return zone->len; + case BLK_ZONE_COND_EMPTY: + return 0; + case BLK_ZONE_COND_NOT_WP: + case BLK_ZONE_COND_OFFLINE: + case BLK_ZONE_COND_READONLY: + default: + /* + * Conventional, offline and read-only zones do not have a valid + * write pointer. + */ + return UINT_MAX; + } +} + +static void disk_zone_wplug_sync_wp_offset(struct gendisk *disk, + struct blk_zone *zone) +{ + struct blk_zone_wplug *zwplug; + unsigned long flags; + + zwplug = disk_get_zone_wplug(disk, zone->start); + if (!zwplug) + return; + + spin_lock_irqsave(&zwplug->lock, flags); + if (zwplug->flags & BLK_ZONE_WPLUG_NEED_WP_UPDATE) + disk_zone_wplug_set_wp_offset(disk, zwplug, + blk_zone_wp_offset(zone)); spin_unlock_irqrestore(&zwplug->lock, flags); + + disk_put_zone_wplug(zwplug); +} + +static int disk_zone_sync_wp_offset(struct gendisk *disk, sector_t sector) +{ + struct disk_report_zones_cb_args args = { + .disk = disk, + }; + + return disk->fops->report_zones(disk, sector, 1, + disk_report_zones_cb, &args); } static bool blk_zone_wplug_handle_reset_or_finish(struct bio *bio, @@ -700,6 +682,7 @@ static bool blk_zone_wplug_handle_reset_or_finish(struct bio *bio, struct gendisk *disk = bio->bi_bdev->bd_disk; sector_t sector = bio->bi_iter.bi_sector; struct blk_zone_wplug *zwplug; + unsigned long flags; /* Conventional zones cannot be reset nor finished. */ if (!bdev_zone_is_seq(bio->bi_bdev, sector)) { @@ -707,6 +690,15 @@ static bool blk_zone_wplug_handle_reset_or_finish(struct bio *bio, return true; } + /* + * No-wait reset or finish BIOs do not make much sense as the callers + * issue these as blocking operations in most cases. To avoid issues + * the BIO execution potentially failing with BLK_STS_AGAIN, warn about + * REQ_NOWAIT being set and ignore that flag. + */ + if (WARN_ON_ONCE(bio->bi_opf & REQ_NOWAIT)) + bio->bi_opf &= ~REQ_NOWAIT; + /* * If we have a zone write plug, set its write pointer offset to 0 * (reset case) or to the zone size (finish case). This will abort all @@ -716,7 +708,9 @@ static bool blk_zone_wplug_handle_reset_or_finish(struct bio *bio, */ zwplug = disk_get_zone_wplug(disk, sector); if (zwplug) { + spin_lock_irqsave(&zwplug->lock, flags); disk_zone_wplug_set_wp_offset(disk, zwplug, wp_offset); + spin_unlock_irqrestore(&zwplug->lock, flags); disk_put_zone_wplug(zwplug); } @@ -727,6 +721,7 @@ static bool blk_zone_wplug_handle_reset_all(struct bio *bio) { struct gendisk *disk = bio->bi_bdev->bd_disk; struct blk_zone_wplug *zwplug; + unsigned long flags; sector_t sector; /* @@ -738,7 +733,9 @@ static bool blk_zone_wplug_handle_reset_all(struct bio *bio) sector += disk->queue->limits.chunk_sectors) { zwplug = disk_get_zone_wplug(disk, sector); if (zwplug) { + spin_lock_irqsave(&zwplug->lock, flags); disk_zone_wplug_set_wp_offset(disk, zwplug, 0); + spin_unlock_irqrestore(&zwplug->lock, flags); disk_put_zone_wplug(zwplug); } } @@ -746,9 +743,25 @@ static bool blk_zone_wplug_handle_reset_all(struct bio *bio) return false; } -static inline void blk_zone_wplug_add_bio(struct blk_zone_wplug *zwplug, - struct bio *bio, unsigned int nr_segs) +static void disk_zone_wplug_schedule_bio_work(struct gendisk *disk, + struct blk_zone_wplug *zwplug) { + /* + * Take a reference on the zone write plug and schedule the submission + * of the next plugged BIO. blk_zone_wplug_bio_work() will release the + * reference we take here. + */ + WARN_ON_ONCE(!(zwplug->flags & BLK_ZONE_WPLUG_PLUGGED)); + refcount_inc(&zwplug->ref); + queue_work(disk->zone_wplugs_wq, &zwplug->bio_work); +} + +static inline void disk_zone_wplug_add_bio(struct gendisk *disk, + struct blk_zone_wplug *zwplug, + struct bio *bio, unsigned int nr_segs) +{ + bool schedule_bio_work = false; + /* * Grab an extra reference on the BIO request queue usage counter. * This reference will be reused to submit a request for the BIO for @@ -764,6 +777,16 @@ static inline void blk_zone_wplug_add_bio(struct blk_zone_wplug *zwplug, */ bio_clear_polled(bio); + /* + * REQ_NOWAIT BIOs are always handled using the zone write plug BIO + * work, which can block. So clear the REQ_NOWAIT flag and schedule the + * work if this is the first BIO we are plugging. + */ + if (bio->bi_opf & REQ_NOWAIT) { + schedule_bio_work = !(zwplug->flags & BLK_ZONE_WPLUG_PLUGGED); + bio->bi_opf &= ~REQ_NOWAIT; + } + /* * Reuse the poll cookie field to store the number of segments when * split to the hardware limits. @@ -777,6 +800,11 @@ static inline void blk_zone_wplug_add_bio(struct blk_zone_wplug *zwplug, * at the tail of the list to preserve the sequential write order. */ bio_list_add(&zwplug->bio_list, bio); + + zwplug->flags |= BLK_ZONE_WPLUG_PLUGGED; + + if (schedule_bio_work) + disk_zone_wplug_schedule_bio_work(disk, zwplug); } /* @@ -889,13 +917,23 @@ static bool blk_zone_wplug_prepare_bio(struct blk_zone_wplug *zwplug, { struct gendisk *disk = bio->bi_bdev->bd_disk; + /* + * If we lost track of the zone write pointer due to a write error, + * the user must either execute a report zones, reset the zone or finish + * the to recover a reliable write pointer position. Fail BIOs if the + * user did not do that as we cannot handle emulated zone append + * otherwise. + */ + if (zwplug->flags & BLK_ZONE_WPLUG_NEED_WP_UPDATE) + return false; + /* * Check that the user is not attempting to write to a full zone. * We know such BIO will fail, and that would potentially overflow our * write pointer offset beyond the end of the zone. */ if (disk_zone_wplug_is_full(disk, zwplug)) - goto err; + return false; if (bio_op(bio) == REQ_OP_ZONE_APPEND) { /* @@ -914,24 +952,18 @@ static bool blk_zone_wplug_prepare_bio(struct blk_zone_wplug *zwplug, bio_set_flag(bio, BIO_EMULATES_ZONE_APPEND); } else { /* - * Check for non-sequential writes early because we avoid a - * whole lot of error handling trouble if we don't send it off - * to the driver. + * Check for non-sequential writes early as we know that BIOs + * with a start sector not unaligned to the zone write pointer + * will fail. */ if (bio_offset_from_zone_start(bio) != zwplug->wp_offset) - goto err; + return false; } /* Advance the zone write pointer offset. */ zwplug->wp_offset += bio_sectors(bio); return true; - -err: - /* We detected an invalid write BIO: schedule error recovery. */ - disk_zone_wplug_set_error(disk, zwplug); - kblockd_schedule_work(&disk->zone_wplugs_work); - return false; } static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs) @@ -970,7 +1002,10 @@ static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs) zwplug = disk_get_and_lock_zone_wplug(disk, sector, gfp_mask, &flags); if (!zwplug) { - bio_io_error(bio); + if (bio->bi_opf & REQ_NOWAIT) + bio_wouldblock_error(bio); + else + bio_io_error(bio); return true; } @@ -978,18 +1013,20 @@ static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs) bio_set_flag(bio, BIO_ZONE_WRITE_PLUGGING); /* - * If the zone is already plugged or has a pending error, add the BIO - * to the plug BIO list. Otherwise, plug and let the BIO execute. + * If the zone is already plugged, add the BIO to the plug BIO list. + * Do the same for REQ_NOWAIT BIOs to ensure that we will not see a + * BLK_STS_AGAIN failure if we let the BIO execute. + * Otherwise, plug and let the BIO execute. */ - if (zwplug->flags & BLK_ZONE_WPLUG_BUSY) + if ((zwplug->flags & BLK_ZONE_WPLUG_PLUGGED) || + (bio->bi_opf & REQ_NOWAIT)) goto plug; - /* - * If an error is detected when preparing the BIO, add it to the BIO - * list so that error recovery can deal with it. - */ - if (!blk_zone_wplug_prepare_bio(zwplug, bio)) - goto plug; + if (!blk_zone_wplug_prepare_bio(zwplug, bio)) { + spin_unlock_irqrestore(&zwplug->lock, flags); + bio_io_error(bio); + return true; + } zwplug->flags |= BLK_ZONE_WPLUG_PLUGGED; @@ -998,8 +1035,7 @@ static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs) return false; plug: - zwplug->flags |= BLK_ZONE_WPLUG_PLUGGED; - blk_zone_wplug_add_bio(zwplug, bio, nr_segs); + disk_zone_wplug_add_bio(disk, zwplug, bio, nr_segs); spin_unlock_irqrestore(&zwplug->lock, flags); @@ -1083,19 +1119,6 @@ bool blk_zone_plug_bio(struct bio *bio, unsigned int nr_segs) } EXPORT_SYMBOL_GPL(blk_zone_plug_bio); -static void disk_zone_wplug_schedule_bio_work(struct gendisk *disk, - struct blk_zone_wplug *zwplug) -{ - /* - * Take a reference on the zone write plug and schedule the submission - * of the next plugged BIO. blk_zone_wplug_bio_work() will release the - * reference we take here. - */ - WARN_ON_ONCE(!(zwplug->flags & BLK_ZONE_WPLUG_PLUGGED)); - refcount_inc(&zwplug->ref); - queue_work(disk->zone_wplugs_wq, &zwplug->bio_work); -} - static void disk_zone_wplug_unplug_bio(struct gendisk *disk, struct blk_zone_wplug *zwplug) { @@ -1103,16 +1126,6 @@ static void disk_zone_wplug_unplug_bio(struct gendisk *disk, spin_lock_irqsave(&zwplug->lock, flags); - /* - * If we had an error, schedule error recovery. The recovery work - * will restart submission of plugged BIOs. - */ - if (zwplug->flags & BLK_ZONE_WPLUG_ERROR) { - spin_unlock_irqrestore(&zwplug->lock, flags); - kblockd_schedule_work(&disk->zone_wplugs_work); - return; - } - /* Schedule submission of the next plugged BIO if we have one. */ if (!bio_list_empty(&zwplug->bio_list)) { disk_zone_wplug_schedule_bio_work(disk, zwplug); @@ -1155,12 +1168,13 @@ void blk_zone_write_plug_bio_endio(struct bio *bio) } /* - * If the BIO failed, mark the plug as having an error to trigger - * recovery. + * If the BIO failed, abort all plugged BIOs and mark the plug as + * needing a write pointer update. */ if (bio->bi_status != BLK_STS_OK) { spin_lock_irqsave(&zwplug->lock, flags); - disk_zone_wplug_set_error(disk, zwplug); + disk_zone_wplug_abort(zwplug); + zwplug->flags |= BLK_ZONE_WPLUG_NEED_WP_UPDATE; spin_unlock_irqrestore(&zwplug->lock, flags); } @@ -1216,6 +1230,7 @@ static void blk_zone_wplug_bio_work(struct work_struct *work) */ spin_lock_irqsave(&zwplug->lock, flags); +again: bio = bio_list_pop(&zwplug->bio_list); if (!bio) { zwplug->flags &= ~BLK_ZONE_WPLUG_PLUGGED; @@ -1224,10 +1239,8 @@ static void blk_zone_wplug_bio_work(struct work_struct *work) } if (!blk_zone_wplug_prepare_bio(zwplug, bio)) { - /* Error recovery will decide what to do with the BIO. */ - bio_list_add_head(&zwplug->bio_list, bio); - spin_unlock_irqrestore(&zwplug->lock, flags); - goto put_zwplug; + blk_zone_wplug_bio_io_error(zwplug, bio); + goto again; } spin_unlock_irqrestore(&zwplug->lock, flags); @@ -1249,120 +1262,6 @@ put_zwplug: disk_put_zone_wplug(zwplug); } -static unsigned int blk_zone_wp_offset(struct blk_zone *zone) -{ - switch (zone->cond) { - case BLK_ZONE_COND_IMP_OPEN: - case BLK_ZONE_COND_EXP_OPEN: - case BLK_ZONE_COND_CLOSED: - return zone->wp - zone->start; - case BLK_ZONE_COND_FULL: - return zone->len; - case BLK_ZONE_COND_EMPTY: - return 0; - case BLK_ZONE_COND_NOT_WP: - case BLK_ZONE_COND_OFFLINE: - case BLK_ZONE_COND_READONLY: - default: - /* - * Conventional, offline and read-only zones do not have a valid - * write pointer. - */ - return UINT_MAX; - } -} - -static int blk_zone_wplug_report_zone_cb(struct blk_zone *zone, - unsigned int idx, void *data) -{ - struct blk_zone *zonep = data; - - *zonep = *zone; - return 0; -} - -static void disk_zone_wplug_handle_error(struct gendisk *disk, - struct blk_zone_wplug *zwplug) -{ - sector_t zone_start_sector = - bdev_zone_sectors(disk->part0) * zwplug->zone_no; - unsigned int noio_flag; - struct blk_zone zone; - unsigned long flags; - int ret; - - /* Get the current zone information from the device. */ - noio_flag = memalloc_noio_save(); - ret = disk->fops->report_zones(disk, zone_start_sector, 1, - blk_zone_wplug_report_zone_cb, &zone); - memalloc_noio_restore(noio_flag); - - spin_lock_irqsave(&zwplug->lock, flags); - - /* - * A zone reset or finish may have cleared the error already. In such - * case, do nothing as the report zones may have seen the "old" write - * pointer value before the reset/finish operation completed. - */ - if (!(zwplug->flags & BLK_ZONE_WPLUG_ERROR)) - goto unlock; - - zwplug->flags &= ~BLK_ZONE_WPLUG_ERROR; - - if (ret != 1) { - /* - * We failed to get the zone information, meaning that something - * is likely really wrong with the device. Abort all remaining - * plugged BIOs as otherwise we could endup waiting forever on - * plugged BIOs to complete if there is a queue freeze on-going. - */ - disk_zone_wplug_abort(zwplug); - goto unplug; - } - - /* Update the zone write pointer offset. */ - zwplug->wp_offset = blk_zone_wp_offset(&zone); - disk_zone_wplug_abort_unaligned(disk, zwplug); - - /* Restart BIO submission if we still have any BIO left. */ - if (!bio_list_empty(&zwplug->bio_list)) { - disk_zone_wplug_schedule_bio_work(disk, zwplug); - goto unlock; - } - -unplug: - zwplug->flags &= ~BLK_ZONE_WPLUG_PLUGGED; - if (disk_should_remove_zone_wplug(disk, zwplug)) - disk_remove_zone_wplug(disk, zwplug); - -unlock: - spin_unlock_irqrestore(&zwplug->lock, flags); -} - -static void disk_zone_wplugs_work(struct work_struct *work) -{ - struct gendisk *disk = - container_of(work, struct gendisk, zone_wplugs_work); - struct blk_zone_wplug *zwplug; - unsigned long flags; - - spin_lock_irqsave(&disk->zone_wplugs_lock, flags); - - while (!list_empty(&disk->zone_wplugs_err_list)) { - zwplug = list_first_entry(&disk->zone_wplugs_err_list, - struct blk_zone_wplug, link); - list_del_init(&zwplug->link); - spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); - - disk_zone_wplug_handle_error(disk, zwplug); - disk_put_zone_wplug(zwplug); - - spin_lock_irqsave(&disk->zone_wplugs_lock, flags); - } - - spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); -} - static inline unsigned int disk_zone_wplugs_hash_size(struct gendisk *disk) { return 1U << disk->zone_wplugs_hash_bits; @@ -1371,8 +1270,6 @@ static inline unsigned int disk_zone_wplugs_hash_size(struct gendisk *disk) void disk_init_zone_resources(struct gendisk *disk) { spin_lock_init(&disk->zone_wplugs_lock); - INIT_LIST_HEAD(&disk->zone_wplugs_err_list); - INIT_WORK(&disk->zone_wplugs_work, disk_zone_wplugs_work); } /* @@ -1471,8 +1368,6 @@ void disk_free_zone_resources(struct gendisk *disk) if (!disk->zone_wplugs_pool) return; - cancel_work_sync(&disk->zone_wplugs_work); - if (disk->zone_wplugs_wq) { destroy_workqueue(disk->zone_wplugs_wq); disk->zone_wplugs_wq = NULL; @@ -1669,6 +1564,8 @@ static int blk_revalidate_seq_zone(struct blk_zone *zone, unsigned int idx, if (!disk->zone_wplugs_hash) return 0; + disk_zone_wplug_sync_wp_offset(disk, zone); + wp_offset = blk_zone_wp_offset(zone); if (!wp_offset || wp_offset >= zone->capacity) return 0; @@ -1799,6 +1696,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk) memalloc_noio_restore(noio_flag); return ret; } + ret = disk->fops->report_zones(disk, 0, UINT_MAX, blk_revalidate_zone_cb, &args); if (!ret) { @@ -1835,6 +1733,48 @@ int blk_revalidate_disk_zones(struct gendisk *disk) } EXPORT_SYMBOL_GPL(blk_revalidate_disk_zones); +/** + * blk_zone_issue_zeroout - zero-fill a block range in a zone + * @bdev: blockdev to write + * @sector: start sector + * @nr_sects: number of sectors to write + * @gfp_mask: memory allocation flags (for bio_alloc) + * + * Description: + * Zero-fill a block range in a zone (@sector must be equal to the zone write + * pointer), handling potential errors due to the (initially unknown) lack of + * hardware offload (See blkdev_issue_zeroout()). + */ +int blk_zone_issue_zeroout(struct block_device *bdev, sector_t sector, + sector_t nr_sects, gfp_t gfp_mask) +{ + int ret; + + if (WARN_ON_ONCE(!bdev_is_zoned(bdev))) + return -EIO; + + ret = blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask, + BLKDEV_ZERO_NOFALLBACK); + if (ret != -EOPNOTSUPP) + return ret; + + /* + * The failed call to blkdev_issue_zeroout() advanced the zone write + * pointer. Undo this using a report zone to update the zone write + * pointer to the correct current value. + */ + ret = disk_zone_sync_wp_offset(bdev->bd_disk, sector); + if (ret != 1) + return ret < 0 ? ret : -EIO; + + /* + * Retry without BLKDEV_ZERO_NOFALLBACK to force the fallback to a + * regular write with zero-pages. + */ + return blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask, 0); +} +EXPORT_SYMBOL_GPL(blk_zone_issue_zeroout); + #ifdef CONFIG_BLK_DEBUG_FS int queue_zone_wplugs_show(void *data, struct seq_file *m) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 91b3789f710e..5528347b5fcf 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -698,8 +698,6 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, list_add(&rq->queuelist, &per_prio->dispatch); rq->fifo_time = jiffies; } else { - struct list_head *insert_before; - deadline_add_rq_rb(per_prio, rq); if (rq_mergeable(rq)) { @@ -712,8 +710,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, * set expire time and add to fifo list */ rq->fifo_time = jiffies + dd->fifo_expire[data_dir]; - insert_before = &per_prio->fifo_list[data_dir]; - list_add_tail(&rq->queuelist, insert_before); + list_add_tail(&rq->queuelist, &per_prio->fifo_list[data_dir]); } } diff --git a/crypto/adiantum.c b/crypto/adiantum.c index 60f3883b736a..c3ef583598b4 100644 --- a/crypto/adiantum.c +++ b/crypto/adiantum.c @@ -646,4 +646,4 @@ MODULE_DESCRIPTION("Adiantum length-preserving encryption mode"); MODULE_LICENSE("GPL v2"); MODULE_AUTHOR("Eric Biggers "); MODULE_ALIAS_CRYPTO("adiantum"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/ansi_cprng.c b/crypto/ansi_cprng.c index 3f512efaba3a..64f57c4c4b06 100644 --- a/crypto/ansi_cprng.c +++ b/crypto/ansi_cprng.c @@ -471,4 +471,4 @@ subsys_initcall(prng_mod_init); module_exit(prng_mod_fini); MODULE_ALIAS_CRYPTO("stdrng"); MODULE_ALIAS_CRYPTO("ansi_cprng"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/ccm.c b/crypto/ccm.c index 36f0acec32e1..06476b53b491 100644 --- a/crypto/ccm.c +++ b/crypto/ccm.c @@ -949,4 +949,4 @@ MODULE_ALIAS_CRYPTO("ccm_base"); MODULE_ALIAS_CRYPTO("rfc4309"); MODULE_ALIAS_CRYPTO("ccm"); MODULE_ALIAS_CRYPTO("cbcmac"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/cipher.c b/crypto/cipher.c index 40cae908788e..1fe62bf79656 100644 --- a/crypto/cipher.c +++ b/crypto/cipher.c @@ -53,7 +53,7 @@ int crypto_cipher_setkey(struct crypto_cipher *tfm, return cia->cia_setkey(crypto_cipher_tfm(tfm), key, keylen); } -EXPORT_SYMBOL_NS_GPL(crypto_cipher_setkey, CRYPTO_INTERNAL); +EXPORT_SYMBOL_NS_GPL(crypto_cipher_setkey, "CRYPTO_INTERNAL"); static inline void cipher_crypt_one(struct crypto_cipher *tfm, u8 *dst, const u8 *src, bool enc) @@ -81,14 +81,14 @@ void crypto_cipher_encrypt_one(struct crypto_cipher *tfm, { cipher_crypt_one(tfm, dst, src, true); } -EXPORT_SYMBOL_NS_GPL(crypto_cipher_encrypt_one, CRYPTO_INTERNAL); +EXPORT_SYMBOL_NS_GPL(crypto_cipher_encrypt_one, "CRYPTO_INTERNAL"); void crypto_cipher_decrypt_one(struct crypto_cipher *tfm, u8 *dst, const u8 *src) { cipher_crypt_one(tfm, dst, src, false); } -EXPORT_SYMBOL_NS_GPL(crypto_cipher_decrypt_one, CRYPTO_INTERNAL); +EXPORT_SYMBOL_NS_GPL(crypto_cipher_decrypt_one, "CRYPTO_INTERNAL"); struct crypto_cipher *crypto_clone_cipher(struct crypto_cipher *cipher) { diff --git a/crypto/cmac.c b/crypto/cmac.c index c7aa3665b076..c66a0f4d8808 100644 --- a/crypto/cmac.c +++ b/crypto/cmac.c @@ -313,4 +313,4 @@ module_exit(crypto_cmac_module_exit); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("CMAC keyed hash algorithm"); MODULE_ALIAS_CRYPTO("cmac"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/ctr.c b/crypto/ctr.c index 1420496062d5..73c0d6e53b2f 100644 --- a/crypto/ctr.c +++ b/crypto/ctr.c @@ -357,4 +357,4 @@ MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("CTR block cipher mode of operation"); MODULE_ALIAS_CRYPTO("rfc3686"); MODULE_ALIAS_CRYPTO("ctr"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/drbg.c b/crypto/drbg.c index c323f40bed4f..f28dfc2511a2 100644 --- a/crypto/drbg.c +++ b/crypto/drbg.c @@ -2151,4 +2151,4 @@ MODULE_DESCRIPTION("NIST SP800-90A Deterministic Random Bit Generator (DRBG) " CRYPTO_DRBG_HMAC_STRING CRYPTO_DRBG_CTR_STRING); MODULE_ALIAS_CRYPTO("stdrng"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/ecb.c b/crypto/ecb.c index e3a67789050e..95d7e972865a 100644 --- a/crypto/ecb.c +++ b/crypto/ecb.c @@ -225,4 +225,4 @@ module_exit(crypto_ecb_module_exit); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("ECB block cipher mode of operation"); MODULE_ALIAS_CRYPTO("ecb"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/essiv.c b/crypto/essiv.c index e63fc6442e32..1c00c3324058 100644 --- a/crypto/essiv.c +++ b/crypto/essiv.c @@ -649,4 +649,4 @@ module_exit(essiv_module_exit); MODULE_DESCRIPTION("ESSIV skcipher/aead wrapper for block encryption"); MODULE_LICENSE("GPL v2"); MODULE_ALIAS_CRYPTO("essiv"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/hctr2.c b/crypto/hctr2.c index 87e7547ad186..cbcd673be481 100644 --- a/crypto/hctr2.c +++ b/crypto/hctr2.c @@ -576,4 +576,4 @@ module_exit(hctr2_module_exit); MODULE_DESCRIPTION("HCTR2 length-preserving encryption mode"); MODULE_LICENSE("GPL v2"); MODULE_ALIAS_CRYPTO("hctr2"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/keywrap.c b/crypto/keywrap.c index 054d9a216fc9..385ffdfd5a9b 100644 --- a/crypto/keywrap.c +++ b/crypto/keywrap.c @@ -317,4 +317,4 @@ MODULE_LICENSE("Dual BSD/GPL"); MODULE_AUTHOR("Stephan Mueller "); MODULE_DESCRIPTION("Key Wrapping (RFC3394 / NIST SP800-38F)"); MODULE_ALIAS_CRYPTO("kw"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/pcbc.c b/crypto/pcbc.c index ab469ba50c13..cbfb3ac14b3a 100644 --- a/crypto/pcbc.c +++ b/crypto/pcbc.c @@ -192,4 +192,4 @@ module_exit(crypto_pcbc_module_exit); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("PCBC block cipher mode of operation"); MODULE_ALIAS_CRYPTO("pcbc"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/rsassa-pkcs1.c b/crypto/rsassa-pkcs1.c index 4d077fc96076..f68ffd338f48 100644 --- a/crypto/rsassa-pkcs1.c +++ b/crypto/rsassa-pkcs1.c @@ -163,10 +163,6 @@ static int rsassa_pkcs1_sign(struct crypto_sig *tfm, struct rsassa_pkcs1_inst_ctx *ictx = sig_instance_ctx(inst); const struct hash_prefix *hash_prefix = ictx->hash_prefix; struct rsassa_pkcs1_ctx *ctx = crypto_sig_ctx(tfm); - unsigned int child_reqsize = crypto_akcipher_reqsize(ctx->child); - struct akcipher_request *child_req __free(kfree_sensitive) = NULL; - struct scatterlist in_sg[3], out_sg; - struct crypto_wait cwait; unsigned int pad_len; unsigned int ps_end; unsigned int len; @@ -187,37 +183,25 @@ static int rsassa_pkcs1_sign(struct crypto_sig *tfm, pad_len = ctx->key_size - slen - hash_prefix->size - 1; - child_req = kmalloc(sizeof(*child_req) + child_reqsize + pad_len, - GFP_KERNEL); - if (!child_req) - return -ENOMEM; - /* RFC 8017 sec 8.2.1 step 1 - EMSA-PKCS1-v1_5 encoding generation */ - in_buf = (u8 *)(child_req + 1) + child_reqsize; + in_buf = dst; + memmove(in_buf + pad_len + hash_prefix->size, src, slen); + memcpy(in_buf + pad_len, hash_prefix->data, hash_prefix->size); + ps_end = pad_len - 1; in_buf[0] = 0x01; memset(in_buf + 1, 0xff, ps_end - 1); in_buf[ps_end] = 0x00; - /* RFC 8017 sec 8.2.1 step 2 - RSA signature */ - crypto_init_wait(&cwait); - sg_init_table(in_sg, 3); - sg_set_buf(&in_sg[0], in_buf, pad_len); - sg_set_buf(&in_sg[1], hash_prefix->data, hash_prefix->size); - sg_set_buf(&in_sg[2], src, slen); - sg_init_one(&out_sg, dst, dlen); - akcipher_request_set_tfm(child_req, ctx->child); - akcipher_request_set_crypt(child_req, in_sg, &out_sg, - ctx->key_size - 1, dlen); - akcipher_request_set_callback(child_req, CRYPTO_TFM_REQ_MAY_SLEEP, - crypto_req_done, &cwait); - err = crypto_akcipher_decrypt(child_req); - err = crypto_wait_req(err, &cwait); - if (err) + /* RFC 8017 sec 8.2.1 step 2 - RSA signature */ + err = crypto_akcipher_sync_decrypt(ctx->child, in_buf, + ctx->key_size - 1, in_buf, + ctx->key_size); + if (err < 0) return err; - len = child_req->dst_len; + len = err; pad_len = ctx->key_size - len; /* Four billion to one */ @@ -239,8 +223,8 @@ static int rsassa_pkcs1_verify(struct crypto_sig *tfm, struct rsassa_pkcs1_ctx *ctx = crypto_sig_ctx(tfm); unsigned int child_reqsize = crypto_akcipher_reqsize(ctx->child); struct akcipher_request *child_req __free(kfree_sensitive) = NULL; - struct scatterlist in_sg, out_sg; struct crypto_wait cwait; + struct scatterlist sg; unsigned int dst_len; unsigned int pos; u8 *out_buf; @@ -259,13 +243,12 @@ static int rsassa_pkcs1_verify(struct crypto_sig *tfm, return -ENOMEM; out_buf = (u8 *)(child_req + 1) + child_reqsize; + memcpy(out_buf, src, slen); crypto_init_wait(&cwait); - sg_init_one(&in_sg, src, slen); - sg_init_one(&out_sg, out_buf, ctx->key_size); + sg_init_one(&sg, out_buf, slen); akcipher_request_set_tfm(child_req, ctx->child); - akcipher_request_set_crypt(child_req, &in_sg, &out_sg, - slen, ctx->key_size); + akcipher_request_set_crypt(child_req, &sg, &sg, slen, slen); akcipher_request_set_callback(child_req, CRYPTO_TFM_REQ_MAY_SLEEP, crypto_req_done, &cwait); diff --git a/crypto/skcipher.c b/crypto/skcipher.c index ceed7f33a67b..f74e4d0d87a2 100644 --- a/crypto/skcipher.c +++ b/crypto/skcipher.c @@ -1085,4 +1085,4 @@ EXPORT_SYMBOL_GPL(skcipher_alloc_instance_simple); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("Symmetric key cipher type"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/testmgr.c b/crypto/testmgr.c index 3fc908bac21a..1f5f48ab18c7 100644 --- a/crypto/testmgr.c +++ b/crypto/testmgr.c @@ -39,7 +39,7 @@ #include "internal.h" -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); static bool notests; module_param(notests, bool, 0644); diff --git a/crypto/vmac.c b/crypto/vmac.c index bd9d70eac22e..2ea384645ecf 100644 --- a/crypto/vmac.c +++ b/crypto/vmac.c @@ -693,4 +693,4 @@ module_exit(vmac_module_exit); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("VMAC hash algorithm"); MODULE_ALIAS_CRYPTO("vmac64"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/xcbc.c b/crypto/xcbc.c index a9e8ee9c1949..fc785667b134 100644 --- a/crypto/xcbc.c +++ b/crypto/xcbc.c @@ -261,4 +261,4 @@ module_exit(crypto_xcbc_module_exit); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("XCBC keyed hash algorithm"); MODULE_ALIAS_CRYPTO("xcbc"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/xctr.c b/crypto/xctr.c index 5c00147e8ec4..6ed9c85ededa 100644 --- a/crypto/xctr.c +++ b/crypto/xctr.c @@ -188,4 +188,4 @@ module_exit(crypto_xctr_module_exit); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("XCTR block cipher mode of operation"); MODULE_ALIAS_CRYPTO("xctr"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/crypto/xts.c b/crypto/xts.c index 672e1a3f0b0c..821060ede2cf 100644 --- a/crypto/xts.c +++ b/crypto/xts.c @@ -472,5 +472,5 @@ module_exit(xts_module_exit); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("XTS block cipher mode"); MODULE_ALIAS_CRYPTO("xts"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); MODULE_SOFTDEP("pre: ecb"); diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig index 64065fb8922b..5b9490367a39 100644 --- a/drivers/accel/Kconfig +++ b/drivers/accel/Kconfig @@ -24,6 +24,7 @@ menuconfig DRM_ACCEL different device files, called accel/accel* (in /dev, sysfs and debugfs). +source "drivers/accel/amdxdna/Kconfig" source "drivers/accel/habanalabs/Kconfig" source "drivers/accel/ivpu/Kconfig" source "drivers/accel/qaic/Kconfig" diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile index ab3df932937f..a301fb6089d4 100644 --- a/drivers/accel/Makefile +++ b/drivers/accel/Makefile @@ -1,5 +1,6 @@ # SPDX-License-Identifier: GPL-2.0-only +obj-$(CONFIG_DRM_ACCEL_AMDXDNA) += amdxdna/ obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/ obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/ obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/ diff --git a/drivers/accel/amdxdna/Kconfig b/drivers/accel/amdxdna/Kconfig new file mode 100644 index 000000000000..f39d7a87296c --- /dev/null +++ b/drivers/accel/amdxdna/Kconfig @@ -0,0 +1,18 @@ +# SPDX-License-Identifier: GPL-2.0-only + +config DRM_ACCEL_AMDXDNA + tristate "AMD AI Engine" + depends on AMD_IOMMU + depends on DRM_ACCEL + depends on PCI && HAS_IOMEM + depends on X86_64 + select DRM_SCHED + select DRM_GEM_SHMEM_HELPER + select FW_LOADER + select HMM_MIRROR + help + Choose this option to enable support for NPU integrated into AMD + client CPUs like AMD Ryzen AI 300 Series. AMD NPU can be used to + accelerate machine learning applications. + + If "M" is selected, the driver module will be amdxdna. diff --git a/drivers/accel/amdxdna/Makefile b/drivers/accel/amdxdna/Makefile new file mode 100644 index 000000000000..0e9adf6890a0 --- /dev/null +++ b/drivers/accel/amdxdna/Makefile @@ -0,0 +1,23 @@ +# SPDX-License-Identifier: GPL-2.0-only + +amdxdna-y := \ + aie2_ctx.o \ + aie2_error.o \ + aie2_message.o \ + aie2_pci.o \ + aie2_pm.o \ + aie2_psp.o \ + aie2_smu.o \ + aie2_solver.o \ + amdxdna_ctx.o \ + amdxdna_gem.o \ + amdxdna_mailbox.o \ + amdxdna_mailbox_helper.o \ + amdxdna_pci_drv.o \ + amdxdna_sysfs.o \ + npu1_regs.o \ + npu2_regs.o \ + npu4_regs.o \ + npu5_regs.o \ + npu6_regs.o +obj-$(CONFIG_DRM_ACCEL_AMDXDNA) = amdxdna.o diff --git a/drivers/accel/amdxdna/TODO b/drivers/accel/amdxdna/TODO new file mode 100644 index 000000000000..5119bccd1917 --- /dev/null +++ b/drivers/accel/amdxdna/TODO @@ -0,0 +1,3 @@ +- Add import and export BO support +- Add debugfs support +- Add debug BO support diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_ctx.c new file mode 100644 index 000000000000..5f43db02b240 --- /dev/null +++ b/drivers/accel/amdxdna/aie2_ctx.c @@ -0,0 +1,910 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "aie2_msg_priv.h" +#include "aie2_pci.h" +#include "aie2_solver.h" +#include "amdxdna_ctx.h" +#include "amdxdna_gem.h" +#include "amdxdna_mailbox.h" +#include "amdxdna_pci_drv.h" + +static bool force_cmdlist; +module_param(force_cmdlist, bool, 0600); +MODULE_PARM_DESC(force_cmdlist, "Force use command list (Default false)"); + +#define HWCTX_MAX_TIMEOUT 60000 /* milliseconds */ + +static void aie2_job_release(struct kref *ref) +{ + struct amdxdna_sched_job *job; + + job = container_of(ref, struct amdxdna_sched_job, refcnt); + amdxdna_sched_job_cleanup(job); + if (job->out_fence) + dma_fence_put(job->out_fence); + kfree(job); +} + +static void aie2_job_put(struct amdxdna_sched_job *job) +{ + kref_put(&job->refcnt, aie2_job_release); +} + +/* The bad_job is used in aie2_sched_job_timedout, otherwise, set it to NULL */ +static void aie2_hwctx_stop(struct amdxdna_dev *xdna, struct amdxdna_hwctx *hwctx, + struct drm_sched_job *bad_job) +{ + drm_sched_stop(&hwctx->priv->sched, bad_job); + aie2_destroy_context(xdna->dev_handle, hwctx); +} + +static int aie2_hwctx_restart(struct amdxdna_dev *xdna, struct amdxdna_hwctx *hwctx) +{ + struct amdxdna_gem_obj *heap = hwctx->priv->heap; + int ret; + + ret = aie2_create_context(xdna->dev_handle, hwctx); + if (ret) { + XDNA_ERR(xdna, "Create hwctx failed, ret %d", ret); + goto out; + } + + ret = aie2_map_host_buf(xdna->dev_handle, hwctx->fw_ctx_id, + heap->mem.userptr, heap->mem.size); + if (ret) { + XDNA_ERR(xdna, "Map host buf failed, ret %d", ret); + goto out; + } + + if (hwctx->status != HWCTX_STAT_READY) { + XDNA_DBG(xdna, "hwctx is not ready, status %d", hwctx->status); + goto out; + } + + ret = aie2_config_cu(hwctx); + if (ret) { + XDNA_ERR(xdna, "Config cu failed, ret %d", ret); + goto out; + } + +out: + drm_sched_start(&hwctx->priv->sched, 0); + XDNA_DBG(xdna, "%s restarted, ret %d", hwctx->name, ret); + return ret; +} + +void aie2_restart_ctx(struct amdxdna_client *client) +{ + struct amdxdna_dev *xdna = client->xdna; + struct amdxdna_hwctx *hwctx; + unsigned long hwctx_id; + + drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); + mutex_lock(&client->hwctx_lock); + amdxdna_for_each_hwctx(client, hwctx_id, hwctx) { + if (hwctx->status != HWCTX_STAT_STOP) + continue; + + hwctx->status = hwctx->old_status; + XDNA_DBG(xdna, "Resetting %s", hwctx->name); + aie2_hwctx_restart(xdna, hwctx); + } + mutex_unlock(&client->hwctx_lock); +} + +static struct dma_fence *aie2_cmd_get_out_fence(struct amdxdna_hwctx *hwctx, u64 seq) +{ + struct dma_fence *fence, *out_fence = NULL; + int ret; + + fence = drm_syncobj_fence_get(hwctx->priv->syncobj); + if (!fence) + return NULL; + + ret = dma_fence_chain_find_seqno(&fence, seq); + if (ret) + goto out; + + out_fence = dma_fence_get(dma_fence_chain_contained(fence)); + +out: + dma_fence_put(fence); + return out_fence; +} + +static void aie2_hwctx_wait_for_idle(struct amdxdna_hwctx *hwctx) +{ + struct dma_fence *fence; + + fence = aie2_cmd_get_out_fence(hwctx, hwctx->priv->seq - 1); + if (!fence) + return; + + dma_fence_wait(fence, false); + dma_fence_put(fence); +} + +void aie2_hwctx_suspend(struct amdxdna_hwctx *hwctx) +{ + struct amdxdna_dev *xdna = hwctx->client->xdna; + + /* + * Command timeout is unlikely. But if it happens, it doesn't + * break the system. aie2_hwctx_stop() will destroy mailbox + * and abort all commands. + */ + drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); + aie2_hwctx_wait_for_idle(hwctx); + aie2_hwctx_stop(xdna, hwctx, NULL); + hwctx->old_status = hwctx->status; + hwctx->status = HWCTX_STAT_STOP; +} + +void aie2_hwctx_resume(struct amdxdna_hwctx *hwctx) +{ + struct amdxdna_dev *xdna = hwctx->client->xdna; + + /* + * The resume path cannot guarantee that mailbox channel can be + * regenerated. If this happen, when submit message to this + * mailbox channel, error will return. + */ + drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); + hwctx->status = hwctx->old_status; + aie2_hwctx_restart(xdna, hwctx); +} + +static void +aie2_sched_notify(struct amdxdna_sched_job *job) +{ + struct dma_fence *fence = job->fence; + + trace_xdna_job(&job->base, job->hwctx->name, "signaled fence", job->seq); + job->hwctx->priv->completed++; + dma_fence_signal(fence); + + up(&job->hwctx->priv->job_sem); + job->job_done = true; + dma_fence_put(fence); + mmput_async(job->mm); + aie2_job_put(job); +} + +static int +aie2_sched_resp_handler(void *handle, const u32 *data, size_t size) +{ + struct amdxdna_sched_job *job = handle; + struct amdxdna_gem_obj *cmd_abo; + u32 ret = 0; + u32 status; + + cmd_abo = job->cmd_bo; + + if (unlikely(!data)) + goto out; + + if (unlikely(size != sizeof(u32))) { + amdxdna_cmd_set_state(cmd_abo, ERT_CMD_STATE_ABORT); + ret = -EINVAL; + goto out; + } + + status = *data; + XDNA_DBG(job->hwctx->client->xdna, "Resp status 0x%x", status); + if (status == AIE2_STATUS_SUCCESS) + amdxdna_cmd_set_state(cmd_abo, ERT_CMD_STATE_COMPLETED); + else + amdxdna_cmd_set_state(cmd_abo, ERT_CMD_STATE_ERROR); + +out: + aie2_sched_notify(job); + return ret; +} + +static int +aie2_sched_nocmd_resp_handler(void *handle, const u32 *data, size_t size) +{ + struct amdxdna_sched_job *job = handle; + u32 ret = 0; + u32 status; + + if (unlikely(!data)) + goto out; + + if (unlikely(size != sizeof(u32))) { + ret = -EINVAL; + goto out; + } + + status = *data; + XDNA_DBG(job->hwctx->client->xdna, "Resp status 0x%x", status); + +out: + aie2_sched_notify(job); + return ret; +} + +static int +aie2_sched_cmdlist_resp_handler(void *handle, const u32 *data, size_t size) +{ + struct amdxdna_sched_job *job = handle; + struct amdxdna_gem_obj *cmd_abo; + struct cmd_chain_resp *resp; + struct amdxdna_dev *xdna; + u32 fail_cmd_status; + u32 fail_cmd_idx; + u32 ret = 0; + + cmd_abo = job->cmd_bo; + if (unlikely(!data) || unlikely(size != sizeof(u32) * 3)) { + amdxdna_cmd_set_state(cmd_abo, ERT_CMD_STATE_ABORT); + ret = -EINVAL; + goto out; + } + + resp = (struct cmd_chain_resp *)data; + xdna = job->hwctx->client->xdna; + XDNA_DBG(xdna, "Status 0x%x", resp->status); + if (resp->status == AIE2_STATUS_SUCCESS) { + amdxdna_cmd_set_state(cmd_abo, ERT_CMD_STATE_COMPLETED); + goto out; + } + + /* Slow path to handle error, read from ringbuf on BAR */ + fail_cmd_idx = resp->fail_cmd_idx; + fail_cmd_status = resp->fail_cmd_status; + XDNA_DBG(xdna, "Failed cmd idx %d, status 0x%x", + fail_cmd_idx, fail_cmd_status); + + if (fail_cmd_status == AIE2_STATUS_SUCCESS) { + amdxdna_cmd_set_state(cmd_abo, ERT_CMD_STATE_ABORT); + ret = -EINVAL; + goto out; + } + amdxdna_cmd_set_state(cmd_abo, fail_cmd_status); + + if (amdxdna_cmd_get_op(cmd_abo) == ERT_CMD_CHAIN) { + struct amdxdna_cmd_chain *cc = amdxdna_cmd_get_payload(cmd_abo, NULL); + + cc->error_index = fail_cmd_idx; + if (cc->error_index >= cc->command_count) + cc->error_index = 0; + } +out: + aie2_sched_notify(job); + return ret; +} + +static struct dma_fence * +aie2_sched_job_run(struct drm_sched_job *sched_job) +{ + struct amdxdna_sched_job *job = drm_job_to_xdna_job(sched_job); + struct amdxdna_gem_obj *cmd_abo = job->cmd_bo; + struct amdxdna_hwctx *hwctx = job->hwctx; + struct dma_fence *fence; + int ret; + + if (!mmget_not_zero(job->mm)) + return ERR_PTR(-ESRCH); + + kref_get(&job->refcnt); + fence = dma_fence_get(job->fence); + + if (unlikely(!cmd_abo)) { + ret = aie2_sync_bo(hwctx, job, aie2_sched_nocmd_resp_handler); + goto out; + } + + amdxdna_cmd_set_state(cmd_abo, ERT_CMD_STATE_NEW); + + if (amdxdna_cmd_get_op(cmd_abo) == ERT_CMD_CHAIN) + ret = aie2_cmdlist_multi_execbuf(hwctx, job, aie2_sched_cmdlist_resp_handler); + else if (force_cmdlist) + ret = aie2_cmdlist_single_execbuf(hwctx, job, aie2_sched_cmdlist_resp_handler); + else + ret = aie2_execbuf(hwctx, job, aie2_sched_resp_handler); + +out: + if (ret) { + dma_fence_put(job->fence); + aie2_job_put(job); + mmput(job->mm); + fence = ERR_PTR(ret); + } + trace_xdna_job(sched_job, hwctx->name, "sent to device", job->seq); + + return fence; +} + +static void aie2_sched_job_free(struct drm_sched_job *sched_job) +{ + struct amdxdna_sched_job *job = drm_job_to_xdna_job(sched_job); + struct amdxdna_hwctx *hwctx = job->hwctx; + + trace_xdna_job(sched_job, hwctx->name, "job free", job->seq); + if (!job->job_done) + up(&hwctx->priv->job_sem); + + drm_sched_job_cleanup(sched_job); + aie2_job_put(job); +} + +static enum drm_gpu_sched_stat +aie2_sched_job_timedout(struct drm_sched_job *sched_job) +{ + struct amdxdna_sched_job *job = drm_job_to_xdna_job(sched_job); + struct amdxdna_hwctx *hwctx = job->hwctx; + struct amdxdna_dev *xdna; + + xdna = hwctx->client->xdna; + trace_xdna_job(sched_job, hwctx->name, "job timedout", job->seq); + mutex_lock(&xdna->dev_lock); + aie2_hwctx_stop(xdna, hwctx, sched_job); + + aie2_hwctx_restart(xdna, hwctx); + mutex_unlock(&xdna->dev_lock); + + return DRM_GPU_SCHED_STAT_NOMINAL; +} + +const struct drm_sched_backend_ops sched_ops = { + .run_job = aie2_sched_job_run, + .free_job = aie2_sched_job_free, + .timedout_job = aie2_sched_job_timedout, +}; + +static int aie2_hwctx_col_list(struct amdxdna_hwctx *hwctx) +{ + struct amdxdna_dev *xdna = hwctx->client->xdna; + struct amdxdna_dev_hdl *ndev; + int start, end, first, last; + u32 width = 1, entries = 0; + int i; + + if (!hwctx->num_tiles) { + XDNA_ERR(xdna, "Number of tiles is zero"); + return -EINVAL; + } + + ndev = xdna->dev_handle; + if (unlikely(!ndev->metadata.core.row_count)) { + XDNA_WARN(xdna, "Core tile row count is zero"); + return -EINVAL; + } + + hwctx->num_col = hwctx->num_tiles / ndev->metadata.core.row_count; + if (!hwctx->num_col || hwctx->num_col > ndev->total_col) { + XDNA_ERR(xdna, "Invalid num_col %d", hwctx->num_col); + return -EINVAL; + } + + if (ndev->priv->col_align == COL_ALIGN_NATURE) + width = hwctx->num_col; + + /* + * In range [start, end], find out columns that is multiple of width. + * 'first' is the first column, + * 'last' is the last column, + * 'entries' is the total number of columns. + */ + start = xdna->dev_info->first_col; + end = ndev->total_col - hwctx->num_col; + if (start > 0 && end == 0) { + XDNA_DBG(xdna, "Force start from col 0"); + start = 0; + } + first = start + (width - start % width) % width; + last = end - end % width; + if (last >= first) + entries = (last - first) / width + 1; + XDNA_DBG(xdna, "start %d end %d first %d last %d", + start, end, first, last); + + if (unlikely(!entries)) { + XDNA_ERR(xdna, "Start %d end %d width %d", + start, end, width); + return -EINVAL; + } + + hwctx->col_list = kmalloc_array(entries, sizeof(*hwctx->col_list), GFP_KERNEL); + if (!hwctx->col_list) + return -ENOMEM; + + hwctx->col_list_len = entries; + hwctx->col_list[0] = first; + for (i = 1; i < entries; i++) + hwctx->col_list[i] = hwctx->col_list[i - 1] + width; + + print_hex_dump_debug("col_list: ", DUMP_PREFIX_OFFSET, 16, 4, hwctx->col_list, + entries * sizeof(*hwctx->col_list), false); + return 0; +} + +static int aie2_alloc_resource(struct amdxdna_hwctx *hwctx) +{ + struct amdxdna_dev *xdna = hwctx->client->xdna; + struct alloc_requests *xrs_req; + int ret; + + xrs_req = kzalloc(sizeof(*xrs_req), GFP_KERNEL); + if (!xrs_req) + return -ENOMEM; + + xrs_req->cdo.start_cols = hwctx->col_list; + xrs_req->cdo.cols_len = hwctx->col_list_len; + xrs_req->cdo.ncols = hwctx->num_col; + xrs_req->cdo.qos_cap.opc = hwctx->max_opc; + + xrs_req->rqos.gops = hwctx->qos.gops; + xrs_req->rqos.fps = hwctx->qos.fps; + xrs_req->rqos.dma_bw = hwctx->qos.dma_bandwidth; + xrs_req->rqos.latency = hwctx->qos.latency; + xrs_req->rqos.exec_time = hwctx->qos.frame_exec_time; + xrs_req->rqos.priority = hwctx->qos.priority; + + xrs_req->rid = (uintptr_t)hwctx; + + ret = xrs_allocate_resource(xdna->xrs_hdl, xrs_req, hwctx); + if (ret) + XDNA_ERR(xdna, "Allocate AIE resource failed, ret %d", ret); + + kfree(xrs_req); + return ret; +} + +static void aie2_release_resource(struct amdxdna_hwctx *hwctx) +{ + struct amdxdna_dev *xdna = hwctx->client->xdna; + int ret; + + ret = xrs_release_resource(xdna->xrs_hdl, (uintptr_t)hwctx); + if (ret) + XDNA_ERR(xdna, "Release AIE resource failed, ret %d", ret); +} + +static int aie2_ctx_syncobj_create(struct amdxdna_hwctx *hwctx) +{ + struct amdxdna_dev *xdna = hwctx->client->xdna; + struct drm_file *filp = hwctx->client->filp; + struct drm_syncobj *syncobj; + u32 hdl; + int ret; + + hwctx->syncobj_hdl = AMDXDNA_INVALID_FENCE_HANDLE; + + ret = drm_syncobj_create(&syncobj, 0, NULL); + if (ret) { + XDNA_ERR(xdna, "Create ctx syncobj failed, ret %d", ret); + return ret; + } + ret = drm_syncobj_get_handle(filp, syncobj, &hdl); + if (ret) { + drm_syncobj_put(syncobj); + XDNA_ERR(xdna, "Create ctx syncobj handle failed, ret %d", ret); + return ret; + } + hwctx->priv->syncobj = syncobj; + hwctx->syncobj_hdl = hdl; + + return 0; +} + +static void aie2_ctx_syncobj_destroy(struct amdxdna_hwctx *hwctx) +{ + /* + * The syncobj_hdl is owned by user space and will be cleaned up + * separately. + */ + drm_syncobj_put(hwctx->priv->syncobj); +} + +int aie2_hwctx_init(struct amdxdna_hwctx *hwctx) +{ + struct amdxdna_client *client = hwctx->client; + struct amdxdna_dev *xdna = client->xdna; + struct drm_gpu_scheduler *sched; + struct amdxdna_hwctx_priv *priv; + struct amdxdna_gem_obj *heap; + struct amdxdna_dev_hdl *ndev; + int i, ret; + + priv = kzalloc(sizeof(*hwctx->priv), GFP_KERNEL); + if (!priv) + return -ENOMEM; + hwctx->priv = priv; + + mutex_lock(&client->mm_lock); + heap = client->dev_heap; + if (!heap) { + XDNA_ERR(xdna, "The client dev heap object not exist"); + mutex_unlock(&client->mm_lock); + ret = -ENOENT; + goto free_priv; + } + drm_gem_object_get(to_gobj(heap)); + mutex_unlock(&client->mm_lock); + priv->heap = heap; + sema_init(&priv->job_sem, HWCTX_MAX_CMDS); + + ret = amdxdna_gem_pin(heap); + if (ret) { + XDNA_ERR(xdna, "Dev heap pin failed, ret %d", ret); + goto put_heap; + } + + for (i = 0; i < ARRAY_SIZE(priv->cmd_buf); i++) { + struct amdxdna_gem_obj *abo; + struct amdxdna_drm_create_bo args = { + .flags = 0, + .type = AMDXDNA_BO_DEV, + .vaddr = 0, + .size = MAX_CHAIN_CMDBUF_SIZE, + }; + + abo = amdxdna_drm_alloc_dev_bo(&xdna->ddev, &args, client->filp, true); + if (IS_ERR(abo)) { + ret = PTR_ERR(abo); + goto free_cmd_bufs; + } + + XDNA_DBG(xdna, "Command buf %d addr 0x%llx size 0x%lx", + i, abo->mem.dev_addr, abo->mem.size); + priv->cmd_buf[i] = abo; + } + + sched = &priv->sched; + mutex_init(&priv->io_lock); + + fs_reclaim_acquire(GFP_KERNEL); + might_lock(&priv->io_lock); + fs_reclaim_release(GFP_KERNEL); + + ret = drm_sched_init(sched, &sched_ops, NULL, DRM_SCHED_PRIORITY_COUNT, + HWCTX_MAX_CMDS, 0, msecs_to_jiffies(HWCTX_MAX_TIMEOUT), + NULL, NULL, hwctx->name, xdna->ddev.dev); + if (ret) { + XDNA_ERR(xdna, "Failed to init DRM scheduler. ret %d", ret); + goto free_cmd_bufs; + } + + ret = drm_sched_entity_init(&priv->entity, DRM_SCHED_PRIORITY_NORMAL, + &sched, 1, NULL); + if (ret) { + XDNA_ERR(xdna, "Failed to initial sched entiry. ret %d", ret); + goto free_sched; + } + + ret = aie2_hwctx_col_list(hwctx); + if (ret) { + XDNA_ERR(xdna, "Create col list failed, ret %d", ret); + goto free_entity; + } + + ret = aie2_alloc_resource(hwctx); + if (ret) { + XDNA_ERR(xdna, "Alloc hw resource failed, ret %d", ret); + goto free_col_list; + } + + ret = aie2_map_host_buf(xdna->dev_handle, hwctx->fw_ctx_id, + heap->mem.userptr, heap->mem.size); + if (ret) { + XDNA_ERR(xdna, "Map host buffer failed, ret %d", ret); + goto release_resource; + } + + ret = aie2_ctx_syncobj_create(hwctx); + if (ret) { + XDNA_ERR(xdna, "Create syncobj failed, ret %d", ret); + goto release_resource; + } + + hwctx->status = HWCTX_STAT_INIT; + ndev = xdna->dev_handle; + ndev->hwctx_num++; + + XDNA_DBG(xdna, "hwctx %s init completed", hwctx->name); + + return 0; + +release_resource: + aie2_release_resource(hwctx); +free_col_list: + kfree(hwctx->col_list); +free_entity: + drm_sched_entity_destroy(&priv->entity); +free_sched: + drm_sched_fini(&priv->sched); +free_cmd_bufs: + for (i = 0; i < ARRAY_SIZE(priv->cmd_buf); i++) { + if (!priv->cmd_buf[i]) + continue; + drm_gem_object_put(to_gobj(priv->cmd_buf[i])); + } + amdxdna_gem_unpin(heap); +put_heap: + drm_gem_object_put(to_gobj(heap)); +free_priv: + kfree(priv); + return ret; +} + +void aie2_hwctx_fini(struct amdxdna_hwctx *hwctx) +{ + struct amdxdna_dev_hdl *ndev; + struct amdxdna_dev *xdna; + int idx; + + xdna = hwctx->client->xdna; + ndev = xdna->dev_handle; + ndev->hwctx_num--; + drm_sched_wqueue_stop(&hwctx->priv->sched); + + /* Now, scheduler will not send command to device. */ + aie2_release_resource(hwctx); + + /* + * All submitted commands are aborted. + * Restart scheduler queues to cleanup jobs. The amdxdna_sched_job_run() + * will return NODEV if it is called. + */ + drm_sched_wqueue_start(&hwctx->priv->sched); + + aie2_hwctx_wait_for_idle(hwctx); + drm_sched_entity_destroy(&hwctx->priv->entity); + drm_sched_fini(&hwctx->priv->sched); + aie2_ctx_syncobj_destroy(hwctx); + + XDNA_DBG(xdna, "%s sequence number %lld", hwctx->name, hwctx->priv->seq); + + for (idx = 0; idx < ARRAY_SIZE(hwctx->priv->cmd_buf); idx++) + drm_gem_object_put(to_gobj(hwctx->priv->cmd_buf[idx])); + amdxdna_gem_unpin(hwctx->priv->heap); + drm_gem_object_put(to_gobj(hwctx->priv->heap)); + + mutex_destroy(&hwctx->priv->io_lock); + kfree(hwctx->col_list); + kfree(hwctx->priv); + kfree(hwctx->cus); +} + +static int aie2_hwctx_cu_config(struct amdxdna_hwctx *hwctx, void *buf, u32 size) +{ + struct amdxdna_hwctx_param_config_cu *config = buf; + struct amdxdna_dev *xdna = hwctx->client->xdna; + u32 total_size; + int ret; + + XDNA_DBG(xdna, "Config %d CU to %s", config->num_cus, hwctx->name); + if (XDNA_MBZ_DBG(xdna, config->pad, sizeof(config->pad))) + return -EINVAL; + + if (hwctx->status != HWCTX_STAT_INIT) { + XDNA_ERR(xdna, "Not support re-config CU"); + return -EINVAL; + } + + if (!config->num_cus) { + XDNA_ERR(xdna, "Number of CU is zero"); + return -EINVAL; + } + + total_size = struct_size(config, cu_configs, config->num_cus); + if (total_size > size) { + XDNA_ERR(xdna, "CU config larger than size"); + return -EINVAL; + } + + hwctx->cus = kmemdup(config, total_size, GFP_KERNEL); + if (!hwctx->cus) + return -ENOMEM; + + ret = aie2_config_cu(hwctx); + if (ret) { + XDNA_ERR(xdna, "Config CU to firmware failed, ret %d", ret); + goto free_cus; + } + + wmb(); /* To avoid locking in command submit when check status */ + hwctx->status = HWCTX_STAT_READY; + + return 0; + +free_cus: + kfree(hwctx->cus); + hwctx->cus = NULL; + return ret; +} + +int aie2_hwctx_config(struct amdxdna_hwctx *hwctx, u32 type, u64 value, void *buf, u32 size) +{ + struct amdxdna_dev *xdna = hwctx->client->xdna; + + drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); + switch (type) { + case DRM_AMDXDNA_HWCTX_CONFIG_CU: + return aie2_hwctx_cu_config(hwctx, buf, size); + case DRM_AMDXDNA_HWCTX_ASSIGN_DBG_BUF: + case DRM_AMDXDNA_HWCTX_REMOVE_DBG_BUF: + return -EOPNOTSUPP; + default: + XDNA_DBG(xdna, "Not supported type %d", type); + return -EOPNOTSUPP; + } +} + +static int aie2_populate_range(struct amdxdna_gem_obj *abo) +{ + struct amdxdna_dev *xdna = to_xdna_dev(to_gobj(abo)->dev); + struct mm_struct *mm = abo->mem.notifier.mm; + struct hmm_range range = { 0 }; + unsigned long timeout; + int ret; + + XDNA_INFO_ONCE(xdna, "populate memory range %llx size %lx", + abo->mem.userptr, abo->mem.size); + range.notifier = &abo->mem.notifier; + range.start = abo->mem.userptr; + range.end = abo->mem.userptr + abo->mem.size; + range.hmm_pfns = abo->mem.pfns; + range.default_flags = HMM_PFN_REQ_FAULT; + + if (!mmget_not_zero(mm)) + return -EFAULT; + + timeout = jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT); +again: + range.notifier_seq = mmu_interval_read_begin(&abo->mem.notifier); + mmap_read_lock(mm); + ret = hmm_range_fault(&range); + mmap_read_unlock(mm); + if (ret) { + if (time_after(jiffies, timeout)) { + ret = -ETIME; + goto put_mm; + } + + if (ret == -EBUSY) + goto again; + + goto put_mm; + } + + down_read(&xdna->notifier_lock); + if (mmu_interval_read_retry(&abo->mem.notifier, range.notifier_seq)) { + up_read(&xdna->notifier_lock); + goto again; + } + abo->mem.map_invalid = false; + up_read(&xdna->notifier_lock); + +put_mm: + mmput(mm); + return ret; +} + +int aie2_cmd_submit(struct amdxdna_hwctx *hwctx, struct amdxdna_sched_job *job, u64 *seq) +{ + struct amdxdna_dev *xdna = hwctx->client->xdna; + struct ww_acquire_ctx acquire_ctx; + struct dma_fence_chain *chain; + struct amdxdna_gem_obj *abo; + unsigned long timeout = 0; + int ret, i; + + ret = down_interruptible(&hwctx->priv->job_sem); + if (ret) { + XDNA_ERR(xdna, "Grab job sem failed, ret %d", ret); + return ret; + } + + chain = dma_fence_chain_alloc(); + if (!chain) { + XDNA_ERR(xdna, "Alloc fence chain failed"); + ret = -ENOMEM; + goto up_sem; + } + + ret = drm_sched_job_init(&job->base, &hwctx->priv->entity, 1, hwctx); + if (ret) { + XDNA_ERR(xdna, "DRM job init failed, ret %d", ret); + goto free_chain; + } + +retry: + ret = drm_gem_lock_reservations(job->bos, job->bo_cnt, &acquire_ctx); + if (ret) { + XDNA_WARN(xdna, "Failed to lock BOs, ret %d", ret); + goto cleanup_job; + } + + for (i = 0; i < job->bo_cnt; i++) { + ret = dma_resv_reserve_fences(job->bos[i]->resv, 1); + if (ret) { + XDNA_WARN(xdna, "Failed to reserve fences %d", ret); + drm_gem_unlock_reservations(job->bos, job->bo_cnt, &acquire_ctx); + goto cleanup_job; + } + } + + down_read(&xdna->notifier_lock); + for (i = 0; i < job->bo_cnt; i++) { + abo = to_xdna_obj(job->bos[i]); + if (abo->mem.map_invalid) { + up_read(&xdna->notifier_lock); + drm_gem_unlock_reservations(job->bos, job->bo_cnt, &acquire_ctx); + if (!timeout) { + timeout = jiffies + + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT); + } else if (time_after(jiffies, timeout)) { + ret = -ETIME; + goto cleanup_job; + } + + ret = aie2_populate_range(abo); + if (ret) + goto cleanup_job; + goto retry; + } + } + + mutex_lock(&hwctx->priv->io_lock); + drm_sched_job_arm(&job->base); + job->out_fence = dma_fence_get(&job->base.s_fence->finished); + for (i = 0; i < job->bo_cnt; i++) + dma_resv_add_fence(job->bos[i]->resv, job->out_fence, DMA_RESV_USAGE_WRITE); + job->seq = hwctx->priv->seq++; + kref_get(&job->refcnt); + drm_sched_entity_push_job(&job->base); + + *seq = job->seq; + drm_syncobj_add_point(hwctx->priv->syncobj, chain, job->out_fence, *seq); + mutex_unlock(&hwctx->priv->io_lock); + + up_read(&xdna->notifier_lock); + drm_gem_unlock_reservations(job->bos, job->bo_cnt, &acquire_ctx); + + aie2_job_put(job); + + return 0; + +cleanup_job: + drm_sched_job_cleanup(&job->base); +free_chain: + dma_fence_chain_free(chain); +up_sem: + up(&hwctx->priv->job_sem); + job->job_done = true; + return ret; +} + +void aie2_hmm_invalidate(struct amdxdna_gem_obj *abo, + unsigned long cur_seq) +{ + struct amdxdna_dev *xdna = to_xdna_dev(to_gobj(abo)->dev); + struct drm_gem_object *gobj = to_gobj(abo); + long ret; + + down_write(&xdna->notifier_lock); + abo->mem.map_invalid = true; + mmu_interval_set_seq(&abo->mem.notifier, cur_seq); + up_write(&xdna->notifier_lock); + ret = dma_resv_wait_timeout(gobj->resv, DMA_RESV_USAGE_BOOKKEEP, + true, MAX_SCHEDULE_TIMEOUT); + if (!ret || ret == -ERESTARTSYS) + XDNA_ERR(xdna, "Failed to wait for bo, ret %ld", ret); +} diff --git a/drivers/accel/amdxdna/aie2_error.c b/drivers/accel/amdxdna/aie2_error.c new file mode 100644 index 000000000000..b1defaa8513b --- /dev/null +++ b/drivers/accel/amdxdna/aie2_error.c @@ -0,0 +1,360 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2023-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "aie2_msg_priv.h" +#include "aie2_pci.h" +#include "amdxdna_mailbox.h" +#include "amdxdna_pci_drv.h" + +struct async_event { + struct amdxdna_dev_hdl *ndev; + struct async_event_msg_resp resp; + struct workqueue_struct *wq; + struct work_struct work; + u8 *buf; + dma_addr_t addr; + u32 size; +}; + +struct async_events { + struct workqueue_struct *wq; + u8 *buf; + dma_addr_t addr; + u32 size; + u32 event_cnt; + struct async_event event[] __counted_by(event_cnt); +}; + +/* + * Below enum, struct and lookup tables are porting from XAIE util header file. + * + * Below data is defined by AIE device and it is used for decode error message + * from the device. + */ + +enum aie_module_type { + AIE_MEM_MOD = 0, + AIE_CORE_MOD, + AIE_PL_MOD, +}; + +enum aie_error_category { + AIE_ERROR_SATURATION = 0, + AIE_ERROR_FP, + AIE_ERROR_STREAM, + AIE_ERROR_ACCESS, + AIE_ERROR_BUS, + AIE_ERROR_INSTRUCTION, + AIE_ERROR_ECC, + AIE_ERROR_LOCK, + AIE_ERROR_DMA, + AIE_ERROR_MEM_PARITY, + /* Unknown is not from XAIE, added for better category */ + AIE_ERROR_UNKNOWN, +}; + +/* Don't pack, unless XAIE side changed */ +struct aie_error { + __u8 row; + __u8 col; + __u32 mod_type; + __u8 event_id; +}; + +struct aie_err_info { + u32 err_cnt; + u32 ret_code; + u32 rsvd; + struct aie_error payload[] __counted_by(err_cnt); +}; + +struct aie_event_category { + u8 event_id; + enum aie_error_category category; +}; + +#define EVENT_CATEGORY(id, cat) { id, cat } +static const struct aie_event_category aie_ml_mem_event_cat[] = { + EVENT_CATEGORY(88U, AIE_ERROR_ECC), + EVENT_CATEGORY(90U, AIE_ERROR_ECC), + EVENT_CATEGORY(91U, AIE_ERROR_MEM_PARITY), + EVENT_CATEGORY(92U, AIE_ERROR_MEM_PARITY), + EVENT_CATEGORY(93U, AIE_ERROR_MEM_PARITY), + EVENT_CATEGORY(94U, AIE_ERROR_MEM_PARITY), + EVENT_CATEGORY(95U, AIE_ERROR_MEM_PARITY), + EVENT_CATEGORY(96U, AIE_ERROR_MEM_PARITY), + EVENT_CATEGORY(97U, AIE_ERROR_DMA), + EVENT_CATEGORY(98U, AIE_ERROR_DMA), + EVENT_CATEGORY(99U, AIE_ERROR_DMA), + EVENT_CATEGORY(100U, AIE_ERROR_DMA), + EVENT_CATEGORY(101U, AIE_ERROR_LOCK), +}; + +static const struct aie_event_category aie_ml_core_event_cat[] = { + EVENT_CATEGORY(55U, AIE_ERROR_ACCESS), + EVENT_CATEGORY(56U, AIE_ERROR_STREAM), + EVENT_CATEGORY(57U, AIE_ERROR_STREAM), + EVENT_CATEGORY(58U, AIE_ERROR_BUS), + EVENT_CATEGORY(59U, AIE_ERROR_INSTRUCTION), + EVENT_CATEGORY(60U, AIE_ERROR_ACCESS), + EVENT_CATEGORY(62U, AIE_ERROR_ECC), + EVENT_CATEGORY(64U, AIE_ERROR_ECC), + EVENT_CATEGORY(65U, AIE_ERROR_ACCESS), + EVENT_CATEGORY(66U, AIE_ERROR_ACCESS), + EVENT_CATEGORY(67U, AIE_ERROR_LOCK), + EVENT_CATEGORY(70U, AIE_ERROR_INSTRUCTION), + EVENT_CATEGORY(71U, AIE_ERROR_STREAM), + EVENT_CATEGORY(72U, AIE_ERROR_BUS), +}; + +static const struct aie_event_category aie_ml_mem_tile_event_cat[] = { + EVENT_CATEGORY(130U, AIE_ERROR_ECC), + EVENT_CATEGORY(132U, AIE_ERROR_ECC), + EVENT_CATEGORY(133U, AIE_ERROR_DMA), + EVENT_CATEGORY(134U, AIE_ERROR_DMA), + EVENT_CATEGORY(135U, AIE_ERROR_STREAM), + EVENT_CATEGORY(136U, AIE_ERROR_STREAM), + EVENT_CATEGORY(137U, AIE_ERROR_STREAM), + EVENT_CATEGORY(138U, AIE_ERROR_BUS), + EVENT_CATEGORY(139U, AIE_ERROR_LOCK), +}; + +static const struct aie_event_category aie_ml_shim_tile_event_cat[] = { + EVENT_CATEGORY(64U, AIE_ERROR_BUS), + EVENT_CATEGORY(65U, AIE_ERROR_STREAM), + EVENT_CATEGORY(66U, AIE_ERROR_STREAM), + EVENT_CATEGORY(67U, AIE_ERROR_BUS), + EVENT_CATEGORY(68U, AIE_ERROR_BUS), + EVENT_CATEGORY(69U, AIE_ERROR_BUS), + EVENT_CATEGORY(70U, AIE_ERROR_BUS), + EVENT_CATEGORY(71U, AIE_ERROR_BUS), + EVENT_CATEGORY(72U, AIE_ERROR_DMA), + EVENT_CATEGORY(73U, AIE_ERROR_DMA), + EVENT_CATEGORY(74U, AIE_ERROR_LOCK), +}; + +static enum aie_error_category +aie_get_error_category(u8 row, u8 event_id, enum aie_module_type mod_type) +{ + const struct aie_event_category *lut; + int num_entry; + int i; + + switch (mod_type) { + case AIE_PL_MOD: + lut = aie_ml_shim_tile_event_cat; + num_entry = ARRAY_SIZE(aie_ml_shim_tile_event_cat); + break; + case AIE_CORE_MOD: + lut = aie_ml_core_event_cat; + num_entry = ARRAY_SIZE(aie_ml_core_event_cat); + break; + case AIE_MEM_MOD: + if (row == 1) { + lut = aie_ml_mem_tile_event_cat; + num_entry = ARRAY_SIZE(aie_ml_mem_tile_event_cat); + } else { + lut = aie_ml_mem_event_cat; + num_entry = ARRAY_SIZE(aie_ml_mem_event_cat); + } + break; + default: + return AIE_ERROR_UNKNOWN; + } + + for (i = 0; i < num_entry; i++) { + if (event_id != lut[i].event_id) + continue; + + return lut[i].category; + } + + return AIE_ERROR_UNKNOWN; +} + +static u32 aie2_error_backtrack(struct amdxdna_dev_hdl *ndev, void *err_info, u32 num_err) +{ + struct aie_error *errs = err_info; + u32 err_col = 0; /* assume that AIE has less than 32 columns */ + int i; + + /* Get err column bitmap */ + for (i = 0; i < num_err; i++) { + struct aie_error *err = &errs[i]; + enum aie_error_category cat; + + cat = aie_get_error_category(err->row, err->event_id, err->mod_type); + XDNA_ERR(ndev->xdna, "Row: %d, Col: %d, module %d, event ID %d, category %d", + err->row, err->col, err->mod_type, + err->event_id, cat); + + if (err->col >= 32) { + XDNA_WARN(ndev->xdna, "Invalid column number"); + break; + } + + err_col |= (1 << err->col); + } + + return err_col; +} + +static int aie2_error_async_cb(void *handle, const u32 *data, size_t size) +{ + struct async_event_msg_resp *resp; + struct async_event *e = handle; + + if (data) { + resp = (struct async_event_msg_resp *)data; + e->resp.type = resp->type; + wmb(); /* Update status in the end, so that no lock for here */ + e->resp.status = resp->status; + } + queue_work(e->wq, &e->work); + return 0; +} + +static int aie2_error_event_send(struct async_event *e) +{ + drm_clflush_virt_range(e->buf, e->size); /* device can access */ + return aie2_register_asyn_event_msg(e->ndev, e->addr, e->size, e, + aie2_error_async_cb); +} + +static void aie2_error_worker(struct work_struct *err_work) +{ + struct aie_err_info *info; + struct amdxdna_dev *xdna; + struct async_event *e; + u32 max_err; + u32 err_col; + + e = container_of(err_work, struct async_event, work); + + xdna = e->ndev->xdna; + + if (e->resp.status == MAX_AIE2_STATUS_CODE) + return; + + e->resp.status = MAX_AIE2_STATUS_CODE; + + print_hex_dump_debug("AIE error: ", DUMP_PREFIX_OFFSET, 16, 4, + e->buf, 0x100, false); + + info = (struct aie_err_info *)e->buf; + XDNA_DBG(xdna, "Error count %d return code %d", info->err_cnt, info->ret_code); + + max_err = (e->size - sizeof(*info)) / sizeof(struct aie_error); + if (unlikely(info->err_cnt > max_err)) { + WARN_ONCE(1, "Error count too large %d\n", info->err_cnt); + return; + } + err_col = aie2_error_backtrack(e->ndev, info->payload, info->err_cnt); + if (!err_col) { + XDNA_WARN(xdna, "Did not get error column"); + return; + } + + mutex_lock(&xdna->dev_lock); + /* Re-sent this event to firmware */ + if (aie2_error_event_send(e)) + XDNA_WARN(xdna, "Unable to register async event"); + mutex_unlock(&xdna->dev_lock); +} + +int aie2_error_async_events_send(struct amdxdna_dev_hdl *ndev) +{ + struct amdxdna_dev *xdna = ndev->xdna; + struct async_event *e; + int i, ret; + + drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); + for (i = 0; i < ndev->async_events->event_cnt; i++) { + e = &ndev->async_events->event[i]; + ret = aie2_error_event_send(e); + if (ret) + return ret; + } + + return 0; +} + +void aie2_error_async_events_free(struct amdxdna_dev_hdl *ndev) +{ + struct amdxdna_dev *xdna = ndev->xdna; + struct async_events *events; + + events = ndev->async_events; + + mutex_unlock(&xdna->dev_lock); + destroy_workqueue(events->wq); + mutex_lock(&xdna->dev_lock); + + dma_free_noncoherent(xdna->ddev.dev, events->size, events->buf, + events->addr, DMA_FROM_DEVICE); + kfree(events); +} + +int aie2_error_async_events_alloc(struct amdxdna_dev_hdl *ndev) +{ + struct amdxdna_dev *xdna = ndev->xdna; + u32 total_col = ndev->total_col; + u32 total_size = ASYNC_BUF_SIZE * total_col; + struct async_events *events; + int i, ret; + + events = kzalloc(struct_size(events, event, total_col), GFP_KERNEL); + if (!events) + return -ENOMEM; + + events->buf = dma_alloc_noncoherent(xdna->ddev.dev, total_size, &events->addr, + DMA_FROM_DEVICE, GFP_KERNEL); + if (!events->buf) { + ret = -ENOMEM; + goto free_events; + } + events->size = total_size; + events->event_cnt = total_col; + + events->wq = alloc_ordered_workqueue("async_wq", 0); + if (!events->wq) { + ret = -ENOMEM; + goto free_buf; + } + + for (i = 0; i < events->event_cnt; i++) { + struct async_event *e = &events->event[i]; + u32 offset = i * ASYNC_BUF_SIZE; + + e->ndev = ndev; + e->wq = events->wq; + e->buf = &events->buf[offset]; + e->addr = events->addr + offset; + e->size = ASYNC_BUF_SIZE; + e->resp.status = MAX_AIE2_STATUS_CODE; + INIT_WORK(&e->work, aie2_error_worker); + } + + ndev->async_events = events; + + XDNA_DBG(xdna, "Async event count %d, buf total size 0x%x", + events->event_cnt, events->size); + return 0; + +free_buf: + dma_free_noncoherent(xdna->ddev.dev, events->size, events->buf, + events->addr, DMA_FROM_DEVICE); +free_events: + kfree(events); + return ret; +} diff --git a/drivers/accel/amdxdna/aie2_message.c b/drivers/accel/amdxdna/aie2_message.c new file mode 100644 index 000000000000..9e2c9a44f76a --- /dev/null +++ b/drivers/accel/amdxdna/aie2_message.c @@ -0,0 +1,776 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2023-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "aie2_msg_priv.h" +#include "aie2_pci.h" +#include "amdxdna_ctx.h" +#include "amdxdna_gem.h" +#include "amdxdna_mailbox.h" +#include "amdxdna_mailbox_helper.h" +#include "amdxdna_pci_drv.h" + +#define DECLARE_AIE2_MSG(name, op) \ + DECLARE_XDNA_MSG_COMMON(name, op, MAX_AIE2_STATUS_CODE) + +static int aie2_send_mgmt_msg_wait(struct amdxdna_dev_hdl *ndev, + struct xdna_mailbox_msg *msg) +{ + struct amdxdna_dev *xdna = ndev->xdna; + struct xdna_notify *hdl = msg->handle; + int ret; + + if (!ndev->mgmt_chann) + return -ENODEV; + + drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); + ret = xdna_send_msg_wait(xdna, ndev->mgmt_chann, msg); + if (ret == -ETIME) { + xdna_mailbox_stop_channel(ndev->mgmt_chann); + xdna_mailbox_destroy_channel(ndev->mgmt_chann); + ndev->mgmt_chann = NULL; + } + + if (!ret && *hdl->data != AIE2_STATUS_SUCCESS) { + XDNA_ERR(xdna, "command opcode 0x%x failed, status 0x%x", + msg->opcode, *hdl->data); + ret = -EINVAL; + } + + return ret; +} + +int aie2_suspend_fw(struct amdxdna_dev_hdl *ndev) +{ + DECLARE_AIE2_MSG(suspend, MSG_OP_SUSPEND); + + return aie2_send_mgmt_msg_wait(ndev, &msg); +} + +int aie2_resume_fw(struct amdxdna_dev_hdl *ndev) +{ + DECLARE_AIE2_MSG(suspend, MSG_OP_RESUME); + + return aie2_send_mgmt_msg_wait(ndev, &msg); +} + +int aie2_set_runtime_cfg(struct amdxdna_dev_hdl *ndev, u32 type, u64 value) +{ + DECLARE_AIE2_MSG(set_runtime_cfg, MSG_OP_SET_RUNTIME_CONFIG); + int ret; + + req.type = type; + req.value = value; + + ret = aie2_send_mgmt_msg_wait(ndev, &msg); + if (ret) { + XDNA_ERR(ndev->xdna, "Failed to set runtime config, ret %d", ret); + return ret; + } + + return 0; +} + +int aie2_get_runtime_cfg(struct amdxdna_dev_hdl *ndev, u32 type, u64 *value) +{ + DECLARE_AIE2_MSG(get_runtime_cfg, MSG_OP_GET_RUNTIME_CONFIG); + int ret; + + req.type = type; + ret = aie2_send_mgmt_msg_wait(ndev, &msg); + if (ret) { + XDNA_ERR(ndev->xdna, "Failed to get runtime config, ret %d", ret); + return ret; + } + + *value = resp.value; + return 0; +} + +int aie2_assign_mgmt_pasid(struct amdxdna_dev_hdl *ndev, u16 pasid) +{ + DECLARE_AIE2_MSG(assign_mgmt_pasid, MSG_OP_ASSIGN_MGMT_PASID); + + req.pasid = pasid; + + return aie2_send_mgmt_msg_wait(ndev, &msg); +} + +int aie2_query_aie_version(struct amdxdna_dev_hdl *ndev, struct aie_version *version) +{ + DECLARE_AIE2_MSG(aie_version_info, MSG_OP_QUERY_AIE_VERSION); + struct amdxdna_dev *xdna = ndev->xdna; + int ret; + + ret = aie2_send_mgmt_msg_wait(ndev, &msg); + if (ret) + return ret; + + XDNA_DBG(xdna, "Query AIE version - major: %u minor: %u completed", + resp.major, resp.minor); + + version->major = resp.major; + version->minor = resp.minor; + + return 0; +} + +int aie2_query_aie_metadata(struct amdxdna_dev_hdl *ndev, struct aie_metadata *metadata) +{ + DECLARE_AIE2_MSG(aie_tile_info, MSG_OP_QUERY_AIE_TILE_INFO); + int ret; + + ret = aie2_send_mgmt_msg_wait(ndev, &msg); + if (ret) + return ret; + + metadata->size = resp.info.size; + metadata->cols = resp.info.cols; + metadata->rows = resp.info.rows; + + metadata->version.major = resp.info.major; + metadata->version.minor = resp.info.minor; + + metadata->core.row_count = resp.info.core_rows; + metadata->core.row_start = resp.info.core_row_start; + metadata->core.dma_channel_count = resp.info.core_dma_channels; + metadata->core.lock_count = resp.info.core_locks; + metadata->core.event_reg_count = resp.info.core_events; + + metadata->mem.row_count = resp.info.mem_rows; + metadata->mem.row_start = resp.info.mem_row_start; + metadata->mem.dma_channel_count = resp.info.mem_dma_channels; + metadata->mem.lock_count = resp.info.mem_locks; + metadata->mem.event_reg_count = resp.info.mem_events; + + metadata->shim.row_count = resp.info.shim_rows; + metadata->shim.row_start = resp.info.shim_row_start; + metadata->shim.dma_channel_count = resp.info.shim_dma_channels; + metadata->shim.lock_count = resp.info.shim_locks; + metadata->shim.event_reg_count = resp.info.shim_events; + + return 0; +} + +int aie2_query_firmware_version(struct amdxdna_dev_hdl *ndev, + struct amdxdna_fw_ver *fw_ver) +{ + DECLARE_AIE2_MSG(firmware_version, MSG_OP_GET_FIRMWARE_VERSION); + int ret; + + ret = aie2_send_mgmt_msg_wait(ndev, &msg); + if (ret) + return ret; + + fw_ver->major = resp.major; + fw_ver->minor = resp.minor; + fw_ver->sub = resp.sub; + fw_ver->build = resp.build; + + return 0; +} + +int aie2_create_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwctx *hwctx) +{ + DECLARE_AIE2_MSG(create_ctx, MSG_OP_CREATE_CONTEXT); + struct amdxdna_dev *xdna = ndev->xdna; + struct xdna_mailbox_chann_res x2i; + struct xdna_mailbox_chann_res i2x; + struct cq_pair *cq_pair; + u32 intr_reg; + int ret; + + req.aie_type = 1; + req.start_col = hwctx->start_col; + req.num_col = hwctx->num_col; + req.num_cq_pairs_requested = 1; + req.pasid = hwctx->client->pasid; + req.context_priority = 2; + + ret = aie2_send_mgmt_msg_wait(ndev, &msg); + if (ret) + return ret; + + hwctx->fw_ctx_id = resp.context_id; + WARN_ONCE(hwctx->fw_ctx_id == -1, "Unexpected context id"); + + cq_pair = &resp.cq_pair[0]; + x2i.mb_head_ptr_reg = AIE2_MBOX_OFF(ndev, cq_pair->x2i_q.head_addr); + x2i.mb_tail_ptr_reg = AIE2_MBOX_OFF(ndev, cq_pair->x2i_q.tail_addr); + x2i.rb_start_addr = AIE2_SRAM_OFF(ndev, cq_pair->x2i_q.buf_addr); + x2i.rb_size = cq_pair->x2i_q.buf_size; + + i2x.mb_head_ptr_reg = AIE2_MBOX_OFF(ndev, cq_pair->i2x_q.head_addr); + i2x.mb_tail_ptr_reg = AIE2_MBOX_OFF(ndev, cq_pair->i2x_q.tail_addr); + i2x.rb_start_addr = AIE2_SRAM_OFF(ndev, cq_pair->i2x_q.buf_addr); + i2x.rb_size = cq_pair->i2x_q.buf_size; + + ret = pci_irq_vector(to_pci_dev(xdna->ddev.dev), resp.msix_id); + if (ret == -EINVAL) { + XDNA_ERR(xdna, "not able to create channel"); + goto out_destroy_context; + } + + intr_reg = i2x.mb_head_ptr_reg + 4; + hwctx->priv->mbox_chann = xdna_mailbox_create_channel(ndev->mbox, &x2i, &i2x, + intr_reg, ret); + if (!hwctx->priv->mbox_chann) { + XDNA_ERR(xdna, "not able to create channel"); + ret = -EINVAL; + goto out_destroy_context; + } + + XDNA_DBG(xdna, "%s mailbox channel irq: %d, msix_id: %d", + hwctx->name, ret, resp.msix_id); + XDNA_DBG(xdna, "%s created fw ctx %d pasid %d", hwctx->name, + hwctx->fw_ctx_id, hwctx->client->pasid); + + return 0; + +out_destroy_context: + aie2_destroy_context(ndev, hwctx); + return ret; +} + +int aie2_destroy_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwctx *hwctx) +{ + DECLARE_AIE2_MSG(destroy_ctx, MSG_OP_DESTROY_CONTEXT); + struct amdxdna_dev *xdna = ndev->xdna; + int ret; + + if (hwctx->fw_ctx_id == -1) + return 0; + + xdna_mailbox_stop_channel(hwctx->priv->mbox_chann); + + req.context_id = hwctx->fw_ctx_id; + ret = aie2_send_mgmt_msg_wait(ndev, &msg); + if (ret) + XDNA_WARN(xdna, "%s destroy context failed, ret %d", hwctx->name, ret); + + xdna_mailbox_destroy_channel(hwctx->priv->mbox_chann); + XDNA_DBG(xdna, "%s destroyed fw ctx %d", hwctx->name, + hwctx->fw_ctx_id); + hwctx->priv->mbox_chann = NULL; + hwctx->fw_ctx_id = -1; + + return ret; +} + +int aie2_map_host_buf(struct amdxdna_dev_hdl *ndev, u32 context_id, u64 addr, u64 size) +{ + DECLARE_AIE2_MSG(map_host_buffer, MSG_OP_MAP_HOST_BUFFER); + struct amdxdna_dev *xdna = ndev->xdna; + int ret; + + req.context_id = context_id; + req.buf_addr = addr; + req.buf_size = size; + ret = aie2_send_mgmt_msg_wait(ndev, &msg); + if (ret) + return ret; + + XDNA_DBG(xdna, "fw ctx %d map host buf addr 0x%llx size 0x%llx", + context_id, addr, size); + + return 0; +} + +int aie2_query_status(struct amdxdna_dev_hdl *ndev, char __user *buf, + u32 size, u32 *cols_filled) +{ + DECLARE_AIE2_MSG(aie_column_info, MSG_OP_QUERY_COL_STATUS); + struct amdxdna_dev *xdna = ndev->xdna; + struct amdxdna_client *client; + struct amdxdna_hwctx *hwctx; + unsigned long hwctx_id; + dma_addr_t dma_addr; + u32 aie_bitmap = 0; + u8 *buff_addr; + int ret, idx; + + buff_addr = dma_alloc_noncoherent(xdna->ddev.dev, size, &dma_addr, + DMA_FROM_DEVICE, GFP_KERNEL); + if (!buff_addr) + return -ENOMEM; + + /* Go through each hardware context and mark the AIE columns that are active */ + list_for_each_entry(client, &xdna->client_list, node) { + idx = srcu_read_lock(&client->hwctx_srcu); + amdxdna_for_each_hwctx(client, hwctx_id, hwctx) + aie_bitmap |= amdxdna_hwctx_col_map(hwctx); + srcu_read_unlock(&client->hwctx_srcu, idx); + } + + *cols_filled = 0; + req.dump_buff_addr = dma_addr; + req.dump_buff_size = size; + req.num_cols = hweight32(aie_bitmap); + req.aie_bitmap = aie_bitmap; + + drm_clflush_virt_range(buff_addr, size); /* device can access */ + ret = aie2_send_mgmt_msg_wait(ndev, &msg); + if (ret) { + XDNA_ERR(xdna, "Error during NPU query, status %d", ret); + goto fail; + } + + if (resp.status != AIE2_STATUS_SUCCESS) { + XDNA_ERR(xdna, "Query NPU status failed, status 0x%x", resp.status); + ret = -EINVAL; + goto fail; + } + XDNA_DBG(xdna, "Query NPU status completed"); + + if (size < resp.size) { + ret = -EINVAL; + XDNA_ERR(xdna, "Bad buffer size. Available: %u. Needs: %u", size, resp.size); + goto fail; + } + + if (copy_to_user(buf, buff_addr, resp.size)) { + ret = -EFAULT; + XDNA_ERR(xdna, "Failed to copy NPU status to user space"); + goto fail; + } + + *cols_filled = aie_bitmap; + +fail: + dma_free_noncoherent(xdna->ddev.dev, size, buff_addr, dma_addr, DMA_FROM_DEVICE); + return ret; +} + +int aie2_register_asyn_event_msg(struct amdxdna_dev_hdl *ndev, dma_addr_t addr, u32 size, + void *handle, int (*cb)(void*, const u32 *, size_t)) +{ + struct async_event_msg_req req = { 0 }; + struct xdna_mailbox_msg msg = { + .send_data = (u8 *)&req, + .send_size = sizeof(req), + .handle = handle, + .opcode = MSG_OP_REGISTER_ASYNC_EVENT_MSG, + .notify_cb = cb, + }; + + req.buf_addr = addr; + req.buf_size = size; + + XDNA_DBG(ndev->xdna, "Register addr 0x%llx size 0x%x", addr, size); + return xdna_mailbox_send_msg(ndev->mgmt_chann, &msg, TX_TIMEOUT); +} + +int aie2_config_cu(struct amdxdna_hwctx *hwctx) +{ + struct mailbox_channel *chann = hwctx->priv->mbox_chann; + struct amdxdna_dev *xdna = hwctx->client->xdna; + u32 shift = xdna->dev_info->dev_mem_buf_shift; + DECLARE_AIE2_MSG(config_cu, MSG_OP_CONFIG_CU); + struct drm_gem_object *gobj; + struct amdxdna_gem_obj *abo; + int ret, i; + + if (!chann) + return -ENODEV; + + if (hwctx->cus->num_cus > MAX_NUM_CUS) { + XDNA_DBG(xdna, "Exceed maximum CU %d", MAX_NUM_CUS); + return -EINVAL; + } + + for (i = 0; i < hwctx->cus->num_cus; i++) { + struct amdxdna_cu_config *cu = &hwctx->cus->cu_configs[i]; + + if (XDNA_MBZ_DBG(xdna, cu->pad, sizeof(cu->pad))) + return -EINVAL; + + gobj = drm_gem_object_lookup(hwctx->client->filp, cu->cu_bo); + if (!gobj) { + XDNA_ERR(xdna, "Lookup GEM object failed"); + return -EINVAL; + } + abo = to_xdna_obj(gobj); + + if (abo->type != AMDXDNA_BO_DEV) { + drm_gem_object_put(gobj); + XDNA_ERR(xdna, "Invalid BO type"); + return -EINVAL; + } + + req.cfgs[i] = FIELD_PREP(AIE2_MSG_CFG_CU_PDI_ADDR, + abo->mem.dev_addr >> shift); + req.cfgs[i] |= FIELD_PREP(AIE2_MSG_CFG_CU_FUNC, cu->cu_func); + XDNA_DBG(xdna, "CU %d full addr 0x%llx, cfg 0x%x", i, + abo->mem.dev_addr, req.cfgs[i]); + drm_gem_object_put(gobj); + } + req.num_cus = hwctx->cus->num_cus; + + ret = xdna_send_msg_wait(xdna, chann, &msg); + if (ret == -ETIME) + aie2_destroy_context(xdna->dev_handle, hwctx); + + if (resp.status == AIE2_STATUS_SUCCESS) { + XDNA_DBG(xdna, "Configure %d CUs, ret %d", req.num_cus, ret); + return 0; + } + + XDNA_ERR(xdna, "Command opcode 0x%x failed, status 0x%x ret %d", + msg.opcode, resp.status, ret); + return ret; +} + +int aie2_execbuf(struct amdxdna_hwctx *hwctx, struct amdxdna_sched_job *job, + int (*notify_cb)(void *, const u32 *, size_t)) +{ + struct mailbox_channel *chann = hwctx->priv->mbox_chann; + struct amdxdna_dev *xdna = hwctx->client->xdna; + struct amdxdna_gem_obj *cmd_abo = job->cmd_bo; + union { + struct execute_buffer_req ebuf; + struct exec_dpu_req dpu; + } req; + struct xdna_mailbox_msg msg; + u32 payload_len; + void *payload; + int cu_idx; + int ret; + u32 op; + + if (!chann) + return -ENODEV; + + payload = amdxdna_cmd_get_payload(cmd_abo, &payload_len); + if (!payload) { + XDNA_ERR(xdna, "Invalid command, cannot get payload"); + return -EINVAL; + } + + cu_idx = amdxdna_cmd_get_cu_idx(cmd_abo); + if (cu_idx < 0) { + XDNA_DBG(xdna, "Invalid cu idx"); + return -EINVAL; + } + + op = amdxdna_cmd_get_op(cmd_abo); + switch (op) { + case ERT_START_CU: + if (unlikely(payload_len > sizeof(req.ebuf.payload))) + XDNA_DBG(xdna, "Invalid ebuf payload len: %d", payload_len); + req.ebuf.cu_idx = cu_idx; + memcpy(req.ebuf.payload, payload, sizeof(req.ebuf.payload)); + msg.send_size = sizeof(req.ebuf); + msg.opcode = MSG_OP_EXECUTE_BUFFER_CF; + break; + case ERT_START_NPU: { + struct amdxdna_cmd_start_npu *sn = payload; + + if (unlikely(payload_len - sizeof(*sn) > sizeof(req.dpu.payload))) + XDNA_DBG(xdna, "Invalid dpu payload len: %d", payload_len); + req.dpu.inst_buf_addr = sn->buffer; + req.dpu.inst_size = sn->buffer_size; + req.dpu.inst_prop_cnt = sn->prop_count; + req.dpu.cu_idx = cu_idx; + memcpy(req.dpu.payload, sn->prop_args, sizeof(req.dpu.payload)); + msg.send_size = sizeof(req.dpu); + msg.opcode = MSG_OP_EXEC_DPU; + break; + } + default: + XDNA_DBG(xdna, "Invalid ERT cmd op code: %d", op); + return -EINVAL; + } + msg.handle = job; + msg.notify_cb = notify_cb; + msg.send_data = (u8 *)&req; + print_hex_dump_debug("cmd: ", DUMP_PREFIX_OFFSET, 16, 4, &req, + 0x40, false); + + ret = xdna_mailbox_send_msg(chann, &msg, TX_TIMEOUT); + if (ret) { + XDNA_ERR(xdna, "Send message failed"); + return ret; + } + + return 0; +} + +static int +aie2_cmdlist_fill_one_slot_cf(void *cmd_buf, u32 offset, + struct amdxdna_gem_obj *abo, u32 *size) +{ + struct cmd_chain_slot_execbuf_cf *buf = cmd_buf + offset; + int cu_idx = amdxdna_cmd_get_cu_idx(abo); + u32 payload_len; + void *payload; + + if (cu_idx < 0) + return -EINVAL; + + payload = amdxdna_cmd_get_payload(abo, &payload_len); + if (!payload) + return -EINVAL; + + if (!slot_cf_has_space(offset, payload_len)) + return -ENOSPC; + + buf->cu_idx = cu_idx; + buf->arg_cnt = payload_len / sizeof(u32); + memcpy(buf->args, payload, payload_len); + /* Accurate buf size to hint firmware to do necessary copy */ + *size = sizeof(*buf) + payload_len; + return 0; +} + +static int +aie2_cmdlist_fill_one_slot_dpu(void *cmd_buf, u32 offset, + struct amdxdna_gem_obj *abo, u32 *size) +{ + struct cmd_chain_slot_dpu *buf = cmd_buf + offset; + int cu_idx = amdxdna_cmd_get_cu_idx(abo); + struct amdxdna_cmd_start_npu *sn; + u32 payload_len; + void *payload; + u32 arg_sz; + + if (cu_idx < 0) + return -EINVAL; + + payload = amdxdna_cmd_get_payload(abo, &payload_len); + if (!payload) + return -EINVAL; + sn = payload; + arg_sz = payload_len - sizeof(*sn); + if (payload_len < sizeof(*sn) || arg_sz > MAX_DPU_ARGS_SIZE) + return -EINVAL; + + if (!slot_dpu_has_space(offset, arg_sz)) + return -ENOSPC; + + buf->inst_buf_addr = sn->buffer; + buf->inst_size = sn->buffer_size; + buf->inst_prop_cnt = sn->prop_count; + buf->cu_idx = cu_idx; + buf->arg_cnt = arg_sz / sizeof(u32); + memcpy(buf->args, sn->prop_args, arg_sz); + + /* Accurate buf size to hint firmware to do necessary copy */ + *size += sizeof(*buf) + arg_sz; + return 0; +} + +static int +aie2_cmdlist_fill_one_slot(u32 op, struct amdxdna_gem_obj *cmdbuf_abo, u32 offset, + struct amdxdna_gem_obj *abo, u32 *size) +{ + u32 this_op = amdxdna_cmd_get_op(abo); + void *cmd_buf = cmdbuf_abo->mem.kva; + int ret; + + if (this_op != op) { + ret = -EINVAL; + goto done; + } + + switch (op) { + case ERT_START_CU: + ret = aie2_cmdlist_fill_one_slot_cf(cmd_buf, offset, abo, size); + break; + case ERT_START_NPU: + ret = aie2_cmdlist_fill_one_slot_dpu(cmd_buf, offset, abo, size); + break; + default: + ret = -EOPNOTSUPP; + } + +done: + if (ret) { + XDNA_ERR(abo->client->xdna, "Can't fill slot for cmd op %d ret %d", + op, ret); + } + return ret; +} + +static inline struct amdxdna_gem_obj * +aie2_cmdlist_get_cmd_buf(struct amdxdna_sched_job *job) +{ + int idx = get_job_idx(job->seq); + + return job->hwctx->priv->cmd_buf[idx]; +} + +static void +aie2_cmdlist_prepare_request(struct cmd_chain_req *req, + struct amdxdna_gem_obj *cmdbuf_abo, u32 size, u32 cnt) +{ + req->buf_addr = cmdbuf_abo->mem.dev_addr; + req->buf_size = size; + req->count = cnt; + drm_clflush_virt_range(cmdbuf_abo->mem.kva, size); + XDNA_DBG(cmdbuf_abo->client->xdna, "Command buf addr 0x%llx size 0x%x count %d", + req->buf_addr, size, cnt); +} + +static inline u32 +aie2_cmd_op_to_msg_op(u32 op) +{ + switch (op) { + case ERT_START_CU: + return MSG_OP_CHAIN_EXEC_BUFFER_CF; + case ERT_START_NPU: + return MSG_OP_CHAIN_EXEC_DPU; + default: + return MSG_OP_MAX_OPCODE; + } +} + +int aie2_cmdlist_multi_execbuf(struct amdxdna_hwctx *hwctx, + struct amdxdna_sched_job *job, + int (*notify_cb)(void *, const u32 *, size_t)) +{ + struct amdxdna_gem_obj *cmdbuf_abo = aie2_cmdlist_get_cmd_buf(job); + struct mailbox_channel *chann = hwctx->priv->mbox_chann; + struct amdxdna_client *client = hwctx->client; + struct amdxdna_gem_obj *cmd_abo = job->cmd_bo; + struct amdxdna_cmd_chain *payload; + struct xdna_mailbox_msg msg; + struct cmd_chain_req req; + u32 payload_len; + u32 offset = 0; + u32 size; + int ret; + u32 op; + u32 i; + + op = amdxdna_cmd_get_op(cmd_abo); + payload = amdxdna_cmd_get_payload(cmd_abo, &payload_len); + if (op != ERT_CMD_CHAIN || !payload || + payload_len < struct_size(payload, data, payload->command_count)) + return -EINVAL; + + for (i = 0; i < payload->command_count; i++) { + u32 boh = (u32)(payload->data[i]); + struct amdxdna_gem_obj *abo; + + abo = amdxdna_gem_get_obj(client, boh, AMDXDNA_BO_CMD); + if (!abo) { + XDNA_ERR(client->xdna, "Failed to find cmd BO %d", boh); + return -ENOENT; + } + + /* All sub-cmd should have same op, use the first one. */ + if (i == 0) + op = amdxdna_cmd_get_op(abo); + + ret = aie2_cmdlist_fill_one_slot(op, cmdbuf_abo, offset, abo, &size); + amdxdna_gem_put_obj(abo); + if (ret) + return -EINVAL; + + offset += size; + } + + /* The offset is the accumulated total size of the cmd buffer */ + aie2_cmdlist_prepare_request(&req, cmdbuf_abo, offset, payload->command_count); + + msg.opcode = aie2_cmd_op_to_msg_op(op); + if (msg.opcode == MSG_OP_MAX_OPCODE) + return -EOPNOTSUPP; + msg.handle = job; + msg.notify_cb = notify_cb; + msg.send_data = (u8 *)&req; + msg.send_size = sizeof(req); + ret = xdna_mailbox_send_msg(chann, &msg, TX_TIMEOUT); + if (ret) { + XDNA_ERR(hwctx->client->xdna, "Send message failed"); + return ret; + } + + return 0; +} + +int aie2_cmdlist_single_execbuf(struct amdxdna_hwctx *hwctx, + struct amdxdna_sched_job *job, + int (*notify_cb)(void *, const u32 *, size_t)) +{ + struct amdxdna_gem_obj *cmdbuf_abo = aie2_cmdlist_get_cmd_buf(job); + struct mailbox_channel *chann = hwctx->priv->mbox_chann; + struct amdxdna_gem_obj *cmd_abo = job->cmd_bo; + struct xdna_mailbox_msg msg; + struct cmd_chain_req req; + u32 size; + int ret; + u32 op; + + op = amdxdna_cmd_get_op(cmd_abo); + ret = aie2_cmdlist_fill_one_slot(op, cmdbuf_abo, 0, cmd_abo, &size); + if (ret) + return ret; + + aie2_cmdlist_prepare_request(&req, cmdbuf_abo, size, 1); + + msg.opcode = aie2_cmd_op_to_msg_op(op); + if (msg.opcode == MSG_OP_MAX_OPCODE) + return -EOPNOTSUPP; + msg.handle = job; + msg.notify_cb = notify_cb; + msg.send_data = (u8 *)&req; + msg.send_size = sizeof(req); + ret = xdna_mailbox_send_msg(chann, &msg, TX_TIMEOUT); + if (ret) { + XDNA_ERR(hwctx->client->xdna, "Send message failed"); + return ret; + } + + return 0; +} + +int aie2_sync_bo(struct amdxdna_hwctx *hwctx, struct amdxdna_sched_job *job, + int (*notify_cb)(void *, const u32 *, size_t)) +{ + struct mailbox_channel *chann = hwctx->priv->mbox_chann; + struct amdxdna_gem_obj *abo = to_xdna_obj(job->bos[0]); + struct amdxdna_dev *xdna = hwctx->client->xdna; + struct xdna_mailbox_msg msg; + struct sync_bo_req req; + int ret = 0; + + req.src_addr = 0; + req.dst_addr = abo->mem.dev_addr - hwctx->client->dev_heap->mem.dev_addr; + req.size = abo->mem.size; + + /* Device to Host */ + req.type = FIELD_PREP(AIE2_MSG_SYNC_BO_SRC_TYPE, SYNC_BO_DEV_MEM) | + FIELD_PREP(AIE2_MSG_SYNC_BO_DST_TYPE, SYNC_BO_HOST_MEM); + + XDNA_DBG(xdna, "sync %d bytes src(0x%llx) to dst(0x%llx) completed", + req.size, req.src_addr, req.dst_addr); + + msg.handle = job; + msg.notify_cb = notify_cb; + msg.send_data = (u8 *)&req; + msg.send_size = sizeof(req); + msg.opcode = MSG_OP_SYNC_BO; + + ret = xdna_mailbox_send_msg(chann, &msg, TX_TIMEOUT); + if (ret) { + XDNA_ERR(xdna, "Send message failed"); + return ret; + } + + return 0; +} diff --git a/drivers/accel/amdxdna/aie2_msg_priv.h b/drivers/accel/amdxdna/aie2_msg_priv.h new file mode 100644 index 000000000000..4e02e744b470 --- /dev/null +++ b/drivers/accel/amdxdna/aie2_msg_priv.h @@ -0,0 +1,370 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2022-2024, Advanced Micro Devices, Inc. + */ + +#ifndef _AIE2_MSG_PRIV_H_ +#define _AIE2_MSG_PRIV_H_ + +enum aie2_msg_opcode { + MSG_OP_CREATE_CONTEXT = 0x2, + MSG_OP_DESTROY_CONTEXT = 0x3, + MSG_OP_SYNC_BO = 0x7, + MSG_OP_EXECUTE_BUFFER_CF = 0xC, + MSG_OP_QUERY_COL_STATUS = 0xD, + MSG_OP_QUERY_AIE_TILE_INFO = 0xE, + MSG_OP_QUERY_AIE_VERSION = 0xF, + MSG_OP_EXEC_DPU = 0x10, + MSG_OP_CONFIG_CU = 0x11, + MSG_OP_CHAIN_EXEC_BUFFER_CF = 0x12, + MSG_OP_CHAIN_EXEC_DPU = 0x13, + MSG_OP_MAX_XRT_OPCODE, + MSG_OP_SUSPEND = 0x101, + MSG_OP_RESUME = 0x102, + MSG_OP_ASSIGN_MGMT_PASID = 0x103, + MSG_OP_INVOKE_SELF_TEST = 0x104, + MSG_OP_MAP_HOST_BUFFER = 0x106, + MSG_OP_GET_FIRMWARE_VERSION = 0x108, + MSG_OP_SET_RUNTIME_CONFIG = 0x10A, + MSG_OP_GET_RUNTIME_CONFIG = 0x10B, + MSG_OP_REGISTER_ASYNC_EVENT_MSG = 0x10C, + MSG_OP_MAX_DRV_OPCODE, + MSG_OP_GET_PROTOCOL_VERSION = 0x301, + MSG_OP_MAX_OPCODE +}; + +enum aie2_msg_status { + AIE2_STATUS_SUCCESS = 0x0, + /* AIE Error codes */ + AIE2_STATUS_AIE_SATURATION_ERROR = 0x1000001, + AIE2_STATUS_AIE_FP_ERROR = 0x1000002, + AIE2_STATUS_AIE_STREAM_ERROR = 0x1000003, + AIE2_STATUS_AIE_ACCESS_ERROR = 0x1000004, + AIE2_STATUS_AIE_BUS_ERROR = 0x1000005, + AIE2_STATUS_AIE_INSTRUCTION_ERROR = 0x1000006, + AIE2_STATUS_AIE_ECC_ERROR = 0x1000007, + AIE2_STATUS_AIE_LOCK_ERROR = 0x1000008, + AIE2_STATUS_AIE_DMA_ERROR = 0x1000009, + AIE2_STATUS_AIE_MEM_PARITY_ERROR = 0x100000a, + AIE2_STATUS_AIE_PWR_CFG_ERROR = 0x100000b, + AIE2_STATUS_AIE_BACKTRACK_ERROR = 0x100000c, + AIE2_STATUS_MAX_AIE_STATUS_CODE, + /* MGMT ERT Error codes */ + AIE2_STATUS_MGMT_ERT_SELF_TEST_FAILURE = 0x2000001, + AIE2_STATUS_MGMT_ERT_HASH_MISMATCH, + AIE2_STATUS_MGMT_ERT_NOAVAIL, + AIE2_STATUS_MGMT_ERT_INVALID_PARAM, + AIE2_STATUS_MGMT_ERT_ENTER_SUSPEND_FAILURE, + AIE2_STATUS_MGMT_ERT_BUSY, + AIE2_STATUS_MGMT_ERT_APPLICATION_ACTIVE, + MAX_MGMT_ERT_STATUS_CODE, + /* APP ERT Error codes */ + AIE2_STATUS_APP_ERT_FIRST_ERROR = 0x3000001, + AIE2_STATUS_APP_INVALID_INSTR, + AIE2_STATUS_APP_LOAD_PDI_FAIL, + MAX_APP_ERT_STATUS_CODE, + /* NPU RTOS Error Codes */ + AIE2_STATUS_INVALID_INPUT_BUFFER = 0x4000001, + AIE2_STATUS_INVALID_COMMAND, + AIE2_STATUS_INVALID_PARAM, + AIE2_STATUS_INVALID_OPERATION = 0x4000006, + AIE2_STATUS_ASYNC_EVENT_MSGS_FULL, + AIE2_STATUS_MAX_RTOS_STATUS_CODE, + MAX_AIE2_STATUS_CODE +}; + +struct assign_mgmt_pasid_req { + __u16 pasid; + __u16 reserved; +} __packed; + +struct assign_mgmt_pasid_resp { + enum aie2_msg_status status; +} __packed; + +struct map_host_buffer_req { + __u32 context_id; + __u64 buf_addr; + __u64 buf_size; +} __packed; + +struct map_host_buffer_resp { + enum aie2_msg_status status; +} __packed; + +#define MAX_CQ_PAIRS 2 +struct cq_info { + __u32 head_addr; + __u32 tail_addr; + __u32 buf_addr; + __u32 buf_size; +}; + +struct cq_pair { + struct cq_info x2i_q; + struct cq_info i2x_q; +}; + +struct create_ctx_req { + __u32 aie_type; + __u8 start_col; + __u8 num_col; + __u16 reserved; + __u8 num_cq_pairs_requested; + __u8 reserved1; + __u16 pasid; + __u32 pad[2]; + __u32 sec_comm_target_type; + __u32 context_priority; +} __packed; + +struct create_ctx_resp { + enum aie2_msg_status status; + __u32 context_id; + __u16 msix_id; + __u8 num_cq_pairs_allocated; + __u8 reserved; + struct cq_pair cq_pair[MAX_CQ_PAIRS]; +} __packed; + +struct destroy_ctx_req { + __u32 context_id; +} __packed; + +struct destroy_ctx_resp { + enum aie2_msg_status status; +} __packed; + +struct execute_buffer_req { + __u32 cu_idx; + __u32 payload[19]; +} __packed; + +struct exec_dpu_req { + __u64 inst_buf_addr; + __u32 inst_size; + __u32 inst_prop_cnt; + __u32 cu_idx; + __u32 payload[35]; +} __packed; + +struct execute_buffer_resp { + enum aie2_msg_status status; +} __packed; + +struct aie_tile_info { + __u32 size; + __u16 major; + __u16 minor; + __u16 cols; + __u16 rows; + __u16 core_rows; + __u16 mem_rows; + __u16 shim_rows; + __u16 core_row_start; + __u16 mem_row_start; + __u16 shim_row_start; + __u16 core_dma_channels; + __u16 mem_dma_channels; + __u16 shim_dma_channels; + __u16 core_locks; + __u16 mem_locks; + __u16 shim_locks; + __u16 core_events; + __u16 mem_events; + __u16 shim_events; + __u16 reserved; +}; + +struct aie_tile_info_req { + __u32 reserved; +} __packed; + +struct aie_tile_info_resp { + enum aie2_msg_status status; + struct aie_tile_info info; +} __packed; + +struct aie_version_info_req { + __u32 reserved; +} __packed; + +struct aie_version_info_resp { + enum aie2_msg_status status; + __u16 major; + __u16 minor; +} __packed; + +struct aie_column_info_req { + __u64 dump_buff_addr; + __u32 dump_buff_size; + __u32 num_cols; + __u32 aie_bitmap; +} __packed; + +struct aie_column_info_resp { + enum aie2_msg_status status; + __u32 size; +} __packed; + +struct suspend_req { + __u32 place_holder; +} __packed; + +struct suspend_resp { + enum aie2_msg_status status; +} __packed; + +struct resume_req { + __u32 place_holder; +} __packed; + +struct resume_resp { + enum aie2_msg_status status; +} __packed; + +struct check_header_hash_req { + __u64 hash_high; + __u64 hash_low; +} __packed; + +struct check_header_hash_resp { + enum aie2_msg_status status; +} __packed; + +struct query_error_req { + __u64 buf_addr; + __u32 buf_size; + __u32 next_row; + __u32 next_column; + __u32 next_module; +} __packed; + +struct query_error_resp { + enum aie2_msg_status status; + __u32 num_err; + __u32 has_next_err; + __u32 next_row; + __u32 next_column; + __u32 next_module; +} __packed; + +struct protocol_version_req { + __u32 reserved; +} __packed; + +struct protocol_version_resp { + enum aie2_msg_status status; + __u32 major; + __u32 minor; +} __packed; + +struct firmware_version_req { + __u32 reserved; +} __packed; + +struct firmware_version_resp { + enum aie2_msg_status status; + __u32 major; + __u32 minor; + __u32 sub; + __u32 build; +} __packed; + +#define MAX_NUM_CUS 32 +#define AIE2_MSG_CFG_CU_PDI_ADDR GENMASK(16, 0) +#define AIE2_MSG_CFG_CU_FUNC GENMASK(24, 17) +struct config_cu_req { + __u32 num_cus; + __u32 cfgs[MAX_NUM_CUS]; +} __packed; + +struct config_cu_resp { + enum aie2_msg_status status; +} __packed; + +struct set_runtime_cfg_req { + __u32 type; + __u64 value; +} __packed; + +struct set_runtime_cfg_resp { + enum aie2_msg_status status; +} __packed; + +struct get_runtime_cfg_req { + __u32 type; +} __packed; + +struct get_runtime_cfg_resp { + enum aie2_msg_status status; + __u64 value; +} __packed; + +enum async_event_type { + ASYNC_EVENT_TYPE_AIE_ERROR, + ASYNC_EVENT_TYPE_EXCEPTION, + MAX_ASYNC_EVENT_TYPE +}; + +#define ASYNC_BUF_SIZE SZ_8K +struct async_event_msg_req { + __u64 buf_addr; + __u32 buf_size; +} __packed; + +struct async_event_msg_resp { + enum aie2_msg_status status; + enum async_event_type type; +} __packed; + +#define MAX_CHAIN_CMDBUF_SIZE SZ_4K +#define slot_cf_has_space(offset, payload_size) \ + (MAX_CHAIN_CMDBUF_SIZE - ((offset) + (payload_size)) > \ + offsetof(struct cmd_chain_slot_execbuf_cf, args[0])) +struct cmd_chain_slot_execbuf_cf { + __u32 cu_idx; + __u32 arg_cnt; + __u32 args[] __counted_by(arg_cnt); +}; + +#define slot_dpu_has_space(offset, payload_size) \ + (MAX_CHAIN_CMDBUF_SIZE - ((offset) + (payload_size)) > \ + offsetof(struct cmd_chain_slot_dpu, args[0])) +struct cmd_chain_slot_dpu { + __u64 inst_buf_addr; + __u32 inst_size; + __u32 inst_prop_cnt; + __u32 cu_idx; + __u32 arg_cnt; +#define MAX_DPU_ARGS_SIZE (34 * sizeof(__u32)) + __u32 args[] __counted_by(arg_cnt); +}; + +struct cmd_chain_req { + __u64 buf_addr; + __u32 buf_size; + __u32 count; +} __packed; + +struct cmd_chain_resp { + enum aie2_msg_status status; + __u32 fail_cmd_idx; + enum aie2_msg_status fail_cmd_status; +} __packed; + +#define AIE2_MSG_SYNC_BO_SRC_TYPE GENMASK(3, 0) +#define AIE2_MSG_SYNC_BO_DST_TYPE GENMASK(7, 4) +struct sync_bo_req { + __u64 src_addr; + __u64 dst_addr; + __u32 size; +#define SYNC_BO_DEV_MEM 0 +#define SYNC_BO_HOST_MEM 2 + __u32 type; +} __packed; + +struct sync_bo_resp { + enum aie2_msg_status status; +} __packed; +#endif /* _AIE2_MSG_PRIV_H_ */ diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c new file mode 100644 index 000000000000..5a058e565b01 --- /dev/null +++ b/drivers/accel/amdxdna/aie2_pci.c @@ -0,0 +1,928 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2023-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "aie2_msg_priv.h" +#include "aie2_pci.h" +#include "aie2_solver.h" +#include "amdxdna_ctx.h" +#include "amdxdna_gem.h" +#include "amdxdna_mailbox.h" +#include "amdxdna_pci_drv.h" + +static int aie2_max_col = XRS_MAX_COL; +module_param(aie2_max_col, uint, 0600); +MODULE_PARM_DESC(aie2_max_col, "Maximum column could be used"); + +/* + * The management mailbox channel is allocated by firmware. + * The related register and ring buffer information is on SRAM BAR. + * This struct is the register layout. + */ +#define MGMT_MBOX_MAGIC 0x55504e5f /* _NPU */ +struct mgmt_mbox_chann_info { + __u32 x2i_tail; + __u32 x2i_head; + __u32 x2i_buf; + __u32 x2i_buf_sz; + __u32 i2x_tail; + __u32 i2x_head; + __u32 i2x_buf; + __u32 i2x_buf_sz; + __u32 magic; + __u32 msi_id; + __u32 prot_major; + __u32 prot_minor; + __u32 rsvd[4]; +}; + +static int aie2_check_protocol(struct amdxdna_dev_hdl *ndev, u32 fw_major, u32 fw_minor) +{ + struct amdxdna_dev *xdna = ndev->xdna; + + /* + * The driver supported mailbox behavior is defined by + * ndev->priv->protocol_major and protocol_minor. + * + * When protocol_major and fw_major are different, it means driver + * and firmware are incompatible. + */ + if (ndev->priv->protocol_major != fw_major) { + XDNA_ERR(xdna, "Incompatible firmware protocol major %d minor %d", + fw_major, fw_minor); + return -EINVAL; + } + + /* + * When protocol_minor is greater then fw_minor, that means driver + * relies on operation the installed firmware does not support. + */ + if (ndev->priv->protocol_minor > fw_minor) { + XDNA_ERR(xdna, "Firmware minor version smaller than supported"); + return -EINVAL; + } + return 0; +} + +static void aie2_dump_chann_info_debug(struct amdxdna_dev_hdl *ndev) +{ + struct amdxdna_dev *xdna = ndev->xdna; + + XDNA_DBG(xdna, "i2x tail 0x%x", ndev->mgmt_i2x.mb_tail_ptr_reg); + XDNA_DBG(xdna, "i2x head 0x%x", ndev->mgmt_i2x.mb_head_ptr_reg); + XDNA_DBG(xdna, "i2x ringbuf 0x%x", ndev->mgmt_i2x.rb_start_addr); + XDNA_DBG(xdna, "i2x rsize 0x%x", ndev->mgmt_i2x.rb_size); + XDNA_DBG(xdna, "x2i tail 0x%x", ndev->mgmt_x2i.mb_tail_ptr_reg); + XDNA_DBG(xdna, "x2i head 0x%x", ndev->mgmt_x2i.mb_head_ptr_reg); + XDNA_DBG(xdna, "x2i ringbuf 0x%x", ndev->mgmt_x2i.rb_start_addr); + XDNA_DBG(xdna, "x2i rsize 0x%x", ndev->mgmt_x2i.rb_size); + XDNA_DBG(xdna, "x2i chann index 0x%x", ndev->mgmt_chan_idx); + XDNA_DBG(xdna, "mailbox protocol major 0x%x", ndev->mgmt_prot_major); + XDNA_DBG(xdna, "mailbox protocol minor 0x%x", ndev->mgmt_prot_minor); +} + +static int aie2_get_mgmt_chann_info(struct amdxdna_dev_hdl *ndev) +{ + struct mgmt_mbox_chann_info info_regs; + struct xdna_mailbox_chann_res *i2x; + struct xdna_mailbox_chann_res *x2i; + u32 addr, off; + u32 *reg; + int ret; + int i; + + /* + * Once firmware is alive, it will write management channel + * information in SRAM BAR and write the address of that information + * at FW_ALIVE_OFF offset in SRMA BAR. + * + * Read a non-zero value from FW_ALIVE_OFF implies that firmware + * is alive. + */ + ret = readx_poll_timeout(readl, SRAM_GET_ADDR(ndev, FW_ALIVE_OFF), + addr, addr, AIE2_INTERVAL, AIE2_TIMEOUT); + if (ret || !addr) + return -ETIME; + + off = AIE2_SRAM_OFF(ndev, addr); + reg = (u32 *)&info_regs; + for (i = 0; i < sizeof(info_regs) / sizeof(u32); i++) + reg[i] = readl(ndev->sram_base + off + i * sizeof(u32)); + + if (info_regs.magic != MGMT_MBOX_MAGIC) { + XDNA_ERR(ndev->xdna, "Invalid mbox magic 0x%x", info_regs.magic); + ret = -EINVAL; + goto done; + } + + i2x = &ndev->mgmt_i2x; + x2i = &ndev->mgmt_x2i; + + i2x->mb_head_ptr_reg = AIE2_MBOX_OFF(ndev, info_regs.i2x_head); + i2x->mb_tail_ptr_reg = AIE2_MBOX_OFF(ndev, info_regs.i2x_tail); + i2x->rb_start_addr = AIE2_SRAM_OFF(ndev, info_regs.i2x_buf); + i2x->rb_size = info_regs.i2x_buf_sz; + + x2i->mb_head_ptr_reg = AIE2_MBOX_OFF(ndev, info_regs.x2i_head); + x2i->mb_tail_ptr_reg = AIE2_MBOX_OFF(ndev, info_regs.x2i_tail); + x2i->rb_start_addr = AIE2_SRAM_OFF(ndev, info_regs.x2i_buf); + x2i->rb_size = info_regs.x2i_buf_sz; + + ndev->mgmt_chan_idx = info_regs.msi_id; + ndev->mgmt_prot_major = info_regs.prot_major; + ndev->mgmt_prot_minor = info_regs.prot_minor; + + ret = aie2_check_protocol(ndev, ndev->mgmt_prot_major, ndev->mgmt_prot_minor); + +done: + aie2_dump_chann_info_debug(ndev); + + /* Must clear address at FW_ALIVE_OFF */ + writel(0, SRAM_GET_ADDR(ndev, FW_ALIVE_OFF)); + + return ret; +} + +int aie2_runtime_cfg(struct amdxdna_dev_hdl *ndev, + enum rt_config_category category, u32 *val) +{ + const struct rt_config *cfg; + u32 value; + int ret; + + for (cfg = ndev->priv->rt_config; cfg->type; cfg++) { + if (cfg->category != category) + continue; + + value = val ? *val : cfg->value; + ret = aie2_set_runtime_cfg(ndev, cfg->type, value); + if (ret) { + XDNA_ERR(ndev->xdna, "Set type %d value %d failed", + cfg->type, value); + return ret; + } + } + + return 0; +} + +static int aie2_xdna_reset(struct amdxdna_dev_hdl *ndev) +{ + int ret; + + ret = aie2_suspend_fw(ndev); + if (ret) { + XDNA_ERR(ndev->xdna, "Suspend firmware failed"); + return ret; + } + + ret = aie2_resume_fw(ndev); + if (ret) { + XDNA_ERR(ndev->xdna, "Resume firmware failed"); + return ret; + } + + return 0; +} + +static int aie2_mgmt_fw_init(struct amdxdna_dev_hdl *ndev) +{ + int ret; + + ret = aie2_runtime_cfg(ndev, AIE2_RT_CFG_INIT, NULL); + if (ret) { + XDNA_ERR(ndev->xdna, "Runtime config failed"); + return ret; + } + + ret = aie2_assign_mgmt_pasid(ndev, 0); + if (ret) { + XDNA_ERR(ndev->xdna, "Can not assign PASID"); + return ret; + } + + ret = aie2_xdna_reset(ndev); + if (ret) { + XDNA_ERR(ndev->xdna, "Reset firmware failed"); + return ret; + } + + if (!ndev->async_events) + return 0; + + ret = aie2_error_async_events_send(ndev); + if (ret) { + XDNA_ERR(ndev->xdna, "Send async events failed"); + return ret; + } + + return 0; +} + +static int aie2_mgmt_fw_query(struct amdxdna_dev_hdl *ndev) +{ + int ret; + + ret = aie2_query_firmware_version(ndev, &ndev->xdna->fw_ver); + if (ret) { + XDNA_ERR(ndev->xdna, "query firmware version failed"); + return ret; + } + + ret = aie2_query_aie_version(ndev, &ndev->version); + if (ret) { + XDNA_ERR(ndev->xdna, "Query AIE version failed"); + return ret; + } + + ret = aie2_query_aie_metadata(ndev, &ndev->metadata); + if (ret) { + XDNA_ERR(ndev->xdna, "Query AIE metadata failed"); + return ret; + } + + return 0; +} + +static void aie2_mgmt_fw_fini(struct amdxdna_dev_hdl *ndev) +{ + if (aie2_suspend_fw(ndev)) + XDNA_ERR(ndev->xdna, "Suspend_fw failed"); + XDNA_DBG(ndev->xdna, "Firmware suspended"); +} + +static int aie2_xrs_load(void *cb_arg, struct xrs_action_load *action) +{ + struct amdxdna_hwctx *hwctx = cb_arg; + struct amdxdna_dev *xdna; + int ret; + + xdna = hwctx->client->xdna; + + hwctx->start_col = action->part.start_col; + hwctx->num_col = action->part.ncols; + ret = aie2_create_context(xdna->dev_handle, hwctx); + if (ret) + XDNA_ERR(xdna, "create context failed, ret %d", ret); + + return ret; +} + +static int aie2_xrs_unload(void *cb_arg) +{ + struct amdxdna_hwctx *hwctx = cb_arg; + struct amdxdna_dev *xdna; + int ret; + + xdna = hwctx->client->xdna; + + ret = aie2_destroy_context(xdna->dev_handle, hwctx); + if (ret) + XDNA_ERR(xdna, "destroy context failed, ret %d", ret); + + return ret; +} + +static int aie2_xrs_set_dft_dpm_level(struct drm_device *ddev, u32 dpm_level) +{ + struct amdxdna_dev *xdna = to_xdna_dev(ddev); + struct amdxdna_dev_hdl *ndev; + + drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); + + ndev = xdna->dev_handle; + ndev->dft_dpm_level = dpm_level; + if (ndev->pw_mode != POWER_MODE_DEFAULT || ndev->dpm_level == dpm_level) + return 0; + + return ndev->priv->hw_ops.set_dpm(ndev, dpm_level); +} + +static struct xrs_action_ops aie2_xrs_actions = { + .load = aie2_xrs_load, + .unload = aie2_xrs_unload, + .set_dft_dpm_level = aie2_xrs_set_dft_dpm_level, +}; + +static void aie2_hw_stop(struct amdxdna_dev *xdna) +{ + struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev); + struct amdxdna_dev_hdl *ndev = xdna->dev_handle; + + if (ndev->dev_status <= AIE2_DEV_INIT) { + XDNA_ERR(xdna, "device is already stopped"); + return; + } + + aie2_mgmt_fw_fini(ndev); + xdna_mailbox_stop_channel(ndev->mgmt_chann); + xdna_mailbox_destroy_channel(ndev->mgmt_chann); + ndev->mgmt_chann = NULL; + drmm_kfree(&xdna->ddev, ndev->mbox); + ndev->mbox = NULL; + aie2_psp_stop(ndev->psp_hdl); + aie2_smu_fini(ndev); + pci_disable_device(pdev); + + ndev->dev_status = AIE2_DEV_INIT; +} + +static int aie2_hw_start(struct amdxdna_dev *xdna) +{ + struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev); + struct amdxdna_dev_hdl *ndev = xdna->dev_handle; + struct xdna_mailbox_res mbox_res; + u32 xdna_mailbox_intr_reg; + int mgmt_mb_irq, ret; + + if (ndev->dev_status >= AIE2_DEV_START) { + XDNA_INFO(xdna, "device is already started"); + return 0; + } + + ret = pci_enable_device(pdev); + if (ret) { + XDNA_ERR(xdna, "failed to enable device, ret %d", ret); + return ret; + } + pci_set_master(pdev); + + ret = aie2_smu_init(ndev); + if (ret) { + XDNA_ERR(xdna, "failed to init smu, ret %d", ret); + goto disable_dev; + } + + ret = aie2_psp_start(ndev->psp_hdl); + if (ret) { + XDNA_ERR(xdna, "failed to start psp, ret %d", ret); + goto fini_smu; + } + + ret = aie2_get_mgmt_chann_info(ndev); + if (ret) { + XDNA_ERR(xdna, "firmware is not alive"); + goto stop_psp; + } + + mbox_res.ringbuf_base = ndev->sram_base; + mbox_res.ringbuf_size = pci_resource_len(pdev, xdna->dev_info->sram_bar); + mbox_res.mbox_base = ndev->mbox_base; + mbox_res.mbox_size = MBOX_SIZE(ndev); + mbox_res.name = "xdna_mailbox"; + ndev->mbox = xdnam_mailbox_create(&xdna->ddev, &mbox_res); + if (!ndev->mbox) { + XDNA_ERR(xdna, "failed to create mailbox device"); + ret = -ENODEV; + goto stop_psp; + } + + mgmt_mb_irq = pci_irq_vector(pdev, ndev->mgmt_chan_idx); + if (mgmt_mb_irq < 0) { + ret = mgmt_mb_irq; + XDNA_ERR(xdna, "failed to alloc irq vector, ret %d", ret); + goto stop_psp; + } + + xdna_mailbox_intr_reg = ndev->mgmt_i2x.mb_head_ptr_reg + 4; + ndev->mgmt_chann = xdna_mailbox_create_channel(ndev->mbox, + &ndev->mgmt_x2i, + &ndev->mgmt_i2x, + xdna_mailbox_intr_reg, + mgmt_mb_irq); + if (!ndev->mgmt_chann) { + XDNA_ERR(xdna, "failed to create management mailbox channel"); + ret = -EINVAL; + goto stop_psp; + } + + ret = aie2_pm_init(ndev); + if (ret) { + XDNA_ERR(xdna, "failed to init pm, ret %d", ret); + goto destroy_mgmt_chann; + } + + ret = aie2_mgmt_fw_init(ndev); + if (ret) { + XDNA_ERR(xdna, "initial mgmt firmware failed, ret %d", ret); + goto destroy_mgmt_chann; + } + + ndev->dev_status = AIE2_DEV_START; + + return 0; + +destroy_mgmt_chann: + xdna_mailbox_stop_channel(ndev->mgmt_chann); + xdna_mailbox_destroy_channel(ndev->mgmt_chann); +stop_psp: + aie2_psp_stop(ndev->psp_hdl); +fini_smu: + aie2_smu_fini(ndev); +disable_dev: + pci_disable_device(pdev); + + return ret; +} + +static int aie2_init(struct amdxdna_dev *xdna) +{ + struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev); + void __iomem *tbl[PCI_NUM_RESOURCES] = {0}; + struct init_config xrs_cfg = { 0 }; + struct amdxdna_dev_hdl *ndev; + struct psp_config psp_conf; + const struct firmware *fw; + unsigned long bars = 0; + int i, nvec, ret; + + ndev = drmm_kzalloc(&xdna->ddev, sizeof(*ndev), GFP_KERNEL); + if (!ndev) + return -ENOMEM; + + ndev->priv = xdna->dev_info->dev_priv; + ndev->xdna = xdna; + + ret = request_firmware(&fw, ndev->priv->fw_path, &pdev->dev); + if (ret) { + XDNA_ERR(xdna, "failed to request_firmware %s, ret %d", + ndev->priv->fw_path, ret); + return ret; + } + + ret = pcim_enable_device(pdev); + if (ret) { + XDNA_ERR(xdna, "pcim enable device failed, ret %d", ret); + goto release_fw; + } + + for (i = 0; i < PSP_MAX_REGS; i++) + set_bit(PSP_REG_BAR(ndev, i), &bars); + + set_bit(xdna->dev_info->sram_bar, &bars); + set_bit(xdna->dev_info->smu_bar, &bars); + set_bit(xdna->dev_info->mbox_bar, &bars); + + for (i = 0; i < PCI_NUM_RESOURCES; i++) { + if (!test_bit(i, &bars)) + continue; + tbl[i] = pcim_iomap(pdev, i, 0); + if (!tbl[i]) { + XDNA_ERR(xdna, "map bar %d failed", i); + ret = -ENOMEM; + goto release_fw; + } + } + + ndev->sram_base = tbl[xdna->dev_info->sram_bar]; + ndev->smu_base = tbl[xdna->dev_info->smu_bar]; + ndev->mbox_base = tbl[xdna->dev_info->mbox_bar]; + + ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); + if (ret) { + XDNA_ERR(xdna, "Failed to set DMA mask: %d", ret); + goto release_fw; + } + + nvec = pci_msix_vec_count(pdev); + if (nvec <= 0) { + XDNA_ERR(xdna, "does not get number of interrupt vector"); + ret = -EINVAL; + goto release_fw; + } + + ret = pci_alloc_irq_vectors(pdev, nvec, nvec, PCI_IRQ_MSIX); + if (ret < 0) { + XDNA_ERR(xdna, "failed to alloc irq vectors, ret %d", ret); + goto release_fw; + } + + ret = iommu_dev_enable_feature(&pdev->dev, IOMMU_DEV_FEAT_SVA); + if (ret) { + XDNA_ERR(xdna, "Enable PASID failed, ret %d", ret); + goto free_irq; + } + + psp_conf.fw_size = fw->size; + psp_conf.fw_buf = fw->data; + for (i = 0; i < PSP_MAX_REGS; i++) + psp_conf.psp_regs[i] = tbl[PSP_REG_BAR(ndev, i)] + PSP_REG_OFF(ndev, i); + ndev->psp_hdl = aie2m_psp_create(&xdna->ddev, &psp_conf); + if (!ndev->psp_hdl) { + XDNA_ERR(xdna, "failed to create psp"); + ret = -ENOMEM; + goto disable_sva; + } + xdna->dev_handle = ndev; + + ret = aie2_hw_start(xdna); + if (ret) { + XDNA_ERR(xdna, "start npu failed, ret %d", ret); + goto disable_sva; + } + + ret = aie2_mgmt_fw_query(ndev); + if (ret) { + XDNA_ERR(xdna, "Query firmware failed, ret %d", ret); + goto stop_hw; + } + ndev->total_col = min(aie2_max_col, ndev->metadata.cols); + + xrs_cfg.clk_list.num_levels = ndev->max_dpm_level + 1; + for (i = 0; i < xrs_cfg.clk_list.num_levels; i++) + xrs_cfg.clk_list.cu_clk_list[i] = ndev->priv->dpm_clk_tbl[i].hclk; + xrs_cfg.sys_eff_factor = 1; + xrs_cfg.ddev = &xdna->ddev; + xrs_cfg.actions = &aie2_xrs_actions; + xrs_cfg.total_col = ndev->total_col; + + xdna->xrs_hdl = xrsm_init(&xrs_cfg); + if (!xdna->xrs_hdl) { + XDNA_ERR(xdna, "Initialize resolver failed"); + ret = -EINVAL; + goto stop_hw; + } + + ret = aie2_error_async_events_alloc(ndev); + if (ret) { + XDNA_ERR(xdna, "Allocate async events failed, ret %d", ret); + goto stop_hw; + } + + ret = aie2_error_async_events_send(ndev); + if (ret) { + XDNA_ERR(xdna, "Send async events failed, ret %d", ret); + goto async_event_free; + } + + /* Issue a command to make sure firmware handled async events */ + ret = aie2_query_firmware_version(ndev, &ndev->xdna->fw_ver); + if (ret) { + XDNA_ERR(xdna, "Re-query firmware version failed"); + goto async_event_free; + } + + release_firmware(fw); + return 0; + +async_event_free: + aie2_error_async_events_free(ndev); +stop_hw: + aie2_hw_stop(xdna); +disable_sva: + iommu_dev_disable_feature(&pdev->dev, IOMMU_DEV_FEAT_SVA); +free_irq: + pci_free_irq_vectors(pdev); +release_fw: + release_firmware(fw); + + return ret; +} + +static void aie2_fini(struct amdxdna_dev *xdna) +{ + struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev); + struct amdxdna_dev_hdl *ndev = xdna->dev_handle; + + aie2_hw_stop(xdna); + aie2_error_async_events_free(ndev); + iommu_dev_disable_feature(&pdev->dev, IOMMU_DEV_FEAT_SVA); + pci_free_irq_vectors(pdev); +} + +static int aie2_get_aie_status(struct amdxdna_client *client, + struct amdxdna_drm_get_info *args) +{ + struct amdxdna_drm_query_aie_status status; + struct amdxdna_dev *xdna = client->xdna; + struct amdxdna_dev_hdl *ndev; + int ret; + + ndev = xdna->dev_handle; + if (copy_from_user(&status, u64_to_user_ptr(args->buffer), sizeof(status))) { + XDNA_ERR(xdna, "Failed to copy AIE request into kernel"); + return -EFAULT; + } + + if (ndev->metadata.cols * ndev->metadata.size < status.buffer_size) { + XDNA_ERR(xdna, "Invalid buffer size. Given Size: %u. Need Size: %u.", + status.buffer_size, ndev->metadata.cols * ndev->metadata.size); + return -EINVAL; + } + + ret = aie2_query_status(ndev, u64_to_user_ptr(status.buffer), + status.buffer_size, &status.cols_filled); + if (ret) { + XDNA_ERR(xdna, "Failed to get AIE status info. Ret: %d", ret); + return ret; + } + + if (copy_to_user(u64_to_user_ptr(args->buffer), &status, sizeof(status))) { + XDNA_ERR(xdna, "Failed to copy AIE request info to user space"); + return -EFAULT; + } + + return 0; +} + +static int aie2_get_aie_metadata(struct amdxdna_client *client, + struct amdxdna_drm_get_info *args) +{ + struct amdxdna_drm_query_aie_metadata *meta; + struct amdxdna_dev *xdna = client->xdna; + struct amdxdna_dev_hdl *ndev; + int ret = 0; + + ndev = xdna->dev_handle; + meta = kzalloc(sizeof(*meta), GFP_KERNEL); + if (!meta) + return -ENOMEM; + + meta->col_size = ndev->metadata.size; + meta->cols = ndev->metadata.cols; + meta->rows = ndev->metadata.rows; + + meta->version.major = ndev->metadata.version.major; + meta->version.minor = ndev->metadata.version.minor; + + meta->core.row_count = ndev->metadata.core.row_count; + meta->core.row_start = ndev->metadata.core.row_start; + meta->core.dma_channel_count = ndev->metadata.core.dma_channel_count; + meta->core.lock_count = ndev->metadata.core.lock_count; + meta->core.event_reg_count = ndev->metadata.core.event_reg_count; + + meta->mem.row_count = ndev->metadata.mem.row_count; + meta->mem.row_start = ndev->metadata.mem.row_start; + meta->mem.dma_channel_count = ndev->metadata.mem.dma_channel_count; + meta->mem.lock_count = ndev->metadata.mem.lock_count; + meta->mem.event_reg_count = ndev->metadata.mem.event_reg_count; + + meta->shim.row_count = ndev->metadata.shim.row_count; + meta->shim.row_start = ndev->metadata.shim.row_start; + meta->shim.dma_channel_count = ndev->metadata.shim.dma_channel_count; + meta->shim.lock_count = ndev->metadata.shim.lock_count; + meta->shim.event_reg_count = ndev->metadata.shim.event_reg_count; + + if (copy_to_user(u64_to_user_ptr(args->buffer), meta, sizeof(*meta))) + ret = -EFAULT; + + kfree(meta); + return ret; +} + +static int aie2_get_aie_version(struct amdxdna_client *client, + struct amdxdna_drm_get_info *args) +{ + struct amdxdna_drm_query_aie_version version; + struct amdxdna_dev *xdna = client->xdna; + struct amdxdna_dev_hdl *ndev; + + ndev = xdna->dev_handle; + version.major = ndev->version.major; + version.minor = ndev->version.minor; + + if (copy_to_user(u64_to_user_ptr(args->buffer), &version, sizeof(version))) + return -EFAULT; + + return 0; +} + +static int aie2_get_firmware_version(struct amdxdna_client *client, + struct amdxdna_drm_get_info *args) +{ + struct amdxdna_drm_query_firmware_version version; + struct amdxdna_dev *xdna = client->xdna; + + version.major = xdna->fw_ver.major; + version.minor = xdna->fw_ver.minor; + version.patch = xdna->fw_ver.sub; + version.build = xdna->fw_ver.build; + + if (copy_to_user(u64_to_user_ptr(args->buffer), &version, sizeof(version))) + return -EFAULT; + + return 0; +} + +static int aie2_get_power_mode(struct amdxdna_client *client, + struct amdxdna_drm_get_info *args) +{ + struct amdxdna_drm_get_power_mode mode = {}; + struct amdxdna_dev *xdna = client->xdna; + struct amdxdna_dev_hdl *ndev; + + ndev = xdna->dev_handle; + mode.power_mode = ndev->pw_mode; + + if (copy_to_user(u64_to_user_ptr(args->buffer), &mode, sizeof(mode))) + return -EFAULT; + + return 0; +} + +static int aie2_get_clock_metadata(struct amdxdna_client *client, + struct amdxdna_drm_get_info *args) +{ + struct amdxdna_drm_query_clock_metadata *clock; + struct amdxdna_dev *xdna = client->xdna; + struct amdxdna_dev_hdl *ndev; + int ret = 0; + + ndev = xdna->dev_handle; + clock = kzalloc(sizeof(*clock), GFP_KERNEL); + if (!clock) + return -ENOMEM; + + snprintf(clock->mp_npu_clock.name, sizeof(clock->mp_npu_clock.name), + "MP-NPU Clock"); + clock->mp_npu_clock.freq_mhz = ndev->npuclk_freq; + snprintf(clock->h_clock.name, sizeof(clock->h_clock.name), "H Clock"); + clock->h_clock.freq_mhz = ndev->hclk_freq; + + if (copy_to_user(u64_to_user_ptr(args->buffer), clock, sizeof(*clock))) + ret = -EFAULT; + + kfree(clock); + return ret; +} + +static int aie2_get_hwctx_status(struct amdxdna_client *client, + struct amdxdna_drm_get_info *args) +{ + struct amdxdna_drm_query_hwctx __user *buf; + struct amdxdna_dev *xdna = client->xdna; + struct amdxdna_drm_query_hwctx *tmp; + struct amdxdna_client *tmp_client; + struct amdxdna_hwctx *hwctx; + unsigned long hwctx_id; + bool overflow = false; + u32 req_bytes = 0; + u32 hw_i = 0; + int ret = 0; + int idx; + + drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); + + tmp = kzalloc(sizeof(*tmp), GFP_KERNEL); + if (!tmp) + return -ENOMEM; + + buf = u64_to_user_ptr(args->buffer); + list_for_each_entry(tmp_client, &xdna->client_list, node) { + idx = srcu_read_lock(&tmp_client->hwctx_srcu); + amdxdna_for_each_hwctx(tmp_client, hwctx_id, hwctx) { + req_bytes += sizeof(*tmp); + if (args->buffer_size < req_bytes) { + /* Continue iterating to get the required size */ + overflow = true; + continue; + } + + memset(tmp, 0, sizeof(*tmp)); + tmp->pid = tmp_client->pid; + tmp->context_id = hwctx->id; + tmp->start_col = hwctx->start_col; + tmp->num_col = hwctx->num_col; + tmp->command_submissions = hwctx->priv->seq; + tmp->command_completions = hwctx->priv->completed; + + if (copy_to_user(&buf[hw_i], tmp, sizeof(*tmp))) { + ret = -EFAULT; + srcu_read_unlock(&tmp_client->hwctx_srcu, idx); + goto out; + } + hw_i++; + } + srcu_read_unlock(&tmp_client->hwctx_srcu, idx); + } + + if (overflow) { + XDNA_ERR(xdna, "Invalid buffer size. Given: %u Need: %u.", + args->buffer_size, req_bytes); + ret = -EINVAL; + } + +out: + kfree(tmp); + args->buffer_size = req_bytes; + return ret; +} + +static int aie2_get_info(struct amdxdna_client *client, struct amdxdna_drm_get_info *args) +{ + struct amdxdna_dev *xdna = client->xdna; + int ret, idx; + + if (!drm_dev_enter(&xdna->ddev, &idx)) + return -ENODEV; + + switch (args->param) { + case DRM_AMDXDNA_QUERY_AIE_STATUS: + ret = aie2_get_aie_status(client, args); + break; + case DRM_AMDXDNA_QUERY_AIE_METADATA: + ret = aie2_get_aie_metadata(client, args); + break; + case DRM_AMDXDNA_QUERY_AIE_VERSION: + ret = aie2_get_aie_version(client, args); + break; + case DRM_AMDXDNA_QUERY_CLOCK_METADATA: + ret = aie2_get_clock_metadata(client, args); + break; + case DRM_AMDXDNA_QUERY_HW_CONTEXTS: + ret = aie2_get_hwctx_status(client, args); + break; + case DRM_AMDXDNA_QUERY_FIRMWARE_VERSION: + ret = aie2_get_firmware_version(client, args); + break; + case DRM_AMDXDNA_GET_POWER_MODE: + ret = aie2_get_power_mode(client, args); + break; + default: + XDNA_ERR(xdna, "Not supported request parameter %u", args->param); + ret = -EOPNOTSUPP; + } + XDNA_DBG(xdna, "Got param %d", args->param); + + drm_dev_exit(idx); + return ret; +} + +static int aie2_set_power_mode(struct amdxdna_client *client, + struct amdxdna_drm_set_state *args) +{ + struct amdxdna_drm_set_power_mode power_state; + enum amdxdna_power_mode_type power_mode; + struct amdxdna_dev *xdna = client->xdna; + + if (copy_from_user(&power_state, u64_to_user_ptr(args->buffer), + sizeof(power_state))) { + XDNA_ERR(xdna, "Failed to copy power mode request into kernel"); + return -EFAULT; + } + + if (XDNA_MBZ_DBG(xdna, power_state.pad, sizeof(power_state.pad))) + return -EINVAL; + + power_mode = power_state.power_mode; + if (power_mode > POWER_MODE_TURBO) { + XDNA_ERR(xdna, "Invalid power mode %d", power_mode); + return -EINVAL; + } + + return aie2_pm_set_mode(xdna->dev_handle, power_mode); +} + +static int aie2_set_state(struct amdxdna_client *client, + struct amdxdna_drm_set_state *args) +{ + struct amdxdna_dev *xdna = client->xdna; + int ret, idx; + + if (!drm_dev_enter(&xdna->ddev, &idx)) + return -ENODEV; + + switch (args->param) { + case DRM_AMDXDNA_SET_POWER_MODE: + ret = aie2_set_power_mode(client, args); + break; + default: + XDNA_ERR(xdna, "Not supported request parameter %u", args->param); + ret = -EOPNOTSUPP; + break; + } + + drm_dev_exit(idx); + return ret; +} + +const struct amdxdna_dev_ops aie2_ops = { + .init = aie2_init, + .fini = aie2_fini, + .resume = aie2_hw_start, + .suspend = aie2_hw_stop, + .get_aie_info = aie2_get_info, + .set_aie_state = aie2_set_state, + .hwctx_init = aie2_hwctx_init, + .hwctx_fini = aie2_hwctx_fini, + .hwctx_config = aie2_hwctx_config, + .cmd_submit = aie2_cmd_submit, + .hmm_invalidate = aie2_hmm_invalidate, + .hwctx_suspend = aie2_hwctx_suspend, + .hwctx_resume = aie2_hwctx_resume, +}; diff --git a/drivers/accel/amdxdna/aie2_pci.h b/drivers/accel/amdxdna/aie2_pci.h new file mode 100644 index 000000000000..f2d95531ddc2 --- /dev/null +++ b/drivers/accel/amdxdna/aie2_pci.h @@ -0,0 +1,297 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2023-2024, Advanced Micro Devices, Inc. + */ + +#ifndef _AIE2_PCI_H_ +#define _AIE2_PCI_H_ + +#include +#include + +#include "amdxdna_mailbox.h" + +#define AIE2_INTERVAL 20000 /* us */ +#define AIE2_TIMEOUT 1000000 /* us */ + +/* Firmware determines device memory base address and size */ +#define AIE2_DEVM_BASE 0x4000000 +#define AIE2_DEVM_SIZE SZ_64M + +#define NDEV2PDEV(ndev) (to_pci_dev((ndev)->xdna->ddev.dev)) + +#define AIE2_SRAM_OFF(ndev, addr) ((addr) - (ndev)->priv->sram_dev_addr) +#define AIE2_MBOX_OFF(ndev, addr) ((addr) - (ndev)->priv->mbox_dev_addr) + +#define PSP_REG_BAR(ndev, idx) ((ndev)->priv->psp_regs_off[(idx)].bar_idx) +#define PSP_REG_OFF(ndev, idx) ((ndev)->priv->psp_regs_off[(idx)].offset) +#define SRAM_REG_OFF(ndev, idx) ((ndev)->priv->sram_offs[(idx)].offset) + +#define SMU_REG(ndev, idx) \ +({ \ + typeof(ndev) _ndev = ndev; \ + ((_ndev)->smu_base + (_ndev)->priv->smu_regs_off[(idx)].offset); \ +}) +#define SRAM_GET_ADDR(ndev, idx) \ +({ \ + typeof(ndev) _ndev = ndev; \ + ((_ndev)->sram_base + SRAM_REG_OFF((_ndev), (idx))); \ +}) + +#define CHAN_SLOT_SZ SZ_8K +#define MBOX_SIZE(ndev) \ +({ \ + typeof(ndev) _ndev = (ndev); \ + ((_ndev)->priv->mbox_size) ? (_ndev)->priv->mbox_size : \ + pci_resource_len(NDEV2PDEV(_ndev), (_ndev)->xdna->dev_info->mbox_bar); \ +}) + +enum aie2_smu_reg_idx { + SMU_CMD_REG = 0, + SMU_ARG_REG, + SMU_INTR_REG, + SMU_RESP_REG, + SMU_OUT_REG, + SMU_MAX_REGS /* Keep this at the end */ +}; + +enum aie2_sram_reg_idx { + MBOX_CHANN_OFF = 0, + FW_ALIVE_OFF, + SRAM_MAX_INDEX /* Keep this at the end */ +}; + +enum psp_reg_idx { + PSP_CMD_REG = 0, + PSP_ARG0_REG, + PSP_ARG1_REG, + PSP_ARG2_REG, + PSP_NUM_IN_REGS, /* number of input registers */ + PSP_INTR_REG = PSP_NUM_IN_REGS, + PSP_STATUS_REG, + PSP_RESP_REG, + PSP_MAX_REGS /* Keep this at the end */ +}; + +struct amdxdna_client; +struct amdxdna_fw_ver; +struct amdxdna_hwctx; +struct amdxdna_sched_job; + +struct psp_config { + const void *fw_buf; + u32 fw_size; + void __iomem *psp_regs[PSP_MAX_REGS]; +}; + +struct aie_version { + u16 major; + u16 minor; +}; + +struct aie_tile_metadata { + u16 row_count; + u16 row_start; + u16 dma_channel_count; + u16 lock_count; + u16 event_reg_count; +}; + +struct aie_metadata { + u32 size; + u16 cols; + u16 rows; + struct aie_version version; + struct aie_tile_metadata core; + struct aie_tile_metadata mem; + struct aie_tile_metadata shim; +}; + +enum rt_config_category { + AIE2_RT_CFG_INIT, + AIE2_RT_CFG_CLK_GATING, +}; + +struct rt_config { + u32 type; + u32 value; + u32 category; +}; + +struct dpm_clk_freq { + u32 npuclk; + u32 hclk; +}; + +/* + * Define the maximum number of pending commands in a hardware context. + * Must be power of 2! + */ +#define HWCTX_MAX_CMDS 4 +#define get_job_idx(seq) ((seq) & (HWCTX_MAX_CMDS - 1)) +struct amdxdna_hwctx_priv { + struct amdxdna_gem_obj *heap; + void *mbox_chann; + + struct drm_gpu_scheduler sched; + struct drm_sched_entity entity; + + struct mutex io_lock; /* protect seq and cmd order */ + struct wait_queue_head job_free_wq; + u32 num_pending; + u64 seq; + struct semaphore job_sem; + bool job_done; + + /* Completed job counter */ + u64 completed; + + struct amdxdna_gem_obj *cmd_buf[HWCTX_MAX_CMDS]; + struct drm_syncobj *syncobj; +}; + +enum aie2_dev_status { + AIE2_DEV_UNINIT, + AIE2_DEV_INIT, + AIE2_DEV_START, +}; + +struct amdxdna_dev_hdl { + struct amdxdna_dev *xdna; + const struct amdxdna_dev_priv *priv; + void __iomem *sram_base; + void __iomem *smu_base; + void __iomem *mbox_base; + struct psp_device *psp_hdl; + + struct xdna_mailbox_chann_res mgmt_x2i; + struct xdna_mailbox_chann_res mgmt_i2x; + u32 mgmt_chan_idx; + u32 mgmt_prot_major; + u32 mgmt_prot_minor; + + u32 total_col; + struct aie_version version; + struct aie_metadata metadata; + + /* power management and clock*/ + enum amdxdna_power_mode_type pw_mode; + u32 dpm_level; + u32 dft_dpm_level; + u32 max_dpm_level; + u32 clk_gating; + u32 npuclk_freq; + u32 hclk_freq; + + /* Mailbox and the management channel */ + struct mailbox *mbox; + struct mailbox_channel *mgmt_chann; + struct async_events *async_events; + + enum aie2_dev_status dev_status; + u32 hwctx_num; +}; + +#define DEFINE_BAR_OFFSET(reg_name, bar, reg_addr) \ + [reg_name] = {bar##_BAR_INDEX, (reg_addr) - bar##_BAR_BASE} + +struct aie2_bar_off_pair { + int bar_idx; + u32 offset; +}; + +struct aie2_hw_ops { + int (*set_dpm)(struct amdxdna_dev_hdl *ndev, u32 dpm_level); +}; + +struct amdxdna_dev_priv { + const char *fw_path; + u64 protocol_major; + u64 protocol_minor; + const struct rt_config *rt_config; + const struct dpm_clk_freq *dpm_clk_tbl; + +#define COL_ALIGN_NONE 0 +#define COL_ALIGN_NATURE 1 + u32 col_align; + u32 mbox_dev_addr; + /* If mbox_size is 0, use BAR size. See MBOX_SIZE macro */ + u32 mbox_size; + u32 sram_dev_addr; + struct aie2_bar_off_pair sram_offs[SRAM_MAX_INDEX]; + struct aie2_bar_off_pair psp_regs_off[PSP_MAX_REGS]; + struct aie2_bar_off_pair smu_regs_off[SMU_MAX_REGS]; + struct aie2_hw_ops hw_ops; +}; + +extern const struct amdxdna_dev_ops aie2_ops; + +int aie2_runtime_cfg(struct amdxdna_dev_hdl *ndev, + enum rt_config_category category, u32 *val); + +/* aie2 npu hw config */ +extern const struct dpm_clk_freq npu1_dpm_clk_table[]; +extern const struct dpm_clk_freq npu4_dpm_clk_table[]; +extern const struct rt_config npu1_default_rt_cfg[]; +extern const struct rt_config npu4_default_rt_cfg[]; + +/* aie2_smu.c */ +int aie2_smu_init(struct amdxdna_dev_hdl *ndev); +void aie2_smu_fini(struct amdxdna_dev_hdl *ndev); +int npu1_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level); +int npu4_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level); + +/* aie2_pm.c */ +int aie2_pm_init(struct amdxdna_dev_hdl *ndev); +int aie2_pm_set_mode(struct amdxdna_dev_hdl *ndev, enum amdxdna_power_mode_type target); + +/* aie2_psp.c */ +struct psp_device *aie2m_psp_create(struct drm_device *ddev, struct psp_config *conf); +int aie2_psp_start(struct psp_device *psp); +void aie2_psp_stop(struct psp_device *psp); + +/* aie2_error.c */ +int aie2_error_async_events_alloc(struct amdxdna_dev_hdl *ndev); +void aie2_error_async_events_free(struct amdxdna_dev_hdl *ndev); +int aie2_error_async_events_send(struct amdxdna_dev_hdl *ndev); +int aie2_error_async_msg_thread(void *data); + +/* aie2_message.c */ +int aie2_suspend_fw(struct amdxdna_dev_hdl *ndev); +int aie2_resume_fw(struct amdxdna_dev_hdl *ndev); +int aie2_set_runtime_cfg(struct amdxdna_dev_hdl *ndev, u32 type, u64 value); +int aie2_get_runtime_cfg(struct amdxdna_dev_hdl *ndev, u32 type, u64 *value); +int aie2_assign_mgmt_pasid(struct amdxdna_dev_hdl *ndev, u16 pasid); +int aie2_query_aie_version(struct amdxdna_dev_hdl *ndev, struct aie_version *version); +int aie2_query_aie_metadata(struct amdxdna_dev_hdl *ndev, struct aie_metadata *metadata); +int aie2_query_firmware_version(struct amdxdna_dev_hdl *ndev, + struct amdxdna_fw_ver *fw_ver); +int aie2_create_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwctx *hwctx); +int aie2_destroy_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwctx *hwctx); +int aie2_map_host_buf(struct amdxdna_dev_hdl *ndev, u32 context_id, u64 addr, u64 size); +int aie2_query_status(struct amdxdna_dev_hdl *ndev, char __user *buf, u32 size, u32 *cols_filled); +int aie2_register_asyn_event_msg(struct amdxdna_dev_hdl *ndev, dma_addr_t addr, u32 size, + void *handle, int (*cb)(void*, const u32 *, size_t)); +int aie2_config_cu(struct amdxdna_hwctx *hwctx); +int aie2_execbuf(struct amdxdna_hwctx *hwctx, struct amdxdna_sched_job *job, + int (*notify_cb)(void *, const u32 *, size_t)); +int aie2_cmdlist_single_execbuf(struct amdxdna_hwctx *hwctx, + struct amdxdna_sched_job *job, + int (*notify_cb)(void *, const u32 *, size_t)); +int aie2_cmdlist_multi_execbuf(struct amdxdna_hwctx *hwctx, + struct amdxdna_sched_job *job, + int (*notify_cb)(void *, const u32 *, size_t)); +int aie2_sync_bo(struct amdxdna_hwctx *hwctx, struct amdxdna_sched_job *job, + int (*notify_cb)(void *, const u32 *, size_t)); + +/* aie2_hwctx.c */ +int aie2_hwctx_init(struct amdxdna_hwctx *hwctx); +void aie2_hwctx_fini(struct amdxdna_hwctx *hwctx); +int aie2_hwctx_config(struct amdxdna_hwctx *hwctx, u32 type, u64 value, void *buf, u32 size); +void aie2_hwctx_suspend(struct amdxdna_hwctx *hwctx); +void aie2_hwctx_resume(struct amdxdna_hwctx *hwctx); +int aie2_cmd_submit(struct amdxdna_hwctx *hwctx, struct amdxdna_sched_job *job, u64 *seq); +void aie2_hmm_invalidate(struct amdxdna_gem_obj *abo, unsigned long cur_seq); +void aie2_restart_ctx(struct amdxdna_client *client); + +#endif /* _AIE2_PCI_H_ */ diff --git a/drivers/accel/amdxdna/aie2_pm.c b/drivers/accel/amdxdna/aie2_pm.c new file mode 100644 index 000000000000..426c38fce848 --- /dev/null +++ b/drivers/accel/amdxdna/aie2_pm.c @@ -0,0 +1,108 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include + +#include "aie2_pci.h" +#include "amdxdna_pci_drv.h" + +#define AIE2_CLK_GATING_ENABLE 1 +#define AIE2_CLK_GATING_DISABLE 0 + +static int aie2_pm_set_clk_gating(struct amdxdna_dev_hdl *ndev, u32 val) +{ + int ret; + + ret = aie2_runtime_cfg(ndev, AIE2_RT_CFG_CLK_GATING, &val); + if (ret) + return ret; + + ndev->clk_gating = val; + return 0; +} + +int aie2_pm_init(struct amdxdna_dev_hdl *ndev) +{ + int ret; + + if (ndev->dev_status != AIE2_DEV_UNINIT) { + /* Resume device */ + ret = ndev->priv->hw_ops.set_dpm(ndev, ndev->dpm_level); + if (ret) + return ret; + + ret = aie2_pm_set_clk_gating(ndev, ndev->clk_gating); + if (ret) + return ret; + + return 0; + } + + while (ndev->priv->dpm_clk_tbl[ndev->max_dpm_level].hclk) + ndev->max_dpm_level++; + ndev->max_dpm_level--; + + ret = ndev->priv->hw_ops.set_dpm(ndev, ndev->max_dpm_level); + if (ret) + return ret; + + ret = aie2_pm_set_clk_gating(ndev, AIE2_CLK_GATING_ENABLE); + if (ret) + return ret; + + ndev->pw_mode = POWER_MODE_DEFAULT; + ndev->dft_dpm_level = ndev->max_dpm_level; + + return 0; +} + +int aie2_pm_set_mode(struct amdxdna_dev_hdl *ndev, enum amdxdna_power_mode_type target) +{ + struct amdxdna_dev *xdna = ndev->xdna; + u32 clk_gating, dpm_level; + int ret; + + drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); + + if (ndev->pw_mode == target) + return 0; + + switch (target) { + case POWER_MODE_TURBO: + if (ndev->hwctx_num) { + XDNA_ERR(xdna, "Can not set turbo when there is active hwctx"); + return -EINVAL; + } + + clk_gating = AIE2_CLK_GATING_DISABLE; + dpm_level = ndev->max_dpm_level; + break; + case POWER_MODE_HIGH: + clk_gating = AIE2_CLK_GATING_ENABLE; + dpm_level = ndev->max_dpm_level; + break; + case POWER_MODE_DEFAULT: + clk_gating = AIE2_CLK_GATING_ENABLE; + dpm_level = ndev->dft_dpm_level; + break; + default: + return -EOPNOTSUPP; + } + + ret = ndev->priv->hw_ops.set_dpm(ndev, dpm_level); + if (ret) + return ret; + + ret = aie2_pm_set_clk_gating(ndev, clk_gating); + if (ret) + return ret; + + ndev->pw_mode = target; + + return 0; +} diff --git a/drivers/accel/amdxdna/aie2_psp.c b/drivers/accel/amdxdna/aie2_psp.c new file mode 100644 index 000000000000..dc3a072ce3b6 --- /dev/null +++ b/drivers/accel/amdxdna/aie2_psp.c @@ -0,0 +1,146 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "aie2_pci.h" +#include "amdxdna_mailbox.h" +#include "amdxdna_pci_drv.h" + +#define PSP_STATUS_READY BIT(31) + +/* PSP commands */ +#define PSP_VALIDATE 1 +#define PSP_START 2 +#define PSP_RELEASE_TMR 3 + +/* PSP special arguments */ +#define PSP_START_COPY_FW 1 + +/* PSP response error code */ +#define PSP_ERROR_CANCEL 0xFFFF0002 +#define PSP_ERROR_BAD_STATE 0xFFFF0007 + +#define PSP_FW_ALIGN 0x10000 +#define PSP_POLL_INTERVAL 20000 /* us */ +#define PSP_POLL_TIMEOUT 1000000 /* us */ + +#define PSP_REG(p, reg) ((p)->psp_regs[reg]) + +struct psp_device { + struct drm_device *ddev; + struct psp_config conf; + u32 fw_buf_sz; + u64 fw_paddr; + void *fw_buffer; + void __iomem *psp_regs[PSP_MAX_REGS]; +}; + +static int psp_exec(struct psp_device *psp, u32 *reg_vals) +{ + u32 resp_code; + int ret, i; + u32 ready; + + /* Write command and argument registers */ + for (i = 0; i < PSP_NUM_IN_REGS; i++) + writel(reg_vals[i], PSP_REG(psp, i)); + + /* clear and set PSP INTR register to kick off */ + writel(0, PSP_REG(psp, PSP_INTR_REG)); + writel(1, PSP_REG(psp, PSP_INTR_REG)); + + /* PSP should be busy. Wait for ready, so we know task is done. */ + ret = readx_poll_timeout(readl, PSP_REG(psp, PSP_STATUS_REG), ready, + FIELD_GET(PSP_STATUS_READY, ready), + PSP_POLL_INTERVAL, PSP_POLL_TIMEOUT); + if (ret) { + drm_err(psp->ddev, "PSP is not ready, ret 0x%x", ret); + return ret; + } + + resp_code = readl(PSP_REG(psp, PSP_RESP_REG)); + if (resp_code) { + drm_err(psp->ddev, "fw return error 0x%x", resp_code); + return -EIO; + } + + return 0; +} + +void aie2_psp_stop(struct psp_device *psp) +{ + u32 reg_vals[PSP_NUM_IN_REGS] = { PSP_RELEASE_TMR, }; + int ret; + + ret = psp_exec(psp, reg_vals); + if (ret) + drm_err(psp->ddev, "release tmr failed, ret %d", ret); +} + +int aie2_psp_start(struct psp_device *psp) +{ + u32 reg_vals[PSP_NUM_IN_REGS]; + int ret; + + reg_vals[0] = PSP_VALIDATE; + reg_vals[1] = lower_32_bits(psp->fw_paddr); + reg_vals[2] = upper_32_bits(psp->fw_paddr); + reg_vals[3] = psp->fw_buf_sz; + + ret = psp_exec(psp, reg_vals); + if (ret) { + drm_err(psp->ddev, "failed to validate fw, ret %d", ret); + return ret; + } + + memset(reg_vals, 0, sizeof(reg_vals)); + reg_vals[0] = PSP_START; + reg_vals[1] = PSP_START_COPY_FW; + ret = psp_exec(psp, reg_vals); + if (ret) { + drm_err(psp->ddev, "failed to start fw, ret %d", ret); + return ret; + } + + return 0; +} + +struct psp_device *aie2m_psp_create(struct drm_device *ddev, struct psp_config *conf) +{ + struct psp_device *psp; + u64 offset; + + psp = drmm_kzalloc(ddev, sizeof(*psp), GFP_KERNEL); + if (!psp) + return NULL; + + psp->ddev = ddev; + memcpy(psp->psp_regs, conf->psp_regs, sizeof(psp->psp_regs)); + + psp->fw_buf_sz = ALIGN(conf->fw_size, PSP_FW_ALIGN) + PSP_FW_ALIGN; + psp->fw_buffer = drmm_kmalloc(ddev, psp->fw_buf_sz, GFP_KERNEL); + if (!psp->fw_buffer) { + drm_err(ddev, "no memory for fw buffer"); + return NULL; + } + + /* + * AMD Platform Security Processor(PSP) requires host physical + * address to load NPU firmware. + */ + psp->fw_paddr = virt_to_phys(psp->fw_buffer); + offset = ALIGN(psp->fw_paddr, PSP_FW_ALIGN) - psp->fw_paddr; + psp->fw_paddr += offset; + memcpy(psp->fw_buffer + offset, conf->fw_buf, conf->fw_size); + + return psp; +} diff --git a/drivers/accel/amdxdna/aie2_smu.c b/drivers/accel/amdxdna/aie2_smu.c new file mode 100644 index 000000000000..73388443c676 --- /dev/null +++ b/drivers/accel/amdxdna/aie2_smu.c @@ -0,0 +1,134 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include + +#include "aie2_pci.h" +#include "amdxdna_pci_drv.h" + +#define SMU_RESULT_OK 1 + +/* SMU commands */ +#define AIE2_SMU_POWER_ON 0x3 +#define AIE2_SMU_POWER_OFF 0x4 +#define AIE2_SMU_SET_MPNPUCLK_FREQ 0x5 +#define AIE2_SMU_SET_HCLK_FREQ 0x6 +#define AIE2_SMU_SET_SOFT_DPMLEVEL 0x7 +#define AIE2_SMU_SET_HARD_DPMLEVEL 0x8 + +static int aie2_smu_exec(struct amdxdna_dev_hdl *ndev, u32 reg_cmd, + u32 reg_arg, u32 *out) +{ + u32 resp; + int ret; + + writel(0, SMU_REG(ndev, SMU_RESP_REG)); + writel(reg_arg, SMU_REG(ndev, SMU_ARG_REG)); + writel(reg_cmd, SMU_REG(ndev, SMU_CMD_REG)); + + /* Clear and set SMU_INTR_REG to kick off */ + writel(0, SMU_REG(ndev, SMU_INTR_REG)); + writel(1, SMU_REG(ndev, SMU_INTR_REG)); + + ret = readx_poll_timeout(readl, SMU_REG(ndev, SMU_RESP_REG), resp, + resp, AIE2_INTERVAL, AIE2_TIMEOUT); + if (ret) { + XDNA_ERR(ndev->xdna, "smu cmd %d timed out", reg_cmd); + return ret; + } + + if (out) + *out = readl(SMU_REG(ndev, SMU_OUT_REG)); + + if (resp != SMU_RESULT_OK) { + XDNA_ERR(ndev->xdna, "smu cmd %d failed, 0x%x", reg_cmd, resp); + return -EINVAL; + } + + return 0; +} + +int npu1_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level) +{ + u32 freq; + int ret; + + ret = aie2_smu_exec(ndev, AIE2_SMU_SET_MPNPUCLK_FREQ, + ndev->priv->dpm_clk_tbl[dpm_level].npuclk, &freq); + if (ret) { + XDNA_ERR(ndev->xdna, "Set npu clock to %d failed, ret %d\n", + ndev->priv->dpm_clk_tbl[dpm_level].npuclk, ret); + } + ndev->npuclk_freq = freq; + + ret = aie2_smu_exec(ndev, AIE2_SMU_SET_HCLK_FREQ, + ndev->priv->dpm_clk_tbl[dpm_level].hclk, &freq); + if (ret) { + XDNA_ERR(ndev->xdna, "Set h clock to %d failed, ret %d\n", + ndev->priv->dpm_clk_tbl[dpm_level].hclk, ret); + } + ndev->hclk_freq = freq; + ndev->dpm_level = dpm_level; + + XDNA_DBG(ndev->xdna, "MP-NPU clock %d, H clock %d\n", + ndev->npuclk_freq, ndev->hclk_freq); + + return 0; +} + +int npu4_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level) +{ + int ret; + + ret = aie2_smu_exec(ndev, AIE2_SMU_SET_HARD_DPMLEVEL, dpm_level, NULL); + if (ret) { + XDNA_ERR(ndev->xdna, "Set hard dpm level %d failed, ret %d ", + dpm_level, ret); + return ret; + } + + ret = aie2_smu_exec(ndev, AIE2_SMU_SET_SOFT_DPMLEVEL, dpm_level, NULL); + if (ret) { + XDNA_ERR(ndev->xdna, "Set soft dpm level %d failed, ret %d", + dpm_level, ret); + return ret; + } + + ndev->npuclk_freq = ndev->priv->dpm_clk_tbl[dpm_level].npuclk; + ndev->hclk_freq = ndev->priv->dpm_clk_tbl[dpm_level].hclk; + ndev->dpm_level = dpm_level; + + XDNA_DBG(ndev->xdna, "MP-NPU clock %d, H clock %d\n", + ndev->npuclk_freq, ndev->hclk_freq); + + return 0; +} + +int aie2_smu_init(struct amdxdna_dev_hdl *ndev) +{ + int ret; + + ret = aie2_smu_exec(ndev, AIE2_SMU_POWER_ON, 0, NULL); + if (ret) { + XDNA_ERR(ndev->xdna, "Power on failed, ret %d", ret); + return ret; + } + + return 0; +} + +void aie2_smu_fini(struct amdxdna_dev_hdl *ndev) +{ + int ret; + + ndev->priv->hw_ops.set_dpm(ndev, 0); + ret = aie2_smu_exec(ndev, AIE2_SMU_POWER_OFF, 0, NULL); + if (ret) + XDNA_ERR(ndev->xdna, "Power off failed, ret %d", ret); +} diff --git a/drivers/accel/amdxdna/aie2_solver.c b/drivers/accel/amdxdna/aie2_solver.c new file mode 100644 index 000000000000..2013d1f13aae --- /dev/null +++ b/drivers/accel/amdxdna/aie2_solver.c @@ -0,0 +1,380 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include + +#include "aie2_solver.h" + +struct partition_node { + struct list_head list; + u32 nshared; /* # shared requests */ + u32 start_col; /* start column */ + u32 ncols; /* # columns */ + bool exclusive; /* can not be shared if set */ +}; + +struct solver_node { + struct list_head list; + u64 rid; /* Request ID from consumer */ + + struct partition_node *pt_node; + void *cb_arg; + u32 dpm_level; + u32 cols_len; + u32 start_cols[] __counted_by(cols_len); +}; + +struct solver_rgroup { + u32 rgid; + u32 nnode; + u32 npartition_node; + + DECLARE_BITMAP(resbit, XRS_MAX_COL); + struct list_head node_list; + struct list_head pt_node_list; +}; + +struct solver_state { + struct solver_rgroup rgp; + struct init_config cfg; + struct xrs_action_ops *actions; +}; + +static u32 calculate_gops(struct aie_qos *rqos) +{ + u32 service_rate = 0; + + if (rqos->latency) + service_rate = (1000 / rqos->latency); + + if (rqos->fps > service_rate) + return rqos->fps * rqos->gops; + + return service_rate * rqos->gops; +} + +/* + * qos_meet() - Check the QOS request can be met. + */ +static int qos_meet(struct solver_state *xrs, struct aie_qos *rqos, u32 cgops) +{ + u32 request_gops = calculate_gops(rqos) * xrs->cfg.sys_eff_factor; + + if (request_gops <= cgops) + return 0; + + return -EINVAL; +} + +/* + * sanity_check() - Do a basic sanity check on allocation request. + */ +static int sanity_check(struct solver_state *xrs, struct alloc_requests *req) +{ + struct cdo_parts *cdop = &req->cdo; + struct aie_qos *rqos = &req->rqos; + u32 cu_clk_freq; + + if (cdop->ncols > xrs->cfg.total_col) + return -EINVAL; + + /* + * We can find at least one CDOs groups that meet the + * GOPs requirement. + */ + cu_clk_freq = xrs->cfg.clk_list.cu_clk_list[xrs->cfg.clk_list.num_levels - 1]; + + if (qos_meet(xrs, rqos, cdop->qos_cap.opc * cu_clk_freq / 1000)) + return -EINVAL; + + return 0; +} + +static bool is_valid_qos_dpm_params(struct aie_qos *rqos) +{ + /* + * gops is retrieved from the xmodel, so it's always set + * fps and latency are the configurable params from the application + */ + if (rqos->gops > 0 && (rqos->fps > 0 || rqos->latency > 0)) + return true; + + return false; +} + +static int set_dpm_level(struct solver_state *xrs, struct alloc_requests *req, u32 *dpm_level) +{ + struct solver_rgroup *rgp = &xrs->rgp; + struct cdo_parts *cdop = &req->cdo; + struct aie_qos *rqos = &req->rqos; + u32 freq, max_dpm_level, level; + struct solver_node *node; + + max_dpm_level = xrs->cfg.clk_list.num_levels - 1; + /* If no QoS parameters are passed, set it to the max DPM level */ + if (!is_valid_qos_dpm_params(rqos)) { + level = max_dpm_level; + goto set_dpm; + } + + /* Find one CDO group that meet the GOPs requirement. */ + for (level = 0; level < max_dpm_level; level++) { + freq = xrs->cfg.clk_list.cu_clk_list[level]; + if (!qos_meet(xrs, rqos, cdop->qos_cap.opc * freq / 1000)) + break; + } + + /* set the dpm level which fits all the sessions */ + list_for_each_entry(node, &rgp->node_list, list) { + if (node->dpm_level > level) + level = node->dpm_level; + } + +set_dpm: + *dpm_level = level; + return xrs->cfg.actions->set_dft_dpm_level(xrs->cfg.ddev, level); +} + +static struct solver_node *rg_search_node(struct solver_rgroup *rgp, u64 rid) +{ + struct solver_node *node; + + list_for_each_entry(node, &rgp->node_list, list) { + if (node->rid == rid) + return node; + } + + return NULL; +} + +static void remove_partition_node(struct solver_rgroup *rgp, + struct partition_node *pt_node) +{ + pt_node->nshared--; + if (pt_node->nshared > 0) + return; + + list_del(&pt_node->list); + rgp->npartition_node--; + + bitmap_clear(rgp->resbit, pt_node->start_col, pt_node->ncols); + kfree(pt_node); +} + +static void remove_solver_node(struct solver_rgroup *rgp, + struct solver_node *node) +{ + list_del(&node->list); + rgp->nnode--; + + if (node->pt_node) + remove_partition_node(rgp, node->pt_node); + + kfree(node); +} + +static int get_free_partition(struct solver_state *xrs, + struct solver_node *snode, + struct alloc_requests *req) +{ + struct partition_node *pt_node; + u32 ncols = req->cdo.ncols; + u32 col, i; + + for (i = 0; i < snode->cols_len; i++) { + col = snode->start_cols[i]; + if (find_next_bit(xrs->rgp.resbit, XRS_MAX_COL, col) >= col + ncols) + break; + } + + if (i == snode->cols_len) + return -ENODEV; + + pt_node = kzalloc(sizeof(*pt_node), GFP_KERNEL); + if (!pt_node) + return -ENOMEM; + + pt_node->nshared = 1; + pt_node->start_col = col; + pt_node->ncols = ncols; + + /* + * Always set exclusive to false for now. + */ + pt_node->exclusive = false; + + list_add_tail(&pt_node->list, &xrs->rgp.pt_node_list); + xrs->rgp.npartition_node++; + bitmap_set(xrs->rgp.resbit, pt_node->start_col, pt_node->ncols); + + snode->pt_node = pt_node; + + return 0; +} + +static int allocate_partition(struct solver_state *xrs, + struct solver_node *snode, + struct alloc_requests *req) +{ + struct partition_node *pt_node, *rpt_node = NULL; + int idx, ret; + + ret = get_free_partition(xrs, snode, req); + if (!ret) + return ret; + + /* try to get a share-able partition */ + list_for_each_entry(pt_node, &xrs->rgp.pt_node_list, list) { + if (pt_node->exclusive) + continue; + + if (rpt_node && pt_node->nshared >= rpt_node->nshared) + continue; + + for (idx = 0; idx < snode->cols_len; idx++) { + if (snode->start_cols[idx] != pt_node->start_col) + continue; + + if (req->cdo.ncols != pt_node->ncols) + continue; + + rpt_node = pt_node; + break; + } + } + + if (!rpt_node) + return -ENODEV; + + rpt_node->nshared++; + snode->pt_node = rpt_node; + + return 0; +} + +static struct solver_node *create_solver_node(struct solver_state *xrs, + struct alloc_requests *req) +{ + struct cdo_parts *cdop = &req->cdo; + struct solver_node *node; + int ret; + + node = kzalloc(struct_size(node, start_cols, cdop->cols_len), GFP_KERNEL); + if (!node) + return ERR_PTR(-ENOMEM); + + node->rid = req->rid; + node->cols_len = cdop->cols_len; + memcpy(node->start_cols, cdop->start_cols, cdop->cols_len * sizeof(u32)); + + ret = allocate_partition(xrs, node, req); + if (ret) + goto free_node; + + list_add_tail(&node->list, &xrs->rgp.node_list); + xrs->rgp.nnode++; + return node; + +free_node: + kfree(node); + return ERR_PTR(ret); +} + +static void fill_load_action(struct solver_state *xrs, + struct solver_node *snode, + struct xrs_action_load *action) +{ + action->rid = snode->rid; + action->part.start_col = snode->pt_node->start_col; + action->part.ncols = snode->pt_node->ncols; +} + +int xrs_allocate_resource(void *hdl, struct alloc_requests *req, void *cb_arg) +{ + struct xrs_action_load load_act; + struct solver_node *snode; + struct solver_state *xrs; + u32 dpm_level; + int ret; + + xrs = (struct solver_state *)hdl; + + ret = sanity_check(xrs, req); + if (ret) { + drm_err(xrs->cfg.ddev, "invalid request"); + return ret; + } + + if (rg_search_node(&xrs->rgp, req->rid)) { + drm_err(xrs->cfg.ddev, "rid %lld is in-use", req->rid); + return -EEXIST; + } + + snode = create_solver_node(xrs, req); + if (IS_ERR(snode)) + return PTR_ERR(snode); + + fill_load_action(xrs, snode, &load_act); + ret = xrs->cfg.actions->load(cb_arg, &load_act); + if (ret) + goto free_node; + + ret = set_dpm_level(xrs, req, &dpm_level); + if (ret) + goto free_node; + + snode->dpm_level = dpm_level; + snode->cb_arg = cb_arg; + + drm_dbg(xrs->cfg.ddev, "start col %d ncols %d\n", + snode->pt_node->start_col, snode->pt_node->ncols); + + return 0; + +free_node: + remove_solver_node(&xrs->rgp, snode); + + return ret; +} + +int xrs_release_resource(void *hdl, u64 rid) +{ + struct solver_state *xrs = hdl; + struct solver_node *node; + + node = rg_search_node(&xrs->rgp, rid); + if (!node) { + drm_err(xrs->cfg.ddev, "node not exist"); + return -ENODEV; + } + + xrs->cfg.actions->unload(node->cb_arg); + remove_solver_node(&xrs->rgp, node); + + return 0; +} + +void *xrsm_init(struct init_config *cfg) +{ + struct solver_rgroup *rgp; + struct solver_state *xrs; + + xrs = drmm_kzalloc(cfg->ddev, sizeof(*xrs), GFP_KERNEL); + if (!xrs) + return NULL; + + memcpy(&xrs->cfg, cfg, sizeof(*cfg)); + + rgp = &xrs->rgp; + INIT_LIST_HEAD(&rgp->node_list); + INIT_LIST_HEAD(&rgp->pt_node_list); + + return xrs; +} diff --git a/drivers/accel/amdxdna/aie2_solver.h b/drivers/accel/amdxdna/aie2_solver.h new file mode 100644 index 000000000000..a2e3c52229e9 --- /dev/null +++ b/drivers/accel/amdxdna/aie2_solver.h @@ -0,0 +1,155 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2023-2024, Advanced Micro Devices, Inc. + */ + +#ifndef _AIE2_SOLVER_H +#define _AIE2_SOLVER_H + +#define XRS_MAX_COL 128 + +/* + * Structure used to describe a partition. A partition is column based + * allocation unit described by its start column and number of columns. + */ +struct aie_part { + u32 start_col; + u32 ncols; +}; + +/* + * The QoS capabilities of a given AIE partition. + */ +struct aie_qos_cap { + u32 opc; /* operations per cycle */ + u32 dma_bw; /* DMA bandwidth */ +}; + +/* + * QoS requirement of a resource allocation. + */ +struct aie_qos { + u32 gops; /* Giga operations */ + u32 fps; /* Frames per second */ + u32 dma_bw; /* DMA bandwidth */ + u32 latency; /* Frame response latency */ + u32 exec_time; /* Frame execution time */ + u32 priority; /* Request priority */ +}; + +/* + * Structure used to describe a relocatable CDO (Configuration Data Object). + */ +struct cdo_parts { + u32 *start_cols; /* Start column array */ + u32 cols_len; /* Length of start column array */ + u32 ncols; /* # of column */ + struct aie_qos_cap qos_cap; /* CDO QoS capabilities */ +}; + +/* + * Structure used to describe a request to allocate. + */ +struct alloc_requests { + u64 rid; + struct cdo_parts cdo; + struct aie_qos rqos; /* Requested QoS */ +}; + +/* + * Load callback argument + */ +struct xrs_action_load { + u32 rid; + struct aie_part part; +}; + +/* + * Define the power level available + * + * POWER_LEVEL_MIN: + * Lowest power level. Usually set when all actions are unloaded. + * + * POWER_LEVEL_n + * Power levels 0 - n, is a step increase in system frequencies + */ +enum power_level { + POWER_LEVEL_MIN = 0x0, + POWER_LEVEL_0 = 0x1, + POWER_LEVEL_1 = 0x2, + POWER_LEVEL_2 = 0x3, + POWER_LEVEL_3 = 0x4, + POWER_LEVEL_4 = 0x5, + POWER_LEVEL_5 = 0x6, + POWER_LEVEL_6 = 0x7, + POWER_LEVEL_7 = 0x8, + POWER_LEVEL_NUM, +}; + +/* + * Structure used to describe the frequency table. + * Resource solver chooses the frequency from the table + * to meet the QOS requirements. + */ +struct clk_list_info { + u32 num_levels; /* available power levels */ + u32 cu_clk_list[POWER_LEVEL_NUM]; /* available aie clock frequencies in Mhz*/ +}; + +struct xrs_action_ops { + int (*load)(void *cb_arg, struct xrs_action_load *action); + int (*unload)(void *cb_arg); + int (*set_dft_dpm_level)(struct drm_device *ddev, u32 level); +}; + +/* + * Structure used to describe information for solver during initialization. + */ +struct init_config { + u32 total_col; + u32 sys_eff_factor; /* system efficiency factor */ + u32 latency_adj; /* latency adjustment in ms */ + struct clk_list_info clk_list; /* List of frequencies available in system */ + struct drm_device *ddev; + struct xrs_action_ops *actions; +}; + +/* + * xrsm_init() - Register resource solver. Resource solver client needs + * to call this function to register itself. + * + * @cfg: The system metrics for resource solver to use + * + * Return: A resource solver handle + * + * Note: We should only create one handle per AIE array to be managed. + */ +void *xrsm_init(struct init_config *cfg); + +/* + * xrs_allocate_resource() - Request to allocate resources for a given context + * and a partition metadata. (See struct part_meta) + * + * @hdl: Resource solver handle obtained from xrs_init() + * @req: Input to the Resource solver including request id + * and partition metadata. + * @cb_arg: callback argument pointer + * + * Return: 0 when successful. + * Or standard error number when failing + * + * Note: + * There is no lock mechanism inside resource solver. So it is + * the caller's responsibility to lock down XCLBINs and grab + * necessary lock. + */ +int xrs_allocate_resource(void *hdl, struct alloc_requests *req, void *cb_arg); + +/* + * xrs_release_resource() - Request to free resources for a given context. + * + * @hdl: Resource solver handle obtained from xrs_init() + * @rid: The Request ID to identify the requesting context + */ +int xrs_release_resource(void *hdl, u64 rid); +#endif /* _AIE2_SOLVER_H */ diff --git a/drivers/accel/amdxdna/amdxdna_ctx.c b/drivers/accel/amdxdna/amdxdna_ctx.c new file mode 100644 index 000000000000..d11b1c83d9c3 --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_ctx.c @@ -0,0 +1,550 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "amdxdna_ctx.h" +#include "amdxdna_gem.h" +#include "amdxdna_pci_drv.h" + +#define MAX_HWCTX_ID 255 +#define MAX_ARG_COUNT 4095 + +struct amdxdna_fence { + struct dma_fence base; + spinlock_t lock; /* for base */ + struct amdxdna_hwctx *hwctx; +}; + +static const char *amdxdna_fence_get_driver_name(struct dma_fence *fence) +{ + return KBUILD_MODNAME; +} + +static const char *amdxdna_fence_get_timeline_name(struct dma_fence *fence) +{ + struct amdxdna_fence *xdna_fence; + + xdna_fence = container_of(fence, struct amdxdna_fence, base); + + return xdna_fence->hwctx->name; +} + +static const struct dma_fence_ops fence_ops = { + .get_driver_name = amdxdna_fence_get_driver_name, + .get_timeline_name = amdxdna_fence_get_timeline_name, +}; + +static struct dma_fence *amdxdna_fence_create(struct amdxdna_hwctx *hwctx) +{ + struct amdxdna_fence *fence; + + fence = kzalloc(sizeof(*fence), GFP_KERNEL); + if (!fence) + return NULL; + + fence->hwctx = hwctx; + spin_lock_init(&fence->lock); + dma_fence_init(&fence->base, &fence_ops, &fence->lock, hwctx->id, 0); + return &fence->base; +} + +void amdxdna_hwctx_suspend(struct amdxdna_client *client) +{ + struct amdxdna_dev *xdna = client->xdna; + struct amdxdna_hwctx *hwctx; + unsigned long hwctx_id; + + drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); + mutex_lock(&client->hwctx_lock); + amdxdna_for_each_hwctx(client, hwctx_id, hwctx) + xdna->dev_info->ops->hwctx_suspend(hwctx); + mutex_unlock(&client->hwctx_lock); +} + +void amdxdna_hwctx_resume(struct amdxdna_client *client) +{ + struct amdxdna_dev *xdna = client->xdna; + struct amdxdna_hwctx *hwctx; + unsigned long hwctx_id; + + drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); + mutex_lock(&client->hwctx_lock); + amdxdna_for_each_hwctx(client, hwctx_id, hwctx) + xdna->dev_info->ops->hwctx_resume(hwctx); + mutex_unlock(&client->hwctx_lock); +} + +static void amdxdna_hwctx_destroy_rcu(struct amdxdna_hwctx *hwctx, + struct srcu_struct *ss) +{ + struct amdxdna_dev *xdna = hwctx->client->xdna; + + synchronize_srcu(ss); + + /* At this point, user is not able to submit new commands */ + mutex_lock(&xdna->dev_lock); + xdna->dev_info->ops->hwctx_fini(hwctx); + mutex_unlock(&xdna->dev_lock); + + kfree(hwctx->name); + kfree(hwctx); +} + +void *amdxdna_cmd_get_payload(struct amdxdna_gem_obj *abo, u32 *size) +{ + struct amdxdna_cmd *cmd = abo->mem.kva; + u32 num_masks, count; + + if (amdxdna_cmd_get_op(abo) == ERT_CMD_CHAIN) + num_masks = 0; + else + num_masks = 1 + FIELD_GET(AMDXDNA_CMD_EXTRA_CU_MASK, cmd->header); + + if (size) { + count = FIELD_GET(AMDXDNA_CMD_COUNT, cmd->header); + if (unlikely(count <= num_masks)) { + *size = 0; + return NULL; + } + *size = (count - num_masks) * sizeof(u32); + } + return &cmd->data[num_masks]; +} + +int amdxdna_cmd_get_cu_idx(struct amdxdna_gem_obj *abo) +{ + struct amdxdna_cmd *cmd = abo->mem.kva; + u32 num_masks, i; + u32 *cu_mask; + + if (amdxdna_cmd_get_op(abo) == ERT_CMD_CHAIN) + return -1; + + num_masks = 1 + FIELD_GET(AMDXDNA_CMD_EXTRA_CU_MASK, cmd->header); + cu_mask = cmd->data; + for (i = 0; i < num_masks; i++) { + if (cu_mask[i]) + return ffs(cu_mask[i]) - 1; + } + + return -1; +} + +/* + * This should be called in close() and remove(). DO NOT call in other syscalls. + * This guarantee that when hwctx and resources will be released, if user + * doesn't call amdxdna_drm_destroy_hwctx_ioctl. + */ +void amdxdna_hwctx_remove_all(struct amdxdna_client *client) +{ + struct amdxdna_hwctx *hwctx; + unsigned long hwctx_id; + + mutex_lock(&client->hwctx_lock); + amdxdna_for_each_hwctx(client, hwctx_id, hwctx) { + XDNA_DBG(client->xdna, "PID %d close HW context %d", + client->pid, hwctx->id); + xa_erase(&client->hwctx_xa, hwctx->id); + mutex_unlock(&client->hwctx_lock); + amdxdna_hwctx_destroy_rcu(hwctx, &client->hwctx_srcu); + mutex_lock(&client->hwctx_lock); + } + mutex_unlock(&client->hwctx_lock); +} + +int amdxdna_drm_create_hwctx_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) +{ + struct amdxdna_client *client = filp->driver_priv; + struct amdxdna_drm_create_hwctx *args = data; + struct amdxdna_dev *xdna = to_xdna_dev(dev); + struct amdxdna_hwctx *hwctx; + int ret, idx; + + if (args->ext || args->ext_flags) + return -EINVAL; + + if (!drm_dev_enter(dev, &idx)) + return -ENODEV; + + hwctx = kzalloc(sizeof(*hwctx), GFP_KERNEL); + if (!hwctx) { + ret = -ENOMEM; + goto exit; + } + + if (copy_from_user(&hwctx->qos, u64_to_user_ptr(args->qos_p), sizeof(hwctx->qos))) { + XDNA_ERR(xdna, "Access QoS info failed"); + ret = -EFAULT; + goto free_hwctx; + } + + hwctx->client = client; + hwctx->fw_ctx_id = -1; + hwctx->num_tiles = args->num_tiles; + hwctx->mem_size = args->mem_size; + hwctx->max_opc = args->max_opc; + ret = xa_alloc_cyclic(&client->hwctx_xa, &hwctx->id, hwctx, + XA_LIMIT(AMDXDNA_INVALID_CTX_HANDLE + 1, MAX_HWCTX_ID), + &client->next_hwctxid, GFP_KERNEL); + if (ret < 0) { + XDNA_ERR(xdna, "Allocate hwctx ID failed, ret %d", ret); + goto free_hwctx; + } + + hwctx->name = kasprintf(GFP_KERNEL, "hwctx.%d.%d", client->pid, hwctx->id); + if (!hwctx->name) { + ret = -ENOMEM; + goto rm_id; + } + + mutex_lock(&xdna->dev_lock); + ret = xdna->dev_info->ops->hwctx_init(hwctx); + if (ret) { + mutex_unlock(&xdna->dev_lock); + XDNA_ERR(xdna, "Init hwctx failed, ret %d", ret); + goto free_name; + } + args->handle = hwctx->id; + args->syncobj_handle = hwctx->syncobj_hdl; + mutex_unlock(&xdna->dev_lock); + + XDNA_DBG(xdna, "PID %d create HW context %d, ret %d", client->pid, args->handle, ret); + drm_dev_exit(idx); + return 0; + +free_name: + kfree(hwctx->name); +rm_id: + xa_erase(&client->hwctx_xa, hwctx->id); +free_hwctx: + kfree(hwctx); +exit: + drm_dev_exit(idx); + return ret; +} + +int amdxdna_drm_destroy_hwctx_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) +{ + struct amdxdna_client *client = filp->driver_priv; + struct amdxdna_drm_destroy_hwctx *args = data; + struct amdxdna_dev *xdna = to_xdna_dev(dev); + struct amdxdna_hwctx *hwctx; + int ret = 0, idx; + + if (XDNA_MBZ_DBG(xdna, &args->pad, sizeof(args->pad))) + return -EINVAL; + + if (!drm_dev_enter(dev, &idx)) + return -ENODEV; + + hwctx = xa_erase(&client->hwctx_xa, args->handle); + if (!hwctx) { + ret = -EINVAL; + XDNA_DBG(xdna, "PID %d HW context %d not exist", + client->pid, args->handle); + goto out; + } + + /* + * The pushed jobs are handled by DRM scheduler during destroy. + * SRCU to synchronize with exec command ioctls. + */ + amdxdna_hwctx_destroy_rcu(hwctx, &client->hwctx_srcu); + + XDNA_DBG(xdna, "PID %d destroyed HW context %d", client->pid, args->handle); +out: + drm_dev_exit(idx); + return ret; +} + +int amdxdna_drm_config_hwctx_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) +{ + struct amdxdna_client *client = filp->driver_priv; + struct amdxdna_drm_config_hwctx *args = data; + struct amdxdna_dev *xdna = to_xdna_dev(dev); + struct amdxdna_hwctx *hwctx; + int ret, idx; + u32 buf_size; + void *buf; + u64 val; + + if (XDNA_MBZ_DBG(xdna, &args->pad, sizeof(args->pad))) + return -EINVAL; + + if (!xdna->dev_info->ops->hwctx_config) + return -EOPNOTSUPP; + + val = args->param_val; + buf_size = args->param_val_size; + + switch (args->param_type) { + case DRM_AMDXDNA_HWCTX_CONFIG_CU: + /* For those types that param_val is pointer */ + if (buf_size > PAGE_SIZE) { + XDNA_ERR(xdna, "Config CU param buffer too large"); + return -E2BIG; + } + + /* Hwctx needs to keep buf */ + buf = kzalloc(PAGE_SIZE, GFP_KERNEL); + if (!buf) + return -ENOMEM; + + if (copy_from_user(buf, u64_to_user_ptr(val), buf_size)) { + kfree(buf); + return -EFAULT; + } + + break; + case DRM_AMDXDNA_HWCTX_ASSIGN_DBG_BUF: + case DRM_AMDXDNA_HWCTX_REMOVE_DBG_BUF: + /* For those types that param_val is a value */ + buf = NULL; + buf_size = 0; + break; + default: + XDNA_DBG(xdna, "Unknown HW context config type %d", args->param_type); + return -EINVAL; + } + + mutex_lock(&xdna->dev_lock); + idx = srcu_read_lock(&client->hwctx_srcu); + hwctx = xa_load(&client->hwctx_xa, args->handle); + if (!hwctx) { + XDNA_DBG(xdna, "PID %d failed to get hwctx %d", client->pid, args->handle); + ret = -EINVAL; + goto unlock_srcu; + } + + ret = xdna->dev_info->ops->hwctx_config(hwctx, args->param_type, val, buf, buf_size); + +unlock_srcu: + srcu_read_unlock(&client->hwctx_srcu, idx); + mutex_unlock(&xdna->dev_lock); + kfree(buf); + return ret; +} + +static void +amdxdna_arg_bos_put(struct amdxdna_sched_job *job) +{ + int i; + + for (i = 0; i < job->bo_cnt; i++) { + if (!job->bos[i]) + break; + drm_gem_object_put(job->bos[i]); + } +} + +static int +amdxdna_arg_bos_lookup(struct amdxdna_client *client, + struct amdxdna_sched_job *job, + u32 *bo_hdls, u32 bo_cnt) +{ + struct drm_gem_object *gobj; + int i, ret; + + job->bo_cnt = bo_cnt; + for (i = 0; i < job->bo_cnt; i++) { + struct amdxdna_gem_obj *abo; + + gobj = drm_gem_object_lookup(client->filp, bo_hdls[i]); + if (!gobj) { + ret = -ENOENT; + goto put_shmem_bo; + } + abo = to_xdna_obj(gobj); + + mutex_lock(&abo->lock); + if (abo->pinned) { + mutex_unlock(&abo->lock); + job->bos[i] = gobj; + continue; + } + + ret = amdxdna_gem_pin_nolock(abo); + if (ret) { + mutex_unlock(&abo->lock); + drm_gem_object_put(gobj); + goto put_shmem_bo; + } + abo->pinned = true; + mutex_unlock(&abo->lock); + + job->bos[i] = gobj; + } + + return 0; + +put_shmem_bo: + amdxdna_arg_bos_put(job); + return ret; +} + +void amdxdna_sched_job_cleanup(struct amdxdna_sched_job *job) +{ + trace_amdxdna_debug_point(job->hwctx->name, job->seq, "job release"); + amdxdna_arg_bos_put(job); + amdxdna_gem_put_obj(job->cmd_bo); +} + +int amdxdna_cmd_submit(struct amdxdna_client *client, + u32 cmd_bo_hdl, u32 *arg_bo_hdls, u32 arg_bo_cnt, + u32 hwctx_hdl, u64 *seq) +{ + struct amdxdna_dev *xdna = client->xdna; + struct amdxdna_sched_job *job; + struct amdxdna_hwctx *hwctx; + int ret, idx; + + XDNA_DBG(xdna, "Command BO hdl %d, Arg BO count %d", cmd_bo_hdl, arg_bo_cnt); + job = kzalloc(struct_size(job, bos, arg_bo_cnt), GFP_KERNEL); + if (!job) + return -ENOMEM; + + if (cmd_bo_hdl != AMDXDNA_INVALID_BO_HANDLE) { + job->cmd_bo = amdxdna_gem_get_obj(client, cmd_bo_hdl, AMDXDNA_BO_CMD); + if (!job->cmd_bo) { + XDNA_ERR(xdna, "Failed to get cmd bo from %d", cmd_bo_hdl); + ret = -EINVAL; + goto free_job; + } + } else { + job->cmd_bo = NULL; + } + + ret = amdxdna_arg_bos_lookup(client, job, arg_bo_hdls, arg_bo_cnt); + if (ret) { + XDNA_ERR(xdna, "Argument BOs lookup failed, ret %d", ret); + goto cmd_put; + } + + idx = srcu_read_lock(&client->hwctx_srcu); + hwctx = xa_load(&client->hwctx_xa, hwctx_hdl); + if (!hwctx) { + XDNA_DBG(xdna, "PID %d failed to get hwctx %d", + client->pid, hwctx_hdl); + ret = -EINVAL; + goto unlock_srcu; + } + + if (hwctx->status != HWCTX_STAT_READY) { + XDNA_ERR(xdna, "HW Context is not ready"); + ret = -EINVAL; + goto unlock_srcu; + } + + job->hwctx = hwctx; + job->mm = current->mm; + + job->fence = amdxdna_fence_create(hwctx); + if (!job->fence) { + XDNA_ERR(xdna, "Failed to create fence"); + ret = -ENOMEM; + goto unlock_srcu; + } + kref_init(&job->refcnt); + + ret = xdna->dev_info->ops->cmd_submit(hwctx, job, seq); + if (ret) + goto put_fence; + + /* + * The amdxdna_hwctx_destroy_rcu() will release hwctx and associated + * resource after synchronize_srcu(). The submitted jobs should be + * handled by the queue, for example DRM scheduler, in device layer. + * For here we can unlock SRCU. + */ + srcu_read_unlock(&client->hwctx_srcu, idx); + trace_amdxdna_debug_point(hwctx->name, *seq, "job pushed"); + + return 0; + +put_fence: + dma_fence_put(job->fence); +unlock_srcu: + srcu_read_unlock(&client->hwctx_srcu, idx); + amdxdna_arg_bos_put(job); +cmd_put: + amdxdna_gem_put_obj(job->cmd_bo); +free_job: + kfree(job); + return ret; +} + +/* + * The submit command ioctl submits a command to firmware. One firmware command + * may contain multiple command BOs for processing as a whole. + * The command sequence number is returned which can be used for wait command ioctl. + */ +static int amdxdna_drm_submit_execbuf(struct amdxdna_client *client, + struct amdxdna_drm_exec_cmd *args) +{ + struct amdxdna_dev *xdna = client->xdna; + u32 *arg_bo_hdls; + u32 cmd_bo_hdl; + int ret; + + if (!args->arg_count || args->arg_count > MAX_ARG_COUNT) { + XDNA_ERR(xdna, "Invalid arg bo count %d", args->arg_count); + return -EINVAL; + } + + /* Only support single command for now. */ + if (args->cmd_count != 1) { + XDNA_ERR(xdna, "Invalid cmd bo count %d", args->cmd_count); + return -EINVAL; + } + + cmd_bo_hdl = (u32)args->cmd_handles; + arg_bo_hdls = kcalloc(args->arg_count, sizeof(u32), GFP_KERNEL); + if (!arg_bo_hdls) + return -ENOMEM; + ret = copy_from_user(arg_bo_hdls, u64_to_user_ptr(args->args), + args->arg_count * sizeof(u32)); + if (ret) { + ret = -EFAULT; + goto free_cmd_bo_hdls; + } + + ret = amdxdna_cmd_submit(client, cmd_bo_hdl, arg_bo_hdls, + args->arg_count, args->hwctx, &args->seq); + if (ret) + XDNA_DBG(xdna, "Submit cmds failed, ret %d", ret); + +free_cmd_bo_hdls: + kfree(arg_bo_hdls); + if (!ret) + XDNA_DBG(xdna, "Pushed cmd %lld to scheduler", args->seq); + return ret; +} + +int amdxdna_drm_submit_cmd_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) +{ + struct amdxdna_client *client = filp->driver_priv; + struct amdxdna_drm_exec_cmd *args = data; + + if (args->ext || args->ext_flags) + return -EINVAL; + + switch (args->type) { + case AMDXDNA_CMD_SUBMIT_EXEC_BUF: + return amdxdna_drm_submit_execbuf(client, args); + } + + XDNA_ERR(client->xdna, "Invalid command type %d", args->type); + return -EINVAL; +} diff --git a/drivers/accel/amdxdna/amdxdna_ctx.h b/drivers/accel/amdxdna/amdxdna_ctx.h new file mode 100644 index 000000000000..80b0304193ec --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_ctx.h @@ -0,0 +1,162 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2022-2024, Advanced Micro Devices, Inc. + */ + +#ifndef _AMDXDNA_CTX_H_ +#define _AMDXDNA_CTX_H_ + +#include + +#include "amdxdna_gem.h" + +struct amdxdna_hwctx_priv; + +enum ert_cmd_opcode { + ERT_START_CU = 0, + ERT_CMD_CHAIN = 19, + ERT_START_NPU = 20, +}; + +enum ert_cmd_state { + ERT_CMD_STATE_INVALID, + ERT_CMD_STATE_NEW, + ERT_CMD_STATE_QUEUED, + ERT_CMD_STATE_RUNNING, + ERT_CMD_STATE_COMPLETED, + ERT_CMD_STATE_ERROR, + ERT_CMD_STATE_ABORT, + ERT_CMD_STATE_SUBMITTED, + ERT_CMD_STATE_TIMEOUT, + ERT_CMD_STATE_NORESPONSE, +}; + +/* + * Interpretation of the beginning of data payload for ERT_START_NPU in + * amdxdna_cmd. The rest of the payload in amdxdna_cmd is regular kernel args. + */ +struct amdxdna_cmd_start_npu { + u64 buffer; /* instruction buffer address */ + u32 buffer_size; /* size of buffer in bytes */ + u32 prop_count; /* properties count */ + u32 prop_args[]; /* properties and regular kernel arguments */ +}; + +/* + * Interpretation of the beginning of data payload for ERT_CMD_CHAIN in + * amdxdna_cmd. The rest of the payload in amdxdna_cmd is cmd BO handles. + */ +struct amdxdna_cmd_chain { + u32 command_count; + u32 submit_index; + u32 error_index; + u32 reserved[3]; + u64 data[] __counted_by(command_count); +}; + +/* Exec buffer command header format */ +#define AMDXDNA_CMD_STATE GENMASK(3, 0) +#define AMDXDNA_CMD_EXTRA_CU_MASK GENMASK(11, 10) +#define AMDXDNA_CMD_COUNT GENMASK(22, 12) +#define AMDXDNA_CMD_OPCODE GENMASK(27, 23) +struct amdxdna_cmd { + u32 header; + u32 data[]; +}; + +struct amdxdna_hwctx { + struct amdxdna_client *client; + struct amdxdna_hwctx_priv *priv; + char *name; + + u32 id; + u32 max_opc; + u32 num_tiles; + u32 mem_size; + u32 fw_ctx_id; + u32 col_list_len; + u32 *col_list; + u32 start_col; + u32 num_col; +#define HWCTX_STAT_INIT 0 +#define HWCTX_STAT_READY 1 +#define HWCTX_STAT_STOP 2 + u32 status; + u32 old_status; + + struct amdxdna_qos_info qos; + struct amdxdna_hwctx_param_config_cu *cus; + u32 syncobj_hdl; +}; + +#define drm_job_to_xdna_job(j) \ + container_of(j, struct amdxdna_sched_job, base) + +struct amdxdna_sched_job { + struct drm_sched_job base; + struct kref refcnt; + struct amdxdna_hwctx *hwctx; + struct mm_struct *mm; + /* The fence to notice DRM scheduler that job is done by hardware */ + struct dma_fence *fence; + /* user can wait on this fence */ + struct dma_fence *out_fence; + bool job_done; + u64 seq; + struct amdxdna_gem_obj *cmd_bo; + size_t bo_cnt; + struct drm_gem_object *bos[] __counted_by(bo_cnt); +}; + +static inline u32 +amdxdna_cmd_get_op(struct amdxdna_gem_obj *abo) +{ + struct amdxdna_cmd *cmd = abo->mem.kva; + + return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header); +} + +static inline void +amdxdna_cmd_set_state(struct amdxdna_gem_obj *abo, enum ert_cmd_state s) +{ + struct amdxdna_cmd *cmd = abo->mem.kva; + + cmd->header &= ~AMDXDNA_CMD_STATE; + cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s); +} + +static inline enum ert_cmd_state +amdxdna_cmd_get_state(struct amdxdna_gem_obj *abo) +{ + struct amdxdna_cmd *cmd = abo->mem.kva; + + return FIELD_GET(AMDXDNA_CMD_STATE, cmd->header); +} + +void *amdxdna_cmd_get_payload(struct amdxdna_gem_obj *abo, u32 *size); +int amdxdna_cmd_get_cu_idx(struct amdxdna_gem_obj *abo); + +static inline u32 amdxdna_hwctx_col_map(struct amdxdna_hwctx *hwctx) +{ + return GENMASK(hwctx->start_col + hwctx->num_col - 1, + hwctx->start_col); +} + +void amdxdna_sched_job_cleanup(struct amdxdna_sched_job *job); +void amdxdna_hwctx_remove_all(struct amdxdna_client *client); +void amdxdna_hwctx_suspend(struct amdxdna_client *client); +void amdxdna_hwctx_resume(struct amdxdna_client *client); + +int amdxdna_cmd_submit(struct amdxdna_client *client, + u32 cmd_bo_hdls, u32 *arg_bo_hdls, u32 arg_bo_cnt, + u32 hwctx_hdl, u64 *seq); + +int amdxdna_cmd_wait(struct amdxdna_client *client, u32 hwctx_hdl, + u64 seq, u32 timeout); + +int amdxdna_drm_create_hwctx_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); +int amdxdna_drm_config_hwctx_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); +int amdxdna_drm_destroy_hwctx_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); +int amdxdna_drm_submit_cmd_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); + +#endif /* _AMDXDNA_CTX_H_ */ diff --git a/drivers/accel/amdxdna/amdxdna_gem.c b/drivers/accel/amdxdna/amdxdna_gem.c new file mode 100644 index 000000000000..606433d73236 --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_gem.c @@ -0,0 +1,622 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "amdxdna_ctx.h" +#include "amdxdna_gem.h" +#include "amdxdna_pci_drv.h" + +#define XDNA_MAX_CMD_BO_SIZE SZ_32K + +static int +amdxdna_gem_insert_node_locked(struct amdxdna_gem_obj *abo, bool use_vmap) +{ + struct amdxdna_client *client = abo->client; + struct amdxdna_dev *xdna = client->xdna; + struct amdxdna_mem *mem = &abo->mem; + u64 offset; + u32 align; + int ret; + + align = 1 << max(PAGE_SHIFT, xdna->dev_info->dev_mem_buf_shift); + ret = drm_mm_insert_node_generic(&abo->dev_heap->mm, &abo->mm_node, + mem->size, align, + 0, DRM_MM_INSERT_BEST); + if (ret) { + XDNA_ERR(xdna, "Failed to alloc dev bo memory, ret %d", ret); + return ret; + } + + mem->dev_addr = abo->mm_node.start; + offset = mem->dev_addr - abo->dev_heap->mem.dev_addr; + mem->userptr = abo->dev_heap->mem.userptr + offset; + mem->pages = &abo->dev_heap->base.pages[offset >> PAGE_SHIFT]; + mem->nr_pages = mem->size >> PAGE_SHIFT; + + if (use_vmap) { + mem->kva = vmap(mem->pages, mem->nr_pages, VM_MAP, PAGE_KERNEL); + if (!mem->kva) { + XDNA_ERR(xdna, "Failed to vmap"); + drm_mm_remove_node(&abo->mm_node); + return -EFAULT; + } + } + + return 0; +} + +static void amdxdna_gem_obj_free(struct drm_gem_object *gobj) +{ + struct amdxdna_dev *xdna = to_xdna_dev(gobj->dev); + struct amdxdna_gem_obj *abo = to_xdna_obj(gobj); + struct iosys_map map = IOSYS_MAP_INIT_VADDR(abo->mem.kva); + + XDNA_DBG(xdna, "BO type %d xdna_addr 0x%llx", abo->type, abo->mem.dev_addr); + if (abo->pinned) + amdxdna_gem_unpin(abo); + + if (abo->type == AMDXDNA_BO_DEV) { + mutex_lock(&abo->client->mm_lock); + drm_mm_remove_node(&abo->mm_node); + mutex_unlock(&abo->client->mm_lock); + + vunmap(abo->mem.kva); + drm_gem_object_put(to_gobj(abo->dev_heap)); + drm_gem_object_release(gobj); + mutex_destroy(&abo->lock); + kfree(abo); + return; + } + + if (abo->type == AMDXDNA_BO_DEV_HEAP) + drm_mm_takedown(&abo->mm); + + drm_gem_vunmap_unlocked(gobj, &map); + mutex_destroy(&abo->lock); + drm_gem_shmem_free(&abo->base); +} + +static const struct drm_gem_object_funcs amdxdna_gem_dev_obj_funcs = { + .free = amdxdna_gem_obj_free, +}; + +static bool amdxdna_hmm_invalidate(struct mmu_interval_notifier *mni, + const struct mmu_notifier_range *range, + unsigned long cur_seq) +{ + struct amdxdna_gem_obj *abo = container_of(mni, struct amdxdna_gem_obj, + mem.notifier); + struct amdxdna_dev *xdna = to_xdna_dev(to_gobj(abo)->dev); + + XDNA_DBG(xdna, "Invalid range 0x%llx, 0x%lx, type %d", + abo->mem.userptr, abo->mem.size, abo->type); + + if (!mmu_notifier_range_blockable(range)) + return false; + + xdna->dev_info->ops->hmm_invalidate(abo, cur_seq); + + return true; +} + +static const struct mmu_interval_notifier_ops amdxdna_hmm_ops = { + .invalidate = amdxdna_hmm_invalidate, +}; + +static void amdxdna_hmm_unregister(struct amdxdna_gem_obj *abo) +{ + struct amdxdna_dev *xdna = to_xdna_dev(to_gobj(abo)->dev); + + if (!xdna->dev_info->ops->hmm_invalidate) + return; + + mmu_interval_notifier_remove(&abo->mem.notifier); + kvfree(abo->mem.pfns); + abo->mem.pfns = NULL; +} + +static int amdxdna_hmm_register(struct amdxdna_gem_obj *abo, unsigned long addr, + size_t len) +{ + struct amdxdna_dev *xdna = to_xdna_dev(to_gobj(abo)->dev); + u32 nr_pages; + int ret; + + if (!xdna->dev_info->ops->hmm_invalidate) + return 0; + + if (abo->mem.pfns) + return -EEXIST; + + nr_pages = (PAGE_ALIGN(addr + len) - (addr & PAGE_MASK)) >> PAGE_SHIFT; + abo->mem.pfns = kvcalloc(nr_pages, sizeof(*abo->mem.pfns), + GFP_KERNEL); + if (!abo->mem.pfns) + return -ENOMEM; + + ret = mmu_interval_notifier_insert_locked(&abo->mem.notifier, + current->mm, + addr, + len, + &amdxdna_hmm_ops); + if (ret) { + XDNA_ERR(xdna, "Insert mmu notifier failed, ret %d", ret); + kvfree(abo->mem.pfns); + } + abo->mem.userptr = addr; + + return ret; +} + +static int amdxdna_gem_obj_mmap(struct drm_gem_object *gobj, + struct vm_area_struct *vma) +{ + struct amdxdna_gem_obj *abo = to_xdna_obj(gobj); + unsigned long num_pages; + int ret; + + ret = amdxdna_hmm_register(abo, vma->vm_start, gobj->size); + if (ret) + return ret; + + ret = drm_gem_shmem_mmap(&abo->base, vma); + if (ret) + goto hmm_unreg; + + num_pages = gobj->size >> PAGE_SHIFT; + /* Try to insert the pages */ + vm_flags_mod(vma, VM_MIXEDMAP, VM_PFNMAP); + ret = vm_insert_pages(vma, vma->vm_start, abo->base.pages, &num_pages); + if (ret) + XDNA_ERR(abo->client->xdna, "Failed insert pages, ret %d", ret); + + return 0; + +hmm_unreg: + amdxdna_hmm_unregister(abo); + return ret; +} + +static vm_fault_t amdxdna_gem_vm_fault(struct vm_fault *vmf) +{ + return drm_gem_shmem_vm_ops.fault(vmf); +} + +static void amdxdna_gem_vm_open(struct vm_area_struct *vma) +{ + drm_gem_shmem_vm_ops.open(vma); +} + +static void amdxdna_gem_vm_close(struct vm_area_struct *vma) +{ + struct drm_gem_object *gobj = vma->vm_private_data; + + amdxdna_hmm_unregister(to_xdna_obj(gobj)); + drm_gem_shmem_vm_ops.close(vma); +} + +static const struct vm_operations_struct amdxdna_gem_vm_ops = { + .fault = amdxdna_gem_vm_fault, + .open = amdxdna_gem_vm_open, + .close = amdxdna_gem_vm_close, +}; + +static const struct drm_gem_object_funcs amdxdna_gem_shmem_funcs = { + .free = amdxdna_gem_obj_free, + .print_info = drm_gem_shmem_object_print_info, + .pin = drm_gem_shmem_object_pin, + .unpin = drm_gem_shmem_object_unpin, + .get_sg_table = drm_gem_shmem_object_get_sg_table, + .vmap = drm_gem_shmem_object_vmap, + .vunmap = drm_gem_shmem_object_vunmap, + .mmap = amdxdna_gem_obj_mmap, + .vm_ops = &amdxdna_gem_vm_ops, +}; + +static struct amdxdna_gem_obj * +amdxdna_gem_create_obj(struct drm_device *dev, size_t size) +{ + struct amdxdna_gem_obj *abo; + + abo = kzalloc(sizeof(*abo), GFP_KERNEL); + if (!abo) + return ERR_PTR(-ENOMEM); + + abo->pinned = false; + abo->assigned_hwctx = AMDXDNA_INVALID_CTX_HANDLE; + mutex_init(&abo->lock); + + abo->mem.userptr = AMDXDNA_INVALID_ADDR; + abo->mem.dev_addr = AMDXDNA_INVALID_ADDR; + abo->mem.size = size; + + return abo; +} + +/* For drm_driver->gem_create_object callback */ +struct drm_gem_object * +amdxdna_gem_create_object_cb(struct drm_device *dev, size_t size) +{ + struct amdxdna_gem_obj *abo; + + abo = amdxdna_gem_create_obj(dev, size); + if (IS_ERR(abo)) + return ERR_CAST(abo); + + to_gobj(abo)->funcs = &amdxdna_gem_shmem_funcs; + + return to_gobj(abo); +} + +static struct amdxdna_gem_obj * +amdxdna_drm_alloc_shmem(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp) +{ + struct amdxdna_client *client = filp->driver_priv; + struct drm_gem_shmem_object *shmem; + struct amdxdna_gem_obj *abo; + + shmem = drm_gem_shmem_create(dev, args->size); + if (IS_ERR(shmem)) + return ERR_CAST(shmem); + + shmem->map_wc = false; + + abo = to_xdna_obj(&shmem->base); + abo->client = client; + abo->type = AMDXDNA_BO_SHMEM; + + return abo; +} + +static struct amdxdna_gem_obj * +amdxdna_drm_create_dev_heap(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp) +{ + struct amdxdna_client *client = filp->driver_priv; + struct amdxdna_dev *xdna = to_xdna_dev(dev); + struct drm_gem_shmem_object *shmem; + struct amdxdna_gem_obj *abo; + int ret; + + if (args->size > xdna->dev_info->dev_mem_size) { + XDNA_DBG(xdna, "Invalid dev heap size 0x%llx, limit 0x%lx", + args->size, xdna->dev_info->dev_mem_size); + return ERR_PTR(-EINVAL); + } + + mutex_lock(&client->mm_lock); + if (client->dev_heap) { + XDNA_DBG(client->xdna, "dev heap is already created"); + ret = -EBUSY; + goto mm_unlock; + } + + shmem = drm_gem_shmem_create(dev, args->size); + if (IS_ERR(shmem)) { + ret = PTR_ERR(shmem); + goto mm_unlock; + } + + shmem->map_wc = false; + abo = to_xdna_obj(&shmem->base); + + abo->type = AMDXDNA_BO_DEV_HEAP; + abo->client = client; + abo->mem.dev_addr = client->xdna->dev_info->dev_mem_base; + drm_mm_init(&abo->mm, abo->mem.dev_addr, abo->mem.size); + + client->dev_heap = abo; + drm_gem_object_get(to_gobj(abo)); + mutex_unlock(&client->mm_lock); + + return abo; + +mm_unlock: + mutex_unlock(&client->mm_lock); + return ERR_PTR(ret); +} + +struct amdxdna_gem_obj * +amdxdna_drm_alloc_dev_bo(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp, bool use_vmap) +{ + struct amdxdna_client *client = filp->driver_priv; + struct amdxdna_dev *xdna = to_xdna_dev(dev); + size_t aligned_sz = PAGE_ALIGN(args->size); + struct amdxdna_gem_obj *abo, *heap; + int ret; + + mutex_lock(&client->mm_lock); + heap = client->dev_heap; + if (!heap) { + ret = -EINVAL; + goto mm_unlock; + } + + if (heap->mem.userptr == AMDXDNA_INVALID_ADDR) { + XDNA_ERR(xdna, "Invalid dev heap userptr"); + ret = -EINVAL; + goto mm_unlock; + } + + if (args->size > heap->mem.size) { + XDNA_ERR(xdna, "Invalid dev bo size 0x%llx, limit 0x%lx", + args->size, heap->mem.size); + ret = -EINVAL; + goto mm_unlock; + } + + abo = amdxdna_gem_create_obj(&xdna->ddev, aligned_sz); + if (IS_ERR(abo)) { + ret = PTR_ERR(abo); + goto mm_unlock; + } + to_gobj(abo)->funcs = &amdxdna_gem_dev_obj_funcs; + abo->type = AMDXDNA_BO_DEV; + abo->client = client; + abo->dev_heap = heap; + ret = amdxdna_gem_insert_node_locked(abo, use_vmap); + if (ret) { + XDNA_ERR(xdna, "Failed to alloc dev bo memory, ret %d", ret); + goto mm_unlock; + } + + drm_gem_object_get(to_gobj(heap)); + drm_gem_private_object_init(&xdna->ddev, to_gobj(abo), aligned_sz); + + mutex_unlock(&client->mm_lock); + return abo; + +mm_unlock: + mutex_unlock(&client->mm_lock); + return ERR_PTR(ret); +} + +static struct amdxdna_gem_obj * +amdxdna_drm_create_cmd_bo(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp) +{ + struct amdxdna_dev *xdna = to_xdna_dev(dev); + struct drm_gem_shmem_object *shmem; + struct amdxdna_gem_obj *abo; + struct iosys_map map; + int ret; + + if (args->size > XDNA_MAX_CMD_BO_SIZE) { + XDNA_ERR(xdna, "Command bo size 0x%llx too large", args->size); + return ERR_PTR(-EINVAL); + } + + if (args->size < sizeof(struct amdxdna_cmd)) { + XDNA_DBG(xdna, "Command BO size 0x%llx too small", args->size); + return ERR_PTR(-EINVAL); + } + + shmem = drm_gem_shmem_create(dev, args->size); + if (IS_ERR(shmem)) + return ERR_CAST(shmem); + + shmem->map_wc = false; + abo = to_xdna_obj(&shmem->base); + + abo->type = AMDXDNA_BO_CMD; + abo->client = filp->driver_priv; + + ret = drm_gem_vmap_unlocked(to_gobj(abo), &map); + if (ret) { + XDNA_ERR(xdna, "Vmap cmd bo failed, ret %d", ret); + goto release_obj; + } + abo->mem.kva = map.vaddr; + + return abo; + +release_obj: + drm_gem_shmem_free(shmem); + return ERR_PTR(ret); +} + +int amdxdna_drm_create_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) +{ + struct amdxdna_dev *xdna = to_xdna_dev(dev); + struct amdxdna_drm_create_bo *args = data; + struct amdxdna_gem_obj *abo; + int ret; + + if (args->flags || args->vaddr || !args->size) + return -EINVAL; + + XDNA_DBG(xdna, "BO arg type %d vaddr 0x%llx size 0x%llx flags 0x%llx", + args->type, args->vaddr, args->size, args->flags); + switch (args->type) { + case AMDXDNA_BO_SHMEM: + abo = amdxdna_drm_alloc_shmem(dev, args, filp); + break; + case AMDXDNA_BO_DEV_HEAP: + abo = amdxdna_drm_create_dev_heap(dev, args, filp); + break; + case AMDXDNA_BO_DEV: + abo = amdxdna_drm_alloc_dev_bo(dev, args, filp, false); + break; + case AMDXDNA_BO_CMD: + abo = amdxdna_drm_create_cmd_bo(dev, args, filp); + break; + default: + return -EINVAL; + } + if (IS_ERR(abo)) + return PTR_ERR(abo); + + /* ready to publish object to userspace */ + ret = drm_gem_handle_create(filp, to_gobj(abo), &args->handle); + if (ret) { + XDNA_ERR(xdna, "Create handle failed"); + goto put_obj; + } + + XDNA_DBG(xdna, "BO hdl %d type %d userptr 0x%llx xdna_addr 0x%llx size 0x%lx", + args->handle, args->type, abo->mem.userptr, + abo->mem.dev_addr, abo->mem.size); +put_obj: + /* Dereference object reference. Handle holds it now. */ + drm_gem_object_put(to_gobj(abo)); + return ret; +} + +int amdxdna_gem_pin_nolock(struct amdxdna_gem_obj *abo) +{ + struct amdxdna_dev *xdna = to_xdna_dev(to_gobj(abo)->dev); + int ret; + + switch (abo->type) { + case AMDXDNA_BO_SHMEM: + case AMDXDNA_BO_DEV_HEAP: + ret = drm_gem_shmem_pin(&abo->base); + break; + case AMDXDNA_BO_DEV: + ret = drm_gem_shmem_pin(&abo->dev_heap->base); + break; + default: + ret = -EOPNOTSUPP; + } + + XDNA_DBG(xdna, "BO type %d ret %d", abo->type, ret); + return ret; +} + +int amdxdna_gem_pin(struct amdxdna_gem_obj *abo) +{ + int ret; + + if (abo->type == AMDXDNA_BO_DEV) + abo = abo->dev_heap; + + mutex_lock(&abo->lock); + ret = amdxdna_gem_pin_nolock(abo); + mutex_unlock(&abo->lock); + + return ret; +} + +void amdxdna_gem_unpin(struct amdxdna_gem_obj *abo) +{ + if (abo->type == AMDXDNA_BO_DEV) + abo = abo->dev_heap; + + mutex_lock(&abo->lock); + drm_gem_shmem_unpin(&abo->base); + mutex_unlock(&abo->lock); +} + +struct amdxdna_gem_obj *amdxdna_gem_get_obj(struct amdxdna_client *client, + u32 bo_hdl, u8 bo_type) +{ + struct amdxdna_dev *xdna = client->xdna; + struct amdxdna_gem_obj *abo; + struct drm_gem_object *gobj; + + gobj = drm_gem_object_lookup(client->filp, bo_hdl); + if (!gobj) { + XDNA_DBG(xdna, "Can not find bo %d", bo_hdl); + return NULL; + } + + abo = to_xdna_obj(gobj); + if (bo_type == AMDXDNA_BO_INVALID || abo->type == bo_type) + return abo; + + drm_gem_object_put(gobj); + return NULL; +} + +int amdxdna_drm_get_bo_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) +{ + struct amdxdna_drm_get_bo_info *args = data; + struct amdxdna_dev *xdna = to_xdna_dev(dev); + struct amdxdna_gem_obj *abo; + struct drm_gem_object *gobj; + int ret = 0; + + if (args->ext || args->ext_flags || args->pad) + return -EINVAL; + + gobj = drm_gem_object_lookup(filp, args->handle); + if (!gobj) { + XDNA_DBG(xdna, "Lookup GEM object %d failed", args->handle); + return -ENOENT; + } + + abo = to_xdna_obj(gobj); + args->vaddr = abo->mem.userptr; + args->xdna_addr = abo->mem.dev_addr; + + if (abo->type != AMDXDNA_BO_DEV) + args->map_offset = drm_vma_node_offset_addr(&gobj->vma_node); + else + args->map_offset = AMDXDNA_INVALID_ADDR; + + XDNA_DBG(xdna, "BO hdl %d map_offset 0x%llx vaddr 0x%llx xdna_addr 0x%llx", + args->handle, args->map_offset, args->vaddr, args->xdna_addr); + + drm_gem_object_put(gobj); + return ret; +} + +/* + * The sync bo ioctl is to make sure the CPU cache is in sync with memory. + * This is required because NPU is not cache coherent device. CPU cache + * flushing/invalidation is expensive so it is best to handle this outside + * of the command submission path. This ioctl allows explicit cache + * flushing/invalidation outside of the critical path. + */ +int amdxdna_drm_sync_bo_ioctl(struct drm_device *dev, + void *data, struct drm_file *filp) +{ + struct amdxdna_dev *xdna = to_xdna_dev(dev); + struct amdxdna_drm_sync_bo *args = data; + struct amdxdna_gem_obj *abo; + struct drm_gem_object *gobj; + int ret; + + gobj = drm_gem_object_lookup(filp, args->handle); + if (!gobj) { + XDNA_ERR(xdna, "Lookup GEM object failed"); + return -ENOENT; + } + abo = to_xdna_obj(gobj); + + ret = amdxdna_gem_pin(abo); + if (ret) { + XDNA_ERR(xdna, "Pin BO %d failed, ret %d", args->handle, ret); + goto put_obj; + } + + if (abo->type == AMDXDNA_BO_DEV) + drm_clflush_pages(abo->mem.pages, abo->mem.nr_pages); + else + drm_clflush_pages(abo->base.pages, gobj->size >> PAGE_SHIFT); + + amdxdna_gem_unpin(abo); + + XDNA_DBG(xdna, "Sync bo %d offset 0x%llx, size 0x%llx\n", + args->handle, args->offset, args->size); + +put_obj: + drm_gem_object_put(gobj); + return ret; +} diff --git a/drivers/accel/amdxdna/amdxdna_gem.h b/drivers/accel/amdxdna/amdxdna_gem.h new file mode 100644 index 000000000000..8ccc0375dd9d --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_gem.h @@ -0,0 +1,65 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024, Advanced Micro Devices, Inc. + */ + +#ifndef _AMDXDNA_GEM_H_ +#define _AMDXDNA_GEM_H_ + +struct amdxdna_mem { + u64 userptr; + void *kva; + u64 dev_addr; + size_t size; + struct page **pages; + u32 nr_pages; + struct mmu_interval_notifier notifier; + unsigned long *pfns; + bool map_invalid; +}; + +struct amdxdna_gem_obj { + struct drm_gem_shmem_object base; + struct amdxdna_client *client; + u8 type; + bool pinned; + struct mutex lock; /* Protects: pinned */ + struct amdxdna_mem mem; + + /* Below members is uninitialized when needed */ + struct drm_mm mm; /* For AMDXDNA_BO_DEV_HEAP */ + struct amdxdna_gem_obj *dev_heap; /* For AMDXDNA_BO_DEV */ + struct drm_mm_node mm_node; /* For AMDXDNA_BO_DEV */ + u32 assigned_hwctx; +}; + +#define to_gobj(obj) (&(obj)->base.base) + +static inline struct amdxdna_gem_obj *to_xdna_obj(struct drm_gem_object *gobj) +{ + return container_of(gobj, struct amdxdna_gem_obj, base.base); +} + +struct amdxdna_gem_obj *amdxdna_gem_get_obj(struct amdxdna_client *client, + u32 bo_hdl, u8 bo_type); +static inline void amdxdna_gem_put_obj(struct amdxdna_gem_obj *abo) +{ + drm_gem_object_put(to_gobj(abo)); +} + +struct drm_gem_object * +amdxdna_gem_create_object_cb(struct drm_device *dev, size_t size); +struct amdxdna_gem_obj * +amdxdna_drm_alloc_dev_bo(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp, bool use_vmap); + +int amdxdna_gem_pin_nolock(struct amdxdna_gem_obj *abo); +int amdxdna_gem_pin(struct amdxdna_gem_obj *abo); +void amdxdna_gem_unpin(struct amdxdna_gem_obj *abo); + +int amdxdna_drm_create_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); +int amdxdna_drm_get_bo_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); +int amdxdna_drm_sync_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); + +#endif /* _AMDXDNA_GEM_H_ */ diff --git a/drivers/accel/amdxdna/amdxdna_mailbox.c b/drivers/accel/amdxdna/amdxdna_mailbox.c new file mode 100644 index 000000000000..814b16bb1953 --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_mailbox.c @@ -0,0 +1,561 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include + +#define CREATE_TRACE_POINTS +#include + +#include "amdxdna_mailbox.h" + +#define MB_ERR(chann, fmt, args...) \ +({ \ + typeof(chann) _chann = chann; \ + dev_err((_chann)->mb->dev, "xdna_mailbox.%d: "fmt, \ + (_chann)->msix_irq, ##args); \ +}) +#define MB_DBG(chann, fmt, args...) \ +({ \ + typeof(chann) _chann = chann; \ + dev_dbg((_chann)->mb->dev, "xdna_mailbox.%d: "fmt, \ + (_chann)->msix_irq, ##args); \ +}) +#define MB_WARN_ONCE(chann, fmt, args...) \ +({ \ + typeof(chann) _chann = chann; \ + dev_warn_once((_chann)->mb->dev, "xdna_mailbox.%d: "fmt, \ + (_chann)->msix_irq, ##args); \ +}) + +#define MAGIC_VAL 0x1D000000U +#define MAGIC_VAL_MASK 0xFF000000 +#define MAX_MSG_ID_ENTRIES 256 +#define MSG_RX_TIMER 200 /* milliseconds */ +#define MAILBOX_NAME "xdna_mailbox" + +enum channel_res_type { + CHAN_RES_X2I, + CHAN_RES_I2X, + CHAN_RES_NUM +}; + +struct mailbox { + struct device *dev; + struct xdna_mailbox_res res; +}; + +struct mailbox_channel { + struct mailbox *mb; + struct xdna_mailbox_chann_res res[CHAN_RES_NUM]; + int msix_irq; + u32 iohub_int_addr; + struct xarray chan_xa; + u32 next_msgid; + u32 x2i_tail; + + /* Received msg related fields */ + struct workqueue_struct *work_q; + struct work_struct rx_work; + u32 i2x_head; + bool bad_state; +}; + +#define MSG_BODY_SZ GENMASK(10, 0) +#define MSG_PROTO_VER GENMASK(23, 16) +struct xdna_msg_header { + __u32 total_size; + __u32 sz_ver; + __u32 id; + __u32 opcode; +} __packed; + +static_assert(sizeof(struct xdna_msg_header) == 16); + +struct mailbox_pkg { + struct xdna_msg_header header; + __u32 payload[]; +}; + +/* The protocol version. */ +#define MSG_PROTOCOL_VERSION 0x1 +/* The tombstone value. */ +#define TOMBSTONE 0xDEADFACE + +struct mailbox_msg { + void *handle; + int (*notify_cb)(void *handle, const u32 *data, size_t size); + size_t pkg_size; /* package size in bytes */ + struct mailbox_pkg pkg; +}; + +static void mailbox_reg_write(struct mailbox_channel *mb_chann, u32 mbox_reg, u32 data) +{ + struct xdna_mailbox_res *mb_res = &mb_chann->mb->res; + void __iomem *ringbuf_addr = mb_res->mbox_base + mbox_reg; + + writel(data, ringbuf_addr); +} + +static u32 mailbox_reg_read(struct mailbox_channel *mb_chann, u32 mbox_reg) +{ + struct xdna_mailbox_res *mb_res = &mb_chann->mb->res; + void __iomem *ringbuf_addr = mb_res->mbox_base + mbox_reg; + + return readl(ringbuf_addr); +} + +static int mailbox_reg_read_non_zero(struct mailbox_channel *mb_chann, u32 mbox_reg, u32 *val) +{ + struct xdna_mailbox_res *mb_res = &mb_chann->mb->res; + void __iomem *ringbuf_addr = mb_res->mbox_base + mbox_reg; + int ret, value; + + /* Poll till value is not zero */ + ret = readx_poll_timeout(readl, ringbuf_addr, value, + value, 1 /* us */, 100); + if (ret < 0) + return ret; + + *val = value; + return 0; +} + +static inline void +mailbox_set_headptr(struct mailbox_channel *mb_chann, u32 headptr_val) +{ + mailbox_reg_write(mb_chann, mb_chann->res[CHAN_RES_I2X].mb_head_ptr_reg, headptr_val); + mb_chann->i2x_head = headptr_val; +} + +static inline void +mailbox_set_tailptr(struct mailbox_channel *mb_chann, u32 tailptr_val) +{ + mailbox_reg_write(mb_chann, mb_chann->res[CHAN_RES_X2I].mb_tail_ptr_reg, tailptr_val); + mb_chann->x2i_tail = tailptr_val; +} + +static inline u32 +mailbox_get_headptr(struct mailbox_channel *mb_chann, enum channel_res_type type) +{ + return mailbox_reg_read(mb_chann, mb_chann->res[type].mb_head_ptr_reg); +} + +static inline u32 +mailbox_get_tailptr(struct mailbox_channel *mb_chann, enum channel_res_type type) +{ + return mailbox_reg_read(mb_chann, mb_chann->res[type].mb_tail_ptr_reg); +} + +static inline u32 +mailbox_get_ringbuf_size(struct mailbox_channel *mb_chann, enum channel_res_type type) +{ + return mb_chann->res[type].rb_size; +} + +static inline int mailbox_validate_msgid(int msg_id) +{ + return (msg_id & MAGIC_VAL_MASK) == MAGIC_VAL; +} + +static int mailbox_acquire_msgid(struct mailbox_channel *mb_chann, struct mailbox_msg *mb_msg) +{ + u32 msg_id; + int ret; + + ret = xa_alloc_cyclic_irq(&mb_chann->chan_xa, &msg_id, mb_msg, + XA_LIMIT(0, MAX_MSG_ID_ENTRIES - 1), + &mb_chann->next_msgid, GFP_NOWAIT); + if (ret < 0) + return ret; + + /* + * Add MAGIC_VAL to the higher bits. + */ + msg_id |= MAGIC_VAL; + return msg_id; +} + +static void mailbox_release_msgid(struct mailbox_channel *mb_chann, int msg_id) +{ + msg_id &= ~MAGIC_VAL_MASK; + xa_erase_irq(&mb_chann->chan_xa, msg_id); +} + +static void mailbox_release_msg(struct mailbox_channel *mb_chann, + struct mailbox_msg *mb_msg) +{ + MB_DBG(mb_chann, "msg_id 0x%x msg opcode 0x%x", + mb_msg->pkg.header.id, mb_msg->pkg.header.opcode); + mb_msg->notify_cb(mb_msg->handle, NULL, 0); + kfree(mb_msg); +} + +static int +mailbox_send_msg(struct mailbox_channel *mb_chann, struct mailbox_msg *mb_msg) +{ + void __iomem *write_addr; + u32 ringbuf_size; + u32 head, tail; + u32 start_addr; + u32 tmp_tail; + + head = mailbox_get_headptr(mb_chann, CHAN_RES_X2I); + tail = mb_chann->x2i_tail; + ringbuf_size = mailbox_get_ringbuf_size(mb_chann, CHAN_RES_X2I); + start_addr = mb_chann->res[CHAN_RES_X2I].rb_start_addr; + tmp_tail = tail + mb_msg->pkg_size; + + if (tail < head && tmp_tail >= head) + goto no_space; + + if (tail >= head && (tmp_tail > ringbuf_size - sizeof(u32) && + mb_msg->pkg_size >= head)) + goto no_space; + + if (tail >= head && tmp_tail > ringbuf_size - sizeof(u32)) { + write_addr = mb_chann->mb->res.ringbuf_base + start_addr + tail; + writel(TOMBSTONE, write_addr); + + /* tombstone is set. Write from the start of the ringbuf */ + tail = 0; + } + + write_addr = mb_chann->mb->res.ringbuf_base + start_addr + tail; + memcpy_toio(write_addr, &mb_msg->pkg, mb_msg->pkg_size); + mailbox_set_tailptr(mb_chann, tail + mb_msg->pkg_size); + + trace_mbox_set_tail(MAILBOX_NAME, mb_chann->msix_irq, + mb_msg->pkg.header.opcode, + mb_msg->pkg.header.id); + + return 0; + +no_space: + return -ENOSPC; +} + +static int +mailbox_get_resp(struct mailbox_channel *mb_chann, struct xdna_msg_header *header, + void *data) +{ + struct mailbox_msg *mb_msg; + int msg_id; + int ret; + + msg_id = header->id; + if (!mailbox_validate_msgid(msg_id)) { + MB_ERR(mb_chann, "Bad message ID 0x%x", msg_id); + return -EINVAL; + } + + msg_id &= ~MAGIC_VAL_MASK; + mb_msg = xa_erase_irq(&mb_chann->chan_xa, msg_id); + if (!mb_msg) { + MB_ERR(mb_chann, "Cannot find msg 0x%x", msg_id); + return -EINVAL; + } + + MB_DBG(mb_chann, "opcode 0x%x size %d id 0x%x", + header->opcode, header->total_size, header->id); + ret = mb_msg->notify_cb(mb_msg->handle, data, header->total_size); + if (unlikely(ret)) + MB_ERR(mb_chann, "Message callback ret %d", ret); + + kfree(mb_msg); + return ret; +} + +static int mailbox_get_msg(struct mailbox_channel *mb_chann) +{ + struct xdna_msg_header header; + void __iomem *read_addr; + u32 msg_size, rest; + u32 ringbuf_size; + u32 head, tail; + u32 start_addr; + int ret; + + if (mailbox_reg_read_non_zero(mb_chann, mb_chann->res[CHAN_RES_I2X].mb_tail_ptr_reg, &tail)) + return -EINVAL; + head = mb_chann->i2x_head; + ringbuf_size = mailbox_get_ringbuf_size(mb_chann, CHAN_RES_I2X); + start_addr = mb_chann->res[CHAN_RES_I2X].rb_start_addr; + + if (unlikely(tail > ringbuf_size || !IS_ALIGNED(tail, 4))) { + MB_WARN_ONCE(mb_chann, "Invalid tail 0x%x", tail); + return -EINVAL; + } + + /* ringbuf empty */ + if (head == tail) + return -ENOENT; + + if (head == ringbuf_size) + head = 0; + + /* Peek size of the message or TOMBSTONE */ + read_addr = mb_chann->mb->res.ringbuf_base + start_addr + head; + header.total_size = readl(read_addr); + /* size is TOMBSTONE, set next read from 0 */ + if (header.total_size == TOMBSTONE) { + if (head < tail) { + MB_WARN_ONCE(mb_chann, "Tombstone, head 0x%x tail 0x%x", + head, tail); + return -EINVAL; + } + mailbox_set_headptr(mb_chann, 0); + return 0; + } + + if (unlikely(!header.total_size || !IS_ALIGNED(header.total_size, 4))) { + MB_WARN_ONCE(mb_chann, "Invalid total size 0x%x", header.total_size); + return -EINVAL; + } + msg_size = sizeof(header) + header.total_size; + + if (msg_size > ringbuf_size - head || msg_size > tail - head) { + MB_WARN_ONCE(mb_chann, "Invalid message size %d, tail %d, head %d", + msg_size, tail, head); + return -EINVAL; + } + + rest = sizeof(header) - sizeof(u32); + read_addr += sizeof(u32); + memcpy_fromio((u32 *)&header + 1, read_addr, rest); + read_addr += rest; + + ret = mailbox_get_resp(mb_chann, &header, (u32 *)read_addr); + + mailbox_set_headptr(mb_chann, head + msg_size); + /* After update head, it can equal to ringbuf_size. This is expected. */ + trace_mbox_set_head(MAILBOX_NAME, mb_chann->msix_irq, + header.opcode, header.id); + + return ret; +} + +static irqreturn_t mailbox_irq_handler(int irq, void *p) +{ + struct mailbox_channel *mb_chann = p; + + trace_mbox_irq_handle(MAILBOX_NAME, irq); + /* Schedule a rx_work to call the callback functions */ + queue_work(mb_chann->work_q, &mb_chann->rx_work); + /* Clear IOHUB register */ + mailbox_reg_write(mb_chann, mb_chann->iohub_int_addr, 0); + + return IRQ_HANDLED; +} + +static void mailbox_rx_worker(struct work_struct *rx_work) +{ + struct mailbox_channel *mb_chann; + int ret; + + mb_chann = container_of(rx_work, struct mailbox_channel, rx_work); + + if (READ_ONCE(mb_chann->bad_state)) { + MB_ERR(mb_chann, "Channel in bad state, work aborted"); + return; + } + + while (1) { + /* + * If return is 0, keep consuming next message, until there is + * no messages or an error happened. + */ + ret = mailbox_get_msg(mb_chann); + if (ret == -ENOENT) + break; + + /* Other error means device doesn't look good, disable irq. */ + if (unlikely(ret)) { + MB_ERR(mb_chann, "Unexpected ret %d, disable irq", ret); + WRITE_ONCE(mb_chann->bad_state, true); + disable_irq(mb_chann->msix_irq); + break; + } + } +} + +int xdna_mailbox_send_msg(struct mailbox_channel *mb_chann, + const struct xdna_mailbox_msg *msg, u64 tx_timeout) +{ + struct xdna_msg_header *header; + struct mailbox_msg *mb_msg; + size_t pkg_size; + int ret; + + pkg_size = sizeof(*header) + msg->send_size; + if (pkg_size > mailbox_get_ringbuf_size(mb_chann, CHAN_RES_X2I)) { + MB_ERR(mb_chann, "Message size larger than ringbuf size"); + return -EINVAL; + } + + if (unlikely(!IS_ALIGNED(msg->send_size, 4))) { + MB_ERR(mb_chann, "Message must be 4 bytes align"); + return -EINVAL; + } + + /* The fist word in payload can NOT be TOMBSTONE */ + if (unlikely(((u32 *)msg->send_data)[0] == TOMBSTONE)) { + MB_ERR(mb_chann, "Tomb stone in data"); + return -EINVAL; + } + + if (READ_ONCE(mb_chann->bad_state)) { + MB_ERR(mb_chann, "Channel in bad state"); + return -EPIPE; + } + + mb_msg = kzalloc(sizeof(*mb_msg) + pkg_size, GFP_KERNEL); + if (!mb_msg) + return -ENOMEM; + + mb_msg->handle = msg->handle; + mb_msg->notify_cb = msg->notify_cb; + mb_msg->pkg_size = pkg_size; + + header = &mb_msg->pkg.header; + /* + * Hardware use total_size and size to split huge message. + * We do not support it here. Thus the values are the same. + */ + header->total_size = msg->send_size; + header->sz_ver = FIELD_PREP(MSG_BODY_SZ, msg->send_size) | + FIELD_PREP(MSG_PROTO_VER, MSG_PROTOCOL_VERSION); + header->opcode = msg->opcode; + memcpy(mb_msg->pkg.payload, msg->send_data, msg->send_size); + + ret = mailbox_acquire_msgid(mb_chann, mb_msg); + if (unlikely(ret < 0)) { + MB_ERR(mb_chann, "mailbox_acquire_msgid failed"); + goto msg_id_failed; + } + header->id = ret; + + MB_DBG(mb_chann, "opcode 0x%x size %d id 0x%x", + header->opcode, header->total_size, header->id); + + ret = mailbox_send_msg(mb_chann, mb_msg); + if (ret) { + MB_DBG(mb_chann, "Error in mailbox send msg, ret %d", ret); + goto release_id; + } + + return 0; + +release_id: + mailbox_release_msgid(mb_chann, header->id); +msg_id_failed: + kfree(mb_msg); + return ret; +} + +struct mailbox_channel * +xdna_mailbox_create_channel(struct mailbox *mb, + const struct xdna_mailbox_chann_res *x2i, + const struct xdna_mailbox_chann_res *i2x, + u32 iohub_int_addr, + int mb_irq) +{ + struct mailbox_channel *mb_chann; + int ret; + + if (!is_power_of_2(x2i->rb_size) || !is_power_of_2(i2x->rb_size)) { + pr_err("Ring buf size must be power of 2"); + return NULL; + } + + mb_chann = kzalloc(sizeof(*mb_chann), GFP_KERNEL); + if (!mb_chann) + return NULL; + + mb_chann->mb = mb; + mb_chann->msix_irq = mb_irq; + mb_chann->iohub_int_addr = iohub_int_addr; + memcpy(&mb_chann->res[CHAN_RES_X2I], x2i, sizeof(*x2i)); + memcpy(&mb_chann->res[CHAN_RES_I2X], i2x, sizeof(*i2x)); + + xa_init_flags(&mb_chann->chan_xa, XA_FLAGS_ALLOC | XA_FLAGS_LOCK_IRQ); + mb_chann->x2i_tail = mailbox_get_tailptr(mb_chann, CHAN_RES_X2I); + mb_chann->i2x_head = mailbox_get_headptr(mb_chann, CHAN_RES_I2X); + + INIT_WORK(&mb_chann->rx_work, mailbox_rx_worker); + mb_chann->work_q = create_singlethread_workqueue(MAILBOX_NAME); + if (!mb_chann->work_q) { + MB_ERR(mb_chann, "Create workqueue failed"); + goto free_and_out; + } + + /* Everything look good. Time to enable irq handler */ + ret = request_irq(mb_irq, mailbox_irq_handler, 0, MAILBOX_NAME, mb_chann); + if (ret) { + MB_ERR(mb_chann, "Failed to request irq %d ret %d", mb_irq, ret); + goto destroy_wq; + } + + mb_chann->bad_state = false; + + MB_DBG(mb_chann, "Mailbox channel created (irq: %d)", mb_chann->msix_irq); + return mb_chann; + +destroy_wq: + destroy_workqueue(mb_chann->work_q); +free_and_out: + kfree(mb_chann); + return NULL; +} + +int xdna_mailbox_destroy_channel(struct mailbox_channel *mb_chann) +{ + struct mailbox_msg *mb_msg; + unsigned long msg_id; + + MB_DBG(mb_chann, "IRQ disabled and RX work cancelled"); + free_irq(mb_chann->msix_irq, mb_chann); + destroy_workqueue(mb_chann->work_q); + /* We can clean up and release resources */ + + xa_for_each(&mb_chann->chan_xa, msg_id, mb_msg) + mailbox_release_msg(mb_chann, mb_msg); + + xa_destroy(&mb_chann->chan_xa); + + MB_DBG(mb_chann, "Mailbox channel destroyed, irq: %d", mb_chann->msix_irq); + kfree(mb_chann); + return 0; +} + +void xdna_mailbox_stop_channel(struct mailbox_channel *mb_chann) +{ + /* Disable an irq and wait. This might sleep. */ + disable_irq(mb_chann->msix_irq); + + /* Cancel RX work and wait for it to finish */ + cancel_work_sync(&mb_chann->rx_work); + MB_DBG(mb_chann, "IRQ disabled and RX work cancelled"); +} + +struct mailbox *xdnam_mailbox_create(struct drm_device *ddev, + const struct xdna_mailbox_res *res) +{ + struct mailbox *mb; + + mb = drmm_kzalloc(ddev, sizeof(*mb), GFP_KERNEL); + if (!mb) + return NULL; + mb->dev = ddev->dev; + + /* mailbox and ring buf base and size information */ + memcpy(&mb->res, res, sizeof(*res)); + + return mb; +} diff --git a/drivers/accel/amdxdna/amdxdna_mailbox.h b/drivers/accel/amdxdna/amdxdna_mailbox.h new file mode 100644 index 000000000000..57954c303bdd --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_mailbox.h @@ -0,0 +1,124 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2022-2024, Advanced Micro Devices, Inc. + */ + +#ifndef _AIE2_MAILBOX_H_ +#define _AIE2_MAILBOX_H_ + +struct mailbox; +struct mailbox_channel; + +/* + * xdna_mailbox_msg - message struct + * + * @opcode: opcode for firmware + * @handle: handle used for the notify callback + * @notify_cb: callback function to notify the sender when there is response + * @send_data: pointing to sending data + * @send_size: size of the sending data + * + * The mailbox will split the sending data in to multiple firmware message if + * the size of the data is too big. This is transparent to the sender. The + * sender will receive one notification. + */ +struct xdna_mailbox_msg { + u32 opcode; + void *handle; + int (*notify_cb)(void *handle, const u32 *data, size_t size); + u8 *send_data; + size_t send_size; +}; + +/* + * xdna_mailbox_res - mailbox hardware resource + * + * @ringbuf_base: ring buffer base address + * @ringbuf_size: ring buffer size + * @mbox_base: mailbox base address + * @mbox_size: mailbox size + */ +struct xdna_mailbox_res { + void __iomem *ringbuf_base; + size_t ringbuf_size; + void __iomem *mbox_base; + size_t mbox_size; + const char *name; +}; + +/* + * xdna_mailbox_chann_res - resources + * + * @rb_start_addr: ring buffer start address + * @rb_size: ring buffer size + * @mb_head_ptr_reg: mailbox head pointer register + * @mb_tail_ptr_reg: mailbox tail pointer register + */ +struct xdna_mailbox_chann_res { + u32 rb_start_addr; + u32 rb_size; + u32 mb_head_ptr_reg; + u32 mb_tail_ptr_reg; +}; + +/* + * xdna_mailbox_create() -- create mailbox subsystem and initialize + * + * @ddev: device pointer + * @res: SRAM and mailbox resources + * + * Return: If success, return a handle of mailbox subsystem. + * Otherwise, return NULL pointer. + */ +struct mailbox *xdnam_mailbox_create(struct drm_device *ddev, + const struct xdna_mailbox_res *res); + +/* + * xdna_mailbox_create_channel() -- Create a mailbox channel instance + * + * @mailbox: the handle return from xdna_mailbox_create() + * @x2i: host to firmware mailbox resources + * @i2x: firmware to host mailbox resources + * @xdna_mailbox_intr_reg: register addr of MSI-X interrupt + * @mb_irq: Linux IRQ number associated with mailbox MSI-X interrupt vector index + * + * Return: If success, return a handle of mailbox channel. Otherwise, return NULL. + */ +struct mailbox_channel * +xdna_mailbox_create_channel(struct mailbox *mailbox, + const struct xdna_mailbox_chann_res *x2i, + const struct xdna_mailbox_chann_res *i2x, + u32 xdna_mailbox_intr_reg, + int mb_irq); + +/* + * xdna_mailbox_destroy_channel() -- destroy mailbox channel + * + * @mailbox_chann: the handle return from xdna_mailbox_create_channel() + * + * Return: if success, return 0. otherwise return error code + */ +int xdna_mailbox_destroy_channel(struct mailbox_channel *mailbox_chann); + +/* + * xdna_mailbox_stop_channel() -- stop mailbox channel + * + * @mailbox_chann: the handle return from xdna_mailbox_create_channel() + * + * Return: if success, return 0. otherwise return error code + */ +void xdna_mailbox_stop_channel(struct mailbox_channel *mailbox_chann); + +/* + * xdna_mailbox_send_msg() -- Send a message + * + * @mailbox_chann: Mailbox channel handle + * @msg: message struct for message information + * @tx_timeout: the timeout value for sending the message in ms. + * + * Return: If success return 0, otherwise, return error code + */ +int xdna_mailbox_send_msg(struct mailbox_channel *mailbox_chann, + const struct xdna_mailbox_msg *msg, u64 tx_timeout); + +#endif /* _AIE2_MAILBOX_ */ diff --git a/drivers/accel/amdxdna/amdxdna_mailbox_helper.c b/drivers/accel/amdxdna/amdxdna_mailbox_helper.c new file mode 100644 index 000000000000..5139a9c96a91 --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_mailbox_helper.c @@ -0,0 +1,61 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "amdxdna_gem.h" +#include "amdxdna_mailbox.h" +#include "amdxdna_mailbox_helper.h" +#include "amdxdna_pci_drv.h" + +int xdna_msg_cb(void *handle, const u32 *data, size_t size) +{ + struct xdna_notify *cb_arg = handle; + int ret; + + if (unlikely(!data)) + goto out; + + if (unlikely(cb_arg->size != size)) { + cb_arg->error = -EINVAL; + goto out; + } + + print_hex_dump_debug("resp data: ", DUMP_PREFIX_OFFSET, + 16, 4, data, cb_arg->size, true); + memcpy(cb_arg->data, data, cb_arg->size); +out: + ret = cb_arg->error; + complete(&cb_arg->comp); + return ret; +} + +int xdna_send_msg_wait(struct amdxdna_dev *xdna, struct mailbox_channel *chann, + struct xdna_mailbox_msg *msg) +{ + struct xdna_notify *hdl = msg->handle; + int ret; + + ret = xdna_mailbox_send_msg(chann, msg, TX_TIMEOUT); + if (ret) { + XDNA_ERR(xdna, "Send message failed, ret %d", ret); + return ret; + } + + ret = wait_for_completion_timeout(&hdl->comp, + msecs_to_jiffies(RX_TIMEOUT)); + if (!ret) { + XDNA_ERR(xdna, "Wait for completion timeout"); + return -ETIME; + } + + return hdl->error; +} diff --git a/drivers/accel/amdxdna/amdxdna_mailbox_helper.h b/drivers/accel/amdxdna/amdxdna_mailbox_helper.h new file mode 100644 index 000000000000..23e1317b79fe --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_mailbox_helper.h @@ -0,0 +1,42 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2023-2024, Advanced Micro Devices, Inc. + */ + +#ifndef _AMDXDNA_MAILBOX_HELPER_H +#define _AMDXDNA_MAILBOX_HELPER_H + +#define TX_TIMEOUT 2000 /* milliseconds */ +#define RX_TIMEOUT 5000 /* milliseconds */ + +struct amdxdna_dev; + +struct xdna_notify { + struct completion comp; + u32 *data; + size_t size; + int error; +}; + +#define DECLARE_XDNA_MSG_COMMON(name, op, status) \ + struct name##_req req = { 0 }; \ + struct name##_resp resp = { status }; \ + struct xdna_notify hdl = { \ + .error = 0, \ + .data = (u32 *)&resp, \ + .size = sizeof(resp), \ + .comp = COMPLETION_INITIALIZER_ONSTACK(hdl.comp), \ + }; \ + struct xdna_mailbox_msg msg = { \ + .send_data = (u8 *)&req, \ + .send_size = sizeof(req), \ + .handle = &hdl, \ + .opcode = op, \ + .notify_cb = xdna_msg_cb, \ + } + +int xdna_msg_cb(void *handle, const u32 *data, size_t size); +int xdna_send_msg_wait(struct amdxdna_dev *xdna, struct mailbox_channel *chann, + struct xdna_mailbox_msg *msg); + +#endif /* _AMDXDNA_MAILBOX_HELPER_H */ diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.c b/drivers/accel/amdxdna/amdxdna_pci_drv.c new file mode 100644 index 000000000000..f5b8497cf5ad --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_pci_drv.c @@ -0,0 +1,434 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "amdxdna_ctx.h" +#include "amdxdna_gem.h" +#include "amdxdna_pci_drv.h" + +#define AMDXDNA_AUTOSUSPEND_DELAY 5000 /* milliseconds */ + +MODULE_FIRMWARE("amdnpu/1502_00/npu.sbin"); +MODULE_FIRMWARE("amdnpu/17f0_10/npu.sbin"); +MODULE_FIRMWARE("amdnpu/17f0_11/npu.sbin"); +MODULE_FIRMWARE("amdnpu/17f0_20/npu.sbin"); + +/* + * Bind the driver base on (vendor_id, device_id) pair and later use the + * (device_id, rev_id) pair as a key to select the devices. The devices with + * same device_id have very similar interface to host driver. + */ +static const struct pci_device_id pci_ids[] = { + { PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x1502) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x17f0) }, + {0} +}; + +MODULE_DEVICE_TABLE(pci, pci_ids); + +static const struct amdxdna_device_id amdxdna_ids[] = { + { 0x1502, 0x0, &dev_npu1_info }, + { 0x17f0, 0x0, &dev_npu2_info }, + { 0x17f0, 0x10, &dev_npu4_info }, + { 0x17f0, 0x11, &dev_npu5_info }, + { 0x17f0, 0x20, &dev_npu6_info }, + {0} +}; + +static int amdxdna_drm_open(struct drm_device *ddev, struct drm_file *filp) +{ + struct amdxdna_dev *xdna = to_xdna_dev(ddev); + struct amdxdna_client *client; + int ret; + + ret = pm_runtime_resume_and_get(ddev->dev); + if (ret) { + XDNA_ERR(xdna, "Failed to get rpm, ret %d", ret); + return ret; + } + + client = kzalloc(sizeof(*client), GFP_KERNEL); + if (!client) { + ret = -ENOMEM; + goto put_rpm; + } + + client->pid = pid_nr(rcu_access_pointer(filp->pid)); + client->xdna = xdna; + + client->sva = iommu_sva_bind_device(xdna->ddev.dev, current->mm); + if (IS_ERR(client->sva)) { + ret = PTR_ERR(client->sva); + XDNA_ERR(xdna, "SVA bind device failed, ret %d", ret); + goto failed; + } + client->pasid = iommu_sva_get_pasid(client->sva); + if (client->pasid == IOMMU_PASID_INVALID) { + XDNA_ERR(xdna, "SVA get pasid failed"); + ret = -ENODEV; + goto unbind_sva; + } + mutex_init(&client->hwctx_lock); + init_srcu_struct(&client->hwctx_srcu); + xa_init_flags(&client->hwctx_xa, XA_FLAGS_ALLOC); + mutex_init(&client->mm_lock); + + mutex_lock(&xdna->dev_lock); + list_add_tail(&client->node, &xdna->client_list); + mutex_unlock(&xdna->dev_lock); + + filp->driver_priv = client; + client->filp = filp; + + XDNA_DBG(xdna, "pid %d opened", client->pid); + return 0; + +unbind_sva: + iommu_sva_unbind_device(client->sva); +failed: + kfree(client); +put_rpm: + pm_runtime_mark_last_busy(ddev->dev); + pm_runtime_put_autosuspend(ddev->dev); + + return ret; +} + +static void amdxdna_drm_close(struct drm_device *ddev, struct drm_file *filp) +{ + struct amdxdna_client *client = filp->driver_priv; + struct amdxdna_dev *xdna = to_xdna_dev(ddev); + + XDNA_DBG(xdna, "closing pid %d", client->pid); + + xa_destroy(&client->hwctx_xa); + cleanup_srcu_struct(&client->hwctx_srcu); + mutex_destroy(&client->hwctx_lock); + mutex_destroy(&client->mm_lock); + if (client->dev_heap) + drm_gem_object_put(to_gobj(client->dev_heap)); + + iommu_sva_unbind_device(client->sva); + + XDNA_DBG(xdna, "pid %d closed", client->pid); + kfree(client); + pm_runtime_mark_last_busy(ddev->dev); + pm_runtime_put_autosuspend(ddev->dev); +} + +static int amdxdna_flush(struct file *f, fl_owner_t id) +{ + struct drm_file *filp = f->private_data; + struct amdxdna_client *client = filp->driver_priv; + struct amdxdna_dev *xdna = client->xdna; + int idx; + + XDNA_DBG(xdna, "PID %d flushing...", client->pid); + if (!drm_dev_enter(&xdna->ddev, &idx)) + return 0; + + mutex_lock(&xdna->dev_lock); + list_del_init(&client->node); + mutex_unlock(&xdna->dev_lock); + amdxdna_hwctx_remove_all(client); + + drm_dev_exit(idx); + return 0; +} + +static int amdxdna_drm_get_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) +{ + struct amdxdna_client *client = filp->driver_priv; + struct amdxdna_dev *xdna = to_xdna_dev(dev); + struct amdxdna_drm_get_info *args = data; + int ret; + + if (!xdna->dev_info->ops->get_aie_info) + return -EOPNOTSUPP; + + XDNA_DBG(xdna, "Request parameter %u", args->param); + mutex_lock(&xdna->dev_lock); + ret = xdna->dev_info->ops->get_aie_info(client, args); + mutex_unlock(&xdna->dev_lock); + return ret; +} + +static int amdxdna_drm_set_state_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) +{ + struct amdxdna_client *client = filp->driver_priv; + struct amdxdna_dev *xdna = to_xdna_dev(dev); + struct amdxdna_drm_set_state *args = data; + int ret; + + if (!xdna->dev_info->ops->set_aie_state) + return -EOPNOTSUPP; + + XDNA_DBG(xdna, "Request parameter %u", args->param); + mutex_lock(&xdna->dev_lock); + ret = xdna->dev_info->ops->set_aie_state(client, args); + mutex_unlock(&xdna->dev_lock); + + return ret; +} + +static const struct drm_ioctl_desc amdxdna_drm_ioctls[] = { + /* Context */ + DRM_IOCTL_DEF_DRV(AMDXDNA_CREATE_HWCTX, amdxdna_drm_create_hwctx_ioctl, 0), + DRM_IOCTL_DEF_DRV(AMDXDNA_DESTROY_HWCTX, amdxdna_drm_destroy_hwctx_ioctl, 0), + DRM_IOCTL_DEF_DRV(AMDXDNA_CONFIG_HWCTX, amdxdna_drm_config_hwctx_ioctl, 0), + /* BO */ + DRM_IOCTL_DEF_DRV(AMDXDNA_CREATE_BO, amdxdna_drm_create_bo_ioctl, 0), + DRM_IOCTL_DEF_DRV(AMDXDNA_GET_BO_INFO, amdxdna_drm_get_bo_info_ioctl, 0), + DRM_IOCTL_DEF_DRV(AMDXDNA_SYNC_BO, amdxdna_drm_sync_bo_ioctl, 0), + /* Execution */ + DRM_IOCTL_DEF_DRV(AMDXDNA_EXEC_CMD, amdxdna_drm_submit_cmd_ioctl, 0), + /* AIE hardware */ + DRM_IOCTL_DEF_DRV(AMDXDNA_GET_INFO, amdxdna_drm_get_info_ioctl, 0), + DRM_IOCTL_DEF_DRV(AMDXDNA_SET_STATE, amdxdna_drm_set_state_ioctl, DRM_ROOT_ONLY), +}; + +static const struct file_operations amdxdna_fops = { + .owner = THIS_MODULE, + .open = accel_open, + .release = drm_release, + .flush = amdxdna_flush, + .unlocked_ioctl = drm_ioctl, + .compat_ioctl = drm_compat_ioctl, + .poll = drm_poll, + .read = drm_read, + .llseek = noop_llseek, + .mmap = drm_gem_mmap, + .fop_flags = FOP_UNSIGNED_OFFSET, +}; + +const struct drm_driver amdxdna_drm_drv = { + .driver_features = DRIVER_GEM | DRIVER_COMPUTE_ACCEL | + DRIVER_SYNCOBJ | DRIVER_SYNCOBJ_TIMELINE, + .fops = &amdxdna_fops, + .name = "amdxdna_accel_driver", + .desc = "AMD XDNA DRM implementation", + .open = amdxdna_drm_open, + .postclose = amdxdna_drm_close, + .ioctls = amdxdna_drm_ioctls, + .num_ioctls = ARRAY_SIZE(amdxdna_drm_ioctls), + + .gem_create_object = amdxdna_gem_create_object_cb, +}; + +static const struct amdxdna_dev_info * +amdxdna_get_dev_info(struct pci_dev *pdev) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(amdxdna_ids); i++) { + if (pdev->device == amdxdna_ids[i].device && + pdev->revision == amdxdna_ids[i].revision) + return amdxdna_ids[i].dev_info; + } + return NULL; +} + +static int amdxdna_probe(struct pci_dev *pdev, const struct pci_device_id *id) +{ + struct device *dev = &pdev->dev; + struct amdxdna_dev *xdna; + int ret; + + xdna = devm_drm_dev_alloc(dev, &amdxdna_drm_drv, typeof(*xdna), ddev); + if (IS_ERR(xdna)) + return PTR_ERR(xdna); + + xdna->dev_info = amdxdna_get_dev_info(pdev); + if (!xdna->dev_info) + return -ENODEV; + + drmm_mutex_init(&xdna->ddev, &xdna->dev_lock); + init_rwsem(&xdna->notifier_lock); + INIT_LIST_HEAD(&xdna->client_list); + pci_set_drvdata(pdev, xdna); + + if (IS_ENABLED(CONFIG_LOCKDEP)) { + fs_reclaim_acquire(GFP_KERNEL); + might_lock(&xdna->notifier_lock); + fs_reclaim_release(GFP_KERNEL); + } + + mutex_lock(&xdna->dev_lock); + ret = xdna->dev_info->ops->init(xdna); + mutex_unlock(&xdna->dev_lock); + if (ret) { + XDNA_ERR(xdna, "Hardware init failed, ret %d", ret); + return ret; + } + + ret = amdxdna_sysfs_init(xdna); + if (ret) { + XDNA_ERR(xdna, "Create amdxdna attrs failed: %d", ret); + goto failed_dev_fini; + } + + pm_runtime_set_autosuspend_delay(dev, AMDXDNA_AUTOSUSPEND_DELAY); + pm_runtime_use_autosuspend(dev); + pm_runtime_allow(dev); + + ret = drm_dev_register(&xdna->ddev, 0); + if (ret) { + XDNA_ERR(xdna, "DRM register failed, ret %d", ret); + pm_runtime_forbid(dev); + goto failed_sysfs_fini; + } + + pm_runtime_mark_last_busy(dev); + pm_runtime_put_autosuspend(dev); + return 0; + +failed_sysfs_fini: + amdxdna_sysfs_fini(xdna); +failed_dev_fini: + mutex_lock(&xdna->dev_lock); + xdna->dev_info->ops->fini(xdna); + mutex_unlock(&xdna->dev_lock); + return ret; +} + +static void amdxdna_remove(struct pci_dev *pdev) +{ + struct amdxdna_dev *xdna = pci_get_drvdata(pdev); + struct device *dev = &pdev->dev; + struct amdxdna_client *client; + + pm_runtime_get_noresume(dev); + pm_runtime_forbid(dev); + + drm_dev_unplug(&xdna->ddev); + amdxdna_sysfs_fini(xdna); + + mutex_lock(&xdna->dev_lock); + client = list_first_entry_or_null(&xdna->client_list, + struct amdxdna_client, node); + while (client) { + list_del_init(&client->node); + mutex_unlock(&xdna->dev_lock); + + amdxdna_hwctx_remove_all(client); + + mutex_lock(&xdna->dev_lock); + client = list_first_entry_or_null(&xdna->client_list, + struct amdxdna_client, node); + } + + xdna->dev_info->ops->fini(xdna); + mutex_unlock(&xdna->dev_lock); +} + +static int amdxdna_dev_suspend_nolock(struct amdxdna_dev *xdna) +{ + if (xdna->dev_info->ops->suspend) + xdna->dev_info->ops->suspend(xdna); + + return 0; +} + +static int amdxdna_dev_resume_nolock(struct amdxdna_dev *xdna) +{ + if (xdna->dev_info->ops->resume) + return xdna->dev_info->ops->resume(xdna); + + return 0; +} + +static int amdxdna_pmops_suspend(struct device *dev) +{ + struct amdxdna_dev *xdna = pci_get_drvdata(to_pci_dev(dev)); + struct amdxdna_client *client; + + mutex_lock(&xdna->dev_lock); + list_for_each_entry(client, &xdna->client_list, node) + amdxdna_hwctx_suspend(client); + + amdxdna_dev_suspend_nolock(xdna); + mutex_unlock(&xdna->dev_lock); + + return 0; +} + +static int amdxdna_pmops_resume(struct device *dev) +{ + struct amdxdna_dev *xdna = pci_get_drvdata(to_pci_dev(dev)); + struct amdxdna_client *client; + int ret; + + XDNA_INFO(xdna, "firmware resuming..."); + mutex_lock(&xdna->dev_lock); + ret = amdxdna_dev_resume_nolock(xdna); + if (ret) { + XDNA_ERR(xdna, "resume NPU firmware failed"); + mutex_unlock(&xdna->dev_lock); + return ret; + } + + XDNA_INFO(xdna, "hardware context resuming..."); + list_for_each_entry(client, &xdna->client_list, node) + amdxdna_hwctx_resume(client); + mutex_unlock(&xdna->dev_lock); + + return 0; +} + +static int amdxdna_rpmops_suspend(struct device *dev) +{ + struct amdxdna_dev *xdna = pci_get_drvdata(to_pci_dev(dev)); + int ret; + + mutex_lock(&xdna->dev_lock); + ret = amdxdna_dev_suspend_nolock(xdna); + mutex_unlock(&xdna->dev_lock); + + XDNA_DBG(xdna, "Runtime suspend done ret: %d", ret); + return ret; +} + +static int amdxdna_rpmops_resume(struct device *dev) +{ + struct amdxdna_dev *xdna = pci_get_drvdata(to_pci_dev(dev)); + int ret; + + mutex_lock(&xdna->dev_lock); + ret = amdxdna_dev_resume_nolock(xdna); + mutex_unlock(&xdna->dev_lock); + + XDNA_DBG(xdna, "Runtime resume done ret: %d", ret); + return ret; +} + +static const struct dev_pm_ops amdxdna_pm_ops = { + SYSTEM_SLEEP_PM_OPS(amdxdna_pmops_suspend, amdxdna_pmops_resume) + RUNTIME_PM_OPS(amdxdna_rpmops_suspend, amdxdna_rpmops_resume, NULL) +}; + +static struct pci_driver amdxdna_pci_driver = { + .name = KBUILD_MODNAME, + .id_table = pci_ids, + .probe = amdxdna_probe, + .remove = amdxdna_remove, + .driver.pm = &amdxdna_pm_ops, +}; + +module_pci_driver(amdxdna_pci_driver); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("XRT Team "); +MODULE_DESCRIPTION("amdxdna driver"); diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.h b/drivers/accel/amdxdna/amdxdna_pci_drv.h new file mode 100644 index 000000000000..37848a8d8031 --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_pci_drv.h @@ -0,0 +1,147 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2022-2024, Advanced Micro Devices, Inc. + */ + +#ifndef _AMDXDNA_PCI_DRV_H_ +#define _AMDXDNA_PCI_DRV_H_ + +#include + +#define XDNA_INFO(xdna, fmt, args...) drm_info(&(xdna)->ddev, fmt, ##args) +#define XDNA_WARN(xdna, fmt, args...) drm_warn(&(xdna)->ddev, "%s: "fmt, __func__, ##args) +#define XDNA_ERR(xdna, fmt, args...) drm_err(&(xdna)->ddev, "%s: "fmt, __func__, ##args) +#define XDNA_DBG(xdna, fmt, args...) drm_dbg(&(xdna)->ddev, fmt, ##args) +#define XDNA_INFO_ONCE(xdna, fmt, args...) drm_info_once(&(xdna)->ddev, fmt, ##args) + +#define XDNA_MBZ_DBG(xdna, ptr, sz) \ + ({ \ + int __i; \ + int __ret = 0; \ + u8 *__ptr = (u8 *)(ptr); \ + for (__i = 0; __i < (sz); __i++) { \ + if (__ptr[__i]) { \ + XDNA_DBG(xdna, "MBZ check failed"); \ + __ret = -EINVAL; \ + break; \ + } \ + } \ + __ret; \ + }) + +#define to_xdna_dev(drm_dev) \ + ((struct amdxdna_dev *)container_of(drm_dev, struct amdxdna_dev, ddev)) + +extern const struct drm_driver amdxdna_drm_drv; + +struct amdxdna_client; +struct amdxdna_dev; +struct amdxdna_drm_get_info; +struct amdxdna_drm_set_state; +struct amdxdna_gem_obj; +struct amdxdna_hwctx; +struct amdxdna_sched_job; + +/* + * struct amdxdna_dev_ops - Device hardware operation callbacks + */ +struct amdxdna_dev_ops { + int (*init)(struct amdxdna_dev *xdna); + void (*fini)(struct amdxdna_dev *xdna); + int (*resume)(struct amdxdna_dev *xdna); + void (*suspend)(struct amdxdna_dev *xdna); + int (*hwctx_init)(struct amdxdna_hwctx *hwctx); + void (*hwctx_fini)(struct amdxdna_hwctx *hwctx); + int (*hwctx_config)(struct amdxdna_hwctx *hwctx, u32 type, u64 value, void *buf, u32 size); + void (*hmm_invalidate)(struct amdxdna_gem_obj *abo, unsigned long cur_seq); + void (*hwctx_suspend)(struct amdxdna_hwctx *hwctx); + void (*hwctx_resume)(struct amdxdna_hwctx *hwctx); + int (*cmd_submit)(struct amdxdna_hwctx *hwctx, struct amdxdna_sched_job *job, u64 *seq); + int (*get_aie_info)(struct amdxdna_client *client, struct amdxdna_drm_get_info *args); + int (*set_aie_state)(struct amdxdna_client *client, struct amdxdna_drm_set_state *args); +}; + +/* + * struct amdxdna_dev_info - Device hardware information + * Record device static information, like reg, mbox, PSP, SMU bar index + */ +struct amdxdna_dev_info { + int reg_bar; + int mbox_bar; + int sram_bar; + int psp_bar; + int smu_bar; + int device_type; + int first_col; + u32 dev_mem_buf_shift; + u64 dev_mem_base; + size_t dev_mem_size; + char *vbnv; + const struct amdxdna_dev_priv *dev_priv; + const struct amdxdna_dev_ops *ops; +}; + +struct amdxdna_fw_ver { + u32 major; + u32 minor; + u32 sub; + u32 build; +}; + +struct amdxdna_dev { + struct drm_device ddev; + struct amdxdna_dev_hdl *dev_handle; + const struct amdxdna_dev_info *dev_info; + void *xrs_hdl; + + struct mutex dev_lock; /* per device lock */ + struct list_head client_list; + struct amdxdna_fw_ver fw_ver; + struct rw_semaphore notifier_lock; /* for mmu notifier*/ +}; + +/* + * struct amdxdna_device_id - PCI device info + */ +struct amdxdna_device_id { + unsigned short device; + u8 revision; + const struct amdxdna_dev_info *dev_info; +}; + +/* + * struct amdxdna_client - amdxdna client + * A per fd data structure for managing context and other user process stuffs. + */ +struct amdxdna_client { + struct list_head node; + pid_t pid; + struct mutex hwctx_lock; /* protect hwctx */ + /* do NOT wait this srcu when hwctx_lock is held */ + struct srcu_struct hwctx_srcu; + struct xarray hwctx_xa; + u32 next_hwctxid; + struct amdxdna_dev *xdna; + struct drm_file *filp; + + struct mutex mm_lock; /* protect memory related */ + struct amdxdna_gem_obj *dev_heap; + + struct iommu_sva *sva; + int pasid; +}; + +#define amdxdna_for_each_hwctx(client, hwctx_id, entry) \ + xa_for_each(&(client)->hwctx_xa, hwctx_id, entry) + +/* Add device info below */ +extern const struct amdxdna_dev_info dev_npu1_info; +extern const struct amdxdna_dev_info dev_npu2_info; +extern const struct amdxdna_dev_info dev_npu4_info; +extern const struct amdxdna_dev_info dev_npu5_info; +extern const struct amdxdna_dev_info dev_npu6_info; + +int amdxdna_sysfs_init(struct amdxdna_dev *xdna); +void amdxdna_sysfs_fini(struct amdxdna_dev *xdna); + +#endif /* _AMDXDNA_PCI_DRV_H_ */ diff --git a/drivers/accel/amdxdna/amdxdna_sysfs.c b/drivers/accel/amdxdna/amdxdna_sysfs.c new file mode 100644 index 000000000000..f27e4ee960a0 --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_sysfs.c @@ -0,0 +1,67 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2023-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include + +#include "amdxdna_gem.h" +#include "amdxdna_pci_drv.h" + +static ssize_t vbnv_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct amdxdna_dev *xdna = dev_get_drvdata(dev); + + return sprintf(buf, "%s\n", xdna->dev_info->vbnv); +} +static DEVICE_ATTR_RO(vbnv); + +static ssize_t device_type_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct amdxdna_dev *xdna = dev_get_drvdata(dev); + + return sprintf(buf, "%d\n", xdna->dev_info->device_type); +} +static DEVICE_ATTR_RO(device_type); + +static ssize_t fw_version_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct amdxdna_dev *xdna = dev_get_drvdata(dev); + + return sprintf(buf, "%d.%d.%d.%d\n", xdna->fw_ver.major, + xdna->fw_ver.minor, xdna->fw_ver.sub, + xdna->fw_ver.build); +} +static DEVICE_ATTR_RO(fw_version); + +static struct attribute *amdxdna_attrs[] = { + &dev_attr_device_type.attr, + &dev_attr_vbnv.attr, + &dev_attr_fw_version.attr, + NULL, +}; + +static struct attribute_group amdxdna_attr_group = { + .attrs = amdxdna_attrs, +}; + +int amdxdna_sysfs_init(struct amdxdna_dev *xdna) +{ + int ret; + + ret = sysfs_create_group(&xdna->ddev.dev->kobj, &amdxdna_attr_group); + if (ret) + XDNA_ERR(xdna, "Create attr group failed"); + + return ret; +} + +void amdxdna_sysfs_fini(struct amdxdna_dev *xdna) +{ + sysfs_remove_group(&xdna->ddev.dev->kobj, &amdxdna_attr_group); +} diff --git a/drivers/accel/amdxdna/npu1_regs.c b/drivers/accel/amdxdna/npu1_regs.c new file mode 100644 index 000000000000..e4f6dac7d00f --- /dev/null +++ b/drivers/accel/amdxdna/npu1_regs.c @@ -0,0 +1,114 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2023-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include + +#include "aie2_pci.h" +#include "amdxdna_mailbox.h" +#include "amdxdna_pci_drv.h" + +/* Address definition from NPU1 docs */ +#define MPNPU_PUB_SEC_INTR 0x3010090 +#define MPNPU_PUB_PWRMGMT_INTR 0x3010094 +#define MPNPU_PUB_SCRATCH2 0x30100A0 +#define MPNPU_PUB_SCRATCH3 0x30100A4 +#define MPNPU_PUB_SCRATCH4 0x30100A8 +#define MPNPU_PUB_SCRATCH5 0x30100AC +#define MPNPU_PUB_SCRATCH6 0x30100B0 +#define MPNPU_PUB_SCRATCH7 0x30100B4 +#define MPNPU_PUB_SCRATCH9 0x30100BC + +#define MPNPU_SRAM_X2I_MAILBOX_0 0x30A0000 +#define MPNPU_SRAM_X2I_MAILBOX_1 0x30A2000 +#define MPNPU_SRAM_I2X_MAILBOX_15 0x30BF000 + +#define MPNPU_APERTURE0_BASE 0x3000000 +#define MPNPU_APERTURE1_BASE 0x3080000 +#define MPNPU_APERTURE2_BASE 0x30C0000 + +/* PCIe BAR Index for NPU1 */ +#define NPU1_REG_BAR_INDEX 0 +#define NPU1_MBOX_BAR_INDEX 4 +#define NPU1_PSP_BAR_INDEX 0 +#define NPU1_SMU_BAR_INDEX 0 +#define NPU1_SRAM_BAR_INDEX 2 +/* Associated BARs and Apertures */ +#define NPU1_REG_BAR_BASE MPNPU_APERTURE0_BASE +#define NPU1_MBOX_BAR_BASE MPNPU_APERTURE2_BASE +#define NPU1_PSP_BAR_BASE MPNPU_APERTURE0_BASE +#define NPU1_SMU_BAR_BASE MPNPU_APERTURE0_BASE +#define NPU1_SRAM_BAR_BASE MPNPU_APERTURE1_BASE + +const struct rt_config npu1_default_rt_cfg[] = { + { 2, 1, AIE2_RT_CFG_INIT }, /* PDI APP LOAD MODE */ + { 1, 1, AIE2_RT_CFG_CLK_GATING }, /* Clock gating on */ + { 0 }, +}; + +const struct dpm_clk_freq npu1_dpm_clk_table[] = { + {400, 800}, + {600, 1024}, + {600, 1024}, + {600, 1024}, + {600, 1024}, + {720, 1309}, + {720, 1309}, + {847, 1600}, + { 0 } +}; + +static const struct amdxdna_dev_priv npu1_dev_priv = { + .fw_path = "amdnpu/1502_00/npu.sbin", + .protocol_major = 0x5, + .protocol_minor = 0x7, + .rt_config = npu1_default_rt_cfg, + .dpm_clk_tbl = npu1_dpm_clk_table, + .col_align = COL_ALIGN_NONE, + .mbox_dev_addr = NPU1_MBOX_BAR_BASE, + .mbox_size = 0, /* Use BAR size */ + .sram_dev_addr = NPU1_SRAM_BAR_BASE, + .sram_offs = { + DEFINE_BAR_OFFSET(MBOX_CHANN_OFF, NPU1_SRAM, MPNPU_SRAM_X2I_MAILBOX_0), + DEFINE_BAR_OFFSET(FW_ALIVE_OFF, NPU1_SRAM, MPNPU_SRAM_I2X_MAILBOX_15), + }, + .psp_regs_off = { + DEFINE_BAR_OFFSET(PSP_CMD_REG, NPU1_PSP, MPNPU_PUB_SCRATCH2), + DEFINE_BAR_OFFSET(PSP_ARG0_REG, NPU1_PSP, MPNPU_PUB_SCRATCH3), + DEFINE_BAR_OFFSET(PSP_ARG1_REG, NPU1_PSP, MPNPU_PUB_SCRATCH4), + DEFINE_BAR_OFFSET(PSP_ARG2_REG, NPU1_PSP, MPNPU_PUB_SCRATCH9), + DEFINE_BAR_OFFSET(PSP_INTR_REG, NPU1_PSP, MPNPU_PUB_SEC_INTR), + DEFINE_BAR_OFFSET(PSP_STATUS_REG, NPU1_PSP, MPNPU_PUB_SCRATCH2), + DEFINE_BAR_OFFSET(PSP_RESP_REG, NPU1_PSP, MPNPU_PUB_SCRATCH3), + }, + .smu_regs_off = { + DEFINE_BAR_OFFSET(SMU_CMD_REG, NPU1_SMU, MPNPU_PUB_SCRATCH5), + DEFINE_BAR_OFFSET(SMU_ARG_REG, NPU1_SMU, MPNPU_PUB_SCRATCH7), + DEFINE_BAR_OFFSET(SMU_INTR_REG, NPU1_SMU, MPNPU_PUB_PWRMGMT_INTR), + DEFINE_BAR_OFFSET(SMU_RESP_REG, NPU1_SMU, MPNPU_PUB_SCRATCH6), + DEFINE_BAR_OFFSET(SMU_OUT_REG, NPU1_SMU, MPNPU_PUB_SCRATCH7), + }, + .hw_ops = { + .set_dpm = npu1_set_dpm, + }, +}; + +const struct amdxdna_dev_info dev_npu1_info = { + .reg_bar = NPU1_REG_BAR_INDEX, + .mbox_bar = NPU1_MBOX_BAR_INDEX, + .sram_bar = NPU1_SRAM_BAR_INDEX, + .psp_bar = NPU1_PSP_BAR_INDEX, + .smu_bar = NPU1_SMU_BAR_INDEX, + .first_col = 1, + .dev_mem_buf_shift = 15, /* 32 KiB aligned */ + .dev_mem_base = AIE2_DEVM_BASE, + .dev_mem_size = AIE2_DEVM_SIZE, + .vbnv = "RyzenAI-npu1", + .device_type = AMDXDNA_DEV_TYPE_KMQ, + .dev_priv = &npu1_dev_priv, + .ops = &aie2_ops, +}; diff --git a/drivers/accel/amdxdna/npu2_regs.c b/drivers/accel/amdxdna/npu2_regs.c new file mode 100644 index 000000000000..a081cac75ee0 --- /dev/null +++ b/drivers/accel/amdxdna/npu2_regs.c @@ -0,0 +1,113 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2023-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include + +#include "aie2_pci.h" +#include "amdxdna_mailbox.h" +#include "amdxdna_pci_drv.h" + +/* NPU Public Registers on MpNPUAxiXbar (refer to Diag npu_registers.h) */ +#define MPNPU_PUB_SEC_INTR 0x3010060 +#define MPNPU_PUB_PWRMGMT_INTR 0x3010064 +#define MPNPU_PUB_SCRATCH0 0x301006C +#define MPNPU_PUB_SCRATCH1 0x3010070 +#define MPNPU_PUB_SCRATCH2 0x3010074 +#define MPNPU_PUB_SCRATCH3 0x3010078 +#define MPNPU_PUB_SCRATCH4 0x301007C +#define MPNPU_PUB_SCRATCH5 0x3010080 +#define MPNPU_PUB_SCRATCH6 0x3010084 +#define MPNPU_PUB_SCRATCH7 0x3010088 +#define MPNPU_PUB_SCRATCH8 0x301008C +#define MPNPU_PUB_SCRATCH9 0x3010090 +#define MPNPU_PUB_SCRATCH10 0x3010094 +#define MPNPU_PUB_SCRATCH11 0x3010098 +#define MPNPU_PUB_SCRATCH12 0x301009C +#define MPNPU_PUB_SCRATCH13 0x30100A0 +#define MPNPU_PUB_SCRATCH14 0x30100A4 +#define MPNPU_PUB_SCRATCH15 0x30100A8 +#define MP0_C2PMSG_73 0x3810A24 +#define MP0_C2PMSG_123 0x3810AEC + +#define MP1_C2PMSG_0 0x3B10900 +#define MP1_C2PMSG_60 0x3B109F0 +#define MP1_C2PMSG_61 0x3B109F4 + +#define MPNPU_SRAM_X2I_MAILBOX_0 0x3600000 +#define MPNPU_SRAM_X2I_MAILBOX_15 0x361E000 +#define MPNPU_SRAM_X2I_MAILBOX_31 0x363E000 +#define MPNPU_SRAM_I2X_MAILBOX_31 0x363F000 + +#define MMNPU_APERTURE0_BASE 0x3000000 +#define MMNPU_APERTURE1_BASE 0x3600000 +#define MMNPU_APERTURE3_BASE 0x3810000 +#define MMNPU_APERTURE4_BASE 0x3B10000 + +/* PCIe BAR Index for NPU2 */ +#define NPU2_REG_BAR_INDEX 0 +#define NPU2_MBOX_BAR_INDEX 0 +#define NPU2_PSP_BAR_INDEX 4 +#define NPU2_SMU_BAR_INDEX 5 +#define NPU2_SRAM_BAR_INDEX 2 +/* Associated BARs and Apertures */ +#define NPU2_REG_BAR_BASE MMNPU_APERTURE0_BASE +#define NPU2_MBOX_BAR_BASE MMNPU_APERTURE0_BASE +#define NPU2_PSP_BAR_BASE MMNPU_APERTURE3_BASE +#define NPU2_SMU_BAR_BASE MMNPU_APERTURE4_BASE +#define NPU2_SRAM_BAR_BASE MMNPU_APERTURE1_BASE + +static const struct amdxdna_dev_priv npu2_dev_priv = { + .fw_path = "amdnpu/17f0_00/npu.sbin", + .protocol_major = 0x6, + .protocol_minor = 0x6, + .rt_config = npu4_default_rt_cfg, + .dpm_clk_tbl = npu4_dpm_clk_table, + .col_align = COL_ALIGN_NATURE, + .mbox_dev_addr = NPU2_MBOX_BAR_BASE, + .mbox_size = 0, /* Use BAR size */ + .sram_dev_addr = NPU2_SRAM_BAR_BASE, + .sram_offs = { + DEFINE_BAR_OFFSET(MBOX_CHANN_OFF, NPU2_SRAM, MPNPU_SRAM_X2I_MAILBOX_0), + DEFINE_BAR_OFFSET(FW_ALIVE_OFF, NPU2_SRAM, MPNPU_SRAM_X2I_MAILBOX_15), + }, + .psp_regs_off = { + DEFINE_BAR_OFFSET(PSP_CMD_REG, NPU2_PSP, MP0_C2PMSG_123), + DEFINE_BAR_OFFSET(PSP_ARG0_REG, NPU2_REG, MPNPU_PUB_SCRATCH3), + DEFINE_BAR_OFFSET(PSP_ARG1_REG, NPU2_REG, MPNPU_PUB_SCRATCH4), + DEFINE_BAR_OFFSET(PSP_ARG2_REG, NPU2_REG, MPNPU_PUB_SCRATCH9), + DEFINE_BAR_OFFSET(PSP_INTR_REG, NPU2_PSP, MP0_C2PMSG_73), + DEFINE_BAR_OFFSET(PSP_STATUS_REG, NPU2_PSP, MP0_C2PMSG_123), + DEFINE_BAR_OFFSET(PSP_RESP_REG, NPU2_REG, MPNPU_PUB_SCRATCH3), + }, + .smu_regs_off = { + DEFINE_BAR_OFFSET(SMU_CMD_REG, NPU2_SMU, MP1_C2PMSG_0), + DEFINE_BAR_OFFSET(SMU_ARG_REG, NPU2_SMU, MP1_C2PMSG_60), + DEFINE_BAR_OFFSET(SMU_INTR_REG, NPU2_SMU, MMNPU_APERTURE4_BASE), + DEFINE_BAR_OFFSET(SMU_RESP_REG, NPU2_SMU, MP1_C2PMSG_61), + DEFINE_BAR_OFFSET(SMU_OUT_REG, NPU2_SMU, MP1_C2PMSG_60), + }, + .hw_ops = { + .set_dpm = npu4_set_dpm, + }, +}; + +const struct amdxdna_dev_info dev_npu2_info = { + .reg_bar = NPU2_REG_BAR_INDEX, + .mbox_bar = NPU2_MBOX_BAR_INDEX, + .sram_bar = NPU2_SRAM_BAR_INDEX, + .psp_bar = NPU2_PSP_BAR_INDEX, + .smu_bar = NPU2_SMU_BAR_INDEX, + .first_col = 0, + .dev_mem_buf_shift = 15, /* 32 KiB aligned */ + .dev_mem_base = AIE2_DEVM_BASE, + .dev_mem_size = AIE2_DEVM_SIZE, + .vbnv = "RyzenAI-npu2", + .device_type = AMDXDNA_DEV_TYPE_KMQ, + .dev_priv = &npu2_dev_priv, + .ops = &aie2_ops, /* NPU2 can share NPU1's callback */ +}; diff --git a/drivers/accel/amdxdna/npu4_regs.c b/drivers/accel/amdxdna/npu4_regs.c new file mode 100644 index 000000000000..9f2e33182ec6 --- /dev/null +++ b/drivers/accel/amdxdna/npu4_regs.c @@ -0,0 +1,134 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2023-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include + +#include "aie2_pci.h" +#include "amdxdna_mailbox.h" +#include "amdxdna_pci_drv.h" + +/* NPU Public Registers on MpNPUAxiXbar (refer to Diag npu_registers.h) */ +#define MPNPU_PUB_SEC_INTR 0x3010060 +#define MPNPU_PUB_PWRMGMT_INTR 0x3010064 +#define MPNPU_PUB_SCRATCH0 0x301006C +#define MPNPU_PUB_SCRATCH1 0x3010070 +#define MPNPU_PUB_SCRATCH2 0x3010074 +#define MPNPU_PUB_SCRATCH3 0x3010078 +#define MPNPU_PUB_SCRATCH4 0x301007C +#define MPNPU_PUB_SCRATCH5 0x3010080 +#define MPNPU_PUB_SCRATCH6 0x3010084 +#define MPNPU_PUB_SCRATCH7 0x3010088 +#define MPNPU_PUB_SCRATCH8 0x301008C +#define MPNPU_PUB_SCRATCH9 0x3010090 +#define MPNPU_PUB_SCRATCH10 0x3010094 +#define MPNPU_PUB_SCRATCH11 0x3010098 +#define MPNPU_PUB_SCRATCH12 0x301009C +#define MPNPU_PUB_SCRATCH13 0x30100A0 +#define MPNPU_PUB_SCRATCH14 0x30100A4 +#define MPNPU_PUB_SCRATCH15 0x30100A8 +#define MP0_C2PMSG_73 0x3810A24 +#define MP0_C2PMSG_123 0x3810AEC + +#define MP1_C2PMSG_0 0x3B10900 +#define MP1_C2PMSG_60 0x3B109F0 +#define MP1_C2PMSG_61 0x3B109F4 + +#define MPNPU_SRAM_X2I_MAILBOX_0 0x3600000 +#define MPNPU_SRAM_X2I_MAILBOX_15 0x361E000 +#define MPNPU_SRAM_X2I_MAILBOX_31 0x363E000 +#define MPNPU_SRAM_I2X_MAILBOX_31 0x363F000 + +#define MMNPU_APERTURE0_BASE 0x3000000 +#define MMNPU_APERTURE1_BASE 0x3600000 +#define MMNPU_APERTURE3_BASE 0x3810000 +#define MMNPU_APERTURE4_BASE 0x3B10000 + +/* PCIe BAR Index for NPU4 */ +#define NPU4_REG_BAR_INDEX 0 +#define NPU4_MBOX_BAR_INDEX 0 +#define NPU4_PSP_BAR_INDEX 4 +#define NPU4_SMU_BAR_INDEX 5 +#define NPU4_SRAM_BAR_INDEX 2 +/* Associated BARs and Apertures */ +#define NPU4_REG_BAR_BASE MMNPU_APERTURE0_BASE +#define NPU4_MBOX_BAR_BASE MMNPU_APERTURE0_BASE +#define NPU4_PSP_BAR_BASE MMNPU_APERTURE3_BASE +#define NPU4_SMU_BAR_BASE MMNPU_APERTURE4_BASE +#define NPU4_SRAM_BAR_BASE MMNPU_APERTURE1_BASE + +const struct rt_config npu4_default_rt_cfg[] = { + { 5, 1, AIE2_RT_CFG_INIT }, /* PDI APP LOAD MODE */ + { 1, 1, AIE2_RT_CFG_CLK_GATING }, /* Clock gating on */ + { 2, 1, AIE2_RT_CFG_CLK_GATING }, /* Clock gating on */ + { 3, 1, AIE2_RT_CFG_CLK_GATING }, /* Clock gating on */ + { 4, 1, AIE2_RT_CFG_CLK_GATING }, /* Clock gating on */ + { 0 }, +}; + +const struct dpm_clk_freq npu4_dpm_clk_table[] = { + {396, 792}, + {600, 1056}, + {792, 1152}, + {975, 1267}, + {975, 1267}, + {1056, 1408}, + {1152, 1584}, + {1267, 1800}, + { 0 } +}; + +static const struct amdxdna_dev_priv npu4_dev_priv = { + .fw_path = "amdnpu/17f0_10/npu.sbin", + .protocol_major = 0x6, + .protocol_minor = 12, + .rt_config = npu4_default_rt_cfg, + .dpm_clk_tbl = npu4_dpm_clk_table, + .col_align = COL_ALIGN_NATURE, + .mbox_dev_addr = NPU4_MBOX_BAR_BASE, + .mbox_size = 0, /* Use BAR size */ + .sram_dev_addr = NPU4_SRAM_BAR_BASE, + .sram_offs = { + DEFINE_BAR_OFFSET(MBOX_CHANN_OFF, NPU4_SRAM, MPNPU_SRAM_X2I_MAILBOX_0), + DEFINE_BAR_OFFSET(FW_ALIVE_OFF, NPU4_SRAM, MPNPU_SRAM_X2I_MAILBOX_15), + }, + .psp_regs_off = { + DEFINE_BAR_OFFSET(PSP_CMD_REG, NPU4_PSP, MP0_C2PMSG_123), + DEFINE_BAR_OFFSET(PSP_ARG0_REG, NPU4_REG, MPNPU_PUB_SCRATCH3), + DEFINE_BAR_OFFSET(PSP_ARG1_REG, NPU4_REG, MPNPU_PUB_SCRATCH4), + DEFINE_BAR_OFFSET(PSP_ARG2_REG, NPU4_REG, MPNPU_PUB_SCRATCH9), + DEFINE_BAR_OFFSET(PSP_INTR_REG, NPU4_PSP, MP0_C2PMSG_73), + DEFINE_BAR_OFFSET(PSP_STATUS_REG, NPU4_PSP, MP0_C2PMSG_123), + DEFINE_BAR_OFFSET(PSP_RESP_REG, NPU4_REG, MPNPU_PUB_SCRATCH3), + }, + .smu_regs_off = { + DEFINE_BAR_OFFSET(SMU_CMD_REG, NPU4_SMU, MP1_C2PMSG_0), + DEFINE_BAR_OFFSET(SMU_ARG_REG, NPU4_SMU, MP1_C2PMSG_60), + DEFINE_BAR_OFFSET(SMU_INTR_REG, NPU4_SMU, MMNPU_APERTURE4_BASE), + DEFINE_BAR_OFFSET(SMU_RESP_REG, NPU4_SMU, MP1_C2PMSG_61), + DEFINE_BAR_OFFSET(SMU_OUT_REG, NPU4_SMU, MP1_C2PMSG_60), + }, + .hw_ops = { + .set_dpm = npu4_set_dpm, + }, +}; + +const struct amdxdna_dev_info dev_npu4_info = { + .reg_bar = NPU4_REG_BAR_INDEX, + .mbox_bar = NPU4_MBOX_BAR_INDEX, + .sram_bar = NPU4_SRAM_BAR_INDEX, + .psp_bar = NPU4_PSP_BAR_INDEX, + .smu_bar = NPU4_SMU_BAR_INDEX, + .first_col = 0, + .dev_mem_buf_shift = 15, /* 32 KiB aligned */ + .dev_mem_base = AIE2_DEVM_BASE, + .dev_mem_size = AIE2_DEVM_SIZE, + .vbnv = "RyzenAI-npu4", + .device_type = AMDXDNA_DEV_TYPE_KMQ, + .dev_priv = &npu4_dev_priv, + .ops = &aie2_ops, /* NPU4 can share NPU1's callback */ +}; diff --git a/drivers/accel/amdxdna/npu5_regs.c b/drivers/accel/amdxdna/npu5_regs.c new file mode 100644 index 000000000000..5f1cf83461c4 --- /dev/null +++ b/drivers/accel/amdxdna/npu5_regs.c @@ -0,0 +1,113 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include + +#include "aie2_pci.h" +#include "amdxdna_mailbox.h" +#include "amdxdna_pci_drv.h" + +/* NPU Public Registers on MpNPUAxiXbar (refer to Diag npu_registers.h) */ +#define MPNPU_PUB_SEC_INTR 0x3010060 +#define MPNPU_PUB_PWRMGMT_INTR 0x3010064 +#define MPNPU_PUB_SCRATCH0 0x301006C +#define MPNPU_PUB_SCRATCH1 0x3010070 +#define MPNPU_PUB_SCRATCH2 0x3010074 +#define MPNPU_PUB_SCRATCH3 0x3010078 +#define MPNPU_PUB_SCRATCH4 0x301007C +#define MPNPU_PUB_SCRATCH5 0x3010080 +#define MPNPU_PUB_SCRATCH6 0x3010084 +#define MPNPU_PUB_SCRATCH7 0x3010088 +#define MPNPU_PUB_SCRATCH8 0x301008C +#define MPNPU_PUB_SCRATCH9 0x3010090 +#define MPNPU_PUB_SCRATCH10 0x3010094 +#define MPNPU_PUB_SCRATCH11 0x3010098 +#define MPNPU_PUB_SCRATCH12 0x301009C +#define MPNPU_PUB_SCRATCH13 0x30100A0 +#define MPNPU_PUB_SCRATCH14 0x30100A4 +#define MPNPU_PUB_SCRATCH15 0x30100A8 +#define MP0_C2PMSG_73 0x3810A24 +#define MP0_C2PMSG_123 0x3810AEC + +#define MP1_C2PMSG_0 0x3B10900 +#define MP1_C2PMSG_60 0x3B109F0 +#define MP1_C2PMSG_61 0x3B109F4 + +#define MPNPU_SRAM_X2I_MAILBOX_0 0x3600000 +#define MPNPU_SRAM_X2I_MAILBOX_15 0x361E000 +#define MPNPU_SRAM_X2I_MAILBOX_31 0x363E000 +#define MPNPU_SRAM_I2X_MAILBOX_31 0x363F000 + +#define MMNPU_APERTURE0_BASE 0x3000000 +#define MMNPU_APERTURE1_BASE 0x3600000 +#define MMNPU_APERTURE3_BASE 0x3810000 +#define MMNPU_APERTURE4_BASE 0x3B10000 + +/* PCIe BAR Index for NPU5 */ +#define NPU5_REG_BAR_INDEX 0 +#define NPU5_MBOX_BAR_INDEX 0 +#define NPU5_PSP_BAR_INDEX 4 +#define NPU5_SMU_BAR_INDEX 5 +#define NPU5_SRAM_BAR_INDEX 2 +/* Associated BARs and Apertures */ +#define NPU5_REG_BAR_BASE MMNPU_APERTURE0_BASE +#define NPU5_MBOX_BAR_BASE MMNPU_APERTURE0_BASE +#define NPU5_PSP_BAR_BASE MMNPU_APERTURE3_BASE +#define NPU5_SMU_BAR_BASE MMNPU_APERTURE4_BASE +#define NPU5_SRAM_BAR_BASE MMNPU_APERTURE1_BASE + +static const struct amdxdna_dev_priv npu5_dev_priv = { + .fw_path = "amdnpu/17f0_11/npu.sbin", + .protocol_major = 0x6, + .protocol_minor = 12, + .rt_config = npu4_default_rt_cfg, + .dpm_clk_tbl = npu4_dpm_clk_table, + .col_align = COL_ALIGN_NATURE, + .mbox_dev_addr = NPU5_MBOX_BAR_BASE, + .mbox_size = 0, /* Use BAR size */ + .sram_dev_addr = NPU5_SRAM_BAR_BASE, + .sram_offs = { + DEFINE_BAR_OFFSET(MBOX_CHANN_OFF, NPU5_SRAM, MPNPU_SRAM_X2I_MAILBOX_0), + DEFINE_BAR_OFFSET(FW_ALIVE_OFF, NPU5_SRAM, MPNPU_SRAM_X2I_MAILBOX_15), + }, + .psp_regs_off = { + DEFINE_BAR_OFFSET(PSP_CMD_REG, NPU5_PSP, MP0_C2PMSG_123), + DEFINE_BAR_OFFSET(PSP_ARG0_REG, NPU5_REG, MPNPU_PUB_SCRATCH3), + DEFINE_BAR_OFFSET(PSP_ARG1_REG, NPU5_REG, MPNPU_PUB_SCRATCH4), + DEFINE_BAR_OFFSET(PSP_ARG2_REG, NPU5_REG, MPNPU_PUB_SCRATCH9), + DEFINE_BAR_OFFSET(PSP_INTR_REG, NPU5_PSP, MP0_C2PMSG_73), + DEFINE_BAR_OFFSET(PSP_STATUS_REG, NPU5_PSP, MP0_C2PMSG_123), + DEFINE_BAR_OFFSET(PSP_RESP_REG, NPU5_REG, MPNPU_PUB_SCRATCH3), + }, + .smu_regs_off = { + DEFINE_BAR_OFFSET(SMU_CMD_REG, NPU5_SMU, MP1_C2PMSG_0), + DEFINE_BAR_OFFSET(SMU_ARG_REG, NPU5_SMU, MP1_C2PMSG_60), + DEFINE_BAR_OFFSET(SMU_INTR_REG, NPU5_SMU, MMNPU_APERTURE4_BASE), + DEFINE_BAR_OFFSET(SMU_RESP_REG, NPU5_SMU, MP1_C2PMSG_61), + DEFINE_BAR_OFFSET(SMU_OUT_REG, NPU5_SMU, MP1_C2PMSG_60), + }, + .hw_ops = { + .set_dpm = npu4_set_dpm, + }, +}; + +const struct amdxdna_dev_info dev_npu5_info = { + .reg_bar = NPU5_REG_BAR_INDEX, + .mbox_bar = NPU5_MBOX_BAR_INDEX, + .sram_bar = NPU5_SRAM_BAR_INDEX, + .psp_bar = NPU5_PSP_BAR_INDEX, + .smu_bar = NPU5_SMU_BAR_INDEX, + .first_col = 0, + .dev_mem_buf_shift = 15, /* 32 KiB aligned */ + .dev_mem_base = AIE2_DEVM_BASE, + .dev_mem_size = AIE2_DEVM_SIZE, + .vbnv = "RyzenAI-npu5", + .device_type = AMDXDNA_DEV_TYPE_KMQ, + .dev_priv = &npu5_dev_priv, + .ops = &aie2_ops, +}; diff --git a/drivers/accel/amdxdna/npu6_regs.c b/drivers/accel/amdxdna/npu6_regs.c new file mode 100644 index 000000000000..94a7005685a7 --- /dev/null +++ b/drivers/accel/amdxdna/npu6_regs.c @@ -0,0 +1,114 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include + +#include "aie2_pci.h" +#include "amdxdna_mailbox.h" +#include "amdxdna_pci_drv.h" + +/* NPU Public Registers on MpNPUAxiXbar (refer to Diag npu_registers.h) */ +#define MPNPU_PUB_SEC_INTR 0x3010060 +#define MPNPU_PUB_PWRMGMT_INTR 0x3010064 +#define MPNPU_PUB_SCRATCH0 0x301006C +#define MPNPU_PUB_SCRATCH1 0x3010070 +#define MPNPU_PUB_SCRATCH2 0x3010074 +#define MPNPU_PUB_SCRATCH3 0x3010078 +#define MPNPU_PUB_SCRATCH4 0x301007C +#define MPNPU_PUB_SCRATCH5 0x3010080 +#define MPNPU_PUB_SCRATCH6 0x3010084 +#define MPNPU_PUB_SCRATCH7 0x3010088 +#define MPNPU_PUB_SCRATCH8 0x301008C +#define MPNPU_PUB_SCRATCH9 0x3010090 +#define MPNPU_PUB_SCRATCH10 0x3010094 +#define MPNPU_PUB_SCRATCH11 0x3010098 +#define MPNPU_PUB_SCRATCH12 0x301009C +#define MPNPU_PUB_SCRATCH13 0x30100A0 +#define MPNPU_PUB_SCRATCH14 0x30100A4 +#define MPNPU_PUB_SCRATCH15 0x30100A8 +#define MP0_C2PMSG_73 0x3810A24 +#define MP0_C2PMSG_123 0x3810AEC + +#define MP1_C2PMSG_0 0x3B10900 +#define MP1_C2PMSG_60 0x3B109F0 +#define MP1_C2PMSG_61 0x3B109F4 + +#define MPNPU_SRAM_X2I_MAILBOX_0 0x3600000 +#define MPNPU_SRAM_X2I_MAILBOX_15 0x361E000 +#define MPNPU_SRAM_X2I_MAILBOX_31 0x363E000 +#define MPNPU_SRAM_I2X_MAILBOX_31 0x363F000 + +#define MMNPU_APERTURE0_BASE 0x3000000 +#define MMNPU_APERTURE1_BASE 0x3600000 +#define MMNPU_APERTURE3_BASE 0x3810000 +#define MMNPU_APERTURE4_BASE 0x3B10000 + +/* PCIe BAR Index for NPU6 */ +#define NPU6_REG_BAR_INDEX 0 +#define NPU6_MBOX_BAR_INDEX 0 +#define NPU6_PSP_BAR_INDEX 4 +#define NPU6_SMU_BAR_INDEX 5 +#define NPU6_SRAM_BAR_INDEX 2 +/* Associated BARs and Apertures */ +#define NPU6_REG_BAR_BASE MMNPU_APERTURE0_BASE +#define NPU6_MBOX_BAR_BASE MMNPU_APERTURE0_BASE +#define NPU6_PSP_BAR_BASE MMNPU_APERTURE3_BASE +#define NPU6_SMU_BAR_BASE MMNPU_APERTURE4_BASE +#define NPU6_SRAM_BAR_BASE MMNPU_APERTURE1_BASE + +static const struct amdxdna_dev_priv npu6_dev_priv = { + .fw_path = "amdnpu/17f0_10/npu.sbin", + .protocol_major = 0x6, + .protocol_minor = 12, + .rt_config = npu4_default_rt_cfg, + .dpm_clk_tbl = npu4_dpm_clk_table, + .col_align = COL_ALIGN_NATURE, + .mbox_dev_addr = NPU6_MBOX_BAR_BASE, + .mbox_size = 0, /* Use BAR size */ + .sram_dev_addr = NPU6_SRAM_BAR_BASE, + .sram_offs = { + DEFINE_BAR_OFFSET(MBOX_CHANN_OFF, NPU6_SRAM, MPNPU_SRAM_X2I_MAILBOX_0), + DEFINE_BAR_OFFSET(FW_ALIVE_OFF, NPU6_SRAM, MPNPU_SRAM_X2I_MAILBOX_15), + }, + .psp_regs_off = { + DEFINE_BAR_OFFSET(PSP_CMD_REG, NPU6_PSP, MP0_C2PMSG_123), + DEFINE_BAR_OFFSET(PSP_ARG0_REG, NPU6_REG, MPNPU_PUB_SCRATCH3), + DEFINE_BAR_OFFSET(PSP_ARG1_REG, NPU6_REG, MPNPU_PUB_SCRATCH4), + DEFINE_BAR_OFFSET(PSP_ARG2_REG, NPU6_REG, MPNPU_PUB_SCRATCH9), + DEFINE_BAR_OFFSET(PSP_INTR_REG, NPU6_PSP, MP0_C2PMSG_73), + DEFINE_BAR_OFFSET(PSP_STATUS_REG, NPU6_PSP, MP0_C2PMSG_123), + DEFINE_BAR_OFFSET(PSP_RESP_REG, NPU6_REG, MPNPU_PUB_SCRATCH3), + }, + .smu_regs_off = { + DEFINE_BAR_OFFSET(SMU_CMD_REG, NPU6_SMU, MP1_C2PMSG_0), + DEFINE_BAR_OFFSET(SMU_ARG_REG, NPU6_SMU, MP1_C2PMSG_60), + DEFINE_BAR_OFFSET(SMU_INTR_REG, NPU6_SMU, MMNPU_APERTURE4_BASE), + DEFINE_BAR_OFFSET(SMU_RESP_REG, NPU6_SMU, MP1_C2PMSG_61), + DEFINE_BAR_OFFSET(SMU_OUT_REG, NPU6_SMU, MP1_C2PMSG_60), + }, + .hw_ops = { + .set_dpm = npu4_set_dpm, + }, + +}; + +const struct amdxdna_dev_info dev_npu6_info = { + .reg_bar = NPU6_REG_BAR_INDEX, + .mbox_bar = NPU6_MBOX_BAR_INDEX, + .sram_bar = NPU6_SRAM_BAR_INDEX, + .psp_bar = NPU6_PSP_BAR_INDEX, + .smu_bar = NPU6_SMU_BAR_INDEX, + .first_col = 0, + .dev_mem_buf_shift = 15, /* 32 KiB aligned */ + .dev_mem_base = AIE2_DEVM_BASE, + .dev_mem_size = AIE2_DEVM_SIZE, + .vbnv = "RyzenAI-npu6", + .device_type = AMDXDNA_DEV_TYPE_KMQ, + .dev_priv = &npu6_dev_priv, + .ops = &aie2_ops, +}; diff --git a/drivers/accel/habanalabs/common/habanalabs_drv.c b/drivers/accel/habanalabs/common/habanalabs_drv.c index 708dfd10f39c..5409b2c656c8 100644 --- a/drivers/accel/habanalabs/common/habanalabs_drv.c +++ b/drivers/accel/habanalabs/common/habanalabs_drv.c @@ -101,7 +101,6 @@ static const struct drm_driver hl_driver = { .major = LINUX_VERSION_MAJOR, .minor = LINUX_VERSION_PATCHLEVEL, .patchlevel = LINUX_VERSION_SUBLEVEL, - .date = "20190505", .fops = &hl_fops, .open = hl_device_open, diff --git a/drivers/accel/habanalabs/common/memory.c b/drivers/accel/habanalabs/common/memory.c index 3348ad12c237..601fdbe70179 100644 --- a/drivers/accel/habanalabs/common/memory.c +++ b/drivers/accel/habanalabs/common/memory.c @@ -14,7 +14,7 @@ #include #include -MODULE_IMPORT_NS(DMA_BUF); +MODULE_IMPORT_NS("DMA_BUF"); #define HL_MMU_DEBUG 0 diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c index 0c4a82271c26..38cf1c342c72 100644 --- a/drivers/accel/ivpu/ivpu_drv.c +++ b/drivers/accel/ivpu/ivpu_drv.c @@ -462,15 +462,7 @@ static const struct drm_driver driver = { .name = DRIVER_NAME, .desc = DRIVER_DESC, -#ifdef DRIVER_DATE - .date = DRIVER_DATE, - .major = DRIVER_MAJOR, - .minor = DRIVER_MINOR, - .patchlevel = DRIVER_PATCHLEVEL, -#else - .date = UTS_RELEASE, .major = 1, -#endif }; static void ivpu_context_abort_invalid(struct ivpu_device *vdev) diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c index 8b2b050cc41a..5060c5dd40d1 100644 --- a/drivers/accel/ivpu/ivpu_pm.c +++ b/drivers/accel/ivpu/ivpu_pm.c @@ -78,8 +78,8 @@ static int ivpu_resume(struct ivpu_device *vdev) int ret; retry: - pci_restore_state(to_pci_dev(vdev->drm.dev)); pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D0); + pci_restore_state(to_pci_dev(vdev->drm.dev)); ret = ivpu_hw_power_up(vdev); if (ret) { diff --git a/drivers/accel/qaic/qaic_drv.c b/drivers/accel/qaic/qaic_drv.c index 3575e0c984d6..81819b9ef8d4 100644 --- a/drivers/accel/qaic/qaic_drv.c +++ b/drivers/accel/qaic/qaic_drv.c @@ -32,7 +32,7 @@ #include "qaic_timesync.h" #include "sahara.h" -MODULE_IMPORT_NS(DMA_BUF); +MODULE_IMPORT_NS("DMA_BUF"); #define PCI_DEV_AIC080 0xa080 #define PCI_DEV_AIC100 0xa100 @@ -208,7 +208,6 @@ static const struct drm_driver qaic_accel_driver = { .name = QAIC_NAME, .desc = QAIC_DESC, - .date = "20190618", .fops = &qaic_accel_fops, .open = qaic_open, diff --git a/drivers/accel/qaic/sahara.c b/drivers/accel/qaic/sahara.c index 6d772143d612..21d58aed0deb 100644 --- a/drivers/accel/qaic/sahara.c +++ b/drivers/accel/qaic/sahara.c @@ -772,8 +772,7 @@ static void sahara_mhi_remove(struct mhi_device *mhi_dev) cancel_work_sync(&context->fw_work); cancel_work_sync(&context->dump_work); - if (context->mem_dump) - vfree(context->mem_dump); + vfree(context->mem_dump); sahara_release_image(context); mhi_unprepare_from_transfer(mhi_dev); } diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index d65cd08ba8e1..d81b55f5068c 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -135,10 +135,10 @@ config ACPI_REV_OVERRIDE_POSSIBLE config ACPI_EC bool "Embedded Controller" depends on HAS_IOPORT - default X86 + default X86 || LOONGARCH help This driver handles communication with the microcontroller - on many x86 laptops and other machines. + on many x86/LoongArch laptops and other machines. config ACPI_EC_DEBUGFS tristate "EC read/write access through /sys/kernel/debug/ec" diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c index 8274a17872ed..a972831dbd66 100644 --- a/drivers/acpi/acpi_video.c +++ b/drivers/acpi/acpi_video.c @@ -610,16 +610,28 @@ acpi_video_device_lcd_get_level_current(struct acpi_video_device *device, return 0; } +/** + * acpi_video_device_EDID() - Get EDID from ACPI _DDC + * @device: video output device (LCD, CRT, ..) + * @edid: address for returned EDID pointer + * @length: _DDC length to request (must be a multiple of 128) + * + * Get EDID from ACPI _DDC. On success, a pointer to the EDID data is written + * to the @edid address, and the length of the EDID is returned. The caller is + * responsible for freeing the edid pointer. + * + * Return the length of EDID (positive value) on success or error (negative + * value). + */ static int -acpi_video_device_EDID(struct acpi_video_device *device, - union acpi_object **edid, int length) +acpi_video_device_EDID(struct acpi_video_device *device, void **edid, int length) { - int status; + acpi_status status; struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL }; union acpi_object *obj; union acpi_object arg0 = { ACPI_TYPE_INTEGER }; struct acpi_object_list args = { 1, &arg0 }; - + int ret; *edid = NULL; @@ -636,16 +648,17 @@ acpi_video_device_EDID(struct acpi_video_device *device, obj = buffer.pointer; - if (obj && obj->type == ACPI_TYPE_BUFFER) - *edid = obj; - else { + if (obj && obj->type == ACPI_TYPE_BUFFER) { + *edid = kmemdup(obj->buffer.pointer, obj->buffer.length, GFP_KERNEL); + ret = *edid ? obj->buffer.length : -ENOMEM; + } else { acpi_handle_debug(device->dev->handle, "Invalid _DDC data for length %d\n", length); - status = -EFAULT; - kfree(obj); + ret = -EFAULT; } - return status; + kfree(obj); + return ret; } /* bus */ @@ -1435,9 +1448,7 @@ int acpi_video_get_edid(struct acpi_device *device, int type, int device_id, { struct acpi_video_bus *video; struct acpi_video_device *video_device; - union acpi_object *buffer = NULL; - acpi_status status; - int i, length; + int i, length, ret; if (!device || !acpi_driver_data(device)) return -EINVAL; @@ -1477,16 +1488,10 @@ int acpi_video_get_edid(struct acpi_device *device, int type, int device_id, } for (length = 512; length > 0; length -= 128) { - status = acpi_video_device_EDID(video_device, &buffer, - length); - if (ACPI_SUCCESS(status)) - break; + ret = acpi_video_device_EDID(video_device, edid, length); + if (ret > 0) + return ret; } - if (!length) - continue; - - *edid = buffer->buffer.pointer; - return length; } return -ENODEV; diff --git a/drivers/acpi/acpica/evxfregn.c b/drivers/acpi/acpica/evxfregn.c index 95f78383bbdb..bff2d099f469 100644 --- a/drivers/acpi/acpica/evxfregn.c +++ b/drivers/acpi/acpica/evxfregn.c @@ -232,8 +232,6 @@ acpi_remove_address_space_handler(acpi_handle device, /* Now we can delete the handler object */ - acpi_os_release_mutex(handler_obj->address_space. - context_mutex); acpi_ut_remove_reference(handler_obj); goto unlock_and_exit; } diff --git a/drivers/acpi/apei/einj-cxl.c b/drivers/acpi/apei/einj-cxl.c index a4e709937236..78da9ae543a2 100644 --- a/drivers/acpi/apei/einj-cxl.c +++ b/drivers/acpi/apei/einj-cxl.c @@ -45,7 +45,7 @@ int einj_cxl_available_error_type_show(struct seq_file *m, void *v) return 0; } -EXPORT_SYMBOL_NS_GPL(einj_cxl_available_error_type_show, CXL); +EXPORT_SYMBOL_NS_GPL(einj_cxl_available_error_type_show, "CXL"); static int cxl_dport_get_sbdf(struct pci_dev *dport_dev, u64 *sbdf) { @@ -83,7 +83,7 @@ int einj_cxl_inject_rch_error(u64 rcrb, u64 type) return einj_cxl_rch_error_inject(type, 0x2, rcrb, GENMASK_ULL(63, 0), 0, 0); } -EXPORT_SYMBOL_NS_GPL(einj_cxl_inject_rch_error, CXL); +EXPORT_SYMBOL_NS_GPL(einj_cxl_inject_rch_error, "CXL"); int einj_cxl_inject_error(struct pci_dev *dport, u64 type) { @@ -104,10 +104,10 @@ int einj_cxl_inject_error(struct pci_dev *dport, u64 type) return einj_error_inject(type, 0x4, 0, 0, 0, param4); } -EXPORT_SYMBOL_NS_GPL(einj_cxl_inject_error, CXL); +EXPORT_SYMBOL_NS_GPL(einj_cxl_inject_error, "CXL"); bool einj_cxl_is_initialized(void) { return einj_initialized; } -EXPORT_SYMBOL_NS_GPL(einj_cxl_is_initialized, CXL); +EXPORT_SYMBOL_NS_GPL(einj_cxl_is_initialized, "CXL"); diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index a2491905f165..07789f0b59bc 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -726,7 +726,7 @@ int cxl_cper_register_work(struct work_struct *work) cxl_cper_work = work; return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_cper_register_work, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_cper_register_work, "CXL"); int cxl_cper_unregister_work(struct work_struct *work) { @@ -737,13 +737,13 @@ int cxl_cper_unregister_work(struct work_struct *work) cxl_cper_work = NULL; return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_cper_unregister_work, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_cper_unregister_work, "CXL"); int cxl_cper_kfifo_get(struct cxl_cper_work_data *wd) { return kfifo_get(&cxl_cper_fifo, wd); } -EXPORT_SYMBOL_NS_GPL(cxl_cper_kfifo_get, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_cper_kfifo_get, "CXL"); static bool ghes_do_proc(struct ghes *ghes, const struct acpi_hest_generic_status *estatus) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index 1f7e4c691d9e..98759d6199d3 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1716,6 +1716,8 @@ static struct acpi_platform_list pmcg_plat_info[] __initdata = { /* HiSilicon Hip09 Platform */ {"HISI ", "HIP09 ", 0, ACPI_SIG_IORT, greater_than_or_equal, "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, + {"HISI ", "HIP09A ", 0, ACPI_SIG_IORT, greater_than_or_equal, + "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, /* HiSilicon Hip10/11 Platform uses the same SMMU IP with Hip09 */ {"HISI ", "HIP10 ", 0, ACPI_SIG_IORT, greater_than_or_equal, "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index 5429ec9ef06f..a5d47819b3a4 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -454,8 +454,13 @@ int acpi_nfit_ctl(struct nvdimm_bus_descriptor *nd_desc, struct nvdimm *nvdimm, if (cmd_rc) *cmd_rc = -EINVAL; - if (cmd == ND_CMD_CALL) + if (cmd == ND_CMD_CALL) { + if (!buf || buf_len < sizeof(*call_pkg)) + return -EINVAL; + call_pkg = buf; + } + func = cmd_to_func(nfit_mem, cmd, call_pkg, &family); if (func < 0) return func; diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c index 1a902a02390f..80a3481c0470 100644 --- a/drivers/acpi/numa/hmat.c +++ b/drivers/acpi/numa/hmat.c @@ -151,7 +151,7 @@ int acpi_get_genport_coordinates(u32 uid, return 0; } -EXPORT_SYMBOL_NS_GPL(acpi_get_genport_coordinates, CXL); +EXPORT_SYMBOL_NS_GPL(acpi_get_genport_coordinates, "CXL"); static __init void alloc_memory_initiator(unsigned int cpu_pxm) { diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c index 7fe842dae1ec..90aaec923889 100644 --- a/drivers/acpi/resource.c +++ b/drivers/acpi/resource.c @@ -250,6 +250,9 @@ static bool acpi_decode_space(struct resource_win *win, switch (addr->resource_type) { case ACPI_MEMORY_RANGE: acpi_dev_memresource_flags(res, len, wp); + + if (addr->info.mem.caching == ACPI_PREFETCHABLE_MEMORY) + res->flags |= IORESOURCE_PREFETCH; break; case ACPI_IO_RANGE: acpi_dev_ioresource_flags(res, len, iodec, @@ -265,9 +268,6 @@ static bool acpi_decode_space(struct resource_win *win, if (addr->producer_consumer == ACPI_PRODUCER) res->flags |= IORESOURCE_WINDOW; - if (addr->info.mem.caching == ACPI_PREFETCHABLE_MEMORY) - res->flags |= IORESOURCE_PREFETCH; - return !(res->flags & IORESOURCE_DISABLED); } @@ -440,6 +440,13 @@ static const struct dmi_system_id irq1_level_low_skip_override[] = { DMI_MATCH(DMI_BOARD_NAME, "S5602ZA"), }, }, + { + /* Asus Vivobook X1504VAP */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."), + DMI_MATCH(DMI_BOARD_NAME, "X1504VAP"), + }, + }, { /* Asus Vivobook X1704VAP */ .matches = { @@ -646,6 +653,17 @@ static const struct dmi_system_id irq1_edge_low_force_override[] = { DMI_MATCH(DMI_BOARD_NAME, "GMxHGxx"), }, }, + { + /* + * TongFang GM5HG0A in case of the SKIKK Vanaheim relabel the + * board-name is changed, so check OEM strings instead. Note + * OEM string matches are always exact matches. + * https://bugzilla.kernel.org/show_bug.cgi?id=219614 + */ + .matches = { + DMI_EXACT_MATCH(DMI_OEM_STRING, "GM5HG0A"), + }, + }, { } }; @@ -671,11 +689,11 @@ static bool acpi_dev_irq_override(u32 gsi, u8 triggering, u8 polarity, for (i = 0; i < ARRAY_SIZE(override_table); i++) { const struct irq_override_cmp *entry = &override_table[i]; - if (dmi_check_system(entry->system) && - entry->irq == gsi && + if (entry->irq == gsi && entry->triggering == triggering && entry->polarity == polarity && - entry->shareable == shareable) + entry->shareable == shareable && + dmi_check_system(entry->system)) return entry->override; } diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index 6671537cb4b7..95982c098d5b 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -1082,7 +1082,7 @@ static void __exit acpi_thermal_exit(void) module_init(acpi_thermal_init); module_exit(acpi_thermal_exit); -MODULE_IMPORT_NS(ACPI_THERMAL); +MODULE_IMPORT_NS("ACPI_THERMAL"); MODULE_AUTHOR("Paul Diefenbaugh"); MODULE_DESCRIPTION("ACPI Thermal Zone Driver"); MODULE_LICENSE("GPL"); diff --git a/drivers/acpi/thermal_lib.c b/drivers/acpi/thermal_lib.c index 6214d6ebe1fa..f81591927e86 100644 --- a/drivers/acpi/thermal_lib.c +++ b/drivers/acpi/thermal_lib.c @@ -53,25 +53,25 @@ int acpi_active_trip_temp(struct acpi_device *adev, int id, int *ret_temp) return acpi_trip_temp(adev, obj_name, ret_temp); } -EXPORT_SYMBOL_NS_GPL(acpi_active_trip_temp, ACPI_THERMAL); +EXPORT_SYMBOL_NS_GPL(acpi_active_trip_temp, "ACPI_THERMAL"); int acpi_passive_trip_temp(struct acpi_device *adev, int *ret_temp) { return acpi_trip_temp(adev, "_PSV", ret_temp); } -EXPORT_SYMBOL_NS_GPL(acpi_passive_trip_temp, ACPI_THERMAL); +EXPORT_SYMBOL_NS_GPL(acpi_passive_trip_temp, "ACPI_THERMAL"); int acpi_hot_trip_temp(struct acpi_device *adev, int *ret_temp) { return acpi_trip_temp(adev, "_HOT", ret_temp); } -EXPORT_SYMBOL_NS_GPL(acpi_hot_trip_temp, ACPI_THERMAL); +EXPORT_SYMBOL_NS_GPL(acpi_hot_trip_temp, "ACPI_THERMAL"); int acpi_critical_trip_temp(struct acpi_device *adev, int *ret_temp) { return acpi_trip_temp(adev, "_CRT", ret_temp); } -EXPORT_SYMBOL_NS_GPL(acpi_critical_trip_temp, ACPI_THERMAL); +EXPORT_SYMBOL_NS_GPL(acpi_critical_trip_temp, "ACPI_THERMAL"); static int thermal_temp(int error, int temp_decik, int *ret_temp) { diff --git a/drivers/ata/sata_highbank.c b/drivers/ata/sata_highbank.c index b1b40e9551de..c8c817c51230 100644 --- a/drivers/ata/sata_highbank.c +++ b/drivers/ata/sata_highbank.c @@ -348,6 +348,7 @@ static int highbank_initialize_phys(struct device *dev, void __iomem *addr) phy_nodes[phy] = phy_data.np; cphy_base[phy] = of_iomap(phy_nodes[phy], 0); if (cphy_base[phy] == NULL) { + of_node_put(phy_data.np); return 0; } phy_count += 1; diff --git a/drivers/atm/fore200e.c b/drivers/atm/fore200e.c index cb00f8244e41..4fea1149e003 100644 --- a/drivers/atm/fore200e.c +++ b/drivers/atm/fore200e.c @@ -2569,7 +2569,7 @@ static struct platform_driver fore200e_sba_driver = { .of_match_table = fore200e_sba_match, }, .probe = fore200e_sba_probe, - .remove_new = fore200e_sba_remove, + .remove = fore200e_sba_remove, }; #endif diff --git a/drivers/auxdisplay/cfag12864bfb.c b/drivers/auxdisplay/cfag12864bfb.c index 2b74dabe7e17..24baf6b2c587 100644 --- a/drivers/auxdisplay/cfag12864bfb.c +++ b/drivers/auxdisplay/cfag12864bfb.c @@ -108,7 +108,7 @@ static void cfag12864bfb_remove(struct platform_device *device) static struct platform_driver cfag12864bfb_driver = { .probe = cfag12864bfb_probe, - .remove_new = cfag12864bfb_remove, + .remove = cfag12864bfb_remove, .driver = { .name = CFAG12864BFB_NAME, }, diff --git a/drivers/auxdisplay/hd44780.c b/drivers/auxdisplay/hd44780.c index 025dc6855cb2..0526f0d90a79 100644 --- a/drivers/auxdisplay/hd44780.c +++ b/drivers/auxdisplay/hd44780.c @@ -339,7 +339,7 @@ MODULE_DEVICE_TABLE(of, hd44780_of_match); static struct platform_driver hd44780_driver = { .probe = hd44780_probe, - .remove_new = hd44780_remove, + .remove = hd44780_remove, .driver = { .name = "hd44780", .of_match_table = hd44780_of_match, diff --git a/drivers/auxdisplay/ht16k33.c b/drivers/auxdisplay/ht16k33.c index 09deb864b27a..0b8ba754b343 100644 --- a/drivers/auxdisplay/ht16k33.c +++ b/drivers/auxdisplay/ht16k33.c @@ -780,5 +780,5 @@ module_i2c_driver(ht16k33_driver); MODULE_DESCRIPTION("Holtek HT16K33 driver"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(LINEDISP); +MODULE_IMPORT_NS("LINEDISP"); MODULE_AUTHOR("Robin van der Gracht "); diff --git a/drivers/auxdisplay/img-ascii-lcd.c b/drivers/auxdisplay/img-ascii-lcd.c index 9ba132dc6143..a802678a6f74 100644 --- a/drivers/auxdisplay/img-ascii-lcd.c +++ b/drivers/auxdisplay/img-ascii-lcd.c @@ -291,11 +291,11 @@ static struct platform_driver img_ascii_lcd_driver = { .of_match_table = img_ascii_lcd_matches, }, .probe = img_ascii_lcd_probe, - .remove_new = img_ascii_lcd_remove, + .remove = img_ascii_lcd_remove, }; module_platform_driver(img_ascii_lcd_driver); MODULE_DESCRIPTION("Imagination Technologies ASCII LCD Display"); MODULE_AUTHOR("Paul Burton "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(LINEDISP); +MODULE_IMPORT_NS("LINEDISP"); diff --git a/drivers/auxdisplay/line-display.c b/drivers/auxdisplay/line-display.c index 731ffdfafc4e..fcec77f100ce 100644 --- a/drivers/auxdisplay/line-display.c +++ b/drivers/auxdisplay/line-display.c @@ -381,7 +381,7 @@ out_put_device: put_device(&linedisp->dev); return err; } -EXPORT_SYMBOL_NS_GPL(linedisp_register, LINEDISP); +EXPORT_SYMBOL_NS_GPL(linedisp_register, "LINEDISP"); /** * linedisp_unregister - unregister a character line display @@ -394,7 +394,7 @@ void linedisp_unregister(struct linedisp *linedisp) del_timer_sync(&linedisp->timer); put_device(&linedisp->dev); } -EXPORT_SYMBOL_NS_GPL(linedisp_unregister, LINEDISP); +EXPORT_SYMBOL_NS_GPL(linedisp_unregister, "LINEDISP"); MODULE_DESCRIPTION("Character line display core support"); MODULE_LICENSE("GPL"); diff --git a/drivers/auxdisplay/max6959.c b/drivers/auxdisplay/max6959.c index 5519c014bd29..962488197b9e 100644 --- a/drivers/auxdisplay/max6959.c +++ b/drivers/auxdisplay/max6959.c @@ -191,4 +191,4 @@ module_i2c_driver(max6959_i2c_driver); MODULE_DESCRIPTION("MAX6958/6959 7-segment LED controller"); MODULE_AUTHOR("Andy Shevchenko "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(LINEDISP); +MODULE_IMPORT_NS("LINEDISP"); diff --git a/drivers/auxdisplay/seg-led-gpio.c b/drivers/auxdisplay/seg-led-gpio.c index 183ab3011cbb..f10c25e6bf12 100644 --- a/drivers/auxdisplay/seg-led-gpio.c +++ b/drivers/auxdisplay/seg-led-gpio.c @@ -97,7 +97,7 @@ MODULE_DEVICE_TABLE(of, seg_led_of_match); static struct platform_driver seg_led_driver = { .probe = seg_led_probe, - .remove_new = seg_led_remove, + .remove = seg_led_remove, .driver = { .name = "seg-led-gpio", .of_match_table = seg_led_of_match, @@ -108,4 +108,4 @@ module_platform_driver(seg_led_driver); MODULE_AUTHOR("Chris Packham "); MODULE_DESCRIPTION("7 segment LED driver"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(LINEDISP); +MODULE_IMPORT_NS("LINEDISP"); diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c index e18701676426..c99f2ab105e5 100644 --- a/drivers/base/arch_numa.c +++ b/drivers/base/arch_numa.c @@ -208,6 +208,10 @@ static int __init numa_register_nodes(void) { int nid; + /* Check the validity of the memblock/node mapping */ + if (!memblock_validate_numa_coverage(0)) + return -EINVAL; + /* Finally register nodes. */ for_each_node_mask(nid, numa_nodes_parsed) { unsigned long start_pfn, end_pfn; diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index 609935ad5091..cf0d455209d7 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -58,7 +58,7 @@ bool last_level_cache_is_valid(unsigned int cpu) { struct cacheinfo *llc; - if (!cache_leaves(cpu)) + if (!cache_leaves(cpu) || !per_cpu_cacheinfo(cpu)) return false; llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1); @@ -458,11 +458,9 @@ int __weak populate_cache_leaves(unsigned int cpu) return -ENOENT; } -static inline -int allocate_cache_info(int cpu) +static inline int allocate_cache_info(int cpu) { - per_cpu_cacheinfo(cpu) = kcalloc(cache_leaves(cpu), - sizeof(struct cacheinfo), GFP_ATOMIC); + per_cpu_cacheinfo(cpu) = kcalloc(cache_leaves(cpu), sizeof(struct cacheinfo), GFP_ATOMIC); if (!per_cpu_cacheinfo(cpu)) { cache_leaves(cpu) = 0; return -ENOMEM; @@ -534,7 +532,11 @@ static inline int init_level_allocate_ci(unsigned int cpu) */ ci_cacheinfo(cpu)->early_ci_levels = false; - if (cache_leaves(cpu) <= early_leaves) + /* + * Some architectures (e.g., x86) do not use early initialization. + * Allocate memory now in such case. + */ + if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu)) return 0; kfree(per_cpu_cacheinfo(cpu)); diff --git a/drivers/base/firmware_loader/builtin/main.c b/drivers/base/firmware_loader/builtin/main.c index a065c3150897..d36befebb1b9 100644 --- a/drivers/base/firmware_loader/builtin/main.c +++ b/drivers/base/firmware_loader/builtin/main.c @@ -61,7 +61,7 @@ bool firmware_request_builtin(struct firmware *fw, const char *name) return false; } -EXPORT_SYMBOL_NS_GPL(firmware_request_builtin, TEST_FIRMWARE); +EXPORT_SYMBOL_NS_GPL(firmware_request_builtin, "TEST_FIRMWARE"); /** * firmware_request_builtin_buf() - load builtin firmware into optional buffer diff --git a/drivers/base/firmware_loader/fallback_table.c b/drivers/base/firmware_loader/fallback_table.c index 8432ab2c3b3c..ddb70e29eb42 100644 --- a/drivers/base/firmware_loader/fallback_table.c +++ b/drivers/base/firmware_loader/fallback_table.c @@ -22,7 +22,7 @@ struct firmware_fallback_config fw_fallback_config = { .loading_timeout = 60, .old_timeout = 60, }; -EXPORT_SYMBOL_NS_GPL(fw_fallback_config, FIRMWARE_LOADER_PRIVATE); +EXPORT_SYMBOL_NS_GPL(fw_fallback_config, "FIRMWARE_LOADER_PRIVATE"); #ifdef CONFIG_SYSCTL static struct ctl_table firmware_config_table[] = { @@ -56,13 +56,13 @@ int register_firmware_config_sysctl(void) return -ENOMEM; return 0; } -EXPORT_SYMBOL_NS_GPL(register_firmware_config_sysctl, FIRMWARE_LOADER_PRIVATE); +EXPORT_SYMBOL_NS_GPL(register_firmware_config_sysctl, "FIRMWARE_LOADER_PRIVATE"); void unregister_firmware_config_sysctl(void) { unregister_sysctl_table(firmware_config_sysct_table_header); firmware_config_sysct_table_header = NULL; } -EXPORT_SYMBOL_NS_GPL(unregister_firmware_config_sysctl, FIRMWARE_LOADER_PRIVATE); +EXPORT_SYMBOL_NS_GPL(unregister_firmware_config_sysctl, "FIRMWARE_LOADER_PRIVATE"); #endif /* CONFIG_SYSCTL */ diff --git a/drivers/base/firmware_loader/sysfs.h b/drivers/base/firmware_loader/sysfs.h index 2060add8ef81..1cabea544a40 100644 --- a/drivers/base/firmware_loader/sysfs.h +++ b/drivers/base/firmware_loader/sysfs.h @@ -6,7 +6,7 @@ #include "firmware.h" -MODULE_IMPORT_NS(FIRMWARE_LOADER_PRIVATE); +MODULE_IMPORT_NS("FIRMWARE_LOADER_PRIVATE"); extern struct firmware_fallback_config fw_fallback_config; extern struct device_attribute dev_attr_loading; diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c index 53131a7ede0a..5962ea1230a1 100644 --- a/drivers/base/regmap/regmap.c +++ b/drivers/base/regmap/regmap.c @@ -598,6 +598,17 @@ int regmap_attach_dev(struct device *dev, struct regmap *map, } EXPORT_SYMBOL_GPL(regmap_attach_dev); +static int dev_get_regmap_match(struct device *dev, void *res, void *data); + +static int regmap_detach_dev(struct device *dev, struct regmap *map) +{ + if (!dev) + return 0; + + return devres_release(dev, dev_get_regmap_release, + dev_get_regmap_match, (void *)map->name); +} + static enum regmap_endian regmap_get_reg_endian(const struct regmap_bus *bus, const struct regmap_config *config) { @@ -1052,13 +1063,13 @@ skip_format_initialization: /* Sanity check */ if (range_cfg->range_max < range_cfg->range_min) { - dev_err(map->dev, "Invalid range %d: %d < %d\n", i, + dev_err(map->dev, "Invalid range %d: %u < %u\n", i, range_cfg->range_max, range_cfg->range_min); goto err_range; } if (range_cfg->range_max > map->max_register) { - dev_err(map->dev, "Invalid range %d: %d > %d\n", i, + dev_err(map->dev, "Invalid range %d: %u > %u\n", i, range_cfg->range_max, map->max_register); goto err_range; } @@ -1445,6 +1456,7 @@ void regmap_exit(struct regmap *map) { struct regmap_async *async; + regmap_detach_dev(map->dev, map); regcache_exit(map); regmap_debugfs_exit(map); diff --git a/drivers/base/topology.c b/drivers/base/topology.c index cf160dd2c27b..b962da263eee 100644 --- a/drivers/base/topology.c +++ b/drivers/base/topology.c @@ -27,9 +27,17 @@ static ssize_t name##_read(struct file *file, struct kobject *kobj, \ loff_t off, size_t count) \ { \ struct device *dev = kobj_to_dev(kobj); \ + cpumask_var_t mask; \ + ssize_t n; \ \ - return cpumap_print_bitmask_to_buf(buf, topology_##mask(dev->id), \ - off, count); \ + if (!alloc_cpumask_var(&mask, GFP_KERNEL)) \ + return -ENOMEM; \ + \ + cpumask_copy(mask, topology_##mask(dev->id)); \ + n = cpumap_print_bitmask_to_buf(buf, mask, off, count); \ + free_cpumask_var(mask); \ + \ + return n; \ } \ \ static ssize_t name##_list_read(struct file *file, struct kobject *kobj, \ @@ -37,9 +45,17 @@ static ssize_t name##_list_read(struct file *file, struct kobject *kobj, \ loff_t off, size_t count) \ { \ struct device *dev = kobj_to_dev(kobj); \ + cpumask_var_t mask; \ + ssize_t n; \ \ - return cpumap_print_list_to_buf(buf, topology_##mask(dev->id), \ - off, count); \ + if (!alloc_cpumask_var(&mask, GFP_KERNEL)) \ + return -ENOMEM; \ + \ + cpumask_copy(mask, topology_##mask(dev->id)); \ + n = cpumap_print_list_to_buf(buf, mask, off, count); \ + free_cpumask_var(mask); \ + \ + return n; \ } define_id_show_func(physical_package_id, "%d"); diff --git a/drivers/bcma/host_soc.c b/drivers/bcma/host_soc.c index 8ae0b918e740..20b1816c570b 100644 --- a/drivers/bcma/host_soc.c +++ b/drivers/bcma/host_soc.c @@ -261,7 +261,7 @@ static struct platform_driver bcma_host_soc_driver = { .of_match_table = bcma_host_soc_of_match, }, .probe = bcma_host_soc_probe, - .remove_new = bcma_host_soc_remove, + .remove = bcma_host_soc_remove, }; int __init bcma_host_soc_register_driver(void) diff --git a/drivers/block/rnull.rs b/drivers/block/rnull.rs index 5de7223beb4d..9cca05dcf772 100644 --- a/drivers/block/rnull.rs +++ b/drivers/block/rnull.rs @@ -28,6 +28,7 @@ module! { type: NullBlkModule, name: "rnull_mod", author: "Andreas Hindborg", + description: "Rust implementation of the C null block driver", license: "GPL v2", } diff --git a/drivers/block/swim.c b/drivers/block/swim.c index 126f151c4f2c..be4ac58afe41 100644 --- a/drivers/block/swim.c +++ b/drivers/block/swim.c @@ -944,7 +944,7 @@ static void swim_remove(struct platform_device *dev) static struct platform_driver swim_driver = { .probe = swim_probe, - .remove_new = swim_remove, + .remove = swim_remove, .driver = { .name = CARDNAME, }, diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index d4aed12dd436..934ab9332c80 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1618,6 +1618,21 @@ static void ublk_unquiesce_dev(struct ublk_device *ub) blk_mq_kick_requeue_list(ub->ub_disk->queue); } +static struct gendisk *ublk_detach_disk(struct ublk_device *ub) +{ + struct gendisk *disk; + + /* Sync with ublk_abort_queue() by holding the lock */ + spin_lock(&ub->lock); + disk = ub->ub_disk; + ub->dev_info.state = UBLK_S_DEV_DEAD; + ub->dev_info.ublksrv_pid = -1; + ub->ub_disk = NULL; + spin_unlock(&ub->lock); + + return disk; +} + static void ublk_stop_dev(struct ublk_device *ub) { struct gendisk *disk; @@ -1631,14 +1646,7 @@ static void ublk_stop_dev(struct ublk_device *ub) ublk_unquiesce_dev(ub); } del_gendisk(ub->ub_disk); - - /* Sync with ublk_abort_queue() by holding the lock */ - spin_lock(&ub->lock); - disk = ub->ub_disk; - ub->dev_info.state = UBLK_S_DEV_DEAD; - ub->dev_info.ublksrv_pid = -1; - ub->ub_disk = NULL; - spin_unlock(&ub->lock); + disk = ublk_detach_disk(ub); put_disk(disk); unlock: mutex_unlock(&ub->mutex); @@ -2336,7 +2344,7 @@ static int ublk_ctrl_start_dev(struct ublk_device *ub, struct io_uring_cmd *cmd) out_put_cdev: if (ret) { - ub->dev_info.state = UBLK_S_DEV_DEAD; + ublk_detach_disk(ub); ublk_put_device(ub); } if (ret) diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index c0cdba71f436..3efe378f1386 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -1586,9 +1586,12 @@ static void virtblk_remove(struct virtio_device *vdev) static int virtblk_freeze(struct virtio_device *vdev) { struct virtio_blk *vblk = vdev->priv; + struct request_queue *q = vblk->disk->queue; /* Ensure no requests in virtqueues before deleting vqs. */ - blk_mq_freeze_queue(vblk->disk->queue); + blk_mq_freeze_queue(q); + blk_mq_quiesce_queue_nowait(q); + blk_mq_unfreeze_queue(q); /* Ensure we don't receive any more interrupts */ virtio_reset_device(vdev); @@ -1612,8 +1615,8 @@ static int virtblk_restore(struct virtio_device *vdev) return ret; virtio_device_ready(vdev); + blk_mq_unquiesce_queue(vblk->disk->queue); - blk_mq_unfreeze_queue(vblk->disk->queue); return 0; } #endif diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 3dee026988dc..7903a4da40ac 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -614,6 +614,12 @@ static ssize_t backing_dev_store(struct device *dev, } nr_pages = i_size_read(inode) >> PAGE_SHIFT; + /* Refuse to use zero sized device (also prevents self reference) */ + if (!nr_pages) { + err = -EINVAL; + goto out; + } + bitmap_sz = BITS_TO_LONGS(nr_pages) * sizeof(long); bitmap = kvzalloc(bitmap_sz, GFP_KERNEL); if (!bitmap) { @@ -1438,12 +1444,16 @@ static void zram_meta_free(struct zram *zram, u64 disksize) size_t num_pages = disksize >> PAGE_SHIFT; size_t index; + if (!zram->table) + return; + /* Free all pages that are still in this zram device */ for (index = 0; index < num_pages; index++) zram_free_page(zram, index); zs_destroy_pool(zram->mem_pool); vfree(zram->table); + zram->table = NULL; } static bool zram_meta_alloc(struct zram *zram, u64 disksize) @@ -1458,6 +1468,7 @@ static bool zram_meta_alloc(struct zram *zram, u64 disksize) zram->mem_pool = zs_create_pool(zram->disk->disk_name); if (!zram->mem_pool) { vfree(zram->table); + zram->table = NULL; return false; } @@ -2320,11 +2331,6 @@ static void zram_reset_device(struct zram *zram) zram->limit_pages = 0; - if (!init_done(zram)) { - up_write(&zram->init_lock); - return; - } - set_capacity_and_notify(zram->disk, 0); part_stat_set_all(zram->disk->part0, 0); diff --git a/drivers/bluetooth/btmtk.c b/drivers/bluetooth/btmtk.c index 8a3f7c3fcfec..224eafc27dbe 100644 --- a/drivers/bluetooth/btmtk.c +++ b/drivers/bluetooth/btmtk.c @@ -395,6 +395,7 @@ int btmtk_process_coredump(struct hci_dev *hdev, struct sk_buff *skb) { struct btmtk_data *data = hci_get_priv(hdev); int err; + bool complete = false; if (!IS_ENABLED(CONFIG_DEV_COREDUMP)) { kfree_skb(skb); @@ -416,19 +417,22 @@ int btmtk_process_coredump(struct hci_dev *hdev, struct sk_buff *skb) fallthrough; case HCI_DEVCOREDUMP_ACTIVE: default: + /* Mediatek coredump data would be more than MTK_COREDUMP_NUM */ + if (data->cd_info.cnt >= MTK_COREDUMP_NUM && + skb->len > MTK_COREDUMP_END_LEN) + if (!memcmp((char *)&skb->data[skb->len - MTK_COREDUMP_END_LEN], + MTK_COREDUMP_END, MTK_COREDUMP_END_LEN - 1)) + complete = true; + err = hci_devcd_append(hdev, skb); if (err < 0) break; data->cd_info.cnt++; - /* Mediatek coredump data would be more than MTK_COREDUMP_NUM */ - if (data->cd_info.cnt > MTK_COREDUMP_NUM && - skb->len > MTK_COREDUMP_END_LEN) - if (!memcmp((char *)&skb->data[skb->len - MTK_COREDUMP_END_LEN], - MTK_COREDUMP_END, MTK_COREDUMP_END_LEN - 1)) { - bt_dev_info(hdev, "Mediatek coredump end"); - hci_devcd_complete(hdev); - } + if (complete) { + bt_dev_info(hdev, "Mediatek coredump end"); + hci_devcd_complete(hdev); + } break; } @@ -1468,10 +1472,15 @@ EXPORT_SYMBOL_GPL(btmtk_usb_setup); int btmtk_usb_shutdown(struct hci_dev *hdev) { + struct btmtk_data *data = hci_get_priv(hdev); struct btmtk_hci_wmt_params wmt_params; u8 param = 0; int err; + err = usb_autopm_get_interface(data->intf); + if (err < 0) + return err; + /* Disable the device */ wmt_params.op = BTMTK_WMT_FUNC_CTRL; wmt_params.flag = 0; @@ -1482,9 +1491,11 @@ int btmtk_usb_shutdown(struct hci_dev *hdev) err = btmtk_usb_hci_wmt_sync(hdev, &wmt_params); if (err < 0) { bt_dev_err(hdev, "Failed to send wmt func ctrl (%d)", err); + usb_autopm_put_interface(data->intf); return err; } + usb_autopm_put_interface(data->intf); return 0; } EXPORT_SYMBOL_GPL(btmtk_usb_shutdown); diff --git a/drivers/bluetooth/btnxpuart.c b/drivers/bluetooth/btnxpuart.c index 569f5b7d6e46..1230045d78a5 100644 --- a/drivers/bluetooth/btnxpuart.c +++ b/drivers/bluetooth/btnxpuart.c @@ -1381,6 +1381,7 @@ static void btnxpuart_tx_work(struct work_struct *work) while ((skb = nxp_dequeue(nxpdev))) { len = serdev_device_write_buf(serdev, skb->data, skb->len); + serdev_device_wait_until_sent(serdev, 0); hdev->stat.byte_tx += len; skb_pull(skb, len); diff --git a/drivers/bluetooth/btqcomsmd.c b/drivers/bluetooth/btqcomsmd.c index 88dbb2f3fabf..c0eb71d6ffd3 100644 --- a/drivers/bluetooth/btqcomsmd.c +++ b/drivers/bluetooth/btqcomsmd.c @@ -216,7 +216,7 @@ MODULE_DEVICE_TABLE(of, btqcomsmd_of_match); static struct platform_driver btqcomsmd_driver = { .probe = btqcomsmd_probe, - .remove_new = btqcomsmd_remove, + .remove = btqcomsmd_remove, .driver = { .name = "btqcomsmd", .of_match_table = btqcomsmd_of_match, diff --git a/drivers/bluetooth/hci_bcm.c b/drivers/bluetooth/hci_bcm.c index 521b785f2908..9684eb16059b 100644 --- a/drivers/bluetooth/hci_bcm.c +++ b/drivers/bluetooth/hci_bcm.c @@ -1498,7 +1498,7 @@ static const struct dev_pm_ops bcm_pm_ops = { static struct platform_driver bcm_driver = { .probe = bcm_probe, - .remove_new = bcm_remove, + .remove = bcm_remove, .driver = { .name = "hci_bcm", .acpi_match_table = ACPI_PTR(bcm_acpi_match), diff --git a/drivers/bluetooth/hci_intel.c b/drivers/bluetooth/hci_intel.c index 999ccd5bb4f2..811f33701f84 100644 --- a/drivers/bluetooth/hci_intel.c +++ b/drivers/bluetooth/hci_intel.c @@ -1206,7 +1206,7 @@ static void intel_remove(struct platform_device *pdev) static struct platform_driver intel_driver = { .probe = intel_probe, - .remove_new = intel_remove, + .remove = intel_remove, .driver = { .name = "hci_intel", .acpi_match_table = ACPI_PTR(intel_acpi_match), diff --git a/drivers/bus/mhi/host/pci_generic.c b/drivers/bus/mhi/host/pci_generic.c index 07645ce2119a..56ba4192c89c 100644 --- a/drivers/bus/mhi/host/pci_generic.c +++ b/drivers/bus/mhi/host/pci_generic.c @@ -917,7 +917,7 @@ static int mhi_pci_claim(struct mhi_controller *mhi_cntrl, return err; } - mhi_cntrl->regs = pcim_iomap_region(pdev, 1 << bar_num, pci_name(pdev)); + mhi_cntrl->regs = pcim_iomap_region(pdev, bar_num, pci_name(pdev)); if (IS_ERR(mhi_cntrl->regs)) { err = PTR_ERR(mhi_cntrl->regs); dev_err(&pdev->dev, "failed to map pci region: %d\n", err); diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c index 6a99a459b80b..51745ed1bbab 100644 --- a/drivers/cdrom/cdrom.c +++ b/drivers/cdrom/cdrom.c @@ -1106,7 +1106,7 @@ int open_for_data(struct cdrom_device_info *cdi) } } - cd_dbg(CD_OPEN, "all seems well, opening the devicen"); + cd_dbg(CD_OPEN, "all seems well, opening the device\n"); /* all seems well, we can open the device */ ret = cdo->open(cdi, 0); /* open for data */ diff --git a/drivers/cdrom/gdrom.c b/drivers/cdrom/gdrom.c index 71cfe7a85913..64b097e830d4 100644 --- a/drivers/cdrom/gdrom.c +++ b/drivers/cdrom/gdrom.c @@ -847,7 +847,7 @@ static void remove_gdrom(struct platform_device *devptr) static struct platform_driver gdrom_driver = { .probe = probe_gdrom, - .remove_new = remove_gdrom, + .remove = remove_gdrom, .driver = { .name = GDROM_DEV_NAME, }, diff --git a/drivers/cdx/Makefile b/drivers/cdx/Makefile index 749a3295c2bd..3ca7068a3052 100644 --- a/drivers/cdx/Makefile +++ b/drivers/cdx/Makefile @@ -5,7 +5,7 @@ # Copyright (C) 2022-2023, Advanced Micro Devices, Inc. # -ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CDX_BUS +ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE='"CDX_BUS"' obj-$(CONFIG_CDX_BUS) += cdx.o controller/ diff --git a/drivers/cdx/cdx.c b/drivers/cdx/cdx.c index 316bd89a95ca..76eac3653b1c 100644 --- a/drivers/cdx/cdx.c +++ b/drivers/cdx/cdx.c @@ -868,7 +868,7 @@ fail: return ret; } -EXPORT_SYMBOL_NS_GPL(cdx_device_add, CDX_BUS_CONTROLLER); +EXPORT_SYMBOL_NS_GPL(cdx_device_add, "CDX_BUS_CONTROLLER"); struct device *cdx_bus_add(struct cdx_controller *cdx, u8 bus_num) { @@ -915,7 +915,7 @@ device_add_fail: return NULL; } -EXPORT_SYMBOL_NS_GPL(cdx_bus_add, CDX_BUS_CONTROLLER); +EXPORT_SYMBOL_NS_GPL(cdx_bus_add, "CDX_BUS_CONTROLLER"); int cdx_register_controller(struct cdx_controller *cdx) { @@ -940,7 +940,7 @@ int cdx_register_controller(struct cdx_controller *cdx) return 0; } -EXPORT_SYMBOL_NS_GPL(cdx_register_controller, CDX_BUS_CONTROLLER); +EXPORT_SYMBOL_NS_GPL(cdx_register_controller, "CDX_BUS_CONTROLLER"); void cdx_unregister_controller(struct cdx_controller *cdx) { @@ -955,7 +955,7 @@ void cdx_unregister_controller(struct cdx_controller *cdx) mutex_unlock(&cdx_controller_lock); } -EXPORT_SYMBOL_NS_GPL(cdx_unregister_controller, CDX_BUS_CONTROLLER); +EXPORT_SYMBOL_NS_GPL(cdx_unregister_controller, "CDX_BUS_CONTROLLER"); static int __init cdx_bus_init(void) { diff --git a/drivers/cdx/cdx_msi.c b/drivers/cdx/cdx_msi.c index e55f1716cfcb..06d723978232 100644 --- a/drivers/cdx/cdx_msi.c +++ b/drivers/cdx/cdx_msi.c @@ -189,4 +189,4 @@ struct irq_domain *cdx_msi_domain_init(struct device *dev) return cdx_msi_domain; } -EXPORT_SYMBOL_NS_GPL(cdx_msi_domain_init, CDX_BUS_CONTROLLER); +EXPORT_SYMBOL_NS_GPL(cdx_msi_domain_init, "CDX_BUS_CONTROLLER"); diff --git a/drivers/cdx/controller/cdx_controller.c b/drivers/cdx/controller/cdx_controller.c index 201f9a6fbde7..d623f9c7517a 100644 --- a/drivers/cdx/controller/cdx_controller.c +++ b/drivers/cdx/controller/cdx_controller.c @@ -250,7 +250,7 @@ static struct platform_driver cdx_pdriver = { .of_match_table = cdx_match_table, }, .probe = xlnx_cdx_probe, - .remove_new = xlnx_cdx_remove, + .remove = xlnx_cdx_remove, }; static int __init cdx_controller_init(void) @@ -275,4 +275,4 @@ module_exit(cdx_controller_exit); MODULE_AUTHOR("AMD Inc."); MODULE_DESCRIPTION("CDX controller for AMD devices"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CDX_BUS_CONTROLLER); +MODULE_IMPORT_NS("CDX_BUS_CONTROLLER"); diff --git a/drivers/char/ipmi/bt-bmc.c b/drivers/char/ipmi/bt-bmc.c index b8b9c07d3b5d..009e32033b17 100644 --- a/drivers/char/ipmi/bt-bmc.c +++ b/drivers/char/ipmi/bt-bmc.c @@ -481,7 +481,7 @@ static struct platform_driver bt_bmc_driver = { .of_match_table = bt_bmc_match, }, .probe = bt_bmc_probe, - .remove_new = bt_bmc_remove, + .remove = bt_bmc_remove, }; module_platform_driver(bt_bmc_driver); diff --git a/drivers/char/ipmi/ipmi_powernv.c b/drivers/char/ipmi/ipmi_powernv.c index c59a86eb58c7..4a2efafcd1f8 100644 --- a/drivers/char/ipmi/ipmi_powernv.c +++ b/drivers/char/ipmi/ipmi_powernv.c @@ -302,7 +302,7 @@ static struct platform_driver powernv_ipmi_driver = { .of_match_table = ipmi_powernv_match, }, .probe = ipmi_powernv_probe, - .remove_new = ipmi_powernv_remove, + .remove = ipmi_powernv_remove, }; diff --git a/drivers/char/ipmi/ipmi_si_platform.c b/drivers/char/ipmi/ipmi_si_platform.c index 96ba85648120..550cabd43ae6 100644 --- a/drivers/char/ipmi/ipmi_si_platform.c +++ b/drivers/char/ipmi/ipmi_si_platform.c @@ -445,7 +445,7 @@ struct platform_driver ipmi_platform_driver = { .acpi_match_table = ACPI_PTR(acpi_ipmi_match), }, .probe = ipmi_probe, - .remove_new = ipmi_remove, + .remove = ipmi_remove, .id_table = si_plat_ids }; diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c index d04b391048fb..506d9988721e 100644 --- a/drivers/char/ipmi/ipmi_ssif.c +++ b/drivers/char/ipmi/ipmi_ssif.c @@ -2114,7 +2114,7 @@ static struct platform_driver ipmi_driver = { .name = DEVICE_NAME, }, .probe = ssif_platform_probe, - .remove_new = ssif_platform_remove, + .remove = ssif_platform_remove, .id_table = ssif_plat_ids }; diff --git a/drivers/char/ipmi/kcs_bmc_aspeed.c b/drivers/char/ipmi/kcs_bmc_aspeed.c index 227bf06c7ca4..c03bc1ec593a 100644 --- a/drivers/char/ipmi/kcs_bmc_aspeed.c +++ b/drivers/char/ipmi/kcs_bmc_aspeed.c @@ -672,7 +672,7 @@ static struct platform_driver ast_kcs_bmc_driver = { .of_match_table = ast_kcs_bmc_match, }, .probe = aspeed_kcs_probe, - .remove_new = aspeed_kcs_remove, + .remove = aspeed_kcs_remove, }; module_platform_driver(ast_kcs_bmc_driver); diff --git a/drivers/char/ipmi/kcs_bmc_npcm7xx.c b/drivers/char/ipmi/kcs_bmc_npcm7xx.c index 07710198233a..4808a61bf273 100644 --- a/drivers/char/ipmi/kcs_bmc_npcm7xx.c +++ b/drivers/char/ipmi/kcs_bmc_npcm7xx.c @@ -241,7 +241,7 @@ static struct platform_driver npcm_kcs_bmc_driver = { .of_match_table = npcm_kcs_bmc_match, }, .probe = npcm7xx_kcs_probe, - .remove_new = npcm7xx_kcs_remove, + .remove = npcm7xx_kcs_remove, }; module_platform_driver(npcm_kcs_bmc_driver); diff --git a/drivers/char/tpm/tpm_ftpm_tee.c b/drivers/char/tpm/tpm_ftpm_tee.c index 2ea4882251cf..139556b21cc6 100644 --- a/drivers/char/tpm/tpm_ftpm_tee.c +++ b/drivers/char/tpm/tpm_ftpm_tee.c @@ -366,7 +366,7 @@ static struct platform_driver ftpm_tee_plat_driver = { }, .shutdown = ftpm_plat_tee_shutdown, .probe = ftpm_plat_tee_probe, - .remove_new = ftpm_plat_tee_remove, + .remove = ftpm_plat_tee_remove, }; /* UUID of the fTPM TA */ diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c index 2f7326d297ad..9aa230a63616 100644 --- a/drivers/char/tpm/tpm_tis.c +++ b/drivers/char/tpm/tpm_tis.c @@ -356,7 +356,7 @@ MODULE_DEVICE_TABLE(of, tis_of_platform_match); static struct platform_driver tis_drv = { .probe = tpm_tis_plat_probe, - .remove_new = tpm_tis_plat_remove, + .remove = tpm_tis_plat_remove, .driver = { .name = "tpm_tis", .pm = &tpm_tis_pm, diff --git a/drivers/char/tpm/tpm_tis_synquacer.c b/drivers/char/tpm/tpm_tis_synquacer.c index 0621ebec530b..4927714d277a 100644 --- a/drivers/char/tpm/tpm_tis_synquacer.c +++ b/drivers/char/tpm/tpm_tis_synquacer.c @@ -152,7 +152,7 @@ MODULE_DEVICE_TABLE(acpi, tpm_synquacer_acpi_tbl); static struct platform_driver tis_synquacer_drv = { .probe = tpm_tis_synquacer_probe, - .remove_new = tpm_tis_synquacer_remove, + .remove = tpm_tis_synquacer_remove, .driver = { .name = "tpm_tis_synquacer", .pm = &tpm_tis_synquacer_pm, diff --git a/drivers/clk/clk-en7523.c b/drivers/clk/clk-en7523.c index e52c5460e927..495c0d607c7d 100644 --- a/drivers/clk/clk-en7523.c +++ b/drivers/clk/clk-en7523.c @@ -87,6 +87,7 @@ static const u32 slic_base[] = { 100000000, 3125000 }; static const u32 npu_base[] = { 333000000, 400000000, 500000000 }; /* EN7581 */ static const u32 emi7581_base[] = { 540000000, 480000000, 400000000, 300000000 }; +static const u32 bus7581_base[] = { 600000000, 540000000 }; static const u32 npu7581_base[] = { 800000000, 750000000, 720000000, 600000000 }; static const u32 crypto_base[] = { 540000000, 480000000 }; @@ -222,8 +223,8 @@ static const struct en_clk_desc en7581_base_clks[] = { .base_reg = REG_BUS_CLK_DIV_SEL, .base_bits = 1, .base_shift = 8, - .base_values = bus_base, - .n_base_values = ARRAY_SIZE(bus_base), + .base_values = bus7581_base, + .n_base_values = ARRAY_SIZE(bus7581_base), .div_bits = 3, .div_shift = 0, @@ -503,6 +504,8 @@ static void en7523_register_clocks(struct device *dev, struct clk_hw_onecell_dat u32 rate; int i; + clk_data->num = EN7523_NUM_CLOCKS; + for (i = 0; i < ARRAY_SIZE(en7523_base_clks); i++) { const struct en_clk_desc *desc = &en7523_base_clks[i]; u32 reg = desc->div_reg ? desc->div_reg : desc->base_reg; @@ -524,8 +527,6 @@ static void en7523_register_clocks(struct device *dev, struct clk_hw_onecell_dat hw = en7523_register_pcie_clk(dev, np_base); clk_data->hws[EN7523_CLK_PCIE] = hw; - - clk_data->num = EN7523_NUM_CLOCKS; } static int en7523_clk_hw_init(struct platform_device *pdev, diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c index bdc6e5b90da5..9b45fa005030 100644 --- a/drivers/clk/clk.c +++ b/drivers/clk/clk.c @@ -2530,7 +2530,7 @@ static int clk_core_set_rate_nolock(struct clk_core *core, rate = clk_core_req_round_rate_nolock(core, req_rate); /* bail early if nothing to do */ - if (rate == clk_core_get_rate_recalc(core)) + if (rate == clk_core_get_rate_nolock(core)) return 0; /* fail on a direct rate set of a protected provider */ diff --git a/drivers/clk/imx/clk-imx8mp-audiomix.c b/drivers/clk/imx/clk-imx8mp-audiomix.c index b2cb157703c5..c409fc7e0618 100644 --- a/drivers/clk/imx/clk-imx8mp-audiomix.c +++ b/drivers/clk/imx/clk-imx8mp-audiomix.c @@ -278,7 +278,8 @@ static int clk_imx8mp_audiomix_reset_controller_register(struct device *dev, #else /* !CONFIG_RESET_CONTROLLER */ -static int clk_imx8mp_audiomix_reset_controller_register(struct clk_imx8mp_audiomix_priv *priv) +static int clk_imx8mp_audiomix_reset_controller_register(struct device *dev, + struct clk_imx8mp_audiomix_priv *priv) { return 0; } diff --git a/drivers/clk/meson/Kconfig b/drivers/clk/meson/Kconfig index febb5d7348ff..be2e3a5f8336 100644 --- a/drivers/clk/meson/Kconfig +++ b/drivers/clk/meson/Kconfig @@ -106,7 +106,7 @@ config COMMON_CLK_AXG_AUDIO select COMMON_CLK_MESON_SCLK_DIV select COMMON_CLK_MESON_CLKC_UTILS select REGMAP_MMIO - depends on RESET_MESON_AUX + select RESET_CONTROLLER help Support for the audio clock controller on AmLogic A113D devices, aka axg, Say Y if you want audio subsystem to work. diff --git a/drivers/clk/meson/a1-peripherals.c b/drivers/clk/meson/a1-peripherals.c index 7aa6abb2eb1f..36489e0f948a 100644 --- a/drivers/clk/meson/a1-peripherals.c +++ b/drivers/clk/meson/a1-peripherals.c @@ -2246,4 +2246,4 @@ MODULE_DESCRIPTION("Amlogic A1 Peripherals Clock Controller driver"); MODULE_AUTHOR("Jian Hu "); MODULE_AUTHOR("Dmitry Rokosov "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/a1-pll.c b/drivers/clk/meson/a1-pll.c index 8e5a42d1afbb..8d7c7b4493c4 100644 --- a/drivers/clk/meson/a1-pll.c +++ b/drivers/clk/meson/a1-pll.c @@ -360,4 +360,4 @@ MODULE_DESCRIPTION("Amlogic S4 PLL Clock Controller driver"); MODULE_AUTHOR("Jian Hu "); MODULE_AUTHOR("Dmitry Rokosov "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/axg-aoclk.c b/drivers/clk/meson/axg-aoclk.c index 1dabc81535a6..f44091ffb57d 100644 --- a/drivers/clk/meson/axg-aoclk.c +++ b/drivers/clk/meson/axg-aoclk.c @@ -342,4 +342,4 @@ module_platform_driver(axg_aoclkc_driver); MODULE_DESCRIPTION("Amlogic AXG Always-ON Clock Controller driver"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/axg-audio.c b/drivers/clk/meson/axg-audio.c index 7714bde5ffc0..9df627b142f8 100644 --- a/drivers/clk/meson/axg-audio.c +++ b/drivers/clk/meson/axg-audio.c @@ -15,8 +15,6 @@ #include #include -#include - #include "meson-clkc-utils.h" #include "axg-audio.h" #include "clk-regmap.h" @@ -1680,6 +1678,84 @@ static struct clk_regmap *const sm1_clk_regmaps[] = { &sm1_earcrx_dmac_clk, }; +struct axg_audio_reset_data { + struct reset_controller_dev rstc; + struct regmap *map; + unsigned int offset; +}; + +static void axg_audio_reset_reg_and_bit(struct axg_audio_reset_data *rst, + unsigned long id, + unsigned int *reg, + unsigned int *bit) +{ + unsigned int stride = regmap_get_reg_stride(rst->map); + + *reg = (id / (stride * BITS_PER_BYTE)) * stride; + *reg += rst->offset; + *bit = id % (stride * BITS_PER_BYTE); +} + +static int axg_audio_reset_update(struct reset_controller_dev *rcdev, + unsigned long id, bool assert) +{ + struct axg_audio_reset_data *rst = + container_of(rcdev, struct axg_audio_reset_data, rstc); + unsigned int offset, bit; + + axg_audio_reset_reg_and_bit(rst, id, &offset, &bit); + + regmap_update_bits(rst->map, offset, BIT(bit), + assert ? BIT(bit) : 0); + + return 0; +} + +static int axg_audio_reset_status(struct reset_controller_dev *rcdev, + unsigned long id) +{ + struct axg_audio_reset_data *rst = + container_of(rcdev, struct axg_audio_reset_data, rstc); + unsigned int val, offset, bit; + + axg_audio_reset_reg_and_bit(rst, id, &offset, &bit); + + regmap_read(rst->map, offset, &val); + + return !!(val & BIT(bit)); +} + +static int axg_audio_reset_assert(struct reset_controller_dev *rcdev, + unsigned long id) +{ + return axg_audio_reset_update(rcdev, id, true); +} + +static int axg_audio_reset_deassert(struct reset_controller_dev *rcdev, + unsigned long id) +{ + return axg_audio_reset_update(rcdev, id, false); +} + +static int axg_audio_reset_toggle(struct reset_controller_dev *rcdev, + unsigned long id) +{ + int ret; + + ret = axg_audio_reset_assert(rcdev, id); + if (ret) + return ret; + + return axg_audio_reset_deassert(rcdev, id); +} + +static const struct reset_control_ops axg_audio_rstc_ops = { + .assert = axg_audio_reset_assert, + .deassert = axg_audio_reset_deassert, + .reset = axg_audio_reset_toggle, + .status = axg_audio_reset_status, +}; + static struct regmap_config axg_audio_regmap_cfg = { .reg_bits = 32, .val_bits = 32, @@ -1690,14 +1766,16 @@ struct audioclk_data { struct clk_regmap *const *regmap_clks; unsigned int regmap_clk_num; struct meson_clk_hw_data hw_clks; + unsigned int reset_offset; + unsigned int reset_num; unsigned int max_register; - const char *rst_drvname; }; static int axg_audio_clkc_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; const struct audioclk_data *data; + struct axg_audio_reset_data *rst; struct regmap *map; void __iomem *regs; struct clk_hw *hw; @@ -1756,11 +1834,22 @@ static int axg_audio_clkc_probe(struct platform_device *pdev) if (ret) return ret; - /* Register auxiliary reset driver when applicable */ - if (data->rst_drvname) - ret = devm_meson_rst_aux_register(dev, map, data->rst_drvname); + /* Stop here if there is no reset */ + if (!data->reset_num) + return 0; - return ret; + rst = devm_kzalloc(dev, sizeof(*rst), GFP_KERNEL); + if (!rst) + return -ENOMEM; + + rst->map = map; + rst->offset = data->reset_offset; + rst->rstc.nr_resets = data->reset_num; + rst->rstc.ops = &axg_audio_rstc_ops; + rst->rstc.of_node = dev->of_node; + rst->rstc.owner = THIS_MODULE; + + return devm_reset_controller_register(dev, &rst->rstc); } static const struct audioclk_data axg_audioclk_data = { @@ -1780,8 +1869,9 @@ static const struct audioclk_data g12a_audioclk_data = { .hws = g12a_audio_hw_clks, .num = ARRAY_SIZE(g12a_audio_hw_clks), }, + .reset_offset = AUDIO_SW_RESET, + .reset_num = 26, .max_register = AUDIO_CLK_SPDIFOUT_B_CTRL, - .rst_drvname = "rst-g12a", }; static const struct audioclk_data sm1_audioclk_data = { @@ -1791,8 +1881,9 @@ static const struct audioclk_data sm1_audioclk_data = { .hws = sm1_audio_hw_clks, .num = ARRAY_SIZE(sm1_audio_hw_clks), }, + .reset_offset = AUDIO_SM1_SW_RESET0, + .reset_num = 39, .max_register = AUDIO_EARCRX_DMAC_CLK_CTRL, - .rst_drvname = "rst-sm1", }; static const struct of_device_id clkc_match_table[] = { @@ -1821,4 +1912,4 @@ module_platform_driver(axg_audio_driver); MODULE_DESCRIPTION("Amlogic AXG/G12A/SM1 Audio Clock driver"); MODULE_AUTHOR("Jerome Brunet "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/axg.c b/drivers/clk/meson/axg.c index 1b08daf579b2..448eece246ca 100644 --- a/drivers/clk/meson/axg.c +++ b/drivers/clk/meson/axg.c @@ -2181,4 +2181,4 @@ module_platform_driver(axg_driver); MODULE_DESCRIPTION("Amlogic AXG Main Clock Controller driver"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/c3-peripherals.c b/drivers/clk/meson/c3-peripherals.c index 7dcbf4ebee07..2075668ed306 100644 --- a/drivers/clk/meson/c3-peripherals.c +++ b/drivers/clk/meson/c3-peripherals.c @@ -2364,4 +2364,4 @@ module_platform_driver(c3_peripherals_driver); MODULE_DESCRIPTION("Amlogic C3 Peripherals Clock Controller driver"); MODULE_AUTHOR("Chuan Liu "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/c3-pll.c b/drivers/clk/meson/c3-pll.c index 35fda31a19e2..ed4bc495862e 100644 --- a/drivers/clk/meson/c3-pll.c +++ b/drivers/clk/meson/c3-pll.c @@ -746,4 +746,4 @@ module_platform_driver(c3_pll_driver); MODULE_DESCRIPTION("Amlogic C3 PLL Clock Controller driver"); MODULE_AUTHOR("Chuan Liu "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/clk-cpu-dyndiv.c b/drivers/clk/meson/clk-cpu-dyndiv.c index 6c1f58826e24..cb043b52b65d 100644 --- a/drivers/clk/meson/clk-cpu-dyndiv.c +++ b/drivers/clk/meson/clk-cpu-dyndiv.c @@ -65,9 +65,9 @@ const struct clk_ops meson_clk_cpu_dyndiv_ops = { .determine_rate = meson_clk_cpu_dyndiv_determine_rate, .set_rate = meson_clk_cpu_dyndiv_set_rate, }; -EXPORT_SYMBOL_NS_GPL(meson_clk_cpu_dyndiv_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_clk_cpu_dyndiv_ops, "CLK_MESON"); MODULE_DESCRIPTION("Amlogic CPU Dynamic Clock divider"); MODULE_AUTHOR("Neil Armstrong "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/clk-dualdiv.c b/drivers/clk/meson/clk-dualdiv.c index 913bf25d3771..c896cf29b318 100644 --- a/drivers/clk/meson/clk-dualdiv.c +++ b/drivers/clk/meson/clk-dualdiv.c @@ -130,15 +130,15 @@ const struct clk_ops meson_clk_dualdiv_ops = { .determine_rate = meson_clk_dualdiv_determine_rate, .set_rate = meson_clk_dualdiv_set_rate, }; -EXPORT_SYMBOL_NS_GPL(meson_clk_dualdiv_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_clk_dualdiv_ops, "CLK_MESON"); const struct clk_ops meson_clk_dualdiv_ro_ops = { .recalc_rate = meson_clk_dualdiv_recalc_rate, }; -EXPORT_SYMBOL_NS_GPL(meson_clk_dualdiv_ro_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_clk_dualdiv_ro_ops, "CLK_MESON"); MODULE_DESCRIPTION("Amlogic dual divider driver"); MODULE_AUTHOR("Neil Armstrong "); MODULE_AUTHOR("Jerome Brunet "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/clk-mpll.c b/drivers/clk/meson/clk-mpll.c index aa9abd06ae65..ee91e32b4050 100644 --- a/drivers/clk/meson/clk-mpll.c +++ b/drivers/clk/meson/clk-mpll.c @@ -154,7 +154,7 @@ const struct clk_ops meson_clk_mpll_ro_ops = { .recalc_rate = mpll_recalc_rate, .determine_rate = mpll_determine_rate, }; -EXPORT_SYMBOL_NS_GPL(meson_clk_mpll_ro_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_clk_mpll_ro_ops, "CLK_MESON"); const struct clk_ops meson_clk_mpll_ops = { .recalc_rate = mpll_recalc_rate, @@ -162,9 +162,9 @@ const struct clk_ops meson_clk_mpll_ops = { .set_rate = mpll_set_rate, .init = mpll_init, }; -EXPORT_SYMBOL_NS_GPL(meson_clk_mpll_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_clk_mpll_ops, "CLK_MESON"); MODULE_DESCRIPTION("Amlogic MPLL driver"); MODULE_AUTHOR("Michael Turquette "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/clk-phase.c b/drivers/clk/meson/clk-phase.c index c1526fbfb6c4..701211120610 100644 --- a/drivers/clk/meson/clk-phase.c +++ b/drivers/clk/meson/clk-phase.c @@ -61,7 +61,7 @@ const struct clk_ops meson_clk_phase_ops = { .get_phase = meson_clk_phase_get_phase, .set_phase = meson_clk_phase_set_phase, }; -EXPORT_SYMBOL_NS_GPL(meson_clk_phase_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_clk_phase_ops, "CLK_MESON"); /* * This is a special clock for the audio controller. @@ -123,7 +123,7 @@ const struct clk_ops meson_clk_triphase_ops = { .get_phase = meson_clk_triphase_get_phase, .set_phase = meson_clk_triphase_set_phase, }; -EXPORT_SYMBOL_NS_GPL(meson_clk_triphase_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_clk_triphase_ops, "CLK_MESON"); /* * This is a special clock for the audio controller. @@ -178,9 +178,9 @@ const struct clk_ops meson_sclk_ws_inv_ops = { .get_phase = meson_sclk_ws_inv_get_phase, .set_phase = meson_sclk_ws_inv_set_phase, }; -EXPORT_SYMBOL_NS_GPL(meson_sclk_ws_inv_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_sclk_ws_inv_ops, "CLK_MESON"); MODULE_DESCRIPTION("Amlogic phase driver"); MODULE_AUTHOR("Jerome Brunet "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/clk-pll.c b/drivers/clk/meson/clk-pll.c index 89f0f04a16ab..e8e53855b00a 100644 --- a/drivers/clk/meson/clk-pll.c +++ b/drivers/clk/meson/clk-pll.c @@ -474,7 +474,7 @@ const struct clk_ops meson_clk_pcie_pll_ops = { .enable = meson_clk_pcie_pll_enable, .disable = meson_clk_pll_disable }; -EXPORT_SYMBOL_NS_GPL(meson_clk_pcie_pll_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_clk_pcie_pll_ops, "CLK_MESON"); const struct clk_ops meson_clk_pll_ops = { .init = meson_clk_pll_init, @@ -485,16 +485,16 @@ const struct clk_ops meson_clk_pll_ops = { .enable = meson_clk_pll_enable, .disable = meson_clk_pll_disable }; -EXPORT_SYMBOL_NS_GPL(meson_clk_pll_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_clk_pll_ops, "CLK_MESON"); const struct clk_ops meson_clk_pll_ro_ops = { .recalc_rate = meson_clk_pll_recalc_rate, .is_enabled = meson_clk_pll_is_enabled, }; -EXPORT_SYMBOL_NS_GPL(meson_clk_pll_ro_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_clk_pll_ro_ops, "CLK_MESON"); MODULE_DESCRIPTION("Amlogic PLL driver"); MODULE_AUTHOR("Carlo Caione "); MODULE_AUTHOR("Jerome Brunet "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/clk-regmap.c b/drivers/clk/meson/clk-regmap.c index 07f7e441b916..f3e504f67571 100644 --- a/drivers/clk/meson/clk-regmap.c +++ b/drivers/clk/meson/clk-regmap.c @@ -49,12 +49,12 @@ const struct clk_ops clk_regmap_gate_ops = { .disable = clk_regmap_gate_disable, .is_enabled = clk_regmap_gate_is_enabled, }; -EXPORT_SYMBOL_NS_GPL(clk_regmap_gate_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(clk_regmap_gate_ops, "CLK_MESON"); const struct clk_ops clk_regmap_gate_ro_ops = { .is_enabled = clk_regmap_gate_is_enabled, }; -EXPORT_SYMBOL_NS_GPL(clk_regmap_gate_ro_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(clk_regmap_gate_ro_ops, "CLK_MESON"); static unsigned long clk_regmap_div_recalc_rate(struct clk_hw *hw, unsigned long prate) @@ -125,13 +125,13 @@ const struct clk_ops clk_regmap_divider_ops = { .determine_rate = clk_regmap_div_determine_rate, .set_rate = clk_regmap_div_set_rate, }; -EXPORT_SYMBOL_NS_GPL(clk_regmap_divider_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(clk_regmap_divider_ops, "CLK_MESON"); const struct clk_ops clk_regmap_divider_ro_ops = { .recalc_rate = clk_regmap_div_recalc_rate, .determine_rate = clk_regmap_div_determine_rate, }; -EXPORT_SYMBOL_NS_GPL(clk_regmap_divider_ro_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(clk_regmap_divider_ro_ops, "CLK_MESON"); static u8 clk_regmap_mux_get_parent(struct clk_hw *hw) { @@ -174,14 +174,14 @@ const struct clk_ops clk_regmap_mux_ops = { .set_parent = clk_regmap_mux_set_parent, .determine_rate = clk_regmap_mux_determine_rate, }; -EXPORT_SYMBOL_NS_GPL(clk_regmap_mux_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(clk_regmap_mux_ops, "CLK_MESON"); const struct clk_ops clk_regmap_mux_ro_ops = { .get_parent = clk_regmap_mux_get_parent, }; -EXPORT_SYMBOL_NS_GPL(clk_regmap_mux_ro_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(clk_regmap_mux_ro_ops, "CLK_MESON"); MODULE_DESCRIPTION("Amlogic regmap backed clock driver"); MODULE_AUTHOR("Jerome Brunet "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/g12a-aoclk.c b/drivers/clk/meson/g12a-aoclk.c index f0a18d8c9fc2..71c758ffa493 100644 --- a/drivers/clk/meson/g12a-aoclk.c +++ b/drivers/clk/meson/g12a-aoclk.c @@ -477,4 +477,4 @@ module_platform_driver(g12a_aoclkc_driver); MODULE_DESCRIPTION("Amlogic G12A Always-ON Clock Controller driver"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/g12a.c b/drivers/clk/meson/g12a.c index d3539fe9f7af..cfffd434e998 100644 --- a/drivers/clk/meson/g12a.c +++ b/drivers/clk/meson/g12a.c @@ -5610,4 +5610,4 @@ module_platform_driver(g12a_driver); MODULE_DESCRIPTION("Amlogic G12/SM1 Main Clock Controller driver"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/gxbb-aoclk.c b/drivers/clk/meson/gxbb-aoclk.c index 83b034157b35..43940232f718 100644 --- a/drivers/clk/meson/gxbb-aoclk.c +++ b/drivers/clk/meson/gxbb-aoclk.c @@ -303,4 +303,4 @@ module_platform_driver(gxbb_aoclkc_driver); MODULE_DESCRIPTION("Amlogic GXBB Always-ON Clock Controller driver"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/gxbb.c b/drivers/clk/meson/gxbb.c index 262c318edbd5..8575b8485385 100644 --- a/drivers/clk/meson/gxbb.c +++ b/drivers/clk/meson/gxbb.c @@ -3565,4 +3565,4 @@ module_platform_driver(gxbb_driver); MODULE_DESCRIPTION("Amlogic GXBB Main Clock Controller driver"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/meson-aoclk.c b/drivers/clk/meson/meson-aoclk.c index 053940ee8940..995be51987f4 100644 --- a/drivers/clk/meson/meson-aoclk.c +++ b/drivers/clk/meson/meson-aoclk.c @@ -88,8 +88,8 @@ int meson_aoclkc_probe(struct platform_device *pdev) return devm_of_clk_add_hw_provider(dev, meson_clk_hw_get, (void *)&data->hw_clks); } -EXPORT_SYMBOL_NS_GPL(meson_aoclkc_probe, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_aoclkc_probe, "CLK_MESON"); MODULE_DESCRIPTION("Amlogic Always-ON Clock Controller helpers"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/meson-clkc-utils.c b/drivers/clk/meson/meson-clkc-utils.c index a8cd2c21fab7..6937d1482719 100644 --- a/drivers/clk/meson/meson-clkc-utils.c +++ b/drivers/clk/meson/meson-clkc-utils.c @@ -20,8 +20,8 @@ struct clk_hw *meson_clk_hw_get(struct of_phandle_args *clkspec, void *clk_hw_da return data->hws[idx]; } -EXPORT_SYMBOL_NS_GPL(meson_clk_hw_get, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_clk_hw_get, "CLK_MESON"); MODULE_DESCRIPTION("Amlogic Clock Controller Utilities"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/meson-eeclk.c b/drivers/clk/meson/meson-eeclk.c index 66f79e384fe5..3053ee7425eb 100644 --- a/drivers/clk/meson/meson-eeclk.c +++ b/drivers/clk/meson/meson-eeclk.c @@ -57,8 +57,8 @@ int meson_eeclkc_probe(struct platform_device *pdev) return devm_of_clk_add_hw_provider(dev, meson_clk_hw_get, (void *)&data->hw_clks); } -EXPORT_SYMBOL_NS_GPL(meson_eeclkc_probe, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_eeclkc_probe, "CLK_MESON"); MODULE_DESCRIPTION("Amlogic Main Clock Controller Helpers"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/s4-peripherals.c b/drivers/clk/meson/s4-peripherals.c index c930cf0614a0..8a4037377787 100644 --- a/drivers/clk/meson/s4-peripherals.c +++ b/drivers/clk/meson/s4-peripherals.c @@ -3814,4 +3814,4 @@ module_platform_driver(s4_driver); MODULE_DESCRIPTION("Amlogic S4 Peripherals Clock Controller driver"); MODULE_AUTHOR("Yu Tu "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/s4-pll.c b/drivers/clk/meson/s4-pll.c index d8e621e79428..f9cc05a506e3 100644 --- a/drivers/clk/meson/s4-pll.c +++ b/drivers/clk/meson/s4-pll.c @@ -872,4 +872,4 @@ module_platform_driver(s4_driver); MODULE_DESCRIPTION("Amlogic S4 PLL Clock Controller driver"); MODULE_AUTHOR("Yu Tu "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/sclk-div.c b/drivers/clk/meson/sclk-div.c index ae03b048182f..9c4945234f26 100644 --- a/drivers/clk/meson/sclk-div.c +++ b/drivers/clk/meson/sclk-div.c @@ -247,9 +247,9 @@ const struct clk_ops meson_sclk_div_ops = { .set_duty_cycle = sclk_div_set_duty_cycle, .init = sclk_div_init, }; -EXPORT_SYMBOL_NS_GPL(meson_sclk_div_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_sclk_div_ops, "CLK_MESON"); MODULE_DESCRIPTION("Amlogic Sample divider driver"); MODULE_AUTHOR("Jerome Brunet "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/vclk.c b/drivers/clk/meson/vclk.c index 36f637d2d01b..6a167ebdc8d7 100644 --- a/drivers/clk/meson/vclk.c +++ b/drivers/clk/meson/vclk.c @@ -49,7 +49,7 @@ const struct clk_ops meson_vclk_gate_ops = { .disable = meson_vclk_gate_disable, .is_enabled = meson_vclk_gate_is_enabled, }; -EXPORT_SYMBOL_NS_GPL(meson_vclk_gate_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_vclk_gate_ops, "CLK_MESON"); /* The VCLK Divider has supplementary reset & enable bits */ @@ -134,9 +134,9 @@ const struct clk_ops meson_vclk_div_ops = { .disable = meson_vclk_div_disable, .is_enabled = meson_vclk_div_is_enabled, }; -EXPORT_SYMBOL_NS_GPL(meson_vclk_div_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_vclk_div_ops, "CLK_MESON"); MODULE_DESCRIPTION("Amlogic vclk clock driver"); MODULE_AUTHOR("Neil Armstrong "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/meson/vid-pll-div.c b/drivers/clk/meson/vid-pll-div.c index 486cf68fc97a..965ed7281f57 100644 --- a/drivers/clk/meson/vid-pll-div.c +++ b/drivers/clk/meson/vid-pll-div.c @@ -92,9 +92,9 @@ static unsigned long meson_vid_pll_div_recalc_rate(struct clk_hw *hw, const struct clk_ops meson_vid_pll_div_ro_ops = { .recalc_rate = meson_vid_pll_div_recalc_rate, }; -EXPORT_SYMBOL_NS_GPL(meson_vid_pll_div_ro_ops, CLK_MESON); +EXPORT_SYMBOL_NS_GPL(meson_vid_pll_div_ro_ops, "CLK_MESON"); MODULE_DESCRIPTION("Amlogic video pll divider driver"); MODULE_AUTHOR("Neil Armstrong "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CLK_MESON); +MODULE_IMPORT_NS("CLK_MESON"); diff --git a/drivers/clk/microchip/clk-mpfs.c b/drivers/clk/microchip/clk-mpfs.c index 28ec0da88cb3..c22632a7439c 100644 --- a/drivers/clk/microchip/clk-mpfs.c +++ b/drivers/clk/microchip/clk-mpfs.c @@ -443,4 +443,4 @@ MODULE_DESCRIPTION("Microchip PolarFire SoC Clock Driver"); MODULE_AUTHOR("Padmarao Begari "); MODULE_AUTHOR("Daire McNamara "); MODULE_AUTHOR("Conor Dooley "); -MODULE_IMPORT_NS(MCHP_CLK_MPFS); +MODULE_IMPORT_NS("MCHP_CLK_MPFS"); diff --git a/drivers/clk/sunxi-ng/ccu-sun20i-d1-r.c b/drivers/clk/sunxi-ng/ccu-sun20i-d1-r.c index 4084714adb15..44b2ebdebdac 100644 --- a/drivers/clk/sunxi-ng/ccu-sun20i-d1-r.c +++ b/drivers/clk/sunxi-ng/ccu-sun20i-d1-r.c @@ -137,6 +137,6 @@ static struct platform_driver sun20i_d1_r_ccu_driver = { }; module_platform_driver(sun20i_d1_r_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner D1/R528/T113 PRCM CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun20i-d1.c b/drivers/clk/sunxi-ng/ccu-sun20i-d1.c index c80ac2dfbb60..bb66c906ebbb 100644 --- a/drivers/clk/sunxi-ng/ccu-sun20i-d1.c +++ b/drivers/clk/sunxi-ng/ccu-sun20i-d1.c @@ -1406,6 +1406,6 @@ static struct platform_driver sun20i_d1_ccu_driver = { }; module_platform_driver(sun20i_d1_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner D1/R528/T113 CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun4i-a10.c b/drivers/clk/sunxi-ng/ccu-sun4i-a10.c index 54c794c50828..409feb085021 100644 --- a/drivers/clk/sunxi-ng/ccu-sun4i-a10.c +++ b/drivers/clk/sunxi-ng/ccu-sun4i-a10.c @@ -1493,6 +1493,6 @@ static struct platform_driver sun4i_a10_ccu_driver = { }; module_platform_driver(sun4i_a10_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner A10/A20 CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a100-r.c b/drivers/clk/sunxi-ng/ccu-sun50i-a100-r.c index cdd9721f9e7d..cb0f8d110c32 100644 --- a/drivers/clk/sunxi-ng/ccu-sun50i-a100-r.c +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a100-r.c @@ -214,6 +214,6 @@ static struct platform_driver sun50i_a100_r_ccu_driver = { }; module_platform_driver(sun50i_a100_r_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner A100 PRCM CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a100.c b/drivers/clk/sunxi-ng/ccu-sun50i-a100.c index 1b6a49bc7184..7133377d4163 100644 --- a/drivers/clk/sunxi-ng/ccu-sun50i-a100.c +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a100.c @@ -1276,6 +1276,6 @@ static struct platform_driver sun50i_a100_ccu_driver = { }; module_platform_driver(sun50i_a100_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner A100 CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c index 82d7dcbca1cc..3a7d61c81667 100644 --- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c @@ -994,6 +994,6 @@ static struct platform_driver sun50i_a64_ccu_driver = { }; module_platform_driver(sun50i_a64_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner A64 CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c index d0ce2779c550..acb4e8b9b1ba 100644 --- a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c @@ -256,6 +256,6 @@ static struct platform_driver sun50i_h6_r_ccu_driver = { }; module_platform_driver(sun50i_h6_r_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner H6 and H616 PRCM CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6.c b/drivers/clk/sunxi-ng/ccu-sun50i-h6.c index bd6fc3df911d..7fccda96d444 100644 --- a/drivers/clk/sunxi-ng/ccu-sun50i-h6.c +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6.c @@ -1286,6 +1286,6 @@ static struct platform_driver sun50i_h6_ccu_driver = { }; module_platform_driver(sun50i_h6_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner H6 CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h616.c b/drivers/clk/sunxi-ng/ccu-sun50i-h616.c index b001d0c03534..1086669b91da 100644 --- a/drivers/clk/sunxi-ng/ccu-sun50i-h616.c +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h616.c @@ -1185,6 +1185,6 @@ static struct platform_driver sun50i_h616_ccu_driver = { }; module_platform_driver(sun50i_h616_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner H616 CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun6i-a31.c b/drivers/clk/sunxi-ng/ccu-sun6i-a31.c index c2ad1209633e..bab65cfe9501 100644 --- a/drivers/clk/sunxi-ng/ccu-sun6i-a31.c +++ b/drivers/clk/sunxi-ng/ccu-sun6i-a31.c @@ -1283,6 +1283,6 @@ static struct platform_driver sun6i_a31_ccu_driver = { }; module_platform_driver(sun6i_a31_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner A31/A31s CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun6i-rtc.c b/drivers/clk/sunxi-ng/ccu-sun6i-rtc.c index 724b202863a8..0536e880b80f 100644 --- a/drivers/clk/sunxi-ng/ccu-sun6i-rtc.c +++ b/drivers/clk/sunxi-ng/ccu-sun6i-rtc.c @@ -381,6 +381,6 @@ int sun6i_rtc_ccu_probe(struct device *dev, void __iomem *reg) return devm_sunxi_ccu_probe(dev, reg, &sun6i_rtc_ccu_desc); } -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner H616/R329 RTC CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-a23.c b/drivers/clk/sunxi-ng/ccu-sun8i-a23.c index 9433dbac038e..78cf3818ab09 100644 --- a/drivers/clk/sunxi-ng/ccu-sun8i-a23.c +++ b/drivers/clk/sunxi-ng/ccu-sun8i-a23.c @@ -763,6 +763,6 @@ static struct platform_driver sun8i_a23_ccu_driver = { }; module_platform_driver(sun8i_a23_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner A23 CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-a33.c b/drivers/clk/sunxi-ng/ccu-sun8i-a33.c index 1ffc5ab9bc3c..b039d419512c 100644 --- a/drivers/clk/sunxi-ng/ccu-sun8i-a33.c +++ b/drivers/clk/sunxi-ng/ccu-sun8i-a33.c @@ -835,6 +835,6 @@ static struct platform_driver sun8i_a33_ccu_driver = { }; module_platform_driver(sun8i_a33_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner A33 CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-a83t.c b/drivers/clk/sunxi-ng/ccu-sun8i-a83t.c index a51fb2c10c94..60e918965a72 100644 --- a/drivers/clk/sunxi-ng/ccu-sun8i-a83t.c +++ b/drivers/clk/sunxi-ng/ccu-sun8i-a83t.c @@ -923,6 +923,6 @@ static struct platform_driver sun8i_a83t_ccu_driver = { }; module_platform_driver(sun8i_a83t_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner A83T CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-de2.c b/drivers/clk/sunxi-ng/ccu-sun8i-de2.c index a742f83746d1..f2aa71206bc2 100644 --- a/drivers/clk/sunxi-ng/ccu-sun8i-de2.c +++ b/drivers/clk/sunxi-ng/ccu-sun8i-de2.c @@ -348,6 +348,6 @@ static struct platform_driver sunxi_de2_clk_driver = { }; module_platform_driver(sunxi_de2_clk_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner SoCs DE2 CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-h3.c b/drivers/clk/sunxi-ng/ccu-sun8i-h3.c index 74da5d27af72..740c4c97331c 100644 --- a/drivers/clk/sunxi-ng/ccu-sun8i-h3.c +++ b/drivers/clk/sunxi-ng/ccu-sun8i-h3.c @@ -1094,6 +1094,6 @@ static struct platform_driver sun8i_h3_ccu_driver = { }; module_platform_driver(sun8i_h3_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner H3 CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-r.c b/drivers/clk/sunxi-ng/ccu-sun8i-r.c index 2b3e094a32cb..0e324344673b 100644 --- a/drivers/clk/sunxi-ng/ccu-sun8i-r.c +++ b/drivers/clk/sunxi-ng/ccu-sun8i-r.c @@ -274,6 +274,6 @@ static struct platform_driver sun8i_r_ccu_driver = { }; module_platform_driver(sun8i_r_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for Allwinner SoCs' PRCM CCUs"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-r40.c b/drivers/clk/sunxi-ng/ccu-sun8i-r40.c index a374aeeca3f4..8b729c9b3545 100644 --- a/drivers/clk/sunxi-ng/ccu-sun8i-r40.c +++ b/drivers/clk/sunxi-ng/ccu-sun8i-r40.c @@ -1375,6 +1375,6 @@ static struct platform_driver sun8i_r40_ccu_driver = { }; module_platform_driver(sun8i_r40_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner R40 CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-v3s.c b/drivers/clk/sunxi-ng/ccu-sun8i-v3s.c index 00d04f7ad94d..579a81bb46df 100644 --- a/drivers/clk/sunxi-ng/ccu-sun8i-v3s.c +++ b/drivers/clk/sunxi-ng/ccu-sun8i-v3s.c @@ -780,6 +780,6 @@ static struct platform_driver sun8i_v3s_ccu_driver = { }; module_platform_driver(sun8i_v3s_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner V3s CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun9i-a80-de.c b/drivers/clk/sunxi-ng/ccu-sun9i-a80-de.c index d561c15f5122..91e5dc448bc0 100644 --- a/drivers/clk/sunxi-ng/ccu-sun9i-a80-de.c +++ b/drivers/clk/sunxi-ng/ccu-sun9i-a80-de.c @@ -266,6 +266,6 @@ static struct platform_driver sun9i_a80_de_clk_driver = { }; module_platform_driver(sun9i_a80_de_clk_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner A80 Display Engine CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun9i-a80-usb.c b/drivers/clk/sunxi-ng/ccu-sun9i-a80-usb.c index 9e2b8d47fc54..62063f525616 100644 --- a/drivers/clk/sunxi-ng/ccu-sun9i-a80-usb.c +++ b/drivers/clk/sunxi-ng/ccu-sun9i-a80-usb.c @@ -138,6 +138,6 @@ static struct platform_driver sun9i_a80_usb_clk_driver = { }; module_platform_driver(sun9i_a80_usb_clk_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner A80 USB CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-sun9i-a80.c b/drivers/clk/sunxi-ng/ccu-sun9i-a80.c index 5da9a16b4ec7..337751998005 100644 --- a/drivers/clk/sunxi-ng/ccu-sun9i-a80.c +++ b/drivers/clk/sunxi-ng/ccu-sun9i-a80.c @@ -1248,6 +1248,6 @@ static struct platform_driver sun9i_a80_ccu_driver = { }; module_platform_driver(sun9i_a80_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner A80 CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu-suniv-f1c100s.c b/drivers/clk/sunxi-ng/ccu-suniv-f1c100s.c index fb37c0fc4fde..35935423145e 100644 --- a/drivers/clk/sunxi-ng/ccu-suniv-f1c100s.c +++ b/drivers/clk/sunxi-ng/ccu-suniv-f1c100s.c @@ -577,6 +577,6 @@ static struct platform_driver suniv_f1c100s_ccu_driver = { }; module_platform_driver(suniv_f1c100s_ccu_driver); -MODULE_IMPORT_NS(SUNXI_CCU); +MODULE_IMPORT_NS("SUNXI_CCU"); MODULE_DESCRIPTION("Support for the Allwinner newer F1C100s CCU"); MODULE_LICENSE("GPL"); diff --git a/drivers/clk/sunxi-ng/ccu_common.c b/drivers/clk/sunxi-ng/ccu_common.c index 4117b0bea267..88ed89658d45 100644 --- a/drivers/clk/sunxi-ng/ccu_common.c +++ b/drivers/clk/sunxi-ng/ccu_common.c @@ -37,7 +37,7 @@ void ccu_helper_wait_for_lock(struct ccu_common *common, u32 lock) WARN_ON(readl_relaxed_poll_timeout(addr, reg, reg & lock, 100, 70000)); } -EXPORT_SYMBOL_NS_GPL(ccu_helper_wait_for_lock, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_helper_wait_for_lock, "SUNXI_CCU"); bool ccu_is_better_rate(struct ccu_common *common, unsigned long target_rate, @@ -59,7 +59,7 @@ bool ccu_is_better_rate(struct ccu_common *common, return current_rate <= target_rate && current_rate > best_rate; } -EXPORT_SYMBOL_NS_GPL(ccu_is_better_rate, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_is_better_rate, "SUNXI_CCU"); /* * This clock notifier is called when the frequency of a PLL clock is @@ -107,7 +107,7 @@ int ccu_pll_notifier_register(struct ccu_pll_nb *pll_nb) return clk_notifier_register(pll_nb->common->hw.clk, &pll_nb->clk_nb); } -EXPORT_SYMBOL_NS_GPL(ccu_pll_notifier_register, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_pll_notifier_register, "SUNXI_CCU"); static int sunxi_ccu_probe(struct sunxi_ccu *ccu, struct device *dev, struct device_node *node, void __iomem *reg, @@ -234,7 +234,7 @@ int devm_sunxi_ccu_probe(struct device *dev, void __iomem *reg, return 0; } -EXPORT_SYMBOL_NS_GPL(devm_sunxi_ccu_probe, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(devm_sunxi_ccu_probe, "SUNXI_CCU"); void of_sunxi_ccu_probe(struct device_node *node, void __iomem *reg, const struct sunxi_ccu_desc *desc) diff --git a/drivers/clk/sunxi-ng/ccu_div.c b/drivers/clk/sunxi-ng/ccu_div.c index cb10a3ea23f9..7f4691f09e01 100644 --- a/drivers/clk/sunxi-ng/ccu_div.c +++ b/drivers/clk/sunxi-ng/ccu_div.c @@ -141,4 +141,4 @@ const struct clk_ops ccu_div_ops = { .recalc_rate = ccu_div_recalc_rate, .set_rate = ccu_div_set_rate, }; -EXPORT_SYMBOL_NS_GPL(ccu_div_ops, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_div_ops, "SUNXI_CCU"); diff --git a/drivers/clk/sunxi-ng/ccu_frac.c b/drivers/clk/sunxi-ng/ccu_frac.c index b31f3ad946d6..75323912608a 100644 --- a/drivers/clk/sunxi-ng/ccu_frac.c +++ b/drivers/clk/sunxi-ng/ccu_frac.c @@ -18,7 +18,7 @@ bool ccu_frac_helper_is_enabled(struct ccu_common *common, return !(readl(common->base + common->reg) & cf->enable); } -EXPORT_SYMBOL_NS_GPL(ccu_frac_helper_is_enabled, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_frac_helper_is_enabled, "SUNXI_CCU"); void ccu_frac_helper_enable(struct ccu_common *common, struct ccu_frac_internal *cf) @@ -34,7 +34,7 @@ void ccu_frac_helper_enable(struct ccu_common *common, writel(reg & ~cf->enable, common->base + common->reg); spin_unlock_irqrestore(common->lock, flags); } -EXPORT_SYMBOL_NS_GPL(ccu_frac_helper_enable, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_frac_helper_enable, "SUNXI_CCU"); void ccu_frac_helper_disable(struct ccu_common *common, struct ccu_frac_internal *cf) @@ -50,7 +50,7 @@ void ccu_frac_helper_disable(struct ccu_common *common, writel(reg | cf->enable, common->base + common->reg); spin_unlock_irqrestore(common->lock, flags); } -EXPORT_SYMBOL_NS_GPL(ccu_frac_helper_disable, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_frac_helper_disable, "SUNXI_CCU"); bool ccu_frac_helper_has_rate(struct ccu_common *common, struct ccu_frac_internal *cf, @@ -61,7 +61,7 @@ bool ccu_frac_helper_has_rate(struct ccu_common *common, return (cf->rates[0] == rate) || (cf->rates[1] == rate); } -EXPORT_SYMBOL_NS_GPL(ccu_frac_helper_has_rate, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_frac_helper_has_rate, "SUNXI_CCU"); unsigned long ccu_frac_helper_read_rate(struct ccu_common *common, struct ccu_frac_internal *cf) @@ -83,7 +83,7 @@ unsigned long ccu_frac_helper_read_rate(struct ccu_common *common, return (reg & cf->select) ? cf->rates[1] : cf->rates[0]; } -EXPORT_SYMBOL_NS_GPL(ccu_frac_helper_read_rate, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_frac_helper_read_rate, "SUNXI_CCU"); int ccu_frac_helper_set_rate(struct ccu_common *common, struct ccu_frac_internal *cf, @@ -112,4 +112,4 @@ int ccu_frac_helper_set_rate(struct ccu_common *common, return 0; } -EXPORT_SYMBOL_NS_GPL(ccu_frac_helper_set_rate, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_frac_helper_set_rate, "SUNXI_CCU"); diff --git a/drivers/clk/sunxi-ng/ccu_gate.c b/drivers/clk/sunxi-ng/ccu_gate.c index a2115a21807d..ac52fd6bff67 100644 --- a/drivers/clk/sunxi-ng/ccu_gate.c +++ b/drivers/clk/sunxi-ng/ccu_gate.c @@ -24,7 +24,7 @@ void ccu_gate_helper_disable(struct ccu_common *common, u32 gate) spin_unlock_irqrestore(common->lock, flags); } -EXPORT_SYMBOL_NS_GPL(ccu_gate_helper_disable, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_gate_helper_disable, "SUNXI_CCU"); static void ccu_gate_disable(struct clk_hw *hw) { @@ -50,7 +50,7 @@ int ccu_gate_helper_enable(struct ccu_common *common, u32 gate) return 0; } -EXPORT_SYMBOL_NS_GPL(ccu_gate_helper_enable, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_gate_helper_enable, "SUNXI_CCU"); static int ccu_gate_enable(struct clk_hw *hw) { @@ -66,7 +66,7 @@ int ccu_gate_helper_is_enabled(struct ccu_common *common, u32 gate) return readl(common->base + common->reg) & gate; } -EXPORT_SYMBOL_NS_GPL(ccu_gate_helper_is_enabled, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_gate_helper_is_enabled, "SUNXI_CCU"); static int ccu_gate_is_enabled(struct clk_hw *hw) { @@ -127,4 +127,4 @@ const struct clk_ops ccu_gate_ops = { .set_rate = ccu_gate_set_rate, .recalc_rate = ccu_gate_recalc_rate, }; -EXPORT_SYMBOL_NS_GPL(ccu_gate_ops, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_gate_ops, "SUNXI_CCU"); diff --git a/drivers/clk/sunxi-ng/ccu_mp.c b/drivers/clk/sunxi-ng/ccu_mp.c index cc94a694cb67..2bb8987ddcc2 100644 --- a/drivers/clk/sunxi-ng/ccu_mp.c +++ b/drivers/clk/sunxi-ng/ccu_mp.c @@ -246,7 +246,7 @@ const struct clk_ops ccu_mp_ops = { .recalc_rate = ccu_mp_recalc_rate, .set_rate = ccu_mp_set_rate, }; -EXPORT_SYMBOL_NS_GPL(ccu_mp_ops, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_mp_ops, "SUNXI_CCU"); /* * Support for MMC timing mode switching @@ -327,4 +327,4 @@ const struct clk_ops ccu_mp_mmc_ops = { .recalc_rate = ccu_mp_mmc_recalc_rate, .set_rate = ccu_mp_mmc_set_rate, }; -EXPORT_SYMBOL_NS_GPL(ccu_mp_mmc_ops, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_mp_mmc_ops, "SUNXI_CCU"); diff --git a/drivers/clk/sunxi-ng/ccu_mult.c b/drivers/clk/sunxi-ng/ccu_mult.c index 7bee217ef111..8d5720f3dec1 100644 --- a/drivers/clk/sunxi-ng/ccu_mult.c +++ b/drivers/clk/sunxi-ng/ccu_mult.c @@ -170,4 +170,4 @@ const struct clk_ops ccu_mult_ops = { .recalc_rate = ccu_mult_recalc_rate, .set_rate = ccu_mult_set_rate, }; -EXPORT_SYMBOL_NS_GPL(ccu_mult_ops, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_mult_ops, "SUNXI_CCU"); diff --git a/drivers/clk/sunxi-ng/ccu_mux.c b/drivers/clk/sunxi-ng/ccu_mux.c index 5edc63b46651..d7ffbdeee9e0 100644 --- a/drivers/clk/sunxi-ng/ccu_mux.c +++ b/drivers/clk/sunxi-ng/ccu_mux.c @@ -66,7 +66,7 @@ unsigned long ccu_mux_helper_apply_prediv(struct ccu_common *common, { return parent_rate / ccu_mux_get_prediv(common, cm, parent_index); } -EXPORT_SYMBOL_NS_GPL(ccu_mux_helper_apply_prediv, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_mux_helper_apply_prediv, "SUNXI_CCU"); static unsigned long ccu_mux_helper_unapply_prediv(struct ccu_common *common, struct ccu_mux_internal *cm, @@ -155,7 +155,7 @@ out: req->rate = best_rate; return 0; } -EXPORT_SYMBOL_NS_GPL(ccu_mux_helper_determine_rate, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_mux_helper_determine_rate, "SUNXI_CCU"); u8 ccu_mux_helper_get_parent(struct ccu_common *common, struct ccu_mux_internal *cm) @@ -178,7 +178,7 @@ u8 ccu_mux_helper_get_parent(struct ccu_common *common, return parent; } -EXPORT_SYMBOL_NS_GPL(ccu_mux_helper_get_parent, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_mux_helper_get_parent, "SUNXI_CCU"); int ccu_mux_helper_set_parent(struct ccu_common *common, struct ccu_mux_internal *cm, @@ -205,7 +205,7 @@ int ccu_mux_helper_set_parent(struct ccu_common *common, return 0; } -EXPORT_SYMBOL_NS_GPL(ccu_mux_helper_set_parent, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_mux_helper_set_parent, "SUNXI_CCU"); static void ccu_mux_disable(struct clk_hw *hw) { @@ -273,7 +273,7 @@ const struct clk_ops ccu_mux_ops = { .determine_rate = ccu_mux_determine_rate, .recalc_rate = ccu_mux_recalc_rate, }; -EXPORT_SYMBOL_NS_GPL(ccu_mux_ops, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_mux_ops, "SUNXI_CCU"); /* * This clock notifier is called when the frequency of the of the parent @@ -308,4 +308,4 @@ int ccu_mux_notifier_register(struct clk *clk, struct ccu_mux_nb *mux_nb) return clk_notifier_register(clk, &mux_nb->clk_nb); } -EXPORT_SYMBOL_NS_GPL(ccu_mux_notifier_register, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_mux_notifier_register, "SUNXI_CCU"); diff --git a/drivers/clk/sunxi-ng/ccu_nk.c b/drivers/clk/sunxi-ng/ccu_nk.c index 8aa35d5804f3..555e99de2cc6 100644 --- a/drivers/clk/sunxi-ng/ccu_nk.c +++ b/drivers/clk/sunxi-ng/ccu_nk.c @@ -158,4 +158,4 @@ const struct clk_ops ccu_nk_ops = { .round_rate = ccu_nk_round_rate, .set_rate = ccu_nk_set_rate, }; -EXPORT_SYMBOL_NS_GPL(ccu_nk_ops, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_nk_ops, "SUNXI_CCU"); diff --git a/drivers/clk/sunxi-ng/ccu_nkm.c b/drivers/clk/sunxi-ng/ccu_nkm.c index 1168d894d636..784eec9ac997 100644 --- a/drivers/clk/sunxi-ng/ccu_nkm.c +++ b/drivers/clk/sunxi-ng/ccu_nkm.c @@ -267,4 +267,4 @@ const struct clk_ops ccu_nkm_ops = { .recalc_rate = ccu_nkm_recalc_rate, .set_rate = ccu_nkm_set_rate, }; -EXPORT_SYMBOL_NS_GPL(ccu_nkm_ops, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_nkm_ops, "SUNXI_CCU"); diff --git a/drivers/clk/sunxi-ng/ccu_nkmp.c b/drivers/clk/sunxi-ng/ccu_nkmp.c index 99359a06892d..6e03b69d4028 100644 --- a/drivers/clk/sunxi-ng/ccu_nkmp.c +++ b/drivers/clk/sunxi-ng/ccu_nkmp.c @@ -230,4 +230,4 @@ const struct clk_ops ccu_nkmp_ops = { .round_rate = ccu_nkmp_round_rate, .set_rate = ccu_nkmp_set_rate, }; -EXPORT_SYMBOL_NS_GPL(ccu_nkmp_ops, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_nkmp_ops, "SUNXI_CCU"); diff --git a/drivers/clk/sunxi-ng/ccu_nm.c b/drivers/clk/sunxi-ng/ccu_nm.c index ffac3deb89d6..a4e2243b8d6b 100644 --- a/drivers/clk/sunxi-ng/ccu_nm.c +++ b/drivers/clk/sunxi-ng/ccu_nm.c @@ -236,4 +236,4 @@ const struct clk_ops ccu_nm_ops = { .round_rate = ccu_nm_round_rate, .set_rate = ccu_nm_set_rate, }; -EXPORT_SYMBOL_NS_GPL(ccu_nm_ops, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_nm_ops, "SUNXI_CCU"); diff --git a/drivers/clk/sunxi-ng/ccu_phase.c b/drivers/clk/sunxi-ng/ccu_phase.c index e4cae2afe9db..ca43cf448666 100644 --- a/drivers/clk/sunxi-ng/ccu_phase.c +++ b/drivers/clk/sunxi-ng/ccu_phase.c @@ -121,4 +121,4 @@ const struct clk_ops ccu_phase_ops = { .get_phase = ccu_phase_get_phase, .set_phase = ccu_phase_set_phase, }; -EXPORT_SYMBOL_NS_GPL(ccu_phase_ops, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_phase_ops, "SUNXI_CCU"); diff --git a/drivers/clk/sunxi-ng/ccu_reset.c b/drivers/clk/sunxi-ng/ccu_reset.c index 6577aa18cb01..55bc7c7cda0f 100644 --- a/drivers/clk/sunxi-ng/ccu_reset.c +++ b/drivers/clk/sunxi-ng/ccu_reset.c @@ -75,4 +75,4 @@ const struct reset_control_ops ccu_reset_ops = { .reset = ccu_reset_reset, .status = ccu_reset_status, }; -EXPORT_SYMBOL_NS_GPL(ccu_reset_ops, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_reset_ops, "SUNXI_CCU"); diff --git a/drivers/clk/sunxi-ng/ccu_sdm.c b/drivers/clk/sunxi-ng/ccu_sdm.c index 41937ed0766d..c564e5f9e610 100644 --- a/drivers/clk/sunxi-ng/ccu_sdm.c +++ b/drivers/clk/sunxi-ng/ccu_sdm.c @@ -20,7 +20,7 @@ bool ccu_sdm_helper_is_enabled(struct ccu_common *common, return !!(readl(common->base + sdm->tuning_reg) & sdm->tuning_enable); } -EXPORT_SYMBOL_NS_GPL(ccu_sdm_helper_is_enabled, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_sdm_helper_is_enabled, "SUNXI_CCU"); void ccu_sdm_helper_enable(struct ccu_common *common, struct ccu_sdm_internal *sdm, @@ -50,7 +50,7 @@ void ccu_sdm_helper_enable(struct ccu_common *common, writel(reg | sdm->enable, common->base + common->reg); spin_unlock_irqrestore(common->lock, flags); } -EXPORT_SYMBOL_NS_GPL(ccu_sdm_helper_enable, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_sdm_helper_enable, "SUNXI_CCU"); void ccu_sdm_helper_disable(struct ccu_common *common, struct ccu_sdm_internal *sdm) @@ -71,7 +71,7 @@ void ccu_sdm_helper_disable(struct ccu_common *common, writel(reg & ~sdm->tuning_enable, common->base + sdm->tuning_reg); spin_unlock_irqrestore(common->lock, flags); } -EXPORT_SYMBOL_NS_GPL(ccu_sdm_helper_disable, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_sdm_helper_disable, "SUNXI_CCU"); /* * Sigma delta modulation provides a way to do fractional-N frequency @@ -105,7 +105,7 @@ bool ccu_sdm_helper_has_rate(struct ccu_common *common, return false; } -EXPORT_SYMBOL_NS_GPL(ccu_sdm_helper_has_rate, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_sdm_helper_has_rate, "SUNXI_CCU"); unsigned long ccu_sdm_helper_read_rate(struct ccu_common *common, struct ccu_sdm_internal *sdm, @@ -136,7 +136,7 @@ unsigned long ccu_sdm_helper_read_rate(struct ccu_common *common, /* We can't calculate the effective clock rate, so just fail. */ return 0; } -EXPORT_SYMBOL_NS_GPL(ccu_sdm_helper_read_rate, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_sdm_helper_read_rate, "SUNXI_CCU"); int ccu_sdm_helper_get_factors(struct ccu_common *common, struct ccu_sdm_internal *sdm, @@ -158,4 +158,4 @@ int ccu_sdm_helper_get_factors(struct ccu_common *common, /* nothing found */ return -EINVAL; } -EXPORT_SYMBOL_NS_GPL(ccu_sdm_helper_get_factors, SUNXI_CCU); +EXPORT_SYMBOL_NS_GPL(ccu_sdm_helper_get_factors, "SUNXI_CCU"); diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c index 17e32ae08720..1015fab95251 100644 --- a/drivers/clk/thead/clk-th1520-ap.c +++ b/drivers/clk/thead/clk-th1520-ap.c @@ -779,6 +779,13 @@ static struct ccu_div dpu1_clk = { }, }; +static CLK_FIXED_FACTOR_HW(emmc_sdio_ref_clk, "emmc-sdio-ref", + &video_pll_clk.common.hw, 4, 1, 0); + +static const struct clk_parent_data emmc_sdio_ref_clk_pd[] = { + { .hw = &emmc_sdio_ref_clk.hw }, +}; + static CCU_GATE(CLK_BROM, brom_clk, "brom", ahb2_cpusys_hclk_pd, 0x100, BIT(4), 0); static CCU_GATE(CLK_BMU, bmu_clk, "bmu", axi4_cpusys2_aclk_pd, 0x100, BIT(5), 0); static CCU_GATE(CLK_AON2CPU_A2X, aon2cpu_a2x_clk, "aon2cpu-a2x", axi4_cpusys2_aclk_pd, @@ -798,7 +805,7 @@ static CCU_GATE(CLK_PERISYS_APB4_HCLK, perisys_apb4_hclk, "perisys-apb4-hclk", p 0x150, BIT(12), 0); static CCU_GATE(CLK_NPU_AXI, npu_axi_clk, "npu-axi", axi_aclk_pd, 0x1c8, BIT(5), 0); static CCU_GATE(CLK_CPU2VP, cpu2vp_clk, "cpu2vp", axi_aclk_pd, 0x1e0, BIT(13), 0); -static CCU_GATE(CLK_EMMC_SDIO, emmc_sdio_clk, "emmc-sdio", video_pll_clk_pd, 0x204, BIT(30), 0); +static CCU_GATE(CLK_EMMC_SDIO, emmc_sdio_clk, "emmc-sdio", emmc_sdio_ref_clk_pd, 0x204, BIT(30), 0); static CCU_GATE(CLK_GMAC1, gmac1_clk, "gmac1", gmac_pll_clk_pd, 0x204, BIT(26), 0); static CCU_GATE(CLK_PADCTRL1, padctrl1_clk, "padctrl1", perisys_apb_pclk_pd, 0x204, BIT(24), 0); static CCU_GATE(CLK_DSMART, dsmart_clk, "dsmart", perisys_apb_pclk_pd, 0x204, BIT(23), 0); @@ -1059,6 +1066,10 @@ static int th1520_clk_probe(struct platform_device *pdev) return ret; priv->hws[CLK_PLL_GMAC_100M] = &gmac_pll_clk_100m.hw; + ret = devm_clk_hw_register(dev, &emmc_sdio_ref_clk.hw); + if (ret) + return ret; + ret = devm_of_clk_add_hw_provider(dev, of_clk_hw_onecell_get, priv); if (ret) return ret; diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c index 99177835cade..b39dee7b93af 100644 --- a/drivers/clocksource/hyperv_timer.c +++ b/drivers/clocksource/hyperv_timer.c @@ -27,7 +27,8 @@ #include static struct clock_event_device __percpu *hv_clock_event; -static u64 hv_sched_clock_offset __ro_after_init; +/* Note: offset can hold negative values after hibernation. */ +static u64 hv_sched_clock_offset __read_mostly; /* * If false, we're using the old mechanism for stimer0 interrupts @@ -470,6 +471,17 @@ static void resume_hv_clock_tsc(struct clocksource *arg) hv_set_msr(HV_MSR_REFERENCE_TSC, tsc_msr.as_uint64); } +/* + * Called during resume from hibernation, from overridden + * x86_platform.restore_sched_clock_state routine. This is to adjust offsets + * used to calculate time for hv tsc page based sched_clock, to account for + * time spent before hibernation. + */ +void hv_adj_sched_clock_offset(u64 offset) +{ + hv_sched_clock_offset -= offset; +} + #ifdef HAVE_VDSO_CLOCKMODE_HVCLOCK static int hv_cs_enable(struct clocksource *cs) { diff --git a/drivers/clocksource/timer-sun5i.c b/drivers/clocksource/timer-sun5i.c index 0d229a9058da..6b48a9006444 100644 --- a/drivers/clocksource/timer-sun5i.c +++ b/drivers/clocksource/timer-sun5i.c @@ -318,7 +318,7 @@ MODULE_DEVICE_TABLE(of, sun5i_timer_of_match); static struct platform_driver sun5i_timer_driver = { .probe = sun5i_timer_probe, - .remove_new = sun5i_timer_remove, + .remove = sun5i_timer_remove, .driver = { .name = "sun5i-timer", .of_match_table = sun5i_timer_of_match, diff --git a/drivers/clocksource/timer-tegra186.c b/drivers/clocksource/timer-tegra186.c index 304537dadf2c..5d4cf5237a11 100644 --- a/drivers/clocksource/timer-tegra186.c +++ b/drivers/clocksource/timer-tegra186.c @@ -502,7 +502,7 @@ static struct platform_driver tegra186_wdt_driver = { .of_match_table = tegra186_timer_of_match, }, .probe = tegra186_timer_probe, - .remove_new = tegra186_timer_remove, + .remove = tegra186_timer_remove, }; module_platform_driver(tegra186_wdt_driver); diff --git a/drivers/clocksource/timer-ti-dm.c b/drivers/clocksource/timer-ti-dm.c index 3666d94cc8dd..e9e32df6b566 100644 --- a/drivers/clocksource/timer-ti-dm.c +++ b/drivers/clocksource/timer-ti-dm.c @@ -1295,7 +1295,7 @@ MODULE_DEVICE_TABLE(of, omap_timer_match); static struct platform_driver omap_dm_timer_driver = { .probe = omap_dm_timer_probe, - .remove_new = omap_dm_timer_remove, + .remove = omap_dm_timer_remove, .driver = { .name = "omap_timer", .of_match_table = omap_timer_match, diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c index 4a6868b8f58b..ce81fc4e1ae7 100644 --- a/drivers/counter/104-quad-8.c +++ b/drivers/counter/104-quad-8.c @@ -1360,4 +1360,4 @@ module_isa_driver_with_irq(quad8_driver, num_quad8, num_irq); MODULE_AUTHOR("William Breathitt Gray "); MODULE_DESCRIPTION("ACCES 104-QUAD-8 driver"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(COUNTER); +MODULE_IMPORT_NS("COUNTER"); diff --git a/drivers/counter/counter-chrdev.c b/drivers/counter/counter-chrdev.c index 3ee75e1a78cd..23fdf0caf712 100644 --- a/drivers/counter/counter-chrdev.c +++ b/drivers/counter/counter-chrdev.c @@ -672,4 +672,4 @@ exit_early: if (copied) wake_up_poll(&counter->events_wait, EPOLLIN); } -EXPORT_SYMBOL_NS_GPL(counter_push_event, COUNTER); +EXPORT_SYMBOL_NS_GPL(counter_push_event, "COUNTER"); diff --git a/drivers/counter/counter-core.c b/drivers/counter/counter-core.c index 893b4f0726d2..50bd30ba3d03 100644 --- a/drivers/counter/counter-core.c +++ b/drivers/counter/counter-core.c @@ -74,7 +74,7 @@ void *counter_priv(const struct counter_device *const counter) return &ch->privdata; } -EXPORT_SYMBOL_NS_GPL(counter_priv, COUNTER); +EXPORT_SYMBOL_NS_GPL(counter_priv, "COUNTER"); /** * counter_alloc - allocate a counter_device @@ -134,13 +134,13 @@ err_ida_alloc: return NULL; } -EXPORT_SYMBOL_NS_GPL(counter_alloc, COUNTER); +EXPORT_SYMBOL_NS_GPL(counter_alloc, "COUNTER"); void counter_put(struct counter_device *counter) { put_device(&counter->dev); } -EXPORT_SYMBOL_NS_GPL(counter_put, COUNTER); +EXPORT_SYMBOL_NS_GPL(counter_put, "COUNTER"); /** * counter_add - complete registration of a counter @@ -167,7 +167,7 @@ int counter_add(struct counter_device *counter) /* implies device_add(dev) */ return cdev_device_add(&counter->chrdev, dev); } -EXPORT_SYMBOL_NS_GPL(counter_add, COUNTER); +EXPORT_SYMBOL_NS_GPL(counter_add, "COUNTER"); /** * counter_unregister - unregister Counter from the system @@ -189,7 +189,7 @@ void counter_unregister(struct counter_device *const counter) mutex_unlock(&counter->ops_exist_lock); } -EXPORT_SYMBOL_NS_GPL(counter_unregister, COUNTER); +EXPORT_SYMBOL_NS_GPL(counter_unregister, "COUNTER"); static void devm_counter_release(void *counter) { @@ -224,7 +224,7 @@ struct counter_device *devm_counter_alloc(struct device *dev, size_t sizeof_priv return counter; } -EXPORT_SYMBOL_NS_GPL(devm_counter_alloc, COUNTER); +EXPORT_SYMBOL_NS_GPL(devm_counter_alloc, "COUNTER"); /** * devm_counter_add - complete registration of a counter @@ -245,7 +245,7 @@ int devm_counter_add(struct device *dev, return devm_add_action_or_reset(dev, devm_counter_release, counter); } -EXPORT_SYMBOL_NS_GPL(devm_counter_add, COUNTER); +EXPORT_SYMBOL_NS_GPL(devm_counter_add, "COUNTER"); #define COUNTER_DEV_MAX 256 diff --git a/drivers/counter/ftm-quaddec.c b/drivers/counter/ftm-quaddec.c index 6ac4efb5658b..c47741292ae1 100644 --- a/drivers/counter/ftm-quaddec.c +++ b/drivers/counter/ftm-quaddec.c @@ -327,4 +327,4 @@ MODULE_DESCRIPTION("Flex Timer Module Quadrature decoder"); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Kjeld Flarup "); MODULE_AUTHOR("Patrick Havelange "); -MODULE_IMPORT_NS(COUNTER); +MODULE_IMPORT_NS("COUNTER"); diff --git a/drivers/counter/i8254.c b/drivers/counter/i8254.c index 6d74e8ef92f0..95ad928725ec 100644 --- a/drivers/counter/i8254.c +++ b/drivers/counter/i8254.c @@ -439,9 +439,9 @@ int devm_i8254_regmap_register(struct device *const dev, return 0; } -EXPORT_SYMBOL_NS_GPL(devm_i8254_regmap_register, I8254); +EXPORT_SYMBOL_NS_GPL(devm_i8254_regmap_register, "I8254"); MODULE_AUTHOR("William Breathitt Gray"); MODULE_DESCRIPTION("Intel 8254 Programmable Interval Timer"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(COUNTER); +MODULE_IMPORT_NS("COUNTER"); diff --git a/drivers/counter/intel-qep.c b/drivers/counter/intel-qep.c index ee2bae27b728..c49c178056f4 100644 --- a/drivers/counter/intel-qep.c +++ b/drivers/counter/intel-qep.c @@ -519,4 +519,4 @@ MODULE_AUTHOR("Jarkko Nikula "); MODULE_AUTHOR("Raymond Tan "); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("Intel Quadrature Encoder Peripheral driver"); -MODULE_IMPORT_NS(COUNTER); +MODULE_IMPORT_NS("COUNTER"); diff --git a/drivers/counter/interrupt-cnt.c b/drivers/counter/interrupt-cnt.c index 229473855c5b..949598d51575 100644 --- a/drivers/counter/interrupt-cnt.c +++ b/drivers/counter/interrupt-cnt.c @@ -253,4 +253,4 @@ MODULE_ALIAS("platform:interrupt-counter"); MODULE_AUTHOR("Oleksij Rempel "); MODULE_DESCRIPTION("Interrupt counter driver"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(COUNTER); +MODULE_IMPORT_NS("COUNTER"); diff --git a/drivers/counter/microchip-tcb-capture.c b/drivers/counter/microchip-tcb-capture.c index b3e615cbd2ca..2f096a5b973d 100644 --- a/drivers/counter/microchip-tcb-capture.c +++ b/drivers/counter/microchip-tcb-capture.c @@ -403,4 +403,4 @@ module_platform_driver(mchp_tc_driver); MODULE_AUTHOR("Kamel Bouhara "); MODULE_DESCRIPTION("Microchip TCB Capture driver"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(COUNTER); +MODULE_IMPORT_NS("COUNTER"); diff --git a/drivers/counter/rz-mtu3-cnt.c b/drivers/counter/rz-mtu3-cnt.c index ee821493b166..e755d54dfece 100644 --- a/drivers/counter/rz-mtu3-cnt.c +++ b/drivers/counter/rz-mtu3-cnt.c @@ -903,4 +903,4 @@ MODULE_AUTHOR("Biju Das "); MODULE_ALIAS("platform:rz-mtu3-counter"); MODULE_DESCRIPTION("Renesas RZ/G2L MTU3a counter driver"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(COUNTER); +MODULE_IMPORT_NS("COUNTER"); diff --git a/drivers/counter/stm32-lptimer-cnt.c b/drivers/counter/stm32-lptimer-cnt.c index 8439755559b2..cf73f65baf60 100644 --- a/drivers/counter/stm32-lptimer-cnt.c +++ b/drivers/counter/stm32-lptimer-cnt.c @@ -520,4 +520,4 @@ MODULE_AUTHOR("Fabrice Gasnier "); MODULE_ALIAS("platform:stm32-lptimer-counter"); MODULE_DESCRIPTION("STMicroelectronics STM32 LPTIM counter driver"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(COUNTER); +MODULE_IMPORT_NS("COUNTER"); diff --git a/drivers/counter/stm32-timer-cnt.c b/drivers/counter/stm32-timer-cnt.c index 87b6ec567b54..e75b69476a00 100644 --- a/drivers/counter/stm32-timer-cnt.c +++ b/drivers/counter/stm32-timer-cnt.c @@ -864,4 +864,4 @@ MODULE_AUTHOR("Benjamin Gaignard "); MODULE_ALIAS("platform:stm32-timer-counter"); MODULE_DESCRIPTION("STMicroelectronics STM32 TIMER counter driver"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(COUNTER); +MODULE_IMPORT_NS("COUNTER"); diff --git a/drivers/counter/ti-ecap-capture.c b/drivers/counter/ti-ecap-capture.c index b119aeede693..3faaf7f60539 100644 --- a/drivers/counter/ti-ecap-capture.c +++ b/drivers/counter/ti-ecap-capture.c @@ -603,7 +603,7 @@ MODULE_DEVICE_TABLE(of, ecap_cnt_of_match); static struct platform_driver ecap_cnt_driver = { .probe = ecap_cnt_probe, - .remove_new = ecap_cnt_remove, + .remove = ecap_cnt_remove, .driver = { .name = "ecap-capture", .of_match_table = ecap_cnt_of_match, @@ -615,4 +615,4 @@ module_platform_driver(ecap_cnt_driver); MODULE_DESCRIPTION("ECAP Capture driver"); MODULE_AUTHOR("Julien Panis "); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(COUNTER); +MODULE_IMPORT_NS("COUNTER"); diff --git a/drivers/counter/ti-eqep.c b/drivers/counter/ti-eqep.c index 313b91456f26..bc586eff0dae 100644 --- a/drivers/counter/ti-eqep.c +++ b/drivers/counter/ti-eqep.c @@ -548,7 +548,7 @@ MODULE_DEVICE_TABLE(of, ti_eqep_of_match); static struct platform_driver ti_eqep_driver = { .probe = ti_eqep_probe, - .remove_new = ti_eqep_remove, + .remove = ti_eqep_remove, .driver = { .name = "ti-eqep-cnt", .of_match_table = ti_eqep_of_match, @@ -559,4 +559,4 @@ module_platform_driver(ti_eqep_driver); MODULE_AUTHOR("David Lechner "); MODULE_DESCRIPTION("TI eQEP counter driver"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(COUNTER); +MODULE_IMPORT_NS("COUNTER"); diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig index 92a83a9bb2e1..26e98fea991a 100644 --- a/drivers/cpufreq/Kconfig +++ b/drivers/cpufreq/Kconfig @@ -325,8 +325,6 @@ config QORIQ_CPUFREQ This adds the CPUFreq driver support for Freescale QorIQ SoCs which are capable of changing the CPU's frequency dynamically. -endif - config ACPI_CPPC_CPUFREQ tristate "CPUFreq driver based on the ACPI CPPC spec" depends on ACPI_PROCESSOR @@ -355,4 +353,6 @@ config ACPI_CPPC_CPUFREQ_FIE If in doubt, say N. +endif + endmenu diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c index d7630bab2516..66e5dfc711c0 100644 --- a/drivers/cpufreq/amd-pstate.c +++ b/drivers/cpufreq/amd-pstate.c @@ -374,15 +374,19 @@ static inline int amd_pstate_cppc_enable(bool enable) static int msr_init_perf(struct amd_cpudata *cpudata) { - u64 cap1; + u64 cap1, numerator; int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &cap1); if (ret) return ret; - WRITE_ONCE(cpudata->highest_perf, AMD_CPPC_HIGHEST_PERF(cap1)); - WRITE_ONCE(cpudata->max_limit_perf, AMD_CPPC_HIGHEST_PERF(cap1)); + ret = amd_get_boost_ratio_numerator(cpudata->cpu, &numerator); + if (ret) + return ret; + + WRITE_ONCE(cpudata->highest_perf, numerator); + WRITE_ONCE(cpudata->max_limit_perf, numerator); WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1)); WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1)); WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1)); @@ -394,13 +398,18 @@ static int msr_init_perf(struct amd_cpudata *cpudata) static int shmem_init_perf(struct amd_cpudata *cpudata) { struct cppc_perf_caps cppc_perf; + u64 numerator; int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf); if (ret) return ret; - WRITE_ONCE(cpudata->highest_perf, cppc_perf.highest_perf); - WRITE_ONCE(cpudata->max_limit_perf, cppc_perf.highest_perf); + ret = amd_get_boost_ratio_numerator(cpudata->cpu, &numerator); + if (ret) + return ret; + + WRITE_ONCE(cpudata->highest_perf, numerator); + WRITE_ONCE(cpudata->max_limit_perf, numerator); WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf); WRITE_ONCE(cpudata->lowest_nonlinear_perf, cppc_perf.lowest_nonlinear_perf); @@ -561,16 +570,13 @@ static int amd_pstate_verify(struct cpufreq_policy_data *policy_data) static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy) { - u32 max_limit_perf, min_limit_perf, lowest_perf, max_perf; + u32 max_limit_perf, min_limit_perf, lowest_perf, max_perf, max_freq; struct amd_cpudata *cpudata = policy->driver_data; - if (cpudata->boost_supported && !policy->boost_enabled) - max_perf = READ_ONCE(cpudata->nominal_perf); - else - max_perf = READ_ONCE(cpudata->highest_perf); - - max_limit_perf = div_u64(policy->max * max_perf, policy->cpuinfo.max_freq); - min_limit_perf = div_u64(policy->min * max_perf, policy->cpuinfo.max_freq); + max_perf = READ_ONCE(cpudata->highest_perf); + max_freq = READ_ONCE(cpudata->max_freq); + max_limit_perf = div_u64(policy->max * max_perf, max_freq); + min_limit_perf = div_u64(policy->min * max_perf, max_freq); lowest_perf = READ_ONCE(cpudata->lowest_perf); if (min_limit_perf < lowest_perf) @@ -889,7 +895,6 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata) { int ret; u32 min_freq, max_freq; - u64 numerator; u32 nominal_perf, nominal_freq; u32 lowest_nonlinear_perf, lowest_nonlinear_freq; u32 boost_ratio, lowest_nonlinear_ratio; @@ -911,10 +916,7 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata) nominal_perf = READ_ONCE(cpudata->nominal_perf); - ret = amd_get_boost_ratio_numerator(cpudata->cpu, &numerator); - if (ret) - return ret; - boost_ratio = div_u64(numerator << SCHED_CAPACITY_SHIFT, nominal_perf); + boost_ratio = div_u64(cpudata->highest_perf << SCHED_CAPACITY_SHIFT, nominal_perf); max_freq = (nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT) * 1000; lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf); @@ -1869,18 +1871,18 @@ static int __init amd_pstate_init(void) static_call_update(amd_pstate_update_perf, shmem_update_perf); } - ret = amd_pstate_register_driver(cppc_state); - if (ret) { - pr_err("failed to register with return %d\n", ret); - return ret; - } - if (amd_pstate_prefcore) { ret = amd_detect_prefcore(&amd_pstate_prefcore); if (ret) return ret; } + ret = amd_pstate_register_driver(cppc_state); + if (ret) { + pr_err("failed to register with return %d\n", ret); + return ret; + } + dev_root = bus_get_dev_root(&cpu_subsys); if (dev_root) { ret = sysfs_create_group(&dev_root->kobj, &amd_pstate_global_attr_group); diff --git a/drivers/cpuidle/cpuidle-kirkwood.c b/drivers/cpuidle/cpuidle-kirkwood.c index 602c4dfdd7e2..5235e6e8f360 100644 --- a/drivers/cpuidle/cpuidle-kirkwood.c +++ b/drivers/cpuidle/cpuidle-kirkwood.c @@ -66,7 +66,7 @@ static void kirkwood_cpuidle_remove(struct platform_device *pdev) static struct platform_driver kirkwood_cpuidle_driver = { .probe = kirkwood_cpuidle_probe, - .remove_new = kirkwood_cpuidle_remove, + .remove = kirkwood_cpuidle_remove, .driver = { .name = "kirkwood_cpuidle", }, diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c index 14462c092039..0c92a628bbd4 100644 --- a/drivers/cpuidle/cpuidle-riscv-sbi.c +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c @@ -504,12 +504,12 @@ static int sbi_cpuidle_probe(struct platform_device *pdev) int cpu, ret; struct cpuidle_driver *drv; struct cpuidle_device *dev; - struct device_node *np, *pds_node; + struct device_node *pds_node; /* Detect OSI support based on CPU DT nodes */ sbi_cpuidle_use_osi = true; for_each_possible_cpu(cpu) { - np = of_cpu_device_node_get(cpu); + struct device_node *np __free(device_node) = of_cpu_device_node_get(cpu); if (np && of_property_present(np, "power-domains") && of_property_present(np, "power-domain-names")) { diff --git a/drivers/cpuidle/governors/teo.c b/drivers/cpuidle/governors/teo.c index f2992f92d8db..173ddcac540a 100644 --- a/drivers/cpuidle/governors/teo.c +++ b/drivers/cpuidle/governors/teo.c @@ -10,25 +10,27 @@ * DOC: teo-description * * The idea of this governor is based on the observation that on many systems - * timer events are two or more orders of magnitude more frequent than any - * other interrupts, so they are likely to be the most significant cause of CPU - * wakeups from idle states. Moreover, information about what happened in the - * (relatively recent) past can be used to estimate whether or not the deepest - * idle state with target residency within the (known) time till the closest - * timer event, referred to as the sleep length, is likely to be suitable for - * the upcoming CPU idle period and, if not, then which of the shallower idle - * states to choose instead of it. + * timer interrupts are two or more orders of magnitude more frequent than any + * other interrupt types, so they are likely to dominate CPU wakeup patterns. + * Moreover, in principle, the time when the next timer event is going to occur + * can be determined at the idle state selection time, although doing that may + * be costly, so it can be regarded as the most reliable source of information + * for idle state selection. * - * Of course, non-timer wakeup sources are more important in some use cases - * which can be covered by taking a few most recent idle time intervals of the - * CPU into account. However, even in that context it is not necessary to - * consider idle duration values greater than the sleep length, because the - * closest timer will ultimately wake up the CPU anyway unless it is woken up - * earlier. + * Of course, non-timer wakeup sources are more important in some use cases, + * but even then it is generally unnecessary to consider idle duration values + * greater than the time time till the next timer event, referred as the sleep + * length in what follows, because the closest timer will ultimately wake up the + * CPU anyway unless it is woken up earlier. * - * Thus this governor estimates whether or not the prospective idle duration of - * a CPU is likely to be significantly shorter than the sleep length and selects - * an idle state for it accordingly. + * However, since obtaining the sleep length may be costly, the governor first + * checks if it can select a shallow idle state using wakeup pattern information + * from recent times, in which case it can do without knowing the sleep length + * at all. For this purpose, it counts CPU wakeup events and looks for an idle + * state whose target residency has not exceeded the idle duration (measured + * after wakeup) in the majority of relevant recent cases. If the target + * residency of that state is small enough, it may be used right away and the + * sleep length need not be determined. * * The computations carried out by this governor are based on using bins whose * boundaries are aligned with the target residency parameter values of the CPU @@ -39,7 +41,11 @@ * idle state 2, the third bin spans from the target residency of idle state 2 * up to, but not including, the target residency of idle state 3 and so on. * The last bin spans from the target residency of the deepest idle state - * supplied by the driver to infinity. + * supplied by the driver to the scheduler tick period length or to infinity if + * the tick period length is less than the target residency of that state. In + * the latter case, the governor also counts events with the measured idle + * duration between the tick period length and the target residency of the + * deepest idle state. * * Two metrics called "hits" and "intercepts" are associated with each bin. * They are updated every time before selecting an idle state for the given CPU @@ -49,47 +55,46 @@ * sleep length and the idle duration measured after CPU wakeup fall into the * same bin (that is, the CPU appears to wake up "on time" relative to the sleep * length). In turn, the "intercepts" metric reflects the relative frequency of - * situations in which the measured idle duration is so much shorter than the - * sleep length that the bin it falls into corresponds to an idle state - * shallower than the one whose bin is fallen into by the sleep length (these - * situations are referred to as "intercepts" below). + * non-timer wakeup events for which the measured idle duration falls into a bin + * that corresponds to an idle state shallower than the one whose bin is fallen + * into by the sleep length (these events are also referred to as "intercepts" + * below). * * In order to select an idle state for a CPU, the governor takes the following * steps (modulo the possible latency constraint that must be taken into account * too): * - * 1. Find the deepest CPU idle state whose target residency does not exceed - * the current sleep length (the candidate idle state) and compute 2 sums as - * follows: + * 1. Find the deepest enabled CPU idle state (the candidate idle state) and + * compute 2 sums as follows: * - * - The sum of the "hits" and "intercepts" metrics for the candidate state - * and all of the deeper idle states (it represents the cases in which the - * CPU was idle long enough to avoid being intercepted if the sleep length - * had been equal to the current one). + * - The sum of the "hits" metric for all of the idle states shallower than + * the candidate one (it represents the cases in which the CPU was likely + * woken up by a timer). * - * - The sum of the "intercepts" metrics for all of the idle states shallower - * than the candidate one (it represents the cases in which the CPU was not - * idle long enough to avoid being intercepted if the sleep length had been - * equal to the current one). + * - The sum of the "intercepts" metric for all of the idle states shallower + * than the candidate one (it represents the cases in which the CPU was + * likely woken up by a non-timer wakeup source). * - * 2. If the second sum is greater than the first one the CPU is likely to wake - * up early, so look for an alternative idle state to select. + * 2. If the second sum computed in step 1 is greater than a half of the sum of + * both metrics for the candidate state bin and all subsequent bins(if any), + * a shallower idle state is likely to be more suitable, so look for it. * - * - Traverse the idle states shallower than the candidate one in the + * - Traverse the enabled idle states shallower than the candidate one in the * descending order. * * - For each of them compute the sum of the "intercepts" metrics over all * of the idle states between it and the candidate one (including the * former and excluding the latter). * - * - If each of these sums that needs to be taken into account (because the - * check related to it has indicated that the CPU is likely to wake up - * early) is greater than a half of the corresponding sum computed in step - * 1 (which means that the target residency of the state in question had - * not exceeded the idle duration in over a half of the relevant cases), - * select the given idle state instead of the candidate one. + * - If this sum is greater than a half of the second sum computed in step 1, + * use the given idle state as the new candidate one. * - * 3. By default, select the candidate state. + * 3. If the current candidate state is state 0 or its target residency is short + * enough, return it and prevent the scheduler tick from being stopped. + * + * 4. Obtain the sleep length value and check if it is below the target + * residency of the current candidate state, in which case a new shallower + * candidate state needs to be found, so look for it. */ #include diff --git a/drivers/crypto/geode-aes.c b/drivers/crypto/geode-aes.c index fa5a9f207bc9..d933f26aeb3a 100644 --- a/drivers/crypto/geode-aes.c +++ b/drivers/crypto/geode-aes.c @@ -433,4 +433,4 @@ module_pci_driver(geode_aes_driver); MODULE_AUTHOR("Advanced Micro Devices, Inc."); MODULE_DESCRIPTION("Geode LX Hardware AES driver"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/drivers/crypto/hisilicon/debugfs.c b/drivers/crypto/hisilicon/debugfs.c index 1b9b7bccdeff..45e130b901eb 100644 --- a/drivers/crypto/hisilicon/debugfs.c +++ b/drivers/crypto/hisilicon/debugfs.c @@ -192,7 +192,7 @@ static int qm_sqc_dump(struct hisi_qm *qm, char *s, char *name) down_read(&qm->qps_lock); if (qm->sqc) { - memcpy(&sqc, qm->sqc + qp_id * sizeof(struct qm_sqc), sizeof(struct qm_sqc)); + memcpy(&sqc, qm->sqc + qp_id, sizeof(struct qm_sqc)); sqc.base_h = cpu_to_le32(QM_XQC_ADDR_MASK); sqc.base_l = cpu_to_le32(QM_XQC_ADDR_MASK); dump_show(qm, &sqc, sizeof(struct qm_sqc), "SOFT SQC"); @@ -229,7 +229,7 @@ static int qm_cqc_dump(struct hisi_qm *qm, char *s, char *name) down_read(&qm->qps_lock); if (qm->cqc) { - memcpy(&cqc, qm->cqc + qp_id * sizeof(struct qm_cqc), sizeof(struct qm_cqc)); + memcpy(&cqc, qm->cqc + qp_id, sizeof(struct qm_cqc)); cqc.base_h = cpu_to_le32(QM_XQC_ADDR_MASK); cqc.base_l = cpu_to_le32(QM_XQC_ADDR_MASK); dump_show(qm, &cqc, sizeof(struct qm_cqc), "SOFT CQC"); diff --git a/drivers/crypto/inside-secure/safexcel.c b/drivers/crypto/inside-secure/safexcel.c index 45758c7aa80e..9ca80d082c4f 100644 --- a/drivers/crypto/inside-secure/safexcel.c +++ b/drivers/crypto/inside-secure/safexcel.c @@ -2031,7 +2031,7 @@ MODULE_AUTHOR("Ofer Heifetz "); MODULE_AUTHOR("Igal Liberman "); MODULE_DESCRIPTION("Support for SafeXcel cryptographic engines: EIP97 & EIP197"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); MODULE_FIRMWARE("ifpp.bin"); MODULE_FIRMWARE("ipue.bin"); diff --git a/drivers/crypto/intel/iaa/Makefile b/drivers/crypto/intel/iaa/Makefile index b64b208d2344..55bda7770fac 100644 --- a/drivers/crypto/intel/iaa/Makefile +++ b/drivers/crypto/intel/iaa/Makefile @@ -3,7 +3,7 @@ # Makefile for IAA crypto device drivers # -ccflags-y += -I $(srctree)/drivers/dma/idxd -DDEFAULT_SYMBOL_NAMESPACE=IDXD +ccflags-y += -I $(srctree)/drivers/dma/idxd -DDEFAULT_SYMBOL_NAMESPACE='"IDXD"' obj-$(CONFIG_CRYPTO_DEV_IAA_CRYPTO) := iaa_crypto.o diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c index 8fced88d3d06..9e557649e5d0 100644 --- a/drivers/crypto/intel/iaa/iaa_crypto_main.c +++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c @@ -2094,7 +2094,7 @@ static void __exit iaa_crypto_cleanup_module(void) pr_debug("cleaned up\n"); } -MODULE_IMPORT_NS(IDXD); +MODULE_IMPORT_NS("IDXD"); MODULE_LICENSE("GPL"); MODULE_ALIAS_IDXD_DEVICE(0); MODULE_AUTHOR("Intel Corporation"); diff --git a/drivers/crypto/intel/qat/qat_420xx/adf_drv.c b/drivers/crypto/intel/qat/qat_420xx/adf_drv.c index 788a11cdb34b..9589d60fb281 100644 --- a/drivers/crypto/intel/qat/qat_420xx/adf_drv.c +++ b/drivers/crypto/intel/qat/qat_420xx/adf_drv.c @@ -204,4 +204,4 @@ MODULE_FIRMWARE(ADF_420XX_MMP); MODULE_DESCRIPTION("Intel(R) QuickAssist Technology"); MODULE_VERSION(ADF_DRV_VERSION); MODULE_SOFTDEP("pre: crypto-intel_qat"); -MODULE_IMPORT_NS(CRYPTO_QAT); +MODULE_IMPORT_NS("CRYPTO_QAT"); diff --git a/drivers/crypto/intel/qat/qat_4xxx/adf_drv.c b/drivers/crypto/intel/qat/qat_4xxx/adf_drv.c index 115eabfd1f6b..d7de1cad1335 100644 --- a/drivers/crypto/intel/qat/qat_4xxx/adf_drv.c +++ b/drivers/crypto/intel/qat/qat_4xxx/adf_drv.c @@ -208,4 +208,4 @@ MODULE_FIRMWARE(ADF_402XX_MMP); MODULE_DESCRIPTION("Intel(R) QuickAssist Technology"); MODULE_VERSION(ADF_DRV_VERSION); MODULE_SOFTDEP("pre: crypto-intel_qat"); -MODULE_IMPORT_NS(CRYPTO_QAT); +MODULE_IMPORT_NS("CRYPTO_QAT"); diff --git a/drivers/crypto/intel/qat/qat_c3xxx/adf_drv.c b/drivers/crypto/intel/qat/qat_c3xxx/adf_drv.c index 4d18057745d4..caa53882fda6 100644 --- a/drivers/crypto/intel/qat/qat_c3xxx/adf_drv.c +++ b/drivers/crypto/intel/qat/qat_c3xxx/adf_drv.c @@ -252,4 +252,4 @@ MODULE_FIRMWARE(ADF_C3XXX_FW); MODULE_FIRMWARE(ADF_C3XXX_MMP); MODULE_DESCRIPTION("Intel(R) QuickAssist Technology"); MODULE_VERSION(ADF_DRV_VERSION); -MODULE_IMPORT_NS(CRYPTO_QAT); +MODULE_IMPORT_NS("CRYPTO_QAT"); diff --git a/drivers/crypto/intel/qat/qat_c3xxxvf/adf_drv.c b/drivers/crypto/intel/qat/qat_c3xxxvf/adf_drv.c index f0023cfb234c..c622793e94a8 100644 --- a/drivers/crypto/intel/qat/qat_c3xxxvf/adf_drv.c +++ b/drivers/crypto/intel/qat/qat_c3xxxvf/adf_drv.c @@ -226,4 +226,4 @@ MODULE_LICENSE("Dual BSD/GPL"); MODULE_AUTHOR("Intel"); MODULE_DESCRIPTION("Intel(R) QuickAssist Technology"); MODULE_VERSION(ADF_DRV_VERSION); -MODULE_IMPORT_NS(CRYPTO_QAT); +MODULE_IMPORT_NS("CRYPTO_QAT"); diff --git a/drivers/crypto/intel/qat/qat_c62x/adf_drv.c b/drivers/crypto/intel/qat/qat_c62x/adf_drv.c index e6b5de55434e..b7398fee19ed 100644 --- a/drivers/crypto/intel/qat/qat_c62x/adf_drv.c +++ b/drivers/crypto/intel/qat/qat_c62x/adf_drv.c @@ -252,4 +252,4 @@ MODULE_FIRMWARE(ADF_C62X_FW); MODULE_FIRMWARE(ADF_C62X_MMP); MODULE_DESCRIPTION("Intel(R) QuickAssist Technology"); MODULE_VERSION(ADF_DRV_VERSION); -MODULE_IMPORT_NS(CRYPTO_QAT); +MODULE_IMPORT_NS("CRYPTO_QAT"); diff --git a/drivers/crypto/intel/qat/qat_c62xvf/adf_drv.c b/drivers/crypto/intel/qat/qat_c62xvf/adf_drv.c index 2bd5b0ff00e3..4840d44bbd5b 100644 --- a/drivers/crypto/intel/qat/qat_c62xvf/adf_drv.c +++ b/drivers/crypto/intel/qat/qat_c62xvf/adf_drv.c @@ -226,4 +226,4 @@ MODULE_LICENSE("Dual BSD/GPL"); MODULE_AUTHOR("Intel"); MODULE_DESCRIPTION("Intel(R) QuickAssist Technology"); MODULE_VERSION(ADF_DRV_VERSION); -MODULE_IMPORT_NS(CRYPTO_QAT); +MODULE_IMPORT_NS("CRYPTO_QAT"); diff --git a/drivers/crypto/intel/qat/qat_common/Makefile b/drivers/crypto/intel/qat/qat_common/Makefile index eac73cbfdd38..7acf9c576149 100644 --- a/drivers/crypto/intel/qat/qat_common/Makefile +++ b/drivers/crypto/intel/qat/qat_common/Makefile @@ -1,6 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_CRYPTO_DEV_QAT) += intel_qat.o -ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CRYPTO_QAT +ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE='"CRYPTO_QAT"' intel_qat-objs := adf_cfg.o \ adf_isr.o \ adf_ctl_drv.o \ diff --git a/drivers/crypto/intel/qat/qat_common/adf_ctl_drv.c b/drivers/crypto/intel/qat/qat_common/adf_ctl_drv.c index 70fa0f6497a9..48c62a14a6a7 100644 --- a/drivers/crypto/intel/qat/qat_common/adf_ctl_drv.c +++ b/drivers/crypto/intel/qat/qat_common/adf_ctl_drv.c @@ -475,4 +475,4 @@ MODULE_AUTHOR("Intel"); MODULE_DESCRIPTION("Intel(R) QuickAssist Technology"); MODULE_ALIAS_CRYPTO("intel_qat"); MODULE_VERSION(ADF_DRV_VERSION); -MODULE_IMPORT_NS(CRYPTO_INTERNAL); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); diff --git a/drivers/crypto/intel/qat/qat_dh895xcc/adf_drv.c b/drivers/crypto/intel/qat/qat_dh895xcc/adf_drv.c index 2a50cce41515..3137fc3b5cf6 100644 --- a/drivers/crypto/intel/qat/qat_dh895xcc/adf_drv.c +++ b/drivers/crypto/intel/qat/qat_dh895xcc/adf_drv.c @@ -252,4 +252,4 @@ MODULE_FIRMWARE(ADF_DH895XCC_FW); MODULE_FIRMWARE(ADF_DH895XCC_MMP); MODULE_DESCRIPTION("Intel(R) QuickAssist Technology"); MODULE_VERSION(ADF_DRV_VERSION); -MODULE_IMPORT_NS(CRYPTO_QAT); +MODULE_IMPORT_NS("CRYPTO_QAT"); diff --git a/drivers/crypto/intel/qat/qat_dh895xccvf/adf_drv.c b/drivers/crypto/intel/qat/qat_dh895xccvf/adf_drv.c index 7cb015b55122..7cd528ee31e7 100644 --- a/drivers/crypto/intel/qat/qat_dh895xccvf/adf_drv.c +++ b/drivers/crypto/intel/qat/qat_dh895xccvf/adf_drv.c @@ -226,4 +226,4 @@ MODULE_LICENSE("Dual BSD/GPL"); MODULE_AUTHOR("Intel"); MODULE_DESCRIPTION("Intel(R) QuickAssist Technology"); MODULE_VERSION(ADF_DRV_VERSION); -MODULE_IMPORT_NS(CRYPTO_QAT); +MODULE_IMPORT_NS("CRYPTO_QAT"); diff --git a/drivers/crypto/marvell/octeontx2/cn10k_cpt.c b/drivers/crypto/marvell/octeontx2/cn10k_cpt.c index 6bfc59e67747..5cae8fafa151 100644 --- a/drivers/crypto/marvell/octeontx2/cn10k_cpt.c +++ b/drivers/crypto/marvell/octeontx2/cn10k_cpt.c @@ -73,7 +73,7 @@ int cn10k_cptpf_lmtst_init(struct otx2_cptpf_dev *cptpf) return 0; } -EXPORT_SYMBOL_NS_GPL(cn10k_cptpf_lmtst_init, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(cn10k_cptpf_lmtst_init, "CRYPTO_DEV_OCTEONTX2_CPT"); int cn10k_cptvf_lmtst_init(struct otx2_cptvf_dev *cptvf) { @@ -94,7 +94,7 @@ int cn10k_cptvf_lmtst_init(struct otx2_cptvf_dev *cptvf) return 0; } -EXPORT_SYMBOL_NS_GPL(cn10k_cptvf_lmtst_init, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(cn10k_cptvf_lmtst_init, "CRYPTO_DEV_OCTEONTX2_CPT"); void cn10k_cpt_hw_ctx_clear(struct pci_dev *pdev, struct cn10k_cpt_errata_ctx *er_ctx) @@ -110,7 +110,7 @@ void cn10k_cpt_hw_ctx_clear(struct pci_dev *pdev, DMA_BIDIRECTIONAL); kfree(er_ctx->hw_ctx); } -EXPORT_SYMBOL_NS_GPL(cn10k_cpt_hw_ctx_clear, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(cn10k_cpt_hw_ctx_clear, "CRYPTO_DEV_OCTEONTX2_CPT"); void cn10k_cpt_hw_ctx_set(union cn10k_cpt_hw_ctx *hctx, u16 ctx_sz) { @@ -119,7 +119,7 @@ void cn10k_cpt_hw_ctx_set(union cn10k_cpt_hw_ctx *hctx, u16 ctx_sz) hctx->w0.ctx_sz = ctx_sz; hctx->w0.ctx_push_sz = 1; } -EXPORT_SYMBOL_NS_GPL(cn10k_cpt_hw_ctx_set, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(cn10k_cpt_hw_ctx_set, "CRYPTO_DEV_OCTEONTX2_CPT"); int cn10k_cpt_hw_ctx_init(struct pci_dev *pdev, struct cn10k_cpt_errata_ctx *er_ctx) @@ -149,7 +149,7 @@ int cn10k_cpt_hw_ctx_init(struct pci_dev *pdev, return 0; } -EXPORT_SYMBOL_NS_GPL(cn10k_cpt_hw_ctx_init, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(cn10k_cpt_hw_ctx_init, "CRYPTO_DEV_OCTEONTX2_CPT"); void cn10k_cpt_ctx_flush(struct pci_dev *pdev, u64 cptr, bool inval) { @@ -168,7 +168,7 @@ void cn10k_cpt_ctx_flush(struct pci_dev *pdev, u64 cptr, bool inval) otx2_cpt_read64(lfs->reg_base, lfs->blkaddr, lfs->lf[0].slot, OTX2_CPT_LF_CTX_ERR); } -EXPORT_SYMBOL_NS_GPL(cn10k_cpt_ctx_flush, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(cn10k_cpt_ctx_flush, "CRYPTO_DEV_OCTEONTX2_CPT"); void cptvf_hw_ops_get(struct otx2_cptvf_dev *cptvf) { @@ -177,4 +177,4 @@ void cptvf_hw_ops_get(struct otx2_cptvf_dev *cptvf) else cptvf->lfs.ops = &otx2_hw_ops; } -EXPORT_SYMBOL_NS_GPL(cptvf_hw_ops_get, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(cptvf_hw_ops_get, "CRYPTO_DEV_OCTEONTX2_CPT"); diff --git a/drivers/crypto/marvell/octeontx2/otx2_cpt_mbox_common.c b/drivers/crypto/marvell/octeontx2/otx2_cpt_mbox_common.c index 5be0103c1fb8..b8b7c8a3c0ca 100644 --- a/drivers/crypto/marvell/octeontx2/otx2_cpt_mbox_common.c +++ b/drivers/crypto/marvell/octeontx2/otx2_cpt_mbox_common.c @@ -19,7 +19,7 @@ int otx2_cpt_send_mbox_msg(struct otx2_mbox *mbox, struct pci_dev *pdev) } return ret; } -EXPORT_SYMBOL_NS_GPL(otx2_cpt_send_mbox_msg, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cpt_send_mbox_msg, "CRYPTO_DEV_OCTEONTX2_CPT"); int otx2_cpt_send_ready_msg(struct otx2_mbox *mbox, struct pci_dev *pdev) { @@ -37,13 +37,13 @@ int otx2_cpt_send_ready_msg(struct otx2_mbox *mbox, struct pci_dev *pdev) return otx2_cpt_send_mbox_msg(mbox, pdev); } -EXPORT_SYMBOL_NS_GPL(otx2_cpt_send_ready_msg, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cpt_send_ready_msg, "CRYPTO_DEV_OCTEONTX2_CPT"); int otx2_cpt_send_af_reg_requests(struct otx2_mbox *mbox, struct pci_dev *pdev) { return otx2_cpt_send_mbox_msg(mbox, pdev); } -EXPORT_SYMBOL_NS_GPL(otx2_cpt_send_af_reg_requests, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cpt_send_af_reg_requests, "CRYPTO_DEV_OCTEONTX2_CPT"); static int otx2_cpt_add_read_af_reg(struct otx2_mbox *mbox, struct pci_dev *pdev, u64 reg, @@ -95,7 +95,7 @@ int otx2_cpt_add_write_af_reg(struct otx2_mbox *mbox, struct pci_dev *pdev, return 0; } -EXPORT_SYMBOL_NS_GPL(otx2_cpt_add_write_af_reg, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cpt_add_write_af_reg, "CRYPTO_DEV_OCTEONTX2_CPT"); int otx2_cpt_read_af_reg(struct otx2_mbox *mbox, struct pci_dev *pdev, u64 reg, u64 *val, int blkaddr) @@ -108,7 +108,7 @@ int otx2_cpt_read_af_reg(struct otx2_mbox *mbox, struct pci_dev *pdev, return otx2_cpt_send_mbox_msg(mbox, pdev); } -EXPORT_SYMBOL_NS_GPL(otx2_cpt_read_af_reg, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cpt_read_af_reg, "CRYPTO_DEV_OCTEONTX2_CPT"); int otx2_cpt_write_af_reg(struct otx2_mbox *mbox, struct pci_dev *pdev, u64 reg, u64 val, int blkaddr) @@ -121,7 +121,7 @@ int otx2_cpt_write_af_reg(struct otx2_mbox *mbox, struct pci_dev *pdev, return otx2_cpt_send_mbox_msg(mbox, pdev); } -EXPORT_SYMBOL_NS_GPL(otx2_cpt_write_af_reg, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cpt_write_af_reg, "CRYPTO_DEV_OCTEONTX2_CPT"); int otx2_cpt_attach_rscrs_msg(struct otx2_cptlfs_info *lfs) { @@ -180,7 +180,7 @@ int otx2_cpt_detach_rsrcs_msg(struct otx2_cptlfs_info *lfs) return ret; } -EXPORT_SYMBOL_NS_GPL(otx2_cpt_detach_rsrcs_msg, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cpt_detach_rsrcs_msg, "CRYPTO_DEV_OCTEONTX2_CPT"); int otx2_cpt_msix_offset_msg(struct otx2_cptlfs_info *lfs) { @@ -213,7 +213,7 @@ int otx2_cpt_msix_offset_msg(struct otx2_cptlfs_info *lfs) } return ret; } -EXPORT_SYMBOL_NS_GPL(otx2_cpt_msix_offset_msg, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cpt_msix_offset_msg, "CRYPTO_DEV_OCTEONTX2_CPT"); int otx2_cpt_sync_mbox_msg(struct otx2_mbox *mbox) { @@ -228,7 +228,7 @@ int otx2_cpt_sync_mbox_msg(struct otx2_mbox *mbox) return otx2_mbox_check_rsp_msgs(mbox, 0); } -EXPORT_SYMBOL_NS_GPL(otx2_cpt_sync_mbox_msg, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cpt_sync_mbox_msg, "CRYPTO_DEV_OCTEONTX2_CPT"); int otx2_cpt_lf_reset_msg(struct otx2_cptlfs_info *lfs, int slot) { @@ -254,4 +254,4 @@ int otx2_cpt_lf_reset_msg(struct otx2_cptlfs_info *lfs, int slot) return ret; } -EXPORT_SYMBOL_NS_GPL(otx2_cpt_lf_reset_msg, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cpt_lf_reset_msg, "CRYPTO_DEV_OCTEONTX2_CPT"); diff --git a/drivers/crypto/marvell/octeontx2/otx2_cptlf.c b/drivers/crypto/marvell/octeontx2/otx2_cptlf.c index b52728e3c0d1..b5d66afcc030 100644 --- a/drivers/crypto/marvell/octeontx2/otx2_cptlf.c +++ b/drivers/crypto/marvell/octeontx2/otx2_cptlf.c @@ -288,8 +288,7 @@ void otx2_cptlf_unregister_misc_interrupts(struct otx2_cptlfs_info *lfs) cptlf_set_misc_intrs(lfs, false); } -EXPORT_SYMBOL_NS_GPL(otx2_cptlf_unregister_misc_interrupts, - CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cptlf_unregister_misc_interrupts, "CRYPTO_DEV_OCTEONTX2_CPT"); void otx2_cptlf_unregister_done_interrupts(struct otx2_cptlfs_info *lfs) { @@ -308,8 +307,7 @@ void otx2_cptlf_unregister_done_interrupts(struct otx2_cptlfs_info *lfs) cptlf_set_done_intrs(lfs, false); } -EXPORT_SYMBOL_NS_GPL(otx2_cptlf_unregister_done_interrupts, - CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cptlf_unregister_done_interrupts, "CRYPTO_DEV_OCTEONTX2_CPT"); static int cptlf_do_register_interrrupts(struct otx2_cptlfs_info *lfs, int lf_num, int irq_offset, @@ -351,8 +349,7 @@ free_irq: otx2_cptlf_unregister_misc_interrupts(lfs); return ret; } -EXPORT_SYMBOL_NS_GPL(otx2_cptlf_register_misc_interrupts, - CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cptlf_register_misc_interrupts, "CRYPTO_DEV_OCTEONTX2_CPT"); int otx2_cptlf_register_done_interrupts(struct otx2_cptlfs_info *lfs) { @@ -375,8 +372,7 @@ free_irq: otx2_cptlf_unregister_done_interrupts(lfs); return ret; } -EXPORT_SYMBOL_NS_GPL(otx2_cptlf_register_done_interrupts, - CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cptlf_register_done_interrupts, "CRYPTO_DEV_OCTEONTX2_CPT"); void otx2_cptlf_free_irqs_affinity(struct otx2_cptlfs_info *lfs) { @@ -390,7 +386,7 @@ void otx2_cptlf_free_irqs_affinity(struct otx2_cptlfs_info *lfs) free_cpumask_var(lfs->lf[slot].affinity_mask); } } -EXPORT_SYMBOL_NS_GPL(otx2_cptlf_free_irqs_affinity, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cptlf_free_irqs_affinity, "CRYPTO_DEV_OCTEONTX2_CPT"); int otx2_cptlf_set_irqs_affinity(struct otx2_cptlfs_info *lfs) { @@ -423,7 +419,7 @@ free_affinity_mask: otx2_cptlf_free_irqs_affinity(lfs); return ret; } -EXPORT_SYMBOL_NS_GPL(otx2_cptlf_set_irqs_affinity, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cptlf_set_irqs_affinity, "CRYPTO_DEV_OCTEONTX2_CPT"); int otx2_cptlf_init(struct otx2_cptlfs_info *lfs, u8 eng_grp_mask, int pri, int lfs_num) @@ -486,7 +482,7 @@ clear_lfs_num: lfs->lfs_num = 0; return ret; } -EXPORT_SYMBOL_NS_GPL(otx2_cptlf_init, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cptlf_init, "CRYPTO_DEV_OCTEONTX2_CPT"); void otx2_cptlf_shutdown(struct otx2_cptlfs_info *lfs) { @@ -498,7 +494,7 @@ void otx2_cptlf_shutdown(struct otx2_cptlfs_info *lfs) otx2_cpt_detach_rsrcs_msg(lfs); lfs->lfs_num = 0; } -EXPORT_SYMBOL_NS_GPL(otx2_cptlf_shutdown, CRYPTO_DEV_OCTEONTX2_CPT); +EXPORT_SYMBOL_NS_GPL(otx2_cptlf_shutdown, "CRYPTO_DEV_OCTEONTX2_CPT"); MODULE_AUTHOR("Marvell"); MODULE_DESCRIPTION("Marvell RVU CPT Common module"); diff --git a/drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c b/drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c index 94d0e73e42de..12971300296d 100644 --- a/drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c +++ b/drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c @@ -868,7 +868,7 @@ static struct pci_driver otx2_cpt_pci_driver = { module_pci_driver(otx2_cpt_pci_driver); -MODULE_IMPORT_NS(CRYPTO_DEV_OCTEONTX2_CPT); +MODULE_IMPORT_NS("CRYPTO_DEV_OCTEONTX2_CPT"); MODULE_AUTHOR("Marvell"); MODULE_DESCRIPTION(OTX2_CPT_DRV_STRING); diff --git a/drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c b/drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c index d0b6ee901f62..d84eebdf2fa8 100644 --- a/drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c +++ b/drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c @@ -453,7 +453,7 @@ static struct pci_driver otx2_cptvf_pci_driver = { module_pci_driver(otx2_cptvf_pci_driver); -MODULE_IMPORT_NS(CRYPTO_DEV_OCTEONTX2_CPT); +MODULE_IMPORT_NS("CRYPTO_DEV_OCTEONTX2_CPT"); MODULE_AUTHOR("Marvell"); MODULE_DESCRIPTION("Marvell RVU CPT Virtual Function Driver"); diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index 432b7cfd12a8..cb14829bb9be 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -934,5 +934,5 @@ MODULE_SOFTDEP("pre: cxl_port"); module_exit(cxl_acpi_exit); MODULE_DESCRIPTION("CXL ACPI: Platform Support"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(CXL); -MODULE_IMPORT_NS(ACPI); +MODULE_IMPORT_NS("CXL"); +MODULE_IMPORT_NS("ACPI"); diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c index 2a1f164db98e..8153f8d83a16 100644 --- a/drivers/cxl/core/cdat.c +++ b/drivers/cxl/core/cdat.c @@ -416,7 +416,7 @@ void cxl_endpoint_parse_cdat(struct cxl_port *port) cxl_qos_class_verify(cxlmd); cxl_memdev_update_perf(cxlmd); } -EXPORT_SYMBOL_NS_GPL(cxl_endpoint_parse_cdat, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_endpoint_parse_cdat, "CXL"); static int cdat_sslbis_handler(union acpi_subtable_headers *header, void *arg, const unsigned long end) @@ -513,7 +513,7 @@ void cxl_switch_parse_cdat(struct cxl_port *port) if (rc) dev_dbg(&port->dev, "Failed to parse SSLBIS: %d\n", rc); } -EXPORT_SYMBOL_NS_GPL(cxl_switch_parse_cdat, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_switch_parse_cdat, "CXL"); static void __cxl_coordinates_combine(struct access_coordinate *out, struct access_coordinate *c1, @@ -545,7 +545,7 @@ void cxl_coordinates_combine(struct access_coordinate *out, __cxl_coordinates_combine(&out[i], &c1[i], &c2[i]); } -MODULE_IMPORT_NS(CXL); +MODULE_IMPORT_NS("CXL"); static void cxl_bandwidth_add(struct access_coordinate *coord, struct access_coordinate *c1, diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index ff0c96ade241..28edd5822486 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -73,7 +73,7 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port) return add_hdm_decoder(port, &cxlsd->cxld, single_port_map); } -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, "CXL"); static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm) { @@ -199,7 +199,7 @@ struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port, return cxlhdm; } -EXPORT_SYMBOL_NS_GPL(devm_cxl_setup_hdm, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_setup_hdm, "CXL"); static void __cxl_dpa_debug(struct seq_file *file, struct resource *r, int depth) { @@ -221,7 +221,7 @@ void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds) } up_read(&cxl_dpa_rwsem); } -EXPORT_SYMBOL_NS_GPL(cxl_dpa_debug, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_dpa_debug, "CXL"); /* * Must be called in a context that synchronizes against this decoder's @@ -358,7 +358,7 @@ int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled, return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled); } -EXPORT_SYMBOL_NS_GPL(devm_cxl_dpa_reserve, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_dpa_reserve, "CXL"); resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled) { @@ -738,7 +738,7 @@ void cxl_port_commit_reap(struct cxl_decoder *cxld) device_for_each_child_reverse_from(&port->dev, &cxld->dev, NULL, commit_reap); } -EXPORT_SYMBOL_NS_GPL(cxl_port_commit_reap, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_port_commit_reap, "CXL"); static void cxl_decoder_reset(struct cxl_decoder *cxld) { @@ -1064,4 +1064,4 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm, return 0; } -EXPORT_SYMBOL_NS_GPL(devm_cxl_enumerate_decoders, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_enumerate_decoders, "CXL"); diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 5175138c4fb7..548564c770c0 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -281,7 +281,7 @@ int cxl_internal_send_cmd(struct cxl_mailbox *cxl_mbox, return -EIO; return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_internal_send_cmd, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_internal_send_cmd, "CXL"); static bool cxl_mem_raw_command_allowed(u16 opcode) { @@ -854,7 +854,7 @@ out: kvfree(gsl); return rc; } -EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, "CXL"); void cxl_event_trace_record(const struct cxl_memdev *cxlmd, enum cxl_event_log_type type, @@ -894,7 +894,7 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd, trace_cxl_dram(cxlmd, type, cxlr, hpa, &evt->dram); } } -EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, "CXL"); static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd, enum cxl_event_log_type type, @@ -1063,7 +1063,7 @@ void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status) if (status & CXLDEV_EVENT_STATUS_INFO) cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_INFO); } -EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, "CXL"); /** * cxl_mem_get_partition_info - Get partition info @@ -1155,7 +1155,7 @@ int cxl_dev_state_identify(struct cxl_memdev_state *mds) return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_dev_state_identify, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_dev_state_identify, "CXL"); static int __cxl_mem_sanitize(struct cxl_memdev_state *mds, u16 cmd) { @@ -1306,7 +1306,7 @@ int cxl_mem_create_range_info(struct cxl_memdev_state *mds) mds->active_volatile_bytes, mds->active_persistent_bytes, "pmem"); } -EXPORT_SYMBOL_NS_GPL(cxl_mem_create_range_info, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_mem_create_range_info, "CXL"); int cxl_set_timestamp(struct cxl_memdev_state *mds) { @@ -1333,7 +1333,7 @@ int cxl_set_timestamp(struct cxl_memdev_state *mds) return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, "CXL"); int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, struct cxl_region *cxlr) @@ -1384,7 +1384,7 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, mutex_unlock(&mds->poison.lock); return rc; } -EXPORT_SYMBOL_NS_GPL(cxl_mem_get_poison, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_poison, "CXL"); static void free_poison_buf(void *buf) { @@ -1420,7 +1420,7 @@ int cxl_poison_state_init(struct cxl_memdev_state *mds) mutex_init(&mds->poison.lock); return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_poison_state_init, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_poison_state_init, "CXL"); int cxl_mailbox_init(struct cxl_mailbox *cxl_mbox, struct device *host) { @@ -1433,7 +1433,7 @@ int cxl_mailbox_init(struct cxl_mailbox *cxl_mbox, struct device *host) return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_mailbox_init, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_mailbox_init, "CXL"); struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev) { @@ -1455,7 +1455,7 @@ struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev) return mds; } -EXPORT_SYMBOL_NS_GPL(cxl_memdev_state_create, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_memdev_state_create, "CXL"); void __init cxl_mbox_init(void) { diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index 84fefb76dafa..ae3dfcbe8938 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -250,7 +250,7 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd) return rc; } -EXPORT_SYMBOL_NS_GPL(cxl_trigger_poison_list, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_trigger_poison_list, "CXL"); static int cxl_validate_poison_dpa(struct cxl_memdev *cxlmd, u64 dpa) { @@ -329,7 +329,7 @@ out: return rc; } -EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, "CXL"); int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa) { @@ -393,7 +393,7 @@ out: return rc; } -EXPORT_SYMBOL_NS_GPL(cxl_clear_poison, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_clear_poison, "CXL"); static struct attribute *cxl_memdev_attributes[] = { &dev_attr_serial.attr, @@ -537,7 +537,7 @@ void cxl_memdev_update_perf(struct cxl_memdev *cxlmd) sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_ram_attribute_group); sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_pmem_attribute_group); } -EXPORT_SYMBOL_NS_GPL(cxl_memdev_update_perf, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_memdev_update_perf, "CXL"); static const struct device_type cxl_memdev_type = { .name = "cxl_memdev", @@ -550,7 +550,7 @@ bool is_cxl_memdev(const struct device *dev) { return dev->type == &cxl_memdev_type; } -EXPORT_SYMBOL_NS_GPL(is_cxl_memdev, CXL); +EXPORT_SYMBOL_NS_GPL(is_cxl_memdev, "CXL"); /** * set_exclusive_cxl_commands() - atomically disable user cxl commands @@ -569,7 +569,7 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds, CXL_MEM_COMMAND_ID_MAX); up_write(&cxl_memdev_rwsem); } -EXPORT_SYMBOL_NS_GPL(set_exclusive_cxl_commands, CXL); +EXPORT_SYMBOL_NS_GPL(set_exclusive_cxl_commands, "CXL"); /** * clear_exclusive_cxl_commands() - atomically enable user cxl commands @@ -584,7 +584,7 @@ void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds, CXL_MEM_COMMAND_ID_MAX); up_write(&cxl_memdev_rwsem); } -EXPORT_SYMBOL_NS_GPL(clear_exclusive_cxl_commands, CXL); +EXPORT_SYMBOL_NS_GPL(clear_exclusive_cxl_commands, "CXL"); static void cxl_memdev_shutdown(struct device *dev) { @@ -1006,7 +1006,7 @@ int devm_cxl_setup_fw_upload(struct device *host, struct cxl_memdev_state *mds) return PTR_ERR(fwl); return devm_add_action_or_reset(host, cxl_remove_fw_upload, fwl); } -EXPORT_SYMBOL_NS_GPL(devm_cxl_setup_fw_upload, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_setup_fw_upload, "CXL"); static const struct file_operations cxl_memdev_fops = { .owner = THIS_MODULE, @@ -1060,7 +1060,7 @@ err: put_device(dev); return ERR_PTR(rc); } -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, "CXL"); static void sanitize_teardown_notifier(void *data) { @@ -1105,7 +1105,7 @@ int devm_cxl_sanitize_setup_notifier(struct device *host, return devm_add_action_or_reset(host, sanitize_teardown_notifier, mds); } -EXPORT_SYMBOL_NS_GPL(devm_cxl_sanitize_setup_notifier, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_sanitize_setup_notifier, "CXL"); __init int cxl_memdev_init(void) { diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index 5b46bc46aaa9..9d58ab9d33c5 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -101,7 +101,7 @@ int devm_cxl_port_enumerate_dports(struct cxl_port *port) return ctx.error; return ctx.count; } -EXPORT_SYMBOL_NS_GPL(devm_cxl_port_enumerate_dports, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_port_enumerate_dports, "CXL"); static int cxl_dvsec_mem_range_valid(struct cxl_dev_state *cxlds, int id) { @@ -209,7 +209,7 @@ int cxl_await_media_ready(struct cxl_dev_state *cxlds) return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_await_media_ready, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_await_media_ready, "CXL"); static int cxl_set_mem_enable(struct cxl_dev_state *cxlds, u16 val) { @@ -386,7 +386,7 @@ int cxl_dvsec_rr_decode(struct device *dev, struct cxl_port *port, return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_dvsec_rr_decode, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_dvsec_rr_decode, "CXL"); /** * cxl_hdm_decode_init() - Setup HDM decoding for the endpoint @@ -464,7 +464,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, */ return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_hdm_decode_init, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_hdm_decode_init, "CXL"); #define CXL_DOE_TABLE_ACCESS_REQ_CODE 0x000000ff #define CXL_DOE_TABLE_ACCESS_REQ_CODE_READ 0 @@ -648,7 +648,7 @@ err: devm_kfree(dev, buf); dev_err(dev, "Failed to read/validate CDAT.\n"); } -EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL); +EXPORT_SYMBOL_NS_GPL(read_cdat_data, "CXL"); static void __cxl_handle_cor_ras(struct cxl_dev_state *cxlds, void __iomem *ras_base) @@ -805,7 +805,7 @@ void cxl_dport_init_ras_reporting(struct cxl_dport *dport, struct device *host) cxl_disable_rch_root_ints(dport); } } -EXPORT_SYMBOL_NS_GPL(cxl_dport_init_ras_reporting, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_dport_init_ras_reporting, "CXL"); static void cxl_handle_rdport_cor_ras(struct cxl_dev_state *cxlds, struct cxl_dport *dport) @@ -916,7 +916,7 @@ void cxl_cor_error_detected(struct pci_dev *pdev) cxl_handle_endpoint_cor_ras(cxlds); } } -EXPORT_SYMBOL_NS_GPL(cxl_cor_error_detected, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_cor_error_detected, "CXL"); pci_ers_result_t cxl_error_detected(struct pci_dev *pdev, pci_channel_state_t state) @@ -966,7 +966,7 @@ pci_ers_result_t cxl_error_detected(struct pci_dev *pdev, } return PCI_ERS_RESULT_NEED_RESET; } -EXPORT_SYMBOL_NS_GPL(cxl_error_detected, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_error_detected, "CXL"); static int cxl_flit_size(struct pci_dev *pdev) { @@ -1030,7 +1030,7 @@ bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port) return device_for_each_child(&port->dev, port, __cxl_endpoint_decoder_reset_detected); } -EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_reset_detected, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_reset_detected, "CXL"); int cxl_pci_get_bandwidth(struct pci_dev *pdev, struct access_coordinate *c) { diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c index c00f3a933164..b3378d3f6acb 100644 --- a/drivers/cxl/core/pmem.c +++ b/drivers/cxl/core/pmem.c @@ -49,13 +49,13 @@ struct cxl_nvdimm_bridge *to_cxl_nvdimm_bridge(struct device *dev) return NULL; return container_of(dev, struct cxl_nvdimm_bridge, dev); } -EXPORT_SYMBOL_NS_GPL(to_cxl_nvdimm_bridge, CXL); +EXPORT_SYMBOL_NS_GPL(to_cxl_nvdimm_bridge, "CXL"); bool is_cxl_nvdimm_bridge(struct device *dev) { return dev->type == &cxl_nvdimm_bridge_type; } -EXPORT_SYMBOL_NS_GPL(is_cxl_nvdimm_bridge, CXL); +EXPORT_SYMBOL_NS_GPL(is_cxl_nvdimm_bridge, "CXL"); static int match_nvdimm_bridge(struct device *dev, void *data) { @@ -82,7 +82,7 @@ struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port) return to_cxl_nvdimm_bridge(dev); } -EXPORT_SYMBOL_NS_GPL(cxl_find_nvdimm_bridge, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_find_nvdimm_bridge, "CXL"); static struct lock_class_key cxl_nvdimm_bridge_key; @@ -164,7 +164,7 @@ err: put_device(dev); return ERR_PTR(rc); } -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_nvdimm_bridge, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_nvdimm_bridge, "CXL"); static void cxl_nvdimm_release(struct device *dev) { @@ -188,7 +188,7 @@ bool is_cxl_nvdimm(struct device *dev) { return dev->type == &cxl_nvdimm_type; } -EXPORT_SYMBOL_NS_GPL(is_cxl_nvdimm, CXL); +EXPORT_SYMBOL_NS_GPL(is_cxl_nvdimm, "CXL"); struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev) { @@ -197,7 +197,7 @@ struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev) return NULL; return container_of(dev, struct cxl_nvdimm, dev); } -EXPORT_SYMBOL_NS_GPL(to_cxl_nvdimm, CXL); +EXPORT_SYMBOL_NS_GPL(to_cxl_nvdimm, "CXL"); static struct lock_class_key cxl_nvdimm_key; @@ -293,4 +293,4 @@ err_alloc: return rc; } -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_nvdimm, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_nvdimm, "CXL"); diff --git a/drivers/cxl/core/pmu.c b/drivers/cxl/core/pmu.c index 5d8e06b0ba6e..b3136d7664ab 100644 --- a/drivers/cxl/core/pmu.c +++ b/drivers/cxl/core/pmu.c @@ -65,4 +65,4 @@ err: put_device(&pmu->dev); return rc; } -EXPORT_SYMBOL_NS_GPL(devm_cxl_pmu_add, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_pmu_add, "CXL"); diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index af92c67bc954..78a5c2c25982 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -437,7 +437,7 @@ struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev) return NULL; return container_of(dev, struct cxl_root_decoder, cxlsd.cxld.dev); } -EXPORT_SYMBOL_NS_GPL(to_cxl_root_decoder, CXL); +EXPORT_SYMBOL_NS_GPL(to_cxl_root_decoder, "CXL"); static void cxl_root_decoder_release(struct device *dev) { @@ -471,19 +471,19 @@ bool is_endpoint_decoder(struct device *dev) { return dev->type == &cxl_decoder_endpoint_type; } -EXPORT_SYMBOL_NS_GPL(is_endpoint_decoder, CXL); +EXPORT_SYMBOL_NS_GPL(is_endpoint_decoder, "CXL"); bool is_root_decoder(struct device *dev) { return dev->type == &cxl_decoder_root_type; } -EXPORT_SYMBOL_NS_GPL(is_root_decoder, CXL); +EXPORT_SYMBOL_NS_GPL(is_root_decoder, "CXL"); bool is_switch_decoder(struct device *dev) { return is_root_decoder(dev) || dev->type == &cxl_decoder_switch_type; } -EXPORT_SYMBOL_NS_GPL(is_switch_decoder, CXL); +EXPORT_SYMBOL_NS_GPL(is_switch_decoder, "CXL"); struct cxl_decoder *to_cxl_decoder(struct device *dev) { @@ -493,7 +493,7 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev) return NULL; return container_of(dev, struct cxl_decoder, dev); } -EXPORT_SYMBOL_NS_GPL(to_cxl_decoder, CXL); +EXPORT_SYMBOL_NS_GPL(to_cxl_decoder, "CXL"); struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev) { @@ -502,7 +502,7 @@ struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev) return NULL; return container_of(dev, struct cxl_endpoint_decoder, cxld.dev); } -EXPORT_SYMBOL_NS_GPL(to_cxl_endpoint_decoder, CXL); +EXPORT_SYMBOL_NS_GPL(to_cxl_endpoint_decoder, "CXL"); struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev) { @@ -511,7 +511,7 @@ struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev) return NULL; return container_of(dev, struct cxl_switch_decoder, cxld.dev); } -EXPORT_SYMBOL_NS_GPL(to_cxl_switch_decoder, CXL); +EXPORT_SYMBOL_NS_GPL(to_cxl_switch_decoder, "CXL"); static void cxl_ep_release(struct cxl_ep *ep) { @@ -585,7 +585,7 @@ bool is_cxl_port(const struct device *dev) { return dev->type == &cxl_port_type; } -EXPORT_SYMBOL_NS_GPL(is_cxl_port, CXL); +EXPORT_SYMBOL_NS_GPL(is_cxl_port, "CXL"); struct cxl_port *to_cxl_port(const struct device *dev) { @@ -594,7 +594,7 @@ struct cxl_port *to_cxl_port(const struct device *dev) return NULL; return container_of(dev, struct cxl_port, dev); } -EXPORT_SYMBOL_NS_GPL(to_cxl_port, CXL); +EXPORT_SYMBOL_NS_GPL(to_cxl_port, "CXL"); static void unregister_port(void *_port) { @@ -942,7 +942,7 @@ struct cxl_port *devm_cxl_add_port(struct device *host, return port; } -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_port, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_port, "CXL"); struct cxl_root *devm_cxl_add_root(struct device *host, const struct cxl_root_ops *ops) @@ -958,7 +958,7 @@ struct cxl_root *devm_cxl_add_root(struct device *host, cxl_root->ops = ops; return cxl_root; } -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_root, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_root, "CXL"); struct pci_bus *cxl_port_to_pci_bus(struct cxl_port *port) { @@ -974,7 +974,7 @@ struct pci_bus *cxl_port_to_pci_bus(struct cxl_port *port) return xa_load(&cxl_root_buses, (unsigned long)port->uport_dev); } -EXPORT_SYMBOL_NS_GPL(cxl_port_to_pci_bus, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_port_to_pci_bus, "CXL"); static void unregister_pci_bus(void *uport_dev) { @@ -995,7 +995,7 @@ int devm_cxl_register_pci_bus(struct device *host, struct device *uport_dev, return rc; return devm_add_action_or_reset(host, unregister_pci_bus, uport_dev); } -EXPORT_SYMBOL_NS_GPL(devm_cxl_register_pci_bus, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_register_pci_bus, "CXL"); static bool dev_is_cxl_root_child(struct device *dev) { @@ -1027,7 +1027,7 @@ struct cxl_root *find_cxl_root(struct cxl_port *port) get_device(&iter->dev); return to_cxl_root(iter); } -EXPORT_SYMBOL_NS_GPL(find_cxl_root, CXL); +EXPORT_SYMBOL_NS_GPL(find_cxl_root, "CXL"); void put_cxl_root(struct cxl_root *cxl_root) { @@ -1036,7 +1036,7 @@ void put_cxl_root(struct cxl_root *cxl_root) put_device(&cxl_root->port.dev); } -EXPORT_SYMBOL_NS_GPL(put_cxl_root, CXL); +EXPORT_SYMBOL_NS_GPL(put_cxl_root, "CXL"); static struct cxl_dport *find_dport(struct cxl_port *port, int id) { @@ -1230,7 +1230,7 @@ struct cxl_dport *devm_cxl_add_dport(struct cxl_port *port, return dport; } -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_dport, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_dport, "CXL"); /** * devm_cxl_add_rch_dport - append RCH downstream port data to a cxl_port @@ -1264,7 +1264,7 @@ struct cxl_dport *devm_cxl_add_rch_dport(struct cxl_port *port, return dport; } -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_rch_dport, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_rch_dport, "CXL"); static int add_ep(struct cxl_ep *new) { @@ -1421,7 +1421,7 @@ int cxl_endpoint_autoremove(struct cxl_memdev *cxlmd, struct cxl_port *endpoint) cxlmd->depth = endpoint->depth; return devm_add_action_or_reset(dev, delete_endpoint, cxlmd); } -EXPORT_SYMBOL_NS_GPL(cxl_endpoint_autoremove, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_endpoint_autoremove, "CXL"); /* * The natural end of life of a non-root 'cxl_port' is when its parent port goes @@ -1692,21 +1692,21 @@ retry: return 0; } -EXPORT_SYMBOL_NS_GPL(devm_cxl_enumerate_ports, CXL); +EXPORT_SYMBOL_NS_GPL(devm_cxl_enumerate_ports, "CXL"); struct cxl_port *cxl_pci_find_port(struct pci_dev *pdev, struct cxl_dport **dport) { return find_cxl_port(pdev->dev.parent, dport); } -EXPORT_SYMBOL_NS_GPL(cxl_pci_find_port, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_pci_find_port, "CXL"); struct cxl_port *cxl_mem_find_port(struct cxl_memdev *cxlmd, struct cxl_dport **dport) { return find_cxl_port(grandparent(&cxlmd->dev), dport); } -EXPORT_SYMBOL_NS_GPL(cxl_mem_find_port, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_mem_find_port, "CXL"); static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd, struct cxl_port *port, int *target_map) @@ -1840,7 +1840,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port, cxlrd->qos_class = CXL_QOS_CLASS_INVALID; return cxlrd; } -EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, "CXL"); /** * cxl_switch_decoder_alloc - Allocate a switch level decoder @@ -1877,7 +1877,7 @@ struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port, cxld->dev.type = &cxl_decoder_switch_type; return cxlsd; } -EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, "CXL"); /** * cxl_endpoint_decoder_alloc - Allocate an endpoint decoder @@ -1909,7 +1909,7 @@ struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port) cxld->dev.type = &cxl_decoder_endpoint_type; return cxled; } -EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, "CXL"); /** * cxl_decoder_add_locked - Add a decoder with targets @@ -1965,7 +1965,7 @@ int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map) return device_add(dev); } -EXPORT_SYMBOL_NS_GPL(cxl_decoder_add_locked, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_decoder_add_locked, "CXL"); /** * cxl_decoder_add - Add a decoder with targets @@ -1995,7 +1995,7 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map) guard(device)(&port->dev); return cxl_decoder_add_locked(cxld, target_map); } -EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, "CXL"); static void cxld_unregister(void *dev) { @@ -2013,7 +2013,7 @@ int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld) { return devm_add_action_or_reset(host, cxld_unregister, &cxld->dev); } -EXPORT_SYMBOL_NS_GPL(cxl_decoder_autoremove, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_decoder_autoremove, "CXL"); /** * __cxl_driver_register - register a driver for the cxl bus @@ -2046,13 +2046,13 @@ int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner, return driver_register(&cxl_drv->drv); } -EXPORT_SYMBOL_NS_GPL(__cxl_driver_register, CXL); +EXPORT_SYMBOL_NS_GPL(__cxl_driver_register, "CXL"); void cxl_driver_unregister(struct cxl_driver *cxl_drv) { driver_unregister(&cxl_drv->drv); } -EXPORT_SYMBOL_NS_GPL(cxl_driver_unregister, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_driver_unregister, "CXL"); static int cxl_bus_uevent(const struct device *dev, struct kobj_uevent_env *env) { @@ -2104,19 +2104,19 @@ void cxl_bus_rescan(void) queue_work(cxl_bus_wq, &rescan_work); } -EXPORT_SYMBOL_NS_GPL(cxl_bus_rescan, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_bus_rescan, "CXL"); void cxl_bus_drain(void) { drain_workqueue(cxl_bus_wq); } -EXPORT_SYMBOL_NS_GPL(cxl_bus_drain, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_bus_drain, "CXL"); bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd) { return queue_work(cxl_bus_wq, &cxlmd->detach_work); } -EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL); +EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, "CXL"); static void add_latency(struct access_coordinate *c, long latency) { @@ -2242,7 +2242,7 @@ int cxl_endpoint_get_perf_coordinates(struct cxl_port *port, return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_endpoint_get_perf_coordinates, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_endpoint_get_perf_coordinates, "CXL"); int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port, struct access_coordinate *c) @@ -2299,7 +2299,7 @@ struct bus_type cxl_bus_type = { .remove = cxl_bus_remove, .bus_groups = cxl_bus_attribute_groups, }; -EXPORT_SYMBOL_NS_GPL(cxl_bus_type, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_bus_type, "CXL"); static struct dentry *cxl_debugfs; @@ -2307,7 +2307,7 @@ struct dentry *cxl_debugfs_create_dir(const char *dir) { return debugfs_create_dir(dir, cxl_debugfs); } -EXPORT_SYMBOL_NS_GPL(cxl_debugfs_create_dir, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_debugfs_create_dir, "CXL"); static __init int cxl_core_init(void) { @@ -2363,4 +2363,4 @@ subsys_initcall(cxl_core_init); module_exit(cxl_core_exit); MODULE_DESCRIPTION("CXL: Core Compute Express Link support"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(CXL); +MODULE_IMPORT_NS("CXL"); diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 70d0a017e99c..b98b1ccffd1c 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1295,6 +1295,7 @@ static int cxl_port_setup_targets(struct cxl_port *port, struct cxl_region_params *p = &cxlr->params; struct cxl_decoder *cxld = cxl_rr->decoder; struct cxl_switch_decoder *cxlsd; + struct cxl_port *iter = port; u16 eig, peig; u8 eiw, peiw; @@ -1311,16 +1312,26 @@ static int cxl_port_setup_targets(struct cxl_port *port, cxlsd = to_cxl_switch_decoder(&cxld->dev); if (cxl_rr->nr_targets_set) { - int i, distance; + int i, distance = 1; + struct cxl_region_ref *cxl_rr_iter; /* - * Passthrough decoders impose no distance requirements between - * peers + * The "distance" between peer downstream ports represents which + * endpoint positions in the region interleave a given port can + * host. + * + * For example, at the root of a hierarchy the distance is + * always 1 as every index targets a different host-bridge. At + * each subsequent switch level those ports map every Nth region + * position where N is the width of the switch == distance. */ - if (cxl_rr->nr_targets == 1) - distance = 0; - else - distance = p->nr_targets / cxl_rr->nr_targets; + do { + cxl_rr_iter = cxl_rr_load(iter, cxlr); + distance *= cxl_rr_iter->nr_targets; + iter = to_cxl_port(iter->dev.parent); + } while (!is_cxl_root(iter)); + distance *= cxlrd->cxlsd.cxld.interleave_ways; + for (i = 0; i < cxl_rr->nr_targets_set; i++) if (ep->dport == cxlsd->target[i]) { rc = check_last_peer(cxled, ep, cxl_rr, @@ -2299,7 +2310,7 @@ bool is_cxl_region(struct device *dev) { return dev->type == &cxl_region_type; } -EXPORT_SYMBOL_NS_GPL(is_cxl_region, CXL); +EXPORT_SYMBOL_NS_GPL(is_cxl_region, "CXL"); static struct cxl_region *to_cxl_region(struct device *dev) { @@ -2652,7 +2663,7 @@ bool is_cxl_pmem_region(struct device *dev) { return dev->type == &cxl_pmem_region_type; } -EXPORT_SYMBOL_NS_GPL(is_cxl_pmem_region, CXL); +EXPORT_SYMBOL_NS_GPL(is_cxl_pmem_region, "CXL"); struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev) { @@ -2661,7 +2672,7 @@ struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev) return NULL; return container_of(dev, struct cxl_pmem_region, dev); } -EXPORT_SYMBOL_NS_GPL(to_cxl_pmem_region, CXL); +EXPORT_SYMBOL_NS_GPL(to_cxl_pmem_region, "CXL"); struct cxl_poison_context { struct cxl_port *port; @@ -3015,7 +3026,7 @@ struct cxl_dax_region *to_cxl_dax_region(struct device *dev) return NULL; return container_of(dev, struct cxl_dax_region, dev); } -EXPORT_SYMBOL_NS_GPL(to_cxl_dax_region, CXL); +EXPORT_SYMBOL_NS_GPL(to_cxl_dax_region, "CXL"); static struct lock_class_key cxl_dax_region_key; @@ -3359,7 +3370,7 @@ out: put_device(cxlrd_dev); return rc; } -EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, "CXL"); static int is_system_ram(struct resource *res, void *arg) { @@ -3462,6 +3473,6 @@ void cxl_region_exit(void) cxl_driver_unregister(&cxl_region_driver); } -MODULE_IMPORT_NS(CXL); -MODULE_IMPORT_NS(DEVMEM); +MODULE_IMPORT_NS("CXL"); +MODULE_IMPORT_NS("DEVMEM"); MODULE_ALIAS_CXL(CXL_DEVICE_REGION); diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c index 429973a2165b..59cb35b40c7e 100644 --- a/drivers/cxl/core/regs.c +++ b/drivers/cxl/core/regs.c @@ -106,7 +106,7 @@ void cxl_probe_component_regs(struct device *dev, void __iomem *base, rmap->size = length; } } -EXPORT_SYMBOL_NS_GPL(cxl_probe_component_regs, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_probe_component_regs, "CXL"); /** * cxl_probe_device_regs() - Detect CXL Device register blocks @@ -174,7 +174,7 @@ void cxl_probe_device_regs(struct device *dev, void __iomem *base, rmap->size = length; } } -EXPORT_SYMBOL_NS_GPL(cxl_probe_device_regs, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_probe_device_regs, "CXL"); void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr, resource_size_t length) @@ -232,7 +232,7 @@ int cxl_map_component_regs(const struct cxl_register_map *map, return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_map_component_regs, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_map_component_regs, "CXL"); int cxl_map_device_regs(const struct cxl_register_map *map, struct cxl_device_regs *regs) @@ -266,7 +266,7 @@ int cxl_map_device_regs(const struct cxl_register_map *map, return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_map_device_regs, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_map_device_regs, "CXL"); static bool cxl_decode_regblock(struct pci_dev *pdev, u32 reg_lo, u32 reg_hi, struct cxl_register_map *map) @@ -344,7 +344,7 @@ int cxl_find_regblock_instance(struct pci_dev *pdev, enum cxl_regloc_type type, map->resource = CXL_RESOURCE_NONE; return -ENODEV; } -EXPORT_SYMBOL_NS_GPL(cxl_find_regblock_instance, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_find_regblock_instance, "CXL"); /** * cxl_find_regblock() - Locate register blocks by type @@ -362,7 +362,7 @@ int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type, { return cxl_find_regblock_instance(pdev, type, map, 0); } -EXPORT_SYMBOL_NS_GPL(cxl_find_regblock, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_find_regblock, "CXL"); /** * cxl_count_regblock() - Count instances of a given regblock type. @@ -385,7 +385,7 @@ int cxl_count_regblock(struct pci_dev *pdev, enum cxl_regloc_type type) count++; } } -EXPORT_SYMBOL_NS_GPL(cxl_count_regblock, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_count_regblock, "CXL"); int cxl_map_pmu_regs(struct cxl_register_map *map, struct cxl_pmu_regs *regs) { @@ -399,7 +399,7 @@ int cxl_map_pmu_regs(struct cxl_register_map *map, struct cxl_pmu_regs *regs) return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_map_pmu_regs, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_map_pmu_regs, "CXL"); static int cxl_map_regblock(struct cxl_register_map *map) { @@ -468,7 +468,7 @@ int cxl_setup_regs(struct cxl_register_map *map) return rc; } -EXPORT_SYMBOL_NS_GPL(cxl_setup_regs, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_setup_regs, "CXL"); u16 cxl_rcrb_to_aer(struct device *dev, resource_size_t rcrb) { @@ -560,7 +560,7 @@ int cxl_dport_map_rcd_linkcap(struct pci_dev *pdev, struct cxl_dport *dport) return 0; } -EXPORT_SYMBOL_NS_GPL(cxl_dport_map_rcd_linkcap, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_dport_map_rcd_linkcap, "CXL"); resource_size_t __rcrb_to_component(struct device *dev, struct cxl_rcrb_info *ri, enum cxl_rcrb which) @@ -633,4 +633,4 @@ resource_size_t cxl_rcd_component_reg_phys(struct device *dev, return CXL_RESOURCE_NONE; return __rcrb_to_component(dev, &dport->rcrb, CXL_RCRB_UPSTREAM); } -EXPORT_SYMBOL_NS_GPL(cxl_rcd_component_reg_phys, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_rcd_component_reg_phys, "CXL"); diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c index a5984d96ea1d..29aa5cc5e565 100644 --- a/drivers/cxl/core/suspend.c +++ b/drivers/cxl/core/suspend.c @@ -15,10 +15,10 @@ void cxl_mem_active_inc(void) { atomic_inc(&mem_active); } -EXPORT_SYMBOL_NS_GPL(cxl_mem_active_inc, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_mem_active_inc, "CXL"); void cxl_mem_active_dec(void) { atomic_dec(&mem_active); } -EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, CXL); +EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL"); diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c index a9fd5cd5a0d2..2f03a4d5606e 100644 --- a/drivers/cxl/mem.c +++ b/drivers/cxl/mem.c @@ -252,7 +252,7 @@ module_cxl_driver(cxl_mem_driver); MODULE_DESCRIPTION("CXL: Memory Expansion"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(CXL); +MODULE_IMPORT_NS("CXL"); MODULE_ALIAS_CXL(CXL_DEVICE_MEMORY_EXPANDER); /* * create_endpoint() wants to validate port driver attach immediately after diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index b2cb81f6d9e7..6d94ff4a4f1a 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -836,6 +836,9 @@ static ssize_t rcd_pcie_cap_emit(struct device *dev, u16 offset, char *buf, size if (!root_dev) return -ENXIO; + if (!dport->regs.rcd_pcie_cap) + return -ENXIO; + guard(device)(root_dev); if (!root_dev->driver) return -ENXIO; @@ -1032,8 +1035,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) if (rc) return rc; - rc = cxl_pci_ras_unmask(pdev); - if (rc) + if (cxl_pci_ras_unmask(pdev)) dev_dbg(&pdev->dev, "No RAS reporting unmasked\n"); pci_save_state(pdev); @@ -1184,4 +1186,4 @@ module_init(cxl_pci_driver_init); module_exit(cxl_pci_driver_exit); MODULE_DESCRIPTION("CXL: PCI manageability"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(CXL); +MODULE_IMPORT_NS("CXL"); diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c index d2d43a4fc053..f9c95996e937 100644 --- a/drivers/cxl/pmem.c +++ b/drivers/cxl/pmem.c @@ -459,7 +459,7 @@ MODULE_DESCRIPTION("CXL PMEM: Persistent Memory Support"); MODULE_LICENSE("GPL v2"); module_init(cxl_pmem_init); module_exit(cxl_pmem_exit); -MODULE_IMPORT_NS(CXL); +MODULE_IMPORT_NS("CXL"); MODULE_ALIAS_CXL(CXL_DEVICE_NVDIMM_BRIDGE); MODULE_ALIAS_CXL(CXL_DEVICE_NVDIMM); MODULE_ALIAS_CXL(CXL_DEVICE_PMEM_REGION); diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c index 24041cf85cfb..4c83f6a22e58 100644 --- a/drivers/cxl/port.c +++ b/drivers/cxl/port.c @@ -226,5 +226,5 @@ module_exit(cxl_port_exit); MODULE_DESCRIPTION("CXL: Port enumeration and services"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(CXL); +MODULE_IMPORT_NS("CXL"); MODULE_ALIAS_CXL(CXL_DEVICE_PORT); diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c index 9b29e732b39a..13cd94d32ff7 100644 --- a/drivers/dax/cxl.c +++ b/drivers/dax/cxl.c @@ -46,4 +46,4 @@ MODULE_ALIAS_CXL(CXL_DEVICE_DAX_REGION); MODULE_DESCRIPTION("CXL DAX: direct access to CXL regions"); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Intel Corporation"); -MODULE_IMPORT_NS(CXL); +MODULE_IMPORT_NS("CXL"); diff --git a/drivers/devfreq/event/exynos-nocp.c b/drivers/devfreq/event/exynos-nocp.c index 5edc522f715c..6a3efd782ad0 100644 --- a/drivers/devfreq/event/exynos-nocp.c +++ b/drivers/devfreq/event/exynos-nocp.c @@ -284,7 +284,7 @@ static void exynos_nocp_remove(struct platform_device *pdev) static struct platform_driver exynos_nocp_driver = { .probe = exynos_nocp_probe, - .remove_new = exynos_nocp_remove, + .remove = exynos_nocp_remove, .driver = { .name = "exynos-nocp", .of_match_table = exynos_nocp_id_match, diff --git a/drivers/devfreq/event/exynos-ppmu.c b/drivers/devfreq/event/exynos-ppmu.c index 7002df20a49e..88cd4dfe87e1 100644 --- a/drivers/devfreq/event/exynos-ppmu.c +++ b/drivers/devfreq/event/exynos-ppmu.c @@ -701,7 +701,7 @@ static void exynos_ppmu_remove(struct platform_device *pdev) static struct platform_driver exynos_ppmu_driver = { .probe = exynos_ppmu_probe, - .remove_new = exynos_ppmu_remove, + .remove = exynos_ppmu_remove, .driver = { .name = "exynos-ppmu", .of_match_table = exynos_ppmu_id_match, diff --git a/drivers/devfreq/mtk-cci-devfreq.c b/drivers/devfreq/mtk-cci-devfreq.c index 7ad5225b0381..22fe9e631f8a 100644 --- a/drivers/devfreq/mtk-cci-devfreq.c +++ b/drivers/devfreq/mtk-cci-devfreq.c @@ -430,7 +430,7 @@ MODULE_DEVICE_TABLE(of, mtk_ccifreq_machines); static struct platform_driver mtk_ccifreq_platdrv = { .probe = mtk_ccifreq_probe, - .remove_new = mtk_ccifreq_remove, + .remove = mtk_ccifreq_remove, .driver = { .name = "mtk-ccifreq", .of_match_table = mtk_ccifreq_machines, diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c index d405cee92c25..dbdce7636ca5 100644 --- a/drivers/devfreq/rk3399_dmc.c +++ b/drivers/devfreq/rk3399_dmc.c @@ -474,7 +474,7 @@ MODULE_DEVICE_TABLE(of, rk3399dmc_devfreq_of_match); static struct platform_driver rk3399_dmcfreq_driver = { .probe = rk3399_dmcfreq_probe, - .remove_new = rk3399_dmcfreq_remove, + .remove = rk3399_dmcfreq_remove, .driver = { .name = "rk3399-dmc-freq", .pm = &rk3399_dmcfreq_pm, diff --git a/drivers/devfreq/sun8i-a33-mbus.c b/drivers/devfreq/sun8i-a33-mbus.c index bcf654f4ff96..7c6ae91ede1f 100644 --- a/drivers/devfreq/sun8i-a33-mbus.c +++ b/drivers/devfreq/sun8i-a33-mbus.c @@ -495,7 +495,7 @@ static SIMPLE_DEV_PM_OPS(sun8i_a33_mbus_pm_ops, static struct platform_driver sun8i_a33_mbus_driver = { .probe = sun8i_a33_mbus_probe, - .remove_new = sun8i_a33_mbus_remove, + .remove = sun8i_a33_mbus_remove, .driver = { .name = "sun8i-a33-mbus", .of_match_table = sun8i_a33_mbus_of_match, diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 84bc32134862..5baa83b85515 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -703,7 +703,7 @@ err_module: module_put(exp_info->owner); return ERR_PTR(ret); } -EXPORT_SYMBOL_NS_GPL(dma_buf_export, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_export, "DMA_BUF"); /** * dma_buf_fd - returns a file descriptor for the given struct dma_buf @@ -727,7 +727,7 @@ int dma_buf_fd(struct dma_buf *dmabuf, int flags) return fd; } -EXPORT_SYMBOL_NS_GPL(dma_buf_fd, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_fd, "DMA_BUF"); /** * dma_buf_get - returns the struct dma_buf related to an fd @@ -753,7 +753,7 @@ struct dma_buf *dma_buf_get(int fd) return file->private_data; } -EXPORT_SYMBOL_NS_GPL(dma_buf_get, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_get, "DMA_BUF"); /** * dma_buf_put - decreases refcount of the buffer @@ -772,7 +772,7 @@ void dma_buf_put(struct dma_buf *dmabuf) fput(dmabuf->file); } -EXPORT_SYMBOL_NS_GPL(dma_buf_put, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_put, "DMA_BUF"); static void mangle_sg_table(struct sg_table *sg_table) { @@ -978,7 +978,7 @@ err_unlock: dma_buf_detach(dmabuf, attach); return ERR_PTR(ret); } -EXPORT_SYMBOL_NS_GPL(dma_buf_dynamic_attach, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_dynamic_attach, "DMA_BUF"); /** * dma_buf_attach - Wrapper for dma_buf_dynamic_attach @@ -993,7 +993,7 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, { return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL); } -EXPORT_SYMBOL_NS_GPL(dma_buf_attach, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_attach, "DMA_BUF"); static void __unmap_dma_buf(struct dma_buf_attachment *attach, struct sg_table *sg_table, @@ -1037,7 +1037,7 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) kfree(attach); } -EXPORT_SYMBOL_NS_GPL(dma_buf_detach, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_detach, "DMA_BUF"); /** * dma_buf_pin - Lock down the DMA-buf @@ -1067,7 +1067,7 @@ int dma_buf_pin(struct dma_buf_attachment *attach) return ret; } -EXPORT_SYMBOL_NS_GPL(dma_buf_pin, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_pin, "DMA_BUF"); /** * dma_buf_unpin - Unpin a DMA-buf @@ -1088,7 +1088,7 @@ void dma_buf_unpin(struct dma_buf_attachment *attach) if (dmabuf->ops->unpin) dmabuf->ops->unpin(attach); } -EXPORT_SYMBOL_NS_GPL(dma_buf_unpin, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_unpin, "DMA_BUF"); /** * dma_buf_map_attachment - Returns the scatterlist table of the attachment; @@ -1176,7 +1176,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, #endif /* CONFIG_DMA_API_DEBUG */ return sg_table; } -EXPORT_SYMBOL_NS_GPL(dma_buf_map_attachment, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_map_attachment, "DMA_BUF"); /** * dma_buf_map_attachment_unlocked - Returns the scatterlist table of the attachment; @@ -1204,7 +1204,7 @@ dma_buf_map_attachment_unlocked(struct dma_buf_attachment *attach, return sg_table; } -EXPORT_SYMBOL_NS_GPL(dma_buf_map_attachment_unlocked, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_map_attachment_unlocked, "DMA_BUF"); /** * dma_buf_unmap_attachment - unmaps and decreases usecount of the buffer;might @@ -1236,7 +1236,7 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, !IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)) dma_buf_unpin(attach); } -EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment, "DMA_BUF"); /** * dma_buf_unmap_attachment_unlocked - unmaps and decreases usecount of the buffer;might @@ -1261,7 +1261,7 @@ void dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach, dma_buf_unmap_attachment(attach, sg_table, direction); dma_resv_unlock(attach->dmabuf->resv); } -EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment_unlocked, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment_unlocked, "DMA_BUF"); /** * dma_buf_move_notify - notify attachments that DMA-buf is moving @@ -1281,7 +1281,7 @@ void dma_buf_move_notify(struct dma_buf *dmabuf) if (attach->importer_ops) attach->importer_ops->move_notify(attach); } -EXPORT_SYMBOL_NS_GPL(dma_buf_move_notify, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_move_notify, "DMA_BUF"); /** * DOC: cpu access @@ -1429,7 +1429,7 @@ int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, return ret; } -EXPORT_SYMBOL_NS_GPL(dma_buf_begin_cpu_access, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_begin_cpu_access, "DMA_BUF"); /** * dma_buf_end_cpu_access - Must be called after accessing a dma_buf from the @@ -1457,7 +1457,7 @@ int dma_buf_end_cpu_access(struct dma_buf *dmabuf, return ret; } -EXPORT_SYMBOL_NS_GPL(dma_buf_end_cpu_access, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_end_cpu_access, "DMA_BUF"); /** @@ -1499,7 +1499,7 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, return dmabuf->ops->mmap(dmabuf, vma); } -EXPORT_SYMBOL_NS_GPL(dma_buf_mmap, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_mmap, "DMA_BUF"); /** * dma_buf_vmap - Create virtual mapping for the buffer object into kernel @@ -1552,7 +1552,7 @@ int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map) return 0; } -EXPORT_SYMBOL_NS_GPL(dma_buf_vmap, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_vmap, "DMA_BUF"); /** * dma_buf_vmap_unlocked - Create virtual mapping for the buffer object into kernel @@ -1579,7 +1579,7 @@ int dma_buf_vmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map) return ret; } -EXPORT_SYMBOL_NS_GPL(dma_buf_vmap_unlocked, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_vmap_unlocked, "DMA_BUF"); /** * dma_buf_vunmap - Unmap a vmap obtained by dma_buf_vmap. @@ -1603,7 +1603,7 @@ void dma_buf_vunmap(struct dma_buf *dmabuf, struct iosys_map *map) iosys_map_clear(&dmabuf->vmap_ptr); } } -EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap, "DMA_BUF"); /** * dma_buf_vunmap_unlocked - Unmap a vmap obtained by dma_buf_vmap. @@ -1619,7 +1619,7 @@ void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map) dma_buf_vunmap(dmabuf, map); dma_resv_unlock(dmabuf->resv); } -EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, "DMA_BUF"); #ifdef CONFIG_DEBUG_FS static int dma_buf_debug_show(struct seq_file *s, void *unused) diff --git a/drivers/dma/amd/qdma/qdma.c b/drivers/dma/amd/qdma/qdma.c index 6d9079458fe9..66f00ad67351 100644 --- a/drivers/dma/amd/qdma/qdma.c +++ b/drivers/dma/amd/qdma/qdma.c @@ -7,9 +7,9 @@ #include #include #include +#include #include #include -#include #include #include #include @@ -492,18 +492,9 @@ static int qdma_device_verify(struct qdma_device *qdev) static int qdma_device_setup(struct qdma_device *qdev) { - struct device *dev = &qdev->pdev->dev; u32 ring_sz = QDMA_DEFAULT_RING_SIZE; int ret = 0; - while (dev && get_dma_ops(dev)) - dev = dev->parent; - if (!dev) { - qdma_err(qdev, "dma device not found"); - return -EINVAL; - } - set_dma_ops(&qdev->pdev->dev, get_dma_ops(dev)); - ret = qdma_setup_fmap_context(qdev); if (ret) { qdma_err(qdev, "Failed setup fmap context"); @@ -548,11 +539,12 @@ static void qdma_free_queue_resources(struct dma_chan *chan) { struct qdma_queue *queue = to_qdma_queue(chan); struct qdma_device *qdev = queue->qdev; - struct device *dev = qdev->dma_dev.dev; + struct qdma_platdata *pdata; qdma_clear_queue_context(queue); vchan_free_chan_resources(&queue->vchan); - dma_free_coherent(dev, queue->ring_size * QDMA_MM_DESC_SIZE, + pdata = dev_get_platdata(&qdev->pdev->dev); + dma_free_coherent(pdata->dma_dev, queue->ring_size * QDMA_MM_DESC_SIZE, queue->desc_base, queue->dma_desc_base); } @@ -565,6 +557,7 @@ static int qdma_alloc_queue_resources(struct dma_chan *chan) struct qdma_queue *queue = to_qdma_queue(chan); struct qdma_device *qdev = queue->qdev; struct qdma_ctxt_sw_desc desc; + struct qdma_platdata *pdata; size_t size; int ret; @@ -572,8 +565,9 @@ static int qdma_alloc_queue_resources(struct dma_chan *chan) if (ret) return ret; + pdata = dev_get_platdata(&qdev->pdev->dev); size = queue->ring_size * QDMA_MM_DESC_SIZE; - queue->desc_base = dma_alloc_coherent(qdev->dma_dev.dev, size, + queue->desc_base = dma_alloc_coherent(pdata->dma_dev, size, &queue->dma_desc_base, GFP_KERNEL); if (!queue->desc_base) { @@ -588,7 +582,7 @@ static int qdma_alloc_queue_resources(struct dma_chan *chan) if (ret) { qdma_err(qdev, "Failed to setup SW desc ctxt for %s", chan->name); - dma_free_coherent(qdev->dma_dev.dev, size, queue->desc_base, + dma_free_coherent(pdata->dma_dev, size, queue->desc_base, queue->dma_desc_base); return ret; } @@ -948,8 +942,9 @@ static int qdma_init_error_irq(struct qdma_device *qdev) static int qdmam_alloc_qintr_rings(struct qdma_device *qdev) { - u32 ctxt[QDMA_CTXT_REGMAP_LEN]; + struct qdma_platdata *pdata = dev_get_platdata(&qdev->pdev->dev); struct device *dev = &qdev->pdev->dev; + u32 ctxt[QDMA_CTXT_REGMAP_LEN]; struct qdma_intr_ring *ring; struct qdma_ctxt_intr intr_ctxt; u32 vector; @@ -969,7 +964,8 @@ static int qdmam_alloc_qintr_rings(struct qdma_device *qdev) ring->msix_id = qdev->err_irq_idx + i + 1; ring->ridx = i; ring->color = 1; - ring->base = dmam_alloc_coherent(dev, QDMA_INTR_RING_SIZE, + ring->base = dmam_alloc_coherent(pdata->dma_dev, + QDMA_INTR_RING_SIZE, &ring->dev_base, GFP_KERNEL); if (!ring->base) { qdma_err(qdev, "Failed to alloc intr ring %d", i); diff --git a/drivers/dma/apple-admac.c b/drivers/dma/apple-admac.c index c499173d80b2..bd49f0374291 100644 --- a/drivers/dma/apple-admac.c +++ b/drivers/dma/apple-admac.c @@ -153,6 +153,8 @@ static int admac_alloc_sram_carveout(struct admac_data *ad, { struct admac_sram *sram; int i, ret = 0, nblocks; + ad->txcache.size = readl_relaxed(ad->base + REG_TX_SRAM_SIZE); + ad->rxcache.size = readl_relaxed(ad->base + REG_RX_SRAM_SIZE); if (dir == DMA_MEM_TO_DEV) sram = &ad->txcache; @@ -912,12 +914,7 @@ static int admac_probe(struct platform_device *pdev) goto free_irq; } - ad->txcache.size = readl_relaxed(ad->base + REG_TX_SRAM_SIZE); - ad->rxcache.size = readl_relaxed(ad->base + REG_RX_SRAM_SIZE); - dev_info(&pdev->dev, "Audio DMA Controller\n"); - dev_info(&pdev->dev, "imprint %x TX cache %u RX cache %u\n", - readl_relaxed(ad->base + REG_IMPRINT), ad->txcache.size, ad->rxcache.size); return 0; diff --git a/drivers/dma/at_xdmac.c b/drivers/dma/at_xdmac.c index 9c7b40220004..ba25c23164e7 100644 --- a/drivers/dma/at_xdmac.c +++ b/drivers/dma/at_xdmac.c @@ -1363,6 +1363,8 @@ at_xdmac_prep_dma_memset(struct dma_chan *chan, dma_addr_t dest, int value, return NULL; desc = at_xdmac_memset_create_desc(chan, atchan, dest, len, value); + if (!desc) + return NULL; list_add_tail(&desc->desc_node, &desc->descs_list); desc->tx_dma_desc.cookie = -EBUSY; diff --git a/drivers/dma/dw/acpi.c b/drivers/dma/dw/acpi.c index c510c109d2c3..b6452fffa657 100644 --- a/drivers/dma/dw/acpi.c +++ b/drivers/dma/dw/acpi.c @@ -8,13 +8,15 @@ static bool dw_dma_acpi_filter(struct dma_chan *chan, void *param) { + struct dw_dma *dw = to_dw_dma(chan->device); + struct dw_dma_chip_pdata *data = dev_get_drvdata(dw->dma.dev); struct acpi_dma_spec *dma_spec = param; struct dw_dma_slave slave = { .dma_dev = dma_spec->dev, .src_id = dma_spec->slave_id, .dst_id = dma_spec->slave_id, - .m_master = 0, - .p_master = 1, + .m_master = data->m_master, + .p_master = data->p_master, }; return dw_dma_filter(chan, &slave); diff --git a/drivers/dma/dw/internal.h b/drivers/dma/dw/internal.h index 563ce73488db..f1bd06a20cd6 100644 --- a/drivers/dma/dw/internal.h +++ b/drivers/dma/dw/internal.h @@ -51,11 +51,15 @@ struct dw_dma_chip_pdata { int (*probe)(struct dw_dma_chip *chip); int (*remove)(struct dw_dma_chip *chip); struct dw_dma_chip *chip; + u8 m_master; + u8 p_master; }; static __maybe_unused const struct dw_dma_chip_pdata dw_dma_chip_pdata = { .probe = dw_dma_probe, .remove = dw_dma_remove, + .m_master = 0, + .p_master = 1, }; static const struct dw_dma_platform_data idma32_pdata = { @@ -72,6 +76,8 @@ static __maybe_unused const struct dw_dma_chip_pdata idma32_chip_pdata = { .pdata = &idma32_pdata, .probe = idma32_dma_probe, .remove = idma32_dma_remove, + .m_master = 0, + .p_master = 0, }; static const struct dw_dma_platform_data xbar_pdata = { @@ -88,6 +94,8 @@ static __maybe_unused const struct dw_dma_chip_pdata xbar_chip_pdata = { .pdata = &xbar_pdata, .probe = idma32_dma_probe, .remove = idma32_dma_remove, + .m_master = 0, + .p_master = 0, }; #endif /* _DMA_DW_INTERNAL_H */ diff --git a/drivers/dma/dw/pci.c b/drivers/dma/dw/pci.c index ad2d4d012cf7..e8a0eb81726a 100644 --- a/drivers/dma/dw/pci.c +++ b/drivers/dma/dw/pci.c @@ -56,10 +56,10 @@ static int dw_pci_probe(struct pci_dev *pdev, const struct pci_device_id *pid) if (ret) return ret; - dw_dma_acpi_controller_register(chip->dw); - pci_set_drvdata(pdev, data); + dw_dma_acpi_controller_register(chip->dw); + return 0; } diff --git a/drivers/dma/fsl-edma-common.h b/drivers/dma/fsl-edma-common.h index ce37e1ee9c46..fe8f103d4a63 100644 --- a/drivers/dma/fsl-edma-common.h +++ b/drivers/dma/fsl-edma-common.h @@ -166,6 +166,7 @@ struct fsl_edma_chan { struct work_struct issue_worker; struct platform_device *pdev; struct device *pd_dev; + struct device_link *pd_dev_link; u32 srcid; struct clk *clk; int priority; diff --git a/drivers/dma/fsl-edma-main.c b/drivers/dma/fsl-edma-main.c index 60de1003193a..1a613236b3e4 100644 --- a/drivers/dma/fsl-edma-main.c +++ b/drivers/dma/fsl-edma-main.c @@ -417,10 +417,33 @@ static const struct of_device_id fsl_edma_dt_ids[] = { }; MODULE_DEVICE_TABLE(of, fsl_edma_dt_ids); +static void fsl_edma3_detach_pd(struct fsl_edma_engine *fsl_edma) +{ + struct fsl_edma_chan *fsl_chan; + int i; + + for (i = 0; i < fsl_edma->n_chans; i++) { + if (fsl_edma->chan_masked & BIT(i)) + continue; + fsl_chan = &fsl_edma->chans[i]; + if (fsl_chan->pd_dev_link) + device_link_del(fsl_chan->pd_dev_link); + if (fsl_chan->pd_dev) { + dev_pm_domain_detach(fsl_chan->pd_dev, false); + pm_runtime_dont_use_autosuspend(fsl_chan->pd_dev); + pm_runtime_set_suspended(fsl_chan->pd_dev); + } + } +} + +static void devm_fsl_edma3_detach_pd(void *data) +{ + fsl_edma3_detach_pd(data); +} + static int fsl_edma3_attach_pd(struct platform_device *pdev, struct fsl_edma_engine *fsl_edma) { struct fsl_edma_chan *fsl_chan; - struct device_link *link; struct device *pd_chan; struct device *dev; int i; @@ -436,15 +459,16 @@ static int fsl_edma3_attach_pd(struct platform_device *pdev, struct fsl_edma_eng pd_chan = dev_pm_domain_attach_by_id(dev, i); if (IS_ERR_OR_NULL(pd_chan)) { dev_err(dev, "Failed attach pd %d\n", i); - return -EINVAL; + goto detach; } - link = device_link_add(dev, pd_chan, DL_FLAG_STATELESS | + fsl_chan->pd_dev_link = device_link_add(dev, pd_chan, DL_FLAG_STATELESS | DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE); - if (!link) { + if (!fsl_chan->pd_dev_link) { dev_err(dev, "Failed to add device_link to %d\n", i); - return -EINVAL; + dev_pm_domain_detach(pd_chan, false); + goto detach; } fsl_chan->pd_dev = pd_chan; @@ -455,6 +479,10 @@ static int fsl_edma3_attach_pd(struct platform_device *pdev, struct fsl_edma_eng } return 0; + +detach: + fsl_edma3_detach_pd(fsl_edma); + return -EINVAL; } static int fsl_edma_probe(struct platform_device *pdev) @@ -544,6 +572,9 @@ static int fsl_edma_probe(struct platform_device *pdev) ret = fsl_edma3_attach_pd(pdev, fsl_edma); if (ret) return ret; + ret = devm_add_action_or_reset(&pdev->dev, devm_fsl_edma3_detach_pd, fsl_edma); + if (ret) + return ret; } if (drvdata->flags & FSL_EDMA_DRV_TCD64) diff --git a/drivers/dma/idxd/Makefile b/drivers/dma/idxd/Makefile index 2b4a0d406e1e..9ff9d7b87b64 100644 --- a/drivers/dma/idxd/Makefile +++ b/drivers/dma/idxd/Makefile @@ -1,4 +1,4 @@ -ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=IDXD +ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE='"IDXD"' obj-$(CONFIG_INTEL_IDXD_BUS) += idxd_bus.o idxd_bus-y := bus.o diff --git a/drivers/dma/idxd/compat.c b/drivers/dma/idxd/compat.c index a4adb0c17995..eff9943f1a42 100644 --- a/drivers/dma/idxd/compat.c +++ b/drivers/dma/idxd/compat.c @@ -103,4 +103,4 @@ struct idxd_device_driver dsa_drv = { }; module_idxd_driver(dsa_drv); -MODULE_IMPORT_NS(IDXD); +MODULE_IMPORT_NS("IDXD"); diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c index c41ef195eeb9..5cf419fe6b46 100644 --- a/drivers/dma/idxd/device.c +++ b/drivers/dma/idxd/device.c @@ -161,7 +161,7 @@ int idxd_wq_alloc_resources(struct idxd_wq *wq) free_hw_descs(wq); return rc; } -EXPORT_SYMBOL_NS_GPL(idxd_wq_alloc_resources, IDXD); +EXPORT_SYMBOL_NS_GPL(idxd_wq_alloc_resources, "IDXD"); void idxd_wq_free_resources(struct idxd_wq *wq) { @@ -175,7 +175,7 @@ void idxd_wq_free_resources(struct idxd_wq *wq) dma_free_coherent(dev, wq->compls_size, wq->compls, wq->compls_addr); sbitmap_queue_free(&wq->sbq); } -EXPORT_SYMBOL_NS_GPL(idxd_wq_free_resources, IDXD); +EXPORT_SYMBOL_NS_GPL(idxd_wq_free_resources, "IDXD"); int idxd_wq_enable(struct idxd_wq *wq) { @@ -407,7 +407,7 @@ int idxd_wq_init_percpu_ref(struct idxd_wq *wq) reinit_completion(&wq->wq_resurrect); return 0; } -EXPORT_SYMBOL_NS_GPL(idxd_wq_init_percpu_ref, IDXD); +EXPORT_SYMBOL_NS_GPL(idxd_wq_init_percpu_ref, "IDXD"); void __idxd_wq_quiesce(struct idxd_wq *wq) { @@ -417,7 +417,7 @@ void __idxd_wq_quiesce(struct idxd_wq *wq) complete_all(&wq->wq_resurrect); wait_for_completion(&wq->wq_dead); } -EXPORT_SYMBOL_NS_GPL(__idxd_wq_quiesce, IDXD); +EXPORT_SYMBOL_NS_GPL(__idxd_wq_quiesce, "IDXD"); void idxd_wq_quiesce(struct idxd_wq *wq) { @@ -425,7 +425,7 @@ void idxd_wq_quiesce(struct idxd_wq *wq) __idxd_wq_quiesce(wq); mutex_unlock(&wq->wq_lock); } -EXPORT_SYMBOL_NS_GPL(idxd_wq_quiesce, IDXD); +EXPORT_SYMBOL_NS_GPL(idxd_wq_quiesce, "IDXD"); /* Device control bits */ static inline bool idxd_is_enabled(struct idxd_device *idxd) @@ -1494,7 +1494,7 @@ err_map_portal: err: return rc; } -EXPORT_SYMBOL_NS_GPL(idxd_drv_enable_wq, IDXD); +EXPORT_SYMBOL_NS_GPL(idxd_drv_enable_wq, "IDXD"); void idxd_drv_disable_wq(struct idxd_wq *wq) { @@ -1516,7 +1516,7 @@ void idxd_drv_disable_wq(struct idxd_wq *wq) wq->type = IDXD_WQT_NONE; wq->client_count = 0; } -EXPORT_SYMBOL_NS_GPL(idxd_drv_disable_wq, IDXD); +EXPORT_SYMBOL_NS_GPL(idxd_drv_disable_wq, "IDXD"); int idxd_device_drv_probe(struct idxd_dev *idxd_dev) { diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c index 234c1c658ec7..140f8d772bee 100644 --- a/drivers/dma/idxd/init.c +++ b/drivers/dma/idxd/init.c @@ -25,7 +25,7 @@ MODULE_VERSION(IDXD_DRIVER_VERSION); MODULE_DESCRIPTION("Intel Data Streaming Accelerator and In-Memory Analytics Accelerator common driver"); MODULE_LICENSE("GPL v2"); MODULE_AUTHOR("Intel Corporation"); -MODULE_IMPORT_NS(IDXD); +MODULE_IMPORT_NS("IDXD"); static bool sva = true; module_param(sva, bool, 0644); diff --git a/drivers/dma/idxd/submit.c b/drivers/dma/idxd/submit.c index 94eca25ae9b9..6db1c5fcedc5 100644 --- a/drivers/dma/idxd/submit.c +++ b/drivers/dma/idxd/submit.c @@ -61,7 +61,7 @@ struct idxd_desc *idxd_alloc_desc(struct idxd_wq *wq, enum idxd_op_type optype) return __get_desc(wq, idx, cpu); } -EXPORT_SYMBOL_NS_GPL(idxd_alloc_desc, IDXD); +EXPORT_SYMBOL_NS_GPL(idxd_alloc_desc, "IDXD"); void idxd_free_desc(struct idxd_wq *wq, struct idxd_desc *desc) { @@ -70,7 +70,7 @@ void idxd_free_desc(struct idxd_wq *wq, struct idxd_desc *desc) desc->cpu = -1; sbitmap_queue_clear(&wq->sbq, desc->id, cpu); } -EXPORT_SYMBOL_NS_GPL(idxd_free_desc, IDXD); +EXPORT_SYMBOL_NS_GPL(idxd_free_desc, "IDXD"); static struct idxd_desc *list_abort_desc(struct idxd_wq *wq, struct idxd_irq_entry *ie, struct idxd_desc *desc) @@ -219,4 +219,4 @@ int idxd_submit_desc(struct idxd_wq *wq, struct idxd_desc *desc) percpu_ref_put(&wq->wq_active); return 0; } -EXPORT_SYMBOL_NS_GPL(idxd_submit_desc, IDXD); +EXPORT_SYMBOL_NS_GPL(idxd_submit_desc, "IDXD"); diff --git a/drivers/dma/loongson2-apb-dma.c b/drivers/dma/loongson2-apb-dma.c index 367ed34ce4da..c528f02b9f84 100644 --- a/drivers/dma/loongson2-apb-dma.c +++ b/drivers/dma/loongson2-apb-dma.c @@ -31,7 +31,7 @@ #define LDMA_ASK_VALID BIT(2) #define LDMA_START BIT(3) /* DMA start operation */ #define LDMA_STOP BIT(4) /* DMA stop operation */ -#define LDMA_CONFIG_MASK GENMASK(4, 0) /* DMA controller config bits mask */ +#define LDMA_CONFIG_MASK GENMASK_ULL(4, 0) /* DMA controller config bits mask */ /* Bitfields in ndesc_addr field of HW descriptor */ #define LDMA_DESC_EN BIT(0) /*1: The next descriptor is valid */ diff --git a/drivers/dma/mv_xor.c b/drivers/dma/mv_xor.c index 43efce77bb57..40b76b40bc30 100644 --- a/drivers/dma/mv_xor.c +++ b/drivers/dma/mv_xor.c @@ -1388,6 +1388,7 @@ static int mv_xor_probe(struct platform_device *pdev) irq = irq_of_parse_and_map(np, 0); if (!irq) { ret = -ENODEV; + of_node_put(np); goto err_channel_add; } @@ -1396,6 +1397,7 @@ static int mv_xor_probe(struct platform_device *pdev) if (IS_ERR(chan)) { ret = PTR_ERR(chan); irq_dispose_mapping(irq); + of_node_put(np); goto err_channel_add; } diff --git a/drivers/dma/tegra186-gpc-dma.c b/drivers/dma/tegra186-gpc-dma.c index cacf3757adc2..4d6fe0efa76e 100644 --- a/drivers/dma/tegra186-gpc-dma.c +++ b/drivers/dma/tegra186-gpc-dma.c @@ -231,6 +231,7 @@ struct tegra_dma_channel { bool config_init; char name[30]; enum dma_transfer_direction sid_dir; + enum dma_status status; int id; int irq; int slave_id; @@ -393,6 +394,8 @@ static int tegra_dma_pause(struct tegra_dma_channel *tdc) tegra_dma_dump_chan_regs(tdc); } + tdc->status = DMA_PAUSED; + return ret; } @@ -419,6 +422,8 @@ static void tegra_dma_resume(struct tegra_dma_channel *tdc) val = tdc_read(tdc, TEGRA_GPCDMA_CHAN_CSRE); val &= ~TEGRA_GPCDMA_CHAN_CSRE_PAUSE; tdc_write(tdc, TEGRA_GPCDMA_CHAN_CSRE, val); + + tdc->status = DMA_IN_PROGRESS; } static int tegra_dma_device_resume(struct dma_chan *dc) @@ -544,6 +549,7 @@ static void tegra_dma_xfer_complete(struct tegra_dma_channel *tdc) tegra_dma_sid_free(tdc); tdc->dma_desc = NULL; + tdc->status = DMA_COMPLETE; } static void tegra_dma_chan_decode_error(struct tegra_dma_channel *tdc, @@ -716,6 +722,7 @@ static int tegra_dma_terminate_all(struct dma_chan *dc) tdc->dma_desc = NULL; } + tdc->status = DMA_COMPLETE; tegra_dma_sid_free(tdc); vchan_get_all_descriptors(&tdc->vc, &head); spin_unlock_irqrestore(&tdc->vc.lock, flags); @@ -769,6 +776,9 @@ static enum dma_status tegra_dma_tx_status(struct dma_chan *dc, if (ret == DMA_COMPLETE) return ret; + if (tdc->status == DMA_PAUSED) + ret = DMA_PAUSED; + spin_lock_irqsave(&tdc->vc.lock, flags); vd = vchan_find_desc(&tdc->vc, cookie); if (vd) { diff --git a/drivers/edac/altera_edac.c b/drivers/edac/altera_edac.c index fe89f5c4837f..3e971f902363 100644 --- a/drivers/edac/altera_edac.c +++ b/drivers/edac/altera_edac.c @@ -482,7 +482,7 @@ static const struct dev_pm_ops altr_sdram_pm_ops = { static struct platform_driver altr_sdram_edac_driver = { .probe = altr_sdram_probe, - .remove_new = altr_sdram_remove, + .remove = altr_sdram_remove, .driver = { .name = "altr_sdram_edac", #ifdef CONFIG_PM @@ -816,7 +816,7 @@ static void altr_edac_device_remove(struct platform_device *pdev) static struct platform_driver altr_edac_device_driver = { .probe = altr_edac_device_probe, - .remove_new = altr_edac_device_remove, + .remove = altr_edac_device_remove, .driver = { .name = "altr_edac_device", .of_match_table = altr_edac_device_of_match, diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index ddfbdb66b794..5d356b7c4589 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -3362,36 +3362,24 @@ static bool dct_ecc_enabled(struct amd64_pvt *pvt) static bool umc_ecc_enabled(struct amd64_pvt *pvt) { - u8 umc_en_mask = 0, ecc_en_mask = 0; - u16 nid = pvt->mc_node_id; struct amd64_umc *umc; - u8 ecc_en = 0, i; + bool ecc_en = false; + int i; + /* Check whether at least one UMC is enabled: */ for_each_umc(i) { umc = &pvt->umc[i]; - /* Only check enabled UMCs. */ - if (!(umc->sdp_ctrl & UMC_SDP_INIT)) - continue; - - umc_en_mask |= BIT(i); - - if (umc->umc_cap_hi & UMC_ECC_ENABLED) - ecc_en_mask |= BIT(i); + if (umc->sdp_ctrl & UMC_SDP_INIT && + umc->umc_cap_hi & UMC_ECC_ENABLED) { + ecc_en = true; + break; + } } - /* Check whether at least one UMC is enabled: */ - if (umc_en_mask) - ecc_en = umc_en_mask == ecc_en_mask; - else - edac_dbg(0, "Node %d: No enabled UMCs.\n", nid); + edac_dbg(3, "Node %d: DRAM ECC %s.\n", pvt->mc_node_id, (ecc_en ? "enabled" : "disabled")); - edac_dbg(3, "Node %d: DRAM ECC %s.\n", nid, (ecc_en ? "enabled" : "disabled")); - - if (!ecc_en) - return false; - else - return true; + return ecc_en; } static inline void diff --git a/drivers/edac/armada_xp_edac.c b/drivers/edac/armada_xp_edac.c index 589bc81f1249..d64248fcf4c0 100644 --- a/drivers/edac/armada_xp_edac.c +++ b/drivers/edac/armada_xp_edac.c @@ -364,7 +364,7 @@ static void axp_mc_remove(struct platform_device *pdev) static struct platform_driver axp_mc_driver = { .probe = axp_mc_probe, - .remove_new = axp_mc_remove, + .remove = axp_mc_remove, .driver = { .name = "armada_xp_mc_edac", .of_match_table = of_match_ptr(axp_mc_of_match), @@ -579,7 +579,7 @@ static void aurora_l2_remove(struct platform_device *pdev) static struct platform_driver aurora_l2_driver = { .probe = aurora_l2_probe, - .remove_new = aurora_l2_remove, + .remove = aurora_l2_remove, .driver = { .name = "aurora_l2_edac", .of_match_table = of_match_ptr(aurora_l2_of_match), diff --git a/drivers/edac/aspeed_edac.c b/drivers/edac/aspeed_edac.c index 157a480eb761..dadb8acbee3d 100644 --- a/drivers/edac/aspeed_edac.c +++ b/drivers/edac/aspeed_edac.c @@ -387,7 +387,7 @@ static struct platform_driver aspeed_driver = { .of_match_table = aspeed_of_match }, .probe = aspeed_probe, - .remove_new = aspeed_remove + .remove = aspeed_remove }; module_platform_driver(aspeed_driver); diff --git a/drivers/edac/bluefield_edac.c b/drivers/edac/bluefield_edac.c index 739132e5ed8a..4942a240c30f 100644 --- a/drivers/edac/bluefield_edac.c +++ b/drivers/edac/bluefield_edac.c @@ -474,7 +474,7 @@ static struct platform_driver bluefield_edac_mc_driver = { .acpi_match_table = bluefield_mc_acpi_ids, }, .probe = bluefield_edac_mc_probe, - .remove_new = bluefield_edac_mc_remove, + .remove = bluefield_edac_mc_remove, }; module_platform_driver(bluefield_edac_mc_driver); diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c index 2000f66fbf5c..c2420e2287ff 100644 --- a/drivers/edac/cell_edac.c +++ b/drivers/edac/cell_edac.c @@ -246,7 +246,7 @@ static struct platform_driver cell_edac_driver = { .name = "cbe-mic", }, .probe = cell_edac_probe, - .remove_new = cell_edac_remove, + .remove = cell_edac_remove, }; static int __init cell_edac_init(void) diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c index eb702bc3aa29..9c9e4369c041 100644 --- a/drivers/edac/cpc925_edac.c +++ b/drivers/edac/cpc925_edac.c @@ -1027,7 +1027,7 @@ static void cpc925_remove(struct platform_device *pdev) static struct platform_driver cpc925_edac_driver = { .probe = cpc925_probe, - .remove_new = cpc925_remove, + .remove = cpc925_remove, .driver = { .name = "cpc925_edac", } diff --git a/drivers/edac/dmc520_edac.c b/drivers/edac/dmc520_edac.c index 5e52d31db3b8..64a4d0a07032 100644 --- a/drivers/edac/dmc520_edac.c +++ b/drivers/edac/dmc520_edac.c @@ -640,7 +640,7 @@ static struct platform_driver dmc520_edac_driver = { }, .probe = dmc520_edac_probe, - .remove_new = dmc520_edac_remove + .remove = dmc520_edac_remove }; module_platform_driver(dmc520_edac_driver); diff --git a/drivers/edac/highbank_l2_edac.c b/drivers/edac/highbank_l2_edac.c index 282ca6535f8f..24f163ff323f 100644 --- a/drivers/edac/highbank_l2_edac.c +++ b/drivers/edac/highbank_l2_edac.c @@ -128,7 +128,7 @@ static void highbank_l2_err_remove(struct platform_device *pdev) static struct platform_driver highbank_l2_edac_driver = { .probe = highbank_l2_err_probe, - .remove_new = highbank_l2_err_remove, + .remove = highbank_l2_err_remove, .driver = { .name = "hb_l2_edac", .of_match_table = hb_l2_err_of_match, diff --git a/drivers/edac/highbank_mc_edac.c b/drivers/edac/highbank_mc_edac.c index 1c5b888ab11d..a8879d72d064 100644 --- a/drivers/edac/highbank_mc_edac.c +++ b/drivers/edac/highbank_mc_edac.c @@ -261,7 +261,7 @@ static void highbank_mc_remove(struct platform_device *pdev) static struct platform_driver highbank_mc_edac_driver = { .probe = highbank_mc_probe, - .remove_new = highbank_mc_remove, + .remove = highbank_mc_remove, .driver = { .name = "hb_mc_edac", .of_match_table = hb_ddr_ctrl_of_match, diff --git a/drivers/edac/layerscape_edac.c b/drivers/edac/layerscape_edac.c index 9a0c92ebbc3c..a2caa7fc5412 100644 --- a/drivers/edac/layerscape_edac.c +++ b/drivers/edac/layerscape_edac.c @@ -28,7 +28,7 @@ MODULE_DEVICE_TABLE(of, fsl_ddr_mc_err_of_match); static struct platform_driver fsl_ddr_mc_err_driver = { .probe = fsl_mc_err_probe, - .remove_new = fsl_mc_err_remove, + .remove = fsl_mc_err_remove, .driver = { .name = "fsl_ddr_mc_err", .of_match_table = fsl_ddr_mc_err_of_match, diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c index d0266cbcbeda..a45dc6b35ede 100644 --- a/drivers/edac/mpc85xx_edac.c +++ b/drivers/edac/mpc85xx_edac.c @@ -323,7 +323,7 @@ static const struct platform_device_id mpc85xx_pci_err_match[] = { static struct platform_driver mpc85xx_pci_err_driver = { .probe = mpc85xx_pci_err_probe, - .remove_new = mpc85xx_pci_err_remove, + .remove = mpc85xx_pci_err_remove, .id_table = mpc85xx_pci_err_match, .driver = { .name = "mpc85xx_pci_err", @@ -627,7 +627,7 @@ MODULE_DEVICE_TABLE(of, mpc85xx_l2_err_of_match); static struct platform_driver mpc85xx_l2_err_driver = { .probe = mpc85xx_l2_err_probe, - .remove_new = mpc85xx_l2_err_remove, + .remove = mpc85xx_l2_err_remove, .driver = { .name = "mpc85xx_l2_err", .of_match_table = mpc85xx_l2_err_of_match, @@ -656,7 +656,7 @@ MODULE_DEVICE_TABLE(of, mpc85xx_mc_err_of_match); static struct platform_driver mpc85xx_mc_err_driver = { .probe = fsl_mc_err_probe, - .remove_new = fsl_mc_err_remove, + .remove = fsl_mc_err_remove, .driver = { .name = "mpc85xx_mc_err", .of_match_table = mpc85xx_mc_err_of_match, diff --git a/drivers/edac/npcm_edac.c b/drivers/edac/npcm_edac.c index 2e2133b784e9..e60a99eb8cfb 100644 --- a/drivers/edac/npcm_edac.c +++ b/drivers/edac/npcm_edac.c @@ -531,7 +531,7 @@ static struct platform_driver npcm_edac_driver = { .of_match_table = npcm_edac_of_match, }, .probe = edac_probe, - .remove_new = edac_remove, + .remove = edac_remove, }; module_platform_driver(npcm_edac_driver); diff --git a/drivers/edac/octeon_edac-l2c.c b/drivers/edac/octeon_edac-l2c.c index 2adb9c8093f8..e6b1595a3cb5 100644 --- a/drivers/edac/octeon_edac-l2c.c +++ b/drivers/edac/octeon_edac-l2c.c @@ -194,7 +194,7 @@ static void octeon_l2c_remove(struct platform_device *pdev) static struct platform_driver octeon_l2c_driver = { .probe = octeon_l2c_probe, - .remove_new = octeon_l2c_remove, + .remove = octeon_l2c_remove, .driver = { .name = "octeon_l2c_edac", } diff --git a/drivers/edac/octeon_edac-lmc.c b/drivers/edac/octeon_edac-lmc.c index 4112c2ee34b8..f7176b95b4fe 100644 --- a/drivers/edac/octeon_edac-lmc.c +++ b/drivers/edac/octeon_edac-lmc.c @@ -312,7 +312,7 @@ static void octeon_lmc_edac_remove(struct platform_device *pdev) static struct platform_driver octeon_lmc_edac_driver = { .probe = octeon_lmc_edac_probe, - .remove_new = octeon_lmc_edac_remove, + .remove = octeon_lmc_edac_remove, .driver = { .name = "octeon_lmc_edac", } diff --git a/drivers/edac/octeon_edac-pc.c b/drivers/edac/octeon_edac-pc.c index d9eeb40d2784..aa1219db0b17 100644 --- a/drivers/edac/octeon_edac-pc.c +++ b/drivers/edac/octeon_edac-pc.c @@ -130,7 +130,7 @@ static void co_cache_error_remove(struct platform_device *pdev) static struct platform_driver co_cache_error_driver = { .probe = co_cache_error_probe, - .remove_new = co_cache_error_remove, + .remove = co_cache_error_remove, .driver = { .name = "octeon_pc_edac", } diff --git a/drivers/edac/octeon_edac-pci.c b/drivers/edac/octeon_edac-pci.c index 4d368af2c5f0..c4f3bc33a971 100644 --- a/drivers/edac/octeon_edac-pci.c +++ b/drivers/edac/octeon_edac-pci.c @@ -97,7 +97,7 @@ static void octeon_pci_remove(struct platform_device *pdev) static struct platform_driver octeon_pci_driver = { .probe = octeon_pci_probe, - .remove_new = octeon_pci_remove, + .remove = octeon_pci_remove, .driver = { .name = "octeon_pci_edac", } diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c index a9a8ba067007..04c42c83a2ba 100644 --- a/drivers/edac/qcom_edac.c +++ b/drivers/edac/qcom_edac.c @@ -407,7 +407,7 @@ MODULE_DEVICE_TABLE(platform, qcom_llcc_edac_id_table); static struct platform_driver qcom_llcc_edac_driver = { .probe = qcom_llcc_edac_probe, - .remove_new = qcom_llcc_edac_remove, + .remove = qcom_llcc_edac_remove, .driver = { .name = "qcom_llcc_edac", }, diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c index d7416166fd8a..5ed32a3299c4 100644 --- a/drivers/edac/synopsys_edac.c +++ b/drivers/edac/synopsys_edac.c @@ -1488,7 +1488,7 @@ static struct platform_driver synps_edac_mc_driver = { .of_match_table = synps_edac_match, }, .probe = mc_probe, - .remove_new = mc_remove, + .remove = mc_remove, }; module_platform_driver(synps_edac_mc_driver); diff --git a/drivers/edac/ti_edac.c b/drivers/edac/ti_edac.c index 29723c9592f7..39cc2ef9cac4 100644 --- a/drivers/edac/ti_edac.c +++ b/drivers/edac/ti_edac.c @@ -322,7 +322,7 @@ static void ti_edac_remove(struct platform_device *pdev) static struct platform_driver ti_edac_driver = { .probe = ti_edac_probe, - .remove_new = ti_edac_remove, + .remove = ti_edac_remove, .driver = { .name = EDAC_MOD_NAME, .of_match_table = ti_edac_of_match, diff --git a/drivers/edac/versal_edac.c b/drivers/edac/versal_edac.c index a556d23e8261..5a43b5d43ca2 100644 --- a/drivers/edac/versal_edac.c +++ b/drivers/edac/versal_edac.c @@ -1186,7 +1186,7 @@ static struct platform_driver xilinx_ddr_edac_mc_driver = { .of_match_table = xlnx_edac_match, }, .probe = mc_probe, - .remove_new = mc_remove, + .remove = mc_remove, }; module_platform_driver(xilinx_ddr_edac_mc_driver); diff --git a/drivers/edac/xgene_edac.c b/drivers/edac/xgene_edac.c index fd87f1b2c145..699c7d29d80c 100644 --- a/drivers/edac/xgene_edac.c +++ b/drivers/edac/xgene_edac.c @@ -1989,7 +1989,7 @@ MODULE_DEVICE_TABLE(of, xgene_edac_of_match); static struct platform_driver xgene_edac_driver = { .probe = xgene_edac_probe, - .remove_new = xgene_edac_remove, + .remove = xgene_edac_remove, .driver = { .name = "xgene-edac", .of_match_table = xgene_edac_of_match, diff --git a/drivers/edac/zynqmp_edac.c b/drivers/edac/zynqmp_edac.c index c9dc78d8c824..cdffc9e4194d 100644 --- a/drivers/edac/zynqmp_edac.c +++ b/drivers/edac/zynqmp_edac.c @@ -455,7 +455,7 @@ static struct platform_driver zynqmp_ocm_edac_driver = { .of_match_table = zynqmp_ocm_edac_match, }, .probe = edac_probe, - .remove_new = edac_remove, + .remove = edac_remove, }; module_platform_driver(zynqmp_ocm_edac_driver); diff --git a/drivers/extcon/extcon-adc-jack.c b/drivers/extcon/extcon-adc-jack.c index 125016da7fde..46c40d85c2ac 100644 --- a/drivers/extcon/extcon-adc-jack.c +++ b/drivers/extcon/extcon-adc-jack.c @@ -196,7 +196,7 @@ static SIMPLE_DEV_PM_OPS(adc_jack_pm_ops, static struct platform_driver adc_jack_driver = { .probe = adc_jack_probe, - .remove_new = adc_jack_remove, + .remove = adc_jack_remove, .driver = { .name = "adc-jack", .pm = &adc_jack_pm_ops, diff --git a/drivers/extcon/extcon-intel-cht-wc.c b/drivers/extcon/extcon-intel-cht-wc.c index 93552dc3c895..8131a3d7d562 100644 --- a/drivers/extcon/extcon-intel-cht-wc.c +++ b/drivers/extcon/extcon-intel-cht-wc.c @@ -627,7 +627,7 @@ MODULE_DEVICE_TABLE(platform, cht_wc_extcon_table); static struct platform_driver cht_wc_extcon_driver = { .probe = cht_wc_extcon_probe, - .remove_new = cht_wc_extcon_remove, + .remove = cht_wc_extcon_remove, .id_table = cht_wc_extcon_table, .driver = { .name = "cht_wcove_pwrsrc", diff --git a/drivers/extcon/extcon-intel-mrfld.c b/drivers/extcon/extcon-intel-mrfld.c index a1f737f13d49..9219f4328d70 100644 --- a/drivers/extcon/extcon-intel-mrfld.c +++ b/drivers/extcon/extcon-intel-mrfld.c @@ -275,7 +275,7 @@ static struct platform_driver mrfld_extcon_driver = { .name = "mrfld_bcove_pwrsrc", }, .probe = mrfld_extcon_probe, - .remove_new = mrfld_extcon_remove, + .remove = mrfld_extcon_remove, .id_table = mrfld_extcon_id_table, }; module_platform_driver(mrfld_extcon_driver); diff --git a/drivers/extcon/extcon-max3355.c b/drivers/extcon/extcon-max3355.c index e62ce7a8d131..b2ee4ff8b04d 100644 --- a/drivers/extcon/extcon-max3355.c +++ b/drivers/extcon/extcon-max3355.c @@ -127,7 +127,7 @@ MODULE_DEVICE_TABLE(of, max3355_match_table); static struct platform_driver max3355_driver = { .probe = max3355_probe, - .remove_new = max3355_remove, + .remove = max3355_remove, .driver = { .name = "extcon-max3355", .of_match_table = max3355_match_table, diff --git a/drivers/extcon/extcon-max77843.c b/drivers/extcon/extcon-max77843.c index 9849e3b8327e..2ae9f7f1a67f 100644 --- a/drivers/extcon/extcon-max77843.c +++ b/drivers/extcon/extcon-max77843.c @@ -956,7 +956,7 @@ static struct platform_driver max77843_muic_driver = { .of_match_table = of_max77843_muic_dt_match, }, .probe = max77843_muic_probe, - .remove_new = max77843_muic_remove, + .remove = max77843_muic_remove, .id_table = max77843_muic_id, }; diff --git a/drivers/extcon/extcon-rtk-type-c.c b/drivers/extcon/extcon-rtk-type-c.c index 19a01e663733..bdc2b7b3a246 100644 --- a/drivers/extcon/extcon-rtk-type-c.c +++ b/drivers/extcon/extcon-rtk-type-c.c @@ -1778,7 +1778,7 @@ static const struct dev_pm_ops extcon_rtk_type_c_pm_ops = { static struct platform_driver extcon_rtk_type_c_driver = { .probe = extcon_rtk_type_c_probe, - .remove_new = extcon_rtk_type_c_remove, + .remove = extcon_rtk_type_c_remove, .driver = { .name = "extcon-rtk-type_c", .of_match_table = extcon_rtk_type_c_match, diff --git a/drivers/extcon/extcon-usb-gpio.c b/drivers/extcon/extcon-usb-gpio.c index 9b61eb99b7dc..5e8ad21ad206 100644 --- a/drivers/extcon/extcon-usb-gpio.c +++ b/drivers/extcon/extcon-usb-gpio.c @@ -279,7 +279,7 @@ MODULE_DEVICE_TABLE(platform, usb_extcon_platform_ids); static struct platform_driver usb_extcon_driver = { .probe = usb_extcon_probe, - .remove_new = usb_extcon_remove, + .remove = usb_extcon_remove, .driver = { .name = "extcon-usb-gpio", .pm = &usb_extcon_pm_ops, diff --git a/drivers/extcon/extcon-usbc-cros-ec.c b/drivers/extcon/extcon-usbc-cros-ec.c index 805a47230689..1fb627ea8b50 100644 --- a/drivers/extcon/extcon-usbc-cros-ec.c +++ b/drivers/extcon/extcon-usbc-cros-ec.c @@ -529,7 +529,7 @@ static struct platform_driver extcon_cros_ec_driver = { .of_match_table = of_match_ptr(extcon_cros_ec_of_match), .pm = DEV_PM_OPS, }, - .remove_new = extcon_cros_ec_remove, + .remove = extcon_cros_ec_remove, .probe = extcon_cros_ec_probe, }; diff --git a/drivers/firmware/arm_ffa/bus.c b/drivers/firmware/arm_ffa/bus.c index eb17d03b66fe..dfda5ffc14db 100644 --- a/drivers/firmware/arm_ffa/bus.c +++ b/drivers/firmware/arm_ffa/bus.c @@ -187,13 +187,18 @@ bool ffa_device_is_valid(struct ffa_device *ffa_dev) return valid; } -struct ffa_device *ffa_device_register(const uuid_t *uuid, int vm_id, - const struct ffa_ops *ops) +struct ffa_device * +ffa_device_register(const struct ffa_partition_info *part_info, + const struct ffa_ops *ops) { int id, ret; + uuid_t uuid; struct device *dev; struct ffa_device *ffa_dev; + if (!part_info) + return NULL; + id = ida_alloc_min(&ffa_bus_id, 1, GFP_KERNEL); if (id < 0) return NULL; @@ -210,9 +215,11 @@ struct ffa_device *ffa_device_register(const uuid_t *uuid, int vm_id, dev_set_name(&ffa_dev->dev, "arm-ffa-%d", id); ffa_dev->id = id; - ffa_dev->vm_id = vm_id; + ffa_dev->vm_id = part_info->id; + ffa_dev->properties = part_info->properties; ffa_dev->ops = ops; - uuid_copy(&ffa_dev->uuid, uuid); + import_uuid(&uuid, (u8 *)part_info->uuid); + uuid_copy(&ffa_dev->uuid, &uuid); ret = device_register(&ffa_dev->dev); if (ret) { diff --git a/drivers/firmware/arm_ffa/driver.c b/drivers/firmware/arm_ffa/driver.c index b14cbdae94e8..2c2ec3c35f15 100644 --- a/drivers/firmware/arm_ffa/driver.c +++ b/drivers/firmware/arm_ffa/driver.c @@ -1387,7 +1387,6 @@ static struct notifier_block ffa_bus_nb = { static int ffa_setup_partitions(void) { int count, idx, ret; - uuid_t uuid; struct ffa_device *ffa_dev; struct ffa_dev_part_info *info; struct ffa_partition_info *pbuf, *tpbuf; @@ -1406,23 +1405,19 @@ static int ffa_setup_partitions(void) xa_init(&drv_info->partition_info); for (idx = 0, tpbuf = pbuf; idx < count; idx++, tpbuf++) { - import_uuid(&uuid, (u8 *)tpbuf->uuid); - /* Note that if the UUID will be uuid_null, that will require * ffa_bus_notifier() to find the UUID of this partition id * with help of ffa_device_match_uuid(). FF-A v1.1 and above * provides UUID here for each partition as part of the * discovery API and the same is passed. */ - ffa_dev = ffa_device_register(&uuid, tpbuf->id, &ffa_drv_ops); + ffa_dev = ffa_device_register(tpbuf, &ffa_drv_ops); if (!ffa_dev) { pr_err("%s: failed to register partition ID 0x%x\n", __func__, tpbuf->id); continue; } - ffa_dev->properties = tpbuf->properties; - if (drv_info->version > FFA_VERSION_1_0 && !(tpbuf->properties & FFA_PARTITION_AARCH64_EXEC)) ffa_mode_32bit_set(ffa_dev); diff --git a/drivers/firmware/arm_scmi/vendors/imx/Kconfig b/drivers/firmware/arm_scmi/vendors/imx/Kconfig index 2883ed24a84d..a01bf5e47301 100644 --- a/drivers/firmware/arm_scmi/vendors/imx/Kconfig +++ b/drivers/firmware/arm_scmi/vendors/imx/Kconfig @@ -15,6 +15,7 @@ config IMX_SCMI_BBM_EXT config IMX_SCMI_MISC_EXT tristate "i.MX SCMI MISC EXTENSION" depends on ARM_SCMI_PROTOCOL || (COMPILE_TEST && OF) + depends on IMX_SCMI_MISC_DRV default y if ARCH_MXC help This enables i.MX System MISC control logic such as gpio expander diff --git a/drivers/firmware/cirrus/cs_dsp.c b/drivers/firmware/cirrus/cs_dsp.c index 419220fa42fd..5365e9a43000 100644 --- a/drivers/firmware/cirrus/cs_dsp.c +++ b/drivers/firmware/cirrus/cs_dsp.c @@ -378,7 +378,7 @@ const char *cs_dsp_mem_region_name(unsigned int type) return NULL; } } -EXPORT_SYMBOL_NS_GPL(cs_dsp_mem_region_name, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_mem_region_name, "FW_CS_DSP"); #ifdef CONFIG_DEBUG_FS static void cs_dsp_debugfs_save_wmfwname(struct cs_dsp *dsp, const char *s) @@ -519,7 +519,7 @@ void cs_dsp_init_debugfs(struct cs_dsp *dsp, struct dentry *debugfs_root) dsp->debugfs_root = root; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_init_debugfs, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_init_debugfs, "FW_CS_DSP"); /** * cs_dsp_cleanup_debugfs() - Removes DSP representation from debugfs @@ -531,17 +531,17 @@ void cs_dsp_cleanup_debugfs(struct cs_dsp *dsp) debugfs_remove_recursive(dsp->debugfs_root); dsp->debugfs_root = ERR_PTR(-ENODEV); } -EXPORT_SYMBOL_NS_GPL(cs_dsp_cleanup_debugfs, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_cleanup_debugfs, "FW_CS_DSP"); #else void cs_dsp_init_debugfs(struct cs_dsp *dsp, struct dentry *debugfs_root) { } -EXPORT_SYMBOL_NS_GPL(cs_dsp_init_debugfs, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_init_debugfs, "FW_CS_DSP"); void cs_dsp_cleanup_debugfs(struct cs_dsp *dsp) { } -EXPORT_SYMBOL_NS_GPL(cs_dsp_cleanup_debugfs, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_cleanup_debugfs, "FW_CS_DSP"); static inline void cs_dsp_debugfs_save_wmfwname(struct cs_dsp *dsp, const char *s) @@ -749,7 +749,7 @@ int cs_dsp_coeff_write_acked_control(struct cs_dsp_coeff_ctl *ctl, unsigned int return -ETIMEDOUT; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_coeff_write_acked_control, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_coeff_write_acked_control, "FW_CS_DSP"); static int cs_dsp_coeff_write_ctrl_raw(struct cs_dsp_coeff_ctl *ctl, unsigned int off, const void *buf, size_t len) @@ -827,7 +827,7 @@ int cs_dsp_coeff_write_ctrl(struct cs_dsp_coeff_ctl *ctl, return 1; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_coeff_write_ctrl, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_coeff_write_ctrl, "FW_CS_DSP"); /** * cs_dsp_coeff_lock_and_write_ctrl() - Writes the given buffer to the given coefficient control @@ -926,7 +926,7 @@ int cs_dsp_coeff_read_ctrl(struct cs_dsp_coeff_ctl *ctl, return ret; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_coeff_read_ctrl, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_coeff_read_ctrl, "FW_CS_DSP"); /** * cs_dsp_coeff_lock_and_read_ctrl() - Reads the given coefficient control into the given buffer @@ -1679,7 +1679,7 @@ struct cs_dsp_coeff_ctl *cs_dsp_get_ctl(struct cs_dsp *dsp, const char *name, in return rslt; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_get_ctl, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_get_ctl, "FW_CS_DSP"); static void cs_dsp_ctl_fixup_base(struct cs_dsp *dsp, const struct cs_dsp_alg_region *alg_region) @@ -1769,7 +1769,7 @@ struct cs_dsp_alg_region *cs_dsp_find_alg_region(struct cs_dsp *dsp, return NULL; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_find_alg_region, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_find_alg_region, "FW_CS_DSP"); static struct cs_dsp_alg_region *cs_dsp_create_region(struct cs_dsp *dsp, int type, __be32 id, @@ -2404,7 +2404,7 @@ int cs_dsp_adsp1_init(struct cs_dsp *dsp) return cs_dsp_common_init(dsp); } -EXPORT_SYMBOL_NS_GPL(cs_dsp_adsp1_init, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_adsp1_init, "FW_CS_DSP"); /** * cs_dsp_adsp1_power_up() - Load and start the named firmware @@ -2496,7 +2496,7 @@ err_mutex: mutex_unlock(&dsp->pwr_lock); return ret; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_adsp1_power_up, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_adsp1_power_up, "FW_CS_DSP"); /** * cs_dsp_adsp1_power_down() - Halts the DSP @@ -2528,7 +2528,7 @@ void cs_dsp_adsp1_power_down(struct cs_dsp *dsp) mutex_unlock(&dsp->pwr_lock); } -EXPORT_SYMBOL_NS_GPL(cs_dsp_adsp1_power_down, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_adsp1_power_down, "FW_CS_DSP"); static int cs_dsp_adsp2v2_enable_core(struct cs_dsp *dsp) { @@ -2680,7 +2680,7 @@ int cs_dsp_set_dspclk(struct cs_dsp *dsp, unsigned int freq) return ret; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_set_dspclk, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_set_dspclk, "FW_CS_DSP"); static void cs_dsp_stop_watchdog(struct cs_dsp *dsp) { @@ -2770,7 +2770,7 @@ err_mutex: return ret; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_power_up, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_power_up, "FW_CS_DSP"); /** * cs_dsp_power_down() - Powers-down the DSP @@ -2804,7 +2804,7 @@ void cs_dsp_power_down(struct cs_dsp *dsp) cs_dsp_dbg(dsp, "Shutdown complete\n"); } -EXPORT_SYMBOL_NS_GPL(cs_dsp_power_down, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_power_down, "FW_CS_DSP"); static int cs_dsp_adsp2_start_core(struct cs_dsp *dsp) { @@ -2890,7 +2890,7 @@ err: return ret; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_run, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_run, "FW_CS_DSP"); /** * cs_dsp_stop() - Stops the firmware @@ -2929,7 +2929,7 @@ void cs_dsp_stop(struct cs_dsp *dsp) cs_dsp_dbg(dsp, "Execution stopped\n"); } -EXPORT_SYMBOL_NS_GPL(cs_dsp_stop, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_stop, "FW_CS_DSP"); static int cs_dsp_halo_start_core(struct cs_dsp *dsp) { @@ -2991,7 +2991,7 @@ int cs_dsp_adsp2_init(struct cs_dsp *dsp) return cs_dsp_common_init(dsp); } -EXPORT_SYMBOL_NS_GPL(cs_dsp_adsp2_init, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_adsp2_init, "FW_CS_DSP"); /** * cs_dsp_halo_init() - Initialise a cs_dsp structure representing a HALO Core DSP @@ -3008,7 +3008,7 @@ int cs_dsp_halo_init(struct cs_dsp *dsp) return cs_dsp_common_init(dsp); } -EXPORT_SYMBOL_NS_GPL(cs_dsp_halo_init, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_halo_init, "FW_CS_DSP"); /** * cs_dsp_remove() - Clean a cs_dsp before deletion @@ -3028,7 +3028,7 @@ void cs_dsp_remove(struct cs_dsp *dsp) cs_dsp_free_ctl_blk(ctl); } } -EXPORT_SYMBOL_NS_GPL(cs_dsp_remove, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_remove, "FW_CS_DSP"); /** * cs_dsp_read_raw_data_block() - Reads a block of data from DSP memory @@ -3065,7 +3065,7 @@ int cs_dsp_read_raw_data_block(struct cs_dsp *dsp, int mem_type, unsigned int me return 0; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_read_raw_data_block, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_read_raw_data_block, "FW_CS_DSP"); /** * cs_dsp_read_data_word() - Reads a word from DSP memory @@ -3089,7 +3089,7 @@ int cs_dsp_read_data_word(struct cs_dsp *dsp, int mem_type, unsigned int mem_add return 0; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_read_data_word, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_read_data_word, "FW_CS_DSP"); /** * cs_dsp_write_data_word() - Writes a word to DSP memory @@ -3115,7 +3115,7 @@ int cs_dsp_write_data_word(struct cs_dsp *dsp, int mem_type, unsigned int mem_ad return regmap_raw_write(dsp->regmap, reg, &val, sizeof(val)); } -EXPORT_SYMBOL_NS_GPL(cs_dsp_write_data_word, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_write_data_word, "FW_CS_DSP"); /** * cs_dsp_remove_padding() - Convert unpacked words to packed bytes @@ -3139,7 +3139,7 @@ void cs_dsp_remove_padding(u32 *buf, int nwords) *pack_out++ = (u8)(word >> 16); } } -EXPORT_SYMBOL_NS_GPL(cs_dsp_remove_padding, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_remove_padding, "FW_CS_DSP"); /** * cs_dsp_adsp2_bus_error() - Handle a DSP bus error interrupt @@ -3209,7 +3209,7 @@ void cs_dsp_adsp2_bus_error(struct cs_dsp *dsp) error: mutex_unlock(&dsp->pwr_lock); } -EXPORT_SYMBOL_NS_GPL(cs_dsp_adsp2_bus_error, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_adsp2_bus_error, "FW_CS_DSP"); /** * cs_dsp_halo_bus_error() - Handle a DSP bus error interrupt @@ -3269,7 +3269,7 @@ void cs_dsp_halo_bus_error(struct cs_dsp *dsp) exit_unlock: mutex_unlock(&dsp->pwr_lock); } -EXPORT_SYMBOL_NS_GPL(cs_dsp_halo_bus_error, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_halo_bus_error, "FW_CS_DSP"); /** * cs_dsp_halo_wdt_expire() - Handle DSP watchdog expiry @@ -3289,7 +3289,7 @@ void cs_dsp_halo_wdt_expire(struct cs_dsp *dsp) mutex_unlock(&dsp->pwr_lock); } -EXPORT_SYMBOL_NS_GPL(cs_dsp_halo_wdt_expire, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_halo_wdt_expire, "FW_CS_DSP"); static const struct cs_dsp_ops cs_dsp_adsp1_ops = { .validate_version = cs_dsp_validate_version, @@ -3419,7 +3419,7 @@ int cs_dsp_chunk_write(struct cs_dsp_chunk *ch, int nbits, u32 val) return 0; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_chunk_write, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_chunk_write, "FW_CS_DSP"); /** * cs_dsp_chunk_flush() - Pad remaining data with zero and commit to chunk @@ -3438,7 +3438,7 @@ int cs_dsp_chunk_flush(struct cs_dsp_chunk *ch) return cs_dsp_chunk_write(ch, CS_DSP_DATA_WORD_BITS - ch->cachebits, 0); } -EXPORT_SYMBOL_NS_GPL(cs_dsp_chunk_flush, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_chunk_flush, "FW_CS_DSP"); /** * cs_dsp_chunk_read() - Parse data from a DSP memory chunk @@ -3480,7 +3480,7 @@ int cs_dsp_chunk_read(struct cs_dsp_chunk *ch, int nbits) return result; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_chunk_read, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_chunk_read, "FW_CS_DSP"); struct cs_dsp_wseq_op { @@ -3605,7 +3605,7 @@ int cs_dsp_wseq_init(struct cs_dsp *dsp, struct cs_dsp_wseq *wseqs, unsigned int return 0; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_wseq_init, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_wseq_init, "FW_CS_DSP"); static struct cs_dsp_wseq_op *cs_dsp_wseq_find_op(u32 addr, u8 op_code, struct list_head *wseq_ops) @@ -3720,7 +3720,7 @@ op_new_free: return ret; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_wseq_write, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_wseq_write, "FW_CS_DSP"); /** * cs_dsp_wseq_multi_write() - Add or update multiple entries in a write sequence @@ -3752,7 +3752,7 @@ int cs_dsp_wseq_multi_write(struct cs_dsp *dsp, struct cs_dsp_wseq *wseq, return 0; } -EXPORT_SYMBOL_NS_GPL(cs_dsp_wseq_multi_write, FW_CS_DSP); +EXPORT_SYMBOL_NS_GPL(cs_dsp_wseq_multi_write, "FW_CS_DSP"); MODULE_DESCRIPTION("Cirrus Logic DSP Support"); MODULE_AUTHOR("Simon Trimmer "); diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig index e312d731f4a3..5fe61b9ab5f9 100644 --- a/drivers/firmware/efi/Kconfig +++ b/drivers/firmware/efi/Kconfig @@ -76,10 +76,6 @@ config EFI_ZBOOT bool "Enable the generic EFI decompressor" depends on EFI_GENERIC_STUB && !ARM select HAVE_KERNEL_GZIP - select HAVE_KERNEL_LZ4 - select HAVE_KERNEL_LZMA - select HAVE_KERNEL_LZO - select HAVE_KERNEL_XZ select HAVE_KERNEL_ZSTD help Create the bootable image as an EFI application that carries the diff --git a/drivers/firmware/efi/efi-pstore.c b/drivers/firmware/efi/efi-pstore.c index 552c78f5f059..a253b6144945 100644 --- a/drivers/firmware/efi/efi-pstore.c +++ b/drivers/firmware/efi/efi-pstore.c @@ -6,7 +6,7 @@ #include #include -MODULE_IMPORT_NS(EFIVAR); +MODULE_IMPORT_NS("EFIVAR"); #define DUMP_NAME_LEN 66 diff --git a/drivers/firmware/efi/embedded-firmware.c b/drivers/firmware/efi/embedded-firmware.c index f5be8e22305b..b49a09d7e665 100644 --- a/drivers/firmware/efi/embedded-firmware.c +++ b/drivers/firmware/efi/embedded-firmware.c @@ -16,9 +16,9 @@ /* Exported for use by lib/test_firmware.c only */ LIST_HEAD(efi_embedded_fw_list); -EXPORT_SYMBOL_NS_GPL(efi_embedded_fw_list, TEST_FIRMWARE); +EXPORT_SYMBOL_NS_GPL(efi_embedded_fw_list, "TEST_FIRMWARE"); bool efi_embedded_fw_checked; -EXPORT_SYMBOL_NS_GPL(efi_embedded_fw_checked, TEST_FIRMWARE); +EXPORT_SYMBOL_NS_GPL(efi_embedded_fw_checked, "TEST_FIRMWARE"); static const struct dmi_system_id * const embedded_fw_table[] = { #ifdef CONFIG_TOUCHSCREEN_DMI diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c index 7a81c0ce4780..4bb7b0584bc9 100644 --- a/drivers/firmware/efi/esrt.c +++ b/drivers/firmware/efi/esrt.c @@ -75,8 +75,6 @@ static LIST_HEAD(entry_list); struct esre_attribute { struct attribute attr; ssize_t (*show)(struct esre_entry *entry, char *buf); - ssize_t (*store)(struct esre_entry *entry, - const char *buf, size_t count); }; static struct esre_entry *to_entry(struct kobject *kobj) diff --git a/drivers/firmware/efi/libstub/Makefile.zboot b/drivers/firmware/efi/libstub/Makefile.zboot index 65ffd0b760b2..48842b5c106b 100644 --- a/drivers/firmware/efi/libstub/Makefile.zboot +++ b/drivers/firmware/efi/libstub/Makefile.zboot @@ -12,22 +12,16 @@ quiet_cmd_copy_and_pad = PAD $@ $(obj)/vmlinux.bin: $(obj)/$(EFI_ZBOOT_PAYLOAD) FORCE $(call if_changed,copy_and_pad) -comp-type-$(CONFIG_KERNEL_GZIP) := gzip -comp-type-$(CONFIG_KERNEL_LZ4) := lz4 -comp-type-$(CONFIG_KERNEL_LZMA) := lzma -comp-type-$(CONFIG_KERNEL_LZO) := lzo -comp-type-$(CONFIG_KERNEL_XZ) := xzkern -comp-type-$(CONFIG_KERNEL_ZSTD) := zstd22 - # in GZIP, the appended le32 carrying the uncompressed size is part of the # format, but in other cases, we just append it at the end for convenience, # causing the original tools to complain when checking image integrity. -# So disregard it when calculating the payload size in the zimage header. -zboot-method-y := $(comp-type-y)_with_size -zboot-size-len-y := 4 +comp-type-y := gzip +zboot-method-y := gzip +zboot-size-len-y := 0 -zboot-method-$(CONFIG_KERNEL_GZIP) := gzip -zboot-size-len-$(CONFIG_KERNEL_GZIP) := 0 +comp-type-$(CONFIG_KERNEL_ZSTD) := zstd +zboot-method-$(CONFIG_KERNEL_ZSTD) := zstd22_with_size +zboot-size-len-$(CONFIG_KERNEL_ZSTD) := 4 $(obj)/vmlinuz: $(obj)/vmlinux.bin FORCE $(call if_changed,$(zboot-method-y)) diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c index 4056ba7f3440..3700e9869767 100644 --- a/drivers/firmware/efi/vars.c +++ b/drivers/firmware/efi/vars.c @@ -149,7 +149,7 @@ int efivar_lock(void) } return 0; } -EXPORT_SYMBOL_NS_GPL(efivar_lock, EFIVAR); +EXPORT_SYMBOL_NS_GPL(efivar_lock, "EFIVAR"); /* * efivar_lock() - obtain the efivar lock if it is free @@ -165,7 +165,7 @@ int efivar_trylock(void) } return 0; } -EXPORT_SYMBOL_NS_GPL(efivar_trylock, EFIVAR); +EXPORT_SYMBOL_NS_GPL(efivar_trylock, "EFIVAR"); /* * efivar_unlock() - release the efivar lock @@ -174,7 +174,7 @@ void efivar_unlock(void) { up(&efivars_lock); } -EXPORT_SYMBOL_NS_GPL(efivar_unlock, EFIVAR); +EXPORT_SYMBOL_NS_GPL(efivar_unlock, "EFIVAR"); /* * efivar_get_variable() - retrieve a variable identified by name/vendor @@ -186,7 +186,7 @@ efi_status_t efivar_get_variable(efi_char16_t *name, efi_guid_t *vendor, { return __efivars->ops->get_variable(name, vendor, attr, size, data); } -EXPORT_SYMBOL_NS_GPL(efivar_get_variable, EFIVAR); +EXPORT_SYMBOL_NS_GPL(efivar_get_variable, "EFIVAR"); /* * efivar_get_next_variable() - enumerate the next name/vendor pair @@ -198,7 +198,7 @@ efi_status_t efivar_get_next_variable(unsigned long *name_size, { return __efivars->ops->get_next_variable(name_size, name, vendor); } -EXPORT_SYMBOL_NS_GPL(efivar_get_next_variable, EFIVAR); +EXPORT_SYMBOL_NS_GPL(efivar_get_next_variable, "EFIVAR"); /* * efivar_set_variable_locked() - set a variable identified by name/vendor @@ -230,7 +230,7 @@ efi_status_t efivar_set_variable_locked(efi_char16_t *name, efi_guid_t *vendor, return setvar(name, vendor, attr, data_size, data); } -EXPORT_SYMBOL_NS_GPL(efivar_set_variable_locked, EFIVAR); +EXPORT_SYMBOL_NS_GPL(efivar_set_variable_locked, "EFIVAR"); /* * efivar_set_variable() - set a variable identified by name/vendor @@ -252,7 +252,7 @@ efi_status_t efivar_set_variable(efi_char16_t *name, efi_guid_t *vendor, efivar_unlock(); return status; } -EXPORT_SYMBOL_NS_GPL(efivar_set_variable, EFIVAR); +EXPORT_SYMBOL_NS_GPL(efivar_set_variable, "EFIVAR"); efi_status_t efivar_query_variable_info(u32 attr, u64 *storage_space, @@ -264,4 +264,4 @@ efi_status_t efivar_query_variable_info(u32 attr, return __efivars->ops->query_variable_info(attr, storage_space, remaining_space, max_variable_size); } -EXPORT_SYMBOL_NS_GPL(efivar_query_variable_info, EFIVAR); +EXPORT_SYMBOL_NS_GPL(efivar_query_variable_info, "EFIVAR"); diff --git a/drivers/firmware/imx/Kconfig b/drivers/firmware/imx/Kconfig index 477d3f32d99a..907cd149c40a 100644 --- a/drivers/firmware/imx/Kconfig +++ b/drivers/firmware/imx/Kconfig @@ -25,7 +25,6 @@ config IMX_SCU config IMX_SCMI_MISC_DRV tristate "IMX SCMI MISC Protocol driver" - depends on IMX_SCMI_MISC_EXT || COMPILE_TEST default y if ARCH_MXC help The System Controller Management Interface firmware (SCMI FW) is diff --git a/drivers/firmware/microchip/mpfs-auto-update.c b/drivers/firmware/microchip/mpfs-auto-update.c index 38a03698cec9..e194f7acb2a9 100644 --- a/drivers/firmware/microchip/mpfs-auto-update.c +++ b/drivers/firmware/microchip/mpfs-auto-update.c @@ -402,10 +402,10 @@ static int mpfs_auto_update_available(struct mpfs_auto_update_priv *priv) return -EIO; /* - * Bit 5 of byte 1 is "UL_Auto Update" & if it is set, Auto Update is + * Bit 5 of byte 1 is "UL_IAP" & if it is set, Auto Update is * not possible. */ - if (response_msg[1] & AUTO_UPDATE_FEATURE_ENABLED) + if ((((u8 *)response_msg)[1] & AUTO_UPDATE_FEATURE_ENABLED)) return -EPERM; return 0; diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c index dd515083bbdd..10f678b9ed36 100644 --- a/drivers/fpga/intel-m10-bmc-sec-update.c +++ b/drivers/fpga/intel-m10-bmc-sec-update.c @@ -771,4 +771,4 @@ module_platform_driver(intel_m10bmc_sec_driver); MODULE_AUTHOR("Intel Corporation"); MODULE_DESCRIPTION("Intel MAX10 BMC Secure Update"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(INTEL_M10_BMC_CORE); +MODULE_IMPORT_NS("INTEL_M10_BMC_CORE"); diff --git a/drivers/fsi/fsi-master-aspeed.c b/drivers/fsi/fsi-master-aspeed.c index 6f5e1bdf7e40..bff897f77fe5 100644 --- a/drivers/fsi/fsi-master-aspeed.c +++ b/drivers/fsi/fsi-master-aspeed.c @@ -666,7 +666,7 @@ static struct platform_driver fsi_master_aspeed_driver = { .of_match_table = fsi_master_aspeed_match, }, .probe = fsi_master_aspeed_probe, - .remove_new = fsi_master_aspeed_remove, + .remove = fsi_master_aspeed_remove, }; module_platform_driver(fsi_master_aspeed_driver); diff --git a/drivers/fsi/fsi-master-ast-cf.c b/drivers/fsi/fsi-master-ast-cf.c index a4c37ff8edd6..9f2fd444ceb6 100644 --- a/drivers/fsi/fsi-master-ast-cf.c +++ b/drivers/fsi/fsi-master-ast-cf.c @@ -1434,7 +1434,7 @@ static struct platform_driver fsi_master_acf = { .of_match_table = fsi_master_acf_match, }, .probe = fsi_master_acf_probe, - .remove_new = fsi_master_acf_remove, + .remove = fsi_master_acf_remove, }; module_platform_driver(fsi_master_acf); diff --git a/drivers/fsi/fsi-master-gpio.c b/drivers/fsi/fsi-master-gpio.c index f761344f4873..69de0b5b9cbd 100644 --- a/drivers/fsi/fsi-master-gpio.c +++ b/drivers/fsi/fsi-master-gpio.c @@ -888,7 +888,7 @@ static struct platform_driver fsi_master_gpio_driver = { .of_match_table = fsi_master_gpio_match, }, .probe = fsi_master_gpio_probe, - .remove_new = fsi_master_gpio_remove, + .remove = fsi_master_gpio_remove, }; module_platform_driver(fsi_master_gpio_driver); diff --git a/drivers/fsi/fsi-occ.c b/drivers/fsi/fsi-occ.c index a6d4c8f123a5..d3e6bf37878a 100644 --- a/drivers/fsi/fsi-occ.c +++ b/drivers/fsi/fsi-occ.c @@ -740,7 +740,7 @@ static struct platform_driver occ_driver = { .of_match_table = occ_match, }, .probe = occ_probe, - .remove_new = occ_remove, + .remove = occ_remove, }; static int occ_init(void) diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index 56fee58e281e..93ee3aa092f8 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -482,8 +482,9 @@ config GPIO_MT7621 Say yes here to support the Mediatek MT7621 SoC GPIO device. config GPIO_MVEBU - def_bool y + bool "Marvell Orion and EBU GPIO support" if COMPILE_TEST depends on PLAT_ORION || ARCH_MVEBU || COMPILE_TEST + default PLAT_ORION || ARCH_MVEBU select GENERIC_IRQ_CHIP select REGMAP_MMIO diff --git a/drivers/gpio/gpio-104-dio-48e.c b/drivers/gpio/gpio-104-dio-48e.c index 4df9becaf349..cf5a50102d49 100644 --- a/drivers/gpio/gpio-104-dio-48e.c +++ b/drivers/gpio/gpio-104-dio-48e.c @@ -22,7 +22,7 @@ #include "gpio-i8255.h" -MODULE_IMPORT_NS(I8255); +MODULE_IMPORT_NS("I8255"); #define DIO48E_EXTENT 16 #define MAX_NUM_DIO48E max_num_isa_dev(DIO48E_EXTENT) @@ -339,4 +339,4 @@ module_isa_driver_with_irq(dio48e_driver, num_dio48e, num_irq); MODULE_AUTHOR("William Breathitt Gray "); MODULE_DESCRIPTION("ACCES 104-DIO-48E GPIO driver"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(I8254); +MODULE_IMPORT_NS("I8254"); diff --git a/drivers/gpio/gpio-104-idio-16.c b/drivers/gpio/gpio-104-idio-16.c index f03ccd0f534c..ffe7e1cb6b23 100644 --- a/drivers/gpio/gpio-104-idio-16.c +++ b/drivers/gpio/gpio-104-idio-16.c @@ -126,4 +126,4 @@ module_isa_driver_with_irq(idio_16_driver, num_idio_16, num_irq); MODULE_AUTHOR("William Breathitt Gray "); MODULE_DESCRIPTION("ACCES 104-IDIO-16 GPIO driver"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(GPIO_IDIO_16); +MODULE_IMPORT_NS("GPIO_IDIO_16"); diff --git a/drivers/gpio/gpio-elkhartlake.c b/drivers/gpio/gpio-elkhartlake.c index 887c0fe99d39..95de52d2cc63 100644 --- a/drivers/gpio/gpio-elkhartlake.c +++ b/drivers/gpio/gpio-elkhartlake.c @@ -75,4 +75,4 @@ MODULE_AUTHOR("Pandith N "); MODULE_AUTHOR("Raag Jadav "); MODULE_DESCRIPTION("Intel Elkhart Lake PSE GPIO driver"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(GPIO_TANGIER); +MODULE_IMPORT_NS("GPIO_TANGIER"); diff --git a/drivers/gpio/gpio-gpio-mm.c b/drivers/gpio/gpio-gpio-mm.c index 43d823a56e59..fb7c510bf2fa 100644 --- a/drivers/gpio/gpio-gpio-mm.c +++ b/drivers/gpio/gpio-gpio-mm.c @@ -18,7 +18,7 @@ #include "gpio-i8255.h" -MODULE_IMPORT_NS(I8255); +MODULE_IMPORT_NS("I8255"); #define GPIOMM_EXTENT 8 #define MAX_NUM_GPIOMM max_num_isa_dev(GPIOMM_EXTENT) diff --git a/drivers/gpio/gpio-graniterapids.c b/drivers/gpio/gpio-graniterapids.c index f2e911a3d2ca..ad6a045fd3d2 100644 --- a/drivers/gpio/gpio-graniterapids.c +++ b/drivers/gpio/gpio-graniterapids.c @@ -32,12 +32,14 @@ #define GNR_PINS_PER_REG 32 #define GNR_NUM_REGS DIV_ROUND_UP(GNR_NUM_PINS, GNR_PINS_PER_REG) -#define GNR_CFG_BAR 0x00 +#define GNR_CFG_PADBAR 0x00 #define GNR_CFG_LOCK_OFFSET 0x04 -#define GNR_GPI_STATUS_OFFSET 0x20 +#define GNR_GPI_STATUS_OFFSET 0x14 #define GNR_GPI_ENABLE_OFFSET 0x24 -#define GNR_CFG_DW_RX_MASK GENMASK(25, 22) +#define GNR_CFG_DW_HOSTSW_MODE BIT(27) +#define GNR_CFG_DW_RX_MASK GENMASK(23, 22) +#define GNR_CFG_DW_INTSEL_MASK GENMASK(21, 14) #define GNR_CFG_DW_RX_DISABLE FIELD_PREP(GNR_CFG_DW_RX_MASK, 2) #define GNR_CFG_DW_RX_EDGE FIELD_PREP(GNR_CFG_DW_RX_MASK, 1) #define GNR_CFG_DW_RX_LEVEL FIELD_PREP(GNR_CFG_DW_RX_MASK, 0) @@ -50,6 +52,7 @@ * struct gnr_gpio - Intel Granite Rapids-D vGPIO driver state * @gc: GPIO controller interface * @reg_base: base address of the GPIO registers + * @pad_base: base address of the vGPIO pad configuration registers * @ro_bitmap: bitmap of read-only pins * @lock: guard the registers * @pad_backup: backup of the register state for suspend @@ -57,6 +60,7 @@ struct gnr_gpio { struct gpio_chip gc; void __iomem *reg_base; + void __iomem *pad_base; DECLARE_BITMAP(ro_bitmap, GNR_NUM_PINS); raw_spinlock_t lock; u32 pad_backup[]; @@ -65,7 +69,7 @@ struct gnr_gpio { static void __iomem *gnr_gpio_get_padcfg_addr(const struct gnr_gpio *priv, unsigned int gpio) { - return priv->reg_base + gpio * sizeof(u32); + return priv->pad_base + gpio * sizeof(u32); } static int gnr_gpio_configure_line(struct gpio_chip *gc, unsigned int gpio, @@ -88,6 +92,20 @@ static int gnr_gpio_configure_line(struct gpio_chip *gc, unsigned int gpio, return 0; } +static int gnr_gpio_request(struct gpio_chip *gc, unsigned int gpio) +{ + struct gnr_gpio *priv = gpiochip_get_data(gc); + u32 dw; + + dw = readl(gnr_gpio_get_padcfg_addr(priv, gpio)); + if (!(dw & GNR_CFG_DW_HOSTSW_MODE)) { + dev_warn(gc->parent, "GPIO %u is not owned by host", gpio); + return -EBUSY; + } + + return 0; +} + static int gnr_gpio_get(struct gpio_chip *gc, unsigned int gpio) { const struct gnr_gpio *priv = gpiochip_get_data(gc); @@ -139,6 +157,7 @@ static int gnr_gpio_direction_output(struct gpio_chip *gc, unsigned int gpio, in static const struct gpio_chip gnr_gpio_chip = { .owner = THIS_MODULE, + .request = gnr_gpio_request, .get = gnr_gpio_get, .set = gnr_gpio_set, .get_direction = gnr_gpio_get_direction, @@ -166,7 +185,7 @@ static void gnr_gpio_irq_ack(struct irq_data *d) guard(raw_spinlock_irqsave)(&priv->lock); reg = readl(addr); - reg &= ~BIT(bit_idx); + reg |= BIT(bit_idx); writel(reg, addr); } @@ -209,10 +228,18 @@ static void gnr_gpio_irq_unmask(struct irq_data *d) static int gnr_gpio_irq_set_type(struct irq_data *d, unsigned int type) { struct gpio_chip *gc = irq_data_get_irq_chip_data(d); - irq_hw_number_t pin = irqd_to_hwirq(d); - u32 mask = GNR_CFG_DW_RX_MASK; + struct gnr_gpio *priv = gpiochip_get_data(gc); + irq_hw_number_t hwirq = irqd_to_hwirq(d); + u32 reg; u32 set; + /* Allow interrupts only if Interrupt Select field is non-zero */ + reg = readl(gnr_gpio_get_padcfg_addr(priv, hwirq)); + if (!(reg & GNR_CFG_DW_INTSEL_MASK)) { + dev_dbg(gc->parent, "GPIO %lu cannot be used as IRQ", hwirq); + return -EPERM; + } + /* Falling edge and level low triggers not supported by the GPIO controller */ switch (type) { case IRQ_TYPE_NONE: @@ -230,10 +257,11 @@ static int gnr_gpio_irq_set_type(struct irq_data *d, unsigned int type) return -EINVAL; } - return gnr_gpio_configure_line(gc, pin, mask, set); + return gnr_gpio_configure_line(gc, hwirq, GNR_CFG_DW_RX_MASK, set); } static const struct irq_chip gnr_gpio_irq_chip = { + .name = "gpio-graniterapids", .irq_ack = gnr_gpio_irq_ack, .irq_mask = gnr_gpio_irq_mask, .irq_unmask = gnr_gpio_irq_unmask, @@ -291,6 +319,7 @@ static int gnr_gpio_probe(struct platform_device *pdev) struct gnr_gpio *priv; void __iomem *regs; int irq, ret; + u32 offset; priv = devm_kzalloc(dev, struct_size(priv, pad_backup, num_backup_pins), GFP_KERNEL); if (!priv) @@ -302,6 +331,10 @@ static int gnr_gpio_probe(struct platform_device *pdev) if (IS_ERR(regs)) return PTR_ERR(regs); + priv->reg_base = regs; + offset = readl(priv->reg_base + GNR_CFG_PADBAR); + priv->pad_base = priv->reg_base + offset; + irq = platform_get_irq(pdev, 0); if (irq < 0) return irq; @@ -311,8 +344,6 @@ static int gnr_gpio_probe(struct platform_device *pdev) if (ret) return dev_err_probe(dev, ret, "failed to request interrupt\n"); - priv->reg_base = regs + readl(regs + GNR_CFG_BAR); - gnr_gpio_init_pin_ro_bits(dev, priv->reg_base + GNR_CFG_LOCK_OFFSET, priv->ro_bitmap); @@ -324,7 +355,6 @@ static int gnr_gpio_probe(struct platform_device *pdev) girq = &priv->gc.irq; gpio_irq_chip_set_chip(girq, &gnr_gpio_irq_chip); - girq->chip->name = dev_name(dev); girq->parent_handler = NULL; girq->num_parents = 0; girq->parents = NULL; diff --git a/drivers/gpio/gpio-i8255.c b/drivers/gpio/gpio-i8255.c index 64ab80fc4a1e..953018bfa2b1 100644 --- a/drivers/gpio/gpio-i8255.c +++ b/drivers/gpio/gpio-i8255.c @@ -134,7 +134,7 @@ int devm_i8255_regmap_register(struct device *const dev, return PTR_ERR_OR_ZERO(devm_gpio_regmap_register(dev, &gpio_config)); } -EXPORT_SYMBOL_NS_GPL(devm_i8255_regmap_register, I8255); +EXPORT_SYMBOL_NS_GPL(devm_i8255_regmap_register, "I8255"); MODULE_AUTHOR("William Breathitt Gray"); MODULE_DESCRIPTION("Intel 8255 Programmable Peripheral Interface"); diff --git a/drivers/gpio/gpio-idio-16.c b/drivers/gpio/gpio-idio-16.c index 53b1eb876a12..0103be977c66 100644 --- a/drivers/gpio/gpio-idio-16.c +++ b/drivers/gpio/gpio-idio-16.c @@ -3,6 +3,9 @@ * GPIO library for the ACCES IDIO-16 family * Copyright (C) 2022 William Breathitt Gray */ + +#define DEFAULT_SYMBOL_NAMESPACE "GPIO_IDIO_16" + #include #include #include @@ -14,8 +17,6 @@ #include "gpio-idio-16.h" -#define DEFAULT_SYMBOL_NAMESPACE GPIO_IDIO_16 - #define IDIO_16_DAT_BASE 0x0 #define IDIO_16_OUT_BASE IDIO_16_DAT_BASE #define IDIO_16_IN_BASE (IDIO_16_DAT_BASE + 1) diff --git a/drivers/gpio/gpio-ljca.c b/drivers/gpio/gpio-ljca.c index d67b912d884d..817ecb12d550 100644 --- a/drivers/gpio/gpio-ljca.c +++ b/drivers/gpio/gpio-ljca.c @@ -82,9 +82,9 @@ static int ljca_gpio_config(struct ljca_gpio_dev *ljca_gpio, u8 gpio_id, int ret; mutex_lock(&ljca_gpio->trans_lock); + packet->num = 1; packet->item[0].index = gpio_id; packet->item[0].value = config | ljca_gpio->connect_mode[gpio_id]; - packet->num = 1; ret = ljca_transfer(ljca_gpio->ljca, LJCA_GPIO_CONFIG, (u8 *)packet, struct_size(packet, item, packet->num), NULL, 0); @@ -492,4 +492,4 @@ MODULE_AUTHOR("Zhifeng Wang "); MODULE_AUTHOR("Lixu Zhang "); MODULE_DESCRIPTION("Intel La Jolla Cove Adapter USB-GPIO driver"); MODULE_LICENSE("GPL"); -MODULE_IMPORT_NS(LJCA); +MODULE_IMPORT_NS("LJCA"); diff --git a/drivers/gpio/gpio-loongson-64bit.c b/drivers/gpio/gpio-loongson-64bit.c index 6749d4dd6d64..7f4d78fd800e 100644 --- a/drivers/gpio/gpio-loongson-64bit.c +++ b/drivers/gpio/gpio-loongson-64bit.c @@ -237,9 +237,9 @@ static const struct loongson_gpio_chip_data loongson_gpio_ls2k2000_data1 = { static const struct loongson_gpio_chip_data loongson_gpio_ls2k2000_data2 = { .label = "ls2k2000_gpio", .mode = BIT_CTRL_MODE, - .conf_offset = 0x84, - .in_offset = 0x88, - .out_offset = 0x80, + .conf_offset = 0x4, + .in_offset = 0x8, + .out_offset = 0x0, }; static const struct loongson_gpio_chip_data loongson_gpio_ls3a5000_data = { diff --git a/drivers/gpio/gpio-menz127.c b/drivers/gpio/gpio-menz127.c index 3ccd2cb35b9c..ebe5da4933bc 100644 --- a/drivers/gpio/gpio-menz127.c +++ b/drivers/gpio/gpio-menz127.c @@ -201,4 +201,4 @@ MODULE_AUTHOR("Andreas Werner "); MODULE_DESCRIPTION("MEN 16z127 GPIO Controller"); MODULE_LICENSE("GPL v2"); MODULE_ALIAS("mcb:16z127"); -MODULE_IMPORT_NS(MCB); +MODULE_IMPORT_NS("MCB"); diff --git a/drivers/gpio/gpio-merrifield.c b/drivers/gpio/gpio-merrifield.c index cd20604f26de..4335a5d8e4f6 100644 --- a/drivers/gpio/gpio-merrifield.c +++ b/drivers/gpio/gpio-merrifield.c @@ -142,4 +142,4 @@ module_pci_driver(mrfld_gpio_driver); MODULE_AUTHOR("Andy Shevchenko "); MODULE_DESCRIPTION("Intel Merrifield SoC GPIO driver"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(GPIO_TANGIER); +MODULE_IMPORT_NS("GPIO_TANGIER"); diff --git a/drivers/gpio/gpio-pci-idio-16.c b/drivers/gpio/gpio-pci-idio-16.c index 64f332c80550..476cea1b5ed7 100644 --- a/drivers/gpio/gpio-pci-idio-16.c +++ b/drivers/gpio/gpio-pci-idio-16.c @@ -112,4 +112,4 @@ module_pci_driver(idio_16_driver); MODULE_AUTHOR("William Breathitt Gray "); MODULE_DESCRIPTION("ACCES PCI-IDIO-16 GPIO driver"); MODULE_LICENSE("GPL v2"); -MODULE_IMPORT_NS(GPIO_IDIO_16); +MODULE_IMPORT_NS("GPIO_IDIO_16"); diff --git a/drivers/gpio/gpio-sim.c b/drivers/gpio/gpio-sim.c index f387dad81f29..686ae3d11ba3 100644 --- a/drivers/gpio/gpio-sim.c +++ b/drivers/gpio/gpio-sim.c @@ -1027,6 +1027,30 @@ static void gpio_sim_device_deactivate(struct gpio_sim_device *dev) dev->pdev = NULL; } +static void +gpio_sim_device_lockup_configfs(struct gpio_sim_device *dev, bool lock) +{ + struct configfs_subsystem *subsys = dev->group.cg_subsys; + struct gpio_sim_bank *bank; + struct gpio_sim_line *line; + + /* + * The device only needs to depend on leaf line entries. This is + * sufficient to lock up all the configfs entries that the + * instantiated, alive device depends on. + */ + list_for_each_entry(bank, &dev->bank_list, siblings) { + list_for_each_entry(line, &bank->line_list, siblings) { + if (lock) + WARN_ON(configfs_depend_item_unlocked( + subsys, &line->group.cg_item)); + else + configfs_undepend_item_unlocked( + &line->group.cg_item); + } + } +} + static ssize_t gpio_sim_device_config_live_store(struct config_item *item, const char *page, size_t count) @@ -1039,14 +1063,24 @@ gpio_sim_device_config_live_store(struct config_item *item, if (ret) return ret; - guard(mutex)(&dev->lock); + if (live) + gpio_sim_device_lockup_configfs(dev, true); - if (live == gpio_sim_device_is_live(dev)) - ret = -EPERM; - else if (live) - ret = gpio_sim_device_activate(dev); - else - gpio_sim_device_deactivate(dev); + scoped_guard(mutex, &dev->lock) { + if (live == gpio_sim_device_is_live(dev)) + ret = -EPERM; + else if (live) + ret = gpio_sim_device_activate(dev); + else + gpio_sim_device_deactivate(dev); + } + + /* + * Undepend is required only if device disablement (live == 0) + * succeeds or if device enablement (live == 1) fails. + */ + if (live == !!ret) + gpio_sim_device_lockup_configfs(dev, false); return ret ?: count; } diff --git a/drivers/gpio/gpio-tangier.c b/drivers/gpio/gpio-tangier.c index 4b29abafecf6..a415e6d36173 100644 --- a/drivers/gpio/gpio-tangier.c +++ b/drivers/gpio/gpio-tangier.c @@ -459,7 +459,7 @@ int devm_tng_gpio_probe(struct device *dev, struct tng_gpio *gpio) return 0; } -EXPORT_SYMBOL_NS_GPL(devm_tng_gpio_probe, GPIO_TANGIER); +EXPORT_SYMBOL_NS_GPL(devm_tng_gpio_probe, "GPIO_TANGIER"); static int tng_gpio_suspend(struct device *dev) { diff --git a/drivers/gpio/gpio-virtuser.c b/drivers/gpio/gpio-virtuser.c index 91b6352c957c..e89f299f2140 100644 --- a/drivers/gpio/gpio-virtuser.c +++ b/drivers/gpio/gpio-virtuser.c @@ -1410,7 +1410,7 @@ gpio_virtuser_make_lookup_table(struct gpio_virtuser_device *dev) size_t num_entries = gpio_virtuser_get_lookup_count(dev); struct gpio_virtuser_lookup_entry *entry; struct gpio_virtuser_lookup *lookup; - unsigned int i = 0; + unsigned int i = 0, idx; lockdep_assert_held(&dev->lock); @@ -1424,12 +1424,12 @@ gpio_virtuser_make_lookup_table(struct gpio_virtuser_device *dev) return -ENOMEM; list_for_each_entry(lookup, &dev->lookup_list, siblings) { + idx = 0; list_for_each_entry(entry, &lookup->entry_list, siblings) { - table->table[i] = + table->table[i++] = GPIO_LOOKUP_IDX(entry->key, entry->offset < 0 ? U16_MAX : entry->offset, - lookup->con_id, i, entry->flags); - i++; + lookup->con_id, idx++, entry->flags); } } @@ -1439,6 +1439,15 @@ gpio_virtuser_make_lookup_table(struct gpio_virtuser_device *dev) return 0; } +static void +gpio_virtuser_remove_lookup_table(struct gpio_virtuser_device *dev) +{ + gpiod_remove_lookup_table(dev->lookup_table); + kfree(dev->lookup_table->dev_id); + kfree(dev->lookup_table); + dev->lookup_table = NULL; +} + static struct fwnode_handle * gpio_virtuser_make_device_swnode(struct gpio_virtuser_device *dev) { @@ -1487,10 +1496,8 @@ gpio_virtuser_device_activate(struct gpio_virtuser_device *dev) pdevinfo.fwnode = swnode; ret = gpio_virtuser_make_lookup_table(dev); - if (ret) { - fwnode_remove_software_node(swnode); - return ret; - } + if (ret) + goto err_remove_swnode; reinit_completion(&dev->probe_completion); dev->driver_bound = false; @@ -1498,23 +1505,31 @@ gpio_virtuser_device_activate(struct gpio_virtuser_device *dev) pdev = platform_device_register_full(&pdevinfo); if (IS_ERR(pdev)) { + ret = PTR_ERR(pdev); bus_unregister_notifier(&platform_bus_type, &dev->bus_notifier); - fwnode_remove_software_node(swnode); - return PTR_ERR(pdev); + goto err_remove_lookup_table; } wait_for_completion(&dev->probe_completion); bus_unregister_notifier(&platform_bus_type, &dev->bus_notifier); if (!dev->driver_bound) { - platform_device_unregister(pdev); - fwnode_remove_software_node(swnode); - return -ENXIO; + ret = -ENXIO; + goto err_unregister_pdev; } dev->pdev = pdev; return 0; + +err_unregister_pdev: + platform_device_unregister(pdev); +err_remove_lookup_table: + gpio_virtuser_remove_lookup_table(dev); +err_remove_swnode: + fwnode_remove_software_node(swnode); + + return ret; } static void @@ -1526,10 +1541,33 @@ gpio_virtuser_device_deactivate(struct gpio_virtuser_device *dev) swnode = dev_fwnode(&dev->pdev->dev); platform_device_unregister(dev->pdev); + gpio_virtuser_remove_lookup_table(dev); fwnode_remove_software_node(swnode); dev->pdev = NULL; - gpiod_remove_lookup_table(dev->lookup_table); - kfree(dev->lookup_table); +} + +static void +gpio_virtuser_device_lockup_configfs(struct gpio_virtuser_device *dev, bool lock) +{ + struct configfs_subsystem *subsys = dev->group.cg_subsys; + struct gpio_virtuser_lookup_entry *entry; + struct gpio_virtuser_lookup *lookup; + + /* + * The device only needs to depend on leaf lookup entries. This is + * sufficient to lock up all the configfs entries that the + * instantiated, alive device depends on. + */ + list_for_each_entry(lookup, &dev->lookup_list, siblings) { + list_for_each_entry(entry, &lookup->entry_list, siblings) { + if (lock) + WARN_ON(configfs_depend_item_unlocked( + subsys, &entry->group.cg_item)); + else + configfs_undepend_item_unlocked( + &entry->group.cg_item); + } + } } static ssize_t @@ -1544,15 +1582,24 @@ gpio_virtuser_device_config_live_store(struct config_item *item, if (ret) return ret; - guard(mutex)(&dev->lock); - - if (live == gpio_virtuser_device_is_live(dev)) - return -EPERM; - if (live) - ret = gpio_virtuser_device_activate(dev); - else - gpio_virtuser_device_deactivate(dev); + gpio_virtuser_device_lockup_configfs(dev, true); + + scoped_guard(mutex, &dev->lock) { + if (live == gpio_virtuser_device_is_live(dev)) + ret = -EPERM; + else if (live) + ret = gpio_virtuser_device_activate(dev); + else + gpio_virtuser_device_deactivate(dev); + } + + /* + * Undepend is required only if device disablement (live == 0) + * succeeds or if device enablement (live == 1) fails. + */ + if (live == !!ret) + gpio_virtuser_device_lockup_configfs(dev, false); return ret ?: count; } diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c index c6a8f2c82680..792d94c49077 100644 --- a/drivers/gpio/gpio-xilinx.c +++ b/drivers/gpio/gpio-xilinx.c @@ -65,7 +65,7 @@ struct xgpio_instance { DECLARE_BITMAP(state, 64); DECLARE_BITMAP(last_irq_read, 64); DECLARE_BITMAP(dir, 64); - spinlock_t gpio_lock; /* For serializing operations */ + raw_spinlock_t gpio_lock; /* For serializing operations */ int irq; DECLARE_BITMAP(enable, 64); DECLARE_BITMAP(rising_edge, 64); @@ -179,14 +179,14 @@ static void xgpio_set(struct gpio_chip *gc, unsigned int gpio, int val) struct xgpio_instance *chip = gpiochip_get_data(gc); int bit = xgpio_to_bit(chip, gpio); - spin_lock_irqsave(&chip->gpio_lock, flags); + raw_spin_lock_irqsave(&chip->gpio_lock, flags); /* Write to GPIO signal and set its direction to output */ __assign_bit(bit, chip->state, val); xgpio_write_ch(chip, XGPIO_DATA_OFFSET, bit, chip->state); - spin_unlock_irqrestore(&chip->gpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->gpio_lock, flags); } /** @@ -210,7 +210,7 @@ static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask, bitmap_remap(hw_mask, mask, chip->sw_map, chip->hw_map, 64); bitmap_remap(hw_bits, bits, chip->sw_map, chip->hw_map, 64); - spin_lock_irqsave(&chip->gpio_lock, flags); + raw_spin_lock_irqsave(&chip->gpio_lock, flags); bitmap_replace(state, chip->state, hw_bits, hw_mask, 64); @@ -218,7 +218,7 @@ static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask, bitmap_copy(chip->state, state, 64); - spin_unlock_irqrestore(&chip->gpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->gpio_lock, flags); } /** @@ -236,13 +236,13 @@ static int xgpio_dir_in(struct gpio_chip *gc, unsigned int gpio) struct xgpio_instance *chip = gpiochip_get_data(gc); int bit = xgpio_to_bit(chip, gpio); - spin_lock_irqsave(&chip->gpio_lock, flags); + raw_spin_lock_irqsave(&chip->gpio_lock, flags); /* Set the GPIO bit in shadow register and set direction as input */ __set_bit(bit, chip->dir); xgpio_write_ch(chip, XGPIO_TRI_OFFSET, bit, chip->dir); - spin_unlock_irqrestore(&chip->gpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->gpio_lock, flags); return 0; } @@ -265,7 +265,7 @@ static int xgpio_dir_out(struct gpio_chip *gc, unsigned int gpio, int val) struct xgpio_instance *chip = gpiochip_get_data(gc); int bit = xgpio_to_bit(chip, gpio); - spin_lock_irqsave(&chip->gpio_lock, flags); + raw_spin_lock_irqsave(&chip->gpio_lock, flags); /* Write state of GPIO signal */ __assign_bit(bit, chip->state, val); @@ -275,7 +275,7 @@ static int xgpio_dir_out(struct gpio_chip *gc, unsigned int gpio, int val) __clear_bit(bit, chip->dir); xgpio_write_ch(chip, XGPIO_TRI_OFFSET, bit, chip->dir); - spin_unlock_irqrestore(&chip->gpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->gpio_lock, flags); return 0; } @@ -398,7 +398,7 @@ static void xgpio_irq_mask(struct irq_data *irq_data) int bit = xgpio_to_bit(chip, irq_offset); u32 mask = BIT(bit / 32), temp; - spin_lock_irqsave(&chip->gpio_lock, flags); + raw_spin_lock_irqsave(&chip->gpio_lock, flags); __clear_bit(bit, chip->enable); @@ -408,7 +408,7 @@ static void xgpio_irq_mask(struct irq_data *irq_data) temp &= ~mask; xgpio_writereg(chip->regs + XGPIO_IPIER_OFFSET, temp); } - spin_unlock_irqrestore(&chip->gpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->gpio_lock, flags); gpiochip_disable_irq(&chip->gc, irq_offset); } @@ -428,7 +428,7 @@ static void xgpio_irq_unmask(struct irq_data *irq_data) gpiochip_enable_irq(&chip->gc, irq_offset); - spin_lock_irqsave(&chip->gpio_lock, flags); + raw_spin_lock_irqsave(&chip->gpio_lock, flags); __set_bit(bit, chip->enable); @@ -447,7 +447,7 @@ static void xgpio_irq_unmask(struct irq_data *irq_data) xgpio_writereg(chip->regs + XGPIO_IPIER_OFFSET, val); } - spin_unlock_irqrestore(&chip->gpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->gpio_lock, flags); } /** @@ -512,7 +512,7 @@ static void xgpio_irqhandler(struct irq_desc *desc) chained_irq_enter(irqchip, desc); - spin_lock(&chip->gpio_lock); + raw_spin_lock(&chip->gpio_lock); xgpio_read_ch_all(chip, XGPIO_DATA_OFFSET, all); @@ -529,7 +529,7 @@ static void xgpio_irqhandler(struct irq_desc *desc) bitmap_copy(chip->last_irq_read, all, 64); bitmap_or(all, rising, falling, 64); - spin_unlock(&chip->gpio_lock); + raw_spin_unlock(&chip->gpio_lock); dev_dbg(gc->parent, "IRQ rising %*pb falling %*pb\n", 64, rising, 64, falling); @@ -620,7 +620,7 @@ static int xgpio_probe(struct platform_device *pdev) bitmap_set(chip->hw_map, 0, width[0]); bitmap_set(chip->hw_map, 32, width[1]); - spin_lock_init(&chip->gpio_lock); + raw_spin_lock_init(&chip->gpio_lock); chip->gc.base = -1; chip->gc.ngpio = bitmap_weight(chip->hw_map, 64); diff --git a/drivers/gpio/gpiolib-swnode.c b/drivers/gpio/gpiolib-swnode.c index 51d2475c05c5..f21dbc28cf2c 100644 --- a/drivers/gpio/gpiolib-swnode.c +++ b/drivers/gpio/gpiolib-swnode.c @@ -141,7 +141,7 @@ int swnode_gpio_count(const struct fwnode_handle *fwnode, const char *con_id) const struct software_node swnode_gpio_undefined = { .name = GPIOLIB_SWNODE_UNDEFINED_NAME, }; -EXPORT_SYMBOL_NS_GPL(swnode_gpio_undefined, GPIO_SWNODE); +EXPORT_SYMBOL_NS_GPL(swnode_gpio_undefined, "GPIO_SWNODE"); static int __init swnode_gpio_init(void) { diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 1850e68f1b61..fbef3f471bd0 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -103,10 +103,15 @@ config DRM_KMS_HELPER help CRTC helpers for KMS drivers. +config DRM_DRAW + bool + depends on DRM + config DRM_PANIC bool "Display a user-friendly message when a kernel panic occurs" depends on DRM select FONT_SUPPORT + select DRM_DRAW help Enable a drm panic handler, which will display a user-friendly message when a kernel panic occurs. It's useful when using a user-space @@ -218,77 +223,7 @@ config DRM_CLIENT option. Drivers that support the default clients should select DRM_CLIENT_SELECTION instead. -config DRM_CLIENT_LIB - tristate - depends on DRM - select DRM_KMS_HELPER if DRM_FBDEV_EMULATION - select FB_CORE if DRM_FBDEV_EMULATION - help - This option enables the DRM client library and selects all - modules and components according to the enabled clients. - -config DRM_CLIENT_SELECTION - tristate - depends on DRM - select DRM_CLIENT_LIB if DRM_FBDEV_EMULATION - help - Drivers that support in-kernel DRM clients have to select this - option. - -config DRM_CLIENT_SETUP - bool - depends on DRM_CLIENT_SELECTION - help - Enables the DRM client selection. DRM drivers that support the - default clients should select DRM_CLIENT_SELECTION instead. - -menu "Supported DRM clients" - depends on DRM_CLIENT_SELECTION - -config DRM_FBDEV_EMULATION - bool "Enable legacy fbdev support for your modesetting driver" - depends on DRM_CLIENT_SELECTION - select DRM_CLIENT - select DRM_CLIENT_SETUP - select FRAMEBUFFER_CONSOLE_DETECT_PRIMARY if FRAMEBUFFER_CONSOLE - default FB - help - Choose this option if you have a need for the legacy fbdev - support. Note that this support also provides the linux console - support on top of your modesetting driver. - - If in doubt, say "Y". - -config DRM_FBDEV_OVERALLOC - int "Overallocation of the fbdev buffer" - depends on DRM_FBDEV_EMULATION - default 100 - help - Defines the fbdev buffer overallocation in percent. Default - is 100. Typical values for double buffering will be 200, - triple buffering 300. - -config DRM_FBDEV_LEAK_PHYS_SMEM - bool "Shamelessly allow leaking of fbdev physical address (DANGEROUS)" - depends on DRM_FBDEV_EMULATION && EXPERT - default n - help - In order to keep user-space compatibility, we want in certain - use-cases to keep leaking the fbdev physical address to the - user-space program handling the fbdev buffer. - This affects, not only, Amlogic, Allwinner or Rockchip devices - with ARM Mali GPUs using an userspace Blob. - This option is not supported by upstream developers and should be - removed as soon as possible and be considered as a broken and - legacy behaviour from a modern fbdev device driver. - - Please send any bug reports when using this to your proprietary - software vendor that requires this. - - If in doubt, say "N" or spread the word to your closed source - library vendor. - -endmenu +source "drivers/gpu/drm/clients/Kconfig" config DRM_LOAD_EDID_FIRMWARE bool "Allow to specify an EDID data set instead of probing for it" @@ -533,6 +468,10 @@ config DRM_HYPERV config DRM_EXPORT_FOR_TESTS bool +# Separate option as not all DRM drivers use it +config DRM_PANEL_BACKLIGHT_QUIRKS + tristate + config DRM_LIB_RANDOM bool default n diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 463afad1b5ca..19fb370fbc56 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -91,10 +91,12 @@ drm-$(CONFIG_DRM_PRIVACY_SCREEN) += \ drm_privacy_screen_x86.o drm-$(CONFIG_DRM_ACCEL) += ../../accel/drm_accel.o drm-$(CONFIG_DRM_PANIC) += drm_panic.o +drm-$(CONFIG_DRM_DRAW) += drm_draw.o drm-$(CONFIG_DRM_PANIC_SCREEN_QR_CODE) += drm_panic_qr.o obj-$(CONFIG_DRM) += drm.o obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += drm_panel_orientation_quirks.o +obj-$(CONFIG_DRM_PANEL_BACKLIGHT_QUIRKS) += drm_panel_backlight_quirks.o # # Memory-management helpers @@ -148,14 +150,6 @@ drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o obj-$(CONFIG_DRM_KMS_HELPER) += drm_kms_helper.o -# -# DRM clients -# - -drm_client_lib-y := drm_client_setup.o -drm_client_lib-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fbdev_client.o -obj-$(CONFIG_DRM_CLIENT_LIB) += drm_client_lib.o - # # Drivers and the rest # @@ -165,6 +159,7 @@ obj-y += tests/ obj-$(CONFIG_DRM_MIPI_DBI) += drm_mipi_dbi.o obj-$(CONFIG_DRM_MIPI_DSI) += drm_mipi_dsi.o obj-y += arm/ +obj-y += clients/ obj-y += display/ obj-$(CONFIG_DRM_TTM) += ttm/ obj-$(CONFIG_DRM_SCHED) += scheduler/ diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig b/drivers/gpu/drm/amd/amdgpu/Kconfig index 41fa3377d9cf..1a11cab741ac 100644 --- a/drivers/gpu/drm/amd/amdgpu/Kconfig +++ b/drivers/gpu/drm/amd/amdgpu/Kconfig @@ -26,6 +26,7 @@ config DRM_AMDGPU select DRM_BUDDY select DRM_SUBALLOC_HELPER select DRM_EXEC + select DRM_PANEL_BACKLIGHT_QUIRKS # amdgpu depends on ACPI_VIDEO when ACPI is enabled, for select to work # ACPI_VIDEO's dependencies must also be selected. select INPUT if ACPI diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile index c7b18c52825d..5b21674b07fb 100644 --- a/drivers/gpu/drm/amd/amdgpu/Makefile +++ b/drivers/gpu/drm/amd/amdgpu/Makefile @@ -1,5 +1,5 @@ # -# Copyright 2017 Advanced Micro Devices, Inc. +# Copyright 2017-2024 Advanced Micro Devices, Inc. All rights reserved. # # Permission is hereby granted, free of charge, to any person obtaining a # copy of this software and associated documentation files (the "Software"), @@ -105,7 +105,7 @@ amdgpu-y += \ # add UMC block amdgpu-y += \ - umc_v6_0.o umc_v6_1.o umc_v6_7.o umc_v8_7.o umc_v8_10.o umc_v12_0.o + umc_v6_0.o umc_v6_1.o umc_v6_7.o umc_v8_7.o umc_v8_10.o umc_v12_0.o umc_v8_14.o # add IH block amdgpu-y += \ @@ -200,6 +200,7 @@ amdgpu-y += \ vcn_v4_0_3.o \ vcn_v4_0_5.o \ vcn_v5_0_0.o \ + vcn_v5_0_1.o \ amdgpu_jpeg.o \ jpeg_v1_0.o \ jpeg_v2_0.o \ @@ -208,7 +209,8 @@ amdgpu-y += \ jpeg_v4_0.o \ jpeg_v4_0_3.o \ jpeg_v4_0_5.o \ - jpeg_v5_0_0.o + jpeg_v5_0_0.o \ + jpeg_v5_0_1.o # add VPE block amdgpu-y += \ diff --git a/drivers/gpu/drm/amd/amdgpu/aldebaran.c b/drivers/gpu/drm/amd/amdgpu/aldebaran.c index f44de9d4b6a1..e13fbd974141 100644 --- a/drivers/gpu/drm/amd/amdgpu/aldebaran.c +++ b/drivers/gpu/drm/amd/amdgpu/aldebaran.c @@ -334,6 +334,8 @@ aldebaran_mode2_restore_hwcontext(struct amdgpu_reset_control *reset_ctl, AMDGPU_INIT_LEVEL_RESET_RECOVERY); dev_info(tmp_adev->dev, "GPU reset succeeded, trying to resume\n"); + /*TBD: Ideally should clear only GFX, SDMA blocks*/ + amdgpu_ras_clear_err_state(tmp_adev); r = aldebaran_mode2_restore_ip(tmp_adev); if (r) goto end; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 4653a8d2823a..69895fccb474 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -880,6 +880,7 @@ struct amdgpu_device { bool need_swiotlb; bool accel_working; struct notifier_block acpi_nb; + struct notifier_block pm_nb; struct amdgpu_i2c_chan *i2c_bus[AMDGPU_MAX_I2C_BUS]; struct debugfs_blob_wrapper debugfs_vbios_blob; struct debugfs_blob_wrapper debugfs_discovery_blob; @@ -1174,7 +1175,6 @@ struct amdgpu_device { struct work_struct reset_work; - bool job_hang; bool dc_enabled; /* Mask of active clusters */ uint32_t aid_mask; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_aca.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_aca.h index 5ef6b745f222..f3289d289913 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_aca.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_aca.h @@ -71,6 +71,11 @@ struct ras_query_context; #define ACA_ERROR_CE_MASK BIT_MASK(ACA_ERROR_TYPE_CE) #define ACA_ERROR_DEFERRED_MASK BIT_MASK(ACA_ERROR_TYPE_DEFERRED) +#define mmSMNAID_AID0_MCA_SMU 0x03b30400 /* SMN AID AID0 */ +#define mmSMNAID_XCD0_MCA_SMU 0x36430400 /* SMN AID XCD0 */ +#define mmSMNAID_XCD1_MCA_SMU 0x38430400 /* SMN AID XCD1 */ +#define mmSMNXCD_XCD0_MCA_SMU 0x40430400 /* SMN XCD XCD0 */ + enum aca_reg_idx { ACA_REG_IDX_CTL = 0, ACA_REG_IDX_STATUS = 1, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c index ec5e0dcf8613..deb0785350e8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c @@ -140,7 +140,7 @@ static int acp_poweroff(struct generic_pm_domain *genpd) * 2. power off the acp tiles * 3. check and enter ulv state */ - amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, true); + amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, true, 0); return 0; } @@ -157,7 +157,7 @@ static int acp_poweron(struct generic_pm_domain *genpd) * 2. turn on acp clock * 3. power on acp tiles */ - amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, false); + amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, false, 0); return 0; } @@ -236,7 +236,7 @@ static int acp_hw_init(struct amdgpu_ip_block *ip_block) ip_block->version->major, ip_block->version->minor); /* -ENODEV means board uses AZ rather than ACP */ if (r == -ENODEV) { - amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, true); + amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, true, 0); return 0; } else if (r) { return r; @@ -508,7 +508,7 @@ static int acp_hw_fini(struct amdgpu_ip_block *ip_block) /* return early if no ACP */ if (!adev->acp.acp_genpd) { - amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, false); + amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, false, 0); return 0; } @@ -565,7 +565,7 @@ static int acp_suspend(struct amdgpu_ip_block *ip_block) /* power up on suspend */ if (!adev->acp.acp_cell) - amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, false); + amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, false, 0); return 0; } @@ -575,7 +575,7 @@ static int acp_resume(struct amdgpu_ip_block *ip_block) /* power down again on resume */ if (!adev->acp.acp_cell) - amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, true); + amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, true, 0); return 0; } @@ -584,19 +584,19 @@ static bool acp_is_idle(void *handle) return true; } -static int acp_set_clockgating_state(void *handle, +static int acp_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int acp_set_powergating_state(void *handle, +static int acp_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_PG_STATE_GATE); - amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, enable); + amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, enable, 0); return 0; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 3afcd1e8aa54..2c1b38c5cfc6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -368,7 +368,7 @@ void amdgpu_amdkfd_free_gtt_mem(struct amdgpu_device *adev, void **mem_obj) { struct amdgpu_bo **bo = (struct amdgpu_bo **) mem_obj; - amdgpu_bo_reserve(*bo, true); + (void)amdgpu_bo_reserve(*bo, true); amdgpu_bo_kunmap(*bo); amdgpu_bo_unpin(*bo); amdgpu_bo_unreserve(*bo); @@ -715,8 +715,9 @@ err: void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, bool idle) { enum amd_powergating_state state = idle ? AMD_PG_STATE_GATE : AMD_PG_STATE_UNGATE; - if (IP_VERSION_MAJ(amdgpu_ip_version(adev, GC_HWIP, 0)) == 11 && - ((adev->mes.kiq_version & AMDGPU_MES_VERSION_MASK) <= 64)) { + if ((IP_VERSION_MAJ(amdgpu_ip_version(adev, GC_HWIP, 0)) == 11 && + ((adev->mes.kiq_version & AMDGPU_MES_VERSION_MASK) <= 64)) || + (IP_VERSION_MAJ(amdgpu_ip_version(adev, GC_HWIP, 0)) == 12)) { pr_debug("GFXOFF is %s\n", idle ? "enabled" : "disabled"); amdgpu_gfx_off_ctrl(adev, idle); } else if ((IP_VERSION_MAJ(amdgpu_ip_version(adev, GC_HWIP, 0)) == 9) && @@ -724,7 +725,9 @@ void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, bool idle) /* Disable GFXOFF and PG. Temporary workaround * to fix some compute applications issue on GFX9. */ - adev->ip_blocks[AMD_IP_BLOCK_TYPE_GFX].version->funcs->set_powergating_state((void *)adev, state); + struct amdgpu_ip_block *gfx_block = amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_GFX); + if (gfx_block != NULL) + gfx_block->version->funcs->set_powergating_state((void *)gfx_block, state); } amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_COMPUTE, @@ -834,7 +837,7 @@ int amdgpu_amdkfd_unmap_hiq(struct amdgpu_device *adev, u32 doorbell_off, if (!kiq->pmf || !kiq->pmf->kiq_unmap_queues) return -EINVAL; - if (!kiq_ring->sched.ready || adev->job_hang) + if (!kiq_ring->sched.ready || amdgpu_in_reset(adev)) return 0; ring_funcs = kzalloc(sizeof(*ring_funcs), GFP_KERNEL); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index 4b80ad860639..8af67f18500a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -433,6 +433,9 @@ void kgd2kfd_unlock_kfd(void); int kgd2kfd_start_sched(struct kfd_dev *kfd, uint32_t node_id); int kgd2kfd_stop_sched(struct kfd_dev *kfd, uint32_t node_id); bool kgd2kfd_compute_active(struct kfd_dev *kfd, uint32_t node_id); +bool kgd2kfd_vmfault_fast_path(struct amdgpu_device *adev, struct amdgpu_iv_entry *entry, + bool retry_fault); + #else static inline int kgd2kfd_init(void) { @@ -518,5 +521,12 @@ static inline bool kgd2kfd_compute_active(struct kfd_dev *kfd, uint32_t node_id) { return false; } + +static inline bool kgd2kfd_vmfault_fast_path(struct amdgpu_device *adev, struct amdgpu_iv_entry *entry, + bool retry_fault) +{ + return false; +} + #endif #endif /* AMDGPU_AMDKFD_H_INCLUDED */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c index cc66ebb7bae1..441568163e20 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c @@ -1131,6 +1131,9 @@ uint64_t kgd_gfx_v9_hqd_get_pq_addr(struct amdgpu_device *adev, uint32_t low, high; uint64_t queue_addr = 0; + if (!amdgpu_gpu_recovery) + return 0; + kgd_gfx_v9_acquire_queue(adev, pipe_id, queue_id, inst); amdgpu_gfx_rlc_enter_safe_mode(adev, inst); @@ -1179,6 +1182,9 @@ uint64_t kgd_gfx_v9_hqd_reset(struct amdgpu_device *adev, uint32_t low, high, pipe_reset_data = 0; uint64_t queue_addr = 0; + if (!amdgpu_gpu_recovery) + return 0; + kgd_gfx_v9_acquire_queue(adev, pipe_id, queue_id, inst); amdgpu_gfx_rlc_enter_safe_mode(adev, inst); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index f30548f4c3b3..1e998f972c30 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -730,7 +730,7 @@ kfd_mem_dmaunmap_userptr(struct kgd_mem *mem, return; amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU); - ttm_bo_validate(&bo->tbo, &bo->placement, &ctx); + (void)ttm_bo_validate(&bo->tbo, &bo->placement, &ctx); dma_unmap_sgtable(adev->dev, ttm->sg, direction, 0); sg_free_table(ttm->sg); @@ -779,7 +779,7 @@ kfd_mem_dmaunmap_sg_bo(struct kgd_mem *mem, } amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU); - ttm_bo_validate(&bo->tbo, &bo->placement, &ctx); + (void)ttm_bo_validate(&bo->tbo, &bo->placement, &ctx); dir = mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE; @@ -989,7 +989,7 @@ unwind: if (!attachment[i]) continue; if (attachment[i]->bo_va) { - amdgpu_bo_reserve(bo[i], true); + (void)amdgpu_bo_reserve(bo[i], true); if (--attachment[i]->bo_va->ref_count == 0) amdgpu_vm_bo_del(adev, attachment[i]->bo_va); amdgpu_bo_unreserve(bo[i]); @@ -1259,11 +1259,11 @@ static int unmap_bo_from_gpuvm(struct kgd_mem *mem, return -EBUSY; } - amdgpu_vm_bo_unmap(adev, bo_va, entry->va); + (void)amdgpu_vm_bo_unmap(adev, bo_va, entry->va); - amdgpu_vm_clear_freed(adev, vm, &bo_va->last_pt_update); + (void)amdgpu_vm_clear_freed(adev, vm, &bo_va->last_pt_update); - amdgpu_sync_fence(sync, bo_va->last_pt_update); + (void)amdgpu_sync_fence(sync, bo_va->last_pt_update); return 0; } @@ -2352,7 +2352,7 @@ void amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel(struct kgd_mem *mem) { struct amdgpu_bo *bo = mem->bo; - amdgpu_bo_reserve(bo, true); + (void)amdgpu_bo_reserve(bo, true); amdgpu_bo_kunmap(bo); amdgpu_bo_unpin(bo); amdgpu_bo_unreserve(bo); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c index 45affc02548c..423fd2eebe1e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c @@ -47,35 +47,37 @@ /* Check if current bios is an ATOM BIOS. * Return true if it is ATOM BIOS. Otherwise, return false. */ -static bool check_atom_bios(uint8_t *bios, size_t size) +static bool check_atom_bios(struct amdgpu_device *adev, size_t size) { uint16_t tmp, bios_header_start; + uint8_t *bios = adev->bios; if (!bios || size < 0x49) { - DRM_INFO("vbios mem is null or mem size is wrong\n"); + dev_dbg(adev->dev, "VBIOS mem is null or mem size is wrong\n"); return false; } if (!AMD_IS_VALID_VBIOS(bios)) { - DRM_INFO("BIOS signature incorrect %x %x\n", bios[0], bios[1]); + dev_dbg(adev->dev, "VBIOS signature incorrect %x %x\n", bios[0], + bios[1]); return false; } bios_header_start = bios[0x48] | (bios[0x49] << 8); if (!bios_header_start) { - DRM_INFO("Can't locate bios header\n"); + dev_dbg(adev->dev, "Can't locate VBIOS header\n"); return false; } tmp = bios_header_start + 4; if (size < tmp) { - DRM_INFO("BIOS header is broken\n"); + dev_dbg(adev->dev, "VBIOS header is broken\n"); return false; } if (!memcmp(bios + tmp, "ATOM", 4) || !memcmp(bios + tmp, "MOTA", 4)) { - DRM_DEBUG("ATOMBIOS detected\n"); + dev_dbg(adev->dev, "ATOMBIOS detected\n"); return true; } @@ -118,7 +120,7 @@ static bool amdgpu_read_bios_from_vram(struct amdgpu_device *adev) memcpy_fromio(adev->bios, bios, size); iounmap(bios); - if (!check_atom_bios(adev->bios, size)) { + if (!check_atom_bios(adev, size)) { kfree(adev->bios); return false; } @@ -146,7 +148,7 @@ bool amdgpu_read_bios(struct amdgpu_device *adev) memcpy_fromio(adev->bios, bios, size); pci_unmap_rom(adev->pdev, bios); - if (!check_atom_bios(adev->bios, size)) { + if (!check_atom_bios(adev, size)) { kfree(adev->bios); return false; } @@ -186,7 +188,7 @@ static bool amdgpu_read_bios_from_rom(struct amdgpu_device *adev) /* read complete BIOS */ amdgpu_asic_read_bios_from_rom(adev, adev->bios, len); - if (!check_atom_bios(adev->bios, len)) { + if (!check_atom_bios(adev, len)) { kfree(adev->bios); return false; } @@ -216,7 +218,7 @@ static bool amdgpu_read_platform_bios(struct amdgpu_device *adev) memcpy_fromio(adev->bios, bios, romlen); iounmap(bios); - if (!check_atom_bios(adev->bios, romlen)) + if (!check_atom_bios(adev, romlen)) goto free_bios; adev->bios_size = romlen; @@ -324,7 +326,7 @@ static bool amdgpu_atrm_get_bios(struct amdgpu_device *adev) break; } - if (!check_atom_bios(adev->bios, size)) { + if (!check_atom_bios(adev, size)) { kfree(adev->bios); return false; } @@ -389,7 +391,7 @@ static bool amdgpu_acpi_vfct_bios(struct amdgpu_device *adev) vhdr->ImageLength, GFP_KERNEL); - if (!check_atom_bios(adev->bios, vhdr->ImageLength)) { + if (!check_atom_bios(adev, vhdr->ImageLength)) { kfree(adev->bios); return false; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c index 16153d275d7a..68bce6a6d09d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c @@ -414,7 +414,9 @@ static int amdgpu_cgs_get_firmware_info(struct cgs_device *cgs_device, return -EINVAL; } - err = amdgpu_ucode_request(adev, &adev->pm.fw, "%s", fw_name); + err = amdgpu_ucode_request(adev, &adev->pm.fw, + AMDGPU_UCODE_REQUIRED, + "%s", fw_name); if (err) { DRM_ERROR("Failed to load firmware \"%s\"", fw_name); amdgpu_ucode_release(&adev->pm.fw); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index d891ab779ca7..5cc5f59e3018 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1105,7 +1105,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p) * We can't use gang submit on with reserved VMIDs when the VM changes * can't be invalidated by more than one engine at the same time. */ - if (p->gang_size > 1 && !p->adev->vm_manager.concurrent_flush) { + if (p->gang_size > 1 && !adev->vm_manager.concurrent_flush) { for (i = 0; i < p->gang_size; ++i) { struct drm_sched_entity *entity = p->entities[i]; struct drm_gpu_scheduler *sched = entity->rq->sched; @@ -1189,7 +1189,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p) if (!bo) continue; - amdgpu_vm_bo_invalidate(adev, bo, false); + amdgpu_vm_bo_invalidate(bo, false); } } @@ -1801,13 +1801,18 @@ int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser, if (dma_resv_locking_ctx((*bo)->tbo.base.resv) != &parser->exec.ticket) return -EINVAL; + /* Make sure VRAM is allocated contigiously */ (*bo)->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS; - amdgpu_bo_placement_from_domain(*bo, (*bo)->allowed_domains); - for (i = 0; i < (*bo)->placement.num_placement; i++) - (*bo)->placements[i].flags |= TTM_PL_FLAG_CONTIGUOUS; - r = ttm_bo_validate(&(*bo)->tbo, &(*bo)->placement, &ctx); - if (r) - return r; + if ((*bo)->tbo.resource->mem_type == TTM_PL_VRAM && + !((*bo)->tbo.resource->placement & TTM_PL_FLAG_CONTIGUOUS)) { + + amdgpu_bo_placement_from_domain(*bo, (*bo)->allowed_domains); + for (i = 0; i < (*bo)->placement.num_placement; i++) + (*bo)->placements[i].flags |= TTM_PL_FLAG_CONTIGUOUS; + r = ttm_bo_validate(&(*bo)->tbo, &(*bo)->placement, &ctx); + if (r) + return r; + } return amdgpu_ttm_alloc_gart(&(*bo)->tbo); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index a68338cb7b4a..49ca8c814455 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -2095,6 +2095,7 @@ int amdgpu_debugfs_init(struct amdgpu_device *adev) if (amdgpu_umsch_mm & amdgpu_umsch_mm_fwlog) amdgpu_debugfs_umsch_fwlog_init(adev, &adev->umsch_mm); + amdgpu_debugfs_vcn_sched_mask_init(adev); amdgpu_debugfs_jpeg_sched_mask_init(adev); amdgpu_debugfs_gfx_sched_mask_init(adev); amdgpu_debugfs_compute_sched_mask_init(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c index 946c48829f19..824f9da5b6ce 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c @@ -343,11 +343,10 @@ void amdgpu_coredump(struct amdgpu_device *adev, bool skip_vram_check, coredump->skip_vram_check = skip_vram_check; coredump->reset_vram_lost = vram_lost; - if (job && job->vm) { - struct amdgpu_vm *vm = job->vm; + if (job && job->pasid) { struct amdgpu_task_info *ti; - ti = amdgpu_vm_get_task_info_vm(vm); + ti = amdgpu_vm_get_task_info_pasid(adev, job->pasid); if (ti) { coredump->reset_task_info = *ti; amdgpu_vm_put_task_info(ti); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 9095c05e0269..36053b3d48b3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -145,7 +145,7 @@ const char *amdgpu_asic_name[] = { "LAST", }; -#define AMDGPU_IP_BLK_MASK_ALL GENMASK(AMDGPU_MAX_IP_NUM - 1, 0) +#define AMDGPU_IP_BLK_MASK_ALL GENMASK(AMD_IP_BLOCK_TYPE_NUM - 1, 0) /* * Default init level where all blocks are expected to be initialized. This is * the level of initialization expected by default and also after a full reset @@ -199,14 +199,16 @@ void amdgpu_set_init_level(struct amdgpu_device *adev, } static inline void amdgpu_device_stop_pending_resets(struct amdgpu_device *adev); +static int amdgpu_device_pm_notifier(struct notifier_block *nb, unsigned long mode, + void *data); /** * DOC: pcie_replay_count * * The amdgpu driver provides a sysfs API for reporting the total number - * of PCIe replays (NAKs) + * of PCIe replays (NAKs). * The file pcie_replay_count is used for this and returns the total - * number of replays as a sum of the NAKs generated and NAKs received + * number of replays as a sum of the NAKs generated and NAKs received. */ static ssize_t amdgpu_device_get_pcie_replay_count(struct device *dev, @@ -417,6 +419,9 @@ bool amdgpu_device_supports_boco(struct drm_device *dev) { struct amdgpu_device *adev = drm_to_adev(dev); + if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE)) + return false; + if (adev->has_pr3 || ((adev->flags & AMD_IS_PX) && amdgpu_is_atpx_hybrid())) return true; @@ -429,8 +434,8 @@ bool amdgpu_device_supports_boco(struct drm_device *dev) * @dev: drm_device pointer * * Return: - * 1 if the device supporte BACO; - * 3 if the device support MACO (only works if BACO is supported) + * 1 if the device supports BACO; + * 3 if the device supports MACO (only works if BACO is supported) * otherwise return 0. */ int amdgpu_device_supports_baco(struct drm_device *dev) @@ -577,7 +582,7 @@ void amdgpu_device_mm_access(struct amdgpu_device *adev, loff_t pos, } /** - * amdgpu_device_aper_access - access vram by vram aperature + * amdgpu_device_aper_access - access vram by vram aperture * * @adev: amdgpu_device pointer * @pos: offset of the buffer in vram @@ -668,7 +673,7 @@ bool amdgpu_device_skip_hw_access(struct amdgpu_device *adev) * here is that the GPU reset is not running on another thread in parallel. * * For this we trylock the read side of the reset semaphore, if that succeeds - * we know that the reset is not running in paralell. + * we know that the reset is not running in parallel. * * If the trylock fails we assert that we are either already holding the read * side of the lock or are the reset thread itself and hold the write side of @@ -1399,6 +1404,7 @@ static int amdgpu_device_asic_init(struct amdgpu_device *adev) if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) || amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) || + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0) || amdgpu_ip_version(adev, GC_HWIP, 0) >= IP_VERSION(11, 0, 0)) { amdgpu_psp_wait_for_bootloader(adev); ret = amdgpu_atomfirmware_asic_init(adev, true); @@ -1733,7 +1739,7 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev) uint32_t fw_ver; err = request_firmware(&adev->pm.fw, "amdgpu/fiji_smc.bin", adev->dev); - /* force vPost if error occured */ + /* force vPost if error occurred */ if (err) return true; @@ -2165,7 +2171,7 @@ int amdgpu_device_ip_set_clockgating_state(void *dev, if (!adev->ip_blocks[i].version->funcs->set_clockgating_state) continue; r = adev->ip_blocks[i].version->funcs->set_clockgating_state( - (void *)adev, state); + &adev->ip_blocks[i], state); if (r) DRM_ERROR("set_clockgating_state of IP block <%s> failed %d\n", adev->ip_blocks[i].version->funcs->name, r); @@ -2199,7 +2205,7 @@ int amdgpu_device_ip_set_powergating_state(void *dev, if (!adev->ip_blocks[i].version->funcs->set_powergating_state) continue; r = adev->ip_blocks[i].version->funcs->set_powergating_state( - (void *)adev, state); + &adev->ip_blocks[i], state); if (r) DRM_ERROR("set_powergating_state of IP block <%s> failed %d\n", adev->ip_blocks[i].version->funcs->name, r); @@ -2378,7 +2384,7 @@ int amdgpu_device_ip_block_add(struct amdgpu_device *adev, * the module parameter virtual_display. This feature provides a virtual * display hardware on headless boards or in virtualized environments. * This function parses and validates the configuration string specified by - * the user and configues the virtual display configuration (number of + * the user and configures the virtual display configuration (number of * virtual connectors, crtcs, etc.) specified. */ static void amdgpu_device_enable_virtual_display(struct amdgpu_device *adev) @@ -2441,7 +2447,7 @@ void amdgpu_device_set_sriov_virtual_display(struct amdgpu_device *adev) * @adev: amdgpu_device pointer * * Parses the asic configuration parameters specified in the gpu info - * firmware and makes them availale to the driver for use in configuring + * firmware and makes them available to the driver for use in configuring * the asic. * Returns 0 on success, -EINVAL on failure. */ @@ -2482,6 +2488,7 @@ static int amdgpu_device_parse_gpu_info_fw(struct amdgpu_device *adev) } err = amdgpu_ucode_request(adev, &adev->firmware.gpu_info_fw, + AMDGPU_UCODE_OPTIONAL, "amdgpu/%s_gpu_info.bin", chip_name); if (err) { dev_err(adev->dev, @@ -2501,7 +2508,7 @@ static int amdgpu_device_parse_gpu_info_fw(struct amdgpu_device *adev) le32_to_cpu(hdr->header.ucode_array_offset_bytes)); /* - * Should be droped when DAL no longer needs it. + * Should be dropped when DAL no longer needs it. */ if (adev->asic_type == CHIP_NAVI12) goto parse_soc_bounding_box; @@ -3061,7 +3068,7 @@ init_failed: * * Writes a reset magic value to the gart pointer in VRAM. The driver calls * this function before a GPU reset. If the value is retained after a - * GPU reset, VRAM has not been lost. Some GPU resets may destry VRAM contents. + * GPU reset, VRAM has not been lost. Some GPU resets may destroy VRAM contents. */ static void amdgpu_device_fill_reset_magic(struct amdgpu_device *adev) { @@ -3137,7 +3144,7 @@ int amdgpu_device_set_cg_state(struct amdgpu_device *adev, adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_JPEG && adev->ip_blocks[i].version->funcs->set_clockgating_state) { /* enable clockgating to save power */ - r = adev->ip_blocks[i].version->funcs->set_clockgating_state((void *)adev, + r = adev->ip_blocks[i].version->funcs->set_clockgating_state(&adev->ip_blocks[i], state); if (r) { DRM_ERROR("set_clockgating_state(gate) of IP block <%s> failed %d\n", @@ -3174,7 +3181,7 @@ int amdgpu_device_set_pg_state(struct amdgpu_device *adev, adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_JPEG && adev->ip_blocks[i].version->funcs->set_powergating_state) { /* enable powergating to save power */ - r = adev->ip_blocks[i].version->funcs->set_powergating_state((void *)adev, + r = adev->ip_blocks[i].version->funcs->set_powergating_state(&adev->ip_blocks[i], state); if (r) { DRM_ERROR("set_powergating_state(gate) of IP block <%s> failed %d\n", @@ -3376,7 +3383,7 @@ static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev) amdgpu_amdkfd_suspend(adev, false); - /* Workaroud for ASICs need to disable SMC first */ + /* Workaround for ASICs need to disable SMC first */ amdgpu_device_smu_fini_early(adev); for (i = adev->num_ip_blocks - 1; i >= 0; i--) { @@ -3478,7 +3485,7 @@ static void amdgpu_device_delay_enable_gfx_off(struct work_struct *work) WARN_ON_ONCE(adev->gfx.gfx_off_state); WARN_ON_ONCE(adev->gfx.gfx_off_req_count); - if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true)) + if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true, 0)) adev->gfx.gfx_off_state = true; } @@ -3670,9 +3677,11 @@ static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev) continue; r = block->version->funcs->hw_init(&adev->ip_blocks[i]); - DRM_INFO("RE-INIT-early: %s %s\n", block->version->funcs->name, r?"failed":"succeeded"); - if (r) + if (r) { + dev_err(adev->dev, "RE-INIT-early: %s failed\n", + block->version->funcs->name); return r; + } block->status.hw = true; } } @@ -3682,7 +3691,8 @@ static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev) static int amdgpu_device_ip_reinit_late_sriov(struct amdgpu_device *adev) { - int i, r; + struct amdgpu_ip_block *block; + int i, r = 0; static enum amd_ip_block_type ip_order[] = { AMD_IP_BLOCK_TYPE_SMC, @@ -3697,34 +3707,28 @@ static int amdgpu_device_ip_reinit_late_sriov(struct amdgpu_device *adev) }; for (i = 0; i < ARRAY_SIZE(ip_order); i++) { - int j; - struct amdgpu_ip_block *block; + block = amdgpu_device_ip_get_ip_block(adev, ip_order[i]); - for (j = 0; j < adev->num_ip_blocks; j++) { - block = &adev->ip_blocks[j]; - - if (block->version->type != ip_order[i] || - !block->status.valid || - block->status.hw) - continue; + if (!block) + continue; + if (block->status.valid && !block->status.hw) { if (block->version->type == AMD_IP_BLOCK_TYPE_SMC) { - r = amdgpu_ip_block_resume(&adev->ip_blocks[i]); - if (r) - return r; + r = amdgpu_ip_block_resume(block); } else { - r = block->version->funcs->hw_init(&adev->ip_blocks[i]); - if (r) { - DRM_ERROR("hw_init of IP block <%s> failed %d\n", - adev->ip_blocks[i].version->funcs->name, r); - return r; - } - block->status.hw = true; + r = block->version->funcs->hw_init(block); } + + if (r) { + dev_err(adev->dev, "RE-INIT-late: %s failed\n", + block->version->funcs->name); + break; + } + block->status.hw = true; } } - return 0; + return r; } /** @@ -3765,7 +3769,7 @@ static int amdgpu_device_ip_resume_phase1(struct amdgpu_device *adev) * * @adev: amdgpu_device pointer * - * First resume function for hardware IPs. The list of all the hardware + * Second resume function for hardware IPs. The list of all the hardware * IPs that make up the asic is walked and the resume callbacks are run for * all blocks except COMMON, GMC, and IH. resume puts the hardware into a * functional state after a suspend and updates the software state as @@ -3783,6 +3787,7 @@ static int amdgpu_device_ip_resume_phase2(struct amdgpu_device *adev) if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_COMMON || adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC || adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_IH || + adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_DCE || adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_PSP) continue; r = amdgpu_ip_block_resume(&adev->ip_blocks[i]); @@ -3793,6 +3798,36 @@ static int amdgpu_device_ip_resume_phase2(struct amdgpu_device *adev) return 0; } +/** + * amdgpu_device_ip_resume_phase3 - run resume for hardware IPs + * + * @adev: amdgpu_device pointer + * + * Third resume function for hardware IPs. The list of all the hardware + * IPs that make up the asic is walked and the resume callbacks are run for + * all DCE. resume puts the hardware into a functional state after a suspend + * and updates the software state as necessary. This function is also used + * for restoring the GPU after a GPU reset. + * + * Returns 0 on success, negative error code on failure. + */ +static int amdgpu_device_ip_resume_phase3(struct amdgpu_device *adev) +{ + int i, r; + + for (i = 0; i < adev->num_ip_blocks; i++) { + if (!adev->ip_blocks[i].status.valid || adev->ip_blocks[i].status.hw) + continue; + if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_DCE) { + r = amdgpu_ip_block_resume(&adev->ip_blocks[i]); + if (r) + return r; + } + } + + return 0; +} + /** * amdgpu_device_ip_resume - run resume for hardware IPs * @@ -3822,6 +3857,13 @@ static int amdgpu_device_ip_resume(struct amdgpu_device *adev) if (adev->mman.buffer_funcs_ring->sched.ready) amdgpu_ttm_set_buffer_funcs_status(adev, true); + if (r) + return r; + + amdgpu_fence_driver_hw_init(adev); + + r = amdgpu_device_ip_resume_phase3(adev); + return r; } @@ -4271,7 +4313,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, /* * Reset domain needs to be present early, before XGMI hive discovered - * (if any) and intitialized to use reset sem and in_gpu reset flag + * (if any) and initialized to use reset sem and in_gpu reset flag * early on during init and before calling to RREG32. */ adev->reset_domain = amdgpu_reset_create_reset_domain(SINGLE_DEVICE, "amdgpu-reset-dev"); @@ -4561,6 +4603,11 @@ fence_driver_init: amdgpu_device_check_iommu_direct_map(adev); + adev->pm_nb.notifier_call = amdgpu_device_pm_notifier; + r = register_pm_notifier(&adev->pm_nb); + if (r) + goto failed; + return 0; release_ras_con: @@ -4625,6 +4672,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) drain_workqueue(adev->mman.bdev.wq); adev->shutdown = true; + unregister_pm_notifier(&adev->pm_nb); + /* make sure IB test finished before entering exclusive mode * to avoid preemption on IB test */ @@ -4743,8 +4792,8 @@ static int amdgpu_device_evict_resources(struct amdgpu_device *adev) { int ret; - /* No need to evict vram on APUs for suspend to ram or s2idle */ - if ((adev->in_s3 || adev->in_s0ix) && (adev->flags & AMD_IS_APU)) + /* No need to evict vram on APUs unless going to S4 */ + if (!adev->in_s4 && (adev->flags & AMD_IS_APU)) return 0; ret = amdgpu_ttm_evict_resources(adev, TTM_PL_VRAM); @@ -4756,6 +4805,41 @@ static int amdgpu_device_evict_resources(struct amdgpu_device *adev) /* * Suspend & resume. */ +/** + * amdgpu_device_pm_notifier - Notification block for Suspend/Hibernate events + * @nb: notifier block + * @mode: suspend mode + * @data: data + * + * This function is called when the system is about to suspend or hibernate. + * It is used to evict resources from the device before the system goes to + * sleep while there is still access to swap. + */ +static int amdgpu_device_pm_notifier(struct notifier_block *nb, unsigned long mode, + void *data) +{ + struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, pm_nb); + int r; + + switch (mode) { + case PM_HIBERNATION_PREPARE: + adev->in_s4 = true; + fallthrough; + case PM_SUSPEND_PREPARE: + r = amdgpu_device_evict_resources(adev); + /* + * This is considered non-fatal at this time because + * amdgpu_device_prepare() will also fatally evict resources. + * See https://gitlab.freedesktop.org/drm/amd/-/issues/3781 + */ + if (r) + drm_warn(adev_to_drm(adev), "Failed to evict resources, freeze active processes if problems occur: %d\n", r); + break; + } + + return NOTIFY_DONE; +} + /** * amdgpu_device_prepare - prepare for device suspend * @@ -4795,7 +4879,7 @@ int amdgpu_device_prepare(struct drm_device *dev) return 0; unprepare: - adev->in_s0ix = adev->in_s3 = false; + adev->in_s0ix = adev->in_s3 = adev->in_s4 = false; return r; } @@ -4902,7 +4986,6 @@ int amdgpu_device_resume(struct drm_device *dev, bool notify_clients) dev_err(adev->dev, "amdgpu_device_ip_resume failed (%d).\n", r); goto exit; } - amdgpu_fence_driver_hw_init(adev); if (!adev->in_s0ix) { r = amdgpu_amdkfd_resume(adev, adev->in_runpm); @@ -5147,7 +5230,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev, if (r) return r; - amdgpu_ras_set_fed(adev, false); + amdgpu_ras_clear_err_state(adev); amdgpu_irq_gpu_reset_resume_helper(adev); /* some sw clean up VF needs to do before recover */ @@ -5204,16 +5287,18 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev, } /** - * amdgpu_device_has_job_running - check if there is any job in mirror list + * amdgpu_device_has_job_running - check if there is any unfinished job * * @adev: amdgpu_device pointer * - * check if there is any job in mirror list + * check if there is any job running on the device when guest driver receives + * FLR notification from host driver. If there are still jobs running, then + * the guest driver will not respond the FLR reset. Instead, let the job hit + * the timeout and guest driver then issue the reset request. */ bool amdgpu_device_has_job_running(struct amdgpu_device *adev) { int i; - struct drm_sched_job *job; for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { struct amdgpu_ring *ring = adev->rings[i]; @@ -5221,11 +5306,7 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev) if (!amdgpu_ring_sched_ready(ring)) continue; - spin_lock(&ring->sched.job_list_lock); - job = list_first_entry_or_null(&ring->sched.pending_list, - struct drm_sched_job, list); - spin_unlock(&ring->sched.job_list_lock); - if (job) + if (amdgpu_fence_count_emitted(ring)) return true; } return false; @@ -5450,7 +5531,7 @@ int amdgpu_device_reinit_after_reset(struct amdgpu_reset_context *reset_context) amdgpu_set_init_level(tmp_adev, init_level); if (full_reset) { /* post card */ - amdgpu_ras_set_fed(tmp_adev, false); + amdgpu_ras_clear_err_state(tmp_adev); r = amdgpu_device_asic_init(tmp_adev); if (r) { dev_warn(tmp_adev->dev, "asic atom init failed!"); @@ -5487,6 +5568,10 @@ int amdgpu_device_reinit_after_reset(struct amdgpu_reset_context *reset_context) if (tmp_adev->mman.buffer_funcs_ring->sched.ready) amdgpu_ttm_set_buffer_funcs_status(tmp_adev, true); + r = amdgpu_device_ip_resume_phase3(tmp_adev); + if (r) + goto out; + if (vram_lost) amdgpu_device_fill_reset_magic(tmp_adev); @@ -5779,6 +5864,18 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, bool audio_suspended = false; int retry_limit = AMDGPU_MAX_RETRY_LIMIT; + /* + * If it reaches here because of hang/timeout and a RAS error is + * detected at the same time, let RAS recovery take care of it. + */ + if (amdgpu_ras_is_err_state(adev, AMDGPU_RAS_BLOCK__ANY) && + !amdgpu_sriov_vf(adev) && + reset_context->src != AMDGPU_RESET_SRC_RAS) { + dev_dbg(adev->dev, + "Gpu recovery from source: %d yielding to RAS error recovery handling", + reset_context->src); + return 0; + } /* * Special case: RAS triggered and full reset isn't supported */ @@ -5862,7 +5959,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, amdgpu_amdkfd_pre_reset(tmp_adev, reset_context); /* - * Mark these ASICs to be reseted as untracked first + * Mark these ASICs to be reset as untracked first * And add them back after reset completed */ amdgpu_unregister_gpu_instance(tmp_adev); @@ -6065,7 +6162,7 @@ static void amdgpu_device_partner_bandwidth(struct amdgpu_device *adev, * * @adev: amdgpu_device pointer * - * Fetchs and stores in the driver the PCIE capabilities (gen speed + * Fetches and stores in the driver the PCIE capabilities (gen speed * and lanes) of the slot the device is in. Handles APUs and * virtualized environments where PCIE config space may not be available. */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index 1040204ac8b9..949d74eff294 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -1,5 +1,5 @@ /* - * Copyright 2018 Advanced Micro Devices, Inc. + * Copyright 2018-2024 Advanced Micro Devices, Inc. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), @@ -104,7 +104,9 @@ #include "smuio_v13_0_6.h" #include "smuio_v14_0_2.h" #include "vcn_v5_0_0.h" +#include "vcn_v5_0_1.h" #include "jpeg_v5_0_0.h" +#include "jpeg_v5_0_1.h" #include "amdgpu_vpe.h" #if defined(CONFIG_DRM_AMD_ISP) @@ -1340,7 +1342,7 @@ static int amdgpu_discovery_reg_base_init(struct amdgpu_device *adev) */ if (adev->vcn.num_vcn_inst < AMDGPU_MAX_VCN_INSTANCES) { - adev->vcn.vcn_config[adev->vcn.num_vcn_inst] = + adev->vcn.inst[adev->vcn.num_vcn_inst].vcn_config = ip->revision & 0xc0; adev->vcn.num_vcn_inst++; adev->vcn.inst_mask |= @@ -1705,7 +1707,7 @@ static int amdgpu_discovery_get_vcn_info(struct amdgpu_device *adev) * so this won't overflow. */ for (v = 0; v < adev->vcn.num_vcn_inst; v++) { - adev->vcn.vcn_codec_disable_mask[v] = + adev->vcn.inst[v].vcn_codec_disable_mask = le32_to_cpu(vcn_info->v1.instance_info[v].fuse_data.all_bits); } break; @@ -1836,6 +1838,7 @@ static int amdgpu_discovery_set_common_ip_blocks(struct amdgpu_device *adev) case IP_VERSION(9, 4, 2): case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): amdgpu_device_ip_block_add(adev, &vega10_common_ip_block); break; case IP_VERSION(10, 1, 10): @@ -1890,6 +1893,7 @@ static int amdgpu_discovery_set_gmc_ip_blocks(struct amdgpu_device *adev) case IP_VERSION(9, 4, 2): case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): amdgpu_device_ip_block_add(adev, &gmc_v9_0_ip_block); break; case IP_VERSION(10, 1, 10): @@ -2013,6 +2017,7 @@ static int amdgpu_discovery_set_psp_ip_blocks(struct amdgpu_device *adev) case IP_VERSION(13, 0, 8): case IP_VERSION(13, 0, 10): case IP_VERSION(13, 0, 11): + case IP_VERSION(13, 0, 12): case IP_VERSION(13, 0, 14): case IP_VERSION(14, 0, 0): case IP_VERSION(14, 0, 1): @@ -2184,6 +2189,7 @@ static int amdgpu_discovery_set_gc_ip_blocks(struct amdgpu_device *adev) break; case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): amdgpu_device_ip_block_add(adev, &gfx_v9_4_3_ip_block); break; case IP_VERSION(10, 1, 10): @@ -2238,6 +2244,7 @@ static int amdgpu_discovery_set_sdma_ip_blocks(struct amdgpu_device *adev) break; case IP_VERSION(4, 4, 2): case IP_VERSION(4, 4, 5): + case IP_VERSION(4, 4, 4): amdgpu_device_ip_block_add(adev, &sdma_v4_4_2_ip_block); break; case IP_VERSION(5, 0, 0): @@ -2361,6 +2368,10 @@ static int amdgpu_discovery_set_mm_ip_blocks(struct amdgpu_device *adev) amdgpu_device_ip_block_add(adev, &vcn_v5_0_0_ip_block); amdgpu_device_ip_block_add(adev, &jpeg_v5_0_0_ip_block); break; + case IP_VERSION(5, 0, 1): + amdgpu_device_ip_block_add(adev, &vcn_v5_0_1_ip_block); + amdgpu_device_ip_block_add(adev, &jpeg_v5_0_1_ip_block); + break; default: dev_err(adev->dev, "Failed to add vcn/jpeg ip block(UVD_HWIP:0x%x)\n", @@ -2405,6 +2416,7 @@ static void amdgpu_discovery_init_soc_config(struct amdgpu_device *adev) switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): aqua_vanjaram_init_soc_config(adev); break; default: @@ -2652,6 +2664,7 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev) case IP_VERSION(9, 4, 2): case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): adev->family = AMDGPU_FAMILY_AI; break; case IP_VERSION(9, 1, 0): diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index b119d27271c1..35c778426a7c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -33,6 +33,7 @@ #include "soc15_common.h" #include "gc/gc_11_0_0_offset.h" #include "gc/gc_11_0_0_sh_mask.h" +#include "bif/bif_4_1_d.h" #include #include @@ -1788,3 +1789,82 @@ int amdgpu_display_resume_helper(struct amdgpu_device *adev) return 0; } +/* panic_bo is set in amdgpu_dm_plane_get_scanout_buffer() and only used in amdgpu_dm_set_pixel() + * they are called from the panic handler, and protected by the drm_panic spinlock. + */ +static struct amdgpu_bo *panic_abo; + +/* Use the indirect MMIO to write each pixel to the GPU VRAM, + * This is a simplified version of amdgpu_device_mm_access() + */ +static void amdgpu_display_set_pixel(struct drm_scanout_buffer *sb, + unsigned int x, + unsigned int y, + u32 color) +{ + struct amdgpu_res_cursor cursor; + unsigned long offset; + struct amdgpu_bo *abo = panic_abo; + struct amdgpu_device *adev = amdgpu_ttm_adev(abo->tbo.bdev); + uint32_t tmp; + + offset = x * 4 + y * sb->pitch[0]; + amdgpu_res_first(abo->tbo.resource, offset, 4, &cursor); + + tmp = cursor.start >> 31; + WREG32_NO_KIQ(mmMM_INDEX, ((uint32_t) cursor.start) | 0x80000000); + if (tmp != 0xffffffff) + WREG32_NO_KIQ(mmMM_INDEX_HI, tmp); + WREG32_NO_KIQ(mmMM_DATA, color); +} + +int amdgpu_display_get_scanout_buffer(struct drm_plane *plane, + struct drm_scanout_buffer *sb) +{ + struct amdgpu_bo *abo; + struct drm_framebuffer *fb = plane->state->fb; + + if (!fb) + return -EINVAL; + + DRM_DEBUG_KMS("Framebuffer %dx%d %p4cc\n", fb->width, fb->height, &fb->format->format); + + abo = gem_to_amdgpu_bo(fb->obj[0]); + if (!abo) + return -EINVAL; + + sb->width = fb->width; + sb->height = fb->height; + /* Use the generic linear format, because tiling will be disabled in panic_flush() */ + sb->format = drm_format_info(fb->format->format); + if (!sb->format) + return -EINVAL; + + sb->pitch[0] = fb->pitches[0]; + + if (abo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS) { + if (abo->tbo.resource->mem_type != TTM_PL_VRAM) { + drm_warn(plane->dev, "amdgpu panic, framebuffer not in VRAM\n"); + return -EINVAL; + } + /* Only handle 32bits format, to simplify mmio access */ + if (fb->format->cpp[0] != 4) { + drm_warn(plane->dev, "amdgpu panic, pixel format is not 32bits\n"); + return -EINVAL; + } + sb->set_pixel = amdgpu_display_set_pixel; + panic_abo = abo; + return 0; + } + if (!abo->kmap.virtual && + ttm_bo_kmap(&abo->tbo, 0, PFN_UP(abo->tbo.base.size), &abo->kmap)) { + drm_warn(plane->dev, "amdgpu bo map failed, panic won't be displayed\n"); + return -ENOMEM; + } + if (abo->kmap.bo_kmap_type & TTM_BO_MAP_IOMEM_MASK) + iosys_map_set_vaddr_iomem(&sb->map[0], abo->kmap.virtual); + else + iosys_map_set_vaddr(&sb->map[0], abo->kmap.virtual); + + return 0; +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.h index 9d19940f73c8..dfa0d642ac16 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.h @@ -23,6 +23,8 @@ #ifndef __AMDGPU_DISPLAY_H__ #define __AMDGPU_DISPLAY_H__ +#include + #define amdgpu_display_vblank_get_counter(adev, crtc) (adev)->mode_info.funcs->vblank_get_counter((adev), (crtc)) #define amdgpu_display_backlight_set_level(adev, e, l) (adev)->mode_info.funcs->backlight_set_level((e), (l)) #define amdgpu_display_backlight_get_level(adev, e) (adev)->mode_info.funcs->backlight_get_level((e)) @@ -49,4 +51,7 @@ amdgpu_lookup_format_info(u32 format, uint64_t modifier); int amdgpu_display_suspend_helper(struct amdgpu_device *adev); int amdgpu_display_resume_helper(struct amdgpu_device *adev); +int amdgpu_display_get_scanout_buffer(struct drm_plane *plane, + struct drm_scanout_buffer *sb); + #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index 8e81a83d37d8..9f627caedc3f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -36,6 +36,7 @@ #include "amdgpu_gem.h" #include "amdgpu_dma_buf.h" #include "amdgpu_xgmi.h" +#include "amdgpu_vm.h" #include #include #include @@ -60,6 +61,8 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf, if (pci_p2pdma_distance(adev->pdev, attach->dev, false) < 0) attach->peer2peer = false; + amdgpu_vm_bo_update_shared(bo); + return 0; } @@ -345,7 +348,7 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach) /* FIXME: This should be after the "if", but needs a fix to make sure * DMABuf imports are initialized in the right VM list. */ - amdgpu_vm_bo_invalidate(adev, bo, false); + amdgpu_vm_bo_invalidate(bo, false); if (!bo->tbo.resource || bo->tbo.resource->mem_type == TTM_PL_SYSTEM) return; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 38686203bea6..492b09d84571 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -23,7 +23,7 @@ */ #include -#include +#include #include #include #include @@ -2552,7 +2552,6 @@ static int amdgpu_pmops_freeze(struct device *dev) struct amdgpu_device *adev = drm_to_adev(drm_dev); int r; - adev->in_s4 = true; r = amdgpu_device_suspend(drm_dev, true); adev->in_s4 = false; if (r) @@ -2916,7 +2915,6 @@ static const struct drm_driver amdgpu_kms_driver = { .name = DRIVER_NAME, .desc = DRIVER_DESC, - .date = DRIVER_DATE, .major = KMS_DRIVER_MAJOR, .minor = KMS_DRIVER_MINOR, .patchlevel = KMS_DRIVER_PATCHLEVEL, @@ -2940,7 +2938,6 @@ const struct drm_driver amdgpu_partition_driver = { .name = DRIVER_NAME, .desc = DRIVER_DESC, - .date = DRIVER_DATE, .major = KMS_DRIVER_MAJOR, .minor = KMS_DRIVER_MINOR, .patchlevel = KMS_DRIVER_PATCHLEVEL, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.h index 5bc2cb661af7..2d86cc6f7f4d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.h @@ -40,7 +40,6 @@ #define DRIVER_NAME "amdgpu" #define DRIVER_DESC "AMD GPU" -#define DRIVER_DATE "20150101" extern const struct drm_driver amdgpu_partition_driver; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c index df2cf5c33925..91d638098889 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c @@ -60,7 +60,7 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file) struct amdgpu_fpriv *fpriv = file->driver_priv; struct amdgpu_vm *vm = &fpriv->vm; - struct amdgpu_mem_stats stats[__AMDGPU_PL_LAST + 1] = { }; + struct amdgpu_mem_stats stats[__AMDGPU_PL_NUM]; ktime_t usage[AMDGPU_HW_IP_NUM]; const char *pl_name[] = { [TTM_PL_VRAM] = "vram", @@ -72,15 +72,8 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file) [AMDGPU_PL_DOORBELL] = "doorbell", }; unsigned int hw_ip, i; - int ret; - - ret = amdgpu_bo_reserve(vm->root.bo, false); - if (ret) - return; - - amdgpu_vm_get_memory(vm, stats, ARRAY_SIZE(stats)); - amdgpu_bo_unreserve(vm->root.bo); + amdgpu_vm_get_memory(vm, stats); amdgpu_ctx_mgr_usage(&fpriv->ctx_mgr, usage); /* @@ -114,9 +107,11 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file) drm_printf(p, "amd-evicted-vram:\t%llu KiB\n", stats[TTM_PL_VRAM].evicted/1024UL); drm_printf(p, "amd-requested-vram:\t%llu KiB\n", - stats[TTM_PL_VRAM].requested/1024UL); + (stats[TTM_PL_VRAM].drm.shared + + stats[TTM_PL_VRAM].drm.private) / 1024UL); drm_printf(p, "amd-requested-gtt:\t%llu KiB\n", - stats[TTM_PL_TT].requested/1024UL); + (stats[TTM_PL_TT].drm.shared + + stats[TTM_PL_TT].drm.private) / 1024UL); for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) { if (!usage[hw_ip]) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c index ceb5163480f4..09c9194d5bd5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c @@ -384,7 +384,7 @@ int amdgpu_fru_sysfs_init(struct amdgpu_device *adev) void amdgpu_fru_sysfs_fini(struct amdgpu_device *adev) { - if (!is_fru_eeprom_supported(adev, NULL) || !adev->fru_info) + if (!adev->fru_info) return; sysfs_remove_files(&adev->dev->kobj, amdgpu_fru_attributes); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h index bc58dca18035..98f3196599ef 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h @@ -32,7 +32,7 @@ struct amdgpu_fru_info { char product_name[AMDGPU_PRODUCT_NAME_LEN]; char serial[20]; char manufacturer_name[32]; - char fru_id[32]; + char fru_id[50]; }; int amdgpu_fru_get_product_info(struct amdgpu_device *adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fw_attestation.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fw_attestation.c index 2d4b67175b55..328a1b963548 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fw_attestation.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fw_attestation.c @@ -122,6 +122,10 @@ static int amdgpu_is_fw_attestation_supported(struct amdgpu_device *adev) if (adev->flags & AMD_IS_APU) return 0; + if (amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(14, 0, 2) || + amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(14, 0, 3)) + return 0; + if (adev->asic_type >= CHIP_SIENNA_CICHLID) return 1; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 1a5df8b94661..69429df09477 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -42,6 +42,7 @@ #include "amdgpu_dma_buf.h" #include "amdgpu_hmm.h" #include "amdgpu_xgmi.h" +#include "amdgpu_vm.h" static vm_fault_t amdgpu_gem_fault(struct vm_fault *vmf) { @@ -87,10 +88,8 @@ static void amdgpu_gem_object_free(struct drm_gem_object *gobj) { struct amdgpu_bo *aobj = gem_to_amdgpu_bo(gobj); - if (aobj) { - amdgpu_hmm_unregister(aobj); - ttm_bo_put(&aobj->tbo); - } + amdgpu_hmm_unregister(aobj); + ttm_bo_put(&aobj->tbo); } int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size, @@ -179,6 +178,7 @@ static int amdgpu_gem_object_open(struct drm_gem_object *obj, if (r) return r; + amdgpu_vm_bo_update_shared(abo); bo_va = amdgpu_vm_bo_find(vm, abo); if (!bo_va) bo_va = amdgpu_vm_bo_add(adev, vm, abo); @@ -252,6 +252,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj, goto out_unlock; amdgpu_vm_bo_del(adev, bo_va); + amdgpu_vm_bo_update_shared(bo); if (!amdgpu_vm_ready(vm)) goto out_unlock; @@ -839,7 +840,6 @@ error: int amdgpu_gem_op_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) { - struct amdgpu_device *adev = drm_to_adev(dev); struct drm_amdgpu_gem_op *args = data; struct drm_gem_object *gobj; struct amdgpu_vm_bo_base *base; @@ -899,7 +899,7 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev, void *data, robj->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT; if (robj->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) - amdgpu_vm_bo_invalidate(adev, robj, true); + amdgpu_vm_bo_invalidate(robj, true); amdgpu_bo_unreserve(robj); break; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index 69a6b6dba0a5..784b03abb3a4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c @@ -515,7 +515,7 @@ int amdgpu_gfx_disable_kcq(struct amdgpu_device *adev, int xcc_id) if (!kiq->pmf || !kiq->pmf->kiq_unmap_queues) return -EINVAL; - if (!kiq_ring->sched.ready || adev->job_hang || amdgpu_in_reset(adev)) + if (!kiq_ring->sched.ready || amdgpu_in_reset(adev)) return 0; spin_lock(&kiq->ring_lock); @@ -567,7 +567,7 @@ int amdgpu_gfx_disable_kgq(struct amdgpu_device *adev, int xcc_id) if (!kiq->pmf || !kiq->pmf->kiq_unmap_queues) return -EINVAL; - if (!adev->gfx.kiq[0].ring.sched.ready || adev->job_hang) + if (!adev->gfx.kiq[0].ring.sched.ready || amdgpu_in_reset(adev)) return 0; if (amdgpu_gfx_is_master_xcc(adev, xcc_id)) { @@ -806,7 +806,7 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable) /* If going to s2idle, no need to wait */ if (adev->in_s0ix) { if (!amdgpu_dpm_set_powergating_by_smu(adev, - AMD_IP_BLOCK_TYPE_GFX, true)) + AMD_IP_BLOCK_TYPE_GFX, true, 0)) adev->gfx.gfx_off_state = true; } else { schedule_delayed_work(&adev->gfx.gfx_off_delay_work, @@ -818,7 +818,7 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable) cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work); if (adev->gfx.gfx_off_state && - !amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) { + !amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false, 0)) { adev->gfx.gfx_off_state = false; if (adev->gfx.funcs->init_spm_golden) { @@ -1484,6 +1484,24 @@ static int amdgpu_gfx_run_cleaner_shader(struct amdgpu_device *adev, int xcp_id) return 0; } +/** + * amdgpu_gfx_set_run_cleaner_shader - Execute the AMDGPU GFX Cleaner Shader + * @dev: The device structure + * @attr: The device attribute structure + * @buf: The buffer containing the input data + * @count: The size of the input data + * + * Provides the sysfs interface to manually run a cleaner shader, which is + * used to clear the GPU state between different tasks. Writing a value to the + * 'run_cleaner_shader' sysfs file triggers the cleaner shader execution. + * The value written corresponds to the partition index on multi-partition + * devices. On single-partition devices, the value should be '0'. + * + * The cleaner shader clears the Local Data Store (LDS) and General Purpose + * Registers (GPRs) to ensure data isolation between GPU workloads. + * + * Return: The number of bytes written to the sysfs file. + */ static ssize_t amdgpu_gfx_set_run_cleaner_shader(struct device *dev, struct device_attribute *attr, const char *buf, @@ -1532,6 +1550,19 @@ static ssize_t amdgpu_gfx_set_run_cleaner_shader(struct device *dev, return count; } +/** + * amdgpu_gfx_get_enforce_isolation - Query AMDGPU GFX Enforce Isolation Settings + * @dev: The device structure + * @attr: The device attribute structure + * @buf: The buffer to store the output data + * + * Provides the sysfs read interface to get the current settings of the 'enforce_isolation' + * feature for each GPU partition. Reading from the 'enforce_isolation' + * sysfs file returns the isolation settings for all partitions, where '0' + * indicates disabled and '1' indicates enabled. + * + * Return: The number of bytes read from the sysfs file. + */ static ssize_t amdgpu_gfx_get_enforce_isolation(struct device *dev, struct device_attribute *attr, char *buf) @@ -1555,6 +1586,20 @@ static ssize_t amdgpu_gfx_get_enforce_isolation(struct device *dev, return size; } +/** + * amdgpu_gfx_set_enforce_isolation - Control AMDGPU GFX Enforce Isolation + * @dev: The device structure + * @attr: The device attribute structure + * @buf: The buffer containing the input data + * @count: The size of the input data + * + * This function allows control over the 'enforce_isolation' feature, which + * serializes access to the graphics engine. Writing '1' or '0' to the + * 'enforce_isolation' sysfs file enables or disables process isolation for + * each partition. The input should specify the setting for all partitions. + * + * Return: The number of bytes written to the sysfs file. + */ static ssize_t amdgpu_gfx_set_enforce_isolation(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) @@ -1940,6 +1985,17 @@ void amdgpu_gfx_enforce_isolation_handler(struct work_struct *work) mutex_unlock(&adev->enforce_isolation_mutex); } +/** + * amdgpu_gfx_enforce_isolation_wait_for_kfd - Manage KFD wait period for process isolation + * @adev: amdgpu_device pointer + * @idx: Index of the GPU partition + * + * When kernel submissions come in, the jobs are given a time slice and once + * that time slice is up, if there are KFD user queues active, kernel + * submissions are blocked until KFD has had its time slice. Once the KFD time + * slice is up, KFD user queues are preempted and kernel submissions are + * unblocked and allowed to run again. + */ static void amdgpu_gfx_enforce_isolation_wait_for_kfd(struct amdgpu_device *adev, u32 idx) @@ -1985,10 +2041,20 @@ amdgpu_gfx_enforce_isolation_wait_for_kfd(struct amdgpu_device *adev, msleep(GFX_SLICE_PERIOD_MS); } +/** + * amdgpu_gfx_enforce_isolation_ring_begin_use - Begin use of a ring with enforced isolation + * @ring: Pointer to the amdgpu_ring structure + * + * Ring begin_use helper implementation for gfx which serializes access to the + * gfx IP between kernel submission IOCTLs and KFD user queues when isolation + * enforcement is enabled. The kernel submission IOCTLs and KFD user queues + * each get a time slice when both are active. + */ void amdgpu_gfx_enforce_isolation_ring_begin_use(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; u32 idx; + bool sched_work = false; if (!adev->gfx.enable_cleaner_shader) return; @@ -2007,15 +2073,28 @@ void amdgpu_gfx_enforce_isolation_ring_begin_use(struct amdgpu_ring *ring) mutex_lock(&adev->enforce_isolation_mutex); if (adev->enforce_isolation[idx]) { if (adev->kfd.init_complete) - amdgpu_gfx_kfd_sch_ctrl(adev, idx, false); + sched_work = true; } mutex_unlock(&adev->enforce_isolation_mutex); + + if (sched_work) + amdgpu_gfx_kfd_sch_ctrl(adev, idx, false); } +/** + * amdgpu_gfx_enforce_isolation_ring_end_use - End use of a ring with enforced isolation + * @ring: Pointer to the amdgpu_ring structure + * + * Ring end_use helper implementation for gfx which serializes access to the + * gfx IP between kernel submission IOCTLs and KFD user queues when isolation + * enforcement is enabled. The kernel submission IOCTLs and KFD user queues + * each get a time slice when both are active. + */ void amdgpu_gfx_enforce_isolation_ring_end_use(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; u32 idx; + bool sched_work = false; if (!adev->gfx.enable_cleaner_shader) return; @@ -2031,9 +2110,12 @@ void amdgpu_gfx_enforce_isolation_ring_end_use(struct amdgpu_ring *ring) mutex_lock(&adev->enforce_isolation_mutex); if (adev->enforce_isolation[idx]) { if (adev->kfd.init_complete) - amdgpu_gfx_kfd_sch_ctrl(adev, idx, true); + sched_work = true; } mutex_unlock(&adev->enforce_isolation_mutex); + + if (sched_work) + amdgpu_gfx_kfd_sch_ctrl(adev, idx, true); } /* @@ -2050,7 +2132,7 @@ static int amdgpu_debugfs_gfx_sched_mask_set(void *data, u64 val) if (!adev) return -ENODEV; - mask = (1 << adev->gfx.num_gfx_rings) - 1; + mask = (1ULL << adev->gfx.num_gfx_rings) - 1; if ((val & mask) == 0) return -EINVAL; @@ -2078,7 +2160,7 @@ static int amdgpu_debugfs_gfx_sched_mask_get(void *data, u64 *val) for (i = 0; i < adev->gfx.num_gfx_rings; ++i) { ring = &adev->gfx.gfx_ring[i]; if (ring->sched.ready) - mask |= 1 << i; + mask |= 1ULL << i; } *val = mask; @@ -2120,7 +2202,7 @@ static int amdgpu_debugfs_compute_sched_mask_set(void *data, u64 val) if (!adev) return -ENODEV; - mask = (1 << adev->gfx.num_compute_rings) - 1; + mask = (1ULL << adev->gfx.num_compute_rings) - 1; if ((val & mask) == 0) return -EINVAL; @@ -2149,7 +2231,7 @@ static int amdgpu_debugfs_compute_sched_mask_get(void *data, u64 *val) for (i = 0; i < adev->gfx.num_compute_rings; ++i) { ring = &adev->gfx.compute_ring[i]; if (ring->sched.ready) - mask |= 1 << i; + mask |= 1ULL << i; } *val = mask; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c index 8b512dc28df8..e0bc37557d2c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c @@ -89,16 +89,14 @@ int amdgpu_ib_get(struct amdgpu_device *adev, struct amdgpu_vm *vm, /** * amdgpu_ib_free - free an IB (Indirect Buffer) * - * @adev: amdgpu_device pointer * @ib: IB object to free * @f: the fence SA bo need wait on for the ib alloation * * Free an IB (all asics). */ -void amdgpu_ib_free(struct amdgpu_device *adev, struct amdgpu_ib *ib, - struct dma_fence *f) +void amdgpu_ib_free(struct amdgpu_ib *ib, struct dma_fence *f) { - amdgpu_sa_bo_free(adev, &ib->sa_bo, f); + amdgpu_sa_bo_free(&ib->sa_bo, f); } /** @@ -193,8 +191,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned int num_ibs, need_ctx_switch = ring->current_ctx != fence_ctx; if (ring->funcs->emit_pipeline_sync && job && ((tmp = amdgpu_sync_get_fence(&job->explicit_sync)) || - (amdgpu_sriov_vf(adev) && need_ctx_switch) || - amdgpu_vm_need_pipeline_sync(ring, job))) { + need_ctx_switch || amdgpu_vm_need_pipeline_sync(ring, job))) { + need_pipe_sync = true; if (tmp) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c index f3b0aaf3ebc6..901f8b12c672 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c @@ -298,3 +298,9 @@ uint64_t amdgpu_ih_decode_iv_ts_helper(struct amdgpu_ih_ring *ih, u32 rptr, dw2 = le32_to_cpu(ih->ring[ring_index + 2]); return dw1 | ((u64)(dw2 & 0xffff) << 32); } + +const char *amdgpu_ih_ring_name(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih) +{ + return ih == &adev->irq.ih ? "ih" : ih == &adev->irq.ih_soft ? "sw ih" : + ih == &adev->irq.ih1 ? "ih1" : ih == &adev->irq.ih2 ? "ih2" : "unknown"; +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h index 508f02eb0cf8..7d4395a5d8ac 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h @@ -110,4 +110,5 @@ void amdgpu_ih_decode_iv_helper(struct amdgpu_device *adev, struct amdgpu_iv_entry *entry); uint64_t amdgpu_ih_decode_iv_ts_helper(struct amdgpu_ih_ring *ih, u32 rptr, signed int offset); +const char *amdgpu_ih_ring_name(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih); #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c index 263ce1811cc8..732744488b03 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c @@ -77,7 +77,8 @@ static int isp_load_fw_by_psp(struct amdgpu_device *adev) sizeof(ucode_prefix)); /* read isp fw */ - r = amdgpu_ucode_request(adev, &adev->isp.fw, "amdgpu/%s.bin", ucode_prefix); + r = amdgpu_ucode_request(adev, &adev->isp.fw, AMDGPU_UCODE_OPTIONAL, + "amdgpu/%s.bin", ucode_prefix); if (r) { amdgpu_ucode_release(&adev->isp.fw); return r; @@ -128,13 +129,13 @@ static bool isp_is_idle(void *handle) return true; } -static int isp_set_clockgating_state(void *handle, +static int isp_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int isp_set_powergating_state(void *handle, +static int isp_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index b9d08bc96581..100f04475943 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -102,8 +102,6 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) return DRM_GPU_SCHED_STAT_ENODEV; } - adev->job_hang = true; - /* * Do the coredump immediately after a job timeout to get a very * close dump/snapshot/representation of GPU's current error status @@ -181,7 +179,6 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) } exit: - adev->job_hang = false; drm_dev_exit(idx); return DRM_GPU_SCHED_STAT_NOMINAL; } @@ -197,11 +194,6 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm, if (!*job) return -ENOMEM; - /* - * Initialize the scheduler to at least some ring so that we always - * have a pointer to adev. - */ - (*job)->base.sched = &adev->rings[0]->sched; (*job)->vm = vm; amdgpu_sync_create(&(*job)->explicit_sync); @@ -255,7 +247,6 @@ void amdgpu_job_set_resources(struct amdgpu_job *job, struct amdgpu_bo *gds, void amdgpu_job_free_resources(struct amdgpu_job *job) { - struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched); struct dma_fence *f; unsigned i; @@ -268,7 +259,7 @@ void amdgpu_job_free_resources(struct amdgpu_job *job) f = NULL; for (i = 0; i < job->num_ibs; ++i) - amdgpu_ib_free(ring->adev, &job->ibs[i], f); + amdgpu_ib_free(&job->ibs[i], f); } static void amdgpu_job_free_cb(struct drm_sched_job *s_job) @@ -367,6 +358,13 @@ amdgpu_job_prepare_job(struct drm_sched_job *sched_job, dev_err(ring->adev->dev, "Error getting VM ID (%d)\n", r); goto error; } + /* + * The VM structure might be released after the VMID is + * assigned, we had multiple problems with people trying to use + * the VM pointer so better set it to NULL. + */ + if (!fence) + job->vm = NULL; } return fence; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.h index 3eb4a4653fce..d9cb343a8708 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.h @@ -27,7 +27,8 @@ #include "amdgpu_ras.h" #define AMDGPU_MAX_JPEG_INSTANCES 4 -#define AMDGPU_MAX_JPEG_RINGS 8 +#define AMDGPU_MAX_JPEG_RINGS 10 +#define AMDGPU_MAX_JPEG_RINGS_4_0_3 8 #define AMDGPU_JPEG_HARVEST_JPEG0 (1 << 0) #define AMDGPU_JPEG_HARVEST_JPEG1 (1 << 1) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c index 59ec20b07a6a..32b27a1658e7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c @@ -1610,10 +1610,12 @@ int amdgpu_mes_init_microcode(struct amdgpu_device *adev, int pipe) pipe == AMDGPU_MES_SCHED_PIPE ? "" : "1"); } - r = amdgpu_ucode_request(adev, &adev->mes.fw[pipe], "%s", fw_name); + r = amdgpu_ucode_request(adev, &adev->mes.fw[pipe], AMDGPU_UCODE_REQUIRED, + "%s", fw_name); if (r && need_retry && pipe == AMDGPU_MES_SCHED_PIPE) { dev_info(adev->dev, "try to fall back to %s_mes.bin\n", ucode_prefix); r = amdgpu_ucode_request(adev, &adev->mes.fw[pipe], + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mes.bin", ucode_prefix); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 6852d50caa89..96f4b8904e9a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -41,6 +41,7 @@ #include "amdgpu_amdkfd.h" #include "amdgpu_vram_mgr.h" #include "amdgpu_vm.h" +#include "amdgpu_dma_buf.h" /** * DOC: amdgpu_object @@ -324,6 +325,9 @@ error_free: * * Allocates and pins a BO for kernel internal use. * + * This function is exported to allow the V4L2 isp device + * external to drm device to create and access the kernel BO. + * * Note: For bo_ptr new BO is only created if bo_ptr points to NULL. * * Returns: @@ -347,6 +351,76 @@ int amdgpu_bo_create_kernel(struct amdgpu_device *adev, return 0; } +EXPORT_SYMBOL(amdgpu_bo_create_kernel); + +/** + * amdgpu_bo_create_isp_user - create user BO for isp + * + * @adev: amdgpu device object + * @dma_buf: DMABUF handle for isp buffer + * @domain: where to place it + * @bo: used to initialize BOs in structures + * @gpu_addr: GPU addr of the pinned BO + * + * Imports isp DMABUF to allocate and pin a user BO for isp internal use. It does + * GART alloc to generate gpu_addr for BO to make it accessible through the + * GART aperture for ISP HW. + * + * This function is exported to allow the V4L2 isp device external to drm device + * to create and access the isp user BO. + * + * Returns: + * 0 on success, negative error code otherwise. + */ +int amdgpu_bo_create_isp_user(struct amdgpu_device *adev, + struct dma_buf *dma_buf, u32 domain, struct amdgpu_bo **bo, + u64 *gpu_addr) + +{ + struct drm_gem_object *gem_obj; + int r; + + gem_obj = amdgpu_gem_prime_import(&adev->ddev, dma_buf); + *bo = gem_to_amdgpu_bo(gem_obj); + if (!(*bo)) { + dev_err(adev->dev, "failed to get valid isp user bo\n"); + return -EINVAL; + } + + r = amdgpu_bo_reserve(*bo, false); + if (r) { + dev_err(adev->dev, "(%d) failed to reserve isp user bo\n", r); + return r; + } + + r = amdgpu_bo_pin(*bo, domain); + if (r) { + dev_err(adev->dev, "(%d) isp user bo pin failed\n", r); + goto error_unreserve; + } + + r = amdgpu_ttm_alloc_gart(&(*bo)->tbo); + if (r) { + dev_err(adev->dev, "%p bind failed\n", *bo); + goto error_unpin; + } + + if (!WARN_ON(!gpu_addr)) + *gpu_addr = amdgpu_bo_gpu_offset(*bo); + + amdgpu_bo_unreserve(*bo); + + return 0; + +error_unpin: + amdgpu_bo_unpin(*bo); +error_unreserve: + amdgpu_bo_unreserve(*bo); + amdgpu_bo_unref(bo); + + return r; +} +EXPORT_SYMBOL(amdgpu_bo_create_isp_user); /** * amdgpu_bo_create_kernel_at - create BO for kernel use at specific location @@ -423,6 +497,9 @@ error: * @cpu_addr: pointer to where the BO's CPU memory space address was stored * * unmaps and unpin a BO for kernel internal use. + * + * This function is exported to allow the V4L2 isp device + * external to drm device to free the kernel BO. */ void amdgpu_bo_free_kernel(struct amdgpu_bo **bo, u64 *gpu_addr, void **cpu_addr) @@ -447,6 +524,30 @@ void amdgpu_bo_free_kernel(struct amdgpu_bo **bo, u64 *gpu_addr, if (cpu_addr) *cpu_addr = NULL; } +EXPORT_SYMBOL(amdgpu_bo_free_kernel); + +/** + * amdgpu_bo_free_isp_user - free BO for isp use + * + * @bo: amdgpu isp user BO to free + * + * unpin and unref BO for isp internal use. + * + * This function is exported to allow the V4L2 isp device + * external to drm device to free the isp user BO. + */ +void amdgpu_bo_free_isp_user(struct amdgpu_bo *bo) +{ + if (bo == NULL) + return; + + if (amdgpu_bo_reserve(bo, true) == 0) { + amdgpu_bo_unpin(bo); + amdgpu_bo_unreserve(bo); + } + amdgpu_bo_unref(&bo); +} +EXPORT_SYMBOL(amdgpu_bo_free_isp_user); /* Validate bo size is bit bigger than the request domain */ static bool amdgpu_bo_validate_size(struct amdgpu_device *adev, @@ -1150,7 +1251,6 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo, bool evict, struct ttm_resource *new_mem) { - struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev); struct ttm_resource *old_mem = bo->resource; struct amdgpu_bo *abo; @@ -1158,7 +1258,7 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo, return; abo = ttm_to_amdgpu_bo(bo); - amdgpu_vm_bo_invalidate(adev, abo, evict); + amdgpu_vm_bo_move(abo, new_mem, evict); amdgpu_bo_kunmap(abo); @@ -1171,75 +1271,6 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo, old_mem ? old_mem->mem_type : -1); } -void amdgpu_bo_get_memory(struct amdgpu_bo *bo, - struct amdgpu_mem_stats *stats, - unsigned int sz) -{ - const unsigned int domain_to_pl[] = { - [ilog2(AMDGPU_GEM_DOMAIN_CPU)] = TTM_PL_SYSTEM, - [ilog2(AMDGPU_GEM_DOMAIN_GTT)] = TTM_PL_TT, - [ilog2(AMDGPU_GEM_DOMAIN_VRAM)] = TTM_PL_VRAM, - [ilog2(AMDGPU_GEM_DOMAIN_GDS)] = AMDGPU_PL_GDS, - [ilog2(AMDGPU_GEM_DOMAIN_GWS)] = AMDGPU_PL_GWS, - [ilog2(AMDGPU_GEM_DOMAIN_OA)] = AMDGPU_PL_OA, - [ilog2(AMDGPU_GEM_DOMAIN_DOORBELL)] = AMDGPU_PL_DOORBELL, - }; - struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); - struct ttm_resource *res = bo->tbo.resource; - struct drm_gem_object *obj = &bo->tbo.base; - uint64_t size = amdgpu_bo_size(bo); - unsigned int type; - - if (!res) { - /* - * If no backing store use one of the preferred domain for basic - * stats. We take the MSB since that should give a reasonable - * view. - */ - BUILD_BUG_ON(TTM_PL_VRAM < TTM_PL_TT || - TTM_PL_VRAM < TTM_PL_SYSTEM); - type = fls(bo->preferred_domains & AMDGPU_GEM_DOMAIN_MASK); - if (!type) - return; - type--; - if (drm_WARN_ON_ONCE(&adev->ddev, - type >= ARRAY_SIZE(domain_to_pl))) - return; - type = domain_to_pl[type]; - } else { - type = res->mem_type; - } - - if (drm_WARN_ON_ONCE(&adev->ddev, type >= sz)) - return; - - /* DRM stats common fields: */ - - if (drm_gem_object_is_shared_for_memory_stats(obj)) - stats[type].drm.shared += size; - else - stats[type].drm.private += size; - - if (res) { - stats[type].drm.resident += size; - - if (!dma_resv_test_signaled(obj->resv, DMA_RESV_USAGE_BOOKKEEP)) - stats[type].drm.active += size; - else if (bo->flags & AMDGPU_GEM_CREATE_DISCARDABLE) - stats[type].drm.purgeable += size; - } - - /* amdgpu specific stats: */ - - if (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) { - stats[TTM_PL_VRAM].requested += size; - if (type != TTM_PL_VRAM) - stats[TTM_PL_VRAM].evicted += size; - } else if (bo->preferred_domains & AMDGPU_GEM_DOMAIN_GTT) { - stats[TTM_PL_TT].requested += size; - } -} - /** * amdgpu_bo_release_notify - notification about a BO being released * @bo: pointer to a buffer object @@ -1454,6 +1485,45 @@ u64 amdgpu_bo_gpu_offset_no_check(struct amdgpu_bo *bo) return amdgpu_gmc_sign_extend(offset); } +/** + * amdgpu_bo_mem_stats_placement - bo placement for memory accounting + * @bo: the buffer object we should look at + * + * BO can have multiple preferred placements, to avoid double counting we want + * to file it under a single placement for memory stats. + * Luckily, if we take the highest set bit in preferred_domains the result is + * quite sensible. + * + * Returns: + * Which of the placements should the BO be accounted under. + */ +uint32_t amdgpu_bo_mem_stats_placement(struct amdgpu_bo *bo) +{ + uint32_t domain = bo->preferred_domains & AMDGPU_GEM_DOMAIN_MASK; + + if (!domain) + return TTM_PL_SYSTEM; + + switch (rounddown_pow_of_two(domain)) { + case AMDGPU_GEM_DOMAIN_CPU: + return TTM_PL_SYSTEM; + case AMDGPU_GEM_DOMAIN_GTT: + return TTM_PL_TT; + case AMDGPU_GEM_DOMAIN_VRAM: + return TTM_PL_VRAM; + case AMDGPU_GEM_DOMAIN_GDS: + return AMDGPU_PL_GDS; + case AMDGPU_GEM_DOMAIN_GWS: + return AMDGPU_PL_GWS; + case AMDGPU_GEM_DOMAIN_OA: + return AMDGPU_PL_OA; + case AMDGPU_GEM_DOMAIN_DOORBELL: + return AMDGPU_PL_DOORBELL; + default: + return TTM_PL_SYSTEM; + } +} + /** * amdgpu_bo_get_preferred_domain - get preferred domain * @adev: amdgpu device object diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h index be6769852ece..375448627f7b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h @@ -260,6 +260,10 @@ int amdgpu_bo_create_kernel(struct amdgpu_device *adev, unsigned long size, int align, u32 domain, struct amdgpu_bo **bo_ptr, u64 *gpu_addr, void **cpu_addr); +int amdgpu_bo_create_isp_user(struct amdgpu_device *adev, + struct dma_buf *dbuf, u32 domain, + struct amdgpu_bo **bo, + u64 *gpu_addr); int amdgpu_bo_create_kernel_at(struct amdgpu_device *adev, uint64_t offset, uint64_t size, struct amdgpu_bo **bo_ptr, void **cpu_addr); @@ -271,6 +275,7 @@ int amdgpu_bo_create_vm(struct amdgpu_device *adev, struct amdgpu_bo_vm **ubo_ptr); void amdgpu_bo_free_kernel(struct amdgpu_bo **bo, u64 *gpu_addr, void **cpu_addr); +void amdgpu_bo_free_isp_user(struct amdgpu_bo *bo); int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr); void *amdgpu_bo_kptr(struct amdgpu_bo *bo); void amdgpu_bo_kunmap(struct amdgpu_bo *bo); @@ -300,9 +305,7 @@ int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct dma_resv *resv, int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr); u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo); u64 amdgpu_bo_gpu_offset_no_check(struct amdgpu_bo *bo); -void amdgpu_bo_get_memory(struct amdgpu_bo *bo, - struct amdgpu_mem_stats *stats, - unsigned int size); +uint32_t amdgpu_bo_mem_stats_placement(struct amdgpu_bo *bo); uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev, uint32_t domain); @@ -337,8 +340,7 @@ int amdgpu_sa_bo_manager_start(struct amdgpu_device *adev, int amdgpu_sa_bo_new(struct amdgpu_sa_manager *sa_manager, struct drm_suballoc **sa_bo, unsigned int size); -void amdgpu_sa_bo_free(struct amdgpu_device *adev, - struct drm_suballoc **sa_bo, +void amdgpu_sa_bo_free(struct drm_suballoc **sa_bo, struct dma_fence *fence); #if defined(CONFIG_DEBUG_FS) void amdgpu_sa_bo_dump_debug_info(struct amdgpu_sa_manager *sa_manager, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index 448f9e742983..babe94ade247 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -208,6 +208,7 @@ static int psp_early_init(struct amdgpu_ip_block *ip_block) psp->boot_time_tmr = false; fallthrough; case IP_VERSION(13, 0, 6): + case IP_VERSION(13, 0, 12): case IP_VERSION(13, 0, 14): psp_v13_0_set_psp_funcs(psp); psp->autoload_supported = false; @@ -359,6 +360,7 @@ static bool psp_get_runtime_db_entry(struct amdgpu_device *adev, int i; if (amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 6) || + amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 12) || amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 14)) return false; @@ -870,6 +872,7 @@ static bool psp_skip_tmr(struct psp_context *psp) case IP_VERSION(13, 0, 2): case IP_VERSION(13, 0, 6): case IP_VERSION(13, 0, 10): + case IP_VERSION(13, 0, 12): case IP_VERSION(13, 0, 14): return true; default: @@ -2264,7 +2267,8 @@ int psp_securedisplay_invoke(struct psp_context *psp, uint32_t ta_cmd_id) return -EINVAL; if (ta_cmd_id != TA_SECUREDISPLAY_COMMAND__QUERY_TA && - ta_cmd_id != TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC) + ta_cmd_id != TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC && + ta_cmd_id != TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC_V2) return -EINVAL; ret = psp_ta_invoke(psp, ta_cmd_id, &psp->securedisplay_context.context); @@ -2385,6 +2389,15 @@ static int psp_hw_start(struct psp_context *psp) } } + if ((is_psp_fw_valid(psp->spdm_drv)) && + (psp->funcs->bootloader_load_spdm_drv != NULL)) { + ret = psp_bootloader_load_spdm_drv(psp); + if (ret) { + dev_err(adev->dev, "PSP load spdm_drv failed!\n"); + return ret; + } + } + if ((is_psp_fw_valid(psp->sos)) && (psp->funcs->bootloader_load_sos != NULL)) { ret = psp_bootloader_load_sos(psp); @@ -3007,10 +3020,7 @@ static int psp_hw_init(struct amdgpu_ip_block *ip_block) struct amdgpu_device *adev = ip_block->adev; mutex_lock(&adev->firmware.mutex); - /* - * This sequence is just used on hw_init only once, no need on - * resume. - */ + ret = amdgpu_ucode_init_bo(adev); if (ret) goto failed; @@ -3135,6 +3145,10 @@ static int psp_resume(struct amdgpu_ip_block *ip_block) mutex_lock(&adev->firmware.mutex); + ret = amdgpu_ucode_init_bo(adev); + if (ret) + goto failed; + ret = psp_hw_start(psp); if (ret) goto failed; @@ -3289,7 +3303,8 @@ int psp_init_asd_microcode(struct psp_context *psp, const char *chip_name) const struct psp_firmware_header_v1_0 *asd_hdr; int err = 0; - err = amdgpu_ucode_request(adev, &adev->psp.asd_fw, "amdgpu/%s_asd.bin", chip_name); + err = amdgpu_ucode_request(adev, &adev->psp.asd_fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_asd.bin", chip_name); if (err) goto out; @@ -3311,7 +3326,8 @@ int psp_init_toc_microcode(struct psp_context *psp, const char *chip_name) const struct psp_firmware_header_v1_0 *toc_hdr; int err = 0; - err = amdgpu_ucode_request(adev, &adev->psp.toc_fw, "amdgpu/%s_toc.bin", chip_name); + err = amdgpu_ucode_request(adev, &adev->psp.toc_fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_toc.bin", chip_name); if (err) goto out; @@ -3407,6 +3423,12 @@ static int parse_sos_bin_descriptor(struct psp_context *psp, psp->ipkeymgr_drv.size_bytes = le32_to_cpu(desc->size_bytes); psp->ipkeymgr_drv.start_addr = ucode_start_addr; break; + case PSP_FW_TYPE_PSP_SPDM_DRV: + psp->spdm_drv.fw_version = le32_to_cpu(desc->fw_version); + psp->spdm_drv.feature_version = le32_to_cpu(desc->fw_version); + psp->spdm_drv.size_bytes = le32_to_cpu(desc->size_bytes); + psp->spdm_drv.start_addr = ucode_start_addr; + break; default: dev_warn(psp->adev->dev, "Unsupported PSP FW type: %d\n", desc->fw_type); break; @@ -3474,7 +3496,8 @@ int psp_init_sos_microcode(struct psp_context *psp, const char *chip_name) uint8_t *ucode_array_start_addr; int err = 0; - err = amdgpu_ucode_request(adev, &adev->psp.sos_fw, "amdgpu/%s_sos.bin", chip_name); + err = amdgpu_ucode_request(adev, &adev->psp.sos_fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_sos.bin", chip_name); if (err) goto out; @@ -3750,7 +3773,8 @@ int psp_init_ta_microcode(struct psp_context *psp, const char *chip_name) struct amdgpu_device *adev = psp->adev; int err; - err = amdgpu_ucode_request(adev, &adev->psp.ta_fw, "amdgpu/%s_ta.bin", chip_name); + err = amdgpu_ucode_request(adev, &adev->psp.ta_fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_ta.bin", chip_name); if (err) return err; @@ -3785,7 +3809,8 @@ int psp_init_cap_microcode(struct psp_context *psp, const char *chip_name) return -EINVAL; } - err = amdgpu_ucode_request(adev, &adev->psp.cap_fw, "amdgpu/%s_cap.bin", chip_name); + err = amdgpu_ucode_request(adev, &adev->psp.cap_fw, AMDGPU_UCODE_OPTIONAL, + "amdgpu/%s_cap.bin", chip_name); if (err) { if (err == -ENODEV) { dev_warn(adev->dev, "cap microcode does not exist, skip\n"); @@ -3849,13 +3874,13 @@ int psp_config_sq_perfmon(struct psp_context *psp, return ret; } -static int psp_set_clockgating_state(void *handle, +static int psp_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int psp_set_powergating_state(void *handle, +static int psp_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; @@ -3867,10 +3892,12 @@ static ssize_t psp_usbc_pd_fw_sysfs_read(struct device *dev, { struct drm_device *ddev = dev_get_drvdata(dev); struct amdgpu_device *adev = drm_to_adev(ddev); + struct amdgpu_ip_block *ip_block; uint32_t fw_ver; int ret; - if (!adev->ip_blocks[AMD_IP_BLOCK_TYPE_PSP].status.late_initialized) { + ip_block = amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_PSP); + if (!ip_block || !ip_block->status.late_initialized) { dev_info(adev->dev, "PSP block is not ready yet\n."); return -EBUSY; } @@ -3899,8 +3926,10 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev, struct amdgpu_bo *fw_buf_bo = NULL; uint64_t fw_pri_mc_addr; void *fw_pri_cpu_addr; + struct amdgpu_ip_block *ip_block; - if (!adev->ip_blocks[AMD_IP_BLOCK_TYPE_PSP].status.late_initialized) { + ip_block = amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_PSP); + if (!ip_block || !ip_block->status.late_initialized) { dev_err(adev->dev, "PSP block is not ready yet."); return -EBUSY; } @@ -3908,7 +3937,8 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev, if (!drm_dev_enter(ddev, &idx)) return -ENODEV; - ret = amdgpu_ucode_request(adev, &usbc_pd_fw, "amdgpu/%s", buf); + ret = amdgpu_ucode_request(adev, &usbc_pd_fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s", buf); if (ret) goto fail; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h index 567cb1f924ca..8d5acc415d38 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h @@ -80,6 +80,7 @@ enum psp_bootloader_cmd { PSP_BL__DRAM_LONG_TRAIN = 0x100000, PSP_BL__DRAM_SHORT_TRAIN = 0x200000, PSP_BL__LOAD_TOS_SPL_TABLE = 0x10000000, + PSP_BL__LOAD_SPDMDRV = 0x20000000, }; enum psp_ring_type { @@ -120,6 +121,7 @@ struct psp_funcs { int (*bootloader_load_dbg_drv)(struct psp_context *psp); int (*bootloader_load_ras_drv)(struct psp_context *psp); int (*bootloader_load_ipkeymgr_drv)(struct psp_context *psp); + int (*bootloader_load_spdm_drv)(struct psp_context *psp); int (*bootloader_load_sos)(struct psp_context *psp); int (*ring_create)(struct psp_context *psp, enum psp_ring_type ring_type); @@ -343,6 +345,7 @@ struct psp_context { struct psp_bin_desc dbg_drv; struct psp_bin_desc ras_drv; struct psp_bin_desc ipkeymgr_drv; + struct psp_bin_desc spdm_drv; /* tmr buffer */ struct amdgpu_bo *tmr_bo; @@ -434,6 +437,9 @@ struct amdgpu_psp_funcs { #define psp_bootloader_load_ipkeymgr_drv(psp) \ ((psp)->funcs->bootloader_load_ipkeymgr_drv ? \ (psp)->funcs->bootloader_load_ipkeymgr_drv((psp)) : 0) +#define psp_bootloader_load_spdm_drv(psp) \ + ((psp)->funcs->bootloader_load_spdm_drv ? \ + (psp)->funcs->bootloader_load_spdm_drv((psp)) : 0) #define psp_bootloader_load_sos(psp) \ ((psp)->funcs->bootloader_load_sos ? (psp)->funcs->bootloader_load_sos((psp)) : 0) #define psp_smu_reload_quirk(psp) \ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 4c9fa24dd972..f0924aa3f4e4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -36,6 +36,7 @@ #include "amdgpu_xgmi.h" #include "ivsrcid/nbio/irqsrcs_nbif_7_4.h" #include "nbio_v4_3.h" +#include "nbif_v6_3_1.h" #include "nbio_v7_9.h" #include "atom.h" #include "amdgpu_reset.h" @@ -192,7 +193,7 @@ static int amdgpu_reserve_page_direct(struct amdgpu_device *adev, uint64_t addre if (amdgpu_bad_page_threshold != 0) { amdgpu_ras_add_bad_pages(adev, err_data.err_addr, - err_data.err_addr_cnt); + err_data.err_addr_cnt, false); amdgpu_ras_save_bad_pages(adev, NULL); } @@ -2015,6 +2016,7 @@ static bool amdgpu_ras_aca_is_supported(struct amdgpu_device *adev) switch (amdgpu_ip_version(adev, MP0_HWIP, 0)) { case IP_VERSION(13, 0, 6): + case IP_VERSION(13, 0, 12): case IP_VERSION(13, 0, 14): ret = true; break; @@ -2156,6 +2158,16 @@ void amdgpu_ras_interrupt_fatal_error_handler(struct amdgpu_device *adev) /* Fatal error events are handled on host side */ if (amdgpu_sriov_vf(adev)) return; + /** + * If the current interrupt is caused by a non-fatal RAS error, skip + * check for fatal error. For fatal errors, FED status of all devices + * in XGMI hive gets set when the first device gets fatal error + * interrupt. The error gets propagated to other devices as well, so + * make sure to ack the interrupt regardless of FED status. + */ + if (!amdgpu_ras_get_fed_status(adev) && + amdgpu_ras_is_err_state(adev, AMDGPU_RAS_BLOCK__ANY)) + return; if (adev->nbio.ras && adev->nbio.ras->handle_ras_controller_intr_no_bifring) @@ -2185,6 +2197,7 @@ static void amdgpu_ras_interrupt_poison_consumption_handler(struct ras_manager * if (ret) return; + amdgpu_ras_set_err_poison(adev, block_obj->ras_comm.block); /* both query_poison_status and handle_poison_consumption are optional, * but at least one of them should be implemented if we need poison * consumption handler @@ -2717,40 +2730,203 @@ static int amdgpu_ras_realloc_eh_data_space(struct amdgpu_device *adev, return 0; } +static int amdgpu_ras_mca2pa_by_idx(struct amdgpu_device *adev, + struct eeprom_table_record *bps, + struct ras_err_data *err_data) +{ + struct ta_ras_query_address_input addr_in; + uint32_t socket = 0; + int ret = 0; + + if (adev->smuio.funcs && adev->smuio.funcs->get_socket_id) + socket = adev->smuio.funcs->get_socket_id(adev); + + /* reinit err_data */ + err_data->err_addr_cnt = 0; + err_data->err_addr_len = adev->umc.retire_unit; + + memset(&addr_in, 0, sizeof(addr_in)); + addr_in.ma.err_addr = bps->address; + addr_in.ma.socket_id = socket; + addr_in.ma.ch_inst = bps->mem_channel; + /* tell RAS TA the node instance is not used */ + addr_in.ma.node_inst = TA_RAS_INV_NODE; + + if (adev->umc.ras && adev->umc.ras->convert_ras_err_addr) + ret = adev->umc.ras->convert_ras_err_addr(adev, err_data, + &addr_in, NULL, false); + + return ret; +} + +static int amdgpu_ras_mca2pa(struct amdgpu_device *adev, + struct eeprom_table_record *bps, + struct ras_err_data *err_data) +{ + struct ta_ras_query_address_input addr_in; + uint32_t die_id, socket = 0; + + if (adev->smuio.funcs && adev->smuio.funcs->get_socket_id) + socket = adev->smuio.funcs->get_socket_id(adev); + + /* although die id is gotten from PA in nps1 mode, the id is + * fitable for any nps mode + */ + if (adev->umc.ras && adev->umc.ras->get_die_id_from_pa) + die_id = adev->umc.ras->get_die_id_from_pa(adev, bps->address, + bps->retired_page << AMDGPU_GPU_PAGE_SHIFT); + else + return -EINVAL; + + /* reinit err_data */ + err_data->err_addr_cnt = 0; + err_data->err_addr_len = adev->umc.retire_unit; + + memset(&addr_in, 0, sizeof(addr_in)); + addr_in.ma.err_addr = bps->address; + addr_in.ma.ch_inst = bps->mem_channel; + addr_in.ma.umc_inst = bps->mcumc_id; + addr_in.ma.node_inst = die_id; + addr_in.ma.socket_id = socket; + + if (adev->umc.ras && adev->umc.ras->convert_ras_err_addr) + return adev->umc.ras->convert_ras_err_addr(adev, err_data, + &addr_in, NULL, false); + else + return -EINVAL; +} + /* it deal with vram only. */ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev, - struct eeprom_table_record *bps, int pages) + struct eeprom_table_record *bps, int pages, bool from_rom) { struct amdgpu_ras *con = amdgpu_ras_get_context(adev); struct ras_err_handler_data *data; + struct ras_err_data err_data; + struct eeprom_table_record *err_rec; + struct amdgpu_ras_eeprom_control *control = + &adev->psp.ras_context.ras->eeprom_control; + enum amdgpu_memory_partition nps = AMDGPU_NPS1_PARTITION_MODE; int ret = 0; - uint32_t i; + uint32_t i, j, loop_cnt = 1; + bool find_pages_per_pa = false; if (!con || !con->eh_data || !bps || pages <= 0) return 0; - mutex_lock(&con->recovery_lock); - data = con->eh_data; - if (!data) - goto out; - - for (i = 0; i < pages; i++) { - if (amdgpu_ras_check_bad_page_unlock(con, - bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT)) - continue; - - if (!data->space_left && - amdgpu_ras_realloc_eh_data_space(adev, data, 256)) { + if (from_rom) { + err_data.err_addr = + kcalloc(adev->umc.retire_unit, + sizeof(struct eeprom_table_record), GFP_KERNEL); + if (!err_data.err_addr) { + dev_warn(adev->dev, "Failed to alloc UMC error address record in mca2pa conversion!\n"); ret = -ENOMEM; goto out; } - amdgpu_ras_reserve_page(adev, bps[i].retired_page); - - memcpy(&data->bps[data->count], &bps[i], sizeof(*data->bps)); - data->count++; - data->space_left--; + err_rec = err_data.err_addr; + loop_cnt = adev->umc.retire_unit; + if (adev->gmc.gmc_funcs->query_mem_partition_mode) + nps = adev->gmc.gmc_funcs->query_mem_partition_mode(adev); } + + mutex_lock(&con->recovery_lock); + data = con->eh_data; + if (!data) { + /* Returning 0 as the absence of eh_data is acceptable */ + goto free; + } + + for (i = 0; i < pages; i++) { + if (from_rom && + control->rec_type == AMDGPU_RAS_EEPROM_REC_MCA) { + if (!find_pages_per_pa) { + if (amdgpu_ras_mca2pa_by_idx(adev, &bps[i], &err_data)) { + if (!i && nps == AMDGPU_NPS1_PARTITION_MODE) { + /* may use old RAS TA, use PA to find pages in + * one row + */ + if (amdgpu_umc_pages_in_a_row(adev, &err_data, + bps[i].retired_page << + AMDGPU_GPU_PAGE_SHIFT)) { + ret = -EINVAL; + goto free; + } else { + find_pages_per_pa = true; + } + } else { + /* unsupported cases */ + ret = -EOPNOTSUPP; + goto free; + } + } + } else { + if (amdgpu_umc_pages_in_a_row(adev, &err_data, + bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT)) { + ret = -EINVAL; + goto free; + } + } + } else { + if (from_rom && !find_pages_per_pa) { + if (bps[i].retired_page & UMC_CHANNEL_IDX_V2) { + /* bad page in any NPS mode in eeprom */ + if (amdgpu_ras_mca2pa_by_idx(adev, &bps[i], &err_data)) { + ret = -EINVAL; + goto free; + } + } else { + /* legacy bad page in eeprom, generated only in + * NPS1 mode + */ + if (amdgpu_ras_mca2pa(adev, &bps[i], &err_data)) { + /* old RAS TA or ASICs which don't support to + * convert addrss via mca address + */ + if (!i && nps == AMDGPU_NPS1_PARTITION_MODE) { + find_pages_per_pa = true; + err_rec = &bps[i]; + loop_cnt = 1; + } else { + /* non-nps1 mode, old RAS TA + * can't support it + */ + ret = -EOPNOTSUPP; + goto free; + } + } + } + + if (!find_pages_per_pa) + i += (adev->umc.retire_unit - 1); + } else { + err_rec = &bps[i]; + } + } + + for (j = 0; j < loop_cnt; j++) { + if (amdgpu_ras_check_bad_page_unlock(con, + err_rec[j].retired_page << AMDGPU_GPU_PAGE_SHIFT)) + continue; + + if (!data->space_left && + amdgpu_ras_realloc_eh_data_space(adev, data, 256)) { + ret = -ENOMEM; + goto free; + } + + amdgpu_ras_reserve_page(adev, err_rec[j].retired_page); + + memcpy(&data->bps[data->count], &(err_rec[j]), + sizeof(struct eeprom_table_record)); + data->count++; + data->space_left--; + } + } + +free: + if (from_rom) + kfree(err_data.err_addr); out: mutex_unlock(&con->recovery_lock); @@ -2768,7 +2944,7 @@ int amdgpu_ras_save_bad_pages(struct amdgpu_device *adev, struct amdgpu_ras *con = amdgpu_ras_get_context(adev); struct ras_err_handler_data *data; struct amdgpu_ras_eeprom_control *control; - int save_count; + int save_count, unit_num, bad_page_num, i; if (!con || !con->eh_data) { if (new_cnt) @@ -2780,19 +2956,32 @@ int amdgpu_ras_save_bad_pages(struct amdgpu_device *adev, mutex_lock(&con->recovery_lock); control = &con->eeprom_control; data = con->eh_data; - save_count = data->count - control->ras_num_recs; + bad_page_num = control->ras_num_bad_pages; + save_count = data->count - bad_page_num; mutex_unlock(&con->recovery_lock); + unit_num = save_count / adev->umc.retire_unit; if (new_cnt) - *new_cnt = save_count / adev->umc.retire_unit; + *new_cnt = unit_num; /* only new entries are saved */ if (save_count > 0) { - if (amdgpu_ras_eeprom_append(control, - &data->bps[control->ras_num_recs], - save_count)) { - dev_err(adev->dev, "Failed to save EEPROM table data!"); - return -EIO; + if (control->rec_type == AMDGPU_RAS_EEPROM_REC_PA) { + if (amdgpu_ras_eeprom_append(control, + &data->bps[control->ras_num_recs], + save_count)) { + dev_err(adev->dev, "Failed to save EEPROM table data!"); + return -EIO; + } + } else { + for (i = 0; i < unit_num; i++) { + if (amdgpu_ras_eeprom_append(control, + &data->bps[bad_page_num + i * adev->umc.retire_unit], + 1)) { + dev_err(adev->dev, "Failed to save EEPROM table data!"); + return -EIO; + } + } } dev_info(adev->dev, "Saved %d pages to EEPROM table.\n", save_count); @@ -2821,11 +3010,32 @@ static int amdgpu_ras_load_bad_pages(struct amdgpu_device *adev) return -ENOMEM; ret = amdgpu_ras_eeprom_read(control, bps, control->ras_num_recs); - if (ret) + if (ret) { dev_err(adev->dev, "Failed to load EEPROM table records!"); - else - ret = amdgpu_ras_add_bad_pages(adev, bps, control->ras_num_recs); + } else { + if (control->ras_num_recs > 1 && + adev->umc.ras && adev->umc.ras->convert_ras_err_addr) { + if ((bps[0].address == bps[1].address) && + (bps[0].mem_channel == bps[1].mem_channel)) + control->rec_type = AMDGPU_RAS_EEPROM_REC_PA; + else + control->rec_type = AMDGPU_RAS_EEPROM_REC_MCA; + } + ret = amdgpu_ras_eeprom_check(control); + if (ret) + goto out; + + /* HW not usable */ + if (amdgpu_ras_is_rma(adev)) { + ret = -EHWPOISON; + goto out; + } + + ret = amdgpu_ras_add_bad_pages(adev, bps, control->ras_num_recs, true); + } + +out: kfree(bps); return ret; } @@ -3205,31 +3415,36 @@ static int amdgpu_ras_page_retirement_thread(void *param) int amdgpu_ras_init_badpage_info(struct amdgpu_device *adev) { struct amdgpu_ras *con = amdgpu_ras_get_context(adev); + struct amdgpu_ras_eeprom_control *control; int ret; if (!con || amdgpu_sriov_vf(adev)) return 0; - ret = amdgpu_ras_eeprom_init(&con->eeprom_control); - + control = &con->eeprom_control; + ret = amdgpu_ras_eeprom_init(control); if (ret) return ret; - /* HW not usable */ - if (amdgpu_ras_is_rma(adev)) - return -EHWPOISON; + if (!adev->umc.ras || !adev->umc.ras->convert_ras_err_addr) + control->rec_type = AMDGPU_RAS_EEPROM_REC_PA; - if (con->eeprom_control.ras_num_recs) { + /* default status is MCA storage */ + if (control->ras_num_recs <= 1 && + adev->umc.ras && adev->umc.ras->convert_ras_err_addr) + control->rec_type = AMDGPU_RAS_EEPROM_REC_MCA; + + if (control->ras_num_recs) { ret = amdgpu_ras_load_bad_pages(adev); if (ret) return ret; amdgpu_dpm_send_hbm_bad_pages_num( - adev, con->eeprom_control.ras_num_recs); + adev, control->ras_num_bad_pages); if (con->update_channel_flag == true) { amdgpu_dpm_send_hbm_bad_channel_flag( - adev, con->eeprom_control.bad_channel_bitmap); + adev, control->bad_channel_bitmap); con->update_channel_flag = false; } } @@ -3366,6 +3581,7 @@ static bool amdgpu_ras_asic_supported(struct amdgpu_device *adev) switch (amdgpu_ip_version(adev, MP0_HWIP, 0)) { case IP_VERSION(13, 0, 2): case IP_VERSION(13, 0, 6): + case IP_VERSION(13, 0, 12): case IP_VERSION(13, 0, 14): return true; default: @@ -3378,7 +3594,9 @@ static bool amdgpu_ras_asic_supported(struct amdgpu_device *adev) case IP_VERSION(13, 0, 0): case IP_VERSION(13, 0, 6): case IP_VERSION(13, 0, 10): + case IP_VERSION(13, 0, 12): case IP_VERSION(13, 0, 14): + case IP_VERSION(14, 0, 3): return true; default: return false; @@ -3629,6 +3847,7 @@ static void amdgpu_ras_init_reserved_vram_size(struct amdgpu_device *adev) switch (amdgpu_ip_version(adev, MP0_HWIP, 0)) { case IP_VERSION(13, 0, 2): case IP_VERSION(13, 0, 6): + case IP_VERSION(13, 0, 12): case IP_VERSION(13, 0, 14): con->reserved_pages_in_bytes = AMDGPU_RAS_RESERVED_VRAM_SIZE; break; @@ -3704,7 +3923,19 @@ int amdgpu_ras_init(struct amdgpu_device *adev) * check DF RAS */ adev->nbio.ras = &nbio_v4_3_ras; break; + case IP_VERSION(6, 3, 1): + if (adev->ras_hw_enabled & (1 << AMDGPU_RAS_BLOCK__DF)) + /* unlike other generation of nbio ras, + * nbif v6_3_1 only support fatal error interrupt + * to inform software that DF is freezed due to + * system fatal error event. driver should not + * enable nbio ras in such case. Instead, + * check DF RAS + */ + adev->nbio.ras = &nbif_v6_3_1_ras; + break; case IP_VERSION(7, 9, 0): + case IP_VERSION(7, 9, 1): if (!adev->gmc.is_app_apu) adev->nbio.ras = &nbio_v7_9_ras; break; @@ -4083,16 +4314,56 @@ bool amdgpu_ras_get_fed_status(struct amdgpu_device *adev) if (!ras) return false; - return atomic_read(&ras->fed); + return test_bit(AMDGPU_RAS_BLOCK__LAST, &ras->ras_err_state); } void amdgpu_ras_set_fed(struct amdgpu_device *adev, bool status) { struct amdgpu_ras *ras; + ras = amdgpu_ras_get_context(adev); + if (ras) { + if (status) + set_bit(AMDGPU_RAS_BLOCK__LAST, &ras->ras_err_state); + else + clear_bit(AMDGPU_RAS_BLOCK__LAST, &ras->ras_err_state); + } +} + +void amdgpu_ras_clear_err_state(struct amdgpu_device *adev) +{ + struct amdgpu_ras *ras; + ras = amdgpu_ras_get_context(adev); if (ras) - atomic_set(&ras->fed, !!status); + ras->ras_err_state = 0; +} + +void amdgpu_ras_set_err_poison(struct amdgpu_device *adev, + enum amdgpu_ras_block block) +{ + struct amdgpu_ras *ras; + + ras = amdgpu_ras_get_context(adev); + if (ras) + set_bit(block, &ras->ras_err_state); +} + +bool amdgpu_ras_is_err_state(struct amdgpu_device *adev, int block) +{ + struct amdgpu_ras *ras; + + ras = amdgpu_ras_get_context(adev); + if (ras) { + if (block == AMDGPU_RAS_BLOCK__ANY) + return (ras->ras_err_state != 0); + else + return test_bit(block, &ras->ras_err_state) || + test_bit(AMDGPU_RAS_BLOCK__LAST, + &ras->ras_err_state); + } + + return false; } static struct ras_event_manager *__get_ras_event_mgr(struct amdgpu_device *adev) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h index 6db772ecfee4..82db986c36a0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h @@ -99,7 +99,8 @@ enum amdgpu_ras_block { AMDGPU_RAS_BLOCK__IH, AMDGPU_RAS_BLOCK__MPIO, - AMDGPU_RAS_BLOCK__LAST + AMDGPU_RAS_BLOCK__LAST, + AMDGPU_RAS_BLOCK__ANY = -1 }; enum amdgpu_ras_mca_block { @@ -482,6 +483,8 @@ struct ras_ecc_err { uint64_t ipid; uint64_t addr; uint64_t pa_pfn; + /* save global channel index across all UMC instances */ + uint32_t channel_idx; struct ras_err_pages err_pages; }; @@ -558,8 +561,8 @@ struct amdgpu_ras { struct ras_ecc_log_info umc_ecc_log; struct delayed_work page_retirement_dwork; - /* Fatal error detected flag */ - atomic_t fed; + /* ras errors detected */ + unsigned long ras_err_state; /* RAS event manager */ struct ras_event_manager __event_mgr; @@ -750,7 +753,7 @@ int amdgpu_ras_query_error_count(struct amdgpu_device *adev, /* error handling functions */ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev, - struct eeprom_table_record *bps, int pages); + struct eeprom_table_record *bps, int pages, bool from_rom); int amdgpu_ras_save_bad_pages(struct amdgpu_device *adev, unsigned long *new_cnt); @@ -952,6 +955,10 @@ ssize_t amdgpu_ras_aca_sysfs_read(struct device *dev, struct device_attribute *a void amdgpu_ras_set_fed(struct amdgpu_device *adev, bool status); bool amdgpu_ras_get_fed_status(struct amdgpu_device *adev); +void amdgpu_ras_set_err_poison(struct amdgpu_device *adev, + enum amdgpu_ras_block block); +void amdgpu_ras_clear_err_state(struct amdgpu_device *adev); +bool amdgpu_ras_is_err_state(struct amdgpu_device *adev, int block); u64 amdgpu_ras_acquire_event_id(struct amdgpu_device *adev, enum ras_event_type type); int amdgpu_ras_mark_ras_event_caller(struct amdgpu_device *adev, enum ras_event_type type, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index f28f6b4ba765..52c16bfeccaa 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c @@ -470,9 +470,10 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control) res = __write_table_ras_info(control); control->ras_num_recs = 0; + control->ras_num_bad_pages = 0; control->ras_fri = 0; - amdgpu_dpm_send_hbm_bad_pages_num(adev, control->ras_num_recs); + amdgpu_dpm_send_hbm_bad_pages_num(adev, control->ras_num_bad_pages); control->bad_channel_bitmap = 0; amdgpu_dpm_send_hbm_bad_channel_flag(adev, control->bad_channel_bitmap); @@ -559,7 +560,7 @@ bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev) if (con->eeprom_control.tbl_hdr.header == RAS_TABLE_HDR_BAD) { if (amdgpu_bad_page_threshold == -1) { dev_warn(adev->dev, "RAS records:%d exceed threshold:%d", - con->eeprom_control.ras_num_recs, con->bad_page_cnt_threshold); + con->eeprom_control.ras_num_bad_pages, con->bad_page_cnt_threshold); dev_warn(adev->dev, "But GPU can be operated due to bad_page_threshold = -1.\n"); return false; @@ -621,6 +622,7 @@ amdgpu_ras_eeprom_append_table(struct amdgpu_ras_eeprom_control *control, const u32 num) { struct amdgpu_ras *con = amdgpu_ras_get_context(to_amdgpu_device(control)); + struct amdgpu_device *adev = to_amdgpu_device(control); u32 a, b, i; u8 *buf, *pp; int res; @@ -723,6 +725,12 @@ amdgpu_ras_eeprom_append_table(struct amdgpu_ras_eeprom_control *control, control->ras_num_recs = 1 + (control->ras_max_record_count + b - control->ras_fri) % control->ras_max_record_count; + + if (control->rec_type == AMDGPU_RAS_EEPROM_REC_PA) + control->ras_num_bad_pages = control->ras_num_recs; + else + control->ras_num_bad_pages = + control->ras_num_recs * adev->umc.retire_unit; Out: kfree(buf); return res; @@ -740,10 +748,10 @@ amdgpu_ras_eeprom_update_header(struct amdgpu_ras_eeprom_control *control) /* Modify the header if it exceeds. */ if (amdgpu_bad_page_threshold != 0 && - control->ras_num_recs >= ras->bad_page_cnt_threshold) { + control->ras_num_bad_pages >= ras->bad_page_cnt_threshold) { dev_warn(adev->dev, "Saved bad pages %d reaches threshold value %d\n", - control->ras_num_recs, ras->bad_page_cnt_threshold); + control->ras_num_bad_pages, ras->bad_page_cnt_threshold); control->tbl_hdr.header = RAS_TABLE_HDR_BAD; if (control->tbl_hdr.version == RAS_TABLE_VER_V2_1) { control->tbl_rai.rma_status = GPU_RETIRED__ECC_REACH_THRESHOLD; @@ -798,9 +806,9 @@ amdgpu_ras_eeprom_update_header(struct amdgpu_ras_eeprom_control *control) */ if (amdgpu_bad_page_threshold != 0 && control->tbl_hdr.version == RAS_TABLE_VER_V2_1 && - control->ras_num_recs < ras->bad_page_cnt_threshold) + control->ras_num_bad_pages < ras->bad_page_cnt_threshold) control->tbl_rai.health_percent = ((ras->bad_page_cnt_threshold - - control->ras_num_recs) * 100) / + control->ras_num_bad_pages) * 100) / ras->bad_page_cnt_threshold; /* Recalc the checksum. @@ -841,7 +849,7 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control, const u32 num) { struct amdgpu_device *adev = to_amdgpu_device(control); - int res; + int res, i; if (!__is_ras_eeprom_supported(adev)) return 0; @@ -855,6 +863,10 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control, return -EINVAL; } + /* set the new channel index flag */ + for (i = 0; i < num; i++) + record[i].retired_page |= UMC_CHANNEL_IDX_V2; + mutex_lock(&control->ras_tbl_mutex); res = amdgpu_ras_eeprom_append_table(control, record, num); @@ -864,6 +876,11 @@ int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control, amdgpu_ras_debugfs_set_ret_size(control); mutex_unlock(&control->ras_tbl_mutex); + + /* clear channel index flag, the flag is only saved on eeprom */ + for (i = 0; i < num; i++) + record[i].retired_page &= ~UMC_CHANNEL_IDX_V2; + return res; } @@ -1373,9 +1390,35 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control) } control->ras_fri = RAS_OFFSET_TO_INDEX(control, hdr->first_rec_offset); + return 0; +} + +int amdgpu_ras_eeprom_check(struct amdgpu_ras_eeprom_control *control) +{ + struct amdgpu_device *adev = to_amdgpu_device(control); + struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr; + struct amdgpu_ras *ras = amdgpu_ras_get_context(adev); + int res; + + if (!__is_ras_eeprom_supported(adev)) + return 0; + + /* Verify i2c adapter is initialized */ + if (!adev->pm.ras_eeprom_i2c_bus || !adev->pm.ras_eeprom_i2c_bus->algo) + return -ENOENT; + + if (!__get_eeprom_i2c_addr(adev, control)) + return -EINVAL; + + if (control->rec_type == AMDGPU_RAS_EEPROM_REC_PA) + control->ras_num_bad_pages = control->ras_num_recs; + else + control->ras_num_bad_pages = + control->ras_num_recs * adev->umc.retire_unit; + if (hdr->header == RAS_TABLE_HDR_VAL) { DRM_DEBUG_DRIVER("Found existing EEPROM table with %d records", - control->ras_num_recs); + control->ras_num_bad_pages); if (hdr->version == RAS_TABLE_VER_V2_1) { res = __read_table_ras_info(control); @@ -1390,9 +1433,9 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control) /* Warn if we are at 90% of the threshold or above */ - if (10 * control->ras_num_recs >= 9 * ras->bad_page_cnt_threshold) + if (10 * control->ras_num_bad_pages >= 9 * ras->bad_page_cnt_threshold) dev_warn(adev->dev, "RAS records:%u exceeds 90%% of threshold:%d", - control->ras_num_recs, + control->ras_num_bad_pages, ras->bad_page_cnt_threshold); } else if (hdr->header == RAS_TABLE_HDR_BAD && amdgpu_bad_page_threshold != 0) { @@ -1403,10 +1446,12 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control) } res = __verify_ras_table_checksum(control); - if (res) - DRM_ERROR("RAS Table incorrect checksum or error:%d\n", + if (res) { + dev_err(adev->dev, "RAS Table incorrect checksum or error:%d\n", res); - if (ras->bad_page_cnt_threshold > control->ras_num_recs) { + return -EINVAL; + } + if (ras->bad_page_cnt_threshold > control->ras_num_bad_pages) { /* This means that, the threshold was increased since * the last time the system was booted, and now, * ras->bad_page_cnt_threshold - control->num_recs > 0, @@ -1416,13 +1461,13 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control) dev_info(adev->dev, "records:%d threshold:%d, resetting " "RAS table header signature", - control->ras_num_recs, + control->ras_num_bad_pages, ras->bad_page_cnt_threshold); res = amdgpu_ras_eeprom_correct_header_tag(control, RAS_TABLE_HDR_VAL); } else { dev_err(adev->dev, "RAS records:%d exceed threshold:%d", - control->ras_num_recs, ras->bad_page_cnt_threshold); + control->ras_num_bad_pages, ras->bad_page_cnt_threshold); if (amdgpu_bad_page_threshold == -1) { dev_warn(adev->dev, "GPU will be initialized due to bad_page_threshold = -1."); res = 0; @@ -1431,7 +1476,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control) dev_err(adev->dev, "RAS records:%d exceed threshold:%d, " "GPU will not be initialized. Replace this GPU or increase the threshold", - control->ras_num_recs, ras->bad_page_cnt_threshold); + control->ras_num_bad_pages, ras->bad_page_cnt_threshold); } } } else { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h index b9ebda577797..81d55cb7b397 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h @@ -43,6 +43,19 @@ enum amdgpu_ras_eeprom_err_type { AMDGPU_RAS_EEPROM_ERR_COUNT, }; +/* + * one UMC MCA address could map to multiply physical address (PA), + * such as 1:16, we use eeprom_table_record.address to store MCA + * address and use eeprom_table_record.retired_page to save PA. + * + * AMDGPU_RAS_EEPROM_REC_PA: one record store one PA + * AMDGPU_RAS_EEPROM_REC_MCA: one record store one MCA address + */ +enum amdgpu_ras_eeprom_rec_type { + AMDGPU_RAS_EEPROM_REC_PA, + AMDGPU_RAS_EEPROM_REC_MCA, +}; + struct amdgpu_ras_eeprom_table_header { uint32_t header; uint32_t version; @@ -82,6 +95,11 @@ struct amdgpu_ras_eeprom_control { */ u32 ras_num_recs; + /* the bad page number is ras_num_recs or + * ras_num_recs * umc.retire_unit + */ + u32 ras_num_bad_pages; + /* First record index to read, 0-based. * Range is [0, num_recs-1]. This is * an absolute index, starting right after @@ -102,6 +120,7 @@ struct amdgpu_ras_eeprom_control { /* Record channel info which occurred bad pages */ u32 bad_channel_bitmap; + enum amdgpu_ras_eeprom_rec_type rec_type; }; /* @@ -145,6 +164,8 @@ uint32_t amdgpu_ras_eeprom_max_record_count(struct amdgpu_ras_eeprom_control *co void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control *control); +int amdgpu_ras_eeprom_check(struct amdgpu_ras_eeprom_control *control); + extern const struct file_operations amdgpu_ras_debugfs_eeprom_size_ops; extern const struct file_operations amdgpu_ras_debugfs_eeprom_table_ops; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c index a0acb65f4b40..dabfbdf6f1ce 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c @@ -183,6 +183,7 @@ int amdgpu_reset_init(struct amdgpu_device *adev) switch (amdgpu_ip_version(adev, MP1_HWIP, 0)) { case IP_VERSION(13, 0, 2): case IP_VERSION(13, 0, 6): + case IP_VERSION(13, 0, 12): case IP_VERSION(13, 0, 14): ret = aldebaran_reset_init(adev); break; @@ -206,6 +207,7 @@ int amdgpu_reset_fini(struct amdgpu_device *adev) switch (amdgpu_ip_version(adev, MP1_HWIP, 0)) { case IP_VERSION(13, 0, 2): case IP_VERSION(13, 0, 6): + case IP_VERSION(13, 0, 12): case IP_VERSION(13, 0, 14): ret = aldebaran_reset_fini(adev); break; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index 36fc9578c53c..dee5a1b4e572 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h @@ -462,8 +462,7 @@ int amdgpu_ib_get(struct amdgpu_device *adev, struct amdgpu_vm *vm, unsigned size, enum amdgpu_ib_pool_type pool, struct amdgpu_ib *ib); -void amdgpu_ib_free(struct amdgpu_device *adev, struct amdgpu_ib *ib, - struct dma_fence *f); +void amdgpu_ib_free(struct amdgpu_ib *ib, struct dma_fence *f); int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs, struct amdgpu_ib *ibs, struct amdgpu_job *job, struct dma_fence **f); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c index 10df731998b2..39070b2a4c04 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c @@ -93,8 +93,7 @@ int amdgpu_sa_bo_new(struct amdgpu_sa_manager *sa_manager, return 0; } -void amdgpu_sa_bo_free(struct amdgpu_device *adev, struct drm_suballoc **sa_bo, - struct dma_fence *fence) +void amdgpu_sa_bo_free(struct drm_suballoc **sa_bo, struct dma_fence *fence) { if (sa_bo == NULL || *sa_bo == NULL) { return; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c index 113f0d242618..174badca27e7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c @@ -219,9 +219,11 @@ int amdgpu_sdma_init_microcode(struct amdgpu_device *adev, amdgpu_ucode_ip_version_decode(adev, SDMA0_HWIP, ucode_prefix, sizeof(ucode_prefix)); if (instance == 0) err = amdgpu_ucode_request(adev, &adev->sdma.instance[instance].fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s.bin", ucode_prefix); else err = amdgpu_ucode_request(adev, &adev->sdma.instance[instance].fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s%d.bin", ucode_prefix, instance); if (err) goto out; @@ -260,6 +262,8 @@ int amdgpu_sdma_init_microcode(struct amdgpu_device *adev, * groups of SDMAs */ if ((amdgpu_ip_version(adev, SDMA0_HWIP, 0) == IP_VERSION(4, 4, 2) || + amdgpu_ip_version(adev, SDMA0_HWIP, 0) == + IP_VERSION(4, 4, 4) || amdgpu_ip_version(adev, SDMA0_HWIP, 0) == IP_VERSION(4, 4, 5)) && adev->firmware.load_type == @@ -358,13 +362,13 @@ static int amdgpu_debugfs_sdma_sched_mask_set(void *data, u64 val) if (!adev) return -ENODEV; - mask = (1 << adev->sdma.num_instances) - 1; + mask = BIT_ULL(adev->sdma.num_instances) - 1; if ((val & mask) == 0) return -EINVAL; for (i = 0; i < adev->sdma.num_instances; ++i) { ring = &adev->sdma.instance[i].ring; - if (val & (1 << i)) + if (val & BIT_ULL(i)) ring->sched.ready = true; else ring->sched.ready = false; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 9f922ec50ea2..ff286940ab43 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -61,7 +61,7 @@ #include "amdgpu_res_cursor.h" #include "bif/bif_4_1_d.h" -MODULE_IMPORT_NS(DMA_BUF); +MODULE_IMPORT_NS("DMA_BUF"); #define AMDGPU_TTM_VRAM_MAX_DW_READ ((size_t)128) @@ -1762,7 +1762,8 @@ static int amdgpu_ttm_reserve_tmr(struct amdgpu_device *adev) if (!adev->bios && (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) || - amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4))) + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) || + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0))) reserve_size = max(reserve_size, (uint32_t)280 << 20); else if (!reserve_size) reserve_size = DISCOVERY_TMR_OFFSET; @@ -2065,6 +2066,7 @@ void amdgpu_ttm_fini(struct amdgpu_device *adev) ttm_range_man_fini(&adev->mman.bdev, AMDGPU_PL_GDS); ttm_range_man_fini(&adev->mman.bdev, AMDGPU_PL_GWS); ttm_range_man_fini(&adev->mman.bdev, AMDGPU_PL_OA); + ttm_range_man_fini(&adev->mman.bdev, AMDGPU_PL_DOORBELL); ttm_device_fini(&adev->mman.bdev); adev->mman.initialized = false; DRM_INFO("amdgpu: ttm finalized\n"); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h index 2852a6064c9a..461fb8090ae0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h @@ -26,15 +26,15 @@ #include #include +#include #include "amdgpu_vram_mgr.h" -#include "amdgpu.h" #define AMDGPU_PL_GDS (TTM_PL_PRIV + 0) #define AMDGPU_PL_GWS (TTM_PL_PRIV + 1) #define AMDGPU_PL_OA (TTM_PL_PRIV + 2) #define AMDGPU_PL_PREEMPT (TTM_PL_PRIV + 3) #define AMDGPU_PL_DOORBELL (TTM_PL_PRIV + 4) -#define __AMDGPU_PL_LAST (TTM_PL_PRIV + 4) +#define __AMDGPU_PL_NUM (TTM_PL_PRIV + 5) #define AMDGPU_GTT_MAX_TRANSFER_SIZE 512 #define AMDGPU_GTT_NUM_TRANSFER_WINDOWS 2 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c index 4c7b53648a50..cf700824b960 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c @@ -1434,6 +1434,7 @@ void amdgpu_ucode_ip_version_decode(struct amdgpu_device *adev, int block_type, * * @adev: amdgpu device * @fw: pointer to load firmware to + * @required: whether the firmware is required * @fmt: firmware name format string * @...: variable arguments * @@ -1442,7 +1443,7 @@ void amdgpu_ucode_ip_version_decode(struct amdgpu_device *adev, int block_type, * the error code to -ENODEV, so that early_init functions will fail to load. */ int amdgpu_ucode_request(struct amdgpu_device *adev, const struct firmware **fw, - const char *fmt, ...) + enum amdgpu_ucode_required required, const char *fmt, ...) { char fname[AMDGPU_UCODE_NAME_MAX]; va_list ap; @@ -1456,16 +1457,24 @@ int amdgpu_ucode_request(struct amdgpu_device *adev, const struct firmware **fw, return -EOVERFLOW; } - r = request_firmware(fw, fname, adev->dev); + if (required == AMDGPU_UCODE_REQUIRED) + r = request_firmware(fw, fname, adev->dev); + else { + r = firmware_request_nowarn(fw, fname, adev->dev); + if (r) + drm_info(&adev->ddev, "Optional firmware \"%s\" was not found\n", fname); + } if (r) return -ENODEV; r = amdgpu_ucode_validate(*fw); - if (r) { + if (r) + /* + * The amdgpu_ucode_request() should be paired with amdgpu_ucode_release() + * regardless of success/failure, and the amdgpu_ucode_release() takes care of + * firmware release and need to avoid redundant release FW operation here. + */ dev_dbg(adev->dev, "\"%s\" failed to validate\n", fname); - release_firmware(*fw); - *fw = NULL; - } return r; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h index 4150ec0aa10d..4eedd92f000b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h @@ -126,6 +126,7 @@ enum psp_fw_type { PSP_FW_TYPE_PSP_DBG_DRV, PSP_FW_TYPE_PSP_RAS_DRV, PSP_FW_TYPE_PSP_IPKEYMGR_DRV, + PSP_FW_TYPE_PSP_SPDM_DRV, PSP_FW_TYPE_MAX_INDEX, }; @@ -551,6 +552,11 @@ enum amdgpu_firmware_load_type { AMDGPU_FW_LOAD_RLC_BACKDOOR_AUTO, }; +enum amdgpu_ucode_required { + AMDGPU_UCODE_OPTIONAL, + AMDGPU_UCODE_REQUIRED, +}; + /* conform to smu_ucode_xfer_cz.h */ #define AMDGPU_SDMA0_UCODE_LOADED 0x00000001 #define AMDGPU_SDMA1_UCODE_LOADED 0x00000002 @@ -604,9 +610,9 @@ void amdgpu_ucode_print_rlc_hdr(const struct common_firmware_header *hdr); void amdgpu_ucode_print_sdma_hdr(const struct common_firmware_header *hdr); void amdgpu_ucode_print_psp_hdr(const struct common_firmware_header *hdr); void amdgpu_ucode_print_gpu_info_hdr(const struct common_firmware_header *hdr); -__printf(3, 4) +__printf(4, 5) int amdgpu_ucode_request(struct amdgpu_device *adev, const struct firmware **fw, - const char *fmt, ...); + enum amdgpu_ucode_required required, const char *fmt, ...); void amdgpu_ucode_release(const struct firmware **fw); bool amdgpu_ucode_hdr_version(union amdgpu_firmware_header *hdr, uint16_t hdr_major, uint16_t hdr_minor); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c index 896f3609b0ee..eafe20d8fe0b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c @@ -78,7 +78,7 @@ int amdgpu_umc_page_retirement_mca(struct amdgpu_device *adev, if (amdgpu_bad_page_threshold != 0) { amdgpu_ras_add_bad_pages(adev, err_data.err_addr, - err_data.err_addr_cnt); + err_data.err_addr_cnt, false); amdgpu_ras_save_bad_pages(adev, NULL); } @@ -166,10 +166,11 @@ void amdgpu_umc_handle_bad_pages(struct amdgpu_device *adev, if ((amdgpu_bad_page_threshold != 0) && err_data->err_addr_cnt) { amdgpu_ras_add_bad_pages(adev, err_data->err_addr, - err_data->err_addr_cnt); + err_data->err_addr_cnt, false); amdgpu_ras_save_bad_pages(adev, &err_count); - amdgpu_dpm_send_hbm_bad_pages_num(adev, con->eeprom_control.ras_num_recs); + amdgpu_dpm_send_hbm_bad_pages_num(adev, + con->eeprom_control.ras_num_bad_pages); if (con->update_channel_flag == true) { amdgpu_dpm_send_hbm_bad_channel_flag(adev, con->eeprom_control.bad_channel_bitmap); @@ -444,3 +445,77 @@ int amdgpu_umc_logs_ecc_err(struct amdgpu_device *adev, return ret; } + +int amdgpu_umc_pages_in_a_row(struct amdgpu_device *adev, + struct ras_err_data *err_data, uint64_t pa_addr) +{ + struct ta_ras_query_address_output addr_out; + + /* reinit err_data */ + err_data->err_addr_cnt = 0; + err_data->err_addr_len = adev->umc.retire_unit; + + addr_out.pa.pa = pa_addr; + if (adev->umc.ras && adev->umc.ras->convert_ras_err_addr) + return adev->umc.ras->convert_ras_err_addr(adev, err_data, NULL, + &addr_out, false); + else + return -EINVAL; +} + +int amdgpu_umc_lookup_bad_pages_in_a_row(struct amdgpu_device *adev, + uint64_t pa_addr, uint64_t *pfns, int len) +{ + int i, ret; + struct ras_err_data err_data; + + err_data.err_addr = kcalloc(adev->umc.retire_unit, + sizeof(struct eeprom_table_record), GFP_KERNEL); + if (!err_data.err_addr) { + dev_warn(adev->dev, "Failed to alloc memory in bad page lookup!\n"); + return 0; + } + + ret = amdgpu_umc_pages_in_a_row(adev, &err_data, pa_addr); + if (ret) + goto out; + + for (i = 0; i < adev->umc.retire_unit; i++) { + if (i >= len) + goto out; + + pfns[i] = err_data.err_addr[i].retired_page; + } + ret = i; + +out: + kfree(err_data.err_addr); + return ret; +} + +int amdgpu_umc_mca_to_addr(struct amdgpu_device *adev, + uint64_t err_addr, uint32_t ch, uint32_t umc, + uint32_t node, uint32_t socket, + struct ta_ras_query_address_output *addr_out, bool dump_addr) +{ + struct ta_ras_query_address_input addr_in; + int ret; + + memset(&addr_in, 0, sizeof(addr_in)); + addr_in.ma.err_addr = err_addr; + addr_in.ma.ch_inst = ch; + addr_in.ma.umc_inst = umc; + addr_in.ma.node_inst = node; + addr_in.ma.socket_id = socket; + + if (adev->umc.ras && adev->umc.ras->convert_ras_err_addr) { + ret = adev->umc.ras->convert_ras_err_addr(adev, NULL, &addr_in, + addr_out, dump_addr); + if (ret) + return ret; + } else { + return 0; + } + + return 0; +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h index ce4179db2a6d..a4a7e61817aa 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h @@ -54,6 +54,22 @@ /* Page retirement tag */ #define UMC_ECC_NEW_DETECTED_TAG 0x1 +/* + * a flag to indicate v2 of channel index stored in eeprom + * + * v1 (legacy way): store channel index within a umc instance in eeprom + * range in UMC v12: 0 ~ 7 + * v2: store global channel index in eeprom + * range in UMC v12: 0 ~ 127 + * + * NOTE: it's better to store it in eeprom_table_record.mem_channel, + * but there is only 8 bits in mem_channel, and the channel number may + * increase in the future, we decide to save it in + * eeprom_table_record.retired_page. retired_page is useless in v2, + * we depend on eeprom_table_record.address instead of retired_page in v2. + * Only 48 bits are saved on eeprom, use bit 47 here. + */ +#define UMC_CHANNEL_IDX_V2 BIT_ULL(47) typedef int (*umc_func)(struct amdgpu_device *adev, uint32_t node_inst, uint32_t umc_inst, uint32_t ch_inst, void *data); @@ -70,6 +86,13 @@ struct amdgpu_umc_ras { enum amdgpu_mca_error_type type, void *ras_error_status); int (*update_ecc_status)(struct amdgpu_device *adev, uint64_t status, uint64_t ipid, uint64_t addr); + int (*convert_ras_err_addr)(struct amdgpu_device *adev, + struct ras_err_data *err_data, + struct ta_ras_query_address_input *addr_in, + struct ta_ras_query_address_output *addr_out, + bool dump_addr); + uint32_t (*get_die_id_from_pa)(struct amdgpu_device *adev, + uint64_t mca_addr, uint64_t retired_page); }; struct amdgpu_umc_funcs { @@ -134,4 +157,12 @@ int amdgpu_umc_logs_ecc_err(struct amdgpu_device *adev, void amdgpu_umc_handle_bad_pages(struct amdgpu_device *adev, void *ras_error_status); +int amdgpu_umc_pages_in_a_row(struct amdgpu_device *adev, + struct ras_err_data *err_data, uint64_t pa_addr); +int amdgpu_umc_lookup_bad_pages_in_a_row(struct amdgpu_device *adev, + uint64_t pa_addr, uint64_t *pfns, int len); +int amdgpu_umc_mca_to_addr(struct amdgpu_device *adev, + uint64_t err_addr, uint32_t ch, uint32_t umc, + uint32_t node, uint32_t socket, + struct ta_ras_query_address_output *addr_out, bool dump_addr); #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c index bd2d3863c3ed..dde15c6a96e1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c @@ -587,7 +587,8 @@ int amdgpu_umsch_mm_init_microcode(struct amdgpu_umsch_mm *umsch) break; } - r = amdgpu_ucode_request(adev, &adev->umsch_mm.fw, "%s", fw_name); + r = amdgpu_ucode_request(adev, &adev->umsch_mm.fw, AMDGPU_UCODE_REQUIRED, + "%s", fw_name); if (r) { release_firmware(adev->umsch_mm.fw); adev->umsch_mm.fw = NULL; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c index 31fd30dcd593..74758b5ffc6c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c @@ -260,7 +260,7 @@ int amdgpu_uvd_sw_init(struct amdgpu_device *adev) return -EINVAL; } - r = amdgpu_ucode_request(adev, &adev->uvd.fw, "%s", fw_name); + r = amdgpu_ucode_request(adev, &adev->uvd.fw, AMDGPU_UCODE_REQUIRED, "%s", fw_name); if (r) { dev_err(adev->dev, "amdgpu_uvd: Can't validate firmware \"%s\"\n", fw_name); @@ -551,6 +551,8 @@ static void amdgpu_uvd_force_into_uvd_segment(struct amdgpu_bo *abo) for (i = 0; i < abo->placement.num_placement; ++i) { abo->placements[i].fpfn = 0 >> PAGE_SHIFT; abo->placements[i].lpfn = (256 * 1024 * 1024) >> PAGE_SHIFT; + if (abo->placements[i].mem_type == TTM_PL_VRAM) + abo->placements[i].flags |= TTM_PL_FLAG_CONTIGUOUS; } } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c index 599d3ca4e0ef..b9060bcd4806 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c @@ -158,7 +158,7 @@ int amdgpu_vce_sw_init(struct amdgpu_device *adev, unsigned long size) return -EINVAL; } - r = amdgpu_ucode_request(adev, &adev->vce.fw, "%s", fw_name); + r = amdgpu_ucode_request(adev, &adev->vce.fw, AMDGPU_UCODE_REQUIRED, "%s", fw_name); if (r) { dev_err(adev->dev, "amdgpu_vce: Can't validate firmware \"%s\"\n", fw_name); @@ -503,7 +503,7 @@ static int amdgpu_vce_get_create_msg(struct amdgpu_ring *ring, uint32_t handle, ib->ptr[i] = 0x0; r = amdgpu_job_submit_direct(job, ring, &f); - amdgpu_ib_free(ring->adev, &ib_msg, f); + amdgpu_ib_free(&ib_msg, f); if (r) goto err; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c index 3e94c3ba1ba2..83faf6e6788a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c @@ -1,5 +1,5 @@ /* - * Copyright 2016 Advanced Micro Devices, Inc. + * Copyright 2016-2024 Advanced Micro Devices, Inc. * All Rights Reserved. * * Permission is hereby granted, free of charge, to any person obtaining a @@ -62,6 +62,7 @@ #define FIRMWARE_VCN4_0_6 "amdgpu/vcn_4_0_6.bin" #define FIRMWARE_VCN4_0_6_1 "amdgpu/vcn_4_0_6_1.bin" #define FIRMWARE_VCN5_0_0 "amdgpu/vcn_5_0_0.bin" +#define FIRMWARE_VCN5_0_1 "amdgpu/vcn_5_0_1.bin" MODULE_FIRMWARE(FIRMWARE_RAVEN); MODULE_FIRMWARE(FIRMWARE_PICASSO); @@ -88,6 +89,7 @@ MODULE_FIRMWARE(FIRMWARE_VCN4_0_5); MODULE_FIRMWARE(FIRMWARE_VCN4_0_6); MODULE_FIRMWARE(FIRMWARE_VCN4_0_6_1); MODULE_FIRMWARE(FIRMWARE_VCN5_0_0); +MODULE_FIRMWARE(FIRMWARE_VCN5_0_1); static void amdgpu_vcn_idle_work_handler(struct work_struct *work); @@ -99,11 +101,15 @@ int amdgpu_vcn_early_init(struct amdgpu_device *adev) amdgpu_ucode_ip_version_decode(adev, UVD_HWIP, ucode_prefix, sizeof(ucode_prefix)); for (i = 0; i < adev->vcn.num_vcn_inst; i++) { if (i == 1 && amdgpu_ip_version(adev, UVD_HWIP, 0) == IP_VERSION(4, 0, 6)) - r = amdgpu_ucode_request(adev, &adev->vcn.fw[i], "amdgpu/%s_%d.bin", ucode_prefix, i); + r = amdgpu_ucode_request(adev, &adev->vcn.inst[i].fw, + AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_%d.bin", ucode_prefix, i); else - r = amdgpu_ucode_request(adev, &adev->vcn.fw[i], "amdgpu/%s.bin", ucode_prefix); + r = amdgpu_ucode_request(adev, &adev->vcn.inst[i].fw, + AMDGPU_UCODE_REQUIRED, + "amdgpu/%s.bin", ucode_prefix); if (r) { - amdgpu_ucode_release(&adev->vcn.fw[i]); + amdgpu_ucode_release(&adev->vcn.inst[i].fw); return r; } } @@ -151,7 +157,7 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev) adev->vcn.using_unified_queue = amdgpu_ip_version(adev, UVD_HWIP, 0) >= IP_VERSION(4, 0, 0); - hdr = (const struct common_firmware_header *)adev->vcn.fw[0]->data; + hdr = (const struct common_firmware_header *)adev->vcn.inst[0].fw->data; adev->vcn.fw_version = le32_to_cpu(hdr->ucode_version); /* Bit 20-23, it is encode major and non-zero for new naming convention. @@ -270,7 +276,7 @@ int amdgpu_vcn_sw_fini(struct amdgpu_device *adev) for (i = 0; i < adev->vcn.num_enc_rings; ++i) amdgpu_ring_fini(&adev->vcn.inst[j].ring_enc[i]); - amdgpu_ucode_release(&adev->vcn.fw[j]); + amdgpu_ucode_release(&adev->vcn.inst[j].fw); } mutex_destroy(&adev->vcn.vcn1_jpeg1_workaround); @@ -282,7 +288,7 @@ int amdgpu_vcn_sw_fini(struct amdgpu_device *adev) bool amdgpu_vcn_is_disabled_vcn(struct amdgpu_device *adev, enum vcn_ring_type type, uint32_t vcn_instance) { bool ret = false; - int vcn_config = adev->vcn.vcn_config[vcn_instance]; + int vcn_config = adev->vcn.inst[vcn_instance].vcn_config; if ((type == VCN_ENCODE_RING) && (vcn_config & VCN_BLOCK_ENCODE_DISABLE_MASK)) ret = true; @@ -362,12 +368,12 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev) const struct common_firmware_header *hdr; unsigned int offset; - hdr = (const struct common_firmware_header *)adev->vcn.fw[i]->data; + hdr = (const struct common_firmware_header *)adev->vcn.inst[i].fw->data; if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) { offset = le32_to_cpu(hdr->ucode_array_offset_bytes); if (drm_dev_enter(adev_to_drm(adev), &idx)) { memcpy_toio(adev->vcn.inst[i].cpu_addr, - adev->vcn.fw[i]->data + offset, + adev->vcn.inst[i].fw->data + offset, le32_to_cpu(hdr->ucode_size_bytes)); drm_dev_exit(idx); } @@ -580,7 +586,7 @@ static int amdgpu_vcn_dec_send_msg(struct amdgpu_ring *ring, if (r) goto err_free; - amdgpu_ib_free(adev, ib_msg, f); + amdgpu_ib_free(ib_msg, f); if (fence) *fence = dma_fence_get(f); @@ -591,7 +597,7 @@ static int amdgpu_vcn_dec_send_msg(struct amdgpu_ring *ring, err_free: amdgpu_job_free(job); err: - amdgpu_ib_free(adev, ib_msg, f); + amdgpu_ib_free(ib_msg, f); return r; } @@ -773,7 +779,7 @@ static int amdgpu_vcn_dec_sw_send_msg(struct amdgpu_ring *ring, if (r) goto err_free; - amdgpu_ib_free(adev, ib_msg, f); + amdgpu_ib_free(ib_msg, f); if (fence) *fence = dma_fence_get(f); @@ -784,7 +790,7 @@ static int amdgpu_vcn_dec_sw_send_msg(struct amdgpu_ring *ring, err_free: amdgpu_job_free(job); err: - amdgpu_ib_free(adev, ib_msg, f); + amdgpu_ib_free(ib_msg, f); return r; } @@ -1014,7 +1020,7 @@ int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = 0; error: - amdgpu_ib_free(adev, &ib, fence); + amdgpu_ib_free(&ib, fence); dma_fence_put(fence); return r; @@ -1025,7 +1031,8 @@ int amdgpu_vcn_unified_ring_test_ib(struct amdgpu_ring *ring, long timeout) struct amdgpu_device *adev = ring->adev; long r; - if (amdgpu_ip_version(adev, UVD_HWIP, 0) != IP_VERSION(4, 0, 3)) { + if ((amdgpu_ip_version(adev, UVD_HWIP, 0) != IP_VERSION(4, 0, 3)) && + (amdgpu_ip_version(adev, UVD_HWIP, 0) != IP_VERSION(5, 0, 1))) { r = amdgpu_vcn_enc_ring_test_ib(ring, timeout); if (r) goto error; @@ -1063,7 +1070,7 @@ void amdgpu_vcn_setup_ucode(struct amdgpu_device *adev) if (adev->vcn.harvest_config & (1 << i)) continue; - hdr = (const struct common_firmware_header *)adev->vcn.fw[i]->data; + hdr = (const struct common_firmware_header *)adev->vcn.inst[i].fw->data; /* currently only support 2 FW instances */ if (i >= 2) { dev_info(adev->dev, "More then 2 VCN FW instances!\n"); @@ -1071,12 +1078,14 @@ void amdgpu_vcn_setup_ucode(struct amdgpu_device *adev) } idx = AMDGPU_UCODE_ID_VCN + i; adev->firmware.ucode[idx].ucode_id = idx; - adev->firmware.ucode[idx].fw = adev->vcn.fw[i]; + adev->firmware.ucode[idx].fw = adev->vcn.inst[i].fw; adev->firmware.fw_size += ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE); if (amdgpu_ip_version(adev, UVD_HWIP, 0) == - IP_VERSION(4, 0, 3)) + IP_VERSION(4, 0, 3) || + amdgpu_ip_version(adev, UVD_HWIP, 0) == + IP_VERSION(5, 0, 1)) break; } } @@ -1320,3 +1329,71 @@ void amdgpu_vcn_sysfs_reset_mask_fini(struct amdgpu_device *adev) device_remove_file(adev->dev, &dev_attr_vcn_reset_mask); } } + +/* + * debugfs to enable/disable vcn job submission to specific core or + * instance. It is created only if the queue type is unified. + */ +#if defined(CONFIG_DEBUG_FS) +static int amdgpu_debugfs_vcn_sched_mask_set(void *data, u64 val) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)data; + u32 i; + u64 mask; + struct amdgpu_ring *ring; + + if (!adev) + return -ENODEV; + + mask = (1ULL << adev->vcn.num_vcn_inst) - 1; + if ((val & mask) == 0) + return -EINVAL; + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + ring = &adev->vcn.inst[i].ring_enc[0]; + if (val & (1ULL << i)) + ring->sched.ready = true; + else + ring->sched.ready = false; + } + /* publish sched.ready flag update effective immediately across smp */ + smp_rmb(); + return 0; +} + +static int amdgpu_debugfs_vcn_sched_mask_get(void *data, u64 *val) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)data; + u32 i; + u64 mask = 0; + struct amdgpu_ring *ring; + + if (!adev) + return -ENODEV; + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + ring = &adev->vcn.inst[i].ring_enc[0]; + if (ring->sched.ready) + mask |= 1ULL << i; + } + *val = mask; + return 0; +} + +DEFINE_DEBUGFS_ATTRIBUTE(amdgpu_debugfs_vcn_sched_mask_fops, + amdgpu_debugfs_vcn_sched_mask_get, + amdgpu_debugfs_vcn_sched_mask_set, "%llx\n"); +#endif + +void amdgpu_debugfs_vcn_sched_mask_init(struct amdgpu_device *adev) +{ +#if defined(CONFIG_DEBUG_FS) + struct drm_minor *minor = adev_to_drm(adev)->primary; + struct dentry *root = minor->debugfs_root; + char name[32]; + + if (adev->vcn.num_vcn_inst <= 1 || !adev->vcn.using_unified_queue) + return; + sprintf(name, "amdgpu_vcn_sched_mask"); + debugfs_create_file(name, 0600, root, adev, + &amdgpu_debugfs_vcn_sched_mask_fops); +#endif +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h index 1e32311c1dff..adaf4388ad28 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h @@ -1,5 +1,5 @@ /* - * Copyright 2016 Advanced Micro Devices, Inc. + * Copyright 2016-2024 Advanced Micro Devices, Inc. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), @@ -163,20 +163,30 @@ #define SOC24_DPG_MODE_OFFSET(ip, inst_idx, reg) \ ({ \ uint32_t internal_reg_offset, addr; \ - bool video_range, aon_range; \ + bool video_range, video1_range, aon_range, aon1_range; \ \ addr = (adev->reg_offset[ip##_HWIP][inst_idx][reg##_BASE_IDX] + reg); \ addr <<= 2; \ video_range = ((((0xFFFFF & addr) >= (VCN_VID_SOC_ADDRESS)) && \ ((0xFFFFF & addr) < ((VCN_VID_SOC_ADDRESS + 0x2600))))); \ + video1_range = ((((0xFFFFF & addr) >= (VCN1_VID_SOC_ADDRESS)) && \ + ((0xFFFFF & addr) < ((VCN1_VID_SOC_ADDRESS + 0x2600))))); \ aon_range = ((((0xFFFFF & addr) >= (VCN_AON_SOC_ADDRESS)) && \ ((0xFFFFF & addr) < ((VCN_AON_SOC_ADDRESS + 0x600))))); \ + aon1_range = ((((0xFFFFF & addr) >= (VCN1_AON_SOC_ADDRESS)) && \ + ((0xFFFFF & addr) < ((VCN1_AON_SOC_ADDRESS + 0x600))))); \ if (video_range) \ internal_reg_offset = ((0xFFFFF & addr) - (VCN_VID_SOC_ADDRESS) + \ (VCN_VID_IP_ADDRESS)); \ else if (aon_range) \ internal_reg_offset = ((0xFFFFF & addr) - (VCN_AON_SOC_ADDRESS) + \ (VCN_AON_IP_ADDRESS)); \ + else if (video1_range) \ + internal_reg_offset = ((0xFFFFF & addr) - (VCN1_VID_SOC_ADDRESS) + \ + (VCN_VID_IP_ADDRESS)); \ + else if (aon1_range) \ + internal_reg_offset = ((0xFFFFF & addr) - (VCN1_AON_SOC_ADDRESS) + \ + (VCN_AON_IP_ADDRESS)); \ else \ internal_reg_offset = (0xFFFFF & addr); \ \ @@ -297,6 +307,9 @@ struct amdgpu_vcn_inst { atomic_t dpg_enc_submission_cnt; struct amdgpu_vcn_fw_shared fw_shared; uint8_t aid_id; + const struct firmware *fw; /* VCN firmware */ + uint8_t vcn_config; + uint32_t vcn_codec_disable_mask; }; struct amdgpu_vcn_ras { @@ -306,15 +319,12 @@ struct amdgpu_vcn_ras { struct amdgpu_vcn { unsigned fw_version; struct delayed_work idle_work; - const struct firmware *fw[AMDGPU_MAX_VCN_INSTANCES]; /* VCN firmware */ unsigned num_enc_rings; enum amd_powergating_state cur_state; bool indirect_sram; uint8_t num_vcn_inst; struct amdgpu_vcn_inst inst[AMDGPU_MAX_VCN_INSTANCES]; - uint8_t vcn_config[AMDGPU_MAX_VCN_INSTANCES]; - uint32_t vcn_codec_disable_mask[AMDGPU_MAX_VCN_INSTANCES]; struct amdgpu_vcn_reg internal; struct mutex vcn_pg_lock; struct mutex vcn1_jpeg1_workaround; @@ -523,5 +533,6 @@ int amdgpu_vcn_psp_update_sram(struct amdgpu_device *adev, int inst_idx, int amdgpu_vcn_save_vcpu_bo(struct amdgpu_device *adev); int amdgpu_vcn_sysfs_reset_mask_init(struct amdgpu_device *adev); void amdgpu_vcn_sysfs_reset_mask_fini(struct amdgpu_device *adev); +void amdgpu_debugfs_vcn_sched_mask_init(struct amdgpu_device *adev); #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c index c704e9803e11..0af469ec6fcc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c @@ -1263,12 +1263,10 @@ static int amdgpu_virt_cache_host_error_counts(struct amdgpu_device *adev, if (used_size > (AMD_SRIOV_RAS_TELEMETRY_SIZE_KB << 10)) return 0; - tmp = kmalloc(used_size, GFP_KERNEL); + tmp = kmemdup(&host_telemetry->body.error_count, used_size, GFP_KERNEL); if (!tmp) return -ENOMEM; - memcpy(tmp, &host_telemetry->body.error_count, used_size); - if (checksum != amd_sriov_msg_checksum(tmp, used_size, 0, 0)) goto out; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c index 8bf28d336807..03308261f894 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c @@ -632,13 +632,13 @@ static bool amdgpu_vkms_is_idle(void *handle) return true; } -static int amdgpu_vkms_set_clockgating_state(void *handle, +static int amdgpu_vkms_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int amdgpu_vkms_set_powergating_state(void *handle, +static int amdgpu_vkms_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 8d9bf7a0857f..5c07777d3239 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -36,6 +36,7 @@ #include #include #include "amdgpu.h" +#include "amdgpu_vm.h" #include "amdgpu_trace.h" #include "amdgpu_amdkfd.h" #include "amdgpu_gmc.h" @@ -310,6 +311,111 @@ static void amdgpu_vm_bo_reset_state_machine(struct amdgpu_vm *vm) spin_unlock(&vm->status_lock); } +/** + * amdgpu_vm_update_shared - helper to update shared memory stat + * @base: base structure for tracking BO usage in a VM + * + * Takes the vm status_lock and updates the shared memory stat. If the basic + * stat changed (e.g. buffer was moved) amdgpu_vm_update_stats need to be called + * as well. + */ +static void amdgpu_vm_update_shared(struct amdgpu_vm_bo_base *base) +{ + struct amdgpu_vm *vm = base->vm; + struct amdgpu_bo *bo = base->bo; + uint64_t size = amdgpu_bo_size(bo); + uint32_t bo_memtype = amdgpu_bo_mem_stats_placement(bo); + bool shared; + + spin_lock(&vm->status_lock); + shared = drm_gem_object_is_shared_for_memory_stats(&bo->tbo.base); + if (base->shared != shared) { + base->shared = shared; + if (shared) { + vm->stats[bo_memtype].drm.shared += size; + vm->stats[bo_memtype].drm.private -= size; + } else { + vm->stats[bo_memtype].drm.shared -= size; + vm->stats[bo_memtype].drm.private += size; + } + } + spin_unlock(&vm->status_lock); +} + +/** + * amdgpu_vm_bo_update_shared - callback when bo gets shared/unshared + * @bo: amdgpu buffer object + * + * Update the per VM stats for all the vm if needed from private to shared or + * vice versa. + */ +void amdgpu_vm_bo_update_shared(struct amdgpu_bo *bo) +{ + struct amdgpu_vm_bo_base *base; + + for (base = bo->vm_bo; base; base = base->next) + amdgpu_vm_update_shared(base); +} + +/** + * amdgpu_vm_update_stats_locked - helper to update normal memory stat + * @base: base structure for tracking BO usage in a VM + * @res: the ttm_resource to use for the purpose of accounting, may or may not + * be bo->tbo.resource + * @sign: if we should add (+1) or subtract (-1) from the stat + * + * Caller need to have the vm status_lock held. Useful for when multiple update + * need to happen at the same time. + */ +static void amdgpu_vm_update_stats_locked(struct amdgpu_vm_bo_base *base, + struct ttm_resource *res, int sign) +{ + struct amdgpu_vm *vm = base->vm; + struct amdgpu_bo *bo = base->bo; + int64_t size = sign * amdgpu_bo_size(bo); + uint32_t bo_memtype = amdgpu_bo_mem_stats_placement(bo); + + /* For drm-total- and drm-shared-, BO are accounted by their preferred + * placement, see also amdgpu_bo_mem_stats_placement. + */ + if (base->shared) + vm->stats[bo_memtype].drm.shared += size; + else + vm->stats[bo_memtype].drm.private += size; + + if (res && res->mem_type < __AMDGPU_PL_NUM) { + uint32_t res_memtype = res->mem_type; + + vm->stats[res_memtype].drm.resident += size; + /* BO only count as purgeable if it is resident, + * since otherwise there's nothing to purge. + */ + if (bo->flags & AMDGPU_GEM_CREATE_DISCARDABLE) + vm->stats[res_memtype].drm.purgeable += size; + if (!(bo->preferred_domains & amdgpu_mem_type_to_domain(res_memtype))) + vm->stats[bo_memtype].evicted += size; + } +} + +/** + * amdgpu_vm_update_stats - helper to update normal memory stat + * @base: base structure for tracking BO usage in a VM + * @res: the ttm_resource to use for the purpose of accounting, may or may not + * be bo->tbo.resource + * @sign: if we should add (+1) or subtract (-1) from the stat + * + * Updates the basic memory stat when bo is added/deleted/moved. + */ +void amdgpu_vm_update_stats(struct amdgpu_vm_bo_base *base, + struct ttm_resource *res, int sign) +{ + struct amdgpu_vm *vm = base->vm; + + spin_lock(&vm->status_lock); + amdgpu_vm_update_stats_locked(base, res, sign); + spin_unlock(&vm->status_lock); +} + /** * amdgpu_vm_bo_base_init - Adds bo to the list of bos associated with the vm * @@ -333,6 +439,11 @@ void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base, base->next = bo->vm_bo; bo->vm_bo = base; + spin_lock(&vm->status_lock); + base->shared = drm_gem_object_is_shared_for_memory_stats(&bo->tbo.base); + amdgpu_vm_update_stats_locked(base, bo->tbo.resource, +1); + spin_unlock(&vm->status_lock); + if (!amdgpu_vm_is_bo_always_valid(vm, bo)) return; @@ -674,12 +785,8 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, pasid_mapping_needed &= adev->gmc.gmc_funcs->emit_pasid_mapping && ring->funcs->emit_wreg; - if (adev->gfx.enable_cleaner_shader && - ring->funcs->emit_cleaner_shader && - job->enforce_isolation) - ring->funcs->emit_cleaner_shader(ring); - - if (!vm_flush_needed && !gds_switch_needed && !need_pipe_sync) + if (!vm_flush_needed && !gds_switch_needed && !need_pipe_sync && + !(job->enforce_isolation && !job->vmid)) return 0; amdgpu_ring_ib_begin(ring); @@ -690,6 +797,11 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, if (need_pipe_sync) amdgpu_ring_emit_pipeline_sync(ring); + if (adev->gfx.enable_cleaner_shader && + ring->funcs->emit_cleaner_shader && + job->enforce_isolation) + ring->funcs->emit_cleaner_shader(ring); + if (vm_flush_needed) { trace_amdgpu_vm_flush(ring, job->vmid, job->vm_pd_addr); amdgpu_ring_emit_vm_flush(ring, job->vmid, job->vm_pd_addr); @@ -1082,53 +1194,11 @@ error_free: return r; } -static void amdgpu_vm_bo_get_memory(struct amdgpu_bo_va *bo_va, - struct amdgpu_mem_stats *stats, - unsigned int size) -{ - struct amdgpu_vm *vm = bo_va->base.vm; - struct amdgpu_bo *bo = bo_va->base.bo; - - if (!bo) - return; - - /* - * For now ignore BOs which are currently locked and potentially - * changing their location. - */ - if (!amdgpu_vm_is_bo_always_valid(vm, bo) && - !dma_resv_trylock(bo->tbo.base.resv)) - return; - - amdgpu_bo_get_memory(bo, stats, size); - if (!amdgpu_vm_is_bo_always_valid(vm, bo)) - dma_resv_unlock(bo->tbo.base.resv); -} - void amdgpu_vm_get_memory(struct amdgpu_vm *vm, - struct amdgpu_mem_stats *stats, - unsigned int size) + struct amdgpu_mem_stats stats[__AMDGPU_PL_NUM]) { - struct amdgpu_bo_va *bo_va, *tmp; - spin_lock(&vm->status_lock); - list_for_each_entry_safe(bo_va, tmp, &vm->idle, base.vm_status) - amdgpu_vm_bo_get_memory(bo_va, stats, size); - - list_for_each_entry_safe(bo_va, tmp, &vm->evicted, base.vm_status) - amdgpu_vm_bo_get_memory(bo_va, stats, size); - - list_for_each_entry_safe(bo_va, tmp, &vm->relocated, base.vm_status) - amdgpu_vm_bo_get_memory(bo_va, stats, size); - - list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status) - amdgpu_vm_bo_get_memory(bo_va, stats, size); - - list_for_each_entry_safe(bo_va, tmp, &vm->invalidated, base.vm_status) - amdgpu_vm_bo_get_memory(bo_va, stats, size); - - list_for_each_entry_safe(bo_va, tmp, &vm->done, base.vm_status) - amdgpu_vm_bo_get_memory(bo_va, stats, size); + memcpy(stats, vm->stats, sizeof(*stats) * __AMDGPU_PL_NUM); spin_unlock(&vm->status_lock); } @@ -1265,10 +1335,9 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, * next command submission. */ if (amdgpu_vm_is_bo_always_valid(vm, bo)) { - uint32_t mem_type = bo->tbo.resource->mem_type; - - if (!(bo->preferred_domains & - amdgpu_mem_type_to_domain(mem_type))) + if (bo->tbo.resource && + !(bo->preferred_domains & + amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type))) amdgpu_vm_bo_evicted(&bo_va->base); else amdgpu_vm_bo_idle(&bo_va->base); @@ -2075,6 +2144,7 @@ void amdgpu_vm_bo_del(struct amdgpu_device *adev, if (*base != &bo_va->base) continue; + amdgpu_vm_update_stats(*base, bo->tbo.resource, -1); *base = bo_va->base.next; break; } @@ -2143,14 +2213,12 @@ bool amdgpu_vm_evictable(struct amdgpu_bo *bo) /** * amdgpu_vm_bo_invalidate - mark the bo as invalid * - * @adev: amdgpu_device pointer * @bo: amdgpu buffer object * @evicted: is the BO evicted * * Mark @bo as invalid. */ -void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev, - struct amdgpu_bo *bo, bool evicted) +void amdgpu_vm_bo_invalidate(struct amdgpu_bo *bo, bool evicted) { struct amdgpu_vm_bo_base *bo_base; @@ -2175,6 +2243,32 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev, } } +/** + * amdgpu_vm_bo_move - handle BO move + * + * @bo: amdgpu buffer object + * @new_mem: the new placement of the BO move + * @evicted: is the BO evicted + * + * Update the memory stats for the new placement and mark @bo as invalid. + */ +void amdgpu_vm_bo_move(struct amdgpu_bo *bo, struct ttm_resource *new_mem, + bool evicted) +{ + struct amdgpu_vm_bo_base *bo_base; + + for (bo_base = bo->vm_bo; bo_base; bo_base = bo_base->next) { + struct amdgpu_vm *vm = bo_base->vm; + + spin_lock(&vm->status_lock); + amdgpu_vm_update_stats_locked(bo_base, bo->tbo.resource, -1); + amdgpu_vm_update_stats_locked(bo_base, new_mem, +1); + spin_unlock(&vm->status_lock); + } + + amdgpu_vm_bo_invalidate(bo, evicted); +} + /** * amdgpu_vm_get_block_size - calculate VM page table size as power of two * @@ -2594,6 +2688,16 @@ void amdgpu_vm_release_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm) vm->is_compute_context = false; } +static int amdgpu_vm_stats_is_zero(struct amdgpu_vm *vm) +{ + for (int i = 0; i < __AMDGPU_PL_NUM; ++i) { + if (!(drm_memory_stats_is_zero(&vm->stats[i].drm) && + vm->stats[i].evicted == 0)) + return false; + } + return true; +} + /** * amdgpu_vm_fini - tear down a vm instance * @@ -2617,7 +2721,6 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm) root = amdgpu_bo_ref(vm->root.bo); amdgpu_bo_reserve(root, true); - amdgpu_vm_put_task_info(vm->task_info); amdgpu_vm_set_pasid(adev, vm, 0); dma_fence_wait(vm->last_unlocked, false); dma_fence_put(vm->last_unlocked); @@ -2666,6 +2769,16 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm) } ttm_lru_bulk_move_fini(&adev->mman.bdev, &vm->lru_bulk_move); + + if (!amdgpu_vm_stats_is_zero(vm)) { + struct amdgpu_task_info *ti = vm->task_info; + + dev_warn(adev->dev, + "VM memory stats for proc %s(%d) task %s(%d) is non-zero when fini\n", + ti->process_name, ti->pid, ti->task_name, ti->tgid); + } + + amdgpu_vm_put_task_info(vm->task_info); } /** diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index 5d119ac26c4f..a3e128e373bc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -35,6 +35,7 @@ #include "amdgpu_sync.h" #include "amdgpu_ring.h" #include "amdgpu_ids.h" +#include "amdgpu_ttm.h" struct drm_exec; @@ -202,9 +203,13 @@ struct amdgpu_vm_bo_base { /* protected by bo being reserved */ struct amdgpu_vm_bo_base *next; - /* protected by spinlock */ + /* protected by vm status_lock */ struct list_head vm_status; + /* if the bo is counted as shared in mem stats + * protected by vm status_lock */ + bool shared; + /* protected by the BO being reserved */ bool moved; }; @@ -324,10 +329,7 @@ struct amdgpu_vm_fault_info { struct amdgpu_mem_stats { struct drm_memory_stats drm; - /* buffers that requested this placement */ - uint64_t requested; - /* buffers that requested this placement - * but are currently evicted */ + /* buffers that requested this placement but are currently evicted */ uint64_t evicted; }; @@ -345,6 +347,9 @@ struct amdgpu_vm { /* Lock to protect vm_bo add/del/move on all lists of vm */ spinlock_t status_lock; + /* Memory statistics for this vm, protected by status_lock */ + struct amdgpu_mem_stats stats[__AMDGPU_PL_NUM]; + /* Per-VM and PT BOs who needs a validation */ struct list_head evicted; @@ -524,8 +529,12 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, bool clear); bool amdgpu_vm_evictable(struct amdgpu_bo *bo); -void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev, - struct amdgpu_bo *bo, bool evicted); +void amdgpu_vm_bo_invalidate(struct amdgpu_bo *bo, bool evicted); +void amdgpu_vm_update_stats(struct amdgpu_vm_bo_base *base, + struct ttm_resource *new_res, int sign); +void amdgpu_vm_bo_update_shared(struct amdgpu_bo *bo); +void amdgpu_vm_bo_move(struct amdgpu_bo *bo, struct ttm_resource *new_mem, + bool evicted); uint64_t amdgpu_vm_map_gart(const dma_addr_t *pages_addr, uint64_t addr); struct amdgpu_bo_va *amdgpu_vm_bo_find(struct amdgpu_vm *vm, struct amdgpu_bo *bo); @@ -576,8 +585,7 @@ void amdgpu_vm_set_task_info(struct amdgpu_vm *vm); void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev, struct amdgpu_vm *vm); void amdgpu_vm_get_memory(struct amdgpu_vm *vm, - struct amdgpu_mem_stats *stats, - unsigned int size); + struct amdgpu_mem_stats stats[__AMDGPU_PL_NUM]); int amdgpu_vm_pt_clear(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct amdgpu_bo_vm *vmbo, bool immediate); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c index f78a0434a48f..b0bf21682115 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c @@ -537,6 +537,7 @@ static void amdgpu_vm_pt_free(struct amdgpu_vm_bo_base *entry) if (!entry->bo) return; + amdgpu_vm_update_stats(entry, entry->bo->tbo.resource, -1); entry->bo->vm_bo = NULL; ttm_bo_set_bulk_move(&entry->bo->tbo, NULL); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c index 110b120d7375..121ee17b522b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c @@ -236,7 +236,8 @@ int amdgpu_vpe_init_microcode(struct amdgpu_vpe *vpe) int ret; amdgpu_ucode_ip_version_decode(adev, VPE_HWIP, fw_prefix, sizeof(fw_prefix)); - ret = amdgpu_ucode_request(adev, &adev->vpe.fw, "amdgpu/%s.bin", fw_prefix); + ret = amdgpu_ucode_request(adev, &adev->vpe.fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s.bin", fw_prefix); if (ret) goto out; @@ -646,16 +647,16 @@ static int vpe_ring_preempt_ib(struct amdgpu_ring *ring) return r; } -static int vpe_set_clockgating_state(void *handle, +static int vpe_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int vpe_set_powergating_state(void *handle, +static int vpe_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; struct amdgpu_vpe *vpe = &adev->vpe; if (!adev->pm.dpm_enabled) @@ -833,7 +834,7 @@ static int vpe_ring_test_ib(struct amdgpu_ring *ring, long timeout) ret = (le32_to_cpu(adev->wb.wb[index]) == test_pattern) ? 0 : -EINVAL; err1: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err0: amdgpu_device_wb_free(adev, index); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c index 7d26a962f811..ff5e52025266 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -567,7 +567,6 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, else remaining_size -= size; } - mutex_unlock(&mgr->lock); if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS && adjust_dcc_size) { struct drm_buddy_block *dcc_block; @@ -584,6 +583,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, (u64)vres->base.size, &vres->blocks); } + mutex_unlock(&mgr->lock); vres->base.start = 0; size = max_t(u64, amdgpu_vram_mgr_blocks_size(&vres->blocks), diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c index e2cb1f080e88..08d6787893b3 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik.c +++ b/drivers/gpu/drm/amd/amdgpu/cik.c @@ -2161,13 +2161,13 @@ static int cik_common_soft_reset(struct amdgpu_ip_block *ip_block) return 0; } -static int cik_common_set_clockgating_state(void *handle, +static int cik_common_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int cik_common_set_powergating_state(void *handle, +static int cik_common_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c index 1da17755ad53..444563486769 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c @@ -402,13 +402,13 @@ static int cik_ih_soft_reset(struct amdgpu_ip_block *ip_block) return 0; } -static int cik_ih_set_clockgating_state(void *handle, +static int cik_ih_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int cik_ih_set_powergating_state(void *handle, +static int cik_ih_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c index ede1a028d48d..d9bd8f3f17e2 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c @@ -133,9 +133,11 @@ static int cik_sdma_init_microcode(struct amdgpu_device *adev) for (i = 0; i < adev->sdma.num_instances; i++) { if (i == 0) err = amdgpu_ucode_request(adev, &adev->sdma.instance[i].fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_sdma.bin", chip_name); else err = amdgpu_ucode_request(adev, &adev->sdma.instance[i].fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_sdma1.bin", chip_name); if (err) goto out; @@ -696,7 +698,7 @@ static int cik_sdma_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err1: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err0: amdgpu_device_wb_free(adev, index); @@ -1189,11 +1191,11 @@ static int cik_sdma_process_illegal_inst_irq(struct amdgpu_device *adev, return 0; } -static int cik_sdma_set_clockgating_state(void *handle, +static int cik_sdma_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { bool gate = false; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (state == AMD_CG_STATE_GATE) gate = true; @@ -1204,7 +1206,7 @@ static int cik_sdma_set_clockgating_state(void *handle, return 0; } -static int cik_sdma_set_powergating_state(void *handle, +static int cik_sdma_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c index d72973bd570d..82586b76aeda 100644 --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c @@ -398,14 +398,14 @@ static int cz_ih_soft_reset(struct amdgpu_ip_block *ip_block) return 0; } -static int cz_ih_set_clockgating_state(void *handle, +static int cz_ih_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { // TODO return 0; } -static int cz_ih_set_powergating_state(void *handle, +static int cz_ih_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { // TODO diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c index 5098c50d54c8..c5e3d2251b18 100644 --- a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c @@ -2687,6 +2687,32 @@ static const struct drm_crtc_helper_funcs dce_v10_0_crtc_helper_funcs = { .get_scanout_position = amdgpu_crtc_get_scanout_position, }; +static void dce_v10_0_panic_flush(struct drm_plane *plane) +{ + struct drm_framebuffer *fb; + struct amdgpu_crtc *amdgpu_crtc; + struct amdgpu_device *adev; + uint32_t fb_format; + + if (!plane->fb) + return; + + fb = plane->fb; + amdgpu_crtc = to_amdgpu_crtc(plane->crtc); + adev = drm_to_adev(fb->dev); + + /* Disable DC tiling */ + fb_format = RREG32(mmGRPH_CONTROL + amdgpu_crtc->crtc_offset); + fb_format &= ~GRPH_CONTROL__GRPH_ARRAY_MODE_MASK; + WREG32(mmGRPH_CONTROL + amdgpu_crtc->crtc_offset, fb_format); + +} + +static const struct drm_plane_helper_funcs dce_v10_0_drm_primary_plane_helper_funcs = { + .get_scanout_buffer = amdgpu_display_get_scanout_buffer, + .panic_flush = dce_v10_0_panic_flush, +}; + static int dce_v10_0_crtc_init(struct amdgpu_device *adev, int index) { struct amdgpu_crtc *amdgpu_crtc; @@ -2734,6 +2760,7 @@ static int dce_v10_0_crtc_init(struct amdgpu_device *adev, int index) amdgpu_crtc->encoder = NULL; amdgpu_crtc->connector = NULL; drm_crtc_helper_add(&amdgpu_crtc->base, &dce_v10_0_crtc_helper_funcs); + drm_plane_helper_add(amdgpu_crtc->base.primary, &dce_v10_0_drm_primary_plane_helper_funcs); return 0; } @@ -3302,13 +3329,13 @@ static int dce_v10_0_hpd_irq(struct amdgpu_device *adev, return 0; } -static int dce_v10_0_set_clockgating_state(void *handle, +static int dce_v10_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int dce_v10_0_set_powergating_state(void *handle, +static int dce_v10_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c index c5680ff4ab9f..ea42a4472bf6 100644 --- a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c @@ -2800,6 +2800,32 @@ static const struct drm_crtc_helper_funcs dce_v11_0_crtc_helper_funcs = { .get_scanout_position = amdgpu_crtc_get_scanout_position, }; +static void dce_v11_0_panic_flush(struct drm_plane *plane) +{ + struct drm_framebuffer *fb; + struct amdgpu_crtc *amdgpu_crtc; + struct amdgpu_device *adev; + uint32_t fb_format; + + if (!plane->fb) + return; + + fb = plane->fb; + amdgpu_crtc = to_amdgpu_crtc(plane->crtc); + adev = drm_to_adev(fb->dev); + + /* Disable DC tiling */ + fb_format = RREG32(mmGRPH_CONTROL + amdgpu_crtc->crtc_offset); + fb_format &= ~GRPH_CONTROL__GRPH_ARRAY_MODE_MASK; + WREG32(mmGRPH_CONTROL + amdgpu_crtc->crtc_offset, fb_format); + +} + +static const struct drm_plane_helper_funcs dce_v11_0_drm_primary_plane_helper_funcs = { + .get_scanout_buffer = amdgpu_display_get_scanout_buffer, + .panic_flush = dce_v11_0_panic_flush, +}; + static int dce_v11_0_crtc_init(struct amdgpu_device *adev, int index) { struct amdgpu_crtc *amdgpu_crtc; @@ -2847,6 +2873,7 @@ static int dce_v11_0_crtc_init(struct amdgpu_device *adev, int index) amdgpu_crtc->encoder = NULL; amdgpu_crtc->connector = NULL; drm_crtc_helper_add(&amdgpu_crtc->base, &dce_v11_0_crtc_helper_funcs); + drm_plane_helper_add(amdgpu_crtc->base.primary, &dce_v11_0_drm_primary_plane_helper_funcs); return 0; } @@ -3434,13 +3461,13 @@ static int dce_v11_0_hpd_irq(struct amdgpu_device *adev, return 0; } -static int dce_v11_0_set_clockgating_state(void *handle, +static int dce_v11_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int dce_v11_0_set_powergating_state(void *handle, +static int dce_v11_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c index eb7de9122d99..915804a6a1d7 100644 --- a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c @@ -2602,6 +2602,32 @@ static const struct drm_crtc_helper_funcs dce_v6_0_crtc_helper_funcs = { .get_scanout_position = amdgpu_crtc_get_scanout_position, }; +static void dce_v6_0_panic_flush(struct drm_plane *plane) +{ + struct drm_framebuffer *fb; + struct amdgpu_crtc *amdgpu_crtc; + struct amdgpu_device *adev; + uint32_t fb_format; + + if (!plane->fb) + return; + + fb = plane->fb; + amdgpu_crtc = to_amdgpu_crtc(plane->crtc); + adev = drm_to_adev(fb->dev); + + /* Disable DC tiling */ + fb_format = RREG32(mmGRPH_CONTROL + amdgpu_crtc->crtc_offset); + fb_format &= ~GRPH_ARRAY_MODE(0x7); + WREG32(mmGRPH_CONTROL + amdgpu_crtc->crtc_offset, fb_format); + +} + +static const struct drm_plane_helper_funcs dce_v6_0_drm_primary_plane_helper_funcs = { + .get_scanout_buffer = amdgpu_display_get_scanout_buffer, + .panic_flush = dce_v6_0_panic_flush, +}; + static int dce_v6_0_crtc_init(struct amdgpu_device *adev, int index) { struct amdgpu_crtc *amdgpu_crtc; @@ -2629,6 +2655,7 @@ static int dce_v6_0_crtc_init(struct amdgpu_device *adev, int index) amdgpu_crtc->encoder = NULL; amdgpu_crtc->connector = NULL; drm_crtc_helper_add(&amdgpu_crtc->base, &dce_v6_0_crtc_helper_funcs); + drm_plane_helper_add(amdgpu_crtc->base.primary, &dce_v6_0_drm_primary_plane_helper_funcs); return 0; } @@ -3124,13 +3151,13 @@ static int dce_v6_0_hpd_irq(struct amdgpu_device *adev, } -static int dce_v6_0_set_clockgating_state(void *handle, +static int dce_v6_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int dce_v6_0_set_powergating_state(void *handle, +static int dce_v6_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c index 04b79ff87f75..f2edc0fece5b 100644 --- a/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c @@ -2613,6 +2613,31 @@ static const struct drm_crtc_helper_funcs dce_v8_0_crtc_helper_funcs = { .get_scanout_position = amdgpu_crtc_get_scanout_position, }; +static void dce_v8_0_panic_flush(struct drm_plane *plane) +{ + struct drm_framebuffer *fb; + struct amdgpu_crtc *amdgpu_crtc; + struct amdgpu_device *adev; + uint32_t fb_format; + + if (!plane->fb) + return; + + fb = plane->fb; + amdgpu_crtc = to_amdgpu_crtc(plane->crtc); + adev = drm_to_adev(fb->dev); + + /* Disable DC tiling */ + fb_format = RREG32(mmGRPH_CONTROL + amdgpu_crtc->crtc_offset); + fb_format &= ~GRPH_CONTROL__GRPH_ARRAY_MODE_MASK; + WREG32(mmGRPH_CONTROL + amdgpu_crtc->crtc_offset, fb_format); +} + +static const struct drm_plane_helper_funcs dce_v8_0_drm_primary_plane_helper_funcs = { + .get_scanout_buffer = amdgpu_display_get_scanout_buffer, + .panic_flush = dce_v8_0_panic_flush, +}; + static int dce_v8_0_crtc_init(struct amdgpu_device *adev, int index) { struct amdgpu_crtc *amdgpu_crtc; @@ -2640,6 +2665,7 @@ static int dce_v8_0_crtc_init(struct amdgpu_device *adev, int index) amdgpu_crtc->encoder = NULL; amdgpu_crtc->connector = NULL; drm_crtc_helper_add(&amdgpu_crtc->base, &dce_v8_0_crtc_helper_funcs); + drm_plane_helper_add(amdgpu_crtc->base.primary, &dce_v8_0_drm_primary_plane_helper_funcs); return 0; } @@ -3212,13 +3238,13 @@ static int dce_v8_0_hpd_irq(struct amdgpu_device *adev, } -static int dce_v8_0_set_clockgating_state(void *handle, +static int dce_v8_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int dce_v8_0_set_powergating_state(void *handle, +static int dce_v8_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 24dce803a829..5ba263fe5512 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -45,6 +45,7 @@ #include "clearstate_gfx10.h" #include "v10_structs.h" #include "gfx_v10_0.h" +#include "gfx_v10_0_cleaner_shader.h" #include "nbio_v2_3.h" /* @@ -3673,7 +3674,7 @@ static void gfx_v10_0_ring_invalidate_tlbs(struct amdgpu_ring *ring, static void gfx_v10_0_update_spm_vmid_internal(struct amdgpu_device *adev, unsigned int vmid); -static int gfx_v10_0_set_powergating_state(void *handle, +static int gfx_v10_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); static void gfx10_kiq_set_resources(struct amdgpu_ring *kiq_ring, uint64_t queue_mask) { @@ -4036,7 +4037,7 @@ static int gfx_v10_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) else r = -EINVAL; err2: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err1: amdgpu_device_wb_free(adev, index); @@ -4138,18 +4139,21 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev) amdgpu_ucode_ip_version_decode(adev, GC_HWIP, ucode_prefix, sizeof(ucode_prefix)); err = amdgpu_ucode_request(adev, &adev->gfx.pfp_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_pfp%s.bin", ucode_prefix, wks); if (err) goto out; amdgpu_gfx_cp_init_microcode(adev, AMDGPU_UCODE_ID_CP_PFP); err = amdgpu_ucode_request(adev, &adev->gfx.me_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_me%s.bin", ucode_prefix, wks); if (err) goto out; amdgpu_gfx_cp_init_microcode(adev, AMDGPU_UCODE_ID_CP_ME); err = amdgpu_ucode_request(adev, &adev->gfx.ce_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_ce%s.bin", ucode_prefix, wks); if (err) goto out; @@ -4173,6 +4177,7 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev) } err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mec%s.bin", ucode_prefix, wks); if (err) goto out; @@ -4180,6 +4185,7 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev) amdgpu_gfx_cp_init_microcode(adev, AMDGPU_UCODE_ID_CP_MEC1_JT); err = amdgpu_ucode_request(adev, &adev->gfx.mec2_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mec2%s.bin", ucode_prefix, wks); if (!err) { amdgpu_gfx_cp_init_microcode(adev, AMDGPU_UCODE_ID_CP_MEC2); @@ -4733,6 +4739,23 @@ static int gfx_v10_0_sw_init(struct amdgpu_ip_block *ip_block) break; } switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { + case IP_VERSION(10, 3, 0): + case IP_VERSION(10, 3, 2): + case IP_VERSION(10, 3, 4): + case IP_VERSION(10, 3, 5): + adev->gfx.cleaner_shader_ptr = gfx_10_3_0_cleaner_shader_hex; + adev->gfx.cleaner_shader_size = sizeof(gfx_10_3_0_cleaner_shader_hex); + if (adev->gfx.me_fw_version >= 64 && + adev->gfx.pfp_fw_version >= 100 && + adev->gfx.mec_fw_version >= 122) { + adev->gfx.enable_cleaner_shader = true; + r = amdgpu_gfx_cleaner_shader_sw_init(adev, adev->gfx.cleaner_shader_size); + if (r) { + adev->gfx.enable_cleaner_shader = false; + dev_err(adev->dev, "Failed to initialize cleaner shader\n"); + } + } + break; default: adev->gfx.enable_cleaner_shader = false; break; @@ -5952,7 +5975,7 @@ static int gfx_v10_0_cp_gfx_enable(struct amdgpu_device *adev, bool enable) else WREG32_SOC15(GC, 0, mmCP_ME_CNTL, tmp); - if (adev->job_hang && !enable) + if (amdgpu_in_reset(adev) && !enable) return 0; for (i = 0; i < adev->usec_timeout; i++) { @@ -6599,17 +6622,13 @@ static void gfx_v10_0_kiq_setting(struct amdgpu_ring *ring) tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid); tmp &= 0xffffff00; tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); - WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid, tmp); - tmp |= 0x80; - WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid, tmp); + WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid, tmp | 0x80); break; default: tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS); tmp &= 0xffffff00; tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); - WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS, tmp); - tmp |= 0x80; - WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS, tmp); + WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS, tmp | 0x80); break; } } @@ -7457,7 +7476,7 @@ static int gfx_v10_0_hw_fini(struct amdgpu_ip_block *ip_block) * otherwise the gfxoff disallowing will be failed to set. */ if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(10, 3, 1)) - gfx_v10_0_set_powergating_state(ip_block->adev, AMD_PG_STATE_UNGATE); + gfx_v10_0_set_powergating_state(ip_block, AMD_PG_STATE_UNGATE); if (!adev->no_hw_access) { if (amdgpu_async_gfx_ring) { @@ -8345,10 +8364,10 @@ static const struct amdgpu_rlc_funcs gfx_v10_0_rlc_funcs_sriov = { .is_rlcg_access_range = gfx_v10_0_is_rlcg_access_range, }; -static int gfx_v10_0_set_powergating_state(void *handle, +static int gfx_v10_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_PG_STATE_GATE); if (amdgpu_sriov_vf(adev)) @@ -8383,10 +8402,10 @@ static int gfx_v10_0_set_powergating_state(void *handle, return 0; } -static int gfx_v10_0_set_clockgating_state(void *handle, +static int gfx_v10_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0_cleaner_shader.h b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0_cleaner_shader.h new file mode 100644 index 000000000000..663c2572d440 --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0_cleaner_shader.h @@ -0,0 +1,56 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright 2025 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/* Define the cleaner shader gfx_10_3_0 */ +static const u32 gfx_10_3_0_cleaner_shader_hex[] = { + 0xb0804004, 0xbf8a0000, + 0xbe8203b8, 0xbefc0380, + 0x7e008480, 0x7e028480, + 0x7e048480, 0x7e068480, + 0x7e088480, 0x7e0a8480, + 0x7e0c8480, 0x7e0e8480, + 0xbefc0302, 0x80828802, + 0xbf84fff5, 0xbe8203ff, + 0x80000000, 0x87020002, + 0xbf840012, 0xbefe03c1, + 0xbeff03c1, 0xd7650001, + 0x0001007f, 0xd7660001, + 0x0002027e, 0x16020288, + 0xbe8203bf, 0xbefc03c1, + 0xd9382000, 0x00020201, + 0xd9386040, 0x00040401, + 0xd70f6a01, 0x000202ff, + 0x00000400, 0x80828102, + 0xbf84fff7, 0xbefc03ff, + 0x00000068, 0xbe803080, + 0xbe813080, 0xbe823080, + 0xbe833080, 0x80fc847c, + 0xbf84fffa, 0xbeea0480, + 0xbeec0480, 0xbeee0480, + 0xbef00480, 0xbef20480, + 0xbef40480, 0xbef60480, + 0xbef80480, 0xbefa0480, + 0xbf810000, 0xbf9f0000, + 0xbf9f0000, 0xbf9f0000, + 0xbf9f0000, 0xbf9f0000, +}; diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_3_0_cleaner_shader.asm b/drivers/gpu/drm/amd/amdgpu/gfx_v10_3_0_cleaner_shader.asm new file mode 100644 index 000000000000..0e1c246166c0 --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_3_0_cleaner_shader.asm @@ -0,0 +1,124 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright 2025 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +// This shader is to clean LDS, SGPRs and VGPRs. It is first 64 Dwords or 256 bytes of 192 Dwords cleaner shader. +//To turn this shader program on for complitaion change this to main and lower shader main to main_1 + +// GFX10.3 : Clear SGPRs, VGPRs and LDS +// Launch 32 waves per CU (16 per SIMD) as a workgroup (threadgroup) to fill every wave slot +// Waves are "wave32" and have 64 VGPRs each, which uses all 1024 VGPRs per SIMD +// Waves are launched in "CU" mode, and the workgroup shares 64KB of LDS (half of the WGP's LDS) +// It takes 2 workgroups to use all of LDS: one on each CU of the WGP +// Each wave clears SGPRs 0 - 107 +// Each wave clears VGPRs 0 - 63 +// The first wave of the workgroup clears its 64KB of LDS +// The shader starts with "S_BARRIER" to ensure SPI has launched all waves of the workgroup +// before any wave in the workgroup could end. Without this, it is possible not all SGPRs get cleared. + + +shader main + asic(GFX10) + type(CS) + wave_size(32) +// Note: original source code from SQ team + +// +// Create 32 waves in a threadgroup (CS waves) +// Each allocates 64 VGPRs +// The workgroup allocates all of LDS (64kbytes) +// +// Takes about 2500 clocks to run. +// (theorhetical fastest = 1024clks vgpr + 640lds = 1660 clks) +// + S_BARRIER + s_mov_b32 s2, 0x00000038 // Loop 64/8=8 times (loop unrolled for performance) + s_mov_b32 m0, 0 + // + // CLEAR VGPRs + // +label_0005: + v_movreld_b32 v0, 0 + v_movreld_b32 v1, 0 + v_movreld_b32 v2, 0 + v_movreld_b32 v3, 0 + v_movreld_b32 v4, 0 + v_movreld_b32 v5, 0 + v_movreld_b32 v6, 0 + v_movreld_b32 v7, 0 + s_mov_b32 m0, s2 + s_sub_u32 s2, s2, 8 + s_cbranch_scc0 label_0005 + // + s_mov_b32 s2, 0x80000000 // Bit31 is first_wave + s_and_b32 s2, s2, s0 // sgpr0 has tg_size (first_wave) term as in ucode only COMPUTE_PGM_RSRC2.tg_size_en is set + s_cbranch_scc0 label_0023 // Clean LDS if its first wave of ThreadGroup/WorkGroup + // CLEAR LDS + // + s_mov_b32 exec_lo, 0xffffffff + s_mov_b32 exec_hi, 0xffffffff + v_mbcnt_lo_u32_b32 v1, exec_hi, 0 // Set V1 to thread-ID (0..63) + v_mbcnt_hi_u32_b32 v1, exec_lo, v1 // Set V1 to thread-ID (0..63) + v_mul_u32_u24 v1, 0x00000008, v1 // * 8, so each thread is a double-dword address (8byte) + s_mov_b32 s2, 0x00000003f // 64 loop iterations + s_mov_b32 m0, 0xffffffff + // Clear all of LDS space + // Each FirstWave of WorkGroup clears 64kbyte block + +label_001F: + ds_write2_b64 v1, v[2:3], v[2:3] offset1:32 + ds_write2_b64 v1, v[4:5], v[4:5] offset0:64 offset1:96 + v_add_co_u32 v1, vcc, 0x00000400, v1 + s_sub_u32 s2, s2, 1 + s_cbranch_scc0 label_001F + + // + // CLEAR SGPRs + // +label_0023: + s_mov_b32 m0, 0x00000068 // Loop 108/4=27 times (loop unrolled for performance) +label_sgpr_loop: + s_movreld_b32 s0, 0 + s_movreld_b32 s1, 0 + s_movreld_b32 s2, 0 + s_movreld_b32 s3, 0 + s_sub_u32 m0, m0, 4 + s_cbranch_scc0 label_sgpr_loop + + //clear vcc + s_mov_b32 flat_scratch_lo, 0 //clear flat scratch lo SGPR + s_mov_b32 flat_scratch_hi, 0 //clear flat scratch hi SGPR + s_mov_b64 vcc, 0 //clear vcc + s_mov_b64 ttmp0, 0 //Clear ttmp0 and ttmp1 + s_mov_b64 ttmp2, 0 //Clear ttmp2 and ttmp3 + s_mov_b64 ttmp4, 0 //Clear ttmp4 and ttmp5 + s_mov_b64 ttmp6, 0 //Clear ttmp6 and ttmp7 + s_mov_b64 ttmp8, 0 //Clear ttmp8 and ttmp9 + s_mov_b64 ttmp10, 0 //Clear ttmp10 and ttmp11 + s_mov_b64 ttmp12, 0 //Clear ttmp12 and ttmp13 + s_mov_b64 ttmp14, 0 //Clear ttmp14 and ttmp15 + + s_endpgm + +end + + diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index 2ae058a224f4..56c06b72a70a 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c @@ -615,7 +615,7 @@ static int gfx_v11_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err2: if (!ring->is_mes_queue) - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err1: if (!ring->is_mes_queue) @@ -639,6 +639,7 @@ static int gfx_v11_0_init_toc_microcode(struct amdgpu_device *adev, const char * int err = 0; err = amdgpu_ucode_request(adev, &adev->psp.toc_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_toc.bin", ucode_prefix); if (err) goto out; @@ -688,6 +689,7 @@ static int gfx_v11_0_init_microcode(struct amdgpu_device *adev) amdgpu_ucode_ip_version_decode(adev, GC_HWIP, ucode_prefix, sizeof(ucode_prefix)); err = amdgpu_ucode_request(adev, &adev->gfx.pfp_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_pfp.bin", ucode_prefix); if (err) goto out; @@ -705,6 +707,7 @@ static int gfx_v11_0_init_microcode(struct amdgpu_device *adev) } err = amdgpu_ucode_request(adev, &adev->gfx.me_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_me.bin", ucode_prefix); if (err) goto out; @@ -720,9 +723,11 @@ static int gfx_v11_0_init_microcode(struct amdgpu_device *adev) if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(11, 0, 0) && adev->pdev->revision == 0xCE) err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/gc_11_0_0_rlc_1.bin"); else err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_rlc.bin", ucode_prefix); if (err) goto out; @@ -735,6 +740,7 @@ static int gfx_v11_0_init_microcode(struct amdgpu_device *adev) } err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mec.bin", ucode_prefix); if (err) goto out; @@ -1885,6 +1891,7 @@ static u32 gfx_v11_0_get_rb_active_bitmap(struct amdgpu_device *adev) static void gfx_v11_0_setup_rb(struct amdgpu_device *adev) { + u32 rb_bitmap_per_sa; u32 rb_bitmap_width_per_sa; u32 max_sa; u32 active_sa_bitmap; @@ -1902,9 +1909,11 @@ static void gfx_v11_0_setup_rb(struct amdgpu_device *adev) adev->gfx.config.max_sh_per_se; rb_bitmap_width_per_sa = adev->gfx.config.max_backends_per_se / adev->gfx.config.max_sh_per_se; + rb_bitmap_per_sa = amdgpu_gfx_create_bitmask(rb_bitmap_width_per_sa); + for (i = 0; i < max_sa; i++) { if (active_sa_bitmap & (1 << i)) - active_rb_bitmap |= (0x3 << (i * rb_bitmap_width_per_sa)); + active_rb_bitmap |= (rb_bitmap_per_sa << (i * rb_bitmap_width_per_sa)); } active_rb_bitmap &= global_active_rb_bitmap; @@ -3918,9 +3927,7 @@ static void gfx_v11_0_kiq_setting(struct amdgpu_ring *ring) tmp = RREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS); tmp &= 0xffffff00; tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); - WREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS, tmp); - tmp |= 0x80; - WREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS, tmp); + WREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS, tmp | 0x80); } static void gfx_v11_0_cp_set_doorbell_range(struct amdgpu_device *adev) @@ -5458,10 +5465,10 @@ static void gfx_v11_cntl_pg(struct amdgpu_device *adev, bool enable) amdgpu_gfx_rlc_exit_safe_mode(adev, 0); } -static int gfx_v11_0_set_powergating_state(void *handle, +static int gfx_v11_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_PG_STATE_GATE); if (amdgpu_sriov_vf(adev)) @@ -5494,10 +5501,10 @@ static int gfx_v11_0_set_powergating_state(void *handle, return 0; } -static int gfx_v11_0_set_clockgating_state(void *handle, +static int gfx_v11_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; @@ -6646,30 +6653,14 @@ static int gfx_v11_0_reset_kgq(struct amdgpu_ring *ring, unsigned int vmid) static int gfx_v11_0_reset_kcq(struct amdgpu_ring *ring, unsigned int vmid) { struct amdgpu_device *adev = ring->adev; - int i, r = 0; + int r = 0; if (amdgpu_sriov_vf(adev)) return -EINVAL; - amdgpu_gfx_rlc_enter_safe_mode(adev, 0); - mutex_lock(&adev->srbm_mutex); - soc21_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0); - WREG32_SOC15(GC, 0, regCP_HQD_DEQUEUE_REQUEST, 0x2); - WREG32_SOC15(GC, 0, regSPI_COMPUTE_QUEUE_RESET, 0x1); - - /* make sure dequeue is complete*/ - for (i = 0; i < adev->usec_timeout; i++) { - if (!(RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) & 1)) - break; - udelay(1); - } - if (i >= adev->usec_timeout) - r = -ETIMEDOUT; - soc21_grbm_select(adev, 0, 0, 0, 0); - mutex_unlock(&adev->srbm_mutex); - amdgpu_gfx_rlc_exit_safe_mode(adev, 0); + r = amdgpu_mes_reset_legacy_queue(ring->adev, ring, vmid, true); if (r) { - dev_err(adev->dev, "fail to wait on hqd deactivate\n"); + dev_err(adev->dev, "reset via MMIO failed %d\n", r); return r; } diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c index fe7c48f2fb2a..4b6e05750654 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c @@ -513,7 +513,7 @@ static int gfx_v12_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err2: if (!ring->is_mes_queue) - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err1: if (!ring->is_mes_queue) @@ -537,6 +537,7 @@ static int gfx_v12_0_init_toc_microcode(struct amdgpu_device *adev, const char * int err = 0; err = amdgpu_ucode_request(adev, &adev->psp.toc_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_toc.bin", ucode_prefix); if (err) goto out; @@ -566,6 +567,7 @@ static int gfx_v12_0_init_microcode(struct amdgpu_device *adev) amdgpu_ucode_ip_version_decode(adev, GC_HWIP, ucode_prefix, sizeof(ucode_prefix)); err = amdgpu_ucode_request(adev, &adev->gfx.pfp_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_pfp.bin", ucode_prefix); if (err) goto out; @@ -573,6 +575,7 @@ static int gfx_v12_0_init_microcode(struct amdgpu_device *adev) amdgpu_gfx_cp_init_microcode(adev, AMDGPU_UCODE_ID_CP_RS64_PFP_P0_STACK); err = amdgpu_ucode_request(adev, &adev->gfx.me_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_me.bin", ucode_prefix); if (err) goto out; @@ -581,6 +584,7 @@ static int gfx_v12_0_init_microcode(struct amdgpu_device *adev) if (!amdgpu_sriov_vf(adev)) { err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_rlc.bin", ucode_prefix); if (err) goto out; @@ -593,6 +597,7 @@ static int gfx_v12_0_init_microcode(struct amdgpu_device *adev) } err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mec.bin", ucode_prefix); if (err) goto out; @@ -1437,11 +1442,19 @@ static int gfx_v12_0_sw_init(struct amdgpu_ip_block *ip_block) } } - /* TODO: Add queue reset mask when FW fully supports it */ adev->gfx.gfx_supported_reset = amdgpu_get_soft_full_reset_mask(&adev->gfx.gfx_ring[0]); adev->gfx.compute_supported_reset = amdgpu_get_soft_full_reset_mask(&adev->gfx.compute_ring[0]); + switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { + case IP_VERSION(12, 0, 0): + case IP_VERSION(12, 0, 1): + if ((adev->gfx.me_fw_version >= 2660) && + (adev->gfx.mec_fw_version >= 2920)) { + adev->gfx.compute_supported_reset |= AMDGPU_RESET_TYPE_PER_QUEUE; + adev->gfx.gfx_supported_reset |= AMDGPU_RESET_TYPE_PER_QUEUE; + } + } if (!adev->enable_mes_kiq) { r = amdgpu_gfx_kiq_init(adev, GFX12_MEC_HPD_SIZE, 0); @@ -1610,6 +1623,7 @@ static u32 gfx_v12_0_get_rb_active_bitmap(struct amdgpu_device *adev) static void gfx_v12_0_setup_rb(struct amdgpu_device *adev) { + u32 rb_bitmap_per_sa; u32 rb_bitmap_width_per_sa; u32 max_sa; u32 active_sa_bitmap; @@ -1627,12 +1641,14 @@ static void gfx_v12_0_setup_rb(struct amdgpu_device *adev) adev->gfx.config.max_sh_per_se; rb_bitmap_width_per_sa = adev->gfx.config.max_backends_per_se / adev->gfx.config.max_sh_per_se; + rb_bitmap_per_sa = amdgpu_gfx_create_bitmask(rb_bitmap_width_per_sa); + for (i = 0; i < max_sa; i++) { if (active_sa_bitmap & (1 << i)) - active_rb_bitmap |= (0x3 << (i * rb_bitmap_width_per_sa)); + active_rb_bitmap |= (rb_bitmap_per_sa << (i * rb_bitmap_width_per_sa)); } - active_rb_bitmap |= global_active_rb_bitmap; + active_rb_bitmap &= global_active_rb_bitmap; adev->gfx.config.backend_enable_mask = active_rb_bitmap; adev->gfx.config.num_rbs = hweight32(active_rb_bitmap); } @@ -2832,9 +2848,7 @@ static void gfx_v12_0_kiq_setting(struct amdgpu_ring *ring) tmp = RREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS); tmp &= 0xffffff00; tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); - WREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS, tmp); - tmp |= 0x80; - WREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS, tmp); + WREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS, tmp | 0x80); } static void gfx_v12_0_cp_set_doorbell_range(struct amdgpu_device *adev) @@ -3864,10 +3878,10 @@ static void gfx_v12_cntl_pg(struct amdgpu_device *adev, bool enable) } #endif -static int gfx_v12_0_set_powergating_state(void *handle, +static int gfx_v12_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_PG_STATE_GATE); if (amdgpu_sriov_vf(adev)) @@ -4115,15 +4129,15 @@ static int gfx_v12_0_update_gfx_clock_gating(struct amdgpu_device *adev, return 0; } -static int gfx_v12_0_set_clockgating_state(void *handle, +static int gfx_v12_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; - switch (adev->ip_versions[GC_HWIP][0]) { + switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { case IP_VERSION(12, 0, 0): case IP_VERSION(12, 0, 1): gfx_v12_0_update_gfx_clock_gating(adev, @@ -5233,24 +5247,16 @@ static int gfx_v12_0_reset_kgq(struct amdgpu_ring *ring, unsigned int vmid) static int gfx_v12_0_reset_kcq(struct amdgpu_ring *ring, unsigned int vmid) { struct amdgpu_device *adev = ring->adev; - int r, i; + int r; if (amdgpu_sriov_vf(adev)) return -EINVAL; - amdgpu_gfx_rlc_enter_safe_mode(adev, 0); - mutex_lock(&adev->srbm_mutex); - soc24_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0); - WREG32_SOC15(GC, 0, regCP_HQD_DEQUEUE_REQUEST, 0x2); - WREG32_SOC15(GC, 0, regSPI_COMPUTE_QUEUE_RESET, 0x1); - for (i = 0; i < adev->usec_timeout; i++) { - if (!(RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) & 1)) - break; - udelay(1); + r = amdgpu_mes_reset_legacy_queue(ring->adev, ring, vmid, true); + if (r) { + dev_err(adev->dev, "reset via MMIO failed %d\n", r); + return r; } - soc24_grbm_select(adev, 0, 0, 0, 0); - mutex_unlock(&adev->srbm_mutex); - amdgpu_gfx_rlc_exit_safe_mode(adev, 0); r = amdgpu_bo_reserve(ring->mqd_obj, false); if (unlikely(r != 0)) { diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.h b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.h index bcc9c72ccbde..f7184b2dc4e8 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.h +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.h @@ -26,4 +26,6 @@ extern const struct amdgpu_ip_block_version gfx_v12_0_ip_block; +int gfx_v12_0_request_gfx_index_mutex(struct amdgpu_device *adev, + bool req); #endif diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c index 41f50bf380c4..f26e2cdec07a 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c @@ -337,6 +337,7 @@ static int gfx_v6_0_init_microcode(struct amdgpu_device *adev) } err = amdgpu_ucode_request(adev, &adev->gfx.pfp_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_pfp.bin", chip_name); if (err) goto out; @@ -345,6 +346,7 @@ static int gfx_v6_0_init_microcode(struct amdgpu_device *adev) adev->gfx.pfp_feature_version = le32_to_cpu(cp_hdr->ucode_feature_version); err = amdgpu_ucode_request(adev, &adev->gfx.me_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_me.bin", chip_name); if (err) goto out; @@ -353,6 +355,7 @@ static int gfx_v6_0_init_microcode(struct amdgpu_device *adev) adev->gfx.me_feature_version = le32_to_cpu(cp_hdr->ucode_feature_version); err = amdgpu_ucode_request(adev, &adev->gfx.ce_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_ce.bin", chip_name); if (err) goto out; @@ -361,6 +364,7 @@ static int gfx_v6_0_init_microcode(struct amdgpu_device *adev) adev->gfx.ce_feature_version = le32_to_cpu(cp_hdr->ucode_feature_version); err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_rlc.bin", chip_name); if (err) goto out; @@ -1906,7 +1910,7 @@ static int gfx_v6_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; error: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); return r; } @@ -3373,11 +3377,11 @@ static int gfx_v6_0_priv_inst_irq(struct amdgpu_device *adev, return 0; } -static int gfx_v6_0_set_clockgating_state(void *handle, +static int gfx_v6_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { bool gate = false; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (state == AMD_CG_STATE_GATE) gate = true; @@ -3395,11 +3399,11 @@ static int gfx_v6_0_set_clockgating_state(void *handle, return 0; } -static int gfx_v6_0_set_powergating_state(void *handle, +static int gfx_v6_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { bool gate = false; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (state == AMD_PG_STATE_GATE) gate = true; diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c index 824d5913103b..84745b2453ab 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c @@ -934,33 +934,39 @@ static int gfx_v7_0_init_microcode(struct amdgpu_device *adev) } err = amdgpu_ucode_request(adev, &adev->gfx.pfp_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_pfp.bin", chip_name); if (err) goto out; err = amdgpu_ucode_request(adev, &adev->gfx.me_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_me.bin", chip_name); if (err) goto out; err = amdgpu_ucode_request(adev, &adev->gfx.ce_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_ce.bin", chip_name); if (err) goto out; err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mec.bin", chip_name); if (err) goto out; if (adev->asic_type == CHIP_KAVERI) { err = amdgpu_ucode_request(adev, &adev->gfx.mec2_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mec2.bin", chip_name); if (err) goto out; } err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_rlc.bin", chip_name); out: if (err) { @@ -2324,7 +2330,7 @@ static int gfx_v7_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; error: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); return r; } @@ -4846,11 +4852,11 @@ static int gfx_v7_0_priv_inst_irq(struct amdgpu_device *adev, return 0; } -static int gfx_v7_0_set_clockgating_state(void *handle, +static int gfx_v7_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { bool gate = false; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (state == AMD_CG_STATE_GATE) gate = true; @@ -4869,11 +4875,11 @@ static int gfx_v7_0_set_clockgating_state(void *handle, return 0; } -static int gfx_v7_0_set_powergating_state(void *handle, +static int gfx_v7_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { bool gate = false; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (state == AMD_PG_STATE_GATE) gate = true; diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c index b7006c41e270..6a025438f9d0 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c @@ -914,7 +914,7 @@ static int gfx_v8_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err2: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err1: amdgpu_device_wb_free(adev, index); @@ -982,13 +982,16 @@ static int gfx_v8_0_init_microcode(struct amdgpu_device *adev) if (adev->asic_type >= CHIP_POLARIS10 && adev->asic_type <= CHIP_POLARIS12) { err = amdgpu_ucode_request(adev, &adev->gfx.pfp_fw, + AMDGPU_UCODE_OPTIONAL, "amdgpu/%s_pfp_2.bin", chip_name); if (err == -ENODEV) { err = amdgpu_ucode_request(adev, &adev->gfx.pfp_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_pfp.bin", chip_name); } } else { err = amdgpu_ucode_request(adev, &adev->gfx.pfp_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_pfp.bin", chip_name); } if (err) @@ -999,13 +1002,16 @@ static int gfx_v8_0_init_microcode(struct amdgpu_device *adev) if (adev->asic_type >= CHIP_POLARIS10 && adev->asic_type <= CHIP_POLARIS12) { err = amdgpu_ucode_request(adev, &adev->gfx.me_fw, + AMDGPU_UCODE_OPTIONAL, "amdgpu/%s_me_2.bin", chip_name); if (err == -ENODEV) { err = amdgpu_ucode_request(adev, &adev->gfx.me_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_me.bin", chip_name); } } else { err = amdgpu_ucode_request(adev, &adev->gfx.me_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_me.bin", chip_name); } if (err) @@ -1017,13 +1023,16 @@ static int gfx_v8_0_init_microcode(struct amdgpu_device *adev) if (adev->asic_type >= CHIP_POLARIS10 && adev->asic_type <= CHIP_POLARIS12) { err = amdgpu_ucode_request(adev, &adev->gfx.ce_fw, + AMDGPU_UCODE_OPTIONAL, "amdgpu/%s_ce_2.bin", chip_name); if (err == -ENODEV) { err = amdgpu_ucode_request(adev, &adev->gfx.ce_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_ce.bin", chip_name); } } else { err = amdgpu_ucode_request(adev, &adev->gfx.ce_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_ce.bin", chip_name); } if (err) @@ -1044,6 +1053,7 @@ static int gfx_v8_0_init_microcode(struct amdgpu_device *adev) adev->virt.chained_ib_support = false; err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_rlc.bin", chip_name); if (err) goto out; @@ -1093,13 +1103,16 @@ static int gfx_v8_0_init_microcode(struct amdgpu_device *adev) if (adev->asic_type >= CHIP_POLARIS10 && adev->asic_type <= CHIP_POLARIS12) { err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, + AMDGPU_UCODE_OPTIONAL, "amdgpu/%s_mec_2.bin", chip_name); if (err == -ENODEV) { err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mec.bin", chip_name); } } else { err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mec.bin", chip_name); } if (err) @@ -1112,13 +1125,16 @@ static int gfx_v8_0_init_microcode(struct amdgpu_device *adev) (adev->asic_type != CHIP_TOPAZ)) { if (adev->asic_type >= CHIP_POLARIS10 && adev->asic_type <= CHIP_POLARIS12) { err = amdgpu_ucode_request(adev, &adev->gfx.mec2_fw, + AMDGPU_UCODE_OPTIONAL, "amdgpu/%s_mec2_2.bin", chip_name); if (err == -ENODEV) { err = amdgpu_ucode_request(adev, &adev->gfx.mec2_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mec2.bin", chip_name); } } else { err = amdgpu_ucode_request(adev, &adev->gfx.mec2_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mec2.bin", chip_name); } if (!err) { @@ -1640,7 +1656,7 @@ static int gfx_v8_0_do_edc_gpr_workarounds(struct amdgpu_device *adev) RREG32(sec_ded_counter_registers[i]); fail: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); return r; @@ -4304,9 +4320,7 @@ static void gfx_v8_0_kiq_setting(struct amdgpu_ring *ring) tmp = RREG32(mmRLC_CP_SCHEDULERS); tmp &= 0xffffff00; tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); - WREG32(mmRLC_CP_SCHEDULERS, tmp); - tmp |= 0x80; - WREG32(mmRLC_CP_SCHEDULERS, tmp); + WREG32(mmRLC_CP_SCHEDULERS, tmp | 0x80); } static int gfx_v8_0_kiq_kcq_enable(struct amdgpu_device *adev) @@ -5321,7 +5335,7 @@ static void gfx_v8_0_enable_gfx_static_mg_power_gating(struct amdgpu_device *ade (adev->asic_type == CHIP_POLARIS12) || (adev->asic_type == CHIP_VEGAM)) /* Send msg to SMU via Powerplay */ - amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, enable); + amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, enable, 0); WREG32_FIELD(RLC_PG_CNTL, STATIC_PER_CU_PG_ENABLE, enable ? 1 : 0); } @@ -5367,10 +5381,10 @@ static void cz_update_gfx_cg_power_gating(struct amdgpu_device *adev, } } -static int gfx_v8_0_set_powergating_state(void *handle, +static int gfx_v8_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_PG_STATE_GATE); if (amdgpu_sriov_vf(adev)) @@ -5625,8 +5639,6 @@ static void gfx_v8_0_update_medium_grain_clock_gating(struct amdgpu_device *adev { uint32_t temp, data; - amdgpu_gfx_rlc_enter_safe_mode(adev, 0); - /* It is disabled by HW by default */ if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_MGCG)) { if (adev->cg_flags & AMD_CG_SUPPORT_GFX_MGLS) { @@ -5720,8 +5732,6 @@ static void gfx_v8_0_update_medium_grain_clock_gating(struct amdgpu_device *adev /* 7- wait for RLC_SERDES_CU_MASTER & RLC_SERDES_NONCU_MASTER idle */ gfx_v8_0_wait_for_rlc_serdes(adev); } - - amdgpu_gfx_rlc_exit_safe_mode(adev, 0); } static void gfx_v8_0_update_coarse_grain_clock_gating(struct amdgpu_device *adev, @@ -5731,8 +5741,6 @@ static void gfx_v8_0_update_coarse_grain_clock_gating(struct amdgpu_device *adev temp = data = RREG32(mmRLC_CGCG_CGLS_CTRL); - amdgpu_gfx_rlc_enter_safe_mode(adev, 0); - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_CGCG)) { temp1 = data1 = RREG32(mmRLC_CGTT_MGCG_OVERRIDE); data1 &= ~RLC_CGTT_MGCG_OVERRIDE__CGCG_MASK; @@ -5813,12 +5821,12 @@ static void gfx_v8_0_update_coarse_grain_clock_gating(struct amdgpu_device *adev } gfx_v8_0_wait_for_rlc_serdes(adev); - - amdgpu_gfx_rlc_exit_safe_mode(adev, 0); } static int gfx_v8_0_update_gfx_clock_gating(struct amdgpu_device *adev, bool enable) { + amdgpu_gfx_rlc_enter_safe_mode(adev, 0); + if (enable) { /* CGCG/CGLS should be enabled after MGCG/MGLS/TS(CG/LS) * === MGCG + MGLS + TS(CG/LS) === @@ -5832,6 +5840,8 @@ static int gfx_v8_0_update_gfx_clock_gating(struct amdgpu_device *adev, gfx_v8_0_update_coarse_grain_clock_gating(adev, enable); gfx_v8_0_update_medium_grain_clock_gating(adev, enable); } + + amdgpu_gfx_rlc_exit_safe_mode(adev, 0); return 0; } @@ -5982,10 +5992,10 @@ static int gfx_v8_0_polaris_update_gfx_clock_gating(struct amdgpu_device *adev, return 0; } -static int gfx_v8_0_set_clockgating_state(void *handle, +static int gfx_v8_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 0b6f09f2cc9b..fa572b40989e 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -1243,7 +1243,7 @@ static int gfx_v9_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err2: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err1: amdgpu_device_wb_free(adev, index); @@ -1429,18 +1429,21 @@ static int gfx_v9_0_init_cp_gfx_microcode(struct amdgpu_device *adev, int err; err = amdgpu_ucode_request(adev, &adev->gfx.pfp_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_pfp.bin", chip_name); if (err) goto out; amdgpu_gfx_cp_init_microcode(adev, AMDGPU_UCODE_ID_CP_PFP); err = amdgpu_ucode_request(adev, &adev->gfx.me_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_me.bin", chip_name); if (err) goto out; amdgpu_gfx_cp_init_microcode(adev, AMDGPU_UCODE_ID_CP_ME); err = amdgpu_ucode_request(adev, &adev->gfx.ce_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_ce.bin", chip_name); if (err) goto out; @@ -1476,6 +1479,7 @@ static int gfx_v9_0_init_rlc_microcode(struct amdgpu_device *adev, (((adev->pdev->revision >= 0xC8) && (adev->pdev->revision <= 0xCF)) || ((adev->pdev->revision >= 0xD8) && (adev->pdev->revision <= 0xDF)))) err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_rlc_am4.bin", chip_name); else if (!strcmp(chip_name, "raven") && (amdgpu_pm_load_smu_firmware(adev, &smu_version) == 0) && (smu_version >= 0x41e2b)) @@ -1483,9 +1487,11 @@ static int gfx_v9_0_init_rlc_microcode(struct amdgpu_device *adev, *SMC is loaded by SBIOS on APU and it's able to get the SMU version directly. */ err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_kicker_rlc.bin", chip_name); else err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_rlc.bin", chip_name); if (err) goto out; @@ -1518,9 +1524,11 @@ static int gfx_v9_0_init_cp_compute_microcode(struct amdgpu_device *adev, if (amdgpu_sriov_vf(adev) && (adev->asic_type == CHIP_ALDEBARAN)) err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, - "amdgpu/%s_sjt_mec.bin", chip_name); + AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_sjt_mec.bin", chip_name); else err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mec.bin", chip_name); if (err) goto out; @@ -1531,9 +1539,11 @@ static int gfx_v9_0_init_cp_compute_microcode(struct amdgpu_device *adev, if (gfx_v9_0_load_mec2_fw_bin_support(adev)) { if (amdgpu_sriov_vf(adev) && (adev->asic_type == CHIP_ALDEBARAN)) err = amdgpu_ucode_request(adev, &adev->gfx.mec2_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_sjt_mec2.bin", chip_name); else err = amdgpu_ucode_request(adev, &adev->gfx.mec2_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_mec2.bin", chip_name); if (!err) { amdgpu_gfx_cp_init_microcode(adev, AMDGPU_UCODE_ID_CP_MEC2); @@ -3488,9 +3498,7 @@ static void gfx_v9_0_kiq_setting(struct amdgpu_ring *ring) tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS); tmp &= 0xffffff00; tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); - WREG32_SOC15_RLC(GC, 0, mmRLC_CP_SCHEDULERS, tmp); - tmp |= 0x80; - WREG32_SOC15_RLC(GC, 0, mmRLC_CP_SCHEDULERS, tmp); + WREG32_SOC15_RLC(GC, 0, mmRLC_CP_SCHEDULERS, tmp | 0x80); } static void gfx_v9_0_mqd_set_priority(struct amdgpu_ring *ring, struct v9_mqd *mqd) @@ -4780,7 +4788,7 @@ static int gfx_v9_0_do_edc_gpr_workarounds(struct amdgpu_device *adev) } fail: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); return r; @@ -4956,8 +4964,6 @@ static void gfx_v9_0_update_medium_grain_clock_gating(struct amdgpu_device *adev { uint32_t data, def; - amdgpu_gfx_rlc_enter_safe_mode(adev, 0); - /* It is disabled by HW by default */ if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_MGCG)) { /* 1 - RLC_CGTT_MGCG_OVERRIDE */ @@ -5022,8 +5028,6 @@ static void gfx_v9_0_update_medium_grain_clock_gating(struct amdgpu_device *adev WREG32_SOC15(GC, 0, mmCP_MEM_SLP_CNTL, data); } } - - amdgpu_gfx_rlc_exit_safe_mode(adev, 0); } static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, @@ -5034,8 +5038,6 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, if (!adev->gfx.num_gfx_rings) return; - amdgpu_gfx_rlc_enter_safe_mode(adev, 0); - /* Enable 3D CGCG/CGLS */ if (enable) { /* write cmd to clear cgcg/cgls ov */ @@ -5077,8 +5079,6 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, if (def != data) WREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D, data); } - - amdgpu_gfx_rlc_exit_safe_mode(adev, 0); } static void gfx_v9_0_update_coarse_grain_clock_gating(struct amdgpu_device *adev, @@ -5086,8 +5086,6 @@ static void gfx_v9_0_update_coarse_grain_clock_gating(struct amdgpu_device *adev { uint32_t def, data; - amdgpu_gfx_rlc_enter_safe_mode(adev, 0); - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_CGCG)) { def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); /* unset CGCG override */ @@ -5129,13 +5127,12 @@ static void gfx_v9_0_update_coarse_grain_clock_gating(struct amdgpu_device *adev if (def != data) WREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL, data); } - - amdgpu_gfx_rlc_exit_safe_mode(adev, 0); } static int gfx_v9_0_update_gfx_clock_gating(struct amdgpu_device *adev, bool enable) { + amdgpu_gfx_rlc_enter_safe_mode(adev, 0); if (enable) { /* CGCG/CGLS should be enabled after MGCG/MGLS * === MGCG + MGLS === @@ -5155,6 +5152,7 @@ static int gfx_v9_0_update_gfx_clock_gating(struct amdgpu_device *adev, /* === MGCG + MGLS === */ gfx_v9_0_update_medium_grain_clock_gating(adev, enable); } + amdgpu_gfx_rlc_exit_safe_mode(adev, 0); return 0; } @@ -5232,10 +5230,10 @@ static const struct amdgpu_rlc_funcs gfx_v9_0_rlc_funcs = { .is_rlcg_access_range = gfx_v9_0_is_rlcg_access_range, }; -static int gfx_v9_0_set_powergating_state(void *handle, +static int gfx_v9_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_PG_STATE_GATE); switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { @@ -5277,10 +5275,10 @@ static int gfx_v9_0_set_powergating_state(void *handle, return 0; } -static int gfx_v9_0_set_clockgating_state(void *handle, +static int gfx_v9_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c index 3f4fd2f08163..d81449f9d822 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c @@ -412,7 +412,7 @@ static int gfx_v9_4_2_run_shader(struct amdgpu_device *adev, r = amdgpu_ib_schedule(ring, 1, ib, NULL, fence_ptr); if (r) { dev_err(adev->dev, "ib submit failed (%d).\n", r); - amdgpu_ib_free(adev, ib, NULL); + amdgpu_ib_free(ib, NULL); } return r; } @@ -611,16 +611,16 @@ static int gfx_v9_4_2_do_sgprs_init(struct amdgpu_device *adev) } disp2_failed: - amdgpu_ib_free(adev, &disp_ibs[2], NULL); + amdgpu_ib_free(&disp_ibs[2], NULL); dma_fence_put(fences[2]); disp1_failed: - amdgpu_ib_free(adev, &disp_ibs[1], NULL); + amdgpu_ib_free(&disp_ibs[1], NULL); dma_fence_put(fences[1]); disp0_failed: - amdgpu_ib_free(adev, &disp_ibs[0], NULL); + amdgpu_ib_free(&disp_ibs[0], NULL); dma_fence_put(fences[0]); pro_end: - amdgpu_ib_free(adev, &wb_ib, NULL); + amdgpu_ib_free(&wb_ib, NULL); if (r) dev_info(adev->dev, "Init SGPRS Failed\n"); @@ -687,10 +687,10 @@ static int gfx_v9_4_2_do_vgprs_init(struct amdgpu_device *adev) } disp_failed: - amdgpu_ib_free(adev, &disp_ib, NULL); + amdgpu_ib_free(&disp_ib, NULL); dma_fence_put(fence); pro_end: - amdgpu_ib_free(adev, &wb_ib, NULL); + amdgpu_ib_free(&wb_ib, NULL); if (r) dev_info(adev->dev, "Init VGPRS Failed\n"); diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c index e2b3dda57030..2ba185875baa 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c @@ -43,8 +43,12 @@ MODULE_FIRMWARE("amdgpu/gc_9_4_3_mec.bin"); MODULE_FIRMWARE("amdgpu/gc_9_4_4_mec.bin"); +MODULE_FIRMWARE("amdgpu/gc_9_5_0_mec.bin"); MODULE_FIRMWARE("amdgpu/gc_9_4_3_rlc.bin"); MODULE_FIRMWARE("amdgpu/gc_9_4_4_rlc.bin"); +MODULE_FIRMWARE("amdgpu/gc_9_5_0_rlc.bin"); +MODULE_FIRMWARE("amdgpu/gc_9_4_3_sjt_mec.bin"); +MODULE_FIRMWARE("amdgpu/gc_9_4_4_sjt_mec.bin"); #define GFX9_MEC_HPD_SIZE 4096 #define RLCG_UCODE_LOADING_START_ADDRESS 0x00002000L @@ -52,10 +56,6 @@ MODULE_FIRMWARE("amdgpu/gc_9_4_4_rlc.bin"); #define GOLDEN_GB_ADDR_CONFIG 0x2a114042 #define CP_HQD_PERSISTENT_STATE_DEFAULT 0xbe05301 -#define mmSMNAID_XCD0_MCA_SMU 0x36430400 /* SMN AID XCD0 */ -#define mmSMNAID_XCD1_MCA_SMU 0x38430400 /* SMN AID XCD1 */ -#define mmSMNXCD_XCD0_MCA_SMU 0x40430400 /* SMN XCD XCD0 */ - #define XCC_REG_RANGE_0_LOW 0x2000 /* XCC gfxdec0 lower Bound */ #define XCC_REG_RANGE_0_HIGH 0x3400 /* XCC gfxdec0 upper Bound */ #define XCC_REG_RANGE_1_LOW 0xA000 /* XCC gfxdec1 lower Bound */ @@ -349,13 +349,17 @@ static void gfx_v9_4_3_init_golden_registers(struct amdgpu_device *adev) WREG32_SOC15(GC, dev_inst, regGB_ADDR_CONFIG, GOLDEN_GB_ADDR_CONFIG); - /* Golden settings applied by driver for ASIC with rev_id 0 */ - if (adev->rev_id == 0) { - WREG32_FIELD15_PREREG(GC, dev_inst, TCP_UTCL1_CNTL1, - REDUCE_FIFO_DEPTH_BY_2, 2); + if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0)) { + WREG32_FIELD15_PREREG(GC, dev_inst, TCP_UTCL1_CNTL2, SPARE, 0x1); } else { - WREG32_FIELD15_PREREG(GC, dev_inst, TCP_UTCL1_CNTL2, - SPARE, 0x1); + /* Golden settings applied by driver for ASIC with rev_id 0 */ + if (adev->rev_id == 0) { + WREG32_FIELD15_PREREG(GC, dev_inst, TCP_UTCL1_CNTL1, + REDUCE_FIFO_DEPTH_BY_2, 2); + } else { + WREG32_FIELD15_PREREG(GC, dev_inst, TCP_UTCL1_CNTL2, + SPARE, 0x1); + } } } } @@ -499,7 +503,7 @@ static int gfx_v9_4_3_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err2: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err1: amdgpu_device_wb_free(adev, index); @@ -543,6 +547,7 @@ static int gfx_v9_4_3_init_rlc_microcode(struct amdgpu_device *adev, err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_rlc.bin", chip_name); if (err) goto out; @@ -574,8 +579,19 @@ static int gfx_v9_4_3_init_cp_compute_microcode(struct amdgpu_device *adev, { int err; - err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, - "amdgpu/%s_mec.bin", chip_name); + if (amdgpu_sriov_vf(adev)) { + err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, + AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_sjt_mec.bin", chip_name); + + if (err) + err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, + AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_mec.bin", chip_name); + } else + err = amdgpu_ucode_request(adev, &adev->gfx.mec_fw, + AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_mec.bin", chip_name); if (err) goto out; amdgpu_gfx_cp_init_microcode(adev, AMDGPU_UCODE_ID_CP_MEC1); @@ -929,6 +945,7 @@ static int gfx_v9_4_3_gpu_early_init(struct amdgpu_device *adev) switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): adev->gfx.config.max_hw_contexts = 8; adev->gfx.config.sc_prim_fifo_size_frontend = 0x20; adev->gfx.config.sc_prim_fifo_size_backend = 0x100; @@ -1779,9 +1796,7 @@ static void gfx_v9_4_3_xcc_kiq_setting(struct amdgpu_ring *ring, int xcc_id) tmp = RREG32_SOC15(GC, GET_INST(GC, xcc_id), regRLC_CP_SCHEDULERS); tmp &= 0xffffff00; tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); - WREG32_SOC15_RLC(GC, GET_INST(GC, xcc_id), regRLC_CP_SCHEDULERS, tmp); - tmp |= 0x80; - WREG32_SOC15_RLC(GC, GET_INST(GC, xcc_id), regRLC_CP_SCHEDULERS, tmp); + WREG32_SOC15_RLC(GC, GET_INST(GC, xcc_id), regRLC_CP_SCHEDULERS, tmp | 0x80); } static void gfx_v9_4_3_mqd_set_priority(struct amdgpu_ring *ring, struct v9_mqd *mqd) @@ -2764,16 +2779,16 @@ static const struct amdgpu_rlc_funcs gfx_v9_4_3_rlc_funcs = { .is_rlcg_access_range = gfx_v9_4_3_is_rlcg_access_range, }; -static int gfx_v9_4_3_set_powergating_state(void *handle, +static int gfx_v9_4_3_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; } -static int gfx_v9_4_3_set_clockgating_state(void *handle, +static int gfx_v9_4_3_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int i, num_xcc; if (amdgpu_sriov_vf(adev)) @@ -4653,7 +4668,6 @@ static void gfx_v9_4_3_ip_dump(struct amdgpu_ip_block *ip_block) num_xcc = NUM_XCC(adev->gfx.xcc_mask); - amdgpu_gfx_off_ctrl(adev, false); for (xcc_id = 0; xcc_id < num_xcc; xcc_id++) { xcc_offset = xcc_id * reg_count; for (i = 0; i < reg_count; i++) @@ -4661,7 +4675,6 @@ static void gfx_v9_4_3_ip_dump(struct amdgpu_ip_block *ip_block) RREG32(SOC15_REG_ENTRY_OFFSET_INST(gc_reg_list_9_4_3[i], GET_INST(GC, xcc_id))); } - amdgpu_gfx_off_ctrl(adev, true); /* dump compute queue registers for all instances */ if (!adev->gfx.ip_dump_compute_queues) @@ -4670,7 +4683,6 @@ static void gfx_v9_4_3_ip_dump(struct amdgpu_ip_block *ip_block) num_inst = adev->gfx.mec.num_mec * adev->gfx.mec.num_pipe_per_mec * adev->gfx.mec.num_queue_per_pipe; reg_count = ARRAY_SIZE(gc_cp_reg_list_9_4_3); - amdgpu_gfx_off_ctrl(adev, false); mutex_lock(&adev->srbm_mutex); for (xcc_id = 0; xcc_id < num_xcc; xcc_id++) { xcc_offset = xcc_id * reg_count * num_inst; @@ -4697,7 +4709,6 @@ static void gfx_v9_4_3_ip_dump(struct amdgpu_ip_block *ip_block) } soc15_grbm_select(adev, 0, 0, 0, 0, 0); mutex_unlock(&adev->srbm_mutex); - amdgpu_gfx_off_ctrl(adev, true); } static void gfx_v9_4_3_ring_emit_cleaner_shader(struct amdgpu_ring *ring) @@ -4860,6 +4871,7 @@ static void gfx_v9_4_3_set_gds_init(struct amdgpu_device *adev) switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): /* 9.4.3 removed all the GDS internal memory, * only support GWS opcode in kernel, like barrier * semaphore.etc */ @@ -4873,6 +4885,7 @@ static void gfx_v9_4_3_set_gds_init(struct amdgpu_device *adev) switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): /* deprecated for 9.4.3, no usage at all */ adev->gds.gds_compute_max_wave_id = 0; break; diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c index ed8e130c7d19..5470cef7e9bd 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c @@ -368,7 +368,9 @@ static void gfxhub_v1_2_xcc_setup_vmid_config(struct amdgpu_device *adev, amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) || amdgpu_ip_version(adev, GC_HWIP, 0) == - IP_VERSION(9, 4, 4)); + IP_VERSION(9, 4, 4) || + amdgpu_ip_version(adev, GC_HWIP, 0) == + IP_VERSION(9, 5, 0)); WREG32_SOC15_OFFSET(GC, GET_INST(GC, j), regVM_CONTEXT1_CNTL, i * hub->ctx_distance, tmp); WREG32_SOC15_OFFSET(GC, GET_INST(GC, j), diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c index 697599c46240..9bedca9a79c6 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -1088,11 +1088,11 @@ static int gmc_v10_0_wait_for_idle(struct amdgpu_ip_block *ip_block) return 0; } -static int gmc_v10_0_set_clockgating_state(void *handle, +static int gmc_v10_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { int r; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; /* * The issue mmhub can't disconnect from DF with MMHUB clock gating being disabled @@ -1131,7 +1131,7 @@ static void gmc_v10_0_get_clockgating_state(void *handle, u64 *flags) athub_v2_0_get_clockgating(adev, flags); } -static int gmc_v10_0_set_powergating_state(void *handle, +static int gmc_v10_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c index f893ab4c14df..72751ab4c766 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c @@ -996,11 +996,11 @@ static int gmc_v11_0_wait_for_idle(struct amdgpu_ip_block *ip_block) return 0; } -static int gmc_v11_0_set_clockgating_state(void *handle, +static int gmc_v11_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { int r; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; r = adev->mmhub.funcs->set_clockgating(adev, state); if (r) @@ -1018,7 +1018,7 @@ static void gmc_v11_0_get_clockgating_state(void *handle, u64 *flags) athub_v3_0_get_clockgating(adev, flags); } -static int gmc_v11_0_set_powergating_state(void *handle, +static int gmc_v11_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c index d22b027fd0bb..b749f1c3f6a9 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c @@ -40,7 +40,7 @@ #include "gfxhub_v12_0.h" #include "mmhub_v4_1_0.h" #include "athub_v4_1_0.h" - +#include "umc_v8_14.h" static int gmc_v12_0_ecc_interrupt_state(struct amdgpu_device *adev, struct amdgpu_irq_src *src, @@ -581,6 +581,18 @@ static void gmc_v12_0_set_gmc_funcs(struct amdgpu_device *adev) static void gmc_v12_0_set_umc_funcs(struct amdgpu_device *adev) { + switch (amdgpu_ip_version(adev, UMC_HWIP, 0)) { + case IP_VERSION(8, 14, 0): + adev->umc.channel_inst_num = UMC_V8_14_CHANNEL_INSTANCE_NUM; + adev->umc.umc_inst_num = UMC_V8_14_UMC_INSTANCE_NUM(adev); + adev->umc.node_inst_num = 0; + adev->umc.max_ras_err_cnt_per_query = UMC_V8_14_TOTAL_CHANNEL_NUM(adev); + adev->umc.channel_offs = UMC_V8_14_PER_CHANNEL_OFFSET; + adev->umc.ras = &umc_v8_14_ras; + break; + default: + break; + } } @@ -829,6 +841,10 @@ static int gmc_v12_0_sw_init(struct amdgpu_ip_block *ip_block) amdgpu_vm_manager_init(adev); + r = amdgpu_gmc_ras_sw_init(adev); + if (r) + return r; + return 0; } @@ -980,11 +996,11 @@ static int gmc_v12_0_wait_for_idle(struct amdgpu_ip_block *ip_block) return 0; } -static int gmc_v12_0_set_clockgating_state(void *handle, +static int gmc_v12_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { int r; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; r = adev->mmhub.funcs->set_clockgating(adev, state); if (r) @@ -1002,7 +1018,7 @@ static void gmc_v12_0_get_clockgating_state(void *handle, u64 *flags) athub_v4_1_0_get_clockgating(adev, flags); } -static int gmc_v12_0_set_powergating_state(void *handle, +static int gmc_v12_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c index ca000b3d1afc..2245dda92021 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c @@ -131,7 +131,8 @@ static int gmc_v6_0_init_microcode(struct amdgpu_device *adev) if (((RREG32(mmMC_SEQ_MISC0) & 0xff000000) >> 24) == 0x58) chip_name = "si58"; - err = amdgpu_ucode_request(adev, &adev->gmc.fw, "amdgpu/%s_mc.bin", chip_name); + err = amdgpu_ucode_request(adev, &adev->gmc.fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_mc.bin", chip_name); if (err) { dev_err(adev->dev, "si_mc: Failed to load firmware \"%s_mc.bin\"\n", @@ -1094,13 +1095,13 @@ static int gmc_v6_0_process_interrupt(struct amdgpu_device *adev, return 0; } -static int gmc_v6_0_set_clockgating_state(void *handle, +static int gmc_v6_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int gmc_v6_0_set_powergating_state(void *handle, +static int gmc_v6_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c index b6016f11956e..9aac4b1101e3 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c @@ -157,7 +157,8 @@ static int gmc_v7_0_init_microcode(struct amdgpu_device *adev) return -EINVAL; } - err = amdgpu_ucode_request(adev, &adev->gmc.fw, "amdgpu/%s_mc.bin", chip_name); + err = amdgpu_ucode_request(adev, &adev->gmc.fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_mc.bin", chip_name); if (err) { pr_err("cik_mc: Failed to load firmware \"%s_mc.bin\"\n", chip_name); amdgpu_ucode_release(&adev->gmc.fw); @@ -1317,11 +1318,11 @@ static int gmc_v7_0_process_interrupt(struct amdgpu_device *adev, return 0; } -static int gmc_v7_0_set_clockgating_state(void *handle, +static int gmc_v7_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { bool gate = false; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (state == AMD_CG_STATE_GATE) gate = true; @@ -1337,7 +1338,7 @@ static int gmc_v7_0_set_clockgating_state(void *handle, return 0; } -static int gmc_v7_0_set_powergating_state(void *handle, +static int gmc_v7_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c index 12d5967ecd45..d06585207c33 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c @@ -259,7 +259,8 @@ static int gmc_v8_0_init_microcode(struct amdgpu_device *adev) return -EINVAL; } - err = amdgpu_ucode_request(adev, &adev->gmc.fw, "amdgpu/%s_mc.bin", chip_name); + err = amdgpu_ucode_request(adev, &adev->gmc.fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_mc.bin", chip_name); if (err) { pr_err("mc: Failed to load firmware \"%s_mc.bin\"\n", chip_name); amdgpu_ucode_release(&adev->gmc.fw); @@ -1658,10 +1659,10 @@ static void fiji_update_mc_light_sleep(struct amdgpu_device *adev, } } -static int gmc_v8_0_set_clockgating_state(void *handle, +static int gmc_v8_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; @@ -1679,7 +1680,7 @@ static int gmc_v8_0_set_clockgating_state(void *handle, return 0; } -static int gmc_v8_0_set_powergating_state(void *handle, +static int gmc_v8_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 50c5da3020cb..291549765c38 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -623,6 +623,9 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device *adev, } } + if (kgd2kfd_vmfault_fast_path(adev, entry, retry_fault)) + return 1; + if (!printk_ratelimit()) return 0; @@ -645,7 +648,8 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device *adev, soc15_ih_clientid_name[entry->client_id]); if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) || - amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4)) + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) || + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0)) dev_err(adev->dev, " cookie node_id %d fault from die %s%d%s\n", node_id, node_id % 4 == 3 ? "RSV" : "AID", node_id / 4, node_id % 4 == 1 ? ".XCD0" : node_id % 4 == 2 ? ".XCD1" : ""); @@ -795,7 +799,8 @@ static bool gmc_v9_0_use_invalidate_semaphore(struct amdgpu_device *adev, { if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 2) || amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) || - amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4)) + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) || + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0)) return false; return ((vmhub == AMDGPU_MMHUB0(0) || @@ -1138,12 +1143,13 @@ static void gmc_v9_0_get_coherence_flags(struct amdgpu_device *adev, bool uncached = bo->flags & AMDGPU_GEM_CREATE_UNCACHED; struct amdgpu_vm *vm = mapping->bo_va->base.vm; unsigned int mtype_local, mtype; + uint32_t gc_ip_version = amdgpu_ip_version(adev, GC_HWIP, 0); bool snoop = false; bool is_local; dma_resv_assert_held(bo->tbo.base.resv); - switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { + switch (gc_ip_version) { case IP_VERSION(9, 4, 1): case IP_VERSION(9, 4, 2): if (is_vram) { @@ -1157,10 +1163,7 @@ static void gmc_v9_0_get_coherence_flags(struct amdgpu_device *adev, /* FIXME: is this still needed? Or does * amdgpu_ttm_tt_pde_flags already handle this? */ - if ((amdgpu_ip_version(adev, GC_HWIP, 0) == - IP_VERSION(9, 4, 2) || - amdgpu_ip_version(adev, GC_HWIP, 0) == - IP_VERSION(9, 4, 3)) && + if (gc_ip_version == IP_VERSION(9, 4, 2) && adev->gmc.xgmi.connected_to_cpu) snoop = true; } else { @@ -1184,6 +1187,7 @@ static void gmc_v9_0_get_coherence_flags(struct amdgpu_device *adev, break; case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): /* Only local VRAM BOs or system memory on non-NUMA APUs * can be assumed to be local in their entirety. Choose * MTYPE_NC as safe fallback for all system memory BOs on @@ -1208,7 +1212,7 @@ static void gmc_v9_0_get_coherence_flags(struct amdgpu_device *adev, if (uncached) { mtype = MTYPE_UC; } else if (ext_coherent) { - if (adev->rev_id) + if (gc_ip_version == IP_VERSION(9, 5, 0) || adev->rev_id) mtype = is_local ? MTYPE_CC : MTYPE_UC; else mtype = MTYPE_UC; @@ -1218,10 +1222,10 @@ static void gmc_v9_0_get_coherence_flags(struct amdgpu_device *adev, /* dGPU */ if (is_local) mtype = mtype_local; - else if (is_vram) - mtype = MTYPE_NC; - else + else if (gc_ip_version < IP_VERSION(9, 5, 0) && !is_vram) mtype = MTYPE_UC; + else + mtype = MTYPE_NC; } break; @@ -1275,7 +1279,8 @@ static void gmc_v9_0_override_vm_pte_flags(struct amdgpu_device *adev, * memory can use more efficient MTYPEs. */ if (amdgpu_ip_version(adev, GC_HWIP, 0) != IP_VERSION(9, 4, 3) && - amdgpu_ip_version(adev, GC_HWIP, 0) != IP_VERSION(9, 4, 4)) + amdgpu_ip_version(adev, GC_HWIP, 0) != IP_VERSION(9, 4, 4) && + amdgpu_ip_version(adev, GC_HWIP, 0) != IP_VERSION(9, 5, 0)) return; /* Only direct-mapped memory allows us to determine the NUMA node from @@ -1540,6 +1545,7 @@ static void gmc_v9_0_set_mmhub_ras_funcs(struct amdgpu_device *adev) adev->mmhub.ras = &mmhub_v1_7_ras; break; case IP_VERSION(1, 8, 0): + case IP_VERSION(1, 8, 1): adev->mmhub.ras = &mmhub_v1_8_ras; break; default: @@ -1551,7 +1557,8 @@ static void gmc_v9_0_set_mmhub_ras_funcs(struct amdgpu_device *adev) static void gmc_v9_0_set_gfxhub_funcs(struct amdgpu_device *adev) { if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) || - amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4)) + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) || + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0)) adev->gfxhub.funcs = &gfxhub_v1_2_funcs; else adev->gfxhub.funcs = &gfxhub_v1_0_funcs; @@ -1619,7 +1626,8 @@ static int gmc_v9_0_early_init(struct amdgpu_ip_block *ip_block) if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 0) || amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 1) || amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) || - amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4)) + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) || + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0)) adev->gmc.xgmi.supported = true; if (amdgpu_ip_version(adev, XGMI_HWIP, 0) == IP_VERSION(6, 1, 0)) { @@ -1792,6 +1800,7 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev) case IP_VERSION(9, 4, 2): case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): default: adev->gmc.gart_size = 512ULL << 20; break; @@ -2070,7 +2079,8 @@ static int gmc_v9_0_sw_init(struct amdgpu_ip_block *ip_block) spin_lock_init(&adev->gmc.invalidate_lock); if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) || - amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4)) { + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) || + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0)) { gmc_v9_4_3_init_vram_info(adev); } else if (!adev->bios) { if (adev->flags & AMD_IS_APU) { @@ -2154,6 +2164,7 @@ static int gmc_v9_0_sw_init(struct amdgpu_ip_block *ip_block) break; case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): bitmap_set(adev->vmhubs_mask, AMDGPU_GFXHUB(0), NUM_XCC(adev->gfx.xcc_mask)); @@ -2220,7 +2231,8 @@ static int gmc_v9_0_sw_init(struct amdgpu_ip_block *ip_block) amdgpu_gmc_get_vbios_allocations(adev); if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) || - amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4)) { + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) || + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0)) { r = gmc_v9_0_init_mem_ranges(adev); if (r) return r; @@ -2250,7 +2262,8 @@ static int gmc_v9_0_sw_init(struct amdgpu_ip_block *ip_block) (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 1) || amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 2) || amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) || - amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4)) ? + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) || + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0)) ? 3 : 8; @@ -2263,7 +2276,8 @@ static int gmc_v9_0_sw_init(struct amdgpu_ip_block *ip_block) return r; if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) || - amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4)) + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) || + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0)) amdgpu_gmc_sysfs_init(adev); return 0; @@ -2274,7 +2288,8 @@ static int gmc_v9_0_sw_fini(struct amdgpu_ip_block *ip_block) struct amdgpu_device *adev = ip_block->adev; if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) || - amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4)) + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) || + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0)) amdgpu_gmc_sysfs_fini(adev); amdgpu_gmc_ras_fini(adev); @@ -2544,10 +2559,10 @@ static int gmc_v9_0_soft_reset(struct amdgpu_ip_block *ip_block) return 0; } -static int gmc_v9_0_set_clockgating_state(void *handle, +static int gmc_v9_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; adev->mmhub.funcs->set_clockgating(adev, state); @@ -2565,7 +2580,7 @@ static void gmc_v9_0_get_clockgating_state(void *handle, u64 *flags) athub_v1_0_get_clockgating(adev, flags); } -static int gmc_v9_0_set_powergating_state(void *handle, +static int gmc_v9_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c b/drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c index e019249883fb..194026e9be33 100644 --- a/drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c @@ -40,10 +40,12 @@ static void hdp_v4_0_flush_hdp(struct amdgpu_device *adev, struct amdgpu_ring *ring) { - if (!ring || !ring->funcs->emit_wreg) + if (!ring || !ring->funcs->emit_wreg) { WREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); - else + RREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2); + } else { amdgpu_ring_emit_wreg(ring, (adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); + } } static void hdp_v4_0_invalidate_hdp(struct amdgpu_device *adev, @@ -54,11 +56,13 @@ static void hdp_v4_0_invalidate_hdp(struct amdgpu_device *adev, amdgpu_ip_version(adev, HDP_HWIP, 0) == IP_VERSION(4, 4, 5)) return; - if (!ring || !ring->funcs->emit_wreg) + if (!ring || !ring->funcs->emit_wreg) { WREG32_SOC15_NO_KIQ(HDP, 0, mmHDP_READ_CACHE_INVALIDATE, 1); - else + RREG32_SOC15_NO_KIQ(HDP, 0, mmHDP_READ_CACHE_INVALIDATE); + } else { amdgpu_ring_emit_wreg(ring, SOC15_REG_OFFSET( HDP, 0, mmHDP_READ_CACHE_INVALIDATE), 1); + } } static void hdp_v4_0_query_ras_error_count(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/amd/amdgpu/hdp_v5_0.c b/drivers/gpu/drm/amd/amdgpu/hdp_v5_0.c index ed7facacf2fe..d3962d469088 100644 --- a/drivers/gpu/drm/amd/amdgpu/hdp_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/hdp_v5_0.c @@ -31,10 +31,12 @@ static void hdp_v5_0_flush_hdp(struct amdgpu_device *adev, struct amdgpu_ring *ring) { - if (!ring || !ring->funcs->emit_wreg) + if (!ring || !ring->funcs->emit_wreg) { WREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); - else + RREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2); + } else { amdgpu_ring_emit_wreg(ring, (adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); + } } static void hdp_v5_0_invalidate_hdp(struct amdgpu_device *adev, @@ -42,6 +44,7 @@ static void hdp_v5_0_invalidate_hdp(struct amdgpu_device *adev, { if (!ring || !ring->funcs->emit_wreg) { WREG32_SOC15_NO_KIQ(HDP, 0, mmHDP_READ_CACHE_INVALIDATE, 1); + RREG32_SOC15_NO_KIQ(HDP, 0, mmHDP_READ_CACHE_INVALIDATE); } else { amdgpu_ring_emit_wreg(ring, SOC15_REG_OFFSET( HDP, 0, mmHDP_READ_CACHE_INVALIDATE), 1); diff --git a/drivers/gpu/drm/amd/amdgpu/hdp_v5_2.c b/drivers/gpu/drm/amd/amdgpu/hdp_v5_2.c index 29c3484ae1f1..f52552c5fa27 100644 --- a/drivers/gpu/drm/amd/amdgpu/hdp_v5_2.c +++ b/drivers/gpu/drm/amd/amdgpu/hdp_v5_2.c @@ -31,13 +31,15 @@ static void hdp_v5_2_flush_hdp(struct amdgpu_device *adev, struct amdgpu_ring *ring) { - if (!ring || !ring->funcs->emit_wreg) + if (!ring || !ring->funcs->emit_wreg) { WREG32_NO_KIQ((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); - else + RREG32_NO_KIQ((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2); + } else { amdgpu_ring_emit_wreg(ring, (adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); + } } static void hdp_v5_2_update_mem_power_gating(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/amd/amdgpu/hdp_v6_0.c b/drivers/gpu/drm/amd/amdgpu/hdp_v6_0.c index 33736d361dd0..6948fe9956ce 100644 --- a/drivers/gpu/drm/amd/amdgpu/hdp_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/hdp_v6_0.c @@ -34,10 +34,12 @@ static void hdp_v6_0_flush_hdp(struct amdgpu_device *adev, struct amdgpu_ring *ring) { - if (!ring || !ring->funcs->emit_wreg) + if (!ring || !ring->funcs->emit_wreg) { WREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); - else + RREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2); + } else { amdgpu_ring_emit_wreg(ring, (adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); + } } static void hdp_v6_0_update_clock_gating(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/amd/amdgpu/hdp_v7_0.c b/drivers/gpu/drm/amd/amdgpu/hdp_v7_0.c index 1c99bb09e2a1..63820329f67e 100644 --- a/drivers/gpu/drm/amd/amdgpu/hdp_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/hdp_v7_0.c @@ -31,10 +31,12 @@ static void hdp_v7_0_flush_hdp(struct amdgpu_device *adev, struct amdgpu_ring *ring) { - if (!ring || !ring->funcs->emit_wreg) + if (!ring || !ring->funcs->emit_wreg) { WREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); - else + RREG32((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2); + } else { amdgpu_ring_emit_wreg(ring, (adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0); + } } static void hdp_v7_0_update_clock_gating(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c index 7f45e93c0397..8ac3d3282268 100644 --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c @@ -392,13 +392,13 @@ static int iceland_ih_soft_reset(struct amdgpu_ip_block *ip_block) return 0; } -static int iceland_ih_set_clockgating_state(void *handle, +static int iceland_ih_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int iceland_ih_set_powergating_state(void *handle, +static int iceland_ih_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/ih_v6_0.c b/drivers/gpu/drm/amd/amdgpu/ih_v6_0.c index 38f953fd65d9..f8a485164437 100644 --- a/drivers/gpu/drm/amd/amdgpu/ih_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/ih_v6_0.c @@ -693,10 +693,10 @@ static void ih_v6_0_update_clockgating_state(struct amdgpu_device *adev, } } -static int ih_v6_0_set_clockgating_state(void *handle, +static int ih_v6_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; ih_v6_0_update_clockgating_state(adev, state == AMD_CG_STATE_GATE); @@ -756,10 +756,10 @@ static void ih_v6_0_update_ih_mem_power_gating(struct amdgpu_device *adev, WREG32_SOC15(OSSSYS, 0, regIH_MEM_POWER_CTRL, ih_mem_pwr_cntl); } -static int ih_v6_0_set_powergating_state(void *handle, +static int ih_v6_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_PG_STATE_GATE); if (adev->pg_flags & AMD_PG_SUPPORT_IH_SRAM_PG) diff --git a/drivers/gpu/drm/amd/amdgpu/ih_v6_1.c b/drivers/gpu/drm/amd/amdgpu/ih_v6_1.c index 61381e0c3795..dd0042efceec 100644 --- a/drivers/gpu/drm/amd/amdgpu/ih_v6_1.c +++ b/drivers/gpu/drm/amd/amdgpu/ih_v6_1.c @@ -674,10 +674,10 @@ static void ih_v6_1_update_clockgating_state(struct amdgpu_device *adev, return; } -static int ih_v6_1_set_clockgating_state(void *handle, +static int ih_v6_1_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; ih_v6_1_update_clockgating_state(adev, state == AMD_CG_STATE_GATE); @@ -737,10 +737,10 @@ static void ih_v6_1_update_ih_mem_power_gating(struct amdgpu_device *adev, WREG32_SOC15(OSSSYS, 0, regIH_MEM_POWER_CTRL, ih_mem_pwr_cntl); } -static int ih_v6_1_set_powergating_state(void *handle, +static int ih_v6_1_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_PG_STATE_GATE); if (adev->pg_flags & AMD_PG_SUPPORT_IH_SRAM_PG) diff --git a/drivers/gpu/drm/amd/amdgpu/ih_v7_0.c b/drivers/gpu/drm/amd/amdgpu/ih_v7_0.c index d2428cf5d385..8f9b15c171f3 100644 --- a/drivers/gpu/drm/amd/amdgpu/ih_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/ih_v7_0.c @@ -664,10 +664,10 @@ static void ih_v7_0_update_clockgating_state(struct amdgpu_device *adev, return; } -static int ih_v7_0_set_clockgating_state(void *handle, +static int ih_v7_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; ih_v7_0_update_clockgating_state(adev, state == AMD_CG_STATE_GATE); @@ -727,10 +727,10 @@ static void ih_v7_0_update_ih_mem_power_gating(struct amdgpu_device *adev, WREG32_SOC15(OSSSYS, 0, regIH_MEM_POWER_CTRL, ih_mem_pwr_cntl); } -static int ih_v7_0_set_powergating_state(void *handle, +static int ih_v7_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_PG_STATE_GATE); if (adev->pg_flags & AMD_PG_SUPPORT_IH_SRAM_PG) diff --git a/drivers/gpu/drm/amd/amdgpu/imu_v11_0.c b/drivers/gpu/drm/amd/amdgpu/imu_v11_0.c index d4f72e47ae9e..aeca5c08ea2f 100644 --- a/drivers/gpu/drm/amd/amdgpu/imu_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/imu_v11_0.c @@ -50,7 +50,8 @@ static int imu_v11_0_init_microcode(struct amdgpu_device *adev) DRM_DEBUG("\n"); amdgpu_ucode_ip_version_decode(adev, GC_HWIP, ucode_prefix, sizeof(ucode_prefix)); - err = amdgpu_ucode_request(adev, &adev->gfx.imu_fw, "amdgpu/%s_imu.bin", ucode_prefix); + err = amdgpu_ucode_request(adev, &adev->gfx.imu_fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_imu.bin", ucode_prefix); if (err) goto out; diff --git a/drivers/gpu/drm/amd/amdgpu/imu_v12_0.c b/drivers/gpu/drm/amd/amdgpu/imu_v12_0.c index 1341f0292031..df898dbb746e 100644 --- a/drivers/gpu/drm/amd/amdgpu/imu_v12_0.c +++ b/drivers/gpu/drm/amd/amdgpu/imu_v12_0.c @@ -47,7 +47,8 @@ static int imu_v12_0_init_microcode(struct amdgpu_device *adev) DRM_DEBUG("\n"); amdgpu_ucode_ip_version_decode(adev, GC_HWIP, ucode_prefix, sizeof(ucode_prefix)); - err = amdgpu_ucode_request(adev, &adev->gfx.imu_fw, "amdgpu/%s_imu.bin", ucode_prefix); + err = amdgpu_ucode_request(adev, &adev->gfx.imu_fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_imu.bin", ucode_prefix); if (err) goto out; diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c index 7319299f25ae..03b8b7cd5229 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c @@ -604,7 +604,7 @@ static void jpeg_v1_0_set_irq_funcs(struct amdgpu_device *adev) static void jpeg_v1_0_ring_begin_use(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; - bool set_clocks = !cancel_delayed_work_sync(&adev->jpeg.idle_work); + bool set_clocks = !cancel_delayed_work_sync(&adev->vcn.idle_work); int cnt = 0; mutex_lock(&adev->vcn.vcn1_jpeg1_workaround); diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c index 6e29b69894a5..7c9251c03815 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c @@ -35,7 +35,7 @@ static void jpeg_v2_0_set_dec_ring_funcs(struct amdgpu_device *adev); static void jpeg_v2_0_set_irq_funcs(struct amdgpu_device *adev); -static int jpeg_v2_0_set_powergating_state(void *handle, +static int jpeg_v2_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); /** @@ -154,7 +154,7 @@ static int jpeg_v2_0_hw_fini(struct amdgpu_ip_block *ip_block) if (adev->jpeg.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(JPEG, 0, mmUVD_JRBC_STATUS)) - jpeg_v2_0_set_powergating_state(adev, AMD_PG_STATE_GATE); + jpeg_v2_0_set_powergating_state(ip_block, AMD_PG_STATE_GATE); return 0; } @@ -675,14 +675,14 @@ static int jpeg_v2_0_wait_for_idle(struct amdgpu_ip_block *ip_block) return ret; } -static int jpeg_v2_0_set_clockgating_state(void *handle, +static int jpeg_v2_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_CG_STATE_GATE); if (enable) { - if (!jpeg_v2_0_is_idle(handle)) + if (!jpeg_v2_0_is_idle(adev)) return -EBUSY; jpeg_v2_0_enable_clock_gating(adev); } else { @@ -692,10 +692,10 @@ static int jpeg_v2_0_set_clockgating_state(void *handle, return 0; } -static int jpeg_v2_0_set_powergating_state(void *handle, +static int jpeg_v2_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; if (state == adev->jpeg.cur_state) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c index 9ac421486f05..11f6af2646e7 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c @@ -38,7 +38,7 @@ static void jpeg_v2_5_set_dec_ring_funcs(struct amdgpu_device *adev); static void jpeg_v2_5_set_irq_funcs(struct amdgpu_device *adev); -static int jpeg_v2_5_set_powergating_state(void *handle, +static int jpeg_v2_5_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); static void jpeg_v2_5_set_ras_funcs(struct amdgpu_device *adev); @@ -219,7 +219,7 @@ static int jpeg_v2_5_hw_fini(struct amdgpu_ip_block *ip_block) if (adev->jpeg.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(JPEG, i, mmUVD_JRBC_STATUS)) - jpeg_v2_5_set_powergating_state(adev, AMD_PG_STATE_GATE); + jpeg_v2_5_set_powergating_state(ip_block, AMD_PG_STATE_GATE); if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__JPEG)) amdgpu_irq_put(adev, &adev->jpeg.inst[i].ras_poison_irq, 0); @@ -518,10 +518,10 @@ static int jpeg_v2_5_wait_for_idle(struct amdgpu_ip_block *ip_block) return 0; } -static int jpeg_v2_5_set_clockgating_state(void *handle, +static int jpeg_v2_5_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_CG_STATE_GATE); int i; @@ -530,7 +530,7 @@ static int jpeg_v2_5_set_clockgating_state(void *handle, continue; if (enable) { - if (!jpeg_v2_5_is_idle(handle)) + if (!jpeg_v2_5_is_idle(adev)) return -EBUSY; jpeg_v2_5_enable_clock_gating(adev, i); } else { @@ -541,10 +541,10 @@ static int jpeg_v2_5_set_clockgating_state(void *handle, return 0; } -static int jpeg_v2_5_set_powergating_state(void *handle, +static int jpeg_v2_5_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; if (state == adev->jpeg.cur_state) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c index e0df6800502c..4eca65ea9053 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c @@ -36,7 +36,7 @@ static void jpeg_v3_0_set_dec_ring_funcs(struct amdgpu_device *adev); static void jpeg_v3_0_set_irq_funcs(struct amdgpu_device *adev); -static int jpeg_v3_0_set_powergating_state(void *handle, +static int jpeg_v3_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); /** @@ -168,7 +168,7 @@ static int jpeg_v3_0_hw_fini(struct amdgpu_ip_block *ip_block) if (adev->jpeg.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(JPEG, 0, mmUVD_JRBC_STATUS)) - jpeg_v3_0_set_powergating_state(adev, AMD_PG_STATE_GATE); + jpeg_v3_0_set_powergating_state(ip_block, AMD_PG_STATE_GATE); return 0; } @@ -466,14 +466,14 @@ static int jpeg_v3_0_wait_for_idle(struct amdgpu_ip_block *ip_block) UVD_JRBC_STATUS__RB_JOB_DONE_MASK); } -static int jpeg_v3_0_set_clockgating_state(void *handle, +static int jpeg_v3_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = state == AMD_CG_STATE_GATE; if (enable) { - if (!jpeg_v3_0_is_idle(handle)) + if (!jpeg_v3_0_is_idle(adev)) return -EBUSY; jpeg_v3_0_enable_clock_gating(adev); } else { @@ -483,10 +483,10 @@ static int jpeg_v3_0_set_clockgating_state(void *handle, return 0; } -static int jpeg_v3_0_set_powergating_state(void *handle, +static int jpeg_v3_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; if(state == adev->jpeg.cur_state) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c index eca1963c33b6..0aef1f64afd0 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c @@ -39,7 +39,7 @@ static int jpeg_v4_0_start_sriov(struct amdgpu_device *adev); static void jpeg_v4_0_set_dec_ring_funcs(struct amdgpu_device *adev); static void jpeg_v4_0_set_irq_funcs(struct amdgpu_device *adev); -static int jpeg_v4_0_set_powergating_state(void *handle, +static int jpeg_v4_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); static void jpeg_v4_0_set_ras_funcs(struct amdgpu_device *adev); @@ -206,7 +206,7 @@ static int jpeg_v4_0_hw_fini(struct amdgpu_ip_block *ip_block) if (!amdgpu_sriov_vf(adev)) { if (adev->jpeg.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(JPEG, 0, regUVD_JRBC_STATUS)) - jpeg_v4_0_set_powergating_state(adev, AMD_PG_STATE_GATE); + jpeg_v4_0_set_powergating_state(ip_block, AMD_PG_STATE_GATE); } if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__JPEG)) amdgpu_irq_put(adev, &adev->jpeg.inst->ras_poison_irq, 0); @@ -635,14 +635,14 @@ static int jpeg_v4_0_wait_for_idle(struct amdgpu_ip_block *ip_block) UVD_JRBC_STATUS__RB_JOB_DONE_MASK); } -static int jpeg_v4_0_set_clockgating_state(void *handle, +static int jpeg_v4_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = state == AMD_CG_STATE_GATE; if (enable) { - if (!jpeg_v4_0_is_idle(handle)) + if (!jpeg_v4_0_is_idle(adev)) return -EBUSY; jpeg_v4_0_enable_clock_gating(adev); } else { @@ -652,10 +652,10 @@ static int jpeg_v4_0_set_clockgating_state(void *handle, return 0; } -static int jpeg_v4_0_set_powergating_state(void *handle, +static int jpeg_v4_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; if (amdgpu_sriov_vf(adev)) { diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c index 67b51bcbacd1..88f9771c1686 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c @@ -43,7 +43,7 @@ enum jpeg_engin_status { static void jpeg_v4_0_3_set_dec_ring_funcs(struct amdgpu_device *adev); static void jpeg_v4_0_3_set_irq_funcs(struct amdgpu_device *adev); -static int jpeg_v4_0_3_set_powergating_state(void *handle, +static int jpeg_v4_0_3_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); static void jpeg_v4_0_3_set_ras_funcs(struct amdgpu_device *adev); static void jpeg_v4_0_3_dec_ring_set_wptr(struct amdgpu_ring *ring); @@ -76,7 +76,7 @@ static int jpeg_v4_0_3_early_init(struct amdgpu_ip_block *ip_block) { struct amdgpu_device *adev = ip_block->adev; - adev->jpeg.num_jpeg_rings = AMDGPU_MAX_JPEG_RINGS; + adev->jpeg.num_jpeg_rings = AMDGPU_MAX_JPEG_RINGS_4_0_3; jpeg_v4_0_3_set_dec_ring_funcs(adev); jpeg_v4_0_3_set_irq_funcs(adev); @@ -321,7 +321,7 @@ static int jpeg_v4_0_3_hw_init(struct amdgpu_ip_block *ip_block) if (r) return r; - for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) { for (j = 0; j < adev->jpeg.num_jpeg_rings; ++j) { ring = &adev->jpeg.inst[i].ring_dec[j]; ring->wptr = 0; @@ -379,7 +379,7 @@ static int jpeg_v4_0_3_hw_fini(struct amdgpu_ip_block *ip_block) if (!amdgpu_sriov_vf(adev)) { if (adev->jpeg.cur_state != AMD_PG_STATE_GATE) - ret = jpeg_v4_0_3_set_powergating_state(adev, AMD_PG_STATE_GATE); + ret = jpeg_v4_0_3_set_powergating_state(ip_block, AMD_PG_STATE_GATE); } return ret; @@ -949,16 +949,16 @@ static int jpeg_v4_0_3_wait_for_idle(struct amdgpu_ip_block *ip_block) return ret; } -static int jpeg_v4_0_3_set_clockgating_state(void *handle, +static int jpeg_v4_0_3_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = state == AMD_CG_STATE_GATE; int i; for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) { if (enable) { - if (!jpeg_v4_0_3_is_idle(handle)) + if (!jpeg_v4_0_3_is_idle(adev)) return -EBUSY; jpeg_v4_0_3_enable_clock_gating(adev, i); } else { @@ -968,10 +968,10 @@ static int jpeg_v4_0_3_set_clockgating_state(void *handle, return 0; } -static int jpeg_v4_0_3_set_powergating_state(void *handle, +static int jpeg_v4_0_3_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; if (amdgpu_sriov_vf(adev)) { @@ -1231,9 +1231,95 @@ static const struct amdgpu_ras_block_hw_ops jpeg_v4_0_3_ras_hw_ops = { .reset_ras_error_count = jpeg_v4_0_3_reset_ras_error_count, }; +static int jpeg_v4_0_3_aca_bank_parser(struct aca_handle *handle, struct aca_bank *bank, + enum aca_smu_type type, void *data) +{ + struct aca_bank_info info; + u64 misc0; + int ret; + + ret = aca_bank_info_decode(bank, &info); + if (ret) + return ret; + + misc0 = bank->regs[ACA_REG_IDX_MISC0]; + switch (type) { + case ACA_SMU_TYPE_UE: + ret = aca_error_cache_log_bank_error(handle, &info, ACA_ERROR_TYPE_UE, + 1ULL); + break; + case ACA_SMU_TYPE_CE: + ret = aca_error_cache_log_bank_error(handle, &info, ACA_ERROR_TYPE_CE, + ACA_REG__MISC0__ERRCNT(misc0)); + break; + default: + return -EINVAL; + } + + return ret; +} + +/* reference to smu driver if header file */ +static int jpeg_v4_0_3_err_codes[] = { + 16, 17, 18, 19, 20, 21, 22, 23, /* JPEG[0-7][S|D] */ + 24, 25, 26, 27, 28, 29, 30, 31 +}; + +static bool jpeg_v4_0_3_aca_bank_is_valid(struct aca_handle *handle, struct aca_bank *bank, + enum aca_smu_type type, void *data) +{ + u32 instlo; + + instlo = ACA_REG__IPID__INSTANCEIDLO(bank->regs[ACA_REG_IDX_IPID]); + instlo &= GENMASK(31, 1); + + if (instlo != mmSMNAID_AID0_MCA_SMU) + return false; + + if (aca_bank_check_error_codes(handle->adev, bank, + jpeg_v4_0_3_err_codes, + ARRAY_SIZE(jpeg_v4_0_3_err_codes))) + return false; + + return true; +} + +static const struct aca_bank_ops jpeg_v4_0_3_aca_bank_ops = { + .aca_bank_parser = jpeg_v4_0_3_aca_bank_parser, + .aca_bank_is_valid = jpeg_v4_0_3_aca_bank_is_valid, +}; + +static const struct aca_info jpeg_v4_0_3_aca_info = { + .hwip = ACA_HWIP_TYPE_SMU, + .mask = ACA_ERROR_UE_MASK, + .bank_ops = &jpeg_v4_0_3_aca_bank_ops, +}; + +static int jpeg_v4_0_3_ras_late_init(struct amdgpu_device *adev, struct ras_common_if *ras_block) +{ + int r; + + r = amdgpu_ras_block_late_init(adev, ras_block); + if (r) + return r; + + r = amdgpu_ras_bind_aca(adev, AMDGPU_RAS_BLOCK__JPEG, + &jpeg_v4_0_3_aca_info, NULL); + if (r) + goto late_fini; + + return 0; + +late_fini: + amdgpu_ras_block_late_fini(adev, ras_block); + + return r; +} + static struct amdgpu_jpeg_ras jpeg_v4_0_3_ras = { .ras_block = { .hw_ops = &jpeg_v4_0_3_ras_hw_ops, + .ras_late_init = jpeg_v4_0_3_ras_late_init, }, }; diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_5.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_5.c index 1d9e3b101c3a..6b3656984957 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_5.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_5.c @@ -48,7 +48,7 @@ static void jpeg_v4_0_5_set_dec_ring_funcs(struct amdgpu_device *adev); static void jpeg_v4_0_5_set_irq_funcs(struct amdgpu_device *adev); -static int jpeg_v4_0_5_set_powergating_state(void *handle, +static int jpeg_v4_0_5_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); static void jpeg_v4_0_5_dec_ring_set_wptr(struct amdgpu_ring *ring); @@ -236,7 +236,7 @@ static int jpeg_v4_0_5_hw_fini(struct amdgpu_ip_block *ip_block) if (!amdgpu_sriov_vf(adev)) { if (adev->jpeg.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(JPEG, i, regUVD_JRBC_STATUS)) - jpeg_v4_0_5_set_powergating_state(adev, AMD_PG_STATE_GATE); + jpeg_v4_0_5_set_powergating_state(ip_block, AMD_PG_STATE_GATE); } } return 0; @@ -660,10 +660,10 @@ static int jpeg_v4_0_5_wait_for_idle(struct amdgpu_ip_block *ip_block) return 0; } -static int jpeg_v4_0_5_set_clockgating_state(void *handle, +static int jpeg_v4_0_5_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_CG_STATE_GATE) ? true : false; int i; @@ -672,7 +672,7 @@ static int jpeg_v4_0_5_set_clockgating_state(void *handle, continue; if (enable) { - if (!jpeg_v4_0_5_is_idle(handle)) + if (!jpeg_v4_0_5_is_idle(adev)) return -EBUSY; jpeg_v4_0_5_enable_clock_gating(adev, i); @@ -684,10 +684,10 @@ static int jpeg_v4_0_5_set_clockgating_state(void *handle, return 0; } -static int jpeg_v4_0_5_set_powergating_state(void *handle, +static int jpeg_v4_0_5_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; if (amdgpu_sriov_vf(adev)) { diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_0.c index 58fb1e5fa89c..d5cf0f2799d4 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_0.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_0.c @@ -31,12 +31,12 @@ #include "vcn/vcn_5_0_0_offset.h" #include "vcn/vcn_5_0_0_sh_mask.h" -#include "ivsrcid/vcn/irqsrcs_vcn_4_0.h" +#include "ivsrcid/vcn/irqsrcs_vcn_5_0.h" #include "jpeg_v5_0_0.h" static void jpeg_v5_0_0_set_dec_ring_funcs(struct amdgpu_device *adev); static void jpeg_v5_0_0_set_irq_funcs(struct amdgpu_device *adev); -static int jpeg_v5_0_0_set_powergating_state(void *handle, +static int jpeg_v5_0_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); /** @@ -74,7 +74,7 @@ static int jpeg_v5_0_0_sw_init(struct amdgpu_ip_block *ip_block) /* JPEG TRAP */ r = amdgpu_irq_add_id(adev, SOC15_IH_CLIENTID_VCN, - VCN_4_0__SRCID__JPEG_DECODE, &adev->jpeg.inst->irq); + VCN_5_0__SRCID__JPEG_DECODE, &adev->jpeg.inst->irq); if (r) return r; @@ -172,7 +172,7 @@ static int jpeg_v5_0_0_hw_fini(struct amdgpu_ip_block *ip_block) if (adev->jpeg.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(JPEG, 0, regUVD_JRBC_STATUS)) - jpeg_v5_0_0_set_powergating_state(adev, AMD_PG_STATE_GATE); + jpeg_v5_0_0_set_powergating_state(ip_block, AMD_PG_STATE_GATE); return 0; } @@ -560,14 +560,14 @@ static int jpeg_v5_0_0_wait_for_idle(struct amdgpu_ip_block *ip_block) UVD_JRBC_STATUS__RB_JOB_DONE_MASK); } -static int jpeg_v5_0_0_set_clockgating_state(void *handle, +static int jpeg_v5_0_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_CG_STATE_GATE) ? true : false; if (enable) { - if (!jpeg_v5_0_0_is_idle(handle)) + if (!jpeg_v5_0_0_is_idle(adev)) return -EBUSY; jpeg_v5_0_0_enable_clock_gating(adev); } else { @@ -577,10 +577,10 @@ static int jpeg_v5_0_0_set_clockgating_state(void *handle, return 0; } -static int jpeg_v5_0_0_set_powergating_state(void *handle, +static int jpeg_v5_0_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; if (state == adev->jpeg.cur_state) @@ -612,7 +612,7 @@ static int jpeg_v5_0_0_process_interrupt(struct amdgpu_device *adev, DRM_DEBUG("IH: JPEG TRAP\n"); switch (entry->src_id) { - case VCN_4_0__SRCID__JPEG_DECODE: + case VCN_5_0__SRCID__JPEG_DECODE: amdgpu_fence_process(adev->jpeg.inst->ring_dec); break; default: diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c new file mode 100644 index 000000000000..40d4c32a8c2a --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c @@ -0,0 +1,708 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright 2014-2024 Advanced Micro Devices, Inc. All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include "amdgpu.h" +#include "amdgpu_jpeg.h" +#include "amdgpu_pm.h" +#include "soc15.h" +#include "soc15d.h" +#include "jpeg_v4_0_3.h" +#include "jpeg_v5_0_1.h" + +#include "vcn/vcn_5_0_0_offset.h" +#include "vcn/vcn_5_0_0_sh_mask.h" +#include "ivsrcid/vcn/irqsrcs_vcn_5_0.h" + +static void jpeg_v5_0_1_set_dec_ring_funcs(struct amdgpu_device *adev); +static void jpeg_v5_0_1_set_irq_funcs(struct amdgpu_device *adev); +static int jpeg_v5_0_1_set_powergating_state(struct amdgpu_ip_block *ip_block, + enum amd_powergating_state state); +static void jpeg_v5_0_1_dec_ring_set_wptr(struct amdgpu_ring *ring); + +static int amdgpu_ih_srcid_jpeg[] = { + VCN_5_0__SRCID__JPEG_DECODE, + VCN_5_0__SRCID__JPEG1_DECODE, + VCN_5_0__SRCID__JPEG2_DECODE, + VCN_5_0__SRCID__JPEG3_DECODE, + VCN_5_0__SRCID__JPEG4_DECODE, + VCN_5_0__SRCID__JPEG5_DECODE, + VCN_5_0__SRCID__JPEG6_DECODE, + VCN_5_0__SRCID__JPEG7_DECODE, + VCN_5_0__SRCID__JPEG8_DECODE, + VCN_5_0__SRCID__JPEG9_DECODE, +}; + +static int jpeg_v5_0_1_core_reg_offset(u32 pipe) +{ + if (pipe <= AMDGPU_MAX_JPEG_RINGS_4_0_3) + return ((0x40 * pipe) - 0xc80); + else + return ((0x40 * pipe) - 0x440); +} + +/** + * jpeg_v5_0_1_early_init - set function pointers + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * Set ring and irq function pointers + */ +static int jpeg_v5_0_1_early_init(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + + if (!adev->jpeg.num_jpeg_inst || adev->jpeg.num_jpeg_inst > AMDGPU_MAX_JPEG_INSTANCES) + return -ENOENT; + + adev->jpeg.num_jpeg_rings = AMDGPU_MAX_JPEG_RINGS; + jpeg_v5_0_1_set_dec_ring_funcs(adev); + jpeg_v5_0_1_set_irq_funcs(adev); + + return 0; +} + +/** + * jpeg_v5_0_1_sw_init - sw init for JPEG block + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * Load firmware and sw initialization + */ +static int jpeg_v5_0_1_sw_init(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + struct amdgpu_ring *ring; + int i, j, r, jpeg_inst; + + for (j = 0; j < adev->jpeg.num_jpeg_rings; ++j) { + /* JPEG TRAP */ + r = amdgpu_irq_add_id(adev, SOC15_IH_CLIENTID_VCN, + amdgpu_ih_srcid_jpeg[j], &adev->jpeg.inst->irq); + if (r) + return r; + } + + r = amdgpu_jpeg_sw_init(adev); + if (r) + return r; + + r = amdgpu_jpeg_resume(adev); + if (r) + return r; + + for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) { + jpeg_inst = GET_INST(JPEG, i); + + for (j = 0; j < adev->jpeg.num_jpeg_rings; ++j) { + ring = &adev->jpeg.inst[i].ring_dec[j]; + ring->use_doorbell = false; + ring->vm_hub = AMDGPU_MMHUB0(adev->jpeg.inst[i].aid_id); + if (!amdgpu_sriov_vf(adev)) { + ring->doorbell_index = + (adev->doorbell_index.vcn.vcn_ring0_1 << 1) + + 1 + j + 11 * jpeg_inst; + } else { + if (j < 4) + ring->doorbell_index = + (adev->doorbell_index.vcn.vcn_ring0_1 << 1) + + 4 + j + 32 * jpeg_inst; + else + ring->doorbell_index = + (adev->doorbell_index.vcn.vcn_ring0_1 << 1) + + 8 + j + 32 * jpeg_inst; + } + sprintf(ring->name, "jpeg_dec_%d.%d", adev->jpeg.inst[i].aid_id, j); + r = amdgpu_ring_init(adev, ring, 512, &adev->jpeg.inst->irq, 0, + AMDGPU_RING_PRIO_DEFAULT, NULL); + if (r) + return r; + + adev->jpeg.internal.jpeg_pitch[j] = + regUVD_JRBC0_UVD_JRBC_SCRATCH0_INTERNAL_OFFSET; + adev->jpeg.inst[i].external.jpeg_pitch[j] = + SOC15_REG_OFFSET1(JPEG, jpeg_inst, regUVD_JRBC_SCRATCH0, + (j ? jpeg_v5_0_1_core_reg_offset(j) : 0)); + } + } + + return 0; +} + +/** + * jpeg_v5_0_1_sw_fini - sw fini for JPEG block + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * JPEG suspend and free up sw allocation + */ +static int jpeg_v5_0_1_sw_fini(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + int r; + + r = amdgpu_jpeg_suspend(adev); + if (r) + return r; + + r = amdgpu_jpeg_sw_fini(adev); + + return r; +} + +/** + * jpeg_v5_0_1_hw_init - start and test JPEG block + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + */ +static int jpeg_v5_0_1_hw_init(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + struct amdgpu_ring *ring; + int i, j, r, jpeg_inst; + + if (amdgpu_sriov_vf(adev)) { + /* jpeg_v5_0_1_start_sriov(adev); */ + for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) { + for (j = 0; j < adev->jpeg.num_jpeg_rings; ++j) { + ring = &adev->jpeg.inst[i].ring_dec[j]; + ring->wptr = 0; + ring->wptr_old = 0; + jpeg_v5_0_1_dec_ring_set_wptr(ring); + ring->sched.ready = true; + } + } + return 0; + } + for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) { + jpeg_inst = GET_INST(JPEG, i); + ring = adev->jpeg.inst[i].ring_dec; + if (ring->use_doorbell) + adev->nbio.funcs->vcn_doorbell_range(adev, ring->use_doorbell, + (adev->doorbell_index.vcn.vcn_ring0_1 << 1) + 11 * jpeg_inst, + adev->jpeg.inst[i].aid_id); + + for (j = 0; j < adev->jpeg.num_jpeg_rings; ++j) { + ring = &adev->jpeg.inst[i].ring_dec[j]; + if (ring->use_doorbell) + WREG32_SOC15_OFFSET(VCN, GET_INST(VCN, i), regVCN_JPEG_DB_CTRL, + (ring->pipe ? (ring->pipe - 0x15) : 0), + ring->doorbell_index << + VCN_JPEG_DB_CTRL__OFFSET__SHIFT | + VCN_JPEG_DB_CTRL__EN_MASK); + r = amdgpu_ring_test_helper(ring); + if (r) + return r; + } + } + + return 0; +} + +/** + * jpeg_v5_0_1_hw_fini - stop the hardware block + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * Stop the JPEG block, mark ring as not ready any more + */ +static int jpeg_v5_0_1_hw_fini(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + int ret = 0; + + cancel_delayed_work_sync(&adev->jpeg.idle_work); + + if (adev->jpeg.cur_state != AMD_PG_STATE_GATE) + ret = jpeg_v5_0_1_set_powergating_state(ip_block, AMD_PG_STATE_GATE); + + return ret; +} + +/** + * jpeg_v5_0_1_suspend - suspend JPEG block + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * HW fini and suspend JPEG block + */ +static int jpeg_v5_0_1_suspend(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + int r; + + r = jpeg_v5_0_1_hw_fini(ip_block); + if (r) + return r; + + r = amdgpu_jpeg_suspend(adev); + + return r; +} + +/** + * jpeg_v5_0_1_resume - resume JPEG block + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * Resume firmware and hw init JPEG block + */ +static int jpeg_v5_0_1_resume(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + int r; + + r = amdgpu_jpeg_resume(adev); + if (r) + return r; + + r = jpeg_v5_0_1_hw_init(ip_block); + + return r; +} + +static int jpeg_v5_0_1_disable_antihang(struct amdgpu_device *adev, int inst_idx) +{ + int jpeg_inst; + + jpeg_inst = GET_INST(JPEG, inst_idx); + /* disable anti hang mechanism */ + WREG32_P(SOC15_REG_OFFSET(JPEG, jpeg_inst, regUVD_JPEG_POWER_STATUS), 0, + ~UVD_JPEG_POWER_STATUS__JPEG_POWER_STATUS_MASK); + + /* keep the JPEG in static PG mode */ + WREG32_P(SOC15_REG_OFFSET(JPEG, jpeg_inst, regUVD_JPEG_POWER_STATUS), 0, + ~UVD_JPEG_POWER_STATUS__JPEG_PG_MODE_MASK); + + return 0; +} + +static int jpeg_v5_0_1_enable_antihang(struct amdgpu_device *adev, int inst_idx) +{ + int jpeg_inst; + + jpeg_inst = GET_INST(JPEG, inst_idx); + /* enable anti hang mechanism */ + WREG32_P(SOC15_REG_OFFSET(JPEG, jpeg_inst, regUVD_JPEG_POWER_STATUS), + UVD_JPEG_POWER_STATUS__JPEG_POWER_STATUS_MASK, + ~UVD_JPEG_POWER_STATUS__JPEG_POWER_STATUS_MASK); + + return 0; +} + +/** + * jpeg_v5_0_1_start - start JPEG block + * + * @adev: amdgpu_device pointer + * + * Setup and start the JPEG block + */ +static int jpeg_v5_0_1_start(struct amdgpu_device *adev) +{ + struct amdgpu_ring *ring; + int i, j, jpeg_inst, r; + + for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) { + jpeg_inst = GET_INST(JPEG, i); + + /* disable antihang */ + r = jpeg_v5_0_1_disable_antihang(adev, i); + if (r) + return r; + + /* MJPEG global tiling registers */ + WREG32_SOC15(JPEG, 0, regJPEG_DEC_GFX10_ADDR_CONFIG, + adev->gfx.config.gb_addr_config); + + /* enable JMI channel */ + WREG32_P(SOC15_REG_OFFSET(JPEG, jpeg_inst, regUVD_JMI_CNTL), 0, + ~UVD_JMI_CNTL__SOFT_RESET_MASK); + + for (j = 0; j < adev->jpeg.num_jpeg_rings; ++j) { + int reg_offset = (j ? jpeg_v5_0_1_core_reg_offset(j) : 0); + u32 reg, data, mask; + + ring = &adev->jpeg.inst[i].ring_dec[j]; + + /* enable System Interrupt for JRBC */ + reg = SOC15_REG_OFFSET(JPEG, jpeg_inst, regJPEG_SYS_INT_EN); + if (j < AMDGPU_MAX_JPEG_RINGS_4_0_3) { + data = JPEG_SYS_INT_EN__DJRBC0_MASK << j; + mask = ~(JPEG_SYS_INT_EN__DJRBC0_MASK << j); + WREG32_P(reg, data, mask); + } else { + data = JPEG_SYS_INT_EN__DJRBC0_MASK << (j+12); + mask = ~(JPEG_SYS_INT_EN__DJRBC0_MASK << (j+12)); + WREG32_P(reg, data, mask); + } + + WREG32_SOC15_OFFSET(JPEG, jpeg_inst, + regUVD_LMI_JRBC_RB_VMID, + reg_offset, 0); + WREG32_SOC15_OFFSET(JPEG, jpeg_inst, + regUVD_JRBC_RB_CNTL, + reg_offset, + (0x00000001L | 0x00000002L)); + WREG32_SOC15_OFFSET(JPEG, jpeg_inst, + regUVD_LMI_JRBC_RB_64BIT_BAR_LOW, + reg_offset, lower_32_bits(ring->gpu_addr)); + WREG32_SOC15_OFFSET(JPEG, jpeg_inst, + regUVD_LMI_JRBC_RB_64BIT_BAR_HIGH, + reg_offset, upper_32_bits(ring->gpu_addr)); + WREG32_SOC15_OFFSET(JPEG, jpeg_inst, + regUVD_JRBC_RB_RPTR, + reg_offset, 0); + WREG32_SOC15_OFFSET(JPEG, jpeg_inst, + regUVD_JRBC_RB_WPTR, + reg_offset, 0); + WREG32_SOC15_OFFSET(JPEG, jpeg_inst, + regUVD_JRBC_RB_CNTL, + reg_offset, 0x00000002L); + WREG32_SOC15_OFFSET(JPEG, jpeg_inst, + regUVD_JRBC_RB_SIZE, + reg_offset, ring->ring_size / 4); + ring->wptr = RREG32_SOC15_OFFSET(JPEG, jpeg_inst, regUVD_JRBC_RB_WPTR, + reg_offset); + } + } + + return 0; +} + +/** + * jpeg_v5_0_1_stop - stop JPEG block + * + * @adev: amdgpu_device pointer + * + * stop the JPEG block + */ +static int jpeg_v5_0_1_stop(struct amdgpu_device *adev) +{ + int i, jpeg_inst, r; + + for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) { + jpeg_inst = GET_INST(JPEG, i); + /* reset JMI */ + WREG32_P(SOC15_REG_OFFSET(JPEG, jpeg_inst, regUVD_JMI_CNTL), + UVD_JMI_CNTL__SOFT_RESET_MASK, + ~UVD_JMI_CNTL__SOFT_RESET_MASK); + + /* enable antihang */ + r = jpeg_v5_0_1_enable_antihang(adev, i); + if (r) + return r; + } + + return 0; +} + +/** + * jpeg_v5_0_1_dec_ring_get_rptr - get read pointer + * + * @ring: amdgpu_ring pointer + * + * Returns the current hardware read pointer + */ +static uint64_t jpeg_v5_0_1_dec_ring_get_rptr(struct amdgpu_ring *ring) +{ + struct amdgpu_device *adev = ring->adev; + + return RREG32_SOC15_OFFSET(JPEG, GET_INST(JPEG, ring->me), regUVD_JRBC_RB_RPTR, + ring->pipe ? jpeg_v5_0_1_core_reg_offset(ring->pipe) : 0); +} + +/** + * jpeg_v5_0_1_dec_ring_get_wptr - get write pointer + * + * @ring: amdgpu_ring pointer + * + * Returns the current hardware write pointer + */ +static uint64_t jpeg_v5_0_1_dec_ring_get_wptr(struct amdgpu_ring *ring) +{ + struct amdgpu_device *adev = ring->adev; + + if (ring->use_doorbell) + return adev->wb.wb[ring->wptr_offs]; + + return RREG32_SOC15_OFFSET(JPEG, GET_INST(JPEG, ring->me), regUVD_JRBC_RB_WPTR, + ring->pipe ? jpeg_v5_0_1_core_reg_offset(ring->pipe) : 0); +} + +/** + * jpeg_v5_0_1_dec_ring_set_wptr - set write pointer + * + * @ring: amdgpu_ring pointer + * + * Commits the write pointer to the hardware + */ +static void jpeg_v5_0_1_dec_ring_set_wptr(struct amdgpu_ring *ring) +{ + struct amdgpu_device *adev = ring->adev; + + if (ring->use_doorbell) { + adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr); + WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr)); + } else { + WREG32_SOC15_OFFSET(JPEG, GET_INST(JPEG, ring->me), + regUVD_JRBC_RB_WPTR, + (ring->pipe ? jpeg_v5_0_1_core_reg_offset(ring->pipe) : 0), + lower_32_bits(ring->wptr)); + } +} + +static bool jpeg_v5_0_1_is_idle(void *handle) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)handle; + bool ret = false; + int i, j; + + for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) { + for (j = 0; j < adev->jpeg.num_jpeg_rings; ++j) { + int reg_offset = (j ? jpeg_v5_0_1_core_reg_offset(j) : 0); + + ret &= ((RREG32_SOC15_OFFSET(JPEG, GET_INST(JPEG, i), + regUVD_JRBC_STATUS, reg_offset) & + UVD_JRBC_STATUS__RB_JOB_DONE_MASK) == + UVD_JRBC_STATUS__RB_JOB_DONE_MASK); + } + } + + return ret; +} + +static int jpeg_v5_0_1_wait_for_idle(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + int ret = 0; + int i, j; + + for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) { + for (j = 0; j < adev->jpeg.num_jpeg_rings; ++j) { + int reg_offset = (j ? jpeg_v5_0_1_core_reg_offset(j) : 0); + + ret &= SOC15_WAIT_ON_RREG_OFFSET(JPEG, GET_INST(JPEG, i), + regUVD_JRBC_STATUS, reg_offset, + UVD_JRBC_STATUS__RB_JOB_DONE_MASK, + UVD_JRBC_STATUS__RB_JOB_DONE_MASK); + } + } + return ret; +} + +static int jpeg_v5_0_1_set_clockgating_state(struct amdgpu_ip_block *ip_block, + enum amd_clockgating_state state) +{ + struct amdgpu_device *adev = ip_block->adev; + bool enable = (state == AMD_CG_STATE_GATE) ? true : false; + + int i; + + if (!enable) + return 0; + + for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) { + if (!jpeg_v5_0_1_is_idle(adev)) + return -EBUSY; + } + + return 0; +} + +static int jpeg_v5_0_1_set_powergating_state(struct amdgpu_ip_block *ip_block, + enum amd_powergating_state state) +{ + struct amdgpu_device *adev = ip_block->adev; + int ret; + + if (state == adev->jpeg.cur_state) + return 0; + + if (state == AMD_PG_STATE_GATE) + ret = jpeg_v5_0_1_stop(adev); + else + ret = jpeg_v5_0_1_start(adev); + + if (!ret) + adev->jpeg.cur_state = state; + + return ret; +} + +static int jpeg_v5_0_1_set_interrupt_state(struct amdgpu_device *adev, + struct amdgpu_irq_src *source, + unsigned int type, + enum amdgpu_interrupt_state state) +{ + return 0; +} + +static int jpeg_v5_0_1_process_interrupt(struct amdgpu_device *adev, + struct amdgpu_irq_src *source, + struct amdgpu_iv_entry *entry) +{ + u32 i, inst; + + i = node_id_to_phys_map[entry->node_id]; + DRM_DEV_DEBUG(adev->dev, "IH: JPEG TRAP\n"); + + for (inst = 0; inst < adev->jpeg.num_jpeg_inst; ++inst) + if (adev->jpeg.inst[inst].aid_id == i) + break; + + if (inst >= adev->jpeg.num_jpeg_inst) { + dev_WARN_ONCE(adev->dev, 1, + "Interrupt received for unknown JPEG instance %d", + entry->node_id); + return 0; + } + + switch (entry->src_id) { + case VCN_5_0__SRCID__JPEG_DECODE: + amdgpu_fence_process(&adev->jpeg.inst[inst].ring_dec[0]); + break; + case VCN_5_0__SRCID__JPEG1_DECODE: + amdgpu_fence_process(&adev->jpeg.inst[inst].ring_dec[1]); + break; + case VCN_5_0__SRCID__JPEG2_DECODE: + amdgpu_fence_process(&adev->jpeg.inst[inst].ring_dec[2]); + break; + case VCN_5_0__SRCID__JPEG3_DECODE: + amdgpu_fence_process(&adev->jpeg.inst[inst].ring_dec[3]); + break; + case VCN_5_0__SRCID__JPEG4_DECODE: + amdgpu_fence_process(&adev->jpeg.inst[inst].ring_dec[4]); + break; + case VCN_5_0__SRCID__JPEG5_DECODE: + amdgpu_fence_process(&adev->jpeg.inst[inst].ring_dec[5]); + break; + case VCN_5_0__SRCID__JPEG6_DECODE: + amdgpu_fence_process(&adev->jpeg.inst[inst].ring_dec[6]); + break; + case VCN_5_0__SRCID__JPEG7_DECODE: + amdgpu_fence_process(&adev->jpeg.inst[inst].ring_dec[7]); + break; + case VCN_5_0__SRCID__JPEG8_DECODE: + amdgpu_fence_process(&adev->jpeg.inst[inst].ring_dec[8]); + break; + case VCN_5_0__SRCID__JPEG9_DECODE: + amdgpu_fence_process(&adev->jpeg.inst[inst].ring_dec[9]); + break; + default: + DRM_DEV_ERROR(adev->dev, "Unhandled interrupt: %d %d\n", + entry->src_id, entry->src_data[0]); + break; + } + + return 0; +} + +static const struct amd_ip_funcs jpeg_v5_0_1_ip_funcs = { + .name = "jpeg_v5_0_1", + .early_init = jpeg_v5_0_1_early_init, + .late_init = NULL, + .sw_init = jpeg_v5_0_1_sw_init, + .sw_fini = jpeg_v5_0_1_sw_fini, + .hw_init = jpeg_v5_0_1_hw_init, + .hw_fini = jpeg_v5_0_1_hw_fini, + .suspend = jpeg_v5_0_1_suspend, + .resume = jpeg_v5_0_1_resume, + .is_idle = jpeg_v5_0_1_is_idle, + .wait_for_idle = jpeg_v5_0_1_wait_for_idle, + .check_soft_reset = NULL, + .pre_soft_reset = NULL, + .soft_reset = NULL, + .post_soft_reset = NULL, + .set_clockgating_state = jpeg_v5_0_1_set_clockgating_state, + .set_powergating_state = jpeg_v5_0_1_set_powergating_state, + .dump_ip_state = NULL, + .print_ip_state = NULL, +}; + +static const struct amdgpu_ring_funcs jpeg_v5_0_1_dec_ring_vm_funcs = { + .type = AMDGPU_RING_TYPE_VCN_JPEG, + .align_mask = 0xf, + .get_rptr = jpeg_v5_0_1_dec_ring_get_rptr, + .get_wptr = jpeg_v5_0_1_dec_ring_get_wptr, + .set_wptr = jpeg_v5_0_1_dec_ring_set_wptr, + .emit_frame_size = + SOC15_FLUSH_GPU_TLB_NUM_WREG * 6 + + SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 8 + + 8 + /* jpeg_v5_0_1_dec_ring_emit_vm_flush */ + 22 + 22 + /* jpeg_v5_0_1_dec_ring_emit_fence x2 vm fence */ + 8 + 16, + .emit_ib_size = 22, /* jpeg_v5_0_1_dec_ring_emit_ib */ + .emit_ib = jpeg_v4_0_3_dec_ring_emit_ib, + .emit_fence = jpeg_v4_0_3_dec_ring_emit_fence, + .emit_vm_flush = jpeg_v4_0_3_dec_ring_emit_vm_flush, + .test_ring = amdgpu_jpeg_dec_ring_test_ring, + .test_ib = amdgpu_jpeg_dec_ring_test_ib, + .insert_nop = jpeg_v4_0_3_dec_ring_nop, + .insert_start = jpeg_v4_0_3_dec_ring_insert_start, + .insert_end = jpeg_v4_0_3_dec_ring_insert_end, + .pad_ib = amdgpu_ring_generic_pad_ib, + .begin_use = amdgpu_jpeg_ring_begin_use, + .end_use = amdgpu_jpeg_ring_end_use, + .emit_wreg = jpeg_v4_0_3_dec_ring_emit_wreg, + .emit_reg_wait = jpeg_v4_0_3_dec_ring_emit_reg_wait, + .emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper, +}; + +static void jpeg_v5_0_1_set_dec_ring_funcs(struct amdgpu_device *adev) +{ + int i, j, jpeg_inst; + + for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) { + for (j = 0; j < adev->jpeg.num_jpeg_rings; ++j) { + adev->jpeg.inst[i].ring_dec[j].funcs = &jpeg_v5_0_1_dec_ring_vm_funcs; + adev->jpeg.inst[i].ring_dec[j].me = i; + adev->jpeg.inst[i].ring_dec[j].pipe = j; + } + jpeg_inst = GET_INST(JPEG, i); + adev->jpeg.inst[i].aid_id = + jpeg_inst / adev->jpeg.num_inst_per_aid; + } +} + +static const struct amdgpu_irq_src_funcs jpeg_v5_0_1_irq_funcs = { + .set = jpeg_v5_0_1_set_interrupt_state, + .process = jpeg_v5_0_1_process_interrupt, +}; + +static void jpeg_v5_0_1_set_irq_funcs(struct amdgpu_device *adev) +{ + int i; + + for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) + adev->jpeg.inst->irq.num_types += adev->jpeg.num_jpeg_rings; + + adev->jpeg.inst->irq.funcs = &jpeg_v5_0_1_irq_funcs; +} + +const struct amdgpu_ip_block_version jpeg_v5_0_1_ip_block = { + .type = AMD_IP_BLOCK_TYPE_JPEG, + .major = 5, + .minor = 0, + .rev = 1, + .funcs = &jpeg_v5_0_1_ip_funcs, +}; diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.h b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.h new file mode 100644 index 000000000000..8ce146c00bb6 --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.h @@ -0,0 +1,29 @@ +/* + * Copyright 2024 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ + +#ifndef __JPEG_V5_0_1_H__ +#define __JPEG_V5_0_1_H__ + +extern const struct amdgpu_ip_block_version jpeg_v5_0_1_ip_block; + +#endif /* __JPEG_V5_0_0_H__ */ diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index 9c905b9e9376..65f389eb65e5 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c @@ -1505,9 +1505,7 @@ static void mes_v11_0_kiq_setting(struct amdgpu_ring *ring) tmp = RREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS); tmp &= 0xffffff00; tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); - WREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS, tmp); - tmp |= 0x80; - WREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS, tmp); + WREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS, tmp | 0x80); } static void mes_v11_0_kiq_clear(struct amdgpu_device *adev) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c index 9ecc5d61e49b..5b537806b4da 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c @@ -24,6 +24,7 @@ #include #include #include "amdgpu.h" +#include "gfx_v12_0.h" #include "soc15_common.h" #include "soc21.h" #include "gc/gc_12_0_0_offset.h" @@ -350,6 +351,132 @@ static int mes_v12_0_remove_hw_queue(struct amdgpu_mes *mes, offsetof(union MESAPI__REMOVE_QUEUE, api_status)); } +int gfx_v12_0_request_gfx_index_mutex(struct amdgpu_device *adev, + bool req) +{ + u32 i, tmp, val; + + for (i = 0; i < adev->usec_timeout; i++) { + /* Request with MeId=2, PipeId=0 */ + tmp = REG_SET_FIELD(0, CP_GFX_INDEX_MUTEX, REQUEST, req); + tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, CLIENTID, 4); + WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp); + + val = RREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX); + if (req) { + if (val == tmp) + break; + } else { + tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, + REQUEST, 1); + + /* unlocked or locked by firmware */ + if (val != tmp) + break; + } + udelay(1); + } + + if (i >= adev->usec_timeout) + return -EINVAL; + + return 0; +} + +static int mes_v12_0_reset_queue_mmio(struct amdgpu_mes *mes, uint32_t queue_type, + uint32_t me_id, uint32_t pipe_id, + uint32_t queue_id, uint32_t vmid) +{ + struct amdgpu_device *adev = mes->adev; + uint32_t value, reg; + int i, r = 0; + + amdgpu_gfx_rlc_enter_safe_mode(adev, 0); + + if (queue_type == AMDGPU_RING_TYPE_GFX) { + dev_info(adev->dev, "reset gfx queue (%d:%d:%d: vmid:%d)\n", + me_id, pipe_id, queue_id, vmid); + + mutex_lock(&adev->gfx.reset_sem_mutex); + gfx_v12_0_request_gfx_index_mutex(adev, true); + /* all se allow writes */ + WREG32_SOC15(GC, 0, regGRBM_GFX_INDEX, + (uint32_t)(0x1 << GRBM_GFX_INDEX__SE_BROADCAST_WRITES__SHIFT)); + value = REG_SET_FIELD(0, CP_VMID_RESET, RESET_REQUEST, 1 << vmid); + if (pipe_id == 0) + value = REG_SET_FIELD(value, CP_VMID_RESET, PIPE0_QUEUES, 1 << queue_id); + else + value = REG_SET_FIELD(value, CP_VMID_RESET, PIPE1_QUEUES, 1 << queue_id); + WREG32_SOC15(GC, 0, regCP_VMID_RESET, value); + gfx_v12_0_request_gfx_index_mutex(adev, false); + mutex_unlock(&adev->gfx.reset_sem_mutex); + + mutex_lock(&adev->srbm_mutex); + soc21_grbm_select(adev, me_id, pipe_id, queue_id, 0); + /* wait till dequeue take effects */ + for (i = 0; i < adev->usec_timeout; i++) { + if (!(RREG32_SOC15(GC, 0, regCP_GFX_HQD_ACTIVE) & 1)) + break; + udelay(1); + } + if (i >= adev->usec_timeout) { + dev_err(adev->dev, "failed to wait on gfx hqd deactivate\n"); + r = -ETIMEDOUT; + } + + soc21_grbm_select(adev, 0, 0, 0, 0); + mutex_unlock(&adev->srbm_mutex); + } else if (queue_type == AMDGPU_RING_TYPE_COMPUTE) { + dev_info(adev->dev, "reset compute queue (%d:%d:%d)\n", + me_id, pipe_id, queue_id); + mutex_lock(&adev->srbm_mutex); + soc21_grbm_select(adev, me_id, pipe_id, queue_id, 0); + WREG32_SOC15(GC, 0, regCP_HQD_DEQUEUE_REQUEST, 0x2); + WREG32_SOC15(GC, 0, regSPI_COMPUTE_QUEUE_RESET, 0x1); + + /* wait till dequeue take effects */ + for (i = 0; i < adev->usec_timeout; i++) { + if (!(RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) & 1)) + break; + udelay(1); + } + if (i >= adev->usec_timeout) { + dev_err(adev->dev, "failed to wait on hqd deactivate\n"); + r = -ETIMEDOUT; + } + soc21_grbm_select(adev, 0, 0, 0, 0); + mutex_unlock(&adev->srbm_mutex); + } else if (queue_type == AMDGPU_RING_TYPE_SDMA) { + dev_info(adev->dev, "reset sdma queue (%d:%d:%d)\n", + me_id, pipe_id, queue_id); + switch (me_id) { + case 1: + reg = SOC15_REG_OFFSET(GC, 0, regSDMA1_QUEUE_RESET_REQ); + break; + case 0: + default: + reg = SOC15_REG_OFFSET(GC, 0, regSDMA0_QUEUE_RESET_REQ); + break; + } + + value = 1 << queue_id; + WREG32(reg, value); + /* wait for queue reset done */ + for (i = 0; i < adev->usec_timeout; i++) { + if (!(RREG32(reg) & value)) + break; + udelay(1); + } + if (i >= adev->usec_timeout) { + dev_err(adev->dev, "failed to wait on sdma queue reset done\n"); + r = -ETIMEDOUT; + } + } + + amdgpu_gfx_rlc_exit_safe_mode(adev, 0); + return r; +} + static int mes_v12_0_reset_hw_queue(struct amdgpu_mes *mes, struct mes_reset_queue_input *input) { @@ -721,6 +848,11 @@ static int mes_v12_0_reset_legacy_queue(struct amdgpu_mes *mes, union MESAPI__RESET mes_reset_queue_pkt; int pipe; + if (input->use_mmio) + return mes_v12_0_reset_queue_mmio(mes, input->queue_type, + input->me_id, input->pipe_id, + input->queue_id, input->vmid); + memset(&mes_reset_queue_pkt, 0, sizeof(mes_reset_queue_pkt)); mes_reset_queue_pkt.header.type = MES_API_TYPE_SCHEDULER; @@ -1455,9 +1587,7 @@ static void mes_v12_0_kiq_setting(struct amdgpu_ring *ring) tmp = RREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS); tmp &= 0xffffff00; tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); - WREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS, tmp); - tmp |= 0x80; - WREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS, tmp); + WREG32_SOC15(GC, 0, regRLC_CP_SCHEDULERS, tmp | 0x80); } static int mes_v12_0_kiq_hw_init(struct amdgpu_device *adev) diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c index e9a6f33ca710..243eabda0607 100644 --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c @@ -356,7 +356,7 @@ static void mmhub_v1_0_update_power_gating(struct amdgpu_device *adev, if (adev->pg_flags & AMD_PG_SUPPORT_MMHUB) amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GMC, - enable); + enable, 0); } static int mmhub_v1_0_gart_enable(struct amdgpu_device *adev) diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c index b01bb759d0f4..e646e5cef0a2 100644 --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c @@ -33,7 +33,6 @@ #define regVM_L2_CNTL3_DEFAULT 0x80100007 #define regVM_L2_CNTL4_DEFAULT 0x000000c1 -#define mmSMNAID_AID0_MCA_SMU 0x03b30400 static u64 mmhub_v1_8_get_fb_location(struct amdgpu_device *adev) { diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c index 0fbc3be81f14..f2ab5001b492 100644 --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c @@ -108,7 +108,7 @@ mmhub_v4_1_0_print_l2_protection_fault_status(struct amdgpu_device *adev, dev_err(adev->dev, "MMVM_L2_PROTECTION_FAULT_STATUS_LO32:0x%08X\n", status); - switch (adev->ip_versions[MMHUB_HWIP][0]) { + switch (amdgpu_ip_version(adev, MMHUB_HWIP, 0)) { case IP_VERSION(4, 1, 0): mmhub_cid = mmhub_client_ids_v4_1_0[cid][rw]; break; diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c index 0820ed62e2e8..62cdfe10e6f4 100644 --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c @@ -434,9 +434,8 @@ static u32 navi10_ih_get_wptr(struct amdgpu_device *adev, * this should allow us to catch up. */ tmp = (wptr + 32) & ih->ptr_mask; - dev_warn(adev->dev, "IH ring buffer overflow " - "(0x%08X, 0x%08X, 0x%08X)\n", - wptr, ih->rptr, tmp); + dev_warn(adev->dev, "%s ring buffer overflow (0x%08X, 0x%08X, 0x%08X)\n", + amdgpu_ih_ring_name(adev, ih), wptr, ih->rptr, tmp); ih->rptr = tmp; tmp = RREG32_NO_KIQ(ih_regs->ih_rb_cntl); @@ -667,17 +666,17 @@ static void navi10_ih_update_clockgating_state(struct amdgpu_device *adev, } } -static int navi10_ih_set_clockgating_state(void *handle, +static int navi10_ih_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; navi10_ih_update_clockgating_state(adev, state == AMD_CG_STATE_GATE); return 0; } -static int navi10_ih_set_powergating_state(void *handle, +static int navi10_ih_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/nbif_v6_3_1.c b/drivers/gpu/drm/amd/amdgpu/nbif_v6_3_1.c index 39919e0892c1..c92875ceb31f 100644 --- a/drivers/gpu/drm/amd/amdgpu/nbif_v6_3_1.c +++ b/drivers/gpu/drm/amd/amdgpu/nbif_v6_3_1.c @@ -28,6 +28,7 @@ #include "nbif/nbif_6_3_1_sh_mask.h" #include "pcie/pcie_6_1_0_offset.h" #include "pcie/pcie_6_1_0_sh_mask.h" +#include "ivsrcid/nbio/irqsrcs_nbif_7_4.h" #include static void nbif_v6_3_1_remap_hdp_registers(struct amdgpu_device *adev) @@ -518,3 +519,83 @@ const struct amdgpu_nbio_funcs nbif_v6_3_1_sriov_funcs = { .get_rom_offset = nbif_v6_3_1_get_rom_offset, .set_reg_remap = nbif_v6_3_1_set_reg_remap, }; + +static int nbif_v6_3_1_set_ras_err_event_athub_irq_state(struct amdgpu_device *adev, + struct amdgpu_irq_src *src, + unsigned type, + enum amdgpu_interrupt_state state) +{ + /* The ras_controller_irq enablement should be done in psp bl when it + * tries to enable ras feature. Driver only need to set the correct interrupt + * vector for bare-metal and sriov use case respectively + */ + uint32_t bif_doorbell_int_cntl; + + bif_doorbell_int_cntl = RREG32_SOC15(NBIO, 0, regBIF_BX0_BIF_DOORBELL_INT_CNTL); + bif_doorbell_int_cntl = REG_SET_FIELD(bif_doorbell_int_cntl, + BIF_BX0_BIF_DOORBELL_INT_CNTL, + RAS_ATHUB_ERR_EVENT_INTERRUPT_DISABLE, + (state == AMDGPU_IRQ_STATE_ENABLE) ? 0 : 1); + WREG32_SOC15(NBIO, 0, regBIF_BX0_BIF_DOORBELL_INT_CNTL, bif_doorbell_int_cntl); + + return 0; +} + +static int nbif_v6_3_1_process_err_event_athub_irq(struct amdgpu_device *adev, + struct amdgpu_irq_src *source, + struct amdgpu_iv_entry *entry) +{ + /* By design, the ih cookie for err_event_athub_irq should be written + * to bif ring. since bif ring is not enabled, just leave process callback + * as a dummy one. + */ + return 0; +} + +static const struct amdgpu_irq_src_funcs nbif_v6_3_1_ras_err_event_athub_irq_funcs = { + .set = nbif_v6_3_1_set_ras_err_event_athub_irq_state, + .process = nbif_v6_3_1_process_err_event_athub_irq, +}; + +static void nbif_v6_3_1_handle_ras_err_event_athub_intr_no_bifring(struct amdgpu_device *adev) +{ + uint32_t bif_doorbell_int_cntl; + + bif_doorbell_int_cntl = RREG32_SOC15(NBIO, 0, regBIF_BX0_BIF_DOORBELL_INT_CNTL); + if (REG_GET_FIELD(bif_doorbell_int_cntl, + BIF_BX0_BIF_DOORBELL_INT_CNTL, + RAS_ATHUB_ERR_EVENT_INTERRUPT_STATUS)) { + /* driver has to clear the interrupt status when bif ring is disabled */ + bif_doorbell_int_cntl = REG_SET_FIELD(bif_doorbell_int_cntl, + BIF_BX0_BIF_DOORBELL_INT_CNTL, + RAS_ATHUB_ERR_EVENT_INTERRUPT_CLEAR, 1); + WREG32_SOC15(NBIO, 0, regBIF_BX0_BIF_DOORBELL_INT_CNTL, bif_doorbell_int_cntl); + amdgpu_ras_global_ras_isr(adev); + } +} + +static int nbif_v6_3_1_init_ras_err_event_athub_interrupt(struct amdgpu_device *adev) +{ + int r; + + /* init the irq funcs */ + adev->nbio.ras_err_event_athub_irq.funcs = + &nbif_v6_3_1_ras_err_event_athub_irq_funcs; + adev->nbio.ras_err_event_athub_irq.num_types = 1; + + /* register ras err event athub interrupt + * nbif v6_3_1 uses the same irq source as nbio v7_4 + */ + r = amdgpu_irq_add_id(adev, SOC21_IH_CLIENTID_BIF, + NBIF_7_4__SRCID__ERREVENT_ATHUB_INTERRUPT, + &adev->nbio.ras_err_event_athub_irq); + + return r; +} + +struct amdgpu_nbio_ras nbif_v6_3_1_ras = { + .handle_ras_err_event_athub_intr_no_bifring = + nbif_v6_3_1_handle_ras_err_event_athub_intr_no_bifring, + .init_ras_err_event_athub_interrupt = + nbif_v6_3_1_init_ras_err_event_athub_interrupt, +}; diff --git a/drivers/gpu/drm/amd/amdgpu/nbif_v6_3_1.h b/drivers/gpu/drm/amd/amdgpu/nbif_v6_3_1.h index b7f2e0d88905..9ac4831d39e1 100644 --- a/drivers/gpu/drm/amd/amdgpu/nbif_v6_3_1.h +++ b/drivers/gpu/drm/amd/amdgpu/nbif_v6_3_1.h @@ -29,5 +29,6 @@ extern const struct nbio_hdp_flush_reg nbif_v6_3_1_hdp_flush_reg; extern const struct amdgpu_nbio_funcs nbif_v6_3_1_funcs; extern const struct amdgpu_nbio_funcs nbif_v6_3_1_sriov_funcs; +extern struct amdgpu_nbio_ras nbif_v6_3_1_ras; #endif diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c index b1b57dcc5a73..d1032e9992b4 100644 --- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c @@ -271,8 +271,19 @@ const struct nbio_hdp_flush_reg nbio_v7_0_hdp_flush_reg = { .ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__SDMA1_MASK, }; +#define regRCC_DEV0_EPF6_STRAP4 0xd304 +#define regRCC_DEV0_EPF6_STRAP4_BASE_IDX 5 + static void nbio_v7_0_init_registers(struct amdgpu_device *adev) { + uint32_t data; + + switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) { + case IP_VERSION(2, 5, 0): + data = RREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF6_STRAP4) & ~BIT(23); + WREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF6_STRAP4, data); + break; + } } #define MMIO_REG_HOLE_OFFSET (0x80000 - PAGE_SIZE) diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c index 814ab59fdd4a..41421da63a08 100644 --- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c @@ -275,7 +275,7 @@ static void nbio_v7_11_init_registers(struct amdgpu_device *adev) if (def != data) WREG32_SOC15(NBIO, 0, regBIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3, data); - switch (adev->ip_versions[NBIO_HWIP][0]) { + switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) { case IP_VERSION(7, 11, 0): case IP_VERSION(7, 11, 1): case IP_VERSION(7, 11, 2): diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c index 1ac730328516..3fb6d2aa7e3b 100644 --- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c @@ -247,7 +247,7 @@ static void nbio_v7_7_init_registers(struct amdgpu_device *adev) if (def != data) WREG32_SOC15(NBIO, 0, regBIF0_PCIE_MST_CTRL_3, data); - switch (adev->ip_versions[NBIO_HWIP][0]) { + switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) { case IP_VERSION(7, 7, 0): data = RREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF5_STRAP4) & ~BIT(23); WREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF5_STRAP4, data); diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index 3bad565ded73..47db483c3516 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -1039,10 +1039,10 @@ static bool nv_common_is_idle(void *handle) return true; } -static int nv_common_set_clockgating_state(void *handle, +static int nv_common_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; @@ -1070,7 +1070,7 @@ static int nv_common_set_clockgating_state(void *handle, return 0; } -static int nv_common_set_powergating_state(void *handle, +static int nv_common_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { /* TODO */ diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c index c4b775aaee9f..cc621064610f 100644 --- a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c +++ b/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c @@ -51,6 +51,8 @@ MODULE_FIRMWARE("amdgpu/psp_13_0_11_toc.bin"); MODULE_FIRMWARE("amdgpu/psp_13_0_11_ta.bin"); MODULE_FIRMWARE("amdgpu/psp_13_0_6_sos.bin"); MODULE_FIRMWARE("amdgpu/psp_13_0_6_ta.bin"); +MODULE_FIRMWARE("amdgpu/psp_13_0_12_sos.bin"); +MODULE_FIRMWARE("amdgpu/psp_13_0_12_ta.bin"); MODULE_FIRMWARE("amdgpu/psp_13_0_14_sos.bin"); MODULE_FIRMWARE("amdgpu/psp_13_0_14_ta.bin"); MODULE_FIRMWARE("amdgpu/psp_14_0_0_toc.bin"); @@ -122,6 +124,7 @@ static int psp_v13_0_init_microcode(struct psp_context *psp) case IP_VERSION(13, 0, 6): case IP_VERSION(13, 0, 7): case IP_VERSION(13, 0, 10): + case IP_VERSION(13, 0, 12): case IP_VERSION(13, 0, 14): err = psp_init_sos_microcode(psp, ucode_prefix); if (err) @@ -177,6 +180,7 @@ static int psp_v13_0_wait_for_bootloader(struct psp_context *psp) retry_cnt = ((amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 6) || + amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 12) || amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 14))) ? PSP_VMBX_POLLING_LIMIT : 10; @@ -203,6 +207,7 @@ static int psp_v13_0_wait_for_bootloader_steady_state(struct psp_context *psp) int ret; if (amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 6) || + amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 12) || amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 14)) { ret = psp_v13_0_wait_for_vmbx_ready(psp); if (ret) @@ -288,6 +293,11 @@ static int psp_v13_0_bootloader_load_ras_drv(struct psp_context *psp) return psp_v13_0_bootloader_load_component(psp, &psp->ras_drv, PSP_BL__LOAD_RASDRV); } +static int psp_v13_0_bootloader_load_spdm_drv(struct psp_context *psp) +{ + return psp_v13_0_bootloader_load_component(psp, &psp->spdm_drv, PSP_BL__LOAD_SPDMDRV); +} + static inline void psp_v13_0_init_sos_version(struct psp_context *psp) { struct amdgpu_device *adev = psp->adev; @@ -798,6 +808,7 @@ static bool psp_v13_0_get_ras_capability(struct psp_context *psp) return false; if ((amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 6) || + amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 12) || amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 14)) && (!(adev->flags & AMD_IS_APU))) { reg_data = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_127); @@ -857,6 +868,7 @@ static const struct psp_funcs psp_v13_0_funcs = { .bootloader_load_intf_drv = psp_v13_0_bootloader_load_intf_drv, .bootloader_load_dbg_drv = psp_v13_0_bootloader_load_dbg_drv, .bootloader_load_ras_drv = psp_v13_0_bootloader_load_ras_drv, + .bootloader_load_spdm_drv = psp_v13_0_bootloader_load_spdm_drv, .bootloader_load_sos = psp_v13_0_bootloader_load_sos, .ring_create = psp_v13_0_ring_create, .ring_stop = psp_v13_0_ring_stop, diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c index 7948d74f8722..135c5099bfb8 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c @@ -145,9 +145,11 @@ static int sdma_v2_4_init_microcode(struct amdgpu_device *adev) for (i = 0; i < adev->sdma.num_instances; i++) { if (i == 0) err = amdgpu_ucode_request(adev, &adev->sdma.instance[i].fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_sdma.bin", chip_name); else err = amdgpu_ucode_request(adev, &adev->sdma.instance[i].fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_sdma1.bin", chip_name); if (err) goto out; @@ -631,7 +633,7 @@ static int sdma_v2_4_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err1: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err0: amdgpu_device_wb_free(adev, index); @@ -1080,14 +1082,14 @@ static int sdma_v2_4_process_illegal_inst_irq(struct amdgpu_device *adev, return 0; } -static int sdma_v2_4_set_clockgating_state(void *handle, +static int sdma_v2_4_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { /* XXX handled via the smc on VI */ return 0; } -static int sdma_v2_4_set_powergating_state(void *handle, +static int sdma_v2_4_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c index 9a3d729545a7..c611328671ed 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c @@ -305,9 +305,11 @@ static int sdma_v3_0_init_microcode(struct amdgpu_device *adev) for (i = 0; i < adev->sdma.num_instances; i++) { if (i == 0) err = amdgpu_ucode_request(adev, &adev->sdma.instance[i].fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_sdma.bin", chip_name); else err = amdgpu_ucode_request(adev, &adev->sdma.instance[i].fw, + AMDGPU_UCODE_REQUIRED, "amdgpu/%s_sdma1.bin", chip_name); if (err) goto out; @@ -904,7 +906,7 @@ static int sdma_v3_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) else r = -EINVAL; err1: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err0: amdgpu_device_wb_free(adev, index); @@ -1483,10 +1485,10 @@ static void sdma_v3_0_update_sdma_medium_grain_light_sleep( } } -static int sdma_v3_0_set_clockgating_state(void *handle, +static int sdma_v3_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; @@ -1506,7 +1508,7 @@ static int sdma_v3_0_set_clockgating_state(void *handle, return 0; } -static int sdma_v3_0_set_powergating_state(void *handle, +static int sdma_v3_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index c1f98f6cf20d..b48d9c0b2e1c 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -1565,7 +1565,7 @@ static int sdma_v4_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err1: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err0: amdgpu_device_wb_free(adev, index); @@ -1956,7 +1956,7 @@ static int sdma_v4_0_hw_init(struct amdgpu_ip_block *ip_block) struct amdgpu_device *adev = ip_block->adev; if (adev->flags & AMD_IS_APU) - amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_SDMA, false); + amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_SDMA, false, 0); if (!amdgpu_sriov_vf(adev)) sdma_v4_0_init_golden_registers(adev); @@ -1983,7 +1983,7 @@ static int sdma_v4_0_hw_fini(struct amdgpu_ip_block *ip_block) sdma_v4_0_enable(adev, false); if (adev->flags & AMD_IS_APU) - amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_SDMA, true); + amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_SDMA, true, 0); return 0; } @@ -2297,10 +2297,10 @@ static void sdma_v4_0_update_medium_grain_light_sleep( } } -static int sdma_v4_0_set_clockgating_state(void *handle, +static int sdma_v4_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; @@ -2312,10 +2312,10 @@ static int sdma_v4_0_set_clockgating_state(void *handle, return 0; } -static int sdma_v4_0_set_powergating_state(void *handle, +static int sdma_v4_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; switch (amdgpu_ip_version(adev, SDMA0_HWIP, 0)) { case IP_VERSION(4, 1, 0): diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c index a38553f38fdc..48537eba225d 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c @@ -189,6 +189,7 @@ static int sdma_v4_4_2_init_microcode(struct amdgpu_device *adev) for (i = 0; i < adev->sdma.num_instances; i++) { if (amdgpu_ip_version(adev, SDMA0_HWIP, 0) == IP_VERSION(4, 4, 2) || + amdgpu_ip_version(adev, SDMA0_HWIP, 0) == IP_VERSION(4, 4, 4) || amdgpu_ip_version(adev, SDMA0_HWIP, 0) == IP_VERSION(4, 4, 5)) { ret = amdgpu_sdma_init_microcode(adev, 0, true); break; @@ -667,11 +668,12 @@ static uint32_t sdma_v4_4_2_rb_cntl(struct amdgpu_ring *ring, uint32_t rb_cntl) * * @adev: amdgpu_device pointer * @i: instance to resume + * @restore: used to restore wptr when restart * * Set up the gfx DMA ring buffers and enable them. * Returns 0 for success, error for failure. */ -static void sdma_v4_4_2_gfx_resume(struct amdgpu_device *adev, unsigned int i) +static void sdma_v4_4_2_gfx_resume(struct amdgpu_device *adev, unsigned int i, bool restore) { struct amdgpu_ring *ring = &adev->sdma.instance[i].ring; u32 rb_cntl, ib_cntl, wptr_poll_cntl; @@ -698,16 +700,24 @@ static void sdma_v4_4_2_gfx_resume(struct amdgpu_device *adev, unsigned int i) WREG32_SDMA(i, regSDMA_GFX_RB_BASE, ring->gpu_addr >> 8); WREG32_SDMA(i, regSDMA_GFX_RB_BASE_HI, ring->gpu_addr >> 40); - ring->wptr = 0; + if (!restore) + ring->wptr = 0; /* before programing wptr to a less value, need set minor_ptr_update first */ WREG32_SDMA(i, regSDMA_GFX_MINOR_PTR_UPDATE, 1); /* Initialize the ring buffer's read and write pointers */ - WREG32_SDMA(i, regSDMA_GFX_RB_RPTR, 0); - WREG32_SDMA(i, regSDMA_GFX_RB_RPTR_HI, 0); - WREG32_SDMA(i, regSDMA_GFX_RB_WPTR, 0); - WREG32_SDMA(i, regSDMA_GFX_RB_WPTR_HI, 0); + if (restore) { + WREG32_SDMA(i, regSDMA_GFX_RB_RPTR, lower_32_bits(ring->wptr << 2)); + WREG32_SDMA(i, regSDMA_GFX_RB_RPTR_HI, upper_32_bits(ring->wptr << 2)); + WREG32_SDMA(i, regSDMA_GFX_RB_WPTR, lower_32_bits(ring->wptr << 2)); + WREG32_SDMA(i, regSDMA_GFX_RB_WPTR_HI, upper_32_bits(ring->wptr << 2)); + } else { + WREG32_SDMA(i, regSDMA_GFX_RB_RPTR, 0); + WREG32_SDMA(i, regSDMA_GFX_RB_RPTR_HI, 0); + WREG32_SDMA(i, regSDMA_GFX_RB_WPTR, 0); + WREG32_SDMA(i, regSDMA_GFX_RB_WPTR_HI, 0); + } doorbell = RREG32_SDMA(i, regSDMA_GFX_DOORBELL); doorbell_offset = RREG32_SDMA(i, regSDMA_GFX_DOORBELL_OFFSET); @@ -755,11 +765,12 @@ static void sdma_v4_4_2_gfx_resume(struct amdgpu_device *adev, unsigned int i) * * @adev: amdgpu_device pointer * @i: instance to resume + * @restore: boolean to say restore needed or not * * Set up the page DMA ring buffers and enable them. * Returns 0 for success, error for failure. */ -static void sdma_v4_4_2_page_resume(struct amdgpu_device *adev, unsigned int i) +static void sdma_v4_4_2_page_resume(struct amdgpu_device *adev, unsigned int i, bool restore) { struct amdgpu_ring *ring = &adev->sdma.instance[i].page; u32 rb_cntl, ib_cntl, wptr_poll_cntl; @@ -775,10 +786,17 @@ static void sdma_v4_4_2_page_resume(struct amdgpu_device *adev, unsigned int i) WREG32_SDMA(i, regSDMA_PAGE_RB_CNTL, rb_cntl); /* Initialize the ring buffer's read and write pointers */ - WREG32_SDMA(i, regSDMA_PAGE_RB_RPTR, 0); - WREG32_SDMA(i, regSDMA_PAGE_RB_RPTR_HI, 0); - WREG32_SDMA(i, regSDMA_PAGE_RB_WPTR, 0); - WREG32_SDMA(i, regSDMA_PAGE_RB_WPTR_HI, 0); + if (restore) { + WREG32_SDMA(i, regSDMA_GFX_RB_RPTR, lower_32_bits(ring->wptr << 2)); + WREG32_SDMA(i, regSDMA_GFX_RB_RPTR_HI, upper_32_bits(ring->wptr << 2)); + WREG32_SDMA(i, regSDMA_GFX_RB_WPTR, lower_32_bits(ring->wptr << 2)); + WREG32_SDMA(i, regSDMA_GFX_RB_WPTR_HI, upper_32_bits(ring->wptr << 2)); + } else { + WREG32_SDMA(i, regSDMA_PAGE_RB_RPTR, 0); + WREG32_SDMA(i, regSDMA_PAGE_RB_RPTR_HI, 0); + WREG32_SDMA(i, regSDMA_PAGE_RB_WPTR, 0); + WREG32_SDMA(i, regSDMA_PAGE_RB_WPTR_HI, 0); + } /* set the wb address whether it's enabled or not */ WREG32_SDMA(i, regSDMA_PAGE_RB_RPTR_ADDR_HI, @@ -792,7 +810,8 @@ static void sdma_v4_4_2_page_resume(struct amdgpu_device *adev, unsigned int i) WREG32_SDMA(i, regSDMA_PAGE_RB_BASE, ring->gpu_addr >> 8); WREG32_SDMA(i, regSDMA_PAGE_RB_BASE_HI, ring->gpu_addr >> 40); - ring->wptr = 0; + if (!restore) + ring->wptr = 0; /* before programing wptr to a less value, need set minor_ptr_update first */ WREG32_SDMA(i, regSDMA_PAGE_MINOR_PTR_UPDATE, 1); @@ -911,12 +930,13 @@ static int sdma_v4_4_2_inst_load_microcode(struct amdgpu_device *adev, * * @adev: amdgpu_device pointer * @inst_mask: mask of dma engine instances to be enabled + * @restore: boolean to say restore needed or not * * Set up the DMA engines and enable them. * Returns 0 for success, error for failure. */ static int sdma_v4_4_2_inst_start(struct amdgpu_device *adev, - uint32_t inst_mask) + uint32_t inst_mask, bool restore) { struct amdgpu_ring *ring; uint32_t tmp_mask; @@ -927,7 +947,7 @@ static int sdma_v4_4_2_inst_start(struct amdgpu_device *adev, sdma_v4_4_2_inst_enable(adev, false, inst_mask); } else { /* bypass sdma microcode loading on Gopher */ - if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP && + if (!restore && adev->firmware.load_type != AMDGPU_FW_LOAD_PSP && adev->sdma.instance[0].fw) { r = sdma_v4_4_2_inst_load_microcode(adev, inst_mask); if (r) @@ -946,17 +966,19 @@ static int sdma_v4_4_2_inst_start(struct amdgpu_device *adev, uint32_t temp; WREG32_SDMA(i, regSDMA_SEM_WAIT_FAIL_TIMER_CNTL, 0); - sdma_v4_4_2_gfx_resume(adev, i); + sdma_v4_4_2_gfx_resume(adev, i, restore); if (adev->sdma.has_page_queue) - sdma_v4_4_2_page_resume(adev, i); + sdma_v4_4_2_page_resume(adev, i, restore); /* set utc l1 enable flag always to 1 */ temp = RREG32_SDMA(i, regSDMA_CNTL); temp = REG_SET_FIELD(temp, SDMA_CNTL, UTC_L1_ENABLE, 1); - /* enable context empty interrupt during initialization */ - temp = REG_SET_FIELD(temp, SDMA_CNTL, CTXEMPTY_INT_ENABLE, 1); - WREG32_SDMA(i, regSDMA_CNTL, temp); + if (amdgpu_ip_version(adev, SDMA0_HWIP, 0) < IP_VERSION(4, 4, 5)) { + /* enable context empty interrupt during initialization */ + temp = REG_SET_FIELD(temp, SDMA_CNTL, CTXEMPTY_INT_ENABLE, 1); + WREG32_SDMA(i, regSDMA_CNTL, temp); + } if (!amdgpu_sriov_vf(adev)) { if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) { /* unhalt engine */ @@ -1110,7 +1132,7 @@ static int sdma_v4_4_2_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err1: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err0: amdgpu_device_wb_free(adev, index); @@ -1466,6 +1488,7 @@ static int sdma_v4_4_2_sw_fini(struct amdgpu_ip_block *ip_block) amdgpu_sdma_sysfs_reset_mask_fini(adev); if (amdgpu_ip_version(adev, SDMA0_HWIP, 0) == IP_VERSION(4, 4, 2) || + amdgpu_ip_version(adev, SDMA0_HWIP, 0) == IP_VERSION(4, 4, 4) || amdgpu_ip_version(adev, SDMA0_HWIP, 0) == IP_VERSION(4, 4, 5)) amdgpu_sdma_destroy_inst_ctx(adev, true); else @@ -1486,7 +1509,7 @@ static int sdma_v4_4_2_hw_init(struct amdgpu_ip_block *ip_block) if (!amdgpu_sriov_vf(adev)) sdma_v4_4_2_inst_init_golden_registers(adev, inst_mask); - r = sdma_v4_4_2_inst_start(adev, inst_mask); + r = sdma_v4_4_2_inst_start(adev, inst_mask, false); return r; } @@ -1514,7 +1537,7 @@ static int sdma_v4_4_2_hw_fini(struct amdgpu_ip_block *ip_block) return 0; } -static int sdma_v4_4_2_set_clockgating_state(void *handle, +static int sdma_v4_4_2_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state); static int sdma_v4_4_2_suspend(struct amdgpu_ip_block *ip_block) @@ -1522,7 +1545,7 @@ static int sdma_v4_4_2_suspend(struct amdgpu_ip_block *ip_block) struct amdgpu_device *adev = ip_block->adev; if (amdgpu_in_reset(adev)) - sdma_v4_4_2_set_clockgating_state(adev, AMD_CG_STATE_UNGATE); + sdma_v4_4_2_set_clockgating_state(ip_block, AMD_CG_STATE_UNGATE); return sdma_v4_4_2_hw_fini(ip_block); } @@ -1573,6 +1596,42 @@ static int sdma_v4_4_2_soft_reset(struct amdgpu_ip_block *ip_block) return 0; } +static int sdma_v4_4_2_reset_queue(struct amdgpu_ring *ring, unsigned int vmid) +{ + struct amdgpu_device *adev = ring->adev; + int i, r; + u32 inst_mask; + + if (amdgpu_sriov_vf(adev)) + return -EINVAL; + + /* stop queue */ + inst_mask = 1 << ring->me; + sdma_v4_4_2_inst_gfx_stop(adev, inst_mask); + if (adev->sdma.has_page_queue) + sdma_v4_4_2_inst_page_stop(adev, inst_mask); + + r = amdgpu_dpm_reset_sdma(adev, 1 << GET_INST(SDMA0, ring->me)); + if (r) + return r; + + udelay(50); + + for (i = 0; i < adev->usec_timeout; i++) { + if (!REG_GET_FIELD(RREG32_SDMA(ring->me, regSDMA_F32_CNTL), SDMA_F32_CNTL, HALT)) + break; + udelay(1); + } + + if (i == adev->usec_timeout) { + dev_err(adev->dev, "timed out waiting for SDMA%d unhalt after reset\n", + ring->me); + return -ETIMEDOUT; + } + + return sdma_v4_4_2_inst_start(adev, inst_mask, true); +} + static int sdma_v4_4_2_set_trap_irq_state(struct amdgpu_device *adev, struct amdgpu_irq_src *source, unsigned type, @@ -1821,10 +1880,10 @@ static void sdma_v4_4_2_inst_update_medium_grain_clock_gating( } } -static int sdma_v4_4_2_set_clockgating_state(void *handle, +static int sdma_v4_4_2_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; uint32_t inst_mask; if (amdgpu_sriov_vf(adev)) @@ -1839,7 +1898,7 @@ static int sdma_v4_4_2_set_clockgating_state(void *handle, return 0; } -static int sdma_v4_4_2_set_powergating_state(void *handle, +static int sdma_v4_4_2_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; @@ -1895,7 +1954,6 @@ static void sdma_v4_4_2_dump_ip_state(struct amdgpu_ip_block *ip_block) if (!adev->sdma.ip_dump) return; - amdgpu_gfx_off_ctrl(adev, false); for (i = 0; i < adev->sdma.num_instances; i++) { instance_offset = i * reg_count; for (j = 0; j < reg_count; j++) @@ -1903,7 +1961,6 @@ static void sdma_v4_4_2_dump_ip_state(struct amdgpu_ip_block *ip_block) RREG32(sdma_v4_4_2_get_reg_offset(adev, i, sdma_reg_list_4_4_2[j].reg_offset)); } - amdgpu_gfx_off_ctrl(adev, true); } const struct amd_ip_funcs sdma_v4_4_2_ip_funcs = { @@ -1955,6 +2012,7 @@ static const struct amdgpu_ring_funcs sdma_v4_4_2_ring_funcs = { .emit_wreg = sdma_v4_4_2_ring_emit_wreg, .emit_reg_wait = sdma_v4_4_2_ring_emit_reg_wait, .emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper, + .reset = sdma_v4_4_2_reset_queue, }; static const struct amdgpu_ring_funcs sdma_v4_4_2_page_ring_funcs = { @@ -2167,7 +2225,7 @@ static int sdma_v4_4_2_xcp_resume(void *handle, uint32_t inst_mask) if (!amdgpu_sriov_vf(adev)) sdma_v4_4_2_inst_init_golden_registers(adev, inst_mask); - r = sdma_v4_4_2_inst_start(adev, inst_mask); + r = sdma_v4_4_2_inst_start(adev, inst_mask, false); return r; } diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index fa9b40934957..b764550834a0 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c @@ -1194,7 +1194,7 @@ static int sdma_v5_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err1: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err0: if (!ring->is_mes_queue) @@ -1853,10 +1853,10 @@ static void sdma_v5_0_update_medium_grain_light_sleep(struct amdgpu_device *adev } } -static int sdma_v5_0_set_clockgating_state(void *handle, +static int sdma_v5_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; @@ -1877,7 +1877,7 @@ static int sdma_v5_0_set_clockgating_state(void *handle, return 0; } -static int sdma_v5_0_set_powergating_state(void *handle, +static int sdma_v5_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c index ba5160399ab2..b1818e87889a 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c @@ -1050,7 +1050,7 @@ static int sdma_v5_2_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err1: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err0: if (!ring->is_mes_queue) @@ -1812,10 +1812,10 @@ static void sdma_v5_2_update_medium_grain_light_sleep(struct amdgpu_device *adev } } -static int sdma_v5_2_set_clockgating_state(void *handle, +static int sdma_v5_2_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; @@ -1841,7 +1841,7 @@ static int sdma_v5_2_set_clockgating_state(void *handle, return 0; } -static int sdma_v5_2_set_powergating_state(void *handle, +static int sdma_v5_2_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c index d46128b0ec92..1a023b45f0be 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c @@ -1063,7 +1063,7 @@ static int sdma_v6_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err1: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err0: if (!ring->is_mes_queue) @@ -1601,13 +1601,13 @@ static int sdma_v6_0_process_illegal_inst_irq(struct amdgpu_device *adev, return 0; } -static int sdma_v6_0_set_clockgating_state(void *handle, +static int sdma_v6_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int sdma_v6_0_set_powergating_state(void *handle, +static int sdma_v6_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c index d2ce6b6a7ff6..9c17df2cf37b 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c @@ -489,6 +489,166 @@ static void sdma_v7_0_enable(struct amdgpu_device *adev, bool enable) } } +/** + * sdma_v7_0_gfx_resume_instance - start/restart a certain sdma engine + * + * @adev: amdgpu_device pointer + * @i: instance + * @restore: used to restore wptr when restart + * + * Set up the gfx DMA ring buffers and enable them. On restart, we will restore wptr and rptr. + * Return 0 for success. + */ +static int sdma_v7_0_gfx_resume_instance(struct amdgpu_device *adev, int i, bool restore) +{ + struct amdgpu_ring *ring; + u32 rb_cntl, ib_cntl; + u32 rb_bufsz; + u32 doorbell; + u32 doorbell_offset; + u32 temp; + u64 wptr_gpu_addr; + int r; + + ring = &adev->sdma.instance[i].ring; + + /* Set ring buffer size in dwords */ + rb_bufsz = order_base_2(ring->ring_size / 4); + rb_cntl = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_CNTL)); + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, RB_SIZE, rb_bufsz); +#ifdef __BIG_ENDIAN + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, RB_SWAP_ENABLE, 1); + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, + RPTR_WRITEBACK_SWAP_ENABLE, 1); +#endif + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, RB_PRIV, 1); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_CNTL), rb_cntl); + + /* Initialize the ring buffer's read and write pointers */ + if (restore) { + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_RPTR), lower_32_bits(ring->wptr << 2)); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_RPTR_HI), upper_32_bits(ring->wptr << 2)); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR), lower_32_bits(ring->wptr << 2)); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR_HI), upper_32_bits(ring->wptr << 2)); + } else { + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_RPTR), 0); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_RPTR_HI), 0); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR), 0); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR_HI), 0); + } + /* setup the wptr shadow polling */ + wptr_gpu_addr = ring->wptr_gpu_addr; + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR_POLL_ADDR_LO), + lower_32_bits(wptr_gpu_addr)); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR_POLL_ADDR_HI), + upper_32_bits(wptr_gpu_addr)); + + /* set the wb address whether it's enabled or not */ + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_RPTR_ADDR_HI), + upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_RPTR_ADDR_LO), + lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC); + + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, RPTR_WRITEBACK_ENABLE, 1); + if (amdgpu_sriov_vf(adev)) + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, WPTR_POLL_ENABLE, 1); + else + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, WPTR_POLL_ENABLE, 0); + + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, MCU_WPTR_POLL_ENABLE, 1); + + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_BASE), ring->gpu_addr >> 8); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_BASE_HI), ring->gpu_addr >> 40); + + if (!restore) + ring->wptr = 0; + + /* before programing wptr to a less value, need set minor_ptr_update first */ + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_MINOR_PTR_UPDATE), 1); + + if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use register write for wptr */ + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR), lower_32_bits(ring->wptr) << 2); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2); + } + + doorbell = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_DOORBELL)); + doorbell_offset = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_DOORBELL_OFFSET)); + + if (ring->use_doorbell) { + doorbell = REG_SET_FIELD(doorbell, SDMA0_QUEUE0_DOORBELL, ENABLE, 1); + doorbell_offset = REG_SET_FIELD(doorbell_offset, SDMA0_QUEUE0_DOORBELL_OFFSET, + OFFSET, ring->doorbell_index); + } else { + doorbell = REG_SET_FIELD(doorbell, SDMA0_QUEUE0_DOORBELL, ENABLE, 0); + } + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_DOORBELL), doorbell); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_DOORBELL_OFFSET), doorbell_offset); + + if (i == 0) + adev->nbio.funcs->sdma_doorbell_range(adev, i, ring->use_doorbell, + ring->doorbell_index, + adev->doorbell_index.sdma_doorbell_range * adev->sdma.num_instances); + + if (amdgpu_sriov_vf(adev)) + sdma_v7_0_ring_set_wptr(ring); + + /* set minor_ptr_update to 0 after wptr programed */ + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_MINOR_PTR_UPDATE), 0); + + /* Set up sdma hang watchdog */ + temp = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_WATCHDOG_CNTL)); + /* 100ms per unit */ + temp = REG_SET_FIELD(temp, SDMA0_WATCHDOG_CNTL, QUEUE_HANG_COUNT, + max(adev->usec_timeout/100000, 1)); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_WATCHDOG_CNTL), temp); + + /* Set up RESP_MODE to non-copy addresses */ + temp = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_UTCL1_CNTL)); + temp = REG_SET_FIELD(temp, SDMA0_UTCL1_CNTL, RESP_MODE, 3); + temp = REG_SET_FIELD(temp, SDMA0_UTCL1_CNTL, REDO_DELAY, 9); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_UTCL1_CNTL), temp); + + /* program default cache read and write policy */ + temp = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_UTCL1_PAGE)); + /* clean read policy and write policy bits */ + temp &= 0xFF0FFF; + temp |= ((CACHE_READ_POLICY_L2__DEFAULT << 12) | + (CACHE_WRITE_POLICY_L2__DEFAULT << 14)); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_UTCL1_PAGE), temp); + + if (!amdgpu_sriov_vf(adev)) { + /* unhalt engine */ + temp = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_MCU_CNTL)); + temp = REG_SET_FIELD(temp, SDMA0_MCU_CNTL, HALT, 0); + temp = REG_SET_FIELD(temp, SDMA0_MCU_CNTL, RESET, 0); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_MCU_CNTL), temp); + } + + /* enable DMA RB */ + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, RB_ENABLE, 1); + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_CNTL), rb_cntl); + + ib_cntl = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_IB_CNTL)); + ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_QUEUE0_IB_CNTL, IB_ENABLE, 1); +#ifdef __BIG_ENDIAN + ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_QUEUE0_IB_CNTL, IB_SWAP_ENABLE, 1); +#endif + /* enable DMA IBs */ + WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_IB_CNTL), ib_cntl); + ring->sched.ready = true; + + if (amdgpu_sriov_vf(adev)) { /* bare-metal sequence doesn't need below to lines */ + sdma_v7_0_ctx_switch_enable(adev, true); + sdma_v7_0_enable(adev, true); + } + + r = amdgpu_ring_test_helper(ring); + if (r) + ring->sched.ready = false; + + return r; +} + /** * sdma_v7_0_gfx_resume - setup and start the async dma engines * @@ -499,153 +659,16 @@ static void sdma_v7_0_enable(struct amdgpu_device *adev, bool enable) */ static int sdma_v7_0_gfx_resume(struct amdgpu_device *adev) { - struct amdgpu_ring *ring; - u32 rb_cntl, ib_cntl; - u32 rb_bufsz; - u32 doorbell; - u32 doorbell_offset; - u32 tmp; - u64 wptr_gpu_addr; int i, r; for (i = 0; i < adev->sdma.num_instances; i++) { - ring = &adev->sdma.instance[i].ring; - - //if (!amdgpu_sriov_vf(adev)) - // WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0); - - /* Set ring buffer size in dwords */ - rb_bufsz = order_base_2(ring->ring_size / 4); - rb_cntl = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_CNTL)); - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, RB_SIZE, rb_bufsz); -#ifdef __BIG_ENDIAN - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, RB_SWAP_ENABLE, 1); - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, - RPTR_WRITEBACK_SWAP_ENABLE, 1); -#endif - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, RB_PRIV, 1); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_CNTL), rb_cntl); - - /* Initialize the ring buffer's read and write pointers */ - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_RPTR), 0); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_RPTR_HI), 0); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR), 0); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR_HI), 0); - - /* setup the wptr shadow polling */ - wptr_gpu_addr = ring->wptr_gpu_addr; - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR_POLL_ADDR_LO), - lower_32_bits(wptr_gpu_addr)); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR_POLL_ADDR_HI), - upper_32_bits(wptr_gpu_addr)); - - /* set the wb address whether it's enabled or not */ - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_RPTR_ADDR_HI), - upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_RPTR_ADDR_LO), - lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC); - - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, RPTR_WRITEBACK_ENABLE, 1); - if (amdgpu_sriov_vf(adev)) - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, WPTR_POLL_ENABLE, 1); - else - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, WPTR_POLL_ENABLE, 0); - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, MCU_WPTR_POLL_ENABLE, 1); - - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_BASE), ring->gpu_addr >> 8); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_BASE_HI), ring->gpu_addr >> 40); - - ring->wptr = 0; - - /* before programing wptr to a less value, need set minor_ptr_update first */ - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_MINOR_PTR_UPDATE), 1); - - if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use register write for wptr */ - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR), lower_32_bits(ring->wptr) << 2); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2); - } - - doorbell = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_DOORBELL)); - doorbell_offset = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_DOORBELL_OFFSET)); - - if (ring->use_doorbell) { - doorbell = REG_SET_FIELD(doorbell, SDMA0_QUEUE0_DOORBELL, ENABLE, 1); - doorbell_offset = REG_SET_FIELD(doorbell_offset, SDMA0_QUEUE0_DOORBELL_OFFSET, - OFFSET, ring->doorbell_index); - } else { - doorbell = REG_SET_FIELD(doorbell, SDMA0_QUEUE0_DOORBELL, ENABLE, 0); - } - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_DOORBELL), doorbell); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_DOORBELL_OFFSET), doorbell_offset); - - if (i == 0) - adev->nbio.funcs->sdma_doorbell_range(adev, i, ring->use_doorbell, - ring->doorbell_index, - adev->doorbell_index.sdma_doorbell_range * adev->sdma.num_instances); - - if (amdgpu_sriov_vf(adev)) - sdma_v7_0_ring_set_wptr(ring); - - /* set minor_ptr_update to 0 after wptr programed */ - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_MINOR_PTR_UPDATE), 0); - - /* Set up sdma hang watchdog */ - tmp = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_WATCHDOG_CNTL)); - /* 100ms per unit */ - tmp = REG_SET_FIELD(tmp, SDMA0_WATCHDOG_CNTL, QUEUE_HANG_COUNT, - max(adev->usec_timeout/100000, 1)); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_WATCHDOG_CNTL), tmp); - - /* Set up RESP_MODE to non-copy addresses */ - tmp = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_UTCL1_CNTL)); - tmp = REG_SET_FIELD(tmp, SDMA0_UTCL1_CNTL, RESP_MODE, 3); - tmp = REG_SET_FIELD(tmp, SDMA0_UTCL1_CNTL, REDO_DELAY, 9); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_UTCL1_CNTL), tmp); - - /* program default cache read and write policy */ - tmp = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_UTCL1_PAGE)); - /* clean read policy and write policy bits */ - tmp &= 0xFF0FFF; - tmp |= ((CACHE_READ_POLICY_L2__DEFAULT << 12) | - (CACHE_WRITE_POLICY_L2__DEFAULT << 14)); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_UTCL1_PAGE), tmp); - - if (!amdgpu_sriov_vf(adev)) { - /* unhalt engine */ - tmp = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_MCU_CNTL)); - tmp = REG_SET_FIELD(tmp, SDMA0_MCU_CNTL, HALT, 0); - tmp = REG_SET_FIELD(tmp, SDMA0_MCU_CNTL, RESET, 0); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_MCU_CNTL), tmp); - } - - /* enable DMA RB */ - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, RB_ENABLE, 1); - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_CNTL), rb_cntl); - - ib_cntl = RREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_IB_CNTL)); - ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_QUEUE0_IB_CNTL, IB_ENABLE, 1); -#ifdef __BIG_ENDIAN - ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_QUEUE0_IB_CNTL, IB_SWAP_ENABLE, 1); -#endif - /* enable DMA IBs */ - WREG32_SOC15_IP(GC, sdma_v7_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_IB_CNTL), ib_cntl); - - ring->sched.ready = true; - - if (amdgpu_sriov_vf(adev)) { /* bare-metal sequence doesn't need below to lines */ - sdma_v7_0_ctx_switch_enable(adev, true); - sdma_v7_0_enable(adev, true); - } - - r = amdgpu_ring_test_helper(ring); - if (r) { - ring->sched.ready = false; + r = sdma_v7_0_gfx_resume_instance(adev, i, false); + if (r) return r; - } - } return 0; + } /** @@ -806,6 +829,31 @@ static bool sdma_v7_0_check_soft_reset(struct amdgpu_ip_block *ip_block) return false; } +static int sdma_v7_0_reset_queue(struct amdgpu_ring *ring, unsigned int vmid) +{ + struct amdgpu_device *adev = ring->adev; + int i, r; + + if (amdgpu_sriov_vf(adev)) + return -EINVAL; + + for (i = 0; i < adev->sdma.num_instances; i++) { + if (ring == &adev->sdma.instance[i].ring) + break; + } + + if (i == adev->sdma.num_instances) { + DRM_ERROR("sdma instance not found\n"); + return -EINVAL; + } + + r = amdgpu_mes_reset_legacy_queue(adev, ring, vmid, true); + if (r) + return r; + + return sdma_v7_0_gfx_resume_instance(adev, i, true); +} + /** * sdma_v7_0_start - setup and start the async dma engines * @@ -1060,7 +1108,7 @@ static int sdma_v7_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err1: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err0: if (!ring->is_mes_queue) @@ -1316,6 +1364,13 @@ static int sdma_v7_0_sw_init(struct amdgpu_ip_block *ip_block) return r; } + adev->sdma.supported_reset = + amdgpu_get_soft_full_reset_mask(&adev->sdma.instance[0].ring); + adev->sdma.supported_reset |= AMDGPU_RESET_TYPE_PER_QUEUE; + + r = amdgpu_sdma_sysfs_reset_mask_init(adev); + if (r) + return r; /* Allocate memory for SDMA IP Dump buffer */ ptr = kcalloc(adev->sdma.num_instances * reg_count, sizeof(uint32_t), GFP_KERNEL); if (ptr) @@ -1334,6 +1389,7 @@ static int sdma_v7_0_sw_fini(struct amdgpu_ip_block *ip_block) for (i = 0; i < adev->sdma.num_instances; i++) amdgpu_ring_fini(&adev->sdma.instance[i].ring); + amdgpu_sdma_sysfs_reset_mask_fini(adev); amdgpu_sdma_destroy_inst_ctx(adev, true); if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) @@ -1524,13 +1580,13 @@ static int sdma_v7_0_process_illegal_inst_irq(struct amdgpu_device *adev, return 0; } -static int sdma_v7_0_set_clockgating_state(void *handle, +static int sdma_v7_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int sdma_v7_0_set_powergating_state(void *handle, +static int sdma_v7_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; @@ -1636,6 +1692,7 @@ static const struct amdgpu_ring_funcs sdma_v7_0_ring_funcs = { .emit_reg_write_reg_wait = sdma_v7_0_ring_emit_reg_write_reg_wait, .init_cond_exec = sdma_v7_0_ring_init_cond_exec, .preempt_ib = sdma_v7_0_ring_preempt_ib, + .reset = sdma_v7_0_reset_queue, }; static void sdma_v7_0_set_ring_funcs(struct amdgpu_device *adev) diff --git a/drivers/gpu/drm/amd/amdgpu/si.c b/drivers/gpu/drm/amd/amdgpu/si.c index 00f63d3fbea7..77ef7da2e4fe 100644 --- a/drivers/gpu/drm/amd/amdgpu/si.c +++ b/drivers/gpu/drm/amd/amdgpu/si.c @@ -2649,13 +2649,13 @@ static bool si_common_is_idle(void *handle) return true; } -static int si_common_set_clockgating_state(void *handle, +static int si_common_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int si_common_set_powergating_state(void *handle, +static int si_common_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/si_dma.c b/drivers/gpu/drm/amd/amdgpu/si_dma.c index 47647a6083e8..dbd78d5345a4 100644 --- a/drivers/gpu/drm/amd/amdgpu/si_dma.c +++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c @@ -286,7 +286,7 @@ static int si_dma_ring_test_ib(struct amdgpu_ring *ring, long timeout) r = -EINVAL; err1: - amdgpu_ib_free(adev, &ib, NULL); + amdgpu_ib_free(&ib, NULL); dma_fence_put(f); err0: amdgpu_device_wb_free(adev, index); @@ -629,13 +629,13 @@ static int si_dma_process_trap_irq(struct amdgpu_device *adev, return 0; } -static int si_dma_set_clockgating_state(void *handle, +static int si_dma_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { u32 orig, data, offset; int i; bool enable; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; enable = (state == AMD_CG_STATE_GATE); @@ -672,12 +672,12 @@ static int si_dma_set_clockgating_state(void *handle, return 0; } -static int si_dma_set_powergating_state(void *handle, +static int si_dma_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { u32 tmp; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; WREG32(DMA_PGFSM_WRITE, 0x00002000); WREG32(DMA_PGFSM_CONFIG, 0x100010ff); diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c index 2ec1ebe4db11..a32b6243c1f8 100644 --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c @@ -263,13 +263,13 @@ static int si_ih_soft_reset(struct amdgpu_ip_block *ip_block) return 0; } -static int si_ih_set_clockgating_state(void *handle, +static int si_ih_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int si_ih_set_powergating_state(void *handle, +static int si_ih_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index ede072758dab..a59b4c36cad7 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -171,6 +171,24 @@ static const struct amdgpu_video_codecs vcn_4_0_3_video_codecs_encode = { .codec_array = NULL, }; +static const struct amdgpu_video_codecs vcn_5_0_1_video_codecs_encode_vcn0 = { + .codec_count = 0, + .codec_array = NULL, +}; + +static const struct amdgpu_video_codec_info vcn_5_0_1_video_codecs_decode_array_vcn0[] = { + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG4_AVC, 4096, 4096, 52)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_HEVC, 8192, 4352, 186)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_JPEG, 16384, 16384, 0)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_VP9, 8192, 4352, 0)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_AV1, 8192, 4352, 0)}, +}; + +static const struct amdgpu_video_codecs vcn_5_0_1_video_codecs_decode_vcn0 = { + .codec_count = ARRAY_SIZE(vcn_5_0_1_video_codecs_decode_array_vcn0), + .codec_array = vcn_5_0_1_video_codecs_decode_array_vcn0, +}; + static int soc15_query_video_codecs(struct amdgpu_device *adev, bool encode, const struct amdgpu_video_codecs **codecs) { @@ -209,6 +227,12 @@ static int soc15_query_video_codecs(struct amdgpu_device *adev, bool encode, else *codecs = &vcn_4_0_3_video_codecs_decode; return 0; + case IP_VERSION(5, 0, 1): + if (encode) + *codecs = &vcn_5_0_1_video_codecs_encode_vcn0; + else + *codecs = &vcn_5_0_1_video_codecs_decode_vcn0; + return 0; default: return -EINVAL; } @@ -327,6 +351,7 @@ static u32 soc15_get_xclk(struct amdgpu_device *adev) if (amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(12, 0, 0) || amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(12, 0, 1) || amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 6) || + amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 12) || amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 14)) return 10000; if (amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(10, 0, 0) || @@ -556,6 +581,7 @@ soc15_asic_reset_method(struct amdgpu_device *adev) break; case IP_VERSION(13, 0, 6): case IP_VERSION(13, 0, 14): + case IP_VERSION(13, 0, 12): /* Use gpu_recovery param to target a reset method. * Enable triggering of GPU reset only if specified * by module parameter. @@ -1177,6 +1203,7 @@ static int soc15_common_early_init(struct amdgpu_ip_block *ip_block) break; case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): adev->asic_funcs = &aqua_vanjaram_asic_funcs; adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | AMD_CG_SUPPORT_GFX_CGCG | @@ -1385,10 +1412,10 @@ static void soc15_update_drm_light_sleep(struct amdgpu_device *adev, bool enable WREG32(SOC15_REG_OFFSET(MP0, 0, mmMP0_MISC_LIGHT_SLEEP_CTRL), data); } -static int soc15_common_set_clockgating_state(void *handle, +static int soc15_common_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; @@ -1453,6 +1480,7 @@ static void soc15_common_get_clockgating_state(void *handle, u64 *flags) if ((amdgpu_ip_version(adev, MP0_HWIP, 0) != IP_VERSION(13, 0, 2)) && (amdgpu_ip_version(adev, MP0_HWIP, 0) != IP_VERSION(13, 0, 6)) && + (amdgpu_ip_version(adev, MP0_HWIP, 0) != IP_VERSION(13, 0, 12)) && (amdgpu_ip_version(adev, MP0_HWIP, 0) != IP_VERSION(13, 0, 14))) { /* AMD_CG_SUPPORT_DRM_MGCG */ data = RREG32(SOC15_REG_OFFSET(MP0, 0, mmMP0_MISC_CGTT_CTRL0)); @@ -1473,7 +1501,7 @@ static void soc15_common_get_clockgating_state(void *handle, u64 *flags) adev->df.funcs->get_clockgating_state(adev, flags); } -static int soc15_common_set_powergating_state(void *handle, +static int soc15_common_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { /* todo */ diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c index d6999835918f..62ad67d0b598 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc21.c +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c @@ -928,10 +928,10 @@ static bool soc21_common_is_idle(void *handle) return true; } -static int soc21_common_set_clockgating_state(void *handle, +static int soc21_common_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) { case IP_VERSION(4, 3, 0): @@ -954,10 +954,10 @@ static int soc21_common_set_clockgating_state(void *handle, return 0; } -static int soc21_common_set_powergating_state(void *handle, +static int soc21_common_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; switch (amdgpu_ip_version(adev, LSDMA_HWIP, 0)) { case IP_VERSION(6, 0, 0): diff --git a/drivers/gpu/drm/amd/amdgpu/soc24.c b/drivers/gpu/drm/amd/amdgpu/soc24.c index be96de92b2f5..6b8e078ee7c7 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc24.c +++ b/drivers/gpu/drm/amd/amdgpu/soc24.c @@ -444,8 +444,18 @@ static int soc24_common_late_init(struct amdgpu_ip_block *ip_block) { struct amdgpu_device *adev = ip_block->adev; - if (amdgpu_sriov_vf(adev)) + if (amdgpu_sriov_vf(adev)) { xgpu_nv_mailbox_get_irq(adev); + } else { + if (adev->nbio.ras && + adev->nbio.ras_err_event_athub_irq.funcs) + /* don't need to fail gpu late init + * if enabling athub_err_event interrupt failed + * nbif v6_3_1 only support fatal error hanlding + * just enable the interrupt directly + */ + amdgpu_irq_get(adev, &adev->nbio.ras_err_event_athub_irq, 0); + } /* Enable selfring doorbell aperture late because doorbell BAR * aperture will change if resize BAR successfully in gmc sw_init. @@ -501,8 +511,13 @@ static int soc24_common_hw_fini(struct amdgpu_ip_block *ip_block) adev->nbio.funcs->enable_doorbell_aperture(adev, false); adev->nbio.funcs->enable_doorbell_selfring_aperture(adev, false); - if (amdgpu_sriov_vf(adev)) + if (amdgpu_sriov_vf(adev)) { xgpu_nv_mailbox_put_irq(adev); + } else { + if (adev->nbio.ras && + adev->nbio.ras_err_event_athub_irq.funcs) + amdgpu_irq_put(adev, &adev->nbio.ras_err_event_athub_irq, 0); + } return 0; } @@ -522,10 +537,10 @@ static bool soc24_common_is_idle(void *handle) return true; } -static int soc24_common_set_clockgating_state(void *handle, +static int soc24_common_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) { case IP_VERSION(6, 3, 1): @@ -542,10 +557,10 @@ static int soc24_common_set_clockgating_state(void *handle, return 0; } -static int soc24_common_set_powergating_state(void *handle, +static int soc24_common_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; switch (amdgpu_ip_version(adev, LSDMA_HWIP, 0)) { case IP_VERSION(7, 0, 0): diff --git a/drivers/gpu/drm/amd/amdgpu/ta_ras_if.h b/drivers/gpu/drm/amd/amdgpu/ta_ras_if.h index 21b71a427b1f..64891f099366 100644 --- a/drivers/gpu/drm/amd/amdgpu/ta_ras_if.h +++ b/drivers/gpu/drm/amd/amdgpu/ta_ras_if.h @@ -30,6 +30,9 @@ #define RSP_ID_MASK (1U << 31) #define RSP_ID(cmdId) (((uint32_t)(cmdId)) | RSP_ID_MASK) +/* invalid node instance value */ +#define TA_RAS_INV_NODE 0xffff + /* RAS related enumerations */ /**********************************************************/ enum ras_command { diff --git a/drivers/gpu/drm/amd/amdgpu/ta_secureDisplay_if.h b/drivers/gpu/drm/amd/amdgpu/ta_secureDisplay_if.h index 00d8bdb8254f..9ec2e03d41c7 100644 --- a/drivers/gpu/drm/amd/amdgpu/ta_secureDisplay_if.h +++ b/drivers/gpu/drm/amd/amdgpu/ta_secureDisplay_if.h @@ -31,10 +31,12 @@ * Secure Display Command ID */ enum ta_securedisplay_command { - /* Query whether TA is responding used only for validation purpose */ + /* Query whether TA is responding. It is used only for validation purpose */ TA_SECUREDISPLAY_COMMAND__QUERY_TA = 1, /* Send region of Interest and CRC value to I2C */ TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC = 2, + /* V2 to send multiple regions of Interest and CRC value to I2C */ + TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC_V2 = 3, /* Maximum Command ID */ TA_SECUREDISPLAY_COMMAND__MAX_ID = 0x7FFFFFFF, }; @@ -83,6 +85,8 @@ enum ta_securedisplay_ta_query_cmd_ret { enum ta_securedisplay_buffer_size { /* 15 bytes = 8 byte (ROI) + 6 byte(CRC) + 1 byte(phy_id) */ TA_SECUREDISPLAY_I2C_BUFFER_SIZE = 15, + /* 16 bytes = 8 byte (ROI) + 6 byte(CRC) + 1 byte(phy_id) + 1 byte(roi_idx) */ + TA_SECUREDISPLAY_V2_I2C_BUFFER_SIZE = 16, }; /** Input/output structures for Secure Display commands */ @@ -95,7 +99,15 @@ enum ta_securedisplay_buffer_size { * Physical ID to determine which DIO scratch register should be used to get ROI */ struct ta_securedisplay_send_roi_crc_input { - uint32_t phy_id; /* Physical ID */ + /* Physical ID */ + uint32_t phy_id; +}; + +struct ta_securedisplay_send_roi_crc_v2_input { + /* Physical ID */ + uint32_t phy_id; + /* Region of interest index */ + uint8_t roi_idx; }; /** @union ta_securedisplay_cmd_input @@ -104,6 +116,8 @@ struct ta_securedisplay_send_roi_crc_input { union ta_securedisplay_cmd_input { /* send ROI and CRC input buffer format */ struct ta_securedisplay_send_roi_crc_input send_roi_crc; + /* send ROI and CRC input buffer format, v2 adds a ROI index */ + struct ta_securedisplay_send_roi_crc_v2_input send_roi_crc_v2; uint32_t reserved[4]; }; @@ -128,6 +142,10 @@ struct ta_securedisplay_send_roi_crc_output { uint8_t reserved; }; +struct ta_securedisplay_send_roi_crc_v2_output { + uint8_t i2c_buf[TA_SECUREDISPLAY_V2_I2C_BUFFER_SIZE]; /* I2C buffer */ +}; + /** @union ta_securedisplay_cmd_output * Output buffer */ @@ -136,6 +154,8 @@ union ta_securedisplay_cmd_output { struct ta_securedisplay_query_ta_output query_ta; /* Send ROI CRC output buffer format used only for validation purpose */ struct ta_securedisplay_send_roi_crc_output send_roi_crc; + /* Send ROI CRC output buffer format used only for validation purpose */ + struct ta_securedisplay_send_roi_crc_v2_output send_roi_crc_v2; uint32_t reserved[4]; }; diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c index 5a04a6770138..0968e551f7b5 100644 --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c @@ -448,13 +448,13 @@ static int tonga_ih_soft_reset(struct amdgpu_ip_block *ip_block) return 0; } -static int tonga_ih_set_clockgating_state(void *handle, +static int tonga_ih_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int tonga_ih_set_powergating_state(void *handle, +static int tonga_ih_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c index 1a8ea834efa6..a7b9c358a2d4 100644 --- a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c +++ b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c @@ -173,156 +173,96 @@ static void umc_v12_0_query_ras_error_count(struct amdgpu_device *adev, umc_v12_0_reset_error_count(adev); } -static void umc_v12_0_convert_error_address(struct amdgpu_device *adev, +static int umc_v12_0_convert_error_address(struct amdgpu_device *adev, struct ras_err_data *err_data, - struct ta_ras_query_address_input *addr_in) + struct ta_ras_query_address_input *addr_in, + struct ta_ras_query_address_output *addr_out, + bool dump_addr) { - uint32_t col, row, row_xor, bank, channel_index; - uint64_t soc_pa, retired_page, column, err_addr; - struct ta_ras_query_address_output addr_out; + uint32_t col, col_lower, row, row_lower, bank; + uint32_t channel_index = 0, umc_inst = 0; + uint32_t i, loop_bits[UMC_V12_0_RETIRE_LOOP_BITS]; + uint64_t soc_pa, column, err_addr; + struct ta_ras_query_address_output addr_out_tmp; + struct ta_ras_query_address_output *paddr_out; + enum amdgpu_memory_partition nps = AMDGPU_NPS1_PARTITION_MODE; + int ret = 0; - err_addr = addr_in->ma.err_addr; - addr_in->addr_type = TA_RAS_MCA_TO_PA; - if (psp_ras_query_address(&adev->psp, addr_in, &addr_out)) { - dev_warn(adev->dev, "Failed to query RAS physical address for 0x%llx", - err_addr); + if (!addr_out) + paddr_out = &addr_out_tmp; + else + paddr_out = addr_out; - return; + err_addr = bank = 0; + if (addr_in) { + err_addr = addr_in->ma.err_addr; + addr_in->addr_type = TA_RAS_MCA_TO_PA; + ret = psp_ras_query_address(&adev->psp, addr_in, paddr_out); + if (ret) { + dev_warn(adev->dev, "Failed to query RAS physical address for 0x%llx", + err_addr); + + goto out; + } + + bank = paddr_out->pa.bank; + /* no need to care about umc inst if addr_in is NULL */ + umc_inst = addr_in->ma.umc_inst; } - soc_pa = addr_out.pa.pa; - bank = addr_out.pa.bank; - channel_index = addr_out.pa.channel_idx; + loop_bits[0] = UMC_V12_0_PA_C2_BIT; + loop_bits[1] = UMC_V12_0_PA_C3_BIT; + loop_bits[2] = UMC_V12_0_PA_C4_BIT; + loop_bits[3] = UMC_V12_0_PA_R13_BIT; - col = (err_addr >> 1) & 0x1fULL; - row = (err_addr >> 10) & 0x3fffULL; - row_xor = row ^ (0x1ULL << 13); - /* clear [C3 C2] in soc physical address */ - soc_pa &= ~(0x3ULL << UMC_V12_0_PA_C2_BIT); - /* clear [C4] in soc physical address */ - soc_pa &= ~(0x1ULL << UMC_V12_0_PA_C4_BIT); - - /* loop for all possibilities of [C4 C3 C2] */ - for (column = 0; column < UMC_V12_0_NA_MAP_PA_NUM; column++) { - retired_page = soc_pa | ((column & 0x3) << UMC_V12_0_PA_C2_BIT); - retired_page |= (((column & 0x4) >> 2) << UMC_V12_0_PA_C4_BIT); - /* include column bit 0 and 1 */ - col &= 0x3; - col |= (column << 2); - dev_info(adev->dev, - "Error Address(PA):0x%-10llx Row:0x%-4x Col:0x%-2x Bank:0x%x Channel:0x%x\n", - retired_page, row, col, bank, channel_index); - amdgpu_umc_fill_error_record(err_data, err_addr, - retired_page, channel_index, addr_in->ma.umc_inst); - - /* shift R13 bit */ - retired_page ^= (0x1ULL << UMC_V12_0_PA_R13_BIT); - dev_info(adev->dev, - "Error Address(PA):0x%-10llx Row:0x%-4x Col:0x%-2x Bank:0x%x Channel:0x%x\n", - retired_page, row_xor, col, bank, channel_index); - amdgpu_umc_fill_error_record(err_data, err_addr, - retired_page, channel_index, addr_in->ma.umc_inst); - } -} - -static void umc_v12_0_dump_addr_info(struct amdgpu_device *adev, - struct ta_ras_query_address_output *addr_out, - uint64_t err_addr) -{ - uint32_t col, row, row_xor, bank, channel_index; - uint64_t soc_pa, retired_page, column; - - soc_pa = addr_out->pa.pa; - bank = addr_out->pa.bank; - channel_index = addr_out->pa.channel_idx; - - col = (err_addr >> 1) & 0x1fULL; - row = (err_addr >> 10) & 0x3fffULL; - row_xor = row ^ (0x1ULL << 13); - /* clear [C3 C2] in soc physical address */ - soc_pa &= ~(0x3ULL << UMC_V12_0_PA_C2_BIT); - /* clear [C4] in soc physical address */ - soc_pa &= ~(0x1ULL << UMC_V12_0_PA_C4_BIT); - - /* loop for all possibilities of [C4 C3 C2] */ - for (column = 0; column < UMC_V12_0_NA_MAP_PA_NUM; column++) { - retired_page = soc_pa | ((column & 0x3) << UMC_V12_0_PA_C2_BIT); - retired_page |= (((column & 0x4) >> 2) << UMC_V12_0_PA_C4_BIT); - /* include column bit 0 and 1 */ - col &= 0x3; - col |= (column << 2); - dev_info(adev->dev, - "Error Address(PA):0x%-10llx Row:0x%-4x Col:0x%-2x Bank:0x%x Channel:0x%x\n", - retired_page, row, col, bank, channel_index); - - /* shift R13 bit */ - retired_page ^= (0x1ULL << UMC_V12_0_PA_R13_BIT); - dev_info(adev->dev, - "Error Address(PA):0x%-10llx Row:0x%-4x Col:0x%-2x Bank:0x%x Channel:0x%x\n", - retired_page, row_xor, col, bank, channel_index); - } -} - -static int umc_v12_0_lookup_bad_pages_in_a_row(struct amdgpu_device *adev, - uint64_t pa_addr, uint64_t *pfns, int len) -{ - uint64_t soc_pa, retired_page, column; - uint32_t pos = 0; - - soc_pa = pa_addr; - /* clear [C3 C2] in soc physical address */ - soc_pa &= ~(0x3ULL << UMC_V12_0_PA_C2_BIT); - /* clear [C4] in soc physical address */ - soc_pa &= ~(0x1ULL << UMC_V12_0_PA_C4_BIT); - - /* loop for all possibilities of [C4 C3 C2] */ - for (column = 0; column < UMC_V12_0_NA_MAP_PA_NUM; column++) { - retired_page = soc_pa | ((column & 0x3) << UMC_V12_0_PA_C2_BIT); - retired_page |= (((column & 0x4) >> 2) << UMC_V12_0_PA_C4_BIT); - - if (pos >= len) - return 0; - pfns[pos++] = retired_page >> AMDGPU_GPU_PAGE_SHIFT; - - /* shift R13 bit */ - retired_page ^= (0x1ULL << UMC_V12_0_PA_R13_BIT); - - if (pos >= len) - return 0; - pfns[pos++] = retired_page >> AMDGPU_GPU_PAGE_SHIFT; + if (adev->gmc.gmc_funcs->query_mem_partition_mode) + nps = adev->gmc.gmc_funcs->query_mem_partition_mode(adev); + /* other nps modes are taken as nps1 */ + if (nps == AMDGPU_NPS4_PARTITION_MODE) { + loop_bits[0] = UMC_V12_0_PA_CH4_BIT; + loop_bits[1] = UMC_V12_0_PA_CH5_BIT; + loop_bits[2] = UMC_V12_0_PA_B0_BIT; + loop_bits[3] = UMC_V12_0_PA_R11_BIT; } - return pos; -} + soc_pa = paddr_out->pa.pa; + channel_index = paddr_out->pa.channel_idx; + /* clear loop bits in soc physical address */ + for (i = 0; i < UMC_V12_0_RETIRE_LOOP_BITS; i++) + soc_pa &= ~BIT_ULL(loop_bits[i]); -static int umc_v12_0_convert_mca_to_addr(struct amdgpu_device *adev, - uint64_t err_addr, uint32_t ch, uint32_t umc, - uint32_t node, uint32_t socket, - uint64_t *addr, bool dump_addr) -{ - struct ta_ras_query_address_input addr_in; - struct ta_ras_query_address_output addr_out; + paddr_out->pa.pa = soc_pa; + /* get column bit 0 and 1 in mca address */ + col_lower = (err_addr >> 1) & 0x3ULL; + /* MA_R13_BIT will be handled later */ + row_lower = (err_addr >> UMC_V12_0_MA_R0_BIT) & 0x1fffULL; - memset(&addr_in, 0, sizeof(addr_in)); - addr_in.ma.err_addr = err_addr; - addr_in.ma.ch_inst = ch; - addr_in.ma.umc_inst = umc; - addr_in.ma.node_inst = node; - addr_in.ma.socket_id = socket; - addr_in.addr_type = TA_RAS_MCA_TO_PA; - if (psp_ras_query_address(&adev->psp, &addr_in, &addr_out)) { - dev_warn(adev->dev, "Failed to query RAS physical address for 0x%llx", - err_addr); - return -EINVAL; + if (!err_data && !dump_addr) + goto out; + + /* loop for all possibilities of retired bits */ + for (column = 0; column < UMC_V12_0_BAD_PAGE_NUM_PER_CHANNEL; column++) { + soc_pa = paddr_out->pa.pa; + for (i = 0; i < UMC_V12_0_RETIRE_LOOP_BITS; i++) + soc_pa |= (((column >> i) & 0x1ULL) << loop_bits[i]); + + col = ((column & 0x7) << 2) | col_lower; + /* add row bit 13 */ + row = ((column >> 3) << 13) | row_lower; + + if (dump_addr) + dev_info(adev->dev, + "Error Address(PA):0x%-10llx Row:0x%-4x Col:0x%-2x Bank:0x%x Channel:0x%x\n", + soc_pa, row, col, bank, channel_index); + + if (err_data) + amdgpu_umc_fill_error_record(err_data, err_addr, + soc_pa, channel_index, umc_inst); } - if (dump_addr) - umc_v12_0_dump_addr_info(adev, &addr_out, err_addr); - - *addr = addr_out.pa.pa; - - return 0; +out: + return ret; } static int umc_v12_0_query_error_address(struct amdgpu_device *adev, @@ -374,7 +314,7 @@ static int umc_v12_0_query_error_address(struct amdgpu_device *adev, addr_in.ma.umc_inst = umc_inst; addr_in.ma.node_inst = node_inst; - umc_v12_0_convert_error_address(adev, err_data, &addr_in); + umc_v12_0_convert_error_address(adev, err_data, &addr_in, NULL, true); } /* clear umc status */ @@ -526,6 +466,9 @@ static int umc_v12_0_update_ecc_status(struct amdgpu_device *adev, uint64_t page_pfn[UMC_V12_0_BAD_PAGE_NUM_PER_CHANNEL]; uint64_t err_addr, pa_addr = 0; struct ras_ecc_err *ecc_err; + struct ta_ras_query_address_output addr_out; + enum amdgpu_memory_partition nps = AMDGPU_NPS1_PARTITION_MODE; + uint32_t shift_bit = UMC_V12_0_PA_C4_BIT; int count, ret, i; hwid = REG_GET_FIELD(ipid, MCMP1_IPIDT0, HardwareID); @@ -552,10 +495,10 @@ static int umc_v12_0_update_ecc_status(struct amdgpu_device *adev, MCA_IPID_2_UMC_CH(ipid), err_addr); - ret = umc_v12_0_convert_mca_to_addr(adev, + ret = amdgpu_umc_mca_to_addr(adev, err_addr, MCA_IPID_2_UMC_CH(ipid), MCA_IPID_2_UMC_INST(ipid), MCA_IPID_2_DIE_ID(ipid), - MCA_IPID_2_SOCKET_ID(ipid), &pa_addr, true); + MCA_IPID_2_SOCKET_ID(ipid), &addr_out, true); if (ret) return ret; @@ -563,14 +506,21 @@ static int umc_v12_0_update_ecc_status(struct amdgpu_device *adev, if (!ecc_err) return -ENOMEM; + pa_addr = addr_out.pa.pa; ecc_err->status = status; ecc_err->ipid = ipid; ecc_err->addr = addr; - ecc_err->pa_pfn = UMC_V12_ADDR_MASK_BAD_COLS(pa_addr) >> AMDGPU_GPU_PAGE_SHIFT; + ecc_err->pa_pfn = pa_addr >> AMDGPU_GPU_PAGE_SHIFT; + ecc_err->channel_idx = addr_out.pa.channel_idx; + + if (adev->gmc.gmc_funcs->query_mem_partition_mode) + nps = adev->gmc.gmc_funcs->query_mem_partition_mode(adev); + if (nps == AMDGPU_NPS4_PARTITION_MODE) + shift_bit = UMC_V12_0_PA_B0_BIT; /* If converted pa_pfn is 0, use pa C4 pfn. */ if (!ecc_err->pa_pfn) - ecc_err->pa_pfn = BIT_ULL(UMC_V12_0_PA_C4_BIT) >> AMDGPU_GPU_PAGE_SHIFT; + ecc_err->pa_pfn = BIT_ULL(shift_bit) >> AMDGPU_GPU_PAGE_SHIFT; ret = amdgpu_umc_logs_ecc_err(adev, &con->umc_ecc_log.de_page_tree, ecc_err); if (ret) { @@ -586,7 +536,7 @@ static int umc_v12_0_update_ecc_status(struct amdgpu_device *adev, con->umc_ecc_log.de_queried_count++; memset(page_pfn, 0, sizeof(page_pfn)); - count = umc_v12_0_lookup_bad_pages_in_a_row(adev, + count = amdgpu_umc_lookup_bad_pages_in_a_row(adev, pa_addr, page_pfn, ARRAY_SIZE(page_pfn)); if (count <= 0) { @@ -629,7 +579,7 @@ static int umc_v12_0_fill_error_record(struct amdgpu_device *adev, return -EINVAL; memset(page_pfn, 0, sizeof(page_pfn)); - count = umc_v12_0_lookup_bad_pages_in_a_row(adev, + count = amdgpu_umc_lookup_bad_pages_in_a_row(adev, ecc_err->pa_pfn << AMDGPU_GPU_PAGE_SHIFT, page_pfn, ARRAY_SIZE(page_pfn)); @@ -637,7 +587,7 @@ static int umc_v12_0_fill_error_record(struct amdgpu_device *adev, ret = amdgpu_umc_fill_error_record(err_data, ecc_err->addr, page_pfn[i] << AMDGPU_GPU_PAGE_SHIFT, - MCA_IPID_2_UMC_CH(ecc_err->ipid), + ecc_err->channel_idx, MCA_IPID_2_UMC_INST(ecc_err->ipid)); if (ret) break; @@ -676,6 +626,31 @@ static void umc_v12_0_query_ras_ecc_err_addr(struct amdgpu_device *adev, mutex_unlock(&con->umc_ecc_log.lock); } +static uint32_t umc_v12_0_get_die_id(struct amdgpu_device *adev, + uint64_t mca_addr, uint64_t retired_page) +{ + uint32_t die = 0; + + /* we only calculate die id for nps1 mode right now */ + die += ((((retired_page >> 12) & 0x1ULL)^ + ((retired_page >> 20) & 0x1ULL) ^ + ((retired_page >> 27) & 0x1ULL) ^ + ((retired_page >> 34) & 0x1ULL) ^ + ((retired_page >> 41) & 0x1ULL)) << 0); + + /* the original PA_C4 and PA_R13 may be cleared in retired_page, so + * get them from mca_addr. + */ + die += ((((retired_page >> 13) & 0x1ULL) ^ + ((mca_addr >> 5) & 0x1ULL) ^ + ((retired_page >> 28) & 0x1ULL) ^ + ((mca_addr >> 23) & 0x1ULL) ^ + ((retired_page >> 42) & 0x1ULL)) << 1); + die &= 3; + + return die; +} + struct amdgpu_umc_ras umc_v12_0_ras = { .ras_block = { .hw_ops = &umc_v12_0_ras_hw_ops, @@ -686,5 +661,7 @@ struct amdgpu_umc_ras umc_v12_0_ras = { .ecc_info_query_ras_error_address = umc_v12_0_query_ras_ecc_err_addr, .check_ecc_err_status = umc_v12_0_check_ecc_err_status, .update_ecc_status = umc_v12_0_update_ecc_status, + .convert_ras_err_addr = umc_v12_0_convert_error_address, + .get_die_id_from_pa = umc_v12_0_get_die_id, }; diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.h b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.h index be5598d76c1d..9298018d938f 100644 --- a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.h +++ b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.h @@ -55,12 +55,24 @@ #define UMC_V12_0_NA_MAP_PA_NUM 8 /* R13 bit shift should be considered, double the number */ #define UMC_V12_0_BAD_PAGE_NUM_PER_CHANNEL (UMC_V12_0_NA_MAP_PA_NUM * 2) +/* C2, C3, C4, R13, four bits in MCA address are looped in retirement */ +#define UMC_V12_0_RETIRE_LOOP_BITS 4 /* column bits in SOC physical address */ #define UMC_V12_0_PA_C2_BIT 15 +#define UMC_V12_0_PA_C3_BIT 16 #define UMC_V12_0_PA_C4_BIT 21 /* row bits in SOC physical address */ +#define UMC_V12_0_PA_R0_BIT 22 +#define UMC_V12_0_PA_R11_BIT 33 #define UMC_V12_0_PA_R13_BIT 35 +/* channel bit in SOC physical address */ +#define UMC_V12_0_PA_CH4_BIT 12 +#define UMC_V12_0_PA_CH5_BIT 13 +/* bank bit in SOC physical address */ +#define UMC_V12_0_PA_B0_BIT 19 +/* row bits in MCA address */ +#define UMC_V12_0_MA_R0_BIT 10 #define MCA_UMC_HWID_V12_0 0x96 #define MCA_UMC_MCATYPE_V12_0 0x0 @@ -81,11 +93,6 @@ (((REG_GET_FIELD(ipid, MCMP1_IPIDT0, InstanceIdLo) & 0x1) << 2) | \ (REG_GET_FIELD(ipid, MCMP1_IPIDT0, InstanceIdHi) & 0x03)) -#define UMC_V12_ADDR_MASK_BAD_COLS(addr) \ - ((addr) & ~((0x3ULL << UMC_V12_0_PA_C2_BIT) | \ - (0x1ULL << UMC_V12_0_PA_C4_BIT) | \ - (0x1ULL << UMC_V12_0_PA_R13_BIT))) - bool umc_v12_0_is_deferred_error(struct amdgpu_device *adev, uint64_t mc_umc_status); bool umc_v12_0_is_uncorrectable_error(struct amdgpu_device *adev, uint64_t mc_umc_status); bool umc_v12_0_is_correctable_error(struct amdgpu_device *adev, uint64_t mc_umc_status); diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v8_14.c b/drivers/gpu/drm/amd/amdgpu/umc_v8_14.c new file mode 100644 index 000000000000..eaca10a3c4a9 --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/umc_v8_14.c @@ -0,0 +1,160 @@ +/* + * Copyright 2024 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ +#include "umc_v8_14.h" +#include "amdgpu_ras.h" +#include "amdgpu_umc.h" +#include "amdgpu.h" +#include "umc/umc_8_14_0_offset.h" +#include "umc/umc_8_14_0_sh_mask.h" + +static inline uint32_t get_umc_v8_14_reg_offset(struct amdgpu_device *adev, + uint32_t umc_inst, + uint32_t ch_inst) +{ + return adev->umc.channel_offs * ch_inst + UMC_V8_14_INST_DIST * umc_inst; +} + +static int umc_v8_14_clear_error_count_per_channel(struct amdgpu_device *adev, + uint32_t node_inst, uint32_t umc_inst, + uint32_t ch_inst, void *data) +{ + uint32_t ecc_err_cnt_addr; + uint32_t umc_reg_offset = + get_umc_v8_14_reg_offset(adev, umc_inst, ch_inst); + + ecc_err_cnt_addr = + SOC15_REG_OFFSET(UMC, 0, regUMCCH0_GeccErrCnt); + + /* clear error count */ + WREG32_PCIE((ecc_err_cnt_addr + umc_reg_offset) * 4, + UMC_V8_14_CE_CNT_INIT); + + return 0; +} + +static void umc_v8_14_clear_error_count(struct amdgpu_device *adev) +{ + amdgpu_umc_loop_channels(adev, + umc_v8_14_clear_error_count_per_channel, NULL); +} + +static void umc_v8_14_query_correctable_error_count(struct amdgpu_device *adev, + uint32_t umc_reg_offset, + unsigned long *error_count) +{ + uint32_t ecc_err_cnt, ecc_err_cnt_addr; + + /* UMC 8_14 registers */ + ecc_err_cnt_addr = + SOC15_REG_OFFSET(UMC, 0, regUMCCH0_GeccErrCnt); + + ecc_err_cnt = RREG32_PCIE((ecc_err_cnt_addr + umc_reg_offset) * 4); + *error_count += + (REG_GET_FIELD(ecc_err_cnt, UMCCH0_GeccErrCnt, GeccErrCnt) - + UMC_V8_14_CE_CNT_INIT); +} + +static void umc_v8_14_query_uncorrectable_error_count(struct amdgpu_device *adev, + uint32_t umc_reg_offset, + unsigned long *error_count) +{ + uint32_t ecc_err_cnt, ecc_err_cnt_addr; + /* UMC 8_14 registers */ + ecc_err_cnt_addr = + SOC15_REG_OFFSET(UMC, 0, regUMCCH0_GeccErrCnt); + + ecc_err_cnt = RREG32_PCIE((ecc_err_cnt_addr + umc_reg_offset) * 4); + *error_count += + (REG_GET_FIELD(ecc_err_cnt, UMCCH0_GeccErrCnt, GeccUnCorrErrCnt) - + UMC_V8_14_CE_CNT_INIT); +} + +static int umc_v8_14_query_error_count_per_channel(struct amdgpu_device *adev, + uint32_t node_inst, uint32_t umc_inst, + uint32_t ch_inst, void *data) +{ + struct ras_err_data *err_data = (struct ras_err_data *)data; + uint32_t umc_reg_offset = + get_umc_v8_14_reg_offset(adev, umc_inst, ch_inst); + + umc_v8_14_query_correctable_error_count(adev, + umc_reg_offset, + &(err_data->ce_count)); + umc_v8_14_query_uncorrectable_error_count(adev, + umc_reg_offset, + &(err_data->ue_count)); + + return 0; +} + +static void umc_v8_14_query_ras_error_count(struct amdgpu_device *adev, + void *ras_error_status) +{ + amdgpu_umc_loop_channels(adev, + umc_v8_14_query_error_count_per_channel, ras_error_status); + + umc_v8_14_clear_error_count(adev); +} + +static int umc_v8_14_err_cnt_init_per_channel(struct amdgpu_device *adev, + uint32_t node_inst, uint32_t umc_inst, + uint32_t ch_inst, void *data) +{ + uint32_t ecc_err_cnt_sel, ecc_err_cnt_sel_addr; + uint32_t ecc_err_cnt_addr; + uint32_t umc_reg_offset = + get_umc_v8_14_reg_offset(adev, umc_inst, ch_inst); + + ecc_err_cnt_sel_addr = + SOC15_REG_OFFSET(UMC, 0, regUMCCH0_GeccErrCntSel); + ecc_err_cnt_addr = + SOC15_REG_OFFSET(UMC, 0, regUMCCH0_GeccErrCnt); + + ecc_err_cnt_sel = RREG32_PCIE((ecc_err_cnt_sel_addr + umc_reg_offset) * 4); + + /* set ce error interrupt type to APIC based interrupt */ + ecc_err_cnt_sel = REG_SET_FIELD(ecc_err_cnt_sel, UMCCH0_GeccErrCntSel, + GeccErrInt, 0x1); + WREG32_PCIE((ecc_err_cnt_sel_addr + umc_reg_offset) * 4, ecc_err_cnt_sel); + /* set error count to initial value */ + WREG32_PCIE((ecc_err_cnt_addr + umc_reg_offset) * 4, UMC_V8_14_CE_CNT_INIT); + + return 0; +} + +static void umc_v8_14_err_cnt_init(struct amdgpu_device *adev) +{ + amdgpu_umc_loop_channels(adev, + umc_v8_14_err_cnt_init_per_channel, NULL); +} + +const struct amdgpu_ras_block_hw_ops umc_v8_14_ras_hw_ops = { + .query_ras_error_count = umc_v8_14_query_ras_error_count, +}; + +struct amdgpu_umc_ras umc_v8_14_ras = { + .ras_block = { + .hw_ops = &umc_v8_14_ras_hw_ops, + }, + .err_cnt_init = umc_v8_14_err_cnt_init, +}; diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v8_14.h b/drivers/gpu/drm/amd/amdgpu/umc_v8_14.h new file mode 100644 index 000000000000..20a258f0017a --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/umc_v8_14.h @@ -0,0 +1,51 @@ +/* + * Copyright 2024 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ +#ifndef __UMC_V8_14_H__ +#define __UMC_V8_14_H__ + +#include "soc15_common.h" +#include "amdgpu.h" + +/* number of umc channel instance with memory map register access */ +#define UMC_V8_14_CHANNEL_INSTANCE_NUM 2 +/* number of umc instance with memory map register access */ +#define UMC_V8_14_UMC_INSTANCE_NUM(adev) ((adev)->umc.node_inst_num) + +/* Total channel instances for all available umc nodes */ +#define UMC_V8_14_TOTAL_CHANNEL_NUM(adev) \ + (UMC_V8_14_CHANNEL_INSTANCE_NUM * (adev)->gmc.num_umc) + +/* UMC register per channel offset */ +#define UMC_V8_14_PER_CHANNEL_OFFSET 0x400 + +#define UMC_V8_14_INST_DIST 0x40000 + +/* EccErrCnt max value */ +#define UMC_V8_14_CE_CNT_MAX 0xffff +/* umc ce interrupt threshold */ +#define UMC_V8_14_CE_INT_THRESHOLD 0xffff +/* umc ce count initial value */ +#define UMC_V8_14_CE_CNT_INIT (UMC_V8_14_CE_CNT_MAX - UMC_V8_14_CE_INT_THRESHOLD) + +extern struct amdgpu_umc_ras umc_v8_14_ras; +#endif diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c index bdbca25d80c4..5830e799c0a3 100644 --- a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c @@ -790,13 +790,13 @@ static int uvd_v3_1_soft_reset(struct amdgpu_ip_block *ip_block) return uvd_v3_1_start(adev); } -static int uvd_v3_1_set_clockgating_state(void *handle, +static int uvd_v3_1_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int uvd_v3_1_set_powergating_state(void *handle, +static int uvd_v3_1_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c index a836dc9cfcad..f93079e09215 100644 --- a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c @@ -44,7 +44,7 @@ static void uvd_v4_2_set_ring_funcs(struct amdgpu_device *adev); static void uvd_v4_2_set_irq_funcs(struct amdgpu_device *adev); static int uvd_v4_2_start(struct amdgpu_device *adev); static void uvd_v4_2_stop(struct amdgpu_device *adev); -static int uvd_v4_2_set_clockgating_state(void *handle, +static int uvd_v4_2_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state); static void uvd_v4_2_set_dcm(struct amdgpu_device *adev, bool sw_mode); @@ -708,13 +708,13 @@ static int uvd_v4_2_process_interrupt(struct amdgpu_device *adev, return 0; } -static int uvd_v4_2_set_clockgating_state(void *handle, +static int uvd_v4_2_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int uvd_v4_2_set_powergating_state(void *handle, +static int uvd_v4_2_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { /* This doesn't actually powergate the UVD block. @@ -724,7 +724,7 @@ static int uvd_v4_2_set_powergating_state(void *handle, * revisit this when there is a cleaner line between * the smc and the hw blocks */ - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (state == AMD_PG_STATE_GATE) { uvd_v4_2_stop(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c index ab55fae3569e..050a0f309390 100644 --- a/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c @@ -42,7 +42,7 @@ static void uvd_v5_0_set_ring_funcs(struct amdgpu_device *adev); static void uvd_v5_0_set_irq_funcs(struct amdgpu_device *adev); static int uvd_v5_0_start(struct amdgpu_device *adev); static void uvd_v5_0_stop(struct amdgpu_device *adev); -static int uvd_v5_0_set_clockgating_state(void *handle, +static int uvd_v5_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state); static void uvd_v5_0_enable_mgcg(struct amdgpu_device *adev, bool enable); @@ -155,7 +155,7 @@ static int uvd_v5_0_hw_init(struct amdgpu_ip_block *ip_block) int r; amdgpu_asic_set_uvd_clocks(adev, 10000, 10000); - uvd_v5_0_set_clockgating_state(adev, AMD_CG_STATE_UNGATE); + uvd_v5_0_set_clockgating_state(ip_block, AMD_CG_STATE_UNGATE); uvd_v5_0_enable_mgcg(adev, true); r = amdgpu_ring_test_helper(ring); @@ -790,16 +790,11 @@ static void uvd_v5_0_enable_mgcg(struct amdgpu_device *adev, } } -static int uvd_v5_0_set_clockgating_state(void *handle, +static int uvd_v5_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_CG_STATE_GATE); - struct amdgpu_ip_block *ip_block; - - ip_block = amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_UVD); - if (!ip_block) - return -EINVAL; if (enable) { /* wait for STATUS to clear */ @@ -817,7 +812,7 @@ static int uvd_v5_0_set_clockgating_state(void *handle, return 0; } -static int uvd_v5_0_set_powergating_state(void *handle, +static int uvd_v5_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { /* This doesn't actually powergate the UVD block. @@ -827,7 +822,7 @@ static int uvd_v5_0_set_powergating_state(void *handle, * revisit this when there is a cleaner line between * the smc and the hw blocks */ - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret = 0; if (state == AMD_PG_STATE_GATE) { diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c index 39f8c3d3a135..d9d036ee51fb 100644 --- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c @@ -48,7 +48,7 @@ static void uvd_v6_0_set_irq_funcs(struct amdgpu_device *adev); static int uvd_v6_0_start(struct amdgpu_device *adev); static void uvd_v6_0_stop(struct amdgpu_device *adev); static void uvd_v6_0_set_sw_clock_gating(struct amdgpu_device *adev); -static int uvd_v6_0_set_clockgating_state(void *handle, +static int uvd_v6_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state); static void uvd_v6_0_enable_mgcg(struct amdgpu_device *adev, bool enable); @@ -467,7 +467,7 @@ static int uvd_v6_0_hw_init(struct amdgpu_ip_block *ip_block) int i, r; amdgpu_asic_set_uvd_clocks(adev, 10000, 10000); - uvd_v6_0_set_clockgating_state(adev, AMD_CG_STATE_UNGATE); + uvd_v6_0_set_clockgating_state(ip_block, AMD_CG_STATE_UNGATE); uvd_v6_0_enable_mgcg(adev, true); r = amdgpu_ring_test_helper(ring); @@ -1450,17 +1450,12 @@ static void uvd_v6_0_enable_mgcg(struct amdgpu_device *adev, } } -static int uvd_v6_0_set_clockgating_state(void *handle, +static int uvd_v6_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; - struct amdgpu_ip_block *ip_block; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_CG_STATE_GATE); - ip_block = amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_UVD); - if (!ip_block) - return -EINVAL; - if (enable) { /* wait for STATUS to clear */ if (uvd_v6_0_wait_for_idle(ip_block)) @@ -1476,7 +1471,7 @@ static int uvd_v6_0_set_clockgating_state(void *handle, return 0; } -static int uvd_v6_0_set_powergating_state(void *handle, +static int uvd_v6_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { /* This doesn't actually powergate the UVD block. @@ -1486,7 +1481,7 @@ static int uvd_v6_0_set_powergating_state(void *handle, * revisit this when there is a cleaner line between * the smc and the hw blocks */ - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret = 0; WREG32(mmUVD_POWER_STATUS, UVD_POWER_STATUS__UVD_PG_EN_MASK); diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c index 079131aeb2f7..9d237b5937fb 100644 --- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c @@ -1288,7 +1288,7 @@ static int uvd_v7_0_ring_patch_cs_in_place(struct amdgpu_cs_parser *p, struct amdgpu_job *job, struct amdgpu_ib *ib) { - struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched); + struct amdgpu_ring *ring = amdgpu_job_ring(job); unsigned i; /* No patching necessary for the first instance */ @@ -1511,7 +1511,7 @@ static int uvd_v7_0_process_interrupt(struct amdgpu_device *adev, return 0; } -static int uvd_v7_0_set_clockgating_state(void *handle, +static int uvd_v7_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { /* needed for driver unload*/ diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c index c1ed91b39415..c633b7ff2943 100644 --- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c @@ -578,13 +578,13 @@ static int vce_v2_0_process_interrupt(struct amdgpu_device *adev, return 0; } -static int vce_v2_0_set_clockgating_state(void *handle, +static int vce_v2_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { bool gate = false; bool sw_cg = false; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (state == AMD_CG_STATE_GATE) { gate = true; @@ -596,7 +596,7 @@ static int vce_v2_0_set_clockgating_state(void *handle, return 0; } -static int vce_v2_0_set_powergating_state(void *handle, +static int vce_v2_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { /* This doesn't actually powergate the VCE block. @@ -606,7 +606,7 @@ static int vce_v2_0_set_powergating_state(void *handle, * revisit this when there is a cleaner line between * the smc and the hw blocks */ - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (state == AMD_PG_STATE_GATE) return vce_v2_0_stop(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c index 6bb318a06f19..f8bddcd19b68 100644 --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c @@ -65,7 +65,7 @@ static void vce_v3_0_mc_resume(struct amdgpu_device *adev, int idx); static void vce_v3_0_set_ring_funcs(struct amdgpu_device *adev); static void vce_v3_0_set_irq_funcs(struct amdgpu_device *adev); static int vce_v3_0_wait_for_idle(struct amdgpu_ip_block *ip_block); -static int vce_v3_0_set_clockgating_state(void *handle, +static int vce_v3_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state); /** * vce_v3_0_ring_get_rptr - get read pointer @@ -497,7 +497,7 @@ static int vce_v3_0_hw_fini(struct amdgpu_ip_block *ip_block) return r; vce_v3_0_stop(adev); - return vce_v3_0_set_clockgating_state(adev, AMD_CG_STATE_GATE); + return vce_v3_0_set_clockgating_state(ip_block, AMD_CG_STATE_GATE); } static int vce_v3_0_suspend(struct amdgpu_ip_block *ip_block) @@ -760,10 +760,10 @@ static int vce_v3_0_process_interrupt(struct amdgpu_device *adev, return 0; } -static int vce_v3_0_set_clockgating_state(void *handle, +static int vce_v3_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_CG_STATE_GATE); int i; @@ -801,7 +801,7 @@ static int vce_v3_0_set_clockgating_state(void *handle, return 0; } -static int vce_v3_0_set_powergating_state(void *handle, +static int vce_v3_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { /* This doesn't actually powergate the VCE block. @@ -811,7 +811,7 @@ static int vce_v3_0_set_powergating_state(void *handle, * revisit this when there is a cleaner line between * the smc and the hw blocks */ - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret = 0; if (state == AMD_PG_STATE_GATE) { diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c index 79ee555768a5..335bda64ff5b 100644 --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c @@ -684,14 +684,14 @@ static void vce_v4_0_mc_resume(struct amdgpu_device *adev) ~VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK); } -static int vce_v4_0_set_clockgating_state(void *handle, +static int vce_v4_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { /* needed for driver unload*/ return 0; } -static int vce_v4_0_set_powergating_state(void *handle, +static int vce_v4_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { /* This doesn't actually powergate the VCE block. @@ -701,7 +701,7 @@ static int vce_v4_0_set_powergating_state(void *handle, * revisit this when there is a cleaner line between * the smc and the hw blocks */ - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (state == AMD_PG_STATE_GATE) return vce_v4_0_stop(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c index 10e99c926fb8..5ea96c983517 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c @@ -85,7 +85,8 @@ static int vcn_v1_0_stop(struct amdgpu_device *adev); static void vcn_v1_0_set_dec_ring_funcs(struct amdgpu_device *adev); static void vcn_v1_0_set_enc_ring_funcs(struct amdgpu_device *adev); static void vcn_v1_0_set_irq_funcs(struct amdgpu_device *adev); -static int vcn_v1_0_set_powergating_state(void *handle, enum amd_powergating_state state); +static int vcn_v1_0_set_powergating_state(struct amdgpu_ip_block *ip_block, + enum amd_powergating_state state); static int vcn_v1_0_pause_dpg_mode(struct amdgpu_device *adev, int inst_idx, struct dpg_pause_state *new_state); @@ -281,7 +282,7 @@ static int vcn_v1_0_hw_fini(struct amdgpu_ip_block *ip_block) if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) || (adev->vcn.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(VCN, 0, mmUVD_STATUS))) { - vcn_v1_0_set_powergating_state(adev, AMD_PG_STATE_GATE); + vcn_v1_0_set_powergating_state(ip_block, AMD_PG_STATE_GATE); } return 0; @@ -303,7 +304,7 @@ static int vcn_v1_0_suspend(struct amdgpu_ip_block *ip_block) idle_work_unexecuted = cancel_delayed_work_sync(&adev->vcn.idle_work); if (idle_work_unexecuted) { if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, false); + amdgpu_dpm_enable_vcn(adev, false, 0); } r = vcn_v1_0_hw_fini(ip_block); @@ -344,7 +345,7 @@ static int vcn_v1_0_resume(struct amdgpu_ip_block *ip_block) */ static void vcn_v1_0_mc_resume_spg_mode(struct amdgpu_device *adev) { - uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[0]->size + 4); + uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[0].fw->size + 4); uint32_t offset; /* cache window 0: fw */ @@ -411,7 +412,7 @@ static void vcn_v1_0_mc_resume_spg_mode(struct amdgpu_device *adev) static void vcn_v1_0_mc_resume_dpg_mode(struct amdgpu_device *adev) { - uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[0]->size + 4); + uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[0].fw->size + 4); uint32_t offset; /* cache window 0: fw */ @@ -1394,15 +1395,15 @@ static int vcn_v1_0_wait_for_idle(struct amdgpu_ip_block *ip_block) return ret; } -static int vcn_v1_0_set_clockgating_state(void *handle, +static int vcn_v1_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_CG_STATE_GATE); if (enable) { /* wait for STATUS to clear */ - if (!vcn_v1_0_is_idle(handle)) + if (!vcn_v1_0_is_idle(adev)) return -EBUSY; vcn_v1_0_enable_clock_gating(adev); } else { @@ -1799,7 +1800,7 @@ static void vcn_v1_0_dec_ring_insert_nop(struct amdgpu_ring *ring, uint32_t coun } } -static int vcn_v1_0_set_powergating_state(void *handle, +static int vcn_v1_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { /* This doesn't actually powergate the VCN block. @@ -1810,7 +1811,7 @@ static int vcn_v1_0_set_powergating_state(void *handle, * the smc and the hw blocks */ int ret; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (state == adev->vcn.cur_state) return 0; @@ -1856,7 +1857,7 @@ static void vcn_v1_0_idle_work_handler(struct work_struct *work) if (fences == 0) { amdgpu_gfx_off_ctrl(adev, true); if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, false); + amdgpu_dpm_enable_vcn(adev, false, 0); else amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCN, AMD_PG_STATE_GATE); @@ -1886,7 +1887,7 @@ void vcn_v1_0_set_pg_for_begin_use(struct amdgpu_ring *ring, bool set_clocks) if (set_clocks) { amdgpu_gfx_off_ctrl(adev, false); if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, true); + amdgpu_dpm_enable_vcn(adev, true, 0); else amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCN, AMD_PG_STATE_UNGATE); diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c index e0322cbca3ec..e42cfc731ad8 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c @@ -92,7 +92,7 @@ static const struct amdgpu_hwip_reg_entry vcn_reg_list_2_0[] = { static void vcn_v2_0_set_dec_ring_funcs(struct amdgpu_device *adev); static void vcn_v2_0_set_enc_ring_funcs(struct amdgpu_device *adev); static void vcn_v2_0_set_irq_funcs(struct amdgpu_device *adev); -static int vcn_v2_0_set_powergating_state(void *handle, +static int vcn_v2_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); static int vcn_v2_0_pause_dpg_mode(struct amdgpu_device *adev, int inst_idx, struct dpg_pause_state *new_state); @@ -318,7 +318,7 @@ static int vcn_v2_0_hw_fini(struct amdgpu_ip_block *ip_block) if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) || (adev->vcn.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(VCN, 0, mmUVD_STATUS))) - vcn_v2_0_set_powergating_state(adev, AMD_PG_STATE_GATE); + vcn_v2_0_set_powergating_state(ip_block, AMD_PG_STATE_GATE); return 0; } @@ -372,7 +372,7 @@ static int vcn_v2_0_resume(struct amdgpu_ip_block *ip_block) */ static void vcn_v2_0_mc_resume(struct amdgpu_device *adev) { - uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[0]->size + 4); + uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[0].fw->size + 4); uint32_t offset; if (amdgpu_sriov_vf(adev)) @@ -428,7 +428,7 @@ static void vcn_v2_0_mc_resume(struct amdgpu_device *adev) static void vcn_v2_0_mc_resume_dpg_mode(struct amdgpu_device *adev, bool indirect) { - uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[0]->size + 4); + uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[0].fw->size + 4); uint32_t offset; /* cache window 0: fw */ @@ -978,7 +978,7 @@ static int vcn_v2_0_start(struct amdgpu_device *adev) int i, j, r; if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, true); + amdgpu_dpm_enable_vcn(adev, true, 0); if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) return vcn_v2_0_start_dpg_mode(adev, adev->vcn.indirect_sram); @@ -1235,7 +1235,7 @@ static int vcn_v2_0_stop(struct amdgpu_device *adev) power_off: if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, false); + amdgpu_dpm_enable_vcn(adev, false, 0); return 0; } @@ -1335,10 +1335,10 @@ static int vcn_v2_0_wait_for_idle(struct amdgpu_ip_block *ip_block) return ret; } -static int vcn_v2_0_set_clockgating_state(void *handle, +static int vcn_v2_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_CG_STATE_GATE); if (amdgpu_sriov_vf(adev)) @@ -1346,7 +1346,7 @@ static int vcn_v2_0_set_clockgating_state(void *handle, if (enable) { /* wait for STATUS to clear */ - if (!vcn_v2_0_is_idle(handle)) + if (!vcn_v2_0_is_idle(adev)) return -EBUSY; vcn_v2_0_enable_clock_gating(adev); } else { @@ -1796,7 +1796,7 @@ int vcn_v2_0_dec_ring_test_ring(struct amdgpu_ring *ring) } -static int vcn_v2_0_set_powergating_state(void *handle, +static int vcn_v2_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { /* This doesn't actually powergate the VCN block. @@ -1807,7 +1807,7 @@ static int vcn_v2_0_set_powergating_state(void *handle, * the smc and the hw blocks */ int ret; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) { adev->vcn.cur_state = AMD_PG_STATE_UNGATE; @@ -1920,7 +1920,7 @@ static int vcn_v2_0_start_sriov(struct amdgpu_device *adev) init_table += header->vcn_table_offset; - size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[0]->size + 4); + size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[0].fw->size + 4); MMSCH_V2_0_INSERT_DIRECT_RD_MOD_WT( SOC15_REG_OFFSET(UVD, i, mmUVD_STATUS), diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c index 6aa08281d094..b518202955ca 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c @@ -95,7 +95,7 @@ static const struct amdgpu_hwip_reg_entry vcn_reg_list_2_5[] = { static void vcn_v2_5_set_dec_ring_funcs(struct amdgpu_device *adev); static void vcn_v2_5_set_enc_ring_funcs(struct amdgpu_device *adev); static void vcn_v2_5_set_irq_funcs(struct amdgpu_device *adev); -static int vcn_v2_5_set_powergating_state(void *handle, +static int vcn_v2_5_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); static int vcn_v2_5_pause_dpg_mode(struct amdgpu_device *adev, int inst_idx, struct dpg_pause_state *new_state); @@ -399,7 +399,7 @@ static int vcn_v2_5_hw_fini(struct amdgpu_ip_block *ip_block) if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) || (adev->vcn.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(VCN, i, mmUVD_STATUS))) - vcn_v2_5_set_powergating_state(adev, AMD_PG_STATE_GATE); + vcn_v2_5_set_powergating_state(ip_block, AMD_PG_STATE_GATE); if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__VCN)) amdgpu_irq_put(adev, &adev->vcn.inst[i].ras_poison_irq, 0); @@ -465,7 +465,7 @@ static void vcn_v2_5_mc_resume(struct amdgpu_device *adev) if (adev->vcn.harvest_config & (1 << i)) continue; - size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[i]->size + 4); + size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[i].fw->size + 4); /* cache window 0: fw */ if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) { WREG32_SOC15(VCN, i, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW, @@ -514,7 +514,7 @@ static void vcn_v2_5_mc_resume(struct amdgpu_device *adev) static void vcn_v2_5_mc_resume_dpg_mode(struct amdgpu_device *adev, int inst_idx, bool indirect) { - uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[inst_idx]->size + 4); + uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[inst_idx].fw->size + 4); uint32_t offset; /* cache window 0: fw */ @@ -1012,8 +1012,10 @@ static int vcn_v2_5_start(struct amdgpu_device *adev) uint32_t rb_bufsz, tmp; int i, j, k, r; - if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, true); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_vcn(adev, true, i); + } for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { if (adev->vcn.harvest_config & (1 << i)) @@ -1285,7 +1287,7 @@ static int vcn_v2_5_sriov_start(struct amdgpu_device *adev) SOC15_REG_OFFSET(VCN, i, mmUVD_STATUS), ~UVD_STATUS__UVD_BUSY, UVD_STATUS__UVD_BUSY); - size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[i]->size + 4); + size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[i].fw->size + 4); /* mc resume*/ if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) { MMSCH_V1_0_INSERT_DIRECT_WT( @@ -1485,8 +1487,10 @@ static int vcn_v2_5_stop(struct amdgpu_device *adev) ~UVD_POWER_STATUS__UVD_POWER_STATUS_MASK); } - if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, false); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_vcn(adev, false, i); + } return 0; } @@ -1778,6 +1782,7 @@ static bool vcn_v2_5_is_idle(void *handle) for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { if (adev->vcn.harvest_config & (1 << i)) continue; + ret &= (RREG32_SOC15(VCN, i, mmUVD_STATUS) == UVD_STATUS__IDLE); } @@ -1801,17 +1806,17 @@ static int vcn_v2_5_wait_for_idle(struct amdgpu_ip_block *ip_block) return ret; } -static int vcn_v2_5_set_clockgating_state(void *handle, +static int vcn_v2_5_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_CG_STATE_GATE); if (amdgpu_sriov_vf(adev)) return 0; if (enable) { - if (!vcn_v2_5_is_idle(handle)) + if (!vcn_v2_5_is_idle(adev)) return -EBUSY; vcn_v2_5_enable_clock_gating(adev); } else { @@ -1821,10 +1826,10 @@ static int vcn_v2_5_set_clockgating_state(void *handle, return 0; } -static int vcn_v2_5_set_powergating_state(void *handle, +static int vcn_v2_5_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; if (amdgpu_sriov_vf(adev)) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c index 6732ad7f16f5..63ddd4cca910 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c @@ -105,7 +105,7 @@ static int vcn_v3_0_start_sriov(struct amdgpu_device *adev); static void vcn_v3_0_set_dec_ring_funcs(struct amdgpu_device *adev); static void vcn_v3_0_set_enc_ring_funcs(struct amdgpu_device *adev); static void vcn_v3_0_set_irq_funcs(struct amdgpu_device *adev); -static int vcn_v3_0_set_powergating_state(void *handle, +static int vcn_v3_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); static int vcn_v3_0_pause_dpg_mode(struct amdgpu_device *adev, int inst_idx, struct dpg_pause_state *new_state); @@ -430,9 +430,9 @@ static int vcn_v3_0_hw_fini(struct amdgpu_ip_block *ip_block) if (!amdgpu_sriov_vf(adev)) { if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) || - (adev->vcn.cur_state != AMD_PG_STATE_GATE && - RREG32_SOC15(VCN, i, mmUVD_STATUS))) { - vcn_v3_0_set_powergating_state(adev, AMD_PG_STATE_GATE); + (adev->vcn.cur_state != AMD_PG_STATE_GATE && + RREG32_SOC15(VCN, i, mmUVD_STATUS))) { + vcn_v3_0_set_powergating_state(ip_block, AMD_PG_STATE_GATE); } } } @@ -490,7 +490,7 @@ static int vcn_v3_0_resume(struct amdgpu_ip_block *ip_block) */ static void vcn_v3_0_mc_resume(struct amdgpu_device *adev, int inst) { - uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[inst]->size + 4); + uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[inst].fw->size + 4); uint32_t offset; /* cache window 0: fw */ @@ -540,7 +540,7 @@ static void vcn_v3_0_mc_resume(struct amdgpu_device *adev, int inst) static void vcn_v3_0_mc_resume_dpg_mode(struct amdgpu_device *adev, int inst_idx, bool indirect) { - uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[inst_idx]->size + 4); + uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[inst_idx].fw->size + 4); uint32_t offset; /* cache window 0: fw */ @@ -1141,8 +1141,10 @@ static int vcn_v3_0_start(struct amdgpu_device *adev) uint32_t rb_bufsz, tmp; int i, j, k, r; - if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, true); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_vcn(adev, true, i); + } for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { if (adev->vcn.harvest_config & (1 << i)) @@ -1373,7 +1375,7 @@ static int vcn_v3_0_start_sriov(struct amdgpu_device *adev) mmUVD_STATUS), ~UVD_STATUS__UVD_BUSY, UVD_STATUS__UVD_BUSY); - cache_size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[i]->size + 4); + cache_size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[i].fw->size + 4); if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) { MMSCH_V3_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCN, i, @@ -1632,8 +1634,10 @@ static int vcn_v3_0_stop(struct amdgpu_device *adev) vcn_v3_0_enable_static_power_gating(adev, i); } - if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, false); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_vcn(adev, false, i); + } return 0; } @@ -2132,10 +2136,10 @@ static int vcn_v3_0_wait_for_idle(struct amdgpu_ip_block *ip_block) return ret; } -static int vcn_v3_0_set_clockgating_state(void *handle, +static int vcn_v3_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = state == AMD_CG_STATE_GATE; int i; @@ -2155,10 +2159,10 @@ static int vcn_v3_0_set_clockgating_state(void *handle, return 0; } -static int vcn_v3_0_set_powergating_state(void *handle, +static int vcn_v3_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; /* for SRIOV, guest should not control VCN Power-gating diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c index fcc8511e91ee..00551d6f0370 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c @@ -96,7 +96,7 @@ static int amdgpu_ih_clientid_vcns[] = { static int vcn_v4_0_start_sriov(struct amdgpu_device *adev); static void vcn_v4_0_set_unified_ring_funcs(struct amdgpu_device *adev); static void vcn_v4_0_set_irq_funcs(struct amdgpu_device *adev); -static int vcn_v4_0_set_powergating_state(void *handle, +static int vcn_v4_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); static int vcn_v4_0_pause_dpg_mode(struct amdgpu_device *adev, int inst_idx, struct dpg_pause_state *new_state); @@ -366,9 +366,9 @@ static int vcn_v4_0_hw_fini(struct amdgpu_ip_block *ip_block) continue; if (!amdgpu_sriov_vf(adev)) { if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) || - (adev->vcn.cur_state != AMD_PG_STATE_GATE && - RREG32_SOC15(VCN, i, regUVD_STATUS))) { - vcn_v4_0_set_powergating_state(adev, AMD_PG_STATE_GATE); + (adev->vcn.cur_state != AMD_PG_STATE_GATE && + RREG32_SOC15(VCN, i, regUVD_STATUS))) { + vcn_v4_0_set_powergating_state(ip_block, AMD_PG_STATE_GATE); } } if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__VCN)) @@ -431,7 +431,7 @@ static void vcn_v4_0_mc_resume(struct amdgpu_device *adev, int inst) uint32_t offset, size; const struct common_firmware_header *hdr; - hdr = (const struct common_firmware_header *)adev->vcn.fw[inst]->data; + hdr = (const struct common_firmware_header *)adev->vcn.inst[inst].fw->data; size = AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8); /* cache window 0: fw */ @@ -491,7 +491,7 @@ static void vcn_v4_0_mc_resume_dpg_mode(struct amdgpu_device *adev, int inst_idx { uint32_t offset, size; const struct common_firmware_header *hdr; - hdr = (const struct common_firmware_header *)adev->vcn.fw[inst_idx]->data; + hdr = (const struct common_firmware_header *)adev->vcn.inst[inst_idx].fw->data; size = AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8); /* cache window 0: fw */ @@ -1097,8 +1097,10 @@ static int vcn_v4_0_start(struct amdgpu_device *adev) uint32_t tmp; int i, j, k, r; - if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, true); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_vcn(adev, true, i); + } for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { if (adev->vcn.harvest_config & (1 << i)) @@ -1341,7 +1343,7 @@ static int vcn_v4_0_start_sriov(struct amdgpu_device *adev) regUVD_STATUS), ~UVD_STATUS__UVD_BUSY, UVD_STATUS__UVD_BUSY); - cache_size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[i]->size + 4); + cache_size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[i].fw->size + 4); if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) { MMSCH_V4_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCN, i, @@ -1623,8 +1625,10 @@ static int vcn_v4_0_stop(struct amdgpu_device *adev) vcn_v4_0_enable_static_power_gating(adev, i); } - if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, false); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_vcn(adev, false, i); + } return 0; } @@ -2007,14 +2011,15 @@ static int vcn_v4_0_wait_for_idle(struct amdgpu_ip_block *ip_block) /** * vcn_v4_0_set_clockgating_state - set VCN block clockgating state * - * @handle: amdgpu_device pointer + * @ip_block: amdgpu_ip_block pointer * @state: clock gating state * * Set VCN block clockgating state */ -static int vcn_v4_0_set_clockgating_state(void *handle, enum amd_clockgating_state state) +static int vcn_v4_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, + enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = state == AMD_CG_STATE_GATE; int i; @@ -2037,14 +2042,15 @@ static int vcn_v4_0_set_clockgating_state(void *handle, enum amd_clockgating_sta /** * vcn_v4_0_set_powergating_state - set VCN block powergating state * - * @handle: amdgpu_device pointer + * @ip_block: amdgpu_ip_block pointer * @state: power gating state * * Set VCN block powergating state */ -static int vcn_v4_0_set_powergating_state(void *handle, enum amd_powergating_state state) +static int vcn_v4_0_set_powergating_state(struct amdgpu_ip_block *ip_block, + enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; /* for SRIOV, guest should not control VCN Power-gating diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c index 3f69b9b2bcd0..ecdc027f8220 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c @@ -87,7 +87,7 @@ static const struct amdgpu_hwip_reg_entry vcn_reg_list_4_0_3[] = { static int vcn_v4_0_3_start_sriov(struct amdgpu_device *adev); static void vcn_v4_0_3_set_unified_ring_funcs(struct amdgpu_device *adev); static void vcn_v4_0_3_set_irq_funcs(struct amdgpu_device *adev); -static int vcn_v4_0_3_set_powergating_state(void *handle, +static int vcn_v4_0_3_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); static int vcn_v4_0_3_pause_dpg_mode(struct amdgpu_device *adev, int inst_idx, struct dpg_pause_state *new_state); @@ -349,7 +349,7 @@ static int vcn_v4_0_3_hw_fini(struct amdgpu_ip_block *ip_block) cancel_delayed_work_sync(&adev->vcn.idle_work); if (adev->vcn.cur_state != AMD_PG_STATE_GATE) - vcn_v4_0_3_set_powergating_state(adev, AMD_PG_STATE_GATE); + vcn_v4_0_3_set_powergating_state(ip_block, AMD_PG_STATE_GATE); return 0; } @@ -407,7 +407,7 @@ static void vcn_v4_0_3_mc_resume(struct amdgpu_device *adev, int inst_idx) uint32_t offset, size, vcn_inst; const struct common_firmware_header *hdr; - hdr = (const struct common_firmware_header *)adev->vcn.fw[inst_idx]->data; + hdr = (const struct common_firmware_header *)adev->vcn.inst[inst_idx].fw->data; size = AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8); vcn_inst = GET_INST(VCN, inst_idx); @@ -482,7 +482,7 @@ static void vcn_v4_0_3_mc_resume_dpg_mode(struct amdgpu_device *adev, int inst_i uint32_t offset, size; const struct common_firmware_header *hdr; - hdr = (const struct common_firmware_header *)adev->vcn.fw[inst_idx]->data; + hdr = (const struct common_firmware_header *)adev->vcn.inst[inst_idx].fw->data; size = AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8); /* cache window 0: fw */ @@ -957,6 +957,8 @@ static int vcn_v4_0_3_start_sriov(struct amdgpu_device *adev) for (i = 0; i < adev->vcn.num_vcn_inst; i++) { vcn_inst = GET_INST(VCN, i); + vcn_v4_0_3_fw_shared_init(adev, vcn_inst); + memset(&header, 0, sizeof(struct mmsch_v4_0_3_init_header)); header.version = MMSCH_VERSION; header.total_size = sizeof(struct mmsch_v4_0_3_init_header) >> 2; @@ -969,7 +971,7 @@ static int vcn_v4_0_3_start_sriov(struct amdgpu_device *adev) MMSCH_V4_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCN, 0, regUVD_STATUS), ~UVD_STATUS__UVD_BUSY, UVD_STATUS__UVD_BUSY); - cache_size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw[i]->size + 4); + cache_size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.inst[i].fw->size + 4); if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) { MMSCH_V4_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCN, 0, @@ -1121,8 +1123,10 @@ static int vcn_v4_0_3_start(struct amdgpu_device *adev) int i, j, k, r, vcn_inst; uint32_t tmp; - if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, true); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_vcn(adev, true, i); + } for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) { @@ -1395,8 +1399,10 @@ static int vcn_v4_0_3_stop(struct amdgpu_device *adev) vcn_v4_0_3_enable_clock_gating(adev, i); } Done: - if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, false); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_vcn(adev, false, i); + } return 0; } @@ -1616,15 +1622,15 @@ static int vcn_v4_0_3_wait_for_idle(struct amdgpu_ip_block *ip_block) /* vcn_v4_0_3_set_clockgating_state - set VCN block clockgating state * - * @handle: amdgpu_device pointer + * @ip_block: amdgpu_ip_block pointer * @state: clock gating state * * Set VCN block clockgating state */ -static int vcn_v4_0_3_set_clockgating_state(void *handle, +static int vcn_v4_0_3_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = state == AMD_CG_STATE_GATE; int i; @@ -1644,15 +1650,15 @@ static int vcn_v4_0_3_set_clockgating_state(void *handle, /** * vcn_v4_0_3_set_powergating_state - set VCN block powergating state * - * @handle: amdgpu_device pointer + * @ip_block: amdgpu_ip_block pointer * @state: power gating state * * Set VCN block powergating state */ -static int vcn_v4_0_3_set_powergating_state(void *handle, +static int vcn_v4_0_3_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; /* for SRIOV, guest should not control VCN Power-gating @@ -1911,9 +1917,94 @@ static const struct amdgpu_ras_block_hw_ops vcn_v4_0_3_ras_hw_ops = { .reset_ras_error_count = vcn_v4_0_3_reset_ras_error_count, }; +static int vcn_v4_0_3_aca_bank_parser(struct aca_handle *handle, struct aca_bank *bank, + enum aca_smu_type type, void *data) +{ + struct aca_bank_info info; + u64 misc0; + int ret; + + ret = aca_bank_info_decode(bank, &info); + if (ret) + return ret; + + misc0 = bank->regs[ACA_REG_IDX_MISC0]; + switch (type) { + case ACA_SMU_TYPE_UE: + ret = aca_error_cache_log_bank_error(handle, &info, ACA_ERROR_TYPE_UE, + 1ULL); + break; + case ACA_SMU_TYPE_CE: + ret = aca_error_cache_log_bank_error(handle, &info, ACA_ERROR_TYPE_CE, + ACA_REG__MISC0__ERRCNT(misc0)); + break; + default: + return -EINVAL; + } + + return ret; +} + +/* reference to smu driver if header file */ +static int vcn_v4_0_3_err_codes[] = { + 14, 15, /* VCN */ +}; + +static bool vcn_v4_0_3_aca_bank_is_valid(struct aca_handle *handle, struct aca_bank *bank, + enum aca_smu_type type, void *data) +{ + u32 instlo; + + instlo = ACA_REG__IPID__INSTANCEIDLO(bank->regs[ACA_REG_IDX_IPID]); + instlo &= GENMASK(31, 1); + + if (instlo != mmSMNAID_AID0_MCA_SMU) + return false; + + if (aca_bank_check_error_codes(handle->adev, bank, + vcn_v4_0_3_err_codes, + ARRAY_SIZE(vcn_v4_0_3_err_codes))) + return false; + + return true; +} + +static const struct aca_bank_ops vcn_v4_0_3_aca_bank_ops = { + .aca_bank_parser = vcn_v4_0_3_aca_bank_parser, + .aca_bank_is_valid = vcn_v4_0_3_aca_bank_is_valid, +}; + +static const struct aca_info vcn_v4_0_3_aca_info = { + .hwip = ACA_HWIP_TYPE_SMU, + .mask = ACA_ERROR_UE_MASK, + .bank_ops = &vcn_v4_0_3_aca_bank_ops, +}; + +static int vcn_v4_0_3_ras_late_init(struct amdgpu_device *adev, struct ras_common_if *ras_block) +{ + int r; + + r = amdgpu_ras_block_late_init(adev, ras_block); + if (r) + return r; + + r = amdgpu_ras_bind_aca(adev, AMDGPU_RAS_BLOCK__VCN, + &vcn_v4_0_3_aca_info, NULL); + if (r) + goto late_fini; + + return 0; + +late_fini: + amdgpu_ras_block_late_fini(adev, ras_block); + + return r; +} + static struct amdgpu_vcn_ras vcn_v4_0_3_ras = { .ras_block = { .hw_ops = &vcn_v4_0_3_ras_hw_ops, + .ras_late_init = vcn_v4_0_3_ras_late_init, }, }; diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c index 71961fb3f7ff..23d3c16c9d9f 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c @@ -95,7 +95,7 @@ static int amdgpu_ih_clientid_vcns[] = { static void vcn_v4_0_5_set_unified_ring_funcs(struct amdgpu_device *adev); static void vcn_v4_0_5_set_irq_funcs(struct amdgpu_device *adev); -static int vcn_v4_0_5_set_powergating_state(void *handle, +static int vcn_v4_0_5_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); static int vcn_v4_0_5_pause_dpg_mode(struct amdgpu_device *adev, int inst_idx, struct dpg_pause_state *new_state); @@ -309,7 +309,7 @@ static int vcn_v4_0_5_hw_fini(struct amdgpu_ip_block *ip_block) if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) || (adev->vcn.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(VCN, i, regUVD_STATUS))) { - vcn_v4_0_5_set_powergating_state(adev, AMD_PG_STATE_GATE); + vcn_v4_0_5_set_powergating_state(ip_block, AMD_PG_STATE_GATE); } } } @@ -370,7 +370,7 @@ static void vcn_v4_0_5_mc_resume(struct amdgpu_device *adev, int inst) uint32_t offset, size; const struct common_firmware_header *hdr; - hdr = (const struct common_firmware_header *)adev->vcn.fw[inst]->data; + hdr = (const struct common_firmware_header *)adev->vcn.inst[inst].fw->data; size = AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8); /* cache window 0: fw */ @@ -431,7 +431,7 @@ static void vcn_v4_0_5_mc_resume_dpg_mode(struct amdgpu_device *adev, int inst_i uint32_t offset, size; const struct common_firmware_header *hdr; - hdr = (const struct common_firmware_header *)adev->vcn.fw[inst_idx]->data; + hdr = (const struct common_firmware_header *)adev->vcn.inst[inst_idx].fw->data; size = AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8); /* cache window 0: fw */ @@ -1000,8 +1000,10 @@ static int vcn_v4_0_5_start(struct amdgpu_device *adev) uint32_t tmp; int i, j, k, r; - if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, true); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_vcn(adev, true, i); + } for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { if (adev->vcn.harvest_config & (1 << i)) @@ -1277,8 +1279,10 @@ static int vcn_v4_0_5_stop(struct amdgpu_device *adev) vcn_v4_0_5_enable_static_power_gating(adev, i); } - if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, false); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_vcn(adev, false, i); + } return 0; } @@ -1492,14 +1496,15 @@ static int vcn_v4_0_5_wait_for_idle(struct amdgpu_ip_block *ip_block) /** * vcn_v4_0_5_set_clockgating_state - set VCN block clockgating state * - * @handle: amdgpu_device pointer + * @ip_block: amdgpu_ip_block pointer * @state: clock gating state * * Set VCN block clockgating state */ -static int vcn_v4_0_5_set_clockgating_state(void *handle, enum amd_clockgating_state state) +static int vcn_v4_0_5_set_clockgating_state(struct amdgpu_ip_block *ip_block, + enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_CG_STATE_GATE) ? true : false; int i; @@ -1522,14 +1527,15 @@ static int vcn_v4_0_5_set_clockgating_state(void *handle, enum amd_clockgating_s /** * vcn_v4_0_5_set_powergating_state - set VCN block powergating state * - * @handle: amdgpu_device pointer + * @ip_block: amdgpu_ip_block pointer * @state: power gating state * * Set VCN block powergating state */ -static int vcn_v4_0_5_set_powergating_state(void *handle, enum amd_powergating_state state) +static int vcn_v4_0_5_set_powergating_state(struct amdgpu_ip_block *ip_block, + enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; if (state == adev->vcn.cur_state) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c index bd3d2bbdc16b..b6d78381ebfb 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c @@ -32,7 +32,7 @@ #include "vcn/vcn_5_0_0_offset.h" #include "vcn/vcn_5_0_0_sh_mask.h" -#include "ivsrcid/vcn/irqsrcs_vcn_4_0.h" +#include "ivsrcid/vcn/irqsrcs_vcn_5_0.h" #include "vcn_v5_0_0.h" #include @@ -78,7 +78,7 @@ static int amdgpu_ih_clientid_vcns[] = { static void vcn_v5_0_0_set_unified_ring_funcs(struct amdgpu_device *adev); static void vcn_v5_0_0_set_irq_funcs(struct amdgpu_device *adev); -static int vcn_v5_0_0_set_powergating_state(void *handle, +static int vcn_v5_0_0_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); static int vcn_v5_0_0_pause_dpg_mode(struct amdgpu_device *adev, int inst_idx, struct dpg_pause_state *new_state); @@ -105,6 +105,21 @@ static int vcn_v5_0_0_early_init(struct amdgpu_ip_block *ip_block) return amdgpu_vcn_early_init(adev); } +void vcn_v5_0_0_alloc_ip_dump(struct amdgpu_device *adev) +{ + uint32_t reg_count = ARRAY_SIZE(vcn_reg_list_5_0); + uint32_t *ptr; + + /* Allocate memory for VCN IP Dump buffer */ + ptr = kcalloc(adev->vcn.num_vcn_inst * reg_count, sizeof(uint32_t), GFP_KERNEL); + if (!ptr) { + DRM_ERROR("Failed to allocate memory for VCN IP Dump\n"); + adev->vcn.ip_dump = NULL; + } else { + adev->vcn.ip_dump = ptr; + } +} + /** * vcn_v5_0_0_sw_init - sw init for VCN block * @@ -117,8 +132,6 @@ static int vcn_v5_0_0_sw_init(struct amdgpu_ip_block *ip_block) struct amdgpu_ring *ring; struct amdgpu_device *adev = ip_block->adev; int i, r; - uint32_t reg_count = ARRAY_SIZE(vcn_reg_list_5_0); - uint32_t *ptr; r = amdgpu_vcn_sw_init(adev); if (r) @@ -140,13 +153,13 @@ static int vcn_v5_0_0_sw_init(struct amdgpu_ip_block *ip_block) /* VCN UNIFIED TRAP */ r = amdgpu_irq_add_id(adev, amdgpu_ih_clientid_vcns[i], - VCN_4_0__SRCID__UVD_ENC_GENERAL_PURPOSE, &adev->vcn.inst[i].irq); + VCN_5_0__SRCID__UVD_ENC_GENERAL_PURPOSE, &adev->vcn.inst[i].irq); if (r) return r; /* VCN POISON TRAP */ r = amdgpu_irq_add_id(adev, amdgpu_ih_clientid_vcns[i], - VCN_4_0__SRCID_UVD_POISON, &adev->vcn.inst[i].irq); + VCN_5_0__SRCID_UVD_POISON, &adev->vcn.inst[i].irq); if (r) return r; @@ -177,14 +190,7 @@ static int vcn_v5_0_0_sw_init(struct amdgpu_ip_block *ip_block) if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) adev->vcn.pause_dpg_mode = vcn_v5_0_0_pause_dpg_mode; - /* Allocate memory for VCN IP Dump buffer */ - ptr = kcalloc(adev->vcn.num_vcn_inst * reg_count, sizeof(uint32_t), GFP_KERNEL); - if (!ptr) { - DRM_ERROR("Failed to allocate memory for VCN IP Dump\n"); - adev->vcn.ip_dump = NULL; - } else { - adev->vcn.ip_dump = ptr; - } + vcn_v5_0_0_alloc_ip_dump(adev); r = amdgpu_vcn_sysfs_reset_mask_init(adev); if (r) @@ -283,7 +289,7 @@ static int vcn_v5_0_0_hw_fini(struct amdgpu_ip_block *ip_block) if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) || (adev->vcn.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(VCN, i, regUVD_STATUS))) { - vcn_v5_0_0_set_powergating_state(adev, AMD_PG_STATE_GATE); + vcn_v5_0_0_set_powergating_state(ip_block, AMD_PG_STATE_GATE); } } } @@ -344,7 +350,7 @@ static void vcn_v5_0_0_mc_resume(struct amdgpu_device *adev, int inst) uint32_t offset, size; const struct common_firmware_header *hdr; - hdr = (const struct common_firmware_header *)adev->vcn.fw[inst]->data; + hdr = (const struct common_firmware_header *)adev->vcn.inst[inst].fw->data; size = AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8); /* cache window 0: fw */ @@ -405,7 +411,7 @@ static void vcn_v5_0_0_mc_resume_dpg_mode(struct amdgpu_device *adev, int inst_i uint32_t offset, size; const struct common_firmware_header *hdr; - hdr = (const struct common_firmware_header *)adev->vcn.fw[inst_idx]->data; + hdr = (const struct common_firmware_header *)adev->vcn.inst[inst_idx].fw->data; size = AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8); /* cache window 0: fw */ @@ -771,8 +777,10 @@ static int vcn_v5_0_0_start(struct amdgpu_device *adev) uint32_t tmp; int i, j, k, r; - if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, true); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_vcn(adev, true, i); + } for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { if (adev->vcn.harvest_config & (1 << i)) @@ -1018,8 +1026,10 @@ static int vcn_v5_0_0_stop(struct amdgpu_device *adev) vcn_v5_0_0_enable_static_power_gating(adev, i); } - if (adev->pm.dpm_enabled) - amdgpu_dpm_enable_uvd(adev, false); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_vcn(adev, false, i); + } return 0; } @@ -1229,14 +1239,15 @@ static int vcn_v5_0_0_wait_for_idle(struct amdgpu_ip_block *ip_block) /** * vcn_v5_0_0_set_clockgating_state - set VCN block clockgating state * - * @handle: amdgpu_device pointer + * @ip_block: amdgpu_ip_block pointer * @state: clock gating state * * Set VCN block clockgating state */ -static int vcn_v5_0_0_set_clockgating_state(void *handle, enum amd_clockgating_state state) +static int vcn_v5_0_0_set_clockgating_state(struct amdgpu_ip_block *ip_block, + enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; bool enable = (state == AMD_CG_STATE_GATE) ? true : false; int i; @@ -1259,14 +1270,15 @@ static int vcn_v5_0_0_set_clockgating_state(void *handle, enum amd_clockgating_s /** * vcn_v5_0_0_set_powergating_state - set VCN block powergating state * - * @handle: amdgpu_device pointer + * @ip_block: amdgpu_ip_block pointer * @state: power gating state * * Set VCN block powergating state */ -static int vcn_v5_0_0_set_powergating_state(void *handle, enum amd_powergating_state state) +static int vcn_v5_0_0_set_powergating_state(struct amdgpu_ip_block *ip_block, + enum amd_powergating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; int ret; if (state == adev->vcn.cur_state) @@ -1312,10 +1324,10 @@ static int vcn_v5_0_0_process_interrupt(struct amdgpu_device *adev, struct amdgp DRM_DEBUG("IH: VCN TRAP\n"); switch (entry->src_id) { - case VCN_4_0__SRCID__UVD_ENC_GENERAL_PURPOSE: + case VCN_5_0__SRCID__UVD_ENC_GENERAL_PURPOSE: amdgpu_fence_process(&adev->vcn.inst[ip_instance].ring_enc[0]); break; - case VCN_4_0__SRCID_UVD_POISON: + case VCN_5_0__SRCID_UVD_POISON: amdgpu_vcn_process_poison_irq(adev, source, entry); break; default: @@ -1351,7 +1363,8 @@ static void vcn_v5_0_0_set_irq_funcs(struct amdgpu_device *adev) } } -static void vcn_v5_0_print_ip_state(struct amdgpu_ip_block *ip_block, struct drm_printer *p) +void vcn_v5_0_0_print_ip_state(struct amdgpu_ip_block *ip_block, + struct drm_printer *p) { struct amdgpu_device *adev = ip_block->adev; int i, j; @@ -1383,7 +1396,7 @@ static void vcn_v5_0_print_ip_state(struct amdgpu_ip_block *ip_block, struct drm } } -static void vcn_v5_0_dump_ip_state(struct amdgpu_ip_block *ip_block) +void vcn_v5_0_0_dump_ip_state(struct amdgpu_ip_block *ip_block) { struct amdgpu_device *adev = ip_block->adev; int i, j; @@ -1424,8 +1437,8 @@ static const struct amd_ip_funcs vcn_v5_0_0_ip_funcs = { .wait_for_idle = vcn_v5_0_0_wait_for_idle, .set_clockgating_state = vcn_v5_0_0_set_clockgating_state, .set_powergating_state = vcn_v5_0_0_set_powergating_state, - .dump_ip_state = vcn_v5_0_dump_ip_state, - .print_ip_state = vcn_v5_0_print_ip_state, + .dump_ip_state = vcn_v5_0_0_dump_ip_state, + .print_ip_state = vcn_v5_0_0_print_ip_state, }; const struct amdgpu_ip_block_version vcn_v5_0_0_ip_block = { diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.h b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.h index 51bbccd4360f..b8927652bc50 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.h +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.h @@ -32,6 +32,11 @@ #define VCN_VID_IP_ADDRESS 0x0 #define VCN_AON_IP_ADDRESS 0x30000 +void vcn_v5_0_0_alloc_ip_dump(struct amdgpu_device *adev); +void vcn_v5_0_0_print_ip_state(struct amdgpu_ip_block *ip_block, + struct drm_printer *p); +void vcn_v5_0_0_dump_ip_state(struct amdgpu_ip_block *ip_block); + extern const struct amdgpu_ip_block_version vcn_v5_0_0_ip_block; #endif /* __VCN_V5_0_0_H__ */ diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c new file mode 100644 index 000000000000..8b463c977d08 --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c @@ -0,0 +1,1118 @@ +/* + * Copyright 2024 Advanced Micro Devices, Inc. All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ + +#include +#include "amdgpu.h" +#include "amdgpu_vcn.h" +#include "amdgpu_pm.h" +#include "soc15.h" +#include "soc15d.h" +#include "soc15_hw_ip.h" +#include "vcn_v2_0.h" + +#include "vcn/vcn_5_0_0_offset.h" +#include "vcn/vcn_5_0_0_sh_mask.h" +#include "ivsrcid/vcn/irqsrcs_vcn_5_0.h" +#include "vcn_v5_0_0.h" +#include "vcn_v5_0_1.h" + +#include + +static void vcn_v5_0_1_set_unified_ring_funcs(struct amdgpu_device *adev); +static void vcn_v5_0_1_set_irq_funcs(struct amdgpu_device *adev); +static int vcn_v5_0_1_set_powergating_state(struct amdgpu_ip_block *ip_block, + enum amd_powergating_state state); +static void vcn_v5_0_1_unified_ring_set_wptr(struct amdgpu_ring *ring); + +/** + * vcn_v5_0_1_early_init - set function pointers and load microcode + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * Set ring and irq function pointers + * Load microcode from filesystem + */ +static int vcn_v5_0_1_early_init(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + + /* re-use enc ring as unified ring */ + adev->vcn.num_enc_rings = 1; + + vcn_v5_0_1_set_unified_ring_funcs(adev); + vcn_v5_0_1_set_irq_funcs(adev); + + return amdgpu_vcn_early_init(adev); +} + +/** + * vcn_v5_0_1_sw_init - sw init for VCN block + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * Load firmware and sw initialization + */ +static int vcn_v5_0_1_sw_init(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + struct amdgpu_ring *ring; + int i, r, vcn_inst; + + r = amdgpu_vcn_sw_init(adev); + if (r) + return r; + + amdgpu_vcn_setup_ucode(adev); + + r = amdgpu_vcn_resume(adev); + if (r) + return r; + + /* VCN UNIFIED TRAP */ + r = amdgpu_irq_add_id(adev, SOC15_IH_CLIENTID_VCN, + VCN_5_0__SRCID__UVD_ENC_GENERAL_PURPOSE, &adev->vcn.inst->irq); + if (r) + return r; + + for (i = 0; i < adev->vcn.num_vcn_inst; i++) { + volatile struct amdgpu_vcn5_fw_shared *fw_shared; + + vcn_inst = GET_INST(VCN, i); + + ring = &adev->vcn.inst[i].ring_enc[0]; + ring->use_doorbell = true; + ring->doorbell_index = (adev->doorbell_index.vcn.vcn_ring0_1 << 1) + 9 * vcn_inst; + + ring->vm_hub = AMDGPU_MMHUB0(adev->vcn.inst[i].aid_id); + sprintf(ring->name, "vcn_unified_%d", adev->vcn.inst[i].aid_id); + + r = amdgpu_ring_init(adev, ring, 512, &adev->vcn.inst[i].irq, 0, + AMDGPU_RING_PRIO_DEFAULT, &adev->vcn.inst[i].sched_score); + if (r) + return r; + + fw_shared = adev->vcn.inst[i].fw_shared.cpu_addr; + fw_shared->present_flag_0 = cpu_to_le32(AMDGPU_FW_SHARED_FLAG_0_UNIFIED_QUEUE); + fw_shared->sq.is_enabled = true; + + if (amdgpu_vcnfw_log) + amdgpu_vcn_fwlog_init(&adev->vcn.inst[i]); + } + + /* TODO: Add queue reset mask when FW fully supports it */ + adev->vcn.supported_reset = + amdgpu_get_soft_full_reset_mask(&adev->vcn.inst[0].ring_enc[0]); + + vcn_v5_0_0_alloc_ip_dump(adev); + + return amdgpu_vcn_sysfs_reset_mask_init(adev); +} + +/** + * vcn_v5_0_1_sw_fini - sw fini for VCN block + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * VCN suspend and free up sw allocation + */ +static int vcn_v5_0_1_sw_fini(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + int i, r, idx; + + if (drm_dev_enter(adev_to_drm(adev), &idx)) { + for (i = 0; i < adev->vcn.num_vcn_inst; i++) { + volatile struct amdgpu_vcn4_fw_shared *fw_shared; + + fw_shared = adev->vcn.inst[i].fw_shared.cpu_addr; + fw_shared->present_flag_0 = 0; + fw_shared->sq.is_enabled = 0; + } + + drm_dev_exit(idx); + } + + r = amdgpu_vcn_suspend(adev); + if (r) + return r; + + r = amdgpu_vcn_sw_fini(adev); + + amdgpu_vcn_sysfs_reset_mask_fini(adev); + + kfree(adev->vcn.ip_dump); + + return r; +} + +/** + * vcn_v5_0_1_hw_init - start and test VCN block + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * Initialize the hardware, boot up the VCPU and do some testing + */ +static int vcn_v5_0_1_hw_init(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + struct amdgpu_ring *ring; + int i, r, vcn_inst; + + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + vcn_inst = GET_INST(VCN, i); + ring = &adev->vcn.inst[i].ring_enc[0]; + + if (ring->use_doorbell) + adev->nbio.funcs->vcn_doorbell_range(adev, ring->use_doorbell, + ((adev->doorbell_index.vcn.vcn_ring0_1 << 1) + + 9 * vcn_inst), + adev->vcn.inst[i].aid_id); + + r = amdgpu_ring_test_helper(ring); + if (r) + return r; + } + + return 0; +} + +/** + * vcn_v5_0_1_hw_fini - stop the hardware block + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * Stop the VCN block, mark ring as not ready any more + */ +static int vcn_v5_0_1_hw_fini(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + + cancel_delayed_work_sync(&adev->vcn.idle_work); + + return 0; +} + +/** + * vcn_v5_0_1_suspend - suspend VCN block + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * HW fini and suspend VCN block + */ +static int vcn_v5_0_1_suspend(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + int r; + + r = vcn_v5_0_1_hw_fini(ip_block); + if (r) + return r; + + r = amdgpu_vcn_suspend(adev); + + return r; +} + +/** + * vcn_v5_0_1_resume - resume VCN block + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * Resume firmware and hw init VCN block + */ +static int vcn_v5_0_1_resume(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + int r; + + r = amdgpu_vcn_resume(adev); + if (r) + return r; + + r = vcn_v5_0_1_hw_init(ip_block); + + return r; +} + +/** + * vcn_v5_0_1_mc_resume - memory controller programming + * + * @adev: amdgpu_device pointer + * @inst: instance number + * + * Let the VCN memory controller know it's offsets + */ +static void vcn_v5_0_1_mc_resume(struct amdgpu_device *adev, int inst) +{ + uint32_t offset, size, vcn_inst; + const struct common_firmware_header *hdr; + + hdr = (const struct common_firmware_header *)adev->vcn.inst[inst].fw->data; + size = AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8); + + vcn_inst = GET_INST(VCN, inst); + /* cache window 0: fw */ + if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) { + WREG32_SOC15(VCN, vcn_inst, regUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW, + (adev->firmware.ucode[AMDGPU_UCODE_ID_VCN + inst].tmr_mc_addr_lo)); + WREG32_SOC15(VCN, vcn_inst, regUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH, + (adev->firmware.ucode[AMDGPU_UCODE_ID_VCN + inst].tmr_mc_addr_hi)); + WREG32_SOC15(VCN, vcn_inst, regUVD_VCPU_CACHE_OFFSET0, 0); + offset = 0; + } else { + WREG32_SOC15(VCN, vcn_inst, regUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW, + lower_32_bits(adev->vcn.inst[inst].gpu_addr)); + WREG32_SOC15(VCN, vcn_inst, regUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH, + upper_32_bits(adev->vcn.inst[inst].gpu_addr)); + offset = size; + WREG32_SOC15(VCN, vcn_inst, regUVD_VCPU_CACHE_OFFSET0, + AMDGPU_UVD_FIRMWARE_OFFSET >> 3); + } + WREG32_SOC15(VCN, vcn_inst, regUVD_VCPU_CACHE_SIZE0, size); + + /* cache window 1: stack */ + WREG32_SOC15(VCN, vcn_inst, regUVD_LMI_VCPU_CACHE1_64BIT_BAR_LOW, + lower_32_bits(adev->vcn.inst[inst].gpu_addr + offset)); + WREG32_SOC15(VCN, vcn_inst, regUVD_LMI_VCPU_CACHE1_64BIT_BAR_HIGH, + upper_32_bits(adev->vcn.inst[inst].gpu_addr + offset)); + WREG32_SOC15(VCN, vcn_inst, regUVD_VCPU_CACHE_OFFSET1, 0); + WREG32_SOC15(VCN, vcn_inst, regUVD_VCPU_CACHE_SIZE1, AMDGPU_VCN_STACK_SIZE); + + /* cache window 2: context */ + WREG32_SOC15(VCN, vcn_inst, regUVD_LMI_VCPU_CACHE2_64BIT_BAR_LOW, + lower_32_bits(adev->vcn.inst[inst].gpu_addr + offset + AMDGPU_VCN_STACK_SIZE)); + WREG32_SOC15(VCN, vcn_inst, regUVD_LMI_VCPU_CACHE2_64BIT_BAR_HIGH, + upper_32_bits(adev->vcn.inst[inst].gpu_addr + offset + AMDGPU_VCN_STACK_SIZE)); + WREG32_SOC15(VCN, vcn_inst, regUVD_VCPU_CACHE_OFFSET2, 0); + WREG32_SOC15(VCN, vcn_inst, regUVD_VCPU_CACHE_SIZE2, AMDGPU_VCN_CONTEXT_SIZE); + + /* non-cache window */ + WREG32_SOC15(VCN, vcn_inst, regUVD_LMI_VCPU_NC0_64BIT_BAR_LOW, + lower_32_bits(adev->vcn.inst[inst].fw_shared.gpu_addr)); + WREG32_SOC15(VCN, vcn_inst, regUVD_LMI_VCPU_NC0_64BIT_BAR_HIGH, + upper_32_bits(adev->vcn.inst[inst].fw_shared.gpu_addr)); + WREG32_SOC15(VCN, vcn_inst, regUVD_VCPU_NONCACHE_OFFSET0, 0); + WREG32_SOC15(VCN, vcn_inst, regUVD_VCPU_NONCACHE_SIZE0, + AMDGPU_GPU_PAGE_ALIGN(sizeof(struct amdgpu_vcn4_fw_shared))); +} + +/** + * vcn_v5_0_1_mc_resume_dpg_mode - memory controller programming for dpg mode + * + * @adev: amdgpu_device pointer + * @inst_idx: instance number index + * @indirect: indirectly write sram + * + * Let the VCN memory controller know it's offsets with dpg mode + */ +static void vcn_v5_0_1_mc_resume_dpg_mode(struct amdgpu_device *adev, int inst_idx, bool indirect) +{ + uint32_t offset, size; + const struct common_firmware_header *hdr; + + hdr = (const struct common_firmware_header *)adev->vcn.inst[inst_idx].fw->data; + size = AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8); + + /* cache window 0: fw */ + if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) { + if (!indirect) { + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW), + (adev->firmware.ucode[AMDGPU_UCODE_ID_VCN + + inst_idx].tmr_mc_addr_lo), 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH), + (adev->firmware.ucode[AMDGPU_UCODE_ID_VCN + + inst_idx].tmr_mc_addr_hi), 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_CACHE_OFFSET0), 0, 0, indirect); + } else { + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW), 0, 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH), 0, 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_CACHE_OFFSET0), 0, 0, indirect); + } + offset = 0; + } else { + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW), + lower_32_bits(adev->vcn.inst[inst_idx].gpu_addr), 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH), + upper_32_bits(adev->vcn.inst[inst_idx].gpu_addr), 0, indirect); + offset = size; + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_CACHE_OFFSET0), + AMDGPU_UVD_FIRMWARE_OFFSET >> 3, 0, indirect); + } + + if (!indirect) + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_CACHE_SIZE0), size, 0, indirect); + else + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_CACHE_SIZE0), 0, 0, indirect); + + /* cache window 1: stack */ + if (!indirect) { + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_CACHE1_64BIT_BAR_LOW), + lower_32_bits(adev->vcn.inst[inst_idx].gpu_addr + offset), 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_CACHE1_64BIT_BAR_HIGH), + upper_32_bits(adev->vcn.inst[inst_idx].gpu_addr + offset), 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_CACHE_OFFSET1), 0, 0, indirect); + } else { + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_CACHE1_64BIT_BAR_LOW), 0, 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_CACHE1_64BIT_BAR_HIGH), 0, 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_CACHE_OFFSET1), 0, 0, indirect); + } + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_CACHE_SIZE1), AMDGPU_VCN_STACK_SIZE, 0, indirect); + + /* cache window 2: context */ + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_CACHE2_64BIT_BAR_LOW), + lower_32_bits(adev->vcn.inst[inst_idx].gpu_addr + offset + + AMDGPU_VCN_STACK_SIZE), 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_CACHE2_64BIT_BAR_HIGH), + upper_32_bits(adev->vcn.inst[inst_idx].gpu_addr + offset + + AMDGPU_VCN_STACK_SIZE), 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_CACHE_OFFSET2), 0, 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_CACHE_SIZE2), AMDGPU_VCN_CONTEXT_SIZE, 0, indirect); + + /* non-cache window */ + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_NC0_64BIT_BAR_LOW), + lower_32_bits(adev->vcn.inst[inst_idx].fw_shared.gpu_addr), 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_VCPU_NC0_64BIT_BAR_HIGH), + upper_32_bits(adev->vcn.inst[inst_idx].fw_shared.gpu_addr), 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_NONCACHE_OFFSET0), 0, 0, indirect); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_NONCACHE_SIZE0), + AMDGPU_GPU_PAGE_ALIGN(sizeof(struct amdgpu_vcn4_fw_shared)), 0, indirect); + + /* VCN global tiling registers */ + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_GFX10_ADDR_CONFIG), adev->gfx.config.gb_addr_config, 0, indirect); +} + +/** + * vcn_v5_0_1_disable_clock_gating - disable VCN clock gating + * + * @adev: amdgpu_device pointer + * @inst: instance number + * + * Disable clock gating for VCN block + */ +static void vcn_v5_0_1_disable_clock_gating(struct amdgpu_device *adev, int inst) +{ +} + +/** + * vcn_v5_0_1_enable_clock_gating - enable VCN clock gating + * + * @adev: amdgpu_device pointer + * @inst: instance number + * + * Enable clock gating for VCN block + */ +static void vcn_v5_0_1_enable_clock_gating(struct amdgpu_device *adev, int inst) +{ +} + +/** + * vcn_v5_0_1_start_dpg_mode - VCN start with dpg mode + * + * @adev: amdgpu_device pointer + * @inst_idx: instance number index + * @indirect: indirectly write sram + * + * Start VCN block with dpg mode + */ +static int vcn_v5_0_1_start_dpg_mode(struct amdgpu_device *adev, int inst_idx, bool indirect) +{ + volatile struct amdgpu_vcn4_fw_shared *fw_shared = + adev->vcn.inst[inst_idx].fw_shared.cpu_addr; + struct amdgpu_ring *ring; + int vcn_inst; + uint32_t tmp; + + vcn_inst = GET_INST(VCN, inst_idx); + + /* disable register anti-hang mechanism */ + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_POWER_STATUS), 1, + ~UVD_POWER_STATUS__UVD_POWER_STATUS_MASK); + + /* enable dynamic power gating mode */ + tmp = RREG32_SOC15(VCN, vcn_inst, regUVD_POWER_STATUS); + tmp |= UVD_POWER_STATUS__UVD_PG_MODE_MASK; + WREG32_SOC15(VCN, vcn_inst, regUVD_POWER_STATUS, tmp); + + if (indirect) { + adev->vcn.inst[inst_idx].dpg_sram_curr_addr = + (uint32_t *)adev->vcn.inst[inst_idx].dpg_sram_cpu_addr; + /* Use dummy register 0xDEADBEEF passing AID selection to PSP FW */ + WREG32_SOC24_DPG_MODE(inst_idx, 0xDEADBEEF, + adev->vcn.inst[inst_idx].aid_id, 0, true); + } + + /* enable VCPU clock */ + tmp = (0xFF << UVD_VCPU_CNTL__PRB_TIMEOUT_VAL__SHIFT); + tmp |= UVD_VCPU_CNTL__CLK_EN_MASK | UVD_VCPU_CNTL__BLK_RST_MASK; + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_CNTL), tmp, 0, indirect); + + /* disable master interrupt */ + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_MASTINT_EN), 0, 0, indirect); + + /* setup regUVD_LMI_CTRL */ + tmp = (UVD_LMI_CTRL__WRITE_CLEAN_TIMER_EN_MASK | + UVD_LMI_CTRL__REQ_MODE_MASK | + UVD_LMI_CTRL__CRC_RESET_MASK | + UVD_LMI_CTRL__MASK_MC_URGENT_MASK | + UVD_LMI_CTRL__DATA_COHERENCY_EN_MASK | + UVD_LMI_CTRL__VCPU_DATA_COHERENCY_EN_MASK | + (8 << UVD_LMI_CTRL__WRITE_CLEAN_TIMER__SHIFT) | + 0x00100000L); + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_CTRL), tmp, 0, indirect); + + vcn_v5_0_1_mc_resume_dpg_mode(adev, inst_idx, indirect); + + tmp = (0xFF << UVD_VCPU_CNTL__PRB_TIMEOUT_VAL__SHIFT); + tmp |= UVD_VCPU_CNTL__CLK_EN_MASK; + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_VCPU_CNTL), tmp, 0, indirect); + + /* enable LMI MC and UMC channels */ + tmp = 0x1f << UVD_LMI_CTRL2__RE_OFLD_MIF_WR_REQ_NUM__SHIFT; + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_LMI_CTRL2), tmp, 0, indirect); + + /* enable master interrupt */ + WREG32_SOC24_DPG_MODE(inst_idx, SOC24_DPG_MODE_OFFSET( + VCN, 0, regUVD_MASTINT_EN), + UVD_MASTINT_EN__VCPU_EN_MASK, 0, indirect); + + if (indirect) + amdgpu_vcn_psp_update_sram(adev, inst_idx, AMDGPU_UCODE_ID_VCN0_RAM); + + ring = &adev->vcn.inst[inst_idx].ring_enc[0]; + + WREG32_SOC15(VCN, vcn_inst, regUVD_RB_BASE_LO, lower_32_bits(ring->gpu_addr)); + WREG32_SOC15(VCN, vcn_inst, regUVD_RB_BASE_HI, upper_32_bits(ring->gpu_addr)); + WREG32_SOC15(VCN, vcn_inst, regUVD_RB_SIZE, ring->ring_size / sizeof(uint32_t)); + + tmp = RREG32_SOC15(VCN, vcn_inst, regVCN_RB_ENABLE); + tmp &= ~(VCN_RB_ENABLE__RB1_EN_MASK); + WREG32_SOC15(VCN, vcn_inst, regVCN_RB_ENABLE, tmp); + fw_shared->sq.queue_mode |= FW_QUEUE_RING_RESET; + WREG32_SOC15(VCN, vcn_inst, regUVD_RB_RPTR, 0); + WREG32_SOC15(VCN, vcn_inst, regUVD_RB_WPTR, 0); + + tmp = RREG32_SOC15(VCN, vcn_inst, regUVD_RB_RPTR); + WREG32_SOC15(VCN, vcn_inst, regUVD_RB_WPTR, tmp); + ring->wptr = RREG32_SOC15(VCN, vcn_inst, regUVD_RB_WPTR); + + tmp = RREG32_SOC15(VCN, vcn_inst, regVCN_RB_ENABLE); + tmp |= VCN_RB_ENABLE__RB1_EN_MASK; + WREG32_SOC15(VCN, vcn_inst, regVCN_RB_ENABLE, tmp); + fw_shared->sq.queue_mode &= ~(FW_QUEUE_RING_RESET | FW_QUEUE_DPG_HOLD_OFF); + + WREG32_SOC15(VCN, vcn_inst, regVCN_RB1_DB_CTRL, + ring->doorbell_index << VCN_RB1_DB_CTRL__OFFSET__SHIFT | + VCN_RB1_DB_CTRL__EN_MASK); + /* Read DB_CTRL to flush the write DB_CTRL command. */ + RREG32_SOC15(VCN, vcn_inst, regVCN_RB1_DB_CTRL); + + return 0; +} + +/** + * vcn_v5_0_1_start - VCN start + * + * @adev: amdgpu_device pointer + * + * Start VCN block + */ +static int vcn_v5_0_1_start(struct amdgpu_device *adev) +{ + volatile struct amdgpu_vcn4_fw_shared *fw_shared; + struct amdgpu_ring *ring; + uint32_t tmp; + int i, j, k, r, vcn_inst; + + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_uvd(adev, true); + + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + fw_shared = adev->vcn.inst[i].fw_shared.cpu_addr; + + if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) { + r = vcn_v5_0_1_start_dpg_mode(adev, i, adev->vcn.indirect_sram); + continue; + } + + vcn_inst = GET_INST(VCN, i); + + /* set VCN status busy */ + tmp = RREG32_SOC15(VCN, vcn_inst, regUVD_STATUS) | UVD_STATUS__UVD_BUSY; + WREG32_SOC15(VCN, vcn_inst, regUVD_STATUS, tmp); + + /* enable VCPU clock */ + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_VCPU_CNTL), + UVD_VCPU_CNTL__CLK_EN_MASK, ~UVD_VCPU_CNTL__CLK_EN_MASK); + + /* disable master interrupt */ + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_MASTINT_EN), 0, + ~UVD_MASTINT_EN__VCPU_EN_MASK); + + /* enable LMI MC and UMC channels */ + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_LMI_CTRL2), 0, + ~UVD_LMI_CTRL2__STALL_ARB_UMC_MASK); + + tmp = RREG32_SOC15(VCN, vcn_inst, regUVD_SOFT_RESET); + tmp &= ~UVD_SOFT_RESET__LMI_SOFT_RESET_MASK; + tmp &= ~UVD_SOFT_RESET__LMI_UMC_SOFT_RESET_MASK; + WREG32_SOC15(VCN, vcn_inst, regUVD_SOFT_RESET, tmp); + + /* setup regUVD_LMI_CTRL */ + tmp = RREG32_SOC15(VCN, vcn_inst, regUVD_LMI_CTRL); + WREG32_SOC15(VCN, vcn_inst, regUVD_LMI_CTRL, tmp | + UVD_LMI_CTRL__WRITE_CLEAN_TIMER_EN_MASK | + UVD_LMI_CTRL__MASK_MC_URGENT_MASK | + UVD_LMI_CTRL__DATA_COHERENCY_EN_MASK | + UVD_LMI_CTRL__VCPU_DATA_COHERENCY_EN_MASK); + + vcn_v5_0_1_mc_resume(adev, i); + + /* VCN global tiling registers */ + WREG32_SOC15(VCN, vcn_inst, regUVD_GFX10_ADDR_CONFIG, + adev->gfx.config.gb_addr_config); + + /* unblock VCPU register access */ + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_RB_ARB_CTRL), 0, + ~UVD_RB_ARB_CTRL__VCPU_DIS_MASK); + + /* release VCPU reset to boot */ + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_VCPU_CNTL), 0, + ~UVD_VCPU_CNTL__BLK_RST_MASK); + + for (j = 0; j < 10; ++j) { + uint32_t status; + + for (k = 0; k < 100; ++k) { + status = RREG32_SOC15(VCN, vcn_inst, regUVD_STATUS); + if (status & 2) + break; + mdelay(100); + if (amdgpu_emu_mode == 1) + msleep(20); + } + + if (amdgpu_emu_mode == 1) { + r = -1; + if (status & 2) { + r = 0; + break; + } + } else { + r = 0; + if (status & 2) + break; + + dev_err(adev->dev, + "VCN[%d] is not responding, trying to reset the VCPU!!!\n", i); + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_VCPU_CNTL), + UVD_VCPU_CNTL__BLK_RST_MASK, + ~UVD_VCPU_CNTL__BLK_RST_MASK); + mdelay(10); + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_VCPU_CNTL), 0, + ~UVD_VCPU_CNTL__BLK_RST_MASK); + + mdelay(10); + r = -1; + } + } + + if (r) { + dev_err(adev->dev, "VCN[%d] is not responding, giving up!!!\n", i); + return r; + } + + /* enable master interrupt */ + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_MASTINT_EN), + UVD_MASTINT_EN__VCPU_EN_MASK, + ~UVD_MASTINT_EN__VCPU_EN_MASK); + + /* clear the busy bit of VCN_STATUS */ + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_STATUS), 0, + ~(2 << UVD_STATUS__VCPU_REPORT__SHIFT)); + + ring = &adev->vcn.inst[i].ring_enc[0]; + + WREG32_SOC15(VCN, vcn_inst, regVCN_RB1_DB_CTRL, + ring->doorbell_index << VCN_RB1_DB_CTRL__OFFSET__SHIFT | + VCN_RB1_DB_CTRL__EN_MASK); + + /* Read DB_CTRL to flush the write DB_CTRL command. */ + RREG32_SOC15(VCN, vcn_inst, regVCN_RB1_DB_CTRL); + + WREG32_SOC15(VCN, vcn_inst, regUVD_RB_BASE_LO, ring->gpu_addr); + WREG32_SOC15(VCN, vcn_inst, regUVD_RB_BASE_HI, upper_32_bits(ring->gpu_addr)); + WREG32_SOC15(VCN, vcn_inst, regUVD_RB_SIZE, ring->ring_size / 4); + + tmp = RREG32_SOC15(VCN, vcn_inst, regVCN_RB_ENABLE); + tmp &= ~(VCN_RB_ENABLE__RB1_EN_MASK); + WREG32_SOC15(VCN, vcn_inst, regVCN_RB_ENABLE, tmp); + fw_shared->sq.queue_mode |= FW_QUEUE_RING_RESET; + WREG32_SOC15(VCN, vcn_inst, regUVD_RB_RPTR, 0); + WREG32_SOC15(VCN, vcn_inst, regUVD_RB_WPTR, 0); + + tmp = RREG32_SOC15(VCN, vcn_inst, regUVD_RB_RPTR); + WREG32_SOC15(VCN, vcn_inst, regUVD_RB_WPTR, tmp); + ring->wptr = RREG32_SOC15(VCN, vcn_inst, regUVD_RB_WPTR); + + tmp = RREG32_SOC15(VCN, vcn_inst, regVCN_RB_ENABLE); + tmp |= VCN_RB_ENABLE__RB1_EN_MASK; + WREG32_SOC15(VCN, vcn_inst, regVCN_RB_ENABLE, tmp); + fw_shared->sq.queue_mode &= ~(FW_QUEUE_RING_RESET | FW_QUEUE_DPG_HOLD_OFF); + } + + return 0; +} + +/** + * vcn_v5_0_1_stop_dpg_mode - VCN stop with dpg mode + * + * @adev: amdgpu_device pointer + * @inst_idx: instance number index + * + * Stop VCN block with dpg mode + */ +static void vcn_v5_0_1_stop_dpg_mode(struct amdgpu_device *adev, int inst_idx) +{ + uint32_t tmp; + int vcn_inst; + + vcn_inst = GET_INST(VCN, inst_idx); + + /* Wait for power status to be 1 */ + SOC15_WAIT_ON_RREG(VCN, vcn_inst, regUVD_POWER_STATUS, 1, + UVD_POWER_STATUS__UVD_POWER_STATUS_MASK); + + /* wait for read ptr to be equal to write ptr */ + tmp = RREG32_SOC15(VCN, vcn_inst, regUVD_RB_WPTR); + SOC15_WAIT_ON_RREG(VCN, vcn_inst, regUVD_RB_RPTR, tmp, 0xFFFFFFFF); + + /* disable dynamic power gating mode */ + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_POWER_STATUS), 0, + ~UVD_POWER_STATUS__UVD_PG_MODE_MASK); +} + +/** + * vcn_v5_0_1_stop - VCN stop + * + * @adev: amdgpu_device pointer + * + * Stop VCN block + */ +static int vcn_v5_0_1_stop(struct amdgpu_device *adev) +{ + volatile struct amdgpu_vcn4_fw_shared *fw_shared; + uint32_t tmp; + int i, r = 0, vcn_inst; + + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + vcn_inst = GET_INST(VCN, i); + + fw_shared = adev->vcn.inst[i].fw_shared.cpu_addr; + fw_shared->sq.queue_mode |= FW_QUEUE_DPG_HOLD_OFF; + + if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) { + vcn_v5_0_1_stop_dpg_mode(adev, i); + continue; + } + + /* wait for vcn idle */ + r = SOC15_WAIT_ON_RREG(VCN, vcn_inst, regUVD_STATUS, UVD_STATUS__IDLE, 0x7); + if (r) + return r; + + tmp = UVD_LMI_STATUS__VCPU_LMI_WRITE_CLEAN_MASK | + UVD_LMI_STATUS__READ_CLEAN_MASK | + UVD_LMI_STATUS__WRITE_CLEAN_MASK | + UVD_LMI_STATUS__WRITE_CLEAN_RAW_MASK; + r = SOC15_WAIT_ON_RREG(VCN, vcn_inst, regUVD_LMI_STATUS, tmp, tmp); + if (r) + return r; + + /* disable LMI UMC channel */ + tmp = RREG32_SOC15(VCN, vcn_inst, regUVD_LMI_CTRL2); + tmp |= UVD_LMI_CTRL2__STALL_ARB_UMC_MASK; + WREG32_SOC15(VCN, vcn_inst, regUVD_LMI_CTRL2, tmp); + tmp = UVD_LMI_STATUS__UMC_READ_CLEAN_RAW_MASK | + UVD_LMI_STATUS__UMC_WRITE_CLEAN_RAW_MASK; + r = SOC15_WAIT_ON_RREG(VCN, vcn_inst, regUVD_LMI_STATUS, tmp, tmp); + if (r) + return r; + + /* block VCPU register access */ + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_RB_ARB_CTRL), + UVD_RB_ARB_CTRL__VCPU_DIS_MASK, + ~UVD_RB_ARB_CTRL__VCPU_DIS_MASK); + + /* reset VCPU */ + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_VCPU_CNTL), + UVD_VCPU_CNTL__BLK_RST_MASK, + ~UVD_VCPU_CNTL__BLK_RST_MASK); + + /* disable VCPU clock */ + WREG32_P(SOC15_REG_OFFSET(VCN, vcn_inst, regUVD_VCPU_CNTL), 0, + ~(UVD_VCPU_CNTL__CLK_EN_MASK)); + + /* apply soft reset */ + tmp = RREG32_SOC15(VCN, vcn_inst, regUVD_SOFT_RESET); + tmp |= UVD_SOFT_RESET__LMI_UMC_SOFT_RESET_MASK; + WREG32_SOC15(VCN, vcn_inst, regUVD_SOFT_RESET, tmp); + tmp = RREG32_SOC15(VCN, vcn_inst, regUVD_SOFT_RESET); + tmp |= UVD_SOFT_RESET__LMI_SOFT_RESET_MASK; + WREG32_SOC15(VCN, vcn_inst, regUVD_SOFT_RESET, tmp); + + /* clear status */ + WREG32_SOC15(VCN, vcn_inst, regUVD_STATUS, 0); + } + + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_uvd(adev, false); + + return 0; +} + +/** + * vcn_v5_0_1_unified_ring_get_rptr - get unified read pointer + * + * @ring: amdgpu_ring pointer + * + * Returns the current hardware unified read pointer + */ +static uint64_t vcn_v5_0_1_unified_ring_get_rptr(struct amdgpu_ring *ring) +{ + struct amdgpu_device *adev = ring->adev; + + if (ring != &adev->vcn.inst[ring->me].ring_enc[0]) + DRM_ERROR("wrong ring id is identified in %s", __func__); + + return RREG32_SOC15(VCN, GET_INST(VCN, ring->me), regUVD_RB_RPTR); +} + +/** + * vcn_v5_0_1_unified_ring_get_wptr - get unified write pointer + * + * @ring: amdgpu_ring pointer + * + * Returns the current hardware unified write pointer + */ +static uint64_t vcn_v5_0_1_unified_ring_get_wptr(struct amdgpu_ring *ring) +{ + struct amdgpu_device *adev = ring->adev; + + if (ring != &adev->vcn.inst[ring->me].ring_enc[0]) + DRM_ERROR("wrong ring id is identified in %s", __func__); + + if (ring->use_doorbell) + return *ring->wptr_cpu_addr; + else + return RREG32_SOC15(VCN, GET_INST(VCN, ring->me), regUVD_RB_WPTR); +} + +/** + * vcn_v5_0_1_unified_ring_set_wptr - set enc write pointer + * + * @ring: amdgpu_ring pointer + * + * Commits the enc write pointer to the hardware + */ +static void vcn_v5_0_1_unified_ring_set_wptr(struct amdgpu_ring *ring) +{ + struct amdgpu_device *adev = ring->adev; + + if (ring != &adev->vcn.inst[ring->me].ring_enc[0]) + DRM_ERROR("wrong ring id is identified in %s", __func__); + + if (ring->use_doorbell) { + *ring->wptr_cpu_addr = lower_32_bits(ring->wptr); + WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr)); + } else { + WREG32_SOC15(VCN, GET_INST(VCN, ring->me), regUVD_RB_WPTR, + lower_32_bits(ring->wptr)); + } +} + +static const struct amdgpu_ring_funcs vcn_v5_0_1_unified_ring_vm_funcs = { + .type = AMDGPU_RING_TYPE_VCN_ENC, + .align_mask = 0x3f, + .nop = VCN_ENC_CMD_NO_OP, + .get_rptr = vcn_v5_0_1_unified_ring_get_rptr, + .get_wptr = vcn_v5_0_1_unified_ring_get_wptr, + .set_wptr = vcn_v5_0_1_unified_ring_set_wptr, + .emit_frame_size = + SOC15_FLUSH_GPU_TLB_NUM_WREG * 3 + + SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 4 + + 4 + /* vcn_v2_0_enc_ring_emit_vm_flush */ + 5 + 5 + /* vcn_v2_0_enc_ring_emit_fence x2 vm fence */ + 1, /* vcn_v2_0_enc_ring_insert_end */ + .emit_ib_size = 5, /* vcn_v2_0_enc_ring_emit_ib */ + .emit_ib = vcn_v2_0_enc_ring_emit_ib, + .emit_fence = vcn_v2_0_enc_ring_emit_fence, + .emit_vm_flush = vcn_v2_0_enc_ring_emit_vm_flush, + .test_ring = amdgpu_vcn_enc_ring_test_ring, + .test_ib = amdgpu_vcn_unified_ring_test_ib, + .insert_nop = amdgpu_ring_insert_nop, + .insert_end = vcn_v2_0_enc_ring_insert_end, + .pad_ib = amdgpu_ring_generic_pad_ib, + .begin_use = amdgpu_vcn_ring_begin_use, + .end_use = amdgpu_vcn_ring_end_use, + .emit_wreg = vcn_v2_0_enc_ring_emit_wreg, + .emit_reg_wait = vcn_v2_0_enc_ring_emit_reg_wait, + .emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper, +}; + +/** + * vcn_v5_0_1_set_unified_ring_funcs - set unified ring functions + * + * @adev: amdgpu_device pointer + * + * Set unified ring functions + */ +static void vcn_v5_0_1_set_unified_ring_funcs(struct amdgpu_device *adev) +{ + int i, vcn_inst; + + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + adev->vcn.inst[i].ring_enc[0].funcs = &vcn_v5_0_1_unified_ring_vm_funcs; + adev->vcn.inst[i].ring_enc[0].me = i; + vcn_inst = GET_INST(VCN, i); + adev->vcn.inst[i].aid_id = vcn_inst / adev->vcn.num_inst_per_aid; + } +} + +/** + * vcn_v5_0_1_is_idle - check VCN block is idle + * + * @handle: amdgpu_device pointer + * + * Check whether VCN block is idle + */ +static bool vcn_v5_0_1_is_idle(void *handle) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)handle; + int i, ret = 1; + + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) + ret &= (RREG32_SOC15(VCN, GET_INST(VCN, i), regUVD_STATUS) == UVD_STATUS__IDLE); + + return ret; +} + +/** + * vcn_v5_0_1_wait_for_idle - wait for VCN block idle + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * + * Wait for VCN block idle + */ +static int vcn_v5_0_1_wait_for_idle(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + int i, ret = 0; + + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + ret = SOC15_WAIT_ON_RREG(VCN, GET_INST(VCN, i), regUVD_STATUS, UVD_STATUS__IDLE, + UVD_STATUS__IDLE); + if (ret) + return ret; + } + + return ret; +} + +/** + * vcn_v5_0_1_set_clockgating_state - set VCN block clockgating state + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * @state: clock gating state + * + * Set VCN block clockgating state + */ +static int vcn_v5_0_1_set_clockgating_state(struct amdgpu_ip_block *ip_block, + enum amd_clockgating_state state) +{ + struct amdgpu_device *adev = ip_block->adev; + bool enable = state == AMD_CG_STATE_GATE; + int i; + + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { + if (enable) { + if (RREG32_SOC15(VCN, GET_INST(VCN, i), regUVD_STATUS) != UVD_STATUS__IDLE) + return -EBUSY; + vcn_v5_0_1_enable_clock_gating(adev, i); + } else { + vcn_v5_0_1_disable_clock_gating(adev, i); + } + } + + return 0; +} + +/** + * vcn_v5_0_1_set_powergating_state - set VCN block powergating state + * + * @ip_block: Pointer to the amdgpu_ip_block for this hw instance. + * @state: power gating state + * + * Set VCN block powergating state + */ +static int vcn_v5_0_1_set_powergating_state(struct amdgpu_ip_block *ip_block, + enum amd_powergating_state state) +{ + struct amdgpu_device *adev = ip_block->adev; + int ret; + + if (state == adev->vcn.cur_state) + return 0; + + if (state == AMD_PG_STATE_GATE) + ret = vcn_v5_0_1_stop(adev); + else + ret = vcn_v5_0_1_start(adev); + + if (!ret) + adev->vcn.cur_state = state; + + return ret; +} + +/** + * vcn_v5_0_1_process_interrupt - process VCN block interrupt + * + * @adev: amdgpu_device pointer + * @source: interrupt sources + * @entry: interrupt entry from clients and sources + * + * Process VCN block interrupt + */ +static int vcn_v5_0_1_process_interrupt(struct amdgpu_device *adev, struct amdgpu_irq_src *source, + struct amdgpu_iv_entry *entry) +{ + uint32_t i, inst; + + i = node_id_to_phys_map[entry->node_id]; + + DRM_DEV_DEBUG(adev->dev, "IH: VCN TRAP\n"); + + for (inst = 0; inst < adev->vcn.num_vcn_inst; ++inst) + if (adev->vcn.inst[inst].aid_id == i) + break; + if (inst >= adev->vcn.num_vcn_inst) { + dev_WARN_ONCE(adev->dev, 1, + "Interrupt received for unknown VCN instance %d", + entry->node_id); + return 0; + } + + switch (entry->src_id) { + case VCN_5_0__SRCID__UVD_ENC_GENERAL_PURPOSE: + amdgpu_fence_process(&adev->vcn.inst[inst].ring_enc[0]); + break; + default: + DRM_DEV_ERROR(adev->dev, "Unhandled interrupt: %d %d\n", + entry->src_id, entry->src_data[0]); + break; + } + + return 0; +} + +static const struct amdgpu_irq_src_funcs vcn_v5_0_1_irq_funcs = { + .process = vcn_v5_0_1_process_interrupt, +}; + +/** + * vcn_v5_0_1_set_irq_funcs - set VCN block interrupt irq functions + * + * @adev: amdgpu_device pointer + * + * Set VCN block interrupt irq functions + */ +static void vcn_v5_0_1_set_irq_funcs(struct amdgpu_device *adev) +{ + int i; + + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) + adev->vcn.inst->irq.num_types++; + adev->vcn.inst->irq.funcs = &vcn_v5_0_1_irq_funcs; +} + +static const struct amd_ip_funcs vcn_v5_0_1_ip_funcs = { + .name = "vcn_v5_0_1", + .early_init = vcn_v5_0_1_early_init, + .late_init = NULL, + .sw_init = vcn_v5_0_1_sw_init, + .sw_fini = vcn_v5_0_1_sw_fini, + .hw_init = vcn_v5_0_1_hw_init, + .hw_fini = vcn_v5_0_1_hw_fini, + .suspend = vcn_v5_0_1_suspend, + .resume = vcn_v5_0_1_resume, + .is_idle = vcn_v5_0_1_is_idle, + .wait_for_idle = vcn_v5_0_1_wait_for_idle, + .check_soft_reset = NULL, + .pre_soft_reset = NULL, + .soft_reset = NULL, + .post_soft_reset = NULL, + .set_clockgating_state = vcn_v5_0_1_set_clockgating_state, + .set_powergating_state = vcn_v5_0_1_set_powergating_state, + .dump_ip_state = vcn_v5_0_0_dump_ip_state, + .print_ip_state = vcn_v5_0_0_print_ip_state, +}; + +const struct amdgpu_ip_block_version vcn_v5_0_1_ip_block = { + .type = AMD_IP_BLOCK_TYPE_VCN, + .major = 5, + .minor = 0, + .rev = 1, + .funcs = &vcn_v5_0_1_ip_funcs, +}; diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.h b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.h new file mode 100644 index 000000000000..82ac709f44bf --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.h @@ -0,0 +1,29 @@ +/* + * Copyright 2024 Advanced Micro Devices, Inc. All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ + +#ifndef __VCN_v5_0_1_H__ +#define __VCN_v5_0_1_H__ + +extern const struct amdgpu_ip_block_version vcn_v5_0_1_ip_block; + +#endif /* __VCN_v5_0_1_H__ */ diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c index 0fedadd0a6a4..98fc6941159e 100644 --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c @@ -364,9 +364,8 @@ static u32 vega10_ih_get_wptr(struct amdgpu_device *adev, * this should allow us to catchup. */ tmp = (wptr + 32) & ih->ptr_mask; - dev_warn(adev->dev, "IH ring buffer overflow " - "(0x%08X, 0x%08X, 0x%08X)\n", - wptr, ih->rptr, tmp); + dev_warn_ratelimited(adev->dev, "%s ring buffer overflow (0x%08X, 0x%08X, 0x%08X)\n", + amdgpu_ih_ring_name(adev, ih), wptr, ih->rptr, tmp); ih->rptr = tmp; tmp = RREG32_NO_KIQ(ih_regs->ih_rb_cntl); @@ -605,10 +604,10 @@ static void vega10_ih_update_clockgating_state(struct amdgpu_device *adev, } } -static int vega10_ih_set_clockgating_state(void *handle, +static int vega10_ih_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; vega10_ih_update_clockgating_state(adev, state == AMD_CG_STATE_GATE); @@ -616,7 +615,7 @@ static int vega10_ih_set_clockgating_state(void *handle, } -static int vega10_ih_set_powergating_state(void *handle, +static int vega10_ih_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c index 1c9aff742e43..e9e3b2ed4b7b 100644 --- a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c @@ -366,6 +366,7 @@ static int vega20_ih_irq_init(struct amdgpu_device *adev) /* Enable IH Retry CAM */ if (amdgpu_ip_version(adev, OSSSYS_HWIP, 0) == IP_VERSION(4, 4, 0) || amdgpu_ip_version(adev, OSSSYS_HWIP, 0) == IP_VERSION(4, 4, 2) || + amdgpu_ip_version(adev, OSSSYS_HWIP, 0) == IP_VERSION(4, 4, 4) || amdgpu_ip_version(adev, OSSSYS_HWIP, 0) == IP_VERSION(4, 4, 5)) WREG32_FIELD15(OSSSYS, 0, IH_RETRY_INT_CAM_CNTL_ALDEBARAN, ENABLE, 1); @@ -443,9 +444,8 @@ static u32 vega20_ih_get_wptr(struct amdgpu_device *adev, * this should allow us to catchup. */ tmp = (wptr + 32) & ih->ptr_mask; - dev_warn(adev->dev, "IH ring buffer overflow " - "(0x%08X, 0x%08X, 0x%08X)\n", - wptr, ih->rptr, tmp); + dev_warn_ratelimited(adev->dev, "%s ring buffer overflow (0x%08X, 0x%08X, 0x%08X)\n", + amdgpu_ih_ring_name(adev, ih), wptr, ih->rptr, tmp); ih->rptr = tmp; tmp = RREG32_NO_KIQ(ih_regs->ih_rb_cntl); @@ -697,10 +697,10 @@ static void vega20_ih_update_clockgating_state(struct amdgpu_device *adev, } } -static int vega20_ih_set_clockgating_state(void *handle, +static int vega20_ih_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; vega20_ih_update_clockgating_state(adev, state == AMD_CG_STATE_GATE); @@ -708,7 +708,7 @@ static int vega20_ih_set_clockgating_state(void *handle, } -static int vega20_ih_set_powergating_state(void *handle, +static int vega20_ih_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c index a83505815d39..06615f160331 100644 --- a/drivers/gpu/drm/amd/amdgpu/vi.c +++ b/drivers/gpu/drm/amd/amdgpu/vi.c @@ -1945,10 +1945,10 @@ static int vi_common_set_clockgating_state_by_smu(void *handle, return 0; } -static int vi_common_set_clockgating_state(void *handle, +static int vi_common_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_device *adev = ip_block->adev; if (amdgpu_sriov_vf(adev)) return 0; @@ -1988,7 +1988,7 @@ static int vi_common_set_clockgating_state(void *handle, return 0; } -static int vi_common_set_powergating_state(void *handle, +static int vi_common_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h index 02f7ba8c93cd..388b44ed5928 100644 --- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h @@ -4122,3 +4122,494 @@ static const uint32_t cwsr_trap_gfx12_hex[] = { 0xbf9f0000, 0xbf9f0000, 0xbf9f0000, 0x00000000, }; + +static const uint32_t cwsr_trap_gfx9_5_0_hex[] = { + 0xbf820001, 0xbf8202d8, + 0xb8f8f802, 0x8978ff78, + 0x00020006, 0xb8fbf803, + 0x866eff78, 0x00002000, + 0xbf840008, 0xbf0d986d, + 0xbf85001f, 0x866eff7b, + 0x00000400, 0xbf850061, + 0xbf8e0010, 0xb8fbf803, + 0xbf82fffa, 0x866eff7b, + 0x03800900, 0xbf850015, + 0x866eff7b, 0x000071ff, + 0xbf840008, 0x866fff7b, + 0x00007080, 0xbf840001, + 0xbeee1a87, 0xb8eff801, + 0x8e6e8c6e, 0x866e6f6e, + 0xbf85000a, 0xbf0d986d, + 0xbf850003, 0x866eff6d, + 0x00ff0000, 0xbf850005, + 0xbf0d986d, 0xbf850004, + 0x866eff7b, 0x00000400, + 0xbf850046, 0xbeed1a9d, + 0xb8faf807, 0x867aff7a, + 0x001f8000, 0x8e7a8b7a, + 0x8979ff79, 0xfc000000, + 0x87797a79, 0xba7ff807, + 0x00000000, 0xb8faf812, + 0xb8fbf813, 0x8efa887a, + 0xbf0d8f7b, 0xbf840002, + 0x877bff7b, 0xffff0000, + 0xc0031cfd, 0x00000010, + 0xc0071bbd, 0x00000000, + 0xc0071ebd, 0x00000008, + 0xbf8cc07f, 0x8e739773, + 0x8979ff79, 0x01800000, + 0x87797379, 0xbf0d986d, + 0xbf840009, 0xbf0d9879, + 0xbf850007, 0x896dff6d, + 0x01ff0000, 0xba7f0583, + 0x00000000, 0xbf0d9d6d, + 0xbeed189d, 0xbf840012, + 0xbef91898, 0xbeed189d, + 0x86ee6e6e, 0xbf840001, + 0xbe801d6e, 0x866eff6d, + 0x01ff0000, 0xbf850005, + 0x8778ff78, 0x00002000, + 0x80ec886c, 0x82ed806d, + 0xbf820005, 0x866eff6d, + 0x01000000, 0xbf850002, + 0x806c846c, 0x826d806d, + 0x866dff6d, 0x0000ffff, + 0x8f7a8b79, 0x867aff7a, + 0x001f8000, 0xb97af807, + 0x86fe7e7e, 0x86ea6a6a, + 0x8f6e8378, 0xb96ee0c2, + 0xbf800002, 0xb9780002, + 0xbe801f6c, 0x866dff6d, + 0x0000ffff, 0xbefa0080, + 0xb97a0283, 0xb8faf807, + 0x867aff7a, 0x001f8000, + 0x8e7a8b7a, 0x8979ff79, + 0xfc000000, 0x87797a79, + 0xba7ff807, 0x00000000, + 0xbeee007e, 0xbeef007f, + 0xbefe0180, 0xbf900004, + 0x877a8478, 0xb97af802, + 0xbf8e0002, 0xbf88fffe, + 0xb8fa2985, 0x807a817a, + 0x8e7a8a7a, 0x8e7a817a, + 0xb8fb1605, 0x807b817b, + 0x8e7b867b, 0x807a7b7a, + 0x807a7e7a, 0x827b807f, + 0x867bff7b, 0x0000ffff, + 0xc04b1c3d, 0x00000050, + 0xbf8cc07f, 0xc04b1d3d, + 0x00000060, 0xbf8cc07f, + 0xc0431e7d, 0x00000074, + 0xbf8cc07f, 0xbef4007e, + 0x8675ff7f, 0x0000ffff, + 0x8775ff75, 0x00040000, + 0xbef60080, 0xbef700ff, + 0x00807fac, 0xbef1007c, + 0xbef00080, 0xb8f02985, + 0x80708170, 0x8e708a70, + 0x8e708170, 0xb8fa1605, + 0x807a817a, 0x8e7a867a, + 0x80707a70, 0xbef60084, + 0xbef600ff, 0x01000000, + 0xbefe007c, 0xbefc0070, + 0xc0611c7a, 0x0000007c, + 0xbf8cc07f, 0x80708470, + 0xbefc007e, 0xbefe007c, + 0xbefc0070, 0xc0611b3a, + 0x0000007c, 0xbf8cc07f, + 0x80708470, 0xbefc007e, + 0xbefe007c, 0xbefc0070, + 0xc0611b7a, 0x0000007c, + 0xbf8cc07f, 0x80708470, + 0xbefc007e, 0xbefe007c, + 0xbefc0070, 0xc0611bba, + 0x0000007c, 0xbf8cc07f, + 0x80708470, 0xbefc007e, + 0xbefe007c, 0xbefc0070, + 0xc0611bfa, 0x0000007c, + 0xbf8cc07f, 0x80708470, + 0xbefc007e, 0xbefe007c, + 0xbefc0070, 0xc0611e3a, + 0x0000007c, 0xbf8cc07f, + 0x80708470, 0xbefc007e, + 0xb8fbf803, 0xbefe007c, + 0xbefc0070, 0xc0611efa, + 0x0000007c, 0xbf8cc07f, + 0x80708470, 0xbefc007e, + 0xbefe007c, 0xbefc0070, + 0xc0611a3a, 0x0000007c, + 0xbf8cc07f, 0x80708470, + 0xbefc007e, 0xbefe007c, + 0xbefc0070, 0xc0611a7a, + 0x0000007c, 0xbf8cc07f, + 0x80708470, 0xbefc007e, + 0xb8f1f801, 0xbefe007c, + 0xbefc0070, 0xc0611c7a, + 0x0000007c, 0xbf8cc07f, + 0x80708470, 0xbefc007e, + 0x867aff7f, 0x04000000, + 0xbeef0080, 0x876f6f7a, + 0xb8f02985, 0x80708170, + 0x8e708a70, 0x8e708170, + 0xb8fb1605, 0x807b817b, + 0x8e7b847b, 0x8e76827b, + 0xbef600ff, 0x01000000, + 0xbef20174, 0x80747074, + 0x82758075, 0xbefc0080, + 0xbf800000, 0xbe802b00, + 0xbe822b02, 0xbe842b04, + 0xbe862b06, 0xbe882b08, + 0xbe8a2b0a, 0xbe8c2b0c, + 0xbe8e2b0e, 0xc06b003a, + 0x00000000, 0xbf8cc07f, + 0xc06b013a, 0x00000010, + 0xbf8cc07f, 0xc06b023a, + 0x00000020, 0xbf8cc07f, + 0xc06b033a, 0x00000030, + 0xbf8cc07f, 0x8074c074, + 0x82758075, 0x807c907c, + 0xbf0a7b7c, 0xbf85ffe7, + 0xbef40172, 0xbef00080, + 0xbefe00c1, 0xbeff00c1, + 0xbee80080, 0xbee90080, + 0xbef600ff, 0x01000000, + 0x867aff78, 0x00400000, + 0xbf850003, 0xb8faf803, + 0x897a7aff, 0x10000000, + 0xbf85004d, 0xbe840080, + 0xd2890000, 0x00000900, + 0x80048104, 0xd2890001, + 0x00000900, 0x80048104, + 0xd2890002, 0x00000900, + 0x80048104, 0xd2890003, + 0x00000900, 0x80048104, + 0xc069003a, 0x00000070, + 0xbf8cc07f, 0x80709070, + 0xbf06c004, 0xbf84ffee, + 0xbe840080, 0xd2890000, + 0x00000901, 0x80048104, + 0xd2890001, 0x00000901, + 0x80048104, 0xd2890002, + 0x00000901, 0x80048104, + 0xd2890003, 0x00000901, + 0x80048104, 0xc069003a, + 0x00000070, 0xbf8cc07f, + 0x80709070, 0xbf06c004, + 0xbf84ffee, 0xbe840080, + 0xd2890000, 0x00000902, + 0x80048104, 0xd2890001, + 0x00000902, 0x80048104, + 0xd2890002, 0x00000902, + 0x80048104, 0xd2890003, + 0x00000902, 0x80048104, + 0xc069003a, 0x00000070, + 0xbf8cc07f, 0x80709070, + 0xbf06c004, 0xbf84ffee, + 0xbe840080, 0xd2890000, + 0x00000903, 0x80048104, + 0xd2890001, 0x00000903, + 0x80048104, 0xd2890002, + 0x00000903, 0x80048104, + 0xd2890003, 0x00000903, + 0x80048104, 0xc069003a, + 0x00000070, 0xbf8cc07f, + 0x80709070, 0xbf06c004, + 0xbf84ffee, 0xbf820008, + 0xe0724000, 0x701d0000, + 0xe0724100, 0x701d0100, + 0xe0724200, 0x701d0200, + 0xe0724300, 0x701d0300, + 0xbefe00c1, 0xbeff00c1, + 0xb8fb5306, 0x867bc17b, + 0xbf840052, 0xbf8a0000, + 0x867aff6f, 0x04000000, + 0xbf84004e, 0x8e7b867b, + 0x8e7b827b, 0xbef6007b, + 0xb8f02985, 0x80708170, + 0x8e708a70, 0x8e708170, + 0xb8fa1605, 0x807a817a, + 0x8e7a867a, 0x80707a70, + 0x8070ff70, 0x00000080, + 0xbef600ff, 0x01000000, + 0xbefc0080, 0xd28c0002, + 0x000100c1, 0xd28d0003, + 0x000204c1, 0x867aff78, + 0x00400000, 0xbf850003, + 0xb8faf803, 0x897a7aff, + 0x10000000, 0xbf85001d, + 0x24040682, 0xd86c0000, + 0x00000002, 0xbf8cc07f, + 0xbe840080, 0xd2890000, + 0x00000900, 0x80048104, + 0xd2890001, 0x00000900, + 0x80048104, 0xd2890002, + 0x00000900, 0x80048104, + 0xd2890003, 0x00000900, + 0x80048104, 0xc069003a, + 0x00000070, 0xbf8cc07f, + 0x80709070, 0xbf06c004, + 0xbf84ffee, 0x680404ff, + 0x00000100, 0xd0c9006a, + 0x0000f702, 0xbf87ffe5, + 0xbf820016, 0xd1060002, + 0x00011103, 0x7e0602ff, + 0x00000200, 0xbefc00ff, + 0x00010000, 0xbe800077, + 0x8677ff77, 0xff7fffff, + 0x8777ff77, 0x00058000, + 0xd8ec0000, 0x00000002, + 0xbf8cc07f, 0xe0765000, + 0x701d0002, 0x68040702, + 0xd0c9006a, 0x0000f702, + 0xbefe016a, 0xbf87fff6, + 0xbef70000, 0xbef000ff, + 0x00000400, 0xbefe00c1, + 0xbeff00c1, 0xb8fb2b05, + 0x807b817b, 0x8e7b827b, + 0xbef600ff, 0x01000000, + 0xbefc0084, 0xbf0a7b7c, + 0xbf84006d, 0xbf11017c, + 0x807bff7b, 0x00001000, + 0x867aff78, 0x00400000, + 0xbf850003, 0xb8faf803, + 0x897a7aff, 0x10000000, + 0xbf850051, 0xbe840080, + 0xd2890000, 0x00000900, + 0x80048104, 0xd2890001, + 0x00000900, 0x80048104, + 0xd2890002, 0x00000900, + 0x80048104, 0xd2890003, + 0x00000900, 0x80048104, + 0xc069003a, 0x00000070, + 0xbf8cc07f, 0x80709070, + 0xbf06c004, 0xbf84ffee, + 0xbe840080, 0xd2890000, + 0x00000901, 0x80048104, + 0xd2890001, 0x00000901, + 0x80048104, 0xd2890002, + 0x00000901, 0x80048104, + 0xd2890003, 0x00000901, + 0x80048104, 0xc069003a, + 0x00000070, 0xbf8cc07f, + 0x80709070, 0xbf06c004, + 0xbf84ffee, 0xbe840080, + 0xd2890000, 0x00000902, + 0x80048104, 0xd2890001, + 0x00000902, 0x80048104, + 0xd2890002, 0x00000902, + 0x80048104, 0xd2890003, + 0x00000902, 0x80048104, + 0xc069003a, 0x00000070, + 0xbf8cc07f, 0x80709070, + 0xbf06c004, 0xbf84ffee, + 0xbe840080, 0xd2890000, + 0x00000903, 0x80048104, + 0xd2890001, 0x00000903, + 0x80048104, 0xd2890002, + 0x00000903, 0x80048104, + 0xd2890003, 0x00000903, + 0x80048104, 0xc069003a, + 0x00000070, 0xbf8cc07f, + 0x80709070, 0xbf06c004, + 0xbf84ffee, 0x807c847c, + 0xbf0a7b7c, 0xbf85ffb1, + 0xbf9c0000, 0xbf820012, + 0x7e000300, 0x7e020301, + 0x7e040302, 0x7e060303, + 0xe0724000, 0x701d0000, + 0xe0724100, 0x701d0100, + 0xe0724200, 0x701d0200, + 0xe0724300, 0x701d0300, + 0x807c847c, 0x8070ff70, + 0x00000400, 0xbf0a7b7c, + 0xbf85ffef, 0xbf9c0000, + 0xb8fb2985, 0x807b817b, + 0x8e7b837b, 0xb8fa2b05, + 0x807a817a, 0x8e7a827a, + 0x80fb7a7b, 0x867b7b7b, + 0xbf84007a, 0x807bff7b, + 0x00001000, 0xbefc0080, + 0xbf11017c, 0x867aff78, + 0x00400000, 0xbf850003, + 0xb8faf803, 0x897a7aff, + 0x10000000, 0xbf850059, + 0xd3d84000, 0x18000100, + 0xd3d84001, 0x18000101, + 0xd3d84002, 0x18000102, + 0xd3d84003, 0x18000103, + 0xbe840080, 0xd2890000, + 0x00000900, 0x80048104, + 0xd2890001, 0x00000900, + 0x80048104, 0xd2890002, + 0x00000900, 0x80048104, + 0xd2890003, 0x00000900, + 0x80048104, 0xc069003a, + 0x00000070, 0xbf8cc07f, + 0x80709070, 0xbf06c004, + 0xbf84ffee, 0xbe840080, + 0xd2890000, 0x00000901, + 0x80048104, 0xd2890001, + 0x00000901, 0x80048104, + 0xd2890002, 0x00000901, + 0x80048104, 0xd2890003, + 0x00000901, 0x80048104, + 0xc069003a, 0x00000070, + 0xbf8cc07f, 0x80709070, + 0xbf06c004, 0xbf84ffee, + 0xbe840080, 0xd2890000, + 0x00000902, 0x80048104, + 0xd2890001, 0x00000902, + 0x80048104, 0xd2890002, + 0x00000902, 0x80048104, + 0xd2890003, 0x00000902, + 0x80048104, 0xc069003a, + 0x00000070, 0xbf8cc07f, + 0x80709070, 0xbf06c004, + 0xbf84ffee, 0xbe840080, + 0xd2890000, 0x00000903, + 0x80048104, 0xd2890001, + 0x00000903, 0x80048104, + 0xd2890002, 0x00000903, + 0x80048104, 0xd2890003, + 0x00000903, 0x80048104, + 0xc069003a, 0x00000070, + 0xbf8cc07f, 0x80709070, + 0xbf06c004, 0xbf84ffee, + 0x807c847c, 0xbf0a7b7c, + 0xbf85ffa9, 0xbf9c0000, + 0xbf820016, 0xd3d84000, + 0x18000100, 0xd3d84001, + 0x18000101, 0xd3d84002, + 0x18000102, 0xd3d84003, + 0x18000103, 0xe0724000, + 0x701d0000, 0xe0724100, + 0x701d0100, 0xe0724200, + 0x701d0200, 0xe0724300, + 0x701d0300, 0x807c847c, + 0x8070ff70, 0x00000400, + 0xbf0a7b7c, 0xbf85ffeb, + 0xbf9c0000, 0xbf8200f4, + 0xbef4007e, 0x8675ff7f, + 0x0000ffff, 0x8775ff75, + 0x00040000, 0xbef60080, + 0xbef700ff, 0x00807fac, + 0x866eff7f, 0x04000000, + 0xbf840025, 0xbefe00c1, + 0xbeff00c1, 0xb8ef5306, + 0x866fc16f, 0xbf840020, + 0x8e6f866f, 0x8e6f826f, + 0xbef6006f, 0xb8f82985, + 0x80788178, 0x8e788a78, + 0x8e788178, 0xb8ee1605, + 0x806e816e, 0x8e6e866e, + 0x80786e78, 0x8078ff78, + 0x00000080, 0xbef600ff, + 0x01000000, 0xbefc0080, + 0xe0510000, 0x781d0000, + 0xe0510100, 0x781d0000, + 0xe0510200, 0x781d0000, + 0xe0510300, 0x781d0000, + 0xe0510400, 0x781d0000, + 0x807cff7c, 0x00000500, + 0x8078ff78, 0x00000500, + 0xbf0a6f7c, 0xbf85fff0, + 0xbefe00c1, 0xbeff00c1, + 0xbef600ff, 0x01000000, + 0xb8ef2b05, 0x806f816f, + 0x8e6f826f, 0x806fff6f, + 0x00008000, 0xbef80080, + 0xbeee0078, 0x8078ff78, + 0x00000400, 0xbefc0084, + 0xbf11087c, 0xe0524000, + 0x781d0000, 0xe0524100, + 0x781d0100, 0xe0524200, + 0x781d0200, 0xe0524300, + 0x781d0300, 0xbf8c0f70, + 0x7e000300, 0x7e020301, + 0x7e040302, 0x7e060303, + 0x807c847c, 0x8078ff78, + 0x00000400, 0xbf0a6f7c, + 0xbf85ffee, 0xb8ef2985, + 0x806f816f, 0x8e6f836f, + 0xb8f92b05, 0x80798179, + 0x8e798279, 0x80ef796f, + 0x866f6f6f, 0xbf84001a, + 0x806fff6f, 0x00008000, + 0xbefc0080, 0xbf11087c, + 0xe0524000, 0x781d0000, + 0xe0524100, 0x781d0100, + 0xe0524200, 0x781d0200, + 0xe0524300, 0x781d0300, + 0xbf8c0f70, 0xd3d94000, + 0x18000100, 0xd3d94001, + 0x18000101, 0xd3d94002, + 0x18000102, 0xd3d94003, + 0x18000103, 0x807c847c, + 0x8078ff78, 0x00000400, + 0xbf0a6f7c, 0xbf85ffea, + 0xbf9c0000, 0xe0524000, + 0x6e1d0000, 0xe0524100, + 0x6e1d0100, 0xe0524200, + 0x6e1d0200, 0xe0524300, + 0x6e1d0300, 0xbf8c0f70, + 0xb8f82985, 0x80788178, + 0x8e788a78, 0x8e788178, + 0xb8ee1605, 0x806e816e, + 0x8e6e866e, 0x80786e78, + 0x80f8c078, 0xb8ef1605, + 0x806f816f, 0x8e6f846f, + 0x8e76826f, 0xbef600ff, + 0x01000000, 0xbefc006f, + 0xc031003a, 0x00000078, + 0x80f8c078, 0xbf8cc07f, + 0x80fc907c, 0xbf800000, + 0xbe802d00, 0xbe822d02, + 0xbe842d04, 0xbe862d06, + 0xbe882d08, 0xbe8a2d0a, + 0xbe8c2d0c, 0xbe8e2d0e, + 0xbf06807c, 0xbf84fff0, + 0xb8f82985, 0x80788178, + 0x8e788a78, 0x8e788178, + 0xb8ee1605, 0x806e816e, + 0x8e6e866e, 0x80786e78, + 0xbef60084, 0xbef600ff, + 0x01000000, 0xc0211bfa, + 0x00000078, 0x80788478, + 0xc0211b3a, 0x00000078, + 0x80788478, 0xc0211b7a, + 0x00000078, 0x80788478, + 0xc0211c3a, 0x00000078, + 0x80788478, 0xc0211c7a, + 0x00000078, 0x80788478, + 0xc0211eba, 0x00000078, + 0x80788478, 0xc0211efa, + 0x00000078, 0x80788478, + 0xc0211a3a, 0x00000078, + 0x80788478, 0xc0211a7a, + 0x00000078, 0x80788478, + 0xc0211cfa, 0x00000078, + 0x80788478, 0xbf8cc07f, + 0xbefc006f, 0xbefe0070, + 0xbeff0071, 0x866f7bff, + 0x000003ff, 0xb96f4803, + 0x866f7bff, 0xfffff800, + 0x8f6f8b6f, 0xb96fa2c3, + 0xb973f801, 0xb8ee2985, + 0x806e816e, 0x8e6e8a6e, + 0x8e6e816e, 0xb8ef1605, + 0x806f816f, 0x8e6f866f, + 0x806e6f6e, 0x806e746e, + 0x826f8075, 0x866fff6f, + 0x0000ffff, 0xc00b1c37, + 0x00000050, 0xc00b1d37, + 0x00000060, 0xc0031e77, + 0x00000074, 0xbf8cc07f, + 0x8f6e8b79, 0x866eff6e, + 0x001f8000, 0xb96ef807, + 0x866dff6d, 0x0000ffff, + 0x86fe7e7e, 0x86ea6a6a, + 0x8f6e837a, 0xb96ee0c2, + 0xbf800002, 0xb97a0002, + 0xbf8a0000, 0xbe801f6c, + 0xbf9b0000, 0x00000000, +}; diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm index 44772eec9ef4..96fbb16ceb21 100644 --- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm @@ -34,41 +34,24 @@ * cpp -DASIC_FAMILY=CHIP_PLUM_BONITO cwsr_trap_handler_gfx10.asm -P -o gfx11.sp3 * sp3 gfx11.sp3 -hex gfx11.hex * - * gfx12: - * cpp -DASIC_FAMILY=CHIP_GFX12 cwsr_trap_handler_gfx10.asm -P -o gfx12.sp3 - * sp3 gfx12.sp3 -hex gfx12.hex */ #define CHIP_NAVI10 26 #define CHIP_SIENNA_CICHLID 30 #define CHIP_PLUM_BONITO 36 -#define CHIP_GFX12 37 #define NO_SQC_STORE (ASIC_FAMILY >= CHIP_SIENNA_CICHLID) #define HAVE_XNACK (ASIC_FAMILY < CHIP_SIENNA_CICHLID) #define HAVE_SENDMSG_RTN (ASIC_FAMILY >= CHIP_PLUM_BONITO) #define HAVE_BUFFER_LDS_LOAD (ASIC_FAMILY < CHIP_PLUM_BONITO) -#define SW_SA_TRAP (ASIC_FAMILY >= CHIP_PLUM_BONITO && ASIC_FAMILY < CHIP_GFX12) +#define SW_SA_TRAP (ASIC_FAMILY == CHIP_PLUM_BONITO) #define SAVE_AFTER_XNACK_ERROR (HAVE_XNACK && !NO_SQC_STORE) // workaround for TCP store failure after XNACK error when ALLOW_REPLAY=0, for debugger #define SINGLE_STEP_MISSED_WORKAROUND 1 //workaround for lost MODE.DEBUG_EN exception when SAVECTX raised -#if ASIC_FAMILY < CHIP_GFX12 #define S_COHERENCE glc:1 #define V_COHERENCE slc:1 glc:1 #define S_WAITCNT_0 s_waitcnt 0 -#else -#define S_COHERENCE scope:SCOPE_SYS -#define V_COHERENCE scope:SCOPE_SYS -#define S_WAITCNT_0 s_wait_idle -#define HW_REG_SHADER_FLAT_SCRATCH_LO HW_REG_WAVE_SCRATCH_BASE_LO -#define HW_REG_SHADER_FLAT_SCRATCH_HI HW_REG_WAVE_SCRATCH_BASE_HI -#define HW_REG_GPR_ALLOC HW_REG_WAVE_GPR_ALLOC -#define HW_REG_LDS_ALLOC HW_REG_WAVE_LDS_ALLOC -#define HW_REG_MODE HW_REG_WAVE_MODE -#endif - -#if ASIC_FAMILY < CHIP_GFX12 var SQ_WAVE_STATUS_SPI_PRIO_MASK = 0x00000006 var SQ_WAVE_STATUS_HALT_MASK = 0x2000 var SQ_WAVE_STATUS_ECC_ERR_MASK = 0x20000 @@ -81,21 +64,6 @@ var S_STATUS_ALWAYS_CLEAR_MASK = SQ_WAVE_STATUS_SPI_PRIO_MASK|SQ_WAVE_STATUS_E var S_STATUS_HALT_MASK = SQ_WAVE_STATUS_HALT_MASK var S_SAVE_PC_HI_TRAP_ID_MASK = 0x00FF0000 var S_SAVE_PC_HI_HT_MASK = 0x01000000 -#else -var SQ_WAVE_STATE_PRIV_BARRIER_COMPLETE_MASK = 0x4 -var SQ_WAVE_STATE_PRIV_SCC_SHIFT = 9 -var SQ_WAVE_STATE_PRIV_SYS_PRIO_MASK = 0xC00 -var SQ_WAVE_STATE_PRIV_HALT_MASK = 0x4000 -var SQ_WAVE_STATE_PRIV_POISON_ERR_MASK = 0x8000 -var SQ_WAVE_STATE_PRIV_POISON_ERR_SHIFT = 15 -var SQ_WAVE_STATUS_WAVE64_SHIFT = 29 -var SQ_WAVE_STATUS_WAVE64_SIZE = 1 -var SQ_WAVE_LDS_ALLOC_GRANULARITY = 9 -var S_STATUS_HWREG = HW_REG_WAVE_STATE_PRIV -var S_STATUS_ALWAYS_CLEAR_MASK = SQ_WAVE_STATE_PRIV_SYS_PRIO_MASK|SQ_WAVE_STATE_PRIV_POISON_ERR_MASK -var S_STATUS_HALT_MASK = SQ_WAVE_STATE_PRIV_HALT_MASK -var S_SAVE_PC_HI_TRAP_ID_MASK = 0xF0000000 -#endif var SQ_WAVE_STATUS_NO_VGPRS_SHIFT = 24 var SQ_WAVE_LDS_ALLOC_LDS_SIZE_SHIFT = 12 @@ -110,7 +78,6 @@ var SQ_WAVE_GPR_ALLOC_VGPR_SIZE_SHIFT = 8 var SQ_WAVE_GPR_ALLOC_VGPR_SIZE_SHIFT = 12 #endif -#if ASIC_FAMILY < CHIP_GFX12 var SQ_WAVE_TRAPSTS_SAVECTX_MASK = 0x400 var SQ_WAVE_TRAPSTS_EXCP_MASK = 0x1FF var SQ_WAVE_TRAPSTS_SAVECTX_SHIFT = 10 @@ -161,39 +128,6 @@ var S_TRAPSTS_RESTORE_PART_3_SIZE = 32 - S_TRAPSTS_RESTORE_PART_3_SHIFT var S_TRAPSTS_HWREG = HW_REG_TRAPSTS var S_TRAPSTS_SAVE_CONTEXT_MASK = SQ_WAVE_TRAPSTS_SAVECTX_MASK var S_TRAPSTS_SAVE_CONTEXT_SHIFT = SQ_WAVE_TRAPSTS_SAVECTX_SHIFT -#else -var SQ_WAVE_EXCP_FLAG_PRIV_ADDR_WATCH_MASK = 0xF -var SQ_WAVE_EXCP_FLAG_PRIV_MEM_VIOL_MASK = 0x10 -var SQ_WAVE_EXCP_FLAG_PRIV_SAVE_CONTEXT_SHIFT = 5 -var SQ_WAVE_EXCP_FLAG_PRIV_SAVE_CONTEXT_MASK = 0x20 -var SQ_WAVE_EXCP_FLAG_PRIV_ILLEGAL_INST_MASK = 0x40 -var SQ_WAVE_EXCP_FLAG_PRIV_ILLEGAL_INST_SHIFT = 6 -var SQ_WAVE_EXCP_FLAG_PRIV_HOST_TRAP_MASK = 0x80 -var SQ_WAVE_EXCP_FLAG_PRIV_HOST_TRAP_SHIFT = 7 -var SQ_WAVE_EXCP_FLAG_PRIV_WAVE_START_MASK = 0x100 -var SQ_WAVE_EXCP_FLAG_PRIV_WAVE_START_SHIFT = 8 -var SQ_WAVE_EXCP_FLAG_PRIV_WAVE_END_MASK = 0x200 -var SQ_WAVE_EXCP_FLAG_PRIV_TRAP_AFTER_INST_MASK = 0x800 -var SQ_WAVE_TRAP_CTRL_ADDR_WATCH_MASK = 0x80 -var SQ_WAVE_TRAP_CTRL_TRAP_AFTER_INST_MASK = 0x200 - -var S_TRAPSTS_HWREG = HW_REG_WAVE_EXCP_FLAG_PRIV -var S_TRAPSTS_SAVE_CONTEXT_MASK = SQ_WAVE_EXCP_FLAG_PRIV_SAVE_CONTEXT_MASK -var S_TRAPSTS_SAVE_CONTEXT_SHIFT = SQ_WAVE_EXCP_FLAG_PRIV_SAVE_CONTEXT_SHIFT -var S_TRAPSTS_NON_MASKABLE_EXCP_MASK = SQ_WAVE_EXCP_FLAG_PRIV_MEM_VIOL_MASK |\ - SQ_WAVE_EXCP_FLAG_PRIV_ILLEGAL_INST_MASK |\ - SQ_WAVE_EXCP_FLAG_PRIV_HOST_TRAP_MASK |\ - SQ_WAVE_EXCP_FLAG_PRIV_WAVE_START_MASK |\ - SQ_WAVE_EXCP_FLAG_PRIV_WAVE_END_MASK |\ - SQ_WAVE_EXCP_FLAG_PRIV_TRAP_AFTER_INST_MASK -var S_TRAPSTS_RESTORE_PART_1_SIZE = SQ_WAVE_EXCP_FLAG_PRIV_SAVE_CONTEXT_SHIFT -var S_TRAPSTS_RESTORE_PART_2_SHIFT = SQ_WAVE_EXCP_FLAG_PRIV_ILLEGAL_INST_SHIFT -var S_TRAPSTS_RESTORE_PART_2_SIZE = SQ_WAVE_EXCP_FLAG_PRIV_HOST_TRAP_SHIFT - SQ_WAVE_EXCP_FLAG_PRIV_ILLEGAL_INST_SHIFT -var S_TRAPSTS_RESTORE_PART_3_SHIFT = SQ_WAVE_EXCP_FLAG_PRIV_WAVE_START_SHIFT -var S_TRAPSTS_RESTORE_PART_3_SIZE = 32 - S_TRAPSTS_RESTORE_PART_3_SHIFT -var BARRIER_STATE_SIGNAL_OFFSET = 16 -var BARRIER_STATE_VALID_OFFSET = 0 -#endif // bits [31:24] unused by SPI debug data var TTMP11_SAVE_REPLAY_W64H_SHIFT = 31 @@ -305,11 +239,7 @@ L_TRAP_NO_BARRIER: L_HALTED: // Host trap may occur while wave is halted. -#if ASIC_FAMILY < CHIP_GFX12 s_and_b32 ttmp2, s_save_pc_hi, S_SAVE_PC_HI_TRAP_ID_MASK -#else - s_and_b32 ttmp2, s_save_trapsts, SQ_WAVE_EXCP_FLAG_PRIV_HOST_TRAP_MASK -#endif s_cbranch_scc1 L_FETCH_2ND_TRAP L_CHECK_SAVE: @@ -336,7 +266,6 @@ L_NOT_HALTED: // Check for maskable exceptions in trapsts.excp and trapsts.excp_hi. // Maskable exceptions only cause the wave to enter the trap handler if // their respective bit in mode.excp_en is set. -#if ASIC_FAMILY < CHIP_GFX12 s_and_b32 ttmp2, s_save_trapsts, SQ_WAVE_TRAPSTS_EXCP_MASK|SQ_WAVE_TRAPSTS_EXCP_HI_MASK s_cbranch_scc0 L_CHECK_TRAP_ID @@ -349,17 +278,6 @@ L_NOT_ADDR_WATCH: s_lshl_b32 ttmp2, ttmp2, SQ_WAVE_MODE_EXCP_EN_SHIFT s_and_b32 ttmp2, ttmp2, ttmp3 s_cbranch_scc1 L_FETCH_2ND_TRAP -#else - s_getreg_b32 ttmp2, hwreg(HW_REG_WAVE_EXCP_FLAG_USER) - s_and_b32 ttmp3, s_save_trapsts, SQ_WAVE_EXCP_FLAG_PRIV_ADDR_WATCH_MASK - s_cbranch_scc0 L_NOT_ADDR_WATCH - s_or_b32 ttmp2, ttmp2, SQ_WAVE_TRAP_CTRL_ADDR_WATCH_MASK - -L_NOT_ADDR_WATCH: - s_getreg_b32 ttmp3, hwreg(HW_REG_WAVE_TRAP_CTRL) - s_and_b32 ttmp2, ttmp3, ttmp2 - s_cbranch_scc1 L_FETCH_2ND_TRAP -#endif L_CHECK_TRAP_ID: // Check trap_id != 0 @@ -369,13 +287,8 @@ L_CHECK_TRAP_ID: #if SINGLE_STEP_MISSED_WORKAROUND // Prioritize single step exception over context save. // Second-level trap will halt wave and RFE, re-entering for SAVECTX. -#if ASIC_FAMILY < CHIP_GFX12 s_getreg_b32 ttmp2, hwreg(HW_REG_MODE) s_and_b32 ttmp2, ttmp2, SQ_WAVE_MODE_DEBUG_EN_MASK -#else - // WAVE_TRAP_CTRL is already in ttmp3. - s_and_b32 ttmp3, ttmp3, SQ_WAVE_TRAP_CTRL_TRAP_AFTER_INST_MASK -#endif s_cbranch_scc1 L_FETCH_2ND_TRAP #endif @@ -425,12 +338,7 @@ L_NO_NEXT_TRAP: s_cbranch_scc1 L_TRAP_CASE // Host trap will not cause trap re-entry. -#if ASIC_FAMILY < CHIP_GFX12 s_and_b32 ttmp2, s_save_pc_hi, S_SAVE_PC_HI_HT_MASK -#else - s_getreg_b32 ttmp2, hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV) - s_and_b32 ttmp2, ttmp2, SQ_WAVE_EXCP_FLAG_PRIV_HOST_TRAP_MASK -#endif s_cbranch_scc1 L_EXIT_TRAP s_or_b32 s_save_status, s_save_status, S_STATUS_HALT_MASK @@ -457,16 +365,7 @@ L_EXIT_TRAP: s_and_b64 exec, exec, exec // Restore STATUS.EXECZ, not writable by s_setreg_b32 s_and_b64 vcc, vcc, vcc // Restore STATUS.VCCZ, not writable by s_setreg_b32 -#if ASIC_FAMILY < CHIP_GFX12 s_setreg_b32 hwreg(S_STATUS_HWREG), s_save_status -#else - // STATE_PRIV.BARRIER_COMPLETE may have changed since we read it. - // Only restore fields which the trap handler changes. - s_lshr_b32 s_save_status, s_save_status, SQ_WAVE_STATE_PRIV_SCC_SHIFT - s_setreg_b32 hwreg(S_STATUS_HWREG, SQ_WAVE_STATE_PRIV_SCC_SHIFT, \ - SQ_WAVE_STATE_PRIV_POISON_ERR_SHIFT - SQ_WAVE_STATE_PRIV_SCC_SHIFT + 1), s_save_status -#endif - s_rfe_b64 [ttmp0, ttmp1] L_SAVE: @@ -478,14 +377,6 @@ L_SAVE: s_endpgm L_HAVE_VGPRS: #endif -#if ASIC_FAMILY >= CHIP_GFX12 - s_getreg_b32 s_save_tmp, hwreg(HW_REG_WAVE_STATUS) - s_bitcmp1_b32 s_save_tmp, SQ_WAVE_STATUS_NO_VGPRS_SHIFT - s_cbranch_scc0 L_HAVE_VGPRS - s_endpgm -L_HAVE_VGPRS: -#endif - s_and_b32 s_save_pc_hi, s_save_pc_hi, 0x0000ffff //pc[47:32] s_mov_b32 s_save_tmp, 0 s_setreg_b32 hwreg(S_TRAPSTS_HWREG, S_TRAPSTS_SAVE_CONTEXT_SHIFT, 1), s_save_tmp //clear saveCtx bit @@ -671,19 +562,6 @@ L_SAVE_HWREG: s_mov_b32 m0, 0x0 //Next lane of v2 to write to #endif -#if ASIC_FAMILY >= CHIP_GFX12 - // Ensure no further changes to barrier or LDS state. - // STATE_PRIV.BARRIER_COMPLETE may change up to this point. - s_barrier_signal -2 - s_barrier_wait -2 - - // Re-read final state of BARRIER_COMPLETE field for save. - s_getreg_b32 s_save_tmp, hwreg(S_STATUS_HWREG) - s_and_b32 s_save_tmp, s_save_tmp, SQ_WAVE_STATE_PRIV_BARRIER_COMPLETE_MASK - s_andn2_b32 s_save_status, s_save_status, SQ_WAVE_STATE_PRIV_BARRIER_COMPLETE_MASK - s_or_b32 s_save_status, s_save_status, s_save_tmp -#endif - write_hwreg_to_mem(s_save_m0, s_save_buf_rsrc0, s_save_mem_offset) write_hwreg_to_mem(s_save_pc_lo, s_save_buf_rsrc0, s_save_mem_offset) s_andn2_b32 s_save_tmp, s_save_pc_hi, S_SAVE_PC_HI_FIRST_WAVE_MASK @@ -707,21 +585,6 @@ L_SAVE_HWREG: s_getreg_b32 s_save_m0, hwreg(HW_REG_SHADER_FLAT_SCRATCH_HI) write_hwreg_to_mem(s_save_m0, s_save_buf_rsrc0, s_save_mem_offset) -#if ASIC_FAMILY >= CHIP_GFX12 - s_getreg_b32 s_save_m0, hwreg(HW_REG_WAVE_EXCP_FLAG_USER) - write_hwreg_to_mem(s_save_m0, s_save_buf_rsrc0, s_save_mem_offset) - - s_getreg_b32 s_save_m0, hwreg(HW_REG_WAVE_TRAP_CTRL) - write_hwreg_to_mem(s_save_m0, s_save_buf_rsrc0, s_save_mem_offset) - - s_getreg_b32 s_save_tmp, hwreg(HW_REG_WAVE_STATUS) - write_hwreg_to_mem(s_save_tmp, s_save_buf_rsrc0, s_save_mem_offset) - - s_get_barrier_state s_save_tmp, -1 - s_wait_kmcnt (0) - write_hwreg_to_mem(s_save_tmp, s_save_buf_rsrc0, s_save_mem_offset) -#endif - #if NO_SQC_STORE // Write HWREGs with 16 VGPR lanes. TTMPs occupy space after this. s_mov_b32 exec_lo, 0xFFFF @@ -814,9 +677,7 @@ L_SAVE_LDS_NORMAL: s_and_b32 s_save_alloc_size, s_save_alloc_size, 0xFFFFFFFF //lds_size is zero? s_cbranch_scc0 L_SAVE_LDS_DONE //no lds used? jump to L_SAVE_DONE -#if ASIC_FAMILY < CHIP_GFX12 s_barrier //LDS is used? wait for other waves in the same TG -#endif s_and_b32 s_save_tmp, s_save_pc_hi, S_SAVE_PC_HI_FIRST_WAVE_MASK s_cbranch_scc0 L_SAVE_LDS_DONE @@ -1081,11 +942,6 @@ L_RESTORE: s_mov_b32 s_restore_buf_rsrc2, 0 //NUM_RECORDS initial value = 0 (in bytes) s_mov_b32 s_restore_buf_rsrc3, S_RESTORE_BUF_RSRC_WORD3_MISC -#if ASIC_FAMILY >= CHIP_GFX12 - // Save s_restore_spi_init_hi for later use. - s_mov_b32 s_restore_spi_init_hi_save, s_restore_spi_init_hi -#endif - //determine it is wave32 or wave64 get_wave_size2(s_restore_size) @@ -1320,9 +1176,7 @@ L_RESTORE_SGPR: // s_barrier with MODE.DEBUG_EN=1, STATUS.PRIV=1 incorrectly asserts debug exception. // Clear DEBUG_EN before and restore MODE after the barrier. s_setreg_imm32_b32 hwreg(HW_REG_MODE), 0 -#if ASIC_FAMILY < CHIP_GFX12 s_barrier //barrier to ensure the readiness of LDS before access attemps from any other wave in the same TG -#endif /* restore HW registers */ L_RESTORE_HWREG: @@ -1334,11 +1188,6 @@ L_RESTORE_HWREG: s_mov_b32 s_restore_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes -#if ASIC_FAMILY >= CHIP_GFX12 - // Restore s_restore_spi_init_hi before the saved value gets clobbered. - s_mov_b32 s_restore_spi_init_hi, s_restore_spi_init_hi_save -#endif - read_hwreg_from_mem(s_restore_m0, s_restore_buf_rsrc0, s_restore_mem_offset) read_hwreg_from_mem(s_restore_pc_lo, s_restore_buf_rsrc0, s_restore_mem_offset) read_hwreg_from_mem(s_restore_pc_hi, s_restore_buf_rsrc0, s_restore_mem_offset) @@ -1358,44 +1207,6 @@ L_RESTORE_HWREG: s_setreg_b32 hwreg(HW_REG_SHADER_FLAT_SCRATCH_HI), s_restore_flat_scratch -#if ASIC_FAMILY >= CHIP_GFX12 - read_hwreg_from_mem(s_restore_tmp, s_restore_buf_rsrc0, s_restore_mem_offset) - S_WAITCNT_0 - s_setreg_b32 hwreg(HW_REG_WAVE_EXCP_FLAG_USER), s_restore_tmp - - read_hwreg_from_mem(s_restore_tmp, s_restore_buf_rsrc0, s_restore_mem_offset) - S_WAITCNT_0 - s_setreg_b32 hwreg(HW_REG_WAVE_TRAP_CTRL), s_restore_tmp - - // Only the first wave needs to restore the workgroup barrier. - s_and_b32 s_restore_tmp, s_restore_spi_init_hi, S_RESTORE_SPI_INIT_FIRST_WAVE_MASK - s_cbranch_scc0 L_SKIP_BARRIER_RESTORE - - // Skip over WAVE_STATUS, since there is no state to restore from it - s_add_u32 s_restore_mem_offset, s_restore_mem_offset, 4 - - read_hwreg_from_mem(s_restore_tmp, s_restore_buf_rsrc0, s_restore_mem_offset) - S_WAITCNT_0 - - s_bitcmp1_b32 s_restore_tmp, BARRIER_STATE_VALID_OFFSET - s_cbranch_scc0 L_SKIP_BARRIER_RESTORE - - // extract the saved signal count from s_restore_tmp - s_lshr_b32 s_restore_tmp, s_restore_tmp, BARRIER_STATE_SIGNAL_OFFSET - - // We need to call s_barrier_signal repeatedly to restore the signal - // count of the work group barrier. The member count is already - // initialized with the number of waves in the work group. -L_BARRIER_RESTORE_LOOP: - s_and_b32 s_restore_tmp, s_restore_tmp, s_restore_tmp - s_cbranch_scc0 L_SKIP_BARRIER_RESTORE - s_barrier_signal -1 - s_add_i32 s_restore_tmp, s_restore_tmp, -1 - s_branch L_BARRIER_RESTORE_LOOP - -L_SKIP_BARRIER_RESTORE: -#endif - s_mov_b32 m0, s_restore_m0 s_mov_b32 exec_lo, s_restore_exec_lo s_mov_b32 exec_hi, s_restore_exec_hi @@ -1453,13 +1264,6 @@ L_RETURN_WITHOUT_PRIV: s_setreg_b32 hwreg(S_STATUS_HWREG), s_restore_status // SCC is included, which is changed by previous salu -#if ASIC_FAMILY >= CHIP_GFX12 - // Make barrier and LDS state visible to all waves in the group. - // STATE_PRIV.BARRIER_COMPLETE may change after this point. - s_barrier_signal -2 - s_barrier_wait -2 -#endif - s_rfe_b64 s_restore_pc_lo //Return to the main shader program and resume execution L_END_PGM: @@ -1598,11 +1402,7 @@ function get_hwreg_size_bytes end function get_wave_size2(s_reg) -#if ASIC_FAMILY < CHIP_GFX12 s_getreg_b32 s_reg, hwreg(HW_REG_IB_STS2,SQ_WAVE_IB_STS2_WAVE64_SHIFT,SQ_WAVE_IB_STS2_WAVE64_SIZE) -#else - s_getreg_b32 s_reg, hwreg(HW_REG_WAVE_STATUS,SQ_WAVE_STATUS_WAVE64_SHIFT,SQ_WAVE_STATUS_WAVE64_SIZE) -#endif s_lshl_b32 s_reg, s_reg, S_WAVE_SIZE end diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm new file mode 100644 index 000000000000..1740e98c6719 --- /dev/null +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm @@ -0,0 +1,1126 @@ +/* + * Copyright 2018 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/* To compile this assembly code: + * + * gfx12: + * cpp -DASIC_FAMILY=CHIP_GFX12 cwsr_trap_handler_gfx12.asm -P -o gfx12.sp3 + * sp3 gfx12.sp3 -hex gfx12.hex + */ + +#define CHIP_GFX12 37 + +#define SINGLE_STEP_MISSED_WORKAROUND 1 //workaround for lost TRAP_AFTER_INST exception when SAVECTX raised + +var SQ_WAVE_STATE_PRIV_BARRIER_COMPLETE_MASK = 0x4 +var SQ_WAVE_STATE_PRIV_SCC_SHIFT = 9 +var SQ_WAVE_STATE_PRIV_SYS_PRIO_MASK = 0xC00 +var SQ_WAVE_STATE_PRIV_HALT_MASK = 0x4000 +var SQ_WAVE_STATE_PRIV_POISON_ERR_MASK = 0x8000 +var SQ_WAVE_STATE_PRIV_POISON_ERR_SHIFT = 15 +var SQ_WAVE_STATUS_WAVE64_SHIFT = 29 +var SQ_WAVE_STATUS_WAVE64_SIZE = 1 +var SQ_WAVE_STATUS_NO_VGPRS_SHIFT = 24 +var SQ_WAVE_STATE_PRIV_ALWAYS_CLEAR_MASK = SQ_WAVE_STATE_PRIV_SYS_PRIO_MASK|SQ_WAVE_STATE_PRIV_POISON_ERR_MASK +var S_SAVE_PC_HI_TRAP_ID_MASK = 0xF0000000 + +var SQ_WAVE_LDS_ALLOC_LDS_SIZE_SHIFT = 12 +var SQ_WAVE_LDS_ALLOC_LDS_SIZE_SIZE = 9 +var SQ_WAVE_GPR_ALLOC_VGPR_SIZE_SIZE = 8 +var SQ_WAVE_GPR_ALLOC_VGPR_SIZE_SHIFT = 12 +var SQ_WAVE_LDS_ALLOC_VGPR_SHARED_SIZE_SHIFT = 24 +var SQ_WAVE_LDS_ALLOC_VGPR_SHARED_SIZE_SIZE = 4 +var SQ_WAVE_LDS_ALLOC_GRANULARITY = 9 + +var SQ_WAVE_EXCP_FLAG_PRIV_ADDR_WATCH_MASK = 0xF +var SQ_WAVE_EXCP_FLAG_PRIV_MEM_VIOL_MASK = 0x10 +var SQ_WAVE_EXCP_FLAG_PRIV_SAVE_CONTEXT_SHIFT = 5 +var SQ_WAVE_EXCP_FLAG_PRIV_SAVE_CONTEXT_MASK = 0x20 +var SQ_WAVE_EXCP_FLAG_PRIV_ILLEGAL_INST_MASK = 0x40 +var SQ_WAVE_EXCP_FLAG_PRIV_ILLEGAL_INST_SHIFT = 6 +var SQ_WAVE_EXCP_FLAG_PRIV_HOST_TRAP_MASK = 0x80 +var SQ_WAVE_EXCP_FLAG_PRIV_HOST_TRAP_SHIFT = 7 +var SQ_WAVE_EXCP_FLAG_PRIV_WAVE_START_MASK = 0x100 +var SQ_WAVE_EXCP_FLAG_PRIV_WAVE_START_SHIFT = 8 +var SQ_WAVE_EXCP_FLAG_PRIV_WAVE_END_MASK = 0x200 +var SQ_WAVE_EXCP_FLAG_PRIV_TRAP_AFTER_INST_MASK = 0x800 +var SQ_WAVE_TRAP_CTRL_ADDR_WATCH_MASK = 0x80 +var SQ_WAVE_TRAP_CTRL_TRAP_AFTER_INST_MASK = 0x200 + +var SQ_WAVE_EXCP_FLAG_PRIV_NON_MASKABLE_EXCP_MASK= SQ_WAVE_EXCP_FLAG_PRIV_MEM_VIOL_MASK |\ + SQ_WAVE_EXCP_FLAG_PRIV_ILLEGAL_INST_MASK |\ + SQ_WAVE_EXCP_FLAG_PRIV_HOST_TRAP_MASK |\ + SQ_WAVE_EXCP_FLAG_PRIV_WAVE_START_MASK |\ + SQ_WAVE_EXCP_FLAG_PRIV_WAVE_END_MASK |\ + SQ_WAVE_EXCP_FLAG_PRIV_TRAP_AFTER_INST_MASK +var SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_1_SIZE = SQ_WAVE_EXCP_FLAG_PRIV_SAVE_CONTEXT_SHIFT +var SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_2_SHIFT = SQ_WAVE_EXCP_FLAG_PRIV_ILLEGAL_INST_SHIFT +var SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_2_SIZE = SQ_WAVE_EXCP_FLAG_PRIV_HOST_TRAP_SHIFT - SQ_WAVE_EXCP_FLAG_PRIV_ILLEGAL_INST_SHIFT +var SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_3_SHIFT = SQ_WAVE_EXCP_FLAG_PRIV_WAVE_START_SHIFT +var SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_3_SIZE = 32 - SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_3_SHIFT +var BARRIER_STATE_SIGNAL_OFFSET = 16 +var BARRIER_STATE_VALID_OFFSET = 0 + +var TTMP11_DEBUG_TRAP_ENABLED_SHIFT = 23 +var TTMP11_DEBUG_TRAP_ENABLED_MASK = 0x800000 + +// SQ_SEL_X/Y/Z/W, BUF_NUM_FORMAT_FLOAT, (0 for MUBUF stride[17:14] +// when ADD_TID_ENABLE and BUF_DATA_FORMAT_32 for MTBUF), ADD_TID_ENABLE +var S_SAVE_BUF_RSRC_WORD1_STRIDE = 0x00040000 +var S_SAVE_BUF_RSRC_WORD3_MISC = 0x10807FAC +var S_SAVE_SPI_INIT_FIRST_WAVE_MASK = 0x04000000 +var S_SAVE_SPI_INIT_FIRST_WAVE_SHIFT = 26 + +var S_SAVE_PC_HI_FIRST_WAVE_MASK = 0x80000000 +var S_SAVE_PC_HI_FIRST_WAVE_SHIFT = 31 + +var s_sgpr_save_num = 108 + +var s_save_spi_init_lo = exec_lo +var s_save_spi_init_hi = exec_hi +var s_save_pc_lo = ttmp0 +var s_save_pc_hi = ttmp1 +var s_save_exec_lo = ttmp2 +var s_save_exec_hi = ttmp3 +var s_save_state_priv = ttmp12 +var s_save_excp_flag_priv = ttmp15 +var s_save_xnack_mask = s_save_excp_flag_priv +var s_wave_size = ttmp7 +var s_save_buf_rsrc0 = ttmp8 +var s_save_buf_rsrc1 = ttmp9 +var s_save_buf_rsrc2 = ttmp10 +var s_save_buf_rsrc3 = ttmp11 +var s_save_mem_offset = ttmp4 +var s_save_alloc_size = s_save_excp_flag_priv +var s_save_tmp = ttmp14 +var s_save_m0 = ttmp5 +var s_save_ttmps_lo = s_save_tmp +var s_save_ttmps_hi = s_save_excp_flag_priv + +var S_RESTORE_BUF_RSRC_WORD1_STRIDE = S_SAVE_BUF_RSRC_WORD1_STRIDE +var S_RESTORE_BUF_RSRC_WORD3_MISC = S_SAVE_BUF_RSRC_WORD3_MISC + +var S_RESTORE_SPI_INIT_FIRST_WAVE_MASK = 0x04000000 +var S_RESTORE_SPI_INIT_FIRST_WAVE_SHIFT = 26 +var S_WAVE_SIZE = 25 + +var s_restore_spi_init_lo = exec_lo +var s_restore_spi_init_hi = exec_hi +var s_restore_mem_offset = ttmp12 +var s_restore_alloc_size = ttmp3 +var s_restore_tmp = ttmp2 +var s_restore_mem_offset_save = s_restore_tmp +var s_restore_m0 = s_restore_alloc_size +var s_restore_mode = ttmp7 +var s_restore_flat_scratch = s_restore_tmp +var s_restore_pc_lo = ttmp0 +var s_restore_pc_hi = ttmp1 +var s_restore_exec_lo = ttmp4 +var s_restore_exec_hi = ttmp5 +var s_restore_state_priv = ttmp14 +var s_restore_excp_flag_priv = ttmp15 +var s_restore_xnack_mask = ttmp13 +var s_restore_buf_rsrc0 = ttmp8 +var s_restore_buf_rsrc1 = ttmp9 +var s_restore_buf_rsrc2 = ttmp10 +var s_restore_buf_rsrc3 = ttmp11 +var s_restore_size = ttmp6 +var s_restore_ttmps_lo = s_restore_tmp +var s_restore_ttmps_hi = s_restore_alloc_size +var s_restore_spi_init_hi_save = s_restore_exec_hi + +shader main + asic(DEFAULT) + type(CS) + wave_size(32) + + s_branch L_SKIP_RESTORE //NOT restore. might be a regular trap or save + +L_JUMP_TO_RESTORE: + s_branch L_RESTORE + +L_SKIP_RESTORE: + s_getreg_b32 s_save_state_priv, hwreg(HW_REG_WAVE_STATE_PRIV) //save STATUS since we will change SCC + + // Clear SPI_PRIO: do not save with elevated priority. + // Clear ECC_ERR: prevents SQC store and triggers FATAL_HALT if setreg'd. + s_andn2_b32 s_save_state_priv, s_save_state_priv, SQ_WAVE_STATE_PRIV_ALWAYS_CLEAR_MASK + + s_getreg_b32 s_save_excp_flag_priv, hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV) + + s_and_b32 ttmp2, s_save_state_priv, SQ_WAVE_STATE_PRIV_HALT_MASK + s_cbranch_scc0 L_NOT_HALTED + +L_HALTED: + // Host trap may occur while wave is halted. + s_and_b32 ttmp2, s_save_excp_flag_priv, SQ_WAVE_EXCP_FLAG_PRIV_HOST_TRAP_MASK + s_cbranch_scc1 L_FETCH_2ND_TRAP + +L_CHECK_SAVE: + s_and_b32 ttmp2, s_save_excp_flag_priv, SQ_WAVE_EXCP_FLAG_PRIV_SAVE_CONTEXT_MASK + s_cbranch_scc1 L_SAVE + + // Wave is halted but neither host trap nor SAVECTX is raised. + // Caused by instruction fetch memory violation. + // Spin wait until context saved to prevent interrupt storm. + s_sleep 0x10 + s_getreg_b32 s_save_excp_flag_priv, hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV) + s_branch L_CHECK_SAVE + +L_NOT_HALTED: + // Let second-level handle non-SAVECTX exception or trap. + // Any concurrent SAVECTX will be handled upon re-entry once halted. + + // Check non-maskable exceptions. memory_violation, illegal_instruction + // and xnack_error exceptions always cause the wave to enter the trap + // handler. + s_and_b32 ttmp2, s_save_excp_flag_priv, SQ_WAVE_EXCP_FLAG_PRIV_NON_MASKABLE_EXCP_MASK + s_cbranch_scc1 L_FETCH_2ND_TRAP + + // Check for maskable exceptions in trapsts.excp and trapsts.excp_hi. + // Maskable exceptions only cause the wave to enter the trap handler if + // their respective bit in mode.excp_en is set. + s_getreg_b32 ttmp2, hwreg(HW_REG_WAVE_EXCP_FLAG_USER) + s_and_b32 ttmp3, s_save_excp_flag_priv, SQ_WAVE_EXCP_FLAG_PRIV_ADDR_WATCH_MASK + s_cbranch_scc0 L_NOT_ADDR_WATCH + s_or_b32 ttmp2, ttmp2, SQ_WAVE_TRAP_CTRL_ADDR_WATCH_MASK + +L_NOT_ADDR_WATCH: + s_getreg_b32 ttmp3, hwreg(HW_REG_WAVE_TRAP_CTRL) + s_and_b32 ttmp2, ttmp3, ttmp2 + s_cbranch_scc1 L_FETCH_2ND_TRAP + +L_CHECK_TRAP_ID: + // Check trap_id != 0 + s_and_b32 ttmp2, s_save_pc_hi, S_SAVE_PC_HI_TRAP_ID_MASK + s_cbranch_scc1 L_FETCH_2ND_TRAP + +#if SINGLE_STEP_MISSED_WORKAROUND + // Prioritize single step exception over context save. + // Second-level trap will halt wave and RFE, re-entering for SAVECTX. + // WAVE_TRAP_CTRL is already in ttmp3. + s_and_b32 ttmp3, ttmp3, SQ_WAVE_TRAP_CTRL_TRAP_AFTER_INST_MASK + s_cbranch_scc1 L_FETCH_2ND_TRAP +#endif + + s_and_b32 ttmp2, s_save_excp_flag_priv, SQ_WAVE_EXCP_FLAG_PRIV_SAVE_CONTEXT_MASK + s_cbranch_scc1 L_SAVE + +L_FETCH_2ND_TRAP: + // Read second-level TBA/TMA from first-level TMA and jump if available. + // ttmp[2:5] and ttmp12 can be used (others hold SPI-initialized debug data) + // ttmp12 holds SQ_WAVE_STATUS + s_sendmsg_rtn_b64 [ttmp14, ttmp15], sendmsg(MSG_RTN_GET_TMA) + s_wait_idle + s_lshl_b64 [ttmp14, ttmp15], [ttmp14, ttmp15], 0x8 + + s_bitcmp1_b32 ttmp15, 0xF + s_cbranch_scc0 L_NO_SIGN_EXTEND_TMA + s_or_b32 ttmp15, ttmp15, 0xFFFF0000 +L_NO_SIGN_EXTEND_TMA: + + s_load_dword ttmp2, [ttmp14, ttmp15], 0x10 scope:SCOPE_SYS // debug trap enabled flag + s_wait_idle + s_lshl_b32 ttmp2, ttmp2, TTMP11_DEBUG_TRAP_ENABLED_SHIFT + s_andn2_b32 ttmp11, ttmp11, TTMP11_DEBUG_TRAP_ENABLED_MASK + s_or_b32 ttmp11, ttmp11, ttmp2 + + s_load_dwordx2 [ttmp2, ttmp3], [ttmp14, ttmp15], 0x0 scope:SCOPE_SYS // second-level TBA + s_wait_idle + s_load_dwordx2 [ttmp14, ttmp15], [ttmp14, ttmp15], 0x8 scope:SCOPE_SYS // second-level TMA + s_wait_idle + + s_and_b64 [ttmp2, ttmp3], [ttmp2, ttmp3], [ttmp2, ttmp3] + s_cbranch_scc0 L_NO_NEXT_TRAP // second-level trap handler not been set + s_setpc_b64 [ttmp2, ttmp3] // jump to second-level trap handler + +L_NO_NEXT_TRAP: + // If not caused by trap then halt wave to prevent re-entry. + s_and_b32 ttmp2, s_save_pc_hi, S_SAVE_PC_HI_TRAP_ID_MASK + s_cbranch_scc1 L_TRAP_CASE + + // Host trap will not cause trap re-entry. + s_getreg_b32 ttmp2, hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV) + s_and_b32 ttmp2, ttmp2, SQ_WAVE_EXCP_FLAG_PRIV_HOST_TRAP_MASK + s_cbranch_scc1 L_EXIT_TRAP + s_or_b32 s_save_state_priv, s_save_state_priv, SQ_WAVE_STATE_PRIV_HALT_MASK + + // If the PC points to S_ENDPGM then context save will fail if STATE_PRIV.HALT is set. + // Rewind the PC to prevent this from occurring. + s_sub_u32 ttmp0, ttmp0, 0x8 + s_subb_u32 ttmp1, ttmp1, 0x0 + + s_branch L_EXIT_TRAP + +L_TRAP_CASE: + // Advance past trap instruction to prevent re-entry. + s_add_u32 ttmp0, ttmp0, 0x4 + s_addc_u32 ttmp1, ttmp1, 0x0 + +L_EXIT_TRAP: + s_and_b32 ttmp1, ttmp1, 0xFFFF + + // Restore SQ_WAVE_STATUS. + s_and_b64 exec, exec, exec // Restore STATUS.EXECZ, not writable by s_setreg_b32 + s_and_b64 vcc, vcc, vcc // Restore STATUS.VCCZ, not writable by s_setreg_b32 + + // STATE_PRIV.BARRIER_COMPLETE may have changed since we read it. + // Only restore fields which the trap handler changes. + s_lshr_b32 s_save_state_priv, s_save_state_priv, SQ_WAVE_STATE_PRIV_SCC_SHIFT + s_setreg_b32 hwreg(HW_REG_WAVE_STATE_PRIV, SQ_WAVE_STATE_PRIV_SCC_SHIFT, \ + SQ_WAVE_STATE_PRIV_POISON_ERR_SHIFT - SQ_WAVE_STATE_PRIV_SCC_SHIFT + 1), s_save_state_priv + + s_rfe_b64 [ttmp0, ttmp1] + +L_SAVE: + // If VGPRs have been deallocated then terminate the wavefront. + // It has no remaining program to run and cannot save without VGPRs. + s_getreg_b32 s_save_tmp, hwreg(HW_REG_WAVE_STATUS) + s_bitcmp1_b32 s_save_tmp, SQ_WAVE_STATUS_NO_VGPRS_SHIFT + s_cbranch_scc0 L_HAVE_VGPRS + s_endpgm +L_HAVE_VGPRS: + + s_and_b32 s_save_pc_hi, s_save_pc_hi, 0x0000ffff //pc[47:32] + s_mov_b32 s_save_tmp, 0 + s_setreg_b32 hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV, SQ_WAVE_EXCP_FLAG_PRIV_SAVE_CONTEXT_SHIFT, 1), s_save_tmp //clear saveCtx bit + + /* inform SPI the readiness and wait for SPI's go signal */ + s_mov_b32 s_save_exec_lo, exec_lo //save EXEC and use EXEC for the go signal from SPI + s_mov_b32 s_save_exec_hi, exec_hi + s_mov_b64 exec, 0x0 //clear EXEC to get ready to receive + + s_sendmsg_rtn_b64 [exec_lo, exec_hi], sendmsg(MSG_RTN_SAVE_WAVE) + s_wait_idle + + // Save first_wave flag so we can clear high bits of save address. + s_and_b32 s_save_tmp, s_save_spi_init_hi, S_SAVE_SPI_INIT_FIRST_WAVE_MASK + s_lshl_b32 s_save_tmp, s_save_tmp, (S_SAVE_PC_HI_FIRST_WAVE_SHIFT - S_SAVE_SPI_INIT_FIRST_WAVE_SHIFT) + s_or_b32 s_save_pc_hi, s_save_pc_hi, s_save_tmp + + // Trap temporaries must be saved via VGPR but all VGPRs are in use. + // There is no ttmp space to hold the resource constant for VGPR save. + // Save v0 by itself since it requires only two SGPRs. + s_mov_b32 s_save_ttmps_lo, exec_lo + s_and_b32 s_save_ttmps_hi, exec_hi, 0xFFFF + s_mov_b32 exec_lo, 0xFFFFFFFF + s_mov_b32 exec_hi, 0xFFFFFFFF + global_store_dword_addtid v0, [s_save_ttmps_lo, s_save_ttmps_hi] scope:SCOPE_SYS + v_mov_b32 v0, 0x0 + s_mov_b32 exec_lo, s_save_ttmps_lo + s_mov_b32 exec_hi, s_save_ttmps_hi + + // Save trap temporaries 4-11, 13 initialized by SPI debug dispatch logic + // ttmp SR memory offset : size(VGPR)+size(SVGPR)+size(SGPR)+0x40 + get_wave_size2(s_save_ttmps_hi) + get_vgpr_size_bytes(s_save_ttmps_lo, s_save_ttmps_hi) + get_svgpr_size_bytes(s_save_ttmps_hi) + s_add_u32 s_save_ttmps_lo, s_save_ttmps_lo, s_save_ttmps_hi + s_and_b32 s_save_ttmps_hi, s_save_spi_init_hi, 0xFFFF + s_add_u32 s_save_ttmps_lo, s_save_ttmps_lo, get_sgpr_size_bytes() + s_add_u32 s_save_ttmps_lo, s_save_ttmps_lo, s_save_spi_init_lo + s_addc_u32 s_save_ttmps_hi, s_save_ttmps_hi, 0x0 + + v_writelane_b32 v0, ttmp4, 0x4 + v_writelane_b32 v0, ttmp5, 0x5 + v_writelane_b32 v0, ttmp6, 0x6 + v_writelane_b32 v0, ttmp7, 0x7 + v_writelane_b32 v0, ttmp8, 0x8 + v_writelane_b32 v0, ttmp9, 0x9 + v_writelane_b32 v0, ttmp10, 0xA + v_writelane_b32 v0, ttmp11, 0xB + v_writelane_b32 v0, ttmp13, 0xD + v_writelane_b32 v0, exec_lo, 0xE + v_writelane_b32 v0, exec_hi, 0xF + + s_mov_b32 exec_lo, 0x3FFF + s_mov_b32 exec_hi, 0x0 + global_store_dword_addtid v0, [s_save_ttmps_lo, s_save_ttmps_hi] offset:0x40 scope:SCOPE_SYS + v_readlane_b32 ttmp14, v0, 0xE + v_readlane_b32 ttmp15, v0, 0xF + s_mov_b32 exec_lo, ttmp14 + s_mov_b32 exec_hi, ttmp15 + + /* setup Resource Contants */ + s_mov_b32 s_save_buf_rsrc0, s_save_spi_init_lo //base_addr_lo + s_and_b32 s_save_buf_rsrc1, s_save_spi_init_hi, 0x0000FFFF //base_addr_hi + s_or_b32 s_save_buf_rsrc1, s_save_buf_rsrc1, S_SAVE_BUF_RSRC_WORD1_STRIDE + s_mov_b32 s_save_buf_rsrc2, 0 //NUM_RECORDS initial value = 0 (in bytes) although not neccessarily inited + s_mov_b32 s_save_buf_rsrc3, S_SAVE_BUF_RSRC_WORD3_MISC + + s_mov_b32 s_save_m0, m0 + + /* global mem offset */ + s_mov_b32 s_save_mem_offset, 0x0 + get_wave_size2(s_wave_size) + + /* save first 4 VGPRs, needed for SGPR save */ + s_mov_b32 exec_lo, 0xFFFFFFFF //need every thread from now on + s_lshr_b32 m0, s_wave_size, S_WAVE_SIZE + s_and_b32 m0, m0, 1 + s_cmp_eq_u32 m0, 1 + s_cbranch_scc1 L_ENABLE_SAVE_4VGPR_EXEC_HI + s_mov_b32 exec_hi, 0x00000000 + s_branch L_SAVE_4VGPR_WAVE32 +L_ENABLE_SAVE_4VGPR_EXEC_HI: + s_mov_b32 exec_hi, 0xFFFFFFFF + s_branch L_SAVE_4VGPR_WAVE64 +L_SAVE_4VGPR_WAVE32: + s_mov_b32 s_save_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes + + // VGPR Allocated in 4-GPR granularity + + buffer_store_dword v1, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS offset:128 + buffer_store_dword v2, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS offset:128*2 + buffer_store_dword v3, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS offset:128*3 + s_branch L_SAVE_HWREG + +L_SAVE_4VGPR_WAVE64: + s_mov_b32 s_save_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes + + // VGPR Allocated in 4-GPR granularity + + buffer_store_dword v1, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS offset:256 + buffer_store_dword v2, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS offset:256*2 + buffer_store_dword v3, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS offset:256*3 + + /* save HW registers */ + +L_SAVE_HWREG: + // HWREG SR memory offset : size(VGPR)+size(SVGPR)+size(SGPR) + get_vgpr_size_bytes(s_save_mem_offset, s_wave_size) + get_svgpr_size_bytes(s_save_tmp) + s_add_u32 s_save_mem_offset, s_save_mem_offset, s_save_tmp + s_add_u32 s_save_mem_offset, s_save_mem_offset, get_sgpr_size_bytes() + + s_mov_b32 s_save_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes + + v_mov_b32 v0, 0x0 //Offset[31:0] from buffer resource + v_mov_b32 v1, 0x0 //Offset[63:32] from buffer resource + v_mov_b32 v2, 0x0 //Set of SGPRs for TCP store + s_mov_b32 m0, 0x0 //Next lane of v2 to write to + + // Ensure no further changes to barrier or LDS state. + // STATE_PRIV.BARRIER_COMPLETE may change up to this point. + s_barrier_signal -2 + s_barrier_wait -2 + + // Re-read final state of BARRIER_COMPLETE field for save. + s_getreg_b32 s_save_tmp, hwreg(HW_REG_WAVE_STATE_PRIV) + s_and_b32 s_save_tmp, s_save_tmp, SQ_WAVE_STATE_PRIV_BARRIER_COMPLETE_MASK + s_andn2_b32 s_save_state_priv, s_save_state_priv, SQ_WAVE_STATE_PRIV_BARRIER_COMPLETE_MASK + s_or_b32 s_save_state_priv, s_save_state_priv, s_save_tmp + + write_hwreg_to_v2(s_save_m0) + write_hwreg_to_v2(s_save_pc_lo) + s_andn2_b32 s_save_tmp, s_save_pc_hi, S_SAVE_PC_HI_FIRST_WAVE_MASK + write_hwreg_to_v2(s_save_tmp) + write_hwreg_to_v2(s_save_exec_lo) + write_hwreg_to_v2(s_save_exec_hi) + write_hwreg_to_v2(s_save_state_priv) + + s_getreg_b32 s_save_tmp, hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV) + write_hwreg_to_v2(s_save_tmp) + + write_hwreg_to_v2(s_save_xnack_mask) + + s_getreg_b32 s_save_m0, hwreg(HW_REG_WAVE_MODE) + write_hwreg_to_v2(s_save_m0) + + s_getreg_b32 s_save_m0, hwreg(HW_REG_WAVE_SCRATCH_BASE_LO) + write_hwreg_to_v2(s_save_m0) + + s_getreg_b32 s_save_m0, hwreg(HW_REG_WAVE_SCRATCH_BASE_HI) + write_hwreg_to_v2(s_save_m0) + + s_getreg_b32 s_save_m0, hwreg(HW_REG_WAVE_EXCP_FLAG_USER) + write_hwreg_to_v2(s_save_m0) + + s_getreg_b32 s_save_m0, hwreg(HW_REG_WAVE_TRAP_CTRL) + write_hwreg_to_v2(s_save_m0) + + s_getreg_b32 s_save_tmp, hwreg(HW_REG_WAVE_STATUS) + write_hwreg_to_v2(s_save_tmp) + + s_get_barrier_state s_save_tmp, -1 + s_wait_kmcnt (0) + write_hwreg_to_v2(s_save_tmp) + + // Write HWREGs with 16 VGPR lanes. TTMPs occupy space after this. + s_mov_b32 exec_lo, 0xFFFF + s_mov_b32 exec_hi, 0x0 + buffer_store_dword v2, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS + + // Write SGPRs with 32 VGPR lanes. This works in wave32 and wave64 mode. + s_mov_b32 exec_lo, 0xFFFFFFFF + + /* save SGPRs */ + // Save SGPR before LDS save, then the s0 to s4 can be used during LDS save... + + // SGPR SR memory offset : size(VGPR)+size(SVGPR) + get_vgpr_size_bytes(s_save_mem_offset, s_wave_size) + get_svgpr_size_bytes(s_save_tmp) + s_add_u32 s_save_mem_offset, s_save_mem_offset, s_save_tmp + s_mov_b32 s_save_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes + + s_mov_b32 ttmp13, 0x0 //next VGPR lane to copy SGPR into + + s_mov_b32 m0, 0x0 //SGPR initial index value =0 + s_nop 0x0 //Manually inserted wait states +L_SAVE_SGPR_LOOP: + // SGPR is allocated in 16 SGPR granularity + s_movrels_b64 s0, s0 //s0 = s[0+m0], s1 = s[1+m0] + s_movrels_b64 s2, s2 //s2 = s[2+m0], s3 = s[3+m0] + s_movrels_b64 s4, s4 //s4 = s[4+m0], s5 = s[5+m0] + s_movrels_b64 s6, s6 //s6 = s[6+m0], s7 = s[7+m0] + s_movrels_b64 s8, s8 //s8 = s[8+m0], s9 = s[9+m0] + s_movrels_b64 s10, s10 //s10 = s[10+m0], s11 = s[11+m0] + s_movrels_b64 s12, s12 //s12 = s[12+m0], s13 = s[13+m0] + s_movrels_b64 s14, s14 //s14 = s[14+m0], s15 = s[15+m0] + + write_16sgpr_to_v2(s0) + + s_cmp_eq_u32 ttmp13, 0x20 //have 32 VGPR lanes filled? + s_cbranch_scc0 L_SAVE_SGPR_SKIP_TCP_STORE + + buffer_store_dword v2, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS + s_add_u32 s_save_mem_offset, s_save_mem_offset, 0x80 + s_mov_b32 ttmp13, 0x0 + v_mov_b32 v2, 0x0 +L_SAVE_SGPR_SKIP_TCP_STORE: + + s_add_u32 m0, m0, 16 //next sgpr index + s_cmp_lt_u32 m0, 96 //scc = (m0 < first 96 SGPR) ? 1 : 0 + s_cbranch_scc1 L_SAVE_SGPR_LOOP //first 96 SGPR save is complete? + + //save the rest 12 SGPR + s_movrels_b64 s0, s0 //s0 = s[0+m0], s1 = s[1+m0] + s_movrels_b64 s2, s2 //s2 = s[2+m0], s3 = s[3+m0] + s_movrels_b64 s4, s4 //s4 = s[4+m0], s5 = s[5+m0] + s_movrels_b64 s6, s6 //s6 = s[6+m0], s7 = s[7+m0] + s_movrels_b64 s8, s8 //s8 = s[8+m0], s9 = s[9+m0] + s_movrels_b64 s10, s10 //s10 = s[10+m0], s11 = s[11+m0] + write_12sgpr_to_v2(s0) + + buffer_store_dword v2, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS + + /* save LDS */ + +L_SAVE_LDS: + // Change EXEC to all threads... + s_mov_b32 exec_lo, 0xFFFFFFFF //need every thread from now on + s_lshr_b32 m0, s_wave_size, S_WAVE_SIZE + s_and_b32 m0, m0, 1 + s_cmp_eq_u32 m0, 1 + s_cbranch_scc1 L_ENABLE_SAVE_LDS_EXEC_HI + s_mov_b32 exec_hi, 0x00000000 + s_branch L_SAVE_LDS_NORMAL +L_ENABLE_SAVE_LDS_EXEC_HI: + s_mov_b32 exec_hi, 0xFFFFFFFF +L_SAVE_LDS_NORMAL: + s_getreg_b32 s_save_alloc_size, hwreg(HW_REG_WAVE_LDS_ALLOC,SQ_WAVE_LDS_ALLOC_LDS_SIZE_SHIFT,SQ_WAVE_LDS_ALLOC_LDS_SIZE_SIZE) + s_and_b32 s_save_alloc_size, s_save_alloc_size, 0xFFFFFFFF //lds_size is zero? + s_cbranch_scc0 L_SAVE_LDS_DONE //no lds used? jump to L_SAVE_DONE + + s_and_b32 s_save_tmp, s_save_pc_hi, S_SAVE_PC_HI_FIRST_WAVE_MASK + s_cbranch_scc0 L_SAVE_LDS_DONE + + // first wave do LDS save; + + s_lshl_b32 s_save_alloc_size, s_save_alloc_size, SQ_WAVE_LDS_ALLOC_GRANULARITY + s_mov_b32 s_save_buf_rsrc2, s_save_alloc_size //NUM_RECORDS in bytes + + // LDS at offset: size(VGPR)+size(SVGPR)+SIZE(SGPR)+SIZE(HWREG) + // + get_vgpr_size_bytes(s_save_mem_offset, s_wave_size) + get_svgpr_size_bytes(s_save_tmp) + s_add_u32 s_save_mem_offset, s_save_mem_offset, s_save_tmp + s_add_u32 s_save_mem_offset, s_save_mem_offset, get_sgpr_size_bytes() + s_add_u32 s_save_mem_offset, s_save_mem_offset, get_hwreg_size_bytes() + + s_mov_b32 s_save_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes + + //load 0~63*4(byte address) to vgpr v0 + v_mbcnt_lo_u32_b32 v0, -1, 0 + v_mbcnt_hi_u32_b32 v0, -1, v0 + v_mul_u32_u24 v0, 4, v0 + + s_lshr_b32 m0, s_wave_size, S_WAVE_SIZE + s_and_b32 m0, m0, 1 + s_cmp_eq_u32 m0, 1 + s_mov_b32 m0, 0x0 + s_cbranch_scc1 L_SAVE_LDS_W64 + +L_SAVE_LDS_W32: + s_mov_b32 s3, 128 + s_nop 0 + s_nop 0 + s_nop 0 +L_SAVE_LDS_LOOP_W32: + ds_read_b32 v1, v0 + s_wait_idle + buffer_store_dword v1, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS + + s_add_u32 m0, m0, s3 //every buffer_store_lds does 128 bytes + s_add_u32 s_save_mem_offset, s_save_mem_offset, s3 + v_add_nc_u32 v0, v0, 128 //mem offset increased by 128 bytes + s_cmp_lt_u32 m0, s_save_alloc_size //scc=(m0 < s_save_alloc_size) ? 1 : 0 + s_cbranch_scc1 L_SAVE_LDS_LOOP_W32 //LDS save is complete? + + s_branch L_SAVE_LDS_DONE + +L_SAVE_LDS_W64: + s_mov_b32 s3, 256 + s_nop 0 + s_nop 0 + s_nop 0 +L_SAVE_LDS_LOOP_W64: + ds_read_b32 v1, v0 + s_wait_idle + buffer_store_dword v1, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS + + s_add_u32 m0, m0, s3 //every buffer_store_lds does 256 bytes + s_add_u32 s_save_mem_offset, s_save_mem_offset, s3 + v_add_nc_u32 v0, v0, 256 //mem offset increased by 256 bytes + s_cmp_lt_u32 m0, s_save_alloc_size //scc=(m0 < s_save_alloc_size) ? 1 : 0 + s_cbranch_scc1 L_SAVE_LDS_LOOP_W64 //LDS save is complete? + +L_SAVE_LDS_DONE: + /* save VGPRs - set the Rest VGPRs */ +L_SAVE_VGPR: + // VGPR SR memory offset: 0 + s_mov_b32 exec_lo, 0xFFFFFFFF //need every thread from now on + s_lshr_b32 m0, s_wave_size, S_WAVE_SIZE + s_and_b32 m0, m0, 1 + s_cmp_eq_u32 m0, 1 + s_cbranch_scc1 L_ENABLE_SAVE_VGPR_EXEC_HI + s_mov_b32 s_save_mem_offset, (0+128*4) // for the rest VGPRs + s_mov_b32 exec_hi, 0x00000000 + s_branch L_SAVE_VGPR_NORMAL +L_ENABLE_SAVE_VGPR_EXEC_HI: + s_mov_b32 s_save_mem_offset, (0+256*4) // for the rest VGPRs + s_mov_b32 exec_hi, 0xFFFFFFFF +L_SAVE_VGPR_NORMAL: + s_getreg_b32 s_save_alloc_size, hwreg(HW_REG_WAVE_GPR_ALLOC,SQ_WAVE_GPR_ALLOC_VGPR_SIZE_SHIFT,SQ_WAVE_GPR_ALLOC_VGPR_SIZE_SIZE) + s_add_u32 s_save_alloc_size, s_save_alloc_size, 1 + s_lshl_b32 s_save_alloc_size, s_save_alloc_size, 2 //Number of VGPRs = (vgpr_size + 1) * 4 (non-zero value) + //determine it is wave32 or wave64 + s_lshr_b32 m0, s_wave_size, S_WAVE_SIZE + s_and_b32 m0, m0, 1 + s_cmp_eq_u32 m0, 1 + s_cbranch_scc1 L_SAVE_VGPR_WAVE64 + + s_mov_b32 s_save_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes + + // VGPR Allocated in 4-GPR granularity + + // VGPR store using dw burst + s_mov_b32 m0, 0x4 //VGPR initial index value =4 + s_cmp_lt_u32 m0, s_save_alloc_size + s_cbranch_scc0 L_SAVE_VGPR_END + +L_SAVE_VGPR_W32_LOOP: + v_movrels_b32 v0, v0 //v0 = v[0+m0] + v_movrels_b32 v1, v1 //v1 = v[1+m0] + v_movrels_b32 v2, v2 //v2 = v[2+m0] + v_movrels_b32 v3, v3 //v3 = v[3+m0] + + buffer_store_dword v0, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS + buffer_store_dword v1, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS offset:128 + buffer_store_dword v2, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS offset:128*2 + buffer_store_dword v3, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS offset:128*3 + + s_add_u32 m0, m0, 4 //next vgpr index + s_add_u32 s_save_mem_offset, s_save_mem_offset, 128*4 //every buffer_store_dword does 128 bytes + s_cmp_lt_u32 m0, s_save_alloc_size //scc = (m0 < s_save_alloc_size) ? 1 : 0 + s_cbranch_scc1 L_SAVE_VGPR_W32_LOOP //VGPR save is complete? + + s_branch L_SAVE_VGPR_END + +L_SAVE_VGPR_WAVE64: + s_mov_b32 s_save_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes + + // VGPR store using dw burst + s_mov_b32 m0, 0x4 //VGPR initial index value =4 + s_cmp_lt_u32 m0, s_save_alloc_size + s_cbranch_scc0 L_SAVE_SHARED_VGPR + +L_SAVE_VGPR_W64_LOOP: + v_movrels_b32 v0, v0 //v0 = v[0+m0] + v_movrels_b32 v1, v1 //v1 = v[1+m0] + v_movrels_b32 v2, v2 //v2 = v[2+m0] + v_movrels_b32 v3, v3 //v3 = v[3+m0] + + buffer_store_dword v0, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS + buffer_store_dword v1, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS offset:256 + buffer_store_dword v2, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS offset:256*2 + buffer_store_dword v3, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS offset:256*3 + + s_add_u32 m0, m0, 4 //next vgpr index + s_add_u32 s_save_mem_offset, s_save_mem_offset, 256*4 //every buffer_store_dword does 256 bytes + s_cmp_lt_u32 m0, s_save_alloc_size //scc = (m0 < s_save_alloc_size) ? 1 : 0 + s_cbranch_scc1 L_SAVE_VGPR_W64_LOOP //VGPR save is complete? + +L_SAVE_SHARED_VGPR: + s_getreg_b32 s_save_alloc_size, hwreg(HW_REG_WAVE_LDS_ALLOC,SQ_WAVE_LDS_ALLOC_VGPR_SHARED_SIZE_SHIFT,SQ_WAVE_LDS_ALLOC_VGPR_SHARED_SIZE_SIZE) + s_and_b32 s_save_alloc_size, s_save_alloc_size, 0xFFFFFFFF //shared_vgpr_size is zero? + s_cbranch_scc0 L_SAVE_VGPR_END //no shared_vgpr used? jump to L_SAVE_LDS + s_lshl_b32 s_save_alloc_size, s_save_alloc_size, 3 //Number of SHARED_VGPRs = shared_vgpr_size * 8 (non-zero value) + //m0 now has the value of normal vgpr count, just add the m0 with shared_vgpr count to get the total count. + //save shared_vgpr will start from the index of m0 + s_add_u32 s_save_alloc_size, s_save_alloc_size, m0 + s_mov_b32 exec_lo, 0xFFFFFFFF + s_mov_b32 exec_hi, 0x00000000 + +L_SAVE_SHARED_VGPR_WAVE64_LOOP: + v_movrels_b32 v0, v0 //v0 = v[0+m0] + buffer_store_dword v0, v0, s_save_buf_rsrc0, s_save_mem_offset scope:SCOPE_SYS + s_add_u32 m0, m0, 1 //next vgpr index + s_add_u32 s_save_mem_offset, s_save_mem_offset, 128 + s_cmp_lt_u32 m0, s_save_alloc_size //scc = (m0 < s_save_alloc_size) ? 1 : 0 + s_cbranch_scc1 L_SAVE_SHARED_VGPR_WAVE64_LOOP //SHARED_VGPR save is complete? + +L_SAVE_VGPR_END: + s_branch L_END_PGM + +L_RESTORE: + /* Setup Resource Contants */ + s_mov_b32 s_restore_buf_rsrc0, s_restore_spi_init_lo //base_addr_lo + s_and_b32 s_restore_buf_rsrc1, s_restore_spi_init_hi, 0x0000FFFF //base_addr_hi + s_or_b32 s_restore_buf_rsrc1, s_restore_buf_rsrc1, S_RESTORE_BUF_RSRC_WORD1_STRIDE + s_mov_b32 s_restore_buf_rsrc2, 0 //NUM_RECORDS initial value = 0 (in bytes) + s_mov_b32 s_restore_buf_rsrc3, S_RESTORE_BUF_RSRC_WORD3_MISC + + // Save s_restore_spi_init_hi for later use. + s_mov_b32 s_restore_spi_init_hi_save, s_restore_spi_init_hi + + //determine it is wave32 or wave64 + get_wave_size2(s_restore_size) + + s_and_b32 s_restore_tmp, s_restore_spi_init_hi, S_RESTORE_SPI_INIT_FIRST_WAVE_MASK + s_cbranch_scc0 L_RESTORE_VGPR + + /* restore LDS */ +L_RESTORE_LDS: + s_mov_b32 exec_lo, 0xFFFFFFFF //need every thread from now on + s_lshr_b32 m0, s_restore_size, S_WAVE_SIZE + s_and_b32 m0, m0, 1 + s_cmp_eq_u32 m0, 1 + s_cbranch_scc1 L_ENABLE_RESTORE_LDS_EXEC_HI + s_mov_b32 exec_hi, 0x00000000 + s_branch L_RESTORE_LDS_NORMAL +L_ENABLE_RESTORE_LDS_EXEC_HI: + s_mov_b32 exec_hi, 0xFFFFFFFF +L_RESTORE_LDS_NORMAL: + s_getreg_b32 s_restore_alloc_size, hwreg(HW_REG_WAVE_LDS_ALLOC,SQ_WAVE_LDS_ALLOC_LDS_SIZE_SHIFT,SQ_WAVE_LDS_ALLOC_LDS_SIZE_SIZE) + s_and_b32 s_restore_alloc_size, s_restore_alloc_size, 0xFFFFFFFF //lds_size is zero? + s_cbranch_scc0 L_RESTORE_VGPR //no lds used? jump to L_RESTORE_VGPR + s_lshl_b32 s_restore_alloc_size, s_restore_alloc_size, SQ_WAVE_LDS_ALLOC_GRANULARITY + s_mov_b32 s_restore_buf_rsrc2, s_restore_alloc_size //NUM_RECORDS in bytes + + // LDS at offset: size(VGPR)+size(SVGPR)+SIZE(SGPR)+SIZE(HWREG) + // + get_vgpr_size_bytes(s_restore_mem_offset, s_restore_size) + get_svgpr_size_bytes(s_restore_tmp) + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, s_restore_tmp + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, get_sgpr_size_bytes() + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, get_hwreg_size_bytes() + + s_mov_b32 s_restore_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes + + s_lshr_b32 m0, s_restore_size, S_WAVE_SIZE + s_and_b32 m0, m0, 1 + s_cmp_eq_u32 m0, 1 + s_mov_b32 m0, 0x0 + s_cbranch_scc1 L_RESTORE_LDS_LOOP_W64 + +L_RESTORE_LDS_LOOP_W32: + buffer_load_dword v0, v0, s_restore_buf_rsrc0, s_restore_mem_offset + s_wait_idle + ds_store_addtid_b32 v0 + s_add_u32 m0, m0, 128 // 128 DW + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, 128 //mem offset increased by 128DW + s_cmp_lt_u32 m0, s_restore_alloc_size //scc=(m0 < s_restore_alloc_size) ? 1 : 0 + s_cbranch_scc1 L_RESTORE_LDS_LOOP_W32 //LDS restore is complete? + s_branch L_RESTORE_VGPR + +L_RESTORE_LDS_LOOP_W64: + buffer_load_dword v0, v0, s_restore_buf_rsrc0, s_restore_mem_offset + s_wait_idle + ds_store_addtid_b32 v0 + s_add_u32 m0, m0, 256 // 256 DW + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, 256 //mem offset increased by 256DW + s_cmp_lt_u32 m0, s_restore_alloc_size //scc=(m0 < s_restore_alloc_size) ? 1 : 0 + s_cbranch_scc1 L_RESTORE_LDS_LOOP_W64 //LDS restore is complete? + + /* restore VGPRs */ +L_RESTORE_VGPR: + // VGPR SR memory offset : 0 + s_mov_b32 s_restore_mem_offset, 0x0 + s_mov_b32 exec_lo, 0xFFFFFFFF //need every thread from now on + s_lshr_b32 m0, s_restore_size, S_WAVE_SIZE + s_and_b32 m0, m0, 1 + s_cmp_eq_u32 m0, 1 + s_cbranch_scc1 L_ENABLE_RESTORE_VGPR_EXEC_HI + s_mov_b32 exec_hi, 0x00000000 + s_branch L_RESTORE_VGPR_NORMAL +L_ENABLE_RESTORE_VGPR_EXEC_HI: + s_mov_b32 exec_hi, 0xFFFFFFFF +L_RESTORE_VGPR_NORMAL: + s_getreg_b32 s_restore_alloc_size, hwreg(HW_REG_WAVE_GPR_ALLOC,SQ_WAVE_GPR_ALLOC_VGPR_SIZE_SHIFT,SQ_WAVE_GPR_ALLOC_VGPR_SIZE_SIZE) + s_add_u32 s_restore_alloc_size, s_restore_alloc_size, 1 + s_lshl_b32 s_restore_alloc_size, s_restore_alloc_size, 2 //Number of VGPRs = (vgpr_size + 1) * 4 (non-zero value) + //determine it is wave32 or wave64 + s_lshr_b32 m0, s_restore_size, S_WAVE_SIZE + s_and_b32 m0, m0, 1 + s_cmp_eq_u32 m0, 1 + s_cbranch_scc1 L_RESTORE_VGPR_WAVE64 + + s_mov_b32 s_restore_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes + + // VGPR load using dw burst + s_mov_b32 s_restore_mem_offset_save, s_restore_mem_offset // restore start with v1, v0 will be the last + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, 128*4 + s_mov_b32 m0, 4 //VGPR initial index value = 4 + s_cmp_lt_u32 m0, s_restore_alloc_size + s_cbranch_scc0 L_RESTORE_SGPR + +L_RESTORE_VGPR_WAVE32_LOOP: + buffer_load_dword v0, v0, s_restore_buf_rsrc0, s_restore_mem_offset scope:SCOPE_SYS + buffer_load_dword v1, v0, s_restore_buf_rsrc0, s_restore_mem_offset scope:SCOPE_SYS offset:128 + buffer_load_dword v2, v0, s_restore_buf_rsrc0, s_restore_mem_offset scope:SCOPE_SYS offset:128*2 + buffer_load_dword v3, v0, s_restore_buf_rsrc0, s_restore_mem_offset scope:SCOPE_SYS offset:128*3 + s_wait_idle + v_movreld_b32 v0, v0 //v[0+m0] = v0 + v_movreld_b32 v1, v1 + v_movreld_b32 v2, v2 + v_movreld_b32 v3, v3 + s_add_u32 m0, m0, 4 //next vgpr index + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, 128*4 //every buffer_load_dword does 128 bytes + s_cmp_lt_u32 m0, s_restore_alloc_size //scc = (m0 < s_restore_alloc_size) ? 1 : 0 + s_cbranch_scc1 L_RESTORE_VGPR_WAVE32_LOOP //VGPR restore (except v0) is complete? + + /* VGPR restore on v0 */ + buffer_load_dword v0, v0, s_restore_buf_rsrc0, s_restore_mem_offset_save scope:SCOPE_SYS + buffer_load_dword v1, v0, s_restore_buf_rsrc0, s_restore_mem_offset_save scope:SCOPE_SYS offset:128 + buffer_load_dword v2, v0, s_restore_buf_rsrc0, s_restore_mem_offset_save scope:SCOPE_SYS offset:128*2 + buffer_load_dword v3, v0, s_restore_buf_rsrc0, s_restore_mem_offset_save scope:SCOPE_SYS offset:128*3 + s_wait_idle + + s_branch L_RESTORE_SGPR + +L_RESTORE_VGPR_WAVE64: + s_mov_b32 s_restore_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes + + // VGPR load using dw burst + s_mov_b32 s_restore_mem_offset_save, s_restore_mem_offset // restore start with v4, v0 will be the last + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, 256*4 + s_mov_b32 m0, 4 //VGPR initial index value = 4 + s_cmp_lt_u32 m0, s_restore_alloc_size + s_cbranch_scc0 L_RESTORE_SHARED_VGPR + +L_RESTORE_VGPR_WAVE64_LOOP: + buffer_load_dword v0, v0, s_restore_buf_rsrc0, s_restore_mem_offset scope:SCOPE_SYS + buffer_load_dword v1, v0, s_restore_buf_rsrc0, s_restore_mem_offset scope:SCOPE_SYS offset:256 + buffer_load_dword v2, v0, s_restore_buf_rsrc0, s_restore_mem_offset scope:SCOPE_SYS offset:256*2 + buffer_load_dword v3, v0, s_restore_buf_rsrc0, s_restore_mem_offset scope:SCOPE_SYS offset:256*3 + s_wait_idle + v_movreld_b32 v0, v0 //v[0+m0] = v0 + v_movreld_b32 v1, v1 + v_movreld_b32 v2, v2 + v_movreld_b32 v3, v3 + s_add_u32 m0, m0, 4 //next vgpr index + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, 256*4 //every buffer_load_dword does 256 bytes + s_cmp_lt_u32 m0, s_restore_alloc_size //scc = (m0 < s_restore_alloc_size) ? 1 : 0 + s_cbranch_scc1 L_RESTORE_VGPR_WAVE64_LOOP //VGPR restore (except v0) is complete? + +L_RESTORE_SHARED_VGPR: + s_getreg_b32 s_restore_alloc_size, hwreg(HW_REG_WAVE_LDS_ALLOC,SQ_WAVE_LDS_ALLOC_VGPR_SHARED_SIZE_SHIFT,SQ_WAVE_LDS_ALLOC_VGPR_SHARED_SIZE_SIZE) //shared_vgpr_size + s_and_b32 s_restore_alloc_size, s_restore_alloc_size, 0xFFFFFFFF //shared_vgpr_size is zero? + s_cbranch_scc0 L_RESTORE_V0 //no shared_vgpr used? + s_lshl_b32 s_restore_alloc_size, s_restore_alloc_size, 3 //Number of SHARED_VGPRs = shared_vgpr_size * 8 (non-zero value) + //m0 now has the value of normal vgpr count, just add the m0 with shared_vgpr count to get the total count. + //restore shared_vgpr will start from the index of m0 + s_add_u32 s_restore_alloc_size, s_restore_alloc_size, m0 + s_mov_b32 exec_lo, 0xFFFFFFFF + s_mov_b32 exec_hi, 0x00000000 +L_RESTORE_SHARED_VGPR_WAVE64_LOOP: + buffer_load_dword v0, v0, s_restore_buf_rsrc0, s_restore_mem_offset scope:SCOPE_SYS + s_wait_idle + v_movreld_b32 v0, v0 //v[0+m0] = v0 + s_add_u32 m0, m0, 1 //next vgpr index + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, 128 + s_cmp_lt_u32 m0, s_restore_alloc_size //scc = (m0 < s_restore_alloc_size) ? 1 : 0 + s_cbranch_scc1 L_RESTORE_SHARED_VGPR_WAVE64_LOOP //VGPR restore (except v0) is complete? + + s_mov_b32 exec_hi, 0xFFFFFFFF //restore back exec_hi before restoring V0!! + + /* VGPR restore on v0 */ +L_RESTORE_V0: + buffer_load_dword v0, v0, s_restore_buf_rsrc0, s_restore_mem_offset_save scope:SCOPE_SYS + buffer_load_dword v1, v0, s_restore_buf_rsrc0, s_restore_mem_offset_save scope:SCOPE_SYS offset:256 + buffer_load_dword v2, v0, s_restore_buf_rsrc0, s_restore_mem_offset_save scope:SCOPE_SYS offset:256*2 + buffer_load_dword v3, v0, s_restore_buf_rsrc0, s_restore_mem_offset_save scope:SCOPE_SYS offset:256*3 + s_wait_idle + + /* restore SGPRs */ + //will be 2+8+16*6 + // SGPR SR memory offset : size(VGPR)+size(SVGPR) +L_RESTORE_SGPR: + get_vgpr_size_bytes(s_restore_mem_offset, s_restore_size) + get_svgpr_size_bytes(s_restore_tmp) + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, s_restore_tmp + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, get_sgpr_size_bytes() + s_sub_u32 s_restore_mem_offset, s_restore_mem_offset, 20*4 //s108~s127 is not saved + + s_mov_b32 s_restore_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes + + s_mov_b32 m0, s_sgpr_save_num + + read_4sgpr_from_mem(s0, s_restore_buf_rsrc0, s_restore_mem_offset) + s_wait_idle + + s_sub_u32 m0, m0, 4 // Restore from S[0] to S[104] + s_nop 0 // hazard SALU M0=> S_MOVREL + + s_movreld_b64 s0, s0 //s[0+m0] = s0 + s_movreld_b64 s2, s2 + + read_8sgpr_from_mem(s0, s_restore_buf_rsrc0, s_restore_mem_offset) + s_wait_idle + + s_sub_u32 m0, m0, 8 // Restore from S[0] to S[96] + s_nop 0 // hazard SALU M0=> S_MOVREL + + s_movreld_b64 s0, s0 //s[0+m0] = s0 + s_movreld_b64 s2, s2 + s_movreld_b64 s4, s4 + s_movreld_b64 s6, s6 + + L_RESTORE_SGPR_LOOP: + read_16sgpr_from_mem(s0, s_restore_buf_rsrc0, s_restore_mem_offset) + s_wait_idle + + s_sub_u32 m0, m0, 16 // Restore from S[n] to S[0] + s_nop 0 // hazard SALU M0=> S_MOVREL + + s_movreld_b64 s0, s0 //s[0+m0] = s0 + s_movreld_b64 s2, s2 + s_movreld_b64 s4, s4 + s_movreld_b64 s6, s6 + s_movreld_b64 s8, s8 + s_movreld_b64 s10, s10 + s_movreld_b64 s12, s12 + s_movreld_b64 s14, s14 + + s_cmp_eq_u32 m0, 0 //scc = (m0 < s_sgpr_save_num) ? 1 : 0 + s_cbranch_scc0 L_RESTORE_SGPR_LOOP + + // s_barrier with STATE_PRIV.TRAP_AFTER_INST=1, STATUS.PRIV=1 incorrectly asserts debug exception. + // Clear DEBUG_EN before and restore MODE after the barrier. + s_setreg_imm32_b32 hwreg(HW_REG_WAVE_MODE), 0 + + /* restore HW registers */ +L_RESTORE_HWREG: + // HWREG SR memory offset : size(VGPR)+size(SVGPR)+size(SGPR) + get_vgpr_size_bytes(s_restore_mem_offset, s_restore_size) + get_svgpr_size_bytes(s_restore_tmp) + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, s_restore_tmp + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, get_sgpr_size_bytes() + + s_mov_b32 s_restore_buf_rsrc2, 0x1000000 //NUM_RECORDS in bytes + + // Restore s_restore_spi_init_hi before the saved value gets clobbered. + s_mov_b32 s_restore_spi_init_hi, s_restore_spi_init_hi_save + + read_hwreg_from_mem(s_restore_m0, s_restore_buf_rsrc0, s_restore_mem_offset) + read_hwreg_from_mem(s_restore_pc_lo, s_restore_buf_rsrc0, s_restore_mem_offset) + read_hwreg_from_mem(s_restore_pc_hi, s_restore_buf_rsrc0, s_restore_mem_offset) + read_hwreg_from_mem(s_restore_exec_lo, s_restore_buf_rsrc0, s_restore_mem_offset) + read_hwreg_from_mem(s_restore_exec_hi, s_restore_buf_rsrc0, s_restore_mem_offset) + read_hwreg_from_mem(s_restore_state_priv, s_restore_buf_rsrc0, s_restore_mem_offset) + read_hwreg_from_mem(s_restore_excp_flag_priv, s_restore_buf_rsrc0, s_restore_mem_offset) + read_hwreg_from_mem(s_restore_xnack_mask, s_restore_buf_rsrc0, s_restore_mem_offset) + read_hwreg_from_mem(s_restore_mode, s_restore_buf_rsrc0, s_restore_mem_offset) + read_hwreg_from_mem(s_restore_flat_scratch, s_restore_buf_rsrc0, s_restore_mem_offset) + s_wait_idle + + s_setreg_b32 hwreg(HW_REG_WAVE_SCRATCH_BASE_LO), s_restore_flat_scratch + + read_hwreg_from_mem(s_restore_flat_scratch, s_restore_buf_rsrc0, s_restore_mem_offset) + s_wait_idle + + s_setreg_b32 hwreg(HW_REG_WAVE_SCRATCH_BASE_HI), s_restore_flat_scratch + + read_hwreg_from_mem(s_restore_tmp, s_restore_buf_rsrc0, s_restore_mem_offset) + s_wait_idle + s_setreg_b32 hwreg(HW_REG_WAVE_EXCP_FLAG_USER), s_restore_tmp + + read_hwreg_from_mem(s_restore_tmp, s_restore_buf_rsrc0, s_restore_mem_offset) + s_wait_idle + s_setreg_b32 hwreg(HW_REG_WAVE_TRAP_CTRL), s_restore_tmp + + // Only the first wave needs to restore the workgroup barrier. + s_and_b32 s_restore_tmp, s_restore_spi_init_hi, S_RESTORE_SPI_INIT_FIRST_WAVE_MASK + s_cbranch_scc0 L_SKIP_BARRIER_RESTORE + + // Skip over WAVE_STATUS, since there is no state to restore from it + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, 4 + + read_hwreg_from_mem(s_restore_tmp, s_restore_buf_rsrc0, s_restore_mem_offset) + s_wait_idle + + s_bitcmp1_b32 s_restore_tmp, BARRIER_STATE_VALID_OFFSET + s_cbranch_scc0 L_SKIP_BARRIER_RESTORE + + // extract the saved signal count from s_restore_tmp + s_lshr_b32 s_restore_tmp, s_restore_tmp, BARRIER_STATE_SIGNAL_OFFSET + + // We need to call s_barrier_signal repeatedly to restore the signal + // count of the work group barrier. The member count is already + // initialized with the number of waves in the work group. +L_BARRIER_RESTORE_LOOP: + s_and_b32 s_restore_tmp, s_restore_tmp, s_restore_tmp + s_cbranch_scc0 L_SKIP_BARRIER_RESTORE + s_barrier_signal -1 + s_add_i32 s_restore_tmp, s_restore_tmp, -1 + s_branch L_BARRIER_RESTORE_LOOP + +L_SKIP_BARRIER_RESTORE: + + s_mov_b32 m0, s_restore_m0 + s_mov_b32 exec_lo, s_restore_exec_lo + s_mov_b32 exec_hi, s_restore_exec_hi + + // EXCP_FLAG_PRIV.SAVE_CONTEXT and HOST_TRAP may have changed. + // Only restore the other fields to avoid clobbering them. + s_setreg_b32 hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV, 0, SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_1_SIZE), s_restore_excp_flag_priv + s_lshr_b32 s_restore_excp_flag_priv, s_restore_excp_flag_priv, SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_2_SHIFT + s_setreg_b32 hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV, SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_2_SHIFT, SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_2_SIZE), s_restore_excp_flag_priv + s_lshr_b32 s_restore_excp_flag_priv, s_restore_excp_flag_priv, SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_3_SHIFT - SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_2_SHIFT + s_setreg_b32 hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV, SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_3_SHIFT, SQ_WAVE_EXCP_FLAG_PRIV_RESTORE_PART_3_SIZE), s_restore_excp_flag_priv + + s_setreg_b32 hwreg(HW_REG_WAVE_MODE), s_restore_mode + + // Restore trap temporaries 4-11, 13 initialized by SPI debug dispatch logic + // ttmp SR memory offset : size(VGPR)+size(SVGPR)+size(SGPR)+0x40 + get_vgpr_size_bytes(s_restore_ttmps_lo, s_restore_size) + get_svgpr_size_bytes(s_restore_ttmps_hi) + s_add_u32 s_restore_ttmps_lo, s_restore_ttmps_lo, s_restore_ttmps_hi + s_add_u32 s_restore_ttmps_lo, s_restore_ttmps_lo, get_sgpr_size_bytes() + s_add_u32 s_restore_ttmps_lo, s_restore_ttmps_lo, s_restore_buf_rsrc0 + s_addc_u32 s_restore_ttmps_hi, s_restore_buf_rsrc1, 0x0 + s_and_b32 s_restore_ttmps_hi, s_restore_ttmps_hi, 0xFFFF + s_load_dwordx4 [ttmp4, ttmp5, ttmp6, ttmp7], [s_restore_ttmps_lo, s_restore_ttmps_hi], 0x50 scope:SCOPE_SYS + s_load_dwordx4 [ttmp8, ttmp9, ttmp10, ttmp11], [s_restore_ttmps_lo, s_restore_ttmps_hi], 0x60 scope:SCOPE_SYS + s_load_dword ttmp13, [s_restore_ttmps_lo, s_restore_ttmps_hi], 0x74 scope:SCOPE_SYS + s_wait_idle + + s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x0000ffff //pc[47:32] //Do it here in order not to affect STATUS + s_and_b64 exec, exec, exec // Restore STATUS.EXECZ, not writable by s_setreg_b32 + s_and_b64 vcc, vcc, vcc // Restore STATUS.VCCZ, not writable by s_setreg_b32 + + s_setreg_b32 hwreg(HW_REG_WAVE_STATE_PRIV), s_restore_state_priv // SCC is included, which is changed by previous salu + + // Make barrier and LDS state visible to all waves in the group. + // STATE_PRIV.BARRIER_COMPLETE may change after this point. + s_barrier_signal -2 + s_barrier_wait -2 + + s_rfe_b64 s_restore_pc_lo //Return to the main shader program and resume execution + +L_END_PGM: + s_endpgm_saved +end + +function write_hwreg_to_v2(s) + // Copy into VGPR for later TCP store. + v_writelane_b32 v2, s, m0 + s_add_u32 m0, m0, 0x1 +end + + +function write_16sgpr_to_v2(s) + // Copy into VGPR for later TCP store. + for var sgpr_idx = 0; sgpr_idx < 16; sgpr_idx ++ + v_writelane_b32 v2, s[sgpr_idx], ttmp13 + s_add_u32 ttmp13, ttmp13, 0x1 + end +end + +function write_12sgpr_to_v2(s) + // Copy into VGPR for later TCP store. + for var sgpr_idx = 0; sgpr_idx < 12; sgpr_idx ++ + v_writelane_b32 v2, s[sgpr_idx], ttmp13 + s_add_u32 ttmp13, ttmp13, 0x1 + end +end + +function read_hwreg_from_mem(s, s_rsrc, s_mem_offset) + s_buffer_load_dword s, s_rsrc, s_mem_offset scope:SCOPE_SYS + s_add_u32 s_mem_offset, s_mem_offset, 4 +end + +function read_16sgpr_from_mem(s, s_rsrc, s_mem_offset) + s_sub_u32 s_mem_offset, s_mem_offset, 4*16 + s_buffer_load_dwordx16 s, s_rsrc, s_mem_offset scope:SCOPE_SYS +end + +function read_8sgpr_from_mem(s, s_rsrc, s_mem_offset) + s_sub_u32 s_mem_offset, s_mem_offset, 4*8 + s_buffer_load_dwordx8 s, s_rsrc, s_mem_offset scope:SCOPE_SYS +end + +function read_4sgpr_from_mem(s, s_rsrc, s_mem_offset) + s_sub_u32 s_mem_offset, s_mem_offset, 4*4 + s_buffer_load_dwordx4 s, s_rsrc, s_mem_offset scope:SCOPE_SYS +end + +function get_vgpr_size_bytes(s_vgpr_size_byte, s_size) + s_getreg_b32 s_vgpr_size_byte, hwreg(HW_REG_WAVE_GPR_ALLOC,SQ_WAVE_GPR_ALLOC_VGPR_SIZE_SHIFT,SQ_WAVE_GPR_ALLOC_VGPR_SIZE_SIZE) + s_add_u32 s_vgpr_size_byte, s_vgpr_size_byte, 1 + s_bitcmp1_b32 s_size, S_WAVE_SIZE + s_cbranch_scc1 L_ENABLE_SHIFT_W64 + s_lshl_b32 s_vgpr_size_byte, s_vgpr_size_byte, (2+7) //Number of VGPRs = (vgpr_size + 1) * 4 * 32 * 4 (non-zero value) + s_branch L_SHIFT_DONE +L_ENABLE_SHIFT_W64: + s_lshl_b32 s_vgpr_size_byte, s_vgpr_size_byte, (2+8) //Number of VGPRs = (vgpr_size + 1) * 4 * 64 * 4 (non-zero value) +L_SHIFT_DONE: +end + +function get_svgpr_size_bytes(s_svgpr_size_byte) + s_getreg_b32 s_svgpr_size_byte, hwreg(HW_REG_WAVE_LDS_ALLOC,SQ_WAVE_LDS_ALLOC_VGPR_SHARED_SIZE_SHIFT,SQ_WAVE_LDS_ALLOC_VGPR_SHARED_SIZE_SIZE) + s_lshl_b32 s_svgpr_size_byte, s_svgpr_size_byte, (3+7) +end + +function get_sgpr_size_bytes + return 512 +end + +function get_hwreg_size_bytes + return 128 +end + +function get_wave_size2(s_reg) + s_getreg_b32 s_reg, hwreg(HW_REG_WAVE_STATUS,SQ_WAVE_STATUS_WAVE64_SHIFT,SQ_WAVE_STATUS_WAVE64_SIZE) + s_lshl_b32 s_reg, s_reg, S_WAVE_SIZE +end diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm index bb26338204f4..0eabb7a8cab9 100644 --- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm @@ -37,17 +37,28 @@ * gc_9_4_3: * cpp -DASIC_FAMILY=GC_9_4_3 cwsr_trap_handler_gfx9.asm -P -o gc_9_4_3.sp3 * sp3 gc_9_4_3.sp3 -hex gc_9_4_3.hex + * + * gc_9_5_0: + * cpp -DASIC_FAMILY=GC_9_5_0 cwsr_trap_handler_gfx9.asm -P -o gc_9_5_0.sp3 + * sp3 gc_9_5_0.sp3 -hex gc_9_5_0.hex */ #define CHIP_VEGAM 18 #define CHIP_ARCTURUS 23 #define CHIP_ALDEBARAN 25 #define CHIP_GC_9_4_3 26 +#define CHIP_GC_9_5_0 27 var ACK_SQC_STORE = 1 //workaround for suspected SQC store bug causing incorrect stores under concurrency var SAVE_AFTER_XNACK_ERROR = 1 //workaround for TCP store failure after XNACK error when ALLOW_REPLAY=0, for debugger var SINGLE_STEP_MISSED_WORKAROUND = (ASIC_FAMILY <= CHIP_ALDEBARAN) //workaround for lost MODE.DEBUG_EN exception when SAVECTX raised +#if ASIC_FAMILY < CHIP_GC_9_4_3 +#define VMEM_MODIFIERS slc:1 glc:1 +#else +#define VMEM_MODIFIERS sc0:1 nt:1 +#endif + /**************************************************************************/ /* variables */ /**************************************************************************/ @@ -62,7 +73,13 @@ var SQ_WAVE_STATUS_ALLOW_REPLAY_MASK = 0x400000 var SQ_WAVE_STATUS_ECC_ERR_MASK = 0x20000 var SQ_WAVE_LDS_ALLOC_LDS_SIZE_SHIFT = 12 +#if ASIC_FAMILY >= CHIP_GC_9_5_0 +var SQ_WAVE_LDS_ALLOC_LDS_SIZE_SIZE = 11 +var LDS_RESTORE_GRANULARITY_BYTES = 1280 +#else var SQ_WAVE_LDS_ALLOC_LDS_SIZE_SIZE = 9 +var LDS_RESTORE_GRANULARITY_BYTES = 512 +#endif var SQ_WAVE_GPR_ALLOC_VGPR_SIZE_SIZE = 6 var SQ_WAVE_GPR_ALLOC_SGPR_SIZE_SIZE = 3 //FIXME sq.blk still has 4 bits at this time while SQ programming guide has 3 bits var SQ_WAVE_GPR_ALLOC_SGPR_SIZE_SHIFT = 24 @@ -557,12 +574,21 @@ if SAVE_AFTER_XNACK_ERROR v_lshlrev_b32 v2, 2, v3 L_SAVE_LDS_LOOP_SQC: +#if ASIC_FAMILY < CHIP_GC_9_5_0 ds_read2_b32 v[0:1], v2 offset0:0 offset1:0x40 s_waitcnt lgkmcnt(0) - write_vgprs_to_mem_with_sqc(v0, 2, s_save_buf_rsrc0, s_save_mem_offset) v_add_u32 v2, 0x200, v2 +#else + // gfx950 needs to save in multiple of 256 bytes. + ds_read_b32 v0, v2 + s_waitcnt lgkmcnt(0) + write_vgprs_to_mem_with_sqc(v0, 1, s_save_buf_rsrc0, s_save_mem_offset) + + v_add_u32 v2, 0x100, v2 +#endif + v_cmp_lt_u32 vcc[0:1], v2, s_save_alloc_size s_cbranch_vccnz L_SAVE_LDS_LOOP_SQC @@ -581,11 +607,14 @@ end L_SAVE_LDS_LOOP_VECTOR: ds_read_b64 v[0:1], v2 //x =LDS[a], byte address s_waitcnt lgkmcnt(0) - buffer_store_dwordx2 v[0:1], v2, s_save_buf_rsrc0, s_save_mem_offset offen:1 glc:1 slc:1 + buffer_store_dwordx2 v[0:1], v2, s_save_buf_rsrc0, s_save_mem_offset VMEM_MODIFIERS offen:1 // s_waitcnt vmcnt(0) // v_add_u32 v2, vcc[0:1], v2, v3 v_add_u32 v2, v2, v3 v_cmp_lt_u32 vcc[0:1], v2, s_save_alloc_size +#if ASIC_FAMILY >= CHIP_GC_9_5_0 + s_mov_b64 exec, vcc +#endif s_cbranch_vccnz L_SAVE_LDS_LOOP_VECTOR // restore rsrc3 @@ -748,8 +777,13 @@ L_RESTORE: L_RESTORE_LDS_LOOP: buffer_load_dword v0, v0, s_restore_buf_rsrc0, s_restore_mem_offset lds:1 // first 64DW buffer_load_dword v0, v0, s_restore_buf_rsrc0, s_restore_mem_offset lds:1 offset:256 // second 64DW - s_add_u32 m0, m0, 256*2 // 128 DW - s_add_u32 s_restore_mem_offset, s_restore_mem_offset, 256*2 //mem offset increased by 128DW +#if ASIC_FAMILY >= CHIP_GC_9_5_0 + buffer_load_dword v0, v0, s_restore_buf_rsrc0, s_restore_mem_offset lds:1 offset:512 // third 64DW + buffer_load_dword v0, v0, s_restore_buf_rsrc0, s_restore_mem_offset lds:1 offset:768 // forth 64DW + buffer_load_dword v0, v0, s_restore_buf_rsrc0, s_restore_mem_offset lds:1 offset:1024 // fifth 64DW +#endif + s_add_u32 m0, m0, LDS_RESTORE_GRANULARITY_BYTES // 128/320 DW + s_add_u32 s_restore_mem_offset, s_restore_mem_offset, LDS_RESTORE_GRANULARITY_BYTES //mem offset increased by 128/320 DW s_cmp_lt_u32 m0, s_restore_alloc_size //scc=(m0 < s_restore_alloc_size) ? 1 : 0 s_cbranch_scc1 L_RESTORE_LDS_LOOP //LDS restore is complete? @@ -979,17 +1013,17 @@ L_TCP_STORE_CHECK_DONE: end function write_4vgprs_to_mem(s_rsrc, s_mem_offset) - buffer_store_dword v0, v0, s_rsrc, s_mem_offset slc:1 glc:1 - buffer_store_dword v1, v0, s_rsrc, s_mem_offset slc:1 glc:1 offset:256 - buffer_store_dword v2, v0, s_rsrc, s_mem_offset slc:1 glc:1 offset:256*2 - buffer_store_dword v3, v0, s_rsrc, s_mem_offset slc:1 glc:1 offset:256*3 + buffer_store_dword v0, v0, s_rsrc, s_mem_offset VMEM_MODIFIERS + buffer_store_dword v1, v0, s_rsrc, s_mem_offset VMEM_MODIFIERS offset:256 + buffer_store_dword v2, v0, s_rsrc, s_mem_offset VMEM_MODIFIERS offset:256*2 + buffer_store_dword v3, v0, s_rsrc, s_mem_offset VMEM_MODIFIERS offset:256*3 end function read_4vgprs_from_mem(s_rsrc, s_mem_offset) - buffer_load_dword v0, v0, s_rsrc, s_mem_offset slc:1 glc:1 - buffer_load_dword v1, v0, s_rsrc, s_mem_offset slc:1 glc:1 offset:256 - buffer_load_dword v2, v0, s_rsrc, s_mem_offset slc:1 glc:1 offset:256*2 - buffer_load_dword v3, v0, s_rsrc, s_mem_offset slc:1 glc:1 offset:256*3 + buffer_load_dword v0, v0, s_rsrc, s_mem_offset VMEM_MODIFIERS + buffer_load_dword v1, v0, s_rsrc, s_mem_offset VMEM_MODIFIERS offset:256 + buffer_load_dword v2, v0, s_rsrc, s_mem_offset VMEM_MODIFIERS offset:256*2 + buffer_load_dword v3, v0, s_rsrc, s_mem_offset VMEM_MODIFIERS offset:256*3 s_waitcnt vmcnt(0) end diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c index 723f1220e1cc..693469c18c60 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c @@ -1423,6 +1423,7 @@ err: static int kfd_fill_gpu_cache_info_from_gfx_config(struct kfd_dev *kdev, + bool cache_line_size_missing, struct kfd_gpu_cache_info *pcache_info) { struct amdgpu_device *adev = kdev->adev; @@ -1437,6 +1438,8 @@ static int kfd_fill_gpu_cache_info_from_gfx_config(struct kfd_dev *kdev, CRAT_CACHE_FLAGS_SIMD_CACHE); pcache_info[i].num_cu_shared = adev->gfx.config.gc_num_tcp_per_wpg / 2; pcache_info[i].cache_line_size = adev->gfx.config.gc_tcp_cache_line_size; + if (cache_line_size_missing && !pcache_info[i].cache_line_size) + pcache_info[i].cache_line_size = 128; i++; } /* Scalar L1 Instruction Cache per SQC */ @@ -1449,6 +1452,8 @@ static int kfd_fill_gpu_cache_info_from_gfx_config(struct kfd_dev *kdev, CRAT_CACHE_FLAGS_SIMD_CACHE); pcache_info[i].num_cu_shared = adev->gfx.config.gc_num_sqc_per_wgp * 2; pcache_info[i].cache_line_size = adev->gfx.config.gc_instruction_cache_line_size; + if (cache_line_size_missing && !pcache_info[i].cache_line_size) + pcache_info[i].cache_line_size = 128; i++; } /* Scalar L1 Data Cache per SQC */ @@ -1460,6 +1465,8 @@ static int kfd_fill_gpu_cache_info_from_gfx_config(struct kfd_dev *kdev, CRAT_CACHE_FLAGS_SIMD_CACHE); pcache_info[i].num_cu_shared = adev->gfx.config.gc_num_sqc_per_wgp * 2; pcache_info[i].cache_line_size = adev->gfx.config.gc_scalar_data_cache_line_size; + if (cache_line_size_missing && !pcache_info[i].cache_line_size) + pcache_info[i].cache_line_size = 64; i++; } /* GL1 Data Cache per SA */ @@ -1472,7 +1479,8 @@ static int kfd_fill_gpu_cache_info_from_gfx_config(struct kfd_dev *kdev, CRAT_CACHE_FLAGS_DATA_CACHE | CRAT_CACHE_FLAGS_SIMD_CACHE); pcache_info[i].num_cu_shared = adev->gfx.config.max_cu_per_sh; - pcache_info[i].cache_line_size = 0; + if (cache_line_size_missing) + pcache_info[i].cache_line_size = 128; i++; } /* L2 Data Cache per GPU (Total Tex Cache) */ @@ -1484,6 +1492,8 @@ static int kfd_fill_gpu_cache_info_from_gfx_config(struct kfd_dev *kdev, CRAT_CACHE_FLAGS_SIMD_CACHE); pcache_info[i].num_cu_shared = adev->gfx.config.max_cu_per_sh; pcache_info[i].cache_line_size = adev->gfx.config.gc_tcc_cache_line_size; + if (cache_line_size_missing && !pcache_info[i].cache_line_size) + pcache_info[i].cache_line_size = 128; i++; } /* L3 Data Cache per GPU */ @@ -1494,7 +1504,7 @@ static int kfd_fill_gpu_cache_info_from_gfx_config(struct kfd_dev *kdev, CRAT_CACHE_FLAGS_DATA_CACHE | CRAT_CACHE_FLAGS_SIMD_CACHE); pcache_info[i].num_cu_shared = adev->gfx.config.max_cu_per_sh; - pcache_info[i].cache_line_size = 0; + pcache_info[i].cache_line_size = 64; i++; } return i; @@ -1510,6 +1520,8 @@ static int kfd_fill_gpu_cache_info_from_gfx_config_v2(struct kfd_dev *kdev, if (adev->gfx.config.gc_tcp_size_per_cu) { pcache_info[i].cache_size = adev->gfx.config.gc_tcp_size_per_cu; pcache_info[i].cache_level = 1; + /* Cacheline size not available in IP discovery for gc943,gc944 */ + pcache_info[i].cache_line_size = 128; pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | CRAT_CACHE_FLAGS_DATA_CACHE | CRAT_CACHE_FLAGS_SIMD_CACHE); @@ -1521,6 +1533,7 @@ static int kfd_fill_gpu_cache_info_from_gfx_config_v2(struct kfd_dev *kdev, pcache_info[i].cache_size = adev->gfx.config.gc_l1_instruction_cache_size_per_sqc; pcache_info[i].cache_level = 1; + pcache_info[i].cache_line_size = 64; pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | CRAT_CACHE_FLAGS_INST_CACHE | CRAT_CACHE_FLAGS_SIMD_CACHE); @@ -1531,6 +1544,7 @@ static int kfd_fill_gpu_cache_info_from_gfx_config_v2(struct kfd_dev *kdev, if (adev->gfx.config.gc_l1_data_cache_size_per_sqc) { pcache_info[i].cache_size = adev->gfx.config.gc_l1_data_cache_size_per_sqc; pcache_info[i].cache_level = 1; + pcache_info[i].cache_line_size = 64; pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | CRAT_CACHE_FLAGS_DATA_CACHE | CRAT_CACHE_FLAGS_SIMD_CACHE); @@ -1541,6 +1555,7 @@ static int kfd_fill_gpu_cache_info_from_gfx_config_v2(struct kfd_dev *kdev, if (adev->gfx.config.gc_tcc_size) { pcache_info[i].cache_size = adev->gfx.config.gc_tcc_size; pcache_info[i].cache_level = 2; + pcache_info[i].cache_line_size = 128; pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | CRAT_CACHE_FLAGS_DATA_CACHE | CRAT_CACHE_FLAGS_SIMD_CACHE); @@ -1551,6 +1566,7 @@ static int kfd_fill_gpu_cache_info_from_gfx_config_v2(struct kfd_dev *kdev, if (adev->gmc.mall_size) { pcache_info[i].cache_size = adev->gmc.mall_size / 1024; pcache_info[i].cache_level = 3; + pcache_info[i].cache_line_size = 64; pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | CRAT_CACHE_FLAGS_DATA_CACHE | CRAT_CACHE_FLAGS_SIMD_CACHE); @@ -1563,6 +1579,7 @@ static int kfd_fill_gpu_cache_info_from_gfx_config_v2(struct kfd_dev *kdev, int kfd_get_gpu_cache_info(struct kfd_node *kdev, struct kfd_gpu_cache_info **pcache_info) { int num_of_cache_types = 0; + bool cache_line_size_missing = false; switch (kdev->adev->asic_type) { case CHIP_KAVERI: @@ -1622,6 +1639,7 @@ int kfd_get_gpu_cache_info(struct kfd_node *kdev, struct kfd_gpu_cache_info **pc break; case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): num_of_cache_types = kfd_fill_gpu_cache_info_from_gfx_config_v2(kdev->kfd, *pcache_info); @@ -1686,10 +1704,17 @@ int kfd_get_gpu_cache_info(struct kfd_node *kdev, struct kfd_gpu_cache_info **pc case IP_VERSION(11, 5, 0): case IP_VERSION(11, 5, 1): case IP_VERSION(11, 5, 2): + /* Cacheline size not available in IP discovery for gc11. + * kfd_fill_gpu_cache_info_from_gfx_config to hard code it + */ + cache_line_size_missing = true; + fallthrough; case IP_VERSION(12, 0, 0): case IP_VERSION(12, 0, 1): num_of_cache_types = - kfd_fill_gpu_cache_info_from_gfx_config(kdev->kfd, *pcache_info); + kfd_fill_gpu_cache_info_from_gfx_config(kdev->kfd, + cache_line_size_missing, + *pcache_info); break; default: *pcache_info = dummy_cache_info; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index 312dfa84f29f..a8abc3091801 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -350,10 +350,27 @@ int kfd_dbg_set_mes_debug_mode(struct kfd_process_device *pdd, bool sq_trap_en) { uint32_t spi_dbg_cntl = pdd->spi_dbg_override | pdd->spi_dbg_launch_mode; uint32_t flags = pdd->process->dbg_flags; + struct amdgpu_device *adev = pdd->dev->adev; + int r; if (!kfd_dbg_is_per_vmid_supported(pdd->dev)) return 0; + if (!pdd->proc_ctx_cpu_ptr) { + r = amdgpu_amdkfd_alloc_gtt_mem(adev, + AMDGPU_MES_PROC_CTX_SIZE, + &pdd->proc_ctx_bo, + &pdd->proc_ctx_gpu_addr, + &pdd->proc_ctx_cpu_ptr, + false); + if (r) { + dev_err(adev->dev, + "failed to allocate process context bo\n"); + return r; + } + memset(pdd->proc_ctx_cpu_ptr, 0, AMDGPU_MES_PROC_CTX_SIZE); + } + return amdgpu_mes_set_shader_debugger(pdd->dev->adev, pdd->proc_ctx_gpu_addr, spi_dbg_cntl, pdd->watch_points, flags, sq_trap_en); } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h index 924d0fd85dfb..27aa1a5b120f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h @@ -79,6 +79,7 @@ static inline bool kfd_dbg_is_per_vmid_supported(struct kfd_node *dev) return (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2) || KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 3) || KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 4) || + KFD_GC_VERSION(dev) == IP_VERSION(9, 5, 0) || KFD_GC_VERSION(dev) >= IP_VERSION(11, 0, 0)); } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 956198da7859..a29374c86405 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -85,6 +85,7 @@ static void kfd_device_info_set_sdma_info(struct kfd_dev *kfd) case IP_VERSION(4, 4, 0):/* ALDEBARAN */ case IP_VERSION(4, 4, 2): case IP_VERSION(4, 4, 5): + case IP_VERSION(4, 4, 4): case IP_VERSION(5, 0, 0):/* NAVI10 */ case IP_VERSION(5, 0, 1):/* CYAN_SKILLFISH */ case IP_VERSION(5, 0, 2):/* NAVI14 */ @@ -152,6 +153,7 @@ static void kfd_device_info_set_event_interrupt_class(struct kfd_dev *kfd) break; case IP_VERSION(9, 4, 3): /* GC 9.4.3 */ case IP_VERSION(9, 4, 4): /* GC 9.4.4 */ + case IP_VERSION(9, 5, 0): /* GC 9.5.0 */ kfd->device_info.event_interrupt_class = &event_interrupt_class_v9_4_3; break; @@ -235,6 +237,9 @@ static void kfd_device_info_init(struct kfd_dev *kfd, */ kfd->device_info.needs_pci_atomics = true; kfd->device_info.no_atomic_fw_version = kfd->adev->gfx.rs64_enable ? 509 : 0; + } else if (gc_version < IP_VERSION(13, 0, 0)) { + kfd->device_info.needs_pci_atomics = true; + kfd->device_info.no_atomic_fw_version = 2090; } else { kfd->device_info.needs_pci_atomics = true; } @@ -353,6 +358,10 @@ struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, bool vf) gfx_target_version = 90402; f2g = &gc_9_4_3_kfd2kgd; break; + case IP_VERSION(9, 5, 0): + gfx_target_version = 90500; + f2g = &gc_9_4_3_kfd2kgd; + break; /* Navi10 */ case IP_VERSION(10, 1, 10): gfx_target_version = 100100; @@ -512,6 +521,10 @@ static void kfd_cwsr_init(struct kfd_dev *kfd) > KFD_CWSR_TMA_OFFSET); kfd->cwsr_isa = cwsr_trap_gfx9_4_3_hex; kfd->cwsr_isa_size = sizeof(cwsr_trap_gfx9_4_3_hex); + } else if (KFD_GC_VERSION(kfd) == IP_VERSION(9, 5, 0)) { + BUILD_BUG_ON(sizeof(cwsr_trap_gfx9_5_0_hex) > PAGE_SIZE); + kfd->cwsr_isa = cwsr_trap_gfx9_5_0_hex; + kfd->cwsr_isa_size = sizeof(cwsr_trap_gfx9_5_0_hex); } else if (KFD_GC_VERSION(kfd) < IP_VERSION(10, 1, 1)) { BUILD_BUG_ON(sizeof(cwsr_trap_gfx9_hex) > KFD_CWSR_TMA_OFFSET); @@ -564,6 +577,7 @@ static int kfd_gws_init(struct kfd_node *node) && kfd->mec2_fw_version >= 0x28) || (KFD_GC_VERSION(node) == IP_VERSION(9, 4, 3) || KFD_GC_VERSION(node) == IP_VERSION(9, 4, 4)) || + (KFD_GC_VERSION(node) == IP_VERSION(9, 5, 0)) || (KFD_GC_VERSION(node) >= IP_VERSION(10, 3, 0) && KFD_GC_VERSION(node) < IP_VERSION(11, 0, 0) && kfd->mec2_fw_version >= 0x6b) || @@ -635,6 +649,14 @@ static void kfd_cleanup_nodes(struct kfd_dev *kfd, unsigned int num_nodes) struct kfd_node *knode; unsigned int i; + /* + * flush_work ensures that there are no outstanding + * work-queue items that will access interrupt_ring. New work items + * can't be created because we stopped interrupt handling above. + */ + flush_workqueue(kfd->ih_wq); + destroy_workqueue(kfd->ih_wq); + for (i = 0; i < num_nodes; i++) { knode = kfd->nodes[i]; device_queue_manager_uninit(knode->dqm); @@ -730,14 +752,14 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, last_vmid_kfd = fls(gpu_resources->compute_vmid_bitmap)-1; vmid_num_kfd = last_vmid_kfd - first_vmid_kfd + 1; - /* For GFX9.4.3, we need special handling for VMIDs depending on - * partition mode. + /* For multi-partition capable GPUs, we need special handling for VMIDs + * depending on partition mode. * In CPX mode, the VMID range needs to be shared between XCDs. * Additionally, there are 13 VMIDs (3-15) available for KFD. To * divide them equally, we change starting VMID to 4 and not use * VMID 3. - * If the VMID range changes for GFX9.4.3, then this code MUST be - * revisited. + * If the VMID range changes for multi-partition capable GPUs, then + * this code MUST be revisited. */ if (kfd->adev->xcp_mgr) { partition_mode = amdgpu_xcp_query_partition_mode(kfd->adev->xcp_mgr, @@ -802,14 +824,12 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, kfd->hive_id = kfd->adev->gmc.xgmi.hive_id; /* - * For GFX9.4.3, the KFD abstracts all partitions within a socket as - * xGMI connected in the topology so assign a unique hive id per - * device based on the pci device location if device is in PCIe mode. + * For multi-partition capable GPUs, the KFD abstracts all partitions + * within a socket as xGMI connected in the topology so assign a unique + * hive id per device based on the pci device location if device is in + * PCIe mode. */ - if (!kfd->hive_id && - (KFD_GC_VERSION(kfd) == IP_VERSION(9, 4, 3) || - KFD_GC_VERSION(kfd) == IP_VERSION(9, 4, 4)) && - kfd->num_nodes > 1) + if (!kfd->hive_id && kfd->num_nodes > 1) kfd->hive_id = pci_dev_id(kfd->adev->pdev); kfd->noretry = kfd->adev->gmc.noretry; @@ -847,12 +867,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, KFD_XCP_MEMORY_SIZE(node->adev, node->node_id) >> 20); } - if ((KFD_GC_VERSION(kfd) == IP_VERSION(9, 4, 3) || - KFD_GC_VERSION(kfd) == IP_VERSION(9, 4, 4)) && - partition_mode == AMDGPU_CPX_PARTITION_MODE && + if (partition_mode == AMDGPU_CPX_PARTITION_MODE && kfd->num_nodes != 1) { - /* For GFX9.4.3 and CPX mode, first XCD gets VMID range - * 4-9 and second XCD gets VMID range 10-15. + /* For multi-partition capable GPUs and CPX mode, first + * XCD gets VMID range 4-9 and second XCD gets VMID + * range 10-15. */ node->vm_info.first_vmid_kfd = (i%2 == 0) ? @@ -876,8 +895,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, amdgpu_amdkfd_get_local_mem_info(kfd->adev, &node->local_mem_info, node->xcp); - if (KFD_GC_VERSION(kfd) == IP_VERSION(9, 4, 3) || - KFD_GC_VERSION(kfd) == IP_VERSION(9, 4, 4)) + if (kfd->adev->xcp_mgr) kfd_setup_interrupt_bitmap(node, i); /* Initialize the KFD node */ @@ -1056,21 +1074,6 @@ static int kfd_resume(struct kfd_node *node) return err; } -static inline void kfd_queue_work(struct workqueue_struct *wq, - struct work_struct *work) -{ - int cpu, new_cpu; - - cpu = new_cpu = smp_processor_id(); - do { - new_cpu = cpumask_next(new_cpu, cpu_online_mask) % nr_cpu_ids; - if (cpu_to_node(new_cpu) == numa_node_id()) - break; - } while (cpu != new_cpu); - - queue_work_on(new_cpu, wq, work); -} - /* This is called directly from KGD at ISR. */ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) { @@ -1096,7 +1099,7 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) patched_ihre, &is_patched) && enqueue_ih_ring_entry(node, is_patched ? patched_ihre : ih_ring_entry)) { - kfd_queue_work(node->ih_wq, &node->interrupt_work); + queue_work(node->kfd->ih_wq, &node->interrupt_work); spin_unlock_irqrestore(&node->interrupt_lock, flags); return; } @@ -1511,6 +1514,73 @@ bool kgd2kfd_compute_active(struct kfd_dev *kfd, uint32_t node_id) return kfd_compute_active(node); } +/** + * kgd2kfd_vmfault_fast_path() - KFD vm page fault interrupt handling fast path for gmc v9 + * @adev: amdgpu device + * @entry: vm fault interrupt vector + * @retry_fault: if this is retry fault + * + * retry fault - + * with CAM enabled, adev primary ring + * | gmc_v9_0_process_interrupt() + * adev soft_ring + * | gmc_v9_0_process_interrupt() worker failed to recover page fault + * KFD node ih_fifo + * | KFD interrupt_wq worker + * kfd_signal_vm_fault_event + * + * without CAM, adev primary ring1 + * | gmc_v9_0_process_interrupt worker failed to recvoer page fault + * KFD node ih_fifo + * | KFD interrupt_wq worker + * kfd_signal_vm_fault_event + * + * no-retry fault - + * adev primary ring + * | gmc_v9_0_process_interrupt() + * KFD node ih_fifo + * | KFD interrupt_wq worker + * kfd_signal_vm_fault_event + * + * fast path - After kfd_signal_vm_fault_event, gmc_v9_0_process_interrupt drop the page fault + * of same process, don't copy interrupt to KFD node ih_fifo. + * With gdb debugger enabled, need convert the retry fault to no-retry fault for + * debugger, cannot use the fast path. + * + * Return: + * true - use the fast path to handle this fault + * false - use normal path to handle it + */ +bool kgd2kfd_vmfault_fast_path(struct amdgpu_device *adev, struct amdgpu_iv_entry *entry, + bool retry_fault) +{ + struct kfd_process *p; + u32 cam_index; + + if (entry->ih == &adev->irq.ih_soft || entry->ih == &adev->irq.ih1) { + p = kfd_lookup_process_by_pasid(entry->pasid); + if (!p) + return true; + + if (p->gpu_page_fault && !p->debug_trap_enabled) { + if (retry_fault && adev->irq.retry_cam_enabled) { + cam_index = entry->src_data[2] & 0x3ff; + WDOORBELL32(adev->irq.retry_cam_doorbell_index, cam_index); + } + + kfd_unref_process(p); + return true; + } + + /* + * This is the first page fault, set flag and then signal user space + */ + p->gpu_page_fault = true; + kfd_unref_process(p); + } + return false; +} + #if defined(CONFIG_DEBUG_FS) /* This function will send a package to HIQ to hang the HWS diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index c79fe9069e22..1405e8affd48 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -207,6 +207,21 @@ static int add_queue_mes(struct device_queue_manager *dqm, struct queue *q, if (!down_read_trylock(&adev->reset_domain->sem)) return -EIO; + if (!pdd->proc_ctx_cpu_ptr) { + r = amdgpu_amdkfd_alloc_gtt_mem(adev, + AMDGPU_MES_PROC_CTX_SIZE, + &pdd->proc_ctx_bo, + &pdd->proc_ctx_gpu_addr, + &pdd->proc_ctx_cpu_ptr, + false); + if (r) { + dev_err(adev->dev, + "failed to allocate process context bo\n"); + return r; + } + memset(pdd->proc_ctx_cpu_ptr, 0, AMDGPU_MES_PROC_CTX_SIZE); + } + memset(&queue_input, 0x0, sizeof(struct mes_add_queue_input)); queue_input.process_id = qpd->pqm->process->pasid; queue_input.page_table_base_addr = qpd->page_table_base; @@ -2373,6 +2388,9 @@ static int wait_on_destroy_queue(struct device_queue_manager *dqm, q->process); int ret = 0; + if (WARN_ON(!pdd)) + return ret; + if (pdd->qpd.is_debug) return ret; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c index 210bcc048f4c..67137e674f1d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c @@ -64,7 +64,8 @@ static int update_qpd_v9(struct device_queue_manager *dqm, qpd->sh_mem_config |= 1 << SH_MEM_CONFIG__RETRY_DISABLE__SHIFT; if (KFD_GC_VERSION(dqm->dev->kfd) == IP_VERSION(9, 4, 3) || - KFD_GC_VERSION(dqm->dev->kfd) == IP_VERSION(9, 4, 4)) + KFD_GC_VERSION(dqm->dev->kfd) == IP_VERSION(9, 4, 4) || + KFD_GC_VERSION(dqm->dev->kfd) == IP_VERSION(9, 5, 0)) qpd->sh_mem_config |= (1 << SH_MEM_CONFIG__F8_MODE__SHIFT); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c index ea3792249209..d075f24e5f9f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c @@ -748,6 +748,16 @@ void kfd_signal_event_interrupt(u32 pasid, uint32_t partial_id, uint64_t *slots = page_slots(p->signal_page); uint32_t id; + /* + * If id is valid but slot is not signaled, GPU may signal the same event twice + * before driver have chance to process the first interrupt, then signal slot is + * auto-reset after set_event wakeup the user space, just drop the second event as + * the application only need wakeup once. + */ + if ((valid_id_bits > 31 || (1U << valid_id_bits) >= KFD_SIGNAL_EVENT_LIMIT) && + partial_id < KFD_SIGNAL_EVENT_LIMIT && slots[partial_id] == UNSIGNALED_EVENT_SLOT) + goto out_unlock; + if (valid_id_bits) pr_debug_ratelimited("Partial ID invalid: %u (%u valid bits)\n", partial_id, valid_id_bits); @@ -776,6 +786,7 @@ void kfd_signal_event_interrupt(u32 pasid, uint32_t partial_id, } } +out_unlock: rcu_read_unlock(); kfd_unref_process(p); } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c index d46a13156ee9..0cb5c582ce7d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c @@ -184,6 +184,7 @@ static void event_interrupt_poison_consumption_v9(struct kfd_node *dev, } else { reset = AMDGPU_RAS_GPU_RESET_MODE2_RESET; } + amdgpu_ras_set_err_poison(dev->adev, AMDGPU_RAS_BLOCK__GFX); break; case SOC15_IH_CLIENTID_VMC: case SOC15_IH_CLIENTID_VMC1: @@ -213,6 +214,7 @@ static void event_interrupt_poison_consumption_v9(struct kfd_node *dev, } else { reset = AMDGPU_RAS_GPU_RESET_MODE2_RESET; } + amdgpu_ras_set_err_poison(dev->adev, AMDGPU_RAS_BLOCK__SDMA); break; default: dev_warn(dev->adev->dev, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c index 9b6b6e882593..783c2f5a04e4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c @@ -46,7 +46,7 @@ #include #include "kfd_priv.h" -#define KFD_IH_NUM_ENTRIES 8192 +#define KFD_IH_NUM_ENTRIES 16384 static void interrupt_wq(struct work_struct *); @@ -62,11 +62,14 @@ int kfd_interrupt_init(struct kfd_node *node) return r; } - node->ih_wq = alloc_workqueue("KFD IH", WQ_HIGHPRI, 1); - if (unlikely(!node->ih_wq)) { - kfifo_free(&node->ih_fifo); - dev_err(node->adev->dev, "Failed to allocate KFD IH workqueue\n"); - return -ENOMEM; + if (!node->kfd->ih_wq) { + node->kfd->ih_wq = alloc_workqueue("KFD IH", WQ_HIGHPRI | WQ_UNBOUND, + node->kfd->num_nodes); + if (unlikely(!node->kfd->ih_wq)) { + kfifo_free(&node->ih_fifo); + dev_err(node->adev->dev, "Failed to allocate KFD IH workqueue\n"); + return -ENOMEM; + } } spin_lock_init(&node->interrupt_lock); @@ -96,16 +99,6 @@ void kfd_interrupt_exit(struct kfd_node *node) spin_lock_irqsave(&node->interrupt_lock, flags); node->interrupts_active = false; spin_unlock_irqrestore(&node->interrupt_lock, flags); - - /* - * flush_work ensures that there are no outstanding - * work-queue items that will access interrupt_ring. New work items - * can't be created because we stopped interrupt handling above. - */ - flush_workqueue(node->ih_wq); - - destroy_workqueue(node->ih_wq); - kfifo_free(&node->ih_fifo); } @@ -114,55 +107,48 @@ void kfd_interrupt_exit(struct kfd_node *node) */ bool enqueue_ih_ring_entry(struct kfd_node *node, const void *ih_ring_entry) { - int count; - - count = kfifo_in(&node->ih_fifo, ih_ring_entry, - node->kfd->device_info.ih_ring_entry_size); - if (count != node->kfd->device_info.ih_ring_entry_size) { - dev_dbg_ratelimited(node->adev->dev, - "Interrupt ring overflow, dropping interrupt %d\n", - count); + if (kfifo_is_full(&node->ih_fifo)) { + dev_warn_ratelimited(node->adev->dev, "KFD node %d ih_fifo overflow\n", + node->node_id); return false; } + kfifo_in(&node->ih_fifo, ih_ring_entry, node->kfd->device_info.ih_ring_entry_size); return true; } /* * Assumption: single reader/writer. This function is not re-entrant */ -static bool dequeue_ih_ring_entry(struct kfd_node *node, void *ih_ring_entry) +static bool dequeue_ih_ring_entry(struct kfd_node *node, u32 **ih_ring_entry) { int count; - count = kfifo_out(&node->ih_fifo, ih_ring_entry, - node->kfd->device_info.ih_ring_entry_size); - - WARN_ON(count && count != node->kfd->device_info.ih_ring_entry_size); + if (kfifo_is_empty(&node->ih_fifo)) + return false; + count = kfifo_out_linear_ptr(&node->ih_fifo, ih_ring_entry, + node->kfd->device_info.ih_ring_entry_size); + WARN_ON(count != node->kfd->device_info.ih_ring_entry_size); return count == node->kfd->device_info.ih_ring_entry_size; } static void interrupt_wq(struct work_struct *work) { - struct kfd_node *dev = container_of(work, struct kfd_node, - interrupt_work); - uint32_t ih_ring_entry[KFD_MAX_RING_ENTRY_SIZE]; + struct kfd_node *dev = container_of(work, struct kfd_node, interrupt_work); + uint32_t *ih_ring_entry; unsigned long start_jiffies = jiffies; - if (dev->kfd->device_info.ih_ring_entry_size > sizeof(ih_ring_entry)) { - dev_err_once(dev->adev->dev, "Ring entry too small\n"); - return; - } - - while (dequeue_ih_ring_entry(dev, ih_ring_entry)) { + while (dequeue_ih_ring_entry(dev, &ih_ring_entry)) { dev->kfd->device_info.event_interrupt_class->interrupt_wq(dev, ih_ring_entry); + kfifo_skip_count(&dev->ih_fifo, dev->kfd->device_info.ih_ring_entry_size); + if (time_is_before_jiffies(start_jiffies + HZ)) { /* If we spent more than a second processing signals, * reschedule the worker to avoid soft-lockup warnings */ - queue_work(dev->ih_wq, &dev->interrupt_work); + queue_work(dev->kfd->ih_wq, &dev->interrupt_work); break; } } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c index eacfeb32f35d..4b275937d05e 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c @@ -306,7 +306,7 @@ svm_migrate_copy_to_vram(struct kfd_node *node, struct svm_range *prange, spage = migrate_pfn_to_page(migrate->src[i]); if (spage && !is_zone_device_page(spage)) { src[i] = dma_map_page(dev, spage, 0, PAGE_SIZE, - DMA_TO_DEVICE); + DMA_BIDIRECTIONAL); r = dma_mapping_error(dev, src[i]); if (r) { dev_err(dev, "%s: fail %d dma_map_page\n", @@ -629,7 +629,7 @@ svm_migrate_copy_to_ram(struct amdgpu_device *adev, struct svm_range *prange, goto out_oom; } - dst[i] = dma_map_page(dev, dpage, 0, PAGE_SIZE, DMA_FROM_DEVICE); + dst[i] = dma_map_page(dev, dpage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL); r = dma_mapping_error(dev, dst[i]); if (r) { dev_err(adev->dev, "%s: fail %d dma_map_page\n", __func__, r); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c index 84e8ea3a8a0c..ff417d5361c4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c @@ -78,7 +78,8 @@ static void update_cu_mask(struct mqd_manager *mm, void *mqd, m->compute_static_thread_mgmt_se2 = se_mask[2]; m->compute_static_thread_mgmt_se3 = se_mask[3]; if (KFD_GC_VERSION(mm->dev) != IP_VERSION(9, 4, 3) && - KFD_GC_VERSION(mm->dev) != IP_VERSION(9, 4, 4)) { + KFD_GC_VERSION(mm->dev) != IP_VERSION(9, 4, 4) && + KFD_GC_VERSION(mm->dev) != IP_VERSION(9, 5, 0)) { m->compute_static_thread_mgmt_se4 = se_mask[4]; m->compute_static_thread_mgmt_se5 = se_mask[5]; m->compute_static_thread_mgmt_se6 = se_mask[6]; @@ -301,7 +302,8 @@ static void update_mqd(struct mqd_manager *mm, void *mqd, m->cp_hqd_ctx_save_control = 0; if (KFD_GC_VERSION(mm->dev) != IP_VERSION(9, 4, 3) && - KFD_GC_VERSION(mm->dev) != IP_VERSION(9, 4, 4)) + KFD_GC_VERSION(mm->dev) != IP_VERSION(9, 4, 4) && + KFD_GC_VERSION(mm->dev) != IP_VERSION(9, 5, 0)) update_cu_mask(mm, mqd, minfo, 0); set_priority(m, q); @@ -885,7 +887,8 @@ struct mqd_manager *mqd_manager_init_v9(enum KFD_MQD_TYPE type, mqd->debugfs_show_mqd = debugfs_show_mqd; #endif if (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 3) || - KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 4)) { + KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 4) || + KFD_GC_VERSION(dev) == IP_VERSION(9, 5, 0)) { mqd->init_mqd = init_mqd_v9_4_3; mqd->load_mqd = load_mqd_v9_4_3; mqd->update_mqd = update_mqd_v9_4_3; @@ -909,8 +912,10 @@ struct mqd_manager *mqd_manager_init_v9(enum KFD_MQD_TYPE type, #if defined(CONFIG_DEBUG_FS) mqd->debugfs_show_mqd = debugfs_show_mqd; #endif + mqd->check_preemption_failed = check_preemption_failed; if (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 3) || - KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 4)) { + KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 4) || + KFD_GC_VERSION(dev) == IP_VERSION(9, 5, 0)) { mqd->init_mqd = init_mqd_hiq_v9_4_3; mqd->load_mqd = hiq_load_mqd_kiq_v9_4_3; mqd->destroy_mqd = destroy_hiq_mqd_v9_4_3; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c index 37930629edc5..4984b41cd372 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c @@ -28,6 +28,10 @@ #include "kfd_kernel_queue.h" #include "kfd_priv.h" +#define OVER_SUBSCRIPTION_PROCESS_COUNT (1 << 0) +#define OVER_SUBSCRIPTION_COMPUTE_QUEUE_COUNT (1 << 1) +#define OVER_SUBSCRIPTION_GWS_QUEUE_COUNT (1 << 2) + static inline void inc_wptr(unsigned int *wptr, unsigned int increment_bytes, unsigned int buffer_size_bytes) { @@ -40,7 +44,7 @@ static inline void inc_wptr(unsigned int *wptr, unsigned int increment_bytes, static void pm_calc_rlib_size(struct packet_manager *pm, unsigned int *rlib_size, - bool *over_subscription) + int *over_subscription) { unsigned int process_count, queue_count, compute_queue_count, gws_queue_count; unsigned int map_queue_size; @@ -58,17 +62,20 @@ static void pm_calc_rlib_size(struct packet_manager *pm, * hws_max_conc_proc has been done in * kgd2kfd_device_init(). */ - *over_subscription = false; + *over_subscription = 0; if (node->max_proc_per_quantum > 1) max_proc_per_quantum = node->max_proc_per_quantum; - if ((process_count > max_proc_per_quantum) || - compute_queue_count > get_cp_queues_num(pm->dqm) || - gws_queue_count > 1) { - *over_subscription = true; + if (process_count > max_proc_per_quantum) + *over_subscription |= OVER_SUBSCRIPTION_PROCESS_COUNT; + if (compute_queue_count > get_cp_queues_num(pm->dqm)) + *over_subscription |= OVER_SUBSCRIPTION_COMPUTE_QUEUE_COUNT; + if (gws_queue_count > 1) + *over_subscription |= OVER_SUBSCRIPTION_GWS_QUEUE_COUNT; + + if (*over_subscription) dev_dbg(dev, "Over subscribed runlist\n"); - } map_queue_size = pm->pmf->map_queues_size; /* calculate run list ib allocation size */ @@ -89,7 +96,7 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm, unsigned int **rl_buffer, uint64_t *rl_gpu_buffer, unsigned int *rl_buffer_size, - bool *is_over_subscription) + int *is_over_subscription) { struct kfd_node *node = pm->dqm->dev; struct device *dev = node->adev->dev; @@ -134,7 +141,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm, struct qcm_process_device *qpd; struct queue *q; struct kernel_queue *kq; - bool is_over_subscription; + int is_over_subscription; rl_wptr = retval = processes_mapped = 0; @@ -213,15 +220,20 @@ static int pm_create_runlist_ib(struct packet_manager *pm, if (is_over_subscription) { if (!pm->is_over_subscription) - dev_warn( - dev, - "Runlist is getting oversubscribed. Expect reduced ROCm performance.\n"); + dev_warn(dev, "Runlist is getting oversubscribed due to%s%s%s. Expect reduced ROCm performance.\n", + is_over_subscription & OVER_SUBSCRIPTION_PROCESS_COUNT ? + " too many processes." : "", + is_over_subscription & OVER_SUBSCRIPTION_COMPUTE_QUEUE_COUNT ? + " too many queues." : "", + is_over_subscription & OVER_SUBSCRIPTION_GWS_QUEUE_COUNT ? + " multiple processes using cooperative launch." : ""); + retval = pm->pmf->runlist(pm, &rl_buffer[rl_wptr], *rl_gpu_addr, alloc_size_bytes / sizeof(uint32_t), true); } - pm->is_over_subscription = is_over_subscription; + pm->is_over_subscription = !!is_over_subscription; for (i = 0; i < alloc_size_bytes / sizeof(uint32_t); i++) pr_debug("0x%2X ", rl_buffer[i]); @@ -248,7 +260,8 @@ int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm) default: if (KFD_GC_VERSION(dqm->dev) == IP_VERSION(9, 4, 2) || KFD_GC_VERSION(dqm->dev) == IP_VERSION(9, 4, 3) || - KFD_GC_VERSION(dqm->dev) == IP_VERSION(9, 4, 4)) + KFD_GC_VERSION(dqm->dev) == IP_VERSION(9, 4, 4) || + KFD_GC_VERSION(dqm->dev) == IP_VERSION(9, 5, 0)) pm->pmf = &kfd_aldebaran_pm_funcs; else if (KFD_GC_VERSION(dqm->dev) >= IP_VERSION(9, 0, 1)) pm->pmf = &kfd_v9_pm_funcs; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 9e5ca0b93b2a..d8cd913aa772 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -32,7 +32,7 @@ #include #include #include -#include +#include #include #include #include @@ -207,7 +207,8 @@ enum cache_policy { #define KFD_SUPPORT_XNACK_PER_PROCESS(dev)\ ((KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2)) || \ (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 3)) || \ - (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 4))) + (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 4)) || \ + (KFD_GC_VERSION(dev) == IP_VERSION(9, 5, 0))) struct kfd_node; @@ -273,7 +274,6 @@ struct kfd_node { /* Interrupts */ struct kfifo ih_fifo; - struct workqueue_struct *ih_wq; struct work_struct interrupt_work; spinlock_t interrupt_lock; @@ -366,6 +366,8 @@ struct kfd_dev { struct kfd_node *nodes[MAX_KFD_NODES]; unsigned int num_nodes; + struct workqueue_struct *ih_wq; + /* Kernel doorbells for KFD device */ struct amdgpu_bo *doorbells; @@ -1002,6 +1004,9 @@ struct kfd_process { struct semaphore runtime_enable_sema; bool is_runtime_retry; struct kfd_runtime_info runtime_info; + + /* if gpu page fault sent to KFD */ + bool gpu_page_fault; }; #define KFD_PROCESS_TABLE_SIZE 5 /* bits: 32 entries */ @@ -1150,7 +1155,8 @@ static inline struct kfd_node *kfd_node_by_irq_ids(struct amdgpu_device *adev, uint32_t i; if (KFD_GC_VERSION(dev) != IP_VERSION(9, 4, 3) && - KFD_GC_VERSION(dev) != IP_VERSION(9, 4, 4)) + KFD_GC_VERSION(dev) != IP_VERSION(9, 4, 4) && + KFD_GC_VERSION(dev) != IP_VERSION(9, 5, 0)) return dev->nodes[0]; for (i = 0; i < dev->num_nodes; i++) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index 87cd52cf4ee9..083f83c94531 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1076,7 +1076,8 @@ static void kfd_process_destroy_pdds(struct kfd_process *p) kfd_free_process_doorbells(pdd->dev->kfd, pdd); - if (pdd->dev->kfd->shared_resources.enable_mes) + if (pdd->dev->kfd->shared_resources.enable_mes && + pdd->proc_ctx_cpu_ptr) amdgpu_amdkfd_free_gtt_mem(pdd->dev->adev, &pdd->proc_ctx_bo); /* @@ -1159,7 +1160,8 @@ static void kfd_process_wq_release(struct work_struct *work) */ synchronize_rcu(); ef = rcu_access_pointer(p->ef); - dma_fence_signal(ef); + if (ef) + dma_fence_signal(ef); kfd_process_remove_sysfs(p); @@ -1608,7 +1610,6 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_node *dev, struct kfd_process *p) { struct kfd_process_device *pdd = NULL; - int retval = 0; if (WARN_ON_ONCE(p->n_pdds >= MAX_GPU_INSTANCE)) return NULL; @@ -1632,21 +1633,6 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_node *dev, pdd->user_gpu_id = dev->id; atomic64_set(&pdd->evict_duration_counter, 0); - if (dev->kfd->shared_resources.enable_mes) { - retval = amdgpu_amdkfd_alloc_gtt_mem(dev->adev, - AMDGPU_MES_PROC_CTX_SIZE, - &pdd->proc_ctx_bo, - &pdd->proc_ctx_gpu_addr, - &pdd->proc_ctx_cpu_ptr, - false); - if (retval) { - dev_err(dev->adev->dev, - "failed to allocate process context bo\n"); - goto err_free_pdd; - } - memset(pdd->proc_ctx_cpu_ptr, 0, AMDGPU_MES_PROC_CTX_SIZE); - } - p->pdds[p->n_pdds++] = pdd; if (kfd_dbg_is_per_vmid_supported(pdd->dev)) pdd->spi_dbg_override = pdd->dev->kfd2kgd->disable_debug_trap( @@ -1658,10 +1644,6 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_node *dev, idr_init(&pdd->alloc_idr); return pdd; - -err_free_pdd: - kfree(pdd); - return NULL; } /** @@ -2146,10 +2128,11 @@ int kfd_process_drain_interrupts(struct kfd_process_device *pdd) irq_drain_fence[3] = pdd->process->pasid; /* - * For GFX 9.4.3, send the NodeId also in IH cookie DW[3] + * For GFX 9.4.3/9.5.0, send the NodeId also in IH cookie DW[3] */ if (KFD_GC_VERSION(pdd->dev->kfd) == IP_VERSION(9, 4, 3) || - KFD_GC_VERSION(pdd->dev->kfd) == IP_VERSION(9, 4, 4)) { + KFD_GC_VERSION(pdd->dev->kfd) == IP_VERSION(9, 4, 4) || + KFD_GC_VERSION(pdd->dev->kfd) == IP_VERSION(9, 5, 0)) { node_id = ffs(pdd->dev->interrupt_bitmap) - 1; irq_drain_fence[3] |= node_id << 16; } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c index c76db22a1000..9df56f8e09f9 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c @@ -131,8 +131,9 @@ int pqm_set_gws(struct process_queue_manager *pqm, unsigned int qid, if (!gws && pdd->qpd.num_gws == 0) return -EINVAL; - if (KFD_GC_VERSION(dev) != IP_VERSION(9, 4, 3) && - KFD_GC_VERSION(dev) != IP_VERSION(9, 4, 4) && + if ((KFD_GC_VERSION(dev) != IP_VERSION(9, 4, 3) && + KFD_GC_VERSION(dev) != IP_VERSION(9, 4, 4) && + KFD_GC_VERSION(dev) != IP_VERSION(9, 5, 0)) && !dev->kfd->shared_resources.enable_mes) { if (gws) ret = amdgpu_amdkfd_add_gws_to_process(pdd->process->kgd_process_info, @@ -197,6 +198,7 @@ static void pqm_clean_queue_resource(struct process_queue_manager *pqm, if (pqn->q->gws) { if (KFD_GC_VERSION(pqn->q->device) != IP_VERSION(9, 4, 3) && KFD_GC_VERSION(pqn->q->device) != IP_VERSION(9, 4, 4) && + KFD_GC_VERSION(pqn->q->device) != IP_VERSION(9, 5, 0) && !dev->kfd->shared_resources.enable_mes) amdgpu_amdkfd_remove_gws_from_process( pqm->process->kgd_process_info, pqn->q->gws); @@ -212,13 +214,17 @@ static void pqm_clean_queue_resource(struct process_queue_manager *pqm, void pqm_uninit(struct process_queue_manager *pqm) { struct process_queue_node *pqn, *next; - struct kfd_process_device *pdd; list_for_each_entry_safe(pqn, next, &pqm->queues, process_queue_list) { if (pqn->q) { - pdd = kfd_get_process_device_data(pqn->q->device, pqm->process); - kfd_queue_unref_bo_vas(pdd, &pqn->q->properties); - kfd_queue_release_buffers(pdd, &pqn->q->properties); + struct kfd_process_device *pdd = kfd_get_process_device_data(pqn->q->device, + pqm->process); + if (pdd) { + kfd_queue_unref_bo_vas(pdd, &pqn->q->properties); + kfd_queue_release_buffers(pdd, &pqn->q->properties); + } else { + WARN_ON(!pdd); + } pqm_clean_queue_resource(pqm, pqn); } @@ -316,11 +322,12 @@ int pqm_create_queue(struct process_queue_manager *pqm, unsigned int max_queues = 127; /* HWS limit */ /* - * On GFX 9.4.3, increase the number of queues that - * can be created to 255. No HWS limit on GFX 9.4.3. + * On GFX 9.4.3/9.5.0, increase the number of queues that + * can be created to 255. No HWS limit on GFX 9.4.3/9.5.0. */ if (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 3) || - KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 4)) + KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 4) || + KFD_GC_VERSION(dev) == IP_VERSION(9, 5, 0)) max_queues = 255; q = NULL; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c index ad29634f8b44..ecccd7adbab4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c @@ -394,7 +394,8 @@ static u32 kfd_get_vgpr_size_per_cu(u32 gfxv) if ((gfxv / 100 * 100) == 90400 || /* GFX_VERSION_AQUA_VANJARAM */ gfxv == 90010 || /* GFX_VERSION_ALDEBARAN */ - gfxv == 90008) /* GFX_VERSION_ARCTURUS */ + gfxv == 90008 || /* GFX_VERSION_ARCTURUS */ + gfxv == 90500) vgpr_size = 0x80000; else if (gfxv == 110000 || /* GFX_VERSION_PLUM_BONITO */ gfxv == 110001 || /* GFX_VERSION_WHEAT_NAS */ @@ -405,9 +406,10 @@ static u32 kfd_get_vgpr_size_per_cu(u32 gfxv) return vgpr_size; } -#define WG_CONTEXT_DATA_SIZE_PER_CU(gfxv) \ +#define WG_CONTEXT_DATA_SIZE_PER_CU(gfxv, props) \ (kfd_get_vgpr_size_per_cu(gfxv) + SGPR_SIZE_PER_CU +\ - LDS_SIZE_PER_CU + HWREG_SIZE_PER_CU) + (((gfxv) == 90500) ? (props->lds_size_in_kb << 10) : LDS_SIZE_PER_CU) +\ + HWREG_SIZE_PER_CU) #define CNTL_STACK_BYTES_PER_WAVE(gfxv) \ ((gfxv) >= 100100 ? 12 : 8) /* GFX_VERSION_NAVI10*/ @@ -431,7 +433,7 @@ void kfd_queue_ctx_save_restore_size(struct kfd_topology_device *dev) min(cu_num * 40, props->array_count / props->simd_arrays_per_engine * 512) : cu_num * 32; - wg_data_size = ALIGN(cu_num * WG_CONTEXT_DATA_SIZE_PER_CU(gfxv), PAGE_SIZE); + wg_data_size = ALIGN(cu_num * WG_CONTEXT_DATA_SIZE_PER_CU(gfxv, props), PAGE_SIZE); ctl_stack_size = wave_num * CNTL_STACK_BYTES_PER_WAVE(gfxv) + 8; ctl_stack_size = ALIGN(SIZEOF_HSA_USER_CONTEXT_SAVE_AREA_HEADER + ctl_stack_size, PAGE_SIZE); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 3e2911895c74..bd3e20d981e0 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -1195,6 +1195,7 @@ svm_range_get_pte_flags(struct kfd_node *node, struct kfd_node *bo_node; uint32_t flags = prange->flags; uint32_t mapping_flags = 0; + uint32_t gc_ip_version = KFD_GC_VERSION(node); uint64_t pte_flags; bool snoop = (domain != SVM_RANGE_VRAM_DOMAIN); bool coherent = flags & (KFD_IOCTL_SVM_FLAG_COHERENT | KFD_IOCTL_SVM_FLAG_EXT_COHERENT); @@ -1204,7 +1205,7 @@ svm_range_get_pte_flags(struct kfd_node *node, if (domain == SVM_RANGE_VRAM_DOMAIN) bo_node = prange->svm_bo->node; - switch (amdgpu_ip_version(node->adev, GC_HWIP, 0)) { + switch (gc_ip_version) { case IP_VERSION(9, 4, 1): if (domain == SVM_RANGE_VRAM_DOMAIN) { if (bo_node == node) { @@ -1241,8 +1242,10 @@ svm_range_get_pte_flags(struct kfd_node *node, break; case IP_VERSION(9, 4, 3): case IP_VERSION(9, 4, 4): + case IP_VERSION(9, 5, 0): if (ext_coherent) - mtype_local = node->adev->rev_id ? AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_UC; + mtype_local = (gc_ip_version < IP_VERSION(9, 5, 0) && !node->adev->rev_id) ? + AMDGPU_VM_MTYPE_UC : AMDGPU_VM_MTYPE_CC; else mtype_local = amdgpu_mtype_local == 1 ? AMDGPU_VM_MTYPE_NC : amdgpu_mtype_local == 2 ? AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_RW; @@ -1257,9 +1260,13 @@ svm_range_get_pte_flags(struct kfd_node *node, */ else if (svm_nodes_in_same_hive(bo_node, node) && !ext_coherent) mapping_flags |= AMDGPU_VM_MTYPE_NC; - /* PCIe P2P or extended system scope coherence */ - else + /* PCIe P2P on GPUs pre-9.5.0 */ + else if (gc_ip_version < IP_VERSION(9, 5, 0) && + !svm_nodes_in_same_hive(bo_node, node)) mapping_flags |= AMDGPU_VM_MTYPE_UC; + /* Other remote memory */ + else + mapping_flags |= ext_coherent ? AMDGPU_VM_MTYPE_UC : AMDGPU_VM_MTYPE_NC; /* system memory accessed by the APU */ } else if (node->adev->flags & AMD_IS_APU) { /* On NUMA systems, locality is determined per-page @@ -1271,7 +1278,10 @@ svm_range_get_pte_flags(struct kfd_node *node, mapping_flags |= ext_coherent ? AMDGPU_VM_MTYPE_UC : AMDGPU_VM_MTYPE_NC; /* system memory accessed by the dGPU */ } else { - mapping_flags |= AMDGPU_VM_MTYPE_UC; + if (gc_ip_version < IP_VERSION(9, 5, 0)) + mapping_flags |= AMDGPU_VM_MTYPE_UC; + else + mapping_flags |= AMDGPU_VM_MTYPE_NC; } break; case IP_VERSION(12, 0, 0): @@ -1299,7 +1309,7 @@ svm_range_get_pte_flags(struct kfd_node *node, pte_flags = AMDGPU_PTE_VALID; pte_flags |= (domain == SVM_RANGE_VRAM_DOMAIN) ? 0 : AMDGPU_PTE_SYSTEM; pte_flags |= snoop ? AMDGPU_PTE_SNOOPED : 0; - if (KFD_GC_VERSION(node) >= IP_VERSION(12, 0, 0)) + if (gc_ip_version >= IP_VERSION(12, 0, 0)) pte_flags |= AMDGPU_PTE_IS_PTE; pte_flags |= amdgpu_gem_va_map_flags(node->adev, mapping_flags); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c index 9476e30d6baa..ceb9fb475ef1 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c @@ -1714,7 +1714,8 @@ static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext, pcache->cacheline_size = pcache_info[cache_type].cache_line_size; if (KFD_GC_VERSION(knode) == IP_VERSION(9, 4, 3) || - KFD_GC_VERSION(knode) == IP_VERSION(9, 4, 4)) + KFD_GC_VERSION(knode) == IP_VERSION(9, 4, 4) || + KFD_GC_VERSION(knode) == IP_VERSION(9, 5, 0)) mode = adev->gmc.gmc_funcs->query_mem_partition_mode(adev); else mode = UNKNOWN_MEMORY_PARTITION_MODE; @@ -1776,7 +1777,7 @@ static void kfd_fill_cache_non_crat_info(struct kfd_topology_device *dev, struct struct amdgpu_cu_info *cu_info = &kdev->adev->gfx.cu_info; struct amdgpu_gfx_config *gfx_info = &kdev->adev->gfx.config; int gpu_processor_id; - struct kfd_cache_properties *props_ext; + struct kfd_cache_properties *props_ext = NULL; int num_of_entries = 0; int num_of_cache_types = 0; struct kfd_gpu_cache_info cache_info[KFD_MAX_CACHE_TYPES]; diff --git a/drivers/gpu/drm/amd/display/Kconfig b/drivers/gpu/drm/amd/display/Kconfig index 11e3f2f3b174..abd3b6564373 100644 --- a/drivers/gpu/drm/amd/display/Kconfig +++ b/drivers/gpu/drm/amd/display/Kconfig @@ -8,6 +8,8 @@ config DRM_AMD_DC bool "AMD DC - Enable new display engine" default y depends on BROKEN || !CC_IS_CLANG || ARM64 || LOONGARCH || RISCV || SPARC64 || X86_64 + select CEC_CORE + select CEC_NOTIFIER select SND_HDA_COMPONENT if SND_HDA_CORE # !CC_IS_CLANG: https://github.com/ClangBuiltLinux/linux/issues/1752 select DRM_AMD_DC_FP if ARCH_HAS_KERNEL_FPU_SUPPORT && !(CC_IS_CLANG && (ARM64 || LOONGARCH || RISCV)) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index f0a6816709ca..0ec178ca7434 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -93,10 +93,12 @@ #include #include #include +#include #include #include #include +#include #include #include "ivsrcid/dcn/irqsrcs_dcn_1_0.h" @@ -955,13 +957,13 @@ static void dm_dmub_outbox1_low_irq(void *interrupt_params) } } -static int dm_set_clockgating_state(void *handle, +static int dm_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int dm_set_powergating_state(void *handle, +static int dm_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; @@ -1036,8 +1038,10 @@ static int amdgpu_dm_audio_component_get_eld(struct device *kdev, int port, continue; *enabled = true; + mutex_lock(&connector->eld_mutex); ret = drm_eld_size(connector->eld); memcpy(buf, connector->eld, min(max_bytes, ret)); + mutex_unlock(&connector->eld_mutex); break; } @@ -2152,9 +2156,13 @@ static int amdgpu_dm_init(struct amdgpu_device *adev) } #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) - adev->dm.secure_display_ctxs = amdgpu_dm_crtc_secure_display_create_contexts(adev); - if (!adev->dm.secure_display_ctxs) + amdgpu_dm_crtc_secure_display_create_contexts(adev); + if (!adev->dm.secure_display_ctx.crtc_ctx) DRM_ERROR("amdgpu: failed to initialize secure display contexts.\n"); + + if (amdgpu_ip_version(adev, DCE_HWIP, 0) >= IP_VERSION(4, 0, 1)) + adev->dm.secure_display_ctx.support_mul_roi = true; + #endif DRM_DEBUG_DRIVER("KMS initialized.\n"); @@ -2197,15 +2205,15 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev) amdgpu_dm_destroy_drm_device(&adev->dm); #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) - if (adev->dm.secure_display_ctxs) { + if (adev->dm.secure_display_ctx.crtc_ctx) { for (i = 0; i < adev->mode_info.num_crtc; i++) { - if (adev->dm.secure_display_ctxs[i].crtc) { - flush_work(&adev->dm.secure_display_ctxs[i].notify_ta_work); - flush_work(&adev->dm.secure_display_ctxs[i].forward_roi_work); + if (adev->dm.secure_display_ctx.crtc_ctx[i].crtc) { + flush_work(&adev->dm.secure_display_ctx.crtc_ctx[i].notify_ta_work); + flush_work(&adev->dm.secure_display_ctx.crtc_ctx[i].forward_roi_work); } } - kfree(adev->dm.secure_display_ctxs); - adev->dm.secure_display_ctxs = NULL; + kfree(adev->dm.secure_display_ctx.crtc_ctx); + adev->dm.secure_display_ctx.crtc_ctx = NULL; } #endif if (adev->dm.hdcp_workqueue) { @@ -2338,7 +2346,8 @@ static int load_dmcu_fw(struct amdgpu_device *adev) return 0; } - r = amdgpu_ucode_request(adev, &adev->dm.fw_dmcu, "%s", fw_name_dmcu); + r = amdgpu_ucode_request(adev, &adev->dm.fw_dmcu, AMDGPU_UCODE_REQUIRED, + "%s", fw_name_dmcu); if (r == -ENODEV) { /* DMCU firmware is not necessary, so don't raise a fuss if it's missing */ DRM_DEBUG_KMS("dm: DMCU firmware not found\n"); @@ -2746,6 +2755,48 @@ out_fail: mutex_unlock(&mgr->lock); } +void hdmi_cec_unset_edid(struct amdgpu_dm_connector *aconnector) +{ + struct cec_notifier *n = aconnector->notifier; + + if (!n) + return; + + cec_notifier_phys_addr_invalidate(n); +} + +void hdmi_cec_set_edid(struct amdgpu_dm_connector *aconnector) +{ + struct drm_connector *connector = &aconnector->base; + struct cec_notifier *n = aconnector->notifier; + + if (!n) + return; + + cec_notifier_set_phys_addr(n, + connector->display_info.source_physical_address); +} + +static void s3_handle_hdmi_cec(struct drm_device *ddev, bool suspend) +{ + struct amdgpu_dm_connector *aconnector; + struct drm_connector *connector; + struct drm_connector_list_iter conn_iter; + + drm_connector_list_iter_begin(ddev, &conn_iter); + drm_for_each_connector_iter(connector, &conn_iter) { + if (connector->connector_type == DRM_MODE_CONNECTOR_WRITEBACK) + continue; + + aconnector = to_amdgpu_dm_connector(connector); + if (suspend) + hdmi_cec_unset_edid(aconnector); + else + hdmi_cec_set_edid(aconnector); + } + drm_connector_list_iter_end(&conn_iter); +} + static void s3_handle_mst(struct drm_device *dev, bool suspend) { struct amdgpu_dm_connector *aconnector; @@ -3017,6 +3068,8 @@ static int dm_suspend(struct amdgpu_ip_block *ip_block) if (IS_ERR(adev->dm.cached_state)) return PTR_ERR(adev->dm.cached_state); + s3_handle_hdmi_cec(adev_to_drm(adev), true); + s3_handle_mst(adev_to_drm(adev), true); amdgpu_dm_irq_suspend(adev); @@ -3289,6 +3342,8 @@ static int dm_resume(struct amdgpu_ip_block *ip_block) */ amdgpu_dm_irq_resume_early(adev); + s3_handle_hdmi_cec(ddev, false); + /* On resume we need to rewrite the MSTM control bits to enable MST*/ s3_handle_mst(ddev, false); @@ -3457,6 +3512,7 @@ static void update_connector_ext_caps(struct amdgpu_dm_connector *aconnector) struct drm_connector *conn_base; struct amdgpu_device *adev; struct drm_luminance_range_info *luminance_range; + int min_input_signal_override; if (aconnector->bl_idx == -1 || aconnector->dc_link->connector_signal != SIGNAL_TYPE_EDP) @@ -3481,6 +3537,8 @@ static void update_connector_ext_caps(struct amdgpu_dm_connector *aconnector) caps->aux_support = false; else if (amdgpu_backlight == 1) caps->aux_support = true; + if (caps->aux_support) + aconnector->dc_link->backlight_control_type = BACKLIGHT_CONTROL_AMD_AUX; luminance_range = &conn_base->display_info.luminance_range; @@ -3491,6 +3549,10 @@ static void update_connector_ext_caps(struct amdgpu_dm_connector *aconnector) caps->aux_min_input_signal = 0; caps->aux_max_input_signal = 512; } + + min_input_signal_override = drm_get_panel_min_brightness_quirk(aconnector->drm_edid); + if (min_input_signal_override >= 0) + caps->min_input_signal = min_input_signal_override; } void amdgpu_dm_update_connector_after_detect( @@ -3596,6 +3658,7 @@ void amdgpu_dm_update_connector_after_detect( dc_sink_retain(aconnector->dc_sink); if (sink->dc_edid.length == 0) { aconnector->drm_edid = NULL; + hdmi_cec_unset_edid(aconnector); if (aconnector->dc_link->aux_mode) { drm_dp_cec_unset_edid(&aconnector->dm_dp_aux.aux); } @@ -3605,6 +3668,7 @@ void amdgpu_dm_update_connector_after_detect( aconnector->drm_edid = drm_edid_alloc(edid, sink->dc_edid.length); drm_edid_connector_update(connector, aconnector->drm_edid); + hdmi_cec_set_edid(aconnector); if (aconnector->dc_link->aux_mode) drm_dp_cec_attach(&aconnector->dm_dp_aux.aux, connector->display_info.source_physical_address); @@ -3621,6 +3685,7 @@ void amdgpu_dm_update_connector_after_detect( amdgpu_dm_update_freesync_caps(connector, aconnector->drm_edid); update_connector_ext_caps(aconnector); } else { + hdmi_cec_unset_edid(aconnector); drm_dp_cec_unset_edid(&aconnector->dm_dp_aux.aux); amdgpu_dm_update_freesync_caps(connector, NULL); aconnector->num_modes = 0; @@ -5304,7 +5369,8 @@ static int dm_init_microcode(struct amdgpu_device *adev) /* ASIC doesn't support DMUB. */ return 0; } - r = amdgpu_ucode_request(adev, &adev->dm.dmub_fw, "%s", fw_name_dmub); + r = amdgpu_ucode_request(adev, &adev->dm.dmub_fw, AMDGPU_UCODE_REQUIRED, + "%s", fw_name_dmub); return r; } @@ -5520,8 +5586,7 @@ fill_dc_plane_info_and_addr(struct amdgpu_device *adev, const u64 tiling_flags, struct dc_plane_info *plane_info, struct dc_plane_address *address, - bool tmz_surface, - bool force_disable_dcc) + bool tmz_surface) { const struct drm_framebuffer *fb = plane_state->fb; const struct amdgpu_framebuffer *afb = @@ -5620,7 +5685,7 @@ fill_dc_plane_info_and_addr(struct amdgpu_device *adev, &plane_info->tiling_info, &plane_info->plane_size, &plane_info->dcc, address, - tmz_surface, force_disable_dcc); + tmz_surface); if (ret) return ret; @@ -5641,7 +5706,6 @@ static int fill_dc_plane_attributes(struct amdgpu_device *adev, struct dc_scaling_info scaling_info; struct dc_plane_info plane_info; int ret; - bool force_disable_dcc = false; ret = amdgpu_dm_plane_fill_dc_scaling_info(adev, plane_state, &scaling_info); if (ret) @@ -5652,13 +5716,11 @@ static int fill_dc_plane_attributes(struct amdgpu_device *adev, dc_plane_state->clip_rect = scaling_info.clip_rect; dc_plane_state->scaling_quality = scaling_info.scaling_quality; - force_disable_dcc = adev->asic_type == CHIP_RAVEN && adev->in_suspend; ret = fill_dc_plane_info_and_addr(adev, plane_state, afb->tiling_flags, &plane_info, &dc_plane_state->address, - afb->tmz_surface, - force_disable_dcc); + afb->tmz_surface); if (ret) return ret; @@ -7040,6 +7102,7 @@ static void amdgpu_dm_connector_unregister(struct drm_connector *connector) if (amdgpu_dm_should_create_sysfs(amdgpu_dm_connector)) sysfs_remove_group(&connector->kdev->kobj, &amdgpu_group); + cec_notifier_conn_unregister(amdgpu_dm_connector->notifier); drm_dp_aux_unregister(&amdgpu_dm_connector->dm_dp_aux.aux); } @@ -8276,6 +8339,27 @@ create_i2c(struct ddc_service *ddc_service, return i2c; } +int amdgpu_dm_initialize_hdmi_connector(struct amdgpu_dm_connector *aconnector) +{ + struct cec_connector_info conn_info; + struct drm_device *ddev = aconnector->base.dev; + struct device *hdmi_dev = ddev->dev; + + if (amdgpu_dc_debug_mask & DC_DISABLE_HDMI_CEC) { + drm_info(ddev, "HDMI-CEC feature masked\n"); + return -EINVAL; + } + + cec_fill_conn_info_from_drm(&conn_info, &aconnector->base); + aconnector->notifier = + cec_notifier_conn_register(hdmi_dev, NULL, &conn_info); + if (!aconnector->notifier) { + drm_err(ddev, "Failed to create cec notifier\n"); + return -ENOMEM; + } + + return 0; +} /* * Note: this function assumes that dc_link_detect() was called for the @@ -8339,6 +8423,10 @@ static int amdgpu_dm_connector_init(struct amdgpu_display_manager *dm, drm_connector_attach_encoder( &aconnector->base, &aencoder->base); + if (connector_type == DRM_MODE_CONNECTOR_HDMIA || + connector_type == DRM_MODE_CONNECTOR_HDMIB) + amdgpu_dm_initialize_hdmi_connector(aconnector); + if (connector_type == DRM_MODE_CONNECTOR_DisplayPort || connector_type == DRM_MODE_CONNECTOR_eDP) amdgpu_dm_initialize_dp_connector(dm, aconnector, link->link_index); @@ -8398,16 +8486,6 @@ static void manage_dm_interrupts(struct amdgpu_device *adev, struct amdgpu_crtc *acrtc, struct dm_crtc_state *acrtc_state) { - /* - * We have no guarantee that the frontend index maps to the same - * backend index - some even map to more than one. - * - * TODO: Use a different interrupt or check DC itself for the mapping. - */ - int irq_type = - amdgpu_display_crtc_idx_to_irq_type( - adev, - acrtc->crtc_id); struct drm_vblank_crtc_config config = {0}; struct dc_crtc_timing *timing; int offdelay; @@ -8433,28 +8511,7 @@ static void manage_dm_interrupts(struct amdgpu_device *adev, drm_crtc_vblank_on_config(&acrtc->base, &config); - - amdgpu_irq_get( - adev, - &adev->pageflip_irq, - irq_type); -#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) - amdgpu_irq_get( - adev, - &adev->vline0_irq, - irq_type); -#endif } else { -#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) - amdgpu_irq_put( - adev, - &adev->vline0_irq, - irq_type); -#endif - amdgpu_irq_put( - adev, - &adev->pageflip_irq, - irq_type); drm_crtc_vblank_off(&acrtc->base); } } @@ -8925,6 +8982,7 @@ static void amdgpu_dm_enable_self_refresh(struct amdgpu_crtc *acrtc_attach, struct replay_settings *pr = &acrtc_state->stream->link->replay_settings; struct amdgpu_dm_connector *aconn = (struct amdgpu_dm_connector *)acrtc_state->stream->dm_stream_context; + bool vrr_active = amdgpu_dm_crtc_vrr_active(acrtc_state); if (acrtc_state->update_type > UPDATE_TYPE_FAST) { if (pr->config.replay_supported && !pr->replay_feature_enabled) @@ -8951,14 +9009,15 @@ static void amdgpu_dm_enable_self_refresh(struct amdgpu_crtc *acrtc_attach, * adequate number of fast atomic commits to notify KMD * of update events. See `vblank_control_worker()`. */ - if (acrtc_attach->dm_irq_params.allow_sr_entry && + if (!vrr_active && + acrtc_attach->dm_irq_params.allow_sr_entry && #ifdef CONFIG_DRM_AMD_SECURE_DISPLAY !amdgpu_dm_crc_window_is_activated(acrtc_state->base.crtc) && #endif (current_ts - psr->psr_dirty_rects_change_timestamp_ns) > 500000000) { if (pr->replay_feature_enabled && !pr->replay_allow_active) amdgpu_dm_replay_enable(acrtc_state->stream, true); - if (psr->psr_version >= DC_PSR_VERSION_SU_1 && + if (psr->psr_version == DC_PSR_VERSION_SU_1 && !psr->psr_allow_active && !aconn->disallow_edp_enter_psr) amdgpu_dm_psr_enable(acrtc_state->stream); } @@ -9095,7 +9154,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state, afb->tiling_flags, &bundle->plane_infos[planes_count], &bundle->flip_addrs[planes_count].address, - afb->tmz_surface, false); + afb->tmz_surface); drm_dbg_state(state->dev, "plane: id=%d dcc_en=%d\n", new_plane_state->plane->index, @@ -9129,7 +9188,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state, acrtc_state->stream->link->psr_settings.psr_dirty_rects_change_timestamp_ns = timestamp_ns; if (acrtc_state->stream->link->psr_settings.psr_allow_active) - amdgpu_dm_psr_disable(acrtc_state->stream); + amdgpu_dm_psr_disable(acrtc_state->stream, true); mutex_unlock(&dm->dc_lock); } } @@ -9295,11 +9354,11 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state, bundle->stream_update.abm_level = &acrtc_state->abm_level; mutex_lock(&dm->dc_lock); - if (acrtc_state->update_type > UPDATE_TYPE_FAST) { + if ((acrtc_state->update_type > UPDATE_TYPE_FAST) || vrr_active) { if (acrtc_state->stream->link->replay_settings.replay_allow_active) amdgpu_dm_replay_disable(acrtc_state->stream); if (acrtc_state->stream->link->psr_settings.psr_allow_active) - amdgpu_dm_psr_disable(acrtc_state->stream); + amdgpu_dm_psr_disable(acrtc_state->stream, true); } mutex_unlock(&dm->dc_lock); @@ -10058,14 +10117,19 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) if (amdgpu_dm_is_valid_crc_source(cur_crc_src)) { #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) if (amdgpu_dm_crc_window_is_activated(crtc)) { + uint8_t cnt; spin_lock_irqsave(&adev_to_drm(adev)->event_lock, flags); - acrtc->dm_irq_params.window_param.update_win = true; + for (cnt = 0; cnt < MAX_CRC_WINDOW_NUM; cnt++) { + if (acrtc->dm_irq_params.window_param[cnt].enable) { + acrtc->dm_irq_params.window_param[cnt].update_win = true; - /** - * It takes 2 frames for HW to stably generate CRC when - * resuming from suspend, so we set skip_frame_cnt 2. - */ - acrtc->dm_irq_params.window_param.skip_frame_cnt = 2; + /** + * It takes 2 frames for HW to stably generate CRC when + * resuming from suspend, so we set skip_frame_cnt 2. + */ + acrtc->dm_irq_params.window_param[cnt].skip_frame_cnt = 2; + } + } spin_unlock_irqrestore(&adev_to_drm(adev)->event_lock, flags); } #endif @@ -11153,8 +11217,8 @@ dm_get_plane_scale(struct drm_plane_state *plane_state, int plane_src_w, plane_src_h; dm_get_oriented_plane_size(plane_state, &plane_src_w, &plane_src_h); - *out_plane_scale_w = plane_state->crtc_w * 1000 / plane_src_w; - *out_plane_scale_h = plane_state->crtc_h * 1000 / plane_src_h; + *out_plane_scale_w = plane_src_w ? plane_state->crtc_w * 1000 / plane_src_w : 0; + *out_plane_scale_h = plane_src_h ? plane_state->crtc_h * 1000 / plane_src_h : 0; } /* @@ -11408,6 +11472,25 @@ static int dm_crtc_get_cursor_mode(struct amdgpu_device *adev, return 0; } +static bool amdgpu_dm_crtc_mem_type_changed(struct drm_device *dev, + struct drm_atomic_state *state, + struct drm_crtc_state *crtc_state) +{ + struct drm_plane *plane; + struct drm_plane_state *new_plane_state, *old_plane_state; + + drm_for_each_plane_mask(plane, dev, crtc_state->plane_mask) { + new_plane_state = drm_atomic_get_plane_state(state, plane); + old_plane_state = drm_atomic_get_plane_state(state, plane); + + if (old_plane_state->fb && new_plane_state->fb && + get_mem_type(old_plane_state->fb) != get_mem_type(new_plane_state->fb)) + return true; + } + + return false; +} + /** * amdgpu_dm_atomic_check() - Atomic check implementation for AMDgpu DM. * @@ -11605,10 +11688,6 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, /* Remove exiting planes if they are modified */ for_each_oldnew_plane_in_descending_zpos(state, plane, old_plane_state, new_plane_state) { - if (old_plane_state->fb && new_plane_state->fb && - get_mem_type(old_plane_state->fb) != - get_mem_type(new_plane_state->fb)) - lock_and_validation_needed = true; ret = dm_update_plane_state(dc, state, plane, old_plane_state, @@ -11903,9 +11982,11 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, /* * Only allow async flips for fast updates that don't change - * the FB pitch, the DCC state, rotation, etc. + * the FB pitch, the DCC state, rotation, mem_type, etc. */ - if (new_crtc_state->async_flip && lock_and_validation_needed) { + if (new_crtc_state->async_flip && + (lock_and_validation_needed || + amdgpu_dm_crtc_mem_type_changed(dev, state, new_crtc_state))) { drm_dbg_atomic(crtc->dev, "[CRTC:%d:%s] async flips are only supported for fast updates\n", crtc->base.id, crtc->name); diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h index 6464a8378387..d2703ca7dff3 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h @@ -541,12 +541,12 @@ struct amdgpu_display_manager { #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) /** - * @secure_display_ctxs: + * @secure_display_ctx: * - * Store the ROI information and the work_struct to command dmub and psp for - * all crtcs. + * Store secure display relevant info. e.g. the ROI information + * , the work_struct to command dmub, etc. */ - struct secure_display_context *secure_display_ctxs; + struct secure_display_context secure_display_ctx; #endif /** * @hpd_rx_offload_wq: @@ -671,6 +671,8 @@ struct amdgpu_dm_connector { uint32_t connector_id; int bl_idx; + struct cec_notifier *notifier; + /* we need to mind the EDID between detect and get modes due to analog/digital/tvencoder */ const struct drm_edid *drm_edid; @@ -697,6 +699,8 @@ struct amdgpu_dm_connector { struct drm_dp_mst_port *mst_output_port; struct amdgpu_dm_connector *mst_root; struct drm_dp_aux *dsc_aux; + uint32_t mst_local_bw; + uint16_t vc_full_pbn; struct mutex handle_mst_msg_ready; /* TODO see if we can merge with ddc_bus or make a dm_connector */ @@ -1010,4 +1014,8 @@ void dm_free_gpu_mem(struct amdgpu_device *adev, bool amdgpu_dm_is_headless(struct amdgpu_device *adev); +void hdmi_cec_set_edid(struct amdgpu_dm_connector *aconnector); +void hdmi_cec_unset_edid(struct amdgpu_dm_connector *aconnector); +int amdgpu_dm_initialize_hdmi_connector(struct amdgpu_dm_connector *aconnector); + #endif /* __AMDGPU_DM_H__ */ diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c index f936a35fa9eb..033bd817d871 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c @@ -30,6 +30,7 @@ #include "amdgpu_dm.h" #include "dc.h" #include "amdgpu_securedisplay.h" +#include "amdgpu_dm_psr.h" static const char *const pipe_crc_sources[] = { "none", @@ -83,45 +84,274 @@ const char *const *amdgpu_dm_crtc_get_crc_sources(struct drm_crtc *crtc, } #ifdef CONFIG_DRM_AMD_SECURE_DISPLAY +static void update_phy_id_mapping(struct amdgpu_device *adev) +{ + struct drm_device *ddev = adev_to_drm(adev); + struct amdgpu_display_manager *dm = &adev->dm; + struct drm_connector *connector; + struct amdgpu_dm_connector *aconnector; + struct amdgpu_dm_connector *sort_connector[AMDGPU_DM_MAX_CRTC] = {NULL}; + struct drm_connector_list_iter iter; + uint8_t idx = 0, idx_2 = 0, connector_cnt = 0; + + dm->secure_display_ctx.phy_mapping_updated = false; + + mutex_lock(&ddev->mode_config.mutex); + drm_connector_list_iter_begin(ddev, &iter); + drm_for_each_connector_iter(connector, &iter) { + + if (connector->status != connector_status_connected) + continue; + + if (idx >= AMDGPU_DM_MAX_CRTC) { + DRM_WARN("%s connected connectors exceed max crtc\n", __func__); + mutex_unlock(&ddev->mode_config.mutex); + return; + } + + aconnector = to_amdgpu_dm_connector(connector); + + sort_connector[idx] = aconnector; + idx++; + connector_cnt++; + } + drm_connector_list_iter_end(&iter); + + /* sort connectors by link_enc_hw_instance first */ + for (idx = connector_cnt; idx > 1 ; idx--) { + for (idx_2 = 0; idx_2 < (idx - 1); idx_2++) { + if (sort_connector[idx_2]->dc_link->link_enc_hw_inst > + sort_connector[idx_2 + 1]->dc_link->link_enc_hw_inst) + swap(sort_connector[idx_2], sort_connector[idx_2 + 1]); + } + } + + /* + * Sort mst connectors by RAD. mst connectors with the same enc_hw_instance are already + * sorted together above. + */ + for (idx = 0; idx < connector_cnt; /*Do nothing*/) { + if (sort_connector[idx]->mst_root) { + uint8_t i, j, k; + uint8_t mst_con_cnt = 1; + + for (idx_2 = (idx + 1); idx_2 < connector_cnt; idx_2++) { + if (sort_connector[idx_2]->mst_root == sort_connector[idx]->mst_root) + mst_con_cnt++; + else + break; + } + + for (i = mst_con_cnt; i > 1; i--) { + for (j = idx; j < (idx + i - 2); j++) { + int mstb_lct = sort_connector[j]->mst_output_port->parent->lct; + int next_mstb_lct = sort_connector[j + 1]->mst_output_port->parent->lct; + u8 *rad; + u8 *next_rad; + bool swap = false; + + /* Sort by mst tree depth first. Then compare RAD if depth is the same*/ + if (mstb_lct > next_mstb_lct) { + swap = true; + } else if (mstb_lct == next_mstb_lct) { + if (mstb_lct == 1) { + if (sort_connector[j]->mst_output_port->port_num > sort_connector[j + 1]->mst_output_port->port_num) + swap = true; + } else if (mstb_lct > 1) { + rad = sort_connector[j]->mst_output_port->parent->rad; + next_rad = sort_connector[j + 1]->mst_output_port->parent->rad; + + for (k = 0; k < mstb_lct - 1; k++) { + int shift = (k % 2) ? 0 : 4; + int port_num = (rad[k / 2] >> shift) & 0xf; + int next_port_num = (next_rad[k / 2] >> shift) & 0xf; + + if (port_num > next_port_num) { + swap = true; + break; + } + } + } else { + DRM_ERROR("MST LCT shouldn't be set as < 1"); + mutex_unlock(&ddev->mode_config.mutex); + return; + } + } + + if (swap) + swap(sort_connector[j], sort_connector[j + 1]); + } + } + + idx += mst_con_cnt; + } else { + idx++; + } + } + + /* Complete sorting. Assign relavant result to dm->secure_display_ctx.phy_id_mapping[]*/ + memset(dm->secure_display_ctx.phy_id_mapping, 0, sizeof(dm->secure_display_ctx.phy_id_mapping)); + for (idx = 0; idx < connector_cnt; idx++) { + aconnector = sort_connector[idx]; + + dm->secure_display_ctx.phy_id_mapping[idx].assigned = true; + dm->secure_display_ctx.phy_id_mapping[idx].is_mst = false; + dm->secure_display_ctx.phy_id_mapping[idx].enc_hw_inst = aconnector->dc_link->link_enc_hw_inst; + + if (sort_connector[idx]->mst_root) { + dm->secure_display_ctx.phy_id_mapping[idx].is_mst = true; + dm->secure_display_ctx.phy_id_mapping[idx].lct = aconnector->mst_output_port->parent->lct; + dm->secure_display_ctx.phy_id_mapping[idx].port_num = aconnector->mst_output_port->port_num; + memcpy(dm->secure_display_ctx.phy_id_mapping[idx].rad, + aconnector->mst_output_port->parent->rad, sizeof(aconnector->mst_output_port->parent->rad)); + } + } + mutex_unlock(&ddev->mode_config.mutex); + + dm->secure_display_ctx.phy_id_mapping_cnt = connector_cnt; + dm->secure_display_ctx.phy_mapping_updated = true; +} + +static bool get_phy_id(struct amdgpu_display_manager *dm, + struct amdgpu_dm_connector *aconnector, uint8_t *phy_id) +{ + int idx, idx_2; + bool found = false; + + /* + * Assume secure display start after all connectors are probed. The connection + * config is static as well + */ + if (!dm->secure_display_ctx.phy_mapping_updated) { + DRM_WARN("%s Should update the phy id table before get it's value", __func__); + return false; + } + + for (idx = 0; idx < dm->secure_display_ctx.phy_id_mapping_cnt; idx++) { + if (!dm->secure_display_ctx.phy_id_mapping[idx].assigned) { + DRM_ERROR("phy_id_mapping[%d] should be assigned", idx); + return false; + } + + if (aconnector->dc_link->link_enc_hw_inst == + dm->secure_display_ctx.phy_id_mapping[idx].enc_hw_inst) { + if (!dm->secure_display_ctx.phy_id_mapping[idx].is_mst) { + found = true; + goto out; + } else { + /* Could caused by wrongly pass mst root connector */ + if (!aconnector->mst_output_port) { + DRM_ERROR("%s Check mst case but connector without a port assigned", __func__); + return false; + } + + if (aconnector->mst_root && + aconnector->mst_root->mst_mgr.mst_primary == NULL) { + DRM_WARN("%s pass in a stale mst connector", __func__); + } + + if (aconnector->mst_output_port->parent->lct == dm->secure_display_ctx.phy_id_mapping[idx].lct && + aconnector->mst_output_port->port_num == dm->secure_display_ctx.phy_id_mapping[idx].port_num) { + if (aconnector->mst_output_port->parent->lct == 1) { + found = true; + goto out; + } else if (aconnector->mst_output_port->parent->lct > 1) { + /* Check RAD */ + for (idx_2 = 0; idx_2 < aconnector->mst_output_port->parent->lct - 1; idx_2++) { + int shift = (idx_2 % 2) ? 0 : 4; + int port_num = (aconnector->mst_output_port->parent->rad[idx_2 / 2] >> shift) & 0xf; + int port_num2 = (dm->secure_display_ctx.phy_id_mapping[idx].rad[idx_2 / 2] >> shift) & 0xf; + + if (port_num != port_num2) + break; + } + + if (idx_2 == aconnector->mst_output_port->parent->lct - 1) { + found = true; + goto out; + } + } else { + DRM_ERROR("lCT should be >= 1"); + return false; + } + } + } + } + } + +out: + if (found) { + DRM_DEBUG_DRIVER("Associated secure display PHY ID as %d", idx); + *phy_id = idx; + } else { + DRM_WARN("Can't find associated phy ID"); + return false; + } + + return true; +} + static void amdgpu_dm_set_crc_window_default(struct drm_crtc *crtc, struct dc_stream_state *stream) { struct drm_device *drm_dev = crtc->dev; struct amdgpu_display_manager *dm = &drm_to_adev(drm_dev)->dm; struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); + struct amdgpu_dm_connector *aconnector; bool was_activated; + uint8_t phy_id; + unsigned long flags; + int i; - spin_lock_irq(&drm_dev->event_lock); - was_activated = acrtc->dm_irq_params.window_param.activated; - acrtc->dm_irq_params.window_param.x_start = 0; - acrtc->dm_irq_params.window_param.y_start = 0; - acrtc->dm_irq_params.window_param.x_end = 0; - acrtc->dm_irq_params.window_param.y_end = 0; - acrtc->dm_irq_params.window_param.activated = false; - acrtc->dm_irq_params.window_param.update_win = false; - acrtc->dm_irq_params.window_param.skip_frame_cnt = 0; - spin_unlock_irq(&drm_dev->event_lock); + spin_lock_irqsave(&drm_dev->event_lock, flags); + was_activated = acrtc->dm_irq_params.crc_window_activated; + for (i = 0; i < MAX_CRC_WINDOW_NUM; i++) { + acrtc->dm_irq_params.window_param[i].x_start = 0; + acrtc->dm_irq_params.window_param[i].y_start = 0; + acrtc->dm_irq_params.window_param[i].x_end = 0; + acrtc->dm_irq_params.window_param[i].y_end = 0; + acrtc->dm_irq_params.window_param[i].enable = false; + acrtc->dm_irq_params.window_param[i].update_win = false; + acrtc->dm_irq_params.window_param[i].skip_frame_cnt = 0; + } + acrtc->dm_irq_params.crc_window_activated = false; + spin_unlock_irqrestore(&drm_dev->event_lock, flags); /* Disable secure_display if it was enabled */ - if (was_activated) { + if (was_activated && dm->secure_display_ctx.op_mode == LEGACY_MODE) { /* stop ROI update on this crtc */ - flush_work(&dm->secure_display_ctxs[crtc->index].notify_ta_work); - flush_work(&dm->secure_display_ctxs[crtc->index].forward_roi_work); - dc_stream_forward_crc_window(stream, NULL, true); + flush_work(&dm->secure_display_ctx.crtc_ctx[crtc->index].notify_ta_work); + flush_work(&dm->secure_display_ctx.crtc_ctx[crtc->index].forward_roi_work); + aconnector = (struct amdgpu_dm_connector *)stream->dm_stream_context; + + if (aconnector && get_phy_id(dm, aconnector, &phy_id)) { + if (dm->secure_display_ctx.support_mul_roi) + dc_stream_forward_multiple_crc_window(stream, NULL, phy_id, true); + else + dc_stream_forward_crc_window(stream, NULL, phy_id, true); + } else { + DRM_DEBUG_DRIVER("%s Can't find matching phy id", __func__); + } } } static void amdgpu_dm_crtc_notify_ta_to_read(struct work_struct *work) { - struct secure_display_context *secure_display_ctx; + struct secure_display_crtc_context *crtc_ctx; struct psp_context *psp; struct ta_securedisplay_cmd *securedisplay_cmd; struct drm_crtc *crtc; struct dc_stream_state *stream; + struct amdgpu_dm_connector *aconnector; uint8_t phy_inst; + struct amdgpu_display_manager *dm; + struct crc_data crc_cpy[MAX_CRC_WINDOW_NUM]; + unsigned long flags; + uint8_t roi_idx = 0; int ret; + int i; - secure_display_ctx = container_of(work, struct secure_display_context, notify_ta_work); - crtc = secure_display_ctx->crtc; + crtc_ctx = container_of(work, struct secure_display_crtc_context, notify_ta_work); + crtc = crtc_ctx->crtc; if (!crtc) return; @@ -133,21 +363,50 @@ static void amdgpu_dm_crtc_notify_ta_to_read(struct work_struct *work) return; } + dm = &drm_to_adev(crtc->dev)->dm; stream = to_amdgpu_crtc(crtc)->dm_irq_params.stream; - phy_inst = stream->link->link_enc_hw_inst; + aconnector = (struct amdgpu_dm_connector *)stream->dm_stream_context; + if (!aconnector) + return; + + mutex_lock(&crtc->dev->mode_config.mutex); + if (!get_phy_id(dm, aconnector, &phy_inst)) { + DRM_WARN("%s Can't find mapping phy id!", __func__); + mutex_unlock(&crtc->dev->mode_config.mutex); + return; + } + mutex_unlock(&crtc->dev->mode_config.mutex); + + spin_lock_irqsave(&crtc->dev->event_lock, flags); + memcpy(crc_cpy, crtc_ctx->crc_info.crc, sizeof(struct crc_data) * MAX_CRC_WINDOW_NUM); + spin_unlock_irqrestore(&crtc->dev->event_lock, flags); /* need lock for multiple crtcs to use the command buffer */ mutex_lock(&psp->securedisplay_context.mutex); - - psp_prep_securedisplay_cmd_buf(psp, &securedisplay_cmd, - TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC); - - securedisplay_cmd->securedisplay_in_message.send_roi_crc.phy_id = phy_inst; - /* PSP TA is expected to finish data transmission over I2C within current frame, * even there are up to 4 crtcs request to send in this frame. */ - ret = psp_securedisplay_invoke(psp, TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC); + if (dm->secure_display_ctx.support_mul_roi) { + psp_prep_securedisplay_cmd_buf(psp, &securedisplay_cmd, + TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC_V2); + + securedisplay_cmd->securedisplay_in_message.send_roi_crc_v2.phy_id = phy_inst; + + for (i = 0; i < MAX_CRC_WINDOW_NUM; i++) { + if (crc_cpy[i].crc_ready) + roi_idx |= 1 << i; + } + securedisplay_cmd->securedisplay_in_message.send_roi_crc_v2.roi_idx = roi_idx; + + ret = psp_securedisplay_invoke(psp, TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC_V2); + } else { + psp_prep_securedisplay_cmd_buf(psp, &securedisplay_cmd, + TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC); + + securedisplay_cmd->securedisplay_in_message.send_roi_crc.phy_id = phy_inst; + + ret = psp_securedisplay_invoke(psp, TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC); + } if (!ret) { if (securedisplay_cmd->status != TA_SECUREDISPLAY_STATUS__SUCCESS) @@ -160,22 +419,47 @@ static void amdgpu_dm_crtc_notify_ta_to_read(struct work_struct *work) static void amdgpu_dm_forward_crc_window(struct work_struct *work) { - struct secure_display_context *secure_display_ctx; + struct secure_display_crtc_context *crtc_ctx; struct amdgpu_display_manager *dm; struct drm_crtc *crtc; struct dc_stream_state *stream; + struct amdgpu_dm_connector *aconnector; + struct crc_window roi_cpy[MAX_CRC_WINDOW_NUM]; + unsigned long flags; + uint8_t phy_id; - secure_display_ctx = container_of(work, struct secure_display_context, forward_roi_work); - crtc = secure_display_ctx->crtc; + crtc_ctx = container_of(work, struct secure_display_crtc_context, forward_roi_work); + crtc = crtc_ctx->crtc; if (!crtc) return; dm = &drm_to_adev(crtc->dev)->dm; stream = to_amdgpu_crtc(crtc)->dm_irq_params.stream; + aconnector = (struct amdgpu_dm_connector *)stream->dm_stream_context; + + if (!aconnector) + return; + + mutex_lock(&crtc->dev->mode_config.mutex); + if (!get_phy_id(dm, aconnector, &phy_id)) { + DRM_WARN("%s Can't find mapping phy id!", __func__); + mutex_unlock(&crtc->dev->mode_config.mutex); + return; + } + mutex_unlock(&crtc->dev->mode_config.mutex); + + spin_lock_irqsave(&crtc->dev->event_lock, flags); + memcpy(roi_cpy, crtc_ctx->roi, sizeof(struct crc_window) * MAX_CRC_WINDOW_NUM); + spin_unlock_irqrestore(&crtc->dev->event_lock, flags); mutex_lock(&dm->dc_lock); - dc_stream_forward_crc_window(stream, &secure_display_ctx->rect, false); + if (dm->secure_display_ctx.support_mul_roi) + dc_stream_forward_multiple_crc_window(stream, roi_cpy, + phy_id, false); + else + dc_stream_forward_crc_window(stream, &roi_cpy[0].rect, + phy_id, false); mutex_unlock(&dm->dc_lock); } @@ -186,7 +470,7 @@ bool amdgpu_dm_crc_window_is_activated(struct drm_crtc *crtc) bool ret = false; spin_lock_irq(&drm_dev->event_lock); - ret = acrtc->dm_irq_params.window_param.activated; + ret = acrtc->dm_irq_params.crc_window_activated; spin_unlock_irq(&drm_dev->event_lock); return ret; @@ -224,10 +508,14 @@ int amdgpu_dm_crtc_configure_crc_source(struct drm_crtc *crtc, mutex_lock(&adev->dm.dc_lock); + /* For PSR1, check that the panel has exited PSR */ + if (stream_state->link->psr_settings.psr_version < DC_PSR_VERSION_SU_1) + amdgpu_dm_psr_wait_disable(stream_state); + /* Enable or disable CRTC CRC generation */ if (dm_is_crc_source_crtc(source) || source == AMDGPU_DM_PIPE_CRC_SOURCE_NONE) { if (!dc_stream_configure_crc(stream_state->ctx->dc, - stream_state, NULL, enable, enable)) { + stream_state, NULL, enable, enable, 0, true)) { ret = -EINVAL; goto unlock; } @@ -258,6 +546,10 @@ int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, const char *src_name) struct drm_crtc_commit *commit; struct dm_crtc_state *crtc_state; struct drm_device *drm_dev = crtc->dev; +#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) + struct amdgpu_device *adev = drm_to_adev(drm_dev); + struct amdgpu_display_manager *dm = &adev->dm; +#endif struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); struct drm_dp_aux *aux = NULL; bool enable = false; @@ -357,6 +649,17 @@ int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, const char *src_name) } + /* + * Reading the CRC requires the vblank interrupt handler to be + * enabled. Keep a reference until CRC capture stops. + */ + enabled = amdgpu_dm_is_valid_crc_source(cur_crc_src); + if (!enabled && enable) { + ret = drm_crtc_vblank_get(crtc); + if (ret) + goto cleanup; + } + #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) /* Reset secure_display when we change crc source from debugfs */ amdgpu_dm_set_crc_window_default(crtc, crtc_state->stream); @@ -367,16 +670,7 @@ int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, const char *src_name) goto cleanup; } - /* - * Reading the CRC requires the vblank interrupt handler to be - * enabled. Keep a reference until CRC capture stops. - */ - enabled = amdgpu_dm_is_valid_crc_source(cur_crc_src); if (!enabled && enable) { - ret = drm_crtc_vblank_get(crtc); - if (ret) - goto cleanup; - if (dm_is_crc_source_dprx(source)) { if (drm_dp_start_crc(aux, crtc)) { DRM_DEBUG_DRIVER("dp start crc failed\n"); @@ -402,6 +696,13 @@ int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, const char *src_name) /* Reset crc_skipped on dm state */ crtc_state->crc_skip_count = 0; +#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) + /* Initialize phy id mapping table for secure display*/ + if (dm->secure_display_ctx.op_mode == LEGACY_MODE && + !dm->secure_display_ctx.phy_mapping_updated) + update_phy_id_mapping(adev); +#endif + cleanup: if (commit) drm_crtc_commit_put(commit); @@ -456,7 +757,7 @@ void amdgpu_dm_crtc_handle_crc_irq(struct drm_crtc *crtc) } if (dm_is_crc_source_crtc(cur_crc_src)) { - if (!dc_stream_get_crc(stream_state->ctx->dc, stream_state, + if (!dc_stream_get_crc(stream_state->ctx->dc, stream_state, 0, &crcs[0], &crcs[1], &crcs[2])) return; @@ -472,8 +773,17 @@ void amdgpu_dm_crtc_handle_crc_window_irq(struct drm_crtc *crtc) enum amdgpu_dm_pipe_crc_source cur_crc_src; struct amdgpu_crtc *acrtc = NULL; struct amdgpu_device *adev = NULL; - struct secure_display_context *secure_display_ctx = NULL; + struct secure_display_crtc_context *crtc_ctx = NULL; + bool reset_crc_frame_count[MAX_CRC_WINDOW_NUM] = {false}; + uint32_t crc_r[MAX_CRC_WINDOW_NUM] = {0}; + uint32_t crc_g[MAX_CRC_WINDOW_NUM] = {0}; + uint32_t crc_b[MAX_CRC_WINDOW_NUM] = {0}; unsigned long flags1; + bool forward_roi_change = false; + bool notify_ta = false; + bool all_crc_ready = true; + struct dc_stream_state *stream_state; + int i; if (crtc == NULL) return; @@ -481,78 +791,160 @@ void amdgpu_dm_crtc_handle_crc_window_irq(struct drm_crtc *crtc) acrtc = to_amdgpu_crtc(crtc); adev = drm_to_adev(crtc->dev); drm_dev = crtc->dev; + stream_state = to_dm_crtc_state(crtc->state)->stream; spin_lock_irqsave(&drm_dev->event_lock, flags1); cur_crc_src = acrtc->dm_irq_params.crc_src; /* Early return if CRC capture is not enabled. */ if (!amdgpu_dm_is_valid_crc_source(cur_crc_src) || - !dm_is_crc_source_crtc(cur_crc_src)) - goto cleanup; - - if (!acrtc->dm_irq_params.window_param.activated) - goto cleanup; - - if (acrtc->dm_irq_params.window_param.skip_frame_cnt) { - acrtc->dm_irq_params.window_param.skip_frame_cnt -= 1; - goto cleanup; + !dm_is_crc_source_crtc(cur_crc_src)) { + spin_unlock_irqrestore(&drm_dev->event_lock, flags1); + return; } - secure_display_ctx = &adev->dm.secure_display_ctxs[acrtc->crtc_id]; - if (WARN_ON(secure_display_ctx->crtc != crtc)) { - /* We have set the crtc when creating secure_display_context, + if (!acrtc->dm_irq_params.crc_window_activated) { + spin_unlock_irqrestore(&drm_dev->event_lock, flags1); + return; + } + + crtc_ctx = &adev->dm.secure_display_ctx.crtc_ctx[acrtc->crtc_id]; + if (WARN_ON(crtc_ctx->crtc != crtc)) { + /* We have set the crtc when creating secure_display_crtc_context, * don't expect it to be changed here. */ - secure_display_ctx->crtc = crtc; + crtc_ctx->crtc = crtc; } - if (acrtc->dm_irq_params.window_param.update_win) { - /* prepare work for dmub to update ROI */ - secure_display_ctx->rect.x = acrtc->dm_irq_params.window_param.x_start; - secure_display_ctx->rect.y = acrtc->dm_irq_params.window_param.y_start; - secure_display_ctx->rect.width = acrtc->dm_irq_params.window_param.x_end - - acrtc->dm_irq_params.window_param.x_start; - secure_display_ctx->rect.height = acrtc->dm_irq_params.window_param.y_end - - acrtc->dm_irq_params.window_param.y_start; - schedule_work(&secure_display_ctx->forward_roi_work); + for (i = 0; i < MAX_CRC_WINDOW_NUM; i++) { + struct crc_params crc_window = { + .windowa_x_start = acrtc->dm_irq_params.window_param[i].x_start, + .windowa_y_start = acrtc->dm_irq_params.window_param[i].y_start, + .windowa_x_end = acrtc->dm_irq_params.window_param[i].x_end, + .windowa_y_end = acrtc->dm_irq_params.window_param[i].y_end, + .windowb_x_start = acrtc->dm_irq_params.window_param[i].x_start, + .windowb_y_start = acrtc->dm_irq_params.window_param[i].y_start, + .windowb_x_end = acrtc->dm_irq_params.window_param[i].x_end, + .windowb_y_end = acrtc->dm_irq_params.window_param[i].y_end, + }; - acrtc->dm_irq_params.window_param.update_win = false; + crtc_ctx->roi[i].enable = acrtc->dm_irq_params.window_param[i].enable; - /* Statically skip 1 frame, because we may need to wait below things - * before sending ROI to dmub: - * 1. We defer the work by using system workqueue. - * 2. We may need to wait for dc_lock before accessing dmub. - */ - acrtc->dm_irq_params.window_param.skip_frame_cnt = 1; + if (!acrtc->dm_irq_params.window_param[i].enable) { + crtc_ctx->crc_info.crc[i].crc_ready = false; + continue; + } - } else { - /* prepare work for psp to read ROI/CRC and send to I2C */ - schedule_work(&secure_display_ctx->notify_ta_work); + if (acrtc->dm_irq_params.window_param[i].skip_frame_cnt) { + acrtc->dm_irq_params.window_param[i].skip_frame_cnt -= 1; + crtc_ctx->crc_info.crc[i].crc_ready = false; + continue; + } + + if (acrtc->dm_irq_params.window_param[i].update_win) { + crtc_ctx->roi[i].rect.x = crc_window.windowa_x_start; + crtc_ctx->roi[i].rect.y = crc_window.windowa_y_start; + crtc_ctx->roi[i].rect.width = crc_window.windowa_x_end - + crc_window.windowa_x_start; + crtc_ctx->roi[i].rect.height = crc_window.windowa_y_end - + crc_window.windowa_y_start; + + if (adev->dm.secure_display_ctx.op_mode == LEGACY_MODE) + /* forward task to dmub to update ROI */ + forward_roi_change = true; + else if (adev->dm.secure_display_ctx.op_mode == DISPLAY_CRC_MODE) + /* update ROI via dm*/ + dc_stream_configure_crc(stream_state->ctx->dc, stream_state, + &crc_window, true, true, i, false); + + reset_crc_frame_count[i] = true; + + acrtc->dm_irq_params.window_param[i].update_win = false; + + /* Statically skip 1 frame, because we may need to wait below things + * before sending ROI to dmub: + * 1. We defer the work by using system workqueue. + * 2. We may need to wait for dc_lock before accessing dmub. + */ + acrtc->dm_irq_params.window_param[i].skip_frame_cnt = 1; + crtc_ctx->crc_info.crc[i].crc_ready = false; + } else { + if (!dc_stream_get_crc(stream_state->ctx->dc, stream_state, i, + &crc_r[i], &crc_g[i], &crc_b[i])) + DRM_ERROR("Secure Display: fail to get crc from engine %d\n", i); + + if (adev->dm.secure_display_ctx.op_mode == LEGACY_MODE) + /* forward task to psp to read ROI/CRC and output via I2C */ + notify_ta = true; + else if (adev->dm.secure_display_ctx.op_mode == DISPLAY_CRC_MODE) + /* Avoid ROI window get changed, keep overwriting. */ + dc_stream_configure_crc(stream_state->ctx->dc, stream_state, + &crc_window, true, true, i, false); + + /* crc ready for psp to read out */ + crtc_ctx->crc_info.crc[i].crc_ready = true; + } } -cleanup: spin_unlock_irqrestore(&drm_dev->event_lock, flags1); + + if (forward_roi_change) + schedule_work(&crtc_ctx->forward_roi_work); + + if (notify_ta) + schedule_work(&crtc_ctx->notify_ta_work); + + spin_lock_irqsave(&crtc_ctx->crc_info.lock, flags1); + for (i = 0; i < MAX_CRC_WINDOW_NUM; i++) { + crtc_ctx->crc_info.crc[i].crc_R = crc_r[i]; + crtc_ctx->crc_info.crc[i].crc_G = crc_g[i]; + crtc_ctx->crc_info.crc[i].crc_B = crc_b[i]; + + if (!crtc_ctx->roi[i].enable) { + crtc_ctx->crc_info.crc[i].frame_count = 0; + continue; + } + + if (!crtc_ctx->crc_info.crc[i].crc_ready) + all_crc_ready = false; + + if (reset_crc_frame_count[i] || crtc_ctx->crc_info.crc[i].frame_count == UINT_MAX) + /* Reset the reference frame count after user update the ROI + * or it reaches the maximum value. + */ + crtc_ctx->crc_info.crc[i].frame_count = 0; + else + crtc_ctx->crc_info.crc[i].frame_count += 1; + } + spin_unlock_irqrestore(&crtc_ctx->crc_info.lock, flags1); + + if (all_crc_ready) + complete_all(&crtc_ctx->crc_info.completion); } -struct secure_display_context * -amdgpu_dm_crtc_secure_display_create_contexts(struct amdgpu_device *adev) +void amdgpu_dm_crtc_secure_display_create_contexts(struct amdgpu_device *adev) { - struct secure_display_context *secure_display_ctxs = NULL; + struct secure_display_crtc_context *crtc_ctx = NULL; int i; - secure_display_ctxs = kcalloc(adev->mode_info.num_crtc, - sizeof(struct secure_display_context), + crtc_ctx = kcalloc(adev->mode_info.num_crtc, + sizeof(struct secure_display_crtc_context), GFP_KERNEL); - if (!secure_display_ctxs) - return NULL; - - for (i = 0; i < adev->mode_info.num_crtc; i++) { - INIT_WORK(&secure_display_ctxs[i].forward_roi_work, amdgpu_dm_forward_crc_window); - INIT_WORK(&secure_display_ctxs[i].notify_ta_work, amdgpu_dm_crtc_notify_ta_to_read); - secure_display_ctxs[i].crtc = &adev->mode_info.crtcs[i]->base; + if (!crtc_ctx) { + adev->dm.secure_display_ctx.crtc_ctx = NULL; + return; } - return secure_display_ctxs; + for (i = 0; i < adev->mode_info.num_crtc; i++) { + INIT_WORK(&crtc_ctx[i].forward_roi_work, amdgpu_dm_forward_crc_window); + INIT_WORK(&crtc_ctx[i].notify_ta_work, amdgpu_dm_crtc_notify_ta_to_read); + crtc_ctx[i].crtc = &adev->mode_info.crtcs[i]->base; + spin_lock_init(&crtc_ctx[i].crc_info.lock); + } + + adev->dm.secure_display_ctx.crtc_ctx = crtc_ctx; + + adev->dm.secure_display_ctx.op_mode = DISPLAY_CRC_MODE; } #endif diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.h index 748e80ef40d0..3da056c8d20b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.h @@ -40,20 +40,53 @@ enum amdgpu_dm_pipe_crc_source { }; #ifdef CONFIG_DRM_AMD_SECURE_DISPLAY +#define MAX_CRTC 6 + +enum secure_display_mode { + /* via dmub + psp */ + LEGACY_MODE = 0, + /* driver directly */ + DISPLAY_CRC_MODE, + SECURE_DISPLAY_MODE_MAX, +}; + +struct phy_id_mapping { + bool assigned; + bool is_mst; + uint8_t enc_hw_inst; + u8 lct; + u8 port_num; + u8 rad[8]; +}; + +struct crc_data { + uint32_t crc_R; + uint32_t crc_G; + uint32_t crc_B; + uint32_t frame_count; + bool crc_ready; +}; + +struct crc_info { + struct crc_data crc[MAX_CRC_WINDOW_NUM]; + struct completion completion; + spinlock_t lock; +}; + struct crc_window_param { uint16_t x_start; uint16_t y_start; uint16_t x_end; uint16_t y_end; /* CRC window is activated or not*/ - bool activated; + bool enable; /* Update crc window during vertical blank or not */ bool update_win; /* skip reading/writing for few frames */ int skip_frame_cnt; }; -struct secure_display_context { +struct secure_display_crtc_context { /* work to notify PSP TA*/ struct work_struct notify_ta_work; @@ -63,7 +96,20 @@ struct secure_display_context { struct drm_crtc *crtc; /* Region of Interest (ROI) */ - struct rect rect; + struct crc_window roi[MAX_CRC_WINDOW_NUM]; + + struct crc_info crc_info; +}; + +struct secure_display_context { + + struct secure_display_crtc_context *crtc_ctx; + /* Whether dmub support multiple ROI setting */ + bool support_mul_roi; + enum secure_display_mode op_mode; + bool phy_mapping_updated; + int phy_id_mapping_cnt; + struct phy_id_mapping phy_id_mapping[MAX_CRTC]; }; #endif @@ -95,8 +141,7 @@ void amdgpu_dm_crtc_handle_crc_irq(struct drm_crtc *crtc); #ifdef CONFIG_DRM_AMD_SECURE_DISPLAY bool amdgpu_dm_crc_window_is_activated(struct drm_crtc *crtc); void amdgpu_dm_crtc_handle_crc_window_irq(struct drm_crtc *crtc); -struct secure_display_context *amdgpu_dm_crtc_secure_display_create_contexts( - struct amdgpu_device *adev); +void amdgpu_dm_crtc_secure_display_create_contexts(struct amdgpu_device *adev); #else #define amdgpu_dm_crc_window_is_activated(x) #define amdgpu_dm_crtc_handle_crc_window_irq(x) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c index 64a041c2af05..36a830a7440f 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c @@ -93,7 +93,7 @@ int amdgpu_dm_crtc_set_vupdate_irq(struct drm_crtc *crtc, bool enable) return rc; } -bool amdgpu_dm_crtc_vrr_active(struct dm_crtc_state *dm_state) +bool amdgpu_dm_crtc_vrr_active(const struct dm_crtc_state *dm_state) { return dm_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE || dm_state->freesync_config.state == VRR_STATE_ACTIVE_FIXED; @@ -142,7 +142,7 @@ static void amdgpu_dm_crtc_set_panel_sr_feature( amdgpu_dm_replay_enable(vblank_work->stream, true); } else if (vblank_enabled) { if (link->psr_settings.psr_version < DC_PSR_VERSION_SU_1 && is_sr_active) - amdgpu_dm_psr_disable(vblank_work->stream); + amdgpu_dm_psr_disable(vblank_work->stream, false); } else if (link->psr_settings.psr_feature_enabled && allow_sr_entry && !is_sr_active && !is_crc_window_active) { diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.h index 17e948753f59..c1212947a77b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.h @@ -37,7 +37,7 @@ int amdgpu_dm_crtc_set_vupdate_irq(struct drm_crtc *crtc, bool enable); bool amdgpu_dm_crtc_vrr_active_irq(struct amdgpu_crtc *acrtc); -bool amdgpu_dm_crtc_vrr_active(struct dm_crtc_state *dm_state); +bool amdgpu_dm_crtc_vrr_active(const struct dm_crtc_state *dm_state); int amdgpu_dm_crtc_enable_vblank(struct drm_crtc *crtc); diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c index 6a97bb2d9160..049046c60462 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c @@ -25,6 +25,7 @@ #include #include +#include #include "dc.h" #include "amdgpu.h" @@ -258,7 +259,7 @@ static ssize_t dp_link_settings_write(struct file *f, const char __user *buf, struct dc_link *link = connector->dc_link; struct amdgpu_device *adev = drm_to_adev(connector->base.dev); struct dc *dc = (struct dc *)link->dc; - struct dc_link_settings prefer_link_settings; + struct dc_link_settings prefer_link_settings = {0}; char *wr_buf = NULL; const uint32_t wr_buf_size = 40; /* 0: lane_count; 1: link_rate */ @@ -389,7 +390,7 @@ static ssize_t dp_mst_link_setting(struct file *f, const char __user *buf, struct dc_link *link = aconnector->dc_link; struct amdgpu_device *adev = drm_to_adev(aconnector->base.dev); struct dc *dc = (struct dc *)link->dc; - struct dc_link_settings prefer_link_settings; + struct dc_link_settings prefer_link_settings = {0}; char *wr_buf = NULL; const uint32_t wr_buf_size = 40; /* 0: lane_count; 1: link_rate */ @@ -613,7 +614,7 @@ static ssize_t dp_phy_settings_write(struct file *f, const char __user *buf, uint32_t wr_buf_size = 40; long param[3]; bool use_prefer_link_setting; - struct link_training_settings link_lane_settings; + struct link_training_settings link_lane_settings = {0}; int max_param_num = 3; uint8_t param_nums = 0; int r = 0; @@ -768,7 +769,7 @@ static ssize_t dp_phy_test_pattern_debugfs_write(struct file *f, const char __us LINK_RATE_UNKNOWN, LINK_SPREAD_DISABLED}; struct dc_link_settings cur_link_settings = {LANE_COUNT_UNKNOWN, LINK_RATE_UNKNOWN, LINK_SPREAD_DISABLED}; - struct link_training_settings link_training_settings; + struct link_training_settings link_training_settings = {0}; int i; if (size == 0) @@ -902,9 +903,10 @@ static int dmub_tracebuffer_show(struct seq_file *m, void *data) { struct amdgpu_device *adev = m->private; struct dmub_srv_fb_info *fb_info = adev->dm.dmub_fb_info; + struct dmub_fw_meta_info *fw_meta_info = NULL; struct dmub_debugfs_trace_entry *entries; uint8_t *tbuf_base; - uint32_t tbuf_size, max_entries, num_entries, i; + uint32_t tbuf_size, max_entries, num_entries, first_entry, i; if (!fb_info) return 0; @@ -913,20 +915,42 @@ static int dmub_tracebuffer_show(struct seq_file *m, void *data) if (!tbuf_base) return 0; - tbuf_size = fb_info->fb[DMUB_WINDOW_5_TRACEBUFF].size; + if (adev->dm.dmub_srv) + fw_meta_info = &adev->dm.dmub_srv->meta_info; + + tbuf_size = fw_meta_info ? fw_meta_info->trace_buffer_size : + DMUB_TRACE_BUFFER_SIZE; max_entries = (tbuf_size - sizeof(struct dmub_debugfs_trace_header)) / sizeof(struct dmub_debugfs_trace_entry); num_entries = ((struct dmub_debugfs_trace_header *)tbuf_base)->entry_count; + /* DMCUB tracebuffer is a ring. If it rolled over, print a hint that + * entries are being overwritten. + */ + if (num_entries > max_entries) + seq_printf(m, "...\n"); + + first_entry = num_entries % max_entries; num_entries = min(num_entries, max_entries); entries = (struct dmub_debugfs_trace_entry *)(tbuf_base + sizeof(struct dmub_debugfs_trace_header)); - for (i = 0; i < num_entries; ++i) { + /* To print entries chronologically, start from the first entry till the + * top of buffer, then from base of buffer to first entry. + */ + for (i = first_entry; i < num_entries; ++i) { + struct dmub_debugfs_trace_entry *entry = &entries[i]; + + seq_printf(m, + "trace_code=%u tick_count=%u param0=%u param1=%u\n", + entry->trace_code, entry->tick_count, entry->param0, + entry->param1); + } + for (i = 0; i < first_entry; ++i) { struct dmub_debugfs_trace_entry *entry = &entries[i]; seq_printf(m, @@ -2825,6 +2849,67 @@ static int is_dpia_link_show(struct seq_file *m, void *data) return 0; } +/** + * hdmi_cec_state_show - Read out the HDMI-CEC feature status + * @m: sequence file. + * @data: unused. + * + * Return 0 on success + */ +static int hdmi_cec_state_show(struct seq_file *m, void *data) +{ + struct drm_connector *connector = m->private; + struct amdgpu_dm_connector *aconnector = to_amdgpu_dm_connector(connector); + + seq_printf(m, "%s:%d\n", connector->name, connector->base.id); + seq_printf(m, "HDMI-CEC status: %d\n", aconnector->notifier ? 1 : 0); + + return 0; +} + +/** + * hdmi_cec_state_write - Enable/Disable HDMI-CEC feature from driver side + * @f: file structure. + * @buf: userspace buffer. set to '1' to enable; '0' to disable cec feature. + * @size: size of buffer from userpsace. + * @pos: unused. + * + * Return size on success, error code on failure + */ +static ssize_t hdmi_cec_state_write(struct file *f, const char __user *buf, + size_t size, loff_t *pos) +{ + int ret; + bool enable; + struct amdgpu_dm_connector *aconnector = file_inode(f)->i_private; + struct drm_device *ddev = aconnector->base.dev; + + if (size == 0) + return -EINVAL; + + ret = kstrtobool_from_user(buf, size, &enable); + if (ret) { + drm_dbg_driver(ddev, "invalid user data !\n"); + return ret; + } + + if (enable) { + if (aconnector->notifier) + return -EINVAL; + ret = amdgpu_dm_initialize_hdmi_connector(aconnector); + if (ret) + return ret; + hdmi_cec_set_edid(aconnector); + } else { + if (!aconnector->notifier) + return -EINVAL; + cec_notifier_conn_unregister(aconnector->notifier); + aconnector->notifier = NULL; + } + + return size; +} + DEFINE_SHOW_ATTRIBUTE(dp_dsc_fec_support); DEFINE_SHOW_ATTRIBUTE(dmub_fw_state); DEFINE_SHOW_ATTRIBUTE(dmub_tracebuffer); @@ -2837,6 +2922,7 @@ DEFINE_SHOW_ATTRIBUTE(psr_capability); DEFINE_SHOW_ATTRIBUTE(dp_is_mst_connector); DEFINE_SHOW_ATTRIBUTE(dp_mst_progress_status); DEFINE_SHOW_ATTRIBUTE(is_dpia_link); +DEFINE_SHOW_STORE_ATTRIBUTE(hdmi_cec_state); static const struct file_operations dp_dsc_clock_en_debugfs_fops = { .owner = THIS_MODULE, @@ -2972,7 +3058,8 @@ static const struct { char *name; const struct file_operations *fops; } hdmi_debugfs_entries[] = { - {"hdcp_sink_capability", &hdcp_sink_capability_fops} + {"hdcp_sink_capability", &hdcp_sink_capability_fops}, + {"hdmi_cec_state", &hdmi_cec_state_fops} }; /* @@ -3457,8 +3544,8 @@ static int crc_win_x_start_set(void *data, u64 val) struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); spin_lock_irq(&drm_dev->event_lock); - acrtc->dm_irq_params.window_param.x_start = (uint16_t) val; - acrtc->dm_irq_params.window_param.update_win = false; + acrtc->dm_irq_params.window_param[0].x_start = (uint16_t) val; + acrtc->dm_irq_params.window_param[0].update_win = false; spin_unlock_irq(&drm_dev->event_lock); return 0; @@ -3474,7 +3561,7 @@ static int crc_win_x_start_get(void *data, u64 *val) struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); spin_lock_irq(&drm_dev->event_lock); - *val = acrtc->dm_irq_params.window_param.x_start; + *val = acrtc->dm_irq_params.window_param[0].x_start; spin_unlock_irq(&drm_dev->event_lock); return 0; @@ -3494,8 +3581,8 @@ static int crc_win_y_start_set(void *data, u64 val) struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); spin_lock_irq(&drm_dev->event_lock); - acrtc->dm_irq_params.window_param.y_start = (uint16_t) val; - acrtc->dm_irq_params.window_param.update_win = false; + acrtc->dm_irq_params.window_param[0].y_start = (uint16_t) val; + acrtc->dm_irq_params.window_param[0].update_win = false; spin_unlock_irq(&drm_dev->event_lock); return 0; @@ -3511,7 +3598,7 @@ static int crc_win_y_start_get(void *data, u64 *val) struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); spin_lock_irq(&drm_dev->event_lock); - *val = acrtc->dm_irq_params.window_param.y_start; + *val = acrtc->dm_irq_params.window_param[0].y_start; spin_unlock_irq(&drm_dev->event_lock); return 0; @@ -3530,8 +3617,8 @@ static int crc_win_x_end_set(void *data, u64 val) struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); spin_lock_irq(&drm_dev->event_lock); - acrtc->dm_irq_params.window_param.x_end = (uint16_t) val; - acrtc->dm_irq_params.window_param.update_win = false; + acrtc->dm_irq_params.window_param[0].x_end = (uint16_t) val; + acrtc->dm_irq_params.window_param[0].update_win = false; spin_unlock_irq(&drm_dev->event_lock); return 0; @@ -3547,7 +3634,7 @@ static int crc_win_x_end_get(void *data, u64 *val) struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); spin_lock_irq(&drm_dev->event_lock); - *val = acrtc->dm_irq_params.window_param.x_end; + *val = acrtc->dm_irq_params.window_param[0].x_end; spin_unlock_irq(&drm_dev->event_lock); return 0; @@ -3566,8 +3653,8 @@ static int crc_win_y_end_set(void *data, u64 val) struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); spin_lock_irq(&drm_dev->event_lock); - acrtc->dm_irq_params.window_param.y_end = (uint16_t) val; - acrtc->dm_irq_params.window_param.update_win = false; + acrtc->dm_irq_params.window_param[0].y_end = (uint16_t) val; + acrtc->dm_irq_params.window_param[0].update_win = false; spin_unlock_irq(&drm_dev->event_lock); return 0; @@ -3583,7 +3670,7 @@ static int crc_win_y_end_get(void *data, u64 *val) struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); spin_lock_irq(&drm_dev->event_lock); - *val = acrtc->dm_irq_params.window_param.y_end; + *val = acrtc->dm_irq_params.window_param[0].y_end; spin_unlock_irq(&drm_dev->event_lock); return 0; @@ -3606,13 +3693,14 @@ static int crc_win_update_set(void *data, u64 val) /* PSR may write to OTG CRC window control register, * so close it before starting secure_display. */ - amdgpu_dm_psr_disable(acrtc->dm_irq_params.stream); + amdgpu_dm_psr_disable(acrtc->dm_irq_params.stream, true); spin_lock_irq(&adev_to_drm(adev)->event_lock); - acrtc->dm_irq_params.window_param.activated = true; - acrtc->dm_irq_params.window_param.update_win = true; - acrtc->dm_irq_params.window_param.skip_frame_cnt = 0; + acrtc->dm_irq_params.window_param[0].enable = true; + acrtc->dm_irq_params.window_param[0].update_win = true; + acrtc->dm_irq_params.window_param[0].skip_frame_cnt = 0; + acrtc->dm_irq_params.crc_window_activated = true; spin_unlock_irq(&adev_to_drm(adev)->event_lock); mutex_unlock(&adev->dm.dc_lock); diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c index b0fea0856866..fbd80d8545a8 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c @@ -885,6 +885,12 @@ bool dm_helpers_dp_write_dsc_enable( return ret; } +bool dm_helpers_dp_write_hblank_reduction(struct dc_context *ctx, const struct dc_stream_state *stream) +{ + // TODO + return false; +} + bool dm_helpers_is_dp_sink_present(struct dc_link *link) { bool dp_sink_present; @@ -907,14 +913,14 @@ dm_helpers_probe_acpi_edid(void *data, u8 *buf, unsigned int block, size_t len) struct drm_connector *connector = data; struct acpi_device *acpidev = ACPI_COMPANION(connector->dev->dev); unsigned char start = block * EDID_LENGTH; - void *edid; + struct edid *edid; int r; if (!acpidev) return -ENODEV; /* fetch the entire edid from BIOS */ - r = acpi_video_get_edid(acpidev, ACPI_VIDEO_DISPLAY_LCD, -1, &edid); + r = acpi_video_get_edid(acpidev, ACPI_VIDEO_DISPLAY_LCD, -1, (void *)&edid); if (r < 0) { drm_dbg(connector->dev, "Failed to get EDID from ACPI: %d\n", r); return r; @@ -924,7 +930,14 @@ dm_helpers_probe_acpi_edid(void *data, u8 *buf, unsigned int block, size_t len) goto cleanup; } - memcpy(buf, edid + start, len); + /* sanity check */ + if (edid->revision < 4 || !(edid->input & DRM_EDID_INPUT_DIGITAL) || + (edid->input & DRM_EDID_DIGITAL_TYPE_MASK) == DRM_EDID_DIGITAL_TYPE_UNDEF) { + r = -EINVAL; + goto cleanup; + } + + memcpy(buf, (void *)edid + start, len); r = 0; cleanup: diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq_params.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq_params.h index 6a7ecc1e4602..6c9de834455b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq_params.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq_params.h @@ -39,7 +39,9 @@ struct dm_irq_params { #ifdef CONFIG_DEBUG_FS enum amdgpu_dm_pipe_crc_source crc_src; #ifdef CONFIG_DRM_AMD_SECURE_DISPLAY - struct crc_window_param window_param; + struct crc_window_param window_param[MAX_CRC_WINDOW_NUM]; + /* At least one CRC window is activated or not*/ + bool crc_window_activated; #endif #endif }; diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c index 6e4359490613..07e744da7bf4 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -155,6 +155,17 @@ amdgpu_dm_mst_connector_late_register(struct drm_connector *connector) return 0; } + +static inline void +amdgpu_dm_mst_reset_mst_connector_setting(struct amdgpu_dm_connector *aconnector) +{ + aconnector->drm_edid = NULL; + aconnector->dsc_aux = NULL; + aconnector->mst_output_port->passthrough_aux = NULL; + aconnector->mst_local_bw = 0; + aconnector->vc_full_pbn = 0; +} + static void amdgpu_dm_mst_connector_early_unregister(struct drm_connector *connector) { @@ -182,9 +193,7 @@ amdgpu_dm_mst_connector_early_unregister(struct drm_connector *connector) dc_sink_release(dc_sink); aconnector->dc_sink = NULL; - aconnector->drm_edid = NULL; - aconnector->dsc_aux = NULL; - port->passthrough_aux = NULL; + amdgpu_dm_mst_reset_mst_connector_setting(aconnector); } aconnector->mst_status = MST_STATUS_DEFAULT; @@ -504,9 +513,7 @@ dm_dp_mst_detect(struct drm_connector *connector, dc_sink_release(aconnector->dc_sink); aconnector->dc_sink = NULL; - aconnector->drm_edid = NULL; - aconnector->dsc_aux = NULL; - port->passthrough_aux = NULL; + amdgpu_dm_mst_reset_mst_connector_setting(aconnector); amdgpu_dm_set_mst_status(&aconnector->mst_status, MST_REMOTE_EDID | MST_ALLOCATE_NEW_PAYLOAD | MST_CLEAR_ALLOCATED_PAYLOAD, @@ -590,11 +597,12 @@ dm_dp_add_mst_connector(struct drm_dp_mst_topology_mgr *mgr, amdgpu_dm_set_mst_status(&aconnector->mst_status, MST_PROBE, true); - if (drm_connector_init( + if (drm_connector_dynamic_init( dev, connector, &dm_dp_mst_connector_funcs, - DRM_MODE_CONNECTOR_DisplayPort)) { + DRM_MODE_CONNECTOR_DisplayPort, + NULL)) { kfree(aconnector); return NULL; } @@ -1688,16 +1696,16 @@ clean_exit: return ret; } -static unsigned int kbps_from_pbn(unsigned int pbn) +static uint32_t kbps_from_pbn(unsigned int pbn) { - unsigned int kbps = pbn; + uint64_t kbps = (uint64_t)pbn; kbps *= (1000000 / PEAK_FACTOR_X1000); kbps *= 8; kbps *= 54; kbps /= 64; - return kbps; + return (uint32_t)kbps; } static bool is_dsc_common_config_possible(struct dc_stream_state *stream, @@ -1819,9 +1827,18 @@ enum dc_status dm_dp_mst_is_port_support_mode( struct drm_dp_mst_port *immediate_upstream_port = NULL; uint32_t end_link_bw = 0; - /*Get last DP link BW capability*/ - if (dp_get_link_current_set_bw(&aconnector->mst_output_port->aux, &end_link_bw)) { - if (stream_kbps > end_link_bw) { + /*Get last DP link BW capability. Mode shall be supported by Legacy peer*/ + if (aconnector->mst_output_port->pdt != DP_PEER_DEVICE_DP_LEGACY_CONV && + aconnector->mst_output_port->pdt != DP_PEER_DEVICE_NONE) { + if (aconnector->vc_full_pbn != aconnector->mst_output_port->full_pbn) { + dp_get_link_current_set_bw(&aconnector->mst_output_port->aux, &end_link_bw); + aconnector->vc_full_pbn = aconnector->mst_output_port->full_pbn; + aconnector->mst_local_bw = end_link_bw; + } else { + end_link_bw = aconnector->mst_local_bw; + } + + if (end_link_bw > 0 && stream_kbps > end_link_bw) { DRM_DEBUG_DRIVER("MST_DSC dsc decode at last link." "Mode required bw can't fit into last link\n"); return DC_FAIL_BANDWIDTH_VALIDATE; @@ -1835,11 +1852,15 @@ enum dc_status dm_dp_mst_is_port_support_mode( if (immediate_upstream_port) { virtual_channel_bw_in_kbps = kbps_from_pbn(immediate_upstream_port->full_pbn); virtual_channel_bw_in_kbps = min(root_link_bw_in_kbps, virtual_channel_bw_in_kbps); - if (bw_range.min_kbps > virtual_channel_bw_in_kbps) { - DRM_DEBUG_DRIVER("MST_DSC dsc decode at last link." - "Max dsc compression can't fit into MST available bw\n"); - return DC_FAIL_BANDWIDTH_VALIDATE; - } + } else { + /* For topology LCT 1 case - only one mstb*/ + virtual_channel_bw_in_kbps = root_link_bw_in_kbps; + } + + if (bw_range.min_kbps > virtual_channel_bw_in_kbps) { + DRM_DEBUG_DRIVER("MST_DSC dsc decode at last link." + "Max dsc compression can't fit into MST available bw\n"); + return DC_FAIL_BANDWIDTH_VALIDATE; } } diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c index 495e3cd70426..774cc3f4f3fd 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c @@ -26,6 +26,7 @@ #include #include +#include "drm/drm_framebuffer.h" #include #include #include @@ -176,7 +177,7 @@ static unsigned int amdgpu_dm_plane_modifier_gfx9_swizzle_mode(uint64_t modifier return AMD_FMT_MOD_GET(TILE, modifier); } -static void amdgpu_dm_plane_fill_gfx8_tiling_info_from_flags(union dc_tiling_info *tiling_info, +static void amdgpu_dm_plane_fill_gfx8_tiling_info_from_flags(struct dc_tiling_info *tiling_info, uint64_t tiling_flags) { /* Fill GFX8 params */ @@ -189,6 +190,7 @@ static void amdgpu_dm_plane_fill_gfx8_tiling_info_from_flags(union dc_tiling_inf tile_split = AMDGPU_TILING_GET(tiling_flags, TILE_SPLIT); num_banks = AMDGPU_TILING_GET(tiling_flags, NUM_BANKS); + tiling_info->gfxversion = DcGfxVersion8; /* XXX fix me for VI */ tiling_info->gfx8.num_banks = num_banks; tiling_info->gfx8.array_mode = @@ -209,7 +211,7 @@ static void amdgpu_dm_plane_fill_gfx8_tiling_info_from_flags(union dc_tiling_inf } static void amdgpu_dm_plane_fill_gfx9_tiling_info_from_device(const struct amdgpu_device *adev, - union dc_tiling_info *tiling_info) + struct dc_tiling_info *tiling_info) { /* Fill GFX9 params */ tiling_info->gfx9.num_pipes = @@ -230,7 +232,7 @@ static void amdgpu_dm_plane_fill_gfx9_tiling_info_from_device(const struct amdgp } static void amdgpu_dm_plane_fill_gfx9_tiling_info_from_modifier(const struct amdgpu_device *adev, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, uint64_t modifier) { unsigned int mod_bank_xor_bits = AMD_FMT_MOD_GET(BANK_XOR_BITS, modifier); @@ -260,7 +262,7 @@ static void amdgpu_dm_plane_fill_gfx9_tiling_info_from_modifier(const struct amd static int amdgpu_dm_plane_validate_dcc(struct amdgpu_device *adev, const enum surface_pixel_format format, const enum dc_rotation_angle rotation, - const union dc_tiling_info *tiling_info, + const struct dc_tiling_info *tiling_info, const struct dc_plane_dcc_param *dcc, const struct dc_plane_address *address, const struct plane_size *plane_size) @@ -307,18 +309,18 @@ static int amdgpu_dm_plane_fill_gfx9_plane_attributes_from_modifiers(struct amdg const enum surface_pixel_format format, const enum dc_rotation_angle rotation, const struct plane_size *plane_size, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct dc_plane_dcc_param *dcc, - struct dc_plane_address *address, - const bool force_disable_dcc) + struct dc_plane_address *address) { const uint64_t modifier = afb->base.modifier; int ret = 0; amdgpu_dm_plane_fill_gfx9_tiling_info_from_modifier(adev, tiling_info, modifier); tiling_info->gfx9.swizzle = amdgpu_dm_plane_modifier_gfx9_swizzle_mode(modifier); + tiling_info->gfxversion = DcGfxVersion9; - if (amdgpu_dm_plane_modifier_has_dcc(modifier) && !force_disable_dcc) { + if (amdgpu_dm_plane_modifier_has_dcc(modifier)) { uint64_t dcc_address = afb->address + afb->base.offsets[1]; bool independent_64b_blks = AMD_FMT_MOD_GET(DCC_INDEPENDENT_64B, modifier); bool independent_128b_blks = AMD_FMT_MOD_GET(DCC_INDEPENDENT_128B, modifier); @@ -358,10 +360,9 @@ static int amdgpu_dm_plane_fill_gfx12_plane_attributes_from_modifiers(struct amd const enum surface_pixel_format format, const enum dc_rotation_angle rotation, const struct plane_size *plane_size, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct dc_plane_dcc_param *dcc, - struct dc_plane_address *address, - const bool force_disable_dcc) + struct dc_plane_address *address) { const uint64_t modifier = afb->base.modifier; int ret = 0; @@ -370,8 +371,9 @@ static int amdgpu_dm_plane_fill_gfx12_plane_attributes_from_modifiers(struct amd amdgpu_dm_plane_fill_gfx9_tiling_info_from_device(adev, tiling_info); tiling_info->gfx9.swizzle = amdgpu_dm_plane_modifier_gfx9_swizzle_mode(modifier); + tiling_info->gfxversion = DcGfxAddr3; - if (amdgpu_dm_plane_modifier_has_dcc(modifier) && !force_disable_dcc) { + if (amdgpu_dm_plane_modifier_has_dcc(modifier)) { int max_compressed_block = AMD_FMT_MOD_GET(DCC_MAX_COMPRESSED_BLOCK, modifier); dcc->enable = 1; @@ -835,12 +837,11 @@ int amdgpu_dm_plane_fill_plane_buffer_attributes(struct amdgpu_device *adev, const enum surface_pixel_format format, const enum dc_rotation_angle rotation, const uint64_t tiling_flags, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, struct dc_plane_dcc_param *dcc, struct dc_plane_address *address, - bool tmz_surface, - bool force_disable_dcc) + bool tmz_surface) { const struct drm_framebuffer *fb = &afb->base; int ret; @@ -900,16 +901,14 @@ int amdgpu_dm_plane_fill_plane_buffer_attributes(struct amdgpu_device *adev, ret = amdgpu_dm_plane_fill_gfx12_plane_attributes_from_modifiers(adev, afb, format, rotation, plane_size, tiling_info, dcc, - address, - force_disable_dcc); + address); if (ret) return ret; } else if (adev->family >= AMDGPU_FAMILY_AI) { ret = amdgpu_dm_plane_fill_gfx9_plane_attributes_from_modifiers(adev, afb, format, rotation, plane_size, tiling_info, dcc, - address, - force_disable_dcc); + address); if (ret) return ret; } else { @@ -1000,14 +999,13 @@ static int amdgpu_dm_plane_helper_prepare_fb(struct drm_plane *plane, dm_plane_state_old->dc_state != dm_plane_state_new->dc_state) { struct dc_plane_state *plane_state = dm_plane_state_new->dc_state; - bool force_disable_dcc = !plane_state->dcc.enable; amdgpu_dm_plane_fill_plane_buffer_attributes( adev, afb, plane_state->format, plane_state->rotation, afb->tiling_flags, &plane_state->tiling_info, &plane_state->plane_size, &plane_state->dcc, &plane_state->address, - afb->tmz_surface, force_disable_dcc); + afb->tmz_surface); } return 0; @@ -1421,6 +1419,20 @@ static void amdgpu_dm_plane_atomic_async_update(struct drm_plane *plane, amdgpu_dm_plane_handle_cursor_update(plane, old_state); } +static void amdgpu_dm_plane_panic_flush(struct drm_plane *plane) +{ + struct dm_plane_state *dm_plane_state = to_dm_plane_state(plane->state); + struct drm_framebuffer *fb = plane->state->fb; + struct dc_plane_state *dc_plane_state; + + if (!dm_plane_state || !dm_plane_state->dc_state) + return; + + dc_plane_state = dm_plane_state->dc_state; + + dc_plane_force_update_for_panic(dc_plane_state, fb->modifier ? true : false); +} + static const struct drm_plane_helper_funcs dm_plane_helper_funcs = { .prepare_fb = amdgpu_dm_plane_helper_prepare_fb, .cleanup_fb = amdgpu_dm_plane_helper_cleanup_fb, @@ -1429,6 +1441,16 @@ static const struct drm_plane_helper_funcs dm_plane_helper_funcs = { .atomic_async_update = amdgpu_dm_plane_atomic_async_update }; +static const struct drm_plane_helper_funcs dm_primary_plane_helper_funcs = { + .prepare_fb = amdgpu_dm_plane_helper_prepare_fb, + .cleanup_fb = amdgpu_dm_plane_helper_cleanup_fb, + .atomic_check = amdgpu_dm_plane_atomic_check, + .atomic_async_check = amdgpu_dm_plane_atomic_async_check, + .atomic_async_update = amdgpu_dm_plane_atomic_async_update, + .get_scanout_buffer = amdgpu_display_get_scanout_buffer, + .panic_flush = amdgpu_dm_plane_panic_flush, +}; + static void amdgpu_dm_plane_drm_plane_reset(struct drm_plane *plane) { struct dm_plane_state *amdgpu_state = NULL; @@ -1855,7 +1877,10 @@ int amdgpu_dm_plane_init(struct amdgpu_display_manager *dm, plane->type != DRM_PLANE_TYPE_CURSOR) drm_plane_enable_fb_damage_clips(plane); - drm_plane_helper_add(plane, &dm_plane_helper_funcs); + if (plane->type == DRM_PLANE_TYPE_PRIMARY) + drm_plane_helper_add(plane, &dm_primary_plane_helper_funcs); + else + drm_plane_helper_add(plane, &dm_plane_helper_funcs); #ifdef AMD_PRIVATE_COLOR dm_atomic_plane_attach_color_mgmt_properties(dm, plane); diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.h index 6498359bff6f..615d2ab2b803 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.h @@ -47,12 +47,11 @@ int amdgpu_dm_plane_fill_plane_buffer_attributes(struct amdgpu_device *adev, const enum surface_pixel_format format, const enum dc_rotation_angle rotation, const uint64_t tiling_flags, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, struct dc_plane_dcc_param *dcc, struct dc_plane_address *address, - bool tmz_surface, - bool force_disable_dcc); + bool tmz_surface); int amdgpu_dm_plane_init(struct amdgpu_display_manager *dm, struct drm_plane *plane, diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c index f40240aafe98..45858bf1523d 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c @@ -201,14 +201,13 @@ void amdgpu_dm_psr_enable(struct dc_stream_state *stream) * * Return: true if success */ -bool amdgpu_dm_psr_disable(struct dc_stream_state *stream) +bool amdgpu_dm_psr_disable(struct dc_stream_state *stream, bool wait) { - unsigned int power_opt = 0; bool psr_enable = false; DRM_DEBUG_DRIVER("Disabling psr...\n"); - return dc_link_set_psr_allow_active(stream->link, &psr_enable, true, false, &power_opt); + return dc_link_set_psr_allow_active(stream->link, &psr_enable, wait, false, NULL); } /* @@ -251,3 +250,33 @@ bool amdgpu_dm_psr_is_active_allowed(struct amdgpu_display_manager *dm) return allow_active; } + +/** + * amdgpu_dm_psr_wait_disable() - Wait for eDP panel to exit PSR + * @stream: stream state attached to the eDP link + * + * Waits for a max of 500ms for the eDP panel to exit PSR. + * + * Return: true if panel exited PSR, false otherwise. + */ +bool amdgpu_dm_psr_wait_disable(struct dc_stream_state *stream) +{ + enum dc_psr_state psr_state = PSR_STATE0; + struct dc_link *link = stream->link; + int retry_count; + + if (link == NULL) + return false; + + for (retry_count = 0; retry_count <= 1000; retry_count++) { + dc_link_get_psr_state(link, &psr_state); + if (psr_state == PSR_STATE0) + break; + udelay(500); + } + + if (retry_count == 1000) + return false; + + return true; +} diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.h index cd2d45c2b5ef..e2366321a3c1 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.h @@ -34,8 +34,9 @@ void amdgpu_dm_set_psr_caps(struct dc_link *link); void amdgpu_dm_psr_enable(struct dc_stream_state *stream); bool amdgpu_dm_link_setup_psr(struct dc_stream_state *stream); -bool amdgpu_dm_psr_disable(struct dc_stream_state *stream); +bool amdgpu_dm_psr_disable(struct dc_stream_state *stream, bool wait); bool amdgpu_dm_psr_disable_all(struct amdgpu_display_manager *dm); bool amdgpu_dm_psr_is_active_allowed(struct amdgpu_display_manager *dm); +bool amdgpu_dm_psr_wait_disable(struct dc_stream_state *stream); #endif /* AMDGPU_DM_AMDGPU_DM_PSR_H_ */ diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c index c9a6de110b74..a62f6c51301c 100644 --- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c +++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c @@ -3088,11 +3088,12 @@ static enum bp_result construct_integrated_info( info->ext_disp_conn_info.path[i].ext_encoder_obj_id.id, info->ext_disp_conn_info.path[i].caps ); - if (info->ext_disp_conn_info.path[i].caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) - DC_LOG_BIOS("BIOS EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN on path %d\n", i); + if ((info->ext_disp_conn_info.path[i].caps & AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK) == AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) + DC_LOG_BIOS("BIOS AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN on path %d\n", i); else if (bp->base.ctx->dc->config.force_bios_fixed_vs) { - info->ext_disp_conn_info.path[i].caps |= EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN; - DC_LOG_BIOS("driver forced EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN on path %d\n", i); + info->ext_disp_conn_info.path[i].caps &= ~AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK; + info->ext_disp_conn_info.path[i].caps |= AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN; + DC_LOG_BIOS("driver forced AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN on path %d\n", i); } } // Log the Checksum and Voltage Swing diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile index ab1132bc896a..d9955c5d2e5e 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile @@ -174,7 +174,7 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN32) ############################################################################### # DCN35 ############################################################################### -CLK_MGR_DCN35 = dcn35_smu.o dcn35_clk_mgr.o +CLK_MGR_DCN35 = dcn35_smu.o dcn351_clk_mgr.o dcn35_clk_mgr.o AMD_DAL_CLK_MGR_DCN35 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn35/,$(CLK_MGR_DCN35)) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c index 0e243f4344d0..4c3e58c730b1 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c @@ -355,8 +355,11 @@ struct clk_mgr *dc_clk_mgr_create(struct dc_context *ctx, struct pp_smu_funcs *p BREAK_TO_DEBUGGER(); return NULL; } + if (ctx->dce_version == DCN_VERSION_3_51) + dcn351_clk_mgr_construct(ctx, clk_mgr, pp_smu, dccg); + else + dcn35_clk_mgr_construct(ctx, clk_mgr, pp_smu, dccg); - dcn35_clk_mgr_construct(ctx, clk_mgr, pp_smu, dccg); return &clk_mgr->base.base; } break; diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn201/dcn201_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn201/dcn201_clk_mgr.c index 7920f6f1aa62..76c612ecfe3c 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn201/dcn201_clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn201/dcn201_clk_mgr.c @@ -34,8 +34,8 @@ #include "dm_services.h" #include "cyan_skillfish_ip_offset.h" -#include "dcn/dcn_2_0_3_offset.h" -#include "dcn/dcn_2_0_3_sh_mask.h" +#include "dcn/dcn_2_0_1_offset.h" +#include "dcn/dcn_2_0_1_sh_mask.h" #include "clk/clk_11_0_1_offset.h" #include "clk/clk_11_0_1_sh_mask.h" diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn351_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn351_clk_mgr.c new file mode 100644 index 000000000000..6a6ae618650b --- /dev/null +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn351_clk_mgr.c @@ -0,0 +1,140 @@ +/* + * Copyright 2024 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: AMD + * + */ + +#include "core_types.h" +#include "dcn35_clk_mgr.h" + +#define DCN_BASE__INST0_SEG1 0x000000C0 +#define mmCLK1_CLK_PLL_REQ 0x16E37 + +#define mmCLK1_CLK0_DFS_CNTL 0x16E69 +#define mmCLK1_CLK1_DFS_CNTL 0x16E6C +#define mmCLK1_CLK2_DFS_CNTL 0x16E6F +#define mmCLK1_CLK3_DFS_CNTL 0x16E72 +#define mmCLK1_CLK4_DFS_CNTL 0x16E75 +#define mmCLK1_CLK5_DFS_CNTL 0x16E78 + +#define mmCLK1_CLK0_CURRENT_CNT 0x16EFC +#define mmCLK1_CLK1_CURRENT_CNT 0x16EFD +#define mmCLK1_CLK2_CURRENT_CNT 0x16EFE +#define mmCLK1_CLK3_CURRENT_CNT 0x16EFF +#define mmCLK1_CLK4_CURRENT_CNT 0x16F00 +#define mmCLK1_CLK5_CURRENT_CNT 0x16F01 + +#define mmCLK1_CLK0_BYPASS_CNTL 0x16E8A +#define mmCLK1_CLK1_BYPASS_CNTL 0x16E93 +#define mmCLK1_CLK2_BYPASS_CNTL 0x16E9C +#define mmCLK1_CLK3_BYPASS_CNTL 0x16EA5 +#define mmCLK1_CLK4_BYPASS_CNTL 0x16EAE +#define mmCLK1_CLK5_BYPASS_CNTL 0x16EB7 + +#define mmCLK1_CLK0_DS_CNTL 0x16E83 +#define mmCLK1_CLK1_DS_CNTL 0x16E8C +#define mmCLK1_CLK2_DS_CNTL 0x16E95 +#define mmCLK1_CLK3_DS_CNTL 0x16E9E +#define mmCLK1_CLK4_DS_CNTL 0x16EA7 +#define mmCLK1_CLK5_DS_CNTL 0x16EB0 + +#define mmCLK1_CLK0_ALLOW_DS 0x16E84 +#define mmCLK1_CLK1_ALLOW_DS 0x16E8D +#define mmCLK1_CLK2_ALLOW_DS 0x16E96 +#define mmCLK1_CLK3_ALLOW_DS 0x16E9F +#define mmCLK1_CLK4_ALLOW_DS 0x16EA8 +#define mmCLK1_CLK5_ALLOW_DS 0x16EB1 + +#define mmCLK5_spll_field_8 0x1B04B +#define mmDENTIST_DISPCLK_CNTL 0x0124 +#define regDENTIST_DISPCLK_CNTL 0x0064 +#define regDENTIST_DISPCLK_CNTL_BASE_IDX 1 + +#define CLK1_CLK_PLL_REQ__FbMult_int__SHIFT 0x0 +#define CLK1_CLK_PLL_REQ__PllSpineDiv__SHIFT 0xc +#define CLK1_CLK_PLL_REQ__FbMult_frac__SHIFT 0x10 +#define CLK1_CLK_PLL_REQ__FbMult_int_MASK 0x000001FFL +#define CLK1_CLK_PLL_REQ__PllSpineDiv_MASK 0x0000F000L +#define CLK1_CLK_PLL_REQ__FbMult_frac_MASK 0xFFFF0000L + +#define CLK1_CLK2_BYPASS_CNTL__CLK2_BYPASS_SEL_MASK 0x00000007L + +// DENTIST_DISPCLK_CNTL +#define DENTIST_DISPCLK_CNTL__DENTIST_DISPCLK_WDIVIDER__SHIFT 0x0 +#define DENTIST_DISPCLK_CNTL__DENTIST_DISPCLK_RDIVIDER__SHIFT 0x8 +#define DENTIST_DISPCLK_CNTL__DENTIST_DISPCLK_CHG_DONE__SHIFT 0x13 +#define DENTIST_DISPCLK_CNTL__DENTIST_DPPCLK_CHG_DONE__SHIFT 0x14 +#define DENTIST_DISPCLK_CNTL__DENTIST_DPPCLK_WDIVIDER__SHIFT 0x18 +#define DENTIST_DISPCLK_CNTL__DENTIST_DISPCLK_WDIVIDER_MASK 0x0000007FL +#define DENTIST_DISPCLK_CNTL__DENTIST_DISPCLK_RDIVIDER_MASK 0x00007F00L +#define DENTIST_DISPCLK_CNTL__DENTIST_DISPCLK_CHG_DONE_MASK 0x00080000L +#define DENTIST_DISPCLK_CNTL__DENTIST_DPPCLK_CHG_DONE_MASK 0x00100000L +#define DENTIST_DISPCLK_CNTL__DENTIST_DPPCLK_WDIVIDER_MASK 0x7F000000L + +#define CLK5_spll_field_8__spll_ssc_en_MASK 0x00002000L + +#define REG(reg) \ + (clk_mgr->regs->reg) + +#define BASE_INNER(seg) DCN_BASE__INST0_SEG ## seg + +#define BASE(seg) BASE_INNER(seg) + +#define SR(reg_name)\ + .reg_name = BASE(reg ## reg_name ## _BASE_IDX) + \ + reg ## reg_name + +#define CLK_SR_DCN35(reg_name)\ + .reg_name = mm ## reg_name + +static const struct clk_mgr_registers clk_mgr_regs_dcn351 = { + CLK_REG_LIST_DCN35() +}; + +static const struct clk_mgr_shift clk_mgr_shift_dcn351 = { + CLK_COMMON_MASK_SH_LIST_DCN32(__SHIFT) +}; + +static const struct clk_mgr_mask clk_mgr_mask_dcn351 = { + CLK_COMMON_MASK_SH_LIST_DCN32(_MASK) +}; + +#define TO_CLK_MGR_DCN35(clk_mgr)\ + container_of(clk_mgr, struct clk_mgr_dcn35, base) + + +void dcn351_clk_mgr_construct( + struct dc_context *ctx, + struct clk_mgr_dcn35 *clk_mgr, + struct pp_smu_funcs *pp_smu, + struct dccg *dccg) +{ + /*register offset changed*/ + clk_mgr->base.regs = &clk_mgr_regs_dcn351; + clk_mgr->base.clk_mgr_shift = &clk_mgr_shift_dcn351; + clk_mgr->base.clk_mgr_mask = &clk_mgr_mask_dcn351; + + dcn35_clk_mgr_construct(ctx, clk_mgr, pp_smu, dccg); + +} + + diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c index b77333817f18..1f974ea3b0c6 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c @@ -36,15 +36,11 @@ #include "dcn20/dcn20_clk_mgr.h" - - #include "reg_helper.h" #include "core_types.h" #include "dcn35_smu.h" #include "dm_helpers.h" -/* TODO: remove this include once we ported over remaining clk mgr functions*/ -#include "dcn30/dcn30_clk_mgr.h" #include "dcn31/dcn31_clk_mgr.h" #include "dc_dmub_srv.h" @@ -55,35 +51,102 @@ #define DC_LOGGER \ clk_mgr->base.base.ctx->logger +#define DCN_BASE__INST0_SEG1 0x000000C0 +#define mmCLK1_CLK_PLL_REQ 0x16E37 -#define regCLK1_CLK_PLL_REQ 0x0237 -#define regCLK1_CLK_PLL_REQ_BASE_IDX 0 +#define mmCLK1_CLK0_DFS_CNTL 0x16E69 +#define mmCLK1_CLK1_DFS_CNTL 0x16E6C +#define mmCLK1_CLK2_DFS_CNTL 0x16E6F +#define mmCLK1_CLK3_DFS_CNTL 0x16E72 +#define mmCLK1_CLK4_DFS_CNTL 0x16E75 +#define mmCLK1_CLK5_DFS_CNTL 0x16E78 -#define CLK1_CLK_PLL_REQ__FbMult_int__SHIFT 0x0 -#define CLK1_CLK_PLL_REQ__PllSpineDiv__SHIFT 0xc -#define CLK1_CLK_PLL_REQ__FbMult_frac__SHIFT 0x10 -#define CLK1_CLK_PLL_REQ__FbMult_int_MASK 0x000001FFL -#define CLK1_CLK_PLL_REQ__PllSpineDiv_MASK 0x0000F000L -#define CLK1_CLK_PLL_REQ__FbMult_frac_MASK 0xFFFF0000L +#define mmCLK1_CLK0_CURRENT_CNT 0x16EFB +#define mmCLK1_CLK1_CURRENT_CNT 0x16EFC +#define mmCLK1_CLK2_CURRENT_CNT 0x16EFD +#define mmCLK1_CLK3_CURRENT_CNT 0x16EFE +#define mmCLK1_CLK4_CURRENT_CNT 0x16EFF +#define mmCLK1_CLK5_CURRENT_CNT 0x16F00 -#define regCLK1_CLK2_BYPASS_CNTL 0x029c -#define regCLK1_CLK2_BYPASS_CNTL_BASE_IDX 0 +#define mmCLK1_CLK0_BYPASS_CNTL 0x16E8A +#define mmCLK1_CLK1_BYPASS_CNTL 0x16E93 +#define mmCLK1_CLK2_BYPASS_CNTL 0x16E9C +#define mmCLK1_CLK3_BYPASS_CNTL 0x16EA5 +#define mmCLK1_CLK4_BYPASS_CNTL 0x16EAE +#define mmCLK1_CLK5_BYPASS_CNTL 0x16EB7 -#define CLK1_CLK2_BYPASS_CNTL__CLK2_BYPASS_SEL__SHIFT 0x0 -#define CLK1_CLK2_BYPASS_CNTL__CLK2_BYPASS_DIV__SHIFT 0x10 -#define CLK1_CLK2_BYPASS_CNTL__CLK2_BYPASS_SEL_MASK 0x00000007L -#define CLK1_CLK2_BYPASS_CNTL__CLK2_BYPASS_DIV_MASK 0x000F0000L +#define mmCLK1_CLK0_DS_CNTL 0x16E83 +#define mmCLK1_CLK1_DS_CNTL 0x16E8C +#define mmCLK1_CLK2_DS_CNTL 0x16E95 +#define mmCLK1_CLK3_DS_CNTL 0x16E9E +#define mmCLK1_CLK4_DS_CNTL 0x16EA7 +#define mmCLK1_CLK5_DS_CNTL 0x16EB0 -#define regCLK5_0_CLK5_spll_field_8 0x464b -#define regCLK5_0_CLK5_spll_field_8_BASE_IDX 0 +#define mmCLK1_CLK0_ALLOW_DS 0x16E84 +#define mmCLK1_CLK1_ALLOW_DS 0x16E8D +#define mmCLK1_CLK2_ALLOW_DS 0x16E96 +#define mmCLK1_CLK3_ALLOW_DS 0x16E9F +#define mmCLK1_CLK4_ALLOW_DS 0x16EA8 +#define mmCLK1_CLK5_ALLOW_DS 0x16EB1 -#define CLK5_0_CLK5_spll_field_8__spll_ssc_en__SHIFT 0xd -#define CLK5_0_CLK5_spll_field_8__spll_ssc_en_MASK 0x00002000L +#define mmCLK5_spll_field_8 0x1B04B +#define mmDENTIST_DISPCLK_CNTL 0x0124 +#define regDENTIST_DISPCLK_CNTL 0x0064 +#define regDENTIST_DISPCLK_CNTL_BASE_IDX 1 + +#define CLK1_CLK_PLL_REQ__FbMult_int__SHIFT 0x0 +#define CLK1_CLK_PLL_REQ__PllSpineDiv__SHIFT 0xc +#define CLK1_CLK_PLL_REQ__FbMult_frac__SHIFT 0x10 +#define CLK1_CLK_PLL_REQ__FbMult_int_MASK 0x000001FFL +#define CLK1_CLK_PLL_REQ__PllSpineDiv_MASK 0x0000F000L +#define CLK1_CLK_PLL_REQ__FbMult_frac_MASK 0xFFFF0000L + +#define CLK1_CLK2_BYPASS_CNTL__CLK2_BYPASS_SEL_MASK 0x00000007L +#define CLK1_CLK2_BYPASS_CNTL__CLK2_BYPASS_DIV_MASK 0x000F0000L +// DENTIST_DISPCLK_CNTL +#define DENTIST_DISPCLK_CNTL__DENTIST_DISPCLK_WDIVIDER__SHIFT 0x0 +#define DENTIST_DISPCLK_CNTL__DENTIST_DISPCLK_RDIVIDER__SHIFT 0x8 +#define DENTIST_DISPCLK_CNTL__DENTIST_DISPCLK_CHG_DONE__SHIFT 0x13 +#define DENTIST_DISPCLK_CNTL__DENTIST_DPPCLK_CHG_DONE__SHIFT 0x14 +#define DENTIST_DISPCLK_CNTL__DENTIST_DPPCLK_WDIVIDER__SHIFT 0x18 +#define DENTIST_DISPCLK_CNTL__DENTIST_DISPCLK_WDIVIDER_MASK 0x0000007FL +#define DENTIST_DISPCLK_CNTL__DENTIST_DISPCLK_RDIVIDER_MASK 0x00007F00L +#define DENTIST_DISPCLK_CNTL__DENTIST_DISPCLK_CHG_DONE_MASK 0x00080000L +#define DENTIST_DISPCLK_CNTL__DENTIST_DPPCLK_CHG_DONE_MASK 0x00100000L +#define DENTIST_DISPCLK_CNTL__DENTIST_DPPCLK_WDIVIDER_MASK 0x7F000000L + +#define CLK5_spll_field_8__spll_ssc_en_MASK 0x00002000L #define SMU_VER_THRESHOLD 0x5D4A00 //93.74.0 +#undef FN +#define FN(reg_name, field_name) \ + clk_mgr->clk_mgr_shift->field_name, clk_mgr->clk_mgr_mask->field_name -#define REG(reg_name) \ - (ctx->clk_reg_offsets[reg ## reg_name ## _BASE_IDX] + reg ## reg_name) +#define REG(reg) \ + (clk_mgr->regs->reg) + +#define BASE_INNER(seg) DCN_BASE__INST0_SEG ## seg + +#define BASE(seg) BASE_INNER(seg) + +#define SR(reg_name)\ + .reg_name = BASE(reg ## reg_name ## _BASE_IDX) + \ + reg ## reg_name + +#define CLK_SR_DCN35(reg_name)\ + .reg_name = mm ## reg_name + +static const struct clk_mgr_registers clk_mgr_regs_dcn35 = { + CLK_REG_LIST_DCN35() +}; + +static const struct clk_mgr_shift clk_mgr_shift_dcn35 = { + CLK_COMMON_MASK_SH_LIST_DCN32(__SHIFT) +}; + +static const struct clk_mgr_mask clk_mgr_mask_dcn35 = { + CLK_COMMON_MASK_SH_LIST_DCN32(_MASK) +}; #define TO_CLK_MGR_DCN35(clk_mgr)\ container_of(clk_mgr, struct clk_mgr_dcn35, base) @@ -338,6 +401,7 @@ void dcn35_update_clocks(struct clk_mgr *clk_mgr_base, if (clk_mgr_base->clks.dtbclk_en && !new_clocks->dtbclk_en) { if (clk_mgr->base.ctx->dc->config.allow_0_dtb_clk) dcn35_smu_set_dtbclk(clk_mgr, false); + clk_mgr_base->clks.dtbclk_en = new_clocks->dtbclk_en; } /* check that we're not already in lower */ @@ -355,11 +419,17 @@ void dcn35_update_clocks(struct clk_mgr *clk_mgr_base, } if (!clk_mgr_base->clks.dtbclk_en && new_clocks->dtbclk_en) { - dcn35_smu_set_dtbclk(clk_mgr, true); - clk_mgr_base->clks.dtbclk_en = new_clocks->dtbclk_en; + int actual_dtbclk = 0; dcn35_update_clocks_update_dtb_dto(clk_mgr, context, new_clocks->ref_dtbclk_khz); - clk_mgr_base->clks.ref_dtbclk_khz = new_clocks->ref_dtbclk_khz; + dcn35_smu_set_dtbclk(clk_mgr, true); + + actual_dtbclk = REG_READ(CLK1_CLK4_CURRENT_CNT); + + if (actual_dtbclk) { + clk_mgr_base->clks.ref_dtbclk_khz = new_clocks->ref_dtbclk_khz; + clk_mgr_base->clks.dtbclk_en = new_clocks->dtbclk_en; + } } /* check that we're not already in D0 */ @@ -452,7 +522,6 @@ static int get_vco_frequency_from_reg(struct clk_mgr_internal *clk_mgr) struct fixed31_32 pll_req; unsigned int fbmult_frac_val = 0; unsigned int fbmult_int_val = 0; - struct dc_context *ctx = clk_mgr->base.ctx; /* * Register value of fbmult is in 8.16 format, we are converting to 314.32 @@ -512,22 +581,20 @@ static void dcn35_dump_clk_registers(struct clk_state_registers_and_bypass *regs static bool dcn35_is_spll_ssc_enabled(struct clk_mgr *clk_mgr_base) { struct clk_mgr_internal *clk_mgr = TO_CLK_MGR_INTERNAL(clk_mgr_base); - struct dc_context *ctx = clk_mgr->base.ctx; + uint32_t ssc_enable; - REG_GET(CLK5_0_CLK5_spll_field_8, spll_ssc_en, &ssc_enable); + ssc_enable = REG_READ(CLK5_spll_field_8) & CLK5_spll_field_8__spll_ssc_en_MASK; - return ssc_enable == 1; + return ssc_enable != 0; } static void init_clk_states(struct clk_mgr *clk_mgr) { - struct clk_mgr_internal *clk_mgr_int = TO_CLK_MGR_INTERNAL(clk_mgr); uint32_t ref_dtbclk = clk_mgr->clks.ref_dtbclk_khz; + memset(&(clk_mgr->clks), 0, sizeof(struct dc_clocks)); - if (clk_mgr_int->smu_ver >= SMU_VER_THRESHOLD) - clk_mgr->clks.dtbclk_en = true; // request DTBCLK disable on first commit clk_mgr->clks.ref_dtbclk_khz = ref_dtbclk; // restore ref_dtbclk clk_mgr->clks.p_state_change_support = true; clk_mgr->clks.prev_p_state_change_support = true; @@ -538,6 +605,7 @@ static void init_clk_states(struct clk_mgr *clk_mgr) void dcn35_init_clocks(struct clk_mgr *clk_mgr) { struct clk_mgr_internal *clk_mgr_int = TO_CLK_MGR_INTERNAL(clk_mgr); + init_clk_states(clk_mgr); // to adjust dp_dto reference clock if ssc is enable otherwise to apply dprefclk @@ -632,6 +700,7 @@ static struct wm_table lpddr5_wm_table = { }; static DpmClocks_t_dcn35 dummy_clocks; +static DpmClocks_t_dcn351 dummy_clocks_dcn351; static struct dcn35_watermarks dummy_wms = { 0 }; @@ -642,10 +711,10 @@ static struct dcn35_ss_info_table ss_info_table = { static void dcn35_read_ss_info_from_lut(struct clk_mgr_internal *clk_mgr) { - struct dc_context *ctx = clk_mgr->base.ctx; - uint32_t clock_source; + uint32_t clock_source = 0; + + clock_source = REG_READ(CLK1_CLK2_BYPASS_CNTL) & CLK1_CLK2_BYPASS_CNTL__CLK2_BYPASS_SEL_MASK; - REG_GET(CLK1_CLK2_BYPASS_CNTL, CLK2_BYPASS_SEL, &clock_source); // If it's DFS mode, clock_source is 0. if (dcn35_is_spll_ssc_enabled(&clk_mgr->base) && (clock_source < ARRAY_SIZE(ss_info_table.ss_percentage))) { clk_mgr->dprefclk_ss_percentage = ss_info_table.ss_percentage[clock_source]; @@ -755,6 +824,22 @@ static void dcn35_get_dpm_table_from_smu(struct clk_mgr_internal *clk_mgr, dcn35_smu_transfer_dpm_table_smu_2_dram(clk_mgr); } +static void dcn351_get_dpm_table_from_smu(struct clk_mgr_internal *clk_mgr, + struct dcn351_smu_dpm_clks *smu_dpm_clks) +{ + DpmClocks_t_dcn351 *table = smu_dpm_clks->dpm_clks; + + if (!clk_mgr->smu_ver) + return; + if (!table || smu_dpm_clks->mc_address.quad_part == 0) + return; + memset(table, 0, sizeof(*table)); + dcn35_smu_set_dram_addr_high(clk_mgr, + smu_dpm_clks->mc_address.high_part); + dcn35_smu_set_dram_addr_low(clk_mgr, + smu_dpm_clks->mc_address.low_part); + dcn35_smu_transfer_dpm_table_smu_2_dram(clk_mgr); +} static uint32_t find_max_clk_value(const uint32_t clocks[], uint32_t num_clocks) { uint32_t max = 0; @@ -1093,6 +1178,57 @@ struct clk_mgr_funcs dcn35_fpga_funcs = { .get_dtb_ref_clk_frequency = dcn31_get_dtb_ref_freq_khz, }; +static void translate_to_DpmClocks_t_dcn35(struct dcn351_smu_dpm_clks *smu_dpm_clks_a, + struct dcn35_smu_dpm_clks *smu_dpm_clks_b) +{ + /*translate two structures and only take need clock tables*/ + uint8_t i; + + if (smu_dpm_clks_a == NULL || smu_dpm_clks_b == NULL || + smu_dpm_clks_a->dpm_clks == NULL || smu_dpm_clks_b->dpm_clks == NULL) + return; + + for (i = 0; i < NUM_DCFCLK_DPM_LEVELS; i++) + smu_dpm_clks_b->dpm_clks->DcfClocks[i] = smu_dpm_clks_a->dpm_clks->DcfClocks[i]; + + for (i = 0; i < NUM_DISPCLK_DPM_LEVELS; i++) + smu_dpm_clks_b->dpm_clks->DispClocks[i] = smu_dpm_clks_a->dpm_clks->DispClocks[i]; + + for (i = 0; i < NUM_DPPCLK_DPM_LEVELS; i++) + smu_dpm_clks_b->dpm_clks->DppClocks[i] = smu_dpm_clks_a->dpm_clks->DppClocks[i]; + + for (i = 0; i < NUM_FCLK_DPM_LEVELS; i++) { + smu_dpm_clks_b->dpm_clks->FclkClocks_Freq[i] = smu_dpm_clks_a->dpm_clks->FclkClocks_Freq[i]; + smu_dpm_clks_b->dpm_clks->FclkClocks_Voltage[i] = smu_dpm_clks_a->dpm_clks->FclkClocks_Voltage[i]; + } + for (i = 0; i < NUM_MEM_PSTATE_LEVELS; i++) { + smu_dpm_clks_b->dpm_clks->MemPstateTable[i].MemClk = + smu_dpm_clks_a->dpm_clks->MemPstateTable[i].MemClk; + smu_dpm_clks_b->dpm_clks->MemPstateTable[i].UClk = + smu_dpm_clks_a->dpm_clks->MemPstateTable[i].UClk; + smu_dpm_clks_b->dpm_clks->MemPstateTable[i].Voltage = + smu_dpm_clks_a->dpm_clks->MemPstateTable[i].Voltage; + smu_dpm_clks_b->dpm_clks->MemPstateTable[i].WckRatio = + smu_dpm_clks_a->dpm_clks->MemPstateTable[i].WckRatio; + } + smu_dpm_clks_b->dpm_clks->MaxGfxClk = smu_dpm_clks_a->dpm_clks->MaxGfxClk; + smu_dpm_clks_b->dpm_clks->MinGfxClk = smu_dpm_clks_a->dpm_clks->MinGfxClk; + smu_dpm_clks_b->dpm_clks->NumDcfClkLevelsEnabled = + smu_dpm_clks_a->dpm_clks->NumDcfClkLevelsEnabled; + smu_dpm_clks_b->dpm_clks->NumDispClkLevelsEnabled = + smu_dpm_clks_a->dpm_clks->NumDispClkLevelsEnabled; + smu_dpm_clks_b->dpm_clks->NumFclkLevelsEnabled = + smu_dpm_clks_a->dpm_clks->NumFclkLevelsEnabled; + smu_dpm_clks_b->dpm_clks->NumMemPstatesEnabled = + smu_dpm_clks_a->dpm_clks->NumMemPstatesEnabled; + smu_dpm_clks_b->dpm_clks->NumSocClkLevelsEnabled = + smu_dpm_clks_a->dpm_clks->NumSocClkLevelsEnabled; + + for (i = 0; i < NUM_SOC_VOLTAGE_LEVELS; i++) { + smu_dpm_clks_b->dpm_clks->SocClocks[i] = smu_dpm_clks_a->dpm_clks->SocClocks[i]; + smu_dpm_clks_b->dpm_clks->SocVoltage[i] = smu_dpm_clks_a->dpm_clks->SocVoltage[i]; + } +} void dcn35_clk_mgr_construct( struct dc_context *ctx, struct clk_mgr_dcn35 *clk_mgr, @@ -1100,6 +1236,7 @@ void dcn35_clk_mgr_construct( struct dccg *dccg) { struct dcn35_smu_dpm_clks smu_dpm_clks = { 0 }; + struct dcn351_smu_dpm_clks smu_dpm_clks_dcn351 = { 0 }; clk_mgr->base.base.ctx = ctx; clk_mgr->base.base.funcs = &dcn35_funcs; @@ -1112,6 +1249,12 @@ void dcn35_clk_mgr_construct( clk_mgr->base.dprefclk_ss_divider = 1000; clk_mgr->base.ss_on_dprefclk = false; clk_mgr->base.dfs_ref_freq_khz = 48000; + if (ctx->dce_version == DCN_VERSION_3_5) { + clk_mgr->base.regs = &clk_mgr_regs_dcn35; + clk_mgr->base.clk_mgr_shift = &clk_mgr_shift_dcn35; + clk_mgr->base.clk_mgr_mask = &clk_mgr_mask_dcn35; + } + clk_mgr->smu_wm_set.wm_set = (struct dcn35_watermarks *)dm_helpers_allocate_gpu_mem( clk_mgr->base.base.ctx, @@ -1130,14 +1273,24 @@ void dcn35_clk_mgr_construct( DC_MEM_ALLOC_TYPE_GART, sizeof(DpmClocks_t_dcn35), &smu_dpm_clks.mc_address.quad_part); - if (smu_dpm_clks.dpm_clks == NULL) { smu_dpm_clks.dpm_clks = &dummy_clocks; smu_dpm_clks.mc_address.quad_part = 0; } - ASSERT(smu_dpm_clks.dpm_clks); + if (ctx->dce_version == DCN_VERSION_3_51) { + smu_dpm_clks_dcn351.dpm_clks = (DpmClocks_t_dcn351 *)dm_helpers_allocate_gpu_mem( + clk_mgr->base.base.ctx, + DC_MEM_ALLOC_TYPE_GART, + sizeof(DpmClocks_t_dcn351), + &smu_dpm_clks_dcn351.mc_address.quad_part); + if (smu_dpm_clks_dcn351.dpm_clks == NULL) { + smu_dpm_clks_dcn351.dpm_clks = &dummy_clocks_dcn351; + smu_dpm_clks_dcn351.mc_address.quad_part = 0; + } + } + clk_mgr->base.smu_ver = dcn35_smu_get_smu_version(&clk_mgr->base); if (clk_mgr->base.smu_ver) @@ -1166,7 +1319,11 @@ void dcn35_clk_mgr_construct( if (clk_mgr->base.base.ctx->dc->debug.pstate_enabled) { int i; - dcn35_get_dpm_table_from_smu(&clk_mgr->base, &smu_dpm_clks); + if (ctx->dce_version == DCN_VERSION_3_51) { + dcn351_get_dpm_table_from_smu(&clk_mgr->base, &smu_dpm_clks_dcn351); + translate_to_DpmClocks_t_dcn35(&smu_dpm_clks_dcn351, &smu_dpm_clks); + } else + dcn35_get_dpm_table_from_smu(&clk_mgr->base, &smu_dpm_clks); DC_LOG_SMU("NumDcfClkLevelsEnabled: %d\n" "NumDispClkLevelsEnabled: %d\n" "NumSocClkLevelsEnabled: %d\n" @@ -1227,6 +1384,10 @@ void dcn35_clk_mgr_construct( dm_helpers_free_gpu_mem(clk_mgr->base.base.ctx, DC_MEM_ALLOC_TYPE_GART, smu_dpm_clks.dpm_clks); + if (smu_dpm_clks_dcn351.dpm_clks && smu_dpm_clks_dcn351.mc_address.quad_part != 0) + dm_helpers_free_gpu_mem(clk_mgr->base.base.ctx, DC_MEM_ALLOC_TYPE_GART, + smu_dpm_clks_dcn351.dpm_clks); + if (ctx->dc->config.disable_ips != DMUB_IPS_DISABLE_ALL) { bool ips_support = false; diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.h index 1203dc605b12..a12a9bf90806 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.h +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.h @@ -60,4 +60,8 @@ void dcn35_clk_mgr_construct(struct dc_context *ctx, void dcn35_clk_mgr_destroy(struct clk_mgr_internal *clk_mgr_int); +void dcn351_clk_mgr_construct(struct dc_context *ctx, + struct clk_mgr_dcn35 *clk_mgr, + struct pp_smu_funcs *pp_smu, + struct dccg *dccg); #endif //__DCN35_CLK_MGR_H__ diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.h index 3fae13c73934..ab9d21ba0c43 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.h +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.h @@ -126,18 +126,31 @@ typedef struct { uint32_t MaxGfxClk; } DpmClocks_t_dcn35; - -// Throttler Status Bitmask - - - - - - - - - - +typedef struct { + uint32_t DcfClocks[NUM_DCFCLK_DPM_LEVELS]; + uint32_t DispClocks[NUM_DISPCLK_DPM_LEVELS]; + uint32_t DppClocks[NUM_DPPCLK_DPM_LEVELS]; + uint32_t SocClocks[NUM_SOCCLK_DPM_LEVELS]; + uint32_t VClocks0[NUM_VCN_DPM_LEVELS]; + uint32_t VClocks1[NUM_VCN_DPM_LEVELS]; + uint32_t DClocks0[NUM_VCN_DPM_LEVELS]; + uint32_t DClocks1[NUM_VCN_DPM_LEVELS]; + uint32_t VPEClocks[NUM_VPE_DPM_LEVELS]; + uint32_t FclkClocks_Freq[NUM_FCLK_DPM_LEVELS]; + uint32_t FclkClocks_Voltage[NUM_FCLK_DPM_LEVELS]; + uint32_t SocVoltage[NUM_SOC_VOLTAGE_LEVELS]; + MemPstateTable_t MemPstateTable[NUM_MEM_PSTATE_LEVELS]; + uint8_t NumDcfClkLevelsEnabled; + uint8_t NumDispClkLevelsEnabled; // Applies to both Dispclk and Dppclk + uint8_t NumSocClkLevelsEnabled; + uint8_t Vcn0ClkLevelsEnabled; // Applies to both Vclk0 and Dclk0 + uint8_t Vcn1ClkLevelsEnabled; // Applies to both Vclk1 and Dclk1 + uint8_t VpeClkLevelsEnabled; + uint8_t NumMemPstatesEnabled; + uint8_t NumFclkLevelsEnabled; + uint32_t MinGfxClk; + uint32_t MaxGfxClk; +} DpmClocks_t_dcn351; #define TABLE_BIOS_IF 0 // Called by BIOS #define TABLE_WATERMARKS 1 // Called by DAL through VBIOS @@ -163,6 +176,10 @@ struct dcn35_smu_dpm_clks { union large_integer mc_address; }; +struct dcn351_smu_dpm_clks { + DpmClocks_t_dcn351 *dpm_clks; + union large_integer mc_address; +}; /* TODO: taken from vgh, may not be correct */ struct display_idle_optimization { unsigned int df_request_disabled : 1; diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dalsmc.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dalsmc.h index dbfdd3487da5..2e0d34fd7512 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dalsmc.h +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dalsmc.h @@ -43,7 +43,9 @@ #define DALSMC_MSG_ActiveUclkFclk 0x18 #define DALSMC_MSG_IdleUclkFclk 0x19 #define DALSMC_MSG_SetUclkPstateAllow 0x1A -#define DALSMC_Message_Count 0x1B +#define DALSMC_MSG_SubvpUclkFclk 0x1B +#define DALSMC_MSG_GetNumUmcChannels 0x1C +#define DALSMC_Message_Count 0x1D typedef enum { FCLK_SWITCH_DISALLOW, diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.c index 8cfc5f435937..8082bb877611 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.c @@ -141,6 +141,20 @@ static bool dcn401_is_ppclk_idle_dpm_enabled(struct clk_mgr_internal *clk_mgr, P return ppclk_idle_dpm_enabled; } +static bool dcn401_is_df_throttle_opt_enabled(struct clk_mgr_internal *clk_mgr) +{ + bool is_df_throttle_opt_enabled = false; + + if (ASICREV_IS_GC_12_0_1_A0(clk_mgr->base.ctx->asic_id.hw_internal_rev) && + clk_mgr->smu_ver >= 0x663500) { + is_df_throttle_opt_enabled = !clk_mgr->base.ctx->dc->debug.force_subvp_df_throttle; + } + + is_df_throttle_opt_enabled &= clk_mgr->smu_present; + + return is_df_throttle_opt_enabled; +} + /* Query SMU for all clock states for a particular clock */ static void dcn401_init_single_clock(struct clk_mgr_internal *clk_mgr, PPCLK_e clk, unsigned int *entry_0, unsigned int *num_levels) @@ -614,207 +628,6 @@ static void dcn401_update_clocks_update_dentist( } -static void dcn401_update_clocks_legacy(struct clk_mgr *clk_mgr_base, - struct dc_state *context, - bool safe_to_lower) -{ - struct clk_mgr_internal *clk_mgr = TO_CLK_MGR_INTERNAL(clk_mgr_base); - struct dc_clocks *new_clocks = &context->bw_ctx.bw.dcn.clk; - struct dc *dc = clk_mgr_base->ctx->dc; - int display_count; - bool update_dppclk = false; - bool update_dispclk = false; - bool enter_display_off = false; - bool dpp_clock_lowered = false; - struct dmcu *dmcu = clk_mgr_base->ctx->dc->res_pool->dmcu; - bool force_reset = false; - bool update_uclk = false, update_fclk = false; - bool p_state_change_support; - bool fclk_p_state_change_support; - int total_plane_count; - - if (dc->work_arounds.skip_clock_update) - return; - - if (clk_mgr_base->clks.dispclk_khz == 0 || - (dc->debug.force_clock_mode & 0x1)) { - /* This is from resume or boot up, if forced_clock cfg option used, - * we bypass program dispclk and DPPCLK, but need set them for S3. - */ - force_reset = true; - - dcn2_read_clocks_from_hw_dentist(clk_mgr_base); - - /* Force_clock_mode 0x1: force reset the clock even it is the same clock - * as long as it is in Passive level. - */ - } - display_count = clk_mgr_helper_get_active_display_cnt(dc, context); - - if (display_count == 0) - enter_display_off = true; - - if (clk_mgr->smu_present) { - if (enter_display_off == safe_to_lower) - dcn401_smu_set_num_of_displays(clk_mgr, display_count); - - clk_mgr_base->clks.fclk_prev_p_state_change_support = clk_mgr_base->clks.fclk_p_state_change_support; - - total_plane_count = clk_mgr_helper_get_active_plane_cnt(dc, context); - fclk_p_state_change_support = new_clocks->fclk_p_state_change_support || (total_plane_count == 0); - - if (should_update_pstate_support(safe_to_lower, fclk_p_state_change_support, clk_mgr_base->clks.fclk_p_state_change_support)) { - clk_mgr_base->clks.fclk_p_state_change_support = fclk_p_state_change_support; - - /* To enable FCLK P-state switching, send PSTATE_SUPPORTED message to PMFW */ - if (clk_mgr_base->clks.fclk_p_state_change_support) { - /* Handle the code for sending a message to PMFW that FCLK P-state change is supported */ - dcn401_smu_send_fclk_pstate_message(clk_mgr, true); - } - } - - if (dc->debug.force_min_dcfclk_mhz > 0) - new_clocks->dcfclk_khz = (new_clocks->dcfclk_khz > (dc->debug.force_min_dcfclk_mhz * 1000)) ? - new_clocks->dcfclk_khz : (dc->debug.force_min_dcfclk_mhz * 1000); - - if (should_set_clock(safe_to_lower, new_clocks->dcfclk_khz, clk_mgr_base->clks.dcfclk_khz)) { - clk_mgr_base->clks.dcfclk_khz = new_clocks->dcfclk_khz; - if (dcn401_is_ppclk_dpm_enabled(clk_mgr, PPCLK_DCFCLK)) - dcn401_smu_set_hard_min_by_freq(clk_mgr, PPCLK_DCFCLK, khz_to_mhz_ceil(clk_mgr_base->clks.dcfclk_khz)); - } - - if (should_set_clock(safe_to_lower, new_clocks->dcfclk_deep_sleep_khz, clk_mgr_base->clks.dcfclk_deep_sleep_khz)) { - clk_mgr_base->clks.dcfclk_deep_sleep_khz = new_clocks->dcfclk_deep_sleep_khz; - if (dcn401_is_ppclk_dpm_enabled(clk_mgr, PPCLK_DCFCLK)) - dcn401_smu_set_min_deep_sleep_dcef_clk(clk_mgr, khz_to_mhz_ceil(clk_mgr_base->clks.dcfclk_deep_sleep_khz)); - } - - if (should_set_clock(safe_to_lower, new_clocks->socclk_khz, clk_mgr_base->clks.socclk_khz)) - /* We don't actually care about socclk, don't notify SMU of hard min */ - clk_mgr_base->clks.socclk_khz = new_clocks->socclk_khz; - - clk_mgr_base->clks.prev_p_state_change_support = clk_mgr_base->clks.p_state_change_support; - clk_mgr_base->clks.prev_num_ways = clk_mgr_base->clks.num_ways; - - if (clk_mgr_base->clks.num_ways != new_clocks->num_ways && - clk_mgr_base->clks.num_ways < new_clocks->num_ways) { - clk_mgr_base->clks.num_ways = new_clocks->num_ways; - if (dcn401_is_ppclk_dpm_enabled(clk_mgr, PPCLK_UCLK)) - dcn401_smu_send_cab_for_uclk_message(clk_mgr, clk_mgr_base->clks.num_ways); - } - - - p_state_change_support = new_clocks->p_state_change_support || (total_plane_count == 0); - if (should_update_pstate_support(safe_to_lower, p_state_change_support, clk_mgr_base->clks.prev_p_state_change_support)) { - clk_mgr_base->clks.p_state_change_support = p_state_change_support; - clk_mgr_base->clks.fw_based_mclk_switching = p_state_change_support && new_clocks->fw_based_mclk_switching; - - /* to disable P-State switching, set UCLK min = max */ - if (!clk_mgr_base->clks.p_state_change_support && dcn401_is_ppclk_dpm_enabled(clk_mgr, PPCLK_UCLK)) - dcn401_smu_set_hard_min_by_freq(clk_mgr, PPCLK_UCLK, - clk_mgr_base->bw_params->clk_table.entries[clk_mgr_base->bw_params->clk_table.num_entries_per_clk.num_memclk_levels - 1].memclk_mhz); - } - - /* Always update saved value, even if new value not set due to P-State switching unsupported. Also check safe_to_lower for FCLK */ - if (safe_to_lower && (clk_mgr_base->clks.fclk_p_state_change_support != clk_mgr_base->clks.fclk_prev_p_state_change_support)) { - update_fclk = true; - } - - if (!clk_mgr_base->clks.fclk_p_state_change_support && - update_fclk && - dcn401_is_ppclk_dpm_enabled(clk_mgr, PPCLK_FCLK)) { - /* Handle code for sending a message to PMFW that FCLK P-state change is not supported */ - dcn401_smu_send_fclk_pstate_message(clk_mgr, false); - } - - /* Always update saved value, even if new value not set due to P-State switching unsupported */ - if (should_set_clock(safe_to_lower, new_clocks->dramclk_khz, clk_mgr_base->clks.dramclk_khz)) { - clk_mgr_base->clks.dramclk_khz = new_clocks->dramclk_khz; - update_uclk = true; - } - - /* set UCLK to requested value if P-State switching is supported, or to re-enable P-State switching */ - if (clk_mgr_base->clks.p_state_change_support && - (update_uclk || !clk_mgr_base->clks.prev_p_state_change_support) && - dcn401_is_ppclk_dpm_enabled(clk_mgr, PPCLK_UCLK)) - dcn401_smu_set_hard_min_by_freq(clk_mgr, PPCLK_UCLK, khz_to_mhz_ceil(clk_mgr_base->clks.dramclk_khz)); - - if (clk_mgr_base->clks.num_ways != new_clocks->num_ways && - clk_mgr_base->clks.num_ways > new_clocks->num_ways) { - clk_mgr_base->clks.num_ways = new_clocks->num_ways; - if (dcn401_is_ppclk_dpm_enabled(clk_mgr, PPCLK_UCLK)) - dcn401_smu_send_cab_for_uclk_message(clk_mgr, clk_mgr_base->clks.num_ways); - } - } - - if (should_set_clock(safe_to_lower, new_clocks->dppclk_khz, clk_mgr_base->clks.dppclk_khz)) { - if (clk_mgr_base->clks.dppclk_khz > new_clocks->dppclk_khz) - dpp_clock_lowered = true; - - clk_mgr_base->clks.dppclk_khz = new_clocks->dppclk_khz; - clk_mgr_base->clks.actual_dppclk_khz = new_clocks->dppclk_khz; - - if (clk_mgr->smu_present && !dpp_clock_lowered && dcn401_is_ppclk_dpm_enabled(clk_mgr, PPCLK_DPPCLK)) - clk_mgr_base->clks.actual_dppclk_khz = dcn401_set_hard_min_by_freq_optimized(clk_mgr, PPCLK_DPPCLK, clk_mgr_base->clks.dppclk_khz); - update_dppclk = true; - } - - if (should_set_clock(safe_to_lower, new_clocks->dispclk_khz, clk_mgr_base->clks.dispclk_khz)) { - clk_mgr_base->clks.dispclk_khz = new_clocks->dispclk_khz; - - if (clk_mgr->smu_present && dcn401_is_ppclk_dpm_enabled(clk_mgr, PPCLK_DISPCLK)) - clk_mgr_base->clks.actual_dispclk_khz = dcn401_set_hard_min_by_freq_optimized(clk_mgr, PPCLK_DISPCLK, clk_mgr_base->clks.dispclk_khz); - - update_dispclk = true; - } - - if (!new_clocks->dtbclk_en && dcn401_is_ppclk_dpm_enabled(clk_mgr, PPCLK_DTBCLK)) { - new_clocks->ref_dtbclk_khz = clk_mgr_base->bw_params->clk_table.entries[0].dtbclk_mhz * 1000; - } - - /* clock limits are received with MHz precision, divide by 1000 to prevent setting clocks at every call */ - if (!dc->debug.disable_dtb_ref_clk_switch && - should_set_clock(safe_to_lower, new_clocks->ref_dtbclk_khz / 1000, clk_mgr_base->clks.ref_dtbclk_khz / 1000) && - dcn401_is_ppclk_dpm_enabled(clk_mgr, PPCLK_DTBCLK)) { - /* DCCG requires KHz precision for DTBCLK */ - clk_mgr_base->clks.ref_dtbclk_khz = - dcn401_smu_set_hard_min_by_freq(clk_mgr, PPCLK_DTBCLK, khz_to_mhz_ceil(new_clocks->ref_dtbclk_khz)); - - dcn401_update_clocks_update_dtb_dto(clk_mgr, context, clk_mgr_base->clks.ref_dtbclk_khz); - } - - if (dc->config.forced_clocks == false || (force_reset && safe_to_lower)) { - if (dpp_clock_lowered) { - /* if clock is being lowered, increase DTO before lowering refclk */ - dcn401_update_clocks_update_dpp_dto(clk_mgr, context, - safe_to_lower, clk_mgr_base->clks.dppclk_khz); - dcn401_update_clocks_update_dentist(clk_mgr, context); - if (clk_mgr->smu_present && dcn401_is_ppclk_dpm_enabled(clk_mgr, PPCLK_DPPCLK)) { - clk_mgr_base->clks.actual_dppclk_khz = dcn401_set_hard_min_by_freq_optimized(clk_mgr, PPCLK_DPPCLK, - clk_mgr_base->clks.dppclk_khz); - dcn401_update_clocks_update_dpp_dto(clk_mgr, context, safe_to_lower, - clk_mgr_base->clks.actual_dppclk_khz); - } - - } else { - /* if clock is being raised, increase refclk before lowering DTO */ - if (update_dppclk || update_dispclk) - dcn401_update_clocks_update_dentist(clk_mgr, context); - /* There is a check inside dcn20_update_clocks_update_dpp_dto which ensures - * that we do not lower dto when it is not safe to lower. We do not need to - * compare the current and new dppclk before calling this function. - */ - dcn401_update_clocks_update_dpp_dto(clk_mgr, context, - safe_to_lower, clk_mgr_base->clks.actual_dppclk_khz); - } - } - - if (update_dispclk && dmcu && dmcu->funcs->is_dmcu_initialized(dmcu)) - /*update dmcu for wait_loop count*/ - dmcu->funcs->set_psr_wait_loop(dmcu, - clk_mgr_base->clks.dispclk_khz / 1000 / 7); -} - static void dcn401_execute_block_sequence(struct clk_mgr *clk_mgr_base, unsigned int num_steps) { struct clk_mgr_internal *clk_mgr_internal = TO_CLK_MGR_INTERNAL(clk_mgr_base); @@ -869,6 +682,12 @@ static void dcn401_execute_block_sequence(struct clk_mgr *clk_mgr_base, unsigned params->update_idle_hardmin_params.uclk_mhz, params->update_idle_hardmin_params.fclk_mhz); break; + case CLK_MGR401_UPDATE_SUBVP_HARDMINS: + dcn401_smu_set_subvp_uclk_fclk_hardmin( + clk_mgr_internal, + params->update_idle_hardmin_params.uclk_mhz, + params->update_idle_hardmin_params.fclk_mhz); + break; case CLK_MGR401_UPDATE_DEEP_SLEEP_DCFCLK: dcn401_smu_set_min_deep_sleep_dcef_clk( clk_mgr_internal, @@ -945,15 +764,21 @@ static unsigned int dcn401_build_update_bandwidth_clocks_sequence( bool update_active_uclk = false; bool update_idle_fclk = false; bool update_idle_uclk = false; + bool update_subvp_prefetch_dramclk = false; + bool update_subvp_prefetch_fclk = false; bool is_idle_dpm_enabled = dcn401_is_ppclk_dpm_enabled(clk_mgr_internal, PPCLK_UCLK) && dcn401_is_ppclk_dpm_enabled(clk_mgr_internal, PPCLK_FCLK) && dcn401_is_ppclk_idle_dpm_enabled(clk_mgr_internal, PPCLK_UCLK) && dcn401_is_ppclk_idle_dpm_enabled(clk_mgr_internal, PPCLK_FCLK); + bool is_df_throttle_opt_enabled = is_idle_dpm_enabled && + dcn401_is_df_throttle_opt_enabled(clk_mgr_internal); int total_plane_count = clk_mgr_helper_get_active_plane_cnt(dc, context); int active_uclk_mhz = khz_to_mhz_ceil(clk_mgr_base->clks.dramclk_khz); int active_fclk_mhz = khz_to_mhz_ceil(clk_mgr_base->clks.fclk_khz); int idle_uclk_mhz = khz_to_mhz_ceil(clk_mgr_base->clks.idle_dramclk_khz); int idle_fclk_mhz = khz_to_mhz_ceil(clk_mgr_base->clks.idle_fclk_khz); + int subvp_prefetch_dramclk_mhz = khz_to_mhz_ceil(clk_mgr_base->clks.subvp_prefetch_dramclk_khz); + int subvp_prefetch_fclk_mhz = khz_to_mhz_ceil(clk_mgr_base->clks.subvp_prefetch_fclk_khz); unsigned int num_steps = 0; @@ -982,15 +807,15 @@ static unsigned int dcn401_build_update_bandwidth_clocks_sequence( update_active_fclk = true; update_idle_fclk = true; - /* To enable FCLK P-state switching, send PSTATE_SUPPORTED message to PMFW */ - if (clk_mgr_base->clks.fclk_p_state_change_support) { - /* Handle the code for sending a message to PMFW that FCLK P-state change is supported */ - if (dcn401_is_ppclk_dpm_enabled(clk_mgr_internal, PPCLK_FCLK)) { - block_sequence[num_steps].params.update_pstate_support_params.support = true; - block_sequence[num_steps].func = CLK_MGR401_UPDATE_FCLK_PSTATE_SUPPORT; - num_steps++; - } - } + /* To enable FCLK P-state switching, send PSTATE_SUPPORTED message to PMFW (message not supported on DCN401)*/ + // if (clk_mgr_base->clks.fclk_p_state_change_support) { + // /* Handle the code for sending a message to PMFW that FCLK P-state change is supported */ + // if (dcn401_is_ppclk_dpm_enabled(clk_mgr_internal, PPCLK_FCLK)) { + // block_sequence[num_steps].params.update_pstate_support_params.support = true; + // block_sequence[num_steps].func = CLK_MGR401_UPDATE_FCLK_PSTATE_SUPPORT; + // num_steps++; + // } + // } } if (!clk_mgr_base->clks.fclk_p_state_change_support && dcn401_is_ppclk_dpm_enabled(clk_mgr_internal, PPCLK_FCLK)) { @@ -1109,6 +934,12 @@ static unsigned int dcn401_build_update_bandwidth_clocks_sequence( } } + if (should_set_clock(safe_to_lower, new_clocks->subvp_prefetch_dramclk_khz, clk_mgr_base->clks.subvp_prefetch_dramclk_khz)) { + clk_mgr_base->clks.subvp_prefetch_dramclk_khz = new_clocks->subvp_prefetch_dramclk_khz; + update_subvp_prefetch_dramclk = true; + subvp_prefetch_dramclk_mhz = khz_to_mhz_ceil(clk_mgr_base->clks.subvp_prefetch_dramclk_khz); + } + /* FCLK */ /* Always update saved value, even if new value not set due to P-State switching unsupported */ if (should_set_clock(safe_to_lower, new_clocks->fclk_khz, clk_mgr_base->clks.fclk_khz)) { @@ -1129,6 +960,12 @@ static unsigned int dcn401_build_update_bandwidth_clocks_sequence( } } + if (should_set_clock(safe_to_lower, new_clocks->subvp_prefetch_fclk_khz, clk_mgr_base->clks.subvp_prefetch_fclk_khz)) { + clk_mgr_base->clks.subvp_prefetch_fclk_khz = new_clocks->subvp_prefetch_fclk_khz; + update_subvp_prefetch_fclk = true; + subvp_prefetch_fclk_mhz = khz_to_mhz_ceil(clk_mgr_base->clks.subvp_prefetch_fclk_khz); + } + /* When idle DPM is enabled, need to send active and idle hardmins separately */ /* CLK_MGR401_UPDATE_ACTIVE_HARDMINS */ if ((update_active_uclk || update_active_fclk) && is_idle_dpm_enabled) { @@ -1146,6 +983,14 @@ static unsigned int dcn401_build_update_bandwidth_clocks_sequence( num_steps++; } + /* CLK_MGR401_UPDATE_SUBVP_HARDMINS */ + if ((update_subvp_prefetch_dramclk || update_subvp_prefetch_fclk) && is_df_throttle_opt_enabled) { + block_sequence[num_steps].params.update_idle_hardmin_params.uclk_mhz = subvp_prefetch_dramclk_mhz; + block_sequence[num_steps].params.update_idle_hardmin_params.fclk_mhz = subvp_prefetch_fclk_mhz; + block_sequence[num_steps].func = CLK_MGR401_UPDATE_SUBVP_HARDMINS; + num_steps++; + } + /* set UCLK to requested value if P-State switching is supported, or to re-enable P-State switching */ if (update_active_uclk || update_idle_uclk) { if (!is_idle_dpm_enabled) { @@ -1178,14 +1023,14 @@ static unsigned int dcn401_build_update_bandwidth_clocks_sequence( // (*num_steps)++; // } - /* disable FCLK P-State support if needed */ - if (!fclk_p_state_change_support && - should_update_pstate_support(safe_to_lower, fclk_p_state_change_support, clk_mgr_base->clks.fclk_prev_p_state_change_support) && - dcn401_is_ppclk_dpm_enabled(clk_mgr_internal, PPCLK_FCLK)) { - block_sequence[num_steps].params.update_pstate_support_params.support = false; - block_sequence[num_steps].func = CLK_MGR401_UPDATE_FCLK_PSTATE_SUPPORT; - num_steps++; - } + /* disable FCLK P-State support if needed (message not supported on DCN401)*/ + // if (!fclk_p_state_change_support && + // should_update_pstate_support(safe_to_lower, fclk_p_state_change_support, clk_mgr_base->clks.fclk_prev_p_state_change_support) && + // dcn401_is_ppclk_dpm_enabled(clk_mgr_internal, PPCLK_FCLK)) { + // block_sequence[num_steps].params.update_pstate_support_params.support = false; + // block_sequence[num_steps].func = CLK_MGR401_UPDATE_FCLK_PSTATE_SUPPORT; + // num_steps++; + // } } if (new_clocks->fw_based_mclk_switching != clk_mgr_base->clks.fw_based_mclk_switching && @@ -1366,11 +1211,6 @@ static void dcn401_update_clocks(struct clk_mgr *clk_mgr_base, unsigned int num_steps = 0; - if (dc->debug.enable_legacy_clock_update) { - dcn401_update_clocks_legacy(clk_mgr_base, context, safe_to_lower); - return; - } - /* build bandwidth related clocks update sequence */ num_steps = dcn401_build_update_bandwidth_clocks_sequence(clk_mgr_base, context, @@ -1505,6 +1345,20 @@ static void dcn401_set_hard_min_memclk(struct clk_mgr *clk_mgr_base, bool curren dcn401_execute_block_sequence(clk_mgr_base, num_steps); } +static int dcn401_get_hard_min_memclk(struct clk_mgr *clk_mgr_base) +{ + struct clk_mgr_internal *clk_mgr = TO_CLK_MGR_INTERNAL(clk_mgr_base); + + return clk_mgr->base.ctx->dc->current_state->bw_ctx.bw.dcn.clk.dramclk_khz; +} + +static int dcn401_get_hard_min_fclk(struct clk_mgr *clk_mgr_base) +{ + struct clk_mgr_internal *clk_mgr = TO_CLK_MGR_INTERNAL(clk_mgr_base); + + return clk_mgr->base.ctx->dc->current_state->bw_ctx.bw.dcn.clk.fclk_khz; +} + /* Get current memclk states, update bounding box */ static void dcn401_get_memclk_states_from_smu(struct clk_mgr *clk_mgr_base) { @@ -1549,6 +1403,15 @@ static void dcn401_get_memclk_states_from_smu(struct clk_mgr *clk_mgr_base) if (clk_mgr->dpm_present && !num_levels) clk_mgr->dpm_present = false; + clk_mgr_base->bw_params->num_channels = dcn401_smu_get_num_of_umc_channels(clk_mgr); + if (clk_mgr_base->ctx->dc_bios) { + /* use BIOS values if none provided by PMFW */ + if (clk_mgr_base->bw_params->num_channels == 0) { + clk_mgr_base->bw_params->num_channels = clk_mgr_base->ctx->dc_bios->vram_info.num_chans; + } + clk_mgr_base->bw_params->dram_channel_width_bytes = clk_mgr_base->ctx->dc_bios->vram_info.dram_channel_width_bytes; + } + /* Refresh bounding box */ clk_mgr_base->ctx->dc->res_pool->funcs->update_bw_bounding_box( clk_mgr->base.ctx->dc, clk_mgr_base->bw_params); @@ -1638,6 +1501,8 @@ static struct clk_mgr_funcs dcn401_funcs = { .enable_pme_wa = dcn401_enable_pme_wa, .is_smu_present = dcn401_is_smu_present, .get_dispclk_from_dentist = dcn401_get_dispclk_from_dentist, + .get_hard_min_memclk = dcn401_get_hard_min_memclk, + .get_hard_min_fclk = dcn401_get_hard_min_fclk, }; struct clk_mgr_internal *dcn401_clk_mgr_construct( diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.h index 8b0461992b22..6c9ae5ca2c7e 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.h +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.h @@ -90,6 +90,7 @@ enum dcn401_clk_mgr_block_sequence_func { CLK_MGR401_UPDATE_DTBCLK_DTO, CLK_MGR401_UPDATE_DENTIST, CLK_MGR401_UPDATE_PSR_WAIT_LOOP, + CLK_MGR401_UPDATE_SUBVP_HARDMINS, }; struct dcn401_clk_mgr_block_sequence { diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr_smu_msg.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr_smu_msg.c index 7700477d019b..21c35528f61f 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr_smu_msg.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr_smu_msg.c @@ -21,6 +21,14 @@ #define smu_print(str, ...) {DC_LOG_SMU(str, ##__VA_ARGS__); } +/* temporary define */ +#ifndef DALSMC_MSG_SubvpUclkFclk +#define DALSMC_MSG_SubvpUclkFclk 0x1B +#endif +#ifndef DALSMC_MSG_GetNumUmcChannels +#define DALSMC_MSG_GetNumUmcChannels 0x1C +#endif + /* * Function to be used instead of REG_WAIT macro because the wait ends when * the register is NOT EQUAL to zero, and because the translation in msg_if.h @@ -296,6 +304,24 @@ bool dcn401_smu_set_active_uclk_fclk_hardmin(struct clk_mgr_internal *clk_mgr, return success; } +bool dcn401_smu_set_subvp_uclk_fclk_hardmin(struct clk_mgr_internal *clk_mgr, + uint16_t uclk_freq_mhz, + uint16_t fclk_freq_mhz) +{ + uint32_t response = 0; + bool success; + + /* 15:0 for uclk, 32:16 for fclk */ + uint32_t param = (fclk_freq_mhz << 16) | uclk_freq_mhz; + + smu_print("SMU Set active hardmin by freq: uclk_freq_mhz = %d MHz, fclk_freq_mhz = %d MHz\n", uclk_freq_mhz, fclk_freq_mhz); + + success = dcn401_smu_send_msg_with_param(clk_mgr, + DALSMC_MSG_SubvpUclkFclk, param, &response); + + return success; +} + void dcn401_smu_set_min_deep_sleep_dcef_clk(struct clk_mgr_internal *clk_mgr, uint32_t freq_mhz) { smu_print("SMU Set min deep sleep dcef clk: freq_mhz = %d MHz\n", freq_mhz); @@ -311,3 +337,14 @@ void dcn401_smu_set_num_of_displays(struct clk_mgr_internal *clk_mgr, uint32_t n dcn401_smu_send_msg_with_param(clk_mgr, DALSMC_MSG_NumOfDisplays, num_displays, NULL); } + +unsigned int dcn401_smu_get_num_of_umc_channels(struct clk_mgr_internal *clk_mgr) +{ + unsigned int response = 0; + + dcn401_smu_send_msg_with_param(clk_mgr, DALSMC_MSG_GetNumUmcChannels, 0, &response); + + smu_print("SMU Get Num UMC Channels: num_umc_channels = %d\n", response); + + return response; +} diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr_smu_msg.h b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr_smu_msg.h index 651fb8d62864..e02eb1294b37 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr_smu_msg.h +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr_smu_msg.h @@ -23,7 +23,11 @@ bool dcn401_smu_set_idle_uclk_fclk_hardmin(struct clk_mgr_internal *clk_mgr, bool dcn401_smu_set_active_uclk_fclk_hardmin(struct clk_mgr_internal *clk_mgr, uint16_t uclk_freq_mhz, uint16_t fclk_freq_mhz); +bool dcn401_smu_set_subvp_uclk_fclk_hardmin(struct clk_mgr_internal *clk_mgr, + uint16_t uclk_freq_mhz, + uint16_t fclk_freq_mhz); void dcn401_smu_set_min_deep_sleep_dcef_clk(struct clk_mgr_internal *clk_mgr, uint32_t freq_mhz); void dcn401_smu_set_num_of_displays(struct clk_mgr_internal *clk_mgr, uint32_t num_displays); +unsigned int dcn401_smu_get_num_of_umc_channels(struct clk_mgr_internal *clk_mgr); #endif /* __DCN401_CLK_MGR_SMU_MSG_H_ */ diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 1dd26d5df6b9..cecaadf741ad 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -579,7 +579,7 @@ dc_stream_forward_dmcu_crc_window(struct dmcu *dmcu, bool dc_stream_forward_crc_window(struct dc_stream_state *stream, - struct rect *rect, bool is_stop) + struct rect *rect, uint8_t phy_id, bool is_stop) { struct dmcu *dmcu; struct dc_dmub_srv *dmub_srv; @@ -598,7 +598,7 @@ dc_stream_forward_crc_window(struct dc_stream_state *stream, if (i == MAX_PIPES) return false; - mux_mapping.phy_output_num = stream->link->link_enc_hw_inst; + mux_mapping.phy_output_num = phy_id; mux_mapping.otg_output_num = pipe->stream_res.tg->inst; dmcu = dc->res_pool->dmcu; @@ -615,6 +615,68 @@ dc_stream_forward_crc_window(struct dc_stream_state *stream, return true; } + +static void +dc_stream_forward_dmub_multiple_crc_window(struct dc_dmub_srv *dmub_srv, + struct crc_window *window, struct otg_phy_mux *mux_mapping, bool stop) +{ + int i; + union dmub_rb_cmd cmd = {0}; + + cmd.secure_display.mul_roi_ctl.phy_id = mux_mapping->phy_output_num; + cmd.secure_display.mul_roi_ctl.otg_id = mux_mapping->otg_output_num; + + cmd.secure_display.header.type = DMUB_CMD__SECURE_DISPLAY; + + if (stop) { + cmd.secure_display.header.sub_type = DMUB_CMD__SECURE_DISPLAY_MULTIPLE_CRC_STOP_UPDATE; + } else { + cmd.secure_display.header.sub_type = DMUB_CMD__SECURE_DISPLAY_MULTIPLE_CRC_WIN_NOTIFY; + for (i = 0; i < MAX_CRC_WINDOW_NUM; i++) { + cmd.secure_display.mul_roi_ctl.roi_ctl[i].x_start = window[i].rect.x; + cmd.secure_display.mul_roi_ctl.roi_ctl[i].y_start = window[i].rect.y; + cmd.secure_display.mul_roi_ctl.roi_ctl[i].x_end = window[i].rect.x + window[i].rect.width; + cmd.secure_display.mul_roi_ctl.roi_ctl[i].y_end = window[i].rect.y + window[i].rect.height; + cmd.secure_display.mul_roi_ctl.roi_ctl[i].enable = window[i].enable; + } + } + + dc_wake_and_execute_dmub_cmd(dmub_srv->ctx, &cmd, DM_DMUB_WAIT_TYPE_NO_WAIT); +} + +bool +dc_stream_forward_multiple_crc_window(struct dc_stream_state *stream, + struct crc_window *window, uint8_t phy_id, bool stop) +{ + struct dc_dmub_srv *dmub_srv; + struct otg_phy_mux mux_mapping; + struct pipe_ctx *pipe; + int i; + struct dc *dc = stream->ctx->dc; + + for (i = 0; i < MAX_PIPES; i++) { + pipe = &dc->current_state->res_ctx.pipe_ctx[i]; + if (pipe->stream == stream && !pipe->top_pipe && !pipe->prev_odm_pipe) + break; + } + + /* Stream not found */ + if (i == MAX_PIPES) + return false; + + mux_mapping.phy_output_num = phy_id; + mux_mapping.otg_output_num = pipe->stream_res.tg->inst; + + dmub_srv = dc->ctx->dmub_srv; + + /* forward to dmub only. no dmcu support*/ + if (dmub_srv) + dc_stream_forward_dmub_multiple_crc_window(dmub_srv, window, &mux_mapping, stop); + else + return false; + + return true; +} #endif /* CONFIG_DRM_AMD_SECURE_DISPLAY */ /** @@ -625,15 +687,17 @@ dc_stream_forward_crc_window(struct dc_stream_state *stream, * @enable: Enable CRC if true, disable otherwise. * @continuous: Capture CRC on every frame if true. Otherwise, only capture * once. + * @idx: Capture CRC on which CRC engine instance + * @reset: Reset CRC engine before the configuration * - * By default, only CRC0 is configured, and the entire frame is used to - * calculate the CRC. + * By default, the entire frame is used to calculate the CRC. * * Return: %false if the stream is not found or CRC capture is not supported; * %true if the stream has been configured. */ bool dc_stream_configure_crc(struct dc *dc, struct dc_stream_state *stream, - struct crc_params *crc_window, bool enable, bool continuous) + struct crc_params *crc_window, bool enable, bool continuous, + uint8_t idx, bool reset) { struct pipe_ctx *pipe; struct crc_params param; @@ -677,6 +741,9 @@ bool dc_stream_configure_crc(struct dc *dc, struct dc_stream_state *stream, param.continuous_mode = continuous; param.enable = enable; + param.crc_eng_inst = idx; + param.reset = reset; + tg = pipe->stream_res.tg; /* Only call if supported */ @@ -691,6 +758,7 @@ bool dc_stream_configure_crc(struct dc *dc, struct dc_stream_state *stream, * * @dc: DC object. * @stream: The DC stream state of the stream to get CRCs from. + * @idx: index of crc engine to get CRC from * @r_cr: CRC value for the red component. * @g_y: CRC value for the green component. * @b_cb: CRC value for the blue component. @@ -700,7 +768,7 @@ bool dc_stream_configure_crc(struct dc *dc, struct dc_stream_state *stream, * Return: * %false if stream is not found, or if CRCs are not enabled. */ -bool dc_stream_get_crc(struct dc *dc, struct dc_stream_state *stream, +bool dc_stream_get_crc(struct dc *dc, struct dc_stream_state *stream, uint8_t idx, uint32_t *r_cr, uint32_t *g_y, uint32_t *b_cb) { int i; @@ -721,7 +789,7 @@ bool dc_stream_get_crc(struct dc *dc, struct dc_stream_state *stream, tg = pipe->stream_res.tg; if (tg->funcs->get_crc) - return tg->funcs->get_crc(tg, r_cr, g_y, b_cb); + return tg->funcs->get_crc(tg, idx, r_cr, g_y, b_cb); DC_LOG_WARNING("CRC capture not supported."); return false; } @@ -1173,6 +1241,8 @@ static void dc_update_visual_confirm_color(struct dc *dc, struct dc_state *conte get_mclk_switch_visual_confirm_color(pipe_ctx, &(pipe_ctx->visual_confirm_color)); else if (dc->debug.visual_confirm == VISUAL_CONFIRM_FAMS2) get_fams2_visual_confirm_color(dc, context, pipe_ctx, &(pipe_ctx->visual_confirm_color)); + else if (dc->debug.visual_confirm == VISUAL_CONFIRM_VABC) + get_vabc_visual_confirm_color(pipe_ctx, &(pipe_ctx->visual_confirm_color)); } } } @@ -2153,6 +2223,11 @@ enum dc_status dc_commit_streams(struct dc *dc, struct dc_commit_streams_params struct dc_stream_state *stream = params->streams[i]; struct dc_stream_status *status = dc_stream_get_status(stream); + /* revalidate streams */ + res = dc_validate_stream(dc, stream); + if (res != DC_OK) + return res; + dc_stream_log(dc, stream); set[i].stream = stream; @@ -2487,7 +2562,7 @@ static enum surface_update_type get_plane_info_update_type(const struct dc *dc, if (memcmp(&u->plane_info->tiling_info, &u->surface->tiling_info, - sizeof(union dc_tiling_info)) != 0) { + sizeof(struct dc_tiling_info)) != 0) { update_flags->bits.swizzle_change = 1; elevate_update_type(&update_type, UPDATE_TYPE_MED); @@ -2982,6 +3057,10 @@ static void copy_surface_update_to_plane( if (srf_update->cursor_csc_color_matrix) surface->cursor_csc_color_matrix = *srf_update->cursor_csc_color_matrix; + + if (srf_update->bias_and_scale.bias_and_scale_valid) + surface->bias_and_scale = + srf_update->bias_and_scale; } static void copy_stream_update_to_stream(struct dc *dc, @@ -4510,7 +4589,7 @@ static bool commit_minimal_transition_based_on_current_context(struct dc *dc, struct pipe_split_policy_backup policy; struct dc_state *intermediate_context; struct dc_state *old_current_state = dc->current_state; - struct dc_surface_update srf_updates[MAX_SURFACE_NUM] = {0}; + struct dc_surface_update srf_updates[MAX_SURFACES] = {0}; int surface_count; /* @@ -5307,11 +5386,9 @@ void dc_set_power_state(struct dc *dc, enum dc_acpi_cm_power_state power_state) dc->vm_pa_config.valid) { dc->hwss.init_sys_ctx(dc->hwseq, dc, &dc->vm_pa_config); } - break; default: ASSERT(dc->current_state->stream_count == 0); - dc_dmub_srv_notify_fw_dc_power_state(dc->ctx->dmub_srv, power_state); dc_state_destruct(dc->current_state); @@ -5435,6 +5512,11 @@ bool dc_set_ips_disable(struct dc *dc, unsigned int disable_ips) void dc_allow_idle_optimizations_internal(struct dc *dc, bool allow, char const *caller_name) { + int idle_fclk_khz = 0, idle_dramclk_khz = 0, i = 0; + enum mall_stream_type subvp_pipe_type[MAX_PIPES] = {0}; + struct pipe_ctx *pipe = NULL; + struct dc_state *context = dc->current_state; + if (dc->debug.disable_idle_power_optimizations) { DC_LOG_DEBUG("%s: disabled\n", __func__); return; @@ -5459,6 +5541,23 @@ void dc_allow_idle_optimizations_internal(struct dc *dc, bool allow, char const dc->idle_optimizations_allowed = allow; DC_LOG_DEBUG("%s: %s\n", __func__, allow ? "enabled" : "disabled"); } + + // log idle clocks and sub vp pipe types at idle optimization time + if (dc->clk_mgr != NULL && dc->clk_mgr->funcs->get_hard_min_fclk) + idle_fclk_khz = dc->clk_mgr->funcs->get_hard_min_fclk(dc->clk_mgr); + + if (dc->clk_mgr != NULL && dc->clk_mgr->funcs->get_hard_min_memclk) + idle_dramclk_khz = dc->clk_mgr->funcs->get_hard_min_memclk(dc->clk_mgr); + + for (i = 0; i < dc->res_pool->pipe_count; i++) { + pipe = &context->res_ctx.pipe_ctx[i]; + subvp_pipe_type[i] = dc_state_get_pipe_subvp_type(context, pipe); + } + + DC_LOG_DC("%s: allow_idle=%d\n HardMinUClk_Khz=%d HardMinDramclk_Khz=%d\n Pipe_0=%d Pipe_1=%d Pipe_2=%d Pipe_3=%d Pipe_4=%d Pipe_5=%d (caller=%s)\n", + __func__, allow, idle_fclk_khz, idle_dramclk_khz, subvp_pipe_type[0], subvp_pipe_type[1], subvp_pipe_type[2], + subvp_pipe_type[3], subvp_pipe_type[4], subvp_pipe_type[5], caller_name); + } void dc_exit_ips_for_hw_access_internal(struct dc *dc, const char *caller_name) @@ -6056,7 +6155,7 @@ void dc_query_current_properties(struct dc *dc, struct dc_current_properties *pr bool subvp_sw_cursor_req = false; for (i = 0; i < dc->current_state->stream_count; i++) { - if (check_subvp_sw_cursor_fallback_req(dc, dc->current_state->streams[i])) { + if (check_subvp_sw_cursor_fallback_req(dc, dc->current_state->streams[i]) && !dc->current_state->streams[i]->hw_cursor_req) { subvp_sw_cursor_req = true; break; } @@ -6109,3 +6208,21 @@ struct dc_power_profile dc_get_power_profile_for_dc_state(const struct dc_state profile.power_level = dc->res_pool->funcs->get_power_profile(context); return profile; } + +/* + ********************************************************************************** + * dc_get_det_buffer_size_from_state() - extracts detile buffer size from dc state + * + * Called when DM wants to log detile buffer size from dc_state + * + ********************************************************************************** + */ +unsigned int dc_get_det_buffer_size_from_state(const struct dc_state *context) +{ + struct dc *dc = context->clk_mgr->ctx->dc; + + if (dc->res_pool->funcs->get_det_buffer_size) + return dc->res_pool->funcs->get_det_buffer_size(context); + else + return 0; +} diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/core/dc_hw_sequencer.c index 252af83e34a5..6eb9bae3af91 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_hw_sequencer.c @@ -425,6 +425,44 @@ void get_hdr_visual_confirm_color( } } +/* Visual Confirm color definition for VABC */ +void get_vabc_visual_confirm_color( + struct pipe_ctx *pipe_ctx, + struct tg_color *color) +{ + uint32_t color_value = MAX_TG_COLOR_VALUE; + struct dc_link *edp_link = NULL; + + if (pipe_ctx && pipe_ctx->stream && pipe_ctx->stream->link) { + if (pipe_ctx->stream->link->connector_signal == SIGNAL_TYPE_EDP) + edp_link = pipe_ctx->stream->link; + } + + if (edp_link) { + switch (edp_link->backlight_control_type) { + case BACKLIGHT_CONTROL_PWM: + color->color_r_cr = color_value; + color->color_g_y = 0; + color->color_b_cb = 0; + break; + case BACKLIGHT_CONTROL_AMD_AUX: + color->color_r_cr = 0; + color->color_g_y = color_value; + color->color_b_cb = 0; + break; + case BACKLIGHT_CONTROL_VESA_AUX: + color->color_r_cr = 0; + color->color_g_y = 0; + color->color_b_cb = color_value; + break; + } + } else { + color->color_r_cr = 0; + color->color_g_y = 0; + color->color_b_cb = 0; + } +} + void get_subvp_visual_confirm_color( struct pipe_ctx *pipe_ctx, struct tg_color *color) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c index 457d60eeb486..c1b79b379447 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c @@ -125,6 +125,14 @@ uint32_t dc_link_bandwidth_kbps( return link->dc->link_srv->dp_link_bandwidth_kbps(link, link_settings); } +uint32_t dc_link_required_hblank_size_bytes( + const struct dc_link *link, + struct dp_audio_bandwidth_params *audio_params) +{ + return link->dc->link_srv->dp_required_hblank_size_bytes(link, + audio_params); +} + void dc_get_cur_link_res_map(const struct dc *dc, uint32_t *map) { dc->link_srv->get_cur_res_map(dc, map); diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c index 619fad17de55..520a34a42827 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c @@ -2094,7 +2094,8 @@ int resource_get_odm_slice_dst_width(struct pipe_ctx *otg_master, count = resource_get_odm_slice_count(otg_master); h_active = timing->h_addressable + timing->h_border_left + - timing->h_border_right; + timing->h_border_right + + otg_master->hblank_borrow; width = h_active / count; if (otg_master->stream_res.tg) @@ -4026,6 +4027,41 @@ fail: return res; } +/** + * decide_hblank_borrow - Decides the horizontal blanking borrow value for a given pipe context. + * @pipe_ctx: Pointer to the pipe context structure. + * + * This function calculates the horizontal blanking borrow value for a given pipe context based on the + * display stream compression (DSC) configuration. If the horizontal active pixels (hactive) are less + * than the total width of the DSC slices, it sets the hblank_borrow value to the difference. If the + * total horizontal timing minus the hblank_borrow value is less than 32, it resets the hblank_borrow + * value to 0. + */ +static void decide_hblank_borrow(struct pipe_ctx *pipe_ctx) +{ + uint32_t hactive; + uint32_t ceil_slice_width; + struct dc_stream_state *stream = NULL; + + if (!pipe_ctx) + return; + + stream = pipe_ctx->stream; + + if (stream->timing.flags.DSC) { + hactive = stream->timing.h_addressable + stream->timing.h_border_left + stream->timing.h_border_right; + + /* Assume if determined slices does not divide Hactive evenly, Hborrow is needed for padding*/ + if (hactive % stream->timing.dsc_cfg.num_slices_h != 0) { + ceil_slice_width = (hactive / stream->timing.dsc_cfg.num_slices_h) + 1; + pipe_ctx->hblank_borrow = ceil_slice_width * stream->timing.dsc_cfg.num_slices_h - hactive; + + if (stream->timing.h_total - hactive - pipe_ctx->hblank_borrow < 32) + pipe_ctx->hblank_borrow = 0; + } + } +} + /** * dc_validate_global_state() - Determine if hardware can support a given state * @@ -4064,6 +4100,10 @@ enum dc_status dc_validate_global_state( if (pipe_ctx->stream != stream) continue; + /* Decide whether hblank borrow is needed and save it in pipe_ctx */ + if (dc->debug.enable_hblank_borrow) + decide_hblank_borrow(pipe_ctx); + if (dc->res_pool->funcs->patch_unknown_plane_state && pipe_ctx->plane_state && pipe_ctx->plane_state->tiling_info.gfx9.swizzle == DC_SW_UNKNOWN) { @@ -4438,7 +4478,7 @@ static void set_hfvs_info_packet( static void adaptive_sync_override_dp_info_packets_sdp_line_num( const struct dc_crtc_timing *timing, struct enc_sdp_line_num *sdp_line_num, - struct _vcs_dpi_display_pipe_dest_params_st *pipe_dlg_param) + unsigned int vstartup_start) { uint32_t asic_blank_start = 0; uint32_t asic_blank_end = 0; @@ -4453,8 +4493,8 @@ static void adaptive_sync_override_dp_info_packets_sdp_line_num( asic_blank_end = (asic_blank_start - tg->v_border_bottom - tg->v_addressable - tg->v_border_top); - if (pipe_dlg_param->vstartup_start > asic_blank_end) { - v_update = (tg->v_total - (pipe_dlg_param->vstartup_start - asic_blank_end)); + if (vstartup_start > asic_blank_end) { + v_update = (tg->v_total - (vstartup_start - asic_blank_end)); sdp_line_num->adaptive_sync_line_num_valid = true; sdp_line_num->adaptive_sync_line_num = (tg->v_total - v_update - 1); } else { @@ -4467,7 +4507,7 @@ static void set_adaptive_sync_info_packet( struct dc_info_packet *info_packet, const struct dc_stream_state *stream, struct encoder_info_frame *info_frame, - struct _vcs_dpi_display_pipe_dest_params_st *pipe_dlg_param) + unsigned int vstartup_start) { if (!stream->adaptive_sync_infopacket.valid) return; @@ -4475,7 +4515,7 @@ static void set_adaptive_sync_info_packet( adaptive_sync_override_dp_info_packets_sdp_line_num( &stream->timing, &info_frame->sdp_line_num, - pipe_dlg_param); + vstartup_start); *info_packet = stream->adaptive_sync_infopacket; } @@ -4508,6 +4548,7 @@ void resource_build_info_frame(struct pipe_ctx *pipe_ctx) { enum signal_type signal = SIGNAL_TYPE_NONE; struct encoder_info_frame *info = &pipe_ctx->stream_res.encoder_info_frame; + unsigned int vstartup_start = 0; /* default all packets to invalid */ info->avi.valid = false; @@ -4521,6 +4562,9 @@ void resource_build_info_frame(struct pipe_ctx *pipe_ctx) info->adaptive_sync.valid = false; signal = pipe_ctx->stream->signal; + if (pipe_ctx->stream->ctx->dc->res_pool->funcs->get_vstartup_for_pipe) + vstartup_start = pipe_ctx->stream->ctx->dc->res_pool->funcs->get_vstartup_for_pipe(pipe_ctx); + /* HDMi and DP have different info packets*/ if (dc_is_hdmi_signal(signal)) { set_avi_info_frame(&info->avi, pipe_ctx); @@ -4542,7 +4586,7 @@ void resource_build_info_frame(struct pipe_ctx *pipe_ctx) set_adaptive_sync_info_packet(&info->adaptive_sync, pipe_ctx->stream, info, - &pipe_ctx->pipe_dlg_param); + vstartup_start); } patch_gamut_packet_checksum(&info->gamut); diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_state.c b/drivers/gpu/drm/amd/display/dc/core/dc_state.c index e006f816ff2f..1b2cce127981 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_state.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_state.c @@ -483,9 +483,9 @@ bool dc_state_add_plane( if (stream_status == NULL) { dm_error("Existing stream not found; failed to attach surface!\n"); goto out; - } else if (stream_status->plane_count == MAX_SURFACE_NUM) { + } else if (stream_status->plane_count == MAX_SURFACES) { dm_error("Surface: can not attach plane_state %p! Maximum is: %d\n", - plane_state, MAX_SURFACE_NUM); + plane_state, MAX_SURFACES); goto out; } else if (!otg_master_pipe) { goto out; @@ -600,7 +600,7 @@ bool dc_state_rem_all_planes_for_stream( { int i, old_plane_count; struct dc_stream_status *stream_status = NULL; - struct dc_plane_state *del_planes[MAX_SURFACE_NUM] = { 0 }; + struct dc_plane_state *del_planes[MAX_SURFACES] = { 0 }; for (i = 0; i < state->stream_count; i++) if (state->streams[i] == stream) { @@ -875,7 +875,7 @@ bool dc_state_rem_all_phantom_planes_for_stream( { int i, old_plane_count; struct dc_stream_status *stream_status = NULL; - struct dc_plane_state *del_planes[MAX_SURFACE_NUM] = { 0 }; + struct dc_plane_state *del_planes[MAX_SURFACES] = { 0 }; for (i = 0; i < state->stream_count; i++) if (state->streams[i] == phantom_stream) { diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c index 55dc482d9b36..e8134c47fe0d 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c @@ -37,6 +37,8 @@ #define DC_LOGGER dc->ctx->logger #ifndef MIN #define MIN(X, Y) ((X) < (Y) ? (X) : (Y)) +#endif +#ifndef MAX #define MAX(x, y) ((x > y) ? x : y) #endif @@ -605,17 +607,6 @@ bool dc_stream_remove_writeback(struct dc *dc, return true; } -bool dc_stream_warmup_writeback(struct dc *dc, - int num_dwb, - struct dc_writeback_info *wb_info) -{ - dc_exit_ips_for_hw_access(dc); - - if (dc->hwss.mmhubbub_warmup) - return dc->hwss.mmhubbub_warmup(dc, num_dwb, wb_info); - else - return false; -} uint32_t dc_stream_get_vblank_counter(const struct dc_stream_state *stream) { uint8_t i; diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_surface.c b/drivers/gpu/drm/amd/display/dc/core/dc_surface.c index ccbb15f1638c..f3471d45b312 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_surface.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_surface.c @@ -83,13 +83,6 @@ uint8_t dc_plane_get_pipe_mask(struct dc_state *dc_state, const struct dc_plane /******************************************************************************* * Public functions ******************************************************************************/ -void enable_surface_flip_reporting(struct dc_plane_state *plane_state, - uint32_t controller_id) -{ - plane_state->irq_source = controller_id + DC_IRQ_SOURCE_PFLIP1 - 1; - /*register_flip_interrupt(surface);*/ -} - struct dc_plane_state *dc_create_plane_state(const struct dc *dc) { struct dc_plane_state *plane_state = kvzalloc(sizeof(*plane_state), @@ -277,4 +270,50 @@ void dc_3dlut_func_retain(struct dc_3dlut *lut) kref_get(&lut->refcount); } +void dc_plane_force_update_for_panic(struct dc_plane_state *plane_state, + bool clear_tiling) +{ + struct dc *dc; + int i; + if (!plane_state) + return; + + dc = plane_state->ctx->dc; + + if (!dc || !dc->current_state) + return; + + for (i = 0; i < dc->res_pool->pipe_count; i++) { + struct pipe_ctx *pipe_ctx = &dc->current_state->res_ctx.pipe_ctx[i]; + + if (!pipe_ctx) + continue; + + if (dc->ctx->dce_version >= DCE_VERSION_MAX) { + struct hubp *hubp = pipe_ctx->plane_res.hubp; + if (!hubp) + continue; + /* if framebuffer is tiled, disable tiling */ + if (clear_tiling && hubp->funcs->hubp_clear_tiling) + hubp->funcs->hubp_clear_tiling(hubp); + + /* force page flip to see the new content of the framebuffer */ + hubp->funcs->hubp_program_surface_flip_and_addr(hubp, + &plane_state->address, + true); + } else { + struct mem_input *mi = pipe_ctx->plane_res.mi; + if (!mi) + continue; + /* if framebuffer is tiled, disable tiling */ + if (clear_tiling && mi->funcs->mem_input_clear_tiling) + mi->funcs->mem_input_clear_tiling(mi); + + /* force page flip to see the new content of the framebuffer */ + mi->funcs->mem_input_program_surface_flip_and_addr(mi, + &plane_state->address, + true); + } + } +} diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index 104051935884..053481ab69ef 100644 --- a/drivers/gpu/drm/amd/display/dc/dc.h +++ b/drivers/gpu/drm/amd/display/dc/dc.h @@ -55,9 +55,9 @@ struct aux_payload; struct set_config_cmd_payload; struct dmub_notification; -#define DC_VER "3.2.310" +#define DC_VER "3.2.316" -#define MAX_SURFACES 3 +#define MAX_SURFACES 4 #define MAX_PLANES 6 #define MAX_STREAMS 6 #define MIN_VIEWPORT_SIZE 12 @@ -290,6 +290,7 @@ struct dc_caps { uint16_t subvp_vertical_int_margin_us; bool seamless_odm; uint32_t max_v_total; + bool vtotal_limited_by_fp2; uint32_t max_disp_clock_khz_at_vmin; uint8_t subvp_drr_vblank_start_margin_us; bool cursor_not_scaled; @@ -462,6 +463,7 @@ struct dc_config { bool enable_auto_dpm_test_logs; unsigned int disable_ips; unsigned int disable_ips_in_vpb; + bool disable_ips_in_dpms_off; bool usb4_bw_alloc_support; bool allow_0_dtb_clk; bool use_assr_psp_message; @@ -470,6 +472,7 @@ struct dc_config { bool disable_hbr_audio_dp2; bool consolidated_dpia_dp_lt; bool set_pipe_unlock_order; + bool enable_dpia_pre_training; }; enum visual_confirm { @@ -486,6 +489,7 @@ enum visual_confirm { VISUAL_CONFIRM_MCLK_SWITCH = 16, VISUAL_CONFIRM_FAMS2 = 19, VISUAL_CONFIRM_HW_CURSOR = 20, + VISUAL_CONFIRM_VABC = 21, }; enum dc_psr_power_opts { @@ -627,6 +631,8 @@ struct dc_clocks { int bw_dispclk_khz; int idle_dramclk_khz; int idle_fclk_khz; + int subvp_prefetch_dramclk_khz; + int subvp_prefetch_fclk_khz; }; struct dc_bw_validation_profile { @@ -771,7 +777,8 @@ union dpia_debug_options { uint32_t enable_force_tbt3_work_around:1; /* bit 4 */ uint32_t disable_usb4_pm_support:1; /* bit 5 */ uint32_t enable_consolidated_dpia_dp_lt:1; /* bit 6 */ - uint32_t reserved:25; + uint32_t enable_dpia_pre_training:1; /* bit 7 */ + uint32_t reserved:24; } bits; uint32_t raw; }; @@ -1054,8 +1061,8 @@ struct dc_debug_options { bool dml21_force_pstate_method; uint32_t dml21_force_pstate_method_values[MAX_PIPES]; uint32_t dml21_disable_pstate_method_mask; + union fw_assisted_mclk_switch_version fams_version; union dmub_fams2_global_feature_config fams2_config; - bool enable_legacy_clock_update; unsigned int force_cositing; unsigned int disable_spl; unsigned int force_easf; @@ -1068,6 +1075,8 @@ struct dc_debug_options { unsigned int scale_to_sharpness_policy; bool skip_full_updated_if_possible; unsigned int enable_oled_edp_power_up_opt; + bool enable_hblank_borrow; + bool force_subvp_df_throttle; }; @@ -1298,7 +1307,7 @@ struct dc_plane_state { struct rect clip_rect; struct plane_size plane_size; - union dc_tiling_info tiling_info; + struct dc_tiling_info tiling_info; struct dc_plane_dcc_param dcc; @@ -1369,7 +1378,7 @@ struct dc_plane_state { struct dc_plane_info { struct plane_size plane_size; - union dc_tiling_info tiling_info; + struct dc_tiling_info tiling_info; struct dc_plane_dcc_param dcc; enum surface_pixel_format format; enum dc_rotation_angle rotation; @@ -1396,7 +1405,7 @@ struct dc_scratch_space { * store current value in plane states so we can still recover * a valid current state during dc update. */ - struct dc_plane_state plane_states[MAX_SURFACE_NUM]; + struct dc_plane_state plane_states[MAX_SURFACES]; struct dc_stream_state stream_state; }; @@ -1524,6 +1533,7 @@ struct dc_surface_update { const struct dc_cm2_parameters *cm2_params; const struct dc_csc_transform *cursor_csc_color_matrix; unsigned int sdr_white_level_nits; + struct dc_bias_and_scale bias_and_scale; }; /* @@ -2017,6 +2027,24 @@ uint32_t dc_link_bandwidth_kbps( const struct dc_link *link, const struct dc_link_settings *link_setting); +struct dp_audio_bandwidth_params { + const struct dc_crtc_timing *crtc_timing; + enum dp_link_encoding link_encoding; + uint32_t channel_count; + uint32_t sample_rate_hz; +}; + +/* The function calculates the minimum size of hblank (in bytes) needed to + * support the specified channel count and sample rate combination, given the + * link encoding and timing to be used. This calculation is not supported + * for 8b/10b SST. + * + * return - min hblank size in bytes, 0 if 8b/10b SST. + */ +uint32_t dc_link_required_hblank_size_bytes( + const struct dc_link *link, + struct dp_audio_bandwidth_params *audio_params); + /* The function takes a snapshot of current link resource allocation state * @dc: pointer to dc of the dm calling this * @map: a dc link resource snapshot defined internally to dc. @@ -2376,6 +2404,13 @@ struct dc_sink_dsc_caps { struct dsc_dec_dpcd_caps dsc_dec_caps; }; +struct dc_sink_hblank_expansion_caps { + // 'true' if these are virtual DPCD's HBlank expansion caps (immediately upstream of sink in MST topology), + // 'false' if they are sink's HBlank expansion caps + bool is_virtual_dpcd_hblank_expansion; + struct hblank_expansion_dpcd_caps dpcd_caps; +}; + struct dc_sink_fec_caps { bool is_rx_fec_supported; bool is_topology_fec_supported; @@ -2402,6 +2437,7 @@ struct dc_sink { struct scdc_caps scdc_caps; struct dc_sink_dsc_caps dsc_caps; struct dc_sink_fec_caps fec_caps; + struct dc_sink_hblank_expansion_caps hblank_expansion_caps; bool is_vsc_sdp_colorimetry_supported; @@ -2550,6 +2586,8 @@ struct dc_power_profile { struct dc_power_profile dc_get_power_profile_for_dc_state(const struct dc_state *context); +unsigned int dc_get_det_buffer_size_from_state(const struct dc_state *context); + /* DSC Interfaces */ #include "dc_dsc.h" diff --git a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c index f90fc154549a..44ff9abe2880 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c +++ b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c @@ -1245,7 +1245,7 @@ static int count_active_streams(const struct dc *dc) for (i = 0; i < dc->current_state->stream_count; ++i) { struct dc_stream_state *stream = dc->current_state->streams[i]; - if (stream && !stream->dpms_off) + if (stream && (!stream->dpms_off || dc->config.disable_ips_in_dpms_off)) count += 1; } @@ -1694,10 +1694,10 @@ void dc_dmub_srv_fams2_update_config(struct dc *dc, { uint8_t num_cmds = 1; uint32_t i; - union dmub_rb_cmd cmd[MAX_STREAMS + 1]; + union dmub_rb_cmd cmd[2 * MAX_STREAMS + 1]; struct dmub_rb_cmd_fams2 *global_cmd = &cmd[0].fams2_config; - memset(cmd, 0, sizeof(union dmub_rb_cmd) * (MAX_STREAMS + 1)); + memset(cmd, 0, sizeof(union dmub_rb_cmd) * (2 * MAX_STREAMS + 1)); /* fill in generic command header */ global_cmd->header.type = DMUB_CMD__FW_ASSISTED_MCLK_SWITCH; global_cmd->header.sub_type = DMUB_CMD__FAMS2_CONFIG; @@ -1714,17 +1714,26 @@ void dc_dmub_srv_fams2_update_config(struct dc *dc, /* construct per-stream configs */ for (i = 0; i < context->bw_ctx.bw.dcn.fams2_global_config.num_streams; i++) { - struct dmub_rb_cmd_fams2 *stream_cmd = &cmd[i+1].fams2_config; + struct dmub_rb_cmd_fams2 *stream_base_cmd = &cmd[i+1].fams2_config; + struct dmub_rb_cmd_fams2 *stream_sub_state_cmd = &cmd[i+1+context->bw_ctx.bw.dcn.fams2_global_config.num_streams].fams2_config; /* configure command header */ - stream_cmd->header.type = DMUB_CMD__FW_ASSISTED_MCLK_SWITCH; - stream_cmd->header.sub_type = DMUB_CMD__FAMS2_CONFIG; - stream_cmd->header.payload_bytes = sizeof(struct dmub_rb_cmd_fams2) - sizeof(struct dmub_cmd_header); - stream_cmd->header.multi_cmd_pending = 1; - /* copy stream static state */ - memcpy(&stream_cmd->config.stream, - &context->bw_ctx.bw.dcn.fams2_stream_params[i], - sizeof(struct dmub_fams2_stream_static_state)); + stream_base_cmd->header.type = DMUB_CMD__FW_ASSISTED_MCLK_SWITCH; + stream_base_cmd->header.sub_type = DMUB_CMD__FAMS2_CONFIG; + stream_base_cmd->header.payload_bytes = sizeof(struct dmub_rb_cmd_fams2) - sizeof(struct dmub_cmd_header); + stream_base_cmd->header.multi_cmd_pending = 1; + stream_sub_state_cmd->header.type = DMUB_CMD__FW_ASSISTED_MCLK_SWITCH; + stream_sub_state_cmd->header.sub_type = DMUB_CMD__FAMS2_CONFIG; + stream_sub_state_cmd->header.payload_bytes = sizeof(struct dmub_rb_cmd_fams2) - sizeof(struct dmub_cmd_header); + stream_sub_state_cmd->header.multi_cmd_pending = 1; + /* copy stream static base state */ + memcpy(&stream_base_cmd->config, + &context->bw_ctx.bw.dcn.fams2_stream_base_params[i], + sizeof(union dmub_cmd_fams2_config)); + /* copy stream static sub state */ + memcpy(&stream_sub_state_cmd->config, + &context->bw_ctx.bw.dcn.fams2_stream_sub_params[i], + sizeof(union dmub_cmd_fams2_config)); } } @@ -1735,8 +1744,8 @@ void dc_dmub_srv_fams2_update_config(struct dc *dc, if (enable && context->bw_ctx.bw.dcn.fams2_global_config.features.bits.enable) { /* set multi pending for global, and unset for last stream cmd */ global_cmd->header.multi_cmd_pending = 1; - cmd[context->bw_ctx.bw.dcn.fams2_global_config.num_streams].fams2_config.header.multi_cmd_pending = 0; - num_cmds += context->bw_ctx.bw.dcn.fams2_global_config.num_streams; + cmd[2 * context->bw_ctx.bw.dcn.fams2_global_config.num_streams].fams2_config.header.multi_cmd_pending = 0; + num_cmds += 2 * context->bw_ctx.bw.dcn.fams2_global_config.num_streams; } dm_execute_dmub_cmd_list(dc->ctx, num_cmds, cmd, DM_DMUB_WAIT_TYPE_WAIT); diff --git a/drivers/gpu/drm/amd/display/dc/dc_dp_types.h b/drivers/gpu/drm/amd/display/dc/dc_dp_types.h index 8dd6eb044829..94ce8fe74481 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_dp_types.h +++ b/drivers/gpu/drm/amd/display/dc/dc_dp_types.h @@ -969,6 +969,21 @@ union dp_sink_video_fallback_formats { uint8_t raw; }; +union dp_receive_port0_cap { + struct { + uint8_t RESERVED :1; + uint8_t LOCAL_EDID_PRESENT :1; + uint8_t ASSOCIATED_TO_PRECEDING_PORT:1; + uint8_t HBLANK_EXPANSION_CAPABLE :1; + uint8_t BUFFER_SIZE_UNIT :1; + uint8_t BUFFER_SIZE_PER_PORT :1; + uint8_t HBLANK_REDUCTION_CAPABLE :1; + uint8_t RESERVED2:1; + uint8_t BUFFER_SIZE:8; + } bits; + uint8_t raw[2]; +}; + union dpcd_max_uncompressed_pixel_rate_cap { struct { uint16_t max_uncompressed_pixel_rate_cap :15; @@ -1193,6 +1208,7 @@ struct dpcd_caps { struct replay_info pr_info; uint16_t edp_oled_emission_rate; + union dp_receive_port0_cap receive_port0_cap; }; union dpcd_sink_ext_caps { diff --git a/drivers/gpu/drm/amd/display/dc/dc_dsc.h b/drivers/gpu/drm/amd/display/dc/dc_dsc.h index 9014c2409817..9d18f1c08079 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_dsc.h +++ b/drivers/gpu/drm/amd/display/dc/dc_dsc.h @@ -94,6 +94,11 @@ uint32_t dc_dsc_stream_bandwidth_overhead_in_kbps( const int num_slices_h, const bool is_dp); +void dc_dsc_dump_decoder_caps(const struct display_stream_compressor *dsc, + const struct dsc_dec_dpcd_caps *dsc_sink_caps); +void dc_dsc_dump_encoder_caps(const struct display_stream_compressor *dsc, + const struct dc_crtc_timing *timing); + /* TODO - Hardware/specs limitation should be owned by dc dsc and returned to DM, * and DM can choose to OVERRIDE the limitation on CASE BY CASE basis. * Hardware/specs limitation should not be writable by DM. diff --git a/drivers/gpu/drm/amd/display/dc/dc_hw_types.h b/drivers/gpu/drm/amd/display/dc/dc_hw_types.h index c10567ec1c81..5ac55601a6da 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_hw_types.h +++ b/drivers/gpu/drm/amd/display/dc/dc_hw_types.h @@ -341,89 +341,101 @@ enum swizzle_mode_addr3_values { DC_ADDR3_SW_UNKNOWN = DC_ADDR3_SW_MAX }; -union dc_tiling_info { +enum dc_gfxversion { + DcGfxVersion7 = 0, + DcGfxVersion8, + DcGfxVersion9, + DcGfxVersion10, + DcGfxVersion11, + DcGfxAddr3, + DcGfxVersionUnknown +}; - struct { - /* Specifies the number of memory banks for tiling - * purposes. - * Only applies to 2D and 3D tiling modes. - * POSSIBLE VALUES: 2,4,8,16 - */ - unsigned int num_banks; - /* Specifies the number of tiles in the x direction - * to be incorporated into the same bank. - * Only applies to 2D and 3D tiling modes. - * POSSIBLE VALUES: 1,2,4,8 - */ - unsigned int bank_width; - unsigned int bank_width_c; - /* Specifies the number of tiles in the y direction to - * be incorporated into the same bank. - * Only applies to 2D and 3D tiling modes. - * POSSIBLE VALUES: 1,2,4,8 - */ - unsigned int bank_height; - unsigned int bank_height_c; - /* Specifies the macro tile aspect ratio. Only applies - * to 2D and 3D tiling modes. - */ - unsigned int tile_aspect; - unsigned int tile_aspect_c; - /* Specifies the number of bytes that will be stored - * contiguously for each tile. - * If the tile data requires more storage than this - * amount, it is split into multiple slices. - * This field must not be larger than - * GB_ADDR_CONFIG.DRAM_ROW_SIZE. - * Only applies to 2D and 3D tiling modes. - * For color render targets, TILE_SPLIT >= 256B. - */ - enum tile_split_values tile_split; - enum tile_split_values tile_split_c; - /* Specifies the addressing within a tile. - * 0x0 - DISPLAY_MICRO_TILING - * 0x1 - THIN_MICRO_TILING - * 0x2 - DEPTH_MICRO_TILING - * 0x3 - ROTATED_MICRO_TILING - */ - enum tile_mode_values tile_mode; - enum tile_mode_values tile_mode_c; - /* Specifies the number of pipes and how they are - * interleaved in the surface. - * Refer to memory addressing document for complete - * details and constraints. - */ - unsigned int pipe_config; - /* Specifies the tiling mode of the surface. - * THIN tiles use an 8x8x1 tile size. - * THICK tiles use an 8x8x4 tile size. - * 2D tiling modes rotate banks for successive Z slices - * 3D tiling modes rotate pipes and banks for Z slices - * Refer to memory addressing document for complete - * details and constraints. - */ - enum array_mode_values array_mode; - } gfx8; + struct dc_tiling_info { + unsigned int gfxversion; // Specifies which part of the union to use. Must use DalGfxVersion enum + union { + struct { + /* Specifies the number of memory banks for tiling + * purposes. + * Only applies to 2D and 3D tiling modes. + * POSSIBLE VALUES: 2,4,8,16 + */ + unsigned int num_banks; + /* Specifies the number of tiles in the x direction + * to be incorporated into the same bank. + * Only applies to 2D and 3D tiling modes. + * POSSIBLE VALUES: 1,2,4,8 + */ + unsigned int bank_width; + unsigned int bank_width_c; + /* Specifies the number of tiles in the y direction to + * be incorporated into the same bank. + * Only applies to 2D and 3D tiling modes. + * POSSIBLE VALUES: 1,2,4,8 + */ + unsigned int bank_height; + unsigned int bank_height_c; + /* Specifies the macro tile aspect ratio. Only applies + * to 2D and 3D tiling modes. + */ + unsigned int tile_aspect; + unsigned int tile_aspect_c; + /* Specifies the number of bytes that will be stored + * contiguously for each tile. + * If the tile data requires more storage than this + * amount, it is split into multiple slices. + * This field must not be larger than + * GB_ADDR_CONFIG.DRAM_ROW_SIZE. + * Only applies to 2D and 3D tiling modes. + * For color render targets, TILE_SPLIT >= 256B. + */ + enum tile_split_values tile_split; + enum tile_split_values tile_split_c; + /* Specifies the addressing within a tile. + * 0x0 - DISPLAY_MICRO_TILING + * 0x1 - THIN_MICRO_TILING + * 0x2 - DEPTH_MICRO_TILING + * 0x3 - ROTATED_MICRO_TILING + */ + enum tile_mode_values tile_mode; + enum tile_mode_values tile_mode_c; + /* Specifies the number of pipes and how they are + * interleaved in the surface. + * Refer to memory addressing document for complete + * details and constraints. + */ + unsigned int pipe_config; + /* Specifies the tiling mode of the surface. + * THIN tiles use an 8x8x1 tile size. + * THICK tiles use an 8x8x4 tile size. + * 2D tiling modes rotate banks for successive Z slices + * 3D tiling modes rotate pipes and banks for Z slices + * Refer to memory addressing document for complete + * details and constraints. + */ + enum array_mode_values array_mode; + } gfx8; - struct { - enum swizzle_mode_values swizzle; - unsigned int num_pipes; - unsigned int max_compressed_frags; - unsigned int pipe_interleave; + struct { + enum swizzle_mode_values swizzle; + unsigned int num_pipes; + unsigned int max_compressed_frags; + unsigned int pipe_interleave; - unsigned int num_banks; - unsigned int num_shader_engines; - unsigned int num_rb_per_se; - bool shaderEnable; + unsigned int num_banks; + unsigned int num_shader_engines; + unsigned int num_rb_per_se; + bool shaderEnable; - bool meta_linear; - bool rb_aligned; - bool pipe_aligned; - unsigned int num_pkrs; - } gfx9;/*gfx9, gfx10 and above*/ - struct { - enum swizzle_mode_addr3_values swizzle; - } gfx_addr3;/*gfx with addr3 and above*/ + bool meta_linear; + bool rb_aligned; + bool pipe_aligned; + unsigned int num_pkrs; + } gfx9;/*gfx9, gfx10 and above*/ + struct { + enum swizzle_mode_addr3_values swizzle; + } gfx_addr3;/*gfx with addr3 and above*/ + }; }; /* Rotation angle */ @@ -975,6 +987,9 @@ struct dc_crtc_timing { struct dc_crtc_timing_flags flags; uint32_t dsc_fixed_bits_per_pixel_x16; /* DSC target bitrate in 1/16 of bpp (e.g. 128 -> 8bpp) */ struct dc_dsc_config dsc_cfg; + + /* The number of pixels that HBlank has been expanded by from the original EDID timing. */ + uint32_t expanded_hblank; }; enum trigger_delay { diff --git a/drivers/gpu/drm/amd/display/dc/dc_plane.h b/drivers/gpu/drm/amd/display/dc/dc_plane.h index bd37ec82b42d..fabcefeda288 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_plane.h +++ b/drivers/gpu/drm/amd/display/dc/dc_plane.h @@ -34,4 +34,7 @@ const struct dc_plane_status *dc_plane_get_status( void dc_plane_state_retain(struct dc_plane_state *plane_state); void dc_plane_state_release(struct dc_plane_state *plane_state); +void dc_plane_force_update_for_panic(struct dc_plane_state *plane_state, + bool clear_tiling); + #endif /* _DC_PLANE_H_ */ diff --git a/drivers/gpu/drm/amd/display/dc/dc_spl_translate.c b/drivers/gpu/drm/amd/display/dc/dc_spl_translate.c index c8d8e335fa37..3518eb1b8cd1 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_spl_translate.c +++ b/drivers/gpu/drm/amd/display/dc/dc_spl_translate.c @@ -64,6 +64,13 @@ static void populate_inits_from_splinits(struct scl_inits *inits, inits->h_c = dc_fixpt_from_int_dy(spl_inits->h_filter_init_int_c, spl_inits->h_filter_init_frac_c >> 5, 0, 19); inits->v_c = dc_fixpt_from_int_dy(spl_inits->v_filter_init_int_c, spl_inits->v_filter_init_frac_c >> 5, 0, 19); } +static void populate_splformat_from_format(enum spl_pixel_format *spl_pixel_format, const enum pixel_format pixel_format) +{ + if (pixel_format < PIXEL_FORMAT_INVALID) + *spl_pixel_format = (enum spl_pixel_format)pixel_format; + else + *spl_pixel_format = SPL_PIXEL_FORMAT_INVALID; +} /// @brief Translate SPL input parameters from pipe context /// @param pipe_ctx /// @param spl_in @@ -89,7 +96,7 @@ void translate_SPL_in_params_from_pipe_ctx(struct pipe_ctx *pipe_ctx, struct spl spl_in->callbacks = dcn2_spl_callbacks; } // Make format field from spl_in point to plane_res scl_data format - spl_in->basic_in.format = (enum spl_pixel_format)pipe_ctx->plane_res.scl_data.format; + populate_splformat_from_format(&spl_in->basic_in.format, pipe_ctx->plane_res.scl_data.format); // Make view_format from basic_out point to view_format from stream spl_in->basic_out.view_format = (enum spl_view_3d)stream->view_format; // Populate spl input basic input clip rect from plane state clip rect @@ -108,19 +115,21 @@ void translate_SPL_in_params_from_pipe_ctx(struct pipe_ctx *pipe_ctx, struct spl spl_in->basic_in.horizontal_mirror = plane_state->horizontal_mirror; // Calculate horizontal splits and split index - spl_in->basic_in.mpc_combine_h = resource_get_mpc_slice_count(pipe_ctx); + spl_in->basic_in.num_h_slices_recout_width_align.use_recout_width_aligned = false; + spl_in->basic_in.num_h_slices_recout_width_align.num_slices_recout_width.mpc_num_h_slices = + resource_get_mpc_slice_count(pipe_ctx); if (stream->view_format == VIEW_3D_FORMAT_SIDE_BY_SIDE) - spl_in->basic_in.mpc_combine_v = 0; + spl_in->basic_in.mpc_h_slice_index = 0; else - spl_in->basic_in.mpc_combine_v = resource_get_mpc_slice_index(pipe_ctx); + spl_in->basic_in.mpc_h_slice_index = resource_get_mpc_slice_index(pipe_ctx); populate_splrect_from_rect(&spl_in->basic_out.odm_slice_rect, &odm_slice_src); spl_in->basic_out.odm_combine_factor = 0; spl_in->odm_slice_index = resource_get_odm_slice_index(pipe_ctx); // Make spl input basic out info output_size width point to stream h active spl_in->basic_out.output_size.width = - stream->timing.h_addressable + stream->timing.h_border_left + stream->timing.h_border_right; + stream->timing.h_addressable + stream->timing.h_border_left + stream->timing.h_border_right + pipe_ctx->hblank_borrow; // Make spl input basic out info output_size height point to v active spl_in->basic_out.output_size.height = stream->timing.v_addressable + stream->timing.v_border_bottom + stream->timing.v_border_top; diff --git a/drivers/gpu/drm/amd/display/dc/dc_stream.h b/drivers/gpu/drm/amd/display/dc/dc_stream.h index 413970588a26..3e303c7808fb 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_stream.h +++ b/drivers/gpu/drm/amd/display/dc/dc_stream.h @@ -56,7 +56,7 @@ struct dc_stream_status { int plane_count; int audio_inst; struct timing_sync_info timing_sync_info; - struct dc_plane_state *plane_states[MAX_SURFACE_NUM]; + struct dc_plane_state *plane_states[MAX_SURFACES]; bool is_abm_supported; struct mall_stream_config mall_stream_config; bool fpo_in_use; @@ -447,10 +447,6 @@ enum dc_status dc_stream_add_dsc_to_resource(struct dc *dc, struct dc_state *state, struct dc_stream_state *stream); -bool dc_stream_warmup_writeback(struct dc *dc, - int num_dwb, - struct dc_writeback_info *wb_info); - bool dc_stream_dmdata_status_done(struct dc *dc, struct dc_stream_state *stream); bool dc_stream_set_dynamic_metadata(struct dc *dc, @@ -541,17 +537,26 @@ bool dc_stream_get_crtc_position(struct dc *dc, #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) bool dc_stream_forward_crc_window(struct dc_stream_state *stream, struct rect *rect, + uint8_t phy_id, bool is_stop); + +bool dc_stream_forward_multiple_crc_window(struct dc_stream_state *stream, + struct crc_window *window, + uint8_t phy_id, + bool stop); #endif bool dc_stream_configure_crc(struct dc *dc, struct dc_stream_state *stream, struct crc_params *crc_window, bool enable, - bool continuous); + bool continuous, + uint8_t idx, + bool reset); bool dc_stream_get_crc(struct dc *dc, struct dc_stream_state *stream, + uint8_t idx, uint32_t *r_cr, uint32_t *g_y, uint32_t *b_cb); diff --git a/drivers/gpu/drm/amd/display/dc/dc_types.h b/drivers/gpu/drm/amd/display/dc/dc_types.h index edf4df1d03b5..0c2aa91f0a11 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_types.h +++ b/drivers/gpu/drm/amd/display/dc/dc_types.h @@ -76,7 +76,6 @@ struct dc_perf_trace { unsigned long last_entry_write; }; -#define MAX_SURFACE_NUM 6 #define NUM_PIXEL_FORMATS 10 enum tiling_mode { @@ -875,6 +874,14 @@ struct dsc_dec_dpcd_caps { bool is_dp; /* Decoded format */ }; +struct hblank_expansion_dpcd_caps { + bool expansion_supported; + bool reduction_supported; + bool buffer_unit_bytes; /* True: buffer size in bytes. False: buffer size in pixels*/ + bool buffer_per_port; /* True: buffer size per port. False: buffer size per lane*/ + uint32_t buffer_size; /* Add 1 to value and multiply by 32 */ +}; + struct dc_golden_table { uint16_t dc_golden_table_ver; uint32_t aux_dphy_rx_control0_val; @@ -932,10 +939,17 @@ enum backlight_control_type { }; #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) +#define MAX_CRC_WINDOW_NUM 2 + struct otg_phy_mux { uint8_t phy_output_num; uint8_t otg_output_num; }; + +struct crc_window { + struct rect rect; + bool enable; +}; #endif enum dc_detect_reason { @@ -1052,10 +1066,13 @@ enum replay_FW_Message_type { union replay_error_status { struct { - unsigned char STATE_TRANSITION_ERROR :1; - unsigned char LINK_CRC_ERROR :1; - unsigned char DESYNC_ERROR :1; - unsigned char RESERVED :5; + unsigned int STATE_TRANSITION_ERROR :1; + unsigned int LINK_CRC_ERROR :1; + unsigned int DESYNC_ERROR :1; + unsigned int RESERVED_3 :1; + unsigned int LOW_RR_INCORRECT_VTOTAL :1; + unsigned int NO_DOUBLED_RR :1; + unsigned int RESERVED_6_7 :2; } bits; unsigned char raw; }; @@ -1102,6 +1119,8 @@ struct replay_config { union replay_error_status replay_error_status; /* Replay Low Hz enable Options */ union replay_low_refresh_rate_enable_options low_rr_enable_options; + /* Replay coasting vtotal is within low refresh rate range. */ + bool low_rr_activated; }; /* Replay feature flags*/ @@ -1126,10 +1145,12 @@ struct replay_settings { uint32_t defer_update_coasting_vtotal_table[PR_COASTING_TYPE_NUM]; /* Maximum link off frame count */ uint32_t link_off_frame_count; - /* Replay pseudo vtotal for abm + ips on full screen video which can improve ips residency */ - uint16_t abm_with_ips_on_full_screen_video_pseudo_vtotal; + /* Replay pseudo vtotal for low refresh rate*/ + uint16_t low_rr_full_screen_video_pseudo_vtotal; /* Replay last pseudo vtotal set to DMUB */ uint16_t last_pseudo_vtotal; + /* Replay desync error */ + uint32_t replay_desync_error_fail_count; }; /* To split out "global" and "per-panel" config settings. diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c b/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c index b700608e4240..077337698e0a 100644 --- a/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c @@ -1105,6 +1105,9 @@ static bool dcn401_program_pix_clk( &dto_params); } else { + if (pll_settings->actual_pix_clk_100hz > 6000000UL) + return false; + /* disables DP DTO when provided with TMDS signal type */ clock_source->ctx->dc->res_pool->dccg->funcs->set_dp_dto( clock_source->ctx->dc->res_pool->dccg, diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c b/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c index f5e1d9caee4c..1c2009e38aa1 100644 --- a/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c @@ -98,7 +98,7 @@ static enum mi_bits_per_pixel get_mi_bpp( } static enum mi_tiling_format get_mi_tiling( - union dc_tiling_info *tiling_info) + struct dc_tiling_info *tiling_info) { switch (tiling_info->gfx8.array_mode) { case DC_ARRAY_1D_TILED_THIN1: @@ -133,7 +133,7 @@ static bool is_vert_scan(enum dc_rotation_angle rotation) static void dce_mi_program_pte_vm( struct mem_input *mi, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, enum dc_rotation_angle rotation) { struct dce_mem_input *dce_mi = TO_DCE_MEM_INPUT(mi); @@ -430,7 +430,7 @@ static void dce120_mi_program_display_marks(struct mem_input *mi, } static void program_tiling( - struct dce_mem_input *dce_mi, const union dc_tiling_info *info) + struct dce_mem_input *dce_mi, const struct dc_tiling_info *info) { if (dce_mi->masks->GRPH_SW_MODE) { /* GFX9 */ REG_UPDATE_6(GRPH_CONTROL, @@ -481,7 +481,6 @@ static void program_tiling( } } - static void program_size_and_rotation( struct dce_mem_input *dce_mi, enum dc_rotation_angle rotation, @@ -627,10 +626,31 @@ static void program_grph_pixel_format( GRPH_PRESCALE_B_SIGN, sign); } +static void dce_mi_clear_tiling( + struct mem_input *mi) +{ + struct dce_mem_input *dce_mi = TO_DCE_MEM_INPUT(mi); + + if (dce_mi->masks->GRPH_SW_MODE) { /* GFX9 */ + REG_UPDATE(GRPH_CONTROL, + GRPH_SW_MODE, DC_SW_LINEAR); + } + + if (dce_mi->masks->GRPH_MICRO_TILE_MODE) { /* GFX8 */ + REG_UPDATE(GRPH_CONTROL, + GRPH_ARRAY_MODE, DC_SW_LINEAR); + } + + if (dce_mi->masks->GRPH_ARRAY_MODE) { /* GFX6 but reuses gfx8 struct */ + REG_UPDATE(GRPH_CONTROL, + GRPH_ARRAY_MODE, DC_SW_LINEAR); + } +} + static void dce_mi_program_surface_config( struct mem_input *mi, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -650,7 +670,7 @@ static void dce_mi_program_surface_config( static void dce60_mi_program_surface_config( struct mem_input *mi, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, /* not used in DCE6 */ struct dc_plane_dcc_param *dcc, @@ -884,7 +904,8 @@ static const struct mem_input_funcs dce_mi_funcs = { .mem_input_program_pte_vm = dce_mi_program_pte_vm, .mem_input_program_surface_config = dce_mi_program_surface_config, - .mem_input_is_flip_pending = dce_mi_is_flip_pending + .mem_input_is_flip_pending = dce_mi_is_flip_pending, + .mem_input_clear_tiling = dce_mi_clear_tiling, }; #if defined(CONFIG_DRM_AMD_DC_SI) @@ -897,7 +918,8 @@ static const struct mem_input_funcs dce60_mi_funcs = { .mem_input_program_pte_vm = dce_mi_program_pte_vm, .mem_input_program_surface_config = dce60_mi_program_surface_config, - .mem_input_is_flip_pending = dce_mi_is_flip_pending + .mem_input_is_flip_pending = dce_mi_is_flip_pending, + .mem_input_clear_tiling = dce_mi_clear_tiling, }; #endif @@ -910,7 +932,8 @@ static const struct mem_input_funcs dce112_mi_funcs = { .mem_input_program_pte_vm = dce_mi_program_pte_vm, .mem_input_program_surface_config = dce_mi_program_surface_config, - .mem_input_is_flip_pending = dce_mi_is_flip_pending + .mem_input_is_flip_pending = dce_mi_is_flip_pending, + .mem_input_clear_tiling = dce_mi_clear_tiling, }; static const struct mem_input_funcs dce120_mi_funcs = { @@ -922,7 +945,8 @@ static const struct mem_input_funcs dce120_mi_funcs = { .mem_input_program_pte_vm = dce_mi_program_pte_vm, .mem_input_program_surface_config = dce_mi_program_surface_config, - .mem_input_is_flip_pending = dce_mi_is_flip_pending + .mem_input_is_flip_pending = dce_mi_is_flip_pending, + .mem_input_clear_tiling = dce_mi_clear_tiling, }; void dce_mem_input_construct( diff --git a/drivers/gpu/drm/amd/display/dc/dce/dmub_hw_lock_mgr.c b/drivers/gpu/drm/amd/display/dc/dce/dmub_hw_lock_mgr.c index bf636b28e3e1..5bb8b78bf250 100644 --- a/drivers/gpu/drm/amd/display/dc/dce/dmub_hw_lock_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_hw_lock_mgr.c @@ -63,7 +63,8 @@ void dmub_hw_lock_mgr_inbox0_cmd(struct dc_dmub_srv *dmub_srv, bool should_use_dmub_lock(struct dc_link *link) { - if (link->psr_settings.psr_version == DC_PSR_VERSION_SU_1) + if (link->psr_settings.psr_version == DC_PSR_VERSION_SU_1 || + link->psr_settings.psr_version == DC_PSR_VERSION_1) return true; if (link->replay_settings.replay_feature_enabled) diff --git a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c index cae18f8c1c9a..88c75c243bf8 100644 --- a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c +++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c @@ -390,8 +390,7 @@ static bool dmub_psr_copy_settings(struct dmub_psr *dmub, !memcmp(link->dpcd_caps.sink_dev_id_str, DP_SINK_DEVICE_STR_ID_1, sizeof(DP_SINK_DEVICE_STR_ID_1))) link->psr_settings.force_ffu_mode = 1; - else - link->psr_settings.force_ffu_mode = 0; + copy_settings_data->force_ffu_mode = link->psr_settings.force_ffu_mode; if (((link->dpcd_caps.fec_cap.bits.FEC_CAPABLE && diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_mem_input_v.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_mem_input_v.c index 8a3fbf95c48f..2c43c2422638 100644 --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_mem_input_v.c +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_mem_input_v.c @@ -162,7 +162,7 @@ static void enable(struct dce_mem_input *mem_input110) static void program_tiling( struct dce_mem_input *mem_input110, - const union dc_tiling_info *info, + const struct dc_tiling_info *info, const enum surface_pixel_format pixel_format) { uint32_t value = 0; @@ -523,7 +523,7 @@ static const unsigned int dvmm_Hw_Setting_Linear[4][9] = { /* Helper to get table entry from surface info */ static const unsigned int *get_dvmm_hw_setting( - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, enum surface_pixel_format format, bool chroma) { @@ -563,7 +563,7 @@ static const unsigned int *get_dvmm_hw_setting( static void dce_mem_input_v_program_pte_vm( struct mem_input *mem_input, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, enum dc_rotation_angle rotation) { struct dce_mem_input *mem_input110 = TO_DCE_MEM_INPUT(mem_input); @@ -636,7 +636,7 @@ static void dce_mem_input_v_program_pte_vm( static void dce_mem_input_v_program_surface_config( struct mem_input *mem_input, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c index fa422a8cbced..61b0807693fb 100644 --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c @@ -2127,70 +2127,131 @@ bool dce110_configure_crc(struct timing_generator *tg, cntl_addr = CRTC_REG(mmCRTC_CRC_CNTL); - /* First, disable CRC before we configure it. */ - dm_write_reg(tg->ctx, cntl_addr, 0); + if (!params->enable || params->reset) + /* First, disable CRC before we configure it. */ + dm_write_reg(tg->ctx, cntl_addr, 0); if (!params->enable) return true; /* Program frame boundaries */ - /* Window A x axis start and end. */ - value = 0; - addr = CRTC_REG(mmCRTC_CRC0_WINDOWA_X_CONTROL); - set_reg_field_value(value, params->windowa_x_start, - CRTC_CRC0_WINDOWA_X_CONTROL, - CRTC_CRC0_WINDOWA_X_START); - set_reg_field_value(value, params->windowa_x_end, - CRTC_CRC0_WINDOWA_X_CONTROL, - CRTC_CRC0_WINDOWA_X_END); - dm_write_reg(tg->ctx, addr, value); + switch (params->crc_eng_inst) { + case 0: + /* Window A x axis start and end. */ + value = 0; + addr = CRTC_REG(mmCRTC_CRC0_WINDOWA_X_CONTROL); + set_reg_field_value(value, params->windowa_x_start, + CRTC_CRC0_WINDOWA_X_CONTROL, + CRTC_CRC0_WINDOWA_X_START); + set_reg_field_value(value, params->windowa_x_end, + CRTC_CRC0_WINDOWA_X_CONTROL, + CRTC_CRC0_WINDOWA_X_END); + dm_write_reg(tg->ctx, addr, value); - /* Window A y axis start and end. */ - value = 0; - addr = CRTC_REG(mmCRTC_CRC0_WINDOWA_Y_CONTROL); - set_reg_field_value(value, params->windowa_y_start, - CRTC_CRC0_WINDOWA_Y_CONTROL, - CRTC_CRC0_WINDOWA_Y_START); - set_reg_field_value(value, params->windowa_y_end, - CRTC_CRC0_WINDOWA_Y_CONTROL, - CRTC_CRC0_WINDOWA_Y_END); - dm_write_reg(tg->ctx, addr, value); + /* Window A y axis start and end. */ + value = 0; + addr = CRTC_REG(mmCRTC_CRC0_WINDOWA_Y_CONTROL); + set_reg_field_value(value, params->windowa_y_start, + CRTC_CRC0_WINDOWA_Y_CONTROL, + CRTC_CRC0_WINDOWA_Y_START); + set_reg_field_value(value, params->windowa_y_end, + CRTC_CRC0_WINDOWA_Y_CONTROL, + CRTC_CRC0_WINDOWA_Y_END); + dm_write_reg(tg->ctx, addr, value); - /* Window B x axis start and end. */ - value = 0; - addr = CRTC_REG(mmCRTC_CRC0_WINDOWB_X_CONTROL); - set_reg_field_value(value, params->windowb_x_start, - CRTC_CRC0_WINDOWB_X_CONTROL, - CRTC_CRC0_WINDOWB_X_START); - set_reg_field_value(value, params->windowb_x_end, - CRTC_CRC0_WINDOWB_X_CONTROL, - CRTC_CRC0_WINDOWB_X_END); - dm_write_reg(tg->ctx, addr, value); + /* Window B x axis start and end. */ + value = 0; + addr = CRTC_REG(mmCRTC_CRC0_WINDOWB_X_CONTROL); + set_reg_field_value(value, params->windowb_x_start, + CRTC_CRC0_WINDOWB_X_CONTROL, + CRTC_CRC0_WINDOWB_X_START); + set_reg_field_value(value, params->windowb_x_end, + CRTC_CRC0_WINDOWB_X_CONTROL, + CRTC_CRC0_WINDOWB_X_END); + dm_write_reg(tg->ctx, addr, value); - /* Window B y axis start and end. */ - value = 0; - addr = CRTC_REG(mmCRTC_CRC0_WINDOWB_Y_CONTROL); - set_reg_field_value(value, params->windowb_y_start, - CRTC_CRC0_WINDOWB_Y_CONTROL, - CRTC_CRC0_WINDOWB_Y_START); - set_reg_field_value(value, params->windowb_y_end, - CRTC_CRC0_WINDOWB_Y_CONTROL, - CRTC_CRC0_WINDOWB_Y_END); - dm_write_reg(tg->ctx, addr, value); + /* Window B y axis start and end. */ + value = 0; + addr = CRTC_REG(mmCRTC_CRC0_WINDOWB_Y_CONTROL); + set_reg_field_value(value, params->windowb_y_start, + CRTC_CRC0_WINDOWB_Y_CONTROL, + CRTC_CRC0_WINDOWB_Y_START); + set_reg_field_value(value, params->windowb_y_end, + CRTC_CRC0_WINDOWB_Y_CONTROL, + CRTC_CRC0_WINDOWB_Y_END); + dm_write_reg(tg->ctx, addr, value); - /* Set crc mode and selection, and enable. Only using CRC0*/ - value = 0; - set_reg_field_value(value, params->continuous_mode ? 1 : 0, - CRTC_CRC_CNTL, CRTC_CRC_CONT_EN); - set_reg_field_value(value, params->selection, - CRTC_CRC_CNTL, CRTC_CRC0_SELECT); - set_reg_field_value(value, 1, CRTC_CRC_CNTL, CRTC_CRC_EN); - dm_write_reg(tg->ctx, cntl_addr, value); + /* Set crc mode and selection, and enable.*/ + value = 0; + set_reg_field_value(value, params->continuous_mode ? 1 : 0, + CRTC_CRC_CNTL, CRTC_CRC_CONT_EN); + set_reg_field_value(value, params->selection, + CRTC_CRC_CNTL, CRTC_CRC0_SELECT); + set_reg_field_value(value, 1, CRTC_CRC_CNTL, CRTC_CRC_EN); + dm_write_reg(tg->ctx, cntl_addr, value); + break; + case 1: + /* Window A x axis start and end. */ + value = 0; + addr = CRTC_REG(mmCRTC_CRC1_WINDOWA_X_CONTROL); + set_reg_field_value(value, params->windowa_x_start, + CRTC_CRC1_WINDOWA_X_CONTROL, + CRTC_CRC1_WINDOWA_X_START); + set_reg_field_value(value, params->windowa_x_end, + CRTC_CRC1_WINDOWA_X_CONTROL, + CRTC_CRC1_WINDOWA_X_END); + dm_write_reg(tg->ctx, addr, value); + + /* Window A y axis start and end. */ + value = 0; + addr = CRTC_REG(mmCRTC_CRC1_WINDOWA_Y_CONTROL); + set_reg_field_value(value, params->windowa_y_start, + CRTC_CRC1_WINDOWA_Y_CONTROL, + CRTC_CRC1_WINDOWA_Y_START); + set_reg_field_value(value, params->windowa_y_end, + CRTC_CRC1_WINDOWA_Y_CONTROL, + CRTC_CRC1_WINDOWA_Y_END); + dm_write_reg(tg->ctx, addr, value); + + /* Window B x axis start and end. */ + value = 0; + addr = CRTC_REG(mmCRTC_CRC1_WINDOWB_X_CONTROL); + set_reg_field_value(value, params->windowb_x_start, + CRTC_CRC1_WINDOWB_X_CONTROL, + CRTC_CRC1_WINDOWB_X_START); + set_reg_field_value(value, params->windowb_x_end, + CRTC_CRC1_WINDOWB_X_CONTROL, + CRTC_CRC1_WINDOWB_X_END); + dm_write_reg(tg->ctx, addr, value); + + /* Window B y axis start and end. */ + value = 0; + addr = CRTC_REG(mmCRTC_CRC1_WINDOWB_Y_CONTROL); + set_reg_field_value(value, params->windowb_y_start, + CRTC_CRC1_WINDOWB_Y_CONTROL, + CRTC_CRC1_WINDOWB_Y_START); + set_reg_field_value(value, params->windowb_y_end, + CRTC_CRC1_WINDOWB_Y_CONTROL, + CRTC_CRC1_WINDOWB_Y_END); + dm_write_reg(tg->ctx, addr, value); + + /* Set crc mode and selection, and enable.*/ + value = 0; + set_reg_field_value(value, params->continuous_mode ? 1 : 0, + CRTC_CRC_CNTL, CRTC_CRC_CONT_EN); + set_reg_field_value(value, params->selection, + CRTC_CRC_CNTL, CRTC_CRC1_SELECT); + set_reg_field_value(value, 1, CRTC_CRC_CNTL, CRTC_CRC_EN); + dm_write_reg(tg->ctx, cntl_addr, value); + break; + default: + return false; + } return true; } -bool dce110_get_crc(struct timing_generator *tg, +bool dce110_get_crc(struct timing_generator *tg, uint8_t idx, uint32_t *r_cr, uint32_t *g_y, uint32_t *b_cb) { uint32_t addr = 0; @@ -2206,14 +2267,30 @@ bool dce110_get_crc(struct timing_generator *tg, if (!field) return false; - addr = CRTC_REG(mmCRTC_CRC0_DATA_RG); - value = dm_read_reg(tg->ctx, addr); - *r_cr = get_reg_field_value(value, CRTC_CRC0_DATA_RG, CRC0_R_CR); - *g_y = get_reg_field_value(value, CRTC_CRC0_DATA_RG, CRC0_G_Y); + switch (idx) { + case 0: + addr = CRTC_REG(mmCRTC_CRC0_DATA_RG); + value = dm_read_reg(tg->ctx, addr); + *r_cr = get_reg_field_value(value, CRTC_CRC0_DATA_RG, CRC0_R_CR); + *g_y = get_reg_field_value(value, CRTC_CRC0_DATA_RG, CRC0_G_Y); - addr = CRTC_REG(mmCRTC_CRC0_DATA_B); - value = dm_read_reg(tg->ctx, addr); - *b_cb = get_reg_field_value(value, CRTC_CRC0_DATA_B, CRC0_B_CB); + addr = CRTC_REG(mmCRTC_CRC0_DATA_B); + value = dm_read_reg(tg->ctx, addr); + *b_cb = get_reg_field_value(value, CRTC_CRC0_DATA_B, CRC0_B_CB); + break; + case 1: + addr = CRTC_REG(mmCRTC_CRC1_DATA_RG); + value = dm_read_reg(tg->ctx, addr); + *r_cr = get_reg_field_value(value, CRTC_CRC1_DATA_RG, CRC1_R_CR); + *g_y = get_reg_field_value(value, CRTC_CRC1_DATA_RG, CRC1_G_Y); + + addr = CRTC_REG(mmCRTC_CRC1_DATA_B); + value = dm_read_reg(tg->ctx, addr); + *b_cb = get_reg_field_value(value, CRTC_CRC1_DATA_B, CRC1_B_CB); + break; + default: + return false; + } return true; } diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.h b/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.h index ee4de740aceb..e4f5cad64f32 100644 --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.h +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.h @@ -286,7 +286,7 @@ bool dce110_arm_vert_intr( bool dce110_configure_crc(struct timing_generator *tg, const struct crc_params *params); -bool dce110_get_crc(struct timing_generator *tg, +bool dce110_get_crc(struct timing_generator *tg, uint8_t idx, uint32_t *r_cr, uint32_t *g_y, uint32_t *b_cb); bool dce110_is_two_pixels_per_container(const struct dc_crtc_timing *timing); diff --git a/drivers/gpu/drm/amd/display/dc/dce120/dce120_timing_generator.c b/drivers/gpu/drm/amd/display/dc/dce120/dce120_timing_generator.c index fcf59348eb62..31c4f44ceaac 100644 --- a/drivers/gpu/drm/amd/display/dc/dce120/dce120_timing_generator.c +++ b/drivers/gpu/drm/amd/display/dc/dce120/dce120_timing_generator.c @@ -1100,45 +1100,79 @@ static bool dce120_configure_crc(struct timing_generator *tg, if (!dce120_is_tg_enabled(tg)) return false; - /* First, disable CRC before we configure it. */ - dm_write_reg_soc15(tg->ctx, mmCRTC0_CRTC_CRC_CNTL, - tg110->offsets.crtc, 0); + if (!params->enable || params->reset) + /* First, disable CRC before we configure it. */ + dm_write_reg_soc15(tg->ctx, mmCRTC0_CRTC_CRC_CNTL, + tg110->offsets.crtc, 0); if (!params->enable) return true; /* Program frame boundaries */ - /* Window A x axis start and end. */ - CRTC_REG_UPDATE_2(CRTC0_CRTC_CRC0_WINDOWA_X_CONTROL, - CRTC_CRC0_WINDOWA_X_START, params->windowa_x_start, - CRTC_CRC0_WINDOWA_X_END, params->windowa_x_end); + switch (params->crc_eng_inst) { + case 0: + /* Window A x axis start and end. */ + CRTC_REG_UPDATE_2(CRTC0_CRTC_CRC0_WINDOWA_X_CONTROL, + CRTC_CRC0_WINDOWA_X_START, params->windowa_x_start, + CRTC_CRC0_WINDOWA_X_END, params->windowa_x_end); - /* Window A y axis start and end. */ - CRTC_REG_UPDATE_2(CRTC0_CRTC_CRC0_WINDOWA_Y_CONTROL, - CRTC_CRC0_WINDOWA_Y_START, params->windowa_y_start, - CRTC_CRC0_WINDOWA_Y_END, params->windowa_y_end); + /* Window A y axis start and end. */ + CRTC_REG_UPDATE_2(CRTC0_CRTC_CRC0_WINDOWA_Y_CONTROL, + CRTC_CRC0_WINDOWA_Y_START, params->windowa_y_start, + CRTC_CRC0_WINDOWA_Y_END, params->windowa_y_end); - /* Window B x axis start and end. */ - CRTC_REG_UPDATE_2(CRTC0_CRTC_CRC0_WINDOWB_X_CONTROL, - CRTC_CRC0_WINDOWB_X_START, params->windowb_x_start, - CRTC_CRC0_WINDOWB_X_END, params->windowb_x_end); + /* Window B x axis start and end. */ + CRTC_REG_UPDATE_2(CRTC0_CRTC_CRC0_WINDOWB_X_CONTROL, + CRTC_CRC0_WINDOWB_X_START, params->windowb_x_start, + CRTC_CRC0_WINDOWB_X_END, params->windowb_x_end); - /* Window B y axis start and end. */ - CRTC_REG_UPDATE_2(CRTC0_CRTC_CRC0_WINDOWB_Y_CONTROL, - CRTC_CRC0_WINDOWB_Y_START, params->windowb_y_start, - CRTC_CRC0_WINDOWB_Y_END, params->windowb_y_end); + /* Window B y axis start and end. */ + CRTC_REG_UPDATE_2(CRTC0_CRTC_CRC0_WINDOWB_Y_CONTROL, + CRTC_CRC0_WINDOWB_Y_START, params->windowb_y_start, + CRTC_CRC0_WINDOWB_Y_END, params->windowb_y_end); - /* Set crc mode and selection, and enable. Only using CRC0*/ - CRTC_REG_UPDATE_3(CRTC0_CRTC_CRC_CNTL, - CRTC_CRC_EN, params->continuous_mode ? 1 : 0, - CRTC_CRC0_SELECT, params->selection, - CRTC_CRC_EN, 1); + /* Set crc mode and selection, and enable.*/ + CRTC_REG_UPDATE_3(CRTC0_CRTC_CRC_CNTL, + CRTC_CRC_CONT_EN, params->continuous_mode ? 1 : 0, + CRTC_CRC0_SELECT, params->selection, + CRTC_CRC_EN, 1); + break; + case 1: + /* Window A x axis start and end. */ + CRTC_REG_UPDATE_2(CRTC0_CRTC_CRC1_WINDOWA_X_CONTROL, + CRTC_CRC1_WINDOWA_X_START, params->windowa_x_start, + CRTC_CRC1_WINDOWA_X_END, params->windowa_x_end); + + /* Window A y axis start and end. */ + CRTC_REG_UPDATE_2(CRTC0_CRTC_CRC1_WINDOWA_Y_CONTROL, + CRTC_CRC1_WINDOWA_Y_START, params->windowa_y_start, + CRTC_CRC1_WINDOWA_Y_END, params->windowa_y_end); + + /* Window B x axis start and end. */ + CRTC_REG_UPDATE_2(CRTC0_CRTC_CRC1_WINDOWB_X_CONTROL, + CRTC_CRC1_WINDOWB_X_START, params->windowb_x_start, + CRTC_CRC1_WINDOWB_X_END, params->windowb_x_end); + + /* Window B y axis start and end. */ + CRTC_REG_UPDATE_2(CRTC0_CRTC_CRC1_WINDOWB_Y_CONTROL, + CRTC_CRC1_WINDOWB_Y_START, params->windowb_y_start, + CRTC_CRC1_WINDOWB_Y_END, params->windowb_y_end); + + /* Set crc mode and selection, and enable */ + CRTC_REG_UPDATE_3(CRTC0_CRTC_CRC_CNTL, + CRTC_CRC_CONT_EN, params->continuous_mode ? 1 : 0, + CRTC_CRC1_SELECT, params->selection, + CRTC_CRC_EN, 1); + break; + default: + return false; + } return true; } -static bool dce120_get_crc(struct timing_generator *tg, uint32_t *r_cr, - uint32_t *g_y, uint32_t *b_cb) +static bool dce120_get_crc(struct timing_generator *tg, uint8_t idx, + uint32_t *r_cr, uint32_t *g_y, uint32_t *b_cb) { struct dce110_timing_generator *tg110 = DCE110TG_FROM_TG(tg); uint32_t value, field; @@ -1151,14 +1185,30 @@ static bool dce120_get_crc(struct timing_generator *tg, uint32_t *r_cr, if (!field) return false; - value = dm_read_reg_soc15(tg->ctx, mmCRTC0_CRTC_CRC0_DATA_RG, - tg110->offsets.crtc); - *r_cr = get_reg_field_value(value, CRTC0_CRTC_CRC0_DATA_RG, CRC0_R_CR); - *g_y = get_reg_field_value(value, CRTC0_CRTC_CRC0_DATA_RG, CRC0_G_Y); + switch (idx) { + case 0: + value = dm_read_reg_soc15(tg->ctx, mmCRTC0_CRTC_CRC0_DATA_RG, + tg110->offsets.crtc); + *r_cr = get_reg_field_value(value, CRTC0_CRTC_CRC0_DATA_RG, CRC0_R_CR); + *g_y = get_reg_field_value(value, CRTC0_CRTC_CRC0_DATA_RG, CRC0_G_Y); - value = dm_read_reg_soc15(tg->ctx, mmCRTC0_CRTC_CRC0_DATA_B, - tg110->offsets.crtc); - *b_cb = get_reg_field_value(value, CRTC0_CRTC_CRC0_DATA_B, CRC0_B_CB); + value = dm_read_reg_soc15(tg->ctx, mmCRTC0_CRTC_CRC0_DATA_B, + tg110->offsets.crtc); + *b_cb = get_reg_field_value(value, CRTC0_CRTC_CRC0_DATA_B, CRC0_B_CB); + break; + case 1: + value = dm_read_reg_soc15(tg->ctx, mmCRTC0_CRTC_CRC1_DATA_RG, + tg110->offsets.crtc); + *r_cr = get_reg_field_value(value, CRTC0_CRTC_CRC1_DATA_RG, CRC1_R_CR); + *g_y = get_reg_field_value(value, CRTC0_CRTC_CRC1_DATA_RG, CRC1_G_Y); + + value = dm_read_reg_soc15(tg->ctx, mmCRTC0_CRTC_CRC1_DATA_B, + tg110->offsets.crtc); + *b_cb = get_reg_field_value(value, CRTC0_CRTC_CRC1_DATA_B, CRC1_B_CB); + break; + default: + return false; + } return true; } diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_panel_cntl.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_panel_cntl.c index 573898984726..f9961a6446f3 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_panel_cntl.c +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_panel_cntl.c @@ -168,31 +168,33 @@ void dcn31_panel_cntl_construct( struct dcn31_panel_cntl *dcn31_panel_cntl, const struct panel_cntl_init_data *init_data) { - uint8_t pwrseq_inst = 0xF; dcn31_panel_cntl->base.funcs = &dcn31_link_panel_cntl_funcs; dcn31_panel_cntl->base.ctx = init_data->ctx; dcn31_panel_cntl->base.inst = init_data->inst; - switch (init_data->eng_id) { - case ENGINE_ID_DIGA: - pwrseq_inst = 0; - break; - case ENGINE_ID_DIGB: - pwrseq_inst = 1; - break; - default: - DC_LOG_WARNING("Unsupported pwrseq engine id: %d!\n", init_data->eng_id); - ASSERT(false); - break; - } - - if (dcn31_panel_cntl->base.ctx->dc->config.support_edp0_on_dp1) + if (dcn31_panel_cntl->base.ctx->dc->config.support_edp0_on_dp1) { //If supported, power sequencer mapping shall follow the DIG instance + uint8_t pwrseq_inst = 0xF; + + switch (init_data->eng_id) { + case ENGINE_ID_DIGA: + pwrseq_inst = 0; + break; + case ENGINE_ID_DIGB: + pwrseq_inst = 1; + break; + default: + DC_LOG_WARNING("Unsupported pwrseq engine id: %d!\n", init_data->eng_id); + ASSERT(false); + break; + } + dcn31_panel_cntl->base.pwrseq_inst = pwrseq_inst; - else + } else { /* If not supported, pwrseq will be assigned in order, * so first pwrseq will be assigned to first panel instance (legacy behavior) */ dcn31_panel_cntl->base.pwrseq_inst = dcn31_panel_cntl->base.inst; + } } diff --git a/drivers/gpu/drm/amd/display/dc/dio/dcn31/dcn31_dio_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dio/dcn31/dcn31_dio_link_encoder.c index b2cea59ba5d4..9a92f73d5b7f 100644 --- a/drivers/gpu/drm/amd/display/dc/dio/dcn31/dcn31_dio_link_encoder.c +++ b/drivers/gpu/drm/amd/display/dc/dio/dcn31/dcn31_dio_link_encoder.c @@ -653,8 +653,9 @@ void dcn31_link_encoder_get_max_link_cap(struct link_encoder *enc, struct dc_lin if (!query_dp_alt_from_dmub(enc, &cmd)) return; - if (cmd.query_dp_alt.data.is_usb && - cmd.query_dp_alt.data.is_dp4 == 0) + if (cmd.query_dp_alt.data.is_dp_alt_disable == 0 && + cmd.query_dp_alt.data.is_usb && + cmd.query_dp_alt.data.is_dp4 == 0) link_settings->lane_count = MIN(LANE_COUNT_TWO, link_settings->lane_count); return; diff --git a/drivers/gpu/drm/amd/display/dc/dio/dcn35/dcn35_dio_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dio/dcn35/dcn35_dio_link_encoder.c index d4a3e811aa39..ea0c9a9d0bd6 100644 --- a/drivers/gpu/drm/amd/display/dc/dio/dcn35/dcn35_dio_link_encoder.c +++ b/drivers/gpu/drm/amd/display/dc/dio/dcn35/dcn35_dio_link_encoder.c @@ -28,6 +28,7 @@ #include "link_encoder.h" #include "dcn31/dcn31_dio_link_encoder.h" #include "dcn35_dio_link_encoder.h" +#include "dc_dmub_srv.h" #define CTX \ enc10->base.ctx #define DC_LOGGER \ @@ -159,6 +160,8 @@ static const struct link_encoder_funcs dcn35_link_enc_funcs = { .is_in_alt_mode = dcn31_link_encoder_is_in_alt_mode, .get_max_link_cap = dcn31_link_encoder_get_max_link_cap, .set_dio_phy_mux = dcn31_link_encoder_set_dio_phy_mux, + .enable_dpia_output = dcn35_link_encoder_enable_dpia_output, + .disable_dpia_output = dcn35_link_encoder_disable_dpia_output, }; void dcn35_link_encoder_construct( @@ -265,3 +268,80 @@ void dcn35_link_encoder_construct( enc10->base.features.flags.bits.HDMI_6GB_EN = 0; } + +/* DPIA equivalent of link_transmitter_control. */ +static bool link_dpia_control(struct dc_context *dc_ctx, + struct dmub_cmd_dig_dpia_control_data *dpia_control) +{ + union dmub_rb_cmd cmd; + + memset(&cmd, 0, sizeof(cmd)); + + cmd.dig1_dpia_control.header.type = DMUB_CMD__DPIA; + cmd.dig1_dpia_control.header.sub_type = + DMUB_CMD__DPIA_DIG1_DPIA_CONTROL; + cmd.dig1_dpia_control.header.payload_bytes = + sizeof(cmd.dig1_dpia_control) - + sizeof(cmd.dig1_dpia_control.header); + + cmd.dig1_dpia_control.dpia_control = *dpia_control; + + dc_wake_and_execute_dmub_cmd(dc_ctx, &cmd, DM_DMUB_WAIT_TYPE_WAIT); + + return true; +} + +static void link_encoder_disable(struct dcn10_link_encoder *enc10) +{ + /* reset training complete */ + REG_UPDATE(DP_LINK_CNTL, DP_LINK_TRAINING_COMPLETE, 0); +} + +void dcn35_link_encoder_enable_dpia_output( + struct link_encoder *enc, + const struct dc_link_settings *link_settings, + uint8_t dpia_id, + uint8_t digmode, + uint8_t fec_rdy) +{ + struct dcn10_link_encoder *enc10 = TO_DCN10_LINK_ENC(enc); + struct dmub_cmd_dig_dpia_control_data dpia_control = { 0 }; + + enc1_configure_encoder(enc10, link_settings); + + dpia_control.action = (uint8_t)TRANSMITTER_CONTROL_ENABLE; + dpia_control.enc_id = enc->preferred_engine; + dpia_control.mode_laneset.digmode = digmode; + dpia_control.lanenum = (uint8_t)link_settings->lane_count; + dpia_control.symclk_10khz = link_settings->link_rate * + LINK_RATE_REF_FREQ_IN_KHZ / 10; + /* DIG_BE_CNTL.DIG_HPD_SELECT set to 5 (hpdsel - 1) to indicate HPD pin unused by DPIA. */ + dpia_control.hpdsel = 6; + dpia_control.dpia_id = dpia_id; + dpia_control.fec_rdy = fec_rdy; + + DC_LOG_DEBUG("%s: DPIA(%d) - enc_id(%d)\n", __func__, dpia_control.dpia_id, dpia_control.enc_id); + link_dpia_control(enc->ctx, &dpia_control); +} + +void dcn35_link_encoder_disable_dpia_output( + struct link_encoder *enc, + uint8_t dpia_id, + uint8_t digmode) +{ + struct dcn10_link_encoder *enc10 = TO_DCN10_LINK_ENC(enc); + struct dmub_cmd_dig_dpia_control_data dpia_control = { 0 }; + + if (enc->funcs->is_dig_enabled && !enc->funcs->is_dig_enabled(enc)) + return; + + dpia_control.action = (uint8_t)TRANSMITTER_CONTROL_DISABLE; + dpia_control.enc_id = enc->preferred_engine; + dpia_control.mode_laneset.digmode = digmode; + dpia_control.dpia_id = dpia_id; + + DC_LOG_DEBUG("%s: DPIA(%d) - enc_id(%d)\n", __func__, dpia_control.dpia_id, dpia_control.enc_id); + link_dpia_control(enc->ctx, &dpia_control); + + link_encoder_disable(enc10); +} diff --git a/drivers/gpu/drm/amd/display/dc/dio/dcn35/dcn35_dio_link_encoder.h b/drivers/gpu/drm/amd/display/dc/dio/dcn35/dcn35_dio_link_encoder.h index d546a3676304..f9d4221f4b43 100644 --- a/drivers/gpu/drm/amd/display/dc/dio/dcn35/dcn35_dio_link_encoder.h +++ b/drivers/gpu/drm/amd/display/dc/dio/dcn35/dcn35_dio_link_encoder.h @@ -144,4 +144,22 @@ bool dcn35_is_dig_enabled(struct link_encoder *enc); enum signal_type dcn35_get_dig_mode(struct link_encoder *enc); void dcn35_link_encoder_setup(struct link_encoder *enc, enum signal_type signal); +/* + * Enable DP transmitter and its encoder for dpia port. + */ +void dcn35_link_encoder_enable_dpia_output( + struct link_encoder *enc, + const struct dc_link_settings *link_settings, + uint8_t dpia_id, + uint8_t digmode, + uint8_t fec_rdy); + +/* + * Disable transmitter and its encoder for dpia port. + */ +void dcn35_link_encoder_disable_dpia_output( + struct link_encoder *enc, + uint8_t dpia_id, + uint8_t digmode); + #endif /* __DC_LINK_ENCODER__DCN35_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/dm_helpers.h b/drivers/gpu/drm/amd/display/dc/dm_helpers.h index 2e4a46f1b499..5efddd48d5c5 100644 --- a/drivers/gpu/drm/amd/display/dc/dm_helpers.h +++ b/drivers/gpu/drm/amd/display/dc/dm_helpers.h @@ -158,6 +158,11 @@ bool dm_helpers_dp_write_dsc_enable( const struct dc_stream_state *stream, bool enable ); + +bool dm_helpers_dp_write_hblank_reduction( + struct dc_context *ctx, + const struct dc_stream_state *stream); + bool dm_helpers_is_dp_sink_present( struct dc_link *link); diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c index 39525721c976..f1235bf9a596 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c +++ b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c @@ -1312,138 +1312,6 @@ bool dcn_validate_bandwidth( return false; } -static unsigned int dcn_find_normalized_clock_vdd_Level( - const struct dc *dc, - enum dm_pp_clock_type clocks_type, - int clocks_in_khz) -{ - int vdd_level = dcn_bw_v_min0p65; - - if (clocks_in_khz == 0)/*todo some clock not in the considerations*/ - return vdd_level; - - switch (clocks_type) { - case DM_PP_CLOCK_TYPE_DISPLAY_CLK: - if (clocks_in_khz > dc->dcn_soc->max_dispclk_vmax0p9*1000) { - vdd_level = dcn_bw_v_max0p91; - BREAK_TO_DEBUGGER(); - } else if (clocks_in_khz > dc->dcn_soc->max_dispclk_vnom0p8*1000) { - vdd_level = dcn_bw_v_max0p9; - } else if (clocks_in_khz > dc->dcn_soc->max_dispclk_vmid0p72*1000) { - vdd_level = dcn_bw_v_nom0p8; - } else if (clocks_in_khz > dc->dcn_soc->max_dispclk_vmin0p65*1000) { - vdd_level = dcn_bw_v_mid0p72; - } else - vdd_level = dcn_bw_v_min0p65; - break; - case DM_PP_CLOCK_TYPE_DISPLAYPHYCLK: - if (clocks_in_khz > dc->dcn_soc->phyclkv_max0p9*1000) { - vdd_level = dcn_bw_v_max0p91; - BREAK_TO_DEBUGGER(); - } else if (clocks_in_khz > dc->dcn_soc->phyclkv_nom0p8*1000) { - vdd_level = dcn_bw_v_max0p9; - } else if (clocks_in_khz > dc->dcn_soc->phyclkv_mid0p72*1000) { - vdd_level = dcn_bw_v_nom0p8; - } else if (clocks_in_khz > dc->dcn_soc->phyclkv_min0p65*1000) { - vdd_level = dcn_bw_v_mid0p72; - } else - vdd_level = dcn_bw_v_min0p65; - break; - - case DM_PP_CLOCK_TYPE_DPPCLK: - if (clocks_in_khz > dc->dcn_soc->max_dppclk_vmax0p9*1000) { - vdd_level = dcn_bw_v_max0p91; - BREAK_TO_DEBUGGER(); - } else if (clocks_in_khz > dc->dcn_soc->max_dppclk_vnom0p8*1000) { - vdd_level = dcn_bw_v_max0p9; - } else if (clocks_in_khz > dc->dcn_soc->max_dppclk_vmid0p72*1000) { - vdd_level = dcn_bw_v_nom0p8; - } else if (clocks_in_khz > dc->dcn_soc->max_dppclk_vmin0p65*1000) { - vdd_level = dcn_bw_v_mid0p72; - } else - vdd_level = dcn_bw_v_min0p65; - break; - - case DM_PP_CLOCK_TYPE_MEMORY_CLK: - { - unsigned factor = (ddr4_dram_factor_single_Channel * dc->dcn_soc->number_of_channels); - - if (clocks_in_khz > dc->dcn_soc->fabric_and_dram_bandwidth_vmax0p9*1000000/factor) { - vdd_level = dcn_bw_v_max0p91; - BREAK_TO_DEBUGGER(); - } else if (clocks_in_khz > dc->dcn_soc->fabric_and_dram_bandwidth_vnom0p8*1000000/factor) { - vdd_level = dcn_bw_v_max0p9; - } else if (clocks_in_khz > dc->dcn_soc->fabric_and_dram_bandwidth_vmid0p72*1000000/factor) { - vdd_level = dcn_bw_v_nom0p8; - } else if (clocks_in_khz > dc->dcn_soc->fabric_and_dram_bandwidth_vmin0p65*1000000/factor) { - vdd_level = dcn_bw_v_mid0p72; - } else - vdd_level = dcn_bw_v_min0p65; - } - break; - - case DM_PP_CLOCK_TYPE_DCFCLK: - if (clocks_in_khz > dc->dcn_soc->dcfclkv_max0p9*1000) { - vdd_level = dcn_bw_v_max0p91; - BREAK_TO_DEBUGGER(); - } else if (clocks_in_khz > dc->dcn_soc->dcfclkv_nom0p8*1000) { - vdd_level = dcn_bw_v_max0p9; - } else if (clocks_in_khz > dc->dcn_soc->dcfclkv_mid0p72*1000) { - vdd_level = dcn_bw_v_nom0p8; - } else if (clocks_in_khz > dc->dcn_soc->dcfclkv_min0p65*1000) { - vdd_level = dcn_bw_v_mid0p72; - } else - vdd_level = dcn_bw_v_min0p65; - break; - - default: - break; - } - return vdd_level; -} - -unsigned int dcn_find_dcfclk_suits_all( - const struct dc *dc, - struct dc_clocks *clocks) -{ - unsigned vdd_level, vdd_level_temp; - unsigned dcf_clk; - - /*find a common supported voltage level*/ - vdd_level = dcn_find_normalized_clock_vdd_Level( - dc, DM_PP_CLOCK_TYPE_DISPLAY_CLK, clocks->dispclk_khz); - vdd_level_temp = dcn_find_normalized_clock_vdd_Level( - dc, DM_PP_CLOCK_TYPE_DISPLAYPHYCLK, clocks->phyclk_khz); - - vdd_level = dcn_bw_max(vdd_level, vdd_level_temp); - vdd_level_temp = dcn_find_normalized_clock_vdd_Level( - dc, DM_PP_CLOCK_TYPE_DPPCLK, clocks->dppclk_khz); - vdd_level = dcn_bw_max(vdd_level, vdd_level_temp); - - vdd_level_temp = dcn_find_normalized_clock_vdd_Level( - dc, DM_PP_CLOCK_TYPE_MEMORY_CLK, clocks->fclk_khz); - vdd_level = dcn_bw_max(vdd_level, vdd_level_temp); - vdd_level_temp = dcn_find_normalized_clock_vdd_Level( - dc, DM_PP_CLOCK_TYPE_DCFCLK, clocks->dcfclk_khz); - - /*find that level conresponding dcfclk*/ - vdd_level = dcn_bw_max(vdd_level, vdd_level_temp); - if (vdd_level == dcn_bw_v_max0p91) { - BREAK_TO_DEBUGGER(); - dcf_clk = dc->dcn_soc->dcfclkv_max0p9*1000; - } else if (vdd_level == dcn_bw_v_max0p9) - dcf_clk = dc->dcn_soc->dcfclkv_max0p9*1000; - else if (vdd_level == dcn_bw_v_nom0p8) - dcf_clk = dc->dcn_soc->dcfclkv_nom0p8*1000; - else if (vdd_level == dcn_bw_v_mid0p72) - dcf_clk = dc->dcn_soc->dcfclkv_mid0p72*1000; - else - dcf_clk = dc->dcn_soc->dcfclkv_min0p65*1000; - - DC_LOG_BANDWIDTH_CALCS("\tdcf_clk for voltage = %d\n", dcf_clk); - return dcf_clk; -} - void dcn_bw_update_from_pplib_fclks( struct dc *dc, struct dm_pp_clock_levels_with_voltage *fclks) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.c index 76d3bb3c9155..8d4873f80df0 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.c @@ -1562,6 +1562,7 @@ static void dml_rq_dlg_get_dlg_params(struct display_mode_lib *mode_lib, dml_print("DML_DLG: %s: disp_dlg_regs->dst_y_per_row_vblank = 0x%x\n", __func__, disp_dlg_regs->dst_y_per_row_vblank); dml_print("DML_DLG: %s: disp_dlg_regs->dst_y_per_vm_flip = 0x%x\n", __func__, disp_dlg_regs->dst_y_per_vm_flip); dml_print("DML_DLG: %s: disp_dlg_regs->dst_y_per_row_flip = 0x%x\n", __func__, disp_dlg_regs->dst_y_per_row_flip); + disp_dlg_regs->refcyc_per_pte_group_vblank_l = (unsigned int)(dst_y_per_row_vblank * (double)htotal * ref_freq_to_pix_freq / (double)dpte_groups_per_row_ub_l); diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c index 86ac7d59fd32..0748ef36a16a 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c @@ -1595,6 +1595,7 @@ double dml32_TruncToValidBPP( unsigned int NonDSCBPP0; unsigned int NonDSCBPP1; unsigned int NonDSCBPP2; + unsigned int NonDSCBPP3 = BPP_INVALID; if (Format == dm_420) { NonDSCBPP0 = 12; @@ -1603,6 +1604,7 @@ double dml32_TruncToValidBPP( MinDSCBPP = 6; MaxDSCBPP = 1.5 * DSCInputBitPerComponent - 1.0 / 16; } else if (Format == dm_444) { + NonDSCBPP3 = 18; NonDSCBPP0 = 24; NonDSCBPP1 = 30; NonDSCBPP2 = 36; @@ -1667,6 +1669,8 @@ double dml32_TruncToValidBPP( return NonDSCBPP1; else if (MaxLinkBPP >= NonDSCBPP0) return 16.0; + else if ((Output == dm_dp2p0 || Output == dm_dp) && NonDSCBPP3 != BPP_INVALID && MaxLinkBPP >= NonDSCBPP3) + return NonDSCBPP3; // Special case to allow 6bpc RGB for DP connections. else return BPP_INVALID; } diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c index beed7adbbd43..47d785204f29 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c @@ -195,9 +195,9 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_5_soc = { .dcn_downspread_percent = 0.5, .gpuvm_min_page_size_bytes = 4096, .hostvm_min_page_size_bytes = 4096, - .do_urgent_latency_adjustment = 1, + .do_urgent_latency_adjustment = 0, .urgent_latency_adjustment_fabric_clock_component_us = 0, - .urgent_latency_adjustment_fabric_clock_reference_mhz = 3000, + .urgent_latency_adjustment_fabric_clock_reference_mhz = 0, }; void dcn35_build_wm_range_table_fpu(struct clk_mgr *clk_mgr) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dml_inline_defs.h b/drivers/gpu/drm/amd/display/dc/dml/dml_inline_defs.h index 072bd0539605..6b2ab4ec2b5f 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dml_inline_defs.h +++ b/drivers/gpu/drm/amd/display/dc/dml/dml_inline_defs.h @@ -66,11 +66,15 @@ static inline double dml_max5(double a, double b, double c, double d, double e) static inline double dml_ceil(double a, double granularity) { + if (granularity == 0) + return 0; return (double) dcn_bw_ceil2(a, granularity); } static inline double dml_floor(double a, double granularity) { + if (granularity == 0) + return 0; return (double) dcn_bw_floor2(a, granularity); } @@ -114,11 +118,15 @@ static inline double dml_ceil_2(double f) static inline double dml_ceil_ex(double x, double granularity) { + if (granularity == 0) + return 0; return (double) dcn_bw_ceil2(x, granularity); } static inline double dml_floor_ex(double x, double granularity) { + if (granularity == 0) + return 0; return (double) dcn_bw_floor2(x, granularity); } diff --git a/drivers/gpu/drm/amd/display/dc/dml2/Makefile b/drivers/gpu/drm/amd/display/dc/dml2/Makefile index c4378e620cbf..91c4f3b4bd5f 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/Makefile +++ b/drivers/gpu/drm/amd/display/dc/dml2/Makefile @@ -29,7 +29,11 @@ dml2_rcflags := $(CC_FLAGS_NO_FPU) ifneq ($(CONFIG_FRAME_WARN),0) ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y) +ifeq ($(CONFIG_CC_IS_CLANG)$(CONFIG_COMPILE_TEST),yy) +frame_warn_flag := -Wframe-larger-than=4096 +else frame_warn_flag := -Wframe-larger-than=3072 +endif else frame_warn_flag := -Wframe-larger-than=2048 endif @@ -73,9 +77,8 @@ AMD_DAL_DML2 = $(addprefix $(AMDDALPATH)/dc/dml2/,$(DML2)) AMD_DISPLAY_FILES += $(AMD_DAL_DML2) -CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml_top.o := $(dml2_ccflags) -CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml_top_mcache.o := $(dml2_ccflags) -CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml2_top_optimization := $(dml2_ccflags) +CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml2_top_interfaces.o := $(dml2_ccflags) +CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml2_top_soc15.o := $(dml2_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4.o := $(dml2_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.o := $(dml2_ccflags) $(frame_warn_flag) CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_factory.o := $(dml2_ccflags) @@ -94,9 +97,8 @@ CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/dml21_translation_helper.o := $(dml2_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/dml21_utils.o := $(dml2_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/inc/dml2_debug.o := $(dml2_ccflags) -CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml_top.o := $(dml2_rcflags) -CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml_top_mcache.o := $(dml2_rcflags) -CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml2_top_optimization.o := $(dml2_rcflags) +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml2_top_interfaces.o := $(dml2_rcflags) +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml2_top_soc15.o := $(dml2_rcflags) CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4.o := $(dml2_rcflags) CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.o := $(dml2_rcflags) CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_factory.o := $(dml2_rcflags) @@ -113,9 +115,8 @@ CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/dml21_translation_helper.o := $(dml2_r CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/dml21_utils.o := $(dml2_rcflags) CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/inc/dml2_debug.o := $(dml2_rcflags) -DML21 := src/dml2_top/dml_top.o -DML21 += src/dml2_top/dml_top_mcache.o -DML21 += src/dml2_top/dml2_top_optimization.o +DML21 := src/dml2_top/dml2_top_interfaces.o +DML21 += src/dml2_top/dml2_top_soc15.o DML21 += src/inc/dml2_debug.o DML21 += src/dml2_core/dml2_core_dcn4.o DML21 += src/dml2_core/dml2_core_factory.o diff --git a/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c b/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c index d851c081e376..35bc917631ae 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c @@ -1222,6 +1222,7 @@ static dml_bool_t CalculatePrefetchSchedule(struct display_mode_lib_scratch_st * s->dst_y_prefetch_oto = s->Tvm_oto_lines + 2 * s->Tr0_oto_lines + s->Lsw_oto; s->dst_y_prefetch_equ = p->VStartup - (*p->TSetup + dml_max(p->TWait + p->TCalc, *p->Tdmdl)) / s->LineTime - (*p->DSTYAfterScaler + (dml_float_t) *p->DSTXAfterScaler / (dml_float_t)p->myPipe->HTotal); + s->dst_y_prefetch_equ = dml_min(s->dst_y_prefetch_equ, 63.75); // limit to the reg limit of U6.2 for DST_Y_PREFETCH #ifdef __DML_VBA_DEBUG__ dml_print("DML::%s: HTotal = %u\n", __func__, p->myPipe->HTotal); @@ -6300,9 +6301,9 @@ static void dml_prefetch_check(struct display_mode_lib_st *mode_lib) mode_lib->ms.meta_row_bandwidth_this_state, mode_lib->ms.dpte_row_bandwidth_this_state, mode_lib->ms.NoOfDPPThisState, - mode_lib->ms.UrgentBurstFactorLuma, - mode_lib->ms.UrgentBurstFactorChroma, - mode_lib->ms.UrgentBurstFactorCursor); + mode_lib->ms.UrgentBurstFactorLuma[j], + mode_lib->ms.UrgentBurstFactorChroma[j], + mode_lib->ms.UrgentBurstFactorCursor[j]); s->VMDataOnlyReturnBWPerState = dml_get_return_bw_mbps_vm_only( &mode_lib->ms.soc, @@ -6433,7 +6434,7 @@ static void dml_prefetch_check(struct display_mode_lib_st *mode_lib) /* Output */ &mode_lib->ms.UrgentBurstFactorCursorPre[k], &mode_lib->ms.UrgentBurstFactorLumaPre[k], - &mode_lib->ms.UrgentBurstFactorChroma[k], + &mode_lib->ms.UrgentBurstFactorChromaPre[k], &mode_lib->ms.NotUrgentLatencyHidingPre[k]); mode_lib->ms.cursor_bw_pre[k] = mode_lib->ms.cache_display_cfg.plane.NumberOfCursors[k] * mode_lib->ms.cache_display_cfg.plane.CursorWidth[k] * @@ -6457,9 +6458,9 @@ static void dml_prefetch_check(struct display_mode_lib_st *mode_lib) mode_lib->ms.cursor_bw_pre, mode_lib->ms.prefetch_vmrow_bw, mode_lib->ms.NoOfDPPThisState, - mode_lib->ms.UrgentBurstFactorLuma, - mode_lib->ms.UrgentBurstFactorChroma, - mode_lib->ms.UrgentBurstFactorCursor, + mode_lib->ms.UrgentBurstFactorLuma[j], + mode_lib->ms.UrgentBurstFactorChroma[j], + mode_lib->ms.UrgentBurstFactorCursor[j], mode_lib->ms.UrgentBurstFactorLumaPre, mode_lib->ms.UrgentBurstFactorChromaPre, mode_lib->ms.UrgentBurstFactorCursorPre, @@ -6516,9 +6517,9 @@ static void dml_prefetch_check(struct display_mode_lib_st *mode_lib) mode_lib->ms.cursor_bw, mode_lib->ms.cursor_bw_pre, mode_lib->ms.NoOfDPPThisState, - mode_lib->ms.UrgentBurstFactorLuma, - mode_lib->ms.UrgentBurstFactorChroma, - mode_lib->ms.UrgentBurstFactorCursor, + mode_lib->ms.UrgentBurstFactorLuma[j], + mode_lib->ms.UrgentBurstFactorChroma[j], + mode_lib->ms.UrgentBurstFactorCursor[j], mode_lib->ms.UrgentBurstFactorLumaPre, mode_lib->ms.UrgentBurstFactorChromaPre, mode_lib->ms.UrgentBurstFactorCursorPre); @@ -6585,9 +6586,9 @@ static void dml_prefetch_check(struct display_mode_lib_st *mode_lib) mode_lib->ms.cursor_bw_pre, mode_lib->ms.prefetch_vmrow_bw, mode_lib->ms.NoOfDPP[j], // VBA_ERROR DPPPerSurface is not assigned at this point, should use NoOfDpp here - mode_lib->ms.UrgentBurstFactorLuma, - mode_lib->ms.UrgentBurstFactorChroma, - mode_lib->ms.UrgentBurstFactorCursor, + mode_lib->ms.UrgentBurstFactorLuma[j], + mode_lib->ms.UrgentBurstFactorChroma[j], + mode_lib->ms.UrgentBurstFactorCursor[j], mode_lib->ms.UrgentBurstFactorLumaPre, mode_lib->ms.UrgentBurstFactorChromaPre, mode_lib->ms.UrgentBurstFactorCursorPre, @@ -7808,9 +7809,9 @@ dml_bool_t dml_core_mode_support(struct display_mode_lib_st *mode_lib) mode_lib->ms.DETBufferSizeYThisState[k], mode_lib->ms.DETBufferSizeCThisState[k], /* Output */ - &mode_lib->ms.UrgentBurstFactorCursor[k], - &mode_lib->ms.UrgentBurstFactorLuma[k], - &mode_lib->ms.UrgentBurstFactorChroma[k], + &mode_lib->ms.UrgentBurstFactorCursor[j][k], + &mode_lib->ms.UrgentBurstFactorLuma[j][k], + &mode_lib->ms.UrgentBurstFactorChroma[j][k], &mode_lib->ms.NotUrgentLatencyHiding[k]); } @@ -8317,7 +8318,7 @@ void dml_core_mode_programming(struct display_mode_lib_st *mode_lib, const struc if (clk_cfg->dcfclk_option != dml_use_override_freq) locals->Dcfclk = mode_lib->ms.DCFCLK; else - locals->Dcfclk = clk_cfg->dcfclk_freq_mhz; + locals->Dcfclk = clk_cfg->dcfclk_mhz; #ifdef __DML_VBA_DEBUG__ dml_print_dml_policy(&mode_lib->ms.policy); @@ -8370,7 +8371,7 @@ void dml_core_mode_programming(struct display_mode_lib_st *mode_lib, const struc if (clk_cfg->dispclk_option == dml_use_required_freq) locals->Dispclk = locals->Dispclk_calculated; else if (clk_cfg->dispclk_option == dml_use_override_freq) - locals->Dispclk = clk_cfg->dispclk_freq_mhz; + locals->Dispclk = clk_cfg->dispclk_mhz; else locals->Dispclk = mode_lib->ms.state.dispclk_mhz; #ifdef __DML_VBA_DEBUG__ @@ -8411,7 +8412,7 @@ void dml_core_mode_programming(struct display_mode_lib_st *mode_lib, const struc if (clk_cfg->dppclk_option[k] == dml_use_required_freq) locals->Dppclk[k] = locals->Dppclk_calculated[k]; else if (clk_cfg->dppclk_option[k] == dml_use_override_freq) - locals->Dppclk[k] = clk_cfg->dppclk_freq_mhz[k]; + locals->Dppclk[k] = clk_cfg->dppclk_mhz[k]; else locals->Dppclk[k] = mode_lib->ms.state.dppclk_mhz; #ifdef __DML_VBA_DEBUG__ @@ -9189,6 +9190,8 @@ void dml_core_mode_programming(struct display_mode_lib_st *mode_lib, const struc &locals->FractionOfUrgentBandwidth, &s->dummy_boolean[0]); // dml_bool_t *PrefetchBandwidthSupport + + if (s->VRatioPrefetchMoreThanMax != false || s->DestinationLineTimesForPrefetchLessThan2 != false) { dml_print("DML::%s: VRatioPrefetchMoreThanMax = %u\n", __func__, s->VRatioPrefetchMoreThanMax); dml_print("DML::%s: DestinationLineTimesForPrefetchLessThan2 = %u\n", __func__, s->DestinationLineTimesForPrefetchLessThan2); @@ -9203,6 +9206,7 @@ void dml_core_mode_programming(struct display_mode_lib_st *mode_lib, const struc } } + if (locals->PrefetchModeSupported == true && mode_lib->ms.support.ImmediateFlipSupport == true) { locals->BandwidthAvailableForImmediateFlip = CalculateBandwidthAvailableForImmediateFlip( mode_lib->ms.num_active_planes, diff --git a/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core_structs.h b/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core_structs.h index f951936bb579..dd3f43181a6e 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core_structs.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core_structs.h @@ -28,6 +28,7 @@ #define __DISPLAY_MODE_CORE_STRUCT_H__ #include "display_mode_lib_defines.h" +#include "dml_top_display_cfg_types.h" enum dml_project_id { dml_project_invalid = 0, @@ -49,7 +50,9 @@ enum dml_use_mall_for_pstate_change_mode { dml_use_mall_pstate_change_disable = 0, dml_use_mall_pstate_change_full_frame = 1, dml_use_mall_pstate_change_sub_viewport = 2, - dml_use_mall_pstate_change_phantom_pipe = 3 + dml_use_mall_pstate_change_phantom_pipe = 3, + dml_use_mall_pstate_change_phantom_pipe_no_data_return = 4, + dml_use_mall_pstate_change_imall = 5 }; enum dml_use_mall_for_static_screen_mode { dml_use_mall_static_screen_disable = 0, @@ -171,7 +174,11 @@ enum dml_swizzle_mode { dml_sw_256kb_z_x = 28, dml_sw_256kb_s_x = 29, dml_sw_256kb_d_x = 30, - dml_sw_256kb_r_x = 31 + dml_sw_256kb_r_x = 31, + dml_sw_256b_2d = 32, + dml_sw_4kb_2d = 33, + dml_sw_64kb_2d = 34, + dml_sw_256kb_2d = 35 }; enum dml_lb_depth { dml_lb_6 = 0, @@ -223,24 +230,28 @@ enum dml_mpc_use_policy { dml_mpc_disabled = 0, dml_mpc_as_possible = 1, dml_mpc_as_needed_for_voltage = 2, - dml_mpc_as_needed_for_pstate_and_voltage = 3 + dml_mpc_as_needed_for_pstate_and_voltage = 3, + dml_mpc_as_needed = 4, + dml_mpc_2to1 = 5 }; enum dml_odm_use_policy { dml_odm_use_policy_bypass = 0, dml_odm_use_policy_combine_as_needed = 1, dml_odm_use_policy_combine_2to1 = 2, - dml_odm_use_policy_combine_4to1 = 3, - dml_odm_use_policy_split_1to2 = 4, - dml_odm_use_policy_mso_1to2 = 5, - dml_odm_use_policy_mso_1to4 = 6 + dml_odm_use_policy_combine_3to1 = 3, + dml_odm_use_policy_combine_4to1 = 4, + dml_odm_use_policy_split_1to2 = 5, + dml_odm_use_policy_mso_1to2 = 6, + dml_odm_use_policy_mso_1to4 = 7 }; enum dml_odm_mode { dml_odm_mode_bypass = 0, dml_odm_mode_combine_2to1 = 1, - dml_odm_mode_combine_4to1 = 2, - dml_odm_mode_split_1to2 = 3, - dml_odm_mode_mso_1to2 = 4, - dml_odm_mode_mso_1to4 = 5 + dml_odm_mode_combine_3to1 = 2, + dml_odm_mode_combine_4to1 = 3, + dml_odm_mode_split_1to2 = 4, + dml_odm_mode_mso_1to2 = 5, + dml_odm_mode_mso_1to4 = 6 }; enum dml_writeback_configuration { dml_whole_buffer_for_single_stream_no_interleave = 0, @@ -289,6 +300,17 @@ struct soc_state_bounding_box_st { dml_float_t fclk_change_latency_us; dml_float_t usr_retraining_latency_us; dml_bool_t use_ideal_dram_bw_strobe; + dml_float_t g6_temp_read_blackout_us; + + struct { + dml_uint_t urgent_ramp_uclk_cycles; + dml_uint_t trip_to_memory_uclk_cycles; + dml_uint_t meta_trip_to_memory_uclk_cycles; + dml_uint_t maximum_latency_when_urgent_uclk_cycles; + dml_uint_t average_latency_when_urgent_uclk_cycles; + dml_uint_t maximum_latency_when_non_urgent_uclk_cycles; + dml_uint_t average_latency_when_non_urgent_uclk_cycles; + } dml_dcn401_uclk_dpm_dependent_soc_qos_params; }; struct soc_bounding_box_st { @@ -297,7 +319,7 @@ struct soc_bounding_box_st { dml_float_t pcierefclk_mhz; dml_float_t refclk_mhz; dml_float_t amclk_mhz; - dml_float_t max_outstanding_reqs; + dml_uint_t max_outstanding_reqs; dml_float_t pct_ideal_sdp_bw_after_urgent; dml_float_t pct_ideal_fabric_bw_after_urgent; dml_float_t pct_ideal_dram_bw_after_urgent_pixel_only; @@ -308,6 +330,16 @@ struct soc_bounding_box_st { dml_float_t max_avg_fabric_bw_use_normal_percent; dml_float_t max_avg_dram_bw_use_normal_percent; dml_float_t max_avg_dram_bw_use_normal_strobe_percent; + + dml_float_t svp_prefetch_pct_ideal_sdp_bw_after_urgent; + dml_float_t svp_prefetch_pct_ideal_fabric_bw_after_urgent; + dml_float_t svp_prefetch_pct_ideal_dram_bw_after_urgent_pixel_only; + dml_float_t svp_prefetch_pct_ideal_dram_bw_after_urgent_pixel_and_vm; + dml_float_t svp_prefetch_pct_ideal_dram_bw_after_urgent_vm_only; + dml_float_t svp_prefetch_max_avg_sdp_bw_use_normal_percent; + dml_float_t svp_prefetch_max_avg_fabric_bw_use_normal_percent; + dml_float_t svp_prefetch_max_avg_dram_bw_use_normal_percent; + dml_uint_t round_trip_ping_latency_dcfclk_cycles; dml_uint_t urgent_out_of_order_return_per_channel_pixel_only_bytes; dml_uint_t urgent_out_of_order_return_per_channel_pixel_and_vm_bytes; @@ -324,6 +356,26 @@ struct soc_bounding_box_st { dml_uint_t mall_allocated_for_dcn_mbytes; dml_float_t dispclk_dppclk_vco_speed_mhz; dml_bool_t do_urgent_latency_adjustment; + + dml_uint_t mem_word_bytes; + dml_uint_t num_dcc_mcaches; + dml_uint_t mcache_size_bytes; + dml_uint_t mcache_line_size_bytes; + + struct { + dml_bool_t UseNewDCN401SOCParameters; + dml_uint_t df_qos_response_time_fclk_cycles; + dml_uint_t max_round_trip_to_furthest_cs_fclk_cycles; + dml_uint_t mall_overhead_fclk_cycles; + dml_uint_t meta_trip_adder_fclk_cycles; + dml_uint_t average_transport_distance_fclk_cycles; + dml_float_t umc_urgent_ramp_latency_margin; + dml_float_t umc_max_latency_margin; + dml_float_t umc_average_latency_margin; + dml_float_t fabric_max_transport_latency_margin; + dml_float_t fabric_average_transport_latency_margin; + } dml_dcn401_soc_qos_params; + }; struct ip_params_st { @@ -515,6 +567,10 @@ struct dml_plane_cfg_st { dml_uint_t CursorWidth[__DML_NUM_PLANES__]; dml_uint_t CursorBPP[__DML_NUM_PLANES__]; + dml_bool_t setup_for_tdlut[__DML_NUM_PLANES__]; + enum dml2_tdlut_addressing_mode tdlut_addressing_mode[__DML_NUM_PLANES__]; + enum dml2_tdlut_width_mode tdlut_width_mode[__DML_NUM_PLANES__]; + enum dml_use_mall_for_static_screen_mode UseMALLForStaticScreen[__DML_NUM_PLANES__]; enum dml_use_mall_for_pstate_change_mode UseMALLForPStateChange[__DML_NUM_PLANES__]; @@ -604,6 +660,17 @@ struct dml_hw_resource_st { dml_float_t DLGRefClkFreqMHz; /// dcfclk_option); dml_print("DML: clk_cfg: dispclk_option = %d\n", clk_cfg->dispclk_option); - dml_print("DML: clk_cfg: dcfclk_freq_mhz = %f\n", clk_cfg->dcfclk_freq_mhz); - dml_print("DML: clk_cfg: dispclk_freq_mhz = %f\n", clk_cfg->dispclk_freq_mhz); + dml_print("DML: clk_cfg: dcfclk_mhz = %f\n", clk_cfg->dcfclk_mhz); + dml_print("DML: clk_cfg: dispclk_mhz = %f\n", clk_cfg->dispclk_mhz); for (dml_uint_t i = 0; i < DCN_DML__NUM_PLANE; i++) { dml_print("DML: clk_cfg: i=%d, dppclk_option = %d\n", i, clk_cfg->dppclk_option[i]); - dml_print("DML: clk_cfg: i=%d, dppclk_freq_mhz = %f\n", i, clk_cfg->dppclk_freq_mhz[i]); + dml_print("DML: clk_cfg: i=%d, dppclk_mhz = %f\n", i, clk_cfg->dppclk_mhz[i]); } } diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.c index 138b4b1e42ed..b9c6b45f6872 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.c @@ -10,7 +10,6 @@ #include "dml21_utils.h" #include "dml21_translation_helper.h" #include "bounding_boxes/dcn4_soc_bb.h" -#include "bounding_boxes/dcn3_soc_bb.h" static void dml21_init_socbb_params(struct dml2_initialize_instance_in_out *dml_init, const struct dml2_configuration_options *config, @@ -20,10 +19,6 @@ static void dml21_init_socbb_params(struct dml2_initialize_instance_in_out *dml_ const struct dml2_soc_qos_parameters *qos_params; switch (in_dc->ctx->dce_version) { - case DCN_VERSION_3_2: // TODO : Temporary for N-1 validation. Remove this after N-1 validation phase is complete. - soc_bb = &dml2_socbb_dcn31; - qos_params = &dml_dcn31_soc_qos_params; - break; case DCN_VERSION_4_01: default: if (config->bb_from_dmub) @@ -60,9 +55,6 @@ static void dml21_init_ip_params(struct dml2_initialize_instance_in_out *dml_ini const struct dml2_ip_capabilities *ip_caps; switch (in_dc->ctx->dce_version) { - case DCN_VERSION_3_2: // TODO : Temporary for N-1 validation. Remove this after N-1 validation phase is complete. - ip_caps = &dml2_dcn31_max_ip_caps; - break; case DCN_VERSION_4_01: default: ip_caps = &dml2_dcn401_max_ip_caps; @@ -302,12 +294,17 @@ void dml21_apply_soc_bb_overrides(struct dml2_initialize_instance_in_out *dml_in dml_soc_bb->power_management_parameters.stutter_exit_latency_us = (in_dc->ctx->dc_bios->bb_info.dram_sr_exit_latency_100ns + 9) / 10; - if (in_dc->ctx->dc_bios->vram_info.num_chans) { + if (dc_bw_params->num_channels) { + dml_clk_table->dram_config.channel_count = dc_bw_params->num_channels; + dml_soc_bb->mall_allocated_for_dcn_mbytes = in_dc->caps.mall_size_total / 1048576; + } else if (in_dc->ctx->dc_bios->vram_info.num_chans) { dml_clk_table->dram_config.channel_count = in_dc->ctx->dc_bios->vram_info.num_chans; dml_soc_bb->mall_allocated_for_dcn_mbytes = in_dc->caps.mall_size_total / 1048576; } - if (in_dc->ctx->dc_bios->vram_info.dram_channel_width_bytes) { + if (dc_bw_params->dram_channel_width_bytes) { + dml_clk_table->dram_config.channel_width_bytes = dc_bw_params->dram_channel_width_bytes; + } else if (in_dc->ctx->dc_bios->vram_info.dram_channel_width_bytes) { dml_clk_table->dram_config.channel_width_bytes = in_dc->ctx->dc_bios->vram_info.dram_channel_width_bytes; } @@ -339,11 +336,22 @@ void dml21_apply_soc_bb_overrides(struct dml2_initialize_instance_in_out *dml_in // } } +static unsigned int calc_max_hardware_v_total(const struct dc_stream_state *stream) +{ + unsigned int max_hw_v_total = stream->ctx->dc->caps.max_v_total; + + if (stream->ctx->dc->caps.vtotal_limited_by_fp2) { + max_hw_v_total -= stream->timing.v_front_porch + 1; + } + + return max_hw_v_total; +} + static void populate_dml21_timing_config_from_stream_state(struct dml2_timing_cfg *timing, struct dc_stream_state *stream, struct dml2_context *dml_ctx) { - unsigned int hblank_start, vblank_start; + unsigned int hblank_start, vblank_start, min_hardware_refresh_in_uhz; timing->h_active = stream->timing.h_addressable + stream->timing.h_border_left + stream->timing.h_border_right; timing->v_active = stream->timing.v_addressable + stream->timing.v_border_bottom + stream->timing.v_border_top; @@ -371,11 +379,23 @@ static void populate_dml21_timing_config_from_stream_state(struct dml2_timing_cf - stream->timing.v_border_top - stream->timing.v_border_bottom; timing->drr_config.enabled = stream->ignore_msa_timing_param; - timing->drr_config.min_refresh_uhz = stream->timing.min_refresh_in_uhz; timing->drr_config.drr_active_variable = stream->vrr_active_variable; timing->drr_config.drr_active_fixed = stream->vrr_active_fixed; timing->drr_config.disallowed = !stream->allow_freesync; + /* limit min refresh rate to DC cap */ + min_hardware_refresh_in_uhz = stream->timing.min_refresh_in_uhz; + if (stream->ctx->dc->caps.max_v_total != 0) { + min_hardware_refresh_in_uhz = div64_u64((stream->timing.pix_clk_100hz * 100000000ULL), + (stream->timing.h_total * (long long)calc_max_hardware_v_total(stream))); + } + + if (stream->timing.min_refresh_in_uhz > min_hardware_refresh_in_uhz) { + timing->drr_config.min_refresh_uhz = stream->timing.min_refresh_in_uhz; + } else { + timing->drr_config.min_refresh_uhz = min_hardware_refresh_in_uhz; + } + if (dml_ctx->config.callbacks.get_max_flickerless_instant_vtotal_increase && stream->ctx->dc->config.enable_fpo_flicker_detection == 1) timing->drr_config.max_instant_vtotal_delta = dml_ctx->config.callbacks.get_max_flickerless_instant_vtotal_increase(stream, false); @@ -422,6 +442,21 @@ static void populate_dml21_timing_config_from_stream_state(struct dml2_timing_cf timing->vblank_nom = timing->v_total - timing->v_active; } +/** + * adjust_dml21_hblank_timing_config_from_pipe_ctx - Adjusts the horizontal blanking timing configuration + * based on the pipe context. + * @timing: Pointer to the dml2_timing_cfg structure to be adjusted. + * @pipe: Pointer to the pipe_ctx structure containing the horizontal blanking borrow value. + * + * This function modifies the horizontal active and blank end timings by adding and subtracting + * the horizontal blanking borrow value from the pipe context, respectively. + */ +static void adjust_dml21_hblank_timing_config_from_pipe_ctx(struct dml2_timing_cfg *timing, struct pipe_ctx *pipe) +{ + timing->h_active += pipe->hblank_borrow; + timing->h_blank_end -= pipe->hblank_borrow; +} + static void populate_dml21_output_config_from_stream_state(struct dml2_link_output_cfg *output, struct dc_stream_state *stream, const struct pipe_ctx *pipe) { @@ -683,11 +718,21 @@ static void populate_dml21_surface_config_from_plane_state( surface->dcc.informative.fraction_of_zero_size_request_plane1 = plane_state->dcc.independent_64b_blks_c; surface->dcc.plane0.pitch = plane_state->dcc.meta_pitch; surface->dcc.plane1.pitch = plane_state->dcc.meta_pitch_c; - if (in_dc->ctx->dce_version < DCN_VERSION_4_01) { - /* needed for N-1 testing */ + + // Update swizzle / array mode based on the gfx_format + switch (plane_state->tiling_info.gfxversion) { + case DcGfxVersion7: + case DcGfxVersion8: + // Placeholder for programming the array_mode + break; + case DcGfxVersion9: + case DcGfxVersion10: + case DcGfxVersion11: surface->tiling = gfx9_to_dml2_swizzle_mode(plane_state->tiling_info.gfx9.swizzle); - } else { + break; + case DcGfxAddr3: surface->tiling = gfx_addr3_to_dml2_swizzle_mode(plane_state->tiling_info.gfx_addr3.swizzle); + break; } } @@ -709,6 +754,7 @@ static const struct scaler_data *get_scaler_data_for_plane( temp_pipe->plane_state = pipe->plane_state; temp_pipe->plane_res.scl_data.taps = pipe->plane_res.scl_data.taps; temp_pipe->stream_res = pipe->stream_res; + temp_pipe->hblank_borrow = pipe->hblank_borrow; dml_ctx->config.callbacks.build_scaling_params(temp_pipe); break; } @@ -973,6 +1019,7 @@ bool dml21_map_dc_state_into_dml_display_cfg(const struct dc *in_dc, struct dc_s ASSERT(disp_cfg_stream_location >= 0 && disp_cfg_stream_location <= __DML2_WRAPPER_MAX_STREAMS_PLANES__); populate_dml21_timing_config_from_stream_state(&dml_dispcfg->stream_descriptors[disp_cfg_stream_location].timing, context->streams[stream_index], dml_ctx); + adjust_dml21_hblank_timing_config_from_pipe_ctx(&dml_dispcfg->stream_descriptors[disp_cfg_stream_location].timing, &context->res_ctx.pipe_ctx[stream_index]); populate_dml21_output_config_from_stream_state(&dml_dispcfg->stream_descriptors[disp_cfg_stream_location].output, context->streams[stream_index], &context->res_ctx.pipe_ctx[stream_index]); populate_dml21_stream_overrides_from_stream_state(&dml_dispcfg->stream_descriptors[disp_cfg_stream_location], context->streams[stream_index]); @@ -1037,28 +1084,8 @@ void dml21_copy_clocks_to_dc_state(struct dml2_context *in_ctx, struct dc_state context->bw_ctx.bw.dcn.clk.dtbclk_en = in_ctx->v21.mode_programming.programming->min_clocks.dcn4x.dtbrefclk_khz > 0; context->bw_ctx.bw.dcn.clk.ref_dtbclk_khz = in_ctx->v21.mode_programming.programming->min_clocks.dcn4x.dtbrefclk_khz; context->bw_ctx.bw.dcn.clk.socclk_khz = in_ctx->v21.mode_programming.programming->min_clocks.dcn4x.socclk_khz; -} - -void dml21_extract_legacy_watermark_set(const struct dc *in_dc, struct dcn_watermarks *watermark, enum dml2_dchub_watermark_reg_set_index reg_set_idx, struct dml2_context *in_ctx) -{ - struct dml2_core_internal_display_mode_lib *mode_lib = &in_ctx->v21.dml_init.dml2_instance->core_instance.clean_me_up.mode_lib; - double refclk_freq_in_mhz = (in_ctx->v21.display_config.overrides.hw.dlg_ref_clk_mhz > 0) ? (double)in_ctx->v21.display_config.overrides.hw.dlg_ref_clk_mhz : mode_lib->soc.dchub_refclk_mhz; - - if (reg_set_idx >= DML2_DCHUB_WATERMARK_SET_NUM) { - /* invalid register set index */ - return; - } - - /* convert to legacy format (time in ns) */ - watermark->urgent_ns = ((double)in_ctx->v21.mode_programming.programming->global_regs.wm_regs[reg_set_idx].urgent / refclk_freq_in_mhz) * 1000.0; - watermark->pte_meta_urgent_ns = ((double)in_ctx->v21.mode_programming.programming->global_regs.wm_regs[reg_set_idx].urgent / refclk_freq_in_mhz) * 1000.0; - watermark->cstate_pstate.cstate_enter_plus_exit_ns = ((double)in_ctx->v21.mode_programming.programming->global_regs.wm_regs[reg_set_idx].sr_enter / refclk_freq_in_mhz) * 1000.0; - watermark->cstate_pstate.cstate_exit_ns = ((double)in_ctx->v21.mode_programming.programming->global_regs.wm_regs[reg_set_idx].sr_exit / refclk_freq_in_mhz) * 1000.0; - watermark->cstate_pstate.pstate_change_ns = ((double)in_ctx->v21.mode_programming.programming->global_regs.wm_regs[reg_set_idx].uclk_pstate / refclk_freq_in_mhz) * 1000.0; - watermark->urgent_latency_ns = ((double)in_ctx->v21.mode_programming.programming->global_regs.wm_regs[reg_set_idx].urgent / refclk_freq_in_mhz) * 1000.0; - watermark->cstate_pstate.fclk_pstate_change_ns = ((double)in_ctx->v21.mode_programming.programming->global_regs.wm_regs[reg_set_idx].fclk_pstate / refclk_freq_in_mhz) * 1000.0; - watermark->frac_urg_bw_flip = in_ctx->v21.mode_programming.programming->global_regs.wm_regs[reg_set_idx].frac_urg_bw_flip; - watermark->frac_urg_bw_nom = in_ctx->v21.mode_programming.programming->global_regs.wm_regs[reg_set_idx].frac_urg_bw_nom; + context->bw_ctx.bw.dcn.clk.subvp_prefetch_dramclk_khz = in_ctx->v21.mode_programming.programming->min_clocks.dcn4x.svp_prefetch_no_throttle.uclk_khz; + context->bw_ctx.bw.dcn.clk.subvp_prefetch_fclk_khz = in_ctx->v21.mode_programming.programming->min_clocks.dcn4x.svp_prefetch_no_throttle.fclk_khz; } static struct dml2_dchub_watermark_regs *wm_set_index_to_dc_wm_set(union dcn_watermark_set *watermarks, const enum dml2_dchub_watermark_reg_set_index wm_index) @@ -1104,53 +1131,6 @@ void dml21_extract_watermark_sets(const struct dc *in_dc, union dcn_watermark_se } } - -void dml21_populate_pipe_ctx_dlg_params(struct dml2_context *dml_ctx, struct dc_state *context, struct pipe_ctx *pipe_ctx, struct dml2_per_stream_programming *stream_programming) -{ - unsigned int hactive, vactive, hblank_start, vblank_start, hblank_end, vblank_end; - struct dc_crtc_timing *timing = &pipe_ctx->stream->timing; - union dml2_global_sync_programming *global_sync = &stream_programming->global_sync; - - hactive = timing->h_addressable + timing->h_border_left + timing->h_border_right; - vactive = timing->v_addressable + timing->v_border_bottom + timing->v_border_top; - hblank_start = pipe_ctx->stream->timing.h_total - pipe_ctx->stream->timing.h_front_porch; - vblank_start = pipe_ctx->stream->timing.v_total - pipe_ctx->stream->timing.v_front_porch; - - hblank_end = hblank_start - timing->h_addressable - timing->h_border_left - timing->h_border_right; - vblank_end = vblank_start - timing->v_addressable - timing->v_border_top - timing->v_border_bottom; - - if (dml_ctx->config.svp_pstate.callbacks.get_pipe_subvp_type(context, pipe_ctx) == SUBVP_PHANTOM) { - /* phantom has its own global sync */ - global_sync = &stream_programming->phantom_stream.global_sync; - } - - pipe_ctx->pipe_dlg_param.vstartup_start = global_sync->dcn4x.vstartup_lines; - pipe_ctx->pipe_dlg_param.vupdate_offset = global_sync->dcn4x.vupdate_offset_pixels; - pipe_ctx->pipe_dlg_param.vupdate_width = global_sync->dcn4x.vupdate_vupdate_width_pixels; - pipe_ctx->pipe_dlg_param.vready_offset = global_sync->dcn4x.vready_offset_pixels; - pipe_ctx->pipe_dlg_param.pstate_keepout = global_sync->dcn4x.pstate_keepout_start_lines; - - pipe_ctx->pipe_dlg_param.otg_inst = pipe_ctx->stream_res.tg->inst; - - pipe_ctx->pipe_dlg_param.hactive = hactive; - pipe_ctx->pipe_dlg_param.vactive = vactive; - pipe_ctx->pipe_dlg_param.htotal = pipe_ctx->stream->timing.h_total; - pipe_ctx->pipe_dlg_param.vtotal = pipe_ctx->stream->timing.v_total; - pipe_ctx->pipe_dlg_param.hblank_end = hblank_end; - pipe_ctx->pipe_dlg_param.vblank_end = vblank_end; - pipe_ctx->pipe_dlg_param.hblank_start = hblank_start; - pipe_ctx->pipe_dlg_param.vblank_start = vblank_start; - pipe_ctx->pipe_dlg_param.vfront_porch = pipe_ctx->stream->timing.v_front_porch; - pipe_ctx->pipe_dlg_param.pixel_rate_mhz = pipe_ctx->stream->timing.pix_clk_100hz / 10000.00; - pipe_ctx->pipe_dlg_param.refresh_rate = ((timing->pix_clk_100hz * 100) / timing->h_total) / timing->v_total; - pipe_ctx->pipe_dlg_param.vtotal_max = pipe_ctx->stream->adjust.v_total_max; - pipe_ctx->pipe_dlg_param.vtotal_min = pipe_ctx->stream->adjust.v_total_min; - pipe_ctx->pipe_dlg_param.recout_height = pipe_ctx->plane_res.scl_data.recout.height; - pipe_ctx->pipe_dlg_param.recout_width = pipe_ctx->plane_res.scl_data.recout.width; - pipe_ctx->pipe_dlg_param.full_recout_height = pipe_ctx->plane_res.scl_data.recout.height; - pipe_ctx->pipe_dlg_param.full_recout_width = pipe_ctx->plane_res.scl_data.recout.width; -} - void dml21_map_hw_resources(struct dml2_context *dml_ctx) { unsigned int i = 0; @@ -1186,22 +1166,22 @@ void dml21_set_dc_p_state_type( bool sub_vp_enabled) { switch (stream_programming->uclk_pstate_method) { - case dml2_uclk_pstate_support_method_vactive: - case dml2_uclk_pstate_support_method_fw_vactive_drr: + case dml2_pstate_method_vactive: + case dml2_pstate_method_fw_vactive_drr: pipe_ctx->p_state_type = P_STATE_V_ACTIVE; break; - case dml2_uclk_pstate_support_method_vblank: - case dml2_uclk_pstate_support_method_fw_vblank_drr: + case dml2_pstate_method_vblank: + case dml2_pstate_method_fw_vblank_drr: if (sub_vp_enabled) pipe_ctx->p_state_type = P_STATE_V_BLANK_SUB_VP; else pipe_ctx->p_state_type = P_STATE_V_BLANK; break; - case dml2_uclk_pstate_support_method_fw_subvp_phantom: - case dml2_uclk_pstate_support_method_fw_subvp_phantom_drr: + case dml2_pstate_method_fw_svp: + case dml2_pstate_method_fw_svp_drr: pipe_ctx->p_state_type = P_STATE_SUB_VP; break; - case dml2_uclk_pstate_support_method_fw_drr: + case dml2_pstate_method_fw_drr: if (sub_vp_enabled) pipe_ctx->p_state_type = P_STATE_DRR_SUB_VP; else diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.h index 476a7f6e4875..069b939c672a 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.h @@ -21,8 +21,6 @@ void dml21_initialize_soc_bb_params(struct dml2_initialize_instance_in_out *dml_ void dml21_initialize_ip_params(struct dml2_initialize_instance_in_out *dml_init, const struct dml2_configuration_options *config, const struct dc *in_dc); bool dml21_map_dc_state_into_dml_display_cfg(const struct dc *in_dc, struct dc_state *context, struct dml2_context *dml_ctx); void dml21_copy_clocks_to_dc_state(struct dml2_context *in_ctx, struct dc_state *context); -void dml21_populate_pipe_ctx_dlg_params(struct dml2_context *dml_ctx, struct dc_state *context, struct pipe_ctx *pipe_ctx, struct dml2_per_stream_programming *stream_programming); -void dml21_extract_legacy_watermark_set(const struct dc *in_dc, struct dcn_watermarks *watermark, enum dml2_dchub_watermark_reg_set_index reg_set_idx, struct dml2_context *in_ctx); void dml21_extract_watermark_sets(const struct dc *in_dc, union dcn_watermark_set *watermarks, struct dml2_context *in_ctx); void dml21_map_hw_resources(struct dml2_context *dml_ctx); void dml21_get_pipe_mcache_config(struct dc_state *context, struct pipe_ctx *pipe_ctx, struct dml2_per_plane_programming *pln_prog, struct dml2_pipe_configuration_descriptor *mcache_pipe_config); diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_utils.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_utils.c index 51d491bffa32..1e56d995cd0e 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_utils.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_utils.c @@ -142,108 +142,21 @@ int dml21_find_dc_pipes_for_plane(const struct dc *in_dc, return num_pipes; } - -void dml21_update_pipe_ctx_dchub_regs(struct dml2_display_rq_regs *rq_regs, - struct dml2_display_dlg_regs *disp_dlg_regs, - struct dml2_display_ttu_regs *disp_ttu_regs, - struct pipe_ctx *out) +void dml21_pipe_populate_global_sync(struct dml2_context *dml_ctx, + struct dc_state *context, + struct pipe_ctx *pipe_ctx, + struct dml2_per_stream_programming *stream_programming) { - memset(&out->rq_regs, 0, sizeof(out->rq_regs)); - out->rq_regs.rq_regs_l.chunk_size = rq_regs->rq_regs_l.chunk_size; - out->rq_regs.rq_regs_l.min_chunk_size = rq_regs->rq_regs_l.min_chunk_size; - //out->rq_regs.rq_regs_l.meta_chunk_size = rq_regs->rq_regs_l.meta_chunk_size; - //out->rq_regs.rq_regs_l.min_meta_chunk_size = rq_regs->rq_regs_l.min_meta_chunk_size; - out->rq_regs.rq_regs_l.dpte_group_size = rq_regs->rq_regs_l.dpte_group_size; - out->rq_regs.rq_regs_l.mpte_group_size = rq_regs->rq_regs_l.mpte_group_size; - out->rq_regs.rq_regs_l.swath_height = rq_regs->rq_regs_l.swath_height; - out->rq_regs.rq_regs_l.pte_row_height_linear = rq_regs->rq_regs_l.pte_row_height_linear; + union dml2_global_sync_programming *global_sync = &stream_programming->global_sync; - out->rq_regs.rq_regs_c.chunk_size = rq_regs->rq_regs_c.chunk_size; - out->rq_regs.rq_regs_c.min_chunk_size = rq_regs->rq_regs_c.min_chunk_size; - //out->rq_regs.rq_regs_c.meta_chunk_size = rq_regs->rq_regs_c.meta_chunk_size; - //out->rq_regs.rq_regs_c.min_meta_chunk_size = rq_regs->rq_regs_c.min_meta_chunk_size; - out->rq_regs.rq_regs_c.dpte_group_size = rq_regs->rq_regs_c.dpte_group_size; - out->rq_regs.rq_regs_c.mpte_group_size = rq_regs->rq_regs_c.mpte_group_size; - out->rq_regs.rq_regs_c.swath_height = rq_regs->rq_regs_c.swath_height; - out->rq_regs.rq_regs_c.pte_row_height_linear = rq_regs->rq_regs_c.pte_row_height_linear; + if (dml_ctx->config.svp_pstate.callbacks.get_pipe_subvp_type(context, pipe_ctx) == SUBVP_PHANTOM) { + /* phantom has its own global sync */ + global_sync = &stream_programming->phantom_stream.global_sync; + } - out->rq_regs.drq_expansion_mode = rq_regs->drq_expansion_mode; - out->rq_regs.prq_expansion_mode = rq_regs->prq_expansion_mode; - //out->rq_regs.mrq_expansion_mode = rq_regs->mrq_expansion_mode; - out->rq_regs.crq_expansion_mode = rq_regs->crq_expansion_mode; - out->rq_regs.plane1_base_address = rq_regs->plane1_base_address; - out->unbounded_req = rq_regs->unbounded_request_enabled; - - memset(&out->dlg_regs, 0, sizeof(out->dlg_regs)); - out->dlg_regs.refcyc_h_blank_end = disp_dlg_regs->refcyc_h_blank_end; - out->dlg_regs.dlg_vblank_end = disp_dlg_regs->dlg_vblank_end; - out->dlg_regs.min_dst_y_next_start = disp_dlg_regs->min_dst_y_next_start; - out->dlg_regs.refcyc_per_htotal = disp_dlg_regs->refcyc_per_htotal; - out->dlg_regs.refcyc_x_after_scaler = disp_dlg_regs->refcyc_x_after_scaler; - out->dlg_regs.dst_y_after_scaler = disp_dlg_regs->dst_y_after_scaler; - out->dlg_regs.dst_y_prefetch = disp_dlg_regs->dst_y_prefetch; - out->dlg_regs.dst_y_per_vm_vblank = disp_dlg_regs->dst_y_per_vm_vblank; - out->dlg_regs.dst_y_per_row_vblank = disp_dlg_regs->dst_y_per_row_vblank; - out->dlg_regs.dst_y_per_vm_flip = disp_dlg_regs->dst_y_per_vm_flip; - out->dlg_regs.dst_y_per_row_flip = disp_dlg_regs->dst_y_per_row_flip; - out->dlg_regs.ref_freq_to_pix_freq = disp_dlg_regs->ref_freq_to_pix_freq; - out->dlg_regs.vratio_prefetch = disp_dlg_regs->vratio_prefetch; - out->dlg_regs.vratio_prefetch_c = disp_dlg_regs->vratio_prefetch_c; - out->dlg_regs.refcyc_per_tdlut_group = disp_dlg_regs->refcyc_per_tdlut_group; - out->dlg_regs.refcyc_per_pte_group_vblank_l = disp_dlg_regs->refcyc_per_pte_group_vblank_l; - out->dlg_regs.refcyc_per_pte_group_vblank_c = disp_dlg_regs->refcyc_per_pte_group_vblank_c; - //out->dlg_regs.refcyc_per_meta_chunk_vblank_l = disp_dlg_regs->refcyc_per_meta_chunk_vblank_l; - //out->dlg_regs.refcyc_per_meta_chunk_vblank_c = disp_dlg_regs->refcyc_per_meta_chunk_vblank_c; - out->dlg_regs.refcyc_per_pte_group_flip_l = disp_dlg_regs->refcyc_per_pte_group_flip_l; - out->dlg_regs.refcyc_per_pte_group_flip_c = disp_dlg_regs->refcyc_per_pte_group_flip_c; - //out->dlg_regs.refcyc_per_meta_chunk_flip_l = disp_dlg_regs->refcyc_per_meta_chunk_flip_l; - //out->dlg_regs.refcyc_per_meta_chunk_flip_c = disp_dlg_regs->refcyc_per_meta_chunk_flip_c; - out->dlg_regs.dst_y_per_pte_row_nom_l = disp_dlg_regs->dst_y_per_pte_row_nom_l; - out->dlg_regs.dst_y_per_pte_row_nom_c = disp_dlg_regs->dst_y_per_pte_row_nom_c; - out->dlg_regs.refcyc_per_pte_group_nom_l = disp_dlg_regs->refcyc_per_pte_group_nom_l; - out->dlg_regs.refcyc_per_pte_group_nom_c = disp_dlg_regs->refcyc_per_pte_group_nom_c; - //out->dlg_regs.dst_y_per_meta_row_nom_l = disp_dlg_regs->dst_y_per_meta_row_nom_l; - //out->dlg_regs.dst_y_per_meta_row_nom_c = disp_dlg_regs->dst_y_per_meta_row_nom_c; - //out->dlg_regs.refcyc_per_meta_chunk_nom_l = disp_dlg_regs->refcyc_per_meta_chunk_nom_l; - //out->dlg_regs.refcyc_per_meta_chunk_nom_c = disp_dlg_regs->refcyc_per_meta_chunk_nom_c; - out->dlg_regs.refcyc_per_line_delivery_pre_l = disp_dlg_regs->refcyc_per_line_delivery_pre_l; - out->dlg_regs.refcyc_per_line_delivery_pre_c = disp_dlg_regs->refcyc_per_line_delivery_pre_c; - out->dlg_regs.refcyc_per_line_delivery_l = disp_dlg_regs->refcyc_per_line_delivery_l; - out->dlg_regs.refcyc_per_line_delivery_c = disp_dlg_regs->refcyc_per_line_delivery_c; - out->dlg_regs.refcyc_per_vm_group_vblank = disp_dlg_regs->refcyc_per_vm_group_vblank; - out->dlg_regs.refcyc_per_vm_group_flip = disp_dlg_regs->refcyc_per_vm_group_flip; - out->dlg_regs.refcyc_per_vm_req_vblank = disp_dlg_regs->refcyc_per_vm_req_vblank; - out->dlg_regs.refcyc_per_vm_req_flip = disp_dlg_regs->refcyc_per_vm_req_flip; - out->dlg_regs.dst_y_offset_cur0 = disp_dlg_regs->dst_y_offset_cur0; - out->dlg_regs.chunk_hdl_adjust_cur0 = disp_dlg_regs->chunk_hdl_adjust_cur0; - //out->dlg_regs.dst_y_offset_cur1 = disp_dlg_regs->dst_y_offset_cur1; - //out->dlg_regs.chunk_hdl_adjust_cur1 = disp_dlg_regs->chunk_hdl_adjust_cur1; - out->dlg_regs.vready_after_vcount0 = disp_dlg_regs->vready_after_vcount0; - out->dlg_regs.dst_y_delta_drq_limit = disp_dlg_regs->dst_y_delta_drq_limit; - out->dlg_regs.refcyc_per_vm_dmdata = disp_dlg_regs->refcyc_per_vm_dmdata; - out->dlg_regs.dmdata_dl_delta = disp_dlg_regs->dmdata_dl_delta; - - memset(&out->ttu_regs, 0, sizeof(out->ttu_regs)); - out->ttu_regs.qos_level_low_wm = disp_ttu_regs->qos_level_low_wm; - out->ttu_regs.qos_level_high_wm = disp_ttu_regs->qos_level_high_wm; - out->ttu_regs.min_ttu_vblank = disp_ttu_regs->min_ttu_vblank; - out->ttu_regs.qos_level_flip = disp_ttu_regs->qos_level_flip; - out->ttu_regs.refcyc_per_req_delivery_l = disp_ttu_regs->refcyc_per_req_delivery_l; - out->ttu_regs.refcyc_per_req_delivery_c = disp_ttu_regs->refcyc_per_req_delivery_c; - out->ttu_regs.refcyc_per_req_delivery_cur0 = disp_ttu_regs->refcyc_per_req_delivery_cur0; - //out->ttu_regs.refcyc_per_req_delivery_cur1 = disp_ttu_regs->refcyc_per_req_delivery_cur1; - out->ttu_regs.refcyc_per_req_delivery_pre_l = disp_ttu_regs->refcyc_per_req_delivery_pre_l; - out->ttu_regs.refcyc_per_req_delivery_pre_c = disp_ttu_regs->refcyc_per_req_delivery_pre_c; - out->ttu_regs.refcyc_per_req_delivery_pre_cur0 = disp_ttu_regs->refcyc_per_req_delivery_pre_cur0; - //out->ttu_regs.refcyc_per_req_delivery_pre_cur1 = disp_ttu_regs->refcyc_per_req_delivery_pre_cur1; - out->ttu_regs.qos_level_fixed_l = disp_ttu_regs->qos_level_fixed_l; - out->ttu_regs.qos_level_fixed_c = disp_ttu_regs->qos_level_fixed_c; - out->ttu_regs.qos_level_fixed_cur0 = disp_ttu_regs->qos_level_fixed_cur0; - //out->ttu_regs.qos_level_fixed_cur1 = disp_ttu_regs->qos_level_fixed_cur1; - out->ttu_regs.qos_ramp_disable_l = disp_ttu_regs->qos_ramp_disable_l; - out->ttu_regs.qos_ramp_disable_c = disp_ttu_regs->qos_ramp_disable_c; - out->ttu_regs.qos_ramp_disable_cur0 = disp_ttu_regs->qos_ramp_disable_cur0; - //out->ttu_regs.qos_ramp_disable_cur1 = disp_ttu_regs->qos_ramp_disable_cur1; + memcpy(&pipe_ctx->global_sync, + global_sync, + sizeof(union dml2_global_sync_programming)); } void dml21_populate_mall_allocation_size(struct dc_state *context, @@ -301,28 +214,16 @@ void dml21_program_dc_pipe(struct dml2_context *dml_ctx, struct dc_state *contex { unsigned int pipe_reg_index = 0; - dml21_populate_pipe_ctx_dlg_params(dml_ctx, context, pipe_ctx, stream_prog); + dml21_pipe_populate_global_sync(dml_ctx, context, pipe_ctx, stream_prog); find_pipe_regs_idx(dml_ctx, pipe_ctx, &pipe_reg_index); if (dml_ctx->config.svp_pstate.callbacks.get_pipe_subvp_type(context, pipe_ctx) == SUBVP_PHANTOM) { memcpy(&pipe_ctx->hubp_regs, pln_prog->phantom_plane.pipe_regs[pipe_reg_index], sizeof(struct dml2_dchub_per_pipe_register_set)); pipe_ctx->unbounded_req = false; - - /* legacy only, should be removed later */ - dml21_update_pipe_ctx_dchub_regs(&pln_prog->phantom_plane.pipe_regs[pipe_reg_index]->rq_regs, - &pln_prog->phantom_plane.pipe_regs[pipe_reg_index]->dlg_regs, - &pln_prog->phantom_plane.pipe_regs[pipe_reg_index]->ttu_regs, pipe_ctx); - pipe_ctx->det_buffer_size_kb = 0; } else { memcpy(&pipe_ctx->hubp_regs, pln_prog->pipe_regs[pipe_reg_index], sizeof(struct dml2_dchub_per_pipe_register_set)); pipe_ctx->unbounded_req = pln_prog->pipe_regs[pipe_reg_index]->rq_regs.unbounded_request_enabled; - - /* legacy only, should be removed later */ - dml21_update_pipe_ctx_dchub_regs(&pln_prog->pipe_regs[pipe_reg_index]->rq_regs, - &pln_prog->pipe_regs[pipe_reg_index]->dlg_regs, - &pln_prog->pipe_regs[pipe_reg_index]->ttu_regs, pipe_ctx); - pipe_ctx->det_buffer_size_kb = pln_prog->pipe_regs[pipe_reg_index]->det_size * 64; } @@ -482,7 +383,8 @@ void dml21_build_fams2_programming(const struct dc *dc, unsigned int num_fams2_streams = 0; /* reset fams2 data */ - memset(&context->bw_ctx.bw.dcn.fams2_stream_params, 0, sizeof(struct dmub_fams2_stream_static_state) * DML2_MAX_PLANES); + memset(&context->bw_ctx.bw.dcn.fams2_stream_base_params, 0, sizeof(union dmub_cmd_fams2_config) * DML2_MAX_PLANES); + memset(&context->bw_ctx.bw.dcn.fams2_stream_sub_params, 0, sizeof(union dmub_cmd_fams2_config) * DML2_MAX_PLANES); memset(&context->bw_ctx.bw.dcn.fams2_global_config, 0, sizeof(struct dmub_cmd_fams2_global_config)); if (dml_ctx->v21.mode_programming.programming->fams2_required) { @@ -490,8 +392,10 @@ void dml21_build_fams2_programming(const struct dc *dc, int dml_stream_idx; struct dc_stream_state *phantom_stream; struct dc_stream_status *phantom_status; + enum fams2_stream_type type = 0; - struct dmub_fams2_stream_static_state *static_state = &context->bw_ctx.bw.dcn.fams2_stream_params[num_fams2_streams]; + union dmub_cmd_fams2_config *static_base_state = &context->bw_ctx.bw.dcn.fams2_stream_base_params[num_fams2_streams]; + union dmub_cmd_fams2_config *static_sub_state = &context->bw_ctx.bw.dcn.fams2_stream_sub_params[num_fams2_streams]; struct dc_stream_state *stream = context->streams[i]; @@ -508,28 +412,38 @@ void dml21_build_fams2_programming(const struct dc *dc, } /* copy static state from PMO */ - memcpy(static_state, - &dml_ctx->v21.mode_programming.programming->stream_programming[dml_stream_idx].fams2_params, - sizeof(struct dmub_fams2_stream_static_state)); + memcpy(static_base_state, + &dml_ctx->v21.mode_programming.programming->stream_programming[dml_stream_idx].fams2_base_params, + sizeof(union dmub_cmd_fams2_config)); + memcpy(static_sub_state, + &dml_ctx->v21.mode_programming.programming->stream_programming[dml_stream_idx].fams2_sub_params, + sizeof(union dmub_cmd_fams2_config)); - /* get information from context */ - static_state->num_planes = context->stream_status[i].plane_count; - static_state->otg_inst = context->stream_status[i].primary_otg_inst; + switch (dc->debug.fams_version.minor) { + case 1: + default: + type = static_base_state->stream_v1.base.type; - /* populate pipe masks for planes */ - for (j = 0; j < context->stream_status[i].plane_count; j++) { - for (k = 0; k < dc->res_pool->pipe_count; k++) { - if (context->res_ctx.pipe_ctx[k].stream && - context->res_ctx.pipe_ctx[k].stream->stream_id == stream->stream_id && - context->res_ctx.pipe_ctx[k].plane_state == context->stream_status[i].plane_states[j]) { - static_state->pipe_mask |= (1 << k); - static_state->plane_pipe_masks[j] |= (1 << k); + /* get information from context */ + static_base_state->stream_v1.base.num_planes = context->stream_status[i].plane_count; + static_base_state->stream_v1.base.otg_inst = context->stream_status[i].primary_otg_inst; + + /* populate pipe masks for planes */ + for (j = 0; j < context->stream_status[i].plane_count; j++) { + for (k = 0; k < dc->res_pool->pipe_count; k++) { + if (context->res_ctx.pipe_ctx[k].stream && + context->res_ctx.pipe_ctx[k].stream->stream_id == stream->stream_id && + context->res_ctx.pipe_ctx[k].plane_state == context->stream_status[i].plane_states[j]) { + static_base_state->stream_v1.base.pipe_mask |= (1 << k); + static_base_state->stream_v1.base.plane_pipe_masks[j] |= (1 << k); + } } } } + /* get per method programming */ - switch (static_state->type) { + switch (type) { case FAMS2_STREAM_TYPE_VBLANK: case FAMS2_STREAM_TYPE_VACTIVE: case FAMS2_STREAM_TYPE_DRR: @@ -543,16 +457,27 @@ void dml21_build_fams2_programming(const struct dc *dc, /* phantom status should always be present */ ASSERT(phantom_status); - static_state->sub_state.subvp.phantom_otg_inst = phantom_status->primary_otg_inst; + if (!phantom_status) + break; - /* populate pipe masks for phantom planes */ - for (j = 0; j < phantom_status->plane_count; j++) { - for (k = 0; k < dc->res_pool->pipe_count; k++) { - if (context->res_ctx.pipe_ctx[k].stream && - context->res_ctx.pipe_ctx[k].stream->stream_id == phantom_stream->stream_id && - context->res_ctx.pipe_ctx[k].plane_state == phantom_status->plane_states[j]) { - static_state->sub_state.subvp.phantom_pipe_mask |= (1 << k); - static_state->sub_state.subvp.phantom_plane_pipe_masks[j] |= (1 << k); + switch (dc->debug.fams_version.minor) { + case 1: + default: + static_sub_state->stream_v1.sub_state.subvp.phantom_otg_inst = phantom_status->primary_otg_inst; + + /* populate pipe masks for phantom planes */ + for (j = 0; j < phantom_status->plane_count; j++) { + for (k = 0; k < dc->res_pool->pipe_count; k++) { + if (context->res_ctx.pipe_ctx[k].stream && + context->res_ctx.pipe_ctx[k].stream->stream_id == phantom_stream->stream_id && + context->res_ctx.pipe_ctx[k].plane_state == phantom_status->plane_states[j]) { + switch (dc->debug.fams_version.minor) { + case 1: + default: + static_sub_state->stream_v1.sub_state.subvp.phantom_pipe_mask |= (1 << k); + static_sub_state->stream_v1.sub_state.subvp.phantom_plane_pipe_masks[j] |= (1 << k); + } + } } } } diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_utils.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_utils.h index d5153fbac921..4bff52eaaef8 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_utils.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_utils.h @@ -18,10 +18,10 @@ struct dml2_display_ttu_regs; int dml21_helper_find_dml_pipe_idx_by_stream_id(struct dml2_context *ctx, unsigned int stream_id); int dml21_find_dml_pipe_idx_by_plane_id(struct dml2_context *ctx, unsigned int plane_id); bool dml21_get_plane_id(const struct dc_state *state, const struct dc_plane_state *plane, unsigned int *plane_id); -void dml21_update_pipe_ctx_dchub_regs(struct dml2_display_rq_regs *rq_regs, - struct dml2_display_dlg_regs *disp_dlg_regs, - struct dml2_display_ttu_regs *disp_ttu_regs, - struct pipe_ctx *out); +void dml21_pipe_populate_global_sync(struct dml2_context *dml_ctx, + struct dc_state *context, + struct pipe_ctx *pipe_ctx, + struct dml2_per_stream_programming *stream_programming); void dml21_populate_mall_allocation_size(struct dc_state *context, struct dml2_context *in_ctx, struct dml2_per_plane_programming *pln_prog, diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c index bbc28b9a15a3..fb80ba9287b6 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c @@ -75,7 +75,6 @@ static void dml21_init(const struct dc *in_dc, struct dml2_context **dml_ctx, co { switch (in_dc->ctx->dce_version) { case DCN_VERSION_4_01: - case DCN_VERSION_3_2: // TODO : Temporary for N-1 validation. Remove this after N-1 validation phase is complete. (*dml_ctx)->v21.dml_init.options.project_id = dml2_project_dcn4x_stage2_auto_drr_svp; break; default: @@ -233,13 +232,6 @@ static bool dml21_mode_check_and_programming(const struct dc *in_dc, struct dc_s dml21_calculate_rq_and_dlg_params(in_dc, context, &context->res_ctx, dml_ctx, in_dc->res_pool->pipe_count); dml21_copy_clocks_to_dc_state(dml_ctx, context); dml21_extract_watermark_sets(in_dc, &context->bw_ctx.bw.dcn.watermarks, dml_ctx); - if (in_dc->ctx->dce_version == DCN_VERSION_3_2) { - dml21_extract_legacy_watermark_set(in_dc, &context->bw_ctx.bw.dcn.watermarks.a, DML2_DCHUB_WATERMARK_SET_A, dml_ctx); - dml21_extract_legacy_watermark_set(in_dc, &context->bw_ctx.bw.dcn.watermarks.b, DML2_DCHUB_WATERMARK_SET_A, dml_ctx); - dml21_extract_legacy_watermark_set(in_dc, &context->bw_ctx.bw.dcn.watermarks.c, DML2_DCHUB_WATERMARK_SET_A, dml_ctx); - dml21_extract_legacy_watermark_set(in_dc, &context->bw_ctx.bw.dcn.watermarks.d, DML2_DCHUB_WATERMARK_SET_A, dml_ctx); - } - dml21_build_fams2_programming(in_dc, context, dml_ctx); } diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/bounding_boxes/dcn3_soc_bb.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/bounding_boxes/dcn3_soc_bb.h deleted file mode 100644 index d82c681a5402..000000000000 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/bounding_boxes/dcn3_soc_bb.h +++ /dev/null @@ -1,401 +0,0 @@ -/* - * Copyright 2022 Advanced Micro Devices, Inc. - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice shall be included in - * all copies or substantial portions of the Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR - * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, - * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR - * OTHER DEALINGS IN THE SOFTWARE. - * - * Authors: AMD - * - */ - -#ifndef __DML_DML_DCN3_SOC_BB__ -#define __DML_DML_DCN3_SOC_BB__ - -#include "dml_top_soc_parameter_types.h" - -static const struct dml2_soc_qos_parameters dml_dcn31_soc_qos_params = { - .derate_table = { - .system_active_urgent = { - .dram_derate_percent_pixel = 22, - .dram_derate_percent_vm = 0, - .dram_derate_percent_pixel_and_vm = 0, - .fclk_derate_percent = 76, - .dcfclk_derate_percent = 100, - }, - .system_active_average = { - .dram_derate_percent_pixel = 17, - .dram_derate_percent_vm = 0, - .dram_derate_percent_pixel_and_vm = 0, - .fclk_derate_percent = 57, - .dcfclk_derate_percent = 75, - }, - .dcn_mall_prefetch_urgent = { - .dram_derate_percent_pixel = 22, - .dram_derate_percent_vm = 0, - .dram_derate_percent_pixel_and_vm = 0, - .fclk_derate_percent = 76, - .dcfclk_derate_percent = 100, - }, - .dcn_mall_prefetch_average = { - .dram_derate_percent_pixel = 17, - .dram_derate_percent_vm = 0, - .dram_derate_percent_pixel_and_vm = 0, - .fclk_derate_percent = 57, - .dcfclk_derate_percent = 75, - }, - .system_idle_average = { - .dram_derate_percent_pixel = 17, - .dram_derate_percent_vm = 0, - .dram_derate_percent_pixel_and_vm = 0, - .fclk_derate_percent = 57, - .dcfclk_derate_percent = 100, - }, - }, - .writeback = { - .base_latency_us = 12, - .scaling_factor_us = 0, - .scaling_factor_mhz = 0, - }, - .qos_params = { - .dcn4x = { - .df_qos_response_time_fclk_cycles = 300, - .max_round_trip_to_furthest_cs_fclk_cycles = 350, - .mall_overhead_fclk_cycles = 50, - .meta_trip_adder_fclk_cycles = 36, - .average_transport_distance_fclk_cycles = 257, - .umc_urgent_ramp_latency_margin = 50, - .umc_max_latency_margin = 30, - .umc_average_latency_margin = 20, - .fabric_max_transport_latency_margin = 20, - .fabric_average_transport_latency_margin = 10, - - .per_uclk_dpm_params = { - { - .minimum_uclk_khz = 97, - .urgent_ramp_uclk_cycles = 472, - .trip_to_memory_uclk_cycles = 827, - .meta_trip_to_memory_uclk_cycles = 827, - .maximum_latency_when_urgent_uclk_cycles = 72, - .average_latency_when_urgent_uclk_cycles = 61, - .maximum_latency_when_non_urgent_uclk_cycles = 827, - .average_latency_when_non_urgent_uclk_cycles = 118, - }, - { - .minimum_uclk_khz = 435, - .urgent_ramp_uclk_cycles = 546, - .trip_to_memory_uclk_cycles = 848, - .meta_trip_to_memory_uclk_cycles = 848, - .maximum_latency_when_urgent_uclk_cycles = 146, - .average_latency_when_urgent_uclk_cycles = 90, - .maximum_latency_when_non_urgent_uclk_cycles = 848, - .average_latency_when_non_urgent_uclk_cycles = 135, - }, - { - .minimum_uclk_khz = 731, - .urgent_ramp_uclk_cycles = 632, - .trip_to_memory_uclk_cycles = 874, - .meta_trip_to_memory_uclk_cycles = 874, - .maximum_latency_when_urgent_uclk_cycles = 232, - .average_latency_when_urgent_uclk_cycles = 124, - .maximum_latency_when_non_urgent_uclk_cycles = 874, - .average_latency_when_non_urgent_uclk_cycles = 155, - }, - { - .minimum_uclk_khz = 1187, - .urgent_ramp_uclk_cycles = 716, - .trip_to_memory_uclk_cycles = 902, - .meta_trip_to_memory_uclk_cycles = 902, - .maximum_latency_when_urgent_uclk_cycles = 316, - .average_latency_when_urgent_uclk_cycles = 160, - .maximum_latency_when_non_urgent_uclk_cycles = 902, - .average_latency_when_non_urgent_uclk_cycles = 177, - }, - }, - }, - }, - .qos_type = dml2_qos_param_type_dcn4x, -}; - -static const struct dml2_soc_bb dml2_socbb_dcn31 = { - .clk_table = { - .uclk = { - .clk_values_khz = {97000, 435000, 731000, 1187000}, - .num_clk_values = 4, - }, - .fclk = { - .clk_values_khz = {300000, 2500000}, - .num_clk_values = 2, - }, - .dcfclk = { - .clk_values_khz = {200000, 1800000}, - .num_clk_values = 2, - }, - .dispclk = { - .clk_values_khz = {100000, 2000000}, - .num_clk_values = 2, - }, - .dppclk = { - .clk_values_khz = {100000, 2000000}, - .num_clk_values = 2, - }, - .dtbclk = { - .clk_values_khz = {100000, 2000000}, - .num_clk_values = 2, - }, - .phyclk = { - .clk_values_khz = {810000, 810000}, - .num_clk_values = 2, - }, - .socclk = { - .clk_values_khz = {300000, 1600000}, - .num_clk_values = 2, - }, - .dscclk = { - .clk_values_khz = {666667, 666667}, - .num_clk_values = 2, - }, - .phyclk_d18 = { - .clk_values_khz = {625000, 625000}, - .num_clk_values = 2, - }, - .phyclk_d32 = { - .clk_values_khz = {2000000, 2000000}, - .num_clk_values = 2, - }, - .dram_config = { - .channel_width_bytes = 2, - .channel_count = 16, - .transactions_per_clock = 16, - }, - }, - - .qos_parameters = { - .derate_table = { - .system_active_urgent = { - .dram_derate_percent_pixel = 22, - .dram_derate_percent_vm = 0, - .dram_derate_percent_pixel_and_vm = 0, - .fclk_derate_percent = 76, - .dcfclk_derate_percent = 100, - }, - .system_active_average = { - .dram_derate_percent_pixel = 17, - .dram_derate_percent_vm = 0, - .dram_derate_percent_pixel_and_vm = 0, - .fclk_derate_percent = 57, - .dcfclk_derate_percent = 75, - }, - .dcn_mall_prefetch_urgent = { - .dram_derate_percent_pixel = 22, - .dram_derate_percent_vm = 0, - .dram_derate_percent_pixel_and_vm = 0, - .fclk_derate_percent = 76, - .dcfclk_derate_percent = 100, - }, - .dcn_mall_prefetch_average = { - .dram_derate_percent_pixel = 17, - .dram_derate_percent_vm = 0, - .dram_derate_percent_pixel_and_vm = 0, - .fclk_derate_percent = 57, - .dcfclk_derate_percent = 75, - }, - .system_idle_average = { - .dram_derate_percent_pixel = 17, - .dram_derate_percent_vm = 0, - .dram_derate_percent_pixel_and_vm = 0, - .fclk_derate_percent = 57, - .dcfclk_derate_percent = 100, - }, - }, - .writeback = { - .base_latency_us = 0, - .scaling_factor_us = 0, - .scaling_factor_mhz = 0, - }, - .qos_params = { - .dcn4x = { - .df_qos_response_time_fclk_cycles = 300, - .max_round_trip_to_furthest_cs_fclk_cycles = 350, - .mall_overhead_fclk_cycles = 50, - .meta_trip_adder_fclk_cycles = 36, - .average_transport_distance_fclk_cycles = 260, - .umc_urgent_ramp_latency_margin = 50, - .umc_max_latency_margin = 30, - .umc_average_latency_margin = 20, - .fabric_max_transport_latency_margin = 20, - .fabric_average_transport_latency_margin = 10, - - .per_uclk_dpm_params = { - { - // State 1 - .minimum_uclk_khz = 0, - .urgent_ramp_uclk_cycles = 472, - .trip_to_memory_uclk_cycles = 827, - .meta_trip_to_memory_uclk_cycles = 827, - .maximum_latency_when_urgent_uclk_cycles = 72, - .average_latency_when_urgent_uclk_cycles = 72, - .maximum_latency_when_non_urgent_uclk_cycles = 827, - .average_latency_when_non_urgent_uclk_cycles = 117, - }, - { - // State 2 - .minimum_uclk_khz = 0, - .urgent_ramp_uclk_cycles = 546, - .trip_to_memory_uclk_cycles = 848, - .meta_trip_to_memory_uclk_cycles = 848, - .maximum_latency_when_urgent_uclk_cycles = 146, - .average_latency_when_urgent_uclk_cycles = 146, - .maximum_latency_when_non_urgent_uclk_cycles = 848, - .average_latency_when_non_urgent_uclk_cycles = 133, - }, - { - // State 3 - .minimum_uclk_khz = 0, - .urgent_ramp_uclk_cycles = 564, - .trip_to_memory_uclk_cycles = 853, - .meta_trip_to_memory_uclk_cycles = 853, - .maximum_latency_when_urgent_uclk_cycles = 164, - .average_latency_when_urgent_uclk_cycles = 164, - .maximum_latency_when_non_urgent_uclk_cycles = 853, - .average_latency_when_non_urgent_uclk_cycles = 136, - }, - { - // State 4 - .minimum_uclk_khz = 0, - .urgent_ramp_uclk_cycles = 613, - .trip_to_memory_uclk_cycles = 869, - .meta_trip_to_memory_uclk_cycles = 869, - .maximum_latency_when_urgent_uclk_cycles = 213, - .average_latency_when_urgent_uclk_cycles = 213, - .maximum_latency_when_non_urgent_uclk_cycles = 869, - .average_latency_when_non_urgent_uclk_cycles = 149, - }, - { - // State 5 - .minimum_uclk_khz = 0, - .urgent_ramp_uclk_cycles = 632, - .trip_to_memory_uclk_cycles = 874, - .meta_trip_to_memory_uclk_cycles = 874, - .maximum_latency_when_urgent_uclk_cycles = 232, - .average_latency_when_urgent_uclk_cycles = 232, - .maximum_latency_when_non_urgent_uclk_cycles = 874, - .average_latency_when_non_urgent_uclk_cycles = 153, - }, - { - // State 6 - .minimum_uclk_khz = 0, - .urgent_ramp_uclk_cycles = 665, - .trip_to_memory_uclk_cycles = 885, - .meta_trip_to_memory_uclk_cycles = 885, - .maximum_latency_when_urgent_uclk_cycles = 265, - .average_latency_when_urgent_uclk_cycles = 265, - .maximum_latency_when_non_urgent_uclk_cycles = 885, - .average_latency_when_non_urgent_uclk_cycles = 161, - }, - { - // State 7 - .minimum_uclk_khz = 0, - .urgent_ramp_uclk_cycles = 689, - .trip_to_memory_uclk_cycles = 895, - .meta_trip_to_memory_uclk_cycles = 895, - .maximum_latency_when_urgent_uclk_cycles = 289, - .average_latency_when_urgent_uclk_cycles = 289, - .maximum_latency_when_non_urgent_uclk_cycles = 895, - .average_latency_when_non_urgent_uclk_cycles = 167, - }, - { - // State 8 - .minimum_uclk_khz = 0, - .urgent_ramp_uclk_cycles = 716, - .trip_to_memory_uclk_cycles = 902, - .meta_trip_to_memory_uclk_cycles = 902, - .maximum_latency_when_urgent_uclk_cycles = 316, - .average_latency_when_urgent_uclk_cycles = 316, - .maximum_latency_when_non_urgent_uclk_cycles = 902, - .average_latency_when_non_urgent_uclk_cycles = 174, - }, - }, - }, - }, - .qos_type = dml2_qos_param_type_dcn4x, - }, - - .power_management_parameters = { - .dram_clk_change_blackout_us = 400, - .fclk_change_blackout_us = 0, - .g7_ppt_blackout_us = 0, - .stutter_enter_plus_exit_latency_us = 50, - .stutter_exit_latency_us = 43, - .z8_stutter_enter_plus_exit_latency_us = 0, - .z8_stutter_exit_latency_us = 0, - }, - - .vmin_limit = { - .dispclk_khz = 600 * 1000, - }, - - .dprefclk_mhz = 700, - .xtalclk_mhz = 100, - .pcie_refclk_mhz = 100, - .dchub_refclk_mhz = 50, - .mall_allocated_for_dcn_mbytes = 64, - .max_outstanding_reqs = 512, - .fabric_datapath_to_dcn_data_return_bytes = 64, - .return_bus_width_bytes = 64, - .hostvm_min_page_size_kbytes = 0, - .gpuvm_min_page_size_kbytes = 256, - .phy_downspread_percent = 0, - .dcn_downspread_percent = 0, - .dispclk_dppclk_vco_speed_mhz = 4500, - .do_urgent_latency_adjustment = 0, - .mem_word_bytes = 32, - .num_dcc_mcaches = 8, - .mcache_size_bytes = 2048, - .mcache_line_size_bytes = 32, - .max_fclk_for_uclk_dpm_khz = 1250 * 1000, -}; - -static const struct dml2_ip_capabilities dml2_dcn31_max_ip_caps = { - .pipe_count = 4, - .otg_count = 4, - .num_dsc = 4, - .max_num_dp2p0_streams = 4, - .max_num_hdmi_frl_outputs = 1, - .max_num_dp2p0_outputs = 4, - .rob_buffer_size_kbytes = 192, - .config_return_buffer_size_in_kbytes = 1152, - .meta_fifo_size_in_kentries = 22, - .compressed_buffer_segment_size_in_kbytes = 64, - .subvp_drr_scheduling_margin_us = 100, - .subvp_prefetch_end_to_mall_start_us = 15, - .subvp_fw_processing_delay = 15, - - .fams2 = { - .max_allow_delay_us = 100 * 1000, - .scheduling_delay_us = 50, - .vertical_interrupt_ack_delay_us = 18, - .allow_programming_delay_us = 18, - .min_allow_width_us = 20, - .subvp_df_throttle_delay_us = 100, - .subvp_programming_delay_us = 18, - .subvp_prefetch_to_mall_delay_us = 18, - .drr_programming_delay_us = 18, - }, -}; - -#endif /* __DML_DML_DCN3_SOC_BB__ */ diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/bounding_boxes/dcn4_soc_bb.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/bounding_boxes/dcn4_soc_bb.h index 8ef7977841de..793e1c038efd 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/bounding_boxes/dcn4_soc_bb.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/bounding_boxes/dcn4_soc_bb.h @@ -344,6 +344,7 @@ static const struct dml2_ip_capabilities dml2_dcn401_max_ip_caps = { .config_return_buffer_segment_size_in_kbytes = 64, .meta_fifo_size_in_kentries = 22, .compressed_buffer_segment_size_in_kbytes = 64, + .cursor_buffer_size = 24, .max_flip_time_us = 80, .max_flip_time_lines = 32, .hostvm_mode = 0, @@ -354,7 +355,7 @@ static const struct dml2_ip_capabilities dml2_dcn401_max_ip_caps = { .fams2 = { .max_allow_delay_us = 100 * 1000, - .scheduling_delay_us = 125, + .scheduling_delay_us = 550, .vertical_interrupt_ack_delay_us = 40, .allow_programming_delay_us = 18, .min_allow_width_us = 20, diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/dml_top_display_cfg_types.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/dml_top_display_cfg_types.h index b132f676a68d..5e1ab6d97640 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/dml_top_display_cfg_types.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/dml_top_display_cfg_types.h @@ -10,9 +10,10 @@ #define DML2_MAX_PLANES 8 #define DML2_MAX_DCN_PIPES 8 #define DML2_MAX_MCACHES 8 // assume plane is going to be supported by a max of 8 mcaches +#define DML2_MAX_WRITEBACK 3 enum dml2_swizzle_mode { - dml2_sw_linear, + dml2_sw_linear, // SW_LINEAR accepts 256 byte aligned pitch and also 128 byte aligned pitch if DCC is not enabled dml2_sw_256b_2d, dml2_sw_4kb_2d, dml2_sw_64kb_2d, @@ -24,7 +25,8 @@ enum dml2_swizzle_mode { dml2_gfx11_sw_64kb_d_x, dml2_gfx11_sw_64kb_r_x, dml2_gfx11_sw_256kb_d_x, - dml2_gfx11_sw_256kb_r_x + dml2_gfx11_sw_256kb_r_x, + }; enum dml2_source_format_class { @@ -38,7 +40,13 @@ enum dml2_source_format_class { dml2_rgbe_alpha = 9, dml2_rgbe = 10, dml2_mono_8 = 11, - dml2_mono_16 = 12 + dml2_mono_16 = 12, + dml2_422_planar_8 = 13, + dml2_422_planar_10 = 14, + dml2_422_planar_12 = 15, + dml2_422_packed_8 = 16, + dml2_422_packed_10 = 17, + dml2_422_packed_12 = 18 }; enum dml2_rotation_angle { @@ -121,15 +129,6 @@ enum dml2_dsc_enable_option { dml2_dsc_enable_if_necessary = 2 }; -enum dml2_pstate_support_method { - dml2_pstate_method_uninitialized, - dml2_pstate_method_not_supported, - dml2_pstate_method_vactive, - dml2_pstate_method_vblank, - dml2_pstate_method_svp, - dml2_pstate_method_drr -}; - enum dml2_tdlut_addressing_mode { dml2_tdlut_sw_linear = 0, dml2_tdlut_simple_linear = 1 @@ -287,22 +286,23 @@ struct dml2_link_output_cfg { bool validate_output; // Do not validate the link configuration for this display stream. }; -struct dml2_writeback_cfg { - bool enable; +struct dml2_writeback_info { enum dml2_source_format_class pixel_format; - unsigned int active_writebacks_per_surface; + unsigned long input_width; + unsigned long input_height; + unsigned long output_width; + unsigned long output_height; + unsigned long v_taps; + unsigned long h_taps; + unsigned long v_taps_chroma; + unsigned long h_taps_chroma; + double h_ratio; + double v_ratio; +}; - struct { - bool enabled; - unsigned long input_width; - unsigned long input_height; - unsigned long output_width; - unsigned long output_height; - unsigned long v_taps; - unsigned long h_taps; - double h_ratio; - double v_ratio; - } scaling_info; +struct dml2_writeback_cfg { + unsigned int active_writebacks_per_stream; + struct dml2_writeback_info writeback_stream[DML2_MAX_WRITEBACK]; }; struct dml2_plane_parameters { diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/dml_top_soc_parameter_types.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/dml_top_soc_parameter_types.h index ebd8abe894a9..5f0bc42d1d2f 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/dml_top_soc_parameter_types.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/dml_top_soc_parameter_types.h @@ -167,11 +167,13 @@ struct dml2_ip_capabilities { unsigned int max_num_dp2p0_streams; unsigned int max_num_hdmi_frl_outputs; unsigned int max_num_dp2p0_outputs; + unsigned int max_num_wb; unsigned int rob_buffer_size_kbytes; unsigned int config_return_buffer_size_in_kbytes; unsigned int config_return_buffer_segment_size_in_kbytes; unsigned int meta_fifo_size_in_kentries; unsigned int compressed_buffer_segment_size_in_kbytes; + unsigned int cursor_buffer_size; unsigned int max_flip_time_us; unsigned int max_flip_time_lines; unsigned int hostvm_mode; diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/dml_top_types.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/dml_top_types.h index eeb96c455658..d2d053f2354d 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/dml_top_types.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/inc/dml_top_types.h @@ -26,20 +26,14 @@ enum dml2_project_id { dml2_project_dcn4x_stage2_auto_drr_svp = 3, }; -enum dml2_dram_clock_change_support { - dml2_dram_clock_change_vactive = 0, - dml2_dram_clock_change_vblank = 1, - dml2_dram_clock_change_vblank_and_vactive = 2, - dml2_dram_clock_change_drr = 3, - dml2_dram_clock_change_mall_svp = 4, - dml2_dram_clock_change_mall_full_frame = 6, - dml2_dram_clock_change_unsupported = 7 -}; - -enum dml2_fclock_change_support { - dml2_fclock_change_vactive = 0, - dml2_fclock_change_vblank = 1, - dml2_fclock_change_unsupported = 2 +enum dml2_pstate_change_support { + dml2_pstate_change_vactive = 0, + dml2_pstate_change_vblank = 1, + dml2_pstate_change_vblank_and_vactive = 2, + dml2_pstate_change_drr = 3, + dml2_pstate_change_mall_svp = 4, + dml2_pstate_change_mall_full_frame = 6, + dml2_pstate_change_unsupported = 7 }; enum dml2_output_type_and_rate__type { @@ -202,24 +196,23 @@ struct dml2_mcache_surface_allocation { } informative; }; -enum dml2_uclk_pstate_support_method { - dml2_uclk_pstate_support_method_not_supported = 0, - /* hw */ - dml2_uclk_pstate_support_method_vactive = 1, - dml2_uclk_pstate_support_method_vblank = 2, - dml2_uclk_pstate_support_method_reserved_hw = 5, - /* fw */ - dml2_uclk_pstate_support_method_fw_subvp_phantom = 6, - dml2_uclk_pstate_support_method_reserved_fw = 10, - /* fw w/drr */ - dml2_uclk_pstate_support_method_fw_vactive_drr = 11, - dml2_uclk_pstate_support_method_fw_vblank_drr = 12, - dml2_uclk_pstate_support_method_fw_subvp_phantom_drr = 13, - dml2_uclk_pstate_support_method_reserved_fw_drr_fixed = 20, - dml2_uclk_pstate_support_method_fw_drr = 21, - dml2_uclk_pstate_support_method_reserved_fw_drr_var = 22, - - dml2_uclk_pstate_support_method_count +enum dml2_pstate_method { + dml2_pstate_method_na = 0, + /* hw exclusive modes */ + dml2_pstate_method_vactive = 1, + dml2_pstate_method_vblank = 2, + dml2_pstate_method_reserved_hw = 5, + /* fw assisted exclusive modes */ + dml2_pstate_method_fw_svp = 6, + dml2_pstate_method_reserved_fw = 10, + /* fw assisted modes requiring drr modulation */ + dml2_pstate_method_fw_vactive_drr = 11, + dml2_pstate_method_fw_vblank_drr = 12, + dml2_pstate_method_fw_svp_drr = 13, + dml2_pstate_method_reserved_fw_drr_clamped = 20, + dml2_pstate_method_fw_drr = 21, + dml2_pstate_method_reserved_fw_drr_var = 22, + dml2_pstate_method_count }; struct dml2_per_plane_programming { @@ -241,7 +234,7 @@ struct dml2_per_plane_programming { // If a stream is using odm split, then this value is always 1 unsigned int num_dpps_required; - enum dml2_uclk_pstate_support_method uclk_pstate_support_method; + enum dml2_pstate_method uclk_pstate_support_method; // MALL size requirements for MALL SS and SubVP unsigned int surface_size_mall_bytes; @@ -281,7 +274,7 @@ struct dml2_per_stream_programming { unsigned int num_odms_required; - enum dml2_uclk_pstate_support_method uclk_pstate_method; + enum dml2_pstate_method uclk_pstate_method; struct { bool enabled; @@ -289,7 +282,8 @@ struct dml2_per_stream_programming { union dml2_global_sync_programming global_sync; } phantom_stream; - struct dmub_fams2_stream_static_state fams2_params; + union dmub_cmd_fams2_config fams2_base_params; + union dmub_cmd_fams2_config fams2_sub_params; }; //----------------- @@ -339,7 +333,7 @@ struct dml2_mode_support_info { bool DCCMetaBufferSizeNotExceeded; bool TotalVerticalActiveBandwidthSupport; bool VActiveBandwidthSupport; - enum dml2_fclock_change_support FCLKChangeSupport[DML2_MAX_PLANES]; + enum dml2_pstate_change_support FCLKChangeSupport[DML2_MAX_PLANES]; bool USRRetrainingSupport; bool PrefetchSupported; bool DynamicMetadataSupported; @@ -361,6 +355,7 @@ struct dml2_mode_support_info { unsigned int AlignedYPitch[DML2_MAX_PLANES]; unsigned int AlignedCPitch[DML2_MAX_PLANES]; bool g6_temp_read_support; + bool temp_read_or_ppt_support; }; // dml2_mode_support_info struct dml2_display_cfg_programming { @@ -392,6 +387,11 @@ struct dml2_display_cfg_programming { unsigned long fclk_khz; unsigned long dcfclk_khz; } svp_prefetch; + struct { + unsigned long uclk_khz; + unsigned long fclk_khz; + unsigned long dcfclk_khz; + } svp_prefetch_no_throttle; unsigned long deepsleep_dcfclk_khz; unsigned long dispclk_khz; @@ -444,7 +444,7 @@ struct dml2_display_cfg_programming { double pstate_change_us; double fclk_pstate_change_us; double usr_retraining_us; - double g6_temp_read_watermark_us; + double temp_read_or_ppt_watermark_us; } watermarks; struct { @@ -653,6 +653,7 @@ struct dml2_display_cfg_programming { double DisplayPipeLineDeliveryTimeLumaPrefetch[DML2_MAX_PLANES]; double DisplayPipeLineDeliveryTimeChromaPrefetch[DML2_MAX_PLANES]; + double WritebackRequiredBandwidth; double WritebackAllowDRAMClockChangeEndPosition[DML2_MAX_PLANES]; double WritebackAllowFCLKChangeEndPosition[DML2_MAX_PLANES]; double DSCCLK_calculated[DML2_MAX_PLANES]; @@ -662,6 +663,7 @@ struct dml2_display_cfg_programming { double MaxActiveDRAMClockChangeLatencySupported[DML2_MAX_PLANES]; unsigned int PrefetchMode[DML2_MAX_PLANES]; // LEGACY_ONLY bool ROBUrgencyAvoidance; + double LowestPrefetchMargin; } misc; struct dml2_mode_support_info mode_support_info; @@ -675,6 +677,7 @@ struct dml2_display_cfg_programming { bool failed_mcache_validation; bool failed_dpmm; bool failed_mode_programming; + bool failed_map_watermarks; } informative; }; diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4.c index 3d41ffde91c1..d68b4567e218 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4.c @@ -9,7 +9,7 @@ #include "dml2_debug.h" #include "lib_float_math.h" -static const struct dml2_core_ip_params core_dcn4_ip_caps_base = { +struct dml2_core_ip_params core_dcn4_ip_caps_base = { // Hardcoded values for DCN3x .vblank_nom_default_us = 668, .remote_iommu_outstanding_translations = 256, @@ -90,6 +90,7 @@ static void patch_ip_caps_with_explicit_ip_params(struct dml2_ip_capabilities *i ip_caps->config_return_buffer_segment_size_in_kbytes = ip_params->config_return_buffer_segment_size_in_kbytes; ip_caps->meta_fifo_size_in_kentries = ip_params->meta_fifo_size_in_kentries; ip_caps->compressed_buffer_segment_size_in_kbytes = ip_params->compressed_buffer_segment_size_in_kbytes; + ip_caps->cursor_buffer_size = ip_params->cursor_buffer_size; ip_caps->max_flip_time_us = ip_params->max_flip_time_us; ip_caps->max_flip_time_lines = ip_params->max_flip_time_lines; ip_caps->hostvm_mode = ip_params->hostvm_mode; @@ -114,6 +115,7 @@ static void patch_ip_params_with_ip_caps(struct dml2_core_ip_params *ip_params, ip_params->config_return_buffer_segment_size_in_kbytes = ip_caps->config_return_buffer_segment_size_in_kbytes; ip_params->meta_fifo_size_in_kentries = ip_caps->meta_fifo_size_in_kentries; ip_params->compressed_buffer_segment_size_in_kbytes = ip_caps->compressed_buffer_segment_size_in_kbytes; + ip_params->cursor_buffer_size = ip_caps->cursor_buffer_size; ip_params->max_flip_time_us = ip_caps->max_flip_time_us; ip_params->max_flip_time_lines = ip_caps->max_flip_time_lines; ip_params->hostvm_mode = ip_caps->hostvm_mode; @@ -316,28 +318,9 @@ static void pack_mode_programming_params_with_implicit_subvp(struct dml2_core_in // Setup the appropriate p-state strategy if (display_cfg->stage3.performed && display_cfg->stage3.success) { - switch (display_cfg->stage3.pstate_switch_modes[plane_index]) { - case dml2_uclk_pstate_support_method_vactive: - case dml2_uclk_pstate_support_method_vblank: - case dml2_uclk_pstate_support_method_fw_subvp_phantom: - case dml2_uclk_pstate_support_method_fw_drr: - case dml2_uclk_pstate_support_method_fw_vactive_drr: - case dml2_uclk_pstate_support_method_fw_vblank_drr: - case dml2_uclk_pstate_support_method_fw_subvp_phantom_drr: - programming->plane_programming[plane_index].uclk_pstate_support_method = display_cfg->stage3.pstate_switch_modes[plane_index]; - break; - case dml2_uclk_pstate_support_method_reserved_hw: - case dml2_uclk_pstate_support_method_reserved_fw: - case dml2_uclk_pstate_support_method_reserved_fw_drr_fixed: - case dml2_uclk_pstate_support_method_reserved_fw_drr_var: - case dml2_uclk_pstate_support_method_not_supported: - case dml2_uclk_pstate_support_method_count: - default: - programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_uclk_pstate_support_method_not_supported; - break; - } + programming->plane_programming[plane_index].uclk_pstate_support_method = display_cfg->stage3.pstate_switch_modes[plane_index]; } else { - programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_uclk_pstate_support_method_not_supported; + programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_pstate_method_na; } dml2_core_calcs_get_mall_allocation(&core->clean_me_up.mode_lib, &programming->plane_programming[plane_index].surface_size_mall_bytes, dml_internal_pipe_index); @@ -360,7 +343,8 @@ static void pack_mode_programming_params_with_implicit_subvp(struct dml2_core_in /* unconditionally populate fams2 params */ dml2_core_calcs_get_stream_fams2_programming(&core->clean_me_up.mode_lib, display_cfg, - &programming->stream_programming[main_plane->stream_index].fams2_params, + &programming->stream_programming[main_plane->stream_index].fams2_base_params, + &programming->stream_programming[main_plane->stream_index].fams2_sub_params, programming->stream_programming[main_plane->stream_index].uclk_pstate_method, plane_index); @@ -572,18 +556,18 @@ bool core_dcn4_mode_programming(struct dml2_core_mode_programming_in_out *in_out in_out->programming->plane_programming[plane_index].num_dpps_required = core->clean_me_up.mode_lib.mp.NoOfDPP[plane_index]; if (in_out->programming->display_config.plane_descriptors[plane_index].overrides.legacy_svp_config == dml2_svp_mode_override_main_pipe) - in_out->programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_uclk_pstate_support_method_fw_subvp_phantom; + in_out->programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_pstate_method_fw_svp; else if (in_out->programming->display_config.plane_descriptors[plane_index].overrides.legacy_svp_config == dml2_svp_mode_override_phantom_pipe) - in_out->programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_uclk_pstate_support_method_fw_subvp_phantom; + in_out->programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_pstate_method_fw_svp; else if (in_out->programming->display_config.plane_descriptors[plane_index].overrides.legacy_svp_config == dml2_svp_mode_override_phantom_pipe_no_data_return) - in_out->programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_uclk_pstate_support_method_fw_subvp_phantom; + in_out->programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_pstate_method_fw_svp; else { if (core->clean_me_up.mode_lib.mp.MaxActiveDRAMClockChangeLatencySupported[plane_index] >= core->clean_me_up.mode_lib.soc.power_management_parameters.dram_clk_change_blackout_us) - in_out->programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_uclk_pstate_support_method_vactive; + in_out->programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_pstate_method_vactive; else if (core->clean_me_up.mode_lib.mp.TWait[plane_index] >= core->clean_me_up.mode_lib.soc.power_management_parameters.dram_clk_change_blackout_us) - in_out->programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_uclk_pstate_support_method_vblank; + in_out->programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_pstate_method_vblank; else - in_out->programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_uclk_pstate_support_method_not_supported; + in_out->programming->plane_programming[plane_index].uclk_pstate_support_method = dml2_pstate_method_na; } dml2_core_calcs_get_mall_allocation(&core->clean_me_up.mode_lib, &in_out->programming->plane_programming[plane_index].surface_size_mall_bytes, dml_internal_pipe_index); diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c index 601320b1be81..c4dbf27abaf8 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c @@ -11,6 +11,9 @@ #define DML2_MAX_FMT_420_BUFFER_WIDTH 4096 #define DML_MAX_NUM_OF_SLICES_PER_DSC 4 +#define DML_MAX_COMPRESSION_RATIO 4 +//#define DML_MODE_SUPPORT_USE_DPM_DRAM_BW +//#define DML_GLOBAL_PREFETCH_CHECK #define ALLOW_SDPIF_RATE_LIMIT_PRE_CSTATE const char *dml2_core_internal_bw_type_str(enum dml2_core_internal_bw_type bw_type) @@ -132,9 +135,9 @@ static void dml2_print_mode_support_info(const struct dml2_core_internal_mode_su dml2_printf("DML: support: DynamicMetadataSupported = %d\n", support->DynamicMetadataSupported); if (!fail_only || support->VRatioInPrefetchSupported == 0) dml2_printf("DML: support: VRatioInPrefetchSupported = %d\n", support->VRatioInPrefetchSupported); - if (!fail_only || support->PTEBufferSizeNotExceeded == 1) + if (!fail_only || support->PTEBufferSizeNotExceeded == 0) dml2_printf("DML: support: PTEBufferSizeNotExceeded = %d\n", support->PTEBufferSizeNotExceeded); - if (!fail_only || support->DCCMetaBufferSizeNotExceeded == 1) + if (!fail_only || support->DCCMetaBufferSizeNotExceeded == 0) dml2_printf("DML: support: DCCMetaBufferSizeNotExceeded = %d\n", support->DCCMetaBufferSizeNotExceeded); if (!fail_only || support->ExceededMALLSize == 1) dml2_printf("DML: support: ExceededMALLSize = %d\n", support->ExceededMALLSize); @@ -315,12 +318,11 @@ dml_get_var_func(meta_trip_memory_us, double, mode_lib->mp.MetaTripToMemory); dml_get_var_func(wm_fclk_change, double, mode_lib->mp.Watermark.FCLKChangeWatermark); dml_get_var_func(wm_usr_retraining, double, mode_lib->mp.Watermark.USRRetrainingWatermark); -dml_get_var_func(wm_g6_temp_read, double, mode_lib->mp.Watermark.g6_temp_read_watermark_us); +dml_get_var_func(wm_temp_read_or_ppt, double, mode_lib->mp.Watermark.temp_read_or_ppt_watermark_us); dml_get_var_func(wm_dram_clock_change, double, mode_lib->mp.Watermark.DRAMClockChangeWatermark); dml_get_var_func(fraction_of_urgent_bandwidth, double, mode_lib->mp.FractionOfUrgentBandwidth); dml_get_var_func(fraction_of_urgent_bandwidth_imm_flip, double, mode_lib->mp.FractionOfUrgentBandwidthImmediateFlip); dml_get_var_func(fraction_of_urgent_bandwidth_mall, double, mode_lib->mp.FractionOfUrgentBandwidthMALL); -dml_get_var_func(urgent_latency, double, mode_lib->mp.UrgentLatency); dml_get_var_func(wm_writeback_dram_clock_change, double, mode_lib->mp.Watermark.WritebackDRAMClockChangeWatermark); dml_get_var_func(wm_writeback_fclk_change, double, mode_lib->mp.Watermark.WritebackFCLKChangeWatermark); dml_get_var_func(stutter_efficiency, double, mode_lib->mp.StutterEfficiency); @@ -355,7 +357,9 @@ dml_get_var_func(svp_prefetch_urg_bw_available_sdp, double, mode_lib->mp.urg_ban dml_get_var_func(svp_prefetch_urg_bw_available_dram, double, mode_lib->mp.urg_bandwidth_available[dml2_core_internal_soc_state_svp_prefetch][dml2_core_internal_bw_dram]); dml_get_var_func(svp_prefetch_urg_bw_available_dram_vm_only, double, mode_lib->mp.urg_bandwidth_available_vm_only[dml2_core_internal_soc_state_svp_prefetch]); +dml_get_var_func(urgent_latency, double, mode_lib->mp.UrgentLatency); dml_get_var_func(max_urgent_latency_us, double, mode_lib->ms.support.max_urgent_latency_us); +dml_get_var_func(max_non_urgent_latency_us, double, mode_lib->ms.support.max_non_urgent_latency_us); dml_get_var_func(avg_non_urgent_latency_us, double, mode_lib->ms.support.avg_non_urgent_latency_us); dml_get_var_func(avg_urgent_latency_us, double, mode_lib->ms.support.avg_urgent_latency_us); @@ -466,6 +470,24 @@ static bool dml_is_420(enum dml2_source_format_class source_format) case dml2_420_12: val = 1; break; + case dml2_422_planar_8: + val = 0; + break; + case dml2_422_planar_10: + val = 0; + break; + case dml2_422_planar_12: + val = 0; + break; + case dml2_422_packed_8: + val = 0; + break; + case dml2_422_packed_10: + val = 0; + break; + case dml2_422_packed_12: + val = 0; + break; case dml2_rgbe_alpha: val = 0; break; @@ -487,32 +509,31 @@ static bool dml_is_420(enum dml2_source_format_class source_format) static unsigned int dml_get_tile_block_size_bytes(enum dml2_swizzle_mode sw_mode) { - switch (sw_mode) { - case (dml2_sw_linear): - return 256; break; - case (dml2_sw_256b_2d): - return 256; break; - case (dml2_sw_4kb_2d): - return 4096; break; - case (dml2_sw_64kb_2d): - return 65536; break; - case (dml2_sw_256kb_2d): - return 262144; break; - case (dml2_gfx11_sw_linear): - return 256; break; - case (dml2_gfx11_sw_64kb_d): - return 65536; break; - case (dml2_gfx11_sw_64kb_d_t): - return 65536; break; - case (dml2_gfx11_sw_64kb_d_x): - return 65536; break; - case (dml2_gfx11_sw_64kb_r_x): - return 65536; break; - case (dml2_gfx11_sw_256kb_d_x): - return 262144; break; - case (dml2_gfx11_sw_256kb_r_x): - return 262144; break; - default: + if (sw_mode == dml2_sw_linear) + return 256; + else if (sw_mode == dml2_sw_256b_2d) + return 256; + else if (sw_mode == dml2_sw_4kb_2d) + return 4096; + else if (sw_mode == dml2_sw_64kb_2d) + return 65536; + else if (sw_mode == dml2_sw_256kb_2d) + return 262144; + else if (sw_mode == dml2_gfx11_sw_linear) + return 256; + else if (sw_mode == dml2_gfx11_sw_64kb_d) + return 65536; + else if (sw_mode == dml2_gfx11_sw_64kb_d_t) + return 65536; + else if (sw_mode == dml2_gfx11_sw_64kb_d_x) + return 65536; + else if (sw_mode == dml2_gfx11_sw_64kb_r_x) + return 65536; + else if (sw_mode == dml2_gfx11_sw_256kb_d_x) + return 262144; + else if (sw_mode == dml2_gfx11_sw_256kb_r_x) + return 262144; + else { DML2_ASSERT(0); return 256; } @@ -579,8 +600,8 @@ static void CalculateBytePerPixelAndBlockSizes( { *BytePerPixelDETY = 0; *BytePerPixelDETC = 0; - *BytePerPixelY = 0; - *BytePerPixelC = 0; + *BytePerPixelY = 1; + *BytePerPixelC = 1; if (SourcePixelFormat == dml2_444_64) { *BytePerPixelDETY = 8; @@ -820,7 +841,7 @@ static void CalculateSwathWidth( // Output unsigned int req_per_swath_ub_l[], unsigned int req_per_swath_ub_c[], - unsigned int SwathWidthSingleDPPY[], + unsigned int SwathWidthSingleDPPY[], // post-rotated plane width unsigned int SwathWidthSingleDPPC[], unsigned int SwathWidthY[], // per-pipe unsigned int SwathWidthC[], // per-pipe @@ -1403,7 +1424,6 @@ static unsigned int dscceComputeDelay( // N422/N420 operate at 2 pixels per clock unsigned int pixelsPerClock, padding_pixels, ssm_group_priming_delay, ssm_pipeline_delay, obsm_pipeline_delay, slice_padded_pixels, ixd_plus_padding, ixd_plus_padding_groups, cycles_per_group, group_delay, pipeline_delay, pixels, additional_group_delay, lines_to_reach_ixd, groups_to_reach_ixd, slice_width_groups, initial_xmit_delay, number_of_lines_to_reach_ixd, slice_width_modified; - if (pixelFormat == dml2_420) pixelsPerClock = 2; // #all other modes operate at 1 pixel per clock @@ -1428,7 +1448,6 @@ static unsigned int dscceComputeDelay( } } - //sub-stream multiplexer balance fifo priming delay in groups as per dsc standard if (bpc == 8) ssm_group_priming_delay = 83; @@ -1447,9 +1466,6 @@ static unsigned int dscceComputeDelay( //determine number of padded pixels in the last group of a slice line, computed as slice_padded_pixels = 3 * slice_width_groups - slice_width_modified; - - - //determine integer number of complete slice lines required to reach initial transmit delay without ssm delay considered number_of_lines_to_reach_ixd = initial_xmit_delay / slice_width_modified; @@ -1463,7 +1479,6 @@ static unsigned int dscceComputeDelay( //number of groups required for a slice to reach initial transmit delay is the sum of the padded initial transmit delay plus the ssm group priming delay groups_to_reach_ixd = ixd_plus_padding_groups + ssm_group_priming_delay; - //number of lines required to reach padded initial transmit delay in groups in slices to the left of the last horizontal slice //needs to be rounded up as a complete slice lines are buffered prior to initial transmit delay being reached in the last horizontal slice lines_to_reach_ixd = (groups_to_reach_ixd + slice_width_groups - 1) / slice_width_groups; //round up lines to reach ixd to next @@ -1506,7 +1521,6 @@ static unsigned int dscceComputeDelay( return pixels; } - //updated in dcn4 static unsigned int dscComputeDelay(enum dml2_output_format_class pixelFormat, enum dml2_output_encoder_class Output) { @@ -2090,7 +2104,6 @@ static void CalculateDCCConfiguration( yuv420 = 1; else yuv420 = 0; - horz_div_l = 1; horz_div_c = 1; vert_div_l = 1; @@ -2561,8 +2574,7 @@ static void calculate_mcache_setting( if (*p->num_mcaches_l) { l->avg_mcache_element_size_l = l->meta_row_width_l / *p->num_mcaches_l; } - - if (l->is_dual_plane && *p->num_mcaches_c) { + if (l->is_dual_plane) { l->avg_mcache_element_size_c = l->meta_row_width_c / *p->num_mcaches_c; if (!p->imall_enable || (*p->mall_comb_mcache_l == *p->mall_comb_mcache_c)) { @@ -2682,12 +2694,12 @@ static double dml_get_return_bandwidth_available( bool is_avg_bw, bool is_hvm_en, bool is_hvm_only, - double dcflk_mhz, + double dcfclk_mhz, double fclk_mhz, double dram_bw_mbps) { double return_bw_mbps = 0.; - double ideal_sdp_bandwidth = (double)soc->return_bus_width_bytes * dcflk_mhz; + double ideal_sdp_bandwidth = (double)soc->return_bus_width_bytes * dcfclk_mhz; double ideal_fabric_bandwidth = fclk_mhz * (double)soc->fabric_datapath_to_dcn_data_return_bytes; double ideal_dram_bandwidth = dram_bw_mbps; //dram_speed_mts * soc->clk_table.dram_config.channel_count * soc->clk_table.dram_config.channel_width_bytes; @@ -2753,7 +2765,7 @@ static double dml_get_return_bandwidth_available( dml2_printf("DML::%s: is_hvm_only = %u\n", __func__, is_hvm_only); dml2_printf("DML::%s: state_type = %s\n", __func__, dml2_core_internal_soc_state_type_str(state_type)); dml2_printf("DML::%s: bw_type = %s\n", __func__, dml2_core_internal_bw_type_str(bw_type)); - dml2_printf("DML::%s: dcflk_mhz = %f\n", __func__, dcflk_mhz); + dml2_printf("DML::%s: dcfclk_mhz = %f\n", __func__, dcfclk_mhz); dml2_printf("DML::%s: fclk_mhz = %f\n", __func__, fclk_mhz); dml2_printf("DML::%s: ideal_sdp_bandwidth = %f\n", __func__, ideal_sdp_bandwidth); dml2_printf("DML::%s: ideal_fabric_bandwidth = %f\n", __func__, ideal_fabric_bandwidth); @@ -3516,10 +3528,9 @@ static void CalculateUrgentBurstFactor( dml2_printf("DML::%s: UrgentBurstFactorChroma = %f\n", __func__, *UrgentBurstFactorChroma); dml2_printf("DML::%s: NotEnoughUrgentLatencyHiding = %d\n", __func__, *NotEnoughUrgentLatencyHiding); #endif - } -static void CalculateDCFCLKDeepSleep( +static void CalculateDCFCLKDeepSleepTdlut( const struct dml2_display_cfg *display_cfg, unsigned int NumberOfActiveSurfaces, unsigned int BytePerPixelY[], @@ -3534,6 +3545,10 @@ static void CalculateDCFCLKDeepSleep( double ReadBandwidthChroma[], unsigned int ReturnBusWidth, + double dispclk, + unsigned int tdlut_bytes_to_deliver[], + double prefetch_swath_time_us[], + // Output double *DCFClkDeepSleep) { @@ -3568,6 +3583,22 @@ static void CalculateDCFCLKDeepSleep( } DCFClkDeepSleepPerSurface[k] = math_max2(DCFClkDeepSleepPerSurface[k], pixel_rate_mhz / 16); + // adjust for 3dlut delivery time + if (display_cfg->plane_descriptors[k].tdlut.setup_for_tdlut && tdlut_bytes_to_deliver[k] > 0) { + double tdlut_required_deepsleep_dcfclk = (double) tdlut_bytes_to_deliver[k] / 64.0 / prefetch_swath_time_us[k]; + + dml2_printf("DML::%s: k=%d, DCFClkDeepSleepPerSurface = %f\n", __func__, k, DCFClkDeepSleepPerSurface[k]); + dml2_printf("DML::%s: k=%d, tdlut_bytes_to_deliver = %d\n", __func__, k, tdlut_bytes_to_deliver[k]); + dml2_printf("DML::%s: k=%d, prefetch_swath_time_us = %f\n", __func__, k, prefetch_swath_time_us[k]); + dml2_printf("DML::%s: k=%d, tdlut_required_deepsleep_dcfclk = %f\n", __func__, k, tdlut_required_deepsleep_dcfclk); + + // increase the deepsleep dcfclk to match the original dispclk throughput rate + if (tdlut_required_deepsleep_dcfclk > DCFClkDeepSleepPerSurface[k]) { + DCFClkDeepSleepPerSurface[k] = math_max2(DCFClkDeepSleepPerSurface[k], tdlut_required_deepsleep_dcfclk); + DCFClkDeepSleepPerSurface[k] = math_max2(DCFClkDeepSleepPerSurface[k], dispclk / 4.0); + } + } + #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: k=%u, PixelClock = %f\n", __func__, k, pixel_rate_mhz); dml2_printf("DML::%s: k=%u, DCFClkDeepSleepPerSurface = %f\n", __func__, k, DCFClkDeepSleepPerSurface[k]); @@ -3590,9 +3621,56 @@ static void CalculateDCFCLKDeepSleep( for (unsigned int k = 0; k < NumberOfActiveSurfaces; ++k) { *DCFClkDeepSleep = math_max2(*DCFClkDeepSleep, DCFClkDeepSleepPerSurface[k]); } + dml2_printf("DML::%s: DCFClkDeepSleep = %f (final)\n", __func__, *DCFClkDeepSleep); } +static void CalculateDCFCLKDeepSleep( + const struct dml2_display_cfg *display_cfg, + unsigned int NumberOfActiveSurfaces, + unsigned int BytePerPixelY[], + unsigned int BytePerPixelC[], + unsigned int SwathWidthY[], + unsigned int SwathWidthC[], + unsigned int DPPPerSurface[], + double PSCL_THROUGHPUT[], + double PSCL_THROUGHPUT_CHROMA[], + double Dppclk[], + double ReadBandwidthLuma[], + double ReadBandwidthChroma[], + unsigned int ReturnBusWidth, + + // Output + double *DCFClkDeepSleep) +{ + double zero_double[DML2_MAX_PLANES]; + unsigned int zero_integer[DML2_MAX_PLANES]; + + memset(zero_double, 0, DML2_MAX_PLANES * sizeof(double)); + memset(zero_integer, 0, DML2_MAX_PLANES * sizeof(unsigned int)); + + CalculateDCFCLKDeepSleepTdlut( + display_cfg, + NumberOfActiveSurfaces, + BytePerPixelY, + BytePerPixelC, + SwathWidthY, + SwathWidthC, + DPPPerSurface, + PSCL_THROUGHPUT, + PSCL_THROUGHPUT_CHROMA, + Dppclk, + ReadBandwidthLuma, + ReadBandwidthChroma, + ReturnBusWidth, + 0, + zero_integer, //tdlut_bytes_to_deliver, + zero_double, //prefetch_swath_time_us, + + // Output + DCFClkDeepSleep); +} + static double CalculateWriteBackDelay( enum dml2_source_format_class WritebackPixelFormat, double WritebackHRatio, @@ -3816,8 +3894,8 @@ static void CalculateSwathAndDETConfiguration(struct dml2_core_internal_scratch p->SwathHeightC[k] = MaximumSwathHeightC[k] / 2; RoundedUpSwathSizeBytesY[k] = p->full_swath_bytes_l[k] / 2; RoundedUpSwathSizeBytesC[k] = p->full_swath_bytes_c[k] / 2; - p->request_size_bytes_luma[k] = ((p->BytePerPixY[k] == 2) == dml_is_vertical_rotation(p->display_cfg->plane_descriptors[k].composition.rotation_angle)) ? 128 : 64; - p->request_size_bytes_chroma[k] = ((p->BytePerPixC[k] == 2) == dml_is_vertical_rotation(p->display_cfg->plane_descriptors[k].composition.rotation_angle)) ? 128 : 64; + p->request_size_bytes_luma[k] = ((p->BytePerPixY[k] == 2) == dml_is_vertical_rotation(p->display_cfg->plane_descriptors[k].composition.rotation_angle)) ? 128 : 64;; + p->request_size_bytes_chroma[k] = ((p->BytePerPixC[k] == 2) == dml_is_vertical_rotation(p->display_cfg->plane_descriptors[k].composition.rotation_angle)) ? 128 : 64;; } if (p->SwathHeightC[k] == 0) @@ -4592,6 +4670,7 @@ static void calculate_tdlut_setting( *p->tdlut_groups_per_2row_ub = 0; *p->tdlut_opt_time = 0; *p->tdlut_drain_time = 0; + *p->tdlut_bytes_to_deliver = 0; *p->tdlut_bytes_per_group = 0; *p->tdlut_pte_bytes_per_frame = 0; *p->tdlut_bytes_per_frame = 0; @@ -4660,6 +4739,7 @@ static void calculate_tdlut_setting( *p->tdlut_groups_per_2row_ub = (unsigned int)math_ceil2((double) *p->tdlut_bytes_per_frame / *p->tdlut_bytes_per_group, 1); *p->tdlut_opt_time = (*p->tdlut_bytes_per_frame - p->cursor_buffer_size * 1024) / tdlut_drain_rate; *p->tdlut_drain_time = p->cursor_buffer_size * 1024 / tdlut_drain_rate; + *p->tdlut_bytes_to_deliver = (unsigned int) (p->cursor_buffer_size * 1024.0); } #ifdef __DML_VBA_DEBUG__ @@ -4680,6 +4760,7 @@ static void calculate_tdlut_setting( dml2_printf("DML::%s: tdlut_delivery_cycles = %u\n", __func__, tdlut_delivery_cycles); dml2_printf("DML::%s: tdlut_opt_time = %f\n", __func__, *p->tdlut_opt_time); dml2_printf("DML::%s: tdlut_drain_time = %f\n", __func__, *p->tdlut_drain_time); + dml2_printf("DML::%s: tdlut_bytes_to_deliver = %d\n", __func__, *p->tdlut_bytes_to_deliver); dml2_printf("DML::%s: tdlut_groups_per_2row_ub = %d\n", __func__, *p->tdlut_groups_per_2row_ub); #endif } @@ -5069,20 +5150,18 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch s->trip_to_mem = 0.0; *p->Tvm_trips = 0.0; *p->Tr0_trips = 0.0; - s->Tvm_no_trip_oto = 0.0; - s->Tr0_no_trip_oto = 0.0; s->Tvm_trips_rounded = 0.0; s->Tr0_trips_rounded = 0.0; s->max_Tsw = 0.0; s->Lsw_oto = 0.0; - s->Tpre_rounded = 0.0; + *p->Tpre_rounded = 0.0; s->prefetch_bw_equ = 0.0; s->Tvm_equ = 0.0; s->Tr0_equ = 0.0; s->Tdmbf = 0.0; s->Tdmec = 0.0; s->Tdmsks = 0.0; - s->prefetch_sw_bytes = 0.0; + *p->prefetch_sw_bytes = 0.0; s->prefetch_bw_pr = 0.0; s->bytes_pp = 0.0; s->dep_bytes = 0.0; @@ -5207,6 +5286,7 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch dml2_printf("DML::%s: setup_for_tdlut = %u\n", __func__, p->setup_for_tdlut); dml2_printf("DML::%s: tdlut_opt_time = %f\n", __func__, p->tdlut_opt_time); dml2_printf("DML::%s: tdlut_pte_bytes_per_frame = %u\n", __func__, p->tdlut_pte_bytes_per_frame); + dml2_printf("DML::%s: tdlut_drain_time = %f\n", __func__, p->tdlut_drain_time); #endif if (p->OutputFormat == dml2_420 || (p->myPipe->InterlaceEnable && p->myPipe->ProgressiveToInterlaceUnitInOPP)) @@ -5277,23 +5357,8 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch s->bytes_pp = p->myPipe->BytePerPixelY + p->myPipe->BytePerPixelC; } - s->prefetch_bw_pr = s->bytes_pp * p->myPipe->PixelClock / (double)p->myPipe->DPPPerSurface; - if (p->myPipe->VRatio < 1.0) - s->prefetch_bw_pr = p->myPipe->VRatio * s->prefetch_bw_pr; - s->max_Tsw = (math_max2(p->PrefetchSourceLinesY, p->PrefetchSourceLinesC) * s->LineTime); - - s->prefetch_sw_bytes = p->PrefetchSourceLinesY * p->swath_width_luma_ub * p->myPipe->BytePerPixelY + p->PrefetchSourceLinesC * p->swath_width_chroma_ub * p->myPipe->BytePerPixelC; - s->prefetch_bw_pr = s->prefetch_bw_pr * p->mall_prefetch_sdp_overhead_factor; - s->prefetch_sw_bytes = s->prefetch_sw_bytes * p->mall_prefetch_sdp_overhead_factor; - s->prefetch_bw_oto = math_max2(s->prefetch_bw_pr, s->prefetch_sw_bytes / s->max_Tsw); - - s->min_Lsw_oto = math_max2(p->PrefetchSourceLinesY, p->PrefetchSourceLinesC) / __DML2_CALCS_MAX_VRATIO_PRE_OTO__; - s->min_Lsw_oto = math_max2(s->min_Lsw_oto, 2.0); - s->min_Lsw_oto = math_max2(s->min_Lsw_oto, p->tdlut_drain_time / s->LineTime); - - s->min_Lsw_equ = math_max2(p->PrefetchSourceLinesY, p->PrefetchSourceLinesC) / __DML2_CALCS_MAX_VRATIO_PRE_EQU__; - s->min_Lsw_equ = math_max2(s->min_Lsw_equ, 2.0); - s->min_Lsw_equ = math_max2(s->min_Lsw_equ, p->tdlut_drain_time / s->LineTime); + *p->prefetch_sw_bytes = p->PrefetchSourceLinesY * p->swath_width_luma_ub * p->myPipe->BytePerPixelY + p->PrefetchSourceLinesC * p->swath_width_chroma_ub * p->myPipe->BytePerPixelC; + *p->prefetch_sw_bytes = *p->prefetch_sw_bytes * p->mall_prefetch_sdp_overhead_factor; vm_bytes = p->vm_bytes; // vm_bytes is dpde0_bytes_per_frame_ub_l + dpde0_bytes_per_frame_ub_c + 2*extra_dpde_bytes; extra_tdpe_bytes = (unsigned int)math_max2(0, (p->display_cfg->gpuvm_max_page_table_levels - 1) * 128); @@ -5302,57 +5367,103 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch vm_bytes = vm_bytes + p->tdlut_pte_bytes_per_frame + (p->display_cfg->gpuvm_enable ? extra_tdpe_bytes : 0); tdlut_row_bytes = (unsigned long) math_ceil2(p->tdlut_bytes_per_frame/2.0, 1.0); + + s->min_Lsw_oto = math_max2(p->PrefetchSourceLinesY, p->PrefetchSourceLinesC) / __DML2_CALCS_MAX_VRATIO_PRE_OTO__; + s->min_Lsw_oto = math_max2(s->min_Lsw_oto, p->tdlut_drain_time / s->LineTime); + s->min_Lsw_oto = math_max2(s->min_Lsw_oto, 2.0); + + // use vactive swath bw for prefetch oto and also cap prefetch_bw_oto to max_vratio_oto + // Note: in prefetch calculation, acounting is done mostly per-pipe. + // vactive swath bw represents the per-surface (aka per dml plane) bw to move vratio_l/c lines of bytes_l/c per line time + s->per_pipe_vactive_sw_bw = p->vactive_sw_bw_l / (double)p->myPipe->DPPPerSurface; + + // one-to-one prefetch bw as one line of bytes per line time (as per vratio_pre_l/c = 1) + s->prefetch_bw_oto = (p->swath_width_luma_ub * p->myPipe->BytePerPixelY) / s->LineTime; + + if (p->myPipe->BytePerPixelC > 0) { + s->per_pipe_vactive_sw_bw += p->vactive_sw_bw_c / (double)p->myPipe->DPPPerSurface; + s->prefetch_bw_oto += (p->swath_width_chroma_ub * p->myPipe->BytePerPixelC) / s->LineTime; + } + + s->prefetch_bw_oto = math_max2(s->per_pipe_vactive_sw_bw, s->prefetch_bw_oto) * p->mall_prefetch_sdp_overhead_factor; + + s->prefetch_bw_oto = math_min2(s->prefetch_bw_oto, *p->prefetch_sw_bytes/(s->min_Lsw_oto*s->LineTime)); + + s->Lsw_oto = math_ceil2(4.0 * *p->prefetch_sw_bytes / s->prefetch_bw_oto / s->LineTime, 1.0) / 4.0; + s->prefetch_bw_oto = math_max3(s->prefetch_bw_oto, p->vm_bytes * p->HostVMInefficiencyFactor / (31 * s->LineTime) - *p->Tno_bw, (p->PixelPTEBytesPerRow * p->HostVMInefficiencyFactor + p->meta_row_bytes + tdlut_row_bytes) / (15 * s->LineTime)); - s->Lsw_oto = math_ceil2(4.0 * math_max2(s->prefetch_sw_bytes / s->prefetch_bw_oto / s->LineTime, s->min_Lsw_oto), 1.0) / 4.0; + +#ifdef __DML_VBA_DEBUG__ + dml2_printf("DML::%s: vactive_sw_bw_l = %f\n", __func__, p->vactive_sw_bw_l); + dml2_printf("DML::%s: vactive_sw_bw_c = %f\n", __func__, p->vactive_sw_bw_c); + dml2_printf("DML::%s: per_pipe_vactive_sw_bw = %f\n", __func__, s->per_pipe_vactive_sw_bw); +#endif if (p->display_cfg->gpuvm_enable == true) { - s->Tvm_no_trip_oto = math_max2( + s->Tvm_oto = math_max3( + *p->Tvm_trips, *p->Tno_bw + vm_bytes * p->HostVMInefficiencyFactor / s->prefetch_bw_oto, s->LineTime / 4.0); - s->Tvm_oto = math_max2( - *p->Tvm_trips, - s->Tvm_no_trip_oto); + #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: Tvm_oto max0 = %f\n", __func__, *p->Tvm_trips); dml2_printf("DML::%s: Tvm_oto max1 = %f\n", __func__, *p->Tno_bw + vm_bytes * p->HostVMInefficiencyFactor / s->prefetch_bw_oto); dml2_printf("DML::%s: Tvm_oto max2 = %f\n", __func__, s->LineTime / 4.0); #endif } else { - s->Tvm_no_trip_oto = s->Tvm_trips_rounded; s->Tvm_oto = s->Tvm_trips_rounded; } if ((p->display_cfg->gpuvm_enable == true || p->setup_for_tdlut || dcc_mrq_enable)) { - s->Tr0_no_trip_oto = math_max2( + s->Tr0_oto = math_max3( + *p->Tr0_trips, (p->PixelPTEBytesPerRow * p->HostVMInefficiencyFactor + p->meta_row_bytes + tdlut_row_bytes) / s->prefetch_bw_oto, s->LineTime / 4.0); - s->Tr0_oto = math_max2( - *p->Tr0_trips, - s->Tr0_no_trip_oto); #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: Tr0_oto max0 = %f\n", __func__, *p->Tr0_trips); dml2_printf("DML::%s: Tr0_oto max1 = %f\n", __func__, (p->PixelPTEBytesPerRow * p->HostVMInefficiencyFactor + p->meta_row_bytes + tdlut_row_bytes) / s->prefetch_bw_oto); dml2_printf("DML::%s: Tr0_oto max2 = %f\n", __func__, s->LineTime / 4); #endif - } else { - s->Tr0_no_trip_oto = (s->LineTime - s->Tvm_oto) / 4.0; - s->Tr0_oto = s->Tr0_no_trip_oto; - } + } else + s->Tr0_oto = s->LineTime / 4.0; s->Tvm_oto_lines = math_ceil2(4.0 * s->Tvm_oto / s->LineTime, 1) / 4.0; s->Tr0_oto_lines = math_ceil2(4.0 * s->Tr0_oto / s->LineTime, 1) / 4.0; s->dst_y_prefetch_oto = s->Tvm_oto_lines + 2 * s->Tr0_oto_lines + s->Lsw_oto; +#ifdef DML_GLOBAL_PREFETCH_CHECK + dml2_printf("DML::%s: impacted_Tpre = %f\n", __func__, p->impacted_dst_y_pre); + if (p->impacted_dst_y_pre > 0) { + dml2_printf("DML::%s: dst_y_prefetch_oto = %f\n", __func__, s->dst_y_prefetch_oto); + s->dst_y_prefetch_oto = math_max2(s->dst_y_prefetch_oto, p->impacted_dst_y_pre); + dml2_printf("DML::%s: dst_y_prefetch_oto = %f (impacted)\n", __func__, s->dst_y_prefetch_oto); + } +#endif + *p->Tpre_oto = s->dst_y_prefetch_oto * s->LineTime; + //To (time for delay after scaler) in line time Lo = (unsigned int)(*p->DSTYAfterScaler + (double)*p->DSTXAfterScaler / (double)p->myPipe->HTotal); + s->min_Lsw_equ = math_max2(p->PrefetchSourceLinesY, p->PrefetchSourceLinesC) / __DML2_CALCS_MAX_VRATIO_PRE_EQU__; + s->min_Lsw_equ = math_max2(s->min_Lsw_equ, p->tdlut_drain_time / s->LineTime); + s->min_Lsw_equ = math_max2(s->min_Lsw_equ, 2.0); //Tpre_equ in line time if (p->DynamicMetadataVMEnabled && p->DynamicMetadataEnable) s->dst_y_prefetch_equ = p->VStartup - (*p->TSetup + math_max2(p->TCalc, *p->Tvm_trips) + s->TWait_p) / s->LineTime - Lo; else s->dst_y_prefetch_equ = p->VStartup - (*p->TSetup + math_max2(p->TCalc, p->ExtraLatencyPrefetch) + s->TWait_p) / s->LineTime - Lo; + +#ifdef DML_GLOBAL_PREFETCH_CHECK + s->dst_y_prefetch_equ_impacted = math_max2(p->impacted_dst_y_pre, s->dst_y_prefetch_equ); + + s->dst_y_prefetch_equ_impacted = math_min2(s->dst_y_prefetch_equ_impacted, 63.75); // limit to the reg limit of U6.2 for DST_Y_PREFETCH + + if (s->dst_y_prefetch_equ_impacted > s->dst_y_prefetch_equ) + s->dst_y_prefetch_equ -= s->dst_y_prefetch_equ_impacted - s->dst_y_prefetch_equ; +#endif + s->dst_y_prefetch_equ = math_min2(s->dst_y_prefetch_equ, 63.75); // limit to the reg limit of U6.2 for DST_Y_PREFETCH #ifdef __DML_VBA_DEBUG__ @@ -5370,7 +5481,7 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch dml2_printf("DML::%s: BytePerPixelC = %u\n", __func__, p->myPipe->BytePerPixelC); dml2_printf("DML::%s: PrefetchSourceLinesC = %f\n", __func__, p->PrefetchSourceLinesC); dml2_printf("DML::%s: swath_width_chroma_ub = %u\n", __func__, p->swath_width_chroma_ub); - dml2_printf("DML::%s: prefetch_sw_bytes = %f\n", __func__, s->prefetch_sw_bytes); + dml2_printf("DML::%s: prefetch_sw_bytes = %f\n", __func__, *p->prefetch_sw_bytes); dml2_printf("DML::%s: max_Tsw = %f\n", __func__, s->max_Tsw); dml2_printf("DML::%s: bytes_pp = %f\n", __func__, s->bytes_pp); dml2_printf("DML::%s: vm_bytes = %u\n", __func__, vm_bytes); @@ -5394,7 +5505,7 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch #endif double Tpre = s->dst_y_prefetch_equ * s->LineTime; s->dst_y_prefetch_equ = math_floor2(4.0 * (s->dst_y_prefetch_equ + 0.125), 1) / 4.0; - s->Tpre_rounded = s->dst_y_prefetch_equ * s->LineTime; + *p->Tpre_rounded = s->dst_y_prefetch_equ * s->LineTime; #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: dst_y_prefetch_equ: %f (after round)\n", __func__, s->dst_y_prefetch_equ); @@ -5420,7 +5531,7 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch dml2_printf("DML::%s: vm_bytes: %f (hvm inefficiency scaled)\n", __func__, vm_bytes*p->HostVMInefficiencyFactor); dml2_printf("DML::%s: row_bytes: %f (hvm inefficiency scaled, 1 row)\n", __func__, p->PixelPTEBytesPerRow*p->HostVMInefficiencyFactor+p->meta_row_bytes+tdlut_row_bytes); dml2_printf("DML::%s: Tno_bw: %f\n", __func__, *p->Tno_bw); - dml2_printf("DML::%s: Tpre=%f Tpre_rounded: %f, delta=%f\n", __func__, Tpre, s->Tpre_rounded, (s->Tpre_rounded - Tpre)); + dml2_printf("DML::%s: Tpre=%f Tpre_rounded: %f, delta=%f\n", __func__, Tpre, *p->Tpre_rounded, (*p->Tpre_rounded - Tpre)); dml2_printf("DML::%s: Tvm_trips=%f Tvm_trips_rounded: %f, delta=%f\n", __func__, *p->Tvm_trips, s->Tvm_trips_rounded, (s->Tvm_trips_rounded - *p->Tvm_trips)); #endif @@ -5434,78 +5545,85 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch // Tpre_rounded is Tpre rounding to 2-bit fraction // Tvm_trips_rounded is Tvm_trips ceiling to 1/4 line time // Tr0_trips_rounded is Tr0_trips ceiling to 1/4 line time - // So that means prefetch bw calculated can be higher since the total time availabe for prefetch is less - bool min_Lsw_equ_ok = s->Tpre_rounded >= s->Tvm_trips_rounded + 2.0*s->Tr0_trips_rounded + s->min_Lsw_equ*s->LineTime; + // So that means prefetch bw calculated can be higher since the total time available for prefetch is less + bool min_Lsw_equ_ok = *p->Tpre_rounded >= s->Tvm_trips_rounded + 2.0*s->Tr0_trips_rounded + s->min_Lsw_equ*s->LineTime; + bool tpre_gt_req_latency = true; +#if 0 + // Check that Tpre_rounded is big enough if all of the stages of the prefetch are time constrained. + // The terms Tvm_trips_rounded and Tr0_trips_rounded represent the min time constraints for the VM and row stages. + // Normally, these terms cover the overall time constraint for Tpre >= (Tex + max{Ttrip, Turg}), but if these terms are at their minimum, an explicit check is necessary. + tpre_gt_req_latency = *p->Tpre_rounded > (math_max2(p->Turg, s->trip_to_mem) + p->ExtraLatencyPrefetch); +#endif - if (s->dst_y_prefetch_equ > 1 && min_Lsw_equ_ok) { + if (s->dst_y_prefetch_equ > 1 && min_Lsw_equ_ok && tpre_gt_req_latency) { s->prefetch_bw1 = 0.; s->prefetch_bw2 = 0.; s->prefetch_bw3 = 0.; s->prefetch_bw4 = 0.; // prefetch_bw1: VM + 2*R0 + SW - if (s->Tpre_rounded - *p->Tno_bw > 0) { + if (*p->Tpre_rounded - *p->Tno_bw > 0) { s->prefetch_bw1 = (vm_bytes * p->HostVMInefficiencyFactor + 2 * (p->PixelPTEBytesPerRow * p->HostVMInefficiencyFactor + p->meta_row_bytes + tdlut_row_bytes) - + s->prefetch_sw_bytes) - / (s->Tpre_rounded - *p->Tno_bw); - s->Tsw_est1 = s->prefetch_sw_bytes / s->prefetch_bw1; + + *p->prefetch_sw_bytes) + / (*p->Tpre_rounded - *p->Tno_bw); + s->Tsw_est1 = *p->prefetch_sw_bytes / s->prefetch_bw1; } else s->prefetch_bw1 = 0; dml2_printf("DML::%s: prefetch_bw1: %f\n", __func__, s->prefetch_bw1); - if ((s->Tsw_est1 < s->min_Lsw_equ * s->LineTime) && (s->Tpre_rounded - s->min_Lsw_equ * s->LineTime - 0.75 * s->LineTime - *p->Tno_bw > 0)) { + if ((s->Tsw_est1 < s->min_Lsw_equ * s->LineTime) && (*p->Tpre_rounded - s->min_Lsw_equ * s->LineTime - 0.75 * s->LineTime - *p->Tno_bw > 0)) { s->prefetch_bw1 = (vm_bytes * p->HostVMInefficiencyFactor + 2 * (p->PixelPTEBytesPerRow * p->HostVMInefficiencyFactor + p->meta_row_bytes + tdlut_row_bytes)) / - (s->Tpre_rounded - s->min_Lsw_equ * s->LineTime - 0.75 * s->LineTime - *p->Tno_bw); + (*p->Tpre_rounded - s->min_Lsw_equ * s->LineTime - 0.75 * s->LineTime - *p->Tno_bw); #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: vm and 2 rows bytes = %f\n", __func__, (vm_bytes * p->HostVMInefficiencyFactor + 2 * (p->PixelPTEBytesPerRow * p->HostVMInefficiencyFactor + p->meta_row_bytes + tdlut_row_bytes))); - dml2_printf("DML::%s: Tpre_rounded = %f\n", __func__, s->Tpre_rounded); + dml2_printf("DML::%s: Tpre_rounded = %f\n", __func__, *p->Tpre_rounded); dml2_printf("DML::%s: minus term = %f\n", __func__, s->min_Lsw_equ * s->LineTime + 0.75 * s->LineTime + *p->Tno_bw); dml2_printf("DML::%s: min_Lsw_equ = %f\n", __func__, s->min_Lsw_equ); dml2_printf("DML::%s: LineTime = %f\n", __func__, s->LineTime); dml2_printf("DML::%s: Tno_bw = %f\n", __func__, *p->Tno_bw); - dml2_printf("DML::%s: Time to fetch vm and 2 rows = %f\n", __func__, (s->Tpre_rounded - s->min_Lsw_equ * s->LineTime - 0.75 * s->LineTime - *p->Tno_bw)); + dml2_printf("DML::%s: Time to fetch vm and 2 rows = %f\n", __func__, (*p->Tpre_rounded - s->min_Lsw_equ * s->LineTime - 0.75 * s->LineTime - *p->Tno_bw)); dml2_printf("DML::%s: prefetch_bw1: %f (updated)\n", __func__, s->prefetch_bw1); #endif } // prefetch_bw2: VM + SW - if (s->Tpre_rounded - *p->Tno_bw - 2.0 * s->Tr0_trips_rounded > 0) { - s->prefetch_bw2 = (vm_bytes * p->HostVMInefficiencyFactor + s->prefetch_sw_bytes) / - (s->Tpre_rounded - *p->Tno_bw - 2.0 * s->Tr0_trips_rounded); - s->Tsw_est2 = s->prefetch_sw_bytes / s->prefetch_bw2; + if (*p->Tpre_rounded - *p->Tno_bw - 2.0 * s->Tr0_trips_rounded > 0) { + s->prefetch_bw2 = (vm_bytes * p->HostVMInefficiencyFactor + *p->prefetch_sw_bytes) / + (*p->Tpre_rounded - *p->Tno_bw - 2.0 * s->Tr0_trips_rounded); + s->Tsw_est2 = *p->prefetch_sw_bytes / s->prefetch_bw2; } else s->prefetch_bw2 = 0; dml2_printf("DML::%s: prefetch_bw2: %f\n", __func__, s->prefetch_bw2); - if ((s->Tsw_est2 < s->min_Lsw_equ * s->LineTime) && ((s->Tpre_rounded - *p->Tno_bw - 2.0 * s->Tr0_trips_rounded - s->min_Lsw_equ * s->LineTime - 0.25 * s->LineTime) > 0)) { - s->prefetch_bw2 = vm_bytes * p->HostVMInefficiencyFactor / (s->Tpre_rounded - *p->Tno_bw - 2.0 * s->Tr0_trips_rounded - s->min_Lsw_equ * s->LineTime - 0.25 * s->LineTime); + if ((s->Tsw_est2 < s->min_Lsw_equ * s->LineTime) && ((*p->Tpre_rounded - *p->Tno_bw - 2.0 * s->Tr0_trips_rounded - s->min_Lsw_equ * s->LineTime - 0.25 * s->LineTime) > 0)) { + s->prefetch_bw2 = vm_bytes * p->HostVMInefficiencyFactor / (*p->Tpre_rounded - *p->Tno_bw - 2.0 * s->Tr0_trips_rounded - s->min_Lsw_equ * s->LineTime - 0.25 * s->LineTime); dml2_printf("DML::%s: prefetch_bw2: %f (updated)\n", __func__, s->prefetch_bw2); } // prefetch_bw3: 2*R0 + SW - if (s->Tpre_rounded - s->Tvm_trips_rounded > 0) { - s->prefetch_bw3 = (2 * (p->PixelPTEBytesPerRow * p->HostVMInefficiencyFactor + p->meta_row_bytes + tdlut_row_bytes) + s->prefetch_sw_bytes) / - (s->Tpre_rounded - s->Tvm_trips_rounded); - s->Tsw_est3 = s->prefetch_sw_bytes / s->prefetch_bw3; + if (*p->Tpre_rounded - s->Tvm_trips_rounded > 0) { + s->prefetch_bw3 = (2 * (p->PixelPTEBytesPerRow * p->HostVMInefficiencyFactor + p->meta_row_bytes + tdlut_row_bytes) + *p->prefetch_sw_bytes) / + (*p->Tpre_rounded - s->Tvm_trips_rounded); + s->Tsw_est3 = *p->prefetch_sw_bytes / s->prefetch_bw3; } else s->prefetch_bw3 = 0; dml2_printf("DML::%s: prefetch_bw3: %f\n", __func__, s->prefetch_bw3); - if ((s->Tsw_est3 < s->min_Lsw_equ * s->LineTime) && ((s->Tpre_rounded - s->min_Lsw_equ * s->LineTime - 0.5 * s->LineTime - s->Tvm_trips_rounded) > 0)) { - s->prefetch_bw3 = (2 * (p->PixelPTEBytesPerRow * p->HostVMInefficiencyFactor + p->meta_row_bytes + tdlut_row_bytes)) / (s->Tpre_rounded - s->min_Lsw_equ * s->LineTime - 0.5 * s->LineTime - s->Tvm_trips_rounded); + if ((s->Tsw_est3 < s->min_Lsw_equ * s->LineTime) && ((*p->Tpre_rounded - s->min_Lsw_equ * s->LineTime - 0.5 * s->LineTime - s->Tvm_trips_rounded) > 0)) { + s->prefetch_bw3 = (2 * (p->PixelPTEBytesPerRow * p->HostVMInefficiencyFactor + p->meta_row_bytes + tdlut_row_bytes)) / (*p->Tpre_rounded - s->min_Lsw_equ * s->LineTime - 0.5 * s->LineTime - s->Tvm_trips_rounded); dml2_printf("DML::%s: prefetch_bw3: %f (updated)\n", __func__, s->prefetch_bw3); } // prefetch_bw4: SW - if (s->Tpre_rounded - s->Tvm_trips_rounded - 2 * s->Tr0_trips_rounded > 0) - s->prefetch_bw4 = s->prefetch_sw_bytes / (s->Tpre_rounded - s->Tvm_trips_rounded - 2 * s->Tr0_trips_rounded); + if (*p->Tpre_rounded - s->Tvm_trips_rounded - 2 * s->Tr0_trips_rounded > 0) + s->prefetch_bw4 = *p->prefetch_sw_bytes / (*p->Tpre_rounded - s->Tvm_trips_rounded - 2 * s->Tr0_trips_rounded); else s->prefetch_bw4 = 0; #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: Tno_bw: %f\n", __func__, *p->Tno_bw); - dml2_printf("DML::%s: Tpre=%f Tpre_rounded: %f, delta=%f\n", __func__, Tpre, s->Tpre_rounded, (s->Tpre_rounded - Tpre)); + dml2_printf("DML::%s: Tpre=%f Tpre_rounded: %f, delta=%f\n", __func__, Tpre, *p->Tpre_rounded, (*p->Tpre_rounded - Tpre)); dml2_printf("DML::%s: Tvm_trips=%f Tvm_trips_rounded: %f, delta=%f\n", __func__, *p->Tvm_trips, s->Tvm_trips_rounded, (s->Tvm_trips_rounded - *p->Tvm_trips)); dml2_printf("DML::%s: Tr0_trips=%f Tr0_trips_rounded: %f, delta=%f\n", __func__, *p->Tr0_trips, s->Tr0_trips_rounded, (s->Tr0_trips_rounded - *p->Tr0_trips)); dml2_printf("DML::%s: Tsw_est1: %f\n", __func__, s->Tsw_est1); @@ -5617,9 +5735,6 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch dml2_printf("DML::%s: Tvm_equ = %f\n", __func__, s->Tvm_equ); dml2_printf("DML::%s: Tr0_equ = %f\n", __func__, s->Tr0_equ); #endif - // Lsw = dst_y_prefetch - (dst_y_per_vm_vblank + 2*dst_y_per_row_vblank) - s->Lsw_equ = s->dst_y_prefetch_equ - math_ceil2(4.0 * (s->Tvm_equ + 2 * s->Tr0_equ) / s->LineTime, 1.0) / 4.0; - // Use the more stressful prefetch schedule if (s->dst_y_prefetch_oto < s->dst_y_prefetch_equ) { *p->dst_y_prefetch = s->dst_y_prefetch_oto; @@ -5628,31 +5743,33 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch *p->dst_y_per_vm_vblank = math_ceil2(4.0 * s->TimeForFetchingVM / s->LineTime, 1.0) / 4.0; *p->dst_y_per_row_vblank = math_ceil2(4.0 * s->TimeForFetchingRowInVBlank / s->LineTime, 1.0) / 4.0; - s->dst_y_per_vm_no_trip_vblank = math_ceil2(4.0 * s->Tvm_no_trip_oto / s->LineTime, 1.0) / 4.0; - s->dst_y_per_row_no_trip_vblank = math_ceil2(4.0 * s->Tr0_no_trip_oto / s->LineTime, 1.0) / 4.0; #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: Using oto scheduling for prefetch\n", __func__); #endif + } else { *p->dst_y_prefetch = s->dst_y_prefetch_equ; + + if (s->dst_y_prefetch_equ < s->dst_y_prefetch_equ_impacted) + *p->dst_y_prefetch = s->dst_y_prefetch_equ_impacted; + s->TimeForFetchingVM = s->Tvm_equ; s->TimeForFetchingRowInVBlank = s->Tr0_equ; - *p->dst_y_per_vm_vblank = math_ceil2(4.0 * s->TimeForFetchingVM / s->LineTime, 1.0) / 4.0; - *p->dst_y_per_row_vblank = math_ceil2(4.0 * s->TimeForFetchingRowInVBlank / s->LineTime, 1.0) / 4.0; - s->dst_y_per_vm_no_trip_vblank = *p->dst_y_per_vm_vblank; - s->dst_y_per_row_no_trip_vblank = *p->dst_y_per_row_vblank; + *p->dst_y_per_vm_vblank = math_ceil2(4.0 * s->TimeForFetchingVM / s->LineTime, 1.0) / 4.0; + *p->dst_y_per_row_vblank = math_ceil2(4.0 * s->TimeForFetchingRowInVBlank / s->LineTime, 1.0) / 4.0; #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: Using equ bw scheduling for prefetch\n", __func__); #endif } - /* take worst case Lsw to calculate bandwidth requirement regardless of schedule */ - s->LinesToRequestPrefetchPixelData = math_min2(s->Lsw_equ, s->Lsw_oto); // Lsw + // Lsw = dst_y_prefetch - (dst_y_per_vm_vblank + 2*dst_y_per_row_vblank) + s->LinesToRequestPrefetchPixelData = *p->dst_y_prefetch - *p->dst_y_per_vm_vblank - 2 * *p->dst_y_per_row_vblank; // Lsw s->cursor_prefetch_bytes = (unsigned int)math_max2(p->cursor_bytes_per_chunk, 4 * p->cursor_bytes_per_line); *p->prefetch_cursor_bw = p->num_cursors * s->cursor_prefetch_bytes / (s->LinesToRequestPrefetchPixelData * s->LineTime); + *p->prefetch_swath_time_us = (s->LinesToRequestPrefetchPixelData * s->LineTime); #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: TimeForFetchingVM = %f\n", __func__, s->TimeForFetchingVM); @@ -5663,6 +5780,7 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch dml2_printf("DML::%s: dst_y_per_row_vblank = %f\n", __func__, *p->dst_y_per_row_vblank); dml2_printf("DML::%s: LinesToRequestPrefetchPixelData = %f\n", __func__, s->LinesToRequestPrefetchPixelData); dml2_printf("DML::%s: PrefetchSourceLinesY = %f\n", __func__, p->PrefetchSourceLinesY); + dml2_printf("DML::%s: prefetch_swath_time_us = %f\n", __func__, *p->prefetch_swath_time_us); dml2_printf("DML::%s: cursor_bytes_per_chunk = %d\n", __func__, p->cursor_bytes_per_chunk); dml2_printf("DML::%s: cursor_bytes_per_line = %d\n", __func__, p->cursor_bytes_per_line); @@ -5749,8 +5867,10 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch } else { dml2_printf("DML::%s: No time to prefetch! dst_y_prefetch_equ = %f (should be > 1)\n", __func__, s->dst_y_prefetch_equ); - dml2_printf("DML::%s: No time to prefetch! min_Lsw_equ_ok = %d, Tpre_rounded (%f) should be >= Tvm_trips_rounded (%f) + 2.0*Tr0_trips_rounded (%f) + min_Tsw_equ (%f)\n", - __func__, min_Lsw_equ_ok, s->Tpre_rounded, s->Tvm_trips_rounded, 2.0*s->Tr0_trips_rounded, s->min_Lsw_equ*s->LineTime); + dml2_printf("DML::%s: No time to prefetch! min_Lsw_equ_ok = %d, Tpre_rounded (%f) should be >= Tvm_trips_rounded (%f) + 2.0*Tr0_trips_rounded (%f) + min_Tsw_equ (%f)\n", + __func__, min_Lsw_equ_ok, *p->Tpre_rounded, s->Tvm_trips_rounded, 2.0*s->Tr0_trips_rounded, s->min_Lsw_equ*s->LineTime); + dml2_printf("DML::%s: No time to prefetch! min_Lsw_equ_ok = %d, Tpre_rounded+Tvm_trips_rounded+2.0*Tr0_trips_rounded+min_Tsw_equ (%f) should be > \n", + __func__, tpre_gt_req_latency, (s->min_Lsw_equ*s->LineTime + s->Tvm_trips_rounded + 2.0*s->Tr0_trips_rounded), p->Turg, s->trip_to_mem, p->ExtraLatencyPrefetch); s->NoTimeToPrefetch = true; s->TimeForFetchingVM = 0; s->TimeForFetchingRowInVBlank = 0; @@ -5769,13 +5889,13 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch if (vm_bytes == 0) { prefetch_vm_bw = 0; - } else if (s->dst_y_per_vm_no_trip_vblank > 0) { + } else if (*p->dst_y_per_vm_vblank > 0) { #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: HostVMInefficiencyFactor = %f\n", __func__, p->HostVMInefficiencyFactor); dml2_printf("DML::%s: dst_y_per_vm_vblank = %f\n", __func__, *p->dst_y_per_vm_vblank); dml2_printf("DML::%s: LineTime = %f\n", __func__, s->LineTime); #endif - prefetch_vm_bw = vm_bytes * p->HostVMInefficiencyFactor / (s->dst_y_per_vm_no_trip_vblank * s->LineTime); + prefetch_vm_bw = vm_bytes * p->HostVMInefficiencyFactor / (*p->dst_y_per_vm_vblank * s->LineTime); #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: prefetch_vm_bw = %f\n", __func__, prefetch_vm_bw); #endif @@ -5787,8 +5907,8 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch if (p->PixelPTEBytesPerRow == 0 && tdlut_row_bytes == 0) { prefetch_row_bw = 0; - } else if (s->dst_y_per_row_no_trip_vblank > 0) { - prefetch_row_bw = (p->PixelPTEBytesPerRow * p->HostVMInefficiencyFactor + tdlut_row_bytes) / (s->dst_y_per_row_no_trip_vblank * s->LineTime); + } else if (*p->dst_y_per_row_vblank > 0) { + prefetch_row_bw = (p->PixelPTEBytesPerRow * p->HostVMInefficiencyFactor + tdlut_row_bytes) / (*p->dst_y_per_row_vblank * s->LineTime); #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: PixelPTEBytesPerRow = %u\n", __func__, p->PixelPTEBytesPerRow); @@ -5828,6 +5948,171 @@ static bool CalculatePrefetchSchedule(struct dml2_core_internal_scratch *scratch return s->NoTimeToPrefetch; } +static unsigned int get_num_lb_source_lines(unsigned int max_line_buffer_lines, + unsigned int line_buffer_size_bits, + unsigned int num_pipes, + unsigned int vp_width, + unsigned int vp_height, + double h_ratio, + enum dml2_rotation_angle rotation_angle) +{ + unsigned int num_lb_source_lines = 0; + double lb_bit_per_pixel = 57.0; + unsigned recin_width = vp_width/num_pipes; + + if (dml_is_vertical_rotation(rotation_angle)) + recin_width = vp_height/num_pipes; + + num_lb_source_lines = (unsigned int) math_min2((double) max_line_buffer_lines, + math_floor2(line_buffer_size_bits / lb_bit_per_pixel / (recin_width / math_max2(h_ratio, 1.0)), 1.0)); + + return num_lb_source_lines; +} + +static unsigned int find_max_impact_plane(unsigned int this_plane_idx, unsigned int num_planes, unsigned int Trpd_dcfclk_cycles[]) +{ + int max_value = -1; + int max_idx = -1; + for (unsigned int i = 0; i < num_planes; i++) { + if (i != this_plane_idx && (int) Trpd_dcfclk_cycles[i] > max_value) { + max_value = Trpd_dcfclk_cycles[i]; + max_idx = i; + } + } + if (max_idx <= 0) { + dml2_assert(max_idx >= 0); + max_idx = this_plane_idx; + } + + return max_idx; +} + +static double calculate_impacted_Tsw(unsigned int exclude_plane_idx, unsigned int num_planes, double *prefetch_swath_bytes, double bw_mbps) +{ + double sum = 0.; + for (unsigned int i = 0; i < num_planes; i++) { + if (i != exclude_plane_idx) { + sum += prefetch_swath_bytes[i]; + } + } + return sum / bw_mbps; +} + +// a global check against the aggregate effect of the per plane prefetch schedule +static bool CheckGlobalPrefetchAdmissibility(struct dml2_core_internal_scratch *scratch, + struct dml2_core_calcs_CheckGlobalPrefetchAdmissibility_params *p) +{ + struct dml2_core_calcs_CheckGlobalPrefetchAdmissibility_locals *s = &scratch->CheckGlobalPrefetchAdmissibility_locals; + unsigned int i, k; + + memset(s, 0, sizeof(struct dml2_core_calcs_CheckGlobalPrefetchAdmissibility_locals)); + + *p->recalc_prefetch_schedule = 0; + s->prefetch_global_check_passed = 1; + // worst case if the rob and cdb is fully hogged + s->max_Trpd_dcfclk_cycles = (unsigned int) math_ceil2((p->rob_buffer_size_kbytes*1024 + p->compressed_buffer_size_kbytes*DML_MAX_COMPRESSION_RATIO*1024)/64.0, 1.0); +#ifdef __DML_VBA_DEBUG__ + dml2_printf("DML::%s: num_active_planes = %d\n", __func__, p->num_active_planes); + dml2_printf("DML::%s: rob_buffer_size_kbytes = %d\n", __func__, p->rob_buffer_size_kbytes); + dml2_printf("DML::%s: compressed_buffer_size_kbytes = %d\n", __func__, p->compressed_buffer_size_kbytes); + dml2_printf("DML::%s: estimated_urg_bandwidth_required_mbps = %f\n", __func__, p->estimated_urg_bandwidth_required_mbps); + dml2_printf("DML::%s: estimated_dcfclk_mhz = %f\n", __func__, p->estimated_dcfclk_mhz); + dml2_printf("DML::%s: max_Trpd_dcfclk_cycles = %u\n", __func__, s->max_Trpd_dcfclk_cycles); +#endif + + // calculate the return impact from each plane, request is 256B per dcfclk + for (i = 0; i < p->num_active_planes; i++) { + s->src_detile_buf_size_bytes_l[i] = p->detile_buffer_size_bytes_l[i]; + s->src_detile_buf_size_bytes_c[i] = p->detile_buffer_size_bytes_c[i]; + s->src_swath_bytes_l[i] = p->full_swath_bytes_l[i]; + s->src_swath_bytes_c[i] = p->full_swath_bytes_c[i]; + + if (p->pixel_format[i] == dml2_420_10) { + s->src_detile_buf_size_bytes_l[i] = (unsigned int) (s->src_detile_buf_size_bytes_l[i] * 1.5); + s->src_detile_buf_size_bytes_c[i] = (unsigned int) (s->src_detile_buf_size_bytes_c[i] * 1.5); + s->src_swath_bytes_l[i] = (unsigned int) (s->src_swath_bytes_l[i] * 1.5); + s->src_swath_bytes_c[i] = (unsigned int) (s->src_swath_bytes_c[i] * 1.5); + } + + s->burst_bytes_to_fill_det = (unsigned int) (math_floor2(s->src_detile_buf_size_bytes_l[i] / p->chunk_bytes_l, 1) * p->chunk_bytes_l); + s->burst_bytes_to_fill_det += (unsigned int) (math_floor2(p->lb_source_lines_l[i] / p->swath_height_l[i], 1) * s->src_swath_bytes_l[i]); + +#ifdef __DML_VBA_DEBUG__ + dml2_printf("DML::%s: i=%u pixel_format = %d\n", __func__, i, p->pixel_format[i]); + dml2_printf("DML::%s: i=%u chunk_bytes_l = %d\n", __func__, i, p->chunk_bytes_l); + dml2_printf("DML::%s: i=%u lb_source_lines_l = %d\n", __func__, i, p->lb_source_lines_l[i]); + dml2_printf("DML::%s: i=%u src_detile_buf_size_bytes_l=%d\n", __func__, i, s->src_detile_buf_size_bytes_l[i]); + dml2_printf("DML::%s: i=%u src_swath_bytes_l=%d\n", __func__, i, s->src_swath_bytes_l[i]); + dml2_printf("DML::%s: i=%u burst_bytes_to_fill_det=%d (luma)\n", __func__, i, s->burst_bytes_to_fill_det); +#endif + + if (s->src_swath_bytes_c[i] > 0) { // dual_plane + s->burst_bytes_to_fill_det += (unsigned int) (math_floor2(s->src_detile_buf_size_bytes_c[i] / p->chunk_bytes_c, 1) * p->chunk_bytes_c); + + if (p->pixel_format[i] == dml2_422_planar_8 || p->pixel_format[i] == dml2_422_planar_10 || p->pixel_format[i] == dml2_422_planar_12) { + s->burst_bytes_to_fill_det += (unsigned int) (math_floor2(p->lb_source_lines_c[i] / p->swath_height_c[i], 1) * s->src_swath_bytes_c[i]); + } + +#ifdef __DML_VBA_DEBUG__ + dml2_printf("DML::%s: i=%u chunk_bytes_c = %d\n", __func__, i, p->chunk_bytes_c); + dml2_printf("DML::%s: i=%u lb_source_lines_c = %d\n", __func__, i, p->lb_source_lines_c[i]); + dml2_printf("DML::%s: i=%u src_detile_buf_size_bytes_c=%d\n", __func__, i, s->src_detile_buf_size_bytes_c[i]); + dml2_printf("DML::%s: i=%u src_swath_bytes_c=%d\n", __func__, i, s->src_swath_bytes_c[i]); +#endif + } + + s->time_to_fill_det_us = (double) s->burst_bytes_to_fill_det / (256 * p->estimated_dcfclk_mhz); // fill time assume full burst at request rate + s->accumulated_return_path_dcfclk_cycles[i] = (unsigned int) math_ceil2(((DML_MAX_COMPRESSION_RATIO-1) * 64 * p->estimated_dcfclk_mhz) * s->time_to_fill_det_us / 64.0, 1.0); //for 64B per DCFClk + +#ifdef __DML_VBA_DEBUG__ + dml2_printf("DML::%s: i=%u burst_bytes_to_fill_det=%d\n", __func__, i, s->burst_bytes_to_fill_det); + dml2_printf("DML::%s: i=%u time_to_fill_det_us=%f\n", __func__, i, s->time_to_fill_det_us); + dml2_printf("DML::%s: i=%u accumulated_return_path_dcfclk_cycles=%u\n", __func__, i, s->accumulated_return_path_dcfclk_cycles[i]); +#endif + // clamping to worst case delay which is one which occupy the full rob+cdb + if (s->accumulated_return_path_dcfclk_cycles[i] > s->max_Trpd_dcfclk_cycles) + s->accumulated_return_path_dcfclk_cycles[i] = s->max_Trpd_dcfclk_cycles; + } + + // Figure out the impacted prefetch time for each plane + // if impacted_Tre is > equ bw Tpre, we need to fail the prefetch schedule as we need a higher state to support the bw + for (i = 0; i < p->num_active_planes; i++) { + k = find_max_impact_plane(i, p->num_active_planes, s->accumulated_return_path_dcfclk_cycles); // plane k causes most impact to plane i + // the rest of planes (except for k) complete for bw + p->impacted_dst_y_pre[i] = s->accumulated_return_path_dcfclk_cycles[k]/p->estimated_dcfclk_mhz; + p->impacted_dst_y_pre[i] += calculate_impacted_Tsw(k, p->num_active_planes, p->prefetch_sw_bytes, p->estimated_urg_bandwidth_required_mbps); + p->impacted_dst_y_pre[i] = math_ceil2(p->impacted_dst_y_pre[i] / p->line_time[i], 0.25); + +#ifdef __DML_VBA_DEBUG__ + dml2_printf("DML::%s: i=%u impacted_Tpre=%f (k=%u)\n", __func__, i, p->impacted_dst_y_pre[i], k); +#endif + } + + if (p->Tpre_rounded != NULL && p->Tpre_oto != NULL) { + for (i = 0; i < p->num_active_planes; i++) { + if (p->impacted_dst_y_pre[i] > p->dst_y_prefetch[i]) { + s->prefetch_global_check_passed = 0; + *p->recalc_prefetch_schedule = 1; + } +#ifdef __DML_VBA_DEBUG__ + dml2_printf("DML::%s: i=%u Tpre_rounded=%f\n", __func__, i, p->Tpre_rounded[i]); + dml2_printf("DML::%s: i=%u Tpre_oto=%f\n", __func__, i, p->Tpre_oto[i]); +#endif + } + } else { + // likely a mode programming calls, assume support, and no recalc - not used anyways + s->prefetch_global_check_passed = 1; + *p->recalc_prefetch_schedule = 0; + } + +#ifdef __DML_VBA_DEBUG__ + dml2_printf("DML::%s: prefetch_global_check_passed=%u\n", __func__, s->prefetch_global_check_passed); + dml2_printf("DML::%s: recalc_prefetch_schedule=%u\n", __func__, *p->recalc_prefetch_schedule); +#endif + + return s->prefetch_global_check_passed; +} + static void calculate_peak_bandwidth_required( struct dml2_core_internal_scratch *s, struct dml2_core_calcs_calculate_peak_bandwidth_required_params *p) @@ -6046,7 +6331,7 @@ static void check_urgent_bandwidth_support( double *frac_urg_bandwidth_nom, double *frac_urg_bandwidth_mall, bool *vactive_bandwidth_support_ok, // vactive ok - bool *bandwidth_support_ok, // max of vm, prefetch, vactive all ok + bool *bandwidth_support_ok,// max of vm, prefetch, vactive all ok unsigned int mall_allocated_for_dcn_mbytes, double non_urg_bandwidth_required[dml2_core_internal_soc_state_max][dml2_core_internal_bw_max], @@ -6116,7 +6401,6 @@ static void check_urgent_bandwidth_support( } } #endif - } static double get_bandwidth_available_for_immediate_flip(enum dml2_core_internal_soc_state_type eval_state, @@ -6438,7 +6722,7 @@ static void CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport( p->Watermark->Z8StutterExitWatermark += p->mmSOCParameters.max_urgent_latency_us + p->mmSOCParameters.df_response_time_us; p->Watermark->Z8StutterEnterPlusExitWatermark += p->mmSOCParameters.max_urgent_latency_us + p->mmSOCParameters.df_response_time_us; } - p->Watermark->g6_temp_read_watermark_us = p->mmSOCParameters.g6_temp_read_blackout_us + p->Watermark->UrgentWatermark; + p->Watermark->temp_read_or_ppt_watermark_us = p->mmSOCParameters.g6_temp_read_blackout_us + p->Watermark->UrgentWatermark; #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: UrgentLatency = %f\n", __func__, p->mmSOCParameters.UrgentLatency); @@ -6454,12 +6738,12 @@ static void CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport( dml2_printf("DML::%s: StutterEnterPlusExitWatermark = %f\n", __func__, p->Watermark->StutterEnterPlusExitWatermark); dml2_printf("DML::%s: Z8StutterExitWatermark = %f\n", __func__, p->Watermark->Z8StutterExitWatermark); dml2_printf("DML::%s: Z8StutterEnterPlusExitWatermark = %f\n", __func__, p->Watermark->Z8StutterEnterPlusExitWatermark); - dml2_printf("DML::%s: g6_temp_read_watermark_us = %f\n", __func__, p->Watermark->g6_temp_read_watermark_us); + dml2_printf("DML::%s: temp_read_or_ppt_watermark_us = %f\n", __func__, p->Watermark->temp_read_or_ppt_watermark_us); #endif s->TotalActiveWriteback = 0; for (unsigned int k = 0; k < p->NumberOfActiveSurfaces; ++k) { - if (p->display_cfg->stream_descriptors[p->display_cfg->plane_descriptors[k].stream_index].writeback.enable == true) { + if (p->display_cfg->stream_descriptors[p->display_cfg->plane_descriptors[k].stream_index].writeback.active_writebacks_per_stream > 0) { s->TotalActiveWriteback = s->TotalActiveWriteback + 1; } } @@ -6522,7 +6806,7 @@ static void CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport( s->LBLatencyHidingSourceLinesC[k] = (unsigned int)(math_min2((double)p->MaxLineBufferLines, math_floor2((double)p->LineBufferSize / LBBitPerPixel / ((double)p->SwathWidthC[k] / math_max2(h_ratio_c, 1.0)), 1)) - (v_taps_c - 1)); #ifdef __DML_VBA_DEBUG__ - dml2_printf("DML::%s: k=%u, MaxLineBufferLines= %u\n", __func__, k, p->MaxLineBufferLines); + dml2_printf("DML::%s: k=%u, MaxLineBufferLines = %u\n", __func__, k, p->MaxLineBufferLines); dml2_printf("DML::%s: k=%u, LineBufferSize = %u\n", __func__, k, p->LineBufferSize); dml2_printf("DML::%s: k=%u, LBBitPerPixel = %u\n", __func__, k, LBBitPerPixel); dml2_printf("DML::%s: k=%u, HRatio = %f\n", __func__, k, h_ratio); @@ -6563,7 +6847,7 @@ static void CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport( s->ActiveDRAMClockChangeLatencyMargin[k] = s->ActiveClockChangeLatencyHiding - p->Watermark->DRAMClockChangeWatermark; s->ActiveFCLKChangeLatencyMargin[k] = s->ActiveClockChangeLatencyHiding - p->Watermark->FCLKChangeWatermark; s->USRRetrainingLatencyMargin[k] = s->ActiveClockChangeLatencyHiding - p->Watermark->USRRetrainingWatermark; - s->g6_temp_read_latency_margin[k] = s->ActiveClockChangeLatencyHiding - p->Watermark->g6_temp_read_watermark_us; + s->g6_temp_read_latency_margin[k] = s->ActiveClockChangeLatencyHiding - p->Watermark->temp_read_or_ppt_watermark_us; if (p->VActiveLatencyHidingMargin) p->VActiveLatencyHidingMargin[k] = s->ActiveDRAMClockChangeLatencyMargin[k]; @@ -6571,9 +6855,12 @@ static void CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport( if (p->VActiveLatencyHidingUs) p->VActiveLatencyHidingUs[k] = s->ActiveClockChangeLatencyHiding; - if (p->display_cfg->stream_descriptors[p->display_cfg->plane_descriptors[k].stream_index].writeback.enable) { - s->WritebackLatencyHiding = (double)p->WritebackInterfaceBufferSize * 1024.0 / ((double)p->display_cfg->stream_descriptors[p->display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_height * (double)p->display_cfg->stream_descriptors[p->display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_width / ((double)p->display_cfg->stream_descriptors[p->display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.input_height * (double)h_total / pixel_clock_mhz) * 4.0); - if (p->display_cfg->stream_descriptors[p->display_cfg->plane_descriptors[k].stream_index].writeback.pixel_format == dml2_444_64) { + if (p->display_cfg->stream_descriptors[p->display_cfg->plane_descriptors[k].stream_index].writeback.active_writebacks_per_stream > 0) { + s->WritebackLatencyHiding = (double)p->WritebackInterfaceBufferSize * 1024.0 + / ((double)p->display_cfg->stream_descriptors[p->display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].output_height + * (double)p->display_cfg->stream_descriptors[p->display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].output_width + / ((double)p->display_cfg->stream_descriptors[p->display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].input_height * (double)h_total / pixel_clock_mhz) * 4.0); + if (p->display_cfg->stream_descriptors[p->display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].pixel_format == dml2_444_64) { s->WritebackLatencyHiding = s->WritebackLatencyHiding / 2; } s->WritebackDRAMClockChangeLatencyMargin = s->WritebackLatencyHiding - p->Watermark->WritebackDRAMClockChangeWatermark; @@ -6588,36 +6875,36 @@ static void CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport( uclk_pstate_change_strategy = p->display_cfg->plane_descriptors[k].overrides.uclk_pstate_change_strategy; reserved_vblank_time_us = (double)p->display_cfg->plane_descriptors[k].overrides.reserved_vblank_time_ns / 1000; - p->FCLKChangeSupport[k] = dml2_fclock_change_unsupported; + p->FCLKChangeSupport[k] = dml2_pstate_change_unsupported; if (s->ActiveFCLKChangeLatencyMargin[k] > 0) - p->FCLKChangeSupport[k] = dml2_fclock_change_vactive; + p->FCLKChangeSupport[k] = dml2_pstate_change_vactive; else if (reserved_vblank_time_us >= p->mmSOCParameters.FCLKChangeLatency) - p->FCLKChangeSupport[k] = dml2_fclock_change_vblank; + p->FCLKChangeSupport[k] = dml2_pstate_change_vblank; - if (p->FCLKChangeSupport[k] == dml2_fclock_change_unsupported) + if (p->FCLKChangeSupport[k] == dml2_pstate_change_unsupported) *p->global_fclk_change_supported = false; - p->DRAMClockChangeSupport[k] = dml2_dram_clock_change_unsupported; + p->DRAMClockChangeSupport[k] = dml2_pstate_change_unsupported; if (uclk_pstate_change_strategy == dml2_uclk_pstate_change_strategy_auto) { if (p->display_cfg->overrides.all_streams_blanked || (s->ActiveDRAMClockChangeLatencyMargin[k] > 0 && reserved_vblank_time_us >= p->mmSOCParameters.DRAMClockChangeLatency)) - p->DRAMClockChangeSupport[k] = dml2_dram_clock_change_vblank_and_vactive; + p->DRAMClockChangeSupport[k] = dml2_pstate_change_vblank_and_vactive; else if (s->ActiveDRAMClockChangeLatencyMargin[k] > 0) - p->DRAMClockChangeSupport[k] = dml2_dram_clock_change_vactive; + p->DRAMClockChangeSupport[k] = dml2_pstate_change_vactive; else if (reserved_vblank_time_us >= p->mmSOCParameters.DRAMClockChangeLatency) - p->DRAMClockChangeSupport[k] = dml2_dram_clock_change_vblank; + p->DRAMClockChangeSupport[k] = dml2_pstate_change_vblank; } else if (uclk_pstate_change_strategy == dml2_uclk_pstate_change_strategy_force_vactive && s->ActiveDRAMClockChangeLatencyMargin[k] > 0) - p->DRAMClockChangeSupport[k] = dml2_dram_clock_change_vactive; + p->DRAMClockChangeSupport[k] = dml2_pstate_change_vactive; else if (uclk_pstate_change_strategy == dml2_uclk_pstate_change_strategy_force_vblank && reserved_vblank_time_us >= p->mmSOCParameters.DRAMClockChangeLatency) - p->DRAMClockChangeSupport[k] = dml2_dram_clock_change_vblank; + p->DRAMClockChangeSupport[k] = dml2_pstate_change_vblank; else if (uclk_pstate_change_strategy == dml2_uclk_pstate_change_strategy_force_drr) - p->DRAMClockChangeSupport[k] = dml2_dram_clock_change_drr; + p->DRAMClockChangeSupport[k] = dml2_pstate_change_drr; else if (uclk_pstate_change_strategy == dml2_uclk_pstate_change_strategy_force_mall_svp) - p->DRAMClockChangeSupport[k] = dml2_dram_clock_change_mall_svp; + p->DRAMClockChangeSupport[k] = dml2_pstate_change_mall_svp; else if (uclk_pstate_change_strategy == dml2_uclk_pstate_change_strategy_force_mall_full_frame) - p->DRAMClockChangeSupport[k] = dml2_dram_clock_change_mall_full_frame; + p->DRAMClockChangeSupport[k] = dml2_pstate_change_mall_full_frame; - if (p->DRAMClockChangeSupport[k] == dml2_dram_clock_change_unsupported) + if (p->DRAMClockChangeSupport[k] == dml2_pstate_change_unsupported) *p->global_dram_clock_change_supported = false; s->dst_y_pstate = (unsigned int)(math_ceil2((p->mmSOCParameters.DRAMClockChangeLatency + p->mmSOCParameters.UrgentLatency) / (h_total / pixel_clock_mhz), 1)); @@ -6915,8 +7202,7 @@ struct dml2_core_internal_g6_temp_read_blackouts_table { } entries[DML_MAX_CLK_TABLE_SIZE]; }; -static const struct dml2_core_internal_g6_temp_read_blackouts_table - core_dcn4_g6_temp_read_blackout_table = { +struct dml2_core_internal_g6_temp_read_blackouts_table core_dcn4_g6_temp_read_blackout_table = { .entries = { { .uclk_khz = 96000, @@ -7036,6 +7322,9 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out struct dml2_core_calcs_CalculateVMRowAndSwath_params *CalculateVMRowAndSwath_params = &mode_lib->scratch.CalculateVMRowAndSwath_params; struct dml2_core_calcs_CalculateSwathAndDETConfiguration_params *CalculateSwathAndDETConfiguration_params = &mode_lib->scratch.CalculateSwathAndDETConfiguration_params; struct dml2_core_calcs_CalculatePrefetchSchedule_params *CalculatePrefetchSchedule_params = &mode_lib->scratch.CalculatePrefetchSchedule_params; +#ifdef DML_GLOBAL_PREFETCH_CHECK + struct dml2_core_calcs_CheckGlobalPrefetchAdmissibility_params *CheckGlobalPrefetchAdmissibility_params = &mode_lib->scratch.CheckGlobalPrefetchAdmissibility_params; +#endif struct dml2_core_calcs_calculate_tdlut_setting_params *calculate_tdlut_setting_params = &mode_lib->scratch.calculate_tdlut_setting_params; struct dml2_core_calcs_calculate_mcache_setting_params *calculate_mcache_setting_params = &mode_lib->scratch.calculate_mcache_setting_params; struct dml2_core_calcs_calculate_peak_bandwidth_required_params *calculate_peak_bandwidth_params = &mode_lib->scratch.calculate_peak_bandwidth_params; @@ -7083,12 +7372,6 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out for (k = 0; k < mode_lib->ms.num_active_planes; k++) dml2_printf("DML::%s: plane_%d: reserved_vblank_time_ns = %u\n", __func__, k, display_cfg->plane_descriptors[k].overrides.reserved_vblank_time_ns); - - // dml2_printf_dml_policy(&mode_lib->ms.policy); - // dml2_printf_dml_display_cfg_timing(&display_cfg->timing, mode_lib->ms.num_active_planes); - // dml2_printf_dml_display_cfg_plane(&display_cfg->plane, mode_lib->ms.num_active_planes); - // dml2_printf_dml_display_cfg_surface(&display_cfg->surface, mode_lib->ms.num_active_planes); - // dml2_printf_dml_display_cfg_output(&display_cfg->output, mode_lib->ms.num_active_planes); #endif CalculateMaxDETAndMinCompressedBufferSize( @@ -7183,8 +7466,8 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out } for (k = 0; k <= mode_lib->ms.num_active_planes - 1; k++) { - mode_lib->ms.SurfaceReadBandwidthLuma[k] = mode_lib->ms.SwathWidthYSingleDPP[k] * math_ceil2(mode_lib->ms.BytePerPixelY[k], 1.0) / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * display_cfg->plane_descriptors[k].composition.scaler_info.plane0.v_ratio; - mode_lib->ms.SurfaceReadBandwidthChroma[k] = mode_lib->ms.SwathWidthCSingleDPP[k] * math_ceil2(mode_lib->ms.BytePerPixelC[k], 2.0) / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * display_cfg->plane_descriptors[k].composition.scaler_info.plane1.v_ratio; + mode_lib->ms.vactive_sw_bw_l[k] = mode_lib->ms.SwathWidthYSingleDPP[k] * math_ceil2(mode_lib->ms.BytePerPixelY[k], 1.0) / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * display_cfg->plane_descriptors[k].composition.scaler_info.plane0.v_ratio; + mode_lib->ms.vactive_sw_bw_c[k] = mode_lib->ms.SwathWidthCSingleDPP[k] * math_ceil2(mode_lib->ms.BytePerPixelC[k], 2.0) / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * display_cfg->plane_descriptors[k].composition.scaler_info.plane1.v_ratio; mode_lib->ms.cursor_bw[k] = display_cfg->plane_descriptors[k].cursor.num_cursors * display_cfg->plane_descriptors[k].cursor.cursor_width * display_cfg->plane_descriptors[k].cursor.cursor_bpp / 8.0 / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)); @@ -7194,35 +7477,35 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out old_ReadBandwidthChroma = mode_lib->ms.SwathWidthYSingleDPP[k] / 2 * math_ceil2(mode_lib->ms.BytePerPixelInDETC[k], 2.0) / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * display_cfg->plane_descriptors[k].composition.scaler_info.plane0.v_ratio / 2.0; dml2_printf("DML::%s: k=%u, old_ReadBandwidthLuma = %f\n", __func__, k, old_ReadBandwidthLuma); dml2_printf("DML::%s: k=%u, old_ReadBandwidthChroma = %f\n", __func__, k, old_ReadBandwidthChroma); - dml2_printf("DML::%s: k=%u, ReadBandwidthLuma = %f\n", __func__, k, mode_lib->ms.SurfaceReadBandwidthLuma[k]); - dml2_printf("DML::%s: k=%u, ReadBandwidthChroma = %f\n", __func__, k, mode_lib->ms.SurfaceReadBandwidthChroma[k]); + dml2_printf("DML::%s: k=%u, vactive_sw_bw_l = %f\n", __func__, k, mode_lib->ms.vactive_sw_bw_l[k]); + dml2_printf("DML::%s: k=%u, vactive_sw_bw_c = %f\n", __func__, k, mode_lib->ms.vactive_sw_bw_c[k]); #endif } // Writeback bandwidth for (k = 0; k < mode_lib->ms.num_active_planes; k++) { - if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.enable == true && display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.pixel_format == dml2_444_64) { - mode_lib->ms.WriteBandwidth[k] = display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_height - * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_width - / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.input_height + if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.active_writebacks_per_stream > 0 && display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].pixel_format == dml2_444_64) { + mode_lib->ms.WriteBandwidth[k][0] = display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].output_height + * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].output_width + / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].input_height * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * 8.0; - } else if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.enable == true) { - mode_lib->ms.WriteBandwidth[k] = display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_height - * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_width - / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.input_height + } else if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.active_writebacks_per_stream > 0) { + mode_lib->ms.WriteBandwidth[k][0] = display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].output_height + * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].output_width + / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].input_height * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * 4.0; } else { - mode_lib->ms.WriteBandwidth[k] = 0.0; + mode_lib->ms.WriteBandwidth[k][0] = 0.0; } } /*Writeback Latency support check*/ mode_lib->ms.support.WritebackLatencySupport = true; for (k = 0; k <= mode_lib->ms.num_active_planes - 1; k++) { - if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.enable == true && - (mode_lib->ms.WriteBandwidth[k] > mode_lib->ip.writeback_interface_buffer_size_kbytes * 1024 / ((double)mode_lib->soc.qos_parameters.writeback.base_latency_us))) { + if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.active_writebacks_per_stream > 0 && + (mode_lib->ms.WriteBandwidth[k][0] > mode_lib->ip.writeback_interface_buffer_size_kbytes * 1024 / ((double)mode_lib->soc.qos_parameters.writeback.base_latency_us))) { mode_lib->ms.support.WritebackLatencySupport = false; } } @@ -7231,19 +7514,19 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out /* Writeback Scale Ratio and Taps Support Check */ mode_lib->ms.support.WritebackScaleRatioAndTapsSupport = true; for (k = 0; k <= mode_lib->ms.num_active_planes - 1; k++) { - if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.enable == true) { - if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.h_ratio > mode_lib->ip.writeback_max_hscl_ratio - || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.v_ratio > mode_lib->ip.writeback_max_vscl_ratio - || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.h_ratio < mode_lib->ip.writeback_min_hscl_ratio - || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.v_ratio < mode_lib->ip.writeback_min_vscl_ratio - || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.h_taps > (unsigned int) mode_lib->ip.writeback_max_hscl_taps - || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.v_taps > (unsigned int) mode_lib->ip.writeback_max_vscl_taps - || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.h_ratio > (unsigned int)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.h_taps - || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.v_ratio > (unsigned int)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.v_taps - || (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.h_taps > 2.0 && ((display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.h_taps % 2) == 1))) { + if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.active_writebacks_per_stream > 0) { + if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].h_ratio > mode_lib->ip.writeback_max_hscl_ratio + || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].v_ratio > mode_lib->ip.writeback_max_vscl_ratio + || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].h_ratio < mode_lib->ip.writeback_min_hscl_ratio + || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].v_ratio < mode_lib->ip.writeback_min_vscl_ratio + || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].h_taps > (unsigned int) mode_lib->ip.writeback_max_hscl_taps + || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].v_taps > (unsigned int) mode_lib->ip.writeback_max_vscl_taps + || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].h_ratio > (unsigned int)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].h_taps + || display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].v_ratio > (unsigned int)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].v_taps + || (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].h_taps > 2.0 && ((display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].h_taps % 2) == 1))) { mode_lib->ms.support.WritebackScaleRatioAndTapsSupport = false; } - if (2.0 * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_height * (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.v_taps - 1) * 57 > mode_lib->ip.writeback_line_buffer_buffer_size) { + if (2.0 * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].output_height * (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].v_taps - 1) * 57 > mode_lib->ip.writeback_line_buffer_buffer_size) { mode_lib->ms.support.WritebackScaleRatioAndTapsSupport = false; } } @@ -7423,8 +7706,8 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out CalculateSwathAndDETConfiguration_params->nomDETInKByte = mode_lib->ms.NomDETInKByte; CalculateSwathAndDETConfiguration_params->ConfigReturnBufferSegmentSizeInkByte = mode_lib->ip.config_return_buffer_segment_size_in_kbytes; CalculateSwathAndDETConfiguration_params->CompressedBufferSegmentSizeInkByte = mode_lib->ip.compressed_buffer_segment_size_in_kbytes; - CalculateSwathAndDETConfiguration_params->ReadBandwidthLuma = mode_lib->ms.SurfaceReadBandwidthLuma; - CalculateSwathAndDETConfiguration_params->ReadBandwidthChroma = mode_lib->ms.SurfaceReadBandwidthChroma; + CalculateSwathAndDETConfiguration_params->ReadBandwidthLuma = mode_lib->ms.vactive_sw_bw_l; + CalculateSwathAndDETConfiguration_params->ReadBandwidthChroma = mode_lib->ms.vactive_sw_bw_c; CalculateSwathAndDETConfiguration_params->MaximumSwathWidthLuma = mode_lib->ms.MaximumSwathWidthLuma; CalculateSwathAndDETConfiguration_params->MaximumSwathWidthChroma = mode_lib->ms.MaximumSwathWidthChroma; CalculateSwathAndDETConfiguration_params->Read256BytesBlockHeightY = mode_lib->ms.Read256BlockHeightY; @@ -7671,16 +7954,16 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out //DISPCLK/DPPCLK mode_lib->ms.WritebackRequiredDISPCLK = 0; for (k = 0; k < mode_lib->ms.num_active_planes; ++k) { - if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.enable) { + if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.active_writebacks_per_stream > 0) { mode_lib->ms.WritebackRequiredDISPCLK = math_max2(mode_lib->ms.WritebackRequiredDISPCLK, - CalculateWriteBackDISPCLK(display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.pixel_format, + CalculateWriteBackDISPCLK(display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].pixel_format, ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000), - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.h_ratio, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.v_ratio, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.h_taps, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.v_taps, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.input_width, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_width, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].h_ratio, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].v_ratio, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].h_taps, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].v_taps, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].input_width, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].output_width, display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total, mode_lib->ip.writeback_line_buffer_buffer_size)); } @@ -7712,7 +7995,7 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out if (!s->stream_visited[display_cfg->plane_descriptors[k].stream_index]) { s->stream_visited[display_cfg->plane_descriptors[k].stream_index] = 1; - if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.enable == true) + if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.active_writebacks_per_stream > 0) s->TotalNumberOfActiveWriteback = s->TotalNumberOfActiveWriteback + 1; s->TotalNumberOfActiveOTG = s->TotalNumberOfActiveOTG + 1; @@ -8256,23 +8539,23 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out mode_lib->ms.PSCL_FACTOR, mode_lib->ms.PSCL_FACTOR_CHROMA, mode_lib->ms.RequiredDPPCLK, - mode_lib->ms.SurfaceReadBandwidthLuma, - mode_lib->ms.SurfaceReadBandwidthChroma, + mode_lib->ms.vactive_sw_bw_l, + mode_lib->ms.vactive_sw_bw_c, mode_lib->soc.return_bus_width_bytes, /* Output */ &mode_lib->ms.dcfclk_deepsleep); for (k = 0; k <= mode_lib->ms.num_active_planes - 1; k++) { - if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.enable == true) { + if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.active_writebacks_per_stream > 0) { mode_lib->ms.WritebackDelayTime[k] = mode_lib->soc.qos_parameters.writeback.base_latency_us + CalculateWriteBackDelay( - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.pixel_format, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.h_ratio, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.v_ratio, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.v_taps, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_width, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_height, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.input_height, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].pixel_format, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].h_ratio, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].v_ratio, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].v_taps, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].output_width, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].output_height, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].input_height, display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total) / mode_lib->ms.RequiredDISPCLK; } else { mode_lib->ms.WritebackDelayTime[k] = 0.0; @@ -8349,7 +8632,7 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out dml2_printf("DML::%s: mode_lib->ms.DCFCLK = %f\n", __func__, mode_lib->ms.DCFCLK); dml2_printf("DML::%s: mode_lib->ms.FabricClock = %f\n", __func__, mode_lib->ms.FabricClock); dml2_printf("DML::%s: mode_lib->ms.uclk_freq_mhz = %f\n", __func__, mode_lib->ms.uclk_freq_mhz); - dml2_printf("DML::%s: urgent latency tolerance = %f\n", __func__, ((mode_lib->ip.rob_buffer_size_kbytes - mode_lib->ip.pixel_chunk_size_kbytes) * 1024 / (mode_lib->ms.DCFCLK * mode_lib->soc.return_bus_width_bytes))); + dml2_printf("DML::%s: urgent latency tolarance = %f\n", __func__, ((mode_lib->ip.rob_buffer_size_kbytes - mode_lib->ip.pixel_chunk_size_kbytes) * 1024 / (mode_lib->ms.DCFCLK * mode_lib->soc.return_bus_width_bytes))); #endif mode_lib->ms.support.OutstandingRequestsSupport = true; @@ -8367,6 +8650,13 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out + mode_lib->soc.qos_parameters.qos_params.dcn4x.average_transport_distance_fclk_cycles / mode_lib->ms.FabricClock) * (1 + mode_lib->soc.qos_parameters.qos_params.dcn4x.fabric_average_transport_latency_margin / 100.0); + mode_lib->ms.support.max_non_urgent_latency_us + = mode_lib->soc.qos_parameters.qos_params.dcn4x.per_uclk_dpm_params[mode_lib->ms.qos_param_index].maximum_latency_when_non_urgent_uclk_cycles + / mode_lib->ms.uclk_freq_mhz * (1 + mode_lib->soc.qos_parameters.qos_params.dcn4x.umc_max_latency_margin / 100.0) + + mode_lib->soc.qos_parameters.qos_params.dcn4x.mall_overhead_fclk_cycles / mode_lib->ms.FabricClock + + mode_lib->soc.qos_parameters.qos_params.dcn4x.max_round_trip_to_furthest_cs_fclk_cycles / mode_lib->ms.FabricClock + * (1 + mode_lib->soc.qos_parameters.qos_params.dcn4x.fabric_max_transport_latency_margin / 100.0); + for (k = 0; k < mode_lib->ms.num_active_planes; k++) { if (mode_lib->soc.qos_parameters.qos_type == dml2_qos_param_type_dcn4x) { @@ -8408,7 +8698,7 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out } memset(calculate_mcache_setting_params, 0, sizeof(struct dml2_core_calcs_calculate_mcache_setting_params)); - if (mode_lib->soc.mall_allocated_for_dcn_mbytes == 0 || mode_lib->ip.dcn_mrq_present) { + if (mode_lib->soc.mcache_size_bytes == 0 || mode_lib->ip.dcn_mrq_present) { for (k = 0; k < mode_lib->ms.num_active_planes; k++) { mode_lib->ms.mall_prefetch_sdp_overhead_factor[k] = 1.0; mode_lib->ms.mall_prefetch_dram_overhead_factor[k] = 1.0; @@ -8515,8 +8805,11 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out display_cfg->hostvm_enable, mode_lib->ms.MaxDCFCLK, mode_lib->ms.MaxFabricClock, +#ifdef DML_MODE_SUPPORT_USE_DPM_DRAM_BW + mode_lib->ms.dram_bw_mbps); +#else mode_lib->ms.max_dram_bw_mbps); - +#endif // Average BW support check calculate_avg_bandwidth_required( @@ -8524,8 +8817,8 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out // input display_cfg, mode_lib->ms.num_active_planes, - mode_lib->ms.SurfaceReadBandwidthLuma, - mode_lib->ms.SurfaceReadBandwidthChroma, + mode_lib->ms.vactive_sw_bw_l, + mode_lib->ms.vactive_sw_bw_c, mode_lib->ms.cursor_bw, mode_lib->ms.dcc_dram_bw_nom_overhead_factor_p0, mode_lib->ms.dcc_dram_bw_nom_overhead_factor_p1, @@ -8595,6 +8888,7 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out calculate_tdlut_setting_params->tdlut_groups_per_2row_ub = &s->tdlut_groups_per_2row_ub[k]; calculate_tdlut_setting_params->tdlut_opt_time = &s->tdlut_opt_time[k]; calculate_tdlut_setting_params->tdlut_drain_time = &s->tdlut_drain_time[k]; + calculate_tdlut_setting_params->tdlut_bytes_to_deliver = &s->tdlut_bytes_to_deliver[k]; calculate_tdlut_setting_params->tdlut_bytes_per_group = &s->tdlut_bytes_per_group[k]; calculate_tdlut_setting(&mode_lib->scratch, calculate_tdlut_setting_params); @@ -8638,9 +8932,32 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out &mode_lib->ms.ExtraLatency_sr, &mode_lib->ms.ExtraLatencyPrefetch); - { + for (k = 0; k < mode_lib->ms.num_active_planes; k++) + s->impacted_dst_y_pre[k] = 0; + + s->recalc_prefetch_schedule = 0; + s->recalc_prefetch_done = 0; + do { mode_lib->ms.support.PrefetchSupported = true; + for (k = 0; k < mode_lib->ms.num_active_planes; k++) { + s->line_times[k] = display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000); + s->pixel_format[k] = display_cfg->plane_descriptors[k].pixel_format; + + s->lb_source_lines_l[k] = get_num_lb_source_lines(mode_lib->ip.max_line_buffer_lines, mode_lib->ip.line_buffer_size_bits, + mode_lib->ms.NoOfDPP[k], + display_cfg->plane_descriptors[k].composition.viewport.plane0.width, + display_cfg->plane_descriptors[k].composition.viewport.plane0.height, + display_cfg->plane_descriptors[k].composition.scaler_info.plane0.h_ratio, + display_cfg->plane_descriptors[k].composition.rotation_angle); + + s->lb_source_lines_c[k] = get_num_lb_source_lines(mode_lib->ip.max_line_buffer_lines, mode_lib->ip.line_buffer_size_bits, + mode_lib->ms.NoOfDPP[k], + display_cfg->plane_descriptors[k].composition.viewport.plane1.width, + display_cfg->plane_descriptors[k].composition.viewport.plane1.height, + display_cfg->plane_descriptors[k].composition.scaler_info.plane1.h_ratio, + display_cfg->plane_descriptors[k].composition.rotation_angle); + struct dml2_core_internal_DmlPipe *myPipe = &s->myPipe; mode_lib->ms.TWait[k] = CalculateTWait( @@ -8730,6 +9047,9 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out CalculatePrefetchSchedule_params->mrq_present = mode_lib->ip.dcn_mrq_present; CalculatePrefetchSchedule_params->meta_row_bytes = mode_lib->ms.meta_row_bytes[k]; CalculatePrefetchSchedule_params->mall_prefetch_sdp_overhead_factor = mode_lib->ms.mall_prefetch_sdp_overhead_factor[k]; + CalculatePrefetchSchedule_params->impacted_dst_y_pre = s->impacted_dst_y_pre[k]; + CalculatePrefetchSchedule_params->vactive_sw_bw_l = mode_lib->ms.vactive_sw_bw_l[k]; + CalculatePrefetchSchedule_params->vactive_sw_bw_c = mode_lib->ms.vactive_sw_bw_c[k]; // output CalculatePrefetchSchedule_params->DSTXAfterScaler = &s->DSTXAfterScaler[k]; @@ -8758,6 +9078,10 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out CalculatePrefetchSchedule_params->VUpdateWidthPix = &s->dummy_integer[1]; CalculatePrefetchSchedule_params->VReadyOffsetPix = &s->dummy_integer[2]; CalculatePrefetchSchedule_params->prefetch_cursor_bw = &mode_lib->ms.prefetch_cursor_bw[k]; + CalculatePrefetchSchedule_params->prefetch_sw_bytes = &s->prefetch_sw_bytes[k]; + CalculatePrefetchSchedule_params->Tpre_rounded = &s->Tpre_rounded[k]; + CalculatePrefetchSchedule_params->Tpre_oto = &s->Tpre_oto[k]; + CalculatePrefetchSchedule_params->prefetch_swath_time_us = &s->prefetch_swath_time_us[k]; mode_lib->ms.NoTimeForPrefetch[k] = CalculatePrefetchSchedule(&mode_lib->scratch, CalculatePrefetchSchedule_params); @@ -8766,6 +9090,27 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out dml2_printf("DML::%s: k=%d, dst_y_per_row_vblank = %f\n", __func__, k, *CalculatePrefetchSchedule_params->dst_y_per_row_vblank); } // for k num_planes + CalculateDCFCLKDeepSleepTdlut( + display_cfg, + mode_lib->ms.num_active_planes, + mode_lib->ms.BytePerPixelY, + mode_lib->ms.BytePerPixelC, + mode_lib->ms.SwathWidthY, + mode_lib->ms.SwathWidthC, + mode_lib->ms.NoOfDPP, + mode_lib->ms.PSCL_FACTOR, + mode_lib->ms.PSCL_FACTOR_CHROMA, + mode_lib->ms.RequiredDPPCLK, + mode_lib->ms.vactive_sw_bw_l, + mode_lib->ms.vactive_sw_bw_c, + mode_lib->soc.return_bus_width_bytes, + mode_lib->ms.RequiredDISPCLK, + s->tdlut_bytes_to_deliver, + s->prefetch_swath_time_us, + + /* Output */ + &mode_lib->ms.dcfclk_deepsleep); + for (k = 0; k < mode_lib->ms.num_active_planes; k++) { if (mode_lib->ms.dst_y_prefetch[k] < 2.0 || mode_lib->ms.LinesForVM[k] >= 32.0 @@ -8789,7 +9134,7 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out } mode_lib->ms.support.VRatioInPrefetchSupported = true; - for (k = 0; k <= mode_lib->ms.num_active_planes - 1; k++) { + for (k = 0; k < mode_lib->ms.num_active_planes; k++) { if (mode_lib->ms.VRatioPreY[k] > __DML2_CALCS_MAX_VRATIO_PRE__ || mode_lib->ms.VRatioPreC[k] > __DML2_CALCS_MAX_VRATIO_PRE__) { mode_lib->ms.support.VRatioInPrefetchSupported = false; @@ -8799,10 +9144,14 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out } } + mode_lib->ms.support.PrefetchSupported &= mode_lib->ms.support.VRatioInPrefetchSupported; + + // By default, do not recalc prefetch schedule + s->recalc_prefetch_schedule = 0; + // Only do urg vs prefetch bandwidth check, flip schedule check, power saving feature support check IF the Prefetch Schedule Check is ok if (mode_lib->ms.support.PrefetchSupported) { - for (k = 0; k <= mode_lib->ms.num_active_planes - 1; k++) { - double line_time_us = display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000); + for (k = 0; k < mode_lib->ms.num_active_planes; k++) { // Calculate Urgent burst factor for prefetch #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: k=%d, Calling CalculateUrgentBurstFactor (for prefetch)\n", __func__, k); @@ -8815,7 +9164,7 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out mode_lib->ms.swath_width_chroma_ub[k], mode_lib->ms.SwathHeightY[k], mode_lib->ms.SwathHeightC[k], - line_time_us, + s->line_times[k], mode_lib->ms.UrgLatency, mode_lib->ms.VRatioPreY[k], mode_lib->ms.VRatioPreC[k], @@ -8852,8 +9201,8 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out calculate_peak_bandwidth_params->mall_prefetch_sdp_overhead_factor = mode_lib->ms.mall_prefetch_sdp_overhead_factor; calculate_peak_bandwidth_params->mall_prefetch_dram_overhead_factor = mode_lib->ms.mall_prefetch_dram_overhead_factor; - calculate_peak_bandwidth_params->surface_read_bandwidth_l = mode_lib->ms.SurfaceReadBandwidthLuma; - calculate_peak_bandwidth_params->surface_read_bandwidth_c = mode_lib->ms.SurfaceReadBandwidthChroma; + calculate_peak_bandwidth_params->surface_read_bandwidth_l = mode_lib->ms.vactive_sw_bw_l; + calculate_peak_bandwidth_params->surface_read_bandwidth_c = mode_lib->ms.vactive_sw_bw_c; calculate_peak_bandwidth_params->prefetch_bandwidth_l = mode_lib->ms.RequiredPrefetchPixelDataBWLuma; calculate_peak_bandwidth_params->prefetch_bandwidth_c = mode_lib->ms.RequiredPrefetchPixelDataBWChroma; calculate_peak_bandwidth_params->excess_vactive_fill_bw_l = mode_lib->ms.excess_vactive_fill_bw_l; @@ -8899,127 +9248,164 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out } } +#ifdef DML_GLOBAL_PREFETCH_CHECK + if (mode_lib->ms.support.PrefetchSupported && mode_lib->ms.num_active_planes > 1 && s->recalc_prefetch_done == 0) { + CheckGlobalPrefetchAdmissibility_params->num_active_planes = mode_lib->ms.num_active_planes; + CheckGlobalPrefetchAdmissibility_params->pixel_format = s->pixel_format; + CheckGlobalPrefetchAdmissibility_params->chunk_bytes_l = mode_lib->ip.pixel_chunk_size_kbytes * 1024; + CheckGlobalPrefetchAdmissibility_params->chunk_bytes_c = mode_lib->ip.pixel_chunk_size_kbytes * 1024; + CheckGlobalPrefetchAdmissibility_params->lb_source_lines_l = s->lb_source_lines_l; + CheckGlobalPrefetchAdmissibility_params->lb_source_lines_c = s->lb_source_lines_c; + CheckGlobalPrefetchAdmissibility_params->swath_height_l = mode_lib->ms.SwathHeightY; + CheckGlobalPrefetchAdmissibility_params->swath_height_c = mode_lib->ms.SwathHeightC; + CheckGlobalPrefetchAdmissibility_params->rob_buffer_size_kbytes = mode_lib->ip.rob_buffer_size_kbytes; + CheckGlobalPrefetchAdmissibility_params->compressed_buffer_size_kbytes = mode_lib->ms.CompressedBufferSizeInkByte; + CheckGlobalPrefetchAdmissibility_params->detile_buffer_size_bytes_l = mode_lib->ms.DETBufferSizeY; + CheckGlobalPrefetchAdmissibility_params->detile_buffer_size_bytes_c = mode_lib->ms.DETBufferSizeC; + CheckGlobalPrefetchAdmissibility_params->full_swath_bytes_l = s->full_swath_bytes_l; + CheckGlobalPrefetchAdmissibility_params->full_swath_bytes_c = s->full_swath_bytes_c; + CheckGlobalPrefetchAdmissibility_params->prefetch_sw_bytes = s->prefetch_sw_bytes; + CheckGlobalPrefetchAdmissibility_params->Tpre_rounded = s->Tpre_rounded; + CheckGlobalPrefetchAdmissibility_params->Tpre_oto = s->Tpre_oto; + CheckGlobalPrefetchAdmissibility_params->estimated_urg_bandwidth_required_mbps = mode_lib->ms.support.urg_bandwidth_required[dml2_core_internal_soc_state_sys_active][dml2_core_internal_bw_sdp]; + CheckGlobalPrefetchAdmissibility_params->line_time = s->line_times; + CheckGlobalPrefetchAdmissibility_params->dst_y_prefetch = mode_lib->ms.dst_y_prefetch; + if (CheckGlobalPrefetchAdmissibility_params->estimated_urg_bandwidth_required_mbps < 10 * 1024) + CheckGlobalPrefetchAdmissibility_params->estimated_urg_bandwidth_required_mbps = 10 * 1024; - // Both prefetch schedule and BW okay - if (mode_lib->ms.support.PrefetchSupported == true && mode_lib->ms.support.VRatioInPrefetchSupported == true) { - mode_lib->ms.BandwidthAvailableForImmediateFlip = - get_bandwidth_available_for_immediate_flip( - dml2_core_internal_soc_state_sys_active, - mode_lib->ms.support.urg_bandwidth_required_qual, // no flip - mode_lib->ms.support.urg_bandwidth_available); + CheckGlobalPrefetchAdmissibility_params->estimated_dcfclk_mhz = (CheckGlobalPrefetchAdmissibility_params->estimated_urg_bandwidth_required_mbps / (double) mode_lib->soc.return_bus_width_bytes) / + ((double)mode_lib->soc.qos_parameters.derate_table.system_active_urgent.dcfclk_derate_percent / 100.0); - mode_lib->ms.TotImmediateFlipBytes = 0; - for (k = 0; k < mode_lib->ms.num_active_planes; k++) { - if (display_cfg->plane_descriptors[k].immediate_flip) { - s->per_pipe_flip_bytes[k] = get_pipe_flip_bytes( - s->HostVMInefficiencyFactor, - mode_lib->ms.vm_bytes[k], - mode_lib->ms.DPTEBytesPerRow[k], - mode_lib->ms.meta_row_bytes[k]); - } else { - s->per_pipe_flip_bytes[k] = 0; - } - mode_lib->ms.TotImmediateFlipBytes += s->per_pipe_flip_bytes[k] * mode_lib->ms.NoOfDPP[k]; - - } - - for (k = 0; k <= mode_lib->ms.num_active_planes - 1; k++) { - CalculateFlipSchedule( - &mode_lib->scratch, - display_cfg->plane_descriptors[k].immediate_flip, - 1, // use_lb_flip_bw - s->HostVMInefficiencyFactor, - s->Tvm_trips_flip[k], - s->Tr0_trips_flip[k], - s->Tvm_trips_flip_rounded[k], - s->Tr0_trips_flip_rounded[k], - display_cfg->gpuvm_enable, - mode_lib->ms.vm_bytes[k], - mode_lib->ms.DPTEBytesPerRow[k], - mode_lib->ms.BandwidthAvailableForImmediateFlip, - mode_lib->ms.TotImmediateFlipBytes, - display_cfg->plane_descriptors[k].pixel_format, - (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)), - display_cfg->plane_descriptors[k].composition.scaler_info.plane0.v_ratio, - display_cfg->plane_descriptors[k].composition.scaler_info.plane1.v_ratio, - mode_lib->ms.Tno_bw_flip[k], - mode_lib->ms.dpte_row_height[k], - mode_lib->ms.dpte_row_height_chroma[k], - mode_lib->ms.use_one_row_for_frame_flip[k], - mode_lib->ip.max_flip_time_us, - mode_lib->ip.max_flip_time_lines, - s->per_pipe_flip_bytes[k], - mode_lib->ms.meta_row_bytes[k], - s->meta_row_height_luma[k], - s->meta_row_height_chroma[k], - mode_lib->ip.dcn_mrq_present && display_cfg->plane_descriptors[k].surface.dcc.enable, - - /* Output */ - &mode_lib->ms.dst_y_per_vm_flip[k], - &mode_lib->ms.dst_y_per_row_flip[k], - &mode_lib->ms.final_flip_bw[k], - &mode_lib->ms.ImmediateFlipSupportedForPipe[k]); - } - - calculate_peak_bandwidth_params->urg_vactive_bandwidth_required = s->dummy_bw; - calculate_peak_bandwidth_params->urg_bandwidth_required = mode_lib->ms.support.urg_bandwidth_required_flip; - calculate_peak_bandwidth_params->urg_bandwidth_required_qual = s->dummy_bw; - calculate_peak_bandwidth_params->non_urg_bandwidth_required = mode_lib->ms.support.non_urg_bandwidth_required_flip; - calculate_peak_bandwidth_params->surface_avg_vactive_required_bw = s->surface_dummy_bw; - calculate_peak_bandwidth_params->surface_peak_required_bw = mode_lib->ms.surface_peak_required_bw; - - calculate_peak_bandwidth_params->display_cfg = display_cfg; - calculate_peak_bandwidth_params->inc_flip_bw = 1; - calculate_peak_bandwidth_params->num_active_planes = mode_lib->ms.num_active_planes; - calculate_peak_bandwidth_params->num_of_dpp = mode_lib->ms.NoOfDPP; - calculate_peak_bandwidth_params->dcc_dram_bw_nom_overhead_factor_p0 = mode_lib->ms.dcc_dram_bw_nom_overhead_factor_p0; - calculate_peak_bandwidth_params->dcc_dram_bw_nom_overhead_factor_p1 = mode_lib->ms.dcc_dram_bw_nom_overhead_factor_p1; - calculate_peak_bandwidth_params->dcc_dram_bw_pref_overhead_factor_p0 = mode_lib->ms.dcc_dram_bw_pref_overhead_factor_p0; - calculate_peak_bandwidth_params->dcc_dram_bw_pref_overhead_factor_p1 = mode_lib->ms.dcc_dram_bw_pref_overhead_factor_p1; - calculate_peak_bandwidth_params->mall_prefetch_sdp_overhead_factor = mode_lib->ms.mall_prefetch_sdp_overhead_factor; - calculate_peak_bandwidth_params->mall_prefetch_dram_overhead_factor = mode_lib->ms.mall_prefetch_dram_overhead_factor; - - calculate_peak_bandwidth_params->surface_read_bandwidth_l = mode_lib->ms.SurfaceReadBandwidthLuma; - calculate_peak_bandwidth_params->surface_read_bandwidth_c = mode_lib->ms.SurfaceReadBandwidthChroma; - calculate_peak_bandwidth_params->prefetch_bandwidth_l = mode_lib->ms.RequiredPrefetchPixelDataBWLuma; - calculate_peak_bandwidth_params->prefetch_bandwidth_c = mode_lib->ms.RequiredPrefetchPixelDataBWChroma; - calculate_peak_bandwidth_params->excess_vactive_fill_bw_l = mode_lib->ms.excess_vactive_fill_bw_l; - calculate_peak_bandwidth_params->excess_vactive_fill_bw_c = mode_lib->ms.excess_vactive_fill_bw_c; - calculate_peak_bandwidth_params->cursor_bw = mode_lib->ms.cursor_bw; - calculate_peak_bandwidth_params->dpte_row_bw = mode_lib->ms.dpte_row_bw; - calculate_peak_bandwidth_params->meta_row_bw = mode_lib->ms.meta_row_bw; - calculate_peak_bandwidth_params->prefetch_cursor_bw = mode_lib->ms.prefetch_cursor_bw; - calculate_peak_bandwidth_params->prefetch_vmrow_bw = mode_lib->ms.prefetch_vmrow_bw; - calculate_peak_bandwidth_params->flip_bw = mode_lib->ms.final_flip_bw; - calculate_peak_bandwidth_params->urgent_burst_factor_l = mode_lib->ms.UrgentBurstFactorLuma; - calculate_peak_bandwidth_params->urgent_burst_factor_c = mode_lib->ms.UrgentBurstFactorChroma; - calculate_peak_bandwidth_params->urgent_burst_factor_cursor = mode_lib->ms.UrgentBurstFactorCursor; - calculate_peak_bandwidth_params->urgent_burst_factor_prefetch_l = mode_lib->ms.UrgentBurstFactorLumaPre; - calculate_peak_bandwidth_params->urgent_burst_factor_prefetch_c = mode_lib->ms.UrgentBurstFactorChromaPre; - calculate_peak_bandwidth_params->urgent_burst_factor_prefetch_cursor = mode_lib->ms.UrgentBurstFactorCursorPre; - - calculate_peak_bandwidth_required( - &mode_lib->scratch, - calculate_peak_bandwidth_params); - - calculate_immediate_flip_bandwidth_support( - &s->dummy_single[0], // double* frac_urg_bandwidth_flip - &mode_lib->ms.support.ImmediateFlipSupport, - - dml2_core_internal_soc_state_sys_active, - mode_lib->ms.support.urg_bandwidth_required_flip, - mode_lib->ms.support.non_urg_bandwidth_required_flip, - mode_lib->ms.support.urg_bandwidth_available); - - for (k = 0; k <= mode_lib->ms.num_active_planes - 1; k++) { - if (display_cfg->plane_descriptors[k].immediate_flip == true && mode_lib->ms.ImmediateFlipSupportedForPipe[k] == false) - mode_lib->ms.support.ImmediateFlipSupport = false; - } - - } else { // if prefetch not support, assume iflip is not supported too - mode_lib->ms.support.ImmediateFlipSupport = false; + // if recalc_prefetch_schedule is set, recalculate the prefetch schedule with the new impacted_Tpre, prefetch should be possible + CheckGlobalPrefetchAdmissibility_params->recalc_prefetch_schedule = &s->recalc_prefetch_schedule; + CheckGlobalPrefetchAdmissibility_params->impacted_dst_y_pre = s->impacted_dst_y_pre; + mode_lib->ms.support.PrefetchSupported = CheckGlobalPrefetchAdmissibility(&mode_lib->scratch, CheckGlobalPrefetchAdmissibility_params); + s->recalc_prefetch_done = 1; + s->recalc_prefetch_schedule = 1; } - } // prefetch schedule +#endif + } // prefetch schedule ok, do urg bw and flip schedule + } while (s->recalc_prefetch_schedule); + + // Flip Schedule + // Both prefetch schedule and BW okay + if (mode_lib->ms.support.PrefetchSupported == true) { + mode_lib->ms.BandwidthAvailableForImmediateFlip = + get_bandwidth_available_for_immediate_flip( + dml2_core_internal_soc_state_sys_active, + mode_lib->ms.support.urg_bandwidth_required_qual, // no flip + mode_lib->ms.support.urg_bandwidth_available); + + mode_lib->ms.TotImmediateFlipBytes = 0; + for (k = 0; k < mode_lib->ms.num_active_planes; k++) { + if (display_cfg->plane_descriptors[k].immediate_flip) { + s->per_pipe_flip_bytes[k] = get_pipe_flip_bytes( + s->HostVMInefficiencyFactor, + mode_lib->ms.vm_bytes[k], + mode_lib->ms.DPTEBytesPerRow[k], + mode_lib->ms.meta_row_bytes[k]); + } else { + s->per_pipe_flip_bytes[k] = 0; + } + mode_lib->ms.TotImmediateFlipBytes += s->per_pipe_flip_bytes[k] * mode_lib->ms.NoOfDPP[k]; + + } + + for (k = 0; k < mode_lib->ms.num_active_planes; k++) { + CalculateFlipSchedule( + &mode_lib->scratch, + display_cfg->plane_descriptors[k].immediate_flip, + 1, // use_lb_flip_bw + s->HostVMInefficiencyFactor, + s->Tvm_trips_flip[k], + s->Tr0_trips_flip[k], + s->Tvm_trips_flip_rounded[k], + s->Tr0_trips_flip_rounded[k], + display_cfg->gpuvm_enable, + mode_lib->ms.vm_bytes[k], + mode_lib->ms.DPTEBytesPerRow[k], + mode_lib->ms.BandwidthAvailableForImmediateFlip, + mode_lib->ms.TotImmediateFlipBytes, + display_cfg->plane_descriptors[k].pixel_format, + (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)), + display_cfg->plane_descriptors[k].composition.scaler_info.plane0.v_ratio, + display_cfg->plane_descriptors[k].composition.scaler_info.plane1.v_ratio, + mode_lib->ms.Tno_bw_flip[k], + mode_lib->ms.dpte_row_height[k], + mode_lib->ms.dpte_row_height_chroma[k], + mode_lib->ms.use_one_row_for_frame_flip[k], + mode_lib->ip.max_flip_time_us, + mode_lib->ip.max_flip_time_lines, + s->per_pipe_flip_bytes[k], + mode_lib->ms.meta_row_bytes[k], + s->meta_row_height_luma[k], + s->meta_row_height_chroma[k], + mode_lib->ip.dcn_mrq_present && display_cfg->plane_descriptors[k].surface.dcc.enable, + + /* Output */ + &mode_lib->ms.dst_y_per_vm_flip[k], + &mode_lib->ms.dst_y_per_row_flip[k], + &mode_lib->ms.final_flip_bw[k], + &mode_lib->ms.ImmediateFlipSupportedForPipe[k]); + } + + calculate_peak_bandwidth_params->urg_vactive_bandwidth_required = s->dummy_bw; + calculate_peak_bandwidth_params->urg_bandwidth_required = mode_lib->ms.support.urg_bandwidth_required_flip; + calculate_peak_bandwidth_params->urg_bandwidth_required_qual = s->dummy_bw; + calculate_peak_bandwidth_params->non_urg_bandwidth_required = mode_lib->ms.support.non_urg_bandwidth_required_flip; + calculate_peak_bandwidth_params->surface_avg_vactive_required_bw = s->surface_dummy_bw; + calculate_peak_bandwidth_params->surface_peak_required_bw = mode_lib->ms.surface_peak_required_bw; + + calculate_peak_bandwidth_params->display_cfg = display_cfg; + calculate_peak_bandwidth_params->inc_flip_bw = 1; + calculate_peak_bandwidth_params->num_active_planes = mode_lib->ms.num_active_planes; + calculate_peak_bandwidth_params->num_of_dpp = mode_lib->ms.NoOfDPP; + calculate_peak_bandwidth_params->dcc_dram_bw_nom_overhead_factor_p0 = mode_lib->ms.dcc_dram_bw_nom_overhead_factor_p0; + calculate_peak_bandwidth_params->dcc_dram_bw_nom_overhead_factor_p1 = mode_lib->ms.dcc_dram_bw_nom_overhead_factor_p1; + calculate_peak_bandwidth_params->dcc_dram_bw_pref_overhead_factor_p0 = mode_lib->ms.dcc_dram_bw_pref_overhead_factor_p0; + calculate_peak_bandwidth_params->dcc_dram_bw_pref_overhead_factor_p1 = mode_lib->ms.dcc_dram_bw_pref_overhead_factor_p1; + calculate_peak_bandwidth_params->mall_prefetch_sdp_overhead_factor = mode_lib->ms.mall_prefetch_sdp_overhead_factor; + calculate_peak_bandwidth_params->mall_prefetch_dram_overhead_factor = mode_lib->ms.mall_prefetch_dram_overhead_factor; + + calculate_peak_bandwidth_params->surface_read_bandwidth_l = mode_lib->ms.vactive_sw_bw_l; + calculate_peak_bandwidth_params->surface_read_bandwidth_c = mode_lib->ms.vactive_sw_bw_c; + calculate_peak_bandwidth_params->prefetch_bandwidth_l = mode_lib->ms.RequiredPrefetchPixelDataBWLuma; + calculate_peak_bandwidth_params->prefetch_bandwidth_c = mode_lib->ms.RequiredPrefetchPixelDataBWChroma; + calculate_peak_bandwidth_params->excess_vactive_fill_bw_l = mode_lib->ms.excess_vactive_fill_bw_l; + calculate_peak_bandwidth_params->excess_vactive_fill_bw_c = mode_lib->ms.excess_vactive_fill_bw_c; + calculate_peak_bandwidth_params->cursor_bw = mode_lib->ms.cursor_bw; + calculate_peak_bandwidth_params->dpte_row_bw = mode_lib->ms.dpte_row_bw; + calculate_peak_bandwidth_params->meta_row_bw = mode_lib->ms.meta_row_bw; + calculate_peak_bandwidth_params->prefetch_cursor_bw = mode_lib->ms.prefetch_cursor_bw; + calculate_peak_bandwidth_params->prefetch_vmrow_bw = mode_lib->ms.prefetch_vmrow_bw; + calculate_peak_bandwidth_params->flip_bw = mode_lib->ms.final_flip_bw; + calculate_peak_bandwidth_params->urgent_burst_factor_l = mode_lib->ms.UrgentBurstFactorLuma; + calculate_peak_bandwidth_params->urgent_burst_factor_c = mode_lib->ms.UrgentBurstFactorChroma; + calculate_peak_bandwidth_params->urgent_burst_factor_cursor = mode_lib->ms.UrgentBurstFactorCursor; + calculate_peak_bandwidth_params->urgent_burst_factor_prefetch_l = mode_lib->ms.UrgentBurstFactorLumaPre; + calculate_peak_bandwidth_params->urgent_burst_factor_prefetch_c = mode_lib->ms.UrgentBurstFactorChromaPre; + calculate_peak_bandwidth_params->urgent_burst_factor_prefetch_cursor = mode_lib->ms.UrgentBurstFactorCursorPre; + + calculate_peak_bandwidth_required( + &mode_lib->scratch, + calculate_peak_bandwidth_params); + + calculate_immediate_flip_bandwidth_support( + &s->dummy_single[0], // double* frac_urg_bandwidth_flip + &mode_lib->ms.support.ImmediateFlipSupport, + + dml2_core_internal_soc_state_sys_active, + mode_lib->ms.support.urg_bandwidth_required_flip, + mode_lib->ms.support.non_urg_bandwidth_required_flip, + mode_lib->ms.support.urg_bandwidth_available); + + for (k = 0; k <= mode_lib->ms.num_active_planes - 1; k++) { + if (display_cfg->plane_descriptors[k].immediate_flip == true && mode_lib->ms.ImmediateFlipSupportedForPipe[k] == false) + mode_lib->ms.support.ImmediateFlipSupport = false; + } + + } else { // if prefetch not support, assume iflip is not supported too + mode_lib->ms.support.ImmediateFlipSupport = false; } s->mSOCParameters.UrgentLatency = mode_lib->ms.UrgLatency; @@ -9116,8 +9502,8 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out s->pstate_bytes_required_c, mode_lib->ms.dcc_dram_bw_nom_overhead_factor_p0, mode_lib->ms.dcc_dram_bw_nom_overhead_factor_p1, - mode_lib->ms.SurfaceReadBandwidthLuma, - mode_lib->ms.SurfaceReadBandwidthChroma, + mode_lib->ms.vactive_sw_bw_l, + mode_lib->ms.vactive_sw_bw_c, mode_lib->ms.surface_avg_vactive_required_bw, mode_lib->ms.surface_peak_required_bw, /* outputs */ @@ -9187,12 +9573,12 @@ static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out dml2_printf("DML::%s: ModeSupport = %u\n", __func__, mode_lib->ms.support.ModeSupport); dml2_printf("DML::%s: ImmediateFlipSupport = %u\n", __func__, mode_lib->ms.support.ImmediateFlipSupport); - for (k = 0; k <= mode_lib->ms.num_active_planes - 1; k++) { + for (k = 0; k < mode_lib->ms.num_active_planes; k++) { mode_lib->ms.support.MPCCombineEnable[k] = mode_lib->ms.MPCCombine[k]; mode_lib->ms.support.DPPPerSurface[k] = mode_lib->ms.NoOfDPP[k]; } - for (k = 0; k <= mode_lib->ms.num_active_planes - 1; k++) { + for (k = 0; k < mode_lib->ms.num_active_planes; k++) { mode_lib->ms.support.ODMMode[k] = mode_lib->ms.ODMMode[k]; mode_lib->ms.support.DSCEnabled[k] = mode_lib->ms.RequiresDSC[k]; mode_lib->ms.support.FECEnabled[k] = mode_lib->ms.RequiresFEC[k]; @@ -9229,7 +9615,7 @@ unsigned int dml2_core_calcs_mode_support_ex(struct dml2_core_calcs_mode_support dml2_printf("DML::%s: is_mode_support = %u (min_clk_index=%d)\n", __func__, result, in_out_params->min_clk_index); for (unsigned int k = 0; k < in_out_params->in_display_cfg->num_planes; k++) - dml2_printf("DML::%s: plane_%d: reserved_vblank_time_ns = %u\n", __func__, k, in_out_params->in_display_cfg->plane_descriptors[k].overrides.reserved_vblank_time_ns); + dml2_printf("DML::%s: plane_%d: reserved_vblank_time_ns = %u\n", __func__, k, in_out_params->in_display_cfg->plane_descriptors[k].overrides.reserved_vblank_time_ns); dml2_printf("DML::%s: ------------- DONE ----------\n", __func__); @@ -9882,7 +10268,7 @@ static void CalculateStutterEfficiency(struct dml2_core_internal_scratch *scratc if (!dml_is_phantom_pipe(&p->display_cfg->plane_descriptors[k])) { if (!l->stream_visited[p->display_cfg->plane_descriptors[k].stream_index]) { - if (p->display_cfg->stream_descriptors[k].writeback.enable) + if (p->display_cfg->stream_descriptors[k].writeback.active_writebacks_per_stream > 0) l->TotalActiveWriteback = l->TotalActiveWriteback + 1; if (TotalNumberOfActiveOTG == 0) { // first otg @@ -9984,6 +10370,7 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex struct dml2_core_calcs_CalculateSwathAndDETConfiguration_params *CalculateSwathAndDETConfiguration_params = &mode_lib->scratch.CalculateSwathAndDETConfiguration_params; struct dml2_core_calcs_CalculateStutterEfficiency_params *CalculateStutterEfficiency_params = &mode_lib->scratch.CalculateStutterEfficiency_params; struct dml2_core_calcs_CalculatePrefetchSchedule_params *CalculatePrefetchSchedule_params = &mode_lib->scratch.CalculatePrefetchSchedule_params; + struct dml2_core_calcs_CheckGlobalPrefetchAdmissibility_params *CheckGlobalPrefetchAdmissibility_params = &mode_lib->scratch.CheckGlobalPrefetchAdmissibility_params; struct dml2_core_calcs_calculate_mcache_setting_params *calculate_mcache_setting_params = &mode_lib->scratch.calculate_mcache_setting_params; struct dml2_core_calcs_calculate_tdlut_setting_params *calculate_tdlut_setting_params = &mode_lib->scratch.calculate_tdlut_setting_params; struct dml2_core_shared_CalculateMetaAndPTETimes_params *CalculateMetaAndPTETimes_params = &mode_lib->scratch.CalculateMetaAndPTETimes_params; @@ -10075,12 +10462,6 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex dml2_assert(s->SOCCLK > 0); #ifdef __DML_VBA_DEBUG__ - // dml2_printf_dml_display_cfg_timing(&display_cfg->timing, s->num_active_planes); - // dml2_printf_dml_display_cfg_plane(&display_cfg->plane, s->num_active_planes); - // dml2_printf_dml_display_cfg_surface(&display_cfg->surface, s->num_active_planes); - // dml2_printf_dml_display_cfg_output(&display_cfg->output, s->num_active_planes); - // dml2_printf_dml_display_cfg_hw_resource(&display_cfg->hw, s->num_active_planes); - dml2_printf("DML::%s: num_active_planes = %u\n", __func__, s->num_active_planes); dml2_printf("DML::%s: num_active_pipes = %u\n", __func__, mode_lib->mp.num_active_pipes); dml2_printf("DML::%s: Dcfclk = %f\n", __func__, mode_lib->mp.Dcfclk); @@ -10198,10 +10579,10 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex for (k = 0; k < s->num_active_planes; ++k) { mode_lib->mp.cursor_bw[k] = display_cfg->plane_descriptors[k].cursor.num_cursors * display_cfg->plane_descriptors[k].cursor.cursor_width * display_cfg->plane_descriptors[k].cursor.cursor_bpp / 8.0 / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)); - mode_lib->mp.SurfaceReadBandwidthLuma[k] = mode_lib->mp.SwathWidthSingleDPPY[k] * mode_lib->mp.BytePerPixelY[k] / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * display_cfg->plane_descriptors[k].composition.scaler_info.plane0.v_ratio; - mode_lib->mp.SurfaceReadBandwidthChroma[k] = mode_lib->mp.SwathWidthSingleDPPC[k] * mode_lib->mp.BytePerPixelC[k] / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * display_cfg->plane_descriptors[k].composition.scaler_info.plane1.v_ratio; - dml2_printf("DML::%s: ReadBandwidthSurfaceLuma[%i] = %fBps\n", __func__, k, mode_lib->mp.SurfaceReadBandwidthLuma[k]); - dml2_printf("DML::%s: ReadBandwidthSurfaceChroma[%i] = %fBps\n", __func__, k, mode_lib->mp.SurfaceReadBandwidthChroma[k]); + mode_lib->mp.vactive_sw_bw_l[k] = mode_lib->mp.SwathWidthSingleDPPY[k] * mode_lib->mp.BytePerPixelY[k] / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * display_cfg->plane_descriptors[k].composition.scaler_info.plane0.v_ratio; + mode_lib->mp.vactive_sw_bw_c[k] = mode_lib->mp.SwathWidthSingleDPPC[k] * mode_lib->mp.BytePerPixelC[k] / (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * display_cfg->plane_descriptors[k].composition.scaler_info.plane1.v_ratio; + dml2_printf("DML::%s: vactive_sw_bw_l[%i] = %fBps\n", __func__, k, mode_lib->mp.vactive_sw_bw_l[k]); + dml2_printf("DML::%s: vactive_sw_bw_c[%i] = %fBps\n", __func__, k, mode_lib->mp.vactive_sw_bw_c[k]); } CalculateSwathAndDETConfiguration_params->display_cfg = display_cfg; @@ -10217,8 +10598,8 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex CalculateSwathAndDETConfiguration_params->nomDETInKByte = s->NomDETInKByte; CalculateSwathAndDETConfiguration_params->ConfigReturnBufferSegmentSizeInkByte = mode_lib->ip.config_return_buffer_segment_size_in_kbytes; CalculateSwathAndDETConfiguration_params->CompressedBufferSegmentSizeInkByte = mode_lib->ip.compressed_buffer_segment_size_in_kbytes; - CalculateSwathAndDETConfiguration_params->ReadBandwidthLuma = mode_lib->mp.SurfaceReadBandwidthLuma; - CalculateSwathAndDETConfiguration_params->ReadBandwidthChroma = mode_lib->mp.SurfaceReadBandwidthChroma; + CalculateSwathAndDETConfiguration_params->ReadBandwidthLuma = mode_lib->mp.vactive_sw_bw_l; + CalculateSwathAndDETConfiguration_params->ReadBandwidthChroma = mode_lib->mp.vactive_sw_bw_c; CalculateSwathAndDETConfiguration_params->MaximumSwathWidthLuma = s->dummy_single_array[0]; CalculateSwathAndDETConfiguration_params->MaximumSwathWidthChroma = s->dummy_single_array[1]; CalculateSwathAndDETConfiguration_params->Read256BytesBlockHeightY = mode_lib->mp.Read256BlockHeightY; @@ -10539,8 +10920,8 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex calculate_tdlut_setting_params->tdlut_groups_per_2row_ub = &s->tdlut_groups_per_2row_ub[k]; calculate_tdlut_setting_params->tdlut_opt_time = &s->tdlut_opt_time[k]; calculate_tdlut_setting_params->tdlut_drain_time = &s->tdlut_drain_time[k]; + calculate_tdlut_setting_params->tdlut_bytes_to_deliver = &s->tdlut_bytes_to_deliver[k]; calculate_tdlut_setting_params->tdlut_bytes_per_group = &s->tdlut_bytes_per_group[k]; - calculate_tdlut_setting(&mode_lib->scratch, calculate_tdlut_setting_params); } @@ -10583,17 +10964,17 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex mode_lib->mp.TCalc = 24.0 / mode_lib->mp.DCFCLKDeepSleep; for (k = 0; k < s->num_active_planes; ++k) { - if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.enable == true) { + if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.active_writebacks_per_stream > 0) { mode_lib->mp.WritebackDelay[k] = mode_lib->soc.qos_parameters.writeback.base_latency_us + CalculateWriteBackDelay( - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.pixel_format, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.h_ratio, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.v_ratio, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.v_taps, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_width, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_height, - display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.input_height, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].pixel_format, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].h_ratio, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].v_ratio, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].v_taps, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].output_width, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].output_height, + display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.writeback_stream[0].input_height, display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total) / mode_lib->mp.Dispclk; } else mode_lib->mp.WritebackDelay[k] = 0; @@ -10679,10 +11060,25 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex for (k = 0; k < s->num_active_planes; ++k) { bool cursor_not_enough_urgent_latency_hiding = 0; - double line_time_us = 0.0; - - line_time_us = display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / + s->line_times[k] = display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000); + + s->pixel_format[k] = display_cfg->plane_descriptors[k].pixel_format; + + s->lb_source_lines_l[k] = get_num_lb_source_lines(mode_lib->ip.max_line_buffer_lines, mode_lib->ip.line_buffer_size_bits, + mode_lib->mp.NoOfDPP[k], + display_cfg->plane_descriptors[k].composition.viewport.plane0.width, + display_cfg->plane_descriptors[k].composition.viewport.plane0.height, + display_cfg->plane_descriptors[k].composition.scaler_info.plane0.h_ratio, + display_cfg->plane_descriptors[k].composition.rotation_angle); + + s->lb_source_lines_c[k] = get_num_lb_source_lines(mode_lib->ip.max_line_buffer_lines, mode_lib->ip.line_buffer_size_bits, + mode_lib->mp.NoOfDPP[k], + display_cfg->plane_descriptors[k].composition.viewport.plane1.width, + display_cfg->plane_descriptors[k].composition.viewport.plane1.height, + display_cfg->plane_descriptors[k].composition.scaler_info.plane1.h_ratio, + display_cfg->plane_descriptors[k].composition.rotation_angle); + if (display_cfg->plane_descriptors[k].cursor.num_cursors > 0) { calculate_cursor_req_attributes( display_cfg->plane_descriptors[k].cursor.cursor_width, @@ -10699,7 +11095,7 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex display_cfg->plane_descriptors[k].cursor.cursor_width, s->cursor_bytes_per_chunk[k], s->cursor_lines_per_chunk[k], - line_time_us, + s->line_times[k], mode_lib->mp.UrgentLatency, // output @@ -10714,7 +11110,7 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex mode_lib->mp.swath_width_chroma_ub[k], mode_lib->mp.SwathHeightY[k], mode_lib->mp.SwathHeightC[k], - line_time_us, + s->line_times[k], mode_lib->mp.UrgentLatency, display_cfg->plane_descriptors[k].composition.scaler_info.plane0.v_ratio, display_cfg->plane_descriptors[k].composition.scaler_info.plane1.v_ratio, @@ -10752,6 +11148,35 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex dml2_printf("DML::%s: immediate_flip_required = %u\n", __func__, s->immediate_flip_required); #endif + if (s->num_active_planes > 1) { + CheckGlobalPrefetchAdmissibility_params->num_active_planes = s->num_active_planes; + CheckGlobalPrefetchAdmissibility_params->pixel_format = s->pixel_format; + CheckGlobalPrefetchAdmissibility_params->chunk_bytes_l = mode_lib->ip.pixel_chunk_size_kbytes * 1024; + CheckGlobalPrefetchAdmissibility_params->chunk_bytes_c = mode_lib->ip.pixel_chunk_size_kbytes * 1024; + CheckGlobalPrefetchAdmissibility_params->lb_source_lines_l = s->lb_source_lines_l; + CheckGlobalPrefetchAdmissibility_params->lb_source_lines_c = s->lb_source_lines_c; + CheckGlobalPrefetchAdmissibility_params->swath_height_l = mode_lib->mp.SwathHeightY; + CheckGlobalPrefetchAdmissibility_params->swath_height_c = mode_lib->mp.SwathHeightC; + CheckGlobalPrefetchAdmissibility_params->rob_buffer_size_kbytes = mode_lib->ip.rob_buffer_size_kbytes; + CheckGlobalPrefetchAdmissibility_params->compressed_buffer_size_kbytes = mode_lib->mp.CompressedBufferSizeInkByte; + CheckGlobalPrefetchAdmissibility_params->detile_buffer_size_bytes_l = mode_lib->mp.DETBufferSizeY; + CheckGlobalPrefetchAdmissibility_params->detile_buffer_size_bytes_c = mode_lib->mp.DETBufferSizeC; + CheckGlobalPrefetchAdmissibility_params->full_swath_bytes_l = s->full_swath_bytes_l; + CheckGlobalPrefetchAdmissibility_params->full_swath_bytes_c = s->full_swath_bytes_c; + CheckGlobalPrefetchAdmissibility_params->prefetch_sw_bytes = s->prefetch_sw_bytes; + CheckGlobalPrefetchAdmissibility_params->Tpre_rounded = 0; // don't care + CheckGlobalPrefetchAdmissibility_params->Tpre_oto = 0; // don't care + CheckGlobalPrefetchAdmissibility_params->estimated_urg_bandwidth_required_mbps = mode_lib->mp.urg_bandwidth_available[dml2_core_internal_soc_state_sys_active][dml2_core_internal_bw_sdp]; + CheckGlobalPrefetchAdmissibility_params->estimated_dcfclk_mhz = mode_lib->mp.Dcfclk; + CheckGlobalPrefetchAdmissibility_params->line_time = s->line_times; + CheckGlobalPrefetchAdmissibility_params->dst_y_prefetch = mode_lib->mp.dst_y_prefetch; + + // if recalc_prefetch_schedule is set, recalculate the prefetch schedule with the new impacted_Tpre, prefetch should be possible + CheckGlobalPrefetchAdmissibility_params->recalc_prefetch_schedule = &s->dummy_boolean[0]; + CheckGlobalPrefetchAdmissibility_params->impacted_dst_y_pre = s->impacted_dst_y_pre; + CheckGlobalPrefetchAdmissibility(&mode_lib->scratch, CheckGlobalPrefetchAdmissibility_params); // dont care about the check output for mode programming + } + { s->DestinationLineTimesForPrefetchLessThan2 = false; s->VRatioPrefetchMoreThanMax = false; @@ -10763,11 +11188,11 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex dml2_printf("DML::%s: k=%d MaxVStartupLines = %u\n", __func__, k, s->MaxVStartupLines[k]); mode_lib->mp.TWait[k] = CalculateTWait( - display_cfg->plane_descriptors[k].overrides.reserved_vblank_time_ns, - mode_lib->mp.UrgentLatency, - mode_lib->mp.TripToMemory, - !dml_is_phantom_pipe(&display_cfg->plane_descriptors[k]) && display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.drr_config.enabled ? - get_g6_temp_read_blackout_us(&mode_lib->soc, (unsigned int)(mode_lib->mp.uclk_freq_mhz * 1000), in_out_params->min_clk_index) : 0.0); + display_cfg->plane_descriptors[k].overrides.reserved_vblank_time_ns, + mode_lib->mp.UrgentLatency, + mode_lib->mp.TripToMemory, + !dml_is_phantom_pipe(&display_cfg->plane_descriptors[k]) && display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.drr_config.enabled ? + get_g6_temp_read_blackout_us(&mode_lib->soc, (unsigned int)(mode_lib->mp.uclk_freq_mhz * 1000), in_out_params->min_clk_index) : 0.0); myPipe->Dppclk = mode_lib->mp.Dppclk[k]; myPipe->Dispclk = mode_lib->mp.Dispclk; @@ -10848,6 +11273,9 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex CalculatePrefetchSchedule_params->mrq_present = mode_lib->ip.dcn_mrq_present; CalculatePrefetchSchedule_params->meta_row_bytes = mode_lib->mp.meta_row_bytes[k]; CalculatePrefetchSchedule_params->mall_prefetch_sdp_overhead_factor = mode_lib->mp.mall_prefetch_sdp_overhead_factor[k]; + CalculatePrefetchSchedule_params->impacted_dst_y_pre = s->impacted_dst_y_pre[k]; + CalculatePrefetchSchedule_params->vactive_sw_bw_l = mode_lib->mp.vactive_sw_bw_l[k]; + CalculatePrefetchSchedule_params->vactive_sw_bw_c = mode_lib->mp.vactive_sw_bw_c[k]; // output CalculatePrefetchSchedule_params->DSTXAfterScaler = &mode_lib->mp.DSTXAfterScaler[k]; @@ -10876,9 +11304,18 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex CalculatePrefetchSchedule_params->VUpdateWidthPix = &mode_lib->mp.VUpdateWidthPix[k]; CalculatePrefetchSchedule_params->VReadyOffsetPix = &mode_lib->mp.VReadyOffsetPix[k]; CalculatePrefetchSchedule_params->prefetch_cursor_bw = &mode_lib->mp.prefetch_cursor_bw[k]; + CalculatePrefetchSchedule_params->prefetch_sw_bytes = &s->prefetch_sw_bytes[k]; + CalculatePrefetchSchedule_params->Tpre_rounded = &s->Tpre_rounded[k]; + CalculatePrefetchSchedule_params->Tpre_oto = &s->Tpre_oto[k]; + CalculatePrefetchSchedule_params->prefetch_swath_time_us = &s->dummy_single[0]; mode_lib->mp.NoTimeToPrefetch[k] = CalculatePrefetchSchedule(&mode_lib->scratch, CalculatePrefetchSchedule_params); + if (s->impacted_dst_y_pre[k] > 0) + mode_lib->mp.impacted_prefetch_margin_us[k] = (mode_lib->mp.dst_y_prefetch[k] - s->impacted_dst_y_pre[k]) * s->line_times[k]; + else + mode_lib->mp.impacted_prefetch_margin_us[k] = 0; + #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: k=%0u NoTimeToPrefetch=%0d\n", __func__, k, mode_lib->mp.NoTimeToPrefetch[k]); #endif @@ -10956,8 +11393,8 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex dml2_printf("DML::%s: k=%0u VRatioY=%f\n", __func__, k, display_cfg->plane_descriptors[k].composition.scaler_info.plane0.v_ratio); dml2_printf("DML::%s: k=%0u prefetch_vmrow_bw=%f\n", __func__, k, mode_lib->mp.prefetch_vmrow_bw[k]); - dml2_printf("DML::%s: k=%0u ReadBandwidthSurfaceLuma=%f\n", __func__, k, mode_lib->mp.SurfaceReadBandwidthLuma[k]); - dml2_printf("DML::%s: k=%0u ReadBandwidthSurfaceChroma=%f\n", __func__, k, mode_lib->mp.SurfaceReadBandwidthChroma[k]); + dml2_printf("DML::%s: k=%0u vactive_sw_bw_l=%f\n", __func__, k, mode_lib->mp.vactive_sw_bw_l[k]); + dml2_printf("DML::%s: k=%0u vactive_sw_bw_c=%f\n", __func__, k, mode_lib->mp.vactive_sw_bw_c[k]); dml2_printf("DML::%s: k=%0u cursor_bw=%f\n", __func__, k, mode_lib->mp.cursor_bw[k]); dml2_printf("DML::%s: k=%0u dpte_row_bw=%f\n", __func__, k, mode_lib->mp.dpte_row_bw[k]); dml2_printf("DML::%s: k=%0u meta_row_bw=%f\n", __func__, k, mode_lib->mp.meta_row_bw[k]); @@ -10988,8 +11425,8 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex calculate_peak_bandwidth_params->mall_prefetch_sdp_overhead_factor = mode_lib->mp.mall_prefetch_sdp_overhead_factor; calculate_peak_bandwidth_params->mall_prefetch_dram_overhead_factor = mode_lib->mp.mall_prefetch_dram_overhead_factor; - calculate_peak_bandwidth_params->surface_read_bandwidth_l = mode_lib->mp.SurfaceReadBandwidthLuma; - calculate_peak_bandwidth_params->surface_read_bandwidth_c = mode_lib->mp.SurfaceReadBandwidthChroma; + calculate_peak_bandwidth_params->surface_read_bandwidth_l = mode_lib->mp.vactive_sw_bw_l; + calculate_peak_bandwidth_params->surface_read_bandwidth_c = mode_lib->mp.vactive_sw_bw_c; calculate_peak_bandwidth_params->prefetch_bandwidth_l = mode_lib->mp.RequiredPrefetchPixelDataBWLuma; calculate_peak_bandwidth_params->prefetch_bandwidth_c = mode_lib->mp.RequiredPrefetchPixelDataBWChroma; calculate_peak_bandwidth_params->excess_vactive_fill_bw_l = mode_lib->mp.excess_vactive_fill_bw_l; @@ -11120,8 +11557,8 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex calculate_peak_bandwidth_params->mall_prefetch_sdp_overhead_factor = mode_lib->mp.mall_prefetch_sdp_overhead_factor; calculate_peak_bandwidth_params->mall_prefetch_dram_overhead_factor = mode_lib->mp.mall_prefetch_dram_overhead_factor; - calculate_peak_bandwidth_params->surface_read_bandwidth_l = mode_lib->mp.SurfaceReadBandwidthLuma; - calculate_peak_bandwidth_params->surface_read_bandwidth_c = mode_lib->mp.SurfaceReadBandwidthChroma; + calculate_peak_bandwidth_params->surface_read_bandwidth_l = mode_lib->mp.vactive_sw_bw_l; + calculate_peak_bandwidth_params->surface_read_bandwidth_c = mode_lib->mp.vactive_sw_bw_c; calculate_peak_bandwidth_params->prefetch_bandwidth_l = mode_lib->mp.RequiredPrefetchPixelDataBWLuma; calculate_peak_bandwidth_params->prefetch_bandwidth_c = mode_lib->mp.RequiredPrefetchPixelDataBWChroma; calculate_peak_bandwidth_params->excess_vactive_fill_bw_l = mode_lib->mp.excess_vactive_fill_bw_l; @@ -11238,8 +11675,8 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex s->mmSOCParameters.USRRetrainingLatency = 0; s->mmSOCParameters.SMNLatency = 0; s->mmSOCParameters.g6_temp_read_blackout_us = get_g6_temp_read_blackout_us(&mode_lib->soc, (unsigned int)(mode_lib->mp.uclk_freq_mhz * 1000), in_out_params->min_clk_index); - s->mmSOCParameters.max_urgent_latency_us = get_max_urgent_latency_us(&mode_lib->soc.qos_parameters.qos_params.dcn4x, mode_lib->ms.uclk_freq_mhz, mode_lib->ms.FabricClock, in_out_params->min_clk_index); - s->mmSOCParameters.df_response_time_us = mode_lib->soc.qos_parameters.qos_params.dcn4x.df_qos_response_time_fclk_cycles / mode_lib->ms.FabricClock; + s->mmSOCParameters.max_urgent_latency_us = get_max_urgent_latency_us(&mode_lib->soc.qos_parameters.qos_params.dcn4x, mode_lib->mp.uclk_freq_mhz, mode_lib->mp.FabricClock, in_out_params->min_clk_index); + s->mmSOCParameters.df_response_time_us = mode_lib->soc.qos_parameters.qos_params.dcn4x.df_qos_response_time_fclk_cycles / mode_lib->mp.FabricClock; s->mmSOCParameters.qos_type = mode_lib->soc.qos_parameters.qos_type; CalculateWatermarks_params->display_cfg = display_cfg; @@ -11289,7 +11726,7 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(&mode_lib->scratch, CalculateWatermarks_params); for (k = 0; k < s->num_active_planes; ++k) { - if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.enable == true) { + if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.active_writebacks_per_stream > 0) { mode_lib->mp.WritebackAllowDRAMClockChangeEndPosition[k] = math_max2(0, mode_lib->mp.VStartupMin[k] * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000) - mode_lib->mp.Watermark.WritebackDRAMClockChangeWatermark); mode_lib->mp.WritebackAllowFCLKChangeEndPosition[k] = math_max2(0, mode_lib->mp.VStartupMin[k] * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total / @@ -11475,25 +11912,25 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex //Maximum Bandwidth Used s->TotalWRBandwidth = 0; - s->WRBandwidth = 0; - for (k = 0; k < s->num_active_planes; ++k) { - if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.enable == true && display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.pixel_format == dml2_444_32) { - s->WRBandwidth = display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_height * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_width / - (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.input_height / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * 4; - } else if (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.enable == true) { - s->WRBandwidth = display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_height * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.output_width / - (display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.h_total * display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].writeback.scaling_info.input_height / ((double)display_cfg->stream_descriptors[display_cfg->plane_descriptors[k].stream_index].timing.pixel_clock_khz / 1000)) * 8; + for (k = 0; k < display_cfg->num_streams; ++k) { + s->WRBandwidth = 0; + if (display_cfg->stream_descriptors[k].writeback.active_writebacks_per_stream > 0) { + s->WRBandwidth = display_cfg->stream_descriptors[k].writeback.writeback_stream[0].output_height + * display_cfg->stream_descriptors[k].writeback.writeback_stream[0].output_width / + (display_cfg->stream_descriptors[k].timing.h_total * display_cfg->stream_descriptors[k].writeback.writeback_stream[0].input_height + / ((double)display_cfg->stream_descriptors[k].timing.pixel_clock_khz / 1000)) + * (display_cfg->stream_descriptors[k].writeback.writeback_stream[0].pixel_format == dml2_444_32 ? 4.0 : 8.0); + s->TotalWRBandwidth = s->TotalWRBandwidth + s->WRBandwidth; } - s->TotalWRBandwidth = s->TotalWRBandwidth + s->WRBandwidth; } mode_lib->mp.TotalDataReadBandwidth = 0; for (k = 0; k < s->num_active_planes; ++k) { - mode_lib->mp.TotalDataReadBandwidth = mode_lib->mp.TotalDataReadBandwidth + mode_lib->mp.SurfaceReadBandwidthLuma[k] + mode_lib->mp.SurfaceReadBandwidthChroma[k]; + mode_lib->mp.TotalDataReadBandwidth = mode_lib->mp.TotalDataReadBandwidth + mode_lib->mp.vactive_sw_bw_l[k] + mode_lib->mp.vactive_sw_bw_c[k]; #ifdef __DML_VBA_DEBUG__ dml2_printf("DML::%s: k=%u, TotalDataReadBandwidth = %f\n", __func__, k, mode_lib->mp.TotalDataReadBandwidth); - dml2_printf("DML::%s: k=%u, ReadBandwidthSurfaceLuma = %f\n", __func__, k, mode_lib->mp.SurfaceReadBandwidthLuma[k]); - dml2_printf("DML::%s: k=%u, ReadBandwidthSurfaceChroma = %f\n", __func__, k, mode_lib->mp.SurfaceReadBandwidthChroma[k]); + dml2_printf("DML::%s: k=%u, vactive_sw_bw_l = %f\n", __func__, k, mode_lib->mp.vactive_sw_bw_l[k]); + dml2_printf("DML::%s: k=%u, vactive_sw_bw_c = %f\n", __func__, k, mode_lib->mp.vactive_sw_bw_c[k]); #endif } @@ -11530,8 +11967,8 @@ static bool dml_core_mode_programming(struct dml2_core_calcs_mode_programming_ex CalculateStutterEfficiency_params->BlockWidth256BytesC = mode_lib->mp.Read256BlockWidthC; CalculateStutterEfficiency_params->DCCYMaxUncompressedBlock = mode_lib->mp.DCCYMaxUncompressedBlock; CalculateStutterEfficiency_params->DCCCMaxUncompressedBlock = mode_lib->mp.DCCCMaxUncompressedBlock; - CalculateStutterEfficiency_params->ReadBandwidthSurfaceLuma = mode_lib->mp.SurfaceReadBandwidthLuma; - CalculateStutterEfficiency_params->ReadBandwidthSurfaceChroma = mode_lib->mp.SurfaceReadBandwidthChroma; + CalculateStutterEfficiency_params->ReadBandwidthSurfaceLuma = mode_lib->mp.vactive_sw_bw_l; + CalculateStutterEfficiency_params->ReadBandwidthSurfaceChroma = mode_lib->mp.vactive_sw_bw_c; CalculateStutterEfficiency_params->dpte_row_bw = mode_lib->mp.dpte_row_bw; CalculateStutterEfficiency_params->meta_row_bw = mode_lib->mp.meta_row_bw; CalculateStutterEfficiency_params->rob_alloc_compressed = mode_lib->ip.dcn_mrq_present; @@ -11742,7 +12179,7 @@ static void rq_dlg_get_wm_regs(const struct dml2_display_cfg *display_cfg, const wm_regs->fclk_pstate = (int unsigned)(mode_lib->mp.Watermark.FCLKChangeWatermark * refclk_freq_in_mhz); wm_regs->sr_enter = (int unsigned)(mode_lib->mp.Watermark.StutterEnterPlusExitWatermark * refclk_freq_in_mhz); wm_regs->sr_exit = (int unsigned)(mode_lib->mp.Watermark.StutterExitWatermark * refclk_freq_in_mhz); - wm_regs->temp_read_or_ppt = (int unsigned)(mode_lib->mp.Watermark.g6_temp_read_watermark_us * refclk_freq_in_mhz); + wm_regs->temp_read_or_ppt = (int unsigned)(mode_lib->mp.Watermark.temp_read_or_ppt_watermark_us * refclk_freq_in_mhz); wm_regs->uclk_pstate = (int unsigned)(mode_lib->mp.Watermark.DRAMClockChangeWatermark * refclk_freq_in_mhz); wm_regs->urgent = (int unsigned)(mode_lib->mp.Watermark.UrgentWatermark * refclk_freq_in_mhz); wm_regs->usr = (int unsigned)(mode_lib->mp.Watermark.USRRetrainingWatermark * refclk_freq_in_mhz); @@ -12321,14 +12758,18 @@ void dml2_core_calcs_get_global_fams2_programming(const struct dml2_core_interna void dml2_core_calcs_get_stream_fams2_programming(const struct dml2_core_internal_display_mode_lib *mode_lib, const struct display_configuation_with_meta *display_cfg, - struct dmub_fams2_stream_static_state *fams2_programming, - enum dml2_uclk_pstate_support_method pstate_method, + union dmub_cmd_fams2_config *fams2_base_programming, + union dmub_cmd_fams2_config *fams2_sub_programming, + enum dml2_pstate_method pstate_method, int plane_index) { const struct dml2_plane_parameters *plane_descriptor = &display_cfg->display_config.plane_descriptors[plane_index]; const struct dml2_stream_parameters *stream_descriptor = &display_cfg->display_config.stream_descriptors[plane_descriptor->stream_index]; const struct dml2_fams2_meta *stream_fams2_meta = &display_cfg->stage3.stream_fams2_meta[plane_descriptor->stream_index]; + struct dmub_fams2_cmd_stream_static_base_state *base_programming = &fams2_base_programming->stream_v1.base; + union dmub_fams2_cmd_stream_static_sub_state *sub_programming = &fams2_sub_programming->stream_v1.sub_state; + unsigned int i; if (display_cfg->display_config.overrides.all_streams_blanked) { @@ -12337,110 +12778,110 @@ void dml2_core_calcs_get_stream_fams2_programming(const struct dml2_core_interna } /* from display configuration */ - fams2_programming->htotal = (uint16_t)stream_descriptor->timing.h_total; - fams2_programming->vtotal = (uint16_t)stream_descriptor->timing.v_total; - fams2_programming->vblank_start = (uint16_t)(stream_fams2_meta->nom_vtotal - + base_programming->htotal = (uint16_t)stream_descriptor->timing.h_total; + base_programming->vtotal = (uint16_t)stream_descriptor->timing.v_total; + base_programming->vblank_start = (uint16_t)(stream_fams2_meta->nom_vtotal - stream_descriptor->timing.v_front_porch); - fams2_programming->vblank_end = (uint16_t)(stream_fams2_meta->nom_vtotal - + base_programming->vblank_end = (uint16_t)(stream_fams2_meta->nom_vtotal - stream_descriptor->timing.v_front_porch - stream_descriptor->timing.v_active); - fams2_programming->config.bits.is_drr = stream_descriptor->timing.drr_config.enabled; + base_programming->config.bits.is_drr = stream_descriptor->timing.drr_config.enabled; /* from meta */ - fams2_programming->otg_vline_time_ns = + base_programming->otg_vline_time_ns = (unsigned int)(stream_fams2_meta->otg_vline_time_us * 1000.0); - fams2_programming->scheduling_delay_otg_vlines = (uint8_t)stream_fams2_meta->scheduling_delay_otg_vlines; - fams2_programming->contention_delay_otg_vlines = (uint8_t)stream_fams2_meta->contention_delay_otg_vlines; - fams2_programming->vline_int_ack_delay_otg_vlines = (uint8_t)stream_fams2_meta->vertical_interrupt_ack_delay_otg_vlines; - fams2_programming->drr_keepout_otg_vline = (uint16_t)(stream_fams2_meta->nom_vtotal - + base_programming->scheduling_delay_otg_vlines = (uint8_t)stream_fams2_meta->scheduling_delay_otg_vlines; + base_programming->contention_delay_otg_vlines = (uint8_t)stream_fams2_meta->contention_delay_otg_vlines; + base_programming->vline_int_ack_delay_otg_vlines = (uint8_t)stream_fams2_meta->vertical_interrupt_ack_delay_otg_vlines; + base_programming->drr_keepout_otg_vline = (uint16_t)(stream_fams2_meta->nom_vtotal - stream_descriptor->timing.v_front_porch - stream_fams2_meta->method_drr.programming_delay_otg_vlines); - fams2_programming->allow_to_target_delay_otg_vlines = (uint8_t)stream_fams2_meta->allow_to_target_delay_otg_vlines; - fams2_programming->max_vtotal = (uint16_t)stream_fams2_meta->max_vtotal; + base_programming->allow_to_target_delay_otg_vlines = (uint8_t)stream_fams2_meta->allow_to_target_delay_otg_vlines; + base_programming->max_vtotal = (uint16_t)stream_fams2_meta->max_vtotal; /* from core */ - fams2_programming->config.bits.min_ttu_vblank_usable = true; + base_programming->config.bits.min_ttu_vblank_usable = true; for (i = 0; i < display_cfg->display_config.num_planes; i++) { /* check if all planes support p-state in blank */ if (display_cfg->display_config.plane_descriptors[i].stream_index == plane_descriptor->stream_index && mode_lib->mp.MinTTUVBlank[i] <= mode_lib->mp.Watermark.DRAMClockChangeWatermark) { - fams2_programming->config.bits.min_ttu_vblank_usable = false; + base_programming->config.bits.min_ttu_vblank_usable = false; break; } } switch (pstate_method) { - case dml2_uclk_pstate_support_method_vactive: - case dml2_uclk_pstate_support_method_fw_vactive_drr: + case dml2_pstate_method_vactive: + case dml2_pstate_method_fw_vactive_drr: /* legacy vactive */ - fams2_programming->type = FAMS2_STREAM_TYPE_VACTIVE; - fams2_programming->sub_state.legacy.vactive_det_fill_delay_otg_vlines = - (uint8_t)stream_fams2_meta->method_vactive.max_vactive_det_fill_delay_otg_vlines; - fams2_programming->allow_start_otg_vline = - (uint16_t)stream_fams2_meta->method_vactive.common.allow_start_otg_vline; - fams2_programming->allow_end_otg_vline = - (uint16_t)stream_fams2_meta->method_vactive.common.allow_end_otg_vline; - fams2_programming->config.bits.clamp_vtotal_min = true; + base_programming->type = FAMS2_STREAM_TYPE_VACTIVE; + sub_programming->legacy.vactive_det_fill_delay_otg_vlines = + (uint8_t)stream_fams2_meta->method_vactive.max_vactive_det_fill_delay_otg_vlines; + base_programming->allow_start_otg_vline = + (uint16_t)stream_fams2_meta->method_vactive.common.allow_start_otg_vline; + base_programming->allow_end_otg_vline = + (uint16_t)stream_fams2_meta->method_vactive.common.allow_end_otg_vline; + base_programming->config.bits.clamp_vtotal_min = true; break; - case dml2_uclk_pstate_support_method_vblank: - case dml2_uclk_pstate_support_method_fw_vblank_drr: + case dml2_pstate_method_vblank: + case dml2_pstate_method_fw_vblank_drr: /* legacy vblank */ - fams2_programming->type = FAMS2_STREAM_TYPE_VBLANK; - fams2_programming->allow_start_otg_vline = - (uint16_t)stream_fams2_meta->method_vblank.common.allow_start_otg_vline; - fams2_programming->allow_end_otg_vline = - (uint16_t)stream_fams2_meta->method_vblank.common.allow_end_otg_vline; - fams2_programming->config.bits.clamp_vtotal_min = true; + base_programming->type = FAMS2_STREAM_TYPE_VBLANK; + base_programming->allow_start_otg_vline = + (uint16_t)stream_fams2_meta->method_vblank.common.allow_start_otg_vline; + base_programming->allow_end_otg_vline = + (uint16_t)stream_fams2_meta->method_vblank.common.allow_end_otg_vline; + base_programming->config.bits.clamp_vtotal_min = true; break; - case dml2_uclk_pstate_support_method_fw_drr: + case dml2_pstate_method_fw_drr: /* drr */ - fams2_programming->type = FAMS2_STREAM_TYPE_DRR; - fams2_programming->sub_state.drr.programming_delay_otg_vlines = - (uint8_t)stream_fams2_meta->method_drr.programming_delay_otg_vlines; - fams2_programming->sub_state.drr.nom_stretched_vtotal = - (uint16_t)stream_fams2_meta->method_drr.stretched_vtotal; - fams2_programming->allow_start_otg_vline = - (uint16_t)stream_fams2_meta->method_drr.common.allow_start_otg_vline; - fams2_programming->allow_end_otg_vline = - (uint16_t)stream_fams2_meta->method_drr.common.allow_end_otg_vline; + base_programming->type = FAMS2_STREAM_TYPE_DRR; + sub_programming->drr.programming_delay_otg_vlines = + (uint8_t)stream_fams2_meta->method_drr.programming_delay_otg_vlines; + sub_programming->drr.nom_stretched_vtotal = + (uint16_t)stream_fams2_meta->method_drr.stretched_vtotal; + base_programming->allow_start_otg_vline = + (uint16_t)stream_fams2_meta->method_drr.common.allow_start_otg_vline; + base_programming->allow_end_otg_vline = + (uint16_t)stream_fams2_meta->method_drr.common.allow_end_otg_vline; /* drr only clamps to vtotal min for single display */ - fams2_programming->config.bits.clamp_vtotal_min = display_cfg->display_config.num_streams == 1; - fams2_programming->sub_state.drr.only_stretch_if_required = true; + base_programming->config.bits.clamp_vtotal_min = display_cfg->display_config.num_streams == 1; + sub_programming->drr.only_stretch_if_required = true; break; - case dml2_uclk_pstate_support_method_fw_subvp_phantom: - case dml2_uclk_pstate_support_method_fw_subvp_phantom_drr: + case dml2_pstate_method_fw_svp: + case dml2_pstate_method_fw_svp_drr: /* subvp */ - fams2_programming->type = FAMS2_STREAM_TYPE_SUBVP; - fams2_programming->sub_state.subvp.vratio_numerator = - (uint16_t)(plane_descriptor->composition.scaler_info.plane0.v_ratio * 1000.0); - fams2_programming->sub_state.subvp.vratio_denominator = 1000; - fams2_programming->sub_state.subvp.programming_delay_otg_vlines = - (uint8_t)stream_fams2_meta->method_subvp.programming_delay_otg_vlines; - fams2_programming->sub_state.subvp.prefetch_to_mall_otg_vlines = - (uint8_t)stream_fams2_meta->method_subvp.prefetch_to_mall_delay_otg_vlines; - fams2_programming->sub_state.subvp.phantom_vtotal = - (uint16_t)stream_fams2_meta->method_subvp.phantom_vtotal; - fams2_programming->sub_state.subvp.phantom_vactive = - (uint16_t)stream_fams2_meta->method_subvp.phantom_vactive; - fams2_programming->sub_state.subvp.config.bits.is_multi_planar = - plane_descriptor->surface.plane1.height > 0; - fams2_programming->sub_state.subvp.config.bits.is_yuv420 = - plane_descriptor->pixel_format == dml2_420_8 || - plane_descriptor->pixel_format == dml2_420_10 || - plane_descriptor->pixel_format == dml2_420_12; + base_programming->type = FAMS2_STREAM_TYPE_SUBVP; + sub_programming->subvp.vratio_numerator = + (uint16_t)(plane_descriptor->composition.scaler_info.plane0.v_ratio * 1000.0); + sub_programming->subvp.vratio_denominator = 1000; + sub_programming->subvp.programming_delay_otg_vlines = + (uint8_t)stream_fams2_meta->method_subvp.programming_delay_otg_vlines; + sub_programming->subvp.prefetch_to_mall_otg_vlines = + (uint8_t)stream_fams2_meta->method_subvp.prefetch_to_mall_delay_otg_vlines; + sub_programming->subvp.phantom_vtotal = + (uint16_t)stream_fams2_meta->method_subvp.phantom_vtotal; + sub_programming->subvp.phantom_vactive = + (uint16_t)stream_fams2_meta->method_subvp.phantom_vactive; + sub_programming->subvp.config.bits.is_multi_planar = + plane_descriptor->surface.plane1.height > 0; + sub_programming->subvp.config.bits.is_yuv420 = + plane_descriptor->pixel_format == dml2_420_8 || + plane_descriptor->pixel_format == dml2_420_10 || + plane_descriptor->pixel_format == dml2_420_12; - fams2_programming->allow_start_otg_vline = - (uint16_t)stream_fams2_meta->method_subvp.common.allow_start_otg_vline; - fams2_programming->allow_end_otg_vline = - (uint16_t)stream_fams2_meta->method_subvp.common.allow_end_otg_vline; - fams2_programming->config.bits.clamp_vtotal_min = true; + base_programming->allow_start_otg_vline = + (uint16_t)stream_fams2_meta->method_subvp.common.allow_start_otg_vline; + base_programming->allow_end_otg_vline = + (uint16_t)stream_fams2_meta->method_subvp.common.allow_end_otg_vline; + base_programming->config.bits.clamp_vtotal_min = true; break; - case dml2_uclk_pstate_support_method_reserved_hw: - case dml2_uclk_pstate_support_method_reserved_fw: - case dml2_uclk_pstate_support_method_reserved_fw_drr_fixed: - case dml2_uclk_pstate_support_method_reserved_fw_drr_var: - case dml2_uclk_pstate_support_method_not_supported: - case dml2_uclk_pstate_support_method_count: + case dml2_pstate_method_reserved_hw: + case dml2_pstate_method_reserved_fw: + case dml2_pstate_method_reserved_fw_drr_clamped: + case dml2_pstate_method_reserved_fw_drr_var: + case dml2_pstate_method_na: + case dml2_pstate_method_count: default: /* this should never happen */ break; @@ -12569,6 +13010,8 @@ void dml2_core_calcs_get_informative(const struct dml2_core_internal_display_mod out->informative.mode_support_info.InvalidCombinationOfMALLUseForPState = mode_lib->ms.support.InvalidCombinationOfMALLUseForPState; out->informative.mode_support_info.ExceededMALLSize = mode_lib->ms.support.ExceededMALLSize; out->informative.mode_support_info.EnoughWritebackUnits = mode_lib->ms.support.EnoughWritebackUnits; + out->informative.mode_support_info.temp_read_or_ppt_support = mode_lib->ms.support.temp_read_or_ppt_support; + out->informative.mode_support_info.g6_temp_read_support = mode_lib->ms.support.g6_temp_read_support; out->informative.mode_support_info.ExceededMultistreamSlots = mode_lib->ms.support.ExceededMultistreamSlots; out->informative.mode_support_info.NotEnoughDSCUnits = mode_lib->ms.support.NotEnoughDSCUnits; @@ -12662,7 +13105,7 @@ void dml2_core_calcs_get_informative(const struct dml2_core_internal_display_mod out->informative.watermarks.pstate_change_us = dml_get_wm_dram_clock_change(mode_lib); out->informative.watermarks.fclk_pstate_change_us = dml_get_wm_fclk_change(mode_lib); out->informative.watermarks.usr_retraining_us = dml_get_wm_usr_retraining(mode_lib); - out->informative.watermarks.g6_temp_read_watermark_us = dml_get_wm_g6_temp_read(mode_lib); + out->informative.watermarks.temp_read_or_ppt_watermark_us = dml_get_wm_temp_read_or_ppt(mode_lib); out->informative.mall.total_surface_size_in_mall_bytes = 0; for (k = 0; k < out->display_config.num_planes; ++k) @@ -12745,6 +13188,8 @@ void dml2_core_calcs_get_informative(const struct dml2_core_internal_display_mod out->informative.qos.max_active_fclk_change_latency_supported = dml_get_fclk_change_latency(mode_lib); + out->informative.misc.LowestPrefetchMargin = 10 * 1000 * 1000; + for (k = 0; k < out->display_config.num_planes; k++) { if ((out->display_config.plane_descriptors->overrides.reserved_vblank_time_ns >= 1000.0 * mode_lib->soc.power_management_parameters.dram_clk_change_blackout_us) @@ -12824,6 +13269,7 @@ void dml2_core_calcs_get_informative(const struct dml2_core_internal_display_mod out->informative.misc.DisplayPipeLineDeliveryTimeLumaPrefetch[k] = mode_lib->mp.DisplayPipeLineDeliveryTimeLumaPrefetch[k]; out->informative.misc.DisplayPipeLineDeliveryTimeChromaPrefetch[k] = mode_lib->mp.DisplayPipeLineDeliveryTimeChromaPrefetch[k]; + out->informative.misc.WritebackRequiredBandwidth = mode_lib->scratch.dml_core_mode_programming_locals.TotalWRBandwidth / 1000.0; out->informative.misc.WritebackAllowDRAMClockChangeEndPosition[k] = mode_lib->mp.WritebackAllowDRAMClockChangeEndPosition[k]; out->informative.misc.WritebackAllowFCLKChangeEndPosition[k] = mode_lib->mp.WritebackAllowFCLKChangeEndPosition[k]; out->informative.misc.DSCCLK_calculated[k] = mode_lib->mp.DSCCLK[k]; @@ -12831,6 +13277,9 @@ void dml2_core_calcs_get_informative(const struct dml2_core_internal_display_mod out->informative.misc.PTE_BUFFER_MODE[k] = mode_lib->mp.PTE_BUFFER_MODE[k]; out->informative.misc.DSCDelay[k] = mode_lib->mp.DSCDelay[k]; out->informative.misc.MaxActiveDRAMClockChangeLatencySupported[k] = mode_lib->mp.MaxActiveDRAMClockChangeLatencySupported[k]; + + if (mode_lib->mp.impacted_prefetch_margin_us[k] < out->informative.misc.LowestPrefetchMargin) + out->informative.misc.LowestPrefetchMargin = mode_lib->mp.impacted_prefetch_margin_us[k]; } // For this DV informative layer, all pipes in the same planes will just use the same id @@ -12853,16 +13302,11 @@ void dml2_core_calcs_get_informative(const struct dml2_core_internal_display_mod out->informative.non_optimized_mcache_allocation[k].global_mcache_ids_plane1[n] = k; } } - - out->informative.qos.max_non_urgent_latency_us = mode_lib->soc.qos_parameters.qos_params.dcn4x.per_uclk_dpm_params[mode_lib->mp.qos_param_index].maximum_latency_when_non_urgent_uclk_cycles - / mode_lib->mp.uclk_freq_mhz * (1 + mode_lib->soc.qos_parameters.qos_params.dcn4x.umc_max_latency_margin / 100.0) - + mode_lib->soc.qos_parameters.qos_params.dcn4x.mall_overhead_fclk_cycles / mode_lib->mp.FabricClock - + mode_lib->soc.qos_parameters.qos_params.dcn4x.max_round_trip_to_furthest_cs_fclk_cycles / mode_lib->mp.FabricClock - * (1 + mode_lib->soc.qos_parameters.qos_params.dcn4x.fabric_max_transport_latency_margin / 100.0); + out->informative.qos.max_non_urgent_latency_us = dml_get_max_non_urgent_latency_us(mode_lib); if (mode_lib->soc.qos_parameters.qos_type == dml2_qos_param_type_dcn4x) { if (((mode_lib->ip.rob_buffer_size_kbytes - mode_lib->ip.pixel_chunk_size_kbytes) * 1024 - / mode_lib->mp.non_urg_bandwidth_required[dml2_core_internal_soc_state_sys_active][dml2_core_internal_bw_sdp]) >= out->informative.qos.max_non_urgent_latency_us) { + / mode_lib->ms.support.non_urg_bandwidth_required[dml2_core_internal_soc_state_sys_active][dml2_core_internal_bw_sdp]) >= out->informative.qos.max_non_urgent_latency_us) { out->informative.misc.ROBUrgencyAvoidance = true; } else { out->informative.misc.ROBUrgencyAvoidance = false; diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.h index df2d1550a14b..27ef0e096b25 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.h @@ -28,7 +28,7 @@ void dml2_core_calcs_get_plane_support_info(const struct dml2_display_cfg *displ void dml2_core_calcs_get_informative(const struct dml2_core_internal_display_mode_lib *mode_lib, struct dml2_display_cfg_programming *out); void dml2_core_calcs_get_stream_support_info(const struct dml2_display_cfg *display_cfg, const struct dml2_core_internal_display_mode_lib *mode_lib, struct core_stream_support_info *out, int plane_index); void dml2_core_calcs_get_mall_allocation(struct dml2_core_internal_display_mode_lib *mode_lib, unsigned int *out, int pipe_index); -void dml2_core_calcs_get_stream_fams2_programming(const struct dml2_core_internal_display_mode_lib *mode_lib, const struct display_configuation_with_meta *display_cfg, struct dmub_fams2_stream_static_state *fams2_programming, enum dml2_uclk_pstate_support_method pstate_method, int plane_index); +void dml2_core_calcs_get_stream_fams2_programming(const struct dml2_core_internal_display_mode_lib *mode_lib, const struct display_configuation_with_meta *display_cfg, union dmub_cmd_fams2_config *fams2_base_programming, union dmub_cmd_fams2_config *fams2_sub_programming, enum dml2_pstate_method pstate_method, int plane_index); void dml2_core_calcs_get_global_fams2_programming(const struct dml2_core_internal_display_mode_lib *mode_lib, const struct display_configuation_with_meta *display_cfg, struct dmub_cmd_fams2_global_config *fams2_global_config); void dml2_core_calcs_get_dpte_row_height(unsigned int *dpte_row_height, struct dml2_core_internal_display_mode_lib *mode_lib, bool is_plane1, enum dml2_source_format_class SourcePixelFormat, enum dml2_swizzle_mode SurfaceTiling, enum dml2_rotation_angle ScanDirection, unsigned int pitch, unsigned int GPUVMMinPageSizeKBytes); diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_shared_types.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_shared_types.h index cbdfbd5a0bde..23c0fca5515f 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_shared_types.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_shared_types.h @@ -201,7 +201,7 @@ struct dml2_core_internal_watermarks { double Z8StutterExitWatermark; double Z8StutterEnterPlusExitWatermark; double USRRetrainingWatermark; - double g6_temp_read_watermark_us; + double temp_read_or_ppt_watermark_us; }; struct dml2_core_internal_mode_support_info { @@ -252,8 +252,8 @@ struct dml2_core_internal_mode_support_info { bool PTEBufferSizeNotExceeded; bool DCCMetaBufferSizeNotExceeded; - enum dml2_dram_clock_change_support DRAMClockChangeSupport[DML2_MAX_PLANES]; - enum dml2_fclock_change_support FCLKChangeSupport[DML2_MAX_PLANES]; + enum dml2_pstate_change_support DRAMClockChangeSupport[DML2_MAX_PLANES]; + enum dml2_pstate_change_support FCLKChangeSupport[DML2_MAX_PLANES]; bool global_dram_clock_change_supported; bool global_fclk_change_supported; bool USRRetrainingSupport; @@ -318,12 +318,15 @@ struct dml2_core_internal_mode_support_info { bool avg_bandwidth_support_ok[dml2_core_internal_soc_state_max][dml2_core_internal_bw_max]; double max_urgent_latency_us; + double max_non_urgent_latency_us; double avg_non_urgent_latency_us; double avg_urgent_latency_us; + double df_response_time_us; bool incorrect_imall_usage; bool g6_temp_read_support; + bool temp_read_or_ppt_support; struct dml2_core_internal_watermarks watermarks; }; @@ -378,8 +381,8 @@ struct dml2_core_internal_mode_support { unsigned int DETBufferSizeC[DML2_MAX_PLANES]; unsigned int SwathHeightY[DML2_MAX_PLANES]; unsigned int SwathHeightC[DML2_MAX_PLANES]; - unsigned int SwathWidthY[DML2_MAX_PLANES]; - unsigned int SwathWidthC[DML2_MAX_PLANES]; + unsigned int SwathWidthY[DML2_MAX_PLANES]; // per-pipe + unsigned int SwathWidthC[DML2_MAX_PLANES]; // per-pipe // ---------------------------------- // Intermediates/Informational @@ -476,9 +479,9 @@ struct dml2_core_internal_mode_support { // Bandwidth Related Info double BandwidthAvailableForImmediateFlip; - double SurfaceReadBandwidthLuma[DML2_MAX_PLANES]; // no dcc overhead, for the plane - double SurfaceReadBandwidthChroma[DML2_MAX_PLANES]; - double WriteBandwidth[DML2_MAX_PLANES]; + double vactive_sw_bw_l[DML2_MAX_PLANES]; // no dcc overhead, for the plane + double vactive_sw_bw_c[DML2_MAX_PLANES]; + double WriteBandwidth[DML2_MAX_PLANES][DML2_MAX_WRITEBACK]; double RequiredPrefetchPixelDataBWLuma[DML2_MAX_PLANES]; double RequiredPrefetchPixelDataBWChroma[DML2_MAX_PLANES]; double cursor_bw[DML2_MAX_PLANES]; @@ -539,7 +542,7 @@ struct dml2_core_internal_mode_program { unsigned int qos_param_index; // to access the uclk dependent dpm table unsigned int active_min_uclk_dpm_index; // to access the min_clk table double FabricClock; /// DynamicMetadataSupported); if (!fail_only || support->VRatioInPrefetchSupported == 0) dml2_printf("DML: support: VRatioInPrefetchSupported = %d\n", support->VRatioInPrefetchSupported); - if (!fail_only || support->PTEBufferSizeNotExceeded == 1) + if (!fail_only || support->PTEBufferSizeNotExceeded == 0) dml2_printf("DML: support: PTEBufferSizeNotExceeded = %d\n", support->PTEBufferSizeNotExceeded); - if (!fail_only || support->DCCMetaBufferSizeNotExceeded == 1) + if (!fail_only || support->DCCMetaBufferSizeNotExceeded == 0) dml2_printf("DML: support: DCCMetaBufferSizeNotExceeded = %d\n", support->DCCMetaBufferSizeNotExceeded); if (!fail_only || support->ExceededMALLSize == 1) dml2_printf("DML: support: ExceededMALLSize = %d\n", support->ExceededMALLSize); @@ -280,39 +424,49 @@ bool dml2_core_utils_is_phantom_pipe(const struct dml2_plane_parameters *plane_c return is_phantom; } -unsigned int dml2_core_utils_get_tile_block_size_bytes(enum dml2_swizzle_mode sw_mode) +unsigned int dml2_core_utils_get_tile_block_size_bytes(enum dml2_swizzle_mode sw_mode, unsigned int byte_per_pixel) { - switch (sw_mode) { - case (dml2_sw_linear): - return 256; break; - case (dml2_sw_256b_2d): - return 256; break; - case (dml2_sw_4kb_2d): - return 4096; break; - case (dml2_sw_64kb_2d): - return 65536; break; - case (dml2_sw_256kb_2d): - return 262144; break; - case (dml2_gfx11_sw_linear): - return 256; break; - case (dml2_gfx11_sw_64kb_d): - return 65536; break; - case (dml2_gfx11_sw_64kb_d_t): - return 65536; break; - case (dml2_gfx11_sw_64kb_d_x): - return 65536; break; - case (dml2_gfx11_sw_64kb_r_x): - return 65536; break; - case (dml2_gfx11_sw_256kb_d_x): - return 262144; break; - case (dml2_gfx11_sw_256kb_r_x): - return 262144; break; - default: + + if (sw_mode == dml2_sw_linear) + return 256; + else if (sw_mode == dml2_sw_256b_2d) + return 256; + else if (sw_mode == dml2_sw_4kb_2d) + return 4096; + else if (sw_mode == dml2_sw_64kb_2d) + return 65536; + else if (sw_mode == dml2_sw_256kb_2d) + return 262144; + else if (sw_mode == dml2_gfx11_sw_linear) + return 256; + else if (sw_mode == dml2_gfx11_sw_64kb_d) + return 65536; + else if (sw_mode == dml2_gfx11_sw_64kb_d_t) + return 65536; + else if (sw_mode == dml2_gfx11_sw_64kb_d_x) + return 65536; + else if (sw_mode == dml2_gfx11_sw_64kb_r_x) + return 65536; + else if (sw_mode == dml2_gfx11_sw_256kb_d_x) + return 262144; + else if (sw_mode == dml2_gfx11_sw_256kb_r_x) + return 262144; + else { DML2_ASSERT(0); return 256; }; } +bool dml2_core_utils_get_segment_horizontal_contiguous(enum dml2_swizzle_mode sw_mode, unsigned int byte_per_pixel) +{ + return (byte_per_pixel != 2); +} + +bool dml2_core_utils_is_linear(enum dml2_swizzle_mode sw_mode) +{ + return (sw_mode == dml2_sw_linear || sw_mode == dml2_sw_linear_256b || sw_mode == dml2_linear_64elements); +}; + bool dml2_core_utils_is_vertical_rotation(enum dml2_rotation_angle Scan) { @@ -325,7 +479,6 @@ bool dml2_core_utils_is_vertical_rotation(enum dml2_rotation_angle Scan) return is_vert; } - int unsigned dml2_core_utils_get_gfx_version(enum dml2_swizzle_mode sw_mode) { int unsigned version = 0; @@ -334,17 +487,17 @@ int unsigned dml2_core_utils_get_gfx_version(enum dml2_swizzle_mode sw_mode) sw_mode == dml2_sw_256b_2d || sw_mode == dml2_sw_4kb_2d || sw_mode == dml2_sw_64kb_2d || - sw_mode == dml2_sw_256kb_2d) { + sw_mode == dml2_sw_256kb_2d) version = 12; - } else if (sw_mode == dml2_gfx11_sw_linear || + else if (sw_mode == dml2_gfx11_sw_linear || sw_mode == dml2_gfx11_sw_64kb_d || sw_mode == dml2_gfx11_sw_64kb_d_t || sw_mode == dml2_gfx11_sw_64kb_d_x || sw_mode == dml2_gfx11_sw_64kb_r_x || sw_mode == dml2_gfx11_sw_256kb_d_x || - sw_mode == dml2_gfx11_sw_256kb_r_x) { + sw_mode == dml2_gfx11_sw_256kb_r_x) version = 11; - } else { + else { dml2_printf("ERROR: Invalid sw_mode setting! val=%u\n", sw_mode); DML2_ASSERT(0); } @@ -403,7 +556,7 @@ bool dml2_core_utils_is_dual_plane(enum dml2_source_format_class source_format) { bool ret_val = 0; - if ((source_format == dml2_420_12) || (source_format == dml2_420_8) || (source_format == dml2_420_10) || (source_format == dml2_rgbe_alpha)) + if (dml2_core_utils_is_420(source_format) || dml2_core_utils_is_422_planar(source_format) || (source_format == dml2_rgbe_alpha)) ret_val = 1; return ret_val; diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_utils.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_utils.h index a5cc6a07167a..95f0d017add4 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_utils.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_utils.h @@ -11,6 +11,8 @@ double dml2_core_utils_div_rem(double dividend, unsigned int divisor, unsigned int *remainder); const char *dml2_core_utils_internal_bw_type_str(enum dml2_core_internal_bw_type bw_type); bool dml2_core_utils_is_420(enum dml2_source_format_class source_format); +bool dml2_core_utils_is_422_planar(enum dml2_source_format_class source_format); +bool dml2_core_utils_is_422_packed(enum dml2_source_format_class source_format); void dml2_core_utils_print_mode_support_info(const struct dml2_core_internal_mode_support_info *support, bool fail_only); const char *dml2_core_utils_internal_soc_state_type_str(enum dml2_core_internal_soc_state_type dml2_core_internal_soc_state_type); void dml2_core_utils_get_stream_output_bpp(double *out_bpp, const struct dml2_display_cfg *display_cfg); @@ -18,8 +20,10 @@ unsigned int dml2_core_utils_round_to_multiple(unsigned int num, unsigned int mu unsigned int dml2_core_util_get_num_active_pipes(int unsigned num_planes, const struct core_display_cfg_support_info *cfg_support_info); void dml2_core_utils_pipe_plane_mapping(const struct core_display_cfg_support_info *cfg_support_info, unsigned int *pipe_plane); bool dml2_core_utils_is_phantom_pipe(const struct dml2_plane_parameters *plane_cfg); -unsigned int dml2_core_utils_get_tile_block_size_bytes(enum dml2_swizzle_mode sw_mode); +unsigned int dml2_core_utils_get_tile_block_size_bytes(enum dml2_swizzle_mode sw_mode, unsigned int byte_per_pixel); +bool dml2_core_utils_get_segment_horizontal_contiguous(enum dml2_swizzle_mode sw_mode, unsigned int byte_per_pixel); bool dml2_core_utils_is_vertical_rotation(enum dml2_rotation_angle Scan); +bool dml2_core_utils_is_linear(enum dml2_swizzle_mode sw_mode); int unsigned dml2_core_utils_get_gfx_version(enum dml2_swizzle_mode sw_mode); unsigned int dml2_core_utils_get_qos_param_index(unsigned long uclk_freq_khz, const struct dml2_dcn4_uclk_dpm_dependent_qos_params *per_uclk_dpm_params); unsigned int dml2_core_utils_get_active_min_uclk_dpm_index(unsigned long uclk_freq_khz, const struct dml2_soc_state_table *clk_table); diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c index 8869ea089312..fc77fb34a19a 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c @@ -96,6 +96,7 @@ static void calculate_svp_prefetch_minimums(struct dml2_dpmm_map_mode_to_soc_dpm double min_uclk_latency; const struct dml2_core_mode_support_result *mode_support_result = &in_out->display_cfg->mode_support_result; + /* assumes DF throttling is enabled */ min_uclk_avg = dram_bw_kbps_to_uclk_khz(mode_support_result->global.svp_prefetch.average_bw_dram_kbps, &in_out->soc_bb->clk_table.dram_config); min_uclk_avg = (double)min_uclk_avg / ((double)in_out->soc_bb->qos_parameters.derate_table.dcn_mall_prefetch_average.dram_derate_percent_pixel / 100); @@ -125,6 +126,37 @@ static void calculate_svp_prefetch_minimums(struct dml2_dpmm_map_mode_to_soc_dpm in_out->programming->min_clocks.dcn4x.svp_prefetch.uclk_khz = dml_round_up(min_uclk_bw > min_uclk_latency ? min_uclk_bw : min_uclk_latency); in_out->programming->min_clocks.dcn4x.svp_prefetch.fclk_khz = dml_round_up(min_fclk_bw > min_fclk_latency ? min_fclk_bw : min_fclk_latency); in_out->programming->min_clocks.dcn4x.svp_prefetch.dcfclk_khz = dml_round_up(min_dcfclk_bw > min_dcfclk_latency ? min_dcfclk_bw : min_dcfclk_latency); + + /* assumes DF throttling is disabled */ + min_uclk_avg = dram_bw_kbps_to_uclk_khz(mode_support_result->global.svp_prefetch.average_bw_dram_kbps, &in_out->soc_bb->clk_table.dram_config); + min_uclk_avg = (double)min_uclk_avg / ((double)in_out->soc_bb->qos_parameters.derate_table.system_active_average.dram_derate_percent_pixel / 100); + + min_uclk_urgent = dram_bw_kbps_to_uclk_khz(mode_support_result->global.svp_prefetch.urgent_bw_dram_kbps, &in_out->soc_bb->clk_table.dram_config); + min_uclk_urgent = (double)min_uclk_urgent / ((double)in_out->soc_bb->qos_parameters.derate_table.system_active_urgent.dram_derate_percent_pixel / 100); + + min_uclk_bw = min_uclk_urgent > min_uclk_avg ? min_uclk_urgent : min_uclk_avg; + + min_fclk_avg = (double)mode_support_result->global.svp_prefetch.average_bw_sdp_kbps / in_out->soc_bb->fabric_datapath_to_dcn_data_return_bytes; + min_fclk_avg = (double)min_fclk_avg / ((double)in_out->soc_bb->qos_parameters.derate_table.system_active_average.fclk_derate_percent / 100); + + min_fclk_urgent = (double)mode_support_result->global.svp_prefetch.urgent_bw_sdp_kbps / in_out->soc_bb->fabric_datapath_to_dcn_data_return_bytes; + min_fclk_urgent = (double)min_fclk_urgent / ((double)in_out->soc_bb->qos_parameters.derate_table.system_active_urgent.fclk_derate_percent / 100); + + min_fclk_bw = min_fclk_urgent > min_fclk_avg ? min_fclk_urgent : min_fclk_avg; + + min_dcfclk_avg = (double)mode_support_result->global.svp_prefetch.average_bw_sdp_kbps / in_out->soc_bb->return_bus_width_bytes; + min_dcfclk_avg = (double)min_dcfclk_avg / ((double)in_out->soc_bb->qos_parameters.derate_table.system_active_average.dcfclk_derate_percent / 100); + + min_dcfclk_urgent = (double)mode_support_result->global.svp_prefetch.urgent_bw_sdp_kbps / in_out->soc_bb->return_bus_width_bytes; + min_dcfclk_urgent = (double)min_dcfclk_urgent / ((double)in_out->soc_bb->qos_parameters.derate_table.system_active_urgent.dcfclk_derate_percent / 100); + + min_dcfclk_bw = min_dcfclk_urgent > min_dcfclk_avg ? min_dcfclk_urgent : min_dcfclk_avg; + + get_minimum_clocks_for_latency(in_out, &min_uclk_latency, &min_fclk_latency, &min_dcfclk_latency); + + in_out->programming->min_clocks.dcn4x.svp_prefetch_no_throttle.uclk_khz = dml_round_up(min_uclk_bw > min_uclk_latency ? min_uclk_bw : min_uclk_latency); + in_out->programming->min_clocks.dcn4x.svp_prefetch_no_throttle.fclk_khz = dml_round_up(min_fclk_bw > min_fclk_latency ? min_fclk_bw : min_fclk_latency); + in_out->programming->min_clocks.dcn4x.svp_prefetch_no_throttle.dcfclk_khz = dml_round_up(min_dcfclk_bw > min_dcfclk_latency ? min_dcfclk_bw : min_dcfclk_latency); } static void calculate_idle_minimums(struct dml2_dpmm_map_mode_to_soc_dpm_params_in_out *in_out) @@ -272,6 +304,17 @@ static bool map_soc_min_clocks_to_dpm_fine_grained(struct dml2_display_cfg_progr if (result) result = round_up_to_next_dpm(&display_cfg->min_clocks.dcn4x.idle.uclk_khz, &state_table->uclk); + /* these clocks are optional, so they can fail to map, in which case map all to 0 */ + if (result) { + if (!round_up_to_next_dpm(&display_cfg->min_clocks.dcn4x.svp_prefetch_no_throttle.dcfclk_khz, &state_table->dcfclk) || + !round_up_to_next_dpm(&display_cfg->min_clocks.dcn4x.svp_prefetch_no_throttle.fclk_khz, &state_table->fclk) || + !round_up_to_next_dpm(&display_cfg->min_clocks.dcn4x.svp_prefetch_no_throttle.uclk_khz, &state_table->uclk)) { + display_cfg->min_clocks.dcn4x.svp_prefetch_no_throttle.dcfclk_khz = 0; + display_cfg->min_clocks.dcn4x.svp_prefetch_no_throttle.fclk_khz = 0; + display_cfg->min_clocks.dcn4x.svp_prefetch_no_throttle.uclk_khz = 0; + } + } + return result; } @@ -374,11 +417,11 @@ static bool map_min_clocks_to_dpm(const struct dml2_core_mode_support_result *mo static bool are_timings_trivially_synchronizable(struct dml2_display_cfg *display_config, int mask) { - unsigned char i; + unsigned int i; bool identical = true; bool contains_drr = false; - unsigned char remap_array[DML2_MAX_PLANES]; - unsigned char remap_array_size = 0; + unsigned int remap_array[DML2_MAX_PLANES]; + unsigned int remap_array_size = 0; // Create a remap array to enable simple iteration through only masked stream indicies for (i = 0; i < display_config->num_streams; i++) { @@ -413,10 +456,10 @@ static bool are_timings_trivially_synchronizable(struct dml2_display_cfg *displa static int find_smallest_idle_time_in_vblank_us(struct dml2_dpmm_map_mode_to_soc_dpm_params_in_out *in_out, int mask) { - unsigned char i; + unsigned int i; int min_idle_us = 0; - unsigned char remap_array[DML2_MAX_PLANES]; - unsigned char remap_array_size = 0; + unsigned int remap_array[DML2_MAX_PLANES]; + unsigned int remap_array_size = 0; const struct dml2_core_mode_support_result *mode_support_result = &in_out->display_cfg->mode_support_result; // Create a remap array to enable simple iteration through only masked stream indicies @@ -711,7 +754,7 @@ bool dpmm_dcn4_map_watermarks(struct dml2_dpmm_map_watermarks_params_in_out *in_ dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_A].fclk_pstate = (int unsigned)(mode_lib->mp.Watermark.FCLKChangeWatermark * refclk_freq_in_mhz); dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_A].sr_enter = (int unsigned)(mode_lib->mp.Watermark.StutterEnterPlusExitWatermark * refclk_freq_in_mhz); dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_A].sr_exit = (int unsigned)(mode_lib->mp.Watermark.StutterExitWatermark * refclk_freq_in_mhz); - dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_A].temp_read_or_ppt = (int unsigned)(mode_lib->mp.Watermark.g6_temp_read_watermark_us * refclk_freq_in_mhz); + dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_A].temp_read_or_ppt = (int unsigned)(mode_lib->mp.Watermark.temp_read_or_ppt_watermark_us * refclk_freq_in_mhz); dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_A].uclk_pstate = (int unsigned)(mode_lib->mp.Watermark.DRAMClockChangeWatermark * refclk_freq_in_mhz); dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_A].urgent = (int unsigned)(mode_lib->mp.Watermark.UrgentWatermark * refclk_freq_in_mhz); dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_A].usr = (int unsigned)(mode_lib->mp.Watermark.USRRetrainingWatermark * refclk_freq_in_mhz); @@ -725,7 +768,7 @@ bool dpmm_dcn4_map_watermarks(struct dml2_dpmm_map_watermarks_params_in_out *in_ dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_B].fclk_pstate = (int unsigned)(mode_lib->mp.Watermark.FCLKChangeWatermark * refclk_freq_in_mhz); dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_B].sr_enter = (int unsigned)(mode_lib->mp.Watermark.StutterEnterPlusExitWatermark * refclk_freq_in_mhz); dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_B].sr_exit = (int unsigned)(mode_lib->mp.Watermark.StutterExitWatermark * refclk_freq_in_mhz); - dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_B].temp_read_or_ppt = (int unsigned)(mode_lib->mp.Watermark.g6_temp_read_watermark_us * refclk_freq_in_mhz); + dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_B].temp_read_or_ppt = (int unsigned)(mode_lib->mp.Watermark.temp_read_or_ppt_watermark_us * refclk_freq_in_mhz); dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_B].uclk_pstate = (int unsigned)(mode_lib->mp.Watermark.DRAMClockChangeWatermark * refclk_freq_in_mhz); dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_B].urgent = (int unsigned)(mode_lib->mp.Watermark.UrgentWatermark * refclk_freq_in_mhz); dchubbub_regs->wm_regs[DML2_DCHUB_WATERMARK_SET_B].usr = (int unsigned)(mode_lib->mp.Watermark.USRRetrainingWatermark * refclk_freq_in_mhz); diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn3.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn3.c index a31db5742675..e763c8e45da8 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn3.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn3.c @@ -195,11 +195,11 @@ static int count_planes_with_stream_index(const struct dml2_display_cfg *display static bool are_timings_trivially_synchronizable(struct display_configuation_with_meta *display_config, int mask) { - unsigned char i; + unsigned int i; bool identical = true; bool contains_drr = false; - unsigned char remap_array[DML2_MAX_PLANES]; - unsigned char remap_array_size = 0; + unsigned int remap_array[DML2_MAX_PLANES]; + unsigned int remap_array_size = 0; // Create a remap array to enable simple iteration through only masked stream indicies for (i = 0; i < display_config->display_config.num_streams; i++) { @@ -347,8 +347,12 @@ static int find_highest_odm_load_stream_index( int odm_load, highest_odm_load = -1, highest_odm_load_index = -1; for (i = 0; i < display_config->num_streams; i++) { - odm_load = display_config->stream_descriptors[i].timing.pixel_clock_khz + if (mode_support_result->cfg_support_info.stream_support_info[i].odms_used > 0) + odm_load = display_config->stream_descriptors[i].timing.pixel_clock_khz / mode_support_result->cfg_support_info.stream_support_info[i].odms_used; + else + odm_load = 0; + if (odm_load > highest_odm_load) { highest_odm_load_index = i; highest_odm_load = odm_load; diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn4_fams2.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn4_fams2.c index 92269f0e50ed..a3324f7b9ba6 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn4_fams2.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn4_fams2.c @@ -13,32 +13,32 @@ static const double MIN_BLANK_STUTTER_FACTOR = 3.0; static const struct dml2_pmo_pstate_strategy base_strategy_list_1_display[] = { // VActive Preferred { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_vactive, dml2_pstate_method_na, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = true, }, // Then SVP { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_fw_svp, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_fw_svp, dml2_pstate_method_na, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = true, }, // Then VBlank { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_vblank, dml2_pstate_method_na, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = false, }, // Then DRR { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_fw_drr, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_fw_drr, dml2_pstate_method_na, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = true, }, // Finally VBlank, but allow base clocks for latency to increase /* { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_vblank, dml2_pstate_method_na, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = true, }, */ @@ -49,56 +49,56 @@ static const int base_strategy_list_1_display_size = sizeof(base_strategy_list_1 static const struct dml2_pmo_pstate_strategy base_strategy_list_2_display[] = { // VActive only is preferred { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_vactive, dml2_pstate_method_vactive, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = true, }, // Then VActive + VBlank { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_vactive, dml2_pstate_method_vblank, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = false, }, // Then VBlank only { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_vblank, dml2_pstate_method_vblank, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = false, }, // Then SVP + VBlank { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_fw_svp, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_fw_svp, dml2_pstate_method_vblank, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = false, }, // Then SVP + DRR { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_fw_svp, dml2_pmo_pstate_strategy_fw_drr, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_fw_svp, dml2_pstate_method_fw_drr, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = true, }, // Then SVP + SVP { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_fw_svp, dml2_pmo_pstate_strategy_fw_svp, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_fw_svp, dml2_pstate_method_fw_svp, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = true, }, // Then DRR + VActive { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_fw_drr, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_vactive, dml2_pstate_method_fw_drr, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = true, }, // Then DRR + DRR { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_fw_drr, dml2_pmo_pstate_strategy_fw_drr, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_fw_drr, dml2_pstate_method_fw_drr, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = true, }, // Finally VBlank, but allow base clocks for latency to increase /* { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_na, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_vblank, dml2_pstate_method_vblank, dml2_pstate_method_na, dml2_pstate_method_na }, .allow_state_increase = true, }, */ @@ -109,32 +109,32 @@ static const int base_strategy_list_2_display_size = sizeof(base_strategy_list_2 static const struct dml2_pmo_pstate_strategy base_strategy_list_3_display[] = { // All VActive { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_vactive, dml2_pstate_method_vactive, dml2_pstate_method_vactive, dml2_pstate_method_na }, .allow_state_increase = true, }, // VActive + 1 VBlank { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_vactive, dml2_pstate_method_vactive, dml2_pstate_method_vblank, dml2_pstate_method_na }, .allow_state_increase = false, }, // All VBlank { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_vblank, dml2_pstate_method_vblank, dml2_pstate_method_vblank, dml2_pstate_method_na }, .allow_state_increase = false, }, // All DRR { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_fw_drr, dml2_pmo_pstate_strategy_fw_drr, dml2_pmo_pstate_strategy_fw_drr, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_fw_drr, dml2_pstate_method_fw_drr, dml2_pstate_method_fw_drr, dml2_pstate_method_na }, .allow_state_increase = true, }, // All VBlank, with state increase allowed /* { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_na }, + .per_stream_pstate_method = { dml2_pstate_method_vblank, dml2_pstate_method_vblank, dml2_pstate_method_vblank, dml2_pstate_method_na }, .allow_state_increase = true, }, */ @@ -145,32 +145,32 @@ static const int base_strategy_list_3_display_size = sizeof(base_strategy_list_3 static const struct dml2_pmo_pstate_strategy base_strategy_list_4_display[] = { // All VActive { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_vactive }, + .per_stream_pstate_method = { dml2_pstate_method_vactive, dml2_pstate_method_vactive, dml2_pstate_method_vactive, dml2_pstate_method_vactive }, .allow_state_increase = true, }, // VActive + 1 VBlank { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_vactive, dml2_pmo_pstate_strategy_vblank }, + .per_stream_pstate_method = { dml2_pstate_method_vactive, dml2_pstate_method_vactive, dml2_pstate_method_vactive, dml2_pstate_method_vblank }, .allow_state_increase = false, }, // All Vblank { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_vblank }, + .per_stream_pstate_method = { dml2_pstate_method_vblank, dml2_pstate_method_vblank, dml2_pstate_method_vblank, dml2_pstate_method_vblank }, .allow_state_increase = false, }, // All DRR { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_fw_drr, dml2_pmo_pstate_strategy_fw_drr, dml2_pmo_pstate_strategy_fw_drr, dml2_pmo_pstate_strategy_fw_drr }, + .per_stream_pstate_method = { dml2_pstate_method_fw_drr, dml2_pstate_method_fw_drr, dml2_pstate_method_fw_drr, dml2_pstate_method_fw_drr }, .allow_state_increase = true, }, // All VBlank, with state increase allowed /* { - .per_stream_pstate_method = { dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_vblank, dml2_pmo_pstate_strategy_vblank }, + .per_stream_pstate_method = { dml2_pstate_method_vblank, dml2_pstate_method_vblank, dml2_pstate_method_vblank, dml2_pstate_method_vblank }, .allow_state_increase = true, }, */ @@ -355,29 +355,30 @@ bool pmo_dcn4_fams2_optimize_dcc_mcache(struct dml2_pmo_optimize_dcc_mcache_in_o return result; } -static enum dml2_pmo_pstate_method convert_strategy_to_drr_variant(const enum dml2_pmo_pstate_method base_strategy) +static enum dml2_pstate_method convert_strategy_to_drr_variant(const enum dml2_pstate_method base_strategy) { - enum dml2_pmo_pstate_method variant_strategy = 0; + enum dml2_pstate_method variant_strategy = 0; switch (base_strategy) { - case dml2_pmo_pstate_strategy_vactive: - variant_strategy = dml2_pmo_pstate_strategy_fw_vactive_drr; + case dml2_pstate_method_vactive: + variant_strategy = dml2_pstate_method_fw_vactive_drr; break; - case dml2_pmo_pstate_strategy_vblank: - variant_strategy = dml2_pmo_pstate_strategy_fw_vblank_drr; + case dml2_pstate_method_vblank: + variant_strategy = dml2_pstate_method_fw_vblank_drr; break; - case dml2_pmo_pstate_strategy_fw_svp: - variant_strategy = dml2_pmo_pstate_strategy_fw_svp_drr; + case dml2_pstate_method_fw_svp: + variant_strategy = dml2_pstate_method_fw_svp_drr; break; - case dml2_pmo_pstate_strategy_fw_vactive_drr: - case dml2_pmo_pstate_strategy_fw_vblank_drr: - case dml2_pmo_pstate_strategy_fw_svp_drr: - case dml2_pmo_pstate_strategy_fw_drr: - case dml2_pmo_pstate_strategy_reserved_hw: - case dml2_pmo_pstate_strategy_reserved_fw: - case dml2_pmo_pstate_strategy_reserved_fw_drr_clamped: - case dml2_pmo_pstate_strategy_reserved_fw_drr_var: - case dml2_pmo_pstate_strategy_na: + case dml2_pstate_method_fw_vactive_drr: + case dml2_pstate_method_fw_vblank_drr: + case dml2_pstate_method_fw_svp_drr: + case dml2_pstate_method_fw_drr: + case dml2_pstate_method_reserved_hw: + case dml2_pstate_method_reserved_fw: + case dml2_pstate_method_reserved_fw_drr_clamped: + case dml2_pstate_method_reserved_fw_drr_var: + case dml2_pstate_method_count: + case dml2_pstate_method_na: default: /* no variant for this mode */ variant_strategy = base_strategy; @@ -419,23 +420,22 @@ static unsigned int get_num_expanded_strategies( static void insert_strategy_into_expanded_list( const struct dml2_pmo_pstate_strategy *per_stream_pstate_strategy, - int stream_count, - struct dml2_pmo_init_data *init_data) + const int stream_count, + struct dml2_pmo_pstate_strategy *expanded_strategy_list, + unsigned int *num_expanded_strategies) { - struct dml2_pmo_pstate_strategy *expanded_strategy_list = NULL; + if (expanded_strategy_list && num_expanded_strategies) { + memcpy(&expanded_strategy_list[*num_expanded_strategies], per_stream_pstate_strategy, sizeof(struct dml2_pmo_pstate_strategy)); - expanded_strategy_list = get_expanded_strategy_list(init_data, stream_count); - - if (expanded_strategy_list) { - memcpy(&expanded_strategy_list[init_data->pmo_dcn4.num_expanded_strategies_per_list[stream_count - 1]], per_stream_pstate_strategy, sizeof(struct dml2_pmo_pstate_strategy)); - - init_data->pmo_dcn4.num_expanded_strategies_per_list[stream_count - 1]++; + (*num_expanded_strategies)++; } } -static void expand_base_strategy(struct dml2_pmo_instance *pmo, +static void expand_base_strategy( const struct dml2_pmo_pstate_strategy *base_strategy, - unsigned int stream_count) + const unsigned int stream_count, + struct dml2_pmo_pstate_strategy *expanded_strategy_list, + unsigned int *num_expanded_strategies) { bool skip_to_next_stream; bool expanded_strategy_added; @@ -473,7 +473,7 @@ static void expand_base_strategy(struct dml2_pmo_instance *pmo, if (i >= stream_count - 1) { /* insert into strategy list */ - insert_strategy_into_expanded_list(&cur_strategy_list, stream_count, &pmo->init_data); + insert_strategy_into_expanded_list(&cur_strategy_list, stream_count, expanded_strategy_list, num_expanded_strategies); expanded_strategy_added = true; } else { /* skip to next stream */ @@ -512,9 +512,9 @@ static void expand_base_strategy(struct dml2_pmo_instance *pmo, static bool is_variant_method_valid(const struct dml2_pmo_pstate_strategy *base_strategy, const struct dml2_pmo_pstate_strategy *variant_strategy, - unsigned int num_streams_per_base_method[PMO_DCN4_MAX_DISPLAYS], - unsigned int num_streams_per_variant_method[PMO_DCN4_MAX_DISPLAYS], - unsigned int stream_count) + const unsigned int num_streams_per_base_method[PMO_DCN4_MAX_DISPLAYS], + const unsigned int num_streams_per_variant_method[PMO_DCN4_MAX_DISPLAYS], + const unsigned int stream_count) { bool valid = true; unsigned int i; @@ -522,7 +522,7 @@ static bool is_variant_method_valid(const struct dml2_pmo_pstate_strategy *base_ /* check all restrictions are met */ for (i = 0; i < stream_count; i++) { /* vblank + vblank_drr variants are invalid */ - if (base_strategy->per_stream_pstate_method[i] == dml2_pmo_pstate_strategy_vblank && + if (base_strategy->per_stream_pstate_method[i] == dml2_pstate_method_vblank && ((num_streams_per_base_method[i] > 0 && num_streams_per_variant_method[i] > 0) || num_streams_per_variant_method[i] > 1)) { valid = false; @@ -533,9 +533,12 @@ static bool is_variant_method_valid(const struct dml2_pmo_pstate_strategy *base_ return valid; } -static void expand_variant_strategy(struct dml2_pmo_instance *pmo, +static void expand_variant_strategy( const struct dml2_pmo_pstate_strategy *base_strategy, - unsigned int stream_count) + const unsigned int stream_count, + const bool should_permute, + struct dml2_pmo_pstate_strategy *expanded_strategy_list, + unsigned int *num_expanded_strategies) { bool variant_found; unsigned int i, j; @@ -544,7 +547,7 @@ static void expand_variant_strategy(struct dml2_pmo_instance *pmo, unsigned int num_streams_per_method[PMO_DCN4_MAX_DISPLAYS] = { 0 }; unsigned int num_streams_per_base_method[PMO_DCN4_MAX_DISPLAYS] = { 0 }; unsigned int num_streams_per_variant_method[PMO_DCN4_MAX_DISPLAYS] = { 0 }; - enum dml2_pmo_pstate_method per_stream_variant_method[DML2_MAX_PLANES]; + enum dml2_pstate_method per_stream_variant_method[DML2_MAX_PLANES]; struct dml2_pmo_pstate_strategy variant_strategy = { 0 }; /* determine number of displays per method */ @@ -585,7 +588,13 @@ static void expand_variant_strategy(struct dml2_pmo_instance *pmo, } if (variant_found && is_variant_method_valid(base_strategy, &variant_strategy, num_streams_per_base_method, num_streams_per_variant_method, stream_count)) { - expand_base_strategy(pmo, &variant_strategy, stream_count); + if (should_permute) { + /* permutations are permitted, proceed to expand */ + expand_base_strategy(&variant_strategy, stream_count, expanded_strategy_list, num_expanded_strategies); + } else { + /* no permutations allowed, so add to list now */ + insert_strategy_into_expanded_list(&variant_strategy, stream_count, expanded_strategy_list, num_expanded_strategies); + } } /* rollback to earliest method with bases remaining */ @@ -612,18 +621,19 @@ static void expand_variant_strategy(struct dml2_pmo_instance *pmo, } } -static void expand_base_strategies( - struct dml2_pmo_instance *pmo, - const struct dml2_pmo_pstate_strategy *base_strategies_list, - const unsigned int num_base_strategies, - unsigned int stream_count) +void pmo_dcn4_fams2_expand_base_pstate_strategies( + const struct dml2_pmo_pstate_strategy *base_strategies_list, + const unsigned int num_base_strategies, + const unsigned int stream_count, + struct dml2_pmo_pstate_strategy *expanded_strategy_list, + unsigned int *num_expanded_strategies) { unsigned int i; /* expand every explicit base strategy (except all DRR) */ for (i = 0; i < num_base_strategies; i++) { - expand_base_strategy(pmo, &base_strategies_list[i], stream_count); - expand_variant_strategy(pmo, &base_strategies_list[i], stream_count); + expand_base_strategy(&base_strategies_list[i], stream_count, expanded_strategy_list, num_expanded_strategies); + expand_variant_strategy(&base_strategies_list[i], stream_count, true, expanded_strategy_list, num_expanded_strategies); } } @@ -652,25 +662,45 @@ bool pmo_dcn4_fams2_initialize(struct dml2_pmo_initialize_in_out *in_out) DML2_ASSERT(base_strategy_list_1_display_size <= PMO_DCN4_MAX_BASE_STRATEGIES); /* populate list */ - expand_base_strategies(pmo, base_strategy_list_1_display, base_strategy_list_1_display_size, 1); + pmo_dcn4_fams2_expand_base_pstate_strategies( + base_strategy_list_1_display, + base_strategy_list_1_display_size, + i, + pmo->init_data.pmo_dcn4.expanded_strategy_list_1_display, + &pmo->init_data.pmo_dcn4.num_expanded_strategies_per_list[i - 1]); break; case 2: DML2_ASSERT(base_strategy_list_2_display_size <= PMO_DCN4_MAX_BASE_STRATEGIES); /* populate list */ - expand_base_strategies(pmo, base_strategy_list_2_display, base_strategy_list_2_display_size, 2); + pmo_dcn4_fams2_expand_base_pstate_strategies( + base_strategy_list_2_display, + base_strategy_list_2_display_size, + i, + pmo->init_data.pmo_dcn4.expanded_strategy_list_2_display, + &pmo->init_data.pmo_dcn4.num_expanded_strategies_per_list[i - 1]); break; case 3: DML2_ASSERT(base_strategy_list_3_display_size <= PMO_DCN4_MAX_BASE_STRATEGIES); /* populate list */ - expand_base_strategies(pmo, base_strategy_list_3_display, base_strategy_list_3_display_size, 3); + pmo_dcn4_fams2_expand_base_pstate_strategies( + base_strategy_list_3_display, + base_strategy_list_3_display_size, + i, + pmo->init_data.pmo_dcn4.expanded_strategy_list_3_display, + &pmo->init_data.pmo_dcn4.num_expanded_strategies_per_list[i - 1]); break; case 4: DML2_ASSERT(base_strategy_list_4_display_size <= PMO_DCN4_MAX_BASE_STRATEGIES); /* populate list */ - expand_base_strategies(pmo, base_strategy_list_4_display, base_strategy_list_4_display_size, 4); + pmo_dcn4_fams2_expand_base_pstate_strategies( + base_strategy_list_4_display, + base_strategy_list_4_display_size, + i, + pmo->init_data.pmo_dcn4.expanded_strategy_list_4_display, + &pmo->init_data.pmo_dcn4.num_expanded_strategies_per_list[i - 1]); break; } } @@ -783,8 +813,12 @@ static int find_highest_odm_load_stream_index( int odm_load, highest_odm_load = -1, highest_odm_load_index = -1; for (i = 0; i < display_config->num_streams; i++) { - odm_load = display_config->stream_descriptors[i].timing.pixel_clock_khz + if (mode_support_result->cfg_support_info.stream_support_info[i].odms_used > 0) + odm_load = display_config->stream_descriptors[i].timing.pixel_clock_khz / mode_support_result->cfg_support_info.stream_support_info[i].odms_used; + else + odm_load = 0; + if (odm_load > highest_odm_load) { highest_odm_load_index = i; highest_odm_load = odm_load; @@ -941,11 +975,8 @@ static void build_synchronized_timing_groups( /* find synchronizable timing groups */ for (j = i + 1; j < display_config->display_config.num_streams; j++) { if (memcmp(master_timing, - &display_config->display_config.stream_descriptors[j].timing, - sizeof(struct dml2_timing_cfg)) == 0 && - display_config->display_config.stream_descriptors[i].output.output_encoder == display_config->display_config.stream_descriptors[j].output.output_encoder && - (display_config->display_config.stream_descriptors[i].output.output_encoder != dml2_hdmi || //hdmi requires formats match - display_config->display_config.stream_descriptors[i].output.output_format == display_config->display_config.stream_descriptors[j].output.output_format)) { + &display_config->display_config.stream_descriptors[j].timing, + sizeof(struct dml2_timing_cfg)) == 0) { set_bit_in_bitfield(&pmo->scratch.pmo_dcn4.synchronized_timing_group_masks[timing_group_idx], j); set_bit_in_bitfield(&stream_mapped_mask, j); } @@ -959,7 +990,7 @@ static bool all_timings_support_vactive(const struct dml2_pmo_instance *pmo, const struct display_configuation_with_meta *display_config, unsigned int mask) { - unsigned char i; + unsigned int i; bool valid = true; // Create a remap array to enable simple iteration through only masked stream indicies @@ -1008,7 +1039,7 @@ static bool all_timings_support_drr(const struct dml2_pmo_instance *pmo, const struct display_configuation_with_meta *display_config, unsigned int mask) { - unsigned char i; + unsigned int i; for (i = 0; i < DML2_MAX_PLANES; i++) { const struct dml2_stream_parameters *stream_descriptor; const struct dml2_fams2_meta *stream_fams2_meta; @@ -1050,7 +1081,7 @@ static bool all_timings_support_svp(const struct dml2_pmo_instance *pmo, const struct dml2_plane_parameters *plane_descriptor; const struct dml2_fams2_meta *stream_fams2_meta; unsigned int microschedule_vlines; - unsigned char i; + unsigned int i; unsigned int num_planes_per_stream[DML2_MAX_PLANES] = { 0 }; @@ -1106,24 +1137,73 @@ static void insert_into_candidate_list(const struct dml2_pmo_pstate_strategy *ps scratch->pmo_dcn4.num_pstate_candidates++; } -static bool all_planes_match_method(const struct display_configuation_with_meta *display_cfg, int plane_mask, enum dml2_pmo_pstate_method method) +static enum dml2_pstate_method uclk_pstate_strategy_override_to_pstate_method(const enum dml2_uclk_pstate_change_strategy override_strategy) { - unsigned char i; - enum dml2_uclk_pstate_change_strategy matching_strategy = (enum dml2_uclk_pstate_change_strategy) dml2_pmo_pstate_strategy_na; + enum dml2_pstate_method method = dml2_pstate_method_na; - if (method == dml2_pmo_pstate_strategy_vactive || method == dml2_pmo_pstate_strategy_fw_vactive_drr) - matching_strategy = dml2_uclk_pstate_change_strategy_force_vactive; - else if (method == dml2_pmo_pstate_strategy_vblank || method == dml2_pmo_pstate_strategy_fw_vblank_drr) - matching_strategy = dml2_uclk_pstate_change_strategy_force_vblank; - else if (method == dml2_pmo_pstate_strategy_fw_svp) - matching_strategy = dml2_uclk_pstate_change_strategy_force_mall_svp; - else if (method == dml2_pmo_pstate_strategy_fw_drr) - matching_strategy = dml2_uclk_pstate_change_strategy_force_drr; + switch (override_strategy) { + case dml2_uclk_pstate_change_strategy_force_vactive: + method = dml2_pstate_method_vactive; + break; + case dml2_uclk_pstate_change_strategy_force_vblank: + method = dml2_pstate_method_vblank; + break; + case dml2_uclk_pstate_change_strategy_force_drr: + method = dml2_pstate_method_fw_drr; + break; + case dml2_uclk_pstate_change_strategy_force_mall_svp: + method = dml2_pstate_method_fw_svp; + break; + case dml2_uclk_pstate_change_strategy_force_mall_full_frame: + case dml2_uclk_pstate_change_strategy_auto: + default: + method = dml2_pstate_method_na; + } + + return method; +} + +static enum dml2_uclk_pstate_change_strategy pstate_method_to_uclk_pstate_strategy_override(const enum dml2_pstate_method method) +{ + enum dml2_uclk_pstate_change_strategy override_strategy = dml2_uclk_pstate_change_strategy_auto; + + switch (method) { + case dml2_pstate_method_vactive: + case dml2_pstate_method_fw_vactive_drr: + override_strategy = dml2_uclk_pstate_change_strategy_force_vactive; + break; + case dml2_pstate_method_vblank: + case dml2_pstate_method_fw_vblank_drr: + override_strategy = dml2_uclk_pstate_change_strategy_force_vblank; + break; + case dml2_pstate_method_fw_svp: + case dml2_pstate_method_fw_svp_drr: + override_strategy = dml2_uclk_pstate_change_strategy_force_mall_svp; + break; + case dml2_pstate_method_fw_drr: + override_strategy = dml2_uclk_pstate_change_strategy_force_drr; + break; + case dml2_pstate_method_reserved_hw: + case dml2_pstate_method_reserved_fw: + case dml2_pstate_method_reserved_fw_drr_clamped: + case dml2_pstate_method_reserved_fw_drr_var: + case dml2_pstate_method_count: + case dml2_pstate_method_na: + default: + override_strategy = dml2_uclk_pstate_change_strategy_auto; + } + + return override_strategy; +} + +static bool all_planes_match_method(const struct display_configuation_with_meta *display_cfg, int plane_mask, enum dml2_pstate_method method) +{ + unsigned int i; for (i = 0; i < DML2_MAX_PLANES; i++) { if (is_bit_set_in_bitfield(plane_mask, i)) { if (display_cfg->display_config.plane_descriptors[i].overrides.uclk_pstate_change_strategy != dml2_uclk_pstate_change_strategy_auto && - display_cfg->display_config.plane_descriptors[i].overrides.uclk_pstate_change_strategy != matching_strategy) + display_cfg->display_config.plane_descriptors[i].overrides.uclk_pstate_change_strategy != pstate_method_to_uclk_pstate_strategy_override(method)) return false; } } @@ -1149,32 +1229,33 @@ static void build_method_scheduling_params( static struct dml2_fams2_per_method_common_meta *get_per_method_common_meta( struct dml2_pmo_instance *pmo, - enum dml2_pmo_pstate_method stream_pstate_method, + enum dml2_pstate_method stream_pstate_method, int stream_idx) { struct dml2_fams2_per_method_common_meta *stream_method_fams2_meta = NULL; switch (stream_pstate_method) { - case dml2_pmo_pstate_strategy_vactive: - case dml2_pmo_pstate_strategy_fw_vactive_drr: + case dml2_pstate_method_vactive: + case dml2_pstate_method_fw_vactive_drr: stream_method_fams2_meta = &pmo->scratch.pmo_dcn4.stream_fams2_meta[stream_idx].method_vactive.common; break; - case dml2_pmo_pstate_strategy_vblank: - case dml2_pmo_pstate_strategy_fw_vblank_drr: + case dml2_pstate_method_vblank: + case dml2_pstate_method_fw_vblank_drr: stream_method_fams2_meta = &pmo->scratch.pmo_dcn4.stream_fams2_meta[stream_idx].method_vblank.common; break; - case dml2_pmo_pstate_strategy_fw_svp: - case dml2_pmo_pstate_strategy_fw_svp_drr: + case dml2_pstate_method_fw_svp: + case dml2_pstate_method_fw_svp_drr: stream_method_fams2_meta = &pmo->scratch.pmo_dcn4.stream_fams2_meta[stream_idx].method_subvp.common; break; - case dml2_pmo_pstate_strategy_fw_drr: + case dml2_pstate_method_fw_drr: stream_method_fams2_meta = &pmo->scratch.pmo_dcn4.stream_fams2_meta[stream_idx].method_drr.common; break; - case dml2_pmo_pstate_strategy_reserved_hw: - case dml2_pmo_pstate_strategy_reserved_fw: - case dml2_pmo_pstate_strategy_reserved_fw_drr_clamped: - case dml2_pmo_pstate_strategy_reserved_fw_drr_var: - case dml2_pmo_pstate_strategy_na: + case dml2_pstate_method_reserved_hw: + case dml2_pstate_method_reserved_fw: + case dml2_pstate_method_reserved_fw_drr_clamped: + case dml2_pstate_method_reserved_fw_drr_var: + case dml2_pstate_method_count: + case dml2_pstate_method_na: default: stream_method_fams2_meta = NULL; } @@ -1215,7 +1296,7 @@ static bool is_timing_group_schedulable( if (is_bit_set_in_bitfield(pmo->scratch.pmo_dcn4.synchronized_timing_group_masks[timing_group_idx], i)) { stream_method_fams2_meta = get_per_method_common_meta(pmo, pstate_strategy->per_stream_pstate_method[i], i); if (!stream_method_fams2_meta) - return false; + continue; if (group_fams2_meta->allow_start_otg_vline < stream_method_fams2_meta->allow_start_otg_vline) { /* set group allow start to larger otg vline */ @@ -1295,7 +1376,7 @@ static bool is_config_schedulable( if (j_disallow_us < jp1_disallow_us) { /* swap as A < B */ swap(s->pmo_dcn4.sorted_group_gtl_disallow_index[j], - s->pmo_dcn4.sorted_group_gtl_disallow_index[j+1]); + s->pmo_dcn4.sorted_group_gtl_disallow_index[j + 1]); swapped = true; } } @@ -1354,7 +1435,7 @@ static bool is_config_schedulable( if (j_period_us < jp1_period_us) { /* swap as A < B */ swap(s->pmo_dcn4.sorted_group_gtl_period_index[j], - s->pmo_dcn4.sorted_group_gtl_period_index[j+1]); + s->pmo_dcn4.sorted_group_gtl_period_index[j + 1]); swapped = true; } } @@ -1413,7 +1494,7 @@ static bool is_config_schedulable( static bool stream_matches_drr_policy(struct dml2_pmo_instance *pmo, const struct display_configuation_with_meta *display_cfg, - const enum dml2_pmo_pstate_method stream_pstate_method, + const enum dml2_pstate_method stream_pstate_method, unsigned int stream_index) { const struct dml2_stream_parameters *stream_descriptor = &display_cfg->display_config.stream_descriptors[stream_index]; @@ -1468,7 +1549,7 @@ static bool validate_pstate_support_strategy_cofunctionality(struct dml2_pmo_ins { struct dml2_pmo_scratch *s = &pmo->scratch; - unsigned char stream_index = 0; + unsigned int stream_index = 0; unsigned int svp_count = 0; unsigned int svp_stream_mask = 0; @@ -1494,19 +1575,19 @@ static bool validate_pstate_support_strategy_cofunctionality(struct dml2_pmo_ins strategy_matches_drr_requirements &= stream_matches_drr_policy(pmo, display_cfg, pstate_strategy->per_stream_pstate_method[stream_index], stream_index); - if (pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_svp || - pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_svp_drr) { + if (pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_svp || + pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_svp_drr) { svp_count++; set_bit_in_bitfield(&svp_stream_mask, stream_index); - } else if (pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_drr) { + } else if (pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_drr) { drr_count++; set_bit_in_bitfield(&drr_stream_mask, stream_index); - } else if (pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_vactive || - pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_vactive_drr) { + } else if (pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pstate_method_vactive || + pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_vactive_drr) { vactive_count++; set_bit_in_bitfield(&vactive_stream_mask, stream_index); - } else if (pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_vblank || - pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_vblank_drr) { + } else if (pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pstate_method_vblank || + pstate_strategy->per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_vblank_drr) { vblank_count++; set_bit_in_bitfield(&vblank_stream_mask, stream_index); } @@ -1532,7 +1613,7 @@ static bool validate_pstate_support_strategy_cofunctionality(struct dml2_pmo_ins static int get_vactive_pstate_margin(const struct display_configuation_with_meta *display_cfg, int plane_mask) { - unsigned char i; + unsigned int i; int min_vactive_margin_us = 0xFFFFFFF; for (i = 0; i < DML2_MAX_PLANES; i++) { @@ -1625,7 +1706,7 @@ static void build_fams2_meta_per_stream(struct dml2_pmo_instance *pmo, /* for single stream, guarantee at least an instant of allow */ stream_fams2_meta->method_vactive.max_vactive_det_fill_delay_otg_vlines = (unsigned int)math_floor( math_max2(0.0, - timing->v_active - stream_fams2_meta->min_allow_width_otg_vlines - stream_fams2_meta->dram_clk_change_blackout_otg_vlines)); + timing->v_active - math_max2(1.0, stream_fams2_meta->min_allow_width_otg_vlines) - stream_fams2_meta->dram_clk_change_blackout_otg_vlines)); } else { /* for multi stream, bound to a max fill time defined by IP caps */ stream_fams2_meta->method_vactive.max_vactive_det_fill_delay_otg_vlines = @@ -1738,8 +1819,10 @@ bool pmo_dcn4_fams2_init_for_pstate_support(struct dml2_pmo_init_for_pstate_supp struct display_configuation_with_meta *display_config; const struct dml2_plane_parameters *plane_descriptor; const struct dml2_pmo_pstate_strategy *strategy_list = NULL; + struct dml2_pmo_pstate_strategy override_base_strategy = { 0 }; unsigned int strategy_list_size = 0; - unsigned char plane_index, stream_index, i; + unsigned int plane_index, stream_index, i; + bool build_override_strategy = true; state->performed = true; in_out->base_display_config->stage3.min_clk_index_for_latency = in_out->base_display_config->stage1.min_clk_index_for_latency; @@ -1763,7 +1846,11 @@ bool pmo_dcn4_fams2_init_for_pstate_support(struct dml2_pmo_init_for_pstate_supp set_bit_in_bitfield(&s->pmo_dcn4.stream_plane_mask[plane_descriptor->stream_index], plane_index); - state->pstate_switch_modes[plane_index] = dml2_uclk_pstate_support_method_vactive; + state->pstate_switch_modes[plane_index] = dml2_pstate_method_vactive; + + build_override_strategy &= plane_descriptor->overrides.uclk_pstate_change_strategy != dml2_uclk_pstate_change_strategy_auto; + override_base_strategy.per_stream_pstate_method[plane_descriptor->stream_index] = + uclk_pstate_strategy_override_to_pstate_method(plane_descriptor->overrides.uclk_pstate_change_strategy); } // Figure out which streams can do vactive, and also build up implicit SVP and FAMS2 meta @@ -1781,13 +1868,30 @@ bool pmo_dcn4_fams2_init_for_pstate_support(struct dml2_pmo_init_for_pstate_supp /* get synchronized timing groups */ build_synchronized_timing_groups(pmo, display_config); - strategy_list = get_expanded_strategy_list(&pmo->init_data, display_config->display_config.num_streams); - if (!strategy_list) - return false; + if (build_override_strategy) { + /* build expanded override strategy list (no permutations) */ + override_base_strategy.allow_state_increase = true; + s->pmo_dcn4.num_expanded_override_strategies = 0; + insert_strategy_into_expanded_list(&override_base_strategy, + display_config->display_config.num_streams, + s->pmo_dcn4.expanded_override_strategy_list, + &s->pmo_dcn4.num_expanded_override_strategies); + expand_variant_strategy(&override_base_strategy, + display_config->display_config.num_streams, + false, + s->pmo_dcn4.expanded_override_strategy_list, + &s->pmo_dcn4.num_expanded_override_strategies); - strategy_list_size = get_num_expanded_strategies(&pmo->init_data, display_config->display_config.num_streams); + /* use override strategy list */ + strategy_list = s->pmo_dcn4.expanded_override_strategy_list; + strategy_list_size = s->pmo_dcn4.num_expanded_override_strategies; + } else { + /* use predefined strategy list */ + strategy_list = get_expanded_strategy_list(&pmo->init_data, display_config->display_config.num_streams); + strategy_list_size = get_num_expanded_strategies(&pmo->init_data, display_config->display_config.num_streams); + } - if (strategy_list_size == 0) + if (!strategy_list || strategy_list_size == 0) return false; s->pmo_dcn4.num_pstate_candidates = 0; @@ -1799,7 +1903,7 @@ bool pmo_dcn4_fams2_init_for_pstate_support(struct dml2_pmo_init_for_pstate_supp } if (s->pmo_dcn4.num_pstate_candidates > 0) { - s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.num_pstate_candidates - 1].allow_state_increase = true; + s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.num_pstate_candidates-1].allow_state_increase = true; s->pmo_dcn4.cur_pstate_candidate = -1; return true; } else { @@ -1832,7 +1936,7 @@ static void reset_display_configuration(struct display_configuation_with_meta *d // Reset strategy to auto plane->overrides.uclk_pstate_change_strategy = dml2_uclk_pstate_change_strategy_auto; - display_config->stage3.pstate_switch_modes[plane_index] = dml2_uclk_pstate_support_method_not_supported; + display_config->stage3.pstate_switch_modes[plane_index] = dml2_pstate_method_na; } } @@ -1840,7 +1944,7 @@ static void setup_planes_for_drr_by_mask(struct display_configuation_with_meta * struct dml2_pmo_instance *pmo, int plane_mask) { - unsigned char plane_index; + unsigned int plane_index; struct dml2_plane_parameters *plane; for (plane_index = 0; plane_index < display_config->display_config.num_planes; plane_index++) { @@ -1849,7 +1953,7 @@ static void setup_planes_for_drr_by_mask(struct display_configuation_with_meta * plane->overrides.uclk_pstate_change_strategy = dml2_uclk_pstate_change_strategy_force_drr; - display_config->stage3.pstate_switch_modes[plane_index] = dml2_uclk_pstate_support_method_fw_drr; + display_config->stage3.pstate_switch_modes[plane_index] = dml2_pstate_method_fw_drr; } } @@ -1861,13 +1965,13 @@ static void setup_planes_for_svp_by_mask(struct display_configuation_with_meta * { struct dml2_pmo_scratch *scratch = &pmo->scratch; - unsigned char plane_index; + unsigned int plane_index; int stream_index = -1; for (plane_index = 0; plane_index < display_config->display_config.num_planes; plane_index++) { if (is_bit_set_in_bitfield(plane_mask, plane_index)) { stream_index = (char)display_config->display_config.plane_descriptors[plane_index].stream_index; - display_config->stage3.pstate_switch_modes[plane_index] = dml2_uclk_pstate_support_method_fw_subvp_phantom; + display_config->stage3.pstate_switch_modes[plane_index] = dml2_pstate_method_fw_svp; } } @@ -1884,13 +1988,13 @@ static void setup_planes_for_svp_drr_by_mask(struct display_configuation_with_me { struct dml2_pmo_scratch *scratch = &pmo->scratch; - unsigned char plane_index; + unsigned int plane_index; int stream_index = -1; for (plane_index = 0; plane_index < display_config->display_config.num_planes; plane_index++) { if (is_bit_set_in_bitfield(plane_mask, plane_index)) { stream_index = (char)display_config->display_config.plane_descriptors[plane_index].stream_index; - display_config->stage3.pstate_switch_modes[plane_index] = dml2_uclk_pstate_support_method_fw_subvp_phantom_drr; + display_config->stage3.pstate_switch_modes[plane_index] = dml2_pstate_method_fw_svp_drr; } } @@ -1905,7 +2009,7 @@ static void setup_planes_for_vblank_by_mask(struct display_configuation_with_met struct dml2_pmo_instance *pmo, int plane_mask) { - unsigned char plane_index; + unsigned int plane_index; struct dml2_plane_parameters *plane; for (plane_index = 0; plane_index < display_config->display_config.num_planes; plane_index++) { @@ -1915,7 +2019,7 @@ static void setup_planes_for_vblank_by_mask(struct display_configuation_with_met plane->overrides.reserved_vblank_time_ns = (long)math_max2(pmo->soc_bb->power_management_parameters.dram_clk_change_blackout_us * 1000.0, plane->overrides.reserved_vblank_time_ns); - display_config->stage3.pstate_switch_modes[plane_index] = dml2_uclk_pstate_support_method_vblank; + display_config->stage3.pstate_switch_modes[plane_index] = dml2_pstate_method_vblank; } } @@ -1925,7 +2029,7 @@ static void setup_planes_for_vblank_drr_by_mask(struct display_configuation_with struct dml2_pmo_instance *pmo, int plane_mask) { - unsigned char plane_index; + unsigned int plane_index; struct dml2_plane_parameters *plane; for (plane_index = 0; plane_index < display_config->display_config.num_planes; plane_index++) { @@ -1933,7 +2037,7 @@ static void setup_planes_for_vblank_drr_by_mask(struct display_configuation_with plane = &display_config->display_config.plane_descriptors[plane_index]; plane->overrides.reserved_vblank_time_ns = (long)(pmo->soc_bb->power_management_parameters.dram_clk_change_blackout_us * 1000); - display_config->stage3.pstate_switch_modes[plane_index] = dml2_uclk_pstate_support_method_fw_vblank_drr; + display_config->stage3.pstate_switch_modes[plane_index] = dml2_pstate_method_fw_vblank_drr; } } } @@ -1942,14 +2046,14 @@ static void setup_planes_for_vactive_by_mask(struct display_configuation_with_me struct dml2_pmo_instance *pmo, int plane_mask) { - unsigned char plane_index; + unsigned int plane_index; unsigned int stream_index; for (plane_index = 0; plane_index < display_config->display_config.num_planes; plane_index++) { if (is_bit_set_in_bitfield(plane_mask, plane_index)) { stream_index = display_config->display_config.plane_descriptors[plane_index].stream_index; - display_config->stage3.pstate_switch_modes[plane_index] = dml2_uclk_pstate_support_method_vactive; + display_config->stage3.pstate_switch_modes[plane_index] = dml2_pstate_method_vactive; if (!pmo->options->disable_vactive_det_fill_bw_pad) { display_config->display_config.plane_descriptors[plane_index].overrides.max_vactive_det_fill_delay_us = @@ -1963,14 +2067,14 @@ static void setup_planes_for_vactive_drr_by_mask(struct display_configuation_wit struct dml2_pmo_instance *pmo, int plane_mask) { - unsigned char plane_index; + unsigned int plane_index; unsigned int stream_index; for (plane_index = 0; plane_index < display_config->display_config.num_planes; plane_index++) { if (is_bit_set_in_bitfield(plane_mask, plane_index)) { stream_index = display_config->display_config.plane_descriptors[plane_index].stream_index; - display_config->stage3.pstate_switch_modes[plane_index] = dml2_uclk_pstate_support_method_fw_vactive_drr; + display_config->stage3.pstate_switch_modes[plane_index] = dml2_pstate_method_fw_vactive_drr; if (!pmo->options->disable_vactive_det_fill_bw_pad) { display_config->display_config.plane_descriptors[plane_index].overrides.max_vactive_det_fill_delay_us = @@ -1992,26 +2096,26 @@ static bool setup_display_config(struct display_configuation_with_meta *display_ for (stream_index = 0; stream_index < display_config->display_config.num_streams; stream_index++) { - if (pmo->scratch.pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_na) { + if (pmo->scratch.pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pstate_method_na) { success = false; break; - } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_vactive) { + } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pstate_method_vactive) { setup_planes_for_vactive_by_mask(display_config, pmo, scratch->pmo_dcn4.stream_plane_mask[stream_index]); - } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_vblank) { + } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pstate_method_vblank) { setup_planes_for_vblank_by_mask(display_config, pmo, scratch->pmo_dcn4.stream_plane_mask[stream_index]); - } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_svp) { + } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_svp) { fams2_required = true; setup_planes_for_svp_by_mask(display_config, pmo, scratch->pmo_dcn4.stream_plane_mask[stream_index]); - } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_vactive_drr) { + } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_vactive_drr) { fams2_required = true; setup_planes_for_vactive_drr_by_mask(display_config, pmo, scratch->pmo_dcn4.stream_plane_mask[stream_index]); - } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_vblank_drr) { + } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_vblank_drr) { fams2_required = true; setup_planes_for_vblank_drr_by_mask(display_config, pmo, scratch->pmo_dcn4.stream_plane_mask[stream_index]); - } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_svp_drr) { + } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_svp_drr) { fams2_required = true; setup_planes_for_svp_drr_by_mask(display_config, pmo, scratch->pmo_dcn4.stream_plane_mask[stream_index]); - } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_drr) { + } else if (scratch->pmo_dcn4.pstate_strategy_candidates[strategy_index].per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_drr) { fams2_required = true; setup_planes_for_drr_by_mask(display_config, pmo, scratch->pmo_dcn4.stream_plane_mask[stream_index]); } @@ -2031,7 +2135,7 @@ static bool setup_display_config(struct display_configuation_with_meta *display_ static int get_minimum_reserved_time_us_for_planes(struct display_configuation_with_meta *display_config, int plane_mask) { int min_time_us = 0xFFFFFF; - unsigned char plane_index = 0; + unsigned int plane_index = 0; for (plane_index = 0; plane_index < display_config->display_config.num_planes; plane_index++) { if (is_bit_set_in_bitfield(plane_mask, plane_index)) { @@ -2066,34 +2170,34 @@ bool pmo_dcn4_fams2_test_for_pstate_support(struct dml2_pmo_test_for_pstate_supp for (stream_index = 0; stream_index < in_out->base_display_config->display_config.num_streams; stream_index++) { struct dml2_fams2_meta *stream_fams2_meta = &s->pmo_dcn4.stream_fams2_meta[stream_index]; - if (s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_vactive || - s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_vactive_drr) { + if (s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pstate_method_vactive || + s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_vactive_drr) { if (get_vactive_pstate_margin(in_out->base_display_config, s->pmo_dcn4.stream_plane_mask[stream_index]) < (MIN_VACTIVE_MARGIN_PCT * in_out->instance->soc_bb->power_management_parameters.dram_clk_change_blackout_us) || get_vactive_det_fill_latency_delay_us(in_out->base_display_config, s->pmo_dcn4.stream_plane_mask[stream_index]) > stream_fams2_meta->method_vactive.max_vactive_det_fill_delay_us) { p_state_supported = false; break; } - } else if (s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_vblank || - s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_vblank_drr) { + } else if (s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pstate_method_vblank || + s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_vblank_drr) { if (get_minimum_reserved_time_us_for_planes(in_out->base_display_config, s->pmo_dcn4.stream_plane_mask[stream_index]) < REQUIRED_RESERVED_TIME || get_vactive_pstate_margin(in_out->base_display_config, s->pmo_dcn4.stream_plane_mask[stream_index]) < MIN_VACTIVE_MARGIN_VBLANK) { p_state_supported = false; break; } - } else if (s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_svp || - s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_svp_drr) { + } else if (s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_svp || + s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_svp_drr) { if (in_out->base_display_config->stage3.stream_svp_meta[stream_index].valid == false) { p_state_supported = false; break; } - } else if (s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_fw_drr) { - if (!all_planes_match_method(in_out->base_display_config, s->pmo_dcn4.stream_plane_mask[stream_index], dml2_pmo_pstate_strategy_fw_drr) || + } else if (s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pstate_method_fw_drr) { + if (!all_planes_match_method(in_out->base_display_config, s->pmo_dcn4.stream_plane_mask[stream_index], dml2_pstate_method_fw_drr) || get_vactive_pstate_margin(in_out->base_display_config, s->pmo_dcn4.stream_plane_mask[stream_index]) < MIN_VACTIVE_MARGIN_DRR) { p_state_supported = false; break; } - } else if (s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pmo_pstate_strategy_na) { + } else if (s->pmo_dcn4.pstate_strategy_candidates[s->pmo_dcn4.cur_pstate_candidate].per_stream_pstate_method[stream_index] == dml2_pstate_method_na) { p_state_supported = false; break; } diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn4_fams2.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn4_fams2.h index 0c25bd3e9ac0..6baab7ad6ecc 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn4_fams2.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn4_fams2.h @@ -23,4 +23,11 @@ bool pmo_dcn4_fams2_init_for_stutter(struct dml2_pmo_init_for_stutter_in_out *in bool pmo_dcn4_fams2_test_for_stutter(struct dml2_pmo_test_for_stutter_in_out *in_out); bool pmo_dcn4_fams2_optimize_for_stutter(struct dml2_pmo_optimize_for_stutter_in_out *in_out); +void pmo_dcn4_fams2_expand_base_pstate_strategies( + const struct dml2_pmo_pstate_strategy *base_strategies_list, + const unsigned int num_base_strategies, + const unsigned int stream_count, + struct dml2_pmo_pstate_strategy *expanded_strategy_list, + unsigned int *num_expanded_strategies); + #endif diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_factory.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_factory.c index add51d41a515..7ed0242a4b33 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_factory.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_factory.c @@ -72,7 +72,6 @@ bool dml2_pmo_create(enum dml2_project_id project_id, struct dml2_pmo_instance * out->init_for_stutter = pmo_dcn4_fams2_init_for_stutter; out->test_for_stutter = pmo_dcn4_fams2_test_for_stutter; out->optimize_for_stutter = pmo_dcn4_fams2_optimize_for_stutter; - result = true; break; case dml2_project_invalid: diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_interfaces.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_interfaces.c new file mode 100644 index 000000000000..f88931ccbc5e --- /dev/null +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_interfaces.c @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: MIT +// +// Copyright 2024 Advanced Micro Devices, Inc. + +#include "dml_top.h" +#include "dml2_internal_shared_types.h" +#include "dml2_top_soc15.h" + +unsigned int dml2_get_instance_size_bytes(void) +{ + return sizeof(struct dml2_instance); +} + +bool dml2_initialize_instance(struct dml2_initialize_instance_in_out *in_out) +{ + switch (in_out->options.project_id) { + case dml2_project_dcn4x_stage1: + case dml2_project_dcn4x_stage2: + case dml2_project_dcn4x_stage2_auto_drr_svp: + return dml2_top_soc15_initialize_instance(in_out); + case dml2_project_invalid: + default: + return false; + } +} + +bool dml2_check_mode_supported(struct dml2_check_mode_supported_in_out *in_out) +{ + if (!in_out->dml2_instance->funcs.check_mode_supported) + return false; + + return in_out->dml2_instance->funcs.check_mode_supported(in_out); +} + +bool dml2_build_mode_programming(struct dml2_build_mode_programming_in_out *in_out) +{ + if (!in_out->dml2_instance->funcs.build_mode_programming) + return false; + + return in_out->dml2_instance->funcs.build_mode_programming(in_out); +} + +bool dml2_build_mcache_programming(struct dml2_build_mcache_programming_in_out *in_out) +{ + if (!in_out->dml2_instance->funcs.build_mcache_programming) + return false; + + return in_out->dml2_instance->funcs.build_mcache_programming(in_out); +} + diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_legacy.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_legacy.c new file mode 100644 index 000000000000..5e14d85821e2 --- /dev/null +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_legacy.c @@ -0,0 +1,10 @@ +// SPDX-License-Identifier: MIT +// +// Copyright 2024 Advanced Micro Devices, Inc. + +#include "dml2_top_legacy.h" +#include "dml2_top_soc15.h" +#include "dml2_core_factory.h" +#include "dml2_pmo_factory.h" +#include "display_mode_core_structs.h" + diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_legacy.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_legacy.h new file mode 100644 index 000000000000..14d0ae03dce6 --- /dev/null +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_legacy.h @@ -0,0 +1,9 @@ +// SPDX-License-Identifier: MIT +// +// Copyright 2024 Advanced Micro Devices, Inc. + +#ifndef __DML2_TOP_LEGACY_H__ +#define __DML2_TOP_LEGACY_H__ +#include "dml2_internal_shared_types.h" +bool dml2_top_legacy_initialize_instance(struct dml2_initialize_instance_in_out *in_out); +#endif /* __DML2_TOP_LEGACY_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_optimization.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_optimization.c deleted file mode 100644 index d0e026d981b5..000000000000 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_optimization.c +++ /dev/null @@ -1,307 +0,0 @@ -// SPDX-License-Identifier: MIT -// -// Copyright 2024 Advanced Micro Devices, Inc. - -#include "dml2_top_optimization.h" -#include "dml2_internal_shared_types.h" -#include "dml_top_mcache.h" - -static void copy_display_configuration_with_meta(struct display_configuation_with_meta *dst, const struct display_configuation_with_meta *src) -{ - memcpy(dst, src, sizeof(struct display_configuation_with_meta)); -} - -bool dml2_top_optimization_init_function_min_clk_for_latency(const struct optimization_init_function_params *params) -{ - struct dml2_optimization_stage1_state *state = ¶ms->display_config->stage1; - - state->performed = true; - - return true; -} - -bool dml2_top_optimization_test_function_min_clk_for_latency(const struct optimization_test_function_params *params) -{ - struct dml2_optimization_stage1_state *state = ¶ms->display_config->stage1; - - return state->min_clk_index_for_latency == 0; -} - -bool dml2_top_optimization_optimize_function_min_clk_for_latency(const struct optimization_optimize_function_params *params) -{ - bool result = false; - - if (params->display_config->stage1.min_clk_index_for_latency > 0) { - copy_display_configuration_with_meta(params->optimized_display_config, params->display_config); - params->optimized_display_config->stage1.min_clk_index_for_latency--; - result = true; - } - - return result; -} - -bool dml2_top_optimization_test_function_mcache(const struct optimization_test_function_params *params) -{ - struct dml2_optimization_test_function_locals *l = params->locals; - bool mcache_success = false; - bool result = false; - - memset(l, 0, sizeof(struct dml2_optimization_test_function_locals)); - - l->test_mcache.calc_mcache_count_params.dml2_instance = params->dml; - l->test_mcache.calc_mcache_count_params.display_config = ¶ms->display_config->display_config; - l->test_mcache.calc_mcache_count_params.mcache_allocations = params->display_config->stage2.mcache_allocations; - - result = dml2_top_mcache_calc_mcache_count_and_offsets(&l->test_mcache.calc_mcache_count_params); // use core to get the basic mcache_allocations - - if (result) { - l->test_mcache.assign_global_mcache_ids_params.allocations = params->display_config->stage2.mcache_allocations; - l->test_mcache.assign_global_mcache_ids_params.num_allocations = params->display_config->display_config.num_planes; - - dml2_top_mcache_assign_global_mcache_ids(&l->test_mcache.assign_global_mcache_ids_params); - - l->test_mcache.validate_admissibility_params.dml2_instance = params->dml; - l->test_mcache.validate_admissibility_params.display_cfg = ¶ms->display_config->display_config; - l->test_mcache.validate_admissibility_params.mcache_allocations = params->display_config->stage2.mcache_allocations; - l->test_mcache.validate_admissibility_params.cfg_support_info = ¶ms->display_config->mode_support_result.cfg_support_info; - - mcache_success = dml2_top_mcache_validate_admissability(&l->test_mcache.validate_admissibility_params); // also find the shift to make mcache allocation works - - memcpy(params->display_config->stage2.per_plane_mcache_support, l->test_mcache.validate_admissibility_params.per_plane_status, sizeof(bool) * DML2_MAX_PLANES); - } - - return mcache_success; -} - -bool dml2_top_optimization_optimize_function_mcache(const struct optimization_optimize_function_params *params) -{ - struct dml2_optimization_optimize_function_locals *l = params->locals; - bool optimize_success = false; - - if (params->last_candidate_supported == false) - return false; - - copy_display_configuration_with_meta(params->optimized_display_config, params->display_config); - - l->optimize_mcache.optimize_mcache_params.instance = ¶ms->dml->pmo_instance; - l->optimize_mcache.optimize_mcache_params.dcc_mcache_supported = params->display_config->stage2.per_plane_mcache_support; - l->optimize_mcache.optimize_mcache_params.display_config = ¶ms->display_config->display_config; - l->optimize_mcache.optimize_mcache_params.optimized_display_cfg = ¶ms->optimized_display_config->display_config; - l->optimize_mcache.optimize_mcache_params.cfg_support_info = ¶ms->optimized_display_config->mode_support_result.cfg_support_info; - - optimize_success = params->dml->pmo_instance.optimize_dcc_mcache(&l->optimize_mcache.optimize_mcache_params); - - return optimize_success; -} - -bool dml2_top_optimization_init_function_vmin(const struct optimization_init_function_params *params) -{ - struct dml2_optimization_init_function_locals *l = params->locals; - - l->vmin.init_params.instance = ¶ms->dml->pmo_instance; - l->vmin.init_params.base_display_config = params->display_config; - return params->dml->pmo_instance.init_for_vmin(&l->vmin.init_params); -} - -bool dml2_top_optimization_test_function_vmin(const struct optimization_test_function_params *params) -{ - struct dml2_optimization_test_function_locals *l = params->locals; - - l->test_vmin.pmo_test_vmin_params.instance = ¶ms->dml->pmo_instance; - l->test_vmin.pmo_test_vmin_params.display_config = params->display_config; - l->test_vmin.pmo_test_vmin_params.vmin_limits = ¶ms->dml->soc_bbox.vmin_limit; - return params->dml->pmo_instance.test_for_vmin(&l->test_vmin.pmo_test_vmin_params); -} - -bool dml2_top_optimization_optimize_function_vmin(const struct optimization_optimize_function_params *params) -{ - struct dml2_optimization_optimize_function_locals *l = params->locals; - - if (params->last_candidate_supported == false) - return false; - - l->optimize_vmin.pmo_optimize_vmin_params.instance = ¶ms->dml->pmo_instance; - l->optimize_vmin.pmo_optimize_vmin_params.base_display_config = params->display_config; - l->optimize_vmin.pmo_optimize_vmin_params.optimized_display_config = params->optimized_display_config; - return params->dml->pmo_instance.optimize_for_vmin(&l->optimize_vmin.pmo_optimize_vmin_params); -} - -bool dml2_top_optimization_perform_optimization_phase(struct dml2_optimization_phase_locals *l, const struct optimization_phase_params *params) -{ - bool test_passed = false; - bool optimize_succeeded = true; - bool candidate_validation_passed = true; - struct optimization_init_function_params init_params = { 0 }; - struct optimization_test_function_params test_params = { 0 }; - struct optimization_optimize_function_params optimize_params = { 0 }; - - if (!params->dml || - !params->optimize_function || - !params->test_function || - !params->display_config || - !params->optimized_display_config) - return false; - - copy_display_configuration_with_meta(&l->cur_candidate_display_cfg, params->display_config); - - init_params.locals = &l->init_function_locals; - init_params.dml = params->dml; - init_params.display_config = &l->cur_candidate_display_cfg; - - if (params->init_function && !params->init_function(&init_params)) - return false; - - test_params.locals = &l->test_function_locals; - test_params.dml = params->dml; - test_params.display_config = &l->cur_candidate_display_cfg; - - test_passed = params->test_function(&test_params); - - while (!test_passed && optimize_succeeded) { - memset(&optimize_params, 0, sizeof(struct optimization_optimize_function_params)); - - optimize_params.locals = &l->optimize_function_locals; - optimize_params.dml = params->dml; - optimize_params.display_config = &l->cur_candidate_display_cfg; - optimize_params.optimized_display_config = &l->next_candidate_display_cfg; - optimize_params.last_candidate_supported = candidate_validation_passed; - - optimize_succeeded = params->optimize_function(&optimize_params); - - if (optimize_succeeded) { - l->mode_support_params.instance = ¶ms->dml->core_instance; - l->mode_support_params.display_cfg = &l->next_candidate_display_cfg; - l->mode_support_params.min_clk_table = ¶ms->dml->min_clk_table; - - if (l->next_candidate_display_cfg.stage3.performed) - l->mode_support_params.min_clk_index = l->next_candidate_display_cfg.stage3.min_clk_index_for_latency; - else - l->mode_support_params.min_clk_index = l->next_candidate_display_cfg.stage1.min_clk_index_for_latency; - - candidate_validation_passed = params->dml->core_instance.mode_support(&l->mode_support_params); - - l->next_candidate_display_cfg.mode_support_result = l->mode_support_params.mode_support_result; - } - - if (optimize_succeeded && candidate_validation_passed) { - memset(&test_params, 0, sizeof(struct optimization_test_function_params)); - test_params.locals = &l->test_function_locals; - test_params.dml = params->dml; - test_params.display_config = &l->next_candidate_display_cfg; - test_passed = params->test_function(&test_params); - - copy_display_configuration_with_meta(&l->cur_candidate_display_cfg, &l->next_candidate_display_cfg); - - // If optimization is not all or nothing, then store partial progress in output - if (!params->all_or_nothing) - copy_display_configuration_with_meta(params->optimized_display_config, &l->next_candidate_display_cfg); - } - } - - if (test_passed) - copy_display_configuration_with_meta(params->optimized_display_config, &l->cur_candidate_display_cfg); - - return test_passed; -} - -bool dml2_top_optimization_perform_optimization_phase_1(struct dml2_optimization_phase_locals *l, const struct optimization_phase_params *params) -{ - int highest_state, lowest_state, cur_state; - bool supported = false; - - if (!params->dml || - !params->optimize_function || - !params->test_function || - !params->display_config || - !params->optimized_display_config) - return false; - - copy_display_configuration_with_meta(&l->cur_candidate_display_cfg, params->display_config); - highest_state = l->cur_candidate_display_cfg.stage1.min_clk_index_for_latency; - lowest_state = 0; - - while (highest_state > lowest_state) { - cur_state = (highest_state + lowest_state) / 2; - - l->mode_support_params.instance = ¶ms->dml->core_instance; - l->mode_support_params.display_cfg = &l->cur_candidate_display_cfg; - l->mode_support_params.min_clk_table = ¶ms->dml->min_clk_table; - l->mode_support_params.min_clk_index = cur_state; - - supported = params->dml->core_instance.mode_support(&l->mode_support_params); - - if (supported) { - l->cur_candidate_display_cfg.mode_support_result = l->mode_support_params.mode_support_result; - highest_state = cur_state; - } else { - lowest_state = cur_state + 1; - } - } - l->cur_candidate_display_cfg.stage1.min_clk_index_for_latency = lowest_state; - - copy_display_configuration_with_meta(params->optimized_display_config, &l->cur_candidate_display_cfg); - - return true; -} - -bool dml2_top_optimization_init_function_uclk_pstate(const struct optimization_init_function_params *params) -{ - struct dml2_optimization_init_function_locals *l = params->locals; - - l->uclk_pstate.init_params.instance = ¶ms->dml->pmo_instance; - l->uclk_pstate.init_params.base_display_config = params->display_config; - - return params->dml->pmo_instance.init_for_uclk_pstate(&l->uclk_pstate.init_params); -} - -bool dml2_top_optimization_test_function_uclk_pstate(const struct optimization_test_function_params *params) -{ - struct dml2_optimization_test_function_locals *l = params->locals; - - l->uclk_pstate.test_params.instance = ¶ms->dml->pmo_instance; - l->uclk_pstate.test_params.base_display_config = params->display_config; - - return params->dml->pmo_instance.test_for_uclk_pstate(&l->uclk_pstate.test_params); -} - -bool dml2_top_optimization_optimize_function_uclk_pstate(const struct optimization_optimize_function_params *params) -{ - struct dml2_optimization_optimize_function_locals *l = params->locals; - - l->uclk_pstate.optimize_params.instance = ¶ms->dml->pmo_instance; - l->uclk_pstate.optimize_params.base_display_config = params->display_config; - l->uclk_pstate.optimize_params.optimized_display_config = params->optimized_display_config; - l->uclk_pstate.optimize_params.last_candidate_failed = !params->last_candidate_supported; - - return params->dml->pmo_instance.optimize_for_uclk_pstate(&l->uclk_pstate.optimize_params); -} - -bool dml2_top_optimization_init_function_stutter(const struct optimization_init_function_params *params) -{ - struct dml2_optimization_init_function_locals *l = params->locals; - - l->uclk_pstate.init_params.instance = ¶ms->dml->pmo_instance; - l->uclk_pstate.init_params.base_display_config = params->display_config; - - return params->dml->pmo_instance.init_for_stutter(&l->stutter.stutter_params); -} - -bool dml2_top_optimization_test_function_stutter(const struct optimization_test_function_params *params) -{ - struct dml2_optimization_test_function_locals *l = params->locals; - - l->stutter.stutter_params.instance = ¶ms->dml->pmo_instance; - l->stutter.stutter_params.base_display_config = params->display_config; - return params->dml->pmo_instance.test_for_stutter(&l->stutter.stutter_params); -} - -bool dml2_top_optimization_optimize_function_stutter(const struct optimization_optimize_function_params *params) -{ - struct dml2_optimization_optimize_function_locals *l = params->locals; - - l->stutter.stutter_params.instance = ¶ms->dml->pmo_instance; - l->stutter.stutter_params.base_display_config = params->display_config; - l->stutter.stutter_params.optimized_display_config = params->optimized_display_config; - l->stutter.stutter_params.last_candidate_failed = !params->last_candidate_supported; - return params->dml->pmo_instance.optimize_for_stutter(&l->stutter.stutter_params); -} diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_optimization.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_optimization.h deleted file mode 100644 index 9f22ab33eab1..000000000000 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_optimization.h +++ /dev/null @@ -1,33 +0,0 @@ -// SPDX-License-Identifier: MIT -// -// Copyright 2024 Advanced Micro Devices, Inc. - -#ifndef __DML2_TOP_OPTIMIZATION_H__ -#define __DML2_TOP_OPTIMIZATION_H__ - -#include "dml2_external_lib_deps.h" -#include "dml2_internal_shared_types.h" - -bool dml2_top_optimization_perform_optimization_phase(struct dml2_optimization_phase_locals *l, const struct optimization_phase_params *params); -bool dml2_top_optimization_perform_optimization_phase_1(struct dml2_optimization_phase_locals *l, const struct optimization_phase_params *params); - -bool dml2_top_optimization_init_function_min_clk_for_latency(const struct optimization_init_function_params *params); -bool dml2_top_optimization_test_function_min_clk_for_latency(const struct optimization_test_function_params *params); -bool dml2_top_optimization_optimize_function_min_clk_for_latency(const struct optimization_optimize_function_params *params); - -bool dml2_top_optimization_test_function_mcache(const struct optimization_test_function_params *params); -bool dml2_top_optimization_optimize_function_mcache(const struct optimization_optimize_function_params *params); - -bool dml2_top_optimization_init_function_uclk_pstate(const struct optimization_init_function_params *params); -bool dml2_top_optimization_test_function_uclk_pstate(const struct optimization_test_function_params *params); -bool dml2_top_optimization_optimize_function_uclk_pstate(const struct optimization_optimize_function_params *params); - -bool dml2_top_optimization_init_function_vmin(const struct optimization_init_function_params *params); -bool dml2_top_optimization_test_function_vmin(const struct optimization_test_function_params *params); -bool dml2_top_optimization_optimize_function_vmin(const struct optimization_optimize_function_params *params); - -bool dml2_top_optimization_init_function_stutter(const struct optimization_init_function_params *params); -bool dml2_top_optimization_test_function_stutter(const struct optimization_test_function_params *params); -bool dml2_top_optimization_optimize_function_stutter(const struct optimization_optimize_function_params *params); - -#endif diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_soc15.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_soc15.c new file mode 100644 index 000000000000..a8f58f8448e4 --- /dev/null +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_soc15.c @@ -0,0 +1,1178 @@ +// SPDX-License-Identifier: MIT +// +// Copyright 2024 Advanced Micro Devices, Inc. + +#include "dml2_top_soc15.h" +#include "dml2_mcg_factory.h" +#include "dml2_dpmm_factory.h" +#include "dml2_core_factory.h" +#include "dml2_pmo_factory.h" +#include "lib_float_math.h" +#include "dml2_debug.h" +static void setup_unoptimized_display_config_with_meta(const struct dml2_instance *dml, struct display_configuation_with_meta *out, const struct dml2_display_cfg *display_config) +{ + memcpy(&out->display_config, display_config, sizeof(struct dml2_display_cfg)); + out->stage1.min_clk_index_for_latency = dml->min_clk_table.dram_bw_table.num_entries - 1; //dml->min_clk_table.clean_me_up.soc_bb.num_states - 1; +} + +static void setup_speculative_display_config_with_meta(const struct dml2_instance *dml, struct display_configuation_with_meta *out, const struct dml2_display_cfg *display_config) +{ + memcpy(&out->display_config, display_config, sizeof(struct dml2_display_cfg)); + out->stage1.min_clk_index_for_latency = 0; +} + +static void copy_display_configuration_with_meta(struct display_configuation_with_meta *dst, const struct display_configuation_with_meta *src) +{ + memcpy(dst, src, sizeof(struct display_configuation_with_meta)); +} + +static bool dml2_top_optimization_init_function_min_clk_for_latency(const struct optimization_init_function_params *params) +{ + struct dml2_optimization_stage1_state *state = ¶ms->display_config->stage1; + + state->performed = true; + + return true; +} + +static bool dml2_top_optimization_test_function_min_clk_for_latency(const struct optimization_test_function_params *params) +{ + struct dml2_optimization_stage1_state *state = ¶ms->display_config->stage1; + + return state->min_clk_index_for_latency == 0; +} + +static bool dml2_top_optimization_optimize_function_min_clk_for_latency(const struct optimization_optimize_function_params *params) +{ + bool result = false; + + if (params->display_config->stage1.min_clk_index_for_latency > 0) { + copy_display_configuration_with_meta(params->optimized_display_config, params->display_config); + params->optimized_display_config->stage1.min_clk_index_for_latency--; + result = true; + } + + return result; +} + +static bool dml2_top_optimization_test_function_mcache(const struct optimization_test_function_params *params) +{ + struct dml2_optimization_test_function_locals *l = params->locals; + bool mcache_success = false; + bool result = false; + + memset(l, 0, sizeof(struct dml2_optimization_test_function_locals)); + + l->test_mcache.calc_mcache_count_params.dml2_instance = params->dml; + l->test_mcache.calc_mcache_count_params.display_config = ¶ms->display_config->display_config; + l->test_mcache.calc_mcache_count_params.mcache_allocations = params->display_config->stage2.mcache_allocations; + + result = dml2_top_mcache_calc_mcache_count_and_offsets(&l->test_mcache.calc_mcache_count_params); // use core to get the basic mcache_allocations + + if (result) { + l->test_mcache.assign_global_mcache_ids_params.allocations = params->display_config->stage2.mcache_allocations; + l->test_mcache.assign_global_mcache_ids_params.num_allocations = params->display_config->display_config.num_planes; + + dml2_top_mcache_assign_global_mcache_ids(&l->test_mcache.assign_global_mcache_ids_params); + + l->test_mcache.validate_admissibility_params.dml2_instance = params->dml; + l->test_mcache.validate_admissibility_params.display_cfg = ¶ms->display_config->display_config; + l->test_mcache.validate_admissibility_params.mcache_allocations = params->display_config->stage2.mcache_allocations; + l->test_mcache.validate_admissibility_params.cfg_support_info = ¶ms->display_config->mode_support_result.cfg_support_info; + + mcache_success = dml2_top_mcache_validate_admissability(&l->test_mcache.validate_admissibility_params); // also find the shift to make mcache allocation works + + memcpy(params->display_config->stage2.per_plane_mcache_support, l->test_mcache.validate_admissibility_params.per_plane_status, sizeof(bool) * DML2_MAX_PLANES); + } + + return mcache_success; +} + +static bool dml2_top_optimization_optimize_function_mcache(const struct optimization_optimize_function_params *params) +{ + struct dml2_optimization_optimize_function_locals *l = params->locals; + bool optimize_success = false; + + if (params->last_candidate_supported == false) + return false; + + copy_display_configuration_with_meta(params->optimized_display_config, params->display_config); + + l->optimize_mcache.optimize_mcache_params.instance = ¶ms->dml->pmo_instance; + l->optimize_mcache.optimize_mcache_params.dcc_mcache_supported = params->display_config->stage2.per_plane_mcache_support; + l->optimize_mcache.optimize_mcache_params.display_config = ¶ms->display_config->display_config; + l->optimize_mcache.optimize_mcache_params.optimized_display_cfg = ¶ms->optimized_display_config->display_config; + l->optimize_mcache.optimize_mcache_params.cfg_support_info = ¶ms->optimized_display_config->mode_support_result.cfg_support_info; + + optimize_success = params->dml->pmo_instance.optimize_dcc_mcache(&l->optimize_mcache.optimize_mcache_params); + + return optimize_success; +} + +static bool dml2_top_optimization_init_function_vmin(const struct optimization_init_function_params *params) +{ + struct dml2_optimization_init_function_locals *l = params->locals; + + l->vmin.init_params.instance = ¶ms->dml->pmo_instance; + l->vmin.init_params.base_display_config = params->display_config; + return params->dml->pmo_instance.init_for_vmin(&l->vmin.init_params); +} + +static bool dml2_top_optimization_test_function_vmin(const struct optimization_test_function_params *params) +{ + struct dml2_optimization_test_function_locals *l = params->locals; + + l->test_vmin.pmo_test_vmin_params.instance = ¶ms->dml->pmo_instance; + l->test_vmin.pmo_test_vmin_params.display_config = params->display_config; + l->test_vmin.pmo_test_vmin_params.vmin_limits = ¶ms->dml->soc_bbox.vmin_limit; + return params->dml->pmo_instance.test_for_vmin(&l->test_vmin.pmo_test_vmin_params); +} + +static bool dml2_top_optimization_optimize_function_vmin(const struct optimization_optimize_function_params *params) +{ + struct dml2_optimization_optimize_function_locals *l = params->locals; + + if (params->last_candidate_supported == false) + return false; + + l->optimize_vmin.pmo_optimize_vmin_params.instance = ¶ms->dml->pmo_instance; + l->optimize_vmin.pmo_optimize_vmin_params.base_display_config = params->display_config; + l->optimize_vmin.pmo_optimize_vmin_params.optimized_display_config = params->optimized_display_config; + return params->dml->pmo_instance.optimize_for_vmin(&l->optimize_vmin.pmo_optimize_vmin_params); +} + +static bool dml2_top_optimization_init_function_uclk_pstate(const struct optimization_init_function_params *params) +{ + struct dml2_optimization_init_function_locals *l = params->locals; + + l->uclk_pstate.init_params.instance = ¶ms->dml->pmo_instance; + l->uclk_pstate.init_params.base_display_config = params->display_config; + + return params->dml->pmo_instance.init_for_uclk_pstate(&l->uclk_pstate.init_params); +} + +static bool dml2_top_optimization_test_function_uclk_pstate(const struct optimization_test_function_params *params) +{ + struct dml2_optimization_test_function_locals *l = params->locals; + + l->uclk_pstate.test_params.instance = ¶ms->dml->pmo_instance; + l->uclk_pstate.test_params.base_display_config = params->display_config; + + return params->dml->pmo_instance.test_for_uclk_pstate(&l->uclk_pstate.test_params); +} + +static bool dml2_top_optimization_optimize_function_uclk_pstate(const struct optimization_optimize_function_params *params) +{ + struct dml2_optimization_optimize_function_locals *l = params->locals; + + l->uclk_pstate.optimize_params.instance = ¶ms->dml->pmo_instance; + l->uclk_pstate.optimize_params.base_display_config = params->display_config; + l->uclk_pstate.optimize_params.optimized_display_config = params->optimized_display_config; + l->uclk_pstate.optimize_params.last_candidate_failed = !params->last_candidate_supported; + + return params->dml->pmo_instance.optimize_for_uclk_pstate(&l->uclk_pstate.optimize_params); +} + +static bool dml2_top_optimization_init_function_stutter(const struct optimization_init_function_params *params) +{ + struct dml2_optimization_init_function_locals *l = params->locals; + + l->uclk_pstate.init_params.instance = ¶ms->dml->pmo_instance; + l->uclk_pstate.init_params.base_display_config = params->display_config; + + return params->dml->pmo_instance.init_for_stutter(&l->stutter.stutter_params); +} + +static bool dml2_top_optimization_test_function_stutter(const struct optimization_test_function_params *params) +{ + struct dml2_optimization_test_function_locals *l = params->locals; + + l->stutter.stutter_params.instance = ¶ms->dml->pmo_instance; + l->stutter.stutter_params.base_display_config = params->display_config; + return params->dml->pmo_instance.test_for_stutter(&l->stutter.stutter_params); +} + +static bool dml2_top_optimization_optimize_function_stutter(const struct optimization_optimize_function_params *params) +{ + struct dml2_optimization_optimize_function_locals *l = params->locals; + + l->stutter.stutter_params.instance = ¶ms->dml->pmo_instance; + l->stutter.stutter_params.base_display_config = params->display_config; + l->stutter.stutter_params.optimized_display_config = params->optimized_display_config; + l->stutter.stutter_params.last_candidate_failed = !params->last_candidate_supported; + return params->dml->pmo_instance.optimize_for_stutter(&l->stutter.stutter_params); +} + +static bool dml2_top_optimization_perform_optimization_phase(struct dml2_optimization_phase_locals *l, const struct optimization_phase_params *params) +{ + bool test_passed = false; + bool optimize_succeeded = true; + bool candidate_validation_passed = true; + struct optimization_init_function_params init_params = { 0 }; + struct optimization_test_function_params test_params = { 0 }; + struct optimization_optimize_function_params optimize_params = { 0 }; + + if (!params->dml || + !params->optimize_function || + !params->test_function || + !params->display_config || + !params->optimized_display_config) + return false; + + copy_display_configuration_with_meta(&l->cur_candidate_display_cfg, params->display_config); + + init_params.locals = &l->init_function_locals; + init_params.dml = params->dml; + init_params.display_config = &l->cur_candidate_display_cfg; + + if (params->init_function && !params->init_function(&init_params)) + return false; + + test_params.locals = &l->test_function_locals; + test_params.dml = params->dml; + test_params.display_config = &l->cur_candidate_display_cfg; + + test_passed = params->test_function(&test_params); + + while (!test_passed && optimize_succeeded) { + memset(&optimize_params, 0, sizeof(struct optimization_optimize_function_params)); + + optimize_params.locals = &l->optimize_function_locals; + optimize_params.dml = params->dml; + optimize_params.display_config = &l->cur_candidate_display_cfg; + optimize_params.optimized_display_config = &l->next_candidate_display_cfg; + optimize_params.last_candidate_supported = candidate_validation_passed; + + optimize_succeeded = params->optimize_function(&optimize_params); + + if (optimize_succeeded) { + l->mode_support_params.instance = ¶ms->dml->core_instance; + l->mode_support_params.display_cfg = &l->next_candidate_display_cfg; + l->mode_support_params.min_clk_table = ¶ms->dml->min_clk_table; + + if (l->next_candidate_display_cfg.stage3.performed) + l->mode_support_params.min_clk_index = l->next_candidate_display_cfg.stage3.min_clk_index_for_latency; + else + l->mode_support_params.min_clk_index = l->next_candidate_display_cfg.stage1.min_clk_index_for_latency; + candidate_validation_passed = params->dml->core_instance.mode_support(&l->mode_support_params); + l->next_candidate_display_cfg.mode_support_result = l->mode_support_params.mode_support_result; + } + + if (optimize_succeeded && candidate_validation_passed) { + memset(&test_params, 0, sizeof(struct optimization_test_function_params)); + test_params.locals = &l->test_function_locals; + test_params.dml = params->dml; + test_params.display_config = &l->next_candidate_display_cfg; + test_passed = params->test_function(&test_params); + + copy_display_configuration_with_meta(&l->cur_candidate_display_cfg, &l->next_candidate_display_cfg); + + // If optimization is not all or nothing, then store partial progress in output + if (!params->all_or_nothing) + copy_display_configuration_with_meta(params->optimized_display_config, &l->next_candidate_display_cfg); + } + } + + if (test_passed) + copy_display_configuration_with_meta(params->optimized_display_config, &l->cur_candidate_display_cfg); + + return test_passed; +} + +static bool dml2_top_optimization_perform_optimization_phase_1(struct dml2_optimization_phase_locals *l, const struct optimization_phase_params *params) +{ + int highest_state, lowest_state, cur_state; + bool supported = false; + + if (!params->dml || + !params->optimize_function || + !params->test_function || + !params->display_config || + !params->optimized_display_config) + return false; + + copy_display_configuration_with_meta(&l->cur_candidate_display_cfg, params->display_config); + highest_state = l->cur_candidate_display_cfg.stage1.min_clk_index_for_latency; + lowest_state = 0; + + while (highest_state > lowest_state) { + cur_state = (highest_state + lowest_state) / 2; + + l->mode_support_params.instance = ¶ms->dml->core_instance; + l->mode_support_params.display_cfg = &l->cur_candidate_display_cfg; + l->mode_support_params.min_clk_table = ¶ms->dml->min_clk_table; + l->mode_support_params.min_clk_index = cur_state; + supported = params->dml->core_instance.mode_support(&l->mode_support_params); + + if (supported) { + l->cur_candidate_display_cfg.mode_support_result = l->mode_support_params.mode_support_result; + highest_state = cur_state; + } else { + lowest_state = cur_state + 1; + } + } + l->cur_candidate_display_cfg.stage1.min_clk_index_for_latency = lowest_state; + + copy_display_configuration_with_meta(params->optimized_display_config, &l->cur_candidate_display_cfg); + + return true; +} + +/* +* Takes an input set of mcache boundaries and finds the appropriate setting of cache programming. +* Returns true if a valid set of programming can be made, and false otherwise. "Valid" means +* that the horizontal viewport does not span more than 2 cache slices. +* +* It optionally also can apply a constant shift to all the cache boundaries. +*/ +static const uint32_t MCACHE_ID_UNASSIGNED = 0xF; +static const uint32_t SPLIT_LOCATION_UNDEFINED = 0xFFFF; + +static bool calculate_first_second_splitting(const int *mcache_boundaries, int num_boundaries, int shift, + int pipe_h_vp_start, int pipe_h_vp_end, int *first_offset, int *second_offset) +{ + const int MAX_VP = 0xFFFFFF; + int left_cache_id; + int right_cache_id; + int range_start; + int range_end; + bool success = false; + + if (num_boundaries <= 1) { + if (first_offset && second_offset) { + *first_offset = 0; + *second_offset = -1; + } + success = true; + return success; + } else { + range_start = 0; + for (left_cache_id = 0; left_cache_id < num_boundaries; left_cache_id++) { + range_end = mcache_boundaries[left_cache_id] - shift - 1; + + if (range_start <= pipe_h_vp_start && pipe_h_vp_start <= range_end) + break; + + range_start = range_end + 1; + } + + range_end = MAX_VP; + for (right_cache_id = num_boundaries - 1; right_cache_id >= -1; right_cache_id--) { + if (right_cache_id >= 0) + range_start = mcache_boundaries[right_cache_id] - shift; + else + range_start = 0; + + if (range_start <= pipe_h_vp_end && pipe_h_vp_end <= range_end) { + break; + } + range_end = range_start - 1; + } + right_cache_id = (right_cache_id + 1) % num_boundaries; + + if (right_cache_id == left_cache_id) { + if (first_offset && second_offset) { + *first_offset = left_cache_id; + *second_offset = -1; + } + success = true; + } else if (right_cache_id == (left_cache_id + 1) % num_boundaries) { + if (first_offset && second_offset) { + *first_offset = left_cache_id; + *second_offset = right_cache_id; + } + success = true; + } + } + + return success; +} + +/* +* For a given set of pipe start/end x positions, checks to see it can support the input mcache splitting. +* It also attempts to "optimize" by finding a shift if the default 0 shift does not work. +*/ +static bool find_shift_for_valid_cache_id_assignment(int *mcache_boundaries, unsigned int num_boundaries, + int *pipe_vp_startx, int *pipe_vp_endx, unsigned int pipe_count, int shift_granularity, int *shift) +{ + int max_shift = 0xFFFF; + unsigned int pipe_index; + unsigned int i, slice_width; + bool success = false; + + for (i = 0; i < num_boundaries; i++) { + if (i == 0) + slice_width = mcache_boundaries[i]; + else + slice_width = mcache_boundaries[i] - mcache_boundaries[i - 1]; + + if (max_shift > (int)slice_width) { + max_shift = slice_width; + } + } + + for (*shift = 0; *shift <= max_shift; *shift += shift_granularity) { + success = true; + for (pipe_index = 0; pipe_index < pipe_count; pipe_index++) { + if (!calculate_first_second_splitting(mcache_boundaries, num_boundaries, *shift, + pipe_vp_startx[pipe_index], pipe_vp_endx[pipe_index], 0, 0)) { + success = false; + break; + } + } + if (success) + break; + } + + return success; +} + +/* +* Counts the number of elements inside input array within the given span length. +* Formally, what is the size of the largest subset of the array where the largest and smallest element +* differ no more than the span. +*/ +static unsigned int count_elements_in_span(int *array, unsigned int array_size, unsigned int span) +{ + unsigned int i; + unsigned int span_start_value; + unsigned int span_start_index; + unsigned int greatest_element_count; + + if (array_size == 0) + return 1; + + if (span == 0) + return array_size > 0 ? 1 : 0; + + span_start_value = 0; + span_start_index = 0; + greatest_element_count = 0; + + while (span_start_index < array_size) { + for (i = span_start_index; i < array_size; i++) { + if (array[i] - span_start_value <= span) { + if (i - span_start_index + 1 > greatest_element_count) { + greatest_element_count = i - span_start_index + 1; + } + } else + break; + } + + span_start_index++; + + if (span_start_index < array_size) { + span_start_value = array[span_start_index - 1] + 1; + } + } + + return greatest_element_count; +} + +static bool calculate_h_split_for_scaling_transform(int full_vp_width, int h_active, int num_pipes, + enum dml2_scaling_transform scaling_transform, int *pipe_vp_x_start, int *pipe_vp_x_end) +{ + int i, slice_width; + const char MAX_SCL_VP_OVERLAP = 3; + bool success = false; + + switch (scaling_transform) { + case dml2_scaling_transform_centered: + case dml2_scaling_transform_aspect_ratio: + case dml2_scaling_transform_fullscreen: + slice_width = full_vp_width / num_pipes; + for (i = 0; i < num_pipes; i++) { + pipe_vp_x_start[i] = i * slice_width; + pipe_vp_x_end[i] = (i + 1) * slice_width - 1; + + if (pipe_vp_x_start[i] < MAX_SCL_VP_OVERLAP) + pipe_vp_x_start[i] = 0; + else + pipe_vp_x_start[i] -= MAX_SCL_VP_OVERLAP; + + if (pipe_vp_x_end[i] > full_vp_width - MAX_SCL_VP_OVERLAP - 1) + pipe_vp_x_end[i] = full_vp_width - 1; + else + pipe_vp_x_end[i] += MAX_SCL_VP_OVERLAP; + } + break; + case dml2_scaling_transform_explicit: + default: + success = false; + break; + } + + return success; +} + +bool dml2_top_mcache_validate_admissability(struct top_mcache_validate_admissability_in_out *params) +{ + struct dml2_instance *dml = (struct dml2_instance *)params->dml2_instance; + struct dml2_top_mcache_validate_admissability_locals *l = &dml->scratch.mcache_validate_admissability_locals; + + const int MAX_PIXEL_OVERLAP = 6; + int max_per_pipe_vp_p0 = 0; + int max_per_pipe_vp_p1 = 0; + int temp, p0shift, p1shift; + unsigned int plane_index = 0; + unsigned int i; + unsigned int odm_combine_factor; + unsigned int mpc_combine_factor; + unsigned int num_dpps; + unsigned int num_boundaries; + enum dml2_scaling_transform scaling_transform; + const struct dml2_plane_parameters *plane; + const struct dml2_stream_parameters *stream; + + bool p0pass = false; + bool p1pass = false; + bool all_pass = true; + + for (plane_index = 0; plane_index < params->display_cfg->num_planes; plane_index++) { + if (!params->display_cfg->plane_descriptors[plane_index].surface.dcc.enable) + continue; + + plane = ¶ms->display_cfg->plane_descriptors[plane_index]; + stream = ¶ms->display_cfg->stream_descriptors[plane->stream_index]; + + num_dpps = odm_combine_factor = params->cfg_support_info->stream_support_info[plane->stream_index].odms_used; + + if (odm_combine_factor == 1) + num_dpps = mpc_combine_factor = (unsigned int)params->cfg_support_info->plane_support_info[plane_index].dpps_used; + else + mpc_combine_factor = 1; + + if (odm_combine_factor > 1) { + max_per_pipe_vp_p0 = plane->surface.plane0.width; + temp = (unsigned int)math_ceil(plane->composition.scaler_info.plane0.h_ratio * stream->timing.h_active / odm_combine_factor); + + if (temp < max_per_pipe_vp_p0) + max_per_pipe_vp_p0 = temp; + + max_per_pipe_vp_p1 = plane->surface.plane1.width; + temp = (unsigned int)math_ceil(plane->composition.scaler_info.plane1.h_ratio * stream->timing.h_active / odm_combine_factor); + + if (temp < max_per_pipe_vp_p1) + max_per_pipe_vp_p1 = temp; + } else { + max_per_pipe_vp_p0 = plane->surface.plane0.width / mpc_combine_factor; + max_per_pipe_vp_p1 = plane->surface.plane1.width / mpc_combine_factor; + } + + max_per_pipe_vp_p0 += 2 * MAX_PIXEL_OVERLAP; + max_per_pipe_vp_p1 += MAX_PIXEL_OVERLAP; + + p0shift = 0; + p1shift = 0; + + // The last element in the unshifted boundary array will always be the first pixel outside the + // plane, which means theres no mcache associated with it, so -1 + num_boundaries = params->mcache_allocations[plane_index].num_mcaches_plane0 == 0 ? 0 : params->mcache_allocations[plane_index].num_mcaches_plane0 - 1; + if ((count_elements_in_span(params->mcache_allocations[plane_index].mcache_x_offsets_plane0, + num_boundaries, max_per_pipe_vp_p0) <= 1) && (num_boundaries <= num_dpps)) { + p0pass = true; + } + num_boundaries = params->mcache_allocations[plane_index].num_mcaches_plane1 == 0 ? 0 : params->mcache_allocations[plane_index].num_mcaches_plane1 - 1; + if ((count_elements_in_span(params->mcache_allocations[plane_index].mcache_x_offsets_plane1, + num_boundaries, max_per_pipe_vp_p1) <= 1) && (num_boundaries <= num_dpps)) { + p1pass = true; + } + + if (!p0pass || !p1pass) { + if (odm_combine_factor > 1) { + num_dpps = odm_combine_factor; + scaling_transform = plane->composition.scaling_transform; + } else { + num_dpps = mpc_combine_factor; + scaling_transform = dml2_scaling_transform_fullscreen; + } + + if (!p0pass) { + if (plane->composition.viewport.stationary) { + calculate_h_split_for_scaling_transform(plane->surface.plane0.width, + stream->timing.h_active, num_dpps, scaling_transform, + &l->plane0.pipe_vp_startx[plane_index], &l->plane0.pipe_vp_endx[plane_index]); + p0pass = find_shift_for_valid_cache_id_assignment(params->mcache_allocations[plane_index].mcache_x_offsets_plane0, + params->mcache_allocations[plane_index].num_mcaches_plane0, + &l->plane0.pipe_vp_startx[plane_index], &l->plane0.pipe_vp_endx[plane_index], num_dpps, + params->mcache_allocations[plane_index].shift_granularity.p0, &p0shift); + } + } + if (!p1pass) { + if (plane->composition.viewport.stationary) { + calculate_h_split_for_scaling_transform(plane->surface.plane1.width, + stream->timing.h_active, num_dpps, scaling_transform, + &l->plane0.pipe_vp_startx[plane_index], &l->plane0.pipe_vp_endx[plane_index]); + p1pass = find_shift_for_valid_cache_id_assignment(params->mcache_allocations[plane_index].mcache_x_offsets_plane1, + params->mcache_allocations[plane_index].num_mcaches_plane1, + &l->plane1.pipe_vp_startx[plane_index], &l->plane1.pipe_vp_endx[plane_index], num_dpps, + params->mcache_allocations[plane_index].shift_granularity.p1, &p1shift); + } + } + } + + if (p0pass && p1pass) { + for (i = 0; i < params->mcache_allocations[plane_index].num_mcaches_plane0; i++) { + params->mcache_allocations[plane_index].mcache_x_offsets_plane0[i] -= p0shift; + } + for (i = 0; i < params->mcache_allocations[plane_index].num_mcaches_plane1; i++) { + params->mcache_allocations[plane_index].mcache_x_offsets_plane1[i] -= p1shift; + } + } + + params->per_plane_status[plane_index] = p0pass && p1pass; + all_pass &= p0pass && p1pass; + } + + return all_pass; +} + +static void reset_mcache_allocations(struct dml2_hubp_pipe_mcache_regs *per_plane_pipe_mcache_regs) +{ + // Initialize all entries to special valid MCache ID and special valid split coordinate + per_plane_pipe_mcache_regs->main.p0.mcache_id_first = MCACHE_ID_UNASSIGNED; + per_plane_pipe_mcache_regs->main.p0.mcache_id_second = MCACHE_ID_UNASSIGNED; + per_plane_pipe_mcache_regs->main.p0.split_location = SPLIT_LOCATION_UNDEFINED; + + per_plane_pipe_mcache_regs->mall.p0.mcache_id_first = MCACHE_ID_UNASSIGNED; + per_plane_pipe_mcache_regs->mall.p0.mcache_id_second = MCACHE_ID_UNASSIGNED; + per_plane_pipe_mcache_regs->mall.p0.split_location = SPLIT_LOCATION_UNDEFINED; + + per_plane_pipe_mcache_regs->main.p1.mcache_id_first = MCACHE_ID_UNASSIGNED; + per_plane_pipe_mcache_regs->main.p1.mcache_id_second = MCACHE_ID_UNASSIGNED; + per_plane_pipe_mcache_regs->main.p1.split_location = SPLIT_LOCATION_UNDEFINED; + + per_plane_pipe_mcache_regs->mall.p1.mcache_id_first = MCACHE_ID_UNASSIGNED; + per_plane_pipe_mcache_regs->mall.p1.mcache_id_second = MCACHE_ID_UNASSIGNED; + per_plane_pipe_mcache_regs->mall.p1.split_location = SPLIT_LOCATION_UNDEFINED; +} + +void dml2_top_mcache_assign_global_mcache_ids(struct top_mcache_assign_global_mcache_ids_in_out *params) +{ + int i; + unsigned int j; + int next_unused_cache_id = 0; + + for (i = 0; i < params->num_allocations; i++) { + if (!params->allocations[i].valid) + continue; + + for (j = 0; j < params->allocations[i].num_mcaches_plane0; j++) { + params->allocations[i].global_mcache_ids_plane0[j] = next_unused_cache_id++; + } + for (j = 0; j < params->allocations[i].num_mcaches_plane1; j++) { + params->allocations[i].global_mcache_ids_plane1[j] = next_unused_cache_id++; + } + + // The "psuedo-last" slice is always wrapped around + params->allocations[i].global_mcache_ids_plane0[params->allocations[i].num_mcaches_plane0] = + params->allocations[i].global_mcache_ids_plane0[0]; + params->allocations[i].global_mcache_ids_plane1[params->allocations[i].num_mcaches_plane1] = + params->allocations[i].global_mcache_ids_plane1[0]; + + // If we need dedicated caches for mall requesting, then we assign them here. + if (params->allocations[i].requires_dedicated_mall_mcache) { + for (j = 0; j < params->allocations[i].num_mcaches_plane0; j++) { + params->allocations[i].global_mcache_ids_mall_plane0[j] = next_unused_cache_id++; + } + for (j = 0; j < params->allocations[i].num_mcaches_plane1; j++) { + params->allocations[i].global_mcache_ids_mall_plane1[j] = next_unused_cache_id++; + } + + // The "psuedo-last" slice is always wrapped around + params->allocations[i].global_mcache_ids_mall_plane0[params->allocations[i].num_mcaches_plane0] = + params->allocations[i].global_mcache_ids_mall_plane0[0]; + params->allocations[i].global_mcache_ids_mall_plane1[params->allocations[i].num_mcaches_plane1] = + params->allocations[i].global_mcache_ids_mall_plane1[0]; + } + + // If P0 and P1 are sharing caches, then it means the largest mcache IDs for p0 and p1 can be the same + // since mcache IDs are always ascending, then it means the largest mcacheID of p1 should be the + // largest mcacheID of P0 + if (params->allocations[i].num_mcaches_plane0 > 0 && params->allocations[i].num_mcaches_plane1 > 0 && + params->allocations[i].last_slice_sharing.plane0_plane1) { + params->allocations[i].global_mcache_ids_plane1[params->allocations[i].num_mcaches_plane1 - 1] = + params->allocations[i].global_mcache_ids_plane0[params->allocations[i].num_mcaches_plane0 - 1]; + } + + // If we need dedicated caches handle last slice sharing + if (params->allocations[i].requires_dedicated_mall_mcache) { + if (params->allocations[i].num_mcaches_plane0 > 0 && params->allocations[i].num_mcaches_plane1 > 0 && + params->allocations[i].last_slice_sharing.plane0_plane1) { + params->allocations[i].global_mcache_ids_mall_plane1[params->allocations[i].num_mcaches_plane1 - 1] = + params->allocations[i].global_mcache_ids_mall_plane0[params->allocations[i].num_mcaches_plane0 - 1]; + } + // If mall_comb_mcache_l is set then it means that largest mcache ID for MALL p0 can be same as regular read p0 + if (params->allocations[i].num_mcaches_plane0 > 0 && params->allocations[i].last_slice_sharing.mall_comb_mcache_p0) { + params->allocations[i].global_mcache_ids_mall_plane0[params->allocations[i].num_mcaches_plane0 - 1] = + params->allocations[i].global_mcache_ids_plane0[params->allocations[i].num_mcaches_plane0 - 1]; + } + // If mall_comb_mcache_c is set then it means that largest mcache ID for MALL p1 can be same as regular + // read p1 (which can be same as regular read p0 if plane0_plane1 is also set) + if (params->allocations[i].num_mcaches_plane1 > 0 && params->allocations[i].last_slice_sharing.mall_comb_mcache_p1) { + params->allocations[i].global_mcache_ids_mall_plane1[params->allocations[i].num_mcaches_plane1 - 1] = + params->allocations[i].global_mcache_ids_plane1[params->allocations[i].num_mcaches_plane1 - 1]; + } + } + + // If you don't need dedicated mall mcaches, the mall mcache assignments are identical to the normal requesting + if (!params->allocations[i].requires_dedicated_mall_mcache) { + memcpy(params->allocations[i].global_mcache_ids_mall_plane0, params->allocations[i].global_mcache_ids_plane0, + sizeof(params->allocations[i].global_mcache_ids_mall_plane0)); + memcpy(params->allocations[i].global_mcache_ids_mall_plane1, params->allocations[i].global_mcache_ids_plane1, + sizeof(params->allocations[i].global_mcache_ids_mall_plane1)); + } + } +} + +bool dml2_top_mcache_calc_mcache_count_and_offsets(struct top_mcache_calc_mcache_count_and_offsets_in_out *params) +{ + struct dml2_instance *dml = (struct dml2_instance *)params->dml2_instance; + struct dml2_top_mcache_verify_mcache_size_locals *l = &dml->scratch.mcache_verify_mcache_size_locals; + + unsigned int total_mcaches_required; + unsigned int i; + bool result = false; + + if (dml->soc_bbox.num_dcc_mcaches == 0) { + return true; + } + + total_mcaches_required = 0; + l->calc_mcache_params.instance = &dml->core_instance; + for (i = 0; i < params->display_config->num_planes; i++) { + if (!params->display_config->plane_descriptors[i].surface.dcc.enable) { + memset(¶ms->mcache_allocations[i], 0, sizeof(struct dml2_mcache_surface_allocation)); + continue; + } + + l->calc_mcache_params.plane_descriptor = ¶ms->display_config->plane_descriptors[i]; + l->calc_mcache_params.mcache_allocation = ¶ms->mcache_allocations[i]; + l->calc_mcache_params.plane_index = i; + + if (!dml->core_instance.calculate_mcache_allocation(&l->calc_mcache_params)) { + result = false; + break; + } + + if (params->mcache_allocations[i].valid) { + total_mcaches_required += params->mcache_allocations[i].num_mcaches_plane0 + params->mcache_allocations[i].num_mcaches_plane1; + if (params->mcache_allocations[i].last_slice_sharing.plane0_plane1) + total_mcaches_required--; + } + } + dml2_printf("DML_CORE_DCN3::%s: plane_%d, total_mcaches_required=%d\n", __func__, i, total_mcaches_required); + + if (total_mcaches_required > dml->soc_bbox.num_dcc_mcaches) { + result = false; + } else { + result = true; + } + + return result; +} + +static bool dml2_top_soc15_check_mode_supported(struct dml2_check_mode_supported_in_out *in_out) +{ + struct dml2_instance *dml = (struct dml2_instance *)in_out->dml2_instance; + struct dml2_check_mode_supported_locals *l = &dml->scratch.check_mode_supported_locals; + struct dml2_display_cfg_programming *dpmm_programming = &dml->dpmm_instance.dpmm_scratch.programming; + + bool result = false; + bool mcache_success = false; + memset(dpmm_programming, 0, sizeof(struct dml2_display_cfg_programming)); + + setup_unoptimized_display_config_with_meta(dml, &l->base_display_config_with_meta, in_out->display_config); + + l->mode_support_params.instance = &dml->core_instance; + l->mode_support_params.display_cfg = &l->base_display_config_with_meta; + l->mode_support_params.min_clk_table = &dml->min_clk_table; + l->mode_support_params.min_clk_index = l->base_display_config_with_meta.stage1.min_clk_index_for_latency; + result = dml->core_instance.mode_support(&l->mode_support_params); + l->base_display_config_with_meta.mode_support_result = l->mode_support_params.mode_support_result; + + if (result) { + struct optimization_phase_params mcache_phase = { + .dml = dml, + .display_config = &l->base_display_config_with_meta, + .test_function = dml2_top_optimization_test_function_mcache, + .optimize_function = dml2_top_optimization_optimize_function_mcache, + .optimized_display_config = &l->optimized_display_config_with_meta, + .all_or_nothing = false, + }; + mcache_success = dml2_top_optimization_perform_optimization_phase(&l->optimization_phase_locals, &mcache_phase); + } + + /* + * Call DPMM to map all requirements to minimum clock state + */ + if (result) { + l->dppm_map_mode_params.min_clk_table = &dml->min_clk_table; + l->dppm_map_mode_params.display_cfg = &l->base_display_config_with_meta; + l->dppm_map_mode_params.programming = dpmm_programming; + l->dppm_map_mode_params.soc_bb = &dml->soc_bbox; + l->dppm_map_mode_params.ip = &dml->core_instance.clean_me_up.mode_lib.ip; + result = dml->dpmm_instance.map_mode_to_soc_dpm(&l->dppm_map_mode_params); + } + + in_out->is_supported = mcache_success; + result = result && in_out->is_supported; + + return result; +} + +static bool dml2_top_soc15_build_mode_programming(struct dml2_build_mode_programming_in_out *in_out) +{ + struct dml2_instance *dml = (struct dml2_instance *)in_out->dml2_instance; + struct dml2_build_mode_programming_locals *l = &dml->scratch.build_mode_programming_locals; + + bool result = false; + bool mcache_success = false; + bool uclk_pstate_success = false; + bool vmin_success = false; + bool stutter_success = false; + unsigned int i; + + memset(l, 0, sizeof(struct dml2_build_mode_programming_locals)); + memset(in_out->programming, 0, sizeof(struct dml2_display_cfg_programming)); + + memcpy(&in_out->programming->display_config, in_out->display_config, sizeof(struct dml2_display_cfg)); + + setup_speculative_display_config_with_meta(dml, &l->base_display_config_with_meta, in_out->display_config); + + l->mode_support_params.instance = &dml->core_instance; + l->mode_support_params.display_cfg = &l->base_display_config_with_meta; + l->mode_support_params.min_clk_table = &dml->min_clk_table; + l->mode_support_params.min_clk_index = l->base_display_config_with_meta.stage1.min_clk_index_for_latency; + result = dml->core_instance.mode_support(&l->mode_support_params); + + l->base_display_config_with_meta.mode_support_result = l->mode_support_params.mode_support_result; + + if (!result) { + setup_unoptimized_display_config_with_meta(dml, &l->base_display_config_with_meta, in_out->display_config); + + l->mode_support_params.instance = &dml->core_instance; + l->mode_support_params.display_cfg = &l->base_display_config_with_meta; + l->mode_support_params.min_clk_table = &dml->min_clk_table; + l->mode_support_params.min_clk_index = l->base_display_config_with_meta.stage1.min_clk_index_for_latency; + result = dml->core_instance.mode_support(&l->mode_support_params); + l->base_display_config_with_meta.mode_support_result = l->mode_support_params.mode_support_result; + + if (!result) { + l->informative_params.instance = &dml->core_instance; + l->informative_params.programming = in_out->programming; + l->informative_params.mode_is_supported = false; + dml->core_instance.populate_informative(&l->informative_params); + + return false; + } + + /* + * Phase 1: Determine minimum clocks to satisfy latency requirements for this mode + */ + memset(&l->min_clock_for_latency_phase, 0, sizeof(struct optimization_phase_params)); + l->min_clock_for_latency_phase.dml = dml; + l->min_clock_for_latency_phase.display_config = &l->base_display_config_with_meta; + l->min_clock_for_latency_phase.init_function = dml2_top_optimization_init_function_min_clk_for_latency; + l->min_clock_for_latency_phase.test_function = dml2_top_optimization_test_function_min_clk_for_latency; + l->min_clock_for_latency_phase.optimize_function = dml2_top_optimization_optimize_function_min_clk_for_latency; + l->min_clock_for_latency_phase.optimized_display_config = &l->optimized_display_config_with_meta; + l->min_clock_for_latency_phase.all_or_nothing = false; + + dml2_top_optimization_perform_optimization_phase_1(&l->optimization_phase_locals, &l->min_clock_for_latency_phase); + + memcpy(&l->base_display_config_with_meta, &l->optimized_display_config_with_meta, sizeof(struct display_configuation_with_meta)); + } + + /* + * Phase 2: Satisfy DCC mcache requirements + */ + memset(&l->mcache_phase, 0, sizeof(struct optimization_phase_params)); + l->mcache_phase.dml = dml; + l->mcache_phase.display_config = &l->base_display_config_with_meta; + l->mcache_phase.test_function = dml2_top_optimization_test_function_mcache; + l->mcache_phase.optimize_function = dml2_top_optimization_optimize_function_mcache; + l->mcache_phase.optimized_display_config = &l->optimized_display_config_with_meta; + l->mcache_phase.all_or_nothing = true; + + mcache_success = dml2_top_optimization_perform_optimization_phase(&l->optimization_phase_locals, &l->mcache_phase); + + if (!mcache_success) { + l->informative_params.instance = &dml->core_instance; + l->informative_params.programming = in_out->programming; + l->informative_params.mode_is_supported = false; + + dml->core_instance.populate_informative(&l->informative_params); + + in_out->programming->informative.failed_mcache_validation = true; + return false; + } + + memcpy(&l->base_display_config_with_meta, &l->optimized_display_config_with_meta, sizeof(struct display_configuation_with_meta)); + + /* + * Phase 3: Optimize for Pstate + */ + memset(&l->uclk_pstate_phase, 0, sizeof(struct optimization_phase_params)); + l->uclk_pstate_phase.dml = dml; + l->uclk_pstate_phase.display_config = &l->base_display_config_with_meta; + l->uclk_pstate_phase.init_function = dml2_top_optimization_init_function_uclk_pstate; + l->uclk_pstate_phase.test_function = dml2_top_optimization_test_function_uclk_pstate; + l->uclk_pstate_phase.optimize_function = dml2_top_optimization_optimize_function_uclk_pstate; + l->uclk_pstate_phase.optimized_display_config = &l->optimized_display_config_with_meta; + l->uclk_pstate_phase.all_or_nothing = true; + + uclk_pstate_success = dml2_top_optimization_perform_optimization_phase(&l->optimization_phase_locals, &l->uclk_pstate_phase); + + if (uclk_pstate_success) { + memcpy(&l->base_display_config_with_meta, &l->optimized_display_config_with_meta, sizeof(struct display_configuation_with_meta)); + l->base_display_config_with_meta.stage3.success = true; + } + + /* + * Phase 4: Optimize for Vmin + */ + memset(&l->vmin_phase, 0, sizeof(struct optimization_phase_params)); + l->vmin_phase.dml = dml; + l->vmin_phase.display_config = &l->base_display_config_with_meta; + l->vmin_phase.init_function = dml2_top_optimization_init_function_vmin; + l->vmin_phase.test_function = dml2_top_optimization_test_function_vmin; + l->vmin_phase.optimize_function = dml2_top_optimization_optimize_function_vmin; + l->vmin_phase.optimized_display_config = &l->optimized_display_config_with_meta; + l->vmin_phase.all_or_nothing = false; + + vmin_success = dml2_top_optimization_perform_optimization_phase(&l->optimization_phase_locals, &l->vmin_phase); + + if (l->optimized_display_config_with_meta.stage4.performed) { + /* + * when performed is true, optimization has applied to + * optimized_display_config_with_meta and it has passed mode + * support. However it may or may not pass the test function to + * reach actual Vmin. As long as voltage is optimized even if it + * doesn't reach Vmin level, there is still power benefit so in + * this case we will still copy this optimization into base + * display config. + */ + memcpy(&l->base_display_config_with_meta, &l->optimized_display_config_with_meta, sizeof(struct display_configuation_with_meta)); + l->base_display_config_with_meta.stage4.success = vmin_success; + } + + /* + * Phase 5: Optimize for Stutter + */ + memset(&l->stutter_phase, 0, sizeof(struct optimization_phase_params)); + l->stutter_phase.dml = dml; + l->stutter_phase.display_config = &l->base_display_config_with_meta; + l->stutter_phase.init_function = dml2_top_optimization_init_function_stutter; + l->stutter_phase.test_function = dml2_top_optimization_test_function_stutter; + l->stutter_phase.optimize_function = dml2_top_optimization_optimize_function_stutter; + l->stutter_phase.optimized_display_config = &l->optimized_display_config_with_meta; + l->stutter_phase.all_or_nothing = true; + + stutter_success = dml2_top_optimization_perform_optimization_phase(&l->optimization_phase_locals, &l->stutter_phase); + + if (stutter_success) { + memcpy(&l->base_display_config_with_meta, &l->optimized_display_config_with_meta, sizeof(struct display_configuation_with_meta)); + l->base_display_config_with_meta.stage5.success = true; + } + + /* + * Populate mcache programming + */ + for (i = 0; i < in_out->display_config->num_planes; i++) { + in_out->programming->plane_programming[i].mcache_allocation = l->base_display_config_with_meta.stage2.mcache_allocations[i]; + } + + /* + * Call DPMM to map all requirements to minimum clock state + */ + if (result) { + l->dppm_map_mode_params.min_clk_table = &dml->min_clk_table; + l->dppm_map_mode_params.display_cfg = &l->base_display_config_with_meta; + l->dppm_map_mode_params.programming = in_out->programming; + l->dppm_map_mode_params.soc_bb = &dml->soc_bbox; + l->dppm_map_mode_params.ip = &dml->core_instance.clean_me_up.mode_lib.ip; + result = dml->dpmm_instance.map_mode_to_soc_dpm(&l->dppm_map_mode_params); + if (!result) + in_out->programming->informative.failed_dpmm = true; + } + + if (result) { + l->mode_programming_params.instance = &dml->core_instance; + l->mode_programming_params.display_cfg = &l->base_display_config_with_meta; + l->mode_programming_params.cfg_support_info = &l->base_display_config_with_meta.mode_support_result.cfg_support_info; + l->mode_programming_params.programming = in_out->programming; + result = dml->core_instance.mode_programming(&l->mode_programming_params); + if (!result) + in_out->programming->informative.failed_mode_programming = true; + } + + if (result) { + l->dppm_map_watermarks_params.core = &dml->core_instance; + l->dppm_map_watermarks_params.display_cfg = &l->base_display_config_with_meta; + l->dppm_map_watermarks_params.programming = in_out->programming; + result = dml->dpmm_instance.map_watermarks(&l->dppm_map_watermarks_params); + } + + l->informative_params.instance = &dml->core_instance; + l->informative_params.programming = in_out->programming; + l->informative_params.mode_is_supported = result; + + dml->core_instance.populate_informative(&l->informative_params); + + return result; +} + +bool dml2_top_soc15_build_mcache_programming(struct dml2_build_mcache_programming_in_out *params) +{ + bool success = true; + int config_index, pipe_index; + int first_offset, second_offset; + int free_per_plane_reg_index = 0; + + memset(params->per_plane_pipe_mcache_regs, 0, DML2_MAX_PLANES * DML2_MAX_DCN_PIPES * sizeof(struct dml2_hubp_pipe_mcache_regs *)); + + for (config_index = 0; config_index < params->num_configurations; config_index++) { + for (pipe_index = 0; pipe_index < params->mcache_configurations[config_index].num_pipes; pipe_index++) { + // Allocate storage for the mcache regs + params->per_plane_pipe_mcache_regs[config_index][pipe_index] = ¶ms->mcache_regs_set[free_per_plane_reg_index++]; + + reset_mcache_allocations(params->per_plane_pipe_mcache_regs[config_index][pipe_index]); + + if (params->mcache_configurations[config_index].plane_descriptor->surface.dcc.enable) { + // P0 always enabled + if (!calculate_first_second_splitting(params->mcache_configurations[config_index].mcache_allocation->mcache_x_offsets_plane0, + params->mcache_configurations[config_index].mcache_allocation->num_mcaches_plane0, + 0, + params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane0.viewport_x_start, + params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane0.viewport_x_start + + params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane0.viewport_width - 1, + &first_offset, &second_offset)) { + success = false; + break; + } + + params->per_plane_pipe_mcache_regs[config_index][pipe_index]->main.p0.mcache_id_first = + params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_plane0[first_offset]; + + params->per_plane_pipe_mcache_regs[config_index][pipe_index]->mall.p0.mcache_id_first = + params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_mall_plane0[first_offset]; + + if (second_offset >= 0) { + params->per_plane_pipe_mcache_regs[config_index][pipe_index]->main.p0.mcache_id_second = + params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_plane0[second_offset]; + params->per_plane_pipe_mcache_regs[config_index][pipe_index]->main.p0.split_location = + params->mcache_configurations[config_index].mcache_allocation->mcache_x_offsets_plane0[first_offset] - 1; + + params->per_plane_pipe_mcache_regs[config_index][pipe_index]->mall.p0.mcache_id_second = + params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_mall_plane0[second_offset]; + params->per_plane_pipe_mcache_regs[config_index][pipe_index]->mall.p0.split_location = + params->mcache_configurations[config_index].mcache_allocation->mcache_x_offsets_plane0[first_offset] - 1; + } + + // Populate P1 if enabled + if (params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane1_enabled) { + if (!calculate_first_second_splitting(params->mcache_configurations[config_index].mcache_allocation->mcache_x_offsets_plane1, + params->mcache_configurations[config_index].mcache_allocation->num_mcaches_plane1, + 0, + params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane1.viewport_x_start, + params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane1.viewport_x_start + + params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane1.viewport_width - 1, + &first_offset, &second_offset)) { + success = false; + break; + } + + params->per_plane_pipe_mcache_regs[config_index][pipe_index]->main.p1.mcache_id_first = + params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_plane1[first_offset]; + + params->per_plane_pipe_mcache_regs[config_index][pipe_index]->mall.p1.mcache_id_first = + params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_mall_plane1[first_offset]; + + if (second_offset >= 0) { + params->per_plane_pipe_mcache_regs[config_index][pipe_index]->main.p1.mcache_id_second = + params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_plane1[second_offset]; + params->per_plane_pipe_mcache_regs[config_index][pipe_index]->main.p1.split_location = + params->mcache_configurations[config_index].mcache_allocation->mcache_x_offsets_plane1[first_offset] - 1; + + params->per_plane_pipe_mcache_regs[config_index][pipe_index]->mall.p1.mcache_id_second = + params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_mall_plane1[second_offset]; + params->per_plane_pipe_mcache_regs[config_index][pipe_index]->mall.p1.split_location = + params->mcache_configurations[config_index].mcache_allocation->mcache_x_offsets_plane1[first_offset] - 1; + } + } + } + } + } + + return success; +} + +static const struct dml2_top_funcs soc15_funcs = { + .check_mode_supported = dml2_top_soc15_check_mode_supported, + .build_mode_programming = dml2_top_soc15_build_mode_programming, + .build_mcache_programming = dml2_top_soc15_build_mcache_programming, +}; + +bool dml2_top_soc15_initialize_instance(struct dml2_initialize_instance_in_out *in_out) +{ + struct dml2_instance *dml = (struct dml2_instance *)in_out->dml2_instance; + struct dml2_initialize_instance_locals *l = &dml->scratch.initialize_instance_locals; + struct dml2_core_initialize_in_out core_init_params = { 0 }; + struct dml2_mcg_build_min_clock_table_params_in_out mcg_build_min_clk_params = { 0 }; + struct dml2_pmo_initialize_in_out pmo_init_params = { 0 }; + bool result = false; + + memset(l, 0, sizeof(struct dml2_initialize_instance_locals)); + memset(dml, 0, sizeof(struct dml2_instance)); + + memcpy(&dml->ip_caps, &in_out->ip_caps, sizeof(struct dml2_ip_capabilities)); + memcpy(&dml->soc_bbox, &in_out->soc_bb, sizeof(struct dml2_soc_bb)); + + dml->project_id = in_out->options.project_id; + dml->pmo_options = in_out->options.pmo_options; + + // Initialize All Components + result = dml2_mcg_create(in_out->options.project_id, &dml->mcg_instance); + + if (result) + result = dml2_dpmm_create(in_out->options.project_id, &dml->dpmm_instance); + + if (result) + result = dml2_core_create(in_out->options.project_id, &dml->core_instance); + + if (result) { + mcg_build_min_clk_params.soc_bb = &in_out->soc_bb; + mcg_build_min_clk_params.min_clk_table = &dml->min_clk_table; + result = dml->mcg_instance.build_min_clock_table(&mcg_build_min_clk_params); + } + + if (result) { + core_init_params.project_id = in_out->options.project_id; + core_init_params.instance = &dml->core_instance; + core_init_params.minimum_clock_table = &dml->min_clk_table; + core_init_params.explicit_ip_bb = in_out->overrides.explicit_ip_bb; + core_init_params.explicit_ip_bb_size = in_out->overrides.explicit_ip_bb_size; + core_init_params.ip_caps = &in_out->ip_caps; + core_init_params.soc_bb = &in_out->soc_bb; + result = dml->core_instance.initialize(&core_init_params); + + if (core_init_params.explicit_ip_bb && core_init_params.explicit_ip_bb_size > 0) { + memcpy(&dml->ip_caps, &in_out->ip_caps, sizeof(struct dml2_ip_capabilities)); + } + } + + if (result) + result = dml2_pmo_create(in_out->options.project_id, &dml->pmo_instance); + + if (result) { + pmo_init_params.instance = &dml->pmo_instance; + pmo_init_params.soc_bb = &dml->soc_bbox; + pmo_init_params.ip_caps = &dml->ip_caps; + pmo_init_params.mcg_clock_table_size = dml->min_clk_table.dram_bw_table.num_entries; + pmo_init_params.options = &dml->pmo_options; + dml->pmo_instance.initialize(&pmo_init_params); + } + dml->funcs = soc15_funcs; + return result; +} diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml_top_mcache.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_soc15.h similarity index 59% rename from drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml_top_mcache.h rename to drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_soc15.h index 7b1f6f7143d0..53bd8602f9ef 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml_top_mcache.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml2_top_soc15.h @@ -2,22 +2,13 @@ // // Copyright 2024 Advanced Micro Devices, Inc. -#ifndef __DML_TOP_MCACHE_H__ -#define __DML_TOP_MCACHE_H__ - -#include "dml2_external_lib_deps.h" -#include "dml_top_display_cfg_types.h" -#include "dml_top_types.h" +#ifndef __DML2_TOP_SOC15_H__ +#define __DML2_TOP_SOC15_H__ #include "dml2_internal_shared_types.h" +bool dml2_top_soc15_initialize_instance(struct dml2_initialize_instance_in_out *in_out); bool dml2_top_mcache_calc_mcache_count_and_offsets(struct top_mcache_calc_mcache_count_and_offsets_in_out *params); - void dml2_top_mcache_assign_global_mcache_ids(struct top_mcache_assign_global_mcache_ids_in_out *params); - bool dml2_top_mcache_validate_admissability(struct top_mcache_validate_admissability_in_out *params); - -bool dml2_top_mcache_build_mcache_programming(struct dml2_build_mcache_programming_in_out *params); - -bool dml2_top_mcache_unit_test(void); - -#endif +bool dml2_top_soc15_build_mcache_programming(struct dml2_build_mcache_programming_in_out *params); +#endif /* __DML2_TOP_SOC15_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml_top_mcache.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml_top_mcache.c deleted file mode 100644 index a342ebfbe4e7..000000000000 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_top/dml_top_mcache.c +++ /dev/null @@ -1,549 +0,0 @@ -// SPDX-License-Identifier: MIT -// -// Copyright 2024 Advanced Micro Devices, Inc. - -#include "dml2_debug.h" - -#include "dml_top_mcache.h" -#include "lib_float_math.h" - -#include "dml2_internal_shared_types.h" - -/* -* Takes an input set of mcache boundaries and finds the appropriate setting of cache programming. -* Returns true if a valid set of programming can be made, and false otherwise. "Valid" means -* that the horizontal viewport does not span more than 2 cache slices. -* -* It optionally also can apply a constant shift to all the cache boundaries. -*/ -static const uint32_t MCACHE_ID_UNASSIGNED = 0xF; -static const uint32_t SPLIT_LOCATION_UNDEFINED = 0xFFFF; - -static bool calculate_first_second_splitting(const int *mcache_boundaries, int num_boundaries, int shift, - int pipe_h_vp_start, int pipe_h_vp_end, int *first_offset, int *second_offset) -{ - const int MAX_VP = 0xFFFFFF; - int left_cache_id; - int right_cache_id; - int range_start; - int range_end; - bool success = false; - - if (num_boundaries <= 1) { - if (first_offset && second_offset) { - *first_offset = 0; - *second_offset = -1; - } - success = true; - return success; - } else { - range_start = 0; - for (left_cache_id = 0; left_cache_id < num_boundaries; left_cache_id++) { - range_end = mcache_boundaries[left_cache_id] - shift - 1; - - if (range_start <= pipe_h_vp_start && pipe_h_vp_start <= range_end) - break; - - range_start = range_end + 1; - } - - range_end = MAX_VP; - for (right_cache_id = num_boundaries - 1; right_cache_id >= -1; right_cache_id--) { - if (right_cache_id >= 0) - range_start = mcache_boundaries[right_cache_id] - shift; - else - range_start = 0; - - if (range_start <= pipe_h_vp_end && pipe_h_vp_end <= range_end) { - break; - } - range_end = range_start - 1; - } - right_cache_id = (right_cache_id + 1) % num_boundaries; - - if (right_cache_id == left_cache_id) { - if (first_offset && second_offset) { - *first_offset = left_cache_id; - *second_offset = -1; - } - success = true; - } else if (right_cache_id == (left_cache_id + 1) % num_boundaries) { - if (first_offset && second_offset) { - *first_offset = left_cache_id; - *second_offset = right_cache_id; - } - success = true; - } - } - - return success; -} - -/* -* For a given set of pipe start/end x positions, checks to see it can support the input mcache splitting. -* It also attempts to "optimize" by finding a shift if the default 0 shift does not work. -*/ -static bool find_shift_for_valid_cache_id_assignment(int *mcache_boundaries, unsigned int num_boundaries, - int *pipe_vp_startx, int *pipe_vp_endx, unsigned int pipe_count, int shift_granularity, int *shift) -{ - int max_shift = 0xFFFF; - unsigned int pipe_index; - unsigned int i, slice_width; - bool success = false; - - for (i = 0; i < num_boundaries; i++) { - if (i == 0) - slice_width = mcache_boundaries[i]; - else - slice_width = mcache_boundaries[i] - mcache_boundaries[i - 1]; - - if (max_shift > (int)slice_width) { - max_shift = slice_width; - } - } - - for (*shift = 0; *shift <= max_shift; *shift += shift_granularity) { - success = true; - for (pipe_index = 0; pipe_index < pipe_count; pipe_index++) { - if (!calculate_first_second_splitting(mcache_boundaries, num_boundaries, *shift, - pipe_vp_startx[pipe_index], pipe_vp_endx[pipe_index], 0, 0)) { - success = false; - break; - } - } - if (success) - break; - } - - return success; -} - -/* -* Counts the number of elements inside input array within the given span length. -* Formally, what is the size of the largest subset of the array where the largest and smallest element -* differ no more than the span. -*/ -static unsigned int count_elements_in_span(int *array, unsigned int array_size, unsigned int span) -{ - unsigned int i; - unsigned int span_start_value; - unsigned int span_start_index; - unsigned int greatest_element_count; - - if (array_size == 0) - return 1; - - if (span == 0) - return array_size > 0 ? 1 : 0; - - span_start_value = 0; - span_start_index = 0; - greatest_element_count = 0; - - while (span_start_index < array_size) { - for (i = span_start_index; i < array_size; i++) { - if (array[i] - span_start_value <= span) { - if (i - span_start_index + 1 > greatest_element_count) { - greatest_element_count = i - span_start_index + 1; - } - } else - break; - } - - span_start_index++; - - if (span_start_index < array_size) { - span_start_value = array[span_start_index - 1] + 1; - } - } - - return greatest_element_count; -} - -static bool calculate_h_split_for_scaling_transform(int full_vp_width, int h_active, int num_pipes, - enum dml2_scaling_transform scaling_transform, int *pipe_vp_x_start, int *pipe_vp_x_end) -{ - int i, slice_width; - const char MAX_SCL_VP_OVERLAP = 3; - bool success = false; - - switch (scaling_transform) { - case dml2_scaling_transform_centered: - case dml2_scaling_transform_aspect_ratio: - case dml2_scaling_transform_fullscreen: - slice_width = full_vp_width / num_pipes; - for (i = 0; i < num_pipes; i++) { - pipe_vp_x_start[i] = i * slice_width; - pipe_vp_x_end[i] = (i + 1) * slice_width - 1; - - if (pipe_vp_x_start[i] < MAX_SCL_VP_OVERLAP) - pipe_vp_x_start[i] = 0; - else - pipe_vp_x_start[i] -= MAX_SCL_VP_OVERLAP; - - if (pipe_vp_x_end[i] > full_vp_width - MAX_SCL_VP_OVERLAP - 1) - pipe_vp_x_end[i] = full_vp_width - 1; - else - pipe_vp_x_end[i] += MAX_SCL_VP_OVERLAP; - } - break; - case dml2_scaling_transform_explicit: - default: - success = false; - break; - } - - return success; -} - -bool dml2_top_mcache_validate_admissability(struct top_mcache_validate_admissability_in_out *params) -{ - struct dml2_instance *dml = (struct dml2_instance *)params->dml2_instance; - struct dml2_top_mcache_validate_admissability_locals *l = &dml->scratch.mcache_validate_admissability_locals; - - const int MAX_PIXEL_OVERLAP = 6; - int max_per_pipe_vp_p0 = 0; - int max_per_pipe_vp_p1 = 0; - int temp, p0shift, p1shift; - unsigned int plane_index = 0; - unsigned int i; - unsigned int odm_combine_factor; - unsigned int mpc_combine_factor; - unsigned int num_dpps; - unsigned int num_boundaries; - enum dml2_scaling_transform scaling_transform; - const struct dml2_plane_parameters *plane; - const struct dml2_stream_parameters *stream; - - bool p0pass = false; - bool p1pass = false; - bool all_pass = true; - - for (plane_index = 0; plane_index < params->display_cfg->num_planes; plane_index++) { - if (!params->display_cfg->plane_descriptors[plane_index].surface.dcc.enable) - continue; - - plane = ¶ms->display_cfg->plane_descriptors[plane_index]; - stream = ¶ms->display_cfg->stream_descriptors[plane->stream_index]; - - num_dpps = odm_combine_factor = params->cfg_support_info->stream_support_info[plane->stream_index].odms_used; - - if (odm_combine_factor == 1) - num_dpps = mpc_combine_factor = (unsigned int)params->cfg_support_info->plane_support_info[plane_index].dpps_used; - else - mpc_combine_factor = 1; - - if (odm_combine_factor > 1) { - max_per_pipe_vp_p0 = plane->surface.plane0.width; - temp = (unsigned int)math_ceil(plane->composition.scaler_info.plane0.h_ratio * stream->timing.h_active / odm_combine_factor); - - if (temp < max_per_pipe_vp_p0) - max_per_pipe_vp_p0 = temp; - - max_per_pipe_vp_p1 = plane->surface.plane1.width; - temp = (unsigned int)math_ceil(plane->composition.scaler_info.plane1.h_ratio * stream->timing.h_active / odm_combine_factor); - - if (temp < max_per_pipe_vp_p1) - max_per_pipe_vp_p1 = temp; - } else { - max_per_pipe_vp_p0 = plane->surface.plane0.width / mpc_combine_factor; - max_per_pipe_vp_p1 = plane->surface.plane1.width / mpc_combine_factor; - } - - max_per_pipe_vp_p0 += 2 * MAX_PIXEL_OVERLAP; - max_per_pipe_vp_p1 += MAX_PIXEL_OVERLAP; - - p0shift = 0; - p1shift = 0; - - // The last element in the unshifted boundary array will always be the first pixel outside the - // plane, which means theres no mcache associated with it, so -1 - num_boundaries = params->mcache_allocations[plane_index].num_mcaches_plane0 == 0 ? 0 : params->mcache_allocations[plane_index].num_mcaches_plane0 - 1; - if ((count_elements_in_span(params->mcache_allocations[plane_index].mcache_x_offsets_plane0, - num_boundaries, max_per_pipe_vp_p0) <= 1) && (num_boundaries <= num_dpps)) { - p0pass = true; - } - num_boundaries = params->mcache_allocations[plane_index].num_mcaches_plane1 == 0 ? 0 : params->mcache_allocations[plane_index].num_mcaches_plane1 - 1; - if ((count_elements_in_span(params->mcache_allocations[plane_index].mcache_x_offsets_plane1, - num_boundaries, max_per_pipe_vp_p1) <= 1) && (num_boundaries <= num_dpps)) { - p1pass = true; - } - - if (!p0pass || !p1pass) { - if (odm_combine_factor > 1) { - num_dpps = odm_combine_factor; - scaling_transform = plane->composition.scaling_transform; - } else { - num_dpps = mpc_combine_factor; - scaling_transform = dml2_scaling_transform_fullscreen; - } - - if (!p0pass) { - if (plane->composition.viewport.stationary) { - calculate_h_split_for_scaling_transform(plane->surface.plane0.width, - stream->timing.h_active, num_dpps, scaling_transform, - &l->plane0.pipe_vp_startx[plane_index], &l->plane0.pipe_vp_endx[plane_index]); - p0pass = find_shift_for_valid_cache_id_assignment(params->mcache_allocations[plane_index].mcache_x_offsets_plane0, - params->mcache_allocations[plane_index].num_mcaches_plane0, - &l->plane0.pipe_vp_startx[plane_index], &l->plane0.pipe_vp_endx[plane_index], num_dpps, - params->mcache_allocations[plane_index].shift_granularity.p0, &p0shift); - } - } - if (!p1pass) { - if (plane->composition.viewport.stationary) { - calculate_h_split_for_scaling_transform(plane->surface.plane1.width, - stream->timing.h_active, num_dpps, scaling_transform, - &l->plane0.pipe_vp_startx[plane_index], &l->plane0.pipe_vp_endx[plane_index]); - p1pass = find_shift_for_valid_cache_id_assignment(params->mcache_allocations[plane_index].mcache_x_offsets_plane1, - params->mcache_allocations[plane_index].num_mcaches_plane1, - &l->plane1.pipe_vp_startx[plane_index], &l->plane1.pipe_vp_endx[plane_index], num_dpps, - params->mcache_allocations[plane_index].shift_granularity.p1, &p1shift); - } - } - } - - if (p0pass && p1pass) { - for (i = 0; i < params->mcache_allocations[plane_index].num_mcaches_plane0; i++) { - params->mcache_allocations[plane_index].mcache_x_offsets_plane0[i] -= p0shift; - } - for (i = 0; i < params->mcache_allocations[plane_index].num_mcaches_plane1; i++) { - params->mcache_allocations[plane_index].mcache_x_offsets_plane1[i] -= p1shift; - } - } - - params->per_plane_status[plane_index] = p0pass && p1pass; - all_pass &= p0pass && p1pass; - } - - return all_pass; -} - -static void reset_mcache_allocations(struct dml2_hubp_pipe_mcache_regs *per_plane_pipe_mcache_regs) -{ - // Initialize all entries to special valid MCache ID and special valid split coordinate - per_plane_pipe_mcache_regs->main.p0.mcache_id_first = MCACHE_ID_UNASSIGNED; - per_plane_pipe_mcache_regs->main.p0.mcache_id_second = MCACHE_ID_UNASSIGNED; - per_plane_pipe_mcache_regs->main.p0.split_location = SPLIT_LOCATION_UNDEFINED; - - per_plane_pipe_mcache_regs->mall.p0.mcache_id_first = MCACHE_ID_UNASSIGNED; - per_plane_pipe_mcache_regs->mall.p0.mcache_id_second = MCACHE_ID_UNASSIGNED; - per_plane_pipe_mcache_regs->mall.p0.split_location = SPLIT_LOCATION_UNDEFINED; - - per_plane_pipe_mcache_regs->main.p1.mcache_id_first = MCACHE_ID_UNASSIGNED; - per_plane_pipe_mcache_regs->main.p1.mcache_id_second = MCACHE_ID_UNASSIGNED; - per_plane_pipe_mcache_regs->main.p1.split_location = SPLIT_LOCATION_UNDEFINED; - - per_plane_pipe_mcache_regs->mall.p1.mcache_id_first = MCACHE_ID_UNASSIGNED; - per_plane_pipe_mcache_regs->mall.p1.mcache_id_second = MCACHE_ID_UNASSIGNED; - per_plane_pipe_mcache_regs->mall.p1.split_location = SPLIT_LOCATION_UNDEFINED; -} - -bool dml2_top_mcache_build_mcache_programming(struct dml2_build_mcache_programming_in_out *params) -{ - bool success = true; - int config_index, pipe_index; - int first_offset, second_offset; - int free_per_plane_reg_index = 0; - - memset(params->per_plane_pipe_mcache_regs, 0, DML2_MAX_PLANES * DML2_MAX_DCN_PIPES * sizeof(struct dml2_hubp_pipe_mcache_regs *)); - - for (config_index = 0; config_index < params->num_configurations; config_index++) { - for (pipe_index = 0; pipe_index < params->mcache_configurations[config_index].num_pipes; pipe_index++) { - // Allocate storage for the mcache regs - params->per_plane_pipe_mcache_regs[config_index][pipe_index] = ¶ms->mcache_regs_set[free_per_plane_reg_index++]; - - reset_mcache_allocations(params->per_plane_pipe_mcache_regs[config_index][pipe_index]); - - if (params->mcache_configurations[config_index].plane_descriptor->surface.dcc.enable) { - // P0 always enabled - if (!calculate_first_second_splitting(params->mcache_configurations[config_index].mcache_allocation->mcache_x_offsets_plane0, - params->mcache_configurations[config_index].mcache_allocation->num_mcaches_plane0, - 0, - params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane0.viewport_x_start, - params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane0.viewport_x_start + - params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane0.viewport_width - 1, - &first_offset, &second_offset)) { - success = false; - break; - } - - params->per_plane_pipe_mcache_regs[config_index][pipe_index]->main.p0.mcache_id_first = - params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_plane0[first_offset]; - - params->per_plane_pipe_mcache_regs[config_index][pipe_index]->mall.p0.mcache_id_first = - params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_mall_plane0[first_offset]; - - if (second_offset >= 0) { - params->per_plane_pipe_mcache_regs[config_index][pipe_index]->main.p0.mcache_id_second = - params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_plane0[second_offset]; - params->per_plane_pipe_mcache_regs[config_index][pipe_index]->main.p0.split_location = - params->mcache_configurations[config_index].mcache_allocation->mcache_x_offsets_plane0[first_offset] - 1; - - params->per_plane_pipe_mcache_regs[config_index][pipe_index]->mall.p0.mcache_id_second = - params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_mall_plane0[second_offset]; - params->per_plane_pipe_mcache_regs[config_index][pipe_index]->mall.p0.split_location = - params->mcache_configurations[config_index].mcache_allocation->mcache_x_offsets_plane0[first_offset] - 1; - } - - // Populate P1 if enabled - if (params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane1_enabled) { - if (!calculate_first_second_splitting(params->mcache_configurations[config_index].mcache_allocation->mcache_x_offsets_plane1, - params->mcache_configurations[config_index].mcache_allocation->num_mcaches_plane1, - 0, - params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane1.viewport_x_start, - params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane1.viewport_x_start + - params->mcache_configurations[config_index].pipe_configurations[pipe_index].plane1.viewport_width - 1, - &first_offset, &second_offset)) { - success = false; - break; - } - - params->per_plane_pipe_mcache_regs[config_index][pipe_index]->main.p1.mcache_id_first = - params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_plane1[first_offset]; - - params->per_plane_pipe_mcache_regs[config_index][pipe_index]->mall.p1.mcache_id_first = - params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_mall_plane1[first_offset]; - - if (second_offset >= 0) { - params->per_plane_pipe_mcache_regs[config_index][pipe_index]->main.p1.mcache_id_second = - params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_plane1[second_offset]; - params->per_plane_pipe_mcache_regs[config_index][pipe_index]->main.p1.split_location = - params->mcache_configurations[config_index].mcache_allocation->mcache_x_offsets_plane1[first_offset] - 1; - - params->per_plane_pipe_mcache_regs[config_index][pipe_index]->mall.p1.mcache_id_second = - params->mcache_configurations[config_index].mcache_allocation->global_mcache_ids_mall_plane1[second_offset]; - params->per_plane_pipe_mcache_regs[config_index][pipe_index]->mall.p1.split_location = - params->mcache_configurations[config_index].mcache_allocation->mcache_x_offsets_plane1[first_offset] - 1; - } - } - } - } - } - - return success; -} - -void dml2_top_mcache_assign_global_mcache_ids(struct top_mcache_assign_global_mcache_ids_in_out *params) -{ - int i; - unsigned int j; - int next_unused_cache_id = 0; - - for (i = 0; i < params->num_allocations; i++) { - if (!params->allocations[i].valid) - continue; - - for (j = 0; j < params->allocations[i].num_mcaches_plane0; j++) { - params->allocations[i].global_mcache_ids_plane0[j] = next_unused_cache_id++; - } - for (j = 0; j < params->allocations[i].num_mcaches_plane1; j++) { - params->allocations[i].global_mcache_ids_plane1[j] = next_unused_cache_id++; - } - - // The "psuedo-last" slice is always wrapped around - params->allocations[i].global_mcache_ids_plane0[params->allocations[i].num_mcaches_plane0] = - params->allocations[i].global_mcache_ids_plane0[0]; - params->allocations[i].global_mcache_ids_plane1[params->allocations[i].num_mcaches_plane1] = - params->allocations[i].global_mcache_ids_plane1[0]; - - // If we need dedicated caches for mall requesting, then we assign them here. - if (params->allocations[i].requires_dedicated_mall_mcache) { - for (j = 0; j < params->allocations[i].num_mcaches_plane0; j++) { - params->allocations[i].global_mcache_ids_mall_plane0[j] = next_unused_cache_id++; - } - for (j = 0; j < params->allocations[i].num_mcaches_plane1; j++) { - params->allocations[i].global_mcache_ids_mall_plane1[j] = next_unused_cache_id++; - } - - // The "psuedo-last" slice is always wrapped around - params->allocations[i].global_mcache_ids_mall_plane0[params->allocations[i].num_mcaches_plane0] = - params->allocations[i].global_mcache_ids_mall_plane0[0]; - params->allocations[i].global_mcache_ids_mall_plane1[params->allocations[i].num_mcaches_plane1] = - params->allocations[i].global_mcache_ids_mall_plane1[0]; - } - - // If P0 and P1 are sharing caches, then it means the largest mcache IDs for p0 and p1 can be the same - // since mcache IDs are always ascending, then it means the largest mcacheID of p1 should be the - // largest mcacheID of P0 - if (params->allocations[i].num_mcaches_plane0 > 0 && params->allocations[i].num_mcaches_plane1 > 0 && - params->allocations[i].last_slice_sharing.plane0_plane1) { - params->allocations[i].global_mcache_ids_plane1[params->allocations[i].num_mcaches_plane1 - 1] = - params->allocations[i].global_mcache_ids_plane0[params->allocations[i].num_mcaches_plane0 - 1]; - } - - // If we need dedicated caches handle last slice sharing - if (params->allocations[i].requires_dedicated_mall_mcache) { - if (params->allocations[i].num_mcaches_plane0 > 0 && params->allocations[i].num_mcaches_plane1 > 0 && - params->allocations[i].last_slice_sharing.plane0_plane1) { - params->allocations[i].global_mcache_ids_mall_plane1[params->allocations[i].num_mcaches_plane1 - 1] = - params->allocations[i].global_mcache_ids_mall_plane0[params->allocations[i].num_mcaches_plane0 - 1]; - } - // If mall_comb_mcache_l is set then it means that largest mcache ID for MALL p0 can be same as regular read p0 - if (params->allocations[i].num_mcaches_plane0 > 0 && params->allocations[i].last_slice_sharing.mall_comb_mcache_p0) { - params->allocations[i].global_mcache_ids_mall_plane0[params->allocations[i].num_mcaches_plane0 - 1] = - params->allocations[i].global_mcache_ids_plane0[params->allocations[i].num_mcaches_plane0 - 1]; - } - // If mall_comb_mcache_c is set then it means that largest mcache ID for MALL p1 can be same as regular - // read p1 (which can be same as regular read p0 if plane0_plane1 is also set) - if (params->allocations[i].num_mcaches_plane1 > 0 && params->allocations[i].last_slice_sharing.mall_comb_mcache_p1) { - params->allocations[i].global_mcache_ids_mall_plane1[params->allocations[i].num_mcaches_plane1 - 1] = - params->allocations[i].global_mcache_ids_plane1[params->allocations[i].num_mcaches_plane1 - 1]; - } - } - - // If you don't need dedicated mall mcaches, the mall mcache assignments are identical to the normal requesting - if (!params->allocations[i].requires_dedicated_mall_mcache) { - memcpy(params->allocations[i].global_mcache_ids_mall_plane0, params->allocations[i].global_mcache_ids_plane0, - sizeof(params->allocations[i].global_mcache_ids_mall_plane0)); - memcpy(params->allocations[i].global_mcache_ids_mall_plane1, params->allocations[i].global_mcache_ids_plane1, - sizeof(params->allocations[i].global_mcache_ids_mall_plane1)); - } - } -} - -bool dml2_top_mcache_calc_mcache_count_and_offsets(struct top_mcache_calc_mcache_count_and_offsets_in_out *params) -{ - struct dml2_instance *dml = (struct dml2_instance *)params->dml2_instance; - struct dml2_top_mcache_verify_mcache_size_locals *l = &dml->scratch.mcache_verify_mcache_size_locals; - - unsigned int total_mcaches_required; - unsigned int i; - bool result = false; - - if (dml->soc_bbox.num_dcc_mcaches == 0) { - return true; - } - - total_mcaches_required = 0; - l->calc_mcache_params.instance = &dml->core_instance; - for (i = 0; i < params->display_config->num_planes; i++) { - if (!params->display_config->plane_descriptors[i].surface.dcc.enable) { - memset(¶ms->mcache_allocations[i], 0, sizeof(struct dml2_mcache_surface_allocation)); - continue; - } - - l->calc_mcache_params.plane_descriptor = ¶ms->display_config->plane_descriptors[i]; - l->calc_mcache_params.mcache_allocation = ¶ms->mcache_allocations[i]; - l->calc_mcache_params.plane_index = i; - - if (!dml->core_instance.calculate_mcache_allocation(&l->calc_mcache_params)) { - result = false; - break; - } - - if (params->mcache_allocations[i].valid) { - total_mcaches_required += params->mcache_allocations[i].num_mcaches_plane0 + params->mcache_allocations[i].num_mcaches_plane1; - if (params->mcache_allocations[i].last_slice_sharing.plane0_plane1) - total_mcaches_required--; - } - } - dml2_printf("DML_CORE_DCN3::%s: plane_%d, total_mcaches_required=%d\n", __func__, i, total_mcaches_required); - - if (total_mcaches_required > dml->soc_bbox.num_dcc_mcaches) { - result = false; - } else { - result = true; - } - - return result; -} diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/inc/dml2_debug.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/inc/dml2_debug.c index e9b8e10695ae..f95c7ff56f15 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/inc/dml2_debug.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/inc/dml2_debug.c @@ -4,6 +4,11 @@ #include "dml2_debug.h" +int dml2_log_internal(const char *format, ...) +{ + return 0; +} + int dml2_printf(const char *format, ...) { #ifdef _DEBUG diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/inc/dml2_debug.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/inc/dml2_debug.h index d51a1b6c62f2..a27792b56f7e 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/inc/dml2_debug.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/inc/dml2_debug.h @@ -8,9 +8,53 @@ #ifdef _DEBUG #define DML2_ASSERT(condition) dml2_assert(condition) #else -#define DML2_ASSERT(condition) +#define DML2_ASSERT(condition) ((void)0) +#endif +/* + * DML_LOG_FATAL - fatal errors for unrecoverable DML states until a restart. + * DML_LOG_ERROR - unexpected but recoverable failures inside DML + * DML_LOG_WARN - unexpected inputs or events to DML + * DML_LOG_INFO - high level tracing of DML interfaces + * DML_LOG_DEBUG - detailed tracing of DML internal components + * DML_LOG_VERBOSE - detailed tracing of DML calculation procedure + */ +#if !defined(DML_LOG_LEVEL) +#if defined(_DEBUG) && defined(_DEBUG_PRINTS) +/* for backward compatibility with old macros */ +#define DML_LOG_LEVEL 5 +#else +#define DML_LOG_LEVEL 0 +#endif #endif +#define DML_LOG_FATAL(fmt, ...) dml2_log_internal(fmt, ## __VA_ARGS__) +#if DML_LOG_LEVEL >= 1 +#define DML_LOG_ERROR(fmt, ...) dml2_log_internal(fmt, ## __VA_ARGS__) +#else +#define DML_LOG_ERROR(fmt, ...) ((void)0) +#endif +#if DML_LOG_LEVEL >= 2 +#define DML_LOG_WARN(fmt, ...) dml2_log_internal(fmt, ## __VA_ARGS__) +#else +#define DML_LOG_WARN(fmt, ...) ((void)0) +#endif +#if DML_LOG_LEVEL >= 3 +#define DML_LOG_INFO(fmt, ...) dml2_log_internal(fmt, ## __VA_ARGS__) +#else +#define DML_LOG_INFO(fmt, ...) ((void)0) +#endif +#if DML_LOG_LEVEL >= 4 +#define DML_LOG_DEBUG(fmt, ...) dml2_log_internal(fmt, ## __VA_ARGS__) +#else +#define DML_LOG_DEBUG(fmt, ...) ((void)0) +#endif +#if DML_LOG_LEVEL >= 5 +#define DML_LOG_VERBOSE(fmt, ...) dml2_log_internal(fmt, ## __VA_ARGS__) +#else +#define DML_LOG_VERBOSE(fmt, ...) ((void)0) +#endif + +int dml2_log_internal(const char *format, ...); int dml2_printf(const char *format, ...); void dml2_assert(int condition); diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/inc/dml2_internal_shared_types.h b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/inc/dml2_internal_shared_types.h index aeac9f159fa5..7fb6026bcb49 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/inc/dml2_internal_shared_types.h +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/inc/dml2_internal_shared_types.h @@ -8,7 +8,6 @@ #include "dml2_external_lib_deps.h" #include "dml_top_types.h" #include "dml2_core_shared_types.h" - /* * DML2 MCG Types and Interfaces */ @@ -63,7 +62,6 @@ struct dml2_mcg_build_min_clock_table_params_in_out { */ struct dml2_mcg_min_clock_table *min_clk_table; }; - struct dml2_mcg_instance { bool (*build_min_clock_table)(struct dml2_mcg_build_min_clock_table_params_in_out *in_out); bool (*unit_test)(void); @@ -81,7 +79,6 @@ struct dml2_dpmm_map_mode_to_soc_dpm_params_in_out { struct dml2_soc_bb *soc_bb; struct dml2_mcg_min_clock_table *min_clk_table; const struct display_configuation_with_meta *display_cfg; - struct { bool perform_pseudo_map; struct dml2_core_internal_soc_bb *soc_bb; @@ -309,7 +306,7 @@ struct dml2_optimization_stage3_state { // The pstate support mode for each plane // The number of valid elements == display_cfg.num_planes // The indexing of pstate_switch_modes matches plane_descriptors[] - enum dml2_uclk_pstate_support_method pstate_switch_modes[DML2_MAX_PLANES]; + enum dml2_pstate_method pstate_switch_modes[DML2_MAX_PLANES]; // Meta-data for implicit SVP generation, indexed by stream index struct dml2_implicit_svp_meta stream_svp_meta[DML2_MAX_PLANES]; @@ -356,6 +353,10 @@ struct display_configuation_with_meta { struct dml2_optimization_stage5_state stage5; }; +struct dml2_pmo_pstate_strategy { + enum dml2_pstate_method per_stream_pstate_method[DML2_MAX_PLANES]; + bool allow_state_increase; +}; struct dml2_core_mode_support_in_out { /* * Inputs @@ -365,7 +366,6 @@ struct dml2_core_mode_support_in_out { struct dml2_mcg_min_clock_table *min_clk_table; int min_clk_index; - /* * Outputs */ @@ -395,7 +395,6 @@ struct dml2_core_mode_programming_in_out { struct dml2_core_instance *instance; const struct display_configuation_with_meta *display_cfg; const struct core_display_cfg_support_info *cfg_support_info; - /* * Outputs (also Input the clk freq are also from programming struct) */ @@ -445,6 +444,7 @@ struct dml2_core_internal_state_intermediates { struct dml2_core_mode_support_locals { struct dml2_core_calcs_mode_support_ex mode_support_ex_params; struct dml2_display_cfg svp_expanded_display_cfg; + struct dml2_calculate_mcache_allocation_in_out calc_mcache_allocation_params; }; struct dml2_core_mode_programming_locals { @@ -600,34 +600,11 @@ struct dml2_pmo_optimize_for_stutter_in_out { struct display_configuation_with_meta *optimized_display_config; }; -enum dml2_pmo_pstate_method { - dml2_pmo_pstate_strategy_na = 0, - /* hw exclusive modes */ - dml2_pmo_pstate_strategy_vactive = 1, - dml2_pmo_pstate_strategy_vblank = 2, - dml2_pmo_pstate_strategy_reserved_hw = 5, - /* fw assisted exclusive modes */ - dml2_pmo_pstate_strategy_fw_svp = 6, - dml2_pmo_pstate_strategy_reserved_fw = 10, - /* fw assisted modes requiring drr modulation */ - dml2_pmo_pstate_strategy_fw_vactive_drr = 11, - dml2_pmo_pstate_strategy_fw_vblank_drr = 12, - dml2_pmo_pstate_strategy_fw_svp_drr = 13, - dml2_pmo_pstate_strategy_reserved_fw_drr_clamped = 20, - dml2_pmo_pstate_strategy_fw_drr = 21, - dml2_pmo_pstate_strategy_reserved_fw_drr_var = 22, -}; - -struct dml2_pmo_pstate_strategy { - enum dml2_pmo_pstate_method per_stream_pstate_method[DML2_MAX_PLANES]; - bool allow_state_increase; -}; - -#define PMO_NO_DRR_STRATEGY_MASK (((1 << (dml2_pmo_pstate_strategy_reserved_fw - dml2_pmo_pstate_strategy_na + 1)) - 1) << dml2_pmo_pstate_strategy_na) -#define PMO_DRR_STRATEGY_MASK (((1 << (dml2_pmo_pstate_strategy_reserved_fw_drr_var - dml2_pmo_pstate_strategy_fw_vactive_drr + 1)) - 1) << dml2_pmo_pstate_strategy_fw_vactive_drr) -#define PMO_DRR_CLAMPED_STRATEGY_MASK (((1 << (dml2_pmo_pstate_strategy_reserved_fw_drr_clamped - dml2_pmo_pstate_strategy_fw_vactive_drr + 1)) - 1) << dml2_pmo_pstate_strategy_fw_vactive_drr) -#define PMO_DRR_VAR_STRATEGY_MASK (((1 << (dml2_pmo_pstate_strategy_reserved_fw_drr_var - dml2_pmo_pstate_strategy_fw_drr + 1)) - 1) << dml2_pmo_pstate_strategy_fw_drr) -#define PMO_FW_STRATEGY_MASK (((1 << (dml2_pmo_pstate_strategy_reserved_fw_drr_var - dml2_pmo_pstate_strategy_fw_svp + 1)) - 1) << dml2_pmo_pstate_strategy_fw_svp) +#define PMO_NO_DRR_STRATEGY_MASK (((1 << (dml2_pstate_method_reserved_fw - dml2_pstate_method_na + 1)) - 1) << dml2_pstate_method_na) +#define PMO_DRR_STRATEGY_MASK (((1 << (dml2_pstate_method_reserved_fw_drr_var - dml2_pstate_method_fw_vactive_drr + 1)) - 1) << dml2_pstate_method_fw_vactive_drr) +#define PMO_DRR_CLAMPED_STRATEGY_MASK (((1 << (dml2_pstate_method_reserved_fw_drr_clamped - dml2_pstate_method_fw_vactive_drr + 1)) - 1) << dml2_pstate_method_fw_vactive_drr) +#define PMO_DRR_VAR_STRATEGY_MASK (((1 << (dml2_pstate_method_reserved_fw_drr_var - dml2_pstate_method_fw_drr + 1)) - 1) << dml2_pstate_method_fw_drr) +#define PMO_FW_STRATEGY_MASK (((1 << (dml2_pstate_method_reserved_fw_drr_var - dml2_pstate_method_fw_svp + 1)) - 1) << dml2_pstate_method_fw_svp) #define PMO_DCN4_MAX_DISPLAYS 4 #define PMO_DCN4_MAX_NUM_VARIANTS 2 @@ -645,6 +622,8 @@ struct dml2_pmo_scratch { int stream_mask; } pmo_dcn3; struct { + struct dml2_pmo_pstate_strategy expanded_override_strategy_list[2 * 2 * 2 * 2]; + unsigned int num_expanded_override_strategies; struct dml2_pmo_pstate_strategy pstate_strategy_candidates[DML2_PMO_PSTATE_CANDIDATE_LIST_SIZE]; int num_pstate_candidates; int cur_pstate_candidate; @@ -706,7 +685,6 @@ struct dml2_pmo_instance { int mpc_combine_limit; int odm_combine_limit; int mcg_clock_table_size; - union { struct { struct { @@ -963,7 +941,13 @@ struct dml2_top_mcache_validate_admissability_locals { struct dml2_top_display_cfg_support_info { const struct dml2_display_cfg *display_config; struct core_display_cfg_support_info core_info; - enum dml2_pstate_support_method per_plane_pstate_method[DML2_MAX_PLANES]; +}; + +struct dml2_top_funcs { + bool (*check_mode_supported)(struct dml2_check_mode_supported_in_out *in_out); + bool (*build_mode_programming)(struct dml2_build_mode_programming_in_out *in_out); + bool (*build_mcache_programming)(struct dml2_build_mcache_programming_in_out *in_out); + bool (*unit_test)(void); }; struct dml2_instance { @@ -978,8 +962,8 @@ struct dml2_instance { struct dml2_ip_capabilities ip_caps; struct dml2_mcg_min_clock_table min_clk_table; - struct dml2_pmo_options pmo_options; + struct dml2_top_funcs funcs; struct { struct dml2_initialize_instance_locals initialize_instance_locals; diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_mall_phantom.c b/drivers/gpu/drm/amd/display/dc/dml2/dml2_mall_phantom.c index 3d29169dd6bb..6b3b8803e0ae 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_mall_phantom.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_mall_phantom.c @@ -813,7 +813,7 @@ static bool remove_all_phantom_planes_for_stream(struct dml2_context *ctx, struc { int i, old_plane_count; struct dc_stream_status *stream_status = NULL; - struct dc_plane_state *del_planes[MAX_SURFACE_NUM] = { 0 }; + struct dc_plane_state *del_planes[MAX_SURFACES] = { 0 }; for (i = 0; i < context->stream_count; i++) if (context->streams[i] == stream) { diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c b/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c index bde4250853b1..b416320873e1 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c @@ -553,13 +553,53 @@ void dml2_init_soc_states(struct dml2_context *dml2, const struct dc *in_dc, } } - dml2_policy_build_synthetic_soc_states(s, p); - if (dml2->v20.dml_core_ctx.project == dml_project_dcn35) { - // Override last out_state with data from last in_state - // This will ensure that out_state contains max fclk - memcpy(&p->out_states->state_array[p->out_states->num_states - 1], - &p->in_states->state_array[p->in_states->num_states - 1], - sizeof(struct soc_state_bounding_box_st)); + if (dml2->v20.dml_core_ctx.project == dml_project_dcn35 || + dml2->v20.dml_core_ctx.project == dml_project_dcn351) { + int max_dcfclk_mhz = 0, max_dispclk_mhz = 0, max_dppclk_mhz = 0, max_phyclk_mhz = 0, + max_dtbclk_mhz = 0, max_fclk_mhz = 0, max_uclk_mhz = 0, max_socclk_mhz = 0; + + for (i = 0; i < p->in_states->num_states; i++) { + if (p->in_states->state_array[i].dcfclk_mhz > max_dcfclk_mhz) + max_dcfclk_mhz = (int)p->in_states->state_array[i].dcfclk_mhz; + if (p->in_states->state_array[i].fabricclk_mhz > max_fclk_mhz) + max_fclk_mhz = (int)p->in_states->state_array[i].fabricclk_mhz; + if (p->in_states->state_array[i].socclk_mhz > max_socclk_mhz) + max_socclk_mhz = (int)p->in_states->state_array[i].socclk_mhz; + if (p->in_states->state_array[i].dram_speed_mts > max_uclk_mhz) + max_uclk_mhz = (int)p->in_states->state_array[i].dram_speed_mts; + if (p->in_states->state_array[i].dispclk_mhz > max_dispclk_mhz) + max_dispclk_mhz = (int)p->in_states->state_array[i].dispclk_mhz; + if (p->in_states->state_array[i].dppclk_mhz > max_dppclk_mhz) + max_dppclk_mhz = (int)p->in_states->state_array[i].dppclk_mhz; + if (p->in_states->state_array[i].phyclk_mhz > max_phyclk_mhz) + max_phyclk_mhz = (int)p->in_states->state_array[i].phyclk_mhz; + if (p->in_states->state_array[i].dtbclk_mhz > max_dtbclk_mhz) + max_dtbclk_mhz = (int)p->in_states->state_array[i].dtbclk_mhz; + } + + for (i = 0; i < p->in_states->num_states; i++) { + /* Independent states - including base (unlisted) parameters from state 0. */ + p->out_states->state_array[i] = p->in_states->state_array[0]; + + p->out_states->state_array[i].dispclk_mhz = max_dispclk_mhz; + p->out_states->state_array[i].dppclk_mhz = max_dppclk_mhz; + p->out_states->state_array[i].dtbclk_mhz = max_dtbclk_mhz; + p->out_states->state_array[i].phyclk_mhz = max_phyclk_mhz; + + p->out_states->state_array[i].dscclk_mhz = max_dispclk_mhz / 3.0; + p->out_states->state_array[i].phyclk_mhz = max_phyclk_mhz; + p->out_states->state_array[i].dtbclk_mhz = max_dtbclk_mhz; + + /* Dependent states. */ + p->out_states->state_array[i].dram_speed_mts = p->in_states->state_array[i].dram_speed_mts; + p->out_states->state_array[i].fabricclk_mhz = p->in_states->state_array[i].fabricclk_mhz; + p->out_states->state_array[i].socclk_mhz = p->in_states->state_array[i].socclk_mhz; + p->out_states->state_array[i].dcfclk_mhz = p->in_states->state_array[i].dcfclk_mhz; + } + + p->out_states->num_states = p->in_states->num_states; + } else { + dml2_policy_build_synthetic_soc_states(s, p); } } diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c b/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c index 9190c1328d5b..68b882d28195 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c @@ -531,14 +531,21 @@ static bool optimize_pstate_with_svp_and_drr(struct dml2_context *dml2, struct d static bool call_dml_mode_support_and_programming(struct dc_state *context) { unsigned int result = 0; - unsigned int min_state; + unsigned int min_state = 0; int min_state_for_g6_temp_read = 0; + + + if (!context) + return false; + struct dml2_context *dml2 = context->bw_ctx.dml2; struct dml2_wrapper_scratch *s = &dml2->v20.scratch; - min_state_for_g6_temp_read = calculate_lowest_supported_state_for_temp_read(dml2, context); + if (!context->streams[0]->sink->link->dc->caps.is_apu) { + min_state_for_g6_temp_read = calculate_lowest_supported_state_for_temp_read(dml2, context); - ASSERT(min_state_for_g6_temp_read >= 0); + ASSERT(min_state_for_g6_temp_read >= 0); + } if (!dml2->config.use_native_pstate_optimization) { result = optimize_pstate_with_svp_and_drr(dml2, context); @@ -549,14 +556,20 @@ static bool call_dml_mode_support_and_programming(struct dc_state *context) /* Upon trying to sett certain frequencies in FRL, min_state_for_g6_temp_read is reported as -1. This leads to an invalid value of min_state causing crashes later on. * Use the default logic for min_state only when min_state_for_g6_temp_read is a valid value. In other cases, use the value calculated by the DML directly. */ - if (min_state_for_g6_temp_read >= 0) - min_state = min_state_for_g6_temp_read > s->mode_support_params.out_lowest_state_idx ? min_state_for_g6_temp_read : s->mode_support_params.out_lowest_state_idx; - else - min_state = s->mode_support_params.out_lowest_state_idx; - - if (result) - result = dml_mode_programming(&dml2->v20.dml_core_ctx, min_state, &s->cur_display_config, true); + if (!context->streams[0]->sink->link->dc->caps.is_apu) { + if (min_state_for_g6_temp_read >= 0) + min_state = min_state_for_g6_temp_read > s->mode_support_params.out_lowest_state_idx ? min_state_for_g6_temp_read : s->mode_support_params.out_lowest_state_idx; + else + min_state = s->mode_support_params.out_lowest_state_idx; + } + if (result) { + if (!context->streams[0]->sink->link->dc->caps.is_apu) { + result = dml_mode_programming(&dml2->v20.dml_core_ctx, min_state, &s->cur_display_config, true); + } else { + result = dml_mode_programming(&dml2->v20.dml_core_ctx, s->mode_support_params.out_lowest_state_idx, &s->cur_display_config, true); + } + } return result; } @@ -685,6 +698,8 @@ static bool dml2_validate_only(struct dc_state *context) build_unoptimized_policy_settings(dml2->v20.dml_core_ctx.project, &dml2->v20.dml_core_ctx.policy); map_dc_state_into_dml_display_cfg(dml2, context, &dml2->v20.scratch.cur_display_config); + if (!dml2->config.skip_hw_state_mapping) + dml2_apply_det_buffer_allocation_policy(dml2, &dml2->v20.scratch.cur_display_config); result = pack_and_call_dml_mode_support_ex(dml2, &dml2->v20.scratch.cur_display_config, @@ -732,11 +747,10 @@ static inline struct dml2_context *dml2_allocate_memory(void) static void dml2_init(const struct dc *in_dc, const struct dml2_configuration_options *config, struct dml2_context **dml2) { - // TODO : Temporarily add DCN_VERSION_3_2 for N-1 validation. Remove DCN_VERSION_3_2 after N-1 validation phase is complete. - if ((in_dc->debug.using_dml21) && (in_dc->ctx->dce_version == DCN_VERSION_4_01 || in_dc->ctx->dce_version == DCN_VERSION_3_2)) { - dml21_reinit(in_dc, dml2, config); + if ((in_dc->debug.using_dml21) && (in_dc->ctx->dce_version == DCN_VERSION_4_01)) { + dml21_reinit(in_dc, dml2, config); return; - } + } // Store config options (*dml2)->config = *config; @@ -771,10 +785,8 @@ static void dml2_init(const struct dc *in_dc, const struct dml2_configuration_op bool dml2_create(const struct dc *in_dc, const struct dml2_configuration_options *config, struct dml2_context **dml2) { - // TODO : Temporarily add DCN_VERSION_3_2 for N-1 validation. Remove DCN_VERSION_3_2 after N-1 validation phase is complete. - if ((in_dc->debug.using_dml21) && (in_dc->ctx->dce_version == DCN_VERSION_4_01 || in_dc->ctx->dce_version == DCN_VERSION_3_2)) { + if ((in_dc->debug.using_dml21) && (in_dc->ctx->dce_version == DCN_VERSION_4_01)) return dml21_create(in_dc, dml2, config); - } // Allocate Mode Lib Ctx *dml2 = dml2_allocate_memory(); @@ -842,8 +854,7 @@ void dml2_reinit(const struct dc *in_dc, const struct dml2_configuration_options *config, struct dml2_context **dml2) { - // TODO : Temporarily add DCN_VERSION_3_2 for N-1 validation. Remove DCN_VERSION_3_2 after N-1 validation phase is complete. - if ((in_dc->debug.using_dml21) && (in_dc->ctx->dce_version == DCN_VERSION_4_01 || in_dc->ctx->dce_version == DCN_VERSION_3_2)) { + if ((in_dc->debug.using_dml21) && (in_dc->ctx->dce_version == DCN_VERSION_4_01)) { dml21_reinit(in_dc, dml2, config); return; } diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml_display_rq_dlg_calc.c b/drivers/gpu/drm/amd/display/dc/dml2/dml_display_rq_dlg_calc.c index 377ef6d01ae5..00d22e542469 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml_display_rq_dlg_calc.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml_display_rq_dlg_calc.c @@ -427,18 +427,6 @@ void dml_rq_dlg_get_dlg_reg(dml_display_dlg_regs_st *disp_dlg_regs, dml_print("DML_DLG: %s: disp_dlg_regs->dst_y_per_vm_flip = 0x%x\n", __func__, disp_dlg_regs->dst_y_per_vm_flip); dml_print("DML_DLG: %s: disp_dlg_regs->dst_y_per_row_flip = 0x%x\n", __func__, disp_dlg_regs->dst_y_per_row_flip); - // hack for FPGA - /* NOTE: We dont have getenv defined in driver and it does not make any sense in the driver */ - /*char* fpga_env = getenv("FPGA_FPDIV"); - if(fpga_env !=NULL) - { - if(disp_dlg_regs->vratio_prefetch >= (dml_uint_t)dml_pow(2, 22)) - { - disp_dlg_regs->vratio_prefetch = (dml_uint_t)dml_pow(2, 22)-1; - dml_print("FPGA msg: vratio_prefetch exceed the max value, the register field is [21:0]\n"); - } - }*/ - disp_dlg_regs->refcyc_per_vm_group_vblank = (dml_uint_t)(dml_get_refcyc_per_vm_group_vblank_in_us(mode_lib, pipe_idx) * refclk_freq_in_mhz); disp_dlg_regs->refcyc_per_vm_group_flip = (dml_uint_t)(dml_get_refcyc_per_vm_group_flip_in_us(mode_lib, pipe_idx) * refclk_freq_in_mhz); disp_dlg_regs->refcyc_per_vm_req_vblank = (dml_uint_t)(dml_get_refcyc_per_vm_req_vblank_in_us(mode_lib, pipe_idx) * refclk_freq_in_mhz * dml_pow(2, 10)); diff --git a/drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c b/drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c index d9aaebfa3a0a..11535922b5ff 100644 --- a/drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c +++ b/drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c @@ -30,6 +30,9 @@ #include "rc_calc.h" #include "fixed31_32.h" +#define DC_LOGGER \ + dsc->ctx->logger + /* This module's internal functions */ /* default DSC policy target bitrate limit is 16bpp */ @@ -480,6 +483,48 @@ bool dc_dsc_compute_bandwidth_range( return is_dsc_possible; } +void dc_dsc_dump_encoder_caps(const struct display_stream_compressor *dsc, + const struct dc_crtc_timing *timing) +{ + struct dsc_enc_caps dsc_enc_caps; + + get_dsc_enc_caps(dsc, &dsc_enc_caps, timing->pix_clk_100hz); + + DC_LOG_DSC("dsc encoder caps:"); + DC_LOG_DSC("\tdsc_version 0x%x", dsc_enc_caps.dsc_version); + DC_LOG_DSC("\tslice_caps 0x%x", dsc_enc_caps.slice_caps.raw); + DC_LOG_DSC("\tlb_bit_depth %d", dsc_enc_caps.lb_bit_depth); + DC_LOG_DSC("\tis_block_pred_supported %d", dsc_enc_caps.is_block_pred_supported); + DC_LOG_DSC("\tcolor_formats 0x%x", dsc_enc_caps.color_formats.raw); + DC_LOG_DSC("\tcolor_depth 0x%x", dsc_enc_caps.color_depth.raw); + DC_LOG_DSC("\tmax_total_throughput_mps %d", dsc_enc_caps.max_total_throughput_mps); + DC_LOG_DSC("\tmax_slice_width %d", dsc_enc_caps.max_slice_width); + DC_LOG_DSC("\tbpp_increment_div %d", dsc_enc_caps.bpp_increment_div); +} + +void dc_dsc_dump_decoder_caps(const struct display_stream_compressor *dsc, + const struct dsc_dec_dpcd_caps *dsc_sink_caps) +{ + DC_LOG_DSC("dsc decoder caps:"); + DC_LOG_DSC("\tis_dsc_supported %d", dsc_sink_caps->is_dsc_supported); + DC_LOG_DSC("\tdsc_version 0x%x", dsc_sink_caps->dsc_version); + DC_LOG_DSC("\trc_buffer_size %d", dsc_sink_caps->rc_buffer_size); + DC_LOG_DSC("\tslice_caps1 0x%x", dsc_sink_caps->slice_caps1.raw); + DC_LOG_DSC("\tslice_caps2 0x%x", dsc_sink_caps->slice_caps2.raw); + DC_LOG_DSC("\tlb_bit_depth %d", dsc_sink_caps->lb_bit_depth); + DC_LOG_DSC("\tis_block_pred_supported %d", dsc_sink_caps->is_block_pred_supported); + DC_LOG_DSC("\tedp_max_bits_per_pixel %d", dsc_sink_caps->edp_max_bits_per_pixel); + DC_LOG_DSC("\tcolor_formats 0x%x", dsc_sink_caps->color_formats.raw); + DC_LOG_DSC("\tthroughput_mode_0_mps %d", dsc_sink_caps->throughput_mode_0_mps); + DC_LOG_DSC("\tthroughput_mode_1_mps %d", dsc_sink_caps->throughput_mode_1_mps); + DC_LOG_DSC("\tmax_slice_width %d", dsc_sink_caps->max_slice_width); + DC_LOG_DSC("\tbpp_increment_div %d", dsc_sink_caps->bpp_increment_div); + DC_LOG_DSC("\tbranch_overall_throughput_0_mps %d", dsc_sink_caps->branch_overall_throughput_0_mps); + DC_LOG_DSC("\tbranch_overall_throughput_1_mps %d", dsc_sink_caps->branch_overall_throughput_1_mps); + DC_LOG_DSC("\tbranch_max_line_width %d", dsc_sink_caps->branch_max_line_width); + DC_LOG_DSC("\tis_dp %d", dsc_sink_caps->is_dp); +} + static void get_dsc_enc_caps( const struct display_stream_compressor *dsc, struct dsc_enc_caps *dsc_enc_caps, diff --git a/drivers/gpu/drm/amd/display/dc/dwb/dcn30/dcn30_dwb.c b/drivers/gpu/drm/amd/display/dc/dwb/dcn30/dcn30_dwb.c index fae98cf52020..bc058f682438 100644 --- a/drivers/gpu/drm/amd/display/dc/dwb/dcn30/dcn30_dwb.c +++ b/drivers/gpu/drm/amd/display/dc/dwb/dcn30/dcn30_dwb.c @@ -270,16 +270,3 @@ void dcn30_dwbc_construct(struct dcn30_dwbc *dwbc30, dwbc30->dwbc_shift = dwbc_shift; dwbc30->dwbc_mask = dwbc_mask; } - -void dwb3_set_host_read_rate_control(struct dwbc *dwbc, bool host_read_delay) -{ - struct dcn30_dwbc *dwbc30 = TO_DCN30_DWBC(dwbc); - - /* - * Set maximum delay of host read access to DWBSCL LUT or OGAM LUT if there are no - * idle cycles in HW pipeline (in number of clock cycles times 4) - */ - REG_UPDATE(DWB_HOST_READ_CONTROL, DWB_HOST_READ_RATE_CONTROL, host_read_delay); - - DC_LOG_DWB("%s dwb3_rate_control at inst = %d", __func__, dwbc->inst); -} diff --git a/drivers/gpu/drm/amd/display/dc/dwb/dcn30/dcn30_dwb.h b/drivers/gpu/drm/amd/display/dc/dwb/dcn30/dcn30_dwb.h index 0f3f7c5fbaec..7f053f49ec6a 100644 --- a/drivers/gpu/drm/amd/display/dc/dwb/dcn30/dcn30_dwb.h +++ b/drivers/gpu/drm/amd/display/dc/dwb/dcn30/dcn30_dwb.h @@ -914,7 +914,6 @@ bool dwb3_ogam_set_input_transfer_func( struct dwbc *dwbc, const struct dc_transfer_func *in_transfer_func_dwb_ogam); -void dwb3_set_host_read_rate_control(struct dwbc *dwbc, bool host_read_delay); #endif diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn10/dcn10_hubp.c b/drivers/gpu/drm/amd/display/dc/hubp/dcn10/dcn10_hubp.c index 22ac2b7e49ae..8364c9f9231a 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn10/dcn10_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn10/dcn10_hubp.c @@ -140,7 +140,7 @@ void hubp1_vready_workaround(struct hubp *hubp, void hubp1_program_tiling( struct hubp *hubp, - const union dc_tiling_info *info, + const struct dc_tiling_info *info, const enum surface_pixel_format pixel_format) { struct dcn10_hubp *hubp1 = TO_DCN10_HUBP(hubp); @@ -518,6 +518,20 @@ bool hubp1_program_surface_flip_and_addr( return true; } +void hubp1_clear_tiling(struct hubp *hubp) +{ + struct dcn10_hubp *hubp1 = TO_DCN10_HUBP(hubp); + + REG_UPDATE(DCHUBP_REQ_SIZE_CONFIG, SWATH_HEIGHT, 0); + REG_UPDATE(DCSURF_TILING_CONFIG, SW_MODE, DC_SW_LINEAR); + + REG_UPDATE_4(DCSURF_SURFACE_CONTROL, + PRIMARY_SURFACE_DCC_EN, 0, + PRIMARY_SURFACE_DCC_IND_64B_BLK, 0, + SECONDARY_SURFACE_DCC_EN, 0, + SECONDARY_SURFACE_DCC_IND_64B_BLK, 0); +} + void hubp1_dcc_control(struct hubp *hubp, bool enable, enum hubp_ind_block_size independent_64b_blks) { @@ -535,7 +549,7 @@ void hubp1_dcc_control(struct hubp *hubp, bool enable, void hubp1_program_surface_config( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -1363,6 +1377,7 @@ static const struct hubp_funcs dcn10_hubp_funcs = { .hubp_disable_control = hubp1_disable_control, .hubp_get_underflow_status = hubp1_get_underflow_status, .hubp_init = hubp1_init, + .hubp_clear_tiling = hubp1_clear_tiling, .dmdata_set_attributes = NULL, .dmdata_load = NULL, diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn10/dcn10_hubp.h b/drivers/gpu/drm/amd/display/dc/hubp/dcn10/dcn10_hubp.h index 69119b2fdce2..a85dc3be786f 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn10/dcn10_hubp.h +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn10/dcn10_hubp.h @@ -706,7 +706,7 @@ struct dcn10_hubp { void hubp1_program_surface_config( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -739,7 +739,7 @@ void hubp1_program_rotation( void hubp1_program_tiling( struct hubp *hubp, - const union dc_tiling_info *info, + const struct dc_tiling_info *info, const enum surface_pixel_format pixel_format); void hubp1_dcc_control(struct hubp *hubp, @@ -794,4 +794,6 @@ void hubp1_soft_reset(struct hubp *hubp, bool reset); void hubp1_set_flip_int(struct hubp *hubp); +void hubp1_clear_tiling(struct hubp *hubp); + #endif diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn20/dcn20_hubp.c b/drivers/gpu/drm/amd/display/dc/hubp/dcn20/dcn20_hubp.c index 0637e4c552d8..c74f6a3313a2 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn20/dcn20_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn20/dcn20_hubp.c @@ -310,7 +310,7 @@ void hubp2_setup_interdependent( */ static void hubp2_program_tiling( struct dcn20_hubp *hubp2, - const union dc_tiling_info *info, + const struct dc_tiling_info *info, const enum surface_pixel_format pixel_format) { REG_UPDATE_3(DCSURF_ADDR_CONFIG, @@ -406,6 +406,20 @@ void hubp2_program_rotation( H_MIRROR_EN, mirror); } +void hubp2_clear_tiling(struct hubp *hubp) +{ + struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); + + REG_UPDATE(DCHUBP_REQ_SIZE_CONFIG, SWATH_HEIGHT, 0); + REG_UPDATE(DCSURF_TILING_CONFIG, SW_MODE, DC_SW_LINEAR); + + REG_UPDATE_4(DCSURF_SURFACE_CONTROL, + PRIMARY_SURFACE_DCC_EN, 0, + PRIMARY_SURFACE_DCC_IND_64B_BLK, 0, + SECONDARY_SURFACE_DCC_EN, 0, + SECONDARY_SURFACE_DCC_IND_64B_BLK, 0); +} + void hubp2_dcc_control(struct hubp *hubp, bool enable, enum hubp_ind_block_size independent_64b_blks) { @@ -536,7 +550,7 @@ void hubp2_program_pixel_format( void hubp2_program_surface_config( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -1676,6 +1690,7 @@ static struct hubp_funcs dcn20_hubp_funcs = { .hubp_in_blank = hubp1_in_blank, .hubp_soft_reset = hubp1_soft_reset, .hubp_set_flip_int = hubp1_set_flip_int, + .hubp_clear_tiling = hubp2_clear_tiling, }; diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn20/dcn20_hubp.h b/drivers/gpu/drm/amd/display/dc/hubp/dcn20/dcn20_hubp.h index 18e194507e36..6968087a3605 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn20/dcn20_hubp.h +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn20/dcn20_hubp.h @@ -382,7 +382,7 @@ void hubp2_program_pixel_format( void hubp2_program_surface_config( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -409,6 +409,8 @@ void hubp2_read_state_common(struct hubp *hubp); void hubp2_read_state(struct hubp *hubp); +void hubp2_clear_tiling(struct hubp *hubp); + #endif /* __DC_MEM_INPUT_DCN20_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn201/dcn201_hubp.c b/drivers/gpu/drm/amd/display/dc/hubp/dcn201/dcn201_hubp.c index cd2bfcc51276..65c628078ca2 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn201/dcn201_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn201/dcn201_hubp.c @@ -42,7 +42,7 @@ static void hubp201_program_surface_config( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -131,6 +131,7 @@ static struct hubp_funcs dcn201_hubp_funcs = { .hubp_clear_underflow = hubp1_clear_underflow, .hubp_set_flip_control_surface_gsl = hubp2_set_flip_control_surface_gsl, .hubp_init = hubp1_init, + .hubp_clear_tiling = hubp1_clear_tiling, }; bool dcn201_hubp_construct( diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn21/dcn21_hubp.c b/drivers/gpu/drm/amd/display/dc/hubp/dcn21/dcn21_hubp.c index e13d69a22c1c..edbdb8c88d5c 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn21/dcn21_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn21/dcn21_hubp.c @@ -837,6 +837,7 @@ static struct hubp_funcs dcn21_hubp_funcs = { .hubp_init = hubp21_init, .validate_dml_output = hubp21_validate_dml_output, .hubp_set_flip_int = hubp1_set_flip_int, + .hubp_clear_tiling = hubp1_clear_tiling, }; bool hubp21_construct( diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn30/dcn30_hubp.c b/drivers/gpu/drm/amd/display/dc/hubp/dcn30/dcn30_hubp.c index 60a64d290352..12b282ed7067 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn30/dcn30_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn30/dcn30_hubp.c @@ -318,7 +318,7 @@ bool hubp3_program_surface_flip_and_addr( void hubp3_program_tiling( struct dcn20_hubp *hubp2, - const union dc_tiling_info *info, + const struct dc_tiling_info *info, const enum surface_pixel_format pixel_format) { REG_UPDATE_4(DCSURF_ADDR_CONFIG, @@ -334,6 +334,22 @@ void hubp3_program_tiling( } +void hubp3_clear_tiling(struct hubp *hubp) +{ + struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); + + REG_UPDATE(DCHUBP_REQ_SIZE_CONFIG, SWATH_HEIGHT, 0); + REG_UPDATE(DCSURF_TILING_CONFIG, SW_MODE, DC_SW_LINEAR); + + REG_UPDATE_6(DCSURF_SURFACE_CONTROL, + PRIMARY_SURFACE_DCC_EN, 0, + PRIMARY_SURFACE_DCC_IND_BLK, 0, + PRIMARY_SURFACE_DCC_IND_BLK_C, 0, + SECONDARY_SURFACE_DCC_EN, 0, + SECONDARY_SURFACE_DCC_IND_BLK, 0, + SECONDARY_SURFACE_DCC_IND_BLK_C, 0); +} + void hubp3_dcc_control(struct hubp *hubp, bool enable, enum hubp_ind_block_size blk_size) { @@ -395,7 +411,7 @@ void hubp3_dmdata_set_attributes( void hubp3_program_surface_config( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -512,6 +528,7 @@ static struct hubp_funcs dcn30_hubp_funcs = { .hubp_in_blank = hubp1_in_blank, .hubp_soft_reset = hubp1_soft_reset, .hubp_set_flip_int = hubp1_set_flip_int, + .hubp_clear_tiling = hubp3_clear_tiling, }; bool hubp3_construct( diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn30/dcn30_hubp.h b/drivers/gpu/drm/amd/display/dc/hubp/dcn30/dcn30_hubp.h index b010531a7fe8..b7d7adf0b58c 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn30/dcn30_hubp.h +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn30/dcn30_hubp.h @@ -264,7 +264,7 @@ bool hubp3_program_surface_flip_and_addr( void hubp3_program_surface_config( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -280,7 +280,7 @@ void hubp3_setup( void hubp3_program_tiling( struct dcn20_hubp *hubp2, - const union dc_tiling_info *info, + const struct dc_tiling_info *info, const enum surface_pixel_format pixel_format); void hubp3_dcc_control(struct hubp *hubp, bool enable, @@ -297,6 +297,8 @@ void hubp3_read_state(struct hubp *hubp); void hubp3_init(struct hubp *hubp); +void hubp3_clear_tiling(struct hubp *hubp); + #endif /* __DC_HUBP_DCN30_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn31/dcn31_hubp.c b/drivers/gpu/drm/amd/display/dc/hubp/dcn31/dcn31_hubp.c index 8394e8c06919..46b804ed05fb 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn31/dcn31_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn31/dcn31_hubp.c @@ -96,6 +96,7 @@ static struct hubp_funcs dcn31_hubp_funcs = { .hubp_set_flip_int = hubp1_set_flip_int, .hubp_in_blank = hubp1_in_blank, .program_extended_blank = hubp31_program_extended_blank, + .hubp_clear_tiling = hubp3_clear_tiling, }; bool hubp31_construct( diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn32/dcn32_hubp.c b/drivers/gpu/drm/amd/display/dc/hubp/dcn32/dcn32_hubp.c index ca5b4b28a664..8b5bd73b8094 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn32/dcn32_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn32/dcn32_hubp.c @@ -201,7 +201,8 @@ static struct hubp_funcs dcn32_hubp_funcs = { .hubp_update_force_cursor_pstate_disallow = hubp32_update_force_cursor_pstate_disallow, .phantom_hubp_post_enable = hubp32_phantom_hubp_post_enable, .hubp_update_mall_sel = hubp32_update_mall_sel, - .hubp_prepare_subvp_buffering = hubp32_prepare_subvp_buffering + .hubp_prepare_subvp_buffering = hubp32_prepare_subvp_buffering, + .hubp_clear_tiling = hubp3_clear_tiling, }; bool hubp32_construct( diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn35/dcn35_hubp.c b/drivers/gpu/drm/amd/display/dc/hubp/dcn35/dcn35_hubp.c index d1f05b82b3dd..faf37febc6fb 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn35/dcn35_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn35/dcn35_hubp.c @@ -172,7 +172,7 @@ void hubp35_program_pixel_format( void hubp35_program_surface_config( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -216,6 +216,7 @@ static struct hubp_funcs dcn35_hubp_funcs = { .hubp_set_flip_int = hubp1_set_flip_int, .hubp_in_blank = hubp1_in_blank, .program_extended_blank = hubp31_program_extended_blank_value, + .hubp_clear_tiling = hubp3_clear_tiling, }; bool hubp35_construct( diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn35/dcn35_hubp.h b/drivers/gpu/drm/amd/display/dc/hubp/dcn35/dcn35_hubp.h index 586b43aa5834..d913f80b3130 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn35/dcn35_hubp.h +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn35/dcn35_hubp.h @@ -65,7 +65,7 @@ void hubp35_program_pixel_format( void hubp35_program_surface_config( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn401/dcn401_hubp.c b/drivers/gpu/drm/amd/display/dc/hubp/dcn401/dcn401_hubp.c index b1ebf5053b4f..28ceceaf9e31 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn401/dcn401_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn401/dcn401_hubp.c @@ -40,7 +40,7 @@ #define FN(reg_name, field_name) \ hubp2->hubp_shift->field_name, hubp2->hubp_mask->field_name -static void hubp401_program_3dlut_fl_addr(struct hubp *hubp, +void hubp401_program_3dlut_fl_addr(struct hubp *hubp, const struct dc_plane_address address) { struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); @@ -49,14 +49,14 @@ static void hubp401_program_3dlut_fl_addr(struct hubp *hubp, REG_WRITE(HUBP_3DLUT_ADDRESS_LOW, address.lut3d.addr.low_part); } -static void hubp401_program_3dlut_fl_dlg_param(struct hubp *hubp, int refcyc_per_3dlut_group) +void hubp401_program_3dlut_fl_dlg_param(struct hubp *hubp, int refcyc_per_3dlut_group) { struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); REG_UPDATE(HUBP_3DLUT_DLG_PARAM, REFCYC_PER_3DLUT_GROUP, refcyc_per_3dlut_group); } -static void hubp401_enable_3dlut_fl(struct hubp *hubp, bool enable) +void hubp401_enable_3dlut_fl(struct hubp *hubp, bool enable) { struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); @@ -72,28 +72,28 @@ int hubp401_get_3dlut_fl_done(struct hubp *hubp) return ret; } -static void hubp401_program_3dlut_fl_addressing_mode(struct hubp *hubp, enum hubp_3dlut_fl_addressing_mode addr_mode) +void hubp401_program_3dlut_fl_addressing_mode(struct hubp *hubp, enum hubp_3dlut_fl_addressing_mode addr_mode) { struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); REG_UPDATE(HUBP_3DLUT_CONTROL, HUBP_3DLUT_ADDRESSING_MODE, addr_mode); } -static void hubp401_program_3dlut_fl_width(struct hubp *hubp, enum hubp_3dlut_fl_width width) +void hubp401_program_3dlut_fl_width(struct hubp *hubp, enum hubp_3dlut_fl_width width) { struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); REG_UPDATE(HUBP_3DLUT_CONTROL, HUBP_3DLUT_WIDTH, width); } -static void hubp401_program_3dlut_fl_tmz_protected(struct hubp *hubp, bool protection_enabled) +void hubp401_program_3dlut_fl_tmz_protected(struct hubp *hubp, bool protection_enabled) { struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); REG_UPDATE(HUBP_3DLUT_CONTROL, HUBP_3DLUT_TMZ, protection_enabled ? 1 : 0); } -static void hubp401_program_3dlut_fl_crossbar(struct hubp *hubp, +void hubp401_program_3dlut_fl_crossbar(struct hubp *hubp, enum hubp_3dlut_fl_crossbar_bit_slice bit_slice_y_g, enum hubp_3dlut_fl_crossbar_bit_slice bit_slice_cb_b, enum hubp_3dlut_fl_crossbar_bit_slice bit_slice_cr_r) @@ -106,21 +106,21 @@ static void hubp401_program_3dlut_fl_crossbar(struct hubp *hubp, HUBP_3DLUT_CROSSBAR_SELECT_CR_R, bit_slice_cr_r); } -static void hubp401_update_3dlut_fl_bias_scale(struct hubp *hubp, uint16_t bias, uint16_t scale) +void hubp401_update_3dlut_fl_bias_scale(struct hubp *hubp, uint16_t bias, uint16_t scale) { struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); REG_UPDATE_2(_3DLUT_FL_BIAS_SCALE, HUBP0_3DLUT_FL_BIAS, bias, HUBP0_3DLUT_FL_SCALE, scale); } -static void hubp401_program_3dlut_fl_mode(struct hubp *hubp, enum hubp_3dlut_fl_mode mode) +void hubp401_program_3dlut_fl_mode(struct hubp *hubp, enum hubp_3dlut_fl_mode mode) { struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); REG_UPDATE(_3DLUT_FL_CONFIG, HUBP0_3DLUT_FL_MODE, mode); } -static void hubp401_program_3dlut_fl_format(struct hubp *hubp, enum hubp_3dlut_fl_format format) +void hubp401_program_3dlut_fl_format(struct hubp *hubp, enum hubp_3dlut_fl_format format) { struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); @@ -145,30 +145,44 @@ void hubp401_init(struct hubp *hubp) } void hubp401_vready_at_or_After_vsync(struct hubp *hubp, - struct _vcs_dpi_display_pipe_dest_params_st *pipe_dest) + union dml2_global_sync_programming *pipe_global_sync, + struct dc_crtc_timing *timing) { - uint32_t value = 0; + unsigned int vstartup_lines = pipe_global_sync->dcn4x.vstartup_lines; + unsigned int vupdate_offset_pixels = pipe_global_sync->dcn4x.vupdate_offset_pixels; + unsigned int vupdate_width_pixels = pipe_global_sync->dcn4x.vupdate_vupdate_width_pixels; + unsigned int vready_offset_pixels = pipe_global_sync->dcn4x.vready_offset_pixels; + unsigned int htotal = timing->h_total; + unsigned int vblank_start = 0; + unsigned int vblank_end = 0; + unsigned int pixel_width = 0; + uint32_t reg_value = 0; + bool is_vready_at_or_after_vsync = false; struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); + /* * if (VSTARTUP_START - (VREADY_OFFSET+VUPDATE_WIDTH+VUPDATE_OFFSET)/htotal) <= OTG_V_BLANK_END * Set HUBP_VREADY_AT_OR_AFTER_VSYNC = 1 * else * Set HUBP_VREADY_AT_OR_AFTER_VSYNC = 0 */ - if (pipe_dest->htotal != 0) { - if ((pipe_dest->vstartup_start - (pipe_dest->vready_offset+pipe_dest->vupdate_width - + pipe_dest->vupdate_offset) / pipe_dest->htotal) <= pipe_dest->vblank_end) { - value = 1; - } else - value = 0; + if (htotal != 0) { + vblank_start = timing->v_total - timing->v_front_porch; + vblank_end = vblank_start - timing->v_addressable - timing->v_border_top - timing->v_border_bottom; + pixel_width = vready_offset_pixels + vupdate_width_pixels + vupdate_offset_pixels; + + is_vready_at_or_after_vsync = (vstartup_lines - pixel_width / htotal) <= vblank_end; + + if (is_vready_at_or_after_vsync) + reg_value = 1; } - REG_UPDATE(DCHUBP_CNTL, HUBP_VREADY_AT_OR_AFTER_VSYNC, value); + REG_UPDATE(DCHUBP_CNTL, HUBP_VREADY_AT_OR_AFTER_VSYNC, reg_value); } void hubp401_program_requestor( struct hubp *hubp, - struct _vcs_dpi_display_rq_regs_st *rq_regs) + struct dml2_display_rq_regs *rq_regs) { struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); @@ -196,8 +210,8 @@ void hubp401_program_requestor( void hubp401_program_deadline( struct hubp *hubp, - struct _vcs_dpi_display_dlg_regs_st *dlg_attr, - struct _vcs_dpi_display_ttu_regs_st *ttu_attr) + struct dml2_display_dlg_regs *dlg_attr, + struct dml2_display_ttu_regs *ttu_attr) { struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); @@ -294,66 +308,64 @@ void hubp401_program_deadline( void hubp401_setup( struct hubp *hubp, - struct _vcs_dpi_display_dlg_regs_st *dlg_attr, - struct _vcs_dpi_display_ttu_regs_st *ttu_attr, - struct _vcs_dpi_display_rq_regs_st *rq_regs, - struct _vcs_dpi_display_pipe_dest_params_st *pipe_dest) + struct dml2_dchub_per_pipe_register_set *pipe_regs, + union dml2_global_sync_programming *pipe_global_sync, + struct dc_crtc_timing *timing) { /* otg is locked when this func is called. Register are double buffered. * disable the requestors is not needed */ - hubp401_vready_at_or_After_vsync(hubp, pipe_dest); - hubp401_program_requestor(hubp, rq_regs); - hubp401_program_deadline(hubp, dlg_attr, ttu_attr); + hubp401_vready_at_or_After_vsync(hubp, pipe_global_sync, timing); + hubp401_program_requestor(hubp, &pipe_regs->rq_regs); + hubp401_program_deadline(hubp, &pipe_regs->dlg_regs, &pipe_regs->ttu_regs); } void hubp401_setup_interdependent( struct hubp *hubp, - struct _vcs_dpi_display_dlg_regs_st *dlg_attr, - struct _vcs_dpi_display_ttu_regs_st *ttu_attr) + struct dml2_dchub_per_pipe_register_set *pipe_regs) { struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); REG_SET_2(PREFETCH_SETTINGS, 0, - DST_Y_PREFETCH, dlg_attr->dst_y_prefetch, - VRATIO_PREFETCH, dlg_attr->vratio_prefetch); + DST_Y_PREFETCH, pipe_regs->dlg_regs.dst_y_prefetch, + VRATIO_PREFETCH, pipe_regs->dlg_regs.vratio_prefetch); REG_SET(PREFETCH_SETTINGS_C, 0, - VRATIO_PREFETCH_C, dlg_attr->vratio_prefetch_c); + VRATIO_PREFETCH_C, pipe_regs->dlg_regs.vratio_prefetch_c); REG_SET_2(VBLANK_PARAMETERS_0, 0, - DST_Y_PER_VM_VBLANK, dlg_attr->dst_y_per_vm_vblank, - DST_Y_PER_ROW_VBLANK, dlg_attr->dst_y_per_row_vblank); + DST_Y_PER_VM_VBLANK, pipe_regs->dlg_regs.dst_y_per_vm_vblank, + DST_Y_PER_ROW_VBLANK, pipe_regs->dlg_regs.dst_y_per_row_vblank); REG_SET_2(FLIP_PARAMETERS_0, 0, - DST_Y_PER_VM_FLIP, dlg_attr->dst_y_per_vm_flip, - DST_Y_PER_ROW_FLIP, dlg_attr->dst_y_per_row_flip); + DST_Y_PER_VM_FLIP, pipe_regs->dlg_regs.dst_y_per_vm_flip, + DST_Y_PER_ROW_FLIP, pipe_regs->dlg_regs.dst_y_per_row_flip); REG_SET(VBLANK_PARAMETERS_3, 0, - REFCYC_PER_META_CHUNK_VBLANK_L, dlg_attr->refcyc_per_meta_chunk_vblank_l); + REFCYC_PER_META_CHUNK_VBLANK_L, pipe_regs->dlg_regs.refcyc_per_meta_chunk_vblank_l); REG_SET(VBLANK_PARAMETERS_4, 0, - REFCYC_PER_META_CHUNK_VBLANK_C, dlg_attr->refcyc_per_meta_chunk_vblank_c); + REFCYC_PER_META_CHUNK_VBLANK_C, pipe_regs->dlg_regs.refcyc_per_meta_chunk_vblank_c); REG_SET(FLIP_PARAMETERS_2, 0, - REFCYC_PER_META_CHUNK_FLIP_L, dlg_attr->refcyc_per_meta_chunk_flip_l); + REFCYC_PER_META_CHUNK_FLIP_L, pipe_regs->dlg_regs.refcyc_per_meta_chunk_flip_l); REG_SET_2(PER_LINE_DELIVERY_PRE, 0, - REFCYC_PER_LINE_DELIVERY_PRE_L, dlg_attr->refcyc_per_line_delivery_pre_l, - REFCYC_PER_LINE_DELIVERY_PRE_C, dlg_attr->refcyc_per_line_delivery_pre_c); + REFCYC_PER_LINE_DELIVERY_PRE_L, pipe_regs->dlg_regs.refcyc_per_line_delivery_pre_l, + REFCYC_PER_LINE_DELIVERY_PRE_C, pipe_regs->dlg_regs.refcyc_per_line_delivery_pre_c); REG_SET(DCN_SURF0_TTU_CNTL1, 0, REFCYC_PER_REQ_DELIVERY_PRE, - ttu_attr->refcyc_per_req_delivery_pre_l); + pipe_regs->ttu_regs.refcyc_per_req_delivery_pre_l); REG_SET(DCN_SURF1_TTU_CNTL1, 0, REFCYC_PER_REQ_DELIVERY_PRE, - ttu_attr->refcyc_per_req_delivery_pre_c); + pipe_regs->ttu_regs.refcyc_per_req_delivery_pre_c); REG_SET(DCN_CUR0_TTU_CNTL1, 0, - REFCYC_PER_REQ_DELIVERY_PRE, ttu_attr->refcyc_per_req_delivery_pre_cur0); + REFCYC_PER_REQ_DELIVERY_PRE, pipe_regs->ttu_regs.refcyc_per_req_delivery_pre_cur0); REG_SET_2(DCN_GLOBAL_TTU_CNTL, 0, - MIN_TTU_VBLANK, ttu_attr->min_ttu_vblank, - QoS_LEVEL_FLIP, ttu_attr->qos_level_flip); + MIN_TTU_VBLANK, pipe_regs->ttu_regs.min_ttu_vblank, + QoS_LEVEL_FLIP, pipe_regs->ttu_regs.qos_level_flip); } @@ -508,6 +520,18 @@ bool hubp401_program_surface_flip_and_addr( return true; } +void hubp401_clear_tiling(struct hubp *hubp) +{ + struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); + + REG_UPDATE(DCHUBP_REQ_SIZE_CONFIG, SWATH_HEIGHT, 0); + REG_UPDATE(DCSURF_TILING_CONFIG, SW_MODE, DC_SW_LINEAR); + + REG_UPDATE_2(DCSURF_SURFACE_CONTROL, + PRIMARY_SURFACE_DCC_EN, 0, + SECONDARY_SURFACE_DCC_EN, 0); +} + void hubp401_dcc_control(struct hubp *hubp, struct dc_plane_dcc_param *dcc) { @@ -520,7 +544,7 @@ void hubp401_dcc_control(struct hubp *hubp, void hubp401_program_tiling( struct dcn20_hubp *hubp2, - const union dc_tiling_info *info, + const struct dc_tiling_info *info, const enum surface_pixel_format pixel_format) { /* DCSURF_ADDR_CONFIG still shows up in reg spec, but does not need to be programmed for DCN4x @@ -568,7 +592,7 @@ void hubp401_program_size( void hubp401_program_surface_config( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -969,8 +993,8 @@ static struct hubp_funcs dcn401_hubp_funcs = { .hubp_program_surface_flip_and_addr = hubp401_program_surface_flip_and_addr, .hubp_program_surface_config = hubp401_program_surface_config, .hubp_is_flip_pending = hubp2_is_flip_pending, - .hubp_setup = hubp401_setup, - .hubp_setup_interdependent = hubp401_setup_interdependent, + .hubp_setup2 = hubp401_setup, + .hubp_setup_interdependent2 = hubp401_setup_interdependent, .hubp_set_vm_system_aperture_settings = hubp3_set_vm_system_aperture_settings, .set_blank = hubp2_set_blank, .set_blank_regs = hubp2_set_blank_regs, @@ -1004,7 +1028,8 @@ static struct hubp_funcs dcn401_hubp_funcs = { .hubp_program_3dlut_fl_width = hubp401_program_3dlut_fl_width, .hubp_program_3dlut_fl_tmz_protected = hubp401_program_3dlut_fl_tmz_protected, .hubp_program_3dlut_fl_crossbar = hubp401_program_3dlut_fl_crossbar, - .hubp_get_3dlut_fl_done = hubp401_get_3dlut_fl_done + .hubp_get_3dlut_fl_done = hubp401_get_3dlut_fl_done, + .hubp_clear_tiling = hubp2_clear_tiling, }; bool hubp401_construct( diff --git a/drivers/gpu/drm/amd/display/dc/hubp/dcn401/dcn401_hubp.h b/drivers/gpu/drm/amd/display/dc/hubp/dcn401/dcn401_hubp.h index e52fdb5b0cd0..6e1d4c90ddd4 100644 --- a/drivers/gpu/drm/amd/display/dc/hubp/dcn401/dcn401_hubp.h +++ b/drivers/gpu/drm/amd/display/dc/hubp/dcn401/dcn401_hubp.h @@ -256,29 +256,15 @@ void hubp401_update_mall_sel(struct hubp *hubp, uint32_t mall_sel, bool c_cursor); -void hubp401_vready_at_or_After_vsync(struct hubp *hubp, - struct _vcs_dpi_display_pipe_dest_params_st *pipe_dest); - -void hubp401_program_requestor( - struct hubp *hubp, - struct _vcs_dpi_display_rq_regs_st *rq_regs); - -void hubp401_program_deadline( - struct hubp *hubp, - struct _vcs_dpi_display_dlg_regs_st *dlg_attr, - struct _vcs_dpi_display_ttu_regs_st *ttu_attr); - void hubp401_setup( struct hubp *hubp, - struct _vcs_dpi_display_dlg_regs_st *dlg_attr, - struct _vcs_dpi_display_ttu_regs_st *ttu_attr, - struct _vcs_dpi_display_rq_regs_st *rq_regs, - struct _vcs_dpi_display_pipe_dest_params_st *pipe_dest); + struct dml2_dchub_per_pipe_register_set *pipe_regs, + union dml2_global_sync_programming *pipe_global_sync, + struct dc_crtc_timing *timing); void hubp401_setup_interdependent( struct hubp *hubp, - struct _vcs_dpi_display_dlg_regs_st *dlg_attr, - struct _vcs_dpi_display_ttu_regs_st *ttu_attr); + struct dml2_dchub_per_pipe_register_set *pipe_regs); bool hubp401_program_surface_flip_and_addr( struct hubp *hubp, @@ -290,7 +276,7 @@ void hubp401_dcc_control(struct hubp *hubp, void hubp401_program_tiling( struct dcn20_hubp *hubp2, - const union dc_tiling_info *info, + const struct dc_tiling_info *info, const enum surface_pixel_format pixel_format); void hubp401_program_size( @@ -302,7 +288,7 @@ void hubp401_program_size( void hubp401_program_surface_config( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -340,4 +326,42 @@ int hubp401_get_3dlut_fl_done(struct hubp *hubp); void hubp401_set_unbounded_requesting(struct hubp *hubp, bool enable); +void hubp401_update_3dlut_fl_bias_scale(struct hubp *hubp, uint16_t bias, uint16_t scale); + +void hubp401_program_3dlut_fl_crossbar(struct hubp *hubp, + enum hubp_3dlut_fl_crossbar_bit_slice bit_slice_y_g, + enum hubp_3dlut_fl_crossbar_bit_slice bit_slice_cb_b, + enum hubp_3dlut_fl_crossbar_bit_slice bit_slice_cr_r); + +void hubp401_program_3dlut_fl_tmz_protected(struct hubp *hubp, bool protection_enabled); + +void hubp401_program_3dlut_fl_width(struct hubp *hubp, enum hubp_3dlut_fl_width width); + +void hubp401_program_3dlut_fl_addressing_mode(struct hubp *hubp, enum hubp_3dlut_fl_addressing_mode addr_mode); + +void hubp401_enable_3dlut_fl(struct hubp *hubp, bool enable); + +void hubp401_program_3dlut_fl_dlg_param(struct hubp *hubp, int refcyc_per_3dlut_group); + +void hubp401_program_3dlut_fl_addr(struct hubp *hubp, const struct dc_plane_address address); + +void hubp401_program_3dlut_fl_format(struct hubp *hubp, enum hubp_3dlut_fl_format format); + +void hubp401_program_3dlut_fl_mode(struct hubp *hubp, enum hubp_3dlut_fl_mode mode); + +void hubp401_clear_tiling(struct hubp *hubp); + +void hubp401_vready_at_or_After_vsync(struct hubp *hubp, + union dml2_global_sync_programming *pipe_global_sync, + struct dc_crtc_timing *timing); + +void hubp401_program_requestor( + struct hubp *hubp, + struct dml2_display_rq_regs *rq_regs); + +void hubp401_program_deadline( + struct hubp *hubp, + struct dml2_display_dlg_regs *dlg_attr, + struct dml2_display_ttu_regs *ttu_attr); + #endif /* __DC_HUBP_DCN401_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.c index b029ec1b26d3..a5e18ab72394 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.c @@ -1288,7 +1288,7 @@ static void dcn20_power_on_plane_resources( } } -static void dcn20_enable_plane(struct dc *dc, struct pipe_ctx *pipe_ctx, +void dcn20_enable_plane(struct dc *dc, struct pipe_ctx *pipe_ctx, struct dc_state *context) { //if (dc->debug.sanity_checks) { @@ -1467,7 +1467,7 @@ void dcn20_pipe_control_lock( } } -static void dcn20_detect_pipe_changes(struct dc_state *old_state, +void dcn20_detect_pipe_changes(struct dc_state *old_state, struct dc_state *new_state, struct pipe_ctx *old_pipe, struct pipe_ctx *new_pipe) @@ -1655,7 +1655,7 @@ static void dcn20_detect_pipe_changes(struct dc_state *old_state, } } -static void dcn20_update_dchubp_dpp( +void dcn20_update_dchubp_dpp( struct dc *dc, struct pipe_ctx *pipe_ctx, struct dc_state *context) @@ -1678,25 +1678,41 @@ static void dcn20_update_dchubp_dpp( * VTG is within DCHUBBUB which is commond block share by each pipe HUBP. * VTG is 1:1 mapping with OTG. Each pipe HUBP will select which VTG */ + if (pipe_ctx->update_flags.bits.hubp_rq_dlg_ttu) { hubp->funcs->hubp_vtg_sel(hubp, pipe_ctx->stream_res.tg->inst); - hubp->funcs->hubp_setup( - hubp, - &pipe_ctx->dlg_regs, - &pipe_ctx->ttu_regs, - &pipe_ctx->rq_regs, - &pipe_ctx->pipe_dlg_param); + if (hubp->funcs->hubp_setup2) { + hubp->funcs->hubp_setup2( + hubp, + &pipe_ctx->hubp_regs, + &pipe_ctx->global_sync, + &pipe_ctx->stream->timing); + } else { + hubp->funcs->hubp_setup( + hubp, + &pipe_ctx->dlg_regs, + &pipe_ctx->ttu_regs, + &pipe_ctx->rq_regs, + &pipe_ctx->pipe_dlg_param); + } } if (pipe_ctx->update_flags.bits.unbounded_req && hubp->funcs->set_unbounded_requesting) hubp->funcs->set_unbounded_requesting(hubp, pipe_ctx->unbounded_req); - if (pipe_ctx->update_flags.bits.hubp_interdependent) - hubp->funcs->hubp_setup_interdependent( - hubp, - &pipe_ctx->dlg_regs, - &pipe_ctx->ttu_regs); + if (pipe_ctx->update_flags.bits.hubp_interdependent) { + if (hubp->funcs->hubp_setup_interdependent2) { + hubp->funcs->hubp_setup_interdependent2( + hubp, + &pipe_ctx->hubp_regs); + } else { + hubp->funcs->hubp_setup_interdependent( + hubp, + &pipe_ctx->dlg_regs, + &pipe_ctx->ttu_regs); + } + } if (pipe_ctx->update_flags.bits.enable || pipe_ctx->update_flags.bits.plane_changed || @@ -1756,10 +1772,9 @@ static void dcn20_update_dchubp_dpp( &pipe_ctx->plane_res.scl_data.viewport_c); viewport_changed = true; } - if (hubp->funcs->hubp_program_mcache_id_and_split_coordinate) - hubp->funcs->hubp_program_mcache_id_and_split_coordinate( - hubp, - &pipe_ctx->mcache_regs); + + if (hubp->funcs->hubp_program_mcache_id_and_split_coordinate) + hubp->funcs->hubp_program_mcache_id_and_split_coordinate(hubp, &pipe_ctx->mcache_regs); /* Any updates are handled in dc interface, just need to apply existing for plane enable */ if ((pipe_ctx->update_flags.bits.enable || pipe_ctx->update_flags.bits.opp_changed || @@ -1838,7 +1853,7 @@ static void dcn20_update_dchubp_dpp( hubp->funcs->phantom_hubp_post_enable(hubp); } -static int calculate_vready_offset_for_group(struct pipe_ctx *pipe) +static int dcn20_calculate_vready_offset_for_group(struct pipe_ctx *pipe) { struct pipe_ctx *other_pipe; int vready_offset = pipe->pipe_dlg_param.vready_offset; @@ -1864,6 +1879,30 @@ static int calculate_vready_offset_for_group(struct pipe_ctx *pipe) return vready_offset; } +static void dcn20_program_tg( + struct dc *dc, + struct pipe_ctx *pipe_ctx, + struct dc_state *context, + struct dce_hwseq *hws) +{ + pipe_ctx->stream_res.tg->funcs->program_global_sync( + pipe_ctx->stream_res.tg, + dcn20_calculate_vready_offset_for_group(pipe_ctx), + pipe_ctx->pipe_dlg_param.vstartup_start, + pipe_ctx->pipe_dlg_param.vupdate_offset, + pipe_ctx->pipe_dlg_param.vupdate_width, + pipe_ctx->pipe_dlg_param.pstate_keepout); + + if (dc_state_get_pipe_subvp_type(context, pipe_ctx) != SUBVP_PHANTOM) + pipe_ctx->stream_res.tg->funcs->wait_for_state(pipe_ctx->stream_res.tg, CRTC_STATE_VACTIVE); + + pipe_ctx->stream_res.tg->funcs->set_vtg_params( + pipe_ctx->stream_res.tg, &pipe_ctx->stream->timing, true); + + if (hws->funcs.setup_vupdate_interrupt) + hws->funcs.setup_vupdate_interrupt(dc, pipe_ctx); +} + static void dcn20_program_pipe( struct dc *dc, struct pipe_ctx *pipe_ctx, @@ -1874,33 +1913,17 @@ static void dcn20_program_pipe( /* Only need to unblank on top pipe */ if (resource_is_pipe_type(pipe_ctx, OTG_MASTER)) { if (pipe_ctx->update_flags.bits.enable || - pipe_ctx->update_flags.bits.odm || - pipe_ctx->stream->update_flags.bits.abm_level) + pipe_ctx->update_flags.bits.odm || + pipe_ctx->stream->update_flags.bits.abm_level) hws->funcs.blank_pixel_data(dc, pipe_ctx, - !pipe_ctx->plane_state || - !pipe_ctx->plane_state->visible); + !pipe_ctx->plane_state || + !pipe_ctx->plane_state->visible); } /* Only update TG on top pipe */ if (pipe_ctx->update_flags.bits.global_sync && !pipe_ctx->top_pipe - && !pipe_ctx->prev_odm_pipe) { - pipe_ctx->stream_res.tg->funcs->program_global_sync( - pipe_ctx->stream_res.tg, - calculate_vready_offset_for_group(pipe_ctx), - pipe_ctx->pipe_dlg_param.vstartup_start, - pipe_ctx->pipe_dlg_param.vupdate_offset, - pipe_ctx->pipe_dlg_param.vupdate_width, - pipe_ctx->pipe_dlg_param.pstate_keepout); - - if (dc_state_get_pipe_subvp_type(context, pipe_ctx) != SUBVP_PHANTOM) - pipe_ctx->stream_res.tg->funcs->wait_for_state(pipe_ctx->stream_res.tg, CRTC_STATE_VACTIVE); - - pipe_ctx->stream_res.tg->funcs->set_vtg_params( - pipe_ctx->stream_res.tg, &pipe_ctx->stream->timing, true); - - if (hws->funcs.setup_vupdate_interrupt) - hws->funcs.setup_vupdate_interrupt(dc, pipe_ctx); - } + && !pipe_ctx->prev_odm_pipe) + dcn20_program_tg(dc, pipe_ctx, context, hws); if (pipe_ctx->update_flags.bits.odm) hws->funcs.update_odm(dc, context, pipe_ctx); @@ -1931,22 +1954,22 @@ static void dcn20_program_pipe( dcn20_update_dchubp_dpp(dc, pipe_ctx, context); if (pipe_ctx->plane_state && (pipe_ctx->update_flags.bits.enable || - pipe_ctx->plane_state->update_flags.bits.hdr_mult)) + pipe_ctx->plane_state->update_flags.bits.hdr_mult)) hws->funcs.set_hdr_multiplier(pipe_ctx); if (hws->funcs.populate_mcm_luts) { if (pipe_ctx->plane_state) { hws->funcs.populate_mcm_luts(dc, pipe_ctx, pipe_ctx->plane_state->mcm_luts, - pipe_ctx->plane_state->lut_bank_a); + pipe_ctx->plane_state->lut_bank_a); pipe_ctx->plane_state->lut_bank_a = !pipe_ctx->plane_state->lut_bank_a; } } if (pipe_ctx->plane_state && - (pipe_ctx->plane_state->update_flags.bits.in_transfer_func_change || - pipe_ctx->plane_state->update_flags.bits.gamma_change || - pipe_ctx->plane_state->update_flags.bits.lut_3d || - pipe_ctx->update_flags.bits.enable)) + (pipe_ctx->plane_state->update_flags.bits.in_transfer_func_change || + pipe_ctx->plane_state->update_flags.bits.gamma_change || + pipe_ctx->plane_state->update_flags.bits.lut_3d || + pipe_ctx->update_flags.bits.enable)) hws->funcs.set_input_transfer_func(dc, pipe_ctx, pipe_ctx->plane_state); /* dcn10_translate_regamma_to_hw_format takes 750us to finish @@ -1954,10 +1977,10 @@ static void dcn20_program_pipe( * updating on slave planes */ if (pipe_ctx->update_flags.bits.enable || - pipe_ctx->update_flags.bits.plane_changed || - pipe_ctx->stream->update_flags.bits.out_tf || - (pipe_ctx->plane_state && - pipe_ctx->plane_state->update_flags.bits.output_tf_change)) + pipe_ctx->update_flags.bits.plane_changed || + pipe_ctx->stream->update_flags.bits.out_tf || + (pipe_ctx->plane_state && + pipe_ctx->plane_state->update_flags.bits.output_tf_change)) hws->funcs.set_output_transfer_func(dc, pipe_ctx, pipe_ctx->stream); /* If the pipe has been enabled or has a different opp, we @@ -1966,7 +1989,7 @@ static void dcn20_program_pipe( * causes a different pipe to be chosen to odm combine with. */ if (pipe_ctx->update_flags.bits.enable - || pipe_ctx->update_flags.bits.opp_changed) { + || pipe_ctx->update_flags.bits.opp_changed) { pipe_ctx->stream_res.opp->funcs->opp_set_dyn_expansion( pipe_ctx->stream_res.opp, @@ -1996,14 +2019,14 @@ static void dcn20_program_pipe( memset(¶ms, 0, sizeof(params)); odm_opp->funcs->opp_program_bit_depth_reduction(odm_opp, ¶ms); dc->hwss.set_disp_pattern_generator(dc, - pipe_ctx, - pipe_ctx->stream_res.test_pattern_params.test_pattern, - pipe_ctx->stream_res.test_pattern_params.color_space, - pipe_ctx->stream_res.test_pattern_params.color_depth, - NULL, - pipe_ctx->stream_res.test_pattern_params.width, - pipe_ctx->stream_res.test_pattern_params.height, - pipe_ctx->stream_res.test_pattern_params.offset); + pipe_ctx, + pipe_ctx->stream_res.test_pattern_params.test_pattern, + pipe_ctx->stream_res.test_pattern_params.color_space, + pipe_ctx->stream_res.test_pattern_params.color_depth, + NULL, + pipe_ctx->stream_res.test_pattern_params.width, + pipe_ctx->stream_res.test_pattern_params.height, + pipe_ctx->stream_res.test_pattern_params.offset); } } @@ -2012,11 +2035,12 @@ void dcn20_program_front_end_for_ctx( struct dc_state *context) { int i; - struct dce_hwseq *hws = dc->hwseq; - DC_LOGGER_INIT(dc->ctx->logger); unsigned int prev_hubp_count = 0; unsigned int hubp_count = 0; - struct pipe_ctx *pipe; + struct dce_hwseq *hws = dc->hwseq; + struct pipe_ctx *pipe = NULL; + + DC_LOGGER_INIT(dc->ctx->logger); if (resource_is_pipe_topology_changed(dc->current_state, context)) resource_log_pipe_topology_update(dc, context); @@ -2029,7 +2053,7 @@ void dcn20_program_front_end_for_ctx( ASSERT(!pipe->plane_state->triplebuffer_flips); /*turn off triple buffer for full update*/ dc->hwss.program_triplebuffer( - dc, pipe, pipe->plane_state->triplebuffer_flips); + dc, pipe, pipe->plane_state->triplebuffer_flips); } } } @@ -2044,30 +2068,31 @@ void dcn20_program_front_end_for_ctx( if (prev_hubp_count == 0 && hubp_count > 0) { if (dc->res_pool->hubbub->funcs->force_pstate_change_control) dc->res_pool->hubbub->funcs->force_pstate_change_control( - dc->res_pool->hubbub, true, false); + dc->res_pool->hubbub, true, false); udelay(500); } /* Set pipe update flags and lock pipes */ for (i = 0; i < dc->res_pool->pipe_count; i++) dcn20_detect_pipe_changes(dc->current_state, context, &dc->current_state->res_ctx.pipe_ctx[i], - &context->res_ctx.pipe_ctx[i]); + &context->res_ctx.pipe_ctx[i]); /* When disabling phantom pipes, turn on phantom OTG first (so we can get double * buffer updates properly) */ for (i = 0; i < dc->res_pool->pipe_count; i++) { struct dc_stream_state *stream = dc->current_state->res_ctx.pipe_ctx[i].stream; + pipe = &dc->current_state->res_ctx.pipe_ctx[i]; if (context->res_ctx.pipe_ctx[i].update_flags.bits.disable && stream && - dc_state_get_pipe_subvp_type(dc->current_state, pipe) == SUBVP_PHANTOM) { + dc_state_get_pipe_subvp_type(dc->current_state, pipe) == SUBVP_PHANTOM) { struct timing_generator *tg = dc->current_state->res_ctx.pipe_ctx[i].stream_res.tg; if (tg->funcs->enable_crtc) { - if (dc->hwseq->funcs.blank_pixel_data) { + if (dc->hwseq->funcs.blank_pixel_data) dc->hwseq->funcs.blank_pixel_data(dc, pipe, true); - } + tg->funcs->enable_crtc(tg); } } @@ -2075,15 +2100,15 @@ void dcn20_program_front_end_for_ctx( /* OTG blank before disabling all front ends */ for (i = 0; i < dc->res_pool->pipe_count; i++) if (context->res_ctx.pipe_ctx[i].update_flags.bits.disable - && !context->res_ctx.pipe_ctx[i].top_pipe - && !context->res_ctx.pipe_ctx[i].prev_odm_pipe - && context->res_ctx.pipe_ctx[i].stream) + && !context->res_ctx.pipe_ctx[i].top_pipe + && !context->res_ctx.pipe_ctx[i].prev_odm_pipe + && context->res_ctx.pipe_ctx[i].stream) hws->funcs.blank_pixel_data(dc, &context->res_ctx.pipe_ctx[i], true); /* Disconnect mpcc */ for (i = 0; i < dc->res_pool->pipe_count; i++) if (context->res_ctx.pipe_ctx[i].update_flags.bits.disable - || context->res_ctx.pipe_ctx[i].update_flags.bits.opp_changed) { + || context->res_ctx.pipe_ctx[i].update_flags.bits.opp_changed) { struct hubbub *hubbub = dc->res_pool->hubbub; /* Phantom pipe DET should be 0, but if a pipe in use is being transitioned to phantom @@ -2093,13 +2118,18 @@ void dcn20_program_front_end_for_ctx( * DET allocation. */ if ((context->res_ctx.pipe_ctx[i].update_flags.bits.disable || - (context->res_ctx.pipe_ctx[i].plane_state && dc_state_get_pipe_subvp_type(context, &context->res_ctx.pipe_ctx[i]) == SUBVP_PHANTOM))) { + (context->res_ctx.pipe_ctx[i].plane_state && + dc_state_get_pipe_subvp_type(context, &context->res_ctx.pipe_ctx[i]) + == SUBVP_PHANTOM))) { if (hubbub->funcs->program_det_size) - hubbub->funcs->program_det_size(hubbub, dc->current_state->res_ctx.pipe_ctx[i].plane_res.hubp->inst, 0); + hubbub->funcs->program_det_size(hubbub, + dc->current_state->res_ctx.pipe_ctx[i].plane_res.hubp->inst, 0); if (dc->res_pool->hubbub->funcs->program_det_segments) - dc->res_pool->hubbub->funcs->program_det_segments(hubbub, dc->current_state->res_ctx.pipe_ctx[i].plane_res.hubp->inst, 0); + dc->res_pool->hubbub->funcs->program_det_segments( + hubbub, dc->current_state->res_ctx.pipe_ctx[i].plane_res.hubp->inst, 0); } - hws->funcs.plane_atomic_disconnect(dc, dc->current_state, &dc->current_state->res_ctx.pipe_ctx[i]); + hws->funcs.plane_atomic_disconnect(dc, dc->current_state, + &dc->current_state->res_ctx.pipe_ctx[i]); DC_LOG_DC("Reset mpcc for pipe %d\n", dc->current_state->res_ctx.pipe_ctx[i].pipe_idx); } @@ -2107,9 +2137,9 @@ void dcn20_program_front_end_for_ctx( for (i = 0; i < dc->res_pool->pipe_count; i++) { pipe = &context->res_ctx.pipe_ctx[i]; if (resource_is_pipe_type(pipe, OTG_MASTER) && - !resource_is_pipe_type(pipe, DPP_PIPE) && - pipe->update_flags.bits.odm && - hws->funcs.update_odm) + !resource_is_pipe_type(pipe, DPP_PIPE) && + pipe->update_flags.bits.odm && + hws->funcs.update_odm) hws->funcs.update_odm(dc, context, pipe); } @@ -2127,25 +2157,28 @@ void dcn20_program_front_end_for_ctx( else { /* Don't program phantom pipes in the regular front end programming sequence. * There is an MPO transition case where a pipe being used by a video plane is - * transitioned directly to be a phantom pipe when closing the MPO video. However - * the phantom pipe will program a new HUBP_VTG_SEL (update takes place right away), - * but the MPO still exists until the double buffered update of the main pipe so we - * will get a frame of underflow if the phantom pipe is programmed here. + * transitioned directly to be a phantom pipe when closing the MPO video. + * However the phantom pipe will program a new HUBP_VTG_SEL (update takes place + * right away) but the MPO still exists until the double buffered update of the + * main pipe so we will get a frame of underflow if the phantom pipe is + * programmed here. */ - if (pipe->stream && dc_state_get_pipe_subvp_type(context, pipe) != SUBVP_PHANTOM) + if (pipe->stream && + dc_state_get_pipe_subvp_type(context, pipe) != SUBVP_PHANTOM) dcn20_program_pipe(dc, pipe, context); } pipe = pipe->bottom_pipe; } } + /* Program secondary blending tree and writeback pipes */ pipe = &context->res_ctx.pipe_ctx[i]; if (!pipe->top_pipe && !pipe->prev_odm_pipe - && pipe->stream && pipe->stream->num_wb_info > 0 - && (pipe->update_flags.raw || (pipe->plane_state && pipe->plane_state->update_flags.raw) - || pipe->stream->update_flags.raw) - && hws->funcs.program_all_writeback_pipes_in_tree) + && pipe->stream && pipe->stream->num_wb_info > 0 + && (pipe->update_flags.raw || (pipe->plane_state && pipe->plane_state->update_flags.raw) + || pipe->stream->update_flags.raw) + && hws->funcs.program_all_writeback_pipes_in_tree) hws->funcs.program_all_writeback_pipes_in_tree(dc, pipe->stream, context); /* Avoid underflow by check of pipe line read when adding 2nd plane. */ @@ -2164,7 +2197,7 @@ void dcn20_program_front_end_for_ctx( * buffered pending status clear and reset opp head pipe's none double buffered * registers to their initial state. */ -static void post_unlock_reset_opp(struct dc *dc, +void dcn20_post_unlock_reset_opp(struct dc *dc, struct pipe_ctx *opp_head) { struct display_stream_compressor *dsc = opp_head->stream_res.dsc; @@ -2201,16 +2234,17 @@ void dcn20_post_unlock_program_front_end( struct dc *dc, struct dc_state *context) { - int i; - const unsigned int TIMEOUT_FOR_PIPE_ENABLE_US = 100000; + // Timeout for pipe enable + unsigned int timeout_us = 100000; unsigned int polling_interval_us = 1; struct dce_hwseq *hwseq = dc->hwseq; + int i; for (i = 0; i < dc->res_pool->pipe_count; i++) if (resource_is_pipe_type(&dc->current_state->res_ctx.pipe_ctx[i], OPP_HEAD) && - !resource_is_pipe_type(&context->res_ctx.pipe_ctx[i], OPP_HEAD)) - post_unlock_reset_opp(dc, - &dc->current_state->res_ctx.pipe_ctx[i]); + !resource_is_pipe_type(&context->res_ctx.pipe_ctx[i], OPP_HEAD)) + dcn20_post_unlock_reset_opp(dc, + &dc->current_state->res_ctx.pipe_ctx[i]); for (i = 0; i < dc->res_pool->pipe_count; i++) if (context->res_ctx.pipe_ctx[i].update_flags.bits.disable) @@ -2226,11 +2260,12 @@ void dcn20_post_unlock_program_front_end( struct pipe_ctx *pipe = &context->res_ctx.pipe_ctx[i]; // Don't check flip pending on phantom pipes if (pipe->plane_state && !pipe->top_pipe && pipe->update_flags.bits.enable && - dc_state_get_pipe_subvp_type(context, pipe) != SUBVP_PHANTOM) { + dc_state_get_pipe_subvp_type(context, pipe) != SUBVP_PHANTOM) { struct hubp *hubp = pipe->plane_res.hubp; int j = 0; - for (j = 0; j < TIMEOUT_FOR_PIPE_ENABLE_US / polling_interval_us - && hubp->funcs->hubp_is_flip_pending(hubp); j++) + + for (j = 0; j < timeout_us / polling_interval_us + && hubp->funcs->hubp_is_flip_pending(hubp); j++) udelay(polling_interval_us); } } @@ -2244,15 +2279,14 @@ void dcn20_post_unlock_program_front_end( * before we've transitioned to 2:1 or 4:1 */ if (resource_is_pipe_type(old_pipe, OTG_MASTER) && resource_is_pipe_type(pipe, OTG_MASTER) && - resource_get_odm_slice_count(old_pipe) < resource_get_odm_slice_count(pipe) && - dc_state_get_pipe_subvp_type(context, pipe) != SUBVP_PHANTOM) { + resource_get_odm_slice_count(old_pipe) < resource_get_odm_slice_count(pipe) && + dc_state_get_pipe_subvp_type(context, pipe) != SUBVP_PHANTOM) { int j = 0; struct timing_generator *tg = pipe->stream_res.tg; - if (tg->funcs->get_optc_double_buffer_pending) { - for (j = 0; j < TIMEOUT_FOR_PIPE_ENABLE_US / polling_interval_us - && tg->funcs->get_optc_double_buffer_pending(tg); j++) + for (j = 0; j < timeout_us / polling_interval_us + && tg->funcs->get_optc_double_buffer_pending(tg); j++) udelay(polling_interval_us); } } @@ -2260,7 +2294,7 @@ void dcn20_post_unlock_program_front_end( if (dc->res_pool->hubbub->funcs->force_pstate_change_control) dc->res_pool->hubbub->funcs->force_pstate_change_control( - dc->res_pool->hubbub, false, false); + dc->res_pool->hubbub, false, false); for (i = 0; i < dc->res_pool->pipe_count; i++) { struct pipe_ctx *pipe = &context->res_ctx.pipe_ctx[i]; @@ -2291,11 +2325,11 @@ void dcn20_post_unlock_program_front_end( return; /* P-State support transitions: - * Natural -> FPO: P-State disabled in prepare, force disallow anytime is safe - * FPO -> Natural: Unforce anytime after FW disable is safe (P-State will assert naturally) - * Unsupported -> FPO: P-State enabled in optimize, force disallow anytime is safe - * FPO -> Unsupported: P-State disabled in prepare, unforce disallow anytime is safe - * FPO <-> SubVP: Force disallow is maintained on the FPO / SubVP pipes + * Natural -> FPO: P-State disabled in prepare, force disallow anytime is safe + * FPO -> Natural: Unforce anytime after FW disable is safe (P-State will assert naturally) + * Unsupported -> FPO: P-State enabled in optimize, force disallow anytime is safe + * FPO -> Unsupported: P-State disabled in prepare, unforce disallow anytime is safe + * FPO <-> SubVP: Force disallow is maintained on the FPO / SubVP pipes */ if (hwseq->funcs.update_force_pstate) dc->hwseq->funcs.update_force_pstate(dc, context); @@ -2310,12 +2344,11 @@ void dcn20_post_unlock_program_front_end( if (hwseq->wa.DEGVIDCN21) dc->res_pool->hubbub->funcs->apply_DEDCN21_147_wa(dc->res_pool->hubbub); - /* WA for stutter underflow during MPO transitions when adding 2nd plane */ if (hwseq->wa.disallow_self_refresh_during_multi_plane_transition) { if (dc->current_state->stream_status[0].plane_count == 1 && - context->stream_status[0].plane_count > 1) { + context->stream_status[0].plane_count > 1) { struct timing_generator *tg = dc->res_pool->timing_generators[0]; @@ -2463,7 +2496,7 @@ bool dcn20_update_bandwidth( pipe_ctx->stream_res.tg->funcs->program_global_sync( pipe_ctx->stream_res.tg, - calculate_vready_offset_for_group(pipe_ctx), + dcn20_calculate_vready_offset_for_group(pipe_ctx), pipe_ctx->pipe_dlg_param.vstartup_start, pipe_ctx->pipe_dlg_param.vupdate_offset, pipe_ctx->pipe_dlg_param.vupdate_width, diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.h b/drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.h index 5c874f7b0683..9d1ad3b29ca5 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.h +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.h @@ -154,6 +154,21 @@ void dcn20_setup_gsl_group_as_lock( const struct dc *dc, struct pipe_ctx *pipe_ctx, bool enable); - +void dcn20_detect_pipe_changes( + struct dc_state *old_state, + struct dc_state *new_state, + struct pipe_ctx *old_pipe, + struct pipe_ctx *new_pipe); +void dcn20_enable_plane( + struct dc *dc, + struct pipe_ctx *pipe_ctx, + struct dc_state *context); +void dcn20_update_dchubp_dpp( + struct dc *dc, + struct pipe_ctx *pipe_ctx, + struct dc_state *context); +void dcn20_post_unlock_reset_opp( + struct dc *dc, + struct pipe_ctx *opp_head); #endif /* __DC_HWSS_DCN20_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_init.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_init.c index 0e8d32e3dbae..c32764aef884 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_init.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_init.c @@ -86,7 +86,6 @@ static const struct hw_sequencer_funcs dcn30_funcs = { .enable_writeback = dcn30_enable_writeback, .disable_writeback = dcn30_disable_writeback, .update_writeback = dcn30_update_writeback, - .mmhubbub_warmup = dcn30_mmhubbub_warmup, .dmdata_status_done = dcn20_dmdata_status_done, .program_dmdata_engine = dcn30_program_dmdata_engine, .set_dmdata_attributes = dcn20_set_dmdata_attributes, diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn301/dcn301_init.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn301/dcn301_init.c index 780ce4c064aa..dcb27cdbce73 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn301/dcn301_init.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn301/dcn301_init.c @@ -86,7 +86,6 @@ static const struct hw_sequencer_funcs dcn301_funcs = { .enable_writeback = dcn30_enable_writeback, .disable_writeback = dcn30_disable_writeback, .update_writeback = dcn30_update_writeback, - .mmhubbub_warmup = dcn30_mmhubbub_warmup, .dmdata_status_done = dcn20_dmdata_status_done, .program_dmdata_engine = dcn30_program_dmdata_engine, .set_dmdata_attributes = dcn20_set_dmdata_attributes, diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn31/dcn31_init.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn31/dcn31_init.c index 5f8f45b48720..fb2ffb637931 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn31/dcn31_init.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn31/dcn31_init.c @@ -89,7 +89,6 @@ static const struct hw_sequencer_funcs dcn31_funcs = { .enable_writeback = dcn30_enable_writeback, .disable_writeback = dcn30_disable_writeback, .update_writeback = dcn30_update_writeback, - .mmhubbub_warmup = dcn30_mmhubbub_warmup, .dmdata_status_done = dcn20_dmdata_status_done, .program_dmdata_engine = dcn30_program_dmdata_engine, .set_dmdata_attributes = dcn20_set_dmdata_attributes, @@ -98,7 +97,7 @@ static const struct hw_sequencer_funcs dcn31_funcs = { .set_flip_control_gsl = dcn20_set_flip_control_gsl, .get_vupdate_offset_from_vsync = dcn10_get_vupdate_offset_from_vsync, .calc_vupdate_position = dcn10_calc_vupdate_position, - .set_backlight_level = dcn31_set_backlight_level, + .set_backlight_level = dcn21_set_backlight_level, .set_abm_immediate_disable = dcn21_set_abm_immediate_disable, .set_pipe = dcn21_set_pipe, .enable_lvds_link_output = dce110_enable_lvds_link_output, diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_hwseq.c index 9b88eb72086d..be26c925fdfa 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_hwseq.c @@ -162,6 +162,8 @@ void dcn314_update_odm(struct dc *dc, struct dc_state *context, struct pipe_ctx int opp_inst[MAX_PIPES] = {0}; int odm_slice_width = resource_get_odm_slice_dst_width(pipe_ctx, false); int last_odm_slice_width = resource_get_odm_slice_dst_width(pipe_ctx, true); + struct mpc *mpc = dc->res_pool->mpc; + int i; opp_cnt = get_odm_config(pipe_ctx, opp_inst); @@ -174,6 +176,16 @@ void dcn314_update_odm(struct dc *dc, struct dc_state *context, struct pipe_ctx pipe_ctx->stream_res.tg->funcs->set_odm_bypass( pipe_ctx->stream_res.tg, &pipe_ctx->stream->timing); + if (mpc->funcs->set_out_rate_control) { + for (i = 0; i < opp_cnt; ++i) { + mpc->funcs->set_out_rate_control( + mpc, opp_inst[i], + false, + 0, + NULL); + } + } + for (odm_pipe = pipe_ctx->next_odm_pipe; odm_pipe; odm_pipe = odm_pipe->next_odm_pipe) { odm_pipe->stream_res.opp->funcs->opp_pipe_clock_control( odm_pipe->stream_res.opp, diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_init.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_init.c index 6bdfbf22ce87..21ef03a76229 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_init.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_init.c @@ -91,7 +91,6 @@ static const struct hw_sequencer_funcs dcn314_funcs = { .enable_writeback = dcn30_enable_writeback, .disable_writeback = dcn30_disable_writeback, .update_writeback = dcn30_update_writeback, - .mmhubbub_warmup = dcn30_mmhubbub_warmup, .dmdata_status_done = dcn20_dmdata_status_done, .program_dmdata_engine = dcn30_program_dmdata_engine, .set_dmdata_attributes = dcn20_set_dmdata_attributes, @@ -100,7 +99,7 @@ static const struct hw_sequencer_funcs dcn314_funcs = { .set_flip_control_gsl = dcn20_set_flip_control_gsl, .get_vupdate_offset_from_vsync = dcn10_get_vupdate_offset_from_vsync, .calc_vupdate_position = dcn10_calc_vupdate_position, - .set_backlight_level = dcn31_set_backlight_level, + .set_backlight_level = dcn21_set_backlight_level, .set_abm_immediate_disable = dcn21_set_abm_immediate_disable, .set_pipe = dcn21_set_pipe, .enable_lvds_link_output = dce110_enable_lvds_link_output, diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c index d7f8b2dcaa6b..ee4de9ddfef4 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c @@ -985,6 +985,7 @@ void dcn32_init_hw(struct dc *dc) dc->caps.dmub_caps.subvp_psr = dc->ctx->dmub_srv->dmub->feature_caps.subvp_psr_support; dc->caps.dmub_caps.gecc_enable = dc->ctx->dmub_srv->dmub->feature_caps.gecc_enable; dc->caps.dmub_caps.mclk_sw = dc->ctx->dmub_srv->dmub->feature_caps.fw_assisted_mclk_switch_ver; + dc->caps.dmub_caps.aux_backlight_support = dc->ctx->dmub_srv->dmub->feature_caps.abm_aux_backlight_support; /* for DCN401 testing only */ dc->caps.dmub_caps.fams_ver = dc->ctx->dmub_srv->dmub->feature_caps.fw_assisted_mclk_switch_ver; @@ -1049,7 +1050,8 @@ void dcn32_update_dsc_on_stream(struct pipe_ctx *pipe_ctx, bool enable) } /* Enable DSC hw block */ - dsc_cfg.pic_width = (stream->timing.h_addressable + stream->timing.h_border_left + stream->timing.h_border_right) / opp_cnt; + dsc_cfg.pic_width = (stream->timing.h_addressable + pipe_ctx->hblank_borrow + + stream->timing.h_border_left + stream->timing.h_border_right) / opp_cnt; dsc_cfg.pic_height = stream->timing.v_addressable + stream->timing.v_border_top + stream->timing.v_border_bottom; dsc_cfg.pixel_encoding = stream->timing.pixel_encoding; dsc_cfg.color_depth = stream->timing.display_color_depth; @@ -1397,12 +1399,12 @@ void dcn32_disable_link_output(struct dc_link *link, link_hwss->disable_link_output(link, link_res, signal); link->phy_state.symclk_state = SYMCLK_OFF_TX_OFF; - - if (signal == SIGNAL_TYPE_EDP && - link->dc->hwss.edp_power_control && - !link->skip_implict_edp_power_control) - link->dc->hwss.edp_power_control(link, false); - else if (dmcu != NULL && dmcu->funcs->unlock_phy) + /* + * Add the logic to extract BOTH power up and power down sequences + * from enable/disable link output and only call edp panel control + * in enable_link_dp and disable_link_dp once. + */ + if (dmcu != NULL && dmcu->funcs->unlock_phy) dmcu->funcs->unlock_phy(dmcu); dc->link_srv->dp_trace_source_sequence(link, DPCD_SOURCE_SEQ_AFTER_DISABLE_LINK_PHY); diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_init.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_init.c index 5ecee7e320da..e4d149eff10f 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_init.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_init.c @@ -87,7 +87,6 @@ static const struct hw_sequencer_funcs dcn32_funcs = { .enable_writeback = dcn30_enable_writeback, .disable_writeback = dcn30_disable_writeback, .update_writeback = dcn30_update_writeback, - .mmhubbub_warmup = dcn30_mmhubbub_warmup, .dmdata_status_done = dcn20_dmdata_status_done, .program_dmdata_engine = dcn30_program_dmdata_engine, .set_dmdata_attributes = dcn20_set_dmdata_attributes, diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c index e599cdc465bf..59fc1c114fbe 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c @@ -426,6 +426,8 @@ void dcn35_update_odm(struct dc *dc, struct dc_state *context, struct pipe_ctx * int opp_inst[MAX_PIPES] = {0}; int odm_slice_width = resource_get_odm_slice_dst_width(pipe_ctx, false); int last_odm_slice_width = resource_get_odm_slice_dst_width(pipe_ctx, true); + struct mpc *mpc = dc->res_pool->mpc; + int i; opp_cnt = get_odm_config(pipe_ctx, opp_inst); @@ -438,6 +440,16 @@ void dcn35_update_odm(struct dc *dc, struct dc_state *context, struct pipe_ctx * pipe_ctx->stream_res.tg->funcs->set_odm_bypass( pipe_ctx->stream_res.tg, &pipe_ctx->stream->timing); + if (mpc->funcs->set_out_rate_control) { + for (i = 0; i < opp_cnt; ++i) { + mpc->funcs->set_out_rate_control( + mpc, opp_inst[i], + false, + 0, + NULL); + } + } + for (odm_pipe = pipe_ctx->next_odm_pipe; odm_pipe; odm_pipe = odm_pipe->next_odm_pipe) { odm_pipe->stream_res.opp->funcs->opp_pipe_clock_control( odm_pipe->stream_res.opp, @@ -1020,8 +1032,13 @@ void dcn35_calc_blocks_to_gate(struct dc *dc, struct dc_state *context, if (pipe_ctx->plane_res.dpp || pipe_ctx->stream_res.opp) update_state->pg_pipe_res_update[PG_MPCC][pipe_ctx->plane_res.mpcc_inst] = false; - if (pipe_ctx->stream_res.dsc) + if (pipe_ctx->stream_res.dsc) { update_state->pg_pipe_res_update[PG_DSC][pipe_ctx->stream_res.dsc->inst] = false; + if (dc->caps.sequential_ono) { + update_state->pg_pipe_res_update[PG_HUBP][pipe_ctx->stream_res.dsc->inst] = false; + update_state->pg_pipe_res_update[PG_DPP][pipe_ctx->stream_res.dsc->inst] = false; + } + } if (pipe_ctx->stream_res.opp) update_state->pg_pipe_res_update[PG_OPP][pipe_ctx->stream_res.opp->inst] = false; @@ -1579,3 +1596,37 @@ bool dcn35_is_dp_dig_pixel_rate_div_policy(struct pipe_ctx *pipe_ctx) return false; } + +/* + * Set powerup to true for every pipe to match pre-OS configuration. + */ +static void dcn35_calc_blocks_to_ungate_for_hw_release(struct dc *dc, struct pg_block_update *update_state) +{ + int i = 0, j = 0; + + memset(update_state, 0, sizeof(struct pg_block_update)); + + for (i = 0; i < dc->res_pool->pipe_count; i++) + for (j = 0; j < PG_HW_PIPE_RESOURCES_NUM_ELEMENT; j++) + update_state->pg_pipe_res_update[j][i] = true; + + update_state->pg_res_update[PG_HPO] = true; + update_state->pg_res_update[PG_DWB] = true; +} + +/* + * The purpose is to power up all gatings to restore optimization to pre-OS env. + * Re-use hwss func and existing PG&RCG flags to decide powerup sequence. + */ +void dcn35_hardware_release(struct dc *dc) +{ + struct pg_block_update pg_update_state; + + dcn35_calc_blocks_to_ungate_for_hw_release(dc, &pg_update_state); + + if (dc->hwss.root_clock_control) + dc->hwss.root_clock_control(dc, &pg_update_state, true); + /*power up required HW block*/ + if (dc->hwss.hw_block_power_up) + dc->hwss.hw_block_power_up(dc, &pg_update_state); +} diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.h b/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.h index e27b3609020f..0b1d6f608edd 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.h +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.h @@ -99,4 +99,6 @@ void dcn35_set_long_vblank(struct pipe_ctx **pipe_ctx, bool dcn35_is_dp_dig_pixel_rate_div_policy(struct pipe_ctx *pipe_ctx); +void dcn35_hardware_release(struct dc *dc); + #endif /* __DC_HWSS_DCN35_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c index fd67779c27a9..c7acaf97974c 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c @@ -92,7 +92,6 @@ static const struct hw_sequencer_funcs dcn35_funcs = { .enable_writeback = dcn30_enable_writeback, .disable_writeback = dcn30_disable_writeback, .update_writeback = dcn30_update_writeback, - .mmhubbub_warmup = dcn30_mmhubbub_warmup, .dmdata_status_done = dcn20_dmdata_status_done, .program_dmdata_engine = dcn30_program_dmdata_engine, .set_dmdata_attributes = dcn20_set_dmdata_attributes, @@ -123,6 +122,11 @@ static const struct hw_sequencer_funcs dcn35_funcs = { .root_clock_control = dcn35_root_clock_control, .set_long_vtotal = dcn35_set_long_vblank, .calculate_pix_rate_divider = dcn32_calculate_pix_rate_divider, + .hardware_release = dcn35_hardware_release, + .detect_pipe_changes = dcn20_detect_pipe_changes, + .enable_plane = dcn20_enable_plane, + .update_dchubp_dpp = dcn20_update_dchubp_dpp, + .post_unlock_reset_opp = dcn20_post_unlock_reset_opp, }; static const struct hwseq_private_funcs dcn35_private_funcs = { diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn351/dcn351_init.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn351/dcn351_init.c index 3c275a1eff58..4f73e7f551ac 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn351/dcn351_init.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn351/dcn351_init.c @@ -91,7 +91,6 @@ static const struct hw_sequencer_funcs dcn351_funcs = { .enable_writeback = dcn30_enable_writeback, .disable_writeback = dcn30_disable_writeback, .update_writeback = dcn30_update_writeback, - .mmhubbub_warmup = dcn30_mmhubbub_warmup, .dmdata_status_done = dcn20_dmdata_status_done, .program_dmdata_engine = dcn30_program_dmdata_engine, .set_dmdata_attributes = dcn20_set_dmdata_attributes, diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c index 5de11e2837c0..555a9f590cd7 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c @@ -3,6 +3,7 @@ // Copyright 2024 Advanced Micro Devices, Inc. #include "dm_services.h" +#include "basics/dc_common.h" #include "dm_helpers.h" #include "core_types.h" #include "resource.h" @@ -126,91 +127,6 @@ void dcn401_program_gamut_remap(struct pipe_ctx *pipe_ctx) mpc->funcs->set_gamut_remap(mpc, mpcc_id, &mpc_adjust); } -struct ips_ono_region_state dcn401_read_ono_state(struct dc *dc, uint8_t region) -{ - struct dce_hwseq *hws = dc->hwseq; - struct ips_ono_region_state state = {0, 0}; - - switch (region) { - case 0: - /* dccg, dio, dcio */ - REG_GET_2(DOMAIN22_PG_STATUS, - DOMAIN_DESIRED_PWR_STATE, &state.desire_pwr_state, - DOMAIN_PGFSM_PWR_STATUS, &state.current_pwr_state); - break; - case 1: - /* dchubbub, dchvm, dchubbubmem */ - REG_GET_2(DOMAIN23_PG_STATUS, - DOMAIN_DESIRED_PWR_STATE, &state.desire_pwr_state, - DOMAIN_PGFSM_PWR_STATUS, &state.current_pwr_state); - break; - case 2: - /* mpc, opp, optc, dwb */ - REG_GET_2(DOMAIN24_PG_STATUS, - DOMAIN_DESIRED_PWR_STATE, &state.desire_pwr_state, - DOMAIN_PGFSM_PWR_STATUS, &state.current_pwr_state); - break; - case 3: - /* hpo */ - REG_GET_2(DOMAIN25_PG_STATUS, - DOMAIN_DESIRED_PWR_STATE, &state.desire_pwr_state, - DOMAIN_PGFSM_PWR_STATUS, &state.current_pwr_state); - break; - case 4: - /* dchubp0, dpp0 */ - REG_GET_2(DOMAIN0_PG_STATUS, - DOMAIN_DESIRED_PWR_STATE, &state.desire_pwr_state, - DOMAIN_PGFSM_PWR_STATUS, &state.current_pwr_state); - break; - case 5: - /* dsc0 */ - REG_GET_2(DOMAIN16_PG_STATUS, - DOMAIN_DESIRED_PWR_STATE, &state.desire_pwr_state, - DOMAIN_PGFSM_PWR_STATUS, &state.current_pwr_state); - break; - case 6: - /* dchubp1, dpp1 */ - REG_GET_2(DOMAIN1_PG_STATUS, - DOMAIN_DESIRED_PWR_STATE, &state.desire_pwr_state, - DOMAIN_PGFSM_PWR_STATUS, &state.current_pwr_state); - break; - case 7: - /* dsc1 */ - REG_GET_2(DOMAIN17_PG_STATUS, - DOMAIN_DESIRED_PWR_STATE, &state.desire_pwr_state, - DOMAIN_PGFSM_PWR_STATUS, &state.current_pwr_state); - break; - case 8: - /* dchubp2, dpp2 */ - REG_GET_2(DOMAIN2_PG_STATUS, - DOMAIN_DESIRED_PWR_STATE, &state.desire_pwr_state, - DOMAIN_PGFSM_PWR_STATUS, &state.current_pwr_state); - break; - case 9: - /* dsc2 */ - REG_GET_2(DOMAIN18_PG_STATUS, - DOMAIN_DESIRED_PWR_STATE, &state.desire_pwr_state, - DOMAIN_PGFSM_PWR_STATUS, &state.current_pwr_state); - break; - case 10: - /* dchubp3, dpp3 */ - REG_GET_2(DOMAIN3_PG_STATUS, - DOMAIN_DESIRED_PWR_STATE, &state.desire_pwr_state, - DOMAIN_PGFSM_PWR_STATUS, &state.current_pwr_state); - break; - case 11: - /* dsc3 */ - REG_GET_2(DOMAIN19_PG_STATUS, - DOMAIN_DESIRED_PWR_STATE, &state.desire_pwr_state, - DOMAIN_PGFSM_PWR_STATUS, &state.current_pwr_state); - break; - default: - break; - } - - return state; -} - void dcn401_init_hw(struct dc *dc) { struct abm **abms = dc->res_pool->multiple_abms; @@ -435,7 +351,8 @@ void dcn401_init_hw(struct dc *dc) dc->caps.dmub_caps.psr = dc->ctx->dmub_srv->dmub->feature_caps.psr; dc->caps.dmub_caps.mclk_sw = dc->ctx->dmub_srv->dmub->feature_caps.fw_assisted_mclk_switch_ver > 0; dc->caps.dmub_caps.fams_ver = dc->ctx->dmub_srv->dmub->feature_caps.fw_assisted_mclk_switch_ver; - dc->debug.fams2_config.bits.enable &= dc->ctx->dmub_srv->dmub->feature_caps.fw_assisted_mclk_switch_ver == 2; + dc->debug.fams2_config.bits.enable &= + dc->caps.dmub_caps.fams_ver == dc->debug.fams_version.ver; // sw & fw fams versions must match for support if ((!dc->debug.fams2_config.bits.enable && dc->res_pool->funcs->update_bw_bounding_box) || res_pool->ref_clocks.dchub_ref_clock_inKhz / 1000 != current_dchub_ref_freq) { /* update bounding box if FAMS2 disabled, or if dchub clk has changed */ @@ -820,7 +737,8 @@ enum dc_status dcn401_enable_stream_timing( int opp_cnt = 1; int opp_inst[MAX_PIPES] = {0}; struct pipe_ctx *opp_heads[MAX_PIPES] = {0}; - bool manual_mode; + struct dc_crtc_timing patched_crtc_timing = stream->timing; + bool manual_mode = false; unsigned int tmds_div = PIXEL_RATE_DIV_NA; unsigned int unused_div = PIXEL_RATE_DIV_NA; int odm_slice_width; @@ -874,16 +792,20 @@ enum dc_status dcn401_enable_stream_timing( if (dc->hwseq->funcs.PLAT_58856_wa && (!dc_is_dp_signal(stream->signal))) dc->hwseq->funcs.PLAT_58856_wa(context, pipe_ctx); + /* if we are borrowing from hblank, h_addressable needs to be adjusted */ + if (dc->debug.enable_hblank_borrow) + patched_crtc_timing.h_addressable = patched_crtc_timing.h_addressable + pipe_ctx->hblank_borrow; + pipe_ctx->stream_res.tg->funcs->program_timing( - pipe_ctx->stream_res.tg, - &stream->timing, - pipe_ctx->pipe_dlg_param.vready_offset, - pipe_ctx->pipe_dlg_param.vstartup_start, - pipe_ctx->pipe_dlg_param.vupdate_offset, - pipe_ctx->pipe_dlg_param.vupdate_width, - pipe_ctx->pipe_dlg_param.pstate_keepout, - pipe_ctx->stream->signal, - true); + pipe_ctx->stream_res.tg, + &patched_crtc_timing, + (unsigned int)pipe_ctx->global_sync.dcn4x.vready_offset_pixels, + (unsigned int)pipe_ctx->global_sync.dcn4x.vstartup_lines, + (unsigned int)pipe_ctx->global_sync.dcn4x.vupdate_offset_pixels, + (unsigned int)pipe_ctx->global_sync.dcn4x.vupdate_vupdate_width_pixels, + (unsigned int)pipe_ctx->global_sync.dcn4x.pstate_keepout_start_lines, + pipe_ctx->stream->signal, + true); for (i = 0; i < opp_cnt; i++) { opp_heads[i]->stream_res.opp->funcs->opp_pipe_clock_control( @@ -2007,3 +1929,730 @@ void dcn401_reset_hw_ctx_wrap( } } } + +static unsigned int dcn401_calculate_vready_offset_for_group(struct pipe_ctx *pipe) +{ + struct pipe_ctx *other_pipe; + unsigned int vready_offset = pipe->global_sync.dcn4x.vready_offset_pixels; + + /* Always use the largest vready_offset of all connected pipes */ + for (other_pipe = pipe->bottom_pipe; other_pipe != NULL; other_pipe = other_pipe->bottom_pipe) { + if (other_pipe->global_sync.dcn4x.vready_offset_pixels > vready_offset) + vready_offset = other_pipe->global_sync.dcn4x.vready_offset_pixels; + } + for (other_pipe = pipe->top_pipe; other_pipe != NULL; other_pipe = other_pipe->top_pipe) { + if (other_pipe->global_sync.dcn4x.vready_offset_pixels > vready_offset) + vready_offset = other_pipe->global_sync.dcn4x.vready_offset_pixels; + } + for (other_pipe = pipe->next_odm_pipe; other_pipe != NULL; other_pipe = other_pipe->next_odm_pipe) { + if (other_pipe->global_sync.dcn4x.vready_offset_pixels > vready_offset) + vready_offset = other_pipe->global_sync.dcn4x.vready_offset_pixels; + } + for (other_pipe = pipe->prev_odm_pipe; other_pipe != NULL; other_pipe = other_pipe->prev_odm_pipe) { + if (other_pipe->global_sync.dcn4x.vready_offset_pixels > vready_offset) + vready_offset = other_pipe->global_sync.dcn4x.vready_offset_pixels; + } + + return vready_offset; +} + +static void dcn401_program_tg( + struct dc *dc, + struct pipe_ctx *pipe_ctx, + struct dc_state *context, + struct dce_hwseq *hws) +{ + pipe_ctx->stream_res.tg->funcs->program_global_sync( + pipe_ctx->stream_res.tg, + dcn401_calculate_vready_offset_for_group(pipe_ctx), + (unsigned int)pipe_ctx->global_sync.dcn4x.vstartup_lines, + (unsigned int)pipe_ctx->global_sync.dcn4x.vupdate_offset_pixels, + (unsigned int)pipe_ctx->global_sync.dcn4x.vupdate_vupdate_width_pixels, + (unsigned int)pipe_ctx->global_sync.dcn4x.pstate_keepout_start_lines); + + if (dc_state_get_pipe_subvp_type(context, pipe_ctx) != SUBVP_PHANTOM) + pipe_ctx->stream_res.tg->funcs->wait_for_state(pipe_ctx->stream_res.tg, CRTC_STATE_VACTIVE); + + pipe_ctx->stream_res.tg->funcs->set_vtg_params( + pipe_ctx->stream_res.tg, &pipe_ctx->stream->timing, true); + + if (hws->funcs.setup_vupdate_interrupt) + hws->funcs.setup_vupdate_interrupt(dc, pipe_ctx); +} + +static void dcn401_program_pipe( + struct dc *dc, + struct pipe_ctx *pipe_ctx, + struct dc_state *context) +{ + struct dce_hwseq *hws = dc->hwseq; + + /* Only need to unblank on top pipe */ + if (resource_is_pipe_type(pipe_ctx, OTG_MASTER)) { + if (pipe_ctx->update_flags.bits.enable || + pipe_ctx->update_flags.bits.odm || + pipe_ctx->stream->update_flags.bits.abm_level) + hws->funcs.blank_pixel_data(dc, pipe_ctx, + !pipe_ctx->plane_state || + !pipe_ctx->plane_state->visible); + } + + /* Only update TG on top pipe */ + if (pipe_ctx->update_flags.bits.global_sync && !pipe_ctx->top_pipe + && !pipe_ctx->prev_odm_pipe) + dcn401_program_tg(dc, pipe_ctx, context, hws); + + if (pipe_ctx->update_flags.bits.odm) + hws->funcs.update_odm(dc, context, pipe_ctx); + + if (pipe_ctx->update_flags.bits.enable) { + if (hws->funcs.enable_plane) + hws->funcs.enable_plane(dc, pipe_ctx, context); + else + dc->hwss.enable_plane(dc, pipe_ctx, context); + + if (dc->res_pool->hubbub->funcs->force_wm_propagate_to_pipes) + dc->res_pool->hubbub->funcs->force_wm_propagate_to_pipes(dc->res_pool->hubbub); + } + + if (pipe_ctx->update_flags.bits.det_size) { + if (dc->res_pool->hubbub->funcs->program_det_size) + dc->res_pool->hubbub->funcs->program_det_size( + dc->res_pool->hubbub, pipe_ctx->plane_res.hubp->inst, pipe_ctx->det_buffer_size_kb); + if (dc->res_pool->hubbub->funcs->program_det_segments) + dc->res_pool->hubbub->funcs->program_det_segments( + dc->res_pool->hubbub, pipe_ctx->plane_res.hubp->inst, pipe_ctx->hubp_regs.det_size); + } + + if (pipe_ctx->update_flags.raw || + (pipe_ctx->plane_state && pipe_ctx->plane_state->update_flags.raw) || + pipe_ctx->stream->update_flags.raw) + dc->hwss.update_dchubp_dpp(dc, pipe_ctx, context); + + if (pipe_ctx->plane_state && (pipe_ctx->update_flags.bits.enable || + pipe_ctx->plane_state->update_flags.bits.hdr_mult)) + hws->funcs.set_hdr_multiplier(pipe_ctx); + + if (hws->funcs.populate_mcm_luts) { + if (pipe_ctx->plane_state) { + hws->funcs.populate_mcm_luts(dc, pipe_ctx, pipe_ctx->plane_state->mcm_luts, + pipe_ctx->plane_state->lut_bank_a); + pipe_ctx->plane_state->lut_bank_a = !pipe_ctx->plane_state->lut_bank_a; + } + } + + if (pipe_ctx->plane_state && + (pipe_ctx->plane_state->update_flags.bits.in_transfer_func_change || + pipe_ctx->plane_state->update_flags.bits.gamma_change || + pipe_ctx->plane_state->update_flags.bits.lut_3d || + pipe_ctx->update_flags.bits.enable)) + hws->funcs.set_input_transfer_func(dc, pipe_ctx, pipe_ctx->plane_state); + + /* dcn10_translate_regamma_to_hw_format takes 750us to finish + * only do gamma programming for powering on, internal memcmp to avoid + * updating on slave planes + */ + if (pipe_ctx->update_flags.bits.enable || + pipe_ctx->update_flags.bits.plane_changed || + pipe_ctx->stream->update_flags.bits.out_tf || + (pipe_ctx->plane_state && + pipe_ctx->plane_state->update_flags.bits.output_tf_change)) + hws->funcs.set_output_transfer_func(dc, pipe_ctx, pipe_ctx->stream); + + /* If the pipe has been enabled or has a different opp, we + * should reprogram the fmt. This deals with cases where + * interation between mpc and odm combine on different streams + * causes a different pipe to be chosen to odm combine with. + */ + if (pipe_ctx->update_flags.bits.enable + || pipe_ctx->update_flags.bits.opp_changed) { + + pipe_ctx->stream_res.opp->funcs->opp_set_dyn_expansion( + pipe_ctx->stream_res.opp, + COLOR_SPACE_YCBCR601, + pipe_ctx->stream->timing.display_color_depth, + pipe_ctx->stream->signal); + + pipe_ctx->stream_res.opp->funcs->opp_program_fmt( + pipe_ctx->stream_res.opp, + &pipe_ctx->stream->bit_depth_params, + &pipe_ctx->stream->clamping); + } + + /* Set ABM pipe after other pipe configurations done */ + if ((pipe_ctx->plane_state && pipe_ctx->plane_state->visible)) { + if (pipe_ctx->stream_res.abm) { + dc->hwss.set_pipe(pipe_ctx); + pipe_ctx->stream_res.abm->funcs->set_abm_level(pipe_ctx->stream_res.abm, + pipe_ctx->stream->abm_level); + } + } + + if (pipe_ctx->update_flags.bits.test_pattern_changed) { + struct output_pixel_processor *odm_opp = pipe_ctx->stream_res.opp; + struct bit_depth_reduction_params params; + + memset(¶ms, 0, sizeof(params)); + odm_opp->funcs->opp_program_bit_depth_reduction(odm_opp, ¶ms); + dc->hwss.set_disp_pattern_generator(dc, + pipe_ctx, + pipe_ctx->stream_res.test_pattern_params.test_pattern, + pipe_ctx->stream_res.test_pattern_params.color_space, + pipe_ctx->stream_res.test_pattern_params.color_depth, + NULL, + pipe_ctx->stream_res.test_pattern_params.width, + pipe_ctx->stream_res.test_pattern_params.height, + pipe_ctx->stream_res.test_pattern_params.offset); + } +} + +void dcn401_program_front_end_for_ctx( + struct dc *dc, + struct dc_state *context) +{ + int i; + unsigned int prev_hubp_count = 0; + unsigned int hubp_count = 0; + struct dce_hwseq *hws = dc->hwseq; + struct pipe_ctx *pipe = NULL; + + DC_LOGGER_INIT(dc->ctx->logger); + + if (resource_is_pipe_topology_changed(dc->current_state, context)) + resource_log_pipe_topology_update(dc, context); + + if (dc->hwss.program_triplebuffer != NULL && dc->debug.enable_tri_buf) { + for (i = 0; i < dc->res_pool->pipe_count; i++) { + pipe = &context->res_ctx.pipe_ctx[i]; + + if (!pipe->top_pipe && !pipe->prev_odm_pipe && pipe->plane_state) { + if (pipe->plane_state->triplebuffer_flips) + BREAK_TO_DEBUGGER(); + + /*turn off triple buffer for full update*/ + dc->hwss.program_triplebuffer( + dc, pipe, pipe->plane_state->triplebuffer_flips); + } + } + } + + for (i = 0; i < dc->res_pool->pipe_count; i++) { + if (dc->current_state->res_ctx.pipe_ctx[i].plane_state) + prev_hubp_count++; + if (context->res_ctx.pipe_ctx[i].plane_state) + hubp_count++; + } + + if (prev_hubp_count == 0 && hubp_count > 0) { + if (dc->res_pool->hubbub->funcs->force_pstate_change_control) + dc->res_pool->hubbub->funcs->force_pstate_change_control( + dc->res_pool->hubbub, true, false); + udelay(500); + } + + /* Set pipe update flags and lock pipes */ + for (i = 0; i < dc->res_pool->pipe_count; i++) + dc->hwss.detect_pipe_changes(dc->current_state, context, &dc->current_state->res_ctx.pipe_ctx[i], + &context->res_ctx.pipe_ctx[i]); + + /* When disabling phantom pipes, turn on phantom OTG first (so we can get double + * buffer updates properly) + */ + for (i = 0; i < dc->res_pool->pipe_count; i++) { + struct dc_stream_state *stream = dc->current_state->res_ctx.pipe_ctx[i].stream; + + pipe = &dc->current_state->res_ctx.pipe_ctx[i]; + + if (context->res_ctx.pipe_ctx[i].update_flags.bits.disable && stream && + dc_state_get_pipe_subvp_type(dc->current_state, pipe) == SUBVP_PHANTOM) { + struct timing_generator *tg = dc->current_state->res_ctx.pipe_ctx[i].stream_res.tg; + + if (tg->funcs->enable_crtc) { + if (dc->hwseq->funcs.blank_pixel_data) + dc->hwseq->funcs.blank_pixel_data(dc, pipe, true); + + tg->funcs->enable_crtc(tg); + } + } + } + /* OTG blank before disabling all front ends */ + for (i = 0; i < dc->res_pool->pipe_count; i++) + if (context->res_ctx.pipe_ctx[i].update_flags.bits.disable + && !context->res_ctx.pipe_ctx[i].top_pipe + && !context->res_ctx.pipe_ctx[i].prev_odm_pipe + && context->res_ctx.pipe_ctx[i].stream) + hws->funcs.blank_pixel_data(dc, &context->res_ctx.pipe_ctx[i], true); + + + /* Disconnect mpcc */ + for (i = 0; i < dc->res_pool->pipe_count; i++) + if (context->res_ctx.pipe_ctx[i].update_flags.bits.disable + || context->res_ctx.pipe_ctx[i].update_flags.bits.opp_changed) { + struct hubbub *hubbub = dc->res_pool->hubbub; + + /* Phantom pipe DET should be 0, but if a pipe in use is being transitioned to phantom + * then we want to do the programming here (effectively it's being disabled). If we do + * the programming later the DET won't be updated until the OTG for the phantom pipe is + * turned on (i.e. in an MCLK switch) which can come in too late and cause issues with + * DET allocation. + */ + if ((context->res_ctx.pipe_ctx[i].update_flags.bits.disable || + (context->res_ctx.pipe_ctx[i].plane_state && + dc_state_get_pipe_subvp_type(context, &context->res_ctx.pipe_ctx[i]) == + SUBVP_PHANTOM))) { + if (hubbub->funcs->program_det_size) + hubbub->funcs->program_det_size(hubbub, + dc->current_state->res_ctx.pipe_ctx[i].plane_res.hubp->inst, 0); + if (dc->res_pool->hubbub->funcs->program_det_segments) + dc->res_pool->hubbub->funcs->program_det_segments( + hubbub, dc->current_state->res_ctx.pipe_ctx[i].plane_res.hubp->inst, 0); + } + hws->funcs.plane_atomic_disconnect(dc, dc->current_state, + &dc->current_state->res_ctx.pipe_ctx[i]); + DC_LOG_DC("Reset mpcc for pipe %d\n", dc->current_state->res_ctx.pipe_ctx[i].pipe_idx); + } + + /* update ODM for blanked OTG master pipes */ + for (i = 0; i < dc->res_pool->pipe_count; i++) { + pipe = &context->res_ctx.pipe_ctx[i]; + if (resource_is_pipe_type(pipe, OTG_MASTER) && + !resource_is_pipe_type(pipe, DPP_PIPE) && + pipe->update_flags.bits.odm && + hws->funcs.update_odm) + hws->funcs.update_odm(dc, context, pipe); + } + + /* + * Program all updated pipes, order matters for mpcc setup. Start with + * top pipe and program all pipes that follow in order + */ + for (i = 0; i < dc->res_pool->pipe_count; i++) { + pipe = &context->res_ctx.pipe_ctx[i]; + + if (pipe->plane_state && !pipe->top_pipe) { + while (pipe) { + if (hws->funcs.program_pipe) + hws->funcs.program_pipe(dc, pipe, context); + else { + /* Don't program phantom pipes in the regular front end programming sequence. + * There is an MPO transition case where a pipe being used by a video plane is + * transitioned directly to be a phantom pipe when closing the MPO video. + * However the phantom pipe will program a new HUBP_VTG_SEL (update takes place + * right away) but the MPO still exists until the double buffered update of the + * main pipe so we will get a frame of underflow if the phantom pipe is + * programmed here. + */ + if (pipe->stream && + dc_state_get_pipe_subvp_type(context, pipe) != SUBVP_PHANTOM) + dcn401_program_pipe(dc, pipe, context); + } + + pipe = pipe->bottom_pipe; + } + } + + /* Program secondary blending tree and writeback pipes */ + pipe = &context->res_ctx.pipe_ctx[i]; + if (!pipe->top_pipe && !pipe->prev_odm_pipe + && pipe->stream && pipe->stream->num_wb_info > 0 + && (pipe->update_flags.raw || (pipe->plane_state && pipe->plane_state->update_flags.raw) + || pipe->stream->update_flags.raw) + && hws->funcs.program_all_writeback_pipes_in_tree) + hws->funcs.program_all_writeback_pipes_in_tree(dc, pipe->stream, context); + + /* Avoid underflow by check of pipe line read when adding 2nd plane. */ + if (hws->wa.wait_hubpret_read_start_during_mpo_transition && + !pipe->top_pipe && + pipe->stream && + pipe->plane_res.hubp->funcs->hubp_wait_pipe_read_start && + dc->current_state->stream_status[0].plane_count == 1 && + context->stream_status[0].plane_count > 1) { + pipe->plane_res.hubp->funcs->hubp_wait_pipe_read_start(pipe->plane_res.hubp); + } + } +} + +void dcn401_post_unlock_program_front_end( + struct dc *dc, + struct dc_state *context) +{ + // Timeout for pipe enable + unsigned int timeout_us = 100000; + unsigned int polling_interval_us = 1; + struct dce_hwseq *hwseq = dc->hwseq; + int i; + + DC_LOGGER_INIT(dc->ctx->logger); + + for (i = 0; i < dc->res_pool->pipe_count; i++) + if (resource_is_pipe_type(&dc->current_state->res_ctx.pipe_ctx[i], OPP_HEAD) && + !resource_is_pipe_type(&context->res_ctx.pipe_ctx[i], OPP_HEAD)) + dc->hwss.post_unlock_reset_opp(dc, + &dc->current_state->res_ctx.pipe_ctx[i]); + + for (i = 0; i < dc->res_pool->pipe_count; i++) + if (context->res_ctx.pipe_ctx[i].update_flags.bits.disable) + dc->hwss.disable_plane(dc, dc->current_state, &dc->current_state->res_ctx.pipe_ctx[i]); + + /* + * If we are enabling a pipe, we need to wait for pending clear as this is a critical + * part of the enable operation otherwise, DM may request an immediate flip which + * will cause HW to perform an "immediate enable" (as opposed to "vsync enable") which + * is unsupported on DCN. + */ + for (i = 0; i < dc->res_pool->pipe_count; i++) { + struct pipe_ctx *pipe = &context->res_ctx.pipe_ctx[i]; + // Don't check flip pending on phantom pipes + if (pipe->plane_state && !pipe->top_pipe && pipe->update_flags.bits.enable && + dc_state_get_pipe_subvp_type(context, pipe) != SUBVP_PHANTOM) { + struct hubp *hubp = pipe->plane_res.hubp; + int j = 0; + + for (j = 0; j < timeout_us / polling_interval_us + && hubp->funcs->hubp_is_flip_pending(hubp); j++) + udelay(polling_interval_us); + } + } + + for (i = 0; i < dc->res_pool->pipe_count; i++) { + struct pipe_ctx *pipe = &context->res_ctx.pipe_ctx[i]; + struct pipe_ctx *old_pipe = &dc->current_state->res_ctx.pipe_ctx[i]; + + /* When going from a smaller ODM slice count to larger, we must ensure double + * buffer update completes before we return to ensure we don't reduce DISPCLK + * before we've transitioned to 2:1 or 4:1 + */ + if (resource_is_pipe_type(old_pipe, OTG_MASTER) && resource_is_pipe_type(pipe, OTG_MASTER) && + resource_get_odm_slice_count(old_pipe) < resource_get_odm_slice_count(pipe) && + dc_state_get_pipe_subvp_type(context, pipe) != SUBVP_PHANTOM) { + int j = 0; + struct timing_generator *tg = pipe->stream_res.tg; + + if (tg->funcs->get_optc_double_buffer_pending) { + for (j = 0; j < timeout_us / polling_interval_us + && tg->funcs->get_optc_double_buffer_pending(tg); j++) + udelay(polling_interval_us); + } + } + } + + if (dc->res_pool->hubbub->funcs->force_pstate_change_control) + dc->res_pool->hubbub->funcs->force_pstate_change_control( + dc->res_pool->hubbub, false, false); + + + for (i = 0; i < dc->res_pool->pipe_count; i++) { + struct pipe_ctx *pipe = &context->res_ctx.pipe_ctx[i]; + + if (pipe->plane_state && !pipe->top_pipe) { + /* Program phantom pipe here to prevent a frame of underflow in the MPO transition + * case (if a pipe being used for a video plane transitions to a phantom pipe, it + * can underflow due to HUBP_VTG_SEL programming if done in the regular front end + * programming sequence). + */ + while (pipe) { + if (pipe->stream && dc_state_get_pipe_subvp_type(context, pipe) == SUBVP_PHANTOM) { + /* When turning on the phantom pipe we want to run through the + * entire enable sequence, so apply all the "enable" flags. + */ + if (dc->hwss.apply_update_flags_for_phantom) + dc->hwss.apply_update_flags_for_phantom(pipe); + if (dc->hwss.update_phantom_vp_position) + dc->hwss.update_phantom_vp_position(dc, context, pipe); + dcn401_program_pipe(dc, pipe, context); + } + pipe = pipe->bottom_pipe; + } + } + } + + if (!hwseq) + return; + + /* P-State support transitions: + * Natural -> FPO: P-State disabled in prepare, force disallow anytime is safe + * FPO -> Natural: Unforce anytime after FW disable is safe (P-State will assert naturally) + * Unsupported -> FPO: P-State enabled in optimize, force disallow anytime is safe + * FPO -> Unsupported: P-State disabled in prepare, unforce disallow anytime is safe + * FPO <-> SubVP: Force disallow is maintained on the FPO / SubVP pipes + */ + if (hwseq->funcs.update_force_pstate) + dc->hwseq->funcs.update_force_pstate(dc, context); + + /* Only program the MALL registers after all the main and phantom pipes + * are done programming. + */ + if (hwseq->funcs.program_mall_pipe_config) + hwseq->funcs.program_mall_pipe_config(dc, context); + + /* WA to apply WM setting*/ + if (hwseq->wa.DEGVIDCN21) + dc->res_pool->hubbub->funcs->apply_DEDCN21_147_wa(dc->res_pool->hubbub); + + + /* WA for stutter underflow during MPO transitions when adding 2nd plane */ + if (hwseq->wa.disallow_self_refresh_during_multi_plane_transition) { + + if (dc->current_state->stream_status[0].plane_count == 1 && + context->stream_status[0].plane_count > 1) { + + struct timing_generator *tg = dc->res_pool->timing_generators[0]; + + dc->res_pool->hubbub->funcs->allow_self_refresh_control(dc->res_pool->hubbub, false); + + hwseq->wa_state.disallow_self_refresh_during_multi_plane_transition_applied = true; + hwseq->wa_state.disallow_self_refresh_during_multi_plane_transition_applied_on_frame = + tg->funcs->get_frame_count(tg); + } + } +} + +bool dcn401_update_bandwidth( + struct dc *dc, + struct dc_state *context) +{ + int i; + struct dce_hwseq *hws = dc->hwseq; + + /* recalculate DML parameters */ + if (!dc->res_pool->funcs->validate_bandwidth(dc, context, false)) + return false; + + /* apply updated bandwidth parameters */ + dc->hwss.prepare_bandwidth(dc, context); + + /* update hubp configs for all pipes */ + for (i = 0; i < dc->res_pool->pipe_count; i++) { + struct pipe_ctx *pipe_ctx = &context->res_ctx.pipe_ctx[i]; + + if (pipe_ctx->plane_state == NULL) + continue; + + if (pipe_ctx->top_pipe == NULL) { + bool blank = !is_pipe_tree_visible(pipe_ctx); + + pipe_ctx->stream_res.tg->funcs->program_global_sync( + pipe_ctx->stream_res.tg, + dcn401_calculate_vready_offset_for_group(pipe_ctx), + (unsigned int)pipe_ctx->global_sync.dcn4x.vstartup_lines, + (unsigned int)pipe_ctx->global_sync.dcn4x.vupdate_offset_pixels, + (unsigned int)pipe_ctx->global_sync.dcn4x.vupdate_vupdate_width_pixels, + (unsigned int)pipe_ctx->global_sync.dcn4x.pstate_keepout_start_lines); + + pipe_ctx->stream_res.tg->funcs->set_vtg_params( + pipe_ctx->stream_res.tg, &pipe_ctx->stream->timing, false); + + if (pipe_ctx->prev_odm_pipe == NULL) + hws->funcs.blank_pixel_data(dc, pipe_ctx, blank); + + if (hws->funcs.setup_vupdate_interrupt) + hws->funcs.setup_vupdate_interrupt(dc, pipe_ctx); + } + + if (pipe_ctx->plane_res.hubp->funcs->hubp_setup2) + pipe_ctx->plane_res.hubp->funcs->hubp_setup2( + pipe_ctx->plane_res.hubp, + &pipe_ctx->hubp_regs, + &pipe_ctx->global_sync, + &pipe_ctx->stream->timing); + } + + return true; +} + +void dcn401_detect_pipe_changes(struct dc_state *old_state, + struct dc_state *new_state, + struct pipe_ctx *old_pipe, + struct pipe_ctx *new_pipe) +{ + bool old_is_phantom = dc_state_get_pipe_subvp_type(old_state, old_pipe) == SUBVP_PHANTOM; + bool new_is_phantom = dc_state_get_pipe_subvp_type(new_state, new_pipe) == SUBVP_PHANTOM; + + unsigned int old_pipe_vready_offset_pixels = old_pipe->global_sync.dcn4x.vready_offset_pixels; + unsigned int new_pipe_vready_offset_pixels = new_pipe->global_sync.dcn4x.vready_offset_pixels; + unsigned int old_pipe_vstartup_lines = old_pipe->global_sync.dcn4x.vstartup_lines; + unsigned int new_pipe_vstartup_lines = new_pipe->global_sync.dcn4x.vstartup_lines; + unsigned int old_pipe_vupdate_offset_pixels = old_pipe->global_sync.dcn4x.vupdate_offset_pixels; + unsigned int new_pipe_vupdate_offset_pixels = new_pipe->global_sync.dcn4x.vupdate_offset_pixels; + unsigned int old_pipe_vupdate_width_pixels = old_pipe->global_sync.dcn4x.vupdate_vupdate_width_pixels; + unsigned int new_pipe_vupdate_width_pixels = new_pipe->global_sync.dcn4x.vupdate_vupdate_width_pixels; + + new_pipe->update_flags.raw = 0; + + /* If non-phantom pipe is being transitioned to a phantom pipe, + * set disable and return immediately. This is because the pipe + * that was previously in use must be fully disabled before we + * can "enable" it as a phantom pipe (since the OTG will certainly + * be different). The post_unlock sequence will set the correct + * update flags to enable the phantom pipe. + */ + if (old_pipe->plane_state && !old_is_phantom && + new_pipe->plane_state && new_is_phantom) { + new_pipe->update_flags.bits.disable = 1; + return; + } + + if (resource_is_pipe_type(new_pipe, OTG_MASTER) && + resource_is_odm_topology_changed(new_pipe, old_pipe)) + /* Detect odm changes */ + new_pipe->update_flags.bits.odm = 1; + + /* Exit on unchanged, unused pipe */ + if (!old_pipe->plane_state && !new_pipe->plane_state) + return; + /* Detect pipe enable/disable */ + if (!old_pipe->plane_state && new_pipe->plane_state) { + new_pipe->update_flags.bits.enable = 1; + new_pipe->update_flags.bits.mpcc = 1; + new_pipe->update_flags.bits.dppclk = 1; + new_pipe->update_flags.bits.hubp_interdependent = 1; + new_pipe->update_flags.bits.hubp_rq_dlg_ttu = 1; + new_pipe->update_flags.bits.unbounded_req = 1; + new_pipe->update_flags.bits.gamut_remap = 1; + new_pipe->update_flags.bits.scaler = 1; + new_pipe->update_flags.bits.viewport = 1; + new_pipe->update_flags.bits.det_size = 1; + if (new_pipe->stream->test_pattern.type != DP_TEST_PATTERN_VIDEO_MODE && + new_pipe->stream_res.test_pattern_params.width != 0 && + new_pipe->stream_res.test_pattern_params.height != 0) + new_pipe->update_flags.bits.test_pattern_changed = 1; + if (!new_pipe->top_pipe && !new_pipe->prev_odm_pipe) { + new_pipe->update_flags.bits.odm = 1; + new_pipe->update_flags.bits.global_sync = 1; + } + return; + } + + /* For SubVP we need to unconditionally enable because any phantom pipes are + * always removed then newly added for every full updates whenever SubVP is in use. + * The remove-add sequence of the phantom pipe always results in the pipe + * being blanked in enable_stream_timing (DPG). + */ + if (new_pipe->stream && dc_state_get_pipe_subvp_type(new_state, new_pipe) == SUBVP_PHANTOM) + new_pipe->update_flags.bits.enable = 1; + + /* Phantom pipes are effectively disabled, if the pipe was previously phantom + * we have to enable + */ + if (old_pipe->plane_state && old_is_phantom && + new_pipe->plane_state && !new_is_phantom) + new_pipe->update_flags.bits.enable = 1; + + if (old_pipe->plane_state && !new_pipe->plane_state) { + new_pipe->update_flags.bits.disable = 1; + return; + } + + /* Detect plane change */ + if (old_pipe->plane_state != new_pipe->plane_state) + new_pipe->update_flags.bits.plane_changed = true; + + /* Detect top pipe only changes */ + if (resource_is_pipe_type(new_pipe, OTG_MASTER)) { + /* Detect global sync changes */ + if ((old_pipe_vready_offset_pixels != new_pipe_vready_offset_pixels) + || (old_pipe_vstartup_lines != new_pipe_vstartup_lines) + || (old_pipe_vupdate_offset_pixels != new_pipe_vupdate_offset_pixels) + || (old_pipe_vupdate_width_pixels != new_pipe_vupdate_width_pixels)) + new_pipe->update_flags.bits.global_sync = 1; + } + + if (old_pipe->det_buffer_size_kb != new_pipe->det_buffer_size_kb) + new_pipe->update_flags.bits.det_size = 1; + + /* + * Detect opp / tg change, only set on change, not on enable + * Assume mpcc inst = pipe index, if not this code needs to be updated + * since mpcc is what is affected by these. In fact all of our sequence + * makes this assumption at the moment with how hubp reset is matched to + * same index mpcc reset. + */ + if (old_pipe->stream_res.opp != new_pipe->stream_res.opp) + new_pipe->update_flags.bits.opp_changed = 1; + if (old_pipe->stream_res.tg != new_pipe->stream_res.tg) + new_pipe->update_flags.bits.tg_changed = 1; + + /* + * Detect mpcc blending changes, only dpp inst and opp matter here, + * mpccs getting removed/inserted update connected ones during their own + * programming + */ + if (old_pipe->plane_res.dpp != new_pipe->plane_res.dpp + || old_pipe->stream_res.opp != new_pipe->stream_res.opp) + new_pipe->update_flags.bits.mpcc = 1; + + /* Detect dppclk change */ + if (old_pipe->plane_res.bw.dppclk_khz != new_pipe->plane_res.bw.dppclk_khz) + new_pipe->update_flags.bits.dppclk = 1; + + /* Check for scl update */ + if (memcmp(&old_pipe->plane_res.scl_data, &new_pipe->plane_res.scl_data, sizeof(struct scaler_data))) + new_pipe->update_flags.bits.scaler = 1; + /* Check for vp update */ + if (memcmp(&old_pipe->plane_res.scl_data.viewport, &new_pipe->plane_res.scl_data.viewport, sizeof(struct rect)) + || memcmp(&old_pipe->plane_res.scl_data.viewport_c, + &new_pipe->plane_res.scl_data.viewport_c, sizeof(struct rect))) + new_pipe->update_flags.bits.viewport = 1; + + /* Detect dlg/ttu/rq updates */ + { + struct dml2_display_dlg_regs old_dlg_regs = old_pipe->hubp_regs.dlg_regs; + struct dml2_display_ttu_regs old_ttu_regs = old_pipe->hubp_regs.ttu_regs; + struct dml2_display_rq_regs old_rq_regs = old_pipe->hubp_regs.rq_regs; + struct dml2_display_dlg_regs *new_dlg_regs = &new_pipe->hubp_regs.dlg_regs; + struct dml2_display_ttu_regs *new_ttu_regs = &new_pipe->hubp_regs.ttu_regs; + struct dml2_display_rq_regs *new_rq_regs = &new_pipe->hubp_regs.rq_regs; + + /* Detect pipe interdependent updates */ + if ((old_dlg_regs.dst_y_prefetch != new_dlg_regs->dst_y_prefetch) + || (old_dlg_regs.vratio_prefetch != new_dlg_regs->vratio_prefetch) + || (old_dlg_regs.vratio_prefetch_c != new_dlg_regs->vratio_prefetch_c) + || (old_dlg_regs.dst_y_per_vm_vblank != new_dlg_regs->dst_y_per_vm_vblank) + || (old_dlg_regs.dst_y_per_row_vblank != new_dlg_regs->dst_y_per_row_vblank) + || (old_dlg_regs.dst_y_per_vm_flip != new_dlg_regs->dst_y_per_vm_flip) + || (old_dlg_regs.dst_y_per_row_flip != new_dlg_regs->dst_y_per_row_flip) + || (old_dlg_regs.refcyc_per_meta_chunk_vblank_l != new_dlg_regs->refcyc_per_meta_chunk_vblank_l) + || (old_dlg_regs.refcyc_per_meta_chunk_vblank_c != new_dlg_regs->refcyc_per_meta_chunk_vblank_c) + || (old_dlg_regs.refcyc_per_meta_chunk_flip_l != new_dlg_regs->refcyc_per_meta_chunk_flip_l) + || (old_dlg_regs.refcyc_per_line_delivery_pre_l != new_dlg_regs->refcyc_per_line_delivery_pre_l) + || (old_dlg_regs.refcyc_per_line_delivery_pre_c != new_dlg_regs->refcyc_per_line_delivery_pre_c) + || (old_ttu_regs.refcyc_per_req_delivery_pre_l != new_ttu_regs->refcyc_per_req_delivery_pre_l) + || (old_ttu_regs.refcyc_per_req_delivery_pre_c != new_ttu_regs->refcyc_per_req_delivery_pre_c) + || (old_ttu_regs.refcyc_per_req_delivery_pre_cur0 != + new_ttu_regs->refcyc_per_req_delivery_pre_cur0) + || (old_ttu_regs.min_ttu_vblank != new_ttu_regs->min_ttu_vblank) + || (old_ttu_regs.qos_level_flip != new_ttu_regs->qos_level_flip)) { + old_dlg_regs.dst_y_prefetch = new_dlg_regs->dst_y_prefetch; + old_dlg_regs.vratio_prefetch = new_dlg_regs->vratio_prefetch; + old_dlg_regs.vratio_prefetch_c = new_dlg_regs->vratio_prefetch_c; + old_dlg_regs.dst_y_per_vm_vblank = new_dlg_regs->dst_y_per_vm_vblank; + old_dlg_regs.dst_y_per_row_vblank = new_dlg_regs->dst_y_per_row_vblank; + old_dlg_regs.dst_y_per_vm_flip = new_dlg_regs->dst_y_per_vm_flip; + old_dlg_regs.dst_y_per_row_flip = new_dlg_regs->dst_y_per_row_flip; + old_dlg_regs.refcyc_per_meta_chunk_vblank_l = new_dlg_regs->refcyc_per_meta_chunk_vblank_l; + old_dlg_regs.refcyc_per_meta_chunk_vblank_c = new_dlg_regs->refcyc_per_meta_chunk_vblank_c; + old_dlg_regs.refcyc_per_meta_chunk_flip_l = new_dlg_regs->refcyc_per_meta_chunk_flip_l; + old_dlg_regs.refcyc_per_line_delivery_pre_l = new_dlg_regs->refcyc_per_line_delivery_pre_l; + old_dlg_regs.refcyc_per_line_delivery_pre_c = new_dlg_regs->refcyc_per_line_delivery_pre_c; + old_ttu_regs.refcyc_per_req_delivery_pre_l = new_ttu_regs->refcyc_per_req_delivery_pre_l; + old_ttu_regs.refcyc_per_req_delivery_pre_c = new_ttu_regs->refcyc_per_req_delivery_pre_c; + old_ttu_regs.refcyc_per_req_delivery_pre_cur0 = new_ttu_regs->refcyc_per_req_delivery_pre_cur0; + old_ttu_regs.min_ttu_vblank = new_ttu_regs->min_ttu_vblank; + old_ttu_regs.qos_level_flip = new_ttu_regs->qos_level_flip; + new_pipe->update_flags.bits.hubp_interdependent = 1; + } + /* Detect any other updates to ttu/rq/dlg */ + if (memcmp(&old_dlg_regs, new_dlg_regs, sizeof(old_dlg_regs)) || + memcmp(&old_ttu_regs, new_ttu_regs, sizeof(old_ttu_regs)) || + memcmp(&old_rq_regs, new_rq_regs, sizeof(old_rq_regs))) + new_pipe->update_flags.bits.hubp_rq_dlg_ttu = 1; + } + + if (old_pipe->unbounded_req != new_pipe->unbounded_req) + new_pipe->update_flags.bits.unbounded_req = 1; + + if (memcmp(&old_pipe->stream_res.test_pattern_params, + &new_pipe->stream_res.test_pattern_params, sizeof(struct test_pattern_params))) { + new_pipe->update_flags.bits.test_pattern_changed = 1; + } +} diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.h b/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.h index 28a513dfc005..17cea748789e 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.h +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.h @@ -63,8 +63,6 @@ void dcn401_set_cursor_position(struct pipe_ctx *pipe_ctx); bool dcn401_apply_idle_power_optimizations(struct dc *dc, bool enable); -struct ips_ono_region_state dcn401_read_ono_state(struct dc *dc, - uint8_t region); void dcn401_wait_for_dcc_meta_propagation(const struct dc *dc, const struct pipe_ctx *top_pipe_to_program); @@ -96,5 +94,12 @@ void dcn401_reset_hw_ctx_wrap( struct dc *dc, struct dc_state *context); void dcn401_perform_3dlut_wa_unlock(struct pipe_ctx *pipe_ctx); - +void dcn401_program_front_end_for_ctx(struct dc *dc, struct dc_state *context); +void dcn401_post_unlock_program_front_end(struct dc *dc, struct dc_state *context); +bool dcn401_update_bandwidth(struct dc *dc, struct dc_state *context); +void dcn401_detect_pipe_changes( + struct dc_state *old_state, + struct dc_state *new_state, + struct pipe_ctx *old_pipe, + struct pipe_ctx *new_pipe); #endif /* __DC_HWSS_DCN401_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_init.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_init.c index 23e4f208152e..44cb376f97c1 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_init.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_init.c @@ -17,9 +17,9 @@ static const struct hw_sequencer_funcs dcn401_funcs = { .init_hw = dcn401_init_hw, .apply_ctx_to_hw = dce110_apply_ctx_to_hw, .apply_ctx_for_surface = NULL, - .program_front_end_for_ctx = dcn20_program_front_end_for_ctx, + .program_front_end_for_ctx = dcn401_program_front_end_for_ctx, .wait_for_pending_cleared = dcn10_wait_for_pending_cleared, - .post_unlock_program_front_end = dcn20_post_unlock_program_front_end, + .post_unlock_program_front_end = dcn401_post_unlock_program_front_end, .update_plane_addr = dcn20_update_plane_addr, .update_dchub = dcn10_update_dchub, .update_pending_status = dcn10_update_pending_status, @@ -42,7 +42,7 @@ static const struct hw_sequencer_funcs dcn401_funcs = { .cursor_lock = dcn10_cursor_lock, .prepare_bandwidth = dcn401_prepare_bandwidth, .optimize_bandwidth = dcn401_optimize_bandwidth, - .update_bandwidth = dcn20_update_bandwidth, + .update_bandwidth = dcn401_update_bandwidth, .set_drr = dcn10_set_drr, .get_position = dcn10_get_position, .set_static_screen_control = dcn31_set_static_screen_control, @@ -66,7 +66,6 @@ static const struct hw_sequencer_funcs dcn401_funcs = { .enable_writeback = dcn30_enable_writeback, .disable_writeback = dcn30_disable_writeback, .update_writeback = dcn30_update_writeback, - .mmhubbub_warmup = dcn30_mmhubbub_warmup, .dmdata_status_done = dcn20_dmdata_status_done, .program_dmdata_engine = dcn30_program_dmdata_engine, .set_dmdata_attributes = dcn20_set_dmdata_attributes, @@ -100,6 +99,10 @@ static const struct hw_sequencer_funcs dcn401_funcs = { .fams2_global_control_lock_fast = dcn401_fams2_global_control_lock_fast, .program_outstanding_updates = dcn401_program_outstanding_updates, .wait_for_all_pending_updates = dcn30_wait_for_all_pending_updates, + .detect_pipe_changes = dcn401_detect_pipe_changes, + .enable_plane = dcn20_enable_plane, + .update_dchubp_dpp = dcn20_update_dchubp_dpp, + .post_unlock_reset_opp = dcn20_post_unlock_reset_opp, }; static const struct hwseq_private_funcs dcn401_private_funcs = { diff --git a/drivers/gpu/drm/amd/display/dc/hwss/hw_sequencer.h b/drivers/gpu/drm/amd/display/dc/hwss/hw_sequencer.h index 66fdc5805d0a..a7d66cfd93c9 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/hw_sequencer.h +++ b/drivers/gpu/drm/amd/display/dc/hwss/hw_sequencer.h @@ -194,7 +194,6 @@ enum block_sequence_func { DMUB_SUBVP_SAVE_SURF_ADDR, HUBP_WAIT_FOR_DCC_META_PROP, DMUB_FAMS2_GLOBAL_CONTROL_LOCK_FAST, - }; struct block_sequence { @@ -331,10 +330,6 @@ struct hw_sequencer_funcs { void (*disable_writeback)(struct dc *dc, unsigned int dwb_pipe_inst); - bool (*mmhubbub_warmup)(struct dc *dc, - unsigned int num_dwb, - struct dc_writeback_info *wb_info); - /* Clock Related */ enum dc_status (*set_clock)(struct dc *dc, enum dc_clock_type clock_type, @@ -462,6 +457,18 @@ struct hw_sequencer_funcs { struct dc_state *context); void (*setup_hpo_hw_control)(const struct dce_hwseq *hws, bool enable); void (*wait_for_all_pending_updates)(const struct pipe_ctx *pipe_ctx); + void (*detect_pipe_changes)(struct dc_state *old_state, + struct dc_state *new_state, + struct pipe_ctx *old_pipe, + struct pipe_ctx *new_pipe); + void (*enable_plane)(struct dc *dc, + struct pipe_ctx *pipe_ctx, + struct dc_state *context); + void (*update_dchubp_dpp)(struct dc *dc, + struct pipe_ctx *pipe_ctx, + struct dc_state *context); + void (*post_unlock_reset_opp)(struct dc *dc, + struct pipe_ctx *opp_head); }; void color_space_to_black_color( @@ -489,11 +496,12 @@ void get_hdr_visual_confirm_color( void get_mpctree_visual_confirm_color( struct pipe_ctx *pipe_ctx, struct tg_color *color); - +void get_vabc_visual_confirm_color( + struct pipe_ctx *pipe_ctx, + struct tg_color *color); void get_subvp_visual_confirm_color( struct pipe_ctx *pipe_ctx, struct tg_color *color); - void get_fams2_visual_confirm_color( struct dc *dc, struct dc_state *context, diff --git a/drivers/gpu/drm/amd/display/dc/inc/core_types.h b/drivers/gpu/drm/amd/display/dc/inc/core_types.h index 8597e866bfe6..d558efc6e12f 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/core_types.h +++ b/drivers/gpu/drm/amd/display/dc/inc/core_types.h @@ -45,9 +45,6 @@ #define MAX_SVP_PHANTOM_STREAMS 2 #define MAX_SVP_PHANTOM_PLANES 2 -void enable_surface_flip_reporting(struct dc_plane_state *plane_state, - uint32_t controller_id); - #include "grph_object_id.h" #include "link_encoder.h" #include "stream_encoder.h" @@ -219,6 +216,8 @@ struct resource_funcs { * Get indicator of power from a context that went through full validation */ int (*get_power_profile)(const struct dc_state *context); + unsigned int (*get_det_buffer_size)(const struct dc_state *context); + unsigned int (*get_vstartup_for_pipe)(struct pipe_ctx *pipe_ctx); }; struct audio_support{ @@ -467,6 +466,7 @@ struct pipe_ctx { unsigned int surface_size_in_mall_bytes; struct dml2_dchub_per_pipe_register_set hubp_regs; struct dml2_hubp_pipe_mcache_regs mcache_regs; + union dml2_global_sync_programming global_sync; struct dwbc *dwbc; struct mcif_wb *mcif_wb; @@ -477,6 +477,8 @@ struct pipe_ctx { /* subvp_index: only valid if the pipe is a SUBVP_MAIN*/ uint8_t subvp_index; struct pixel_rate_divider pixel_rate_divider; + /* pixels borrowed from hblank to hactive */ + uint8_t hblank_borrow; }; /* Data used for dynamic link encoder assignment. @@ -539,7 +541,8 @@ struct dcn_bw_output { bool legacy_svp_drr_stream_index_valid; struct dml2_mcache_surface_allocation mcache_allocations[DML2_MAX_PLANES]; struct dmub_cmd_fams2_global_config fams2_global_config; - struct dmub_fams2_stream_static_state fams2_stream_params[DML2_MAX_PLANES]; + union dmub_cmd_fams2_config fams2_stream_base_params[DML2_MAX_PLANES]; + union dmub_cmd_fams2_config fams2_stream_sub_params[DML2_MAX_PLANES]; struct dml2_display_arb_regs arb_regs; }; diff --git a/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h b/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h index 55529c5f471c..d19a595c2be4 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h +++ b/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h @@ -624,10 +624,6 @@ bool dcn_validate_bandwidth( struct dc_state *context, bool fast_validate); -unsigned int dcn_find_dcfclk_suits_all( - const struct dc *dc, - struct dc_clocks *clocks); - void dcn_get_soc_clks( struct dc *dc, int *min_fclk_khz, diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/clk_mgr.h b/drivers/gpu/drm/amd/display/dc/inc/hw/clk_mgr.h index 2d06067ff36d..c14d64687a3d 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/hw/clk_mgr.h +++ b/drivers/gpu/drm/amd/display/dc/inc/hw/clk_mgr.h @@ -306,6 +306,9 @@ struct clk_mgr_funcs { */ void (*set_hard_min_memclk)(struct clk_mgr *clk_mgr, bool current_mode); + int (*get_hard_min_memclk)(struct clk_mgr *clk_mgr); + int (*get_hard_min_fclk)(struct clk_mgr *clk_mgr); + /* Send message to PMFW to set hard max memclk frequency to highest DPM */ void (*set_hard_max_memclk)(struct clk_mgr *clk_mgr); diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/clk_mgr_internal.h b/drivers/gpu/drm/amd/display/dc/inc/hw/clk_mgr_internal.h index c2dd061892f4..7a1ca1e98059 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/hw/clk_mgr_internal.h +++ b/drivers/gpu/drm/amd/display/dc/inc/hw/clk_mgr_internal.h @@ -166,6 +166,41 @@ enum dentist_divider_range { CLK_SR_DCN32(CLK1_CLK4_CURRENT_CNT), \ CLK_SR_DCN32(CLK4_CLK0_CURRENT_CNT) +#define CLK_REG_LIST_DCN35() \ + CLK_SR_DCN35(CLK1_CLK_PLL_REQ), \ + CLK_SR_DCN35(CLK1_CLK0_DFS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK1_DFS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK2_DFS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK3_DFS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK4_DFS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK5_DFS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK0_CURRENT_CNT), \ + CLK_SR_DCN35(CLK1_CLK1_CURRENT_CNT), \ + CLK_SR_DCN35(CLK1_CLK2_CURRENT_CNT), \ + CLK_SR_DCN35(CLK1_CLK3_CURRENT_CNT), \ + CLK_SR_DCN35(CLK1_CLK4_CURRENT_CNT), \ + CLK_SR_DCN35(CLK1_CLK5_CURRENT_CNT), \ + CLK_SR_DCN35(CLK1_CLK0_BYPASS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK1_BYPASS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK2_BYPASS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK3_BYPASS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK4_BYPASS_CNTL),\ + CLK_SR_DCN35(CLK1_CLK5_BYPASS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK0_DS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK1_DS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK2_DS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK3_DS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK4_DS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK5_DS_CNTL), \ + CLK_SR_DCN35(CLK1_CLK0_ALLOW_DS), \ + CLK_SR_DCN35(CLK1_CLK1_ALLOW_DS), \ + CLK_SR_DCN35(CLK1_CLK2_ALLOW_DS), \ + CLK_SR_DCN35(CLK1_CLK3_ALLOW_DS), \ + CLK_SR_DCN35(CLK1_CLK4_ALLOW_DS), \ + CLK_SR_DCN35(CLK1_CLK5_ALLOW_DS), \ + CLK_SR_DCN35(CLK5_spll_field_8), \ + SR(DENTIST_DISPCLK_CNTL), \ + #define CLK_COMMON_MASK_SH_LIST_DCN32(mask_sh) \ CLK_COMMON_MASK_SH_LIST_DCN20_BASE(mask_sh),\ CLK_SF(CLK1_CLK_PLL_REQ, FbMult_int, mask_sh),\ @@ -236,6 +271,7 @@ struct clk_mgr_registers { uint32_t CLK1_CLK2_DFS_CNTL; uint32_t CLK1_CLK3_DFS_CNTL; uint32_t CLK1_CLK4_DFS_CNTL; + uint32_t CLK1_CLK5_DFS_CNTL; uint32_t CLK2_CLK2_DFS_CNTL; uint32_t CLK1_CLK0_CURRENT_CNT; @@ -243,11 +279,34 @@ struct clk_mgr_registers { uint32_t CLK1_CLK2_CURRENT_CNT; uint32_t CLK1_CLK3_CURRENT_CNT; uint32_t CLK1_CLK4_CURRENT_CNT; + uint32_t CLK1_CLK5_CURRENT_CNT; uint32_t CLK0_CLK0_DFS_CNTL; uint32_t CLK0_CLK1_DFS_CNTL; uint32_t CLK0_CLK3_DFS_CNTL; uint32_t CLK0_CLK4_DFS_CNTL; + uint32_t CLK1_CLK0_BYPASS_CNTL; + uint32_t CLK1_CLK1_BYPASS_CNTL; + uint32_t CLK1_CLK2_BYPASS_CNTL; + uint32_t CLK1_CLK3_BYPASS_CNTL; + uint32_t CLK1_CLK4_BYPASS_CNTL; + uint32_t CLK1_CLK5_BYPASS_CNTL; + + uint32_t CLK1_CLK0_DS_CNTL; + uint32_t CLK1_CLK1_DS_CNTL; + uint32_t CLK1_CLK2_DS_CNTL; + uint32_t CLK1_CLK3_DS_CNTL; + uint32_t CLK1_CLK4_DS_CNTL; + uint32_t CLK1_CLK5_DS_CNTL; + + uint32_t CLK1_CLK0_ALLOW_DS; + uint32_t CLK1_CLK1_ALLOW_DS; + uint32_t CLK1_CLK2_ALLOW_DS; + uint32_t CLK1_CLK3_ALLOW_DS; + uint32_t CLK1_CLK4_ALLOW_DS; + uint32_t CLK1_CLK5_ALLOW_DS; + uint32_t CLK5_spll_field_8; + }; struct clk_mgr_shift { diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h b/drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h index 16580d624278..2a530a4a39f7 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h +++ b/drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h @@ -42,6 +42,7 @@ #include "cursor_reg_cache.h" #include "dml2/dml21/inc/dml_top_dchub_registers.h" +#include "dml2/dml21/inc/dml_top_types.h" #define OPP_ID_INVALID 0xf #define MAX_TTU 0xffffff @@ -144,11 +145,21 @@ struct hubp_funcs { struct _vcs_dpi_display_rq_regs_st *rq_regs, struct _vcs_dpi_display_pipe_dest_params_st *pipe_dest); + void (*hubp_setup2)( + struct hubp *hubp, + struct dml2_dchub_per_pipe_register_set *pipe_regs, + union dml2_global_sync_programming *pipe_global_sync, + struct dc_crtc_timing *timing); + void (*hubp_setup_interdependent)( struct hubp *hubp, struct _vcs_dpi_display_dlg_regs_st *dlg_regs, struct _vcs_dpi_display_ttu_regs_st *ttu_regs); + void (*hubp_setup_interdependent2)( + struct hubp *hubp, + struct dml2_dchub_per_pipe_register_set *pipe_regs); + void (*dcc_control)(struct hubp *hubp, bool enable, enum hubp_ind_block_size blk_size); @@ -165,7 +176,7 @@ struct hubp_funcs { void (*hubp_program_pte_vm)( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, enum dc_rotation_angle rotation); void (*hubp_set_vm_system_aperture_settings)( @@ -179,7 +190,7 @@ struct hubp_funcs { void (*hubp_program_surface_config)( struct hubp *hubp, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -275,6 +286,7 @@ struct hubp_funcs { enum hubp_3dlut_fl_crossbar_bit_slice bit_slice_cb_b, enum hubp_3dlut_fl_crossbar_bit_slice bit_slice_cr_r); int (*hubp_get_3dlut_fl_done)(struct hubp *hubp); + void (*hubp_clear_tiling)(struct hubp *hubp); }; #endif diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h b/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h index af9183f5d69b..08c16ba52a51 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h +++ b/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h @@ -168,6 +168,14 @@ struct link_encoder_funcs { struct link_encoder *enc, enum encoder_type_select sel, uint32_t hpo_inst); + void (*enable_dpia_output)(struct link_encoder *enc, + const struct dc_link_settings *link_settings, + uint8_t dpia_id, + uint8_t digmode, + uint8_t fec_rdy); + void (*disable_dpia_output)(struct link_encoder *link_enc, + uint8_t dpia_id, + uint8_t digmode); }; /* diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/mem_input.h b/drivers/gpu/drm/amd/display/dc/inc/hw/mem_input.h index a8b44f398ce6..42fbc70f7056 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/hw/mem_input.h +++ b/drivers/gpu/drm/amd/display/dc/inc/hw/mem_input.h @@ -150,7 +150,7 @@ struct mem_input_funcs { void (*mem_input_program_pte_vm)( struct mem_input *mem_input, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, enum dc_rotation_angle rotation); void (*mem_input_set_vm_system_aperture_settings)( @@ -164,7 +164,7 @@ struct mem_input_funcs { void (*mem_input_program_surface_config)( struct mem_input *mem_input, enum surface_pixel_format format, - union dc_tiling_info *tiling_info, + struct dc_tiling_info *tiling_info, struct plane_size *plane_size, enum dc_rotation_angle rotation, struct dc_plane_dcc_param *dcc, @@ -187,6 +187,8 @@ struct mem_input_funcs { const struct dc_cursor_position *pos, const struct dc_cursor_mi_param *param); + void (*mem_input_clear_tiling)( + struct mem_input *mem_input); }; #endif diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/optc.h b/drivers/gpu/drm/amd/display/dc/inc/hw/optc.h index 03cbcbb36f1c..6fdc9809280c 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/hw/optc.h +++ b/drivers/gpu/drm/amd/display/dc/inc/hw/optc.h @@ -210,7 +210,7 @@ void optc1_enable_crtc_reset(struct timing_generator *optc, bool optc1_configure_crc(struct timing_generator *optc, const struct crc_params *params); -bool optc1_get_crc(struct timing_generator *optc, +bool optc1_get_crc(struct timing_generator *optc, uint8_t idx, uint32_t *r_cr, uint32_t *g_y, uint32_t *b_cb); diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h b/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h index b74e18cc1e66..9885cb3c310f 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h +++ b/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h @@ -141,6 +141,9 @@ struct crc_params { bool continuous_mode; bool enable; + + uint8_t crc_eng_inst; + bool reset; }; /** @@ -291,7 +294,7 @@ struct timing_generator_funcs { * @get_crc: Get CRCs for the given timing generator. Return false if * CRCs are not enabled (via configure_crc). */ - bool (*get_crc)(struct timing_generator *tg, + bool (*get_crc)(struct timing_generator *tg, uint8_t idx, uint32_t *r_cr, uint32_t *g_y, uint32_t *b_cb); void (*program_manual_trigger)(struct timing_generator *optc); diff --git a/drivers/gpu/drm/amd/display/dc/inc/link.h b/drivers/gpu/drm/amd/display/dc/inc/link.h index f04292086c08..fd1f9d3db039 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/link.h +++ b/drivers/gpu/drm/amd/display/dc/inc/link.h @@ -148,6 +148,10 @@ struct link_service { const struct dc_stream_state *stream, const unsigned int num_streams); + uint32_t (*dp_required_hblank_size_bytes)( + const struct dc_link *link, + struct dp_audio_bandwidth_params *audio_params); + /*************************** DPMS *************************************/ void (*set_dpms_on)(struct dc_state *state, struct pipe_ctx *pipe_ctx); diff --git a/drivers/gpu/drm/amd/display/dc/irq/dcn201/irq_service_dcn201.c b/drivers/gpu/drm/amd/display/dc/irq/dcn201/irq_service_dcn201.c index 4fb9cd6708d5..1d61d475d36f 100644 --- a/drivers/gpu/drm/amd/display/dc/irq/dcn201/irq_service_dcn201.c +++ b/drivers/gpu/drm/amd/display/dc/irq/dcn201/irq_service_dcn201.c @@ -30,8 +30,8 @@ #include "../dce110/irq_service_dce110.h" #include "irq_service_dcn201.h" -#include "dcn/dcn_2_0_3_offset.h" -#include "dcn/dcn_2_0_3_sh_mask.h" +#include "dcn/dcn_2_0_1_offset.h" +#include "dcn/dcn_2_0_1_sh_mask.h" #include "cyan_skillfish_ip_offset.h" #include "soc15_hw_ip.h" diff --git a/drivers/gpu/drm/amd/display/dc/link/accessories/link_dp_cts.c b/drivers/gpu/drm/amd/display/dc/link/accessories/link_dp_cts.c index ff8fe1a94965..96febabf464a 100644 --- a/drivers/gpu/drm/amd/display/dc/link/accessories/link_dp_cts.c +++ b/drivers/gpu/drm/amd/display/dc/link/accessories/link_dp_cts.c @@ -251,7 +251,7 @@ static void dp_test_send_phy_test_pattern(struct dc_link *link) link_training_settings.lttpr_mode = dp_decide_lttpr_mode(link, &link->cur_link_settings); - if ((link->chip_caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && + if (((link->chip_caps & AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK) == AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && link_training_settings.lttpr_mode == LTTPR_MODE_TRANSPARENT) dp_fixed_vs_pe_read_lane_adjust( link, @@ -646,7 +646,7 @@ bool dp_set_test_pattern( if (IS_DP_PHY_PATTERN(test_pattern)) { /* Set DPCD Lane Settings before running test pattern */ if (p_link_settings != NULL) { - if ((link->chip_caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && + if (((link->chip_caps & AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK) == AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && p_link_settings->lttpr_mode == LTTPR_MODE_TRANSPARENT) { dp_fixed_vs_pe_set_retimer_lane_settings( link, diff --git a/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio.c b/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio.c index 3e47a6735912..06faa461067b 100644 --- a/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio.c +++ b/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio.c @@ -164,7 +164,9 @@ void disable_dio_link_output(struct dc_link *link, { struct link_encoder *link_enc = link_enc_cfg_get_link_enc(link); - link_enc->funcs->disable_output(link_enc, signal); + if (link_enc != NULL) + link_enc->funcs->disable_output(link_enc, signal); + link->dc->link_srv->dp_trace_source_sequence(link, DPCD_SOURCE_SEQ_AFTER_DISABLE_LINK_PHY); } diff --git a/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio_fixed_vs_pe_retimer.c b/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio_fixed_vs_pe_retimer.c index 348ea4cb832d..a6d1d7641ab4 100644 --- a/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio_fixed_vs_pe_retimer.c +++ b/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio_fixed_vs_pe_retimer.c @@ -187,7 +187,7 @@ static const struct link_hwss dio_fixed_vs_pe_retimer_link_hwss = { bool requires_fixed_vs_pe_retimer_dio_link_hwss(const struct dc_link *link) { - return (link->chip_caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN); + return ((link->chip_caps & AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK) == AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN); } const struct link_hwss *get_dio_fixed_vs_pe_retimer_link_hwss(void) diff --git a/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dpia.c b/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dpia.c index 6499807af72a..36adf95744fe 100644 --- a/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dpia.c +++ b/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dpia.c @@ -77,17 +77,74 @@ static void set_dio_dpia_lane_settings(struct dc_link *link, { } +static void enable_dpia_link_output(struct dc_link *link, + const struct link_resource *link_res, + enum signal_type signal, + enum clock_source_id clock_source, + const struct dc_link_settings *link_settings) +{ + struct link_encoder *link_enc = link_enc_cfg_get_link_enc(link); + + if (link_enc != NULL) { + if (link->dc->config.enable_dpia_pre_training && link_enc->funcs->enable_dpia_output) { + uint8_t fec_rdy = link->dc->link_srv->dp_should_enable_fec(link); + uint8_t digmode = dc_is_dp_sst_signal(signal) ? DIG_SST_MODE : DIG_MST_MODE; + + link_enc->funcs->enable_dpia_output( + link_enc, + link_settings, + link->ddc_hw_inst, + digmode, + fec_rdy); + } else { + if (dc_is_dp_sst_signal(signal)) + link_enc->funcs->enable_dp_output( + link_enc, + link_settings, + clock_source); + else + link_enc->funcs->enable_dp_mst_output( + link_enc, + link_settings, + clock_source); + } + + } + + link->dc->link_srv->dp_trace_source_sequence(link, + DPCD_SOURCE_SEQ_AFTER_ENABLE_LINK_PHY); +} + +static void disable_dpia_link_output(struct dc_link *link, + const struct link_resource *link_res, + enum signal_type signal) +{ + struct link_encoder *link_enc = link_enc_cfg_get_link_enc(link); + + if (link_enc != NULL) { + if (link->dc->config.enable_dpia_pre_training && link_enc->funcs->disable_dpia_output) { + uint8_t digmode = dc_is_dp_sst_signal(signal) ? DIG_SST_MODE : DIG_MST_MODE; + + link_enc->funcs->disable_dpia_output(link_enc, link->ddc_hw_inst, digmode); + } else + link_enc->funcs->disable_output(link_enc, signal); + } + + link->dc->link_srv->dp_trace_source_sequence(link, + DPCD_SOURCE_SEQ_AFTER_DISABLE_LINK_PHY); +} + static const struct link_hwss dpia_link_hwss = { .setup_stream_encoder = setup_dio_stream_encoder, .reset_stream_encoder = reset_dio_stream_encoder, .setup_stream_attribute = setup_dio_stream_attribute, - .disable_link_output = disable_dio_link_output, + .disable_link_output = disable_dpia_link_output, .setup_audio_output = setup_dio_audio_output, .enable_audio_packet = enable_dio_audio_packet, .disable_audio_packet = disable_dio_audio_packet, .ext = { .set_throttled_vcp_size = set_dio_throttled_vcp_size, - .enable_dp_link_output = enable_dio_dp_link_output, + .enable_dp_link_output = enable_dpia_link_output, .set_dp_link_test_pattern = set_dio_dpia_link_test_pattern, .set_dp_lane_settings = set_dio_dpia_lane_settings, .update_stream_allocation_table = update_dpia_stream_allocation_table, diff --git a/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dpia.h b/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dpia.h index ad16ec5d9bb7..259e0f4775e1 100644 --- a/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dpia.h +++ b/drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dpia.h @@ -27,6 +27,9 @@ #include "link_hwss.h" +#define DIG_SST_MODE 0 +#define DIG_MST_MODE 5 + const struct link_hwss *get_dpia_link_hwss(void); bool can_use_dpia_link_hwss(const struct dc_link *link, const struct link_resource *link_res); diff --git a/drivers/gpu/drm/amd/display/dc/link/link_detection.c b/drivers/gpu/drm/amd/display/dc/link/link_detection.c index e026c728042a..550e1a098fa2 100644 --- a/drivers/gpu/drm/amd/display/dc/link/link_detection.c +++ b/drivers/gpu/drm/amd/display/dc/link/link_detection.c @@ -829,7 +829,8 @@ static bool should_verify_link_capability_destructively(struct dc_link *link, if (link->dc->debug.skip_detection_link_training || dc_is_embedded_signal(link->local_sink->sink_signal) || - link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA) { + (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA && + !link->dc->config.enable_dpia_pre_training)) { destrictive = false; } else if (link_dp_get_encoding_format(&max_link_cap) == DP_8b_10b_ENCODING) { diff --git a/drivers/gpu/drm/amd/display/dc/link/link_dpms.c b/drivers/gpu/drm/amd/display/dc/link/link_dpms.c index 41cab9ad6885..ec7de9c01fab 100644 --- a/drivers/gpu/drm/amd/display/dc/link/link_dpms.c +++ b/drivers/gpu/drm/amd/display/dc/link/link_dpms.c @@ -772,6 +772,20 @@ static bool dp_set_dsc_on_rx(struct pipe_ctx *pipe_ctx, bool enable) return result; } +static bool dp_set_hblank_reduction_on_rx(struct pipe_ctx *pipe_ctx) +{ + struct dc *dc = pipe_ctx->stream->ctx->dc; + struct dc_stream_state *stream = pipe_ctx->stream; + bool result = false; + + if (dc_is_virtual_signal(stream->signal)) + result = true; + else + result = dm_helpers_dp_write_hblank_reduction(dc->ctx, stream); + return result; +} + + /* The stream with these settings can be sent (unblanked) only after DSC was enabled on RX first, * i.e. after dp_enable_dsc_on_rx() had been called */ @@ -808,7 +822,8 @@ void link_set_dsc_on_stream(struct pipe_ctx *pipe_ctx, bool enable) enum optc_dsc_mode optc_dsc_mode; /* Enable DSC hw block */ - dsc_cfg.pic_width = (stream->timing.h_addressable + stream->timing.h_border_left + stream->timing.h_border_right) / opp_cnt; + dsc_cfg.pic_width = (stream->timing.h_addressable + pipe_ctx->hblank_borrow + + stream->timing.h_border_left + stream->timing.h_border_right) / opp_cnt; dsc_cfg.pic_height = stream->timing.v_addressable + stream->timing.v_border_top + stream->timing.v_border_bottom; dsc_cfg.pixel_encoding = stream->timing.pixel_encoding; dsc_cfg.color_depth = stream->timing.display_color_depth; @@ -1952,11 +1967,15 @@ static void enable_link_hdmi(struct pipe_ctx *pipe_ctx) stream->phy_pix_clk = stream->timing.pix_clk_100hz / 10; if (stream->phy_pix_clk > 340000) is_over_340mhz = true; + if (dc_is_tmds_signal(stream->signal) && stream->phy_pix_clk > 6000000UL) { + ASSERT(false); + return; + } if (dc_is_hdmi_signal(pipe_ctx->stream->signal)) { unsigned short masked_chip_caps = pipe_ctx->stream->link->chip_caps & - EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK; - if (masked_chip_caps == EXT_DISPLAY_PATH_CAPS__HDMI20_TISN65DP159RSBT) { + AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK; + if (masked_chip_caps == AMD_EXT_DISPLAY_PATH_CAPS__HDMI20_TISN65DP159RSBT) { /* DP159, Retimer settings */ eng_id = pipe_ctx->stream_res.stream_enc->id; @@ -1967,7 +1986,7 @@ static void enable_link_hdmi(struct pipe_ctx *pipe_ctx) write_i2c_default_retimer_setting(pipe_ctx, is_vga_mode, is_over_340mhz); } - } else if (masked_chip_caps == EXT_DISPLAY_PATH_CAPS__HDMI20_PI3EQX1204) { + } else if (masked_chip_caps == AMD_EXT_DISPLAY_PATH_CAPS__HDMI20_PI3EQX1204) { /* PI3EQX1204, Redriver settings */ write_i2c_redriver_setting(pipe_ctx, is_over_340mhz); } @@ -2023,7 +2042,7 @@ static enum dc_status enable_link_dp(struct dc_state *state, int lt_attempts = LINK_TRAINING_ATTEMPTS; // Increase retry count if attempting DP1.x on FIXED_VS link - if ((link->chip_caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && + if (((link->chip_caps & AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK) == AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && link_dp_get_encoding_format(link_settings) == DP_8b_10b_ENCODING) lt_attempts = 10; @@ -2038,7 +2057,8 @@ static enum dc_status enable_link_dp(struct dc_state *state, /* Train with fallback when enabling DPIA link. Conventional links are * trained with fallback during sink detection. */ - if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA) + if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA && + !link->dc->config.enable_dpia_pre_training) do_fallback = true; /* @@ -2374,13 +2394,13 @@ void link_set_dpms_off(struct pipe_ctx *pipe_ctx) enum engine_id eng_id = pipe_ctx->stream_res.stream_enc->id; unsigned short masked_chip_caps = link->chip_caps & - EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK; + AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK; //Need to inform that sink is going to use legacy HDMI mode. write_scdc_data( link->ddc, 165000,//vbios only handles 165Mhz. false); - if (masked_chip_caps == EXT_DISPLAY_PATH_CAPS__HDMI20_TISN65DP159RSBT) { + if (masked_chip_caps == AMD_EXT_DISPLAY_PATH_CAPS__HDMI20_TISN65DP159RSBT) { /* DP159, Retimer settings */ if (get_ext_hdmi_settings(pipe_ctx, eng_id, &settings)) write_i2c_retimer_setting(pipe_ctx, @@ -2388,7 +2408,7 @@ void link_set_dpms_off(struct pipe_ctx *pipe_ctx) else write_i2c_default_retimer_setting(pipe_ctx, false, false); - } else if (masked_chip_caps == EXT_DISPLAY_PATH_CAPS__HDMI20_PI3EQX1204) { + } else if (masked_chip_caps == AMD_EXT_DISPLAY_PATH_CAPS__HDMI20_PI3EQX1204) { /* PI3EQX1204, Redriver settings */ write_i2c_redriver_setting(pipe_ctx, false); } @@ -2528,6 +2548,15 @@ void link_set_dpms_on( if (pipe_ctx->stream->dpms_off) return; + /* For Dp tunneling link, a pending HPD means that we have a race condition between processing + * current link and processing the pending HPD. If we enable the link now, we may end up with a + * link that is not actually connected to a sink. So we skip enabling the link in this case. + */ + if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA && link->is_hpd_pending) { + DC_LOG_DEBUG("%s, Link%d HPD is pending, not enable it.\n", __func__, link->link_index); + return; + } + /* Have to setup DSC before DIG FE and BE are connected (which happens before the * link training). This is to make sure the bandwidth sent to DIG BE won't be * bigger than what the link and/or DIG BE can handle. VBID[6]/CompressedStream_flag @@ -2593,6 +2622,9 @@ void link_set_dpms_on( } } + if (dc_is_dp_signal(pipe_ctx->stream->signal)) + dp_set_hblank_reduction_on_rx(pipe_ctx); + if (pipe_ctx->stream->link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA) allocate_usb4_bandwidth(pipe_ctx->stream); diff --git a/drivers/gpu/drm/amd/display/dc/link/link_factory.c b/drivers/gpu/drm/amd/display/dc/link/link_factory.c index 5e1b5ab9fbc6..a7877d57a00f 100644 --- a/drivers/gpu/drm/amd/display/dc/link/link_factory.c +++ b/drivers/gpu/drm/amd/display/dc/link/link_factory.c @@ -101,6 +101,7 @@ static void construct_link_service_validation(struct link_service *link_srv) link_srv->validate_mode_timing = link_validate_mode_timing; link_srv->dp_link_bandwidth_kbps = dp_link_bandwidth_kbps; link_srv->validate_dpia_bandwidth = link_validate_dpia_bandwidth; + link_srv->dp_required_hblank_size_bytes = dp_required_hblank_size_bytes; } /* link dpms owns the programming sequence of stream's dpms state associated @@ -698,7 +699,7 @@ static bool construct_phy(struct dc_link *link, link->chip_caps); } - if (link->chip_caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) { + if ((link->chip_caps & AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK) == AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) { link->bios_forced_drive_settings.VOLTAGE_SWING = (bios->integrated_info->ext_disp_conn_info.fixdpvoltageswing & 0x3); link->bios_forced_drive_settings.PRE_EMPHASIS = diff --git a/drivers/gpu/drm/amd/display/dc/link/link_validation.c b/drivers/gpu/drm/amd/display/dc/link/link_validation.c index 60f15a9ba7a5..29606fda029d 100644 --- a/drivers/gpu/drm/amd/display/dc/link/link_validation.c +++ b/drivers/gpu/drm/amd/display/dc/link/link_validation.c @@ -409,3 +409,182 @@ bool link_validate_dpia_bandwidth(const struct dc_stream_state *stream, const un return dpia_validate_usb4_bw(dpia_link, bw_needed, num_dpias); } + +struct dp_audio_layout_config { + uint8_t layouts_per_sample_denom; + uint8_t symbols_per_layout; + uint8_t max_layouts_per_audio_sdp; +}; + +static void get_audio_layout_config( + uint32_t channel_count, + enum dp_link_encoding encoding, + struct dp_audio_layout_config *output) +{ + memset(output, 0, sizeof(struct dp_audio_layout_config)); + + /* Assuming L-PCM audio. Current implementation uses max 1 layout per SDP, + * with each layout being the same size (8ch layout). + */ + if (encoding == DP_8b_10b_ENCODING) { + if (channel_count == 2) { + output->layouts_per_sample_denom = 4; + output->symbols_per_layout = 40; + output->max_layouts_per_audio_sdp = 1; + } else if (channel_count == 8 || channel_count == 6) { + output->layouts_per_sample_denom = 1; + output->symbols_per_layout = 40; + output->max_layouts_per_audio_sdp = 1; + } + } else if (encoding == DP_128b_132b_ENCODING) { + if (channel_count == 2) { + output->layouts_per_sample_denom = 4; + output->symbols_per_layout = 10; + output->max_layouts_per_audio_sdp = 1; + } else if (channel_count == 8 || channel_count == 6) { + output->layouts_per_sample_denom = 1; + output->symbols_per_layout = 10; + output->max_layouts_per_audio_sdp = 1; + } + } +} + +static uint32_t get_av_stream_map_lane_count( + enum dp_link_encoding encoding, + enum dc_lane_count lane_count, + bool is_mst) +{ + uint32_t av_stream_map_lane_count = 0; + + if (encoding == DP_8b_10b_ENCODING) { + if (!is_mst) + av_stream_map_lane_count = lane_count; + else + av_stream_map_lane_count = 4; + } else if (encoding == DP_128b_132b_ENCODING) { + av_stream_map_lane_count = 4; + } + + ASSERT(av_stream_map_lane_count != 0); + + return av_stream_map_lane_count; +} + +static uint32_t get_audio_sdp_overhead( + enum dp_link_encoding encoding, + enum dc_lane_count lane_count, + bool is_mst) +{ + uint32_t audio_sdp_overhead = 0; + + if (encoding == DP_8b_10b_ENCODING) { + if (is_mst) + audio_sdp_overhead = 16; /* 4 * 2 + 8 */ + else + audio_sdp_overhead = lane_count * 2 + 8; + } else if (encoding == DP_128b_132b_ENCODING) { + audio_sdp_overhead = 10; /* 4 x 2.5 */ + } + + ASSERT(audio_sdp_overhead != 0); + + return audio_sdp_overhead; +} + +/* Current calculation only applicable for 8b/10b MST and 128b/132b SST/MST. + */ +static uint32_t calculate_overhead_hblank_bw_in_symbols( + uint32_t max_slice_h) +{ + uint32_t overhead_hblank_bw = 0; /* in stream symbols */ + + overhead_hblank_bw += max_slice_h * 4; /* EOC overhead */ + overhead_hblank_bw += 12; /* Main link overhead (VBID, BS/BE) */ + + return overhead_hblank_bw; +} + +uint32_t dp_required_hblank_size_bytes( + const struct dc_link *link, + struct dp_audio_bandwidth_params *audio_params) +{ + /* Main logic from dce_audio is duplicated here, with the main + * difference being: + * - Pre-determined lane count of 4 + * - Assumed 16 dsc slices for worst case + * - Assumed SDP split disabled for worst case + * TODO: Unify logic from dce_audio to prevent duplicated logic. + */ + + const struct dc_crtc_timing *timing = audio_params->crtc_timing; + const uint32_t channel_count = audio_params->channel_count; + const uint32_t sample_rate_hz = audio_params->sample_rate_hz; + const enum dp_link_encoding link_encoding = audio_params->link_encoding; + + // 8b/10b MST and 128b/132b are always 4 logical lanes. + const uint32_t lane_count = 4; + const bool is_mst = (link->connector_signal == SIGNAL_TYPE_DISPLAY_PORT); + // Maximum slice count is with ODM 4:1, 4 slices per DSC + const uint32_t max_slices_h = 16; + + const uint32_t av_stream_map_lane_count = get_av_stream_map_lane_count( + link_encoding, lane_count, is_mst); + const uint32_t audio_sdp_overhead = get_audio_sdp_overhead( + link_encoding, lane_count, is_mst); + struct dp_audio_layout_config layout_config; + + if (link_encoding == DP_8b_10b_ENCODING && link->connector_signal == SIGNAL_TYPE_DISPLAY_PORT) + return 0; + + get_audio_layout_config( + channel_count, link_encoding, &layout_config); + + /* DP spec recommends between 1.05 to 1.1 safety margin to prevent sample under-run */ + struct fixed31_32 audio_sdp_margin = dc_fixpt_from_fraction(110, 100); + struct fixed31_32 horizontal_line_freq_khz = dc_fixpt_from_fraction( + timing->pix_clk_100hz, (long long)timing->h_total * 10); + struct fixed31_32 samples_per_line; + struct fixed31_32 layouts_per_line; + struct fixed31_32 symbols_per_sdp_max_layout; + struct fixed31_32 remainder; + uint32_t num_sdp_with_max_layouts; + uint32_t required_symbols_per_hblank; + uint32_t required_bytes_per_hblank = 0; + + samples_per_line = dc_fixpt_from_fraction(sample_rate_hz, 1000); + samples_per_line = dc_fixpt_div(samples_per_line, horizontal_line_freq_khz); + layouts_per_line = dc_fixpt_div_int(samples_per_line, layout_config.layouts_per_sample_denom); + // HBlank expansion usage assumes SDP split disabled to allow for worst case. + layouts_per_line = dc_fixpt_from_int(dc_fixpt_ceil(layouts_per_line)); + + num_sdp_with_max_layouts = dc_fixpt_floor( + dc_fixpt_div_int(layouts_per_line, layout_config.max_layouts_per_audio_sdp)); + symbols_per_sdp_max_layout = dc_fixpt_from_int( + layout_config.max_layouts_per_audio_sdp * layout_config.symbols_per_layout); + symbols_per_sdp_max_layout = dc_fixpt_add_int(symbols_per_sdp_max_layout, audio_sdp_overhead); + symbols_per_sdp_max_layout = dc_fixpt_mul(symbols_per_sdp_max_layout, audio_sdp_margin); + required_symbols_per_hblank = num_sdp_with_max_layouts; + required_symbols_per_hblank *= ((dc_fixpt_ceil(symbols_per_sdp_max_layout) + av_stream_map_lane_count) / + av_stream_map_lane_count) * av_stream_map_lane_count; + + if (num_sdp_with_max_layouts != dc_fixpt_ceil( + dc_fixpt_div_int(layouts_per_line, layout_config.max_layouts_per_audio_sdp))) { + remainder = dc_fixpt_sub_int(layouts_per_line, + num_sdp_with_max_layouts * layout_config.max_layouts_per_audio_sdp); + remainder = dc_fixpt_mul_int(remainder, layout_config.symbols_per_layout); + remainder = dc_fixpt_add_int(remainder, audio_sdp_overhead); + remainder = dc_fixpt_mul(remainder, audio_sdp_margin); + required_symbols_per_hblank += ((dc_fixpt_ceil(remainder) + av_stream_map_lane_count) / + av_stream_map_lane_count) * av_stream_map_lane_count; + } + + required_symbols_per_hblank += calculate_overhead_hblank_bw_in_symbols(max_slices_h); + + if (link_encoding == DP_8b_10b_ENCODING) + required_bytes_per_hblank = required_symbols_per_hblank; // 8 bits per 8b/10b symbol + else if (link_encoding == DP_128b_132b_ENCODING) + required_bytes_per_hblank = required_symbols_per_hblank * 4; // 32 bits per 128b/132b symbol + + return required_bytes_per_hblank; +} + diff --git a/drivers/gpu/drm/amd/display/dc/link/link_validation.h b/drivers/gpu/drm/amd/display/dc/link/link_validation.h index 595fb05946e9..bf398c49c3e8 100644 --- a/drivers/gpu/drm/amd/display/dc/link/link_validation.h +++ b/drivers/gpu/drm/amd/display/dc/link/link_validation.h @@ -37,4 +37,9 @@ uint32_t dp_link_bandwidth_kbps( const struct dc_link *link, const struct dc_link_settings *link_settings); + +uint32_t dp_required_hblank_size_bytes( + const struct dc_link *link, + struct dp_audio_bandwidth_params *audio_params); + #endif /* __LINK_VALIDATION_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c index d6d5bbf2108c..267180e7bc48 100644 --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c @@ -505,7 +505,7 @@ bool try_to_configure_aux_timeout(struct ddc_service *ddc, bool result = false; struct ddc *ddc_pin = ddc->ddc_pin; - if ((ddc->link->chip_caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && + if (((ddc->link->chip_caps & AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK) == AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && !ddc->link->dc->debug.disable_fixed_vs_aux_timeout_wa && ddc->ctx->dce_version == DCN_VERSION_3_1) { /* Fixed VS workaround for AUX timeout */ diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c index 9dabaf682171..44c3023a7731 100644 --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c @@ -1554,7 +1554,7 @@ enum dc_status dp_retrieve_lttpr_cap(struct dc_link *link) /* If this chip cap is set, at least one retimer must exist in the chain * Override count to 1 if we receive a known bad count (0 or an invalid value) */ - if ((link->chip_caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && + if (((link->chip_caps & AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK) == AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && (dp_parse_lttpr_repeater_count(link->dpcd_caps.lttpr_caps.phy_repeater_cnt) == 0)) { /* If you see this message consistently, either the host platform has FIXED_VS flag * incorrectly configured or the sink device is returning an invalid count. @@ -1632,13 +1632,6 @@ static bool retrieve_link_cap(struct dc_link *link) sizeof(link->dpcd_caps.lttpr_caps.phy_repeater_cnt)); } - /* Read DP tunneling information. */ - if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA) { - status = dpcd_get_tunneling_device_data(link); - if (status != DC_OK) - dm_error("%s: Read tunneling device data failed.\n", __func__); - } - dpcd_set_source_specific_data(link); /* Sink may need to configure internals based on vendor, so allow some * time before proceeding with possibly vendor specific transactions @@ -1711,7 +1704,7 @@ static bool retrieve_link_cap(struct dc_link *link) link->dpcd_caps.dprx_feature.raw = dpcd_dprx_data; if (status != DC_OK) - dm_error("%s: Read DPRX caps data failed.\n", __func__); + dm_error("%s: Read DPRX feature list failed.\n", __func__); /* AdaptiveSyncCapability */ dpcd_dprx_data = 0; @@ -1726,15 +1719,13 @@ static bool retrieve_link_cap(struct dc_link *link) link->dpcd_caps.adaptive_sync_caps.dp_adap_sync_caps.raw = dpcd_dprx_data; if (status != DC_OK) - dm_error("%s: Read DPRX caps data failed. Addr:%#x\n", + dm_error("%s: Read DPRX feature list_1 failed. Addr:%#x\n", __func__, DP_DPRX_FEATURE_ENUMERATION_LIST_CONT_1); } - else { link->dpcd_caps.dprx_feature.raw = 0; } - /* Error condition checking... * It is impossible for Sink to report Max Lane Count = 0. * It is possible for Sink to report Max Link Rate = 0, if it is @@ -1788,6 +1779,11 @@ static bool retrieve_link_cap(struct dc_link *link) link->test_pattern_enabled = false; link->compliance_test_state.raw = 0; + link->dpcd_caps.receive_port0_cap.raw[0] = + dpcd_data[DP_RECEIVE_PORT_0_CAP_0 - DP_DPCD_REV]; + link->dpcd_caps.receive_port0_cap.raw[1] = + dpcd_data[DP_RECEIVE_PORT_0_BUFFER_SIZE - DP_DPCD_REV]; + /* read sink count */ core_link_read_dpcd(link, DP_SINK_COUNT, @@ -1918,6 +1914,7 @@ static bool retrieve_link_cap(struct dc_link *link) if (link->dpcd_caps.channel_coding_cap.bits.DP_128b_132b_SUPPORTED) { DC_LOG_DP2("128b/132b encoding is supported at link %d", link->link_index); + /* Read 128b/132b suppoerted link rates */ core_link_read_dpcd(link, DP_128B132B_SUPPORTED_LINK_RATES, &link->dpcd_caps.dp_128b_132b_supported_link_rates.raw, @@ -1965,6 +1962,13 @@ static bool retrieve_link_cap(struct dc_link *link) link->dpcd_caps.max_uncompressed_pixel_rate_cap.raw, sizeof(link->dpcd_caps.max_uncompressed_pixel_rate_cap.raw)); + /* Read DP tunneling information. */ + if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA) { + status = dpcd_get_tunneling_device_data(link); + if (status != DC_OK) + dm_error("%s: Read DP tunneling device data failed.\n", __func__); + } + retrieve_cable_id(link); dpcd_write_cable_id_to_dprx(link); @@ -2308,6 +2312,14 @@ bool dp_verify_link_cap_with_retries( } else { link->verified_link_cap = last_verified_link_cap; } + + /* For Dp tunneling link, a pending HPD means that we have a race condition between processing + * current link and processing the pending HPD. Since the training is failed, we should just brak + * the loop so that we have chance to process the pending HPD. + */ + if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA && link->is_hpd_pending) + break; + fsleep(10 * 1000); } diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_irq_handler.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_irq_handler.c index 48abeaa88678..a08403c022ea 100644 --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_irq_handler.c +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_irq_handler.c @@ -226,6 +226,8 @@ static void handle_hpd_irq_replay_sink(struct dc_link *link) replay_configuration.bits.STATE_TRANSITION_ERROR_STATUS) { bool allow_active; + link->replay_settings.config.replay_error_status.raw |= replay_error_status.raw; + if (link->replay_settings.config.force_disable_desync_error_check) return; @@ -237,6 +239,9 @@ static void handle_hpd_irq_replay_sink(struct dc_link *link) &replay_configuration.raw, sizeof(replay_configuration.raw)); + /* Update desync error counter */ + link->replay_settings.replay_desync_error_fail_count++; + /* Acknowledge and clear error bits */ dm_helpers_dp_write_dpcd( link->ctx, @@ -408,7 +413,8 @@ bool dp_handle_hpd_rx_irq(struct dc_link *link, if (hpd_irq_dpcd_data.bytes.device_service_irq.bits.AUTOMATED_TEST) { // Workaround for DP 1.4a LL Compliance CTS as USB4 has to share encoders unlike DP and USBC - if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA) + if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA && + !link->dc->config.enable_dpia_pre_training) link->skip_fallback_on_link_loss = true; device_service_clear.bits.AUTOMATED_TEST = 1; diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_phy.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_phy.c index bafa52a0165a..2c73ac87cd66 100644 --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_phy.c +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_phy.c @@ -104,7 +104,7 @@ void dp_set_hw_lane_settings( // Don't return here if using FIXED_VS link HWSS and encoding is 128b/132b if ((link_settings->lttpr_mode == LTTPR_MODE_NON_TRANSPARENT) && !is_immediate_downstream(link, offset) && - (!(link->chip_caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) || + (!((link->chip_caps & AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK) == AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) || link_dp_get_encoding_format(&link_settings->link_settings) == DP_8b_10b_ENCODING)) return; diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training.c index 754c895e1bfb..88d4288cde0f 100644 --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training.c +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training.c @@ -739,7 +739,7 @@ void override_training_settings( if (overrides->ffe_preset != NULL) lt_settings->ffe_preset = overrides->ffe_preset; /* Override HW lane settings with BIOS forced values if present */ - if ((link->chip_caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && + if ((link->chip_caps & AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && lt_settings->lttpr_mode == LTTPR_MODE_TRANSPARENT) { lt_settings->voltage_swing = &link->bios_forced_drive_settings.VOLTAGE_SWING; lt_settings->pre_emphasis = &link->bios_forced_drive_settings.PRE_EMPHASIS; @@ -1574,7 +1574,7 @@ enum link_training_result dp_perform_link_training( * Per DP specs starting from here, DPTX device shall not issue * Non-LT AUX transactions inside training mode. */ - if ((link->chip_caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && encoding == DP_8b_10b_ENCODING) + if (((link->chip_caps & AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK) == AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) && encoding == DP_8b_10b_ENCODING) status = dp_perform_fixed_vs_pe_training_sequence(link, link_res, <_settings); else if (encoding == DP_8b_10b_ENCODING) status = dp_perform_8b_10b_link_training(link, link_res, <_settings); diff --git a/drivers/gpu/drm/amd/display/dc/mpc/dcn30/dcn30_mpc.c b/drivers/gpu/drm/amd/display/dc/mpc/dcn30/dcn30_mpc.c index fe26fde12eeb..85298b8a1b5e 100644 --- a/drivers/gpu/drm/amd/display/dc/mpc/dcn30/dcn30_mpc.c +++ b/drivers/gpu/drm/amd/display/dc/mpc/dcn30/dcn30_mpc.c @@ -110,6 +110,23 @@ void mpc3_disable_dwb_mux( MPC_DWB0_MUX, 0xf); } +void mpc3_set_out_rate_control( + struct mpc *mpc, + int opp_id, + bool enable, + bool rate_2x_mode, + struct mpc_dwb_flow_control *flow_control) +{ + struct dcn30_mpc *mpc30 = TO_DCN30_MPC(mpc); + + /* Always disable mpc out rate and flow control. + * MPC flow rate control is not needed for DCN30 and above. + */ + REG_UPDATE_2(MUX[opp_id], + MPC_OUT_RATE_CONTROL_DISABLE, 1, + MPC_OUT_RATE_CONTROL, 0); +} + enum dc_lut_mode mpc3_get_ogam_current(struct mpc *mpc, int mpcc_id) { /*Contrary to DCN2 and DCN1 wherein a single status register field holds this info; @@ -1519,6 +1536,7 @@ static const struct mpc_funcs dcn30_mpc_funcs = { .set_dwb_mux = mpc3_set_dwb_mux, .disable_dwb_mux = mpc3_disable_dwb_mux, .is_dwb_idle = mpc3_is_dwb_idle, + .set_out_rate_control = mpc3_set_out_rate_control, .set_gamut_remap = mpc3_set_gamut_remap, .program_shaper = mpc3_program_shaper, .acquire_rmu = mpcc3_acquire_rmu, diff --git a/drivers/gpu/drm/amd/display/dc/mpc/dcn30/dcn30_mpc.h b/drivers/gpu/drm/amd/display/dc/mpc/dcn30/dcn30_mpc.h index ce93003dae01..103f29900a2c 100644 --- a/drivers/gpu/drm/amd/display/dc/mpc/dcn30/dcn30_mpc.h +++ b/drivers/gpu/drm/amd/display/dc/mpc/dcn30/dcn30_mpc.h @@ -1085,6 +1085,13 @@ bool mpc3_is_dwb_idle( struct mpc *mpc, int dwb_id); +void mpc3_set_out_rate_control( + struct mpc *mpc, + int opp_id, + bool enable, + bool rate_2x_mode, + struct mpc_dwb_flow_control *flow_control); + void mpc3_power_on_ogam_lut( struct mpc *mpc, int mpcc_id, bool power_on); diff --git a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c index 097d06023e64..19d5ebc6763c 100644 --- a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c +++ b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c @@ -302,7 +302,6 @@ void optc1_program_timing( /* Enable stereo - only when we need to pack 3D frame. Other types * of stereo handled in explicit call */ - if (optc->funcs->is_two_pixels_per_container(&patched_crtc_timing) || optc1->opp_count == 2) h_div = H_TIMING_DIV_BY2; @@ -1471,37 +1470,71 @@ bool optc1_configure_crc(struct timing_generator *optc, if (!optc1_is_tg_enabled(optc)) return false; - REG_WRITE(OTG_CRC_CNTL, 0); + if (!params->enable || params->reset) + REG_WRITE(OTG_CRC_CNTL, 0); if (!params->enable) return true; /* Program frame boundaries */ - /* Window A x axis start and end. */ - REG_UPDATE_2(OTG_CRC0_WINDOWA_X_CONTROL, - OTG_CRC0_WINDOWA_X_START, params->windowa_x_start, - OTG_CRC0_WINDOWA_X_END, params->windowa_x_end); + switch (params->crc_eng_inst) { + case 0: + /* Window A x axis start and end. */ + REG_UPDATE_2(OTG_CRC0_WINDOWA_X_CONTROL, + OTG_CRC0_WINDOWA_X_START, params->windowa_x_start, + OTG_CRC0_WINDOWA_X_END, params->windowa_x_end); - /* Window A y axis start and end. */ - REG_UPDATE_2(OTG_CRC0_WINDOWA_Y_CONTROL, - OTG_CRC0_WINDOWA_Y_START, params->windowa_y_start, - OTG_CRC0_WINDOWA_Y_END, params->windowa_y_end); + /* Window A y axis start and end. */ + REG_UPDATE_2(OTG_CRC0_WINDOWA_Y_CONTROL, + OTG_CRC0_WINDOWA_Y_START, params->windowa_y_start, + OTG_CRC0_WINDOWA_Y_END, params->windowa_y_end); - /* Window B x axis start and end. */ - REG_UPDATE_2(OTG_CRC0_WINDOWB_X_CONTROL, - OTG_CRC0_WINDOWB_X_START, params->windowb_x_start, - OTG_CRC0_WINDOWB_X_END, params->windowb_x_end); + /* Window B x axis start and end. */ + REG_UPDATE_2(OTG_CRC0_WINDOWB_X_CONTROL, + OTG_CRC0_WINDOWB_X_START, params->windowb_x_start, + OTG_CRC0_WINDOWB_X_END, params->windowb_x_end); - /* Window B y axis start and end. */ - REG_UPDATE_2(OTG_CRC0_WINDOWB_Y_CONTROL, - OTG_CRC0_WINDOWB_Y_START, params->windowb_y_start, - OTG_CRC0_WINDOWB_Y_END, params->windowb_y_end); + /* Window B y axis start and end. */ + REG_UPDATE_2(OTG_CRC0_WINDOWB_Y_CONTROL, + OTG_CRC0_WINDOWB_Y_START, params->windowb_y_start, + OTG_CRC0_WINDOWB_Y_END, params->windowb_y_end); - /* Set crc mode and selection, and enable. Only using CRC0*/ - REG_UPDATE_3(OTG_CRC_CNTL, - OTG_CRC_CONT_EN, params->continuous_mode ? 1 : 0, - OTG_CRC0_SELECT, params->selection, - OTG_CRC_EN, 1); + /* Set crc mode and selection, and enable.*/ + REG_UPDATE_3(OTG_CRC_CNTL, + OTG_CRC_CONT_EN, params->continuous_mode ? 1 : 0, + OTG_CRC0_SELECT, params->selection, + OTG_CRC_EN, 1); + break; + case 1: + /* Window A x axis start and end. */ + REG_UPDATE_2(OTG_CRC1_WINDOWA_X_CONTROL, + OTG_CRC1_WINDOWA_X_START, params->windowa_x_start, + OTG_CRC1_WINDOWA_X_END, params->windowa_x_end); + + /* Window A y axis start and end. */ + REG_UPDATE_2(OTG_CRC1_WINDOWA_Y_CONTROL, + OTG_CRC1_WINDOWA_Y_START, params->windowa_y_start, + OTG_CRC1_WINDOWA_Y_END, params->windowa_y_end); + + /* Window B x axis start and end. */ + REG_UPDATE_2(OTG_CRC1_WINDOWB_X_CONTROL, + OTG_CRC1_WINDOWB_X_START, params->windowb_x_start, + OTG_CRC1_WINDOWB_X_END, params->windowb_x_end); + + /* Window B y axis start and end. */ + REG_UPDATE_2(OTG_CRC1_WINDOWB_Y_CONTROL, + OTG_CRC1_WINDOWB_Y_START, params->windowb_y_start, + OTG_CRC1_WINDOWB_Y_END, params->windowb_y_end); + + /* Set crc mode and selection, and enable.*/ + REG_UPDATE_3(OTG_CRC_CNTL, + OTG_CRC_CONT_EN, params->continuous_mode ? 1 : 0, + OTG_CRC1_SELECT, params->selection, + OTG_CRC_EN, 1); + break; + default: + return false; + } return true; } @@ -1510,6 +1543,7 @@ bool optc1_configure_crc(struct timing_generator *optc, * optc1_get_crc - Capture CRC result per component * * @optc: timing_generator instance. + * @idx: index of crc engine to get CRC from * @r_cr: 16-bit primary CRC signature for red data. * @g_y: 16-bit primary CRC signature for green data. * @b_cb: 16-bit primary CRC signature for blue data. @@ -1521,7 +1555,7 @@ bool optc1_configure_crc(struct timing_generator *optc, * If CRC is disabled, return false; otherwise, return true, and the CRC * results in the parameters. */ -bool optc1_get_crc(struct timing_generator *optc, +bool optc1_get_crc(struct timing_generator *optc, uint8_t idx, uint32_t *r_cr, uint32_t *g_y, uint32_t *b_cb) { uint32_t field = 0; @@ -1533,14 +1567,30 @@ bool optc1_get_crc(struct timing_generator *optc, if (!field) return false; - /* OTG_CRC0_DATA_RG has the CRC16 results for the red and green component */ - REG_GET_2(OTG_CRC0_DATA_RG, - CRC0_R_CR, r_cr, - CRC0_G_Y, g_y); + switch (idx) { + case 0: + /* OTG_CRC0_DATA_RG has the CRC16 results for the red and green component */ + REG_GET_2(OTG_CRC0_DATA_RG, + CRC0_R_CR, r_cr, + CRC0_G_Y, g_y); - /* OTG_CRC0_DATA_B has the CRC16 results for the blue component */ - REG_GET(OTG_CRC0_DATA_B, - CRC0_B_CB, b_cb); + /* OTG_CRC0_DATA_B has the CRC16 results for the blue component */ + REG_GET(OTG_CRC0_DATA_B, + CRC0_B_CB, b_cb); + break; + case 1: + /* OTG_CRC1_DATA_RG has the CRC16 results for the red and green component */ + REG_GET_2(OTG_CRC1_DATA_RG, + CRC1_R_CR, r_cr, + CRC1_G_Y, g_y); + + /* OTG_CRC1_DATA_B has the CRC16 results for the blue component */ + REG_GET(OTG_CRC1_DATA_B, + CRC1_B_CB, b_cb); + break; + default: + return false; + } return true; } diff --git a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.h b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.h index 40757f20d73f..159172178d51 100644 --- a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.h +++ b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.h @@ -86,6 +86,12 @@ SRI(OTG_CRC0_WINDOWA_Y_CONTROL, OTG, inst),\ SRI(OTG_CRC0_WINDOWB_X_CONTROL, OTG, inst),\ SRI(OTG_CRC0_WINDOWB_Y_CONTROL, OTG, inst),\ + SRI(OTG_CRC1_DATA_RG, OTG, inst),\ + SRI(OTG_CRC1_DATA_B, OTG, inst),\ + SRI(OTG_CRC1_WINDOWA_X_CONTROL, OTG, inst),\ + SRI(OTG_CRC1_WINDOWA_Y_CONTROL, OTG, inst),\ + SRI(OTG_CRC1_WINDOWB_X_CONTROL, OTG, inst),\ + SRI(OTG_CRC1_WINDOWB_Y_CONTROL, OTG, inst),\ SR(GSL_SOURCE_SELECT),\ SRI(OTG_GLOBAL_CONTROL2, OTG, inst),\ SRI(OTG_TRIGA_MANUAL_TRIG, OTG, inst) @@ -315,6 +321,7 @@ struct dcn_optc_registers { SF(OTG0_OTG_GSL_CONTROL, OTG_GSL_CHECK_ALL_FIELDS, mask_sh),\ SF(OTG0_OTG_CRC_CNTL, OTG_CRC_CONT_EN, mask_sh),\ SF(OTG0_OTG_CRC_CNTL, OTG_CRC0_SELECT, mask_sh),\ + SF(OTG0_OTG_CRC_CNTL, OTG_CRC1_SELECT, mask_sh),\ SF(OTG0_OTG_CRC_CNTL, OTG_CRC_EN, mask_sh),\ SF(OTG0_OTG_CRC0_DATA_RG, CRC0_R_CR, mask_sh),\ SF(OTG0_OTG_CRC0_DATA_RG, CRC0_G_Y, mask_sh),\ @@ -327,6 +334,17 @@ struct dcn_optc_registers { SF(OTG0_OTG_CRC0_WINDOWB_X_CONTROL, OTG_CRC0_WINDOWB_X_END, mask_sh),\ SF(OTG0_OTG_CRC0_WINDOWB_Y_CONTROL, OTG_CRC0_WINDOWB_Y_START, mask_sh),\ SF(OTG0_OTG_CRC0_WINDOWB_Y_CONTROL, OTG_CRC0_WINDOWB_Y_END, mask_sh),\ + SF(OTG0_OTG_CRC1_DATA_RG, CRC1_R_CR, mask_sh),\ + SF(OTG0_OTG_CRC1_DATA_RG, CRC1_G_Y, mask_sh),\ + SF(OTG0_OTG_CRC1_DATA_B, CRC1_B_CB, mask_sh),\ + SF(OTG0_OTG_CRC1_WINDOWA_X_CONTROL, OTG_CRC1_WINDOWA_X_START, mask_sh),\ + SF(OTG0_OTG_CRC1_WINDOWA_X_CONTROL, OTG_CRC1_WINDOWA_X_END, mask_sh),\ + SF(OTG0_OTG_CRC1_WINDOWA_Y_CONTROL, OTG_CRC1_WINDOWA_Y_START, mask_sh),\ + SF(OTG0_OTG_CRC1_WINDOWA_Y_CONTROL, OTG_CRC1_WINDOWA_Y_END, mask_sh),\ + SF(OTG0_OTG_CRC1_WINDOWB_X_CONTROL, OTG_CRC1_WINDOWB_X_START, mask_sh),\ + SF(OTG0_OTG_CRC1_WINDOWB_X_CONTROL, OTG_CRC1_WINDOWB_X_END, mask_sh),\ + SF(OTG0_OTG_CRC1_WINDOWB_Y_CONTROL, OTG_CRC1_WINDOWB_Y_START, mask_sh),\ + SF(OTG0_OTG_CRC1_WINDOWB_Y_CONTROL, OTG_CRC1_WINDOWB_Y_END, mask_sh),\ SF(GSL_SOURCE_SELECT, GSL0_READY_SOURCE_SEL, mask_sh),\ SF(GSL_SOURCE_SELECT, GSL1_READY_SOURCE_SEL, mask_sh),\ SF(GSL_SOURCE_SELECT, GSL2_READY_SOURCE_SEL, mask_sh),\ @@ -482,6 +500,7 @@ struct dcn_optc_registers { type OTG_MASTER_UPDATE_LOCK_VUPDATE_KEEPOUT_EN;\ type OTG_CRC_CONT_EN;\ type OTG_CRC0_SELECT;\ + type OTG_CRC1_SELECT;\ type OTG_CRC_EN;\ type CRC0_R_CR;\ type CRC0_G_Y;\ diff --git a/drivers/gpu/drm/amd/display/dc/optc/dcn35/dcn35_optc.c b/drivers/gpu/drm/amd/display/dc/optc/dcn35/dcn35_optc.c index dfa9364fe5a6..d21e82b927d0 100644 --- a/drivers/gpu/drm/amd/display/dc/optc/dcn35/dcn35_optc.c +++ b/drivers/gpu/drm/amd/display/dc/optc/dcn35/dcn35_optc.c @@ -183,34 +183,87 @@ static bool optc35_configure_crc(struct timing_generator *optc, { struct optc *optc1 = DCN10TG_FROM_TG(optc); + /* Cannot configure crc on a CRTC that is disabled */ if (!optc1_is_tg_enabled(optc)) return false; - REG_WRITE(OTG_CRC_CNTL, 0); + + if (!params->enable || params->reset) + REG_WRITE(OTG_CRC_CNTL, 0); + if (!params->enable) return true; - REG_UPDATE_2(OTG_CRC0_WINDOWA_X_CONTROL, - OTG_CRC0_WINDOWA_X_START, params->windowa_x_start, - OTG_CRC0_WINDOWA_X_END, params->windowa_x_end); - REG_UPDATE_2(OTG_CRC0_WINDOWA_Y_CONTROL, - OTG_CRC0_WINDOWA_Y_START, params->windowa_y_start, - OTG_CRC0_WINDOWA_Y_END, params->windowa_y_end); - REG_UPDATE_2(OTG_CRC0_WINDOWB_X_CONTROL, - OTG_CRC0_WINDOWB_X_START, params->windowb_x_start, - OTG_CRC0_WINDOWB_X_END, params->windowb_x_end); - REG_UPDATE_2(OTG_CRC0_WINDOWB_Y_CONTROL, - OTG_CRC0_WINDOWB_Y_START, params->windowb_y_start, - OTG_CRC0_WINDOWB_Y_END, params->windowb_y_end); - if (optc1->base.ctx->dc->debug.otg_crc_db && optc1->tg_mask->OTG_CRC_WINDOW_DB_EN != 0) { - REG_UPDATE_4(OTG_CRC_CNTL, - OTG_CRC_CONT_EN, params->continuous_mode ? 1 : 0, - OTG_CRC0_SELECT, params->selection, - OTG_CRC_EN, 1, - OTG_CRC_WINDOW_DB_EN, 1); - } else - REG_UPDATE_3(OTG_CRC_CNTL, - OTG_CRC_CONT_EN, params->continuous_mode ? 1 : 0, - OTG_CRC0_SELECT, params->selection, - OTG_CRC_EN, 1); + + /* Program frame boundaries */ + switch (params->crc_eng_inst) { + case 0: + /* Window A x axis start and end. */ + REG_UPDATE_2(OTG_CRC0_WINDOWA_X_CONTROL, + OTG_CRC0_WINDOWA_X_START, params->windowa_x_start, + OTG_CRC0_WINDOWA_X_END, params->windowa_x_end); + + /* Window A y axis start and end. */ + REG_UPDATE_2(OTG_CRC0_WINDOWA_Y_CONTROL, + OTG_CRC0_WINDOWA_Y_START, params->windowa_y_start, + OTG_CRC0_WINDOWA_Y_END, params->windowa_y_end); + + /* Window B x axis start and end. */ + REG_UPDATE_2(OTG_CRC0_WINDOWB_X_CONTROL, + OTG_CRC0_WINDOWB_X_START, params->windowb_x_start, + OTG_CRC0_WINDOWB_X_END, params->windowb_x_end); + + /* Window B y axis start and end. */ + REG_UPDATE_2(OTG_CRC0_WINDOWB_Y_CONTROL, + OTG_CRC0_WINDOWB_Y_START, params->windowb_y_start, + OTG_CRC0_WINDOWB_Y_END, params->windowb_y_end); + + if (optc1->base.ctx->dc->debug.otg_crc_db && optc1->tg_mask->OTG_CRC_WINDOW_DB_EN != 0) + REG_UPDATE_4(OTG_CRC_CNTL, + OTG_CRC_CONT_EN, params->continuous_mode ? 1 : 0, + OTG_CRC0_SELECT, params->selection, + OTG_CRC_EN, 1, + OTG_CRC_WINDOW_DB_EN, 1); + else + REG_UPDATE_3(OTG_CRC_CNTL, + OTG_CRC_CONT_EN, params->continuous_mode ? 1 : 0, + OTG_CRC0_SELECT, params->selection, + OTG_CRC_EN, 1); + break; + case 1: + /* Window A x axis start and end. */ + REG_UPDATE_2(OTG_CRC1_WINDOWA_X_CONTROL, + OTG_CRC1_WINDOWA_X_START, params->windowa_x_start, + OTG_CRC1_WINDOWA_X_END, params->windowa_x_end); + + /* Window A y axis start and end. */ + REG_UPDATE_2(OTG_CRC1_WINDOWA_Y_CONTROL, + OTG_CRC1_WINDOWA_Y_START, params->windowa_y_start, + OTG_CRC1_WINDOWA_Y_END, params->windowa_y_end); + + /* Window B x axis start and end. */ + REG_UPDATE_2(OTG_CRC1_WINDOWB_X_CONTROL, + OTG_CRC1_WINDOWB_X_START, params->windowb_x_start, + OTG_CRC1_WINDOWB_X_END, params->windowb_x_end); + + /* Window B y axis start and end. */ + REG_UPDATE_2(OTG_CRC1_WINDOWB_Y_CONTROL, + OTG_CRC1_WINDOWB_Y_START, params->windowb_y_start, + OTG_CRC1_WINDOWB_Y_END, params->windowb_y_end); + + if (optc1->base.ctx->dc->debug.otg_crc_db && optc1->tg_mask->OTG_CRC_WINDOW_DB_EN != 0) + REG_UPDATE_4(OTG_CRC_CNTL, + OTG_CRC_CONT_EN, params->continuous_mode ? 1 : 0, + OTG_CRC1_SELECT, params->selection, + OTG_CRC_EN, 1, + OTG_CRC_WINDOW_DB_EN, 1); + else + REG_UPDATE_3(OTG_CRC_CNTL, + OTG_CRC_CONT_EN, params->continuous_mode ? 1 : 0, + OTG_CRC1_SELECT, params->selection, + OTG_CRC_EN, 1); + break; + default: + return false; + } return true; } diff --git a/drivers/gpu/drm/amd/display/dc/optc/dcn401/dcn401_optc.c b/drivers/gpu/drm/amd/display/dc/optc/dcn401/dcn401_optc.c index 783ca9acc762..338a0cad23a5 100644 --- a/drivers/gpu/drm/amd/display/dc/optc/dcn401/dcn401_optc.c +++ b/drivers/gpu/drm/amd/display/dc/optc/dcn401/dcn401_optc.c @@ -315,7 +315,7 @@ void optc401_set_drr( struct drr_params amended_params = { 0 }; bool program_manual_trigger = false; - if (dc->caps.dmub_caps.fams_ver >= 2 && dc->debug.fams2_config.bits.enable) { + if (dc->caps.dmub_caps.fams_ver == dc->debug.fams_version.ver && dc->debug.fams2_config.bits.enable) { if (params != NULL && params->vertical_total_max > 0 && params->vertical_total_min > 0) { @@ -380,7 +380,7 @@ void optc401_set_vtotal_min_max(struct timing_generator *optc, int vtotal_min, i { struct dc *dc = optc->ctx->dc; - if (dc->caps.dmub_caps.fams_ver >= 2 && dc->debug.fams2_config.bits.enable) { + if (dc->caps.dmub_caps.fams_ver == dc->debug.fams_version.ver && dc->debug.fams2_config.bits.enable) { /* FAMS2 */ dc_dmub_srv_fams2_drr_update(dc, optc->inst, vtotal_min, diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn10/dcn10_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn10/dcn10_resource.c index 770a380cc03d..e92f14d50adb 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn10/dcn10_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn10/dcn10_resource.c @@ -1258,6 +1258,11 @@ struct stream_encoder *dcn10_find_first_free_match_stream_enc_for_link( return NULL; } +unsigned int dcn10_get_vstartup_for_pipe(struct pipe_ctx *pipe_ctx) +{ + return pipe_ctx->pipe_dlg_param.vstartup_start; +} + static const struct dc_cap_funcs cap_funcs = { .get_dcc_compression_cap = dcn10_get_dcc_compression_cap }; @@ -1272,7 +1277,8 @@ static const struct resource_funcs dcn10_res_pool_funcs = { .validate_global = dcn10_validate_global, .add_stream_to_ctx = dcn10_add_stream_to_ctx, .patch_unknown_plane_state = dcn10_patch_unknown_plane_state, - .find_first_free_match_stream_enc_for_link = dcn10_find_first_free_match_stream_enc_for_link + .find_first_free_match_stream_enc_for_link = dcn10_find_first_free_match_stream_enc_for_link, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static uint32_t read_pipe_fuses(struct dc_context *ctx) diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn10/dcn10_resource.h b/drivers/gpu/drm/amd/display/dc/resource/dcn10/dcn10_resource.h index bf8e33cd8147..7bc1be53e800 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn10/dcn10_resource.h +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn10/dcn10_resource.h @@ -51,6 +51,7 @@ struct stream_encoder *dcn10_find_first_free_match_stream_enc_for_link( const struct resource_pool *pool, struct dc_stream_state *stream); +unsigned int dcn10_get_vstartup_for_pipe(struct pipe_ctx *pipe_ctx); #endif /* __DC_RESOURCE_DCN10_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn20/dcn20_resource.c index 189d0c85872e..5c6dc710e96c 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn20/dcn20_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn20/dcn20_resource.c @@ -1509,41 +1509,12 @@ bool dcn20_split_stream_for_odm( next_odm_pipe->prev_odm_pipe = prev_odm_pipe; if (prev_odm_pipe->plane_state) { - struct scaler_data *sd = &prev_odm_pipe->plane_res.scl_data; - int new_width; - - /* HACTIVE halved for odm combine */ - sd->h_active /= 2; - /* Calculate new vp and recout for left pipe */ - /* Need at least 16 pixels width per side */ - if (sd->recout.x + 16 >= sd->h_active) - return false; - new_width = sd->h_active - sd->recout.x; - sd->viewport.width -= dc_fixpt_floor(dc_fixpt_mul_int( - sd->ratios.horz, sd->recout.width - new_width)); - sd->viewport_c.width -= dc_fixpt_floor(dc_fixpt_mul_int( - sd->ratios.horz_c, sd->recout.width - new_width)); - sd->recout.width = new_width; - - /* Calculate new vp and recout for right pipe */ - sd = &next_odm_pipe->plane_res.scl_data; - /* HACTIVE halved for odm combine */ - sd->h_active /= 2; - /* Need at least 16 pixels width per side */ - if (new_width <= 16) - return false; - new_width = sd->recout.width + sd->recout.x - sd->h_active; - sd->viewport.width -= dc_fixpt_floor(dc_fixpt_mul_int( - sd->ratios.horz, sd->recout.width - new_width)); - sd->viewport_c.width -= dc_fixpt_floor(dc_fixpt_mul_int( - sd->ratios.horz_c, sd->recout.width - new_width)); - sd->recout.width = new_width; - sd->viewport.x += dc_fixpt_floor(dc_fixpt_mul_int( - sd->ratios.horz, sd->h_active - sd->recout.x)); - sd->viewport_c.x += dc_fixpt_floor(dc_fixpt_mul_int( - sd->ratios.horz_c, sd->h_active - sd->recout.x)); - sd->recout.x = 0; + if (!resource_build_scaling_params(prev_odm_pipe) || + !resource_build_scaling_params(next_odm_pipe)) { + return false; + } } + if (!next_odm_pipe->top_pipe) next_odm_pipe->stream_res.opp = pool->opps[next_odm_pipe->pipe_idx]; else @@ -2132,6 +2103,7 @@ bool dcn20_fast_validate_bw( ASSERT(0); } } + /* Actual dsc count per stream dsc validation*/ if (!dcn20_validate_dsc(dc, context)) { context->bw_ctx.dml.vba.ValidationStatus[context->bw_ctx.dml.vba.soc.num_states] = @@ -2257,7 +2229,8 @@ static const struct resource_funcs dcn20_res_pool_funcs = { .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, .set_mcif_arb_params = dcn20_set_mcif_arb_params, .populate_dml_pipes = dcn20_populate_dml_pipes_from_context, - .find_first_free_match_stream_enc_for_link = dcn10_find_first_free_match_stream_enc_for_link + .find_first_free_match_stream_enc_for_link = dcn10_find_first_free_match_stream_enc_for_link, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; bool dcn20_dwbc_create(struct dc_context *ctx, struct resource_pool *pool) diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn201/dcn201_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn201/dcn201_resource.c index d3d67d366523..43fa2cb117f3 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn201/dcn201_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn201/dcn201_resource.c @@ -59,8 +59,8 @@ #include "cyan_skillfish_ip_offset.h" -#include "dcn/dcn_2_0_3_offset.h" -#include "dcn/dcn_2_0_3_sh_mask.h" +#include "dcn/dcn_2_0_1_offset.h" +#include "dcn/dcn_2_0_1_sh_mask.h" #include "dpcs/dpcs_2_0_3_offset.h" #include "dpcs/dpcs_2_0_3_sh_mask.h" @@ -1079,7 +1079,8 @@ static struct resource_funcs dcn201_res_pool_funcs = { .populate_dml_writeback_from_context = dcn201_populate_dml_writeback_from_context, .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, .set_mcif_arb_params = dcn20_set_mcif_arb_params, - .find_first_free_match_stream_enc_for_link = dcn10_find_first_free_match_stream_enc_for_link + .find_first_free_match_stream_enc_for_link = dcn10_find_first_free_match_stream_enc_for_link, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static bool dcn201_resource_construct( diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c index 021ba8ac5c8c..2615c36d5ffe 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c @@ -1378,6 +1378,7 @@ static const struct resource_funcs dcn21_res_pool_funcs = { .find_first_free_match_stream_enc_for_link = dcn10_find_first_free_match_stream_enc_for_link, .update_bw_bounding_box = dcn21_update_bw_bounding_box, .get_panel_config_defaults = dcn21_get_panel_config_defaults, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static bool dcn21_resource_construct( diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn30/dcn30_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn30/dcn30_resource.c index cd31e4f16c14..13202ce30d66 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn30/dcn30_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn30/dcn30_resource.c @@ -2250,6 +2250,7 @@ static const struct resource_funcs dcn30_res_pool_funcs = { .update_bw_bounding_box = dcn30_update_bw_bounding_box, .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, .get_panel_config_defaults = dcn30_get_panel_config_defaults, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; #define CTX ctx @@ -2353,6 +2354,7 @@ static bool dcn30_resource_construct( dc->caps.dp_hdmi21_pcon_support = true; dc->caps.max_v_total = (1 << 15) - 1; + dc->caps.vtotal_limited_by_fp2 = true; /* read VBIOS LTTPR caps */ { diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn301/dcn301_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn301/dcn301_resource.c index a9816affd312..121a86a59833 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn301/dcn301_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn301/dcn301_resource.c @@ -671,9 +671,9 @@ static const struct dc_plane_cap plane_cap = { /* 6:1 downscaling ratio: 1000/6 = 166.666 */ .max_downscale_factor = { - .argb8888 = 167, - .nv12 = 167, - .fp16 = 167 + .argb8888 = 358, + .nv12 = 358, + .fp16 = 358 }, 64, 64 @@ -693,7 +693,7 @@ static const struct dc_debug_options debug_defaults_drv = { .disable_dcc = DCC_ENABLE, .vsr_support = true, .performance_trace = false, - .max_downscale_src_width = 7680,/*upto 8K*/ + .max_downscale_src_width = 4096,/*upto true 4k*/ .scl_reset_length10 = true, .sanity_checks = false, .underflow_assert_delay_us = 0xFFFFFFFF, @@ -1400,7 +1400,8 @@ static struct resource_funcs dcn301_res_pool_funcs = { .acquire_post_bldn_3dlut = dcn30_acquire_post_bldn_3dlut, .release_post_bldn_3dlut = dcn30_release_post_bldn_3dlut, .update_bw_bounding_box = dcn301_update_bw_bounding_box, - .patch_unknown_plane_state = dcn20_patch_unknown_plane_state + .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static bool dcn301_resource_construct( diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn302/dcn302_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn302/dcn302_resource.c index 02af8b8f4d27..012c5fd52cb1 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn302/dcn302_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn302/dcn302_resource.c @@ -1151,6 +1151,7 @@ static struct resource_funcs dcn302_res_pool_funcs = { .update_bw_bounding_box = dcn302_update_bw_bounding_box, .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, .get_panel_config_defaults = dcn302_get_panel_config_defaults, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static struct dc_cap_funcs cap_funcs = { @@ -1233,6 +1234,7 @@ static bool dcn302_resource_construct( dc->caps.extended_aux_timeout_support = true; dc->caps.dmcub_support = true; dc->caps.max_v_total = (1 << 15) - 1; + dc->caps.vtotal_limited_by_fp2 = true; /* Color pipeline capabilities */ dc->caps.color.dpp.dcn_arch = 1; diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn303/dcn303_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn303/dcn303_resource.c index 7002a8dd358a..a8d0b4686f9a 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn303/dcn303_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn303/dcn303_resource.c @@ -1096,6 +1096,7 @@ static struct resource_funcs dcn303_res_pool_funcs = { .update_bw_bounding_box = dcn303_update_bw_bounding_box, .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, .get_panel_config_defaults = dcn303_get_panel_config_defaults, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static struct dc_cap_funcs cap_funcs = { @@ -1178,6 +1179,7 @@ static bool dcn303_resource_construct( dc->caps.extended_aux_timeout_support = true; dc->caps.dmcub_support = true; dc->caps.max_v_total = (1 << 15) - 1; + dc->caps.vtotal_limited_by_fp2 = true; /* Color pipeline capabilities */ dc->caps.color.dpp.dcn_arch = 1; diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.c index c16cf1c8f7f9..911bd60d4fbc 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.c @@ -1720,6 +1720,12 @@ int dcn31_populate_dml_pipes_from_context( return pipe_cnt; } +unsigned int dcn31_get_det_buffer_size( + const struct dc_state *context) +{ + return context->bw_ctx.dml.ip.det_buffer_size_kbytes; +} + void dcn31_calculate_wm_and_dlg( struct dc *dc, struct dc_state *context, display_e2e_pipe_params_st *pipes, @@ -1842,6 +1848,8 @@ static struct resource_funcs dcn31_res_pool_funcs = { .update_bw_bounding_box = dcn31_update_bw_bounding_box, .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, .get_panel_config_defaults = dcn31_get_panel_config_defaults, + .get_det_buffer_size = dcn31_get_det_buffer_size, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static struct clock_source *dcn30_clock_source_create( diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.h b/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.h index 901436591ed4..551ad912f7be 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.h +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn31/dcn31_resource.h @@ -63,6 +63,9 @@ struct resource_pool *dcn31_create_resource_pool( const struct dc_init_data *init_data, struct dc *dc); +unsigned int dcn31_get_det_buffer_size( + const struct dc_state *context); + /*temp: B0 specific before switch to dcn313 headers*/ #ifndef regPHYPLLF_PIXCLK_RESYNC_CNTL #define regPHYPLLF_PIXCLK_RESYNC_CNTL 0x007e diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn314/dcn314_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn314/dcn314_resource.c index c0f48c78e968..e3ba105034f8 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn314/dcn314_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn314/dcn314_resource.c @@ -1777,6 +1777,8 @@ static struct resource_funcs dcn314_res_pool_funcs = { .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, .get_panel_config_defaults = dcn314_get_panel_config_defaults, .get_preferred_eng_id_dpia = dcn314_get_preferred_eng_id_dpia, + .get_det_buffer_size = dcn31_get_det_buffer_size, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static struct clock_source *dcn30_clock_source_create( diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn315/dcn315_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn315/dcn315_resource.c index 6c3295259a81..14acef036b5a 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn315/dcn315_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn315/dcn315_resource.c @@ -1845,6 +1845,8 @@ static struct resource_funcs dcn315_res_pool_funcs = { .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, .get_panel_config_defaults = dcn315_get_panel_config_defaults, .get_power_profile = dcn315_get_power_profile, + .get_det_buffer_size = dcn31_get_det_buffer_size, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static bool dcn315_resource_construct( diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn316/dcn316_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn316/dcn316_resource.c index 6edaaadcb173..568094827212 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn316/dcn316_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn316/dcn316_resource.c @@ -1719,6 +1719,8 @@ static struct resource_funcs dcn316_res_pool_funcs = { .update_bw_bounding_box = dcn316_update_bw_bounding_box, .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, .get_panel_config_defaults = dcn316_get_panel_config_defaults, + .get_det_buffer_size = dcn31_get_det_buffer_size, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static bool dcn316_resource_construct( diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c index 01d1a11d5545..664302876019 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c @@ -2066,6 +2066,7 @@ static struct resource_funcs dcn32_res_pool_funcs = { .add_phantom_pipes = dcn32_add_phantom_pipes, .build_pipe_pix_clk_params = dcn20_build_pipe_pix_clk_params, .calculate_mall_ways_from_bytes = dcn32_calculate_mall_ways_from_bytes, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static uint32_t read_pipe_fuses(struct dc_context *ctx) @@ -2189,6 +2190,7 @@ static bool dcn32_resource_construct( dc->caps.dmcub_support = true; dc->caps.seamless_odm = true; dc->caps.max_v_total = (1 << 15) - 1; + dc->caps.vtotal_limited_by_fp2 = true; /* Color pipeline capabilities */ dc->caps.color.dpp.dcn_arch = 1; @@ -2803,6 +2805,7 @@ struct pipe_ctx *dcn32_acquire_free_pipe_as_secondary_opp_head( free_pipe->plane_res.xfm = pool->transforms[free_pipe_idx]; free_pipe->plane_res.dpp = pool->dpps[free_pipe_idx]; free_pipe->plane_res.mpcc_inst = pool->dpps[free_pipe_idx]->inst; + free_pipe->hblank_borrow = otg_master->hblank_borrow; if (free_pipe->stream->timing.flags.DSC == 1) { dcn20_acquire_dsc(free_pipe->stream->ctx->dc, &new_ctx->res_ctx, diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c index 5cb74fd9cb7d..38d76434683e 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c @@ -1624,6 +1624,7 @@ static struct resource_funcs dcn321_res_pool_funcs = { .add_phantom_pipes = dcn32_add_phantom_pipes, .build_pipe_pix_clk_params = dcn20_build_pipe_pix_clk_params, .calculate_mall_ways_from_bytes = dcn32_calculate_mall_ways_from_bytes, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static uint32_t read_pipe_fuses(struct dc_context *ctx) @@ -1742,6 +1743,7 @@ static bool dcn321_resource_construct( dc->caps.extended_aux_timeout_support = true; dc->caps.dmcub_support = true; dc->caps.max_v_total = (1 << 15) - 1; + dc->caps.vtotal_limited_by_fp2 = true; /* Color pipeline capabilities */ dc->caps.color.dpp.dcn_arch = 1; diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c index 6cc2960b6104..8ee3d99ea2aa 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c @@ -1752,6 +1752,13 @@ static bool dcn35_validate_bandwidth(struct dc *dc, return out; } +enum dc_status dcn35_patch_unknown_plane_state(struct dc_plane_state *plane_state) +{ + plane_state->tiling_info.gfxversion = DcGfxVersion9; + dcn20_patch_unknown_plane_state(plane_state); + return DC_OK; +} + static struct resource_funcs dcn35_res_pool_funcs = { .destroy = dcn35_destroy_resource_pool, @@ -1775,9 +1782,11 @@ static struct resource_funcs dcn35_res_pool_funcs = { .acquire_post_bldn_3dlut = dcn30_acquire_post_bldn_3dlut, .release_post_bldn_3dlut = dcn30_release_post_bldn_3dlut, .update_bw_bounding_box = dcn35_update_bw_bounding_box_fpu, - .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, + .patch_unknown_plane_state = dcn35_patch_unknown_plane_state, .get_panel_config_defaults = dcn35_get_panel_config_defaults, .get_preferred_eng_id_dpia = dcn35_get_preferred_eng_id_dpia, + .get_det_buffer_size = dcn31_get_det_buffer_size, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static bool dcn35_resource_construct( @@ -1849,6 +1858,7 @@ static bool dcn35_resource_construct( dc->caps.zstate_support = true; dc->caps.ips_support = true; dc->caps.max_v_total = (1 << 15) - 1; + dc->caps.vtotal_limited_by_fp2 = true; /* Color pipeline capabilities */ dc->caps.color.dpp.dcn_arch = 1; diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.h b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.h index f97bb4cb3761..9d03a55d90cf 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.h +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.h @@ -35,6 +35,7 @@ extern struct _vcs_dpi_ip_params_st dcn3_5_ip; extern struct _vcs_dpi_soc_bounding_box_st dcn3_5_soc; +enum dc_status dcn35_patch_unknown_plane_state(struct dc_plane_state *plane_state); struct dcn35_resource_pool { struct resource_pool base; diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c index d87e2641cda1..14f7c3acdc96 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c @@ -1754,9 +1754,11 @@ static struct resource_funcs dcn351_res_pool_funcs = { .acquire_post_bldn_3dlut = dcn30_acquire_post_bldn_3dlut, .release_post_bldn_3dlut = dcn30_release_post_bldn_3dlut, .update_bw_bounding_box = dcn351_update_bw_bounding_box_fpu, - .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, + .patch_unknown_plane_state = dcn35_patch_unknown_plane_state, .get_panel_config_defaults = dcn35_get_panel_config_defaults, .get_preferred_eng_id_dpia = dcn351_get_preferred_eng_id_dpia, + .get_det_buffer_size = dcn31_get_det_buffer_size, + .get_vstartup_for_pipe = dcn10_get_vstartup_for_pipe }; static bool dcn351_resource_construct( @@ -1828,6 +1830,7 @@ static bool dcn351_resource_construct( dc->caps.zstate_support = true; dc->caps.ips_support = true; dc->caps.max_v_total = (1 << 15) - 1; + dc->caps.vtotal_limited_by_fp2 = true; /* Color pipeline capabilities */ dc->caps.color.dpp.dcn_arch = 1; diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn401/dcn401_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn401/dcn401_resource.c index db93bac247c0..c1ebc6b1c937 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn401/dcn401_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn401/dcn401_resource.c @@ -726,6 +726,10 @@ static const struct dc_debug_options debug_defaults_drv = { .disable_unbounded_requesting = false, .enable_legacy_fast_update = false, .dcc_meta_propagation_delay_us = 10, + .fams_version = { + .minor = 1, + .major = 2, + }, //v2.1 .fams2_config = { .bits = { .enable = true, @@ -733,7 +737,7 @@ static const struct dc_debug_options debug_defaults_drv = { .enable_stall_recovery = true, } }, - .force_cositing = CHROMA_COSITING_TOPLEFT + 1, + .force_cositing = CHROMA_COSITING_NONE + 1, }; static struct dce_aux *dcn401_aux_engine_create( @@ -1293,6 +1297,29 @@ static struct hpo_dp_link_encoder *dcn401_hpo_dp_link_encoder_create( return &hpo_dp_enc31->base; } +static unsigned int dcn401_calc_num_avail_chans_for_mall(struct dc *dc, unsigned int num_chans) +{ + unsigned int num_available_chans = 1; + + /* channels for MALL must be a power of 2 */ + while (num_chans > 1) { + num_available_chans = (num_available_chans << 1); + num_chans = (num_chans >> 1); + } + + /* cannot be odd */ + num_available_chans &= ~1; + + /* clamp to max available channels for MALL per ASIC */ + if (ASICREV_IS_GC_12_0_0_A0(dc->ctx->asic_id.hw_internal_rev)) { + num_available_chans = num_available_chans > 16 ? 16 : num_available_chans; + } else if (ASICREV_IS_GC_12_0_1_A0(dc->ctx->asic_id.hw_internal_rev)) { + num_available_chans = num_available_chans > 8 ? 8 : num_available_chans; + } + + return num_available_chans; +} + static struct dce_hwseq *dcn401_hwseq_create( struct dc_context *ctx) { @@ -1588,6 +1615,14 @@ static void dcn401_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *b memcpy(dml2_opt, &dc->dml2_options, sizeof(dc->dml2_options)); + /* re-calculate the available MALL size if required */ + if (bw_params->num_channels > 0) { + dc->caps.max_cab_allocation_bytes = dcn401_calc_num_avail_chans_for_mall( + dc, bw_params->num_channels) * + dc->caps.mall_size_per_mem_channel * 1024 * 1024; + dc->caps.mall_size_total = dc->caps.max_cab_allocation_bytes; + } + DC_FP_START(); dcn401_update_bw_bounding_box_fpu(dc, bw_params); @@ -1605,6 +1640,7 @@ static void dcn401_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *b enum dc_status dcn401_patch_unknown_plane_state(struct dc_plane_state *plane_state) { + plane_state->tiling_info.gfxversion = DcGfxAddr3; plane_state->tiling_info.gfx_addr3.swizzle = DC_ADDR3_SW_64KB_2D; return DC_OK; } @@ -1704,27 +1740,9 @@ static int dcn401_get_power_profile(const struct dc_state *context) return dpm_level; } -static unsigned int dcn401_calc_num_avail_chans_for_mall(struct dc *dc, unsigned int num_chans) +static unsigned int dcn401_get_vstartup_for_pipe(struct pipe_ctx *pipe_ctx) { - unsigned int num_available_chans = 1; - - /* channels for MALL must be a power of 2 */ - while (num_chans > 1) { - num_available_chans = (num_available_chans << 1); - num_chans = (num_chans >> 1); - } - - /* cannot be odd */ - num_available_chans &= ~1; - - /* clamp to max available channels for MALL per ASIC */ - if (ASICREV_IS_GC_12_0_0_A0(dc->ctx->asic_id.hw_internal_rev)) { - num_available_chans = num_available_chans > 16 ? 16 : num_available_chans; - } else if (ASICREV_IS_GC_12_0_1_A0(dc->ctx->asic_id.hw_internal_rev)) { - num_available_chans = num_available_chans > 8 ? 8 : num_available_chans; - } - - return num_available_chans; + return pipe_ctx->global_sync.dcn4x.vstartup_lines; } static struct resource_funcs dcn401_res_pool_funcs = { @@ -1754,6 +1772,7 @@ static struct resource_funcs dcn401_res_pool_funcs = { .build_pipe_pix_clk_params = dcn401_build_pipe_pix_clk_params, .calculate_mall_ways_from_bytes = dcn32_calculate_mall_ways_from_bytes, .get_power_profile = dcn401_get_power_profile, + .get_vstartup_for_pipe = dcn401_get_vstartup_for_pipe }; static uint32_t read_pipe_fuses(struct dc_context *ctx) @@ -1864,6 +1883,7 @@ static bool dcn401_resource_construct( dc->caps.extended_aux_timeout_support = true; dc->caps.dmcub_support = true; dc->caps.max_v_total = (1 << 15) - 1; + dc->caps.vtotal_limited_by_fp2 = true; if (ASICREV_IS_GC_12_0_1_A0(dc->ctx->asic_id.hw_internal_rev)) dc->caps.dcc_plane_width_limit = 7680; diff --git a/drivers/gpu/drm/amd/display/dc/spl/dc_spl.c b/drivers/gpu/drm/amd/display/dc/spl/dc_spl.c index 73a65913cb12..38a9a0d68058 100644 --- a/drivers/gpu/drm/amd/display/dc/spl/dc_spl.c +++ b/drivers/gpu/drm/amd/display/dc/spl/dc_spl.c @@ -11,6 +11,41 @@ #define IDENTITY_RATIO(ratio) (spl_fixpt_u2d19(ratio) == (1 << 19)) #define MIN_VIEWPORT_SIZE 12 +static bool spl_is_yuv420(enum spl_pixel_format format) +{ + if ((format >= SPL_PIXEL_FORMAT_420BPP8) && + (format <= SPL_PIXEL_FORMAT_420BPP10)) + return true; + + return false; +} + +static bool spl_is_rgb8(enum spl_pixel_format format) +{ + if (format == SPL_PIXEL_FORMAT_ARGB8888) + return true; + + return false; +} + +static bool spl_is_video_format(enum spl_pixel_format format) +{ + if (format >= SPL_PIXEL_FORMAT_VIDEO_BEGIN + && format <= SPL_PIXEL_FORMAT_VIDEO_END) + return true; + else + return false; +} + +static bool spl_is_subsampled_format(enum spl_pixel_format format) +{ + if (format >= SPL_PIXEL_FORMAT_SUBSAMPLED_BEGIN + && format <= SPL_PIXEL_FORMAT_SUBSAMPLED_END) + return true; + else + return false; +} + static struct spl_rect intersect_rec(const struct spl_rect *r0, const struct spl_rect *r1) { struct spl_rect rec; @@ -137,15 +172,32 @@ static struct spl_rect calculate_mpc_slice_in_timing_active( struct spl_in *spl_in, struct spl_rect *plane_clip_rec) { - int mpc_slice_count = spl_in->basic_in.mpc_combine_h; - int mpc_slice_idx = spl_in->basic_in.mpc_combine_v; + bool use_recout_width_aligned = + spl_in->basic_in.num_h_slices_recout_width_align.use_recout_width_aligned; + int mpc_slice_count = + spl_in->basic_in.num_h_slices_recout_width_align.num_slices_recout_width.mpc_num_h_slices; + int recout_width_align = + spl_in->basic_in.num_h_slices_recout_width_align.num_slices_recout_width.mpc_recout_width_align; + int mpc_slice_idx = spl_in->basic_in.mpc_h_slice_index; int epimo = mpc_slice_count - plane_clip_rec->width % mpc_slice_count - 1; struct spl_rect mpc_rec; - mpc_rec.width = plane_clip_rec->width / mpc_slice_count; - mpc_rec.x = plane_clip_rec->x + mpc_rec.width * mpc_slice_idx; - mpc_rec.height = plane_clip_rec->height; - mpc_rec.y = plane_clip_rec->y; + if (use_recout_width_aligned) { + mpc_rec.width = recout_width_align; + if ((mpc_rec.width * (mpc_slice_idx + 1)) > plane_clip_rec->width) { + mpc_rec.width = plane_clip_rec->width % recout_width_align; + mpc_rec.x = plane_clip_rec->x + recout_width_align * mpc_slice_idx; + } else + mpc_rec.x = plane_clip_rec->x + mpc_rec.width * mpc_slice_idx; + mpc_rec.height = plane_clip_rec->height; + mpc_rec.y = plane_clip_rec->y; + + } else { + mpc_rec.width = plane_clip_rec->width / mpc_slice_count; + mpc_rec.x = plane_clip_rec->x + mpc_rec.width * mpc_slice_idx; + mpc_rec.height = plane_clip_rec->height; + mpc_rec.y = plane_clip_rec->y; + } SPL_ASSERT(mpc_slice_count == 1 || spl_in->basic_out.view_format != SPL_VIEW_3D_SIDE_BY_SIDE || mpc_rec.width % 2 == 0); @@ -391,8 +443,7 @@ static void spl_calculate_scaling_ratios(struct spl_in *spl_in, spl_scratch->scl_data.ratios.horz_c = spl_scratch->scl_data.ratios.horz; spl_scratch->scl_data.ratios.vert_c = spl_scratch->scl_data.ratios.vert; - if (spl_in->basic_in.format == SPL_PIXEL_FORMAT_420BPP8 - || spl_in->basic_in.format == SPL_PIXEL_FORMAT_420BPP10) { + if (spl_is_yuv420(spl_in->basic_in.format)) { spl_scratch->scl_data.ratios.horz_c.value /= 2; spl_scratch->scl_data.ratios.vert_c.value /= 2; } @@ -529,23 +580,6 @@ static void spl_calculate_init_and_vp(bool flip_scan_dir, *vp_offset = src_size - *vp_offset - *vp_size; } -static bool spl_is_yuv420(enum spl_pixel_format format) -{ - if ((format >= SPL_PIXEL_FORMAT_420BPP8) && - (format <= SPL_PIXEL_FORMAT_420BPP10)) - return true; - - return false; -} - -static bool spl_is_rgb8(enum spl_pixel_format format) -{ - if (format == SPL_PIXEL_FORMAT_ARGB8888) - return true; - - return false; -} - /*Calculate inits and viewport */ static void spl_calculate_inits_and_viewports(struct spl_in *spl_in, struct spl_scratch *spl_scratch) @@ -556,8 +590,7 @@ static void spl_calculate_inits_and_viewports(struct spl_in *spl_in, struct spl_rect recout_clip_in_recout_dst; struct spl_rect overlap_in_active_timing; struct spl_rect odm_slice = calculate_odm_slice_in_timing_active(spl_in); - int vpc_div = (spl_in->basic_in.format == SPL_PIXEL_FORMAT_420BPP8 - || spl_in->basic_in.format == SPL_PIXEL_FORMAT_420BPP10) ? 2 : 1; + int vpc_div = spl_is_subsampled_format(spl_in->basic_in.format) ? 2 : 1; bool orthogonal_rotation, flip_vert_scan_dir, flip_horz_scan_dir; struct spl_fixed31_32 init_adj_h = spl_fixpt_zero; struct spl_fixed31_32 init_adj_v = spl_fixpt_zero; @@ -585,12 +618,7 @@ static void spl_calculate_inits_and_viewports(struct spl_in *spl_in, &flip_vert_scan_dir, &flip_horz_scan_dir); - if (orthogonal_rotation) { - spl_swap(src.width, src.height); - spl_swap(flip_vert_scan_dir, flip_horz_scan_dir); - } - - if (spl_is_yuv420(spl_in->basic_in.format)) { + if (spl_is_subsampled_format(spl_in->basic_in.format)) { /* this gives the direction of the cositing (negative will move * left, right otherwise) */ @@ -598,15 +626,15 @@ static void spl_calculate_inits_and_viewports(struct spl_in *spl_in, switch (spl_in->basic_in.cositing) { - case CHROMA_COSITING_LEFT: - init_adj_h = spl_fixpt_zero; - init_adj_v = spl_fixpt_from_fraction(sign, 4); - break; - case CHROMA_COSITING_NONE: + case CHROMA_COSITING_TOPLEFT: init_adj_h = spl_fixpt_from_fraction(sign, 4); init_adj_v = spl_fixpt_from_fraction(sign, 4); break; - case CHROMA_COSITING_TOPLEFT: + case CHROMA_COSITING_LEFT: + init_adj_h = spl_fixpt_from_fraction(sign, 4); + init_adj_v = spl_fixpt_zero; + break; + case CHROMA_COSITING_NONE: default: init_adj_h = spl_fixpt_zero; init_adj_v = spl_fixpt_zero; @@ -614,6 +642,12 @@ static void spl_calculate_inits_and_viewports(struct spl_in *spl_in, } } + if (orthogonal_rotation) { + spl_swap(src.width, src.height); + spl_swap(flip_vert_scan_dir, flip_horz_scan_dir); + spl_swap(init_adj_h, init_adj_v); + } + spl_calculate_init_and_vp( flip_horz_scan_dir, recout_clip_in_recout_dst.x, @@ -678,7 +712,7 @@ static void spl_handle_3d_recout(struct spl_in *spl_in, struct spl_rect *recout) * since 3d is special and needs to calculate vp as if there is no recout offset * This may break with rotation, good thing we aren't mixing hw rotation and 3d */ - if (spl_in->basic_in.mpc_combine_v) { + if (spl_in->basic_in.mpc_h_slice_index) { SPL_ASSERT(spl_in->basic_in.rotation == SPL_ROTATION_ANGLE_0 || (spl_in->basic_out.view_format != SPL_VIEW_3D_TOP_AND_BOTTOM && spl_in->basic_out.view_format != SPL_VIEW_3D_SIDE_BY_SIDE)); @@ -698,24 +732,6 @@ static void spl_clamp_viewport(struct spl_rect *viewport) viewport->width = MIN_VIEWPORT_SIZE; } -static bool spl_dscl_is_420_format(enum spl_pixel_format format) -{ - if (format == SPL_PIXEL_FORMAT_420BPP8 || - format == SPL_PIXEL_FORMAT_420BPP10) - return true; - else - return false; -} - -static bool spl_dscl_is_video_format(enum spl_pixel_format format) -{ - if (format >= SPL_PIXEL_FORMAT_VIDEO_BEGIN - && format <= SPL_PIXEL_FORMAT_VIDEO_END) - return true; - else - return false; -} - static enum scl_mode spl_get_dscl_mode(const struct spl_in *spl_in, const struct spl_scaler_data *data, bool enable_isharp, bool enable_easf) @@ -732,8 +748,8 @@ static enum scl_mode spl_get_dscl_mode(const struct spl_in *spl_in, && !enable_isharp) return SCL_MODE_SCALING_444_BYPASS; - if (!spl_dscl_is_420_format(pixel_format)) { - if (spl_dscl_is_video_format(pixel_format)) + if (!spl_is_subsampled_format(pixel_format)) { + if (spl_is_video_format(pixel_format)) return SCL_MODE_SCALING_444_YCBCR_ENABLE; else return SCL_MODE_SCALING_444_RGB_ENABLE; @@ -756,7 +772,7 @@ static bool spl_choose_lls_policy(enum spl_pixel_format format, enum spl_transfer_func_predefined tf_predefined_type, enum linear_light_scaling *lls_pref) { - if (spl_is_yuv420(format)) { + if (spl_is_video_format(format)) { *lls_pref = LLS_PREF_NO; if ((tf_type == SPL_TF_TYPE_PREDEFINED) || (tf_type == SPL_TF_TYPE_DISTRIBUTED_POINTS)) @@ -815,7 +831,7 @@ static bool enable_easf(struct spl_in *spl_in, struct spl_scratch *spl_scratch) /* Check if video is in fullscreen mode */ static bool spl_is_video_fullscreen(struct spl_in *spl_in) { - if (spl_is_yuv420(spl_in->basic_in.format) && spl_in->is_fullscreen) + if (spl_is_video_format(spl_in->basic_in.format) && spl_in->is_fullscreen) return true; return false; } @@ -846,10 +862,10 @@ static bool spl_get_isharp_en(struct spl_in *spl_in, * Apply sharpness to RGB and YUV (NV12/P010) * surfaces based on policy setting */ - if (!spl_is_yuv420(spl_in->basic_in.format) && + if (!spl_is_video_format(spl_in->basic_in.format) && (spl_in->sharpen_policy == SHARPEN_YUV)) return enable_isharp; - else if ((spl_is_yuv420(spl_in->basic_in.format) && !fullscreen) && + else if ((spl_is_video_format(spl_in->basic_in.format) && !fullscreen) && (spl_in->sharpen_policy == SHARPEN_RGB_FULLSCREEN_YUV)) return enable_isharp; else if (!spl_in->is_fullscreen && @@ -882,8 +898,8 @@ static void spl_get_taps_non_adaptive_scaler( if (in_taps->v_taps == 0) { if (spl_fixpt_ceil(spl_scratch->scl_data.ratios.vert) > 1) - spl_scratch->scl_data.taps.v_taps = spl_min(spl_fixpt_ceil(spl_fixpt_mul_int( - spl_scratch->scl_data.ratios.vert, 2)), 8); + spl_scratch->scl_data.taps.v_taps = spl_min(2 * spl_fixpt_ceil( + spl_scratch->scl_data.ratios.vert), 8); else spl_scratch->scl_data.taps.v_taps = 4; } else @@ -891,8 +907,8 @@ static void spl_get_taps_non_adaptive_scaler( if (in_taps->v_taps_c == 0) { if (spl_fixpt_ceil(spl_scratch->scl_data.ratios.vert_c) > 1) - spl_scratch->scl_data.taps.v_taps_c = spl_min(spl_fixpt_ceil(spl_fixpt_mul_int( - spl_scratch->scl_data.ratios.vert_c, 2)), 8); + spl_scratch->scl_data.taps.v_taps_c = spl_min(2 * spl_fixpt_ceil( + spl_scratch->scl_data.ratios.vert_c), 8); else spl_scratch->scl_data.taps.v_taps_c = 4; } else @@ -932,7 +948,7 @@ static bool spl_get_optimal_number_of_taps( int min_taps_y, min_taps_c; enum lb_memory_config lb_config; bool skip_easf = false; - bool is_ycbcr = spl_dscl_is_video_format(spl_in->basic_in.format); + bool is_subsampled = spl_is_subsampled_format(spl_in->basic_in.format); if (spl_scratch->scl_data.viewport.width > spl_scratch->scl_data.h_active && max_downscale_src_width != 0 && @@ -964,7 +980,7 @@ static bool spl_get_optimal_number_of_taps( if (skip_easf) spl_get_taps_non_adaptive_scaler(spl_scratch, in_taps); else { - if (spl_is_yuv420(spl_in->basic_in.format)) { + if (spl_is_video_format(spl_in->basic_in.format)) { spl_scratch->scl_data.taps.h_taps = 6; spl_scratch->scl_data.taps.v_taps = 6; spl_scratch->scl_data.taps.h_taps_c = 4; @@ -982,8 +998,7 @@ static bool spl_get_optimal_number_of_taps( min_taps_c = spl_fixpt_ceil(spl_scratch->scl_data.ratios.vert_c); /* Use LB_MEMORY_CONFIG_3 for 4:2:0 */ - if ((spl_in->basic_in.format == SPL_PIXEL_FORMAT_420BPP8) - || (spl_in->basic_in.format == SPL_PIXEL_FORMAT_420BPP10)) + if (spl_is_yuv420(spl_in->basic_in.format)) lb_config = LB_MEMORY_CONFIG_3; else lb_config = LB_MEMORY_CONFIG_0; @@ -1039,13 +1054,11 @@ static bool spl_get_optimal_number_of_taps( if (spl_scratch->scl_data.taps.h_taps_c == 5) spl_scratch->scl_data.taps.h_taps_c = 4; - if (spl_is_yuv420(spl_in->basic_in.format)) { - if ((spl_scratch->scl_data.taps.h_taps <= 4) || - (spl_scratch->scl_data.taps.h_taps_c <= 3)) { + if (spl_is_video_format(spl_in->basic_in.format)) { + if (spl_scratch->scl_data.taps.h_taps <= 4) { *enable_easf_v = false; *enable_easf_h = false; - } else if ((spl_scratch->scl_data.taps.v_taps <= 3) || - (spl_scratch->scl_data.taps.v_taps_c <= 3)) { + } else if (spl_scratch->scl_data.taps.v_taps <= 3) { *enable_easf_v = false; *enable_easf_h = true; } else { @@ -1086,10 +1099,10 @@ static bool spl_get_optimal_number_of_taps( spl_scratch->scl_data.taps.h_taps = 1; spl_scratch->scl_data.taps.v_taps = 1; - if (IDENTITY_RATIO(spl_scratch->scl_data.ratios.horz_c) && !is_ycbcr) + if (IDENTITY_RATIO(spl_scratch->scl_data.ratios.horz_c) && !is_subsampled) spl_scratch->scl_data.taps.h_taps_c = 1; - if (IDENTITY_RATIO(spl_scratch->scl_data.ratios.vert_c) && !is_ycbcr) + if (IDENTITY_RATIO(spl_scratch->scl_data.ratios.vert_c) && !is_subsampled) spl_scratch->scl_data.taps.v_taps_c = 1; *enable_easf_v = false; @@ -1103,11 +1116,11 @@ static bool spl_get_optimal_number_of_taps( (IDENTITY_RATIO(spl_scratch->scl_data.ratios.vert))) spl_scratch->scl_data.taps.v_taps = 1; - if ((!*enable_easf_h) && !is_ycbcr && + if ((!*enable_easf_h) && !is_subsampled && (IDENTITY_RATIO(spl_scratch->scl_data.ratios.horz_c))) spl_scratch->scl_data.taps.h_taps_c = 1; - if ((!*enable_easf_v) && !is_ycbcr && + if ((!*enable_easf_v) && !is_subsampled && (IDENTITY_RATIO(spl_scratch->scl_data.ratios.vert_c))) spl_scratch->scl_data.taps.v_taps_c = 1; } @@ -1118,7 +1131,7 @@ static bool spl_get_optimal_number_of_taps( static void spl_set_black_color_data(enum spl_pixel_format format, struct scl_black_color *scl_black_color) { - bool ycbcr = spl_dscl_is_video_format(format); + bool ycbcr = spl_is_video_format(format); if (ycbcr) { scl_black_color->offset_rgb_y = BLACK_OFFSET_RGB_Y; scl_black_color->offset_rgb_cbcr = BLACK_OFFSET_CBCR; @@ -1585,7 +1598,7 @@ static void spl_set_easf_data(struct spl_scratch *spl_scratch, struct spl_out *s 0x0; // fp1.5.10, C3 coefficient } - if (spl_is_yuv420(format)) { /* TODO: 0 = RGB, 1 = YUV */ + if (spl_is_subsampled_format(format)) { /* TODO: 0 = RGB, 1 = YUV */ dscl_prog_data->easf_matrix_mode = 1; /* * 2-bit, BF3 chroma mode correction calculation mode diff --git a/drivers/gpu/drm/amd/display/dc/spl/dc_spl_types.h b/drivers/gpu/drm/amd/display/dc/spl/dc_spl_types.h index 55d557df4aa5..467af9dd90de 100644 --- a/drivers/gpu/drm/amd/display/dc/spl/dc_spl_types.h +++ b/drivers/gpu/drm/amd/display/dc/spl/dc_spl_types.h @@ -63,13 +63,13 @@ enum spl_pixel_format { SPL_PIXEL_FORMAT_420BPP8, SPL_PIXEL_FORMAT_420BPP10, /*end of pixel format definition*/ - SPL_PIXEL_FORMAT_INVALID, - SPL_PIXEL_FORMAT_422BPP8, - SPL_PIXEL_FORMAT_422BPP10, SPL_PIXEL_FORMAT_GRPH_BEGIN = SPL_PIXEL_FORMAT_INDEX8, SPL_PIXEL_FORMAT_GRPH_END = SPL_PIXEL_FORMAT_FP16, + SPL_PIXEL_FORMAT_SUBSAMPLED_BEGIN = SPL_PIXEL_FORMAT_420BPP8, + SPL_PIXEL_FORMAT_SUBSAMPLED_END = SPL_PIXEL_FORMAT_420BPP10, SPL_PIXEL_FORMAT_VIDEO_BEGIN = SPL_PIXEL_FORMAT_420BPP8, SPL_PIXEL_FORMAT_VIDEO_END = SPL_PIXEL_FORMAT_420BPP10, + SPL_PIXEL_FORMAT_INVALID, SPL_PIXEL_FORMAT_UNKNOWN }; @@ -436,8 +436,14 @@ struct basic_in { struct spl_rect clip_rect; // Clip rect enum spl_rotation_angle rotation; // Rotation bool horizontal_mirror; // Horizontal mirror - int mpc_combine_h; // MPC Horizontal Combine Factor (split_count) - int mpc_combine_v; // MPC Vertical Combine Factor (split_idx) + struct { // previous mpc_combine_h - split count + bool use_recout_width_aligned; + union { + int mpc_num_h_slices; + int mpc_recout_width_align; + } num_slices_recout_width; + } num_h_slices_recout_width_align; + int mpc_h_slice_index; // previous mpc_combine_v - split_idx // Inputs for adaptive scaler - TODO enum spl_transfer_func_type tf_type; /* Transfer function type */ enum spl_transfer_func_predefined tf_predefined_type; /* Transfer function predefined type */ diff --git a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h index b353c4ceb60d..4b3ccbca0da2 100644 --- a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h +++ b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h @@ -69,6 +69,9 @@ #define DMUB_PC_SNAPSHOT_COUNT 10 +/* Default tracebuffer size if meta is absent. */ +#define DMUB_TRACE_BUFFER_SIZE (64 * 1024) + /* Forward declarations */ struct dmub_srv; struct dmub_srv_common_regs; diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h index b800a507d1e0..d0fe324cb537 100644 --- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h +++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h @@ -431,7 +431,68 @@ union replay_debug_flags { */ uint32_t enable_ips_residency_profiling : 1; - uint32_t reserved : 20; + /** + * 0x1000 (bit 12) + * @enable_coasting_vtotal_check: Enable Coasting_vtotal_check + */ + uint32_t enable_coasting_vtotal_check : 1; + /** + * 0x2000 (bit 13) + * @enable_visual_confirm_debug: Enable Visual Confirm Debug + */ + uint32_t enable_visual_confirm_debug : 1; + + uint32_t reserved : 18; + } bitfields; + + uint32_t u32All; +}; + +/** + * Flags record error state. + */ +union replay_visual_confirm_error_state_flags { + struct { + /** + * 0x1 (bit 0) - Desync Error flag. + */ + uint32_t desync_error : 1; + + /** + * 0x2 (bit 1) - State Transition Error flag. + */ + uint32_t state_transition_error : 1; + + /** + * 0x4 (bit 2) - Crc Error flag + */ + uint32_t crc_error : 1; + + /** + * 0x8 (bit 3) - Reserved + */ + uint32_t reserved_3 : 1; + + /** + * 0x10 (bit 4) - Incorrect Coasting vtotal checking --> use debug flag to control DPCD write. + * Added new debug flag to control DPCD. + */ + uint32_t incorrect_vtotal_in_static_screen : 1; + + /** + * 0x20 (bit 5) - No doubled Refresh Rate. + */ + uint32_t no_double_rr : 1; + + /** + * Reserved bit 6-7 + */ + uint32_t reserved_6_7 : 2; + + /** + * Reserved bit 9-31 + */ + uint32_t reserved_9_31 : 24; } bitfields; uint32_t u32All; @@ -475,11 +536,23 @@ union replay_hw_flags { * Use TPS3 signal when restore main link. */ uint32_t force_wakeup_by_tps3 : 1; + /** + * @is_alpm_initialized: Indicates whether ALPM is initialized + */ + uint32_t is_alpm_initialized : 1; } bitfields; uint32_t u32All; }; +union fw_assisted_mclk_switch_version { + struct { + uint8_t minor : 5; + uint8_t major : 3; + }; + uint8_t ver; +}; + /** * DMUB feature capabilities. * After DMUB init, driver will query FW capabilities prior to enabling certain features. @@ -1823,52 +1896,11 @@ enum fams2_stream_type { FAMS2_STREAM_TYPE_SUBVP = 4, }; -/* dynamic stream state */ -struct dmub_fams2_legacy_stream_dynamic_state { - uint8_t force_allow_at_vblank; - uint8_t pad[3]; -}; - -struct dmub_fams2_subvp_stream_dynamic_state { - uint16_t viewport_start_hubp_vline; - uint16_t viewport_height_hubp_vlines; - uint16_t viewport_start_c_hubp_vline; - uint16_t viewport_height_c_hubp_vlines; - uint16_t phantom_viewport_height_hubp_vlines; - uint16_t phantom_viewport_height_c_hubp_vlines; - uint16_t microschedule_start_otg_vline; - uint16_t mall_start_otg_vline; - uint16_t mall_start_hubp_vline; - uint16_t mall_start_c_hubp_vline; - uint8_t force_allow_at_vblank_only; - uint8_t pad[3]; -}; - -struct dmub_fams2_drr_stream_dynamic_state { - uint16_t stretched_vtotal; - uint8_t use_cur_vtotal; - uint8_t pad; -}; - -struct dmub_fams2_stream_dynamic_state { - uint64_t ref_tick; - uint32_t cur_vtotal; - uint16_t adjusted_allow_end_otg_vline; - uint8_t pad[2]; - struct dmub_optc_position ref_otg_pos; - struct dmub_optc_position target_otg_pos; - union { - struct dmub_fams2_legacy_stream_dynamic_state legacy; - struct dmub_fams2_subvp_stream_dynamic_state subvp; - struct dmub_fams2_drr_stream_dynamic_state drr; - } sub_state; -}; - /* static stream state */ struct dmub_fams2_legacy_stream_static_state { uint8_t vactive_det_fill_delay_otg_vlines; uint8_t programming_delay_otg_vlines; -}; +}; //v0 struct dmub_fams2_subvp_stream_static_state { uint16_t vratio_numerator; @@ -1887,14 +1919,59 @@ struct dmub_fams2_subvp_stream_static_state { uint8_t phantom_otg_inst; uint8_t phantom_pipe_mask; uint8_t phantom_plane_pipe_masks[DMUB_MAX_PHANTOM_PLANES]; // phantom pipe mask per plane (for flip passthrough) -}; +}; //v0 struct dmub_fams2_drr_stream_static_state { uint16_t nom_stretched_vtotal; uint8_t programming_delay_otg_vlines; uint8_t only_stretch_if_required; uint8_t pad[2]; -}; +}; //v0 + +struct dmub_fams2_cmd_legacy_stream_static_state { + uint16_t vactive_det_fill_delay_otg_vlines; + uint16_t programming_delay_otg_vlines; +}; //v1 + +struct dmub_fams2_cmd_subvp_stream_static_state { + uint16_t vratio_numerator; + uint16_t vratio_denominator; + uint16_t phantom_vtotal; + uint16_t phantom_vactive; + uint16_t programming_delay_otg_vlines; + uint16_t prefetch_to_mall_otg_vlines; + union { + struct { + uint8_t is_multi_planar : 1; + uint8_t is_yuv420 : 1; + } bits; + uint8_t all; + } config; + uint8_t phantom_otg_inst; + uint8_t phantom_pipe_mask; + uint8_t pad0; + uint8_t phantom_plane_pipe_masks[DMUB_MAX_PHANTOM_PLANES]; // phantom pipe mask per plane (for flip passthrough) + uint8_t pad1[4 - (DMUB_MAX_PHANTOM_PLANES % 4)]; +}; //v1 + +struct dmub_fams2_cmd_drr_stream_static_state { + uint16_t nom_stretched_vtotal; + uint16_t programming_delay_otg_vlines; + uint8_t only_stretch_if_required; + uint8_t pad[3]; +}; //v1 + +union dmub_fams2_stream_static_sub_state { + struct dmub_fams2_legacy_stream_static_state legacy; + struct dmub_fams2_subvp_stream_static_state subvp; + struct dmub_fams2_drr_stream_static_state drr; +}; //v0 + +union dmub_fams2_cmd_stream_static_sub_state { + struct dmub_fams2_cmd_legacy_stream_static_state legacy; + struct dmub_fams2_cmd_subvp_stream_static_state subvp; + struct dmub_fams2_cmd_drr_stream_static_state drr; +}; //v1 struct dmub_fams2_stream_static_state { enum fams2_stream_type type; @@ -1924,13 +2001,45 @@ struct dmub_fams2_stream_static_state { uint8_t pipe_mask; // pipe mask for the whole config uint8_t num_planes; uint8_t plane_pipe_masks[DMUB_MAX_PLANES]; // pipe mask per plane (for flip passthrough) - uint8_t pad[DMUB_MAX_PLANES % 4]; + uint8_t pad[4 - (DMUB_MAX_PLANES % 4)]; + union dmub_fams2_stream_static_sub_state sub_state; +}; //v0 + +struct dmub_fams2_cmd_stream_static_base_state { + enum fams2_stream_type type; + uint32_t otg_vline_time_ns; + uint32_t otg_vline_time_ticks; + uint16_t htotal; + uint16_t vtotal; // nominal vtotal + uint16_t vblank_start; + uint16_t vblank_end; + uint16_t max_vtotal; + uint16_t allow_start_otg_vline; + uint16_t allow_end_otg_vline; + uint16_t drr_keepout_otg_vline; // after this vline, vtotal cannot be changed + uint16_t scheduling_delay_otg_vlines; // min time to budget for ready to microschedule start + uint16_t contention_delay_otg_vlines; // time to budget for contention on execution + uint16_t vline_int_ack_delay_otg_vlines; // min time to budget for vertical interrupt firing + uint16_t allow_to_target_delay_otg_vlines; // time from allow vline to target vline union { - struct dmub_fams2_legacy_stream_static_state legacy; - struct dmub_fams2_subvp_stream_static_state subvp; - struct dmub_fams2_drr_stream_static_state drr; - } sub_state; -}; + struct { + uint8_t is_drr : 1; // stream is DRR enabled + uint8_t clamp_vtotal_min : 1; // clamp vtotal to min instead of nominal + uint8_t min_ttu_vblank_usable : 1; // if min ttu vblank is above wm, no force pstate is needed in blank + } bits; + uint8_t all; + } config; + uint8_t otg_inst; + uint8_t pipe_mask; // pipe mask for the whole config + uint8_t num_planes; + uint8_t plane_pipe_masks[DMUB_MAX_PLANES]; // pipe mask per plane (for flip passthrough) + uint8_t pad[4 - (DMUB_MAX_PLANES % 4)]; +}; //v1 + +struct dmub_fams2_stream_static_state_v1 { + struct dmub_fams2_cmd_stream_static_base_state base; + union dmub_fams2_cmd_stream_static_sub_state sub_state; +}; //v1 /** * enum dmub_fams2_allow_delay_check_mode - macroscheduler mode for breaking on excessive @@ -1970,7 +2079,11 @@ struct dmub_cmd_fams2_global_config { union dmub_cmd_fams2_config { struct dmub_cmd_fams2_global_config global; - struct dmub_fams2_stream_static_state stream; + struct dmub_fams2_stream_static_state stream; //v0 + union { + struct dmub_fams2_cmd_stream_static_base_state base; + union dmub_fams2_cmd_stream_static_sub_state sub_state; + } stream_v1; //v1 }; /** @@ -3592,6 +3705,8 @@ enum dmub_cmd_replay_general_subtype { */ REPLAY_GENERAL_CMD_DISABLED_ADAPTIVE_SYNC_SDP, REPLAY_GENERAL_CMD_DISABLED_DESYNC_ERROR_DETECTION, + REPLAY_GENERAL_CMD_UPDATE_ERROR_STATUS, + REPLAY_GENERAL_CMD_SET_LOW_RR_ACTIVATE, }; /** diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c index a3f3ff5d49ac..15ea216e903d 100644 --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c @@ -61,10 +61,6 @@ /* Default state size if meta is absent. */ #define DMUB_FW_STATE_SIZE (64 * 1024) -/* Default tracebuffer size if meta is absent. */ -#define DMUB_TRACE_BUFFER_SIZE (64 * 1024) - - /* Default scratch mem size. */ #define DMUB_SCRATCH_MEM_SIZE (1024) diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c index f980a84dceef..2b3964529539 100644 --- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c +++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c @@ -122,6 +122,17 @@ static unsigned int calc_duration_in_us_from_v_total( return duration_in_us; } +static unsigned int calc_max_hardware_v_total(const struct dc_stream_state *stream) +{ + unsigned int max_hw_v_total = stream->ctx->dc->caps.max_v_total; + + if (stream->ctx->dc->caps.vtotal_limited_by_fp2) { + max_hw_v_total -= stream->timing.v_front_porch + 1; + } + + return max_hw_v_total; +} + unsigned int mod_freesync_calc_v_total_from_refresh( const struct dc_stream_state *stream, unsigned int refresh_in_uhz) @@ -1016,7 +1027,7 @@ void mod_freesync_build_vrr_params(struct mod_freesync *mod_freesync, if (stream->ctx->dc->caps.max_v_total != 0 && stream->timing.h_total != 0) { min_hardware_refresh_in_uhz = div64_u64((stream->timing.pix_clk_100hz * 100000000ULL), - (stream->timing.h_total * (long long)stream->ctx->dc->caps.max_v_total)); + (stream->timing.h_total * (long long)calc_max_hardware_v_total(stream))); } /* Limit minimum refresh rate to what can be supported by hardware */ min_refresh_in_uhz = min_hardware_refresh_in_uhz > in_config->min_refresh_in_uhz ? diff --git a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c index 95838c7ab054..29ccd3532d13 100644 --- a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c +++ b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c @@ -996,9 +996,9 @@ void set_replay_coasting_vtotal(struct dc_link *link, link->replay_settings.coasting_vtotal_table[type] = vtotal; } -void set_replay_ips_full_screen_video_src_vtotal(struct dc_link *link, uint16_t vtotal) +void set_replay_low_rr_full_screen_video_src_vtotal(struct dc_link *link, uint16_t vtotal) { - link->replay_settings.abm_with_ips_on_full_screen_video_pseudo_vtotal = vtotal; + link->replay_settings.low_rr_full_screen_video_pseudo_vtotal = vtotal; } void calculate_replay_link_off_frame_count(struct dc_link *link, @@ -1039,3 +1039,8 @@ bool fill_custom_backlight_caps(unsigned int config_no, struct dm_acpi_atif_back memcpy(caps->data_points, custom_backlight_profiles[config_no].data_points, data_points_size); return true; } + +void reset_replay_dsync_error_count(struct dc_link *link) +{ + link->replay_settings.replay_desync_error_fail_count = 0; +} diff --git a/drivers/gpu/drm/amd/display/modules/power/power_helpers.h b/drivers/gpu/drm/amd/display/modules/power/power_helpers.h index cac302e8fa10..758a8aa31fbe 100644 --- a/drivers/gpu/drm/amd/display/modules/power/power_helpers.h +++ b/drivers/gpu/drm/amd/display/modules/power/power_helpers.h @@ -62,7 +62,7 @@ void set_replay_defer_update_coasting_vtotal(struct dc_link *link, uint32_t vtotal); void update_replay_coasting_vtotal_from_defer(struct dc_link *link, enum replay_coasting_vtotal_type type); -void set_replay_ips_full_screen_video_src_vtotal(struct dc_link *link, uint16_t vtotal); +void set_replay_low_rr_full_screen_video_src_vtotal(struct dc_link *link, uint16_t vtotal); void calculate_replay_link_off_frame_count(struct dc_link *link, uint16_t vtotal, uint16_t htotal); @@ -78,4 +78,5 @@ bool psr_su_set_dsc_slice_height(struct dc *dc, struct dc_link *link, bool fill_custom_backlight_caps(unsigned int config_no, struct dm_acpi_atif_backlight_caps *caps); +void reset_replay_dsync_error_count(struct dc_link *link); #endif /* MODULES_POWER_POWER_HELPERS_H_ */ diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h index 7eefcb0f5070..05bdb4e020ae 100644 --- a/drivers/gpu/drm/amd/include/amd_shared.h +++ b/drivers/gpu/drm/amd/include/amd_shared.h @@ -344,6 +344,11 @@ enum DC_DEBUG_MASK { * eDP display from ACPI _DDC method. */ DC_DISABLE_ACPI_EDID = 0x8000, + + /* + * @DC_DISABLE_HDMI_CEC: If set, disable HDMI-CEC feature in amdgpu driver. + */ + DC_DISABLE_HDMI_CEC = 0x10000, }; enum amd_dpm_forced_level; @@ -401,9 +406,9 @@ struct amd_ip_funcs { int (*pre_soft_reset)(struct amdgpu_ip_block *ip_block); int (*soft_reset)(struct amdgpu_ip_block *ip_block); int (*post_soft_reset)(struct amdgpu_ip_block *ip_block); - int (*set_clockgating_state)(void *handle, + int (*set_clockgating_state)(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state); - int (*set_powergating_state)(void *handle, + int (*set_powergating_state)(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state); void (*get_clockgating_state)(void *handle, u64 *flags); void (*dump_ip_state)(struct amdgpu_ip_block *ip_block); diff --git a/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_2_0_3_offset.h b/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_2_0_1_offset.h similarity index 99% rename from drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_2_0_3_offset.h rename to drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_2_0_1_offset.h index cae1a7e74323..73c5dd5e83d4 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_2_0_3_offset.h +++ b/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_2_0_1_offset.h @@ -19,8 +19,8 @@ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ -#ifndef _dcn_2_0_3_OFFSET_HEADER -#define _dcn_2_0_3_OFFSET_HEADER +#ifndef _dcn_2_0_1_OFFSET_HEADER +#define _dcn_2_0_1_OFFSET_HEADER // addressBlock: dce_dc_dccg_dccg_dispdec diff --git a/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_2_0_3_sh_mask.h b/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_2_0_1_sh_mask.h similarity index 99% rename from drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_2_0_3_sh_mask.h rename to drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_2_0_1_sh_mask.h index ca1e1eb39256..290d807800a6 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_2_0_3_sh_mask.h +++ b/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_2_0_1_sh_mask.h @@ -18,8 +18,8 @@ * AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ -#ifndef _dcn_2_0_3_SH_MASK_HEADER -#define _dcn_2_0_3_SH_MASK_HEADER +#ifndef _dcn_2_0_1_SH_MASK_HEADER +#define _dcn_2_0_1_SH_MASK_HEADER // addressBlock: dce_dc_dccg_dccg_dispdec diff --git a/drivers/gpu/drm/amd/include/asic_reg/umc/umc_8_14_0_offset.h b/drivers/gpu/drm/amd/include/asic_reg/umc/umc_8_14_0_offset.h new file mode 100644 index 000000000000..0e8f12728d5f --- /dev/null +++ b/drivers/gpu/drm/amd/include/asic_reg/umc/umc_8_14_0_offset.h @@ -0,0 +1,29 @@ +/* + * Copyright (C) 2024 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN + * AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + */ +#ifndef _umc_8_14_0_OFFSET_HEADER +#define _umc_8_14_0_OFFSET_HEADER + +#define regUMCCH0_GeccErrCntSel 0x0328 +#define regUMCCH0_GeccErrCntSel_BASE_IDX 0 +#define regUMCCH0_GeccErrCnt 0x0329 +#define regUMCCH0_GeccErrCnt_BASE_IDX 0 + +#endif diff --git a/drivers/gpu/drm/amd/include/asic_reg/umc/umc_8_14_0_sh_mask.h b/drivers/gpu/drm/amd/include/asic_reg/umc/umc_8_14_0_sh_mask.h new file mode 100644 index 000000000000..5d723b5d9b87 --- /dev/null +++ b/drivers/gpu/drm/amd/include/asic_reg/umc/umc_8_14_0_sh_mask.h @@ -0,0 +1,37 @@ +/* + * Copyright (C) 2024 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN + * AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + */ +#ifndef _umc_8_14_0_SH_MASK_HEADER +#define _umc_8_14_0_SH_MASK_HEADER + +//UMCCH0_GeccErrCntSel +#define UMCCH0_GeccErrCntSel__GeccErrInt__SHIFT 0xc +#define UMCCH0_GeccErrCntSel__GeccErrCntEn__SHIFT 0xf +#define UMCCH0_GeccErrCntSel__PoisonCntEn__SHIFT 0x10 +#define UMCCH0_GeccErrCntSel__GeccErrInt_MASK 0x00003000L +#define UMCCH0_GeccErrCntSel__GeccErrCntEn_MASK 0x00008000L +#define UMCCH0_GeccErrCntSel__PoisonCntEn_MASK 0x00030000L +//UMCCH0_GeccErrCnt +#define UMCCH0_GeccErrCnt__GeccErrCnt__SHIFT 0x0 +#define UMCCH0_GeccErrCnt__GeccUnCorrErrCnt__SHIFT 0x10 +#define UMCCH0_GeccErrCnt__GeccErrCnt_MASK 0x0000FFFFL +#define UMCCH0_GeccErrCnt__GeccUnCorrErrCnt_MASK 0xFFFF0000L + +#endif diff --git a/drivers/gpu/drm/amd/include/atomfirmware.h b/drivers/gpu/drm/amd/include/atomfirmware.h index b0fc22383e28..0160d65f3f5e 100644 --- a/drivers/gpu/drm/amd/include/atomfirmware.h +++ b/drivers/gpu/drm/amd/include/atomfirmware.h @@ -1300,12 +1300,17 @@ struct atom_ext_display_path //usCaps enum ext_display_path_cap_def { - EXT_DISPLAY_PATH_CAPS__HBR2_DISABLE = 0x0001, - EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN = 0x0002, - EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK = 0x007C, - EXT_DISPLAY_PATH_CAPS__HDMI20_PI3EQX1204 = (0x01 << 2), //PI redriver chip - EXT_DISPLAY_PATH_CAPS__HDMI20_TISN65DP159RSBT = (0x02 << 2), //TI retimer chip - EXT_DISPLAY_PATH_CAPS__HDMI20_PARADE_PS175 = (0x03 << 2) //Parade DP->HDMI recoverter chip + EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK = 0x007E, + AMD_EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK = 0x007E, + AMD_EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN = (0x01 << 1), + AMD_EXT_DISPLAY_PATH_CAPS__HDMI20_PI3EQX1204 = (0x02 << 1), + AMD_EXT_DISPLAY_PATH_CAPS__DP_EARLY_8B10B_TPS2 = (0x03 << 1), + AMD_EXT_DISPLAY_PATH_CAPS__HDMI20_TISN65DP159RSBT = (0x04 << 1), + AMD_EXT_DISPLAY_PATH_CAPS__HDMI20_PARADE_PS175 = (0x06 << 1), + EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN = (0x07 << 1), + EXT_DISPLAY_PATH_CAPS__HDMI20_PI3EQX1204 = (0x08 << 1), //PI redriver chip + EXT_DISPLAY_PATH_CAPS__HDMI20_TISN65DP159RSBT = (0x09 << 1), //TI retimer chip + EXT_DISPLAY_PATH_CAPS__AMD_INTERNAL = (0x0a << 1), //AMD internal customer chip placeholder }; struct atom_external_display_connection_info diff --git a/drivers/gpu/drm/amd/include/ivsrcid/vcn/irqsrcs_vcn_5_0.h b/drivers/gpu/drm/amd/include/ivsrcid/vcn/irqsrcs_vcn_5_0.h new file mode 100644 index 000000000000..64b553e7de1a --- /dev/null +++ b/drivers/gpu/drm/amd/include/ivsrcid/vcn/irqsrcs_vcn_5_0.h @@ -0,0 +1,47 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright 2024 Advanced Micro Devices, Inc. All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#ifndef __IRQSRCS_VCN_5_0_H__ +#define __IRQSRCS_VCN_5_0_H__ + +#define VCN_5_0__SRCID__UVD_TRAP 114 // 0x72 UVD_TRAP +#define VCN_5_0__SRCID__UVD_ENC_GENERAL_PURPOSE 119 // 0x77 Encoder General Purpose +#define VCN_5_0__SRCID__UVD_ENC_LOW_LATENCY 120 // 0x78 Encoder Low Latency +#define VCN_5_0__SRCID__UVD_SYSTEM_MESSAGE_INTERRUPT 124 // 0x7c UVD system message interrupt +#define VCN_5_0__SRCID__JPEG_ENCODE 151 // 0x97 JRBC Encode interrupt +#define VCN_5_0__SRCID__JPEG_DECODE 153 // 0x99 JRBC Decode interrupt +#define VCN_5_0__SRCID__JPEG1_DECODE 149 // 0x95 JRBC1 Decode interrupt +#define VCN_5_0__SRCID__JPEG2_DECODE 151 // 0x97 JRBC2 Decode interrupt +#define VCN_5_0__SRCID__JPEG3_DECODE 171 // 0xab JRBC3 Decode interrupt +#define VCN_5_0__SRCID__JPEG4_DECODE 172 // 0xac JRBC4 Decode interrupt +#define VCN_5_0__SRCID__JPEG5_DECODE 173 // 0xad JRBC5 Decode interrupt +#define VCN_5_0__SRCID__JPEG6_DECODE 174 // 0xae JRBC6 Decode interrupt +#define VCN_5_0__SRCID__JPEG7_DECODE 175 // 0xaf JRBC7 Decode interrupt +#define VCN_5_0__SRCID__JPEG8_DECODE 177 // 0xb1 JRBC8 Decode interrupt +#define VCN_5_0__SRCID__JPEG9_DECODE 178 // 0xb2 JRBC9 Decode interrupt + +#define VCN_5_0__SRCID_UVD_POISON 160 +#define VCN_5_0__SRCID_DJPEG0_POISON 161 +#define VCN_5_0__SRCID_EJPEG0_POISON 162 +#endif diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h b/drivers/gpu/drm/amd/include/kgd_pp_interface.h index 67a5de573943..9189dcb65188 100644 --- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h +++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h @@ -164,6 +164,7 @@ enum amd_pp_task { }; enum PP_SMC_POWER_PROFILE { + PP_SMC_POWER_PROFILE_UNKNOWN = -1, PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT = 0x0, PP_SMC_POWER_PROFILE_FULLSCREEN3D = 0x1, PP_SMC_POWER_PROFILE_POWERSAVING = 0x2, @@ -420,7 +421,9 @@ struct amd_pm_funcs { int (*load_firmware)(void *handle); int (*wait_for_fw_loading_complete)(void *handle); int (*set_powergating_by_smu)(void *handle, - uint32_t block_type, bool gate); + uint32_t block_type, + bool gate, + int inst); int (*set_clockgating_by_smu)(void *handle, uint32_t msg_id); int (*set_power_limit)(void *handle, uint32_t n); int (*get_power_limit)(void *handle, uint32_t *limit, diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c index 9dc82f4d7c93..6a9e26905edf 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c @@ -70,13 +70,18 @@ int amdgpu_dpm_get_mclk(struct amdgpu_device *adev, bool low) return ret; } -int amdgpu_dpm_set_powergating_by_smu(struct amdgpu_device *adev, uint32_t block_type, bool gate) +int amdgpu_dpm_set_powergating_by_smu(struct amdgpu_device *adev, + uint32_t block_type, + bool gate, + int inst) { int ret = 0; const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs; enum ip_power_state pwr_state = gate ? POWER_STATE_OFF : POWER_STATE_ON; + bool is_vcn = (block_type == AMD_IP_BLOCK_TYPE_UVD || block_type == AMD_IP_BLOCK_TYPE_VCN); - if (atomic_read(&adev->pm.pwr_state[block_type]) == pwr_state) { + if (atomic_read(&adev->pm.pwr_state[block_type]) == pwr_state && + (!is_vcn || adev->vcn.num_vcn_inst == 1)) { dev_dbg(adev->dev, "IP block%d already in the target %s state!", block_type, gate ? "gate" : "ungate"); return 0; @@ -88,7 +93,6 @@ int amdgpu_dpm_set_powergating_by_smu(struct amdgpu_device *adev, uint32_t block case AMD_IP_BLOCK_TYPE_UVD: case AMD_IP_BLOCK_TYPE_VCE: case AMD_IP_BLOCK_TYPE_GFX: - case AMD_IP_BLOCK_TYPE_VCN: case AMD_IP_BLOCK_TYPE_SDMA: case AMD_IP_BLOCK_TYPE_JPEG: case AMD_IP_BLOCK_TYPE_GMC: @@ -96,7 +100,12 @@ int amdgpu_dpm_set_powergating_by_smu(struct amdgpu_device *adev, uint32_t block case AMD_IP_BLOCK_TYPE_VPE: if (pp_funcs && pp_funcs->set_powergating_by_smu) ret = (pp_funcs->set_powergating_by_smu( - (adev)->powerplay.pp_handle, block_type, gate)); + (adev)->powerplay.pp_handle, block_type, gate, 0)); + break; + case AMD_IP_BLOCK_TYPE_VCN: + if (pp_funcs && pp_funcs->set_powergating_by_smu) + ret = (pp_funcs->set_powergating_by_smu( + (adev)->powerplay.pp_handle, block_type, gate, inst)); break; default: break; @@ -566,7 +575,17 @@ void amdgpu_dpm_enable_uvd(struct amdgpu_device *adev, bool enable) return; } - ret = amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_UVD, !enable); + ret = amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_UVD, !enable, 0); + if (ret) + DRM_ERROR("Dpm %s uvd failed, ret = %d. \n", + enable ? "enable" : "disable", ret); +} + +void amdgpu_dpm_enable_vcn(struct amdgpu_device *adev, bool enable, int inst) +{ + int ret = 0; + + ret = amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_VCN, !enable, inst); if (ret) DRM_ERROR("Dpm %s uvd failed, ret = %d. \n", enable ? "enable" : "disable", ret); @@ -591,7 +610,7 @@ void amdgpu_dpm_enable_vce(struct amdgpu_device *adev, bool enable) return; } - ret = amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_VCE, !enable); + ret = amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_VCE, !enable, 0); if (ret) DRM_ERROR("Dpm %s vce failed, ret = %d. \n", enable ? "enable" : "disable", ret); @@ -601,7 +620,7 @@ void amdgpu_dpm_enable_jpeg(struct amdgpu_device *adev, bool enable) { int ret = 0; - ret = amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_JPEG, !enable); + ret = amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_JPEG, !enable, 0); if (ret) DRM_ERROR("Dpm %s jpeg failed, ret = %d. \n", enable ? "enable" : "disable", ret); @@ -611,7 +630,7 @@ void amdgpu_dpm_enable_vpe(struct amdgpu_device *adev, bool enable) { int ret = 0; - ret = amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_VPE, !enable); + ret = amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_VPE, !enable, 0); if (ret) DRM_ERROR("Dpm %s vpe failed, ret = %d.\n", enable ? "enable" : "disable", ret); @@ -700,6 +719,21 @@ int amdgpu_dpm_send_rma_reason(struct amdgpu_device *adev) return ret; } +int amdgpu_dpm_reset_sdma(struct amdgpu_device *adev, uint32_t inst_mask) +{ + struct smu_context *smu = adev->powerplay.pp_handle; + int ret; + + if (!is_support_sw_smu(adev)) + return -EOPNOTSUPP; + + mutex_lock(&adev->pm.mutex); + ret = smu_reset_sdma(smu, inst_mask); + mutex_unlock(&adev->pm.mutex); + + return ret; +} + int amdgpu_dpm_get_dpm_freq_range(struct amdgpu_device *adev, enum pp_clock_type type, uint32_t *min, @@ -953,6 +987,24 @@ enum amd_dpm_forced_level amdgpu_dpm_get_performance_level(struct amdgpu_device return level; } +static void amdgpu_dpm_enter_umd_state(struct amdgpu_device *adev) +{ + /* enter UMD Pstate */ + amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_GFX, + AMD_PG_STATE_UNGATE); + amdgpu_device_ip_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_GFX, + AMD_CG_STATE_UNGATE); +} + +static void amdgpu_dpm_exit_umd_state(struct amdgpu_device *adev) +{ + /* exit UMD Pstate */ + amdgpu_device_ip_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_GFX, + AMD_CG_STATE_GATE); + amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_GFX, + AMD_PG_STATE_GATE); +} + int amdgpu_dpm_force_performance_level(struct amdgpu_device *adev, enum amd_dpm_forced_level level) { @@ -973,6 +1025,10 @@ int amdgpu_dpm_force_performance_level(struct amdgpu_device *adev, if (current_level == level) return 0; + if (!(current_level & profile_mode_mask) && + (level == AMD_DPM_FORCED_LEVEL_PROFILE_EXIT)) + return -EINVAL; + if (adev->asic_type == CHIP_RAVEN) { if (!(adev->apu_flags & AMD_APU_IS_RAVEN2)) { if (current_level != AMD_DPM_FORCED_LEVEL_MANUAL && @@ -984,35 +1040,25 @@ int amdgpu_dpm_force_performance_level(struct amdgpu_device *adev, } } - if (!(current_level & profile_mode_mask) && - (level == AMD_DPM_FORCED_LEVEL_PROFILE_EXIT)) - return -EINVAL; - - if (!(current_level & profile_mode_mask) && - (level & profile_mode_mask)) { - /* enter UMD Pstate */ - amdgpu_device_ip_set_powergating_state(adev, - AMD_IP_BLOCK_TYPE_GFX, - AMD_PG_STATE_UNGATE); - amdgpu_device_ip_set_clockgating_state(adev, - AMD_IP_BLOCK_TYPE_GFX, - AMD_CG_STATE_UNGATE); - } else if ((current_level & profile_mode_mask) && - !(level & profile_mode_mask)) { - /* exit UMD Pstate */ - amdgpu_device_ip_set_clockgating_state(adev, - AMD_IP_BLOCK_TYPE_GFX, - AMD_CG_STATE_GATE); - amdgpu_device_ip_set_powergating_state(adev, - AMD_IP_BLOCK_TYPE_GFX, - AMD_PG_STATE_GATE); - } + if (!(current_level & profile_mode_mask) && (level & profile_mode_mask)) + amdgpu_dpm_enter_umd_state(adev); + else if ((current_level & profile_mode_mask) && + !(level & profile_mode_mask)) + amdgpu_dpm_exit_umd_state(adev); mutex_lock(&adev->pm.mutex); if (pp_funcs->force_performance_level(adev->powerplay.pp_handle, level)) { mutex_unlock(&adev->pm.mutex); + /* If new level failed, retain the umd state as before */ + if (!(current_level & profile_mode_mask) && + (level & profile_mode_mask)) + amdgpu_dpm_exit_umd_state(adev); + else if ((current_level & profile_mode_mask) && + !(level & profile_mode_mask)) + amdgpu_dpm_enter_umd_state(adev); + return -EINVAL; } diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index 136e8193867c..e8ae7681bf0a 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -1361,7 +1361,11 @@ static ssize_t amdgpu_set_pp_mclk_od(struct device *dev, * create a custom set of heuristics, write a string of numbers to the file * starting with the number of the custom profile along with a setting * for each heuristic parameter. Due to differences across asic families - * the heuristic parameters vary from family to family. + * the heuristic parameters vary from family to family. Additionally, + * you can apply the custom heuristics to different clock domains. Each + * clock domain is considered a distinct operation so if you modify the + * gfxclk heuristics and then the memclk heuristics, the all of the + * custom heuristics will be retained until you switch to another profile. * */ diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h index 363af8990aa2..1f5ac7e0230d 100644 --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h @@ -397,7 +397,7 @@ int amdgpu_dpm_get_apu_thermal_limit(struct amdgpu_device *adev, uint32_t *limit int amdgpu_dpm_set_apu_thermal_limit(struct amdgpu_device *adev, uint32_t limit); int amdgpu_dpm_set_powergating_by_smu(struct amdgpu_device *adev, - uint32_t block_type, bool gate); + uint32_t block_type, bool gate, int inst); extern int amdgpu_dpm_get_sclk(struct amdgpu_device *adev, bool low); @@ -446,6 +446,7 @@ void amdgpu_pm_acpi_event_handler(struct amdgpu_device *adev); void amdgpu_dpm_compute_clocks(struct amdgpu_device *adev); void amdgpu_dpm_enable_uvd(struct amdgpu_device *adev, bool enable); +void amdgpu_dpm_enable_vcn(struct amdgpu_device *adev, bool enable, int inst); void amdgpu_dpm_enable_vce(struct amdgpu_device *adev, bool enable); void amdgpu_dpm_enable_jpeg(struct amdgpu_device *adev, bool enable); void amdgpu_dpm_enable_vpe(struct amdgpu_device *adev, bool enable); @@ -601,5 +602,6 @@ int amdgpu_dpm_set_pm_policy(struct amdgpu_device *adev, int policy_type, int policy_level); ssize_t amdgpu_dpm_get_pm_policy_info(struct amdgpu_device *adev, enum pp_pm_policy p_type, char *buf); +int amdgpu_dpm_reset_sdma(struct amdgpu_device *adev, uint32_t inst_mask); #endif diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c index 8908646ad620..67a8e22b1126 100644 --- a/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c @@ -3177,13 +3177,13 @@ static int kv_dpm_process_interrupt(struct amdgpu_device *adev, return 0; } -static int kv_dpm_set_clockgating_state(void *handle, +static int kv_dpm_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int kv_dpm_set_powergating_state(void *handle, +static int kv_dpm_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; @@ -3276,7 +3276,9 @@ static int kv_dpm_read_sensor(void *handle, int idx, } static int kv_set_powergating_by_smu(void *handle, - uint32_t block_type, bool gate) + uint32_t block_type, + bool gate, + int inst) { switch (block_type) { case AMD_IP_BLOCK_TYPE_UVD: diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c index ee23a0f897c5..a87dcf0974bc 100644 --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c @@ -7709,7 +7709,8 @@ static int si_dpm_init_microcode(struct amdgpu_device *adev) default: BUG(); } - err = amdgpu_ucode_request(adev, &adev->pm.fw, "amdgpu/%s_smc.bin", chip_name); + err = amdgpu_ucode_request(adev, &adev->pm.fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s_smc.bin", chip_name); if (err) { DRM_ERROR("si_smc: Failed to load firmware. err = %d\"%s_smc.bin\"\n", err, chip_name); @@ -7849,13 +7850,13 @@ static int si_dpm_wait_for_idle(struct amdgpu_ip_block *ip_block) return 0; } -static int si_dpm_set_clockgating_state(void *handle, +static int si_dpm_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int si_dpm_set_powergating_state(void *handle, +static int si_dpm_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c index 26624a716fc6..686345f75f26 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c +++ b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c @@ -244,7 +244,7 @@ static bool pp_is_idle(void *handle) return false; } -static int pp_set_powergating_state(void *handle, +static int pp_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; @@ -267,7 +267,7 @@ static int pp_resume(struct amdgpu_ip_block *ip_block) return hwmgr_resume(hwmgr); } -static int pp_set_clockgating_state(void *handle, +static int pp_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; @@ -1227,7 +1227,9 @@ static void pp_dpm_powergate_sdma(void *handle, bool gate) } static int pp_set_powergating_by_smu(void *handle, - uint32_t block_type, bool gate) + uint32_t block_type, + bool gate, + int inst) { int ret = 0; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.c index fe24219c3bf4..4bd92fd782be 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.c @@ -992,6 +992,8 @@ int atomctrl_get_smc_sclk_range_table(struct pp_hwmgr *hwmgr, struct pp_atom_ctr GetIndexIntoMasterTable(DATA, SMU_Info), &size, &frev, &crev); + if (!psmu_info) + return -EINVAL; for (i = 0; i < psmu_info->ucSclkEntryNum; i++) { table->entry[i].ucVco_setting = psmu_info->asSclkFcwRangeEntry[i].ucVco_setting; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_powertune.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_powertune.c index 3007b054c873..776d58ea63ae 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_powertune.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_powertune.c @@ -1120,13 +1120,14 @@ static int vega10_enable_se_edc_force_stall_config(struct pp_hwmgr *hwmgr) result = vega10_program_didt_config_registers(hwmgr, SEEDCForceStallPatternConfig_Vega10, VEGA10_CONFIGREG_DIDT); result |= vega10_program_didt_config_registers(hwmgr, SEEDCCtrlForceStallConfig_Vega10, VEGA10_CONFIGREG_DIDT); if (0 != result) - return result; + goto exit_safe_mode; vega10_didt_set_mask(hwmgr, false); +exit_safe_mode: amdgpu_gfx_rlc_exit_safe_mode(adev, 0); - return 0; + return result; } static int vega10_disable_se_edc_force_stall_config(struct pp_hwmgr *hwmgr) diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index b8355293518f..8ca793c222ff 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -72,6 +72,10 @@ static int smu_set_power_limit(void *handle, uint32_t limit); static int smu_set_fan_speed_rpm(void *handle, uint32_t speed); static int smu_set_gfx_cgpg(struct smu_context *smu, bool enabled); static int smu_set_mp1_state(void *handle, enum pp_mp1_state mp1_state); +static void smu_power_profile_mode_get(struct smu_context *smu, + enum PP_SMC_POWER_PROFILE profile_mode); +static void smu_power_profile_mode_put(struct smu_context *smu, + enum PP_SMC_POWER_PROFILE profile_mode); static int smu_sys_get_pp_feature_mask(void *handle, char *buf) @@ -234,7 +238,8 @@ static bool is_vcn_enabled(struct amdgpu_device *adev) } static int smu_dpm_set_vcn_enable(struct smu_context *smu, - bool enable) + bool enable, + int inst) { struct smu_power_context *smu_power = &smu->smu_power; struct smu_power_gate *power_gate = &smu_power->power_gate; @@ -249,12 +254,12 @@ static int smu_dpm_set_vcn_enable(struct smu_context *smu, if (!smu->ppt_funcs->dpm_set_vcn_enable) return 0; - if (atomic_read(&power_gate->vcn_gated) ^ enable) + if (atomic_read(&power_gate->vcn_gated[inst]) ^ enable) return 0; - ret = smu->ppt_funcs->dpm_set_vcn_enable(smu, enable, 0xff); + ret = smu->ppt_funcs->dpm_set_vcn_enable(smu, enable, inst); if (!ret) - atomic_set(&power_gate->vcn_gated, !enable); + atomic_set(&power_gate->vcn_gated[inst], !enable); return ret; } @@ -341,8 +346,9 @@ static int smu_set_mall_enable(struct smu_context *smu) * smu_dpm_set_power_gate - power gate/ungate the specific IP block * * @handle: smu_context pointer - * @block_type: the IP block to power gate/ungate - * @gate: to power gate if true, ungate otherwise + * @block_type: the IP block to power gate/ungate + * @gate: to power gate if true, ungate otherwise + * @inst: the instance of the IP block to power gate/ungate * * This API uses no smu->mutex lock protection due to: * 1. It is either called by other IP block(gfx/sdma/vcn/uvd/vce). @@ -353,7 +359,8 @@ static int smu_set_mall_enable(struct smu_context *smu) */ static int smu_dpm_set_power_gate(void *handle, uint32_t block_type, - bool gate) + bool gate, + int inst) { struct smu_context *smu = handle; int ret = 0; @@ -372,10 +379,10 @@ static int smu_dpm_set_power_gate(void *handle, */ case AMD_IP_BLOCK_TYPE_UVD: case AMD_IP_BLOCK_TYPE_VCN: - ret = smu_dpm_set_vcn_enable(smu, !gate); + ret = smu_dpm_set_vcn_enable(smu, !gate, inst); if (ret) - dev_err(smu->adev->dev, "Failed to power %s VCN!\n", - gate ? "gate" : "ungate"); + dev_err(smu->adev->dev, "Failed to power %s VCN instance %d!\n", + gate ? "gate" : "ungate", inst); break; case AMD_IP_BLOCK_TYPE_GFX: ret = smu_gfx_off_control(smu, gate); @@ -720,6 +727,7 @@ static int smu_set_funcs(struct amdgpu_device *adev) break; case IP_VERSION(13, 0, 6): case IP_VERSION(13, 0, 14): + case IP_VERSION(13, 0, 12): smu_v13_0_6_set_ppt_funcs(smu); /* Enable pp_od_clk_voltage node */ smu->od_enabled = true; @@ -760,6 +768,7 @@ static int smu_early_init(struct amdgpu_ip_block *ip_block) smu->smu_baco.platform_support = false; smu->smu_baco.maco_support = false; smu->user_dpm_profile.fan_mode = -1; + smu->power_profile_mode = PP_SMC_POWER_PROFILE_UNKNOWN; mutex_init(&smu->message_lock); @@ -777,21 +786,25 @@ static int smu_set_default_dpm_table(struct smu_context *smu) struct amdgpu_device *adev = smu->adev; struct smu_power_context *smu_power = &smu->smu_power; struct smu_power_gate *power_gate = &smu_power->power_gate; - int vcn_gate, jpeg_gate; + int vcn_gate[AMDGPU_MAX_VCN_INSTANCES], jpeg_gate, i; int ret = 0; if (!smu->ppt_funcs->set_default_dpm_table) return 0; - if (adev->pg_flags & AMD_PG_SUPPORT_VCN) - vcn_gate = atomic_read(&power_gate->vcn_gated); + if (adev->pg_flags & AMD_PG_SUPPORT_VCN) { + for (i = 0; i < adev->vcn.num_vcn_inst; i++) + vcn_gate[i] = atomic_read(&power_gate->vcn_gated[i]); + } if (adev->pg_flags & AMD_PG_SUPPORT_JPEG) jpeg_gate = atomic_read(&power_gate->jpeg_gated); if (adev->pg_flags & AMD_PG_SUPPORT_VCN) { - ret = smu_dpm_set_vcn_enable(smu, true); - if (ret) - return ret; + for (i = 0; i < adev->vcn.num_vcn_inst; i++) { + ret = smu_dpm_set_vcn_enable(smu, true, i); + if (ret) + return ret; + } } if (adev->pg_flags & AMD_PG_SUPPORT_JPEG) { @@ -808,8 +821,10 @@ static int smu_set_default_dpm_table(struct smu_context *smu) if (adev->pg_flags & AMD_PG_SUPPORT_JPEG) smu_dpm_set_jpeg_enable(smu, !jpeg_gate); err_out: - if (adev->pg_flags & AMD_PG_SUPPORT_VCN) - smu_dpm_set_vcn_enable(smu, !vcn_gate); + if (adev->pg_flags & AMD_PG_SUPPORT_VCN) { + for (i = 0; i < adev->vcn.num_vcn_inst; i++) + smu_dpm_set_vcn_enable(smu, !vcn_gate[i], i); + } return ret; } @@ -1244,11 +1259,26 @@ static bool smu_is_workload_profile_available(struct smu_context *smu, return smu->workload_map && smu->workload_map[profile].valid_mapping; } +static void smu_init_power_profile(struct smu_context *smu) +{ + if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_UNKNOWN) { + if (smu->is_apu || + !smu_is_workload_profile_available( + smu, PP_SMC_POWER_PROFILE_FULLSCREEN3D)) + smu->power_profile_mode = + PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; + else + smu->power_profile_mode = + PP_SMC_POWER_PROFILE_FULLSCREEN3D; + } + smu_power_profile_mode_get(smu, smu->power_profile_mode); +} + static int smu_sw_init(struct amdgpu_ip_block *ip_block) { struct amdgpu_device *adev = ip_block->adev; struct smu_context *smu = adev->powerplay.pp_handle; - int ret; + int i, ret; smu->pool_size = adev->pm.smu_prv_buffer_size; smu->smu_feature.feature_num = SMU_FEATURE_MAX; @@ -1259,42 +1289,14 @@ static int smu_sw_init(struct amdgpu_ip_block *ip_block) INIT_WORK(&smu->interrupt_work, smu_interrupt_work_fn); atomic64_set(&smu->throttle_int_counter, 0); smu->watermarks_bitmap = 0; - smu->power_profile_mode = PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; - smu->default_power_profile_mode = PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; - smu->user_dpm_profile.user_workload_mask = 0; - atomic_set(&smu->smu_power.power_gate.vcn_gated, 1); + for (i = 0; i < adev->vcn.num_vcn_inst; i++) + atomic_set(&smu->smu_power.power_gate.vcn_gated[i], 1); atomic_set(&smu->smu_power.power_gate.jpeg_gated, 1); atomic_set(&smu->smu_power.power_gate.vpe_gated, 1); atomic_set(&smu->smu_power.power_gate.umsch_mm_gated, 1); - smu->workload_priority[PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT] = 0; - smu->workload_priority[PP_SMC_POWER_PROFILE_FULLSCREEN3D] = 1; - smu->workload_priority[PP_SMC_POWER_PROFILE_POWERSAVING] = 2; - smu->workload_priority[PP_SMC_POWER_PROFILE_VIDEO] = 3; - smu->workload_priority[PP_SMC_POWER_PROFILE_VR] = 4; - smu->workload_priority[PP_SMC_POWER_PROFILE_COMPUTE] = 5; - smu->workload_priority[PP_SMC_POWER_PROFILE_CUSTOM] = 6; - - if (smu->is_apu || - !smu_is_workload_profile_available(smu, PP_SMC_POWER_PROFILE_FULLSCREEN3D)) { - smu->driver_workload_mask = - 1 << smu->workload_priority[PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT]; - } else { - smu->driver_workload_mask = - 1 << smu->workload_priority[PP_SMC_POWER_PROFILE_FULLSCREEN3D]; - smu->default_power_profile_mode = PP_SMC_POWER_PROFILE_FULLSCREEN3D; - } - - smu->workload_mask = smu->driver_workload_mask | - smu->user_dpm_profile.user_workload_mask; - smu->workload_setting[0] = PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; - smu->workload_setting[1] = PP_SMC_POWER_PROFILE_FULLSCREEN3D; - smu->workload_setting[2] = PP_SMC_POWER_PROFILE_POWERSAVING; - smu->workload_setting[3] = PP_SMC_POWER_PROFILE_VIDEO; - smu->workload_setting[4] = PP_SMC_POWER_PROFILE_VR; - smu->workload_setting[5] = PP_SMC_POWER_PROFILE_COMPUTE; - smu->workload_setting[6] = PP_SMC_POWER_PROFILE_CUSTOM; + smu_init_power_profile(smu); smu->display_config = &adev->pm.pm_display_cfg; smu->smu_dpm.dpm_level = AMD_DPM_FORCED_LEVEL_AUTO; @@ -1347,6 +1349,11 @@ static int smu_sw_fini(struct amdgpu_ip_block *ip_block) return ret; } + if (smu->custom_profile_params) { + kfree(smu->custom_profile_params); + smu->custom_profile_params = NULL; + } + smu_fini_microcode(smu); return 0; @@ -1814,7 +1821,7 @@ static int smu_start_smc_engine(struct smu_context *smu) static int smu_hw_init(struct amdgpu_ip_block *ip_block) { - int ret; + int i, ret; struct amdgpu_device *adev = ip_block->adev; struct smu_context *smu = adev->powerplay.pp_handle; @@ -1840,7 +1847,8 @@ static int smu_hw_init(struct amdgpu_ip_block *ip_block) ret = smu_set_gfx_imu_enable(smu); if (ret) return ret; - smu_dpm_set_vcn_enable(smu, true); + for (i = 0; i < adev->vcn.num_vcn_inst; i++) + smu_dpm_set_vcn_enable(smu, true, i); smu_dpm_set_jpeg_enable(smu, true); smu_dpm_set_vpe_enable(smu, true); smu_dpm_set_umsch_mm_enable(smu, true); @@ -2038,12 +2046,13 @@ static int smu_hw_fini(struct amdgpu_ip_block *ip_block) { struct amdgpu_device *adev = ip_block->adev; struct smu_context *smu = adev->powerplay.pp_handle; - int ret; + int i, ret; if (amdgpu_sriov_vf(adev) && !amdgpu_sriov_is_pp_one_vf(adev)) return 0; - smu_dpm_set_vcn_enable(smu, false); + for (i = 0; i < adev->vcn.num_vcn_inst; i++) + smu_dpm_set_vcn_enable(smu, false, i); smu_dpm_set_jpeg_enable(smu, false); smu_dpm_set_vpe_enable(smu, false); smu_dpm_set_umsch_mm_enable(smu, false); @@ -2131,6 +2140,9 @@ static int smu_suspend(struct amdgpu_ip_block *ip_block) if (!ret) adev->gfx.gfx_off_entrycount = count; + /* clear this on suspend so it will get reprogrammed on resume */ + smu->workload_mask = 0; + return 0; } @@ -2192,13 +2204,13 @@ static int smu_display_configuration_change(void *handle, return 0; } -static int smu_set_clockgating_state(void *handle, +static int smu_set_clockgating_state(struct amdgpu_ip_block *ip_block, enum amd_clockgating_state state) { return 0; } -static int smu_set_powergating_state(void *handle, +static int smu_set_powergating_state(struct amdgpu_ip_block *ip_block, enum amd_powergating_state state) { return 0; @@ -2243,25 +2255,49 @@ static int smu_enable_umd_pstate(void *handle, } static int smu_bump_power_profile_mode(struct smu_context *smu, - long *param, - uint32_t param_size) + long *custom_params, + u32 custom_params_max_idx) { - int ret = 0; + u32 workload_mask = 0; + int i, ret = 0; + + for (i = 0; i < PP_SMC_POWER_PROFILE_COUNT; i++) { + if (smu->workload_refcount[i]) + workload_mask |= 1 << i; + } + + if (smu->workload_mask == workload_mask) + return 0; if (smu->ppt_funcs->set_power_profile_mode) - ret = smu->ppt_funcs->set_power_profile_mode(smu, param, param_size); + ret = smu->ppt_funcs->set_power_profile_mode(smu, workload_mask, + custom_params, + custom_params_max_idx); + + if (!ret) + smu->workload_mask = workload_mask; return ret; } +static void smu_power_profile_mode_get(struct smu_context *smu, + enum PP_SMC_POWER_PROFILE profile_mode) +{ + smu->workload_refcount[profile_mode]++; +} + +static void smu_power_profile_mode_put(struct smu_context *smu, + enum PP_SMC_POWER_PROFILE profile_mode) +{ + if (smu->workload_refcount[profile_mode]) + smu->workload_refcount[profile_mode]--; +} + static int smu_adjust_power_state_dynamic(struct smu_context *smu, enum amd_dpm_forced_level level, - bool skip_display_settings, - bool init) + bool skip_display_settings) { int ret = 0; - int index = 0; - long workload[1]; struct smu_dpm_context *smu_dpm_ctx = &(smu->smu_dpm); if (!skip_display_settings) { @@ -2298,14 +2334,8 @@ static int smu_adjust_power_state_dynamic(struct smu_context *smu, } if (smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_MANUAL && - smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_PERF_DETERMINISM) { - index = fls(smu->workload_mask); - index = index > 0 && index <= WORKLOAD_POLICY_MAX ? index - 1 : 0; - workload[0] = smu->workload_setting[index]; - - if (init || smu->power_profile_mode != workload[0]) - smu_bump_power_profile_mode(smu, workload, 0); - } + smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_PERF_DETERMINISM) + smu_bump_power_profile_mode(smu, NULL, 0); return ret; } @@ -2324,13 +2354,13 @@ static int smu_handle_task(struct smu_context *smu, ret = smu_pre_display_config_changed(smu); if (ret) return ret; - ret = smu_adjust_power_state_dynamic(smu, level, false, false); + ret = smu_adjust_power_state_dynamic(smu, level, false); break; case AMD_PP_TASK_COMPLETE_INIT: - ret = smu_adjust_power_state_dynamic(smu, level, true, true); + ret = smu_adjust_power_state_dynamic(smu, level, true); break; case AMD_PP_TASK_READJUST_POWER_STATE: - ret = smu_adjust_power_state_dynamic(smu, level, true, false); + ret = smu_adjust_power_state_dynamic(smu, level, true); break; default: break; @@ -2352,12 +2382,11 @@ static int smu_handle_dpm_task(void *handle, static int smu_switch_power_profile(void *handle, enum PP_SMC_POWER_PROFILE type, - bool en) + bool enable) { struct smu_context *smu = handle; struct smu_dpm_context *smu_dpm_ctx = &(smu->smu_dpm); - long workload[1]; - uint32_t index; + int ret; if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled) return -EOPNOTSUPP; @@ -2365,24 +2394,21 @@ static int smu_switch_power_profile(void *handle, if (!(type < PP_SMC_POWER_PROFILE_CUSTOM)) return -EINVAL; - if (!en) { - smu->driver_workload_mask &= ~(1 << smu->workload_priority[type]); - index = fls(smu->workload_mask); - index = index > 0 && index <= WORKLOAD_POLICY_MAX ? index - 1 : 0; - workload[0] = smu->workload_setting[index]; - } else { - smu->driver_workload_mask |= (1 << smu->workload_priority[type]); - index = fls(smu->workload_mask); - index = index <= WORKLOAD_POLICY_MAX ? index - 1 : 0; - workload[0] = smu->workload_setting[index]; - } - - smu->workload_mask = smu->driver_workload_mask | - smu->user_dpm_profile.user_workload_mask; - if (smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_MANUAL && - smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_PERF_DETERMINISM) - smu_bump_power_profile_mode(smu, workload, 0); + smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_PERF_DETERMINISM) { + if (enable) + smu_power_profile_mode_get(smu, type); + else + smu_power_profile_mode_put(smu, type); + ret = smu_bump_power_profile_mode(smu, NULL, 0); + if (ret) { + if (enable) + smu_power_profile_mode_put(smu, type); + else + smu_power_profile_mode_get(smu, type); + return ret; + } + } return 0; } @@ -2966,9 +2992,10 @@ static int smu_read_sensor(void *handle, int *size_arg) { struct smu_context *smu = handle; + struct amdgpu_device *adev = smu->adev; struct smu_umd_pstate_table *pstate_table = &smu->pstate_table; - int ret = 0; + int i, ret = 0; uint32_t *size, size_val; if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled) @@ -3014,7 +3041,13 @@ static int smu_read_sensor(void *handle, *size = 4; break; case AMDGPU_PP_SENSOR_VCN_POWER_STATE: - *(uint32_t *)data = atomic_read(&smu->smu_power.power_gate.vcn_gated) ? 0 : 1; + *(uint32_t *)data = 0; + for (i = 0; i < adev->vcn.num_vcn_inst; i++) { + if (!atomic_read(&smu->smu_power.power_gate.vcn_gated[i])) { + *(uint32_t *)data = 1; + break; + } + } *size = 4; break; case AMDGPU_PP_SENSOR_MIN_FAN_RPM: @@ -3074,21 +3107,33 @@ static int smu_set_power_profile_mode(void *handle, uint32_t param_size) { struct smu_context *smu = handle; - int ret; + bool custom = false; + int ret = 0; if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled || !smu->ppt_funcs->set_power_profile_mode) return -EOPNOTSUPP; - if (smu->user_dpm_profile.user_workload_mask & - (1 << smu->workload_priority[param[param_size]])) - return 0; + if (param[param_size] == PP_SMC_POWER_PROFILE_CUSTOM) { + custom = true; + /* clear frontend mask so custom changes propogate */ + smu->workload_mask = 0; + } - smu->user_dpm_profile.user_workload_mask = - (1 << smu->workload_priority[param[param_size]]); - smu->workload_mask = smu->user_dpm_profile.user_workload_mask | - smu->driver_workload_mask; - ret = smu_bump_power_profile_mode(smu, param, param_size); + if ((param[param_size] != smu->power_profile_mode) || custom) { + /* clear the old user preference */ + smu_power_profile_mode_put(smu, smu->power_profile_mode); + /* set the new user preference */ + smu_power_profile_mode_get(smu, param[param_size]); + ret = smu_bump_power_profile_mode(smu, + custom ? param : NULL, + custom ? param_size : 0); + if (ret) + smu_power_profile_mode_put(smu, param[param_size]); + else + /* store the user's preference */ + smu->power_profile_mode = param[param_size]; + } return ret; } @@ -3870,3 +3915,13 @@ int smu_send_rma_reason(struct smu_context *smu) return ret; } + +int smu_reset_sdma(struct smu_context *smu, uint32_t inst_mask) +{ + int ret = 0; + + if (smu->ppt_funcs && smu->ppt_funcs->reset_sdma) + ret = smu->ppt_funcs->reset_sdma(smu, inst_mask); + + return ret; +} diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h index d665c47f19b7..3630593bce61 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h @@ -240,7 +240,6 @@ struct smu_user_dpm_profile { /* user clock state information */ uint32_t clk_mask[SMU_CLK_COUNT]; uint32_t clk_dependency; - uint32_t user_workload_mask; }; #define SMU_TABLE_INIT(tables, table_id, s, a, d) \ @@ -400,7 +399,7 @@ struct smu_dpm_context { struct smu_power_gate { bool uvd_gated; bool vce_gated; - atomic_t vcn_gated; + atomic_t vcn_gated[AMDGPU_MAX_VCN_INSTANCES]; atomic_t jpeg_gated; atomic_t vpe_gated; atomic_t umsch_mm_gated; @@ -557,12 +556,13 @@ struct smu_context { uint32_t hard_min_uclk_req_from_dal; bool disable_uclk_switch; + /* asic agnostic workload mask */ uint32_t workload_mask; - uint32_t driver_workload_mask; - uint32_t workload_priority[WORKLOAD_POLICY_MAX]; - uint32_t workload_setting[WORKLOAD_POLICY_MAX]; + /* default/user workload preference */ uint32_t power_profile_mode; - uint32_t default_power_profile_mode; + uint32_t workload_refcount[PP_SMC_POWER_PROFILE_COUNT]; + /* backend specific custom workload settings */ + long *custom_profile_params; bool pm_enabled; bool is_apu; @@ -733,9 +733,12 @@ struct pptable_funcs { * @set_power_profile_mode: Set a power profile mode. Also used to * create/set custom power profile modes. * &input: Power profile mode parameters. - * &size: Size of &input. + * &workload_mask: mask of workloads to enable + * &custom_params: custom profile parameters + * &custom_params_max_idx: max valid idx into custom_params */ - int (*set_power_profile_mode)(struct smu_context *smu, long *input, uint32_t size); + int (*set_power_profile_mode)(struct smu_context *smu, u32 workload_mask, + long *custom_params, u32 custom_params_max_idx); /** * @dpm_set_vcn_enable: Enable/disable VCN engine dynamic power @@ -1369,6 +1372,11 @@ struct pptable_funcs { */ int (*send_rma_reason)(struct smu_context *smu); + /** + * @reset_sdma: message SMU to soft reset sdma instance. + */ + int (*reset_sdma)(struct smu_context *smu, uint32_t inst_mask); + /** * @get_ecc_table: message SMU to get ECC INFO table. */ @@ -1628,6 +1636,7 @@ void amdgpu_smu_stb_debug_fs_init(struct amdgpu_device *adev); int smu_send_hbm_bad_pages_num(struct smu_context *smu, uint32_t size); int smu_send_hbm_bad_channel_flag(struct smu_context *smu, uint32_t size); int smu_send_rma_reason(struct smu_context *smu); +int smu_reset_sdma(struct smu_context *smu, uint32_t inst_mask); int smu_set_pm_policy(struct smu_context *smu, enum pp_pm_policy p_type, int level); ssize_t smu_get_pm_policy_info(struct smu_context *smu, diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_pmfw.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_pmfw.h index 0f96b8c59a0e..274b3e1cc4fb 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_pmfw.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_pmfw.h @@ -34,6 +34,8 @@ #define NUM_PCIE_BITRATES 4 #define NUM_XGMI_BITRATES 4 #define NUM_XGMI_WIDTHS 3 +#define NUM_SOC_P2S_TABLES 3 +#define NUM_TDP_GROUPS 4 typedef enum { /*0*/ FEATURE_DATA_CALCULATION = 0, @@ -80,8 +82,10 @@ typedef enum { /*41*/ FEATURE_CXL_QOS = 41, /*42*/ FEATURE_SOC_DC_RTC = 42, /*43*/ FEATURE_GFX_DC_RTC = 43, +/*44*/ FEATURE_DVM_MIN_PSM = 44, +/*45*/ FEATURE_PRC = 45, -/*44*/ NUM_FEATURES = 44 +/*46*/ NUM_FEATURES = 46 } FEATURE_LIST_e; //enum for MPIO PCIe gen speed msgs @@ -123,7 +127,7 @@ typedef enum { VOLTAGE_GUARDBAND_COUNT } GFX_GUARDBAND_e; -#define SMU_METRICS_TABLE_VERSION 0xE +#define SMU_METRICS_TABLE_VERSION 0xF typedef struct __attribute__((packed, aligned(4))) { uint32_t AccumulationCounter; @@ -234,6 +238,9 @@ typedef struct __attribute__((packed, aligned(4))) { //PCIE BW Data and error count uint32_t PCIeOtherEndRecoveryAcc; // The Pcie counter itself is accumulated + + //Total App Clock Counter + uint64_t GfxclkBelowHostLimitAcc[8]; } MetricsTableX_t; typedef struct __attribute__((packed, aligned(4))) { @@ -328,13 +335,14 @@ typedef struct __attribute__((packed, aligned(4))) { uint32_t JpegBusy[32]; } MetricsTableA_t; -#define SMU_VF_METRICS_TABLE_VERSION 0x3 +#define SMU_VF_METRICS_TABLE_VERSION 0x5 typedef struct __attribute__((packed, aligned(4))) { uint32_t AccumulationCounter; uint32_t InstGfxclk_TargFreq; uint64_t AccGfxclk_TargFreq; uint64_t AccGfxRsmuDpm_Busy; + uint64_t AccGfxclkBelowHostLimit; } VfMetricsTable_t; #endif diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h index 41cb681927e2..7b65a27fb302 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h @@ -93,7 +93,9 @@ #define PPSMC_MSG_SelectPLPDMode 0x40 #define PPSMC_MSG_RmaDueToBadPageThreshold 0x43 #define PPSMC_MSG_SelectPstatePolicy 0x44 -#define PPSMC_Message_Count 0x45 +#define PPSMC_MSG_ResetSDMA2 0x45 +#define PPSMC_MSG_ResetSDMA 0x4D +#define PPSMC_Message_Count 0x4E //PPSMC Reset Types for driver msg argument #define PPSMC_RESET_TYPE_DRIVER_MODE_1_RESET 0x1 diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h index a299dc4a8071..b0dab9797c70 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h @@ -275,7 +275,9 @@ __SMU_DUMMY_MAP(RmaDueToBadPageThreshold), \ __SMU_DUMMY_MAP(SelectPstatePolicy), \ __SMU_DUMMY_MAP(MALLPowerController), \ - __SMU_DUMMY_MAP(MALLPowerState), + __SMU_DUMMY_MAP(MALLPowerState), \ + __SMU_DUMMY_MAP(ResetSDMA), \ + __SMU_DUMMY_MAP(ResetSDMA2), #undef __SMU_DUMMY_MAP #define __SMU_DUMMY_MAP(type) SMU_MSG_##type diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h index ae3563d71fa0..356d9422b411 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h @@ -303,5 +303,7 @@ int smu_v13_0_set_wbrf_exclusion_ranges(struct smu_context *smu, int smu_v13_0_get_boot_freq_by_index(struct smu_context *smu, enum smu_clk_type clk_type, uint32_t *value); + +void smu_v13_0_interrupt_work(struct smu_context *smu); #endif #endif diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c index 12125303bb79..8aa61a9f7778 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c @@ -1445,97 +1445,120 @@ static int arcturus_get_power_profile_mode(struct smu_context *smu, return size; } -static int arcturus_set_power_profile_mode(struct smu_context *smu, - long *input, - uint32_t size) +#define ARCTURUS_CUSTOM_PARAMS_COUNT 10 +#define ARCTURUS_CUSTOM_PARAMS_CLOCK_COUNT 2 +#define ARCTURUS_CUSTOM_PARAMS_SIZE (ARCTURUS_CUSTOM_PARAMS_CLOCK_COUNT * ARCTURUS_CUSTOM_PARAMS_COUNT * sizeof(long)) + +static int arcturus_set_power_profile_mode_coeff(struct smu_context *smu, + long *input) { DpmActivityMonitorCoeffInt_t activity_monitor; - int workload_type = 0; - uint32_t profile_mode = input[size]; - int ret = 0; + int ret, idx; - if (profile_mode > PP_SMC_POWER_PROFILE_CUSTOM) { - dev_err(smu->adev->dev, "Invalid power profile mode %d\n", profile_mode); - return -EINVAL; - } - - if ((profile_mode == PP_SMC_POWER_PROFILE_CUSTOM) && - (smu->smc_fw_version >= 0x360d00)) { - if (size != 10) - return -EINVAL; - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, - WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor), - false); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); - return ret; - } - - switch (input[0]) { - case 0: /* Gfxclk */ - activity_monitor.Gfx_FPS = input[1]; - activity_monitor.Gfx_UseRlcBusy = input[2]; - activity_monitor.Gfx_MinActiveFreqType = input[3]; - activity_monitor.Gfx_MinActiveFreq = input[4]; - activity_monitor.Gfx_BoosterFreqType = input[5]; - activity_monitor.Gfx_BoosterFreq = input[6]; - activity_monitor.Gfx_PD_Data_limit_c = input[7]; - activity_monitor.Gfx_PD_Data_error_coeff = input[8]; - activity_monitor.Gfx_PD_Data_error_rate_coeff = input[9]; - break; - case 1: /* Uclk */ - activity_monitor.Mem_FPS = input[1]; - activity_monitor.Mem_UseRlcBusy = input[2]; - activity_monitor.Mem_MinActiveFreqType = input[3]; - activity_monitor.Mem_MinActiveFreq = input[4]; - activity_monitor.Mem_BoosterFreqType = input[5]; - activity_monitor.Mem_BoosterFreq = input[6]; - activity_monitor.Mem_PD_Data_limit_c = input[7]; - activity_monitor.Mem_PD_Data_error_coeff = input[8]; - activity_monitor.Mem_PD_Data_error_rate_coeff = input[9]; - break; - default: - return -EINVAL; - } - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, - WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor), - true); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); - return ret; - } - } - - /* - * Conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT - * Not all profile modes are supported on arcturus. - */ - workload_type = smu_cmn_to_asic_specific_index(smu, - CMN2ASIC_MAPPING_WORKLOAD, - profile_mode); - if (workload_type < 0) { - dev_dbg(smu->adev->dev, "Unsupported power profile mode %d on arcturus\n", profile_mode); - return -EINVAL; - } - - ret = smu_cmn_send_smc_msg_with_param(smu, - SMU_MSG_SetWorkloadMask, - smu->workload_mask, - NULL); + ret = smu_cmn_update_table(smu, + SMU_TABLE_ACTIVITY_MONITOR_COEFF, + WORKLOAD_PPLIB_CUSTOM_BIT, + (void *)(&activity_monitor), + false); if (ret) { - dev_err(smu->adev->dev, "Fail to set workload type %d\n", workload_type); + dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); return ret; } - smu_cmn_assign_power_profile(smu); + idx = 0 * ARCTURUS_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Gfxclk */ + activity_monitor.Gfx_FPS = input[idx + 1]; + activity_monitor.Gfx_UseRlcBusy = input[idx + 2]; + activity_monitor.Gfx_MinActiveFreqType = input[idx + 3]; + activity_monitor.Gfx_MinActiveFreq = input[idx + 4]; + activity_monitor.Gfx_BoosterFreqType = input[idx + 5]; + activity_monitor.Gfx_BoosterFreq = input[idx + 6]; + activity_monitor.Gfx_PD_Data_limit_c = input[idx + 7]; + activity_monitor.Gfx_PD_Data_error_coeff = input[idx + 8]; + activity_monitor.Gfx_PD_Data_error_rate_coeff = input[idx + 9]; + } + idx = 1 * ARCTURUS_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Uclk */ + activity_monitor.Mem_FPS = input[idx + 1]; + activity_monitor.Mem_UseRlcBusy = input[idx + 2]; + activity_monitor.Mem_MinActiveFreqType = input[idx + 3]; + activity_monitor.Mem_MinActiveFreq = input[idx + 4]; + activity_monitor.Mem_BoosterFreqType = input[idx + 5]; + activity_monitor.Mem_BoosterFreq = input[idx + 6]; + activity_monitor.Mem_PD_Data_limit_c = input[idx + 7]; + activity_monitor.Mem_PD_Data_error_coeff = input[idx + 8]; + activity_monitor.Mem_PD_Data_error_rate_coeff = input[idx + 9]; + } - return 0; + ret = smu_cmn_update_table(smu, + SMU_TABLE_ACTIVITY_MONITOR_COEFF, + WORKLOAD_PPLIB_CUSTOM_BIT, + (void *)(&activity_monitor), + true); + if (ret) { + dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); + return ret; + } + + return ret; +} + +static int arcturus_set_power_profile_mode(struct smu_context *smu, + u32 workload_mask, + long *custom_params, + u32 custom_params_max_idx) +{ + u32 backend_workload_mask = 0; + int ret, idx = -1, i; + + smu_cmn_get_backend_workload_mask(smu, workload_mask, + &backend_workload_mask); + + if (workload_mask & (1 << PP_SMC_POWER_PROFILE_CUSTOM)) { + if (smu->smc_fw_version < 0x360d00) + return -EINVAL; + if (!smu->custom_profile_params) { + smu->custom_profile_params = + kzalloc(ARCTURUS_CUSTOM_PARAMS_SIZE, GFP_KERNEL); + if (!smu->custom_profile_params) + return -ENOMEM; + } + if (custom_params && custom_params_max_idx) { + if (custom_params_max_idx != ARCTURUS_CUSTOM_PARAMS_COUNT) + return -EINVAL; + if (custom_params[0] >= ARCTURUS_CUSTOM_PARAMS_CLOCK_COUNT) + return -EINVAL; + idx = custom_params[0] * ARCTURUS_CUSTOM_PARAMS_COUNT; + smu->custom_profile_params[idx] = 1; + for (i = 1; i < custom_params_max_idx; i++) + smu->custom_profile_params[idx + i] = custom_params[i]; + } + ret = arcturus_set_power_profile_mode_coeff(smu, + smu->custom_profile_params); + if (ret) { + if (idx != -1) + smu->custom_profile_params[idx] = 0; + return ret; + } + } else if (smu->custom_profile_params) { + memset(smu->custom_profile_params, 0, ARCTURUS_CUSTOM_PARAMS_SIZE); + } + + ret = smu_cmn_send_smc_msg_with_param(smu, + SMU_MSG_SetWorkloadMask, + backend_workload_mask, + NULL); + if (ret) { + dev_err(smu->adev->dev, "Failed to set workload mask 0x%08x\n", + workload_mask); + if (idx != -1) + smu->custom_profile_params[idx] = 0; + return ret; + } + + return ret; } static int arcturus_set_performance_level(struct smu_context *smu, diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c index 211635dabed8..7fad5dfb39c4 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c @@ -2006,90 +2006,122 @@ static int navi10_get_power_profile_mode(struct smu_context *smu, char *buf) return size; } -static int navi10_set_power_profile_mode(struct smu_context *smu, long *input, uint32_t size) +#define NAVI10_CUSTOM_PARAMS_COUNT 10 +#define NAVI10_CUSTOM_PARAMS_CLOCKS_COUNT 3 +#define NAVI10_CUSTOM_PARAMS_SIZE (NAVI10_CUSTOM_PARAMS_CLOCKS_COUNT * NAVI10_CUSTOM_PARAMS_COUNT * sizeof(long)) + +static int navi10_set_power_profile_mode_coeff(struct smu_context *smu, + long *input) { DpmActivityMonitorCoeffInt_t activity_monitor; - int workload_type, ret = 0; + int ret, idx; - smu->power_profile_mode = input[size]; - - if (smu->power_profile_mode > PP_SMC_POWER_PROFILE_CUSTOM) { - dev_err(smu->adev->dev, "Invalid power profile mode %d\n", smu->power_profile_mode); - return -EINVAL; + ret = smu_cmn_update_table(smu, + SMU_TABLE_ACTIVITY_MONITOR_COEFF, WORKLOAD_PPLIB_CUSTOM_BIT, + (void *)(&activity_monitor), false); + if (ret) { + dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); + return ret; } - if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_CUSTOM) { - if (size != 10) - return -EINVAL; - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor), false); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); - return ret; - } - - switch (input[0]) { - case 0: /* Gfxclk */ - activity_monitor.Gfx_FPS = input[1]; - activity_monitor.Gfx_MinFreqStep = input[2]; - activity_monitor.Gfx_MinActiveFreqType = input[3]; - activity_monitor.Gfx_MinActiveFreq = input[4]; - activity_monitor.Gfx_BoosterFreqType = input[5]; - activity_monitor.Gfx_BoosterFreq = input[6]; - activity_monitor.Gfx_PD_Data_limit_c = input[7]; - activity_monitor.Gfx_PD_Data_error_coeff = input[8]; - activity_monitor.Gfx_PD_Data_error_rate_coeff = input[9]; - break; - case 1: /* Socclk */ - activity_monitor.Soc_FPS = input[1]; - activity_monitor.Soc_MinFreqStep = input[2]; - activity_monitor.Soc_MinActiveFreqType = input[3]; - activity_monitor.Soc_MinActiveFreq = input[4]; - activity_monitor.Soc_BoosterFreqType = input[5]; - activity_monitor.Soc_BoosterFreq = input[6]; - activity_monitor.Soc_PD_Data_limit_c = input[7]; - activity_monitor.Soc_PD_Data_error_coeff = input[8]; - activity_monitor.Soc_PD_Data_error_rate_coeff = input[9]; - break; - case 2: /* Memclk */ - activity_monitor.Mem_FPS = input[1]; - activity_monitor.Mem_MinFreqStep = input[2]; - activity_monitor.Mem_MinActiveFreqType = input[3]; - activity_monitor.Mem_MinActiveFreq = input[4]; - activity_monitor.Mem_BoosterFreqType = input[5]; - activity_monitor.Mem_BoosterFreq = input[6]; - activity_monitor.Mem_PD_Data_limit_c = input[7]; - activity_monitor.Mem_PD_Data_error_coeff = input[8]; - activity_monitor.Mem_PD_Data_error_rate_coeff = input[9]; - break; - default: - return -EINVAL; - } - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor), true); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); - return ret; - } + idx = 0 * NAVI10_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Gfxclk */ + activity_monitor.Gfx_FPS = input[idx + 1]; + activity_monitor.Gfx_MinFreqStep = input[idx + 2]; + activity_monitor.Gfx_MinActiveFreqType = input[idx + 3]; + activity_monitor.Gfx_MinActiveFreq = input[idx + 4]; + activity_monitor.Gfx_BoosterFreqType = input[idx + 5]; + activity_monitor.Gfx_BoosterFreq = input[idx + 6]; + activity_monitor.Gfx_PD_Data_limit_c = input[idx + 7]; + activity_monitor.Gfx_PD_Data_error_coeff = input[idx + 8]; + activity_monitor.Gfx_PD_Data_error_rate_coeff = input[idx + 9]; + } + idx = 1 * NAVI10_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Socclk */ + activity_monitor.Soc_FPS = input[idx + 1]; + activity_monitor.Soc_MinFreqStep = input[idx + 2]; + activity_monitor.Soc_MinActiveFreqType = input[idx + 3]; + activity_monitor.Soc_MinActiveFreq = input[idx + 4]; + activity_monitor.Soc_BoosterFreqType = input[idx + 5]; + activity_monitor.Soc_BoosterFreq = input[idx + 6]; + activity_monitor.Soc_PD_Data_limit_c = input[idx + 7]; + activity_monitor.Soc_PD_Data_error_coeff = input[idx + 8]; + activity_monitor.Soc_PD_Data_error_rate_coeff = input[idx + 9]; + } + idx = 2 * NAVI10_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Memclk */ + activity_monitor.Mem_FPS = input[idx + 1]; + activity_monitor.Mem_MinFreqStep = input[idx + 2]; + activity_monitor.Mem_MinActiveFreqType = input[idx + 3]; + activity_monitor.Mem_MinActiveFreq = input[idx + 4]; + activity_monitor.Mem_BoosterFreqType = input[idx + 5]; + activity_monitor.Mem_BoosterFreq = input[idx + 6]; + activity_monitor.Mem_PD_Data_limit_c = input[idx + 7]; + activity_monitor.Mem_PD_Data_error_coeff = input[idx + 8]; + activity_monitor.Mem_PD_Data_error_rate_coeff = input[idx + 9]; } - /* conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT */ - workload_type = smu_cmn_to_asic_specific_index(smu, - CMN2ASIC_MAPPING_WORKLOAD, - smu->power_profile_mode); - if (workload_type < 0) - return -EINVAL; + ret = smu_cmn_update_table(smu, + SMU_TABLE_ACTIVITY_MONITOR_COEFF, WORKLOAD_PPLIB_CUSTOM_BIT, + (void *)(&activity_monitor), true); + if (ret) { + dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); + return ret; + } + + return ret; +} + +static int navi10_set_power_profile_mode(struct smu_context *smu, + u32 workload_mask, + long *custom_params, + u32 custom_params_max_idx) +{ + u32 backend_workload_mask = 0; + int ret, idx = -1, i; + + smu_cmn_get_backend_workload_mask(smu, workload_mask, + &backend_workload_mask); + + if (workload_mask & (1 << PP_SMC_POWER_PROFILE_CUSTOM)) { + if (!smu->custom_profile_params) { + smu->custom_profile_params = kzalloc(NAVI10_CUSTOM_PARAMS_SIZE, GFP_KERNEL); + if (!smu->custom_profile_params) + return -ENOMEM; + } + if (custom_params && custom_params_max_idx) { + if (custom_params_max_idx != NAVI10_CUSTOM_PARAMS_COUNT) + return -EINVAL; + if (custom_params[0] >= NAVI10_CUSTOM_PARAMS_CLOCKS_COUNT) + return -EINVAL; + idx = custom_params[0] * NAVI10_CUSTOM_PARAMS_COUNT; + smu->custom_profile_params[idx] = 1; + for (i = 1; i < custom_params_max_idx; i++) + smu->custom_profile_params[idx + i] = custom_params[i]; + } + ret = navi10_set_power_profile_mode_coeff(smu, + smu->custom_profile_params); + if (ret) { + if (idx != -1) + smu->custom_profile_params[idx] = 0; + return ret; + } + } else if (smu->custom_profile_params) { + memset(smu->custom_profile_params, 0, NAVI10_CUSTOM_PARAMS_SIZE); + } ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, - smu->workload_mask, NULL); - if (ret) - dev_err(smu->adev->dev, "[%s] Failed to set work load mask!", __func__); - else - smu_cmn_assign_power_profile(smu); + backend_workload_mask, NULL); + if (ret) { + dev_err(smu->adev->dev, "Failed to set workload mask 0x%08x\n", + workload_mask); + if (idx != -1) + smu->custom_profile_params[idx] = 0; + return ret; + } return ret; } diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c index d0ed0d060a8a..19a25fdc2f5b 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -1157,19 +1157,15 @@ static int sienna_cichlid_dpm_set_vcn_enable(struct smu_context *smu, int inst) { struct amdgpu_device *adev = smu->adev; - int i, ret = 0; + int ret = 0; - for (i = 0; i < adev->vcn.num_vcn_inst; i++) { - if (adev->vcn.harvest_config & (1 << i)) - continue; - /* vcn dpm on is a prerequisite for vcn power gate messages */ - if (smu_cmn_feature_is_enabled(smu, SMU_FEATURE_MM_DPM_PG_BIT)) { - ret = smu_cmn_send_smc_msg_with_param(smu, enable ? - SMU_MSG_PowerUpVcn : SMU_MSG_PowerDownVcn, - 0x10000 * i, NULL); - if (ret) - return ret; - } + if (adev->vcn.harvest_config & (1 << inst)) + return ret; + /* vcn dpm on is a prerequisite for vcn power gate messages */ + if (smu_cmn_feature_is_enabled(smu, SMU_FEATURE_MM_DPM_PG_BIT)) { + ret = smu_cmn_send_smc_msg_with_param(smu, enable ? + SMU_MSG_PowerUpVcn : SMU_MSG_PowerDownVcn, + 0x10000 * inst, NULL); } return ret; @@ -1708,93 +1704,126 @@ static int sienna_cichlid_get_power_profile_mode(struct smu_context *smu, char * return size; } -static int sienna_cichlid_set_power_profile_mode(struct smu_context *smu, long *input, uint32_t size) +#define SIENNA_CICHLID_CUSTOM_PARAMS_COUNT 10 +#define SIENNA_CICHLID_CUSTOM_PARAMS_CLOCK_COUNT 3 +#define SIENNA_CICHLID_CUSTOM_PARAMS_SIZE (SIENNA_CICHLID_CUSTOM_PARAMS_CLOCK_COUNT * SIENNA_CICHLID_CUSTOM_PARAMS_COUNT * sizeof(long)) + +static int sienna_cichlid_set_power_profile_mode_coeff(struct smu_context *smu, + long *input) { DpmActivityMonitorCoeffIntExternal_t activity_monitor_external; DpmActivityMonitorCoeffInt_t *activity_monitor = &(activity_monitor_external.DpmActivityMonitorCoeffInt); - int workload_type, ret = 0; + int ret, idx; - smu->power_profile_mode = input[size]; - - if (smu->power_profile_mode > PP_SMC_POWER_PROFILE_CUSTOM) { - dev_err(smu->adev->dev, "Invalid power profile mode %d\n", smu->power_profile_mode); - return -EINVAL; + ret = smu_cmn_update_table(smu, + SMU_TABLE_ACTIVITY_MONITOR_COEFF, WORKLOAD_PPLIB_CUSTOM_BIT, + (void *)(&activity_monitor_external), false); + if (ret) { + dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); + return ret; } - if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_CUSTOM) { - if (size != 10) - return -EINVAL; - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor_external), false); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); - return ret; - } - - switch (input[0]) { - case 0: /* Gfxclk */ - activity_monitor->Gfx_FPS = input[1]; - activity_monitor->Gfx_MinFreqStep = input[2]; - activity_monitor->Gfx_MinActiveFreqType = input[3]; - activity_monitor->Gfx_MinActiveFreq = input[4]; - activity_monitor->Gfx_BoosterFreqType = input[5]; - activity_monitor->Gfx_BoosterFreq = input[6]; - activity_monitor->Gfx_PD_Data_limit_c = input[7]; - activity_monitor->Gfx_PD_Data_error_coeff = input[8]; - activity_monitor->Gfx_PD_Data_error_rate_coeff = input[9]; - break; - case 1: /* Socclk */ - activity_monitor->Fclk_FPS = input[1]; - activity_monitor->Fclk_MinFreqStep = input[2]; - activity_monitor->Fclk_MinActiveFreqType = input[3]; - activity_monitor->Fclk_MinActiveFreq = input[4]; - activity_monitor->Fclk_BoosterFreqType = input[5]; - activity_monitor->Fclk_BoosterFreq = input[6]; - activity_monitor->Fclk_PD_Data_limit_c = input[7]; - activity_monitor->Fclk_PD_Data_error_coeff = input[8]; - activity_monitor->Fclk_PD_Data_error_rate_coeff = input[9]; - break; - case 2: /* Memclk */ - activity_monitor->Mem_FPS = input[1]; - activity_monitor->Mem_MinFreqStep = input[2]; - activity_monitor->Mem_MinActiveFreqType = input[3]; - activity_monitor->Mem_MinActiveFreq = input[4]; - activity_monitor->Mem_BoosterFreqType = input[5]; - activity_monitor->Mem_BoosterFreq = input[6]; - activity_monitor->Mem_PD_Data_limit_c = input[7]; - activity_monitor->Mem_PD_Data_error_coeff = input[8]; - activity_monitor->Mem_PD_Data_error_rate_coeff = input[9]; - break; - default: - return -EINVAL; - } - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor_external), true); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); - return ret; - } + idx = 0 * SIENNA_CICHLID_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Gfxclk */ + activity_monitor->Gfx_FPS = input[idx + 1]; + activity_monitor->Gfx_MinFreqStep = input[idx + 2]; + activity_monitor->Gfx_MinActiveFreqType = input[idx + 3]; + activity_monitor->Gfx_MinActiveFreq = input[idx + 4]; + activity_monitor->Gfx_BoosterFreqType = input[idx + 5]; + activity_monitor->Gfx_BoosterFreq = input[idx + 6]; + activity_monitor->Gfx_PD_Data_limit_c = input[idx + 7]; + activity_monitor->Gfx_PD_Data_error_coeff = input[idx + 8]; + activity_monitor->Gfx_PD_Data_error_rate_coeff = input[idx + 9]; + } + idx = 1 * SIENNA_CICHLID_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Socclk */ + activity_monitor->Fclk_FPS = input[idx + 1]; + activity_monitor->Fclk_MinFreqStep = input[idx + 2]; + activity_monitor->Fclk_MinActiveFreqType = input[idx + 3]; + activity_monitor->Fclk_MinActiveFreq = input[idx + 4]; + activity_monitor->Fclk_BoosterFreqType = input[idx + 5]; + activity_monitor->Fclk_BoosterFreq = input[idx + 6]; + activity_monitor->Fclk_PD_Data_limit_c = input[idx + 7]; + activity_monitor->Fclk_PD_Data_error_coeff = input[idx + 8]; + activity_monitor->Fclk_PD_Data_error_rate_coeff = input[idx + 9]; + } + idx = 2 * SIENNA_CICHLID_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Memclk */ + activity_monitor->Mem_FPS = input[idx + 1]; + activity_monitor->Mem_MinFreqStep = input[idx + 2]; + activity_monitor->Mem_MinActiveFreqType = input[idx + 3]; + activity_monitor->Mem_MinActiveFreq = input[idx + 4]; + activity_monitor->Mem_BoosterFreqType = input[idx + 5]; + activity_monitor->Mem_BoosterFreq = input[idx + 6]; + activity_monitor->Mem_PD_Data_limit_c = input[idx + 7]; + activity_monitor->Mem_PD_Data_error_coeff = input[idx + 8]; + activity_monitor->Mem_PD_Data_error_rate_coeff = input[idx + 9]; } - /* conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT */ - workload_type = smu_cmn_to_asic_specific_index(smu, - CMN2ASIC_MAPPING_WORKLOAD, - smu->power_profile_mode); - if (workload_type < 0) - return -EINVAL; + ret = smu_cmn_update_table(smu, + SMU_TABLE_ACTIVITY_MONITOR_COEFF, WORKLOAD_PPLIB_CUSTOM_BIT, + (void *)(&activity_monitor_external), true); + if (ret) { + dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); + return ret; + } + + return ret; +} + +static int sienna_cichlid_set_power_profile_mode(struct smu_context *smu, + u32 workload_mask, + long *custom_params, + u32 custom_params_max_idx) +{ + u32 backend_workload_mask = 0; + int ret, idx = -1, i; + + smu_cmn_get_backend_workload_mask(smu, workload_mask, + &backend_workload_mask); + + if (workload_mask & (1 << PP_SMC_POWER_PROFILE_CUSTOM)) { + if (!smu->custom_profile_params) { + smu->custom_profile_params = + kzalloc(SIENNA_CICHLID_CUSTOM_PARAMS_SIZE, GFP_KERNEL); + if (!smu->custom_profile_params) + return -ENOMEM; + } + if (custom_params && custom_params_max_idx) { + if (custom_params_max_idx != SIENNA_CICHLID_CUSTOM_PARAMS_COUNT) + return -EINVAL; + if (custom_params[0] >= SIENNA_CICHLID_CUSTOM_PARAMS_CLOCK_COUNT) + return -EINVAL; + idx = custom_params[0] * SIENNA_CICHLID_CUSTOM_PARAMS_COUNT; + smu->custom_profile_params[idx] = 1; + for (i = 1; i < custom_params_max_idx; i++) + smu->custom_profile_params[idx + i] = custom_params[i]; + } + ret = sienna_cichlid_set_power_profile_mode_coeff(smu, + smu->custom_profile_params); + if (ret) { + if (idx != -1) + smu->custom_profile_params[idx] = 0; + return ret; + } + } else if (smu->custom_profile_params) { + memset(smu->custom_profile_params, 0, SIENNA_CICHLID_CUSTOM_PARAMS_SIZE); + } ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, - smu->workload_mask, NULL); - if (ret) - dev_err(smu->adev->dev, "[%s] Failed to set work load mask!", __func__); - else - smu_cmn_assign_power_profile(smu); + backend_workload_mask, NULL); + if (ret) { + dev_err(smu->adev->dev, "Failed to set workload mask 0x%08x\n", + workload_mask); + if (idx != -1) + smu->custom_profile_params[idx] = 0; + return ret; + } return ret; } diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c index 480cf3cb204d..189c6a32b6bd 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c @@ -105,7 +105,8 @@ int smu_v11_0_init_microcode(struct smu_context *smu) return 0; amdgpu_ucode_ip_version_decode(adev, MP1_HWIP, ucode_prefix, sizeof(ucode_prefix)); - err = amdgpu_ucode_request(adev, &adev->pm.fw, "amdgpu/%s.bin", ucode_prefix); + err = amdgpu_ucode_request(adev, &adev->pm.fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s.bin", ucode_prefix); if (err) goto out; diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c index f89c487dce72..a55ea76d7399 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c @@ -1056,42 +1056,27 @@ static int vangogh_get_power_profile_mode(struct smu_context *smu, return size; } -static int vangogh_set_power_profile_mode(struct smu_context *smu, long *input, uint32_t size) +static int vangogh_set_power_profile_mode(struct smu_context *smu, + u32 workload_mask, + long *custom_params, + u32 custom_params_max_idx) { - int workload_type, ret; - uint32_t profile_mode = input[size]; + u32 backend_workload_mask = 0; + int ret; - if (profile_mode >= PP_SMC_POWER_PROFILE_COUNT) { - dev_err(smu->adev->dev, "Invalid power profile mode %d\n", profile_mode); - return -EINVAL; - } - - if (profile_mode == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT || - profile_mode == PP_SMC_POWER_PROFILE_POWERSAVING) - return 0; - - /* conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT */ - workload_type = smu_cmn_to_asic_specific_index(smu, - CMN2ASIC_MAPPING_WORKLOAD, - profile_mode); - if (workload_type < 0) { - dev_dbg(smu->adev->dev, "Unsupported power profile mode %d on VANGOGH\n", - profile_mode); - return -EINVAL; - } + smu_cmn_get_backend_workload_mask(smu, workload_mask, + &backend_workload_mask); ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_ActiveProcessNotify, - smu->workload_mask, - NULL); + backend_workload_mask, + NULL); if (ret) { - dev_err_once(smu->adev->dev, "Fail to set workload type %d\n", - workload_type); + dev_err_once(smu->adev->dev, "Fail to set workload mask 0x%08x\n", + workload_mask); return ret; } - smu_cmn_assign_power_profile(smu); - - return 0; + return ret; } static int vangogh_set_soft_freq_limited_range(struct smu_context *smu, diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c index 75a9ea87f419..37d82a71a2d7 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c @@ -864,44 +864,27 @@ static int renoir_force_clk_levels(struct smu_context *smu, return ret; } -static int renoir_set_power_profile_mode(struct smu_context *smu, long *input, uint32_t size) +static int renoir_set_power_profile_mode(struct smu_context *smu, + u32 workload_mask, + long *custom_params, + u32 custom_params_max_idx) { - int workload_type, ret; - uint32_t profile_mode = input[size]; + int ret; + u32 backend_workload_mask = 0; - if (profile_mode > PP_SMC_POWER_PROFILE_CUSTOM) { - dev_err(smu->adev->dev, "Invalid power profile mode %d\n", profile_mode); - return -EINVAL; - } - - if (profile_mode == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT || - profile_mode == PP_SMC_POWER_PROFILE_POWERSAVING) - return 0; - - /* conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT */ - workload_type = smu_cmn_to_asic_specific_index(smu, - CMN2ASIC_MAPPING_WORKLOAD, - profile_mode); - if (workload_type < 0) { - /* - * TODO: If some case need switch to powersave/default power mode - * then can consider enter WORKLOAD_COMPUTE/WORKLOAD_CUSTOM for power saving. - */ - dev_dbg(smu->adev->dev, "Unsupported power profile mode %d on RENOIR\n", profile_mode); - return -EINVAL; - } + smu_cmn_get_backend_workload_mask(smu, workload_mask, + &backend_workload_mask); ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_ActiveProcessNotify, - smu->workload_mask, - NULL); + backend_workload_mask, + NULL); if (ret) { - dev_err_once(smu->adev->dev, "Fail to set workload type %d\n", workload_type); + dev_err_once(smu->adev->dev, "Failed to set workload mask 0x08%x\n", + workload_mask); return ret; } - smu_cmn_assign_power_profile(smu); - - return 0; + return ret; } static int renoir_set_peak_clock_by_device(struct smu_context *smu) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c index 2bfea740dace..fbbdfa54f6a2 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c @@ -103,7 +103,8 @@ int smu_v13_0_init_microcode(struct smu_context *smu) return 0; amdgpu_ucode_ip_version_decode(adev, MP1_HWIP, ucode_prefix, sizeof(ucode_prefix)); - err = amdgpu_ucode_request(adev, &adev->pm.fw, "amdgpu/%s.bin", ucode_prefix); + err = amdgpu_ucode_request(adev, &adev->pm.fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s.bin", ucode_prefix); if (err) goto out; @@ -1320,11 +1321,11 @@ static int smu_v13_0_set_irq_state(struct amdgpu_device *adev, return 0; } -static int smu_v13_0_ack_ac_dc_interrupt(struct smu_context *smu) +void smu_v13_0_interrupt_work(struct smu_context *smu) { - return smu_cmn_send_smc_msg(smu, - SMU_MSG_ReenableAcDcInterrupt, - NULL); + smu_cmn_send_smc_msg(smu, + SMU_MSG_ReenableAcDcInterrupt, + NULL); } #define THM_11_0__SRCID__THM_DIG_THERM_L2H 0 /* ASIC_TEMP > CG_THERMAL_INT.DIG_THERM_INTH */ @@ -1377,12 +1378,12 @@ static int smu_v13_0_irq_process(struct amdgpu_device *adev, switch (ctxid) { case SMU_IH_INTERRUPT_CONTEXT_ID_AC: dev_dbg(adev->dev, "Switched to AC mode!\n"); - smu_v13_0_ack_ac_dc_interrupt(smu); + schedule_work(&smu->interrupt_work); adev->pm.ac_power = true; break; case SMU_IH_INTERRUPT_CONTEXT_ID_DC: dev_dbg(adev->dev, "Switched to DC mode!\n"); - smu_v13_0_ack_ac_dc_interrupt(smu); + schedule_work(&smu->interrupt_work); adev->pm.ac_power = false; break; case SMU_IH_INTERRUPT_CONTEXT_ID_THERMAL_THROTTLING: @@ -2108,18 +2109,14 @@ int smu_v13_0_set_vcn_enable(struct smu_context *smu, int inst) { struct amdgpu_device *adev = smu->adev; - int i, ret = 0; + int ret = 0; - for (i = 0; i < adev->vcn.num_vcn_inst; i++) { - if (adev->vcn.harvest_config & (1 << i)) - continue; + if (adev->vcn.harvest_config & (1 << inst)) + return ret; - ret = smu_cmn_send_smc_msg_with_param(smu, enable ? - SMU_MSG_PowerUpVcn : SMU_MSG_PowerDownVcn, - i << 16U, NULL); - if (ret) - return ret; - } + ret = smu_cmn_send_smc_msg_with_param(smu, enable ? + SMU_MSG_PowerUpVcn : SMU_MSG_PowerDownVcn, + inst << 16U, NULL); return ret; } diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c index 80c6b1e523aa..0551a3311217 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c @@ -2571,111 +2571,129 @@ static int smu_v13_0_0_get_power_profile_mode(struct smu_context *smu, return size; } -static int smu_v13_0_0_set_power_profile_mode(struct smu_context *smu, - long *input, - uint32_t size) +#define SMU_13_0_0_CUSTOM_PARAMS_COUNT 9 +#define SMU_13_0_0_CUSTOM_PARAMS_CLOCK_COUNT 2 +#define SMU_13_0_0_CUSTOM_PARAMS_SIZE (SMU_13_0_0_CUSTOM_PARAMS_CLOCK_COUNT * SMU_13_0_0_CUSTOM_PARAMS_COUNT * sizeof(long)) + +static int smu_v13_0_0_set_power_profile_mode_coeff(struct smu_context *smu, + long *input) { DpmActivityMonitorCoeffIntExternal_t activity_monitor_external; DpmActivityMonitorCoeffInt_t *activity_monitor = &(activity_monitor_external.DpmActivityMonitorCoeffInt); - int workload_type, ret = 0; - u32 workload_mask; + int ret, idx; - smu->power_profile_mode = input[size]; - - if (smu->power_profile_mode >= PP_SMC_POWER_PROFILE_COUNT) { - dev_err(smu->adev->dev, "Invalid power profile mode %d\n", smu->power_profile_mode); - return -EINVAL; + ret = smu_cmn_update_table(smu, + SMU_TABLE_ACTIVITY_MONITOR_COEFF, + WORKLOAD_PPLIB_CUSTOM_BIT, + (void *)(&activity_monitor_external), + false); + if (ret) { + dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); + return ret; } - if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_CUSTOM) { - if (size != 9) - return -EINVAL; - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, - WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor_external), - false); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); - return ret; - } - - switch (input[0]) { - case 0: /* Gfxclk */ - activity_monitor->Gfx_FPS = input[1]; - activity_monitor->Gfx_MinActiveFreqType = input[2]; - activity_monitor->Gfx_MinActiveFreq = input[3]; - activity_monitor->Gfx_BoosterFreqType = input[4]; - activity_monitor->Gfx_BoosterFreq = input[5]; - activity_monitor->Gfx_PD_Data_limit_c = input[6]; - activity_monitor->Gfx_PD_Data_error_coeff = input[7]; - activity_monitor->Gfx_PD_Data_error_rate_coeff = input[8]; - break; - case 1: /* Fclk */ - activity_monitor->Fclk_FPS = input[1]; - activity_monitor->Fclk_MinActiveFreqType = input[2]; - activity_monitor->Fclk_MinActiveFreq = input[3]; - activity_monitor->Fclk_BoosterFreqType = input[4]; - activity_monitor->Fclk_BoosterFreq = input[5]; - activity_monitor->Fclk_PD_Data_limit_c = input[6]; - activity_monitor->Fclk_PD_Data_error_coeff = input[7]; - activity_monitor->Fclk_PD_Data_error_rate_coeff = input[8]; - break; - default: - return -EINVAL; - } - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, - WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor_external), - true); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); - return ret; - } + idx = 0 * SMU_13_0_0_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Gfxclk */ + activity_monitor->Gfx_FPS = input[idx + 1]; + activity_monitor->Gfx_MinActiveFreqType = input[idx + 2]; + activity_monitor->Gfx_MinActiveFreq = input[idx + 3]; + activity_monitor->Gfx_BoosterFreqType = input[idx + 4]; + activity_monitor->Gfx_BoosterFreq = input[idx + 5]; + activity_monitor->Gfx_PD_Data_limit_c = input[idx + 6]; + activity_monitor->Gfx_PD_Data_error_coeff = input[idx + 7]; + activity_monitor->Gfx_PD_Data_error_rate_coeff = input[idx + 8]; + } + idx = 1 * SMU_13_0_0_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Fclk */ + activity_monitor->Fclk_FPS = input[idx + 1]; + activity_monitor->Fclk_MinActiveFreqType = input[idx + 2]; + activity_monitor->Fclk_MinActiveFreq = input[idx + 3]; + activity_monitor->Fclk_BoosterFreqType = input[idx + 4]; + activity_monitor->Fclk_BoosterFreq = input[idx + 5]; + activity_monitor->Fclk_PD_Data_limit_c = input[idx + 6]; + activity_monitor->Fclk_PD_Data_error_coeff = input[idx + 7]; + activity_monitor->Fclk_PD_Data_error_rate_coeff = input[idx + 8]; } - /* conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT */ - workload_type = smu_cmn_to_asic_specific_index(smu, - CMN2ASIC_MAPPING_WORKLOAD, - smu->power_profile_mode); + ret = smu_cmn_update_table(smu, + SMU_TABLE_ACTIVITY_MONITOR_COEFF, + WORKLOAD_PPLIB_CUSTOM_BIT, + (void *)(&activity_monitor_external), + true); + if (ret) { + dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); + return ret; + } - if (workload_type < 0) - return -EINVAL; + return ret; +} - workload_mask = 1 << workload_type; +static int smu_v13_0_0_set_power_profile_mode(struct smu_context *smu, + u32 workload_mask, + long *custom_params, + u32 custom_params_max_idx) +{ + u32 backend_workload_mask = 0; + int workload_type, ret, idx = -1, i; + + smu_cmn_get_backend_workload_mask(smu, workload_mask, + &backend_workload_mask); /* Add optimizations for SMU13.0.0/10. Reuse the power saving profile */ - if ((amdgpu_ip_version(smu->adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 0) && - ((smu->adev->pm.fw_version == 0x004e6601) || - (smu->adev->pm.fw_version >= 0x004e7300))) || - (amdgpu_ip_version(smu->adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 10) && - smu->adev->pm.fw_version >= 0x00504500)) { + if ((workload_mask & (1 << PP_SMC_POWER_PROFILE_COMPUTE)) && + ((amdgpu_ip_version(smu->adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 0) && + ((smu->adev->pm.fw_version == 0x004e6601) || + (smu->adev->pm.fw_version >= 0x004e7300))) || + (amdgpu_ip_version(smu->adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 10) && + smu->adev->pm.fw_version >= 0x00504500))) { workload_type = smu_cmn_to_asic_specific_index(smu, CMN2ASIC_MAPPING_WORKLOAD, PP_SMC_POWER_PROFILE_POWERSAVING); if (workload_type >= 0) - workload_mask |= 1 << workload_type; + backend_workload_mask |= 1 << workload_type; } - smu->workload_mask |= workload_mask; - ret = smu_cmn_send_smc_msg_with_param(smu, - SMU_MSG_SetWorkloadMask, - smu->workload_mask, - NULL); - if (!ret) { - smu_cmn_assign_power_profile(smu); - if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_POWERSAVING) { - workload_type = smu_cmn_to_asic_specific_index(smu, - CMN2ASIC_MAPPING_WORKLOAD, - PP_SMC_POWER_PROFILE_FULLSCREEN3D); - smu->power_profile_mode = smu->workload_mask & (1 << workload_type) - ? PP_SMC_POWER_PROFILE_FULLSCREEN3D - : PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; + if (workload_mask & (1 << PP_SMC_POWER_PROFILE_CUSTOM)) { + if (!smu->custom_profile_params) { + smu->custom_profile_params = + kzalloc(SMU_13_0_0_CUSTOM_PARAMS_SIZE, GFP_KERNEL); + if (!smu->custom_profile_params) + return -ENOMEM; } + if (custom_params && custom_params_max_idx) { + if (custom_params_max_idx != SMU_13_0_0_CUSTOM_PARAMS_COUNT) + return -EINVAL; + if (custom_params[0] >= SMU_13_0_0_CUSTOM_PARAMS_CLOCK_COUNT) + return -EINVAL; + idx = custom_params[0] * SMU_13_0_0_CUSTOM_PARAMS_COUNT; + smu->custom_profile_params[idx] = 1; + for (i = 1; i < custom_params_max_idx; i++) + smu->custom_profile_params[idx + i] = custom_params[i]; + } + ret = smu_v13_0_0_set_power_profile_mode_coeff(smu, + smu->custom_profile_params); + if (ret) { + if (idx != -1) + smu->custom_profile_params[idx] = 0; + return ret; + } + } else if (smu->custom_profile_params) { + memset(smu->custom_profile_params, 0, SMU_13_0_0_CUSTOM_PARAMS_SIZE); + } + + ret = smu_cmn_send_smc_msg_with_param(smu, + SMU_MSG_SetWorkloadMask, + backend_workload_mask, + NULL); + if (ret) { + dev_err(smu->adev->dev, "Failed to set workload mask 0x%08x\n", + workload_mask); + if (idx != -1) + smu->custom_profile_params[idx] = 0; + return ret; } return ret; @@ -3202,6 +3220,7 @@ static const struct pptable_funcs smu_v13_0_0_ppt_funcs = { .is_asic_wbrf_supported = smu_v13_0_0_wbrf_support_check, .enable_uclk_shadow = smu_v13_0_enable_uclk_shadow, .set_wbrf_exclusion_ranges = smu_v13_0_set_wbrf_exclusion_ranges, + .interrupt_work = smu_v13_0_interrupt_work, }; void smu_v13_0_0_set_ppt_funcs(struct smu_context *smu) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c index ab3c93ddce46..8ab30b2f7119 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c @@ -119,6 +119,21 @@ static inline bool smu_v13_0_6_is_other_end_count_available(struct smu_context * } } +static inline bool smu_v13_0_6_is_blw_host_limit_available(struct smu_context *smu) +{ + if (smu->adev->flags & AMD_IS_APU) + return smu->smc_fw_version >= 0x04556F00; + + switch (amdgpu_ip_version(smu->adev, MP1_HWIP, 0)) { + case IP_VERSION(13, 0, 6): + return smu->smc_fw_version >= 0x557900; + case IP_VERSION(13, 0, 14): + return smu->smc_fw_version >= 0x05551000; + default: + return false; + } +} + struct mca_bank_ipid { enum amdgpu_mca_ip ip; uint16_t hwid; @@ -193,6 +208,8 @@ static const struct cmn2asic_msg_mapping smu_v13_0_6_message_map[SMU_MSG_MAX_COU MSG_MAP(SelectPLPDMode, PPSMC_MSG_SelectPLPDMode, 0), MSG_MAP(RmaDueToBadPageThreshold, PPSMC_MSG_RmaDueToBadPageThreshold, 0), MSG_MAP(SelectPstatePolicy, PPSMC_MSG_SelectPstatePolicy, 0), + MSG_MAP(ResetSDMA, PPSMC_MSG_ResetSDMA, 0), + MSG_MAP(ResetSDMA2, PPSMC_MSG_ResetSDMA2, 0), }; // clang-format on @@ -304,7 +321,8 @@ static int smu_v13_0_6_init_microcode(struct smu_context *smu) amdgpu_ucode_ip_version_decode(adev, MP1_HWIP, ucode_prefix, sizeof(ucode_prefix)); - ret = amdgpu_ucode_request(adev, &adev->pm.fw, "amdgpu/%s.bin", ucode_prefix); + ret = amdgpu_ucode_request(adev, &adev->pm.fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s.bin", ucode_prefix); if (ret) goto out; @@ -2356,6 +2374,9 @@ static ssize_t smu_v13_0_6_get_gpu_metrics(struct smu_context *smu, void **table gpu_metrics->average_umc_activity = SMUQ10_ROUND(GET_METRIC_FIELD(DramBandwidthUtilization, flag)); + gpu_metrics->mem_max_bandwidth = + SMUQ10_ROUND(GET_METRIC_FIELD(MaxDramBandwidth, flag)); + gpu_metrics->curr_socket_power = SMUQ10_ROUND(GET_METRIC_FIELD(SocketPower, flag)); /* Energy counter reported in 15.259uJ (2^-16) units */ @@ -2494,6 +2515,11 @@ static ssize_t smu_v13_0_6_get_gpu_metrics(struct smu_context *smu, void **table SMUQ10_ROUND(metrics_x->GfxBusy[inst]); gpu_metrics->xcp_stats[i].gfx_busy_acc[idx] = SMUQ10_ROUND(metrics_x->GfxBusyAcc[inst]); + + if (smu_v13_0_6_is_blw_host_limit_available(smu)) + gpu_metrics->xcp_stats[i].gfx_below_host_limit_acc[idx] = + SMUQ10_ROUND(metrics_x->GfxclkBelowHostLimitAcc + [inst]); idx++; } } @@ -2716,6 +2742,41 @@ static int smu_v13_0_6_send_rma_reason(struct smu_context *smu) return ret; } +static int smu_v13_0_6_reset_sdma(struct smu_context *smu, uint32_t inst_mask) +{ + uint32_t smu_program; + int ret = 0; + + smu_program = (smu->smc_fw_version >> 24) & 0xff; + switch (amdgpu_ip_version(smu->adev, MP1_HWIP, 0)) { + case IP_VERSION(13, 0, 6): + if (((smu_program == 7) && (smu->smc_fw_version > 0x07550700)) || + ((smu_program == 0) && (smu->smc_fw_version > 0x00557700))) + ret = smu_cmn_send_smc_msg_with_param(smu, + SMU_MSG_ResetSDMA, inst_mask, NULL); + else if ((smu_program == 4) && + (smu->smc_fw_version > 0x4556e6c)) + ret = smu_cmn_send_smc_msg_with_param(smu, + SMU_MSG_ResetSDMA2, inst_mask, NULL); + break; + case IP_VERSION(13, 0, 14): + if ((smu_program == 5) && + (smu->smc_fw_version > 0x05550f00)) + ret = smu_cmn_send_smc_msg_with_param(smu, + SMU_MSG_ResetSDMA2, inst_mask, NULL); + break; + default: + break; + } + + if (ret) + dev_err(smu->adev->dev, + "failed to send ResetSDMA event with mask 0x%x\n", + inst_mask); + + return ret; +} + static int mca_smu_set_debug_mode(struct amdgpu_device *adev, bool enable) { struct smu_context *smu = adev->powerplay.pp_handle; @@ -3385,6 +3446,7 @@ static const struct pptable_funcs smu_v13_0_6_ppt_funcs = { .i2c_fini = smu_v13_0_6_i2c_control_fini, .send_hbm_bad_pages_num = smu_v13_0_6_smu_send_hbm_bad_page_num, .send_rma_reason = smu_v13_0_6_send_rma_reason, + .reset_sdma = smu_v13_0_6_reset_sdma, }; void smu_v13_0_6_set_ppt_funcs(struct smu_context *smu) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c index 4fd0354bd312..55ef18517b0f 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c @@ -2530,79 +2530,110 @@ out: return result; } -static int smu_v13_0_7_set_power_profile_mode(struct smu_context *smu, long *input, uint32_t size) +#define SMU_13_0_7_CUSTOM_PARAMS_COUNT 8 +#define SMU_13_0_7_CUSTOM_PARAMS_CLOCK_COUNT 2 +#define SMU_13_0_7_CUSTOM_PARAMS_SIZE (SMU_13_0_7_CUSTOM_PARAMS_CLOCK_COUNT * SMU_13_0_7_CUSTOM_PARAMS_COUNT * sizeof(long)) + +static int smu_v13_0_7_set_power_profile_mode_coeff(struct smu_context *smu, + long *input) { DpmActivityMonitorCoeffIntExternal_t activity_monitor_external; DpmActivityMonitorCoeffInt_t *activity_monitor = &(activity_monitor_external.DpmActivityMonitorCoeffInt); - int workload_type, ret = 0; + int ret, idx; - smu->power_profile_mode = input[size]; - - if (smu->power_profile_mode > PP_SMC_POWER_PROFILE_WINDOW3D) { - dev_err(smu->adev->dev, "Invalid power profile mode %d\n", smu->power_profile_mode); - return -EINVAL; + ret = smu_cmn_update_table(smu, + SMU_TABLE_ACTIVITY_MONITOR_COEFF, WORKLOAD_PPLIB_CUSTOM_BIT, + (void *)(&activity_monitor_external), false); + if (ret) { + dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); + return ret; } - if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_CUSTOM) { - if (size != 8) - return -EINVAL; - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor_external), false); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); - return ret; - } - - switch (input[0]) { - case 0: /* Gfxclk */ - activity_monitor->Gfx_ActiveHystLimit = input[1]; - activity_monitor->Gfx_IdleHystLimit = input[2]; - activity_monitor->Gfx_FPS = input[3]; - activity_monitor->Gfx_MinActiveFreqType = input[4]; - activity_monitor->Gfx_BoosterFreqType = input[5]; - activity_monitor->Gfx_MinActiveFreq = input[6]; - activity_monitor->Gfx_BoosterFreq = input[7]; - break; - case 1: /* Fclk */ - activity_monitor->Fclk_ActiveHystLimit = input[1]; - activity_monitor->Fclk_IdleHystLimit = input[2]; - activity_monitor->Fclk_FPS = input[3]; - activity_monitor->Fclk_MinActiveFreqType = input[4]; - activity_monitor->Fclk_BoosterFreqType = input[5]; - activity_monitor->Fclk_MinActiveFreq = input[6]; - activity_monitor->Fclk_BoosterFreq = input[7]; - break; - default: - return -EINVAL; - } - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor_external), true); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); - return ret; - } + idx = 0 * SMU_13_0_7_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Gfxclk */ + activity_monitor->Gfx_ActiveHystLimit = input[idx + 1]; + activity_monitor->Gfx_IdleHystLimit = input[idx + 2]; + activity_monitor->Gfx_FPS = input[idx + 3]; + activity_monitor->Gfx_MinActiveFreqType = input[idx + 4]; + activity_monitor->Gfx_BoosterFreqType = input[idx + 5]; + activity_monitor->Gfx_MinActiveFreq = input[idx + 6]; + activity_monitor->Gfx_BoosterFreq = input[idx + 7]; + } + idx = 1 * SMU_13_0_7_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Fclk */ + activity_monitor->Fclk_ActiveHystLimit = input[idx + 1]; + activity_monitor->Fclk_IdleHystLimit = input[idx + 2]; + activity_monitor->Fclk_FPS = input[idx + 3]; + activity_monitor->Fclk_MinActiveFreqType = input[idx + 4]; + activity_monitor->Fclk_BoosterFreqType = input[idx + 5]; + activity_monitor->Fclk_MinActiveFreq = input[idx + 6]; + activity_monitor->Fclk_BoosterFreq = input[idx + 7]; } - /* conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT */ - workload_type = smu_cmn_to_asic_specific_index(smu, - CMN2ASIC_MAPPING_WORKLOAD, - smu->power_profile_mode); - if (workload_type < 0) - return -EINVAL; + ret = smu_cmn_update_table(smu, + SMU_TABLE_ACTIVITY_MONITOR_COEFF, WORKLOAD_PPLIB_CUSTOM_BIT, + (void *)(&activity_monitor_external), true); + if (ret) { + dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); + return ret; + } + + return ret; +} + +static int smu_v13_0_7_set_power_profile_mode(struct smu_context *smu, + u32 workload_mask, + long *custom_params, + u32 custom_params_max_idx) +{ + u32 backend_workload_mask = 0; + int ret, idx = -1, i; + + smu_cmn_get_backend_workload_mask(smu, workload_mask, + &backend_workload_mask); + + if (workload_mask & (1 << PP_SMC_POWER_PROFILE_CUSTOM)) { + if (!smu->custom_profile_params) { + smu->custom_profile_params = + kzalloc(SMU_13_0_7_CUSTOM_PARAMS_SIZE, GFP_KERNEL); + if (!smu->custom_profile_params) + return -ENOMEM; + } + if (custom_params && custom_params_max_idx) { + if (custom_params_max_idx != SMU_13_0_7_CUSTOM_PARAMS_COUNT) + return -EINVAL; + if (custom_params[0] >= SMU_13_0_7_CUSTOM_PARAMS_CLOCK_COUNT) + return -EINVAL; + idx = custom_params[0] * SMU_13_0_7_CUSTOM_PARAMS_COUNT; + smu->custom_profile_params[idx] = 1; + for (i = 1; i < custom_params_max_idx; i++) + smu->custom_profile_params[idx + i] = custom_params[i]; + } + ret = smu_v13_0_7_set_power_profile_mode_coeff(smu, + smu->custom_profile_params); + if (ret) { + if (idx != -1) + smu->custom_profile_params[idx] = 0; + return ret; + } + } else if (smu->custom_profile_params) { + memset(smu->custom_profile_params, 0, SMU_13_0_7_CUSTOM_PARAMS_SIZE); + } ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, - smu->workload_mask, NULL); + backend_workload_mask, NULL); - if (ret) - dev_err(smu->adev->dev, "[%s] Failed to set work load mask!", __func__); - else - smu_cmn_assign_power_profile(smu); + if (ret) { + dev_err(smu->adev->dev, "Failed to set workload mask 0x%08x\n", + workload_mask); + if (idx != -1) + smu->custom_profile_params[idx] = 0; + return ret; + } return ret; } @@ -2766,6 +2797,7 @@ static const struct pptable_funcs smu_v13_0_7_ppt_funcs = { .is_asic_wbrf_supported = smu_v13_0_7_wbrf_support_check, .enable_uclk_shadow = smu_v13_0_enable_uclk_shadow, .set_wbrf_exclusion_ranges = smu_v13_0_set_wbrf_exclusion_ranges, + .interrupt_work = smu_v13_0_interrupt_work, }; void smu_v13_0_7_set_ppt_funcs(struct smu_context *smu) @@ -2779,4 +2811,5 @@ void smu_v13_0_7_set_ppt_funcs(struct smu_context *smu) smu->workload_map = smu_v13_0_7_workload_map; smu->smc_driver_if_version = SMU13_0_7_DRIVER_IF_VERSION; smu_v13_0_set_smu_mailbox_registers(smu); + smu->power_profile_mode = PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; } diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c index a87040cb2f2e..9b2f4fe1578b 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c @@ -79,7 +79,8 @@ int smu_v14_0_init_microcode(struct smu_context *smu) return 0; amdgpu_ucode_ip_version_decode(adev, MP1_HWIP, ucode_prefix, sizeof(ucode_prefix)); - err = amdgpu_ucode_request(adev, &adev->pm.fw, "amdgpu/%s.bin", ucode_prefix); + err = amdgpu_ucode_request(adev, &adev->pm.fw, AMDGPU_UCODE_REQUIRED, + "amdgpu/%s.bin", ucode_prefix); if (err) goto out; @@ -1511,29 +1512,24 @@ int smu_v14_0_set_vcn_enable(struct smu_context *smu, int inst) { struct amdgpu_device *adev = smu->adev; - int i, ret = 0; + int ret = 0; - for (i = 0; i < adev->vcn.num_vcn_inst; i++) { - if (adev->vcn.harvest_config & (1 << i)) - continue; + if (adev->vcn.harvest_config & (1 << inst)) + return ret; - if (smu->is_apu) { - if (i == 0) - ret = smu_cmn_send_smc_msg_with_param(smu, enable ? - SMU_MSG_PowerUpVcn0 : SMU_MSG_PowerDownVcn0, - i << 16U, NULL); - else if (i == 1) - ret = smu_cmn_send_smc_msg_with_param(smu, enable ? - SMU_MSG_PowerUpVcn1 : SMU_MSG_PowerDownVcn1, - i << 16U, NULL); - } else { + if (smu->is_apu) { + if (inst == 0) ret = smu_cmn_send_smc_msg_with_param(smu, enable ? - SMU_MSG_PowerUpVcn : SMU_MSG_PowerDownVcn, - i << 16U, NULL); - } - - if (ret) - return ret; + SMU_MSG_PowerUpVcn0 : SMU_MSG_PowerDownVcn0, + inst << 16U, NULL); + else if (inst == 1) + ret = smu_cmn_send_smc_msg_with_param(smu, enable ? + SMU_MSG_PowerUpVcn1 : SMU_MSG_PowerDownVcn1, + inst << 16U, NULL); + } else { + ret = smu_cmn_send_smc_msg_with_param(smu, enable ? + SMU_MSG_PowerUpVcn : SMU_MSG_PowerDownVcn, + inst << 16U, NULL); } return ret; diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c index 687a0f5ac94f..5cad09c5f2ff 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c @@ -1739,89 +1739,120 @@ static int smu_v14_0_2_get_power_profile_mode(struct smu_context *smu, return size; } -static int smu_v14_0_2_set_power_profile_mode(struct smu_context *smu, - long *input, - uint32_t size) +#define SMU_14_0_2_CUSTOM_PARAMS_COUNT 9 +#define SMU_14_0_2_CUSTOM_PARAMS_CLOCK_COUNT 2 +#define SMU_14_0_2_CUSTOM_PARAMS_SIZE (SMU_14_0_2_CUSTOM_PARAMS_CLOCK_COUNT * SMU_14_0_2_CUSTOM_PARAMS_COUNT * sizeof(long)) + +static int smu_v14_0_2_set_power_profile_mode_coeff(struct smu_context *smu, + long *input) { DpmActivityMonitorCoeffIntExternal_t activity_monitor_external; DpmActivityMonitorCoeffInt_t *activity_monitor = &(activity_monitor_external.DpmActivityMonitorCoeffInt); - int workload_type, ret = 0; - uint32_t current_profile_mode = smu->power_profile_mode; - smu->power_profile_mode = input[size]; + int ret, idx; - if (smu->power_profile_mode >= PP_SMC_POWER_PROFILE_COUNT) { - dev_err(smu->adev->dev, "Invalid power profile mode %d\n", smu->power_profile_mode); - return -EINVAL; + ret = smu_cmn_update_table(smu, + SMU_TABLE_ACTIVITY_MONITOR_COEFF, + WORKLOAD_PPLIB_CUSTOM_BIT, + (void *)(&activity_monitor_external), + false); + if (ret) { + dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); + return ret; } - if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_CUSTOM) { - if (size != 9) - return -EINVAL; - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, - WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor_external), - false); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); - return ret; - } - - switch (input[0]) { - case 0: /* Gfxclk */ - activity_monitor->Gfx_FPS = input[1]; - activity_monitor->Gfx_MinActiveFreqType = input[2]; - activity_monitor->Gfx_MinActiveFreq = input[3]; - activity_monitor->Gfx_BoosterFreqType = input[4]; - activity_monitor->Gfx_BoosterFreq = input[5]; - activity_monitor->Gfx_PD_Data_limit_c = input[6]; - activity_monitor->Gfx_PD_Data_error_coeff = input[7]; - activity_monitor->Gfx_PD_Data_error_rate_coeff = input[8]; - break; - case 1: /* Fclk */ - activity_monitor->Fclk_FPS = input[1]; - activity_monitor->Fclk_MinActiveFreqType = input[2]; - activity_monitor->Fclk_MinActiveFreq = input[3]; - activity_monitor->Fclk_BoosterFreqType = input[4]; - activity_monitor->Fclk_BoosterFreq = input[5]; - activity_monitor->Fclk_PD_Data_limit_c = input[6]; - activity_monitor->Fclk_PD_Data_error_coeff = input[7]; - activity_monitor->Fclk_PD_Data_error_rate_coeff = input[8]; - break; - default: - return -EINVAL; - } - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, - WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor_external), - true); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); - return ret; - } + idx = 0 * SMU_14_0_2_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Gfxclk */ + activity_monitor->Gfx_FPS = input[idx + 1]; + activity_monitor->Gfx_MinActiveFreqType = input[idx + 2]; + activity_monitor->Gfx_MinActiveFreq = input[idx + 3]; + activity_monitor->Gfx_BoosterFreqType = input[idx + 4]; + activity_monitor->Gfx_BoosterFreq = input[idx + 5]; + activity_monitor->Gfx_PD_Data_limit_c = input[idx + 6]; + activity_monitor->Gfx_PD_Data_error_coeff = input[idx + 7]; + activity_monitor->Gfx_PD_Data_error_rate_coeff = input[idx + 8]; + } + idx = 1 * SMU_14_0_2_CUSTOM_PARAMS_COUNT; + if (input[idx]) { + /* Fclk */ + activity_monitor->Fclk_FPS = input[idx + 1]; + activity_monitor->Fclk_MinActiveFreqType = input[idx + 2]; + activity_monitor->Fclk_MinActiveFreq = input[idx + 3]; + activity_monitor->Fclk_BoosterFreqType = input[idx + 4]; + activity_monitor->Fclk_BoosterFreq = input[idx + 5]; + activity_monitor->Fclk_PD_Data_limit_c = input[idx + 6]; + activity_monitor->Fclk_PD_Data_error_coeff = input[idx + 7]; + activity_monitor->Fclk_PD_Data_error_rate_coeff = input[idx + 8]; } - if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_COMPUTE) + ret = smu_cmn_update_table(smu, + SMU_TABLE_ACTIVITY_MONITOR_COEFF, + WORKLOAD_PPLIB_CUSTOM_BIT, + (void *)(&activity_monitor_external), + true); + if (ret) { + dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); + return ret; + } + + return ret; +} + +static int smu_v14_0_2_set_power_profile_mode(struct smu_context *smu, + u32 workload_mask, + long *custom_params, + u32 custom_params_max_idx) +{ + u32 backend_workload_mask = 0; + int ret, idx = -1, i; + + smu_cmn_get_backend_workload_mask(smu, workload_mask, + &backend_workload_mask); + + /* disable deep sleep if compute is enabled */ + if (workload_mask & (1 << PP_SMC_POWER_PROFILE_COMPUTE)) smu_v14_0_deep_sleep_control(smu, false); - else if (current_profile_mode == PP_SMC_POWER_PROFILE_COMPUTE) + else smu_v14_0_deep_sleep_control(smu, true); - /* conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT */ - workload_type = smu_cmn_to_asic_specific_index(smu, - CMN2ASIC_MAPPING_WORKLOAD, - smu->power_profile_mode); - if (workload_type < 0) - return -EINVAL; + if (workload_mask & (1 << PP_SMC_POWER_PROFILE_CUSTOM)) { + if (!smu->custom_profile_params) { + smu->custom_profile_params = + kzalloc(SMU_14_0_2_CUSTOM_PARAMS_SIZE, GFP_KERNEL); + if (!smu->custom_profile_params) + return -ENOMEM; + } + if (custom_params && custom_params_max_idx) { + if (custom_params_max_idx != SMU_14_0_2_CUSTOM_PARAMS_COUNT) + return -EINVAL; + if (custom_params[0] >= SMU_14_0_2_CUSTOM_PARAMS_CLOCK_COUNT) + return -EINVAL; + idx = custom_params[0] * SMU_14_0_2_CUSTOM_PARAMS_COUNT; + smu->custom_profile_params[idx] = 1; + for (i = 1; i < custom_params_max_idx; i++) + smu->custom_profile_params[idx + i] = custom_params[i]; + } + ret = smu_v14_0_2_set_power_profile_mode_coeff(smu, + smu->custom_profile_params); + if (ret) { + if (idx != -1) + smu->custom_profile_params[idx] = 0; + return ret; + } + } else if (smu->custom_profile_params) { + memset(smu->custom_profile_params, 0, SMU_14_0_2_CUSTOM_PARAMS_SIZE); + } ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, - smu->workload_mask, NULL); - - if (!ret) - smu_cmn_assign_power_profile(smu); + backend_workload_mask, NULL); + if (ret) { + dev_err(smu->adev->dev, "Failed to set workload mask 0x%08x\n", + workload_mask); + if (idx != -1) + smu->custom_profile_params[idx] = 0; + return ret; + } return ret; } @@ -2065,7 +2096,7 @@ static int smu_v14_0_2_enable_gfx_features(struct smu_context *smu) { struct amdgpu_device *adev = smu->adev; - if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(14, 0, 2)) + if (amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(14, 0, 2)) return smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_EnableAllSmuFeatures, FEATURE_PWR_GFX, NULL); else diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c index dbbd3759bff3..9f55207ea9bc 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c @@ -1144,14 +1144,6 @@ int smu_cmn_set_mp1_state(struct smu_context *smu, return ret; } -void smu_cmn_assign_power_profile(struct smu_context *smu) -{ - uint32_t index; - index = fls(smu->workload_mask); - index = index > 0 && index <= WORKLOAD_POLICY_MAX ? index - 1 : 0; - smu->power_profile_mode = smu->workload_setting[index]; -} - bool smu_cmn_is_audio_func_enabled(struct amdgpu_device *adev) { struct pci_dev *p = NULL; @@ -1229,3 +1221,28 @@ void smu_cmn_generic_plpd_policy_desc(struct smu_dpm_policy *policy) { policy->desc = &xgmi_plpd_policy_desc; } + +void smu_cmn_get_backend_workload_mask(struct smu_context *smu, + u32 workload_mask, + u32 *backend_workload_mask) +{ + int workload_type; + u32 profile_mode; + + *backend_workload_mask = 0; + + for (profile_mode = 0; profile_mode < PP_SMC_POWER_PROFILE_COUNT; profile_mode++) { + if (!(workload_mask & (1 << profile_mode))) + continue; + + /* conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT */ + workload_type = smu_cmn_to_asic_specific_index(smu, + CMN2ASIC_MAPPING_WORKLOAD, + profile_mode); + + if (workload_type < 0) + continue; + + *backend_workload_mask |= 1 << workload_type; + } +} diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h index 8a801e389659..a020277dec3e 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h @@ -130,8 +130,6 @@ void smu_cmn_init_soft_gpu_metrics(void *table, uint8_t frev, uint8_t crev); int smu_cmn_set_mp1_state(struct smu_context *smu, enum pp_mp1_state mp1_state); -void smu_cmn_assign_power_profile(struct smu_context *smu); - /* * Helper function to make sysfs_emit_at() happy. Align buf to * the current page boundary and record the offset. @@ -149,5 +147,9 @@ bool smu_cmn_is_audio_func_enabled(struct amdgpu_device *adev); void smu_cmn_generic_soc_policy_desc(struct smu_dpm_policy *policy); void smu_cmn_generic_plpd_policy_desc(struct smu_dpm_policy *policy); +void smu_cmn_get_backend_workload_mask(struct smu_context *smu, + u32 workload_mask, + u32 *backend_workload_mask); + #endif #endif diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_drv.c b/drivers/gpu/drm/arm/display/komeda/komeda_drv.c index 6d475bb34002..358c1512b087 100644 --- a/drivers/gpu/drm/arm/display/komeda/komeda_drv.c +++ b/drivers/gpu/drm/arm/display/komeda/komeda_drv.c @@ -9,7 +9,7 @@ #include #include #include -#include +#include #include #include #include "komeda_dev.h" @@ -153,7 +153,7 @@ static const struct dev_pm_ops komeda_pm_ops = { static struct platform_driver komeda_platform_driver = { .probe = komeda_platform_probe, - .remove_new = komeda_platform_remove, + .remove = komeda_platform_remove, .shutdown = komeda_platform_shutdown, .driver = { .name = "komeda", diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c index 1e7b1fcb2848..6ed504099188 100644 --- a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c +++ b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c @@ -63,7 +63,6 @@ static const struct drm_driver komeda_kms_driver = { .fops = &komeda_cma_fops, .name = "komeda", .desc = "Arm Komeda Display Processor driver", - .date = "20181101", .major = 0, .minor = 1, }; diff --git a/drivers/gpu/drm/arm/hdlcd_drv.c b/drivers/gpu/drm/arm/hdlcd_drv.c index cd4389809d42..c3179d74f3f5 100644 --- a/drivers/gpu/drm/arm/hdlcd_drv.c +++ b/drivers/gpu/drm/arm/hdlcd_drv.c @@ -22,8 +22,8 @@ #include #include +#include #include -#include #include #include #include @@ -233,7 +233,6 @@ static const struct drm_driver hdlcd_driver = { .fops = &fops, .name = "hdlcd", .desc = "ARM HDLCD Controller DRM", - .date = "20151021", .major = 1, .minor = 0, }; @@ -405,7 +404,7 @@ static SIMPLE_DEV_PM_OPS(hdlcd_pm_ops, hdlcd_pm_suspend, hdlcd_pm_resume); static struct platform_driver hdlcd_platform_driver = { .probe = hdlcd_probe, - .remove_new = hdlcd_remove, + .remove = hdlcd_remove, .shutdown = hdlcd_shutdown, .driver = { .name = "hdlcd", diff --git a/drivers/gpu/drm/arm/malidp_drv.c b/drivers/gpu/drm/arm/malidp_drv.c index 4cb25004b84f..e083021e9e99 100644 --- a/drivers/gpu/drm/arm/malidp_drv.c +++ b/drivers/gpu/drm/arm/malidp_drv.c @@ -16,9 +16,9 @@ #include #include +#include #include #include -#include #include #include #include @@ -570,7 +570,6 @@ static const struct drm_driver malidp_driver = { .fops = &fops, .name = "mali-dp", .desc = "ARM Mali Display Processor driver", - .date = "20160106", .major = 1, .minor = 0, }; @@ -988,7 +987,7 @@ static const struct dev_pm_ops malidp_pm_ops = { static struct platform_driver malidp_platform_driver = { .probe = malidp_platform_probe, - .remove_new = malidp_platform_remove, + .remove = malidp_platform_remove, .shutdown = malidp_platform_shutdown, .driver = { .name = "mali-dp", diff --git a/drivers/gpu/drm/armada/armada_crtc.c b/drivers/gpu/drm/armada/armada_crtc.c index c78687c755a8..0900e4466ffb 100644 --- a/drivers/gpu/drm/armada/armada_crtc.c +++ b/drivers/gpu/drm/armada/armada_crtc.c @@ -1084,7 +1084,7 @@ MODULE_DEVICE_TABLE(platform, armada_lcd_platform_ids); struct platform_driver armada_lcd_platform_driver = { .probe = armada_lcd_probe, - .remove_new = armada_lcd_remove, + .remove = armada_lcd_remove, .driver = { .name = "armada-lcd", .owner = THIS_MODULE, diff --git a/drivers/gpu/drm/armada/armada_drv.c b/drivers/gpu/drm/armada/armada_drv.c index 5c26f0409478..cae25ad66c74 100644 --- a/drivers/gpu/drm/armada/armada_drv.c +++ b/drivers/gpu/drm/armada/armada_drv.c @@ -11,8 +11,8 @@ #include #include +#include #include -#include #include #include #include @@ -45,7 +45,6 @@ static const struct drm_driver armada_drm_driver = { .minor = 0, .name = "armada-drm", .desc = "Armada SoC DRM", - .date = "20120730", .driver_features = DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC, .ioctls = armada_ioctls, .num_ioctls = ARRAY_SIZE(armada_ioctls), @@ -250,7 +249,7 @@ MODULE_DEVICE_TABLE(platform, armada_drm_platform_ids); static struct platform_driver armada_drm_platform_driver = { .probe = armada_drm_probe, - .remove_new = armada_drm_remove, + .remove = armada_drm_remove, .shutdown = armada_drm_shutdown, .driver = { .name = "armada-drm", diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index 26d10065d534..1a1680d71486 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -15,7 +15,7 @@ #include "armada_gem.h" #include "armada_ioctlP.h" -MODULE_IMPORT_NS(DMA_BUF); +MODULE_IMPORT_NS("DMA_BUF"); static vm_fault_t armada_gem_vm_fault(struct vm_fault *vmf) { diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c b/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c index 109023815fa2..397e677a691c 100644 --- a/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c +++ b/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c @@ -13,8 +13,8 @@ #include #include +#include #include -#include #include #include #include @@ -252,7 +252,6 @@ static const struct drm_driver aspeed_gfx_driver = { .fops = &fops, .name = "aspeed-gfx-drm", .desc = "ASPEED GFX DRM", - .date = "20180319", .major = 1, .minor = 0, }; @@ -368,7 +367,7 @@ static void aspeed_gfx_shutdown(struct platform_device *pdev) static struct platform_driver aspeed_gfx_platform_driver = { .probe = aspeed_gfx_probe, - .remove_new = aspeed_gfx_remove, + .remove = aspeed_gfx_remove, .shutdown = aspeed_gfx_shutdown, .driver = { .name = "aspeed_gfx", diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c index 4afe4be072ef..ff3bcdd1cff2 100644 --- a/drivers/gpu/drm/ast/ast_drv.c +++ b/drivers/gpu/drm/ast/ast_drv.c @@ -31,8 +31,8 @@ #include #include +#include #include -#include #include #include #include @@ -60,7 +60,6 @@ static const struct drm_driver ast_driver = { .fops = &ast_fops, .name = DRIVER_NAME, .desc = DRIVER_DESC, - .date = DRIVER_DATE, .major = DRIVER_MAJOR, .minor = DRIVER_MINOR, .patchlevel = DRIVER_PATCHLEVEL, diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h index 21ce3769bf0d..6b4305ac07d4 100644 --- a/drivers/gpu/drm/ast/ast_drv.h +++ b/drivers/gpu/drm/ast/ast_drv.h @@ -43,7 +43,6 @@ #define DRIVER_NAME "ast" #define DRIVER_DESC "AST" -#define DRIVER_DATE "20120228" #define DRIVER_MAJOR 0 #define DRIVER_MINOR 1 diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c index 792dcc19e8e7..fa8ad94e431a 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c @@ -16,9 +16,9 @@ #include #include +#include #include #include -#include #include #include #include @@ -846,7 +846,6 @@ static const struct drm_driver atmel_hlcdc_dc_driver = { .fops = &fops, .name = "atmel-hlcdc", .desc = "Atmel HLCD Controller DRM", - .date = "20141504", .major = 1, .minor = 0, }; @@ -937,7 +936,7 @@ static const struct of_device_id atmel_hlcdc_dc_of_match[] = { static struct platform_driver atmel_hlcdc_dc_platform_driver = { .probe = atmel_hlcdc_dc_drm_probe, - .remove_new = atmel_hlcdc_dc_drm_remove, + .remove = atmel_hlcdc_dc_drm_remove, .shutdown = atmel_hlcdc_dc_drm_shutdown, .driver = { .name = "atmel-hlcdc-display-controller", diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c b/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c index 8f786592143b..657bc3dd18df 100644 --- a/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c +++ b/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c @@ -214,7 +214,8 @@ static void audio_shutdown(struct device *dev, void *data) } static int adv7511_hdmi_i2s_get_dai_id(struct snd_soc_component *component, - struct device_node *endpoint) + struct device_node *endpoint, + void *data) { struct of_endpoint of_ep; int ret; diff --git a/drivers/gpu/drm/bridge/analogix/analogix-anx6345.c b/drivers/gpu/drm/bridge/analogix/analogix-anx6345.c index b754947e3e00..83d711ee3a2e 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix-anx6345.c +++ b/drivers/gpu/drm/bridge/analogix/analogix-anx6345.c @@ -793,7 +793,7 @@ static void anx6345_i2c_remove(struct i2c_client *client) } static const struct i2c_device_id anx6345_id[] = { - { "anx6345", 0 }, + { "anx6345" }, { /* sentinel */ } }; MODULE_DEVICE_TABLE(i2c, anx6345_id); diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c b/drivers/gpu/drm/bridge/analogix/anx7625.c index a2675b121fe4..4be34d5c7a3b 100644 --- a/drivers/gpu/drm/bridge/analogix/anx7625.c +++ b/drivers/gpu/drm/bridge/analogix/anx7625.c @@ -1952,7 +1952,8 @@ static void anx7625_audio_shutdown(struct device *dev, void *data) } static int anx7625_hdmi_i2s_get_dai_id(struct snd_soc_component *component, - struct device_node *endpoint) + struct device_node *endpoint, + void *data) { struct of_endpoint of_ep; int ret; @@ -2002,8 +2003,10 @@ static int anx7625_audio_get_eld(struct device *dev, void *data, memset(buf, 0, len); } else { dev_dbg(dev, "audio copy eld\n"); + mutex_lock(&ctx->connector->eld_mutex); memcpy(buf, ctx->connector->eld, min(sizeof(ctx->connector->eld), len)); + mutex_unlock(&ctx->connector->eld_mutex); } return 0; @@ -2137,49 +2140,6 @@ static void hdcp_check_work_func(struct work_struct *work) drm_modeset_unlock(&drm_dev->mode_config.connection_mutex); } -static int anx7625_connector_atomic_check(struct anx7625_data *ctx, - struct drm_connector_state *state) -{ - struct device *dev = ctx->dev; - int cp; - - dev_dbg(dev, "hdcp state check\n"); - cp = state->content_protection; - - if (cp == ctx->hdcp_cp) - return 0; - - if (cp == DRM_MODE_CONTENT_PROTECTION_DESIRED) { - if (ctx->dp_en) { - dev_dbg(dev, "enable HDCP\n"); - anx7625_hdcp_enable(ctx); - - queue_delayed_work(ctx->hdcp_workqueue, - &ctx->hdcp_work, - msecs_to_jiffies(2000)); - } - } - - if (cp == DRM_MODE_CONTENT_PROTECTION_UNDESIRED) { - if (ctx->hdcp_cp != DRM_MODE_CONTENT_PROTECTION_ENABLED) { - dev_err(dev, "current CP is not ENABLED\n"); - return -EINVAL; - } - anx7625_hdcp_disable(ctx); - ctx->hdcp_cp = DRM_MODE_CONTENT_PROTECTION_UNDESIRED; - drm_hdcp_update_content_protection(ctx->connector, - ctx->hdcp_cp); - dev_dbg(dev, "update CP to UNDESIRE\n"); - } - - if (cp == DRM_MODE_CONTENT_PROTECTION_ENABLED) { - dev_err(dev, "Userspace illegal set to PROTECTION ENABLE\n"); - return -EINVAL; - } - - return 0; -} - static int anx7625_bridge_attach(struct drm_bridge *bridge, enum drm_bridge_attach_flags flags) { @@ -2416,7 +2376,7 @@ static int anx7625_bridge_atomic_check(struct drm_bridge *bridge, anx7625_bridge_mode_fixup(bridge, &crtc_state->mode, &crtc_state->adjusted_mode); - return anx7625_connector_atomic_check(ctx, conn_state); + return 0; } static void anx7625_bridge_atomic_enable(struct drm_bridge *bridge, @@ -2425,6 +2385,7 @@ static void anx7625_bridge_atomic_enable(struct drm_bridge *bridge, struct anx7625_data *ctx = bridge_to_anx7625(bridge); struct device *dev = ctx->dev; struct drm_connector *connector; + struct drm_connector_state *conn_state; dev_dbg(dev, "drm atomic enable\n"); @@ -2439,6 +2400,22 @@ static void anx7625_bridge_atomic_enable(struct drm_bridge *bridge, _anx7625_hpd_polling(ctx, 5000 * 100); anx7625_dp_start(ctx); + + conn_state = drm_atomic_get_new_connector_state(state->base.state, connector); + + if (WARN_ON(!conn_state)) + return; + + if (conn_state->content_protection == DRM_MODE_CONTENT_PROTECTION_DESIRED) { + if (ctx->dp_en) { + dev_dbg(dev, "enable HDCP\n"); + anx7625_hdcp_enable(ctx); + + queue_delayed_work(ctx->hdcp_workqueue, + &ctx->hdcp_work, + msecs_to_jiffies(2000)); + } + } } static void anx7625_bridge_atomic_disable(struct drm_bridge *bridge, @@ -2449,6 +2426,17 @@ static void anx7625_bridge_atomic_disable(struct drm_bridge *bridge, dev_dbg(dev, "drm atomic disable\n"); + flush_workqueue(ctx->hdcp_workqueue); + + if (ctx->connector && + ctx->hdcp_cp == DRM_MODE_CONTENT_PROTECTION_ENABLED) { + anx7625_hdcp_disable(ctx); + ctx->hdcp_cp = DRM_MODE_CONTENT_PROTECTION_DESIRED; + drm_hdcp_update_content_protection(ctx->connector, + ctx->hdcp_cp); + dev_dbg(dev, "update CP to DESIRE\n"); + } + ctx->connector = NULL; anx7625_dp_stop(ctx); @@ -2795,7 +2783,7 @@ static void anx7625_i2c_remove(struct i2c_client *client) } static const struct i2c_device_id anx7625_id[] = { - {"anx7625", 0}, + { "anx7625" }, {} }; diff --git a/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c b/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c index 7457d38622b0..c7a0247e06ad 100644 --- a/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c +++ b/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c @@ -1300,7 +1300,7 @@ MODULE_DEVICE_TABLE(of, cdns_dsi_of_match); static struct platform_driver cdns_dsi_platform_driver = { .probe = cdns_dsi_drm_probe, - .remove_new = cdns_dsi_drm_remove, + .remove = cdns_dsi_drm_remove, .driver = { .name = "cdns-dsi", .of_match_table = cdns_dsi_of_match, diff --git a/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-core.c b/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-core.c index 41f72d458487..d081850e3c03 100644 --- a/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-core.c +++ b/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-core.c @@ -2656,7 +2656,7 @@ static struct platform_driver mhdp_driver = { .of_match_table = mhdp_ids, }, .probe = cdns_mhdp_probe, - .remove_new = cdns_mhdp_remove, + .remove = cdns_mhdp_remove, }; module_platform_driver(mhdp_driver); diff --git a/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-hdcp.c b/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-hdcp.c index 31832ba4017f..42248f179b69 100644 --- a/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-hdcp.c +++ b/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-hdcp.c @@ -500,34 +500,6 @@ static void cdns_mhdp_hdcp_prop_work(struct work_struct *work) drm_modeset_unlock(&dev->mode_config.connection_mutex); } -int cdns_mhdp_hdcp_set_lc(struct cdns_mhdp_device *mhdp, u8 *val) -{ - int ret; - - mutex_lock(&mhdp->mbox_mutex); - ret = cdns_mhdp_secure_mailbox_send(mhdp, MB_MODULE_ID_HDCP_GENERAL, - HDCP_GENERAL_SET_LC_128, - 16, val); - mutex_unlock(&mhdp->mbox_mutex); - - return ret; -} - -int -cdns_mhdp_hdcp_set_public_key_param(struct cdns_mhdp_device *mhdp, - struct cdns_hdcp_tx_public_key_param *val) -{ - int ret; - - mutex_lock(&mhdp->mbox_mutex); - ret = cdns_mhdp_secure_mailbox_send(mhdp, MB_MODULE_ID_HDCP_TX, - HDCP2X_TX_SET_PUBLIC_KEY_PARAMS, - sizeof(*val), (u8 *)val); - mutex_unlock(&mhdp->mbox_mutex); - - return ret; -} - int cdns_mhdp_hdcp_enable(struct cdns_mhdp_device *mhdp, u8 content_type) { int ret; diff --git a/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-hdcp.h b/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-hdcp.h index 334c0b8b0d4f..3b6ec9c3a8d8 100644 --- a/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-hdcp.h +++ b/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-hdcp.h @@ -82,9 +82,6 @@ struct cdns_hdcp_tx_public_key_param { u8 E[DLP_E]; }; -int cdns_mhdp_hdcp_set_public_key_param(struct cdns_mhdp_device *mhdp, - struct cdns_hdcp_tx_public_key_param *val); -int cdns_mhdp_hdcp_set_lc(struct cdns_mhdp_device *mhdp, u8 *val); int cdns_mhdp_hdcp_enable(struct cdns_mhdp_device *mhdp, u8 content_type); int cdns_mhdp_hdcp_disable(struct cdns_mhdp_device *mhdp); void cdns_mhdp_hdcp_init(struct cdns_mhdp_device *mhdp); diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index 9eecac457dcf..d47703559b0d 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -785,7 +785,7 @@ static struct mipi_dsi_driver chipone_dsi_driver = { }, }; -static struct i2c_device_id chipone_i2c_id[] = { +static const struct i2c_device_id chipone_i2c_id[] = { { "chipone,icn6211" }, {}, }; diff --git a/drivers/gpu/drm/bridge/chrontel-ch7033.c b/drivers/gpu/drm/bridge/chrontel-ch7033.c index c83486cf6b15..da17f0978a79 100644 --- a/drivers/gpu/drm/bridge/chrontel-ch7033.c +++ b/drivers/gpu/drm/bridge/chrontel-ch7033.c @@ -597,7 +597,7 @@ static const struct of_device_id ch7033_dt_ids[] = { MODULE_DEVICE_TABLE(of, ch7033_dt_ids); static const struct i2c_device_id ch7033_ids[] = { - { "ch7033", 0 }, + { "ch7033" }, { } }; MODULE_DEVICE_TABLE(i2c, ch7033_ids); diff --git a/drivers/gpu/drm/bridge/display-connector.c b/drivers/gpu/drm/bridge/display-connector.c index aab9ce7be94c..72bc508d4e6e 100644 --- a/drivers/gpu/drm/bridge/display-connector.c +++ b/drivers/gpu/drm/bridge/display-connector.c @@ -427,7 +427,7 @@ MODULE_DEVICE_TABLE(of, display_connector_match); static struct platform_driver display_connector_driver = { .probe = display_connector_probe, - .remove_new = display_connector_remove, + .remove = display_connector_remove, .driver = { .name = "display-connector", .of_match_table = display_connector_match, diff --git a/drivers/gpu/drm/bridge/fsl-ldb.c b/drivers/gpu/drm/bridge/fsl-ldb.c index 0e4bac7dd04f..0fc8a14fd800 100644 --- a/drivers/gpu/drm/bridge/fsl-ldb.c +++ b/drivers/gpu/drm/bridge/fsl-ldb.c @@ -393,7 +393,7 @@ MODULE_DEVICE_TABLE(of, fsl_ldb_match); static struct platform_driver fsl_ldb_driver = { .probe = fsl_ldb_probe, - .remove_new = fsl_ldb_remove, + .remove = fsl_ldb_remove, .driver = { .name = "fsl-ldb", .of_match_table = fsl_ldb_match, diff --git a/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-pvi.c b/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-pvi.c index 073e64dc200c..0d1ac3edcab4 100644 --- a/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-pvi.c +++ b/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-pvi.c @@ -193,7 +193,7 @@ MODULE_DEVICE_TABLE(of, imx8mp_hdmi_pvi_match); static struct platform_driver imx8mp_hdmi_pvi_driver = { .probe = imx8mp_hdmi_pvi_probe, - .remove_new = imx8mp_hdmi_pvi_remove, + .remove = imx8mp_hdmi_pvi_remove, .driver = { .name = "imx-hdmi-pvi", .of_match_table = imx8mp_hdmi_pvi_match, diff --git a/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-tx.c b/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-tx.c index 8fcc6d18f4ab..1e7a789ec289 100644 --- a/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-tx.c +++ b/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-tx.c @@ -144,7 +144,7 @@ MODULE_DEVICE_TABLE(of, imx8mp_dw_hdmi_of_table); static struct platform_driver imx8mp_dw_hdmi_platform_driver = { .probe = imx8mp_dw_hdmi_probe, - .remove_new = imx8mp_dw_hdmi_remove, + .remove = imx8mp_dw_hdmi_remove, .driver = { .name = "imx8mp-dw-hdmi-tx", .of_match_table = imx8mp_dw_hdmi_of_table, diff --git a/drivers/gpu/drm/bridge/imx/imx8qm-ldb.c b/drivers/gpu/drm/bridge/imx/imx8qm-ldb.c index c879e37f5811..dd5823f04c70 100644 --- a/drivers/gpu/drm/bridge/imx/imx8qm-ldb.c +++ b/drivers/gpu/drm/bridge/imx/imx8qm-ldb.c @@ -570,7 +570,7 @@ MODULE_DEVICE_TABLE(of, imx8qm_ldb_dt_ids); static struct platform_driver imx8qm_ldb_driver = { .probe = imx8qm_ldb_probe, - .remove_new = imx8qm_ldb_remove, + .remove = imx8qm_ldb_remove, .driver = { .pm = pm_ptr(&imx8qm_ldb_pm_ops), .name = DRIVER_NAME, diff --git a/drivers/gpu/drm/bridge/imx/imx8qxp-ldb.c b/drivers/gpu/drm/bridge/imx/imx8qxp-ldb.c index b33011f397f0..7bce2305d676 100644 --- a/drivers/gpu/drm/bridge/imx/imx8qxp-ldb.c +++ b/drivers/gpu/drm/bridge/imx/imx8qxp-ldb.c @@ -706,7 +706,7 @@ MODULE_DEVICE_TABLE(of, imx8qxp_ldb_dt_ids); static struct platform_driver imx8qxp_ldb_driver = { .probe = imx8qxp_ldb_probe, - .remove_new = imx8qxp_ldb_remove, + .remove = imx8qxp_ldb_remove, .driver = { .pm = pm_ptr(&imx8qxp_ldb_pm_ops), .name = DRIVER_NAME, diff --git a/drivers/gpu/drm/bridge/imx/imx8qxp-pixel-combiner.c b/drivers/gpu/drm/bridge/imx/imx8qxp-pixel-combiner.c index ce43e4069e21..1812bd106261 100644 --- a/drivers/gpu/drm/bridge/imx/imx8qxp-pixel-combiner.c +++ b/drivers/gpu/drm/bridge/imx/imx8qxp-pixel-combiner.c @@ -427,7 +427,7 @@ MODULE_DEVICE_TABLE(of, imx8qxp_pc_dt_ids); static struct platform_driver imx8qxp_pc_bridge_driver = { .probe = imx8qxp_pc_bridge_probe, - .remove_new = imx8qxp_pc_bridge_remove, + .remove = imx8qxp_pc_bridge_remove, .driver = { .pm = pm_ptr(&imx8qxp_pc_pm_ops), .name = DRIVER_NAME, diff --git a/drivers/gpu/drm/bridge/imx/imx8qxp-pixel-link.c b/drivers/gpu/drm/bridge/imx/imx8qxp-pixel-link.c index 1d11cc1df43c..4b0715ed6f38 100644 --- a/drivers/gpu/drm/bridge/imx/imx8qxp-pixel-link.c +++ b/drivers/gpu/drm/bridge/imx/imx8qxp-pixel-link.c @@ -409,7 +409,7 @@ MODULE_DEVICE_TABLE(of, imx8qxp_pixel_link_dt_ids); static struct platform_driver imx8qxp_pixel_link_bridge_driver = { .probe = imx8qxp_pixel_link_bridge_probe, - .remove_new = imx8qxp_pixel_link_bridge_remove, + .remove = imx8qxp_pixel_link_bridge_remove, .driver = { .of_match_table = imx8qxp_pixel_link_dt_ids, .name = DRIVER_NAME, diff --git a/drivers/gpu/drm/bridge/imx/imx8qxp-pxl2dpi.c b/drivers/gpu/drm/bridge/imx/imx8qxp-pxl2dpi.c index fb7cf4369bb8..65cf3a6c8ec6 100644 --- a/drivers/gpu/drm/bridge/imx/imx8qxp-pxl2dpi.c +++ b/drivers/gpu/drm/bridge/imx/imx8qxp-pxl2dpi.c @@ -467,7 +467,7 @@ MODULE_DEVICE_TABLE(of, imx8qxp_pxl2dpi_dt_ids); static struct platform_driver imx8qxp_pxl2dpi_bridge_driver = { .probe = imx8qxp_pxl2dpi_bridge_probe, - .remove_new = imx8qxp_pxl2dpi_bridge_remove, + .remove = imx8qxp_pxl2dpi_bridge_remove, .driver = { .of_match_table = imx8qxp_pxl2dpi_dt_ids, .name = DRIVER_NAME, diff --git a/drivers/gpu/drm/bridge/imx/imx93-mipi-dsi.c b/drivers/gpu/drm/bridge/imx/imx93-mipi-dsi.c index 2347f8dd632f..bea8346515b8 100644 --- a/drivers/gpu/drm/bridge/imx/imx93-mipi-dsi.c +++ b/drivers/gpu/drm/bridge/imx/imx93-mipi-dsi.c @@ -904,7 +904,7 @@ MODULE_DEVICE_TABLE(of, imx93_dsi_dt_ids); static struct platform_driver imx93_dsi_driver = { .probe = imx93_dsi_probe, - .remove_new = imx93_dsi_remove, + .remove = imx93_dsi_remove, .driver = { .of_match_table = imx93_dsi_dt_ids, .name = "imx93_mipi_dsi", diff --git a/drivers/gpu/drm/bridge/ite-it6263.c b/drivers/gpu/drm/bridge/ite-it6263.c index 5f138a5692c7..306b5e374b9e 100644 --- a/drivers/gpu/drm/bridge/ite-it6263.c +++ b/drivers/gpu/drm/bridge/ite-it6263.c @@ -48,6 +48,7 @@ #define REG_COL_DEP GENMASK(1, 0) #define BIT8 FIELD_PREP(REG_COL_DEP, 1) #define OUT_MAP BIT(4) +#define VESA BIT(4) #define JEIDA 0 #define REG_DESSC_ENB BIT(6) #define DMODE BIT(7) @@ -428,12 +429,30 @@ static inline void it6263_lvds_reset(struct it6263 *it) fsleep(10000); } +static inline bool it6263_is_input_bus_fmt_valid(int input_fmt) +{ + switch (input_fmt) { + case MEDIA_BUS_FMT_RGB888_1X7X4_JEIDA: + case MEDIA_BUS_FMT_RGB888_1X7X4_SPWG: + return true; + } + return false; +} + static inline void it6263_lvds_set_interface(struct it6263 *it) { + u8 fmt; + /* color depth */ regmap_write_bits(it->lvds_regmap, LVDS_REG_2C, REG_COL_DEP, BIT8); + + if (it->lvds_data_mapping == MEDIA_BUS_FMT_RGB888_1X7X4_SPWG) + fmt = VESA; + else + fmt = JEIDA; + /* output mapping */ - regmap_write_bits(it->lvds_regmap, LVDS_REG_2C, OUT_MAP, JEIDA); + regmap_write_bits(it->lvds_regmap, LVDS_REG_2C, OUT_MAP, fmt); if (it->lvds_dual_link) { regmap_write_bits(it->lvds_regmap, LVDS_REG_2C, DMODE, DISO); @@ -550,15 +569,6 @@ static int it6263_read_edid(void *data, u8 *buf, unsigned int block, size_t len) return 0; } -static int it6263_bridge_atomic_check(struct drm_bridge *bridge, - struct drm_bridge_state *bridge_state, - struct drm_crtc_state *crtc_state, - struct drm_connector_state *conn_state) -{ - return drm_atomic_helper_connector_hdmi_check(conn_state->connector, - conn_state->state); -} - static void it6263_bridge_atomic_disable(struct drm_bridge *bridge, struct drm_bridge_state *old_bridge_state) @@ -714,14 +724,14 @@ it6263_bridge_atomic_get_input_bus_fmts(struct drm_bridge *bridge, *num_input_fmts = 0; - if (it->lvds_data_mapping != MEDIA_BUS_FMT_RGB888_1X7X4_JEIDA) + if (!it6263_is_input_bus_fmt_valid(it->lvds_data_mapping)) return NULL; input_fmts = kmalloc(sizeof(*input_fmts), GFP_KERNEL); if (!input_fmts) return NULL; - input_fmts[0] = MEDIA_BUS_FMT_RGB888_1X7X4_JEIDA; + input_fmts[0] = it->lvds_data_mapping; *num_input_fmts = 1; return input_fmts; @@ -793,7 +803,6 @@ static const struct drm_bridge_funcs it6263_bridge_funcs = { .mode_valid = it6263_bridge_mode_valid, .atomic_disable = it6263_bridge_atomic_disable, .atomic_enable = it6263_bridge_atomic_enable, - .atomic_check = it6263_bridge_atomic_check, .detect = it6263_bridge_detect, .edid_read = it6263_bridge_edid_read, .atomic_get_input_bus_fmts = it6263_bridge_atomic_get_input_bus_fmts, @@ -878,7 +887,7 @@ static const struct of_device_id it6263_of_match[] = { MODULE_DEVICE_TABLE(of, it6263_of_match); static const struct i2c_device_id it6263_i2c_ids[] = { - { "it6263", 0 }, + { "it6263" }, { } }; MODULE_DEVICE_TABLE(i2c, it6263_i2c_ids); diff --git a/drivers/gpu/drm/bridge/ite-it6505.c b/drivers/gpu/drm/bridge/ite-it6505.c index 008d86cc562a..88ef76a37fe6 100644 --- a/drivers/gpu/drm/bridge/ite-it6505.c +++ b/drivers/gpu/drm/bridge/ite-it6505.c @@ -19,6 +19,7 @@ #include #include #include +#include #include @@ -126,6 +127,7 @@ #define REG_AUX_OUT_DATA0 0x27 #define REG_AUX_CMD_REQ 0x2B +#define M_AUX_REQ_CMD 0x0F #define AUX_BUSY BIT(5) #define REG_AUX_DATA_0_7 0x2C @@ -266,6 +268,18 @@ #define REG_SSC_CTRL1 0x189 #define REG_SSC_CTRL2 0x18A +#define REG_AUX_USER_CTRL 0x190 +#define EN_USER_AUX BIT(0) +#define USER_AUX_DONE BIT(1) +#define AUX_EVENT BIT(4) + +#define REG_AUX_USER_DATA_REC 0x191 +#define M_AUX_IN_REC 0xF0 +#define M_AUX_OUT_REC 0x0F + +#define REG_AUX_USER_REPLY 0x19A +#define REG_AUX_USER_RXB(n) (n + 0x19B) + #define RBR DP_LINK_BW_1_62 #define HBR DP_LINK_BW_2_7 #define HBR2 DP_LINK_BW_5_4 @@ -296,11 +310,13 @@ #define MAX_LANE_COUNT 4 #define MAX_LINK_RATE HBR #define AUTO_TRAIN_RETRY 3 -#define MAX_HDCP_DOWN_STREAM_COUNT 10 +#define MAX_HDCP_DOWN_STREAM_COUNT 127 #define MAX_CR_LEVEL 0x03 #define MAX_EQ_LEVEL 0x03 #define AUX_WAIT_TIMEOUT_MS 15 -#define AUX_FIFO_MAX_SIZE 32 +#define AUX_FIFO_MAX_SIZE 16 +#define AUX_I2C_MAX_SIZE 4 +#define AUX_I2C_DEFER_RETRY 4 #define PIXEL_CLK_DELAY 1 #define PIXEL_CLK_INVERSE 0 #define ADJUST_PHASE_THRESHOLD 80000 @@ -323,7 +339,15 @@ enum aux_cmd_type { CMD_AUX_NATIVE_READ = 0x0, CMD_AUX_NATIVE_WRITE = 0x5, + CMD_AUX_GI2C_ADR = 0x08, + CMD_AUX_GI2C_READ = 0x09, + CMD_AUX_GI2C_WRITE = 0x0A, CMD_AUX_I2C_EDID_READ = 0xB, + CMD_AUX_I2C_READ = 0x0D, + CMD_AUX_I2C_WRITE = 0x0C, + + /* KSV read with AUX FIFO extend from CMD_AUX_NATIVE_READ*/ + CMD_AUX_GET_KSV_LIST = 0x10, }; enum aux_cmd_reply { @@ -965,7 +989,8 @@ static ssize_t it6505_aux_operation(struct it6505 *it6505, it6505_set_bits(it6505, REG_AUX_CTRL, AUX_USER_MODE, AUX_USER_MODE); aux_op_start: - if (cmd == CMD_AUX_I2C_EDID_READ) { + /* HW AUX FIFO supports only EDID and DCPD KSV FIFO area */ + if (cmd == CMD_AUX_I2C_EDID_READ || cmd == CMD_AUX_GET_KSV_LIST) { /* AUX EDID FIFO has max length of AUX_FIFO_MAX_SIZE bytes. */ size = min_t(size_t, size, AUX_FIFO_MAX_SIZE); /* Enable AUX FIFO read back and clear FIFO */ @@ -996,7 +1021,7 @@ aux_op_start: size); /* Aux Fire */ - it6505_write(it6505, REG_AUX_CMD_REQ, cmd); + it6505_write(it6505, REG_AUX_CMD_REQ, FIELD_GET(M_AUX_REQ_CMD, cmd)); ret = it6505_aux_wait(it6505); if (ret < 0) @@ -1030,7 +1055,7 @@ aux_op_start: goto aux_op_start; } - if (cmd == CMD_AUX_I2C_EDID_READ) { + if (cmd == CMD_AUX_I2C_EDID_READ || cmd == CMD_AUX_GET_KSV_LIST) { for (i = 0; i < size; i++) { ret = it6505_read(it6505, REG_AUX_DATA_FIFO); if (ret < 0) @@ -1055,7 +1080,7 @@ aux_op_start: ret = i; aux_op_err: - if (cmd == CMD_AUX_I2C_EDID_READ) { + if (cmd == CMD_AUX_I2C_EDID_READ || cmd == CMD_AUX_GET_KSV_LIST) { /* clear AUX FIFO */ it6505_set_bits(it6505, REG_AUX_CTRL, AUX_EN_FIFO_READ | CLR_EDID_FIFO, @@ -1076,10 +1101,14 @@ static ssize_t it6505_aux_do_transfer(struct it6505 *it6505, size_t size, enum aux_cmd_reply *reply) { int i, ret_size, ret = 0, request_size; + int fifo_max_size = (cmd == CMD_AUX_I2C_EDID_READ || cmd == CMD_AUX_GET_KSV_LIST) ? + AUX_FIFO_MAX_SIZE : 4; mutex_lock(&it6505->aux_lock); - for (i = 0; i < size; i += 4) { - request_size = min((int)size - i, 4); + i = 0; + do { + request_size = min_t(int, (int)size - i, fifo_max_size); + ret_size = it6505_aux_operation(it6505, cmd, address + i, buffer + i, request_size, reply); @@ -1088,14 +1117,170 @@ static ssize_t it6505_aux_do_transfer(struct it6505 *it6505, goto aux_op_err; } + i += request_size; ret += ret_size; - } + } while (i < size); aux_op_err: mutex_unlock(&it6505->aux_lock); return ret; } +static bool it6505_aux_i2c_reply_defer(u8 reply) +{ + if (reply == DP_AUX_NATIVE_REPLY_DEFER || reply == DP_AUX_I2C_REPLY_DEFER) + return true; + return false; +} + +static bool it6505_aux_i2c_reply_nack(u8 reply) +{ + if (reply == DP_AUX_NATIVE_REPLY_NACK || reply == DP_AUX_I2C_REPLY_NACK) + return true; + return false; +} + +static int it6505_aux_i2c_wait(struct it6505 *it6505, u8 *reply) +{ + int err = 0; + unsigned long timeout; + struct device *dev = it6505->dev; + + timeout = jiffies + msecs_to_jiffies(AUX_WAIT_TIMEOUT_MS) + 1; + + do { + if (it6505_read(it6505, REG_AUX_USER_CTRL) & AUX_EVENT) + break; + if (time_after(jiffies, timeout)) { + dev_err(dev, "Timed out waiting AUX I2C, BUSY = %X\n", + it6505_aux_op_finished(it6505)); + err = -ETIMEDOUT; + goto end_aux_i2c_wait; + } + usleep_range(300, 800); + } while (!it6505_aux_op_finished(it6505)); + + *reply = it6505_read(it6505, REG_AUX_USER_REPLY) >> 4; + + if (*reply == 0) + goto end_aux_i2c_wait; + + if (it6505_aux_i2c_reply_defer(*reply)) + err = -EBUSY; + else if (it6505_aux_i2c_reply_nack(*reply)) + err = -ENXIO; + +end_aux_i2c_wait: + it6505_set_bits(it6505, REG_AUX_USER_CTRL, USER_AUX_DONE, USER_AUX_DONE); + return err; +} + +static int it6505_aux_i2c_readb(struct it6505 *it6505, u8 *buf, size_t size, u8 *reply) +{ + int ret, i; + int retry; + + for (retry = 0; retry < AUX_I2C_DEFER_RETRY; retry++) { + it6505_write(it6505, REG_AUX_CMD_REQ, CMD_AUX_GI2C_READ); + + ret = it6505_aux_i2c_wait(it6505, reply); + if (it6505_aux_i2c_reply_defer(*reply)) + continue; + if (ret >= 0) + break; + } + + for (i = 0; i < size; i++) + buf[i] = it6505_read(it6505, REG_AUX_USER_RXB(0 + i)); + + return size; +} + +static int it6505_aux_i2c_writeb(struct it6505 *it6505, u8 *buf, size_t size, u8 *reply) +{ + int i, ret; + int retry; + + for (i = 0; i < size; i++) + it6505_write(it6505, REG_AUX_OUT_DATA0 + i, buf[i]); + + for (retry = 0; retry < AUX_I2C_DEFER_RETRY; retry++) { + it6505_write(it6505, REG_AUX_CMD_REQ, CMD_AUX_GI2C_WRITE); + + ret = it6505_aux_i2c_wait(it6505, reply); + if (it6505_aux_i2c_reply_defer(*reply)) + continue; + if (ret >= 0) + break; + } + return size; +} + +static ssize_t it6505_aux_i2c_operation(struct it6505 *it6505, + struct drm_dp_aux_msg *msg) +{ + int ret; + ssize_t request_size, data_cnt = 0; + u8 *buffer = msg->buffer; + + /* set AUX user mode */ + it6505_set_bits(it6505, REG_AUX_CTRL, + AUX_USER_MODE | AUX_NO_SEGMENT_WR, AUX_USER_MODE); + it6505_set_bits(it6505, REG_AUX_USER_CTRL, EN_USER_AUX, EN_USER_AUX); + /* clear AUX FIFO */ + it6505_set_bits(it6505, REG_AUX_CTRL, + AUX_EN_FIFO_READ | CLR_EDID_FIFO, + AUX_EN_FIFO_READ | CLR_EDID_FIFO); + + it6505_set_bits(it6505, REG_AUX_CTRL, + AUX_EN_FIFO_READ | CLR_EDID_FIFO, 0x00); + + it6505_write(it6505, REG_AUX_ADR_0_7, 0x00); + it6505_write(it6505, REG_AUX_ADR_8_15, msg->address << 1); + + if (msg->size == 0) { + /* IIC Start/STOP dummy write */ + it6505_write(it6505, REG_AUX_ADR_16_19, msg->request); + it6505_write(it6505, REG_AUX_CMD_REQ, CMD_AUX_GI2C_ADR); + ret = it6505_aux_i2c_wait(it6505, &msg->reply); + goto end_aux_i2c_transfer; + } + + /* IIC data transfer */ + data_cnt = 0; + do { + request_size = min_t(ssize_t, msg->size - data_cnt, AUX_I2C_MAX_SIZE); + it6505_write(it6505, REG_AUX_ADR_16_19, + msg->request | ((request_size - 1) << 4)); + if ((msg->request & DP_AUX_I2C_READ) == DP_AUX_I2C_READ) + ret = it6505_aux_i2c_readb(it6505, &buffer[data_cnt], + request_size, &msg->reply); + else + ret = it6505_aux_i2c_writeb(it6505, &buffer[data_cnt], + request_size, &msg->reply); + + if (ret < 0) + goto end_aux_i2c_transfer; + + data_cnt += request_size; + } while (data_cnt < msg->size); + ret = data_cnt; +end_aux_i2c_transfer: + + it6505_set_bits(it6505, REG_AUX_USER_CTRL, EN_USER_AUX, 0); + it6505_set_bits(it6505, REG_AUX_CTRL, AUX_USER_MODE, 0); + return ret; +} + +static ssize_t it6505_aux_i2c_transfer(struct drm_dp_aux *aux, + struct drm_dp_aux_msg *msg) +{ + struct it6505 *it6505 = container_of(aux, struct it6505, aux); + + guard(mutex)(&it6505->aux_lock); + return it6505_aux_i2c_operation(it6505, msg); +} + static ssize_t it6505_aux_transfer(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg) { @@ -1105,9 +1290,8 @@ static ssize_t it6505_aux_transfer(struct drm_dp_aux *aux, int ret; enum aux_cmd_reply reply; - /* IT6505 doesn't support arbitrary I2C read / write. */ if (is_i2c) - return -EINVAL; + return it6505_aux_i2c_transfer(aux, msg); switch (msg->request) { case DP_AUX_NATIVE_READ: @@ -1178,6 +1362,37 @@ static int it6505_get_edid_block(void *data, u8 *buf, unsigned int block, return 0; } +static int it6505_get_ksvlist(struct it6505 *it6505, u8 *buf, size_t len) +{ + struct device *dev = it6505->dev; + enum aux_cmd_reply reply; + int request_size, ret; + int i = 0; + + do { + request_size = min_t(int, (int)len - i, 15); + + ret = it6505_aux_do_transfer(it6505, CMD_AUX_GET_KSV_LIST, + DP_AUX_HDCP_KSV_FIFO, + buf + i, request_size, &reply); + + DRM_DEV_DEBUG_DRIVER(dev, "request_size = %d, ret =%d", request_size, ret); + if (ret < 0) + return ret; + + i += request_size; + } while (i < len); + + DRM_DEV_DEBUG_DRIVER(dev, "ksv read cnt = %d down_stream_cnt=%d ", i, i / 5); + + for (i = 0 ; i < len; i += 5) { + DRM_DEV_DEBUG_DRIVER(dev, "ksv[%d] = %02X%02X%02X%02X%02X", + i / 5, buf[i], buf[i + 1], buf[i + 2], buf[i + 3], buf[i + 4]); + } + + return len; +} + static void it6505_variable_config(struct it6505 *it6505) { it6505->link_rate_bw_code = HBR; @@ -1959,7 +2174,7 @@ static int it6505_setup_sha1_input(struct it6505 *it6505, u8 *sha1_input) { struct device *dev = it6505->dev; u8 binfo[2]; - int down_stream_count, i, err, msg_count = 0; + int down_stream_count, err, msg_count = 0; err = it6505_get_dpcd(it6505, DP_AUX_HDCP_BINFO, binfo, ARRAY_SIZE(binfo)); @@ -1984,18 +2199,11 @@ static int it6505_setup_sha1_input(struct it6505 *it6505, u8 *sha1_input) down_stream_count); return 0; } + err = it6505_get_ksvlist(it6505, sha1_input, down_stream_count * 5); + if (err < 0) + return err; - for (i = 0; i < down_stream_count; i++) { - err = it6505_get_dpcd(it6505, DP_AUX_HDCP_KSV_FIFO + - (i % 3) * DRM_HDCP_KSV_LEN, - sha1_input + msg_count, - DRM_HDCP_KSV_LEN); - - if (err < 0) - return err; - - msg_count += 5; - } + msg_count += down_stream_count * 5; it6505->hdcp_down_stream_count = down_stream_count; sha1_input[msg_count++] = binfo[0]; @@ -2023,7 +2231,7 @@ static bool it6505_hdcp_part2_ksvlist_check(struct it6505 *it6505) { struct device *dev = it6505->dev; u8 av[5][4], bv[5][4]; - int i, err; + int i, err, retry; i = it6505_setup_sha1_input(it6505, it6505->sha1_input); if (i <= 0) { @@ -2032,22 +2240,28 @@ static bool it6505_hdcp_part2_ksvlist_check(struct it6505 *it6505) } it6505_sha1_digest(it6505, it6505->sha1_input, i, (u8 *)av); + /*1B-05 V' must retry 3 times */ + for (retry = 0; retry < 3; retry++) { + err = it6505_get_dpcd(it6505, DP_AUX_HDCP_V_PRIME(0), (u8 *)bv, + sizeof(bv)); - err = it6505_get_dpcd(it6505, DP_AUX_HDCP_V_PRIME(0), (u8 *)bv, - sizeof(bv)); + if (err < 0) { + dev_err(dev, "Read V' value Fail %d", retry); + continue; + } - if (err < 0) { - dev_err(dev, "Read V' value Fail"); - return false; + for (i = 0; i < 5; i++) { + if (bv[i][3] != av[i][0] || bv[i][2] != av[i][1] || + av[i][1] != av[i][2] || bv[i][0] != av[i][3]) + break; + + DRM_DEV_DEBUG_DRIVER(dev, "V' all match!! %d, %d", retry, i); + return true; + } } - for (i = 0; i < 5; i++) - if (bv[i][3] != av[i][0] || bv[i][2] != av[i][1] || - bv[i][1] != av[i][2] || bv[i][0] != av[i][3]) - return false; - - DRM_DEV_DEBUG_DRIVER(dev, "V' all match!!"); - return true; + DRM_DEV_DEBUG_DRIVER(dev, "V' NOT match!! %d", retry); + return false; } static void it6505_hdcp_wait_ksv_list(struct work_struct *work) @@ -2055,12 +2269,13 @@ static void it6505_hdcp_wait_ksv_list(struct work_struct *work) struct it6505 *it6505 = container_of(work, struct it6505, hdcp_wait_ksv_list); struct device *dev = it6505->dev; - unsigned int timeout = 5000; - u8 bstatus = 0; + u8 bstatus; bool ksv_list_check; + /* 1B-04 wait ksv list for 5s */ + unsigned long timeout = jiffies + + msecs_to_jiffies(5000) + 1; - timeout /= 20; - while (timeout > 0) { + for (;;) { if (!it6505_get_sink_hpd_status(it6505)) return; @@ -2069,27 +2284,23 @@ static void it6505_hdcp_wait_ksv_list(struct work_struct *work) if (bstatus & DP_BSTATUS_READY) break; - msleep(20); - timeout--; - } + if (time_after(jiffies, timeout)) { + DRM_DEV_DEBUG_DRIVER(dev, "KSV list wait timeout"); + goto timeout; + } - if (timeout == 0) { - DRM_DEV_DEBUG_DRIVER(dev, "timeout and ksv list wait failed"); - goto timeout; + msleep(20); } ksv_list_check = it6505_hdcp_part2_ksvlist_check(it6505); DRM_DEV_DEBUG_DRIVER(dev, "ksv list ready, ksv list check %s", ksv_list_check ? "pass" : "fail"); - if (ksv_list_check) { - it6505_set_bits(it6505, REG_HDCP_TRIGGER, - HDCP_TRIGGER_KSV_DONE, HDCP_TRIGGER_KSV_DONE); + + if (ksv_list_check) return; - } + timeout: - it6505_set_bits(it6505, REG_HDCP_TRIGGER, - HDCP_TRIGGER_KSV_DONE | HDCP_TRIGGER_KSV_FAIL, - HDCP_TRIGGER_KSV_DONE | HDCP_TRIGGER_KSV_FAIL); + it6505_start_hdcp(it6505); } static void it6505_hdcp_work(struct work_struct *work) @@ -2312,14 +2523,20 @@ static int it6505_process_hpd_irq(struct it6505 *it6505) DRM_DEV_DEBUG_DRIVER(dev, "dp_irq_vector = 0x%02x", dp_irq_vector); if (dp_irq_vector & DP_CP_IRQ) { - it6505_set_bits(it6505, REG_HDCP_TRIGGER, HDCP_TRIGGER_CPIRQ, - HDCP_TRIGGER_CPIRQ); - bstatus = it6505_dpcd_read(it6505, DP_AUX_HDCP_BSTATUS); if (bstatus < 0) return bstatus; DRM_DEV_DEBUG_DRIVER(dev, "Bstatus = 0x%02x", bstatus); + + /*Check BSTATUS when recive CP_IRQ */ + if (bstatus & DP_BSTATUS_R0_PRIME_READY && + it6505->hdcp_status == HDCP_AUTH_GOING) + it6505_set_bits(it6505, REG_HDCP_TRIGGER, HDCP_TRIGGER_CPIRQ, + HDCP_TRIGGER_CPIRQ); + else if (bstatus & (DP_BSTATUS_REAUTH_REQ | DP_BSTATUS_LINK_FAILURE) && + it6505->hdcp_status == HDCP_AUTH_DONE) + it6505_start_hdcp(it6505); } ret = drm_dp_dpcd_read_link_status(&it6505->aux, link_status); @@ -2456,7 +2673,11 @@ static void it6505_irq_hdcp_ksv_check(struct it6505 *it6505) { struct device *dev = it6505->dev; - DRM_DEV_DEBUG_DRIVER(dev, "HDCP event Interrupt"); + DRM_DEV_DEBUG_DRIVER(dev, "HDCP repeater R0 event Interrupt"); + /* 1B01 HDCP encription should start when R0 is ready*/ + it6505_set_bits(it6505, REG_HDCP_TRIGGER, + HDCP_TRIGGER_KSV_DONE, HDCP_TRIGGER_KSV_DONE); + schedule_work(&it6505->hdcp_wait_ksv_list); } @@ -3497,7 +3718,7 @@ static void it6505_i2c_remove(struct i2c_client *client) } static const struct i2c_device_id it6505_id[] = { - { "it6505", 0 }, + { "it6505" }, { } }; diff --git a/drivers/gpu/drm/bridge/ite-it66121.c b/drivers/gpu/drm/bridge/ite-it66121.c index 35ae3f0e8f51..23edcde6b9a7 100644 --- a/drivers/gpu/drm/bridge/ite-it66121.c +++ b/drivers/gpu/drm/bridge/ite-it66121.c @@ -1450,8 +1450,10 @@ static int it66121_audio_get_eld(struct device *dev, void *data, dev_dbg(dev, "No connector present, passing empty EDID data"); memset(buf, 0, len); } else { + mutex_lock(&ctx->connector->eld_mutex); memcpy(buf, ctx->connector->eld, min(sizeof(ctx->connector->eld), len)); + mutex_unlock(&ctx->connector->eld_mutex); } mutex_unlock(&ctx->lock); @@ -1464,7 +1466,6 @@ static const struct hdmi_codec_ops it66121_audio_codec_ops = { .audio_shutdown = it66121_audio_shutdown, .mute_stream = it66121_audio_mute, .get_eld = it66121_audio_get_eld, - .no_capture_mute = 1, }; static int it66121_audio_codec_init(struct it66121_ctx *ctx, struct device *dev) @@ -1474,11 +1475,12 @@ static int it66121_audio_codec_init(struct it66121_ctx *ctx, struct device *dev) .i2s = 1, /* Only i2s support for now */ .spdif = 0, .max_i2s_channels = 8, + .no_capture_mute = 1, }; dev_dbg(dev, "%s\n", __func__); - if (!of_property_read_bool(dev->of_node, "#sound-dai-cells")) { + if (!of_property_present(dev->of_node, "#sound-dai-cells")) { dev_info(dev, "No \"#sound-dai-cells\", no audio\n"); return 0; } diff --git a/drivers/gpu/drm/bridge/lontium-lt8912b.c b/drivers/gpu/drm/bridge/lontium-lt8912b.c index e265ab3c8c92..52da204f5740 100644 --- a/drivers/gpu/drm/bridge/lontium-lt8912b.c +++ b/drivers/gpu/drm/bridge/lontium-lt8912b.c @@ -815,8 +815,8 @@ static const struct of_device_id lt8912_dt_match[] = { MODULE_DEVICE_TABLE(of, lt8912_dt_match); static const struct i2c_device_id lt8912_id[] = { - {"lt8912", 0}, - {}, + { "lt8912" }, + {} }; MODULE_DEVICE_TABLE(i2c, lt8912_id); diff --git a/drivers/gpu/drm/bridge/lontium-lt9211.c b/drivers/gpu/drm/bridge/lontium-lt9211.c index c8881796fba4..999ddebb832d 100644 --- a/drivers/gpu/drm/bridge/lontium-lt9211.c +++ b/drivers/gpu/drm/bridge/lontium-lt9211.c @@ -773,7 +773,7 @@ static void lt9211_remove(struct i2c_client *client) drm_bridge_remove(&ctx->bridge); } -static struct i2c_device_id lt9211_id[] = { +static const struct i2c_device_id lt9211_id[] = { { "lontium,lt9211" }, {}, }; diff --git a/drivers/gpu/drm/bridge/lontium-lt9611.c b/drivers/gpu/drm/bridge/lontium-lt9611.c index 1b31fdebe164..e650cd83fc8d 100644 --- a/drivers/gpu/drm/bridge/lontium-lt9611.c +++ b/drivers/gpu/drm/bridge/lontium-lt9611.c @@ -45,7 +45,6 @@ struct lt9611 { struct device_node *dsi1_node; struct mipi_dsi_device *dsi0; struct mipi_dsi_device *dsi1; - struct platform_device *audio_pdev; bool ac_mode; @@ -757,7 +756,6 @@ static enum drm_mode_status lt9611_bridge_mode_valid(struct drm_bridge *bridge, const struct drm_display_mode *mode) { struct lt9611 *lt9611 = bridge_to_lt9611(bridge); - unsigned long long rate; if (mode->hdisplay > 3840) return MODE_BAD_HVALUE; @@ -765,17 +763,7 @@ static enum drm_mode_status lt9611_bridge_mode_valid(struct drm_bridge *bridge, if (mode->hdisplay > 2000 && !lt9611->dsi1_node) return MODE_PANEL; - rate = drm_hdmi_compute_mode_clock(mode, 8, HDMI_COLORSPACE_RGB); - return bridge->funcs->hdmi_tmds_char_rate_valid(bridge, mode, rate); -} - -static int lt9611_bridge_atomic_check(struct drm_bridge *bridge, - struct drm_bridge_state *bridge_state, - struct drm_crtc_state *crtc_state, - struct drm_connector_state *conn_state) -{ - return drm_atomic_helper_connector_hdmi_check(conn_state->connector, - conn_state->state); + return MODE_OK; } static void lt9611_bridge_atomic_pre_enable(struct drm_bridge *bridge, @@ -866,6 +854,10 @@ static int lt9611_hdmi_clear_infoframe(struct drm_bridge *bridge, unsigned int mask; switch (type) { + case HDMI_INFOFRAME_TYPE_AUDIO: + mask = LT9611_INFOFRAME_AUDIO; + break; + case HDMI_INFOFRAME_TYPE_AVI: mask = LT9611_INFOFRAME_AVI; break; @@ -899,6 +891,11 @@ static int lt9611_hdmi_write_infoframe(struct drm_bridge *bridge, int i; switch (type) { + case HDMI_INFOFRAME_TYPE_AUDIO: + mask = LT9611_INFOFRAME_AUDIO; + addr = 0x84b2; + break; + case HDMI_INFOFRAME_TYPE_AVI: mask = LT9611_INFOFRAME_AVI; addr = 0x8440; @@ -942,6 +939,55 @@ lt9611_hdmi_tmds_char_rate_valid(const struct drm_bridge *bridge, return MODE_OK; } +static int lt9611_hdmi_audio_startup(struct drm_connector *connector, + struct drm_bridge *bridge) +{ + struct lt9611 *lt9611 = bridge_to_lt9611(bridge); + + regmap_write(lt9611->regmap, 0x82d6, 0x8c); + regmap_write(lt9611->regmap, 0x82d7, 0x04); + + regmap_write(lt9611->regmap, 0x8406, 0x08); + regmap_write(lt9611->regmap, 0x8407, 0x10); + + regmap_write(lt9611->regmap, 0x8434, 0xd5); + + return 0; +} + +static int lt9611_hdmi_audio_prepare(struct drm_connector *connector, + struct drm_bridge *bridge, + struct hdmi_codec_daifmt *fmt, + struct hdmi_codec_params *hparms) +{ + struct lt9611 *lt9611 = bridge_to_lt9611(bridge); + + if (hparms->sample_rate == 48000) + regmap_write(lt9611->regmap, 0x840f, 0x2b); + else if (hparms->sample_rate == 96000) + regmap_write(lt9611->regmap, 0x840f, 0xab); + else + return -EINVAL; + + regmap_write(lt9611->regmap, 0x8435, 0x00); + regmap_write(lt9611->regmap, 0x8436, 0x18); + regmap_write(lt9611->regmap, 0x8437, 0x00); + + return drm_atomic_helper_connector_hdmi_update_audio_infoframe(connector, + &hparms->cea); +} + +static void lt9611_hdmi_audio_shutdown(struct drm_connector *connector, + struct drm_bridge *bridge) +{ + struct lt9611 *lt9611 = bridge_to_lt9611(bridge); + + drm_atomic_helper_connector_hdmi_clear_audio_infoframe(connector); + + regmap_write(lt9611->regmap, 0x8406, 0x00); + regmap_write(lt9611->regmap, 0x8407, 0x00); +} + static const struct drm_bridge_funcs lt9611_bridge_funcs = { .attach = lt9611_bridge_attach, .mode_valid = lt9611_bridge_mode_valid, @@ -949,7 +995,6 @@ static const struct drm_bridge_funcs lt9611_bridge_funcs = { .edid_read = lt9611_bridge_edid_read, .hpd_enable = lt9611_bridge_hpd_enable, - .atomic_check = lt9611_bridge_atomic_check, .atomic_pre_enable = lt9611_bridge_atomic_pre_enable, .atomic_enable = lt9611_bridge_atomic_enable, .atomic_disable = lt9611_bridge_atomic_disable, @@ -962,6 +1007,10 @@ static const struct drm_bridge_funcs lt9611_bridge_funcs = { .hdmi_tmds_char_rate_valid = lt9611_hdmi_tmds_char_rate_valid, .hdmi_write_infoframe = lt9611_hdmi_write_infoframe, .hdmi_clear_infoframe = lt9611_hdmi_clear_infoframe, + + .hdmi_audio_startup = lt9611_hdmi_audio_startup, + .hdmi_audio_prepare = lt9611_hdmi_audio_prepare, + .hdmi_audio_shutdown = lt9611_hdmi_audio_shutdown, }; static int lt9611_parse_dt(struct device *dev, @@ -1015,101 +1064,6 @@ static int lt9611_read_device_rev(struct lt9611 *lt9611) return ret; } -static int lt9611_hdmi_hw_params(struct device *dev, void *data, - struct hdmi_codec_daifmt *fmt, - struct hdmi_codec_params *hparms) -{ - struct lt9611 *lt9611 = data; - - if (hparms->sample_rate == 48000) - regmap_write(lt9611->regmap, 0x840f, 0x2b); - else if (hparms->sample_rate == 96000) - regmap_write(lt9611->regmap, 0x840f, 0xab); - else - return -EINVAL; - - regmap_write(lt9611->regmap, 0x8435, 0x00); - regmap_write(lt9611->regmap, 0x8436, 0x18); - regmap_write(lt9611->regmap, 0x8437, 0x00); - - return 0; -} - -static int lt9611_audio_startup(struct device *dev, void *data) -{ - struct lt9611 *lt9611 = data; - - regmap_write(lt9611->regmap, 0x82d6, 0x8c); - regmap_write(lt9611->regmap, 0x82d7, 0x04); - - regmap_write(lt9611->regmap, 0x8406, 0x08); - regmap_write(lt9611->regmap, 0x8407, 0x10); - - regmap_write(lt9611->regmap, 0x8434, 0xd5); - - return 0; -} - -static void lt9611_audio_shutdown(struct device *dev, void *data) -{ - struct lt9611 *lt9611 = data; - - regmap_write(lt9611->regmap, 0x8406, 0x00); - regmap_write(lt9611->regmap, 0x8407, 0x00); -} - -static int lt9611_hdmi_i2s_get_dai_id(struct snd_soc_component *component, - struct device_node *endpoint) -{ - struct of_endpoint of_ep; - int ret; - - ret = of_graph_parse_endpoint(endpoint, &of_ep); - if (ret < 0) - return ret; - - /* - * HDMI sound should be located as reg = <2> - * Then, it is sound port 0 - */ - if (of_ep.port == 2) - return 0; - - return -EINVAL; -} - -static const struct hdmi_codec_ops lt9611_codec_ops = { - .hw_params = lt9611_hdmi_hw_params, - .audio_shutdown = lt9611_audio_shutdown, - .audio_startup = lt9611_audio_startup, - .get_dai_id = lt9611_hdmi_i2s_get_dai_id, -}; - -static struct hdmi_codec_pdata codec_data = { - .ops = <9611_codec_ops, - .max_i2s_channels = 8, - .i2s = 1, -}; - -static int lt9611_audio_init(struct device *dev, struct lt9611 *lt9611) -{ - codec_data.data = lt9611; - lt9611->audio_pdev = - platform_device_register_data(dev, HDMI_CODEC_DRV_NAME, - PLATFORM_DEVID_AUTO, - &codec_data, sizeof(codec_data)); - - return PTR_ERR_OR_ZERO(lt9611->audio_pdev); -} - -static void lt9611_audio_exit(struct lt9611 *lt9611) -{ - if (lt9611->audio_pdev) { - platform_device_unregister(lt9611->audio_pdev); - lt9611->audio_pdev = NULL; - } -} - static int lt9611_probe(struct i2c_client *client) { struct lt9611 *lt9611; @@ -1173,6 +1127,9 @@ static int lt9611_probe(struct i2c_client *client) i2c_set_clientdata(client, lt9611); + /* Disable Audio InfoFrame, enabled by default */ + regmap_update_bits(lt9611->regmap, 0x843d, LT9611_INFOFRAME_AUDIO, 0); + lt9611->bridge.funcs = <9611_bridge_funcs; lt9611->bridge.of_node = client->dev.of_node; lt9611->bridge.ops = DRM_BRIDGE_OP_DETECT | DRM_BRIDGE_OP_EDID | @@ -1181,6 +1138,9 @@ static int lt9611_probe(struct i2c_client *client) lt9611->bridge.type = DRM_MODE_CONNECTOR_HDMIA; lt9611->bridge.vendor = "Lontium"; lt9611->bridge.product = "LT9611"; + lt9611->bridge.hdmi_audio_dev = dev; + lt9611->bridge.hdmi_audio_max_i2s_playback_channels = 8; + lt9611->bridge.hdmi_audio_dai_port = 2; drm_bridge_add(<9611->bridge); @@ -1202,10 +1162,6 @@ static int lt9611_probe(struct i2c_client *client) lt9611_enable_hpd_interrupts(lt9611); - ret = lt9611_audio_init(dev, lt9611); - if (ret) - goto err_remove_bridge; - return 0; err_remove_bridge: @@ -1226,7 +1182,6 @@ static void lt9611_remove(struct i2c_client *client) struct lt9611 *lt9611 = i2c_get_clientdata(client); disable_irq(client->irq); - lt9611_audio_exit(lt9611); drm_bridge_remove(<9611->bridge); regulator_bulk_disable(ARRAY_SIZE(lt9611->supplies), lt9611->supplies); @@ -1235,8 +1190,8 @@ static void lt9611_remove(struct i2c_client *client) of_node_put(lt9611->dsi0_node); } -static struct i2c_device_id lt9611_id[] = { - { "lontium,lt9611", 0 }, +static const struct i2c_device_id lt9611_id[] = { + { "lontium,lt9611" }, {} }; MODULE_DEVICE_TABLE(i2c, lt9611_id); diff --git a/drivers/gpu/drm/bridge/lontium-lt9611uxc.c b/drivers/gpu/drm/bridge/lontium-lt9611uxc.c index 4d1d40e1f1b4..f4c3ff1fdc69 100644 --- a/drivers/gpu/drm/bridge/lontium-lt9611uxc.c +++ b/drivers/gpu/drm/bridge/lontium-lt9611uxc.c @@ -522,7 +522,8 @@ static void lt9611uxc_audio_shutdown(struct device *dev, void *data) } static int lt9611uxc_hdmi_i2s_get_dai_id(struct snd_soc_component *component, - struct device_node *endpoint) + struct device_node *endpoint, + void *data) { struct of_endpoint of_ep; int ret; @@ -913,8 +914,8 @@ static void lt9611uxc_remove(struct i2c_client *client) of_node_put(lt9611uxc->dsi0_node); } -static struct i2c_device_id lt9611uxc_id[] = { - { "lontium,lt9611uxc", 0 }, +static const struct i2c_device_id lt9611uxc_id[] = { + { "lontium,lt9611uxc" }, { /* sentinel */ } }; diff --git a/drivers/gpu/drm/bridge/lvds-codec.c b/drivers/gpu/drm/bridge/lvds-codec.c index 991732c4b629..389af0233fcd 100644 --- a/drivers/gpu/drm/bridge/lvds-codec.c +++ b/drivers/gpu/drm/bridge/lvds-codec.c @@ -236,7 +236,7 @@ MODULE_DEVICE_TABLE(of, lvds_codec_match); static struct platform_driver lvds_codec_driver = { .probe = lvds_codec_probe, - .remove_new = lvds_codec_remove, + .remove = lvds_codec_remove, .driver = { .name = "lvds-codec", .of_match_table = lvds_codec_match, diff --git a/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c b/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c index 37f1acf5c0f8..a3dcee62e7a5 100644 --- a/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c +++ b/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c @@ -318,8 +318,8 @@ static void stdp4028_ge_b850v3_fw_remove(struct i2c_client *stdp4028_i2c) } static const struct i2c_device_id stdp4028_ge_b850v3_fw_i2c_table[] = { - {"stdp4028_ge_fw", 0}, - {}, + { "stdp4028_ge_fw" }, + {} }; MODULE_DEVICE_TABLE(i2c, stdp4028_ge_b850v3_fw_i2c_table); @@ -365,8 +365,8 @@ static void stdp2690_ge_b850v3_fw_remove(struct i2c_client *stdp2690_i2c) } static const struct i2c_device_id stdp2690_ge_b850v3_fw_i2c_table[] = { - {"stdp2690_ge_fw", 0}, - {}, + { "stdp2690_ge_fw" }, + {} }; MODULE_DEVICE_TABLE(i2c, stdp2690_ge_b850v3_fw_i2c_table); diff --git a/drivers/gpu/drm/bridge/nwl-dsi.c b/drivers/gpu/drm/bridge/nwl-dsi.c index 5f05647a3bea..1e5b2a37cb8c 100644 --- a/drivers/gpu/drm/bridge/nwl-dsi.c +++ b/drivers/gpu/drm/bridge/nwl-dsi.c @@ -1211,7 +1211,7 @@ static void nwl_dsi_remove(struct platform_device *pdev) static struct platform_driver nwl_dsi_driver = { .probe = nwl_dsi_probe, - .remove_new = nwl_dsi_remove, + .remove = nwl_dsi_remove, .driver = { .of_match_table = nwl_dsi_dt_ids, .name = DRV_NAME, diff --git a/drivers/gpu/drm/bridge/nxp-ptn3460.c b/drivers/gpu/drm/bridge/nxp-ptn3460.c index e77aab965fcf..44e36ae66db4 100644 --- a/drivers/gpu/drm/bridge/nxp-ptn3460.c +++ b/drivers/gpu/drm/bridge/nxp-ptn3460.c @@ -319,8 +319,8 @@ static void ptn3460_remove(struct i2c_client *client) } static const struct i2c_device_id ptn3460_i2c_table[] = { - {"ptn3460", 0}, - {}, + { "ptn3460" }, + {} }; MODULE_DEVICE_TABLE(i2c, ptn3460_i2c_table); diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c b/drivers/gpu/drm/bridge/samsung-dsim.c index 4416d0be7272..f8b4fb835765 100644 --- a/drivers/gpu/drm/bridge/samsung-dsim.c +++ b/drivers/gpu/drm/bridge/samsung-dsim.c @@ -2139,7 +2139,7 @@ MODULE_DEVICE_TABLE(of, samsung_dsim_of_match); static struct platform_driver samsung_dsim_driver = { .probe = samsung_dsim_probe, - .remove_new = samsung_dsim_remove, + .remove = samsung_dsim_remove, .driver = { .name = "samsung-dsim", .pm = pm_ptr(&samsung_dsim_pm_ops), diff --git a/drivers/gpu/drm/bridge/sii902x.c b/drivers/gpu/drm/bridge/sii902x.c index 9be9cc5b9025..bf2d1632b020 100644 --- a/drivers/gpu/drm/bridge/sii902x.c +++ b/drivers/gpu/drm/bridge/sii902x.c @@ -815,7 +815,8 @@ static int sii902x_audio_get_eld(struct device *dev, void *data, } static int sii902x_audio_get_dai_id(struct snd_soc_component *component, - struct device_node *endpoint) + struct device_node *endpoint, + void *data) { struct of_endpoint of_ep; int ret; @@ -840,7 +841,6 @@ static const struct hdmi_codec_ops sii902x_audio_codec_ops = { .mute_stream = sii902x_audio_mute, .get_eld = sii902x_audio_get_eld, .get_dai_id = sii902x_audio_get_dai_id, - .no_capture_mute = 1, }; static int sii902x_audio_codec_init(struct sii902x *sii902x, @@ -863,11 +863,12 @@ static int sii902x_audio_codec_init(struct sii902x *sii902x, .i2s = 1, /* Only i2s support for now. */ .spdif = 0, .max_i2s_channels = 0, + .no_capture_mute = 1, }; u8 lanes[4]; int num_lanes, i; - if (!of_property_read_bool(dev->of_node, "#sound-dai-cells")) { + if (!of_property_present(dev->of_node, "#sound-dai-cells")) { dev_dbg(dev, "%s: No \"#sound-dai-cells\", no audio\n", __func__); return 0; @@ -1239,8 +1240,8 @@ static const struct of_device_id sii902x_dt_ids[] = { MODULE_DEVICE_TABLE(of, sii902x_dt_ids); static const struct i2c_device_id sii902x_i2c_ids[] = { - { "sii9022", 0 }, - { }, + { "sii9022" }, + { } }; MODULE_DEVICE_TABLE(i2c, sii902x_i2c_ids); diff --git a/drivers/gpu/drm/bridge/sii9234.c b/drivers/gpu/drm/bridge/sii9234.c index 0c74cdc07032..cd7837c9a6e0 100644 --- a/drivers/gpu/drm/bridge/sii9234.c +++ b/drivers/gpu/drm/bridge/sii9234.c @@ -945,8 +945,8 @@ static const struct of_device_id sii9234_dt_match[] = { MODULE_DEVICE_TABLE(of, sii9234_dt_match); static const struct i2c_device_id sii9234_id[] = { - { "SII9234", 0 }, - { }, + { "SII9234" }, + { } }; MODULE_DEVICE_TABLE(i2c, sii9234_id); diff --git a/drivers/gpu/drm/bridge/sil-sii8620.c b/drivers/gpu/drm/bridge/sil-sii8620.c index 26b8d137bce0..28a2e1ee04b2 100644 --- a/drivers/gpu/drm/bridge/sil-sii8620.c +++ b/drivers/gpu/drm/bridge/sil-sii8620.c @@ -2368,8 +2368,8 @@ static const struct of_device_id sii8620_dt_match[] = { MODULE_DEVICE_TABLE(of, sii8620_dt_match); static const struct i2c_device_id sii8620_id[] = { - { "sii8620", 0 }, - { }, + { "sii8620" }, + { } }; MODULE_DEVICE_TABLE(i2c, sii8620_id); diff --git a/drivers/gpu/drm/bridge/synopsys/Kconfig b/drivers/gpu/drm/bridge/synopsys/Kconfig index ca416dab156d..f3ab2f985f8c 100644 --- a/drivers/gpu/drm/bridge/synopsys/Kconfig +++ b/drivers/gpu/drm/bridge/synopsys/Kconfig @@ -59,3 +59,9 @@ config DRM_DW_MIPI_DSI select DRM_KMS_HELPER select DRM_MIPI_DSI select DRM_PANEL_BRIDGE + +config DRM_DW_MIPI_DSI2 + tristate + select DRM_KMS_HELPER + select DRM_MIPI_DSI + select DRM_PANEL_BRIDGE diff --git a/drivers/gpu/drm/bridge/synopsys/Makefile b/drivers/gpu/drm/bridge/synopsys/Makefile index 9869d9651ed1..9dc376d220ad 100644 --- a/drivers/gpu/drm/bridge/synopsys/Makefile +++ b/drivers/gpu/drm/bridge/synopsys/Makefile @@ -8,3 +8,4 @@ obj-$(CONFIG_DRM_DW_HDMI_CEC) += dw-hdmi-cec.o obj-$(CONFIG_DRM_DW_HDMI_QP) += dw-hdmi-qp.o obj-$(CONFIG_DRM_DW_MIPI_DSI) += dw-mipi-dsi.o +obj-$(CONFIG_DRM_DW_MIPI_DSI2) += dw-mipi-dsi2.o diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c index 221e9a4edb40..cf1f66b7b192 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c @@ -645,7 +645,7 @@ static SIMPLE_DEV_PM_OPS(snd_dw_hdmi_pm, snd_dw_hdmi_suspend, static struct platform_driver snd_dw_hdmi_driver = { .probe = snd_dw_hdmi_probe, - .remove_new = snd_dw_hdmi_remove, + .remove = snd_dw_hdmi_remove, .driver = { .name = DRIVER_NAME, .pm = PM_OPS, diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.c index d4614de1ae1e..9549dabde941 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.c @@ -346,7 +346,7 @@ static const struct dev_pm_ops dw_hdmi_cec_pm = { static struct platform_driver dw_hdmi_cec_driver = { .probe = dw_hdmi_cec_probe, - .remove_new = dw_hdmi_cec_remove, + .remove = dw_hdmi_cec_remove, .driver = { .name = "dw-hdmi-cec", .pm = pm_ptr(&dw_hdmi_cec_pm), diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-gp-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-gp-audio.c index 423762da2ab4..ab18f9a3bf23 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-gp-audio.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-gp-audio.c @@ -181,7 +181,7 @@ static void snd_dw_hdmi_remove(struct platform_device *pdev) static struct platform_driver snd_dw_hdmi_driver = { .probe = snd_dw_hdmi_probe, - .remove_new = snd_dw_hdmi_remove, + .remove = snd_dw_hdmi_remove, .driver = { .name = DRIVER_NAME, }, diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c index 26c187d20d97..2c903c9fe805 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c @@ -148,7 +148,8 @@ static int dw_hdmi_i2s_get_eld(struct device *dev, void *data, uint8_t *buf, } static int dw_hdmi_i2s_get_dai_id(struct snd_soc_component *component, - struct device_node *endpoint) + struct device_node *endpoint, + void *data) { struct of_endpoint of_ep; int ret; @@ -225,7 +226,7 @@ static void snd_dw_hdmi_remove(struct platform_device *pdev) static struct platform_driver snd_dw_hdmi_driver = { .probe = snd_dw_hdmi_probe, - .remove_new = snd_dw_hdmi_remove, + .remove = snd_dw_hdmi_remove, .driver = { .name = DRIVER_NAME, }, diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.c index 181c5164b231..b281cabfe992 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.c @@ -361,22 +361,6 @@ static int dw_hdmi_qp_config_drm_infoframe(struct dw_hdmi_qp *hdmi, return 0; } -static int dw_hdmi_qp_bridge_atomic_check(struct drm_bridge *bridge, - struct drm_bridge_state *bridge_state, - struct drm_crtc_state *crtc_state, - struct drm_connector_state *conn_state) -{ - struct dw_hdmi_qp *hdmi = bridge->driver_private; - int ret; - - ret = drm_atomic_helper_connector_hdmi_check(conn_state->connector, - conn_state->state); - if (ret) - dev_dbg(hdmi->dev, "%s failed: %d\n", __func__, ret); - - return ret; -} - static void dw_hdmi_qp_bridge_atomic_enable(struct drm_bridge *bridge, struct drm_bridge_state *old_state) { @@ -442,16 +426,14 @@ dw_hdmi_qp_bridge_edid_read(struct drm_bridge *bridge, } static enum drm_mode_status -dw_hdmi_qp_bridge_mode_valid(struct drm_bridge *bridge, - const struct drm_display_info *info, - const struct drm_display_mode *mode) +dw_hdmi_qp_bridge_tmds_char_rate_valid(const struct drm_bridge *bridge, + const struct drm_display_mode *mode, + unsigned long long rate) { struct dw_hdmi_qp *hdmi = bridge->driver_private; - unsigned long long rate; - rate = drm_hdmi_compute_mode_clock(mode, 8, HDMI_COLORSPACE_RGB); if (rate > HDMI14_MAX_TMDSCLK) { - dev_dbg(hdmi->dev, "Unsupported mode clock: %d\n", mode->clock); + dev_dbg(hdmi->dev, "Unsupported TMDS char rate: %lld\n", rate); return MODE_CLOCK_HIGH; } @@ -505,12 +487,11 @@ static const struct drm_bridge_funcs dw_hdmi_qp_bridge_funcs = { .atomic_duplicate_state = drm_atomic_helper_bridge_duplicate_state, .atomic_destroy_state = drm_atomic_helper_bridge_destroy_state, .atomic_reset = drm_atomic_helper_bridge_reset, - .atomic_check = dw_hdmi_qp_bridge_atomic_check, .atomic_enable = dw_hdmi_qp_bridge_atomic_enable, .atomic_disable = dw_hdmi_qp_bridge_atomic_disable, .detect = dw_hdmi_qp_bridge_detect, .edid_read = dw_hdmi_qp_bridge_edid_read, - .mode_valid = dw_hdmi_qp_bridge_mode_valid, + .hdmi_tmds_char_rate_valid = dw_hdmi_qp_bridge_tmds_char_rate_valid, .hdmi_clear_infoframe = dw_hdmi_qp_bridge_clear_infoframe, .hdmi_write_infoframe = dw_hdmi_qp_bridge_write_infoframe, }; diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.h b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.h index 2115b8ef0bd6..72987e6c4689 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.h +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 */ /* - * Copyright (C) Rockchip Electronics Co.Ltd + * Copyright (C) Rockchip Electronics Co., Ltd. * Author: * Algea Cao */ diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi2.c b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi2.c new file mode 100644 index 000000000000..d7569bf2d9c3 --- /dev/null +++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi2.c @@ -0,0 +1,1030 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (c) 2024, Fuzhou Rockchip Electronics Co., Ltd + * + * Modified by Heiko Stuebner + * This generic Synopsys DesignWare MIPI DSI2 host driver is based on the + * Rockchip version from rockchip/dw-mipi-dsi2.c converted to use bridge APIs. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include