1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

215 commits

Author SHA1 Message Date
Vinay Belgaumkar
cec82816d0 drm/i915/guc: Use context hints for GT frequency
Allow user to provide a low latency context hint. When set, KMD
sends a hint to GuC which results in special handling for this
context. SLPC will ramp the GT frequency aggressively every time
it switches to this context. The down freq threshold will also be
lower so GuC will ramp down the GT freq for this context more slowly.
We also disable waitboost for this context as that will interfere with
the strategy.

We need to enable the use of SLPC Compute strategy during init, but
it will apply only to contexts that set this bit during context
creation.

Userland can check whether this feature is supported using a new param-
I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission
enabled platforms as they use SLPC for frequency management.

The Mesa usage model for this flag is here -
https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint

v2: Rename flags as per review suggestions (Rodrigo, Tvrtko).
Also, use flag bits in intel_context as it allows finer control for
toggling per engine if needed (Tvrtko).

v3: Minor review comments (Tvrtko)

v4: Update comment (Sushma)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306012759.204938-1-vinay.belgaumkar@intel.com
2024-03-07 10:25:06 -08:00
Tvrtko Ursulin
ca02a0119f drm/i915: Record which client owns a VM
To enable accounting of indirect client memory usage (such as page tables)
in the following patch, lets start recording the creator of each PPGTT.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231107101806.608990-2-tvrtko.ursulin@linux.intel.com
2023-11-10 11:48:54 +00:00
Kunwu Chan
27b086382c drm/i915: Fix potential spectre vulnerability
Fix smatch warning:
drivers/gpu/drm/i915/gem/i915_gem_context.c:847 set_proto_ctx_sseu()
warn: potential spectre issue 'pc->user_engines' [r] (local cap)

Fixes: d4433c7600 ("drm/i915/gem: Use the proto-context to handle create parameters (v5)")
Cc: <stable@vger.kernel.org> # v5.15+
Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231103110922.430122-1-tvrtko.ursulin@linux.intel.com
2023-11-06 08:59:34 +00:00
Chris Wilson
5945d8b9a8 drm/i915/gem: Use large rings for compute contexts
Allow compute contexts to submit the maximal amount of work without
blocking userspace.

The original size for user LRC ring's (SZ_16K) was chosen to minimise
memory consumption, without being so small as to frequently stall in the
middle of workloads. With the main consumers being GL / media pipelines
of 2 or 3 batches per frame, we want to support ~10 requests in flight
to allow for the application to control throttling without stalling
within a frame.

v2:
  - cover with else part

Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230517135754.1110291-1-tejas.upadhyay@intel.com
2023-05-30 23:11:59 +02:00
Andi Shyti
badb302709 drm/i915: Use i915 instead of dev_priv insied the file_priv structure
In the process of renaming all instances of 'dev_priv' to 'i915',
start using 'i915' within the 'drm_i915_file_private' structure.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230322001611.632321-1-andi.shyti@linux.intel.com
2023-03-23 01:53:44 +01:00
Rob Clark
99343c46d4 drm/i915: Avoid potential vm use-after-free
Adding the vm to the vm_xa table makes it visible to userspace, which
could try to race with us to close the vm.  So we need to take our extra
reference before putting it in the table.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Fixes: 9ec8795e7d ("drm/i915: Drop __rcu from gem_context->vm")
Cc: <stable@vger.kernel.org> # v5.16+
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230119173321.2825472-1-robdclark@gmail.com
2023-01-27 13:44:27 +00:00
Tvrtko Ursulin
1ec23ed712 drm/i915: Use uabi engines for the default engine map
Default engine map is exactly about uabi engines so no excuse not to use
the appropriate iterator to populate it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
[tursulin: Fixed up r-b tag spelling.]
Link: https://patchwork.freedesktop.org/patch/msgid/20230123185629.1593320-1-jonathan.cavitt@intel.com
2023-01-24 09:51:50 +00:00
Rob Clark
bed4b455cf drm/i915: Fix potential context UAFs
gem_context_register() makes the context visible to userspace, and which
point a separate thread can trigger the I915_GEM_CONTEXT_DESTROY ioctl.
So we need to ensure that nothing uses the ctx ptr after this.  And we
need to ensure that adding the ctx to the xarray is the *last* thing
that gem_context_register() does with the ctx pointer.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Fixes: eb4dedae92 ("drm/i915/gem: Delay tracking the GEM context until it is registered")
Fixes: a4c1cdd34e ("drm/i915/gem: Delay context creation (v3)")
Fixes: 49bd54b390 ("drm/i915: Track all user contexts per client")
Cc: <stable@vger.kernel.org> # v5.10+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
[tursulin: Stable and fixes tags add/tidy.]
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230103234948.1218393-1-robdclark@gmail.com
2023-01-06 10:10:47 +00:00
Alan Previn
f67986b011 drm/i915/pxp: Promote pxp subsystem to top-level of i915
Starting with MTL, there will be two GT-tiles, a render and media
tile. PXP as a service for supporting workloads with protected
contexts and protected buffers can be subscribed by process
workloads on any tile. However, depending on the platform,
only one of the tiles is used for control events pertaining to PXP
operation (such as creating the arbitration session and session
tear-down).

PXP as a global feature is accessible via batch buffer instructions
on any engine/tile and the coherency across tiles is handled implicitly
by the HW. In fact, for the foreseeable future, we are expecting this
single-control-tile for the PXP subsystem.

In MTL, it's the standalone media tile (not the root tile) because
it contains the VDBOX and KCR engine (among the assets PXP relies on
for those events).

Looking at the current code design, each tile is represented by the
intel_gt structure while the intel_pxp structure currently hangs off the
intel_gt structure.

Keeping the intel_pxp structure within the intel_gt structure makes some
internal functionalities more straight forward but adds code complexity to
code readability and maintainibility to many external-to-pxp subsystems
which may need to pick the correct intel_gt structure. An example of this
would be the intel_pxp_is_active or intel_pxp_is_enabled functionality
which should be viewed as a global level inquiry, not a per-gt inquiry.

That said, this series promotes the intel_pxp structure into the
drm_i915_private structure making it a top-level subsystem and the PXP
subsystem will select the control gt internally and keep a pointer to
it for internal reference.

This promotion comes with two noteworthy changes:

1. Exported pxp functions that are called by external subsystems
   (such as intel_pxp_enabled/active) will have to check implicitly
   if i915->pxp is valid as that structure will not be allocated
   for HW that doesn't support PXP.

2. Since GT is now considered a soft-dependency of PXP we are
   ensuring that GT init happens before PXP init and vice versa
   for fini. This causes a minor ordering change whereby we previously
   called intel_pxp_suspend after intel_uc_suspend but now is before
   i915_gem_suspend_late but the change is required for correct
   dependency flows. Additionally, this re-order change doesn't
   have any impact because at that point in either case, the top level
   entry to i915 won't observe any PXP events (since the GPU was
   quiesced during suspend_prepare). Also, any PXP event doesn't
   really matter when we disable the PXP HW (global GT irqs are
   already off anyway, so even if there was a bug that generated
   spurious events we wouldn't see it and we would just clean it
   up on resume which is okay since the default fallback action
   for PXP would be to keep the sessions off at this suspend stage).

Changes from prior revs:
  v11: - Reformat a comment (Tvrtko).
  v10: - Change the code flow for intel_pxp_init to make it more
         cleaner and readible with better comments explaining the
         difference between full-PXP-feature vs the partial-teelink
         inits depending on the platform. Additionally, only do
         the pxp allocation when we are certain the subsystem is
         needed. (Tvrtko).
   v9: - Cosmetic cleanups in supported/enabled/active. (Daniele).
       - Add comments for intel_pxp_init and pxp_get_ctrl_gt that
         explain the functional flow for when PXP is not supported
         but the backend-assets are needed for HuC authentication
         (Daniele and Tvrtko).
       - Fix two remaining functions that are accessible outside
         PXP that need to be checking pxp ptrs before using them:
         intel_pxp_irq_handler and intel_pxp_huc_load_and_auth
         (Tvrtko and Daniele).
       - User helper macro in pxp-debugfs (Tvrtko).
   v8: - Remove pxp_to_gt macro (Daniele).
       - Fix a bug in pxp_get_ctrl_gt for the case of MTL and we don't
         support GSC-FW on it. (Daniele).
       - Leave i915->pxp as NULL if we dont support PXP and in line
         with that, do additional validity check on i915->pxp for
         intel_pxp_is_supported/enabled/active (Daniele).
       - Remove unncessary include header from intel_gt_debugfs.c
         and check drm_minor i915->drm.primary (Daniele).
       - Other cosmetics / minor issues / more comments on suspend
         flow order change (Daniele).
   v7: - Drop i915_dev_to_pxp and in intel_pxp_init use 'i915->pxp'
         through out instead of local variable newpxp. (Rodrigo)
       - In the case intel_pxp_fini is called during driver unload but
         after i915 loading failed without pxp being allocated, check
         i915->pxp before referencing it. (Alan)
   v6: - Remove HAS_PXP macro and replace it with intel_pxp_is_supported
         because : [1] introduction of 'ctrl_gt' means we correct this
         for MTL's upcoming series now. [2] Also, this has little impact
         globally as its only used by PXP-internal callers at the moment.
       - Change intel_pxp_init/fini to take in i915 as its input to avoid
         ptr-to-ptr in init/fini calls.(Jani).
       - Remove the backpointer from pxp->i915 since we can use
         pxp->ctrl_gt->i915 if we need it. (Rodrigo).
   v5: - Switch from series to single patch (Rodrigo).
       - change function name from pxp_get_kcr_owner_gt to
         pxp_get_ctrl_gt.
       - Fix CI BAT failure by removing redundant call to intel_pxp_fini
         from driver-remove.
       - NOTE: remaining open still persists on using ptr-to-ptr
         and back-ptr.
   v4: - Instead of maintaining intel_pxp as an intel_gt structure member
         and creating a number of convoluted helpers that takes in i915 as
         input and redirects to the correct intel_gt or takes any intel_gt
         and internally replaces with the correct intel_gt, promote it to
         be a top-level i915 structure.
   v3: - Rename gt level helper functions to "intel_pxp_is_enabled/
         supported/ active_on_gt" (Daniele)
       - Upgrade _gt_supports_pxp to replace what was intel_gtpxp_is
         supported as the new intel_pxp_is_supported_on_gt to check for
         PXP feature support vs the tee support for huc authentication.
         Fix pxp-debugfs-registration to use only the former to decide
         support. (Daniele)
       - Couple minor optimizations.
   v2: - Avoid introduction of new device info or gt variables and use
         existing checks / macros to differentiate the correct GT->PXP
         control ownership (Daniele Ceraolo Spurio)
       - Don't reuse the updated global-checkers for per-GT callers (such
         as other files within PXP) to avoid unnecessary GT-reparsing,
         expose a replacement helper like the prior ones. (Daniele).
   v1: - Add one more patch to the series for the intel_pxp suspend/resume
         for similar refactoring

References: https://patchwork.freedesktop.org/patch/msgid/20221202011407.4068371-1-alan.previn.teres.alexis@intel.com
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221208180542.998148-1-alan.previn.teres.alexis@intel.com
2022-12-09 08:36:30 -08:00
Tvrtko Ursulin
a10234fda4 drm/i915: Partial abandonment of legacy DRM logging macros
Convert some usages of legacy DRM logging macros into versions which tell
us on which device have the events occurred.

v2:
 * Don't have struct drm_device as local. (Jani, Ville)

v3:
 * Store gt, not i915, in workaround list. (John)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221109104633.2579245-1-tvrtko.ursulin@linux.intel.com
2022-11-10 12:35:46 +00:00
Karolina Drobnik
67f99e3447 i915/i915_gem_context: Remove debug message in i915_gem_context_create_ioctl
We know that as long as GEM context create ioctl succeeds, a context was
created. There is no need to write about it, especially when such a message
heavily pollutes dmesg and makes debugging actual errors harder.

Since commit baa89ba3f1 ("drm/i915/gem: initial conversion to new
logging macros using coccinelle"), the logging for creating a new user
context was moved under the driver debug output (for lack of a means for
per-user logs, and a lack of user-focused drm.debug parameter). This
only reveals how obnoxious having that spam be part of the driver debug
logs, so remove it. [ from Chris Wilson ]

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221025091903.986819-1-karolina.drobnik@intel.com
2022-10-27 12:28:17 +02:00
Matthew Brost
8332109430 drm/i915/guc: Delay disabling guc_id scheduling for better hysteresis
Add a delay, configurable via debugfs (default 34ms), to disable
scheduling of a context after the pin count goes to zero. Disable
scheduling is a costly operation as it requires synchronizing with
the GuC. So the idea is that a delay allows the user to resubmit
something before doing this operation. This delay is only done if
the context isn't closed and less than a given threshold
(default is 3/4) of the guc_ids are in use.

Alan Previn: Matt Brost first introduced this patch back in Oct 2021.
However no real world workload with measured performance impact was
available to prove the intended results. Today, this series is being
republished in response to a real world workload that benefited greatly
from it along with measured performance improvement.

Workload description: 36 containers were created on a DG2 device where
each container was performing a combination of 720p 3d game rendering
and 30fps video encoding. The workload density was configured in a way
that guaranteed each container to ALWAYS be able to render and
encode no less than 30fps with a predefined maximum render + encode
latency time. That means the totality of all 36 containers and their
workloads were not saturating the engines to their max (in order to
maintain just enough headrooom to meet the min fps and max latencies
of incoming container submissions).

Problem statement: It was observed that the CPU core processing the i915
soft IRQ work was experiencing severe load. Using tracelogs and an
instrumentation patch to count specific i915 IRQ events, it was confirmed
that the majority of the CPU cycles were caused by the
gen11_other_irq_handler() -> guc_irq_handler() code path. The vast
majority of the cycles was determined to be processing a specific G2H
IRQ: i.e. INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE. These IRQs are sent
by GuC in response to i915 KMD sending H2G requests:
INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET. Those H2G requests are sent
whenever a context goes idle so that we can unpin the context from GuC.
The high CPU utilization % symptom was limiting density scaling.

Root Cause Analysis: Because the incoming execution buffers were spread
across 36 different containers (each with multiple contexts) but the
system in totality was NOT saturated to the max, it was assumed that each
context was constantly idling between submissions. This was causing
a thrashing of unpinning contexts from GuC at one moment, followed quickly
by repinning them due to incoming workload the very next moment. These
event-pairs were being triggered across multiple contexts per container,
across all containers at the rate of > 30 times per sec per context.

Metrics: When running this workload without this patch, we measured an
average of ~69K INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE events every 10
seconds or ~10 million times over ~25+ mins. With this patch, the count
reduced to ~480 every 10 seconds or about ~28K over ~10 mins. The
improvement observed is ~99% for the average counts per 10 seconds.

Design awareness: Selftest impact.
As temporary WA disable this feature for the selftests. Selftests are
very timing sensitive and any change in timing can cause failure. A
follow up patch will fixup the selftests to understand this delay.

Design awareness: Race between guc_request_alloc and guc_context_close.
If a context close is issued while there is a request submission in
flight and a delayed schedule disable is pending, guc_context_close
and guc_request_alloc will race to cancel the delayed disable.
To close the race, make sure that guc_request_alloc waits for
guc_context_close to finish running before checking any state.

Design awareness: GT Reset event.
If a gt reset is triggered, as preparation steps, add an additional step
to ensure all contexts that have a pending delay-disable-schedule task
be flushed of it. Move them directly into the closed state after cancelling
the worker. This is okay because the existing flow flushes all
yet-to-arrive G2H's dropping them anyway.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221006225121.826257-2-alan.previn.teres.alexis@intel.com
2022-10-26 17:29:43 -07:00
Tvrtko Ursulin
0add082ceb drm/i915/guc: Fix revocation of non-persistent contexts
Patch which added graceful exit for non-persistent contexts missed the
fact it is not enough to set the exiting flag on a context and let the
backend handle it from there.

GuC backend cannot handle it because it runs independently in the
firmware and driver might not see the requests ever again. Patch also
missed the fact some usages of intel_context_is_banned in the GuC backend
needed replacing with newly introduced intel_context_is_schedulable.

Fix the first issue by calling into backend revoke when we know this is
the last chance to do it. Fix the second issue by replacing
intel_context_is_banned with intel_context_is_schedulable, which should
always be safe since latter is a superset of the former.

v2:
 * Just call ce->ops->revoke unconditionally. (Andrzej)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 45c64ecf97 ("drm/i915: Improve user experience and driver robustness under SIGINT or similar")
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: <stable@vger.kernel.org> # v6.0+
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221003121630.694249-1-tvrtko.ursulin@linux.intel.com
2022-10-05 08:52:41 +01:00
Chris Wilson
ad3aa7c31e drm/i915/gem: Really move i915_gem_context.link under ref protection
i915_perf assumes that it can use the i915_gem_context reference to
protect its i915->gem.contexts.list iteration. However, this requires
that we do not remove the context from the list until after we drop the
final reference and release the struct. If, as currently, we remove the
context from the list during context_close(), the link.next pointer may
be poisoned while we are holding the context reference and cause a GPF:

[ 4070.573157] i915 0000:00:02.0: [drm:i915_perf_open_ioctl [i915]] filtering on ctx_id=0x1fffff ctx_id_mask=0x1fffff
[ 4070.574881] general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] PREEMPT SMP
[ 4070.574897] CPU: 1 PID: 284392 Comm: amd_performance Tainted: G            E     5.17.9 #180
[ 4070.574903] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[ 4070.574907] RIP: 0010:oa_configure_all_contexts.isra.0+0x222/0x350 [i915]
[ 4070.574982] Code: 08 e8 32 6e 10 e1 4d 8b 6d 50 b8 ff ff ff ff 49 83 ed 50 f0 41 0f c1 04 24 83 f8 01 0f 84 e3 00 00 00 85 c0 0f 8e fa 00 00 00 <49> 8b 45 50 48 8d 70 b0 49 8d 45 50 48 39 44 24 10 0f 85 34 fe ff
[ 4070.574990] RSP: 0018:ffffc90002077b78 EFLAGS: 00010202
[ 4070.574995] RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000000
[ 4070.575000] RDX: 0000000000000001 RSI: ffffc90002077b20 RDI: ffff88810ddc7c68
[ 4070.575004] RBP: 0000000000000001 R08: ffff888103242648 R09: fffffffffffffffc
[ 4070.575008] R10: ffffffff82c50bc0 R11: 0000000000025c80 R12: ffff888101bf1860
[ 4070.575012] R13: dead0000000000b0 R14: ffffc90002077c04 R15: ffff88810be5cabc
[ 4070.575016] FS:  00007f1ed50c0780(0000) GS:ffff88885ec80000(0000) knlGS:0000000000000000
[ 4070.575021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4070.575025] CR2: 00007f1ed5590280 CR3: 000000010ef6f005 CR4: 00000000003706e0
[ 4070.575029] Call Trace:
[ 4070.575033]  <TASK>
[ 4070.575037]  lrc_configure_all_contexts+0x13e/0x150 [i915]
[ 4070.575103]  gen8_enable_metric_set+0x4d/0x90 [i915]
[ 4070.575164]  i915_perf_open_ioctl+0xbc0/0x1500 [i915]
[ 4070.575224]  ? asm_common_interrupt+0x1e/0x40
[ 4070.575232]  ? i915_oa_init_reg_state+0x110/0x110 [i915]
[ 4070.575290]  drm_ioctl_kernel+0x85/0x110
[ 4070.575296]  ? update_load_avg+0x5f/0x5e0
[ 4070.575302]  drm_ioctl+0x1d3/0x370
[ 4070.575307]  ? i915_oa_init_reg_state+0x110/0x110 [i915]
[ 4070.575382]  ? gen8_gt_irq_handler+0x46/0x130 [i915]
[ 4070.575445]  __x64_sys_ioctl+0x3c4/0x8d0
[ 4070.575451]  ? __do_softirq+0xaa/0x1d2
[ 4070.575456]  do_syscall_64+0x35/0x80
[ 4070.575461]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 4070.575467] RIP: 0033:0x7f1ed5c10397
[ 4070.575471] Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
[ 4070.575478] RSP: 002b:00007ffd65c8d7a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 4070.575484] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f1ed5c10397
[ 4070.575488] RDX: 00007ffd65c8d7c0 RSI: 0000000040106476 RDI: 0000000000000006
[ 4070.575492] RBP: 00005620972f9c60 R08: 000000000000000a R09: 0000000000000005
[ 4070.575496] R10: 000000000000000d R11: 0000000000000246 R12: 000000000000000a
[ 4070.575500] R13: 000000000000000d R14: 0000000000000000 R15: 00007ffd65c8d7c0
[ 4070.575505]  </TASK>
[ 4070.575507] Modules linked in: nls_ascii(E) nls_cp437(E) vfat(E) fat(E) i915(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) aesni_intel(E) crypto_simd(E) intel_gtt(E) cryptd(E) ttm(E) rapl(E) intel_cstate(E) drm_kms_helper(E) cfbfillrect(E) syscopyarea(E) cfbimgblt(E) intel_uncore(E) sysfillrect(E) mei_me(E) sysimgblt(E) i2c_i801(E) fb_sys_fops(E) mei(E) intel_pch_thermal(E) i2c_smbus(E) cfbcopyarea(E) video(E) button(E) efivarfs(E) autofs4(E)
[ 4070.575549] ---[ end trace 0000000000000000 ]---

v3: fix incorrect syntax of spin_lock() replacing spin_lock_irqsave()

v2: irqsave not required in a worker, neither conversion to irq safe
    elsewhere (Tvrtko),
  - perf: it's safe to call gen8_configure_context() even if context has
    been closed, no need to check,
  - drop unrelated cleanup (Andi, Tvrtko)

Reported-by: Mark Janes <mark.janes@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/issues/6222
References: a4e7ccdac3 ("drm/i915: Move context management under GEM")
Fixes: f8246cf4d9 ("drm/i915/gem: Drop free_work for GEM contexts")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.12+
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220916092403.201355-3-janusz.krzysztofik@linux.intel.com
2022-09-19 17:57:07 +02:00
Matthew Auld
54c204c522 Revert "drm/i915/guc: Add delay to disable scheduling after pin count goes to zero"
This reverts commit 6a07990384.

Everything in CI using GuC is now timing out[1], and killing the machine
with this change (perhaps a deadlock?). CI was recently on fire due to
some changes coming in from -rc1, so likely the pre-merge CI results for
this series were invalid? For now just revert, unless GuC experts
already have a fix in mind.

[1] https://intel-gfx-ci.01.org/tree/drm-tip/index.html?

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220819123904.913750-1-matthew.auld@intel.com
2022-08-20 09:41:56 -07:00
Matthew Brost
6a07990384 drm/i915/guc: Add delay to disable scheduling after pin count goes to zero
Add a delay, configurable via debugfs (default 34ms), to disable
scheduling of a context after the pin count goes to zero. Disable
scheduling is a costly operation as it requires synchronizing with
the GuC. So the idea is that a delay allows the user to resubmit
something before doing this operation. This delay is only done if
the context isn't closed and less than a given threshold
(default is 3/4) of the guc_ids are in use.

As temporary WA disable this feature for the selftests. Selftests are
very timing sensitive and any change in timing can cause failure. A
follow up patch will fixup the selftests to understand this delay.

Alan Previn: Matt Brost first introduced this series back in Oct 2021.
However no real world workload with measured performance impact was
available to prove the intended results. Today, this series is being
republished in response to a real world workload that benefited greatly
from it along with measured performance improvement.

Workload description: 36 containers were created on a DG2 device where
each container was performing a combination of 720p 3d game rendering
and 30fps video encoding. The workload density was configured in a way
that guaranteed each container to ALWAYS be able to render and
encode no less than 30fps with a predefined maximum render + encode
latency time. That means the totality of all 36 containers and their
workloads were not saturating the engines to their max (in order to
maintain just enough headrooom to meet the min fps and max latencies
of incoming container submissions).

Problem statement: It was observed that the CPU core processing the i915
soft IRQ work was experiencing severe load. Using tracelogs and an
instrumentation patch to count specific i915 IRQ events, it was confirmed
that the majority of the CPU cycles were caused by the
gen11_other_irq_handler() -> guc_irq_handler() code path. The vast
majority of the cycles was determined to be processing a specific G2H
IRQ: i.e. INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE. These IRQs are sent
by GuC in response to i915 KMD sending H2G requests:
INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET. Those H2G requests are sent
whenever a context goes idle so that we can unpin the context from GuC.
The high CPU utilization % symptom was limiting density scaling.

Root Cause Analysis: Because the incoming execution buffers were spread
across 36 different containers (each with multiple contexts) but the
system in totality was NOT saturated to the max, it was assumed that each
context was constantly idling between submissions. This was causing
a thrashing of unpinning contexts from GuC at one moment, followed quickly
by repinning them due to incoming workload the very next moment. These
event-pairs were being triggered across multiple contexts per container,
across all containers at the rate of > 30 times per sec per context.

Metrics: When running this workload without this patch, we measured an
average of ~69K INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE events every 10
seconds or ~10 million times over ~25+ mins. With this patch, the count
reduced to ~480 every 10 seconds or about ~28K over ~10 mins. The
improvement observed is ~99% for the average counts per 10 seconds.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220817020511.2180747-3-alan.previn.teres.alexis@intel.com
2022-08-18 15:23:47 -07:00
katrinzhou
7482a65664 drm/i915/gem: add missing else
Add missing else in set_proto_ctx_param() to fix coverity issue.

Addresses-Coverity: ("Unused value")
Fixes: d4433c7600 ("drm/i915/gem: Use the proto-context to handle create parameters (v5)")
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: katrinzhou <katrinzhou@tencent.com>
[tursulin: fixup alignment]
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621124926.615884-1-tvrtko.ursulin@linux.intel.com
2022-06-22 07:50:03 +01:00
Tvrtko Ursulin
45c64ecf97 drm/i915: Improve user experience and driver robustness under SIGINT or similar
We have long standing customer complaints that pressing Ctrl-C (or to the
effect of) causes engine resets with otherwise well behaving programs.

Not only is logging engine resets during normal operation not desirable
since it creates support incidents, but more fundamentally we should avoid
going the engine reset path when we can since any engine reset introduces
a chance of harming an innocent context.

Reason for this undesirable behaviour is that the driver currently does
not distinguish between banned contexts and non-persistent contexts which
have been closed.

To fix this we add the distinction between the two reasons for revoking
contexts, which then allows the strict timeout only be applied to banned,
while innocent contexts (well behaving) can preempt cleanly and exit
without triggering the engine reset path.

Note that the added context exiting category applies both to closed non-
persistent context, and any exiting context when hangcheck has been
disabled by the user.

At the same time we rename the backend operation from 'ban' to 'revoke'
which more accurately describes the actual semantics. (There is no ban at
the backend level since banning is a concept driven by the scheduling
frontend. Backends are simply able to revoke a running context so that
is the more appropriate name chosen.)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220527072452.2225610-1-tvrtko.ursulin@linux.intel.com
2022-06-17 09:03:11 +01:00
Matt Roper
b87d390196 drm/i915/sseu: Disassociate internal subslice mask representation from uapi
As with EU masks, it's easier to store subslice/DSS masks internally in
a format that's more natural for the driver to work with, and then only
covert into the u8[] uapi form when the query ioctl is invoked.  Since
the hardware design changed significantly with Xe_HP, we'll use a union
to choose between the old "hsw-style" subslice masks or the newer xehp
mask.  HSW-style masks will be stored in an array of u8's, indexed by
slice (there's never more than 6 subslices per slice on older
platforms).  For Xe_HP and beyond where slices no longer exist, we only
need a single bitmask.  However we already know that this mask is
eventually going to grow too large for a simple u64 to hold, so we'll
represent it in a manner that can be operated on by the utilities in
linux/bitmap.h.

v2:
 - Fix typo: BIT(s) -> BIT(ss) in gen9_sseu_device_status()

v3:
 - Eliminate sseu->ss_stride and just calculate the stride while
   specifically handling uapi.  (Tvrtko)
 - Use BITMAP_BITS() macro to refer to size of masks rather than
   passing I915_MAX_SS_FUSE_BITS directly.  (Tvrtko)
 - Report compute/geometry DSS masks separately when dumping Xe_HP SSEU
   info.  (Tvrtko)
 - Restore dropped range checks to intel_sseu_has_subslice().  (Tvrtko)

v4:
 - Make the bitmap size macro check the size of the .xehp field rather
   than the containing union.  (Tvrtko)
 - Don't add GEM_BUG_ON() intel_sseu_has_subslice()'s check for whether
   slice or subslice ID exceed sseu->max_[sub]slices; various loops
   in the driver are expected to exceed these, so we should just
   silently return 'false.'

v5:
 - Move XEHP_BITMAP_BITS() to the header so that we can also replace a
   usage of I915_MAX_SS_FUSE_BITS in one of the inline functions.
   (Bala)
 - Change the local variable in intel_slicemask_from_xehp_dssmask() from
   u16 to 'unsigned long' to make it a bit more future-proof.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601150725.521468-6-matthew.d.roper@intel.com
2022-06-02 07:20:59 -07:00
Rodrigo Vivi
e1e1f4e325 Merge drm/drm-next into drm-intel-gt-next
In order to get the GSC Support merged on drm-intel-gt-next
in a clean fashion we needed this ATS-M patch to avoid
conflict in i915_pci.c:

commit 412c942bdf ("drm/i915/ats-m: add ATS-M platform info")

--

Fixing a silent conflict on drivers/gpu/drm/i915/gt/intel_gt_gmch.c:
-       if (!intel_vtd_active(i915))
+       if (!i915_vtd_active(i915))

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-04-21 13:48:26 -04:00
Jani Nikula
83970cd63b Merge drm/drm-next into drm-intel-next
Sync up with v5.18-rc1, in particular to get 5e3094cfd9
("drm/i915/xehpsdv: Add has_flat_ccs to device info").

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-04-11 16:01:56 +03:00
Tvrtko Ursulin
49bd54b390 drm/i915: Track all user contexts per client
We soon want to start answering questions like how much GPU time is the
context belonging to a client which exited still using.

To enable this we start tracking all context belonging to a client on a
separate list.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-5-tvrtko.ursulin@linux.intel.com
2022-04-05 08:39:07 +01:00
Tvrtko Ursulin
8399eec8a1 drm/i915: Track runtime spent in closed and unreachable GEM contexts
As contexts are abandoned we want to remember how much GPU time they used
(per class) so later we can used it for smarter purposes.

As GEM contexts are closed we want to have the DRM client remember how
much GPU time they used (per class) so later we can used it for smarter
purposes.

v2:
 * Size past runtimes array by uabi class, not internal.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v1
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-4-tvrtko.ursulin@linux.intel.com
2022-04-05 08:39:03 +01:00
Tvrtko Ursulin
43c504607d drm/i915: Make GEM contexts track DRM clients
Make GEM contexts keep a reference to i915_drm_client for the whole of
of their lifetime which will come handy in following patches.

v2: Don't bother supporting selftests contexts from debugfs. (Chris)
v3 (Lucas): Finish constructing ctx before adding it to the list
v4 (Ram): Rebase.
v5: Trivial rebase for proto ctx changes.
v6: Rebase after clients no longer track name and pid.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v5
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v5
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-3-tvrtko.ursulin@linux.intel.com
2022-04-05 08:38:56 +01:00
Thomas Hellström
e1a7ab4fca drm/i915: Remove the vm open count
vms are not getting properly closed. Rather than fixing that,
Remove the vm open count and instead rely on the vm refcount.

The vm open count existed solely to break the strong references the
vmas had on the vms. Now instead make those references weak and
ensure vmas are destroyed when the vm is destroyed.

Unfortunately if the vm destructor and the object destructor both
wants to destroy a vma, that may lead to a race in that the vm
destructor just unbinds the vma and leaves the actual vma destruction
to the object destructor. However in order for the object destructor
to ensure the vma is unbound it needs to grab the vm mutex. In order
to keep the vm mutex alive until the object destructor is done with
it, somewhat hackishly grab a vm_resv refcount that is released late
in the vma destruction process, when the vm mutex is no longer needed.

v2: Address review-comments from Niranjana
- Clarify that the struct i915_address_space::skip_pte_rewrite is a hack
  and should ideally be replaced in an upcoming patch.
- Remove an unneeded continue in clear_vm_list and update comment.

v3:
- Documentation update
- Commit message formatting

Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-2-thomas.hellstrom@linux.intel.com
2022-03-07 08:50:03 +01:00
Jani Nikula
e9b67ec2d3 drm/i915: include linux/highmem.h and linux/swap.h where needed
Include linux/highmem.h and linux/swap.h explicitly where needed so we
can drop the linux/i2c.h include from i915_drv.h where it pulled in the
dependencies implicitly.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303181931.1661767-5-jani.nikula@intel.com
2022-03-04 11:15:25 +02:00
Matthew Brost
e393e2aa0a drm/i915/xehp: Don't support parallel submission on compute / render
A different emit breadcrumbs ring programming is required for compute /
render and we don't have UMD user so just reject parallel submission for
these engine classes.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-11-matthew.d.roper@intel.com
2022-03-02 06:52:42 -08:00
Rodrigo Vivi
30424ebae8 Merge tag 'drm-intel-gt-next-2022-02-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-intel-next
UAPI Changes:

- Weak parallel submission support for execlists

  Minimal implementation of the parallel submission support for
  execlists backend that was previously only implemented for GuC.
  Support one sibling non-virtual engine.

Core Changes:

- Two backmerges of drm/drm-next for header file renames/changes and
  i915_regs reorganization

Driver Changes:

- Add new DG2 subplatform: DG2-G12 (Matt R)
- Add new DG2 workarounds (Matt R, Ram, Bruce)
- Handle pre-programmed WOPCM registers for DG2+ (Daniele)
- Update guc shim control programming on XeHP SDV+ (Daniele)
- Add RPL-S C0/D0 stepping information (Anusha)
- Improve GuC ADS initialization to work on ARM64 on dGFX (Lucas)

- Fix KMD and GuC race on accessing PMU busyness (Umesh)
- Use PM timestamp instead of RING TIMESTAMP for reference in PMU with GuC (Umesh)
- Report error on invalid reset notification from GuC (John)
- Avoid WARN splat by holding RPM wakelock during PXP unbind (Juston)
- Fixes to parallel submission implementation (Matt B.)
- Improve GuC loading status check/error reports (John)
- Tweak TTM LRU priority hint selection (Matt A.)
- Align the plane_vma to min_page_size of stolen mem (Ram)

- Introduce vma resources and implement async unbinding (Thomas)
- Use struct vma_resource instead of struct vma_snapshot (Thomas)
- Return some TTM accel move errors instead of trying memcpy move (Thomas)
- Fix a race between vma / object destruction and unbinding (Thomas)
- Remove short-term pins from execbuf (Maarten)
- Update to GuC version 69.0.3 (John, Michal Wa.)
- Improvements to GT reset paths in GuC backend (Matt B.)
- Use shrinker_release_pages instead of writeback in shmem object hooks (Matt A., Tvrtko)
- Use trylock instead of blocking lock when freeing GEM objects (Maarten)
- Allocate intel_engine_coredump_alloc with ALLOW_FAIL (Matt B.)
- Fixes to object unmapping and purging (Matt A)
- Check for wedged device in GuC backend (John)
- Avoid lockdep splat by locking dpt_obj around set_cache_level (Maarten)
- Allow dead vm to unbind vma's without lock (Maarten)
- s/engine->i915/i915/ for DG2 engine workarounds (Matt R)

- Use to_gt() helper for GGTT accesses (Michal Wi.)
- Selftest improvements (Matt B., Thomas, Ram)
- Coding style and compiler warning fixes (Matt B., Jasmine, Andi, Colin, Gustavo, Dan)

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Yg4i2aCZvvee5Eai@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Fixed conflicts while applying, using the fixups/drm-intel-gt-next.patch
from drm-rerere's 1f2b1742abdd ("2022y-02m-23d-16h-07m-57s UTC: drm-tip
rerere cache update")]
2022-02-23 15:03:51 -05:00
Jani Nikula
5f2ec9095c drm/i915: don't include drm_cache.h in i915_drv.h
Include it only in files that use it.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/14edab4a193ea3f73f387a88e3836c8555401871.1644507885.git.jani.nikula@intel.com
2022-02-14 13:19:37 +02:00
Jani Nikula
5472b3f2d9 drm/i915: split out i915_file_private.h from i915_drv.h
Limit the scope of struct drm_i915_file_private to the files that
actually need it.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e375859dc1729a1b988036e4103e5b1bd48caa00.1644507885.git.jani.nikula@intel.com
2022-02-14 13:16:28 +02:00
Jani Nikula
d83d5298ba drm/i915: move i915_gem_vm_lookup() where it's used
Move the function next to the only user. Arguably it's perhaps not the
best place, but it's much better than having a static inline in a
header.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a080e401840a8b9d45946ff33fd63c7939a623ae.1644507885.git.jani.nikula@intel.com
2022-02-14 12:36:48 +02:00
Tvrtko Ursulin
647bfd26bf Merge drm/drm-next into drm-intel-gt-next
Maarten needs backmerge to account for header file renames/changes which
landed via drm-intel-next and are interfering with his pinning work.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-01-18 10:54:02 +00:00
Linus Torvalds
8d0749b4f8 drm for 5.17-rc1
core:
 - add privacy screen support
 - move nomodeset option into drm subsystem
 - clean up nomodeset handling in drivers
 - make drm_irq.c legacy
 - fix stack_depot name conflicts
 - remove DMA_BUF_SET_NAME ioctl restrictions
 - sysfs: send hotplug event
 - replace several DRM_* logging macros with drm_*
 - move hashtable to legacy code
 - add error return from gem_create_object
 - cma-helper: improve interfaces, drop CONFIG_DRM_KMS_CMA_HELPER
 - kernel.h related include cleanups
 - support XRGB2101010 source buffers
 
 ttm:
 - don't include drm hashtable
 - stop pruning fences after wait
 - documentation updates
 
 dma-buf:
 - add dma_resv selftest
 - add debugfs helpers
 - remove dma_resv_get_excl_unlocked
 - documentation
 - make fences mandatory in dma_resv_add_excl_fence
 
 dp:
 - add link training delay helpers
 
 gem:
 - link shmem/cma helpers into separate modules
 - use dma_resv iteratior
 - import dma-buf namespace into gem helper modules
 
 scheduler:
 - fence grab fix
 - lockdep fixes
 
 bridge:
 - switch to managed MIPI DSI helpers
 - register and attach during probe fixes
 - convert to YAML in several places.
 
 panel:
 - add bunch of new panesl
 
 simpledrm:
 - support FB_DAMAGE_CLIPS
 - support virtual screen sizes
 - add Apple M1 support
 
 amdgpu:
 - enable seamless boot for DCN 3.01
 - runtime PM fixes
 - use drm_kms_helper_connector_hotplug_event
 - get all fences at once
 - use generic drm fb helpers
 - PSR/DPCD/LTTPR/DSC/PM/RAS/OLED/SRIOV fixes
 - add smart trace buffer (STB) for supported GPUs
 - display debugfs entries
 - new SMU debug option
 - Documentation update
 
 amdkfd:
 - IP discovery enumeration refactor
 - interface between driver fixes
 - SVM fixes
 - kfd uapi header to define some sysfs bitfields.
 
 i915:
 - support VESA panel backlights
 - enable ADL-P by default
 - add eDP privacy screen support
 - add Raptor Lake S (RPL-S) support
 - DG2 page table support
 - lots of GuC/HuC fw refactoring
 - refactored i915->gt interfaces
 - CD clock squashing support
 - enable 10-bit gamma support
 - update ADL-P DMC fw to v2.14
 - enable runtime PM autosuspend by default
 - ADL-P DSI support
 - per-lane DP drive settings for ICL+
 - add support for pipe C/D DMC firmware
 - Atomic gamma LUT updates
 - remove CCS FB stride restrictions on ADL-P
 - VRR platform support for display 11
 - add support for display audio codec keepalive
 - lots of display refactoring
 - fix runtime PM handling during PXP suspend
 - improved eviction performance with async TTM moves
 - async VMA unbinding improvements
 - VMA locking refactoring
 - improved error capture robustness
 - use per device iommu checks
 - drop bits stealing from i915_sw_fence function ptr
 - remove dma_resv_prune
 - add IC cache invalidation on DG2
 
 nouveau:
 - crc fixes
 - validate LUTs in atomic check
 - set HDMI AVI RGB quant to full
 
 tegra:
 - buffer objects reworks for dma-buf compat
 - NVDEC driver uAPI support
 - power management improvements
 
 etnaviv:
 - IOMMU enabled system support
 - fix > 4GB command buffer mapping
 - close a DoS vector
 - fix spurious GPU resets
 
 ast:
 - fix i2c initialization
 
 rcar-du:
 - DSI output support
 
 exynos:
 - replace legacy gpio interface
 - implement generic GEM object mmap
 
 msm:
 - dpu plane state cleanup in prep for multirect
 - dpu debugfs cleanups
 - dp support for sc7280
 - a506 support
 - removal of struct_mutex
 - remove old eDP sub-driver
 
 anx7625:
 - support MIPI DSI input
 - support HDMI audio
 - fix reading EDID
 
 lvds:
 - fix bridge DT bindings
 
 megachips:
 - probe both bridges before registering
 
 dw-hdmi:
 - allow interlace on bridge
 
 ps8640:
 - enable runtime PM
 - support aux-bus
 
 tx358768:
 - enable reference clock
 - add pulse mode support
 
 ti-sn65dsi86:
 - use regmap bulk write
 - add PWM support
 
 etnaviv:
 - get all fences at once
 
 gma500:
 - gem object cleanups
 
 kmb:
 - enable fb console
 
 radeon:
 - use dma_resv_wait_timeout
 
 rockchip:
 - add DSP hold timeout
 - suspend/resume fixes
 - PLL clock fixes
 - implement mmap in GEM object functions
 - use generic fbdev emulation
 
 sun4i:
 - use CMA helpers without vmap support
 
 vc4:
 - fix HDMI-CEC hang with display is off
 - power on HDMI controller while disabling
 - support 4K@60Hz modes
 - support 10-bit YUV 4:2:0 output
 
 vmwgfx:
 - fix leak on probe errors
 - fail probing on broken hosts
 - new placement for MOB page tables
 - hide internal BOs from userspace
 - implement GEM support
 - implement GL 4.3 support
 
 virtio:
 - overflow fixes
 
 xen:
 - implement mmap as GEM object function
 
 omapdrm:
 - fix scatterlist export
 - support virtual planes
 
 mediatek:
 - MT8192 support
 - CMDQ refinement
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmHX1vMACgkQDHTzWXnE
 hr6rAw/9ES5RO5N3Ku9foFk1CI9bqy1Kh663KLkkEc+rDdhKpiZbBnAsrKkZ9sGu
 fNuHmWNN5nWXtDSOqHWuslt3F7Gh+qEBQtlkqC9mZsBm3bWB0aJK6E4QaJxfeSaK
 ta6AmyGx8DaV+C69i86dnemQurYSDVjROd7LDPKnCU0Fye/JxiXSXQmXksKMFVxd
 x5vmO9yfeDSg3EF+u1yB6nJNUYZBV0vhrAfjPqxPCRBXuQc7akuaglE/SFwlGnEk
 vn0GjVHEQcRTqYKrHr64xvQxIoKXcJP0pkDUyT7KYCsyj8GJkvxkb7/ls5pp5DvL
 SwyNg3J3vwUVP6w6GEvzf3ffG720qqUZvCbvLmE+A/t2DhGILiAm+HXSo43PTOW8
 uagT7Gxma8dy8EovjSxioS9HPX8Gcu+S+XYavgOsevOZ7oeEt4f4TLW7LXsw9d6y
 75FrMhiUpreab5hAh8Le0swuLYZHjdnJRdjSTqZJ/T6VdTdVftLT6IfwvSDx5CHy
 cWuufgcAjd7xVTXFquHWYXWLTQkiSMGf1M02jx9IWolTd4Cm41LNBhqMEDHZLHJD
 7ngGgoaREVDQ+MqjG90yfIwJFIpJPI3YOaHLi/Kznga+iDzFY6cyOQWW2vX7ZdY5
 7+LJWsgGT8Feb7/bzD5hX1mYqJLxh1pWUIaqIKMl+7LJL7gTVU8=
 =MByd
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2022-01-07' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Highlights are support for privacy screens found in new laptops, a
  bunch of nomodeset refactoring, and i915 enables ADL-P systems by
  default, while starting to add RPL-S support.

  vmwgfx adds GEM and support for OpenGL 4.3 features in userspace.

  Lots of internal refactorings around dma reservations, and lots of
  driver refactoring as well.

  Summary:

  core:
   - add privacy screen support
   - move nomodeset option into drm subsystem
   - clean up nomodeset handling in drivers
   - make drm_irq.c legacy
   - fix stack_depot name conflicts
   - remove DMA_BUF_SET_NAME ioctl restrictions
   - sysfs: send hotplug event
   - replace several DRM_* logging macros with drm_*
   - move hashtable to legacy code
   - add error return from gem_create_object
   - cma-helper: improve interfaces, drop CONFIG_DRM_KMS_CMA_HELPER
   - kernel.h related include cleanups
   - support XRGB2101010 source buffers

  ttm:
   - don't include drm hashtable
   - stop pruning fences after wait
   - documentation updates

  dma-buf:
   - add dma_resv selftest
   - add debugfs helpers
   - remove dma_resv_get_excl_unlocked
   - documentation
   - make fences mandatory in dma_resv_add_excl_fence

  dp:
   - add link training delay helpers

  gem:
   - link shmem/cma helpers into separate modules
   - use dma_resv iteratior
   - import dma-buf namespace into gem helper modules

  scheduler:
   - fence grab fix
   - lockdep fixes

  bridge:
   - switch to managed MIPI DSI helpers
   - register and attach during probe fixes
   - convert to YAML in several places.

  panel:
   - add bunch of new panesl

  simpledrm:
   - support FB_DAMAGE_CLIPS
   - support virtual screen sizes
   - add Apple M1 support

  amdgpu:
   - enable seamless boot for DCN 3.01
   - runtime PM fixes
   - use drm_kms_helper_connector_hotplug_event
   - get all fences at once
   - use generic drm fb helpers
   - PSR/DPCD/LTTPR/DSC/PM/RAS/OLED/SRIOV fixes
   - add smart trace buffer (STB) for supported GPUs
   - display debugfs entries
   - new SMU debug option
   - Documentation update

  amdkfd:
   - IP discovery enumeration refactor
   - interface between driver fixes
   - SVM fixes
   - kfd uapi header to define some sysfs bitfields.

  i915:
   - support VESA panel backlights
   - enable ADL-P by default
   - add eDP privacy screen support
   - add Raptor Lake S (RPL-S) support
   - DG2 page table support
   - lots of GuC/HuC fw refactoring
   - refactored i915->gt interfaces
   - CD clock squashing support
   - enable 10-bit gamma support
   - update ADL-P DMC fw to v2.14
   - enable runtime PM autosuspend by default
   - ADL-P DSI support
   - per-lane DP drive settings for ICL+
   - add support for pipe C/D DMC firmware
   - Atomic gamma LUT updates
   - remove CCS FB stride restrictions on ADL-P
   - VRR platform support for display 11
   - add support for display audio codec keepalive
   - lots of display refactoring
   - fix runtime PM handling during PXP suspend
   - improved eviction performance with async TTM moves
   - async VMA unbinding improvements
   - VMA locking refactoring
   - improved error capture robustness
   - use per device iommu checks
   - drop bits stealing from i915_sw_fence function ptr
   - remove dma_resv_prune
   - add IC cache invalidation on DG2

  nouveau:
   - crc fixes
   - validate LUTs in atomic check
   - set HDMI AVI RGB quant to full

  tegra:
   - buffer objects reworks for dma-buf compat
   - NVDEC driver uAPI support
   - power management improvements

  etnaviv:
   - IOMMU enabled system support
   - fix > 4GB command buffer mapping
   - close a DoS vector
   - fix spurious GPU resets

  ast:
   - fix i2c initialization

  rcar-du:
   - DSI output support

  exynos:
   - replace legacy gpio interface
   - implement generic GEM object mmap

  msm:
   - dpu plane state cleanup in prep for multirect
   - dpu debugfs cleanups
   - dp support for sc7280
   - a506 support
   - removal of struct_mutex
   - remove old eDP sub-driver

  anx7625:
   - support MIPI DSI input
   - support HDMI audio
   - fix reading EDID

  lvds:
   - fix bridge DT bindings

  megachips:
   - probe both bridges before registering

  dw-hdmi:
   - allow interlace on bridge

  ps8640:
   - enable runtime PM
   - support aux-bus

  tx358768:
   - enable reference clock
   - add pulse mode support

  ti-sn65dsi86:
   - use regmap bulk write
   - add PWM support

  etnaviv:
   - get all fences at once

  gma500:
   - gem object cleanups

  kmb:
   - enable fb console

  radeon:
   - use dma_resv_wait_timeout

  rockchip:
   - add DSP hold timeout
   - suspend/resume fixes
   - PLL clock fixes
   - implement mmap in GEM object functions
   - use generic fbdev emulation

  sun4i:
   - use CMA helpers without vmap support

  vc4:
   - fix HDMI-CEC hang with display is off
   - power on HDMI controller while disabling
   - support 4K@60Hz modes
   - support 10-bit YUV 4:2:0 output

  vmwgfx:
   - fix leak on probe errors
   - fail probing on broken hosts
   - new placement for MOB page tables
   - hide internal BOs from userspace
   - implement GEM support
   - implement GL 4.3 support

  virtio:
   - overflow fixes

  xen:
   - implement mmap as GEM object function

  omapdrm:
   - fix scatterlist export
   - support virtual planes

  mediatek:
   - MT8192 support
   - CMDQ refinement"

* tag 'drm-next-2022-01-07' of git://anongit.freedesktop.org/drm/drm: (1241 commits)
  drm/amdgpu: no DC support for headless chips
  drm/amd/display: fix dereference before NULL check
  drm/amdgpu: always reset the asic in suspend (v2)
  drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform
  drm/amd/display: Fix the uninitialized variable in enable_stream_features()
  drm/amdgpu: fix runpm documentation
  amdgpu/pm: Make sysfs pm attributes as read-only for VFs
  drm/amdgpu: save error count in RAS poison handler
  drm/amdgpu: drop redundant semicolon
  drm/amd/display: get and restore link res map
  drm/amd/display: support dynamic HPO DP link encoder allocation
  drm/amd/display: access hpo dp link encoder only through link resource
  drm/amd/display: populate link res in both detection and validation
  drm/amd/display: define link res and make it accessible to all link interfaces
  drm/amd/display: 3.2.167
  drm/amd/display: [FW Promotion] Release 0.0.98
  drm/amd/display: Undo ODM combine
  drm/amd/display: Add reg defs for DCN303
  drm/amd/display: Changed pipe split policy to allow for multi-display pipe split
  drm/amd/display: Set optimize_pwr_state for DCN31
  ...
2022-01-10 12:58:46 -08:00
Matthew Brost
0f9d36af8f drm/i915: Fix possible uninitialized variable in parallel extension
'prev_engine' was declared inside the output loop and checked in the
inner after at least 1 pass of either loop. The variable should be
declared outside both loops as it needs to be persistent across the
entire loop structure.

Fixes: e5e32171a2 ("drm/i915/guc: Connect UAPI to GuC multi-lrc interface")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211219001909.24348-1-matthew.brost@intel.com
(cherry picked from commit cbffbac9c1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-12-27 11:33:34 +02:00
Matthew Brost
cbffbac9c1 drm/i915: Fix possible uninitialized variable in parallel extension
'prev_engine' was declared inside the output loop and checked in the
inner after at least 1 pass of either loop. The variable should be
declared outside both loops as it needs to be persistent across the
entire loop structure.

Fixes: e5e32171a2 ("drm/i915/guc: Connect UAPI to GuC multi-lrc interface")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211219001909.24348-1-matthew.brost@intel.com
2021-12-23 12:53:06 -08:00
Dave Airlie
4817c37d71 Merge tag 'drm-intel-gt-next-2021-12-23' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Driver Changes:

- Added bits of DG2 support around page table handling (Stuart Summers, Matthew Auld)
- Fixed wakeref leak in PMU busyness during reset in GuC mode (Umesh Nerlige Ramappa)
- Fixed debugfs access crash if GuC failed to load (John Harrison)
- Bring back GuC error log to error capture, undoing accidental earlier breakage (Thomas Hellström)
- Fixed memory leak in error capture caused by earlier refactoring (Thomas Hellström)
- Exclude reserved stolen from driver use (Chris Wilson)
- Add memory region sanity checking and optional full test (Chris Wilson)
- Fixed buffer size truncation in TTM shmemfs backend (Robert Beckett)
- Use correct lock and don't overwrite internal data structures when stealing GuC context ids (Matthew Brost)
- Don't hog IRQs when destroying GuC contexts (John Harrison)
- Make GuC to Host communication more robust (Matthew Brost)
- Continuation of locking refactoring around VMA and backing store handling (Maarten Lankhorst)
- Improve performance of reading GuC log from debugfs (John Harrison)
- Log when GuC fails to reset an engine (John Harrison)
- Speed up GuC/HuC firmware loading by requesting RP0 (Vinay Belgaumkar)
- Further work on asynchronous VMA unbinding (Thomas Hellström, Christian König)

- Refactor GuC/HuC firmware handling to prepare for future platforms (John Harrison)
- Prepare for future different GuC/HuC firmware signing key sizes (Daniele Ceraolo Spurio, Michal Wajdeczko)
- Add noreclaim annotations (Matthew Auld)
- Remove racey GEM_BUG_ON between GPU reset and GuC communication handling (Matthew Brost)
- Refactor i915->gt with to_gt(i915) to prepare for future platforms (Michał Winiarski, Andi Shyti)
- Increase GuC log size for CONFIG_DEBUG_GEM (John Harrison)

- Fixed engine busyness in selftests when in GuC mode (Umesh Nerlige Ramappa)
- Make engine parking work with PREEMPT_RT (Sebastian Andrzej Siewior)
- Replace X86_FEATURE_PAT with pat_enabled() (Lucas De Marchi)
- Selftest for stealing of guc ids (Matthew Brost)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YcRvKO5cyPvIxVCi@tursulin-mobl2
2021-12-24 06:14:51 +10:00
Matthew Brost
a88afcfa25 drm/i915/execlists: Weak parallel submission support for execlists
A weak implementation of parallel submission (multi-bb execbuf IOCTL) for
execlists. Doing as little as possible to support this interface for
execlists - basically just passing submit fences between each request
generated and virtual engines are not allowed. This is on par with what
is there for the existing (hopefully soon deprecated) bonding interface.

We perma-pin these execlists contexts to align with GuC implementation.

v2:
 (John Harrison)
  - Drop siblings array as num_siblings must be 1
v3:
 (John Harrison)
  - Drop single submission
v4:
 (John Harrison)
  - Actually drop single submission
  - Use IS_ERR check on return value from intel_context_create
  - Set last request to NULL on unpin

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211222223532.28698-1-matthew.brost@intel.com
2021-12-23 11:14:23 -08:00
Michał Winiarski
1a9c4db4ca drm/i915/gem: Use to_gt() helper
Use to_gt() helper consistently throughout the codebase.
Pure mechanical s/i915->gt/to_gt(i915). No functional changes.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211214193346.21231-6-andi.shyti@linux.intel.com
2021-12-17 21:50:32 -08:00
Dave Airlie
211b4dbc07 Merge tag 'drm-intel-gt-next-2021-12-09' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Core Changes:

- Fix PENDING_ERROR leak in dma_fence_array_signaled() (Thomas Hellström)

Driver Changes:

- Fix runtime PM handling during PXP suspend (Tejas Upadhyay)
- Improve eviction performance on discrete by implementing async TTM moves (Thomas Hellström, Maarten Lankhorst)
- Improve robustness of error capture under memory pressure (Thomas Hellström)
- Fix GuC PMU versus GPU reset handling (Umesh Nerlige Ramappa)
- Use per device iommu check (Tvrtko Ursulin)
- Make error capture work with async migration (Thomas Hellström)
- Revert incorrect implementation of Wa_1508744258 causing hangs (José Roberto de Souza)
- Disable coarse power gating on some DG2 steppings workaround (Matt Roper)
- Add IC cache invalidation workaround on DG2 (Ramalingam C)
- Move two Icelake workarounds to the right place (Raviteja Goud Talla)
- Fix error pointer dereference in i915_gem_do_execbuffer() (Dan Carpenter)
- Fixup a couple of generic and DG2 specific issues in migration code (Matthew Auld)

- Fix kernel-doc warnings in i915_drm_object.c (Randy Dunlap)
- Drop stealing of bits from i915_sw_fence function pointer (Matthew Brost)
- Introduce new macros for i915 PTE (Michael Cheng)
- Prep work for engine reset by reset domain lookup (Tejas Upadhyay)

- Fixup drm-intel-gt-next build failure (Matthew Auld)
- Fix live_engine_busy_stats selftests in GuC mode (Umesh Nerlige Ramappa)
- Remove dma_resv_prune (Maarten Lankhorst)
- Preserve huge pages enablement after driver reload (Matthew Auld)
- Fix a NULL pointer dereference in igt_request_rewind() (selftests) (Zhou Qingyang)
- Add workaround numbers to GEN7_COMMON_SLICE_CHICKEN1 whitelisting (José Roberto de Souza)
- Increase timeouts in i915_gem_contexts selftests to handle GuC being slower (Bruce Chang)

Signed-off-by: Dave Airlie <airlied@redhat.com>

# Conflicts:
#	drivers/gpu/drm/i915/display/intel_fbc.c
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YbIBOeqhn+nPzaYD@tursulin-mobl2
2021-12-10 15:35:20 +10:00
Matthew Brost
44505168d7 drm/i915: Drop stealing of bits from i915_sw_fence function pointer
Rather than stealing bits from i915_sw_fence function pointer use
separate fields for function pointer and flags. If using two different
fields, the 4 byte alignment for the i915_sw_fence function pointer can
also be dropped.

v2:
 (CI)
  - Set new function field rather than flags in __i915_sw_fence_init
v3:
 (Tvrtko)
  - Remove BUG_ON(!fence->flags) in reinit as that will now blow up
  - Only define fence->flags if CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is
    defined
v4:
  - Rebase, resend for CI

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211116194929.10211-1-matthew.brost@intel.com
2021-11-30 17:52:15 -08:00
Jani Nikula
c1bb3a463d Merge drm/drm-next into drm-intel-next
Backmerge to get the DP 2.0 MST changes merged to drm-next. This also
syncs us up to v5.15-rc7.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-10-29 13:40:45 +03:00
Rodrigo Vivi
2c85034db1 drm/i915: Clean-up bonding debug message.
We should stop using the gen name and the "+" to reference
the newer platforms.
And on this case specifically we can simplify the debug
message even further.

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211015091129.83226-1-rodrigo.vivi@intel.com
2021-10-18 09:13:10 -04:00
Matthew Brost
4eb61ddc1b drm/i915: Enable multi-bb execbuf
Enable multi-bb execbuf by enabling the set_parallel extension.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-25-matthew.brost@intel.com
2021-10-15 10:45:51 -07:00
Matthew Brost
e5e32171a2 drm/i915/guc: Connect UAPI to GuC multi-lrc interface
Introduce 'set parallel submit' extension to connect UAPI to GuC
multi-lrc interface. Kernel doc in new uAPI should explain it all.

IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071&rev=1
media UMD: https://github.com/intel/media-driver/pull/1252

v2:
 (Daniel Vetter)
  - Add IGT link and placeholder for media UMD link
v3:
 (Kernel test robot)
  - Fix warning in unpin engines call
 (John Harrison)
  - Reword a bunch of the kernel doc
v4:
 (John Harrison)
  - Add comment why perma-pin is done after setting gem context
  - Update some comments / docs for proto contexts
v5:
 (John Harrison)
  - Rework perma-pin comment
  - Add BUG_IN if context is pinned when setting gem context

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-17-matthew.brost@intel.com
2021-10-15 10:45:50 -07:00
Lucas De Marchi
1a839e016e drm/i915: remove IS_ACTIVE
When trying to bring IS_ACTIVE to linux/kconfig.h I thought it wouldn't
provide much value just encapsulating it in a boolean context. So I also
added the support for handling undefined macros as the IS_ENABLED()
counterpart. However the feedback received from Masahiro Yamada was that
it is too ugly, not providing much value. And just wrapping in a boolean
context is too dumb - we could simply open code it.

As detailed in commit babaab2f47 ("drm/i915: Encapsulate kconfig
constant values inside boolean predicates"), the IS_ACTIVE macro was
added to workaround a compilation warning. However after checking again
our current uses of IS_ACTIVE it turned out there is only
1 case in which it triggers a warning in clang (due
-Wconstant-logical-operand) and 2 in smatch. All the others
can simply use the shorter version, without wrapping it in any macro.

So here I'm dialing all the way back to simply removing the macro. That
single case hit by clang can be changed to make the constant come first,
so it doesn't think it's mask:

	-       if (context && CONFIG_DRM_I915_FENCE_TIMEOUT)
	+       if (CONFIG_DRM_I915_FENCE_TIMEOUT && context)

As talked with Dan Carpenter, that logic will be added in smatch as
well, so it will also stop warning about it.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211005171728.3147094-1-lucas.demarchi@intel.com
2021-10-07 11:04:05 -07:00
Matthew Brost
84edf53776 drm/i915: Fix bug in user proto-context creation that leaked contexts
Set number of engines before attempting to create contexts so the
function free_engines can clean up properly. Also check return of
alloc_engines for NULL.

v2:
 (Tvrtko)
  - Send as stand alone patch
 (John Harrison)
  - Check for alloc_engines returning NULL
v3:
 (Checkpatch / Tvrtko)
  - Remove braces around single line if statement

Cc: Jason Ekstrand <jason@jlekstrand.net>
Fixes: d4433c7600 ("drm/i915/gem: Use the proto-context to handle create parameters (v5)")
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211001155825.6762-1-matthew.brost@intel.com
2021-10-05 10:19:39 +01:00
Daniele Ceraolo Spurio
32271ecd65 drm/i915/pxp: start the arb session on demand
Now that we can handle destruction and re-creation of the arb session,
we can postpone the start of the session to the first submission that
requires it, to avoid keeping it running with no user.

v10: increase timeout when waiting in intel_pxp_start as firmware
     session startup is slower right after boot.
v13: increase the same timeout by 50 milisec because previous timeout
     was not enough to cover two lower level 100 milisec timeouts
     in the session termination + creation steps.

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-12-alan.previn.teres.alexis@intel.com
2021-10-04 13:11:06 -04:00
Daniele Ceraolo Spurio
d3ac8d4216 drm/i915/pxp: interfaces for using protected objects
This api allow user mode to create protected buffers and to mark
contexts as making use of such objects. Only when using contexts
marked in such a way is the execution guaranteed to work as expected.

Contexts can only be marked as using protected content at creation time
(i.e. the parameter is immutable) and they must be both bannable and not
recoverable. Given that the protected session gets invalidated on
suspend, contexts created this way hold a runtime pm wakeref until
they're either destroyed or invalidated.

All protected objects and contexts will be considered invalid when the
PXP session is destroyed and all new submissions using them will be
rejected. All intel contexts within the invalidated gem contexts will be
marked banned. Userspace can detect that an invalidation has occurred via
the RESET_STATS ioctl, where we report it the same way as a ban due to a
hang.

v5: squash patches, rebase on proto_ctx, update kerneldoc

v6: rebase on obj create_ext changes

v7: Use session counter to check if an object it valid, hold wakeref in
    context, don't add a new flag to RESET_STATS (Daniel)

v8: don't increase guilty count for contexts banned during pxp
    invalidation (Rodrigo)

v9: better comments, avoid wakeref put race between pxp_inval and
    context_close, add usage examples (Rodrigo)

v10: modify internal set/get-protected-context functions to not
     return -ENODEV when setting PXP param to false or getting param
     when running on pxp-unsupported hw or getting param when i915
     was built with CONFIG_PXP off

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-11-alan.previn.teres.alexis@intel.com
2021-10-04 13:11:00 -04:00
Thomas Hellström
a259cc14ec drm/i915: Reduce the number of objects subject to memcpy recover
We really only need memcpy restore for objects that affect the
operability of the migrate context. That is, primarily the page-table
objects of the migrate VM.

Add an object flag, I915_BO_ALLOC_PM_EARLY for objects that need early
restores using memcpy and a way to assign LMEM page-table object flags
to be used by the vms.

Restore objects without this flag with the gpu blitter and only objects
carrying the flag using TTM memcpy.

Initially mark the migrate, gt, gtt and vgpu vms to use this flag, and
defer for a later audit which vms actually need it. Most importantly, user-
allocated vms with pinned page-table objects can be restored using the
blitter.

Performance-wise memcpy restore is probably as fast as gpu restore if not
faster, but using gpu restore will help tackling future restrictions in
mappable LMEM size.

v4:
- Don't mark the aliasing ppgtt page table flags for early resume, but
  rather the ggtt page table flags as intended. (Matthew Auld)
- The check for user buffer objects during early resume is pointless, since
  they are never marked I915_BO_ALLOC_PM_EARLY. (Matthew Auld)
v5:
- Mark GuC LMEM objects with I915_BO_ALLOC_PM_EARLY to have them restored
  before we fire up the migrate context.

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-8-thomas.hellstrom@linux.intel.com
2021-09-24 08:19:16 +02:00
Daniel Vetter
9ec8795e7d drm/i915: Drop __rcu from gem_context->vm
It's been invariant since

    commit ccbc1b9794
    Author: Jason Ekstrand <jason@jlekstrand.net>
    Date:   Thu Jul 8 10:48:30 2021 -0500

        drm/i915/gem: Don't allow changing the VM on running contexts (v4)

this just completes the deed. I've tried to split out prep work for
more careful review as much as possible, this is what's left:

- get_ppgtt gets simplified since we don't need to grab a temporary
  reference - we can rely on the temporary reference for the gem_ctx
  while we inspect the vm. The new vm_id still needs a full
  i915_vm_open ofc. This also removes the final caller of context_get_vm_rcu

- A pile of selftests can now just look at ctx->vm instead of
  rcu_dereference_protected( , true) or similar things.

- All callers of i915_gem_context_vm also disappear.

- I've changed the hugepage selftest to set scrub_64K without any
  locking, because when we inspect that setting we're also not taking
  any locks either. It works because it's a selftests that's careful
  (single threaded gives you nice ordering) and not a live driver
  where races can happen from anywhere.

These can only be split up further if we have some intermediate state
with a bunch more rcu_dereference_protected(ctx->vm, true), just to
shut up lockdep and sparse.

The conversion to __rcu happened in

commit a4e7ccdac3
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Oct 4 14:40:09 2019 +0100

    drm/i915: Move context management under GEM

Note that we're not breaking the actual bugfix in there: The real
bugfix is pushing the i915_vm_relase onto a separate worker, to avoid
locking inversion issues. The rcu conversion was just thrown in for
entertainment value on top (no vm lookup isn't even close to anything
that's a hotpath where removing the single spinlock can be measured).

v2: Rebase over the change to move the i915_vm_put() into
i915_gem_context_release().

v3: Trivial conflict against repainted shed.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-9-daniel.vetter@ffwll.ch
2021-09-06 11:11:32 +02:00