It is only being used to clear a struct and set the type, after which it
is overwritten. Since we no longer check the unset bits of the union,
skipping the clear is permissible.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-6-chris@chris-wilson.co.uk
Reading the ggtt_views is much more pleasant without the extra
characters from specifying the union (i.e. ggtt_view.partial rather than
ggtt_view.params.partial). To make this work inside i915_vma_compare()
with only a single memcmp requires us to ensure that there are no
uninitialised bytes within each branch of the union (we make sure the
structs are packed) and we need to store the size of each branch.
v4: Rewrite changelog and add comments explaining the assert.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-5-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
In preparation for the next patch to convert to using an anonymous union
and leaving the excess bytes in the union uninitialised, we first need
to make sure we do not compare using those uninitialised bytes. We also
want to preserve the compactness of the code, avoiding a second call to
memcmp or introducing a switch, so we take advantage of using the type
as an encoded size (as well as a unique identifier for each type of view).
v2: Add the rationale for why we encode size into ggtt_view.type as a
comment before the memcmp()
v3: Use a switch to also assert that no two i915_ggtt_view_type have the same
value.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-3-chris@chris-wilson.co.uk
In the next few patches, we will depend upon there being no
uninitialised bits inside the ggtt_view. To ensure this we add the
__packed attribute and double check with a build bug that gcc hasn't
expanded the struct to include some padding bytes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-2-chris@chris-wilson.co.uk
Reports live state of PSR2 form PSR2_STATUS register.
bit field 31:28 gives the live state of PSR2.
It can be used to check if system is in deep sleep,
selective update or selective update standby.
During video play back, we can use this to check
if system is entering SU mode or not.
when system is in idle state, DEEP_SLEEP(8) must be entered.
When video playback is happening, system must be in
SLEEP(3 / selective update) or SU_STANDBY( 6 / selective update standby)
v2: (Rodrigo)
- Remove EDP_PSR2_STATUS_TG_ON=a ,instead use ARRAY_SIZE
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Patil Deepti <deepti.patil@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1483720352-24761-1-git-send-email-vathsala.nagaraju@intel.com
Psr2 is enabled only for y cordinate panels.Once GTC (global time code)
is implemented,this restriction is removed so that psr2
can work on panels without y cordinate support.
v2: (Rodrigo)
- Move the check to intel_psr_match_conditions
v3: (Rodrigo)
- add return false
v4: rebase
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Patil Deepti <deepti.patil@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484173710-3138-1-git-send-email-vathsala.nagaraju@intel.com
Program EDP_PSR_DEBUG_CTL (PSR_MASK) to enable system
to go to deep sleep while in psr2.PSR2_STATUS bit 31:28
should report value 8 , if system enters deep sleep state.
Also, EDP_FRAMES_BEFORE_SU_ENTRY is set 1 , if not set,
flickering is observed on psr2 panel.
v2: (Ilia Mirkin)
- Remove duplicate bit definition 25:27
v3: rebase
v4: rebase
v5: rebase
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Patil Deepti <deepti.patil@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484267484-21843-1-git-send-email-vathsala.nagaraju@intel.com
As per bpsec, CHICKEN_TRANS_EDP bit 12 ,15 must be programmed in
psr2 enable sequence.
bit 12 : Program Transcoder EDP VSC DIP header with a valid setting for
PSR2 and Set CHICKEN_TRANS_EDP(0x420cc) bit 12 for programmable
header packet.
bit 15 : Set CHICKEN_TRANS_EDP(0x420cc) bit 15 if Y coordinate is supported
v2: (Rodrigo)
- move CHICKEN_TRANS_EDP bit set logic right after setup_vsc
v3:(Rodrigo)
- initialize chicken_trans to CHICKEN_TRANS_BIT12 instead of 0
v4:(chris wilson)
- use BIT(12), remove CHICKEN_TRANS_BIT12
- remove unnecessary comments
- update commit message
v5:
- rename bit 12 PSR2_VSC_ENABLE_PROG_HEADER
- rename bit 15 PSR2_ADD_VERTICAL_LINE_COUNT
v6:(Rodrigo)
- remove TRANS_EDP=3, use cpu_transcoder
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: vathsala nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Patil Deepti <deepti.patil@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484247691-20930-1-git-send-email-vathsala.nagaraju@intel.com
This reverts commits 7c83d7abc9 and
a1f49cc179.
They caused the HW cursor to disappear under various circumstances in
the wild. I wasn't able to reproduce any of them, and I'm not sure
what's going on. But those changes aren't a big deal anyway, so let's
just revert for now.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=191291
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99143
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We currently check after the slow path that the vma is bound correctly,
but we don't currently check after the fast path. This is important in
case we accidentally take the fast path and leave the vma misplaced.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111210937.29252-27-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Fixes: 9cb07b099fb ("drm/msm: support multiple address spaces")
Reported-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
It would race between userspace thread and commit worker. Ie. vblank
irq would trigger event and userspace could begin the next atomic
update, before the commit worker had a chance to clear the pending
flag.
If we do end up needing something to prevent userspace from trying
another pageflip before getting vblank event, it should probably be
implemented as a pending_planes bitmask, similar to pending_crtcs. See
start_atomic() and end_atomic().
Signed-off-by: Rob Clark <robdclark@gmail.com>
STANDALONE_UPDATE_F should be set if something changed in plane configurations,
including plane disable.
The patch fixes page-faults bugs, caused by decon still using framebuffers
of disabled planes.
v2: fixed clear-bit code (Thx Marek)
v3: use test_and_clear_bit (Thx Joonyoung)
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Tested-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Improper usage of DECON_UPDATE register leads to subtle errors.
If it set in decon_commit when there are no active windows it results
in slow registry updates - all subsequent shadow registry updates takes more
than full vblank. On the other side if it is not set when there are
active windows it results in garbage on the screen after suspend/resume of
FB console.
The patch hopefully fixes it.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
GT reset and FLR share some operations and they are both implemented in
our new function intel_gvt_reset_vgpu_locked(). This patch rewrite the
gt reset handler using this new function.
Besides, this new implementation fixed the old issue in GT reset. The
old implementation reset GGTT entries which is illegal. We only clear
GGTT entries at PCI level reset.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Our function tests found several issues related to reusing vGPU
instance. They are qemu reboot failure, guest tdr after reboot, host
hang when reboot guest. All these issues are caused by dirty status
inherited from last VM.
This patch fix all these issues by resetting a virtual GPU before VM
use it. The reset logical is put into a low level function
_intel_gvt_reset_vgpu(), which supports Device Model Level Reset, Full
GT Reset and Per-Engine Reset.
vGPU Device Model Level Reset (DMLR) simulates the PCI reset to reset
the whole vGPU to default state as when it is created, including GTT,
execlist, scratch pages, cfg space, mmio space, pvinfo page, scheduler
and fence registers. The ultimate goal of vGPU DMLR is that reuse a
vGPU instance by different virtual machines. When we reassign a vGPU
to a virtual machine we must issue such reset first.
Full GT Reset and Per-Engine GT Reset are soft reset flow for GPU engines
(Render, Blitter, Video, Video Enhancement). It is defined by GPU Spec.
Unlike the FLR, GT reset only reset particular resource of a vGPU per
the reset request. Guest driver can issue a GT reset by programming
the virtual GDRST register to reset specific virtual GPU engine or all
engines.
Since vGPU DMLR and GT reset can share some code so we implement both
these two into one single function intel_gvt_reset_vgpu_locked(). The
parameter dmlr is to identify if we will do FLR or GT reset. The
parameter engine_mask is to specific the engines that need to be
resetted. If value ALL_ENGINES is given for engine_mask, it means
the caller requests a full gt reset that we will reset all virtual
GPU engines.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Jike Song <jike.song@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch introduces a new function intel_vgpu_reset_mmio() to
reset vGPU MMIO space (virtual registers of the vGPU). The default
values are loaded as firmware during gvt inititiation.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Move the mmio space inititation function setup_vgpu_mmio()
and cleanup function clean_vgpu_mmio() in vgpu.c to dedicated
source file mmio.c, and rename them as intel_vgpu_init_mmio()
and intel_vgpu_clean_mmio() respectively.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch introduces a new function intel_vgpu_reset_cfg_space()
to reset vGPU configuration space. This function will unmap gttmmio
and aperture if they are mapped before. Then entire cfg space will
be restored to default values.
Currently we only do such reset when vGPU is not owned by any VM
so we simply restore entire cfg space to default value, not following
the PCIe FLR spec that some fields should remain unchanged.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Move the configuration space inititation function setup_vgpu_cfg_space()
in vgpu.c to dedicated source file cfg_space.c, and rename the function
as intel_vgpu_init_cfg_space().
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch introduces a new function intel_vgpu_reset_gtt() to reset
the all GTT related status, including GGTT, PPGTT, scratch page. This
function can free all shadowed PPGTT, clear all GGTT entry, and clear
scratch page to all zero. After this, we can ensure no gtt related
information can be leakaged from one VM to anothor one when assign
vgpu instance across different VMs (not simultaneously).
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch introudces a new function intel_vgpu_reset_resource() to
reset allocated vgpu resources by intel_vgpu_alloc_resource(). So far
we only need clear the fence registers. The function _clear_vgpu_fence()
will reset both virtual and physical fence registers to 0.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
As per edp1.4 spec , alpm is required for psr2 operation as it's
used for all psr2 main link power down management and alpm enable
bit must be set for psr2 operation.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: vathsala nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Patil Deepti <deepti.patil@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1483356663-32668-6-git-send-email-vathsala.nagaraju@intel.com
Screen freeze observed if AUX_FRAME_SYNC is not disabled
on psr2 exit.AUX_FRAME_SYNC needed for psr2 is enabled during
psr2 entry. It must be disabled on psr2 exit.
v2: rebase
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Patil Deepti <deepti.patil@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484147673-2044-1-git-send-email-vathsala.nagaraju@intel.com
Psr1 and psr2 are mutually exclusive,ie when psr2 is enabled,
psr1 should be disabled.When psr2 is exited , bit 31 of reg
PSR2_CTL must be set to 0 but currently bit 31 of SRD_CTL
(psr1 control register)is set to 0.
Also ,PSR2_IDLE state is looked up from SRD_STATUS(psr1 register)
instead of PSR2_STATUS register, which has wrong data, resulting
in blankscreen.
hsw_enable_source is split into hsw_enable_source_psr1 and
hsw_enable_source_psr2 for easier code review and maintenance,
as suggested by rodrigo and jim.
v2: (Rodrigo)
- Rename hsw_enable_source_psr* to intel_enable_source_psr*
v3: (Rodrigo)
- In hsw_psr_disable ,
1) for psr active case, handle psr2 followed by psr1.
2) psr inactive case, handle psr2 followed by psr1
v4:(Rodrigo)
- move psr2 restriction(32X20) to match_conditions function
returning false and fully blocking PSR to a new patch before
this one.
v5: in source_psr2, removed val = EDP_PSR_ENABLE
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Patil Deepti <deepti.patil@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484244059-9201-1-git-send-email-vathsala.nagaraju@intel.com
Program HardMin based on the vce_arbiter.ecclk
if ecclk is 0, disable ECLK DPM 0. Otherwise VCE
could hang if switching SCLK from DPM 0 to 6/7
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
can fix Bug 191281: vce ib test failed.
when vce idle, set vce clock gate, so the clock
in vce domain will be disabled.
when need to encode, disable vce clock gate,
enable the clocks to vce engine.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Special MC ucode is required for these memory configurations.
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Special MC ucode is required for these memory configurations.
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Parameter: good.
Parameter - bad.
One day I'll learn the syntax.
Fixes: 625d988acc ("drm/i915: Extract reserving space in the GTT to a helper")
Fixes: e007b19d7b ("drm/i915: Use the MRU stack search after evicting")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112164559.27232-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
It was only needed to protect the connector_list walking, see
commit 8c4ccc4ab6
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Jul 9 23:44:26 2015 +0200
drm/probe-helper: Grab mode_config.mutex in poll_init/enable
Unfortunately the commit message of that patch fails to mention that
the new locking check was for the connector_list.
But that requirement disappeared in
commit c36a3254f7
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Dec 15 16:58:43 2016 +0100
drm: Convert all helpers to drm_connector_list_iter
and so we can drop this again.
This fixes a locking inversion on nouveau, where the rpm code needs to
re-enable. But in other places the rpm_get() calls are nested within
the big modeset locks.
While at it, also improve the kerneldoc for these two functions a
notch.
v2: Update the kerneldoc even more to explain that these functions
can't be called concurrently, or bad things happen (Chris).
Cc: Dave Airlie <airlied@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Lyude <lyude@redhat.com>
Reviewed-by: Lyude <lyude@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111090117.5134-1-daniel.vetter@ffwll.ch
When dumping the VMA, include the parameters of the different GGTT views
so that we can distinguish them.
v2: Contract output and add MISSING_CASE for any unknown types.
v3: Show both stride and offset for rotated planes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112112108.31632-1-chris@chris-wilson.co.uk
The internal object is a collection of struct pages and so is
intrinsically linked to the available physical memory on the machine,
and not an arbitrary type from the uabi. Use phys_addr_t so the link
between size and memory consumption is clear, and then double check that
we don't overflow the maximum object size.
v2: Also assert that size is not zero - a mistake I made a few times
while writing selftests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112130431.1844-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Move the GuC invalidation of its ggtt TLB to where we perform the ggtt
modification rather than proliferate it into all the callers of the
insert (which may or may not in fact have to do the insertion).
v2: Just do the guc invalidate unconditionally, (afaict) it has no impact
without the guc loaded on gen8+
v3: Conditionally invalidate the guc - just in case that register has
not been validated for other modes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112110050.25333-1-chris@chris-wilson.co.uk
The visible member used to be in intel_plane_state->visible,
but has been moved to drm_plane_state->visible. In the conversion
some casts were left in that are now useless.
to_intel_plane_state(x)->base.visible is the same as x->visible,
so use the latter to clear up the code a little.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484214225-30328-1-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Since commit c033666a94 ("drm/i915: Store a i915 backpointer from
engine, and use it") i915_reset receives dev_priv, but the kerneldoc
was not updated.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112041817.1102-3-michel.thierry@intel.com
And before the function description.
Tidy up from commit 14bb2c1179 ("drm/i915: Fix a buch of kerneldoc
warnings"), all others kerneldoc blocks look ok.
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112041817.1102-2-michel.thierry@intel.com
The WaDisableLSQCROPERFforOCL workaround has the side effect of
disabling an L3SQ optimization that has huge performance implications
and is unlikely to be necessary for the correct functioning of usual
graphic workloads. Userspace is free to re-enable the workaround on
demand, and is generally in a better position to determine whether the
workaround is necessary than the DRM is (e.g. only during the
execution of compute kernels that rely on both L3 fences and HDC R/W
requests).
The same workaround seems to apply to BDW (at least to production
stepping G1) and SKL as well (the internal workaround database claims
that it does for all steppings, while the BSpec workaround table only
mentions pre-production steppings), but the DRM doesn't do anything
beyond whitelisting the L3SQCREG4 register so userspace can enable it
when it sees fit. Do the same on KBL platforms.
Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
This is followed by a regression of 35% and 10% respectively for the
same benchmarks and platform caused by my recent patch series
switching userspace to use the dataport constant cache instead of the
sampler to implement uniform pull constant loads, which caused us to
hit more heavily the L3 cache (and on platforms other than KBL had the
opposite effect of improving performance of the same two benchmarks).
The overall effect on KBL of this change combined with the recent
userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapPcf
was affected by the constant cache changes (though it improved as it
did on other platforms rather than regressing), but is not
significantly affected by this patch (with statistical significance of
5% and sample size 20).
v2: Drop some more code to avoid unused variable warning.
Fixes: 738fa1b312 ("drm/i915/kbl: Add WaDisableLSQCROPERFforOCL")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: beignet@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.7+
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[Removed double Fixes tag]
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484217894-20505-1-git-send-email-mika.kuoppala@intel.com
Since commit 4741da925f ("drm/i915/guc: Assert that all GGTT offsets used
by the GuC are mappable"), we're asserting that GuC firmware is in the
GuC mappable range.
Except we're not pinning the object with bias, which means it's possible
to trigger this assert. Let's add a proper bias.
Fixes: 4741da925f ("drm/i915/guc: Assert that all GGTT offsets used by the GuC are mappable")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111151739.28965-1-michal.winiarski@intel.com
This is the deprecated function for when you embedded the framebuffer
somewhere else (which breaks refcounting). But cma helpers are using
drm_framebuffer_remove and a free-standing fb, so this is rendundant.
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1482835765-12044-2-git-send-email-daniel.vetter@ffwll.ch