This reverts commit c802bdf42e and makes small adjustments.
Until we have a proper synchronization in place between emitting Cs triggered by the app thread, and fetching them from the queue, to measure the CsThread-caused delay, this config option is still useful for running some rare CsThread-limited games.
Stutters less this way because we increase the sensitivity to mark frames as outliers, so that they don't get used for predicting the next frame. The actual "optimal" threshold is still to be fine-tuned, but this one worked really well.
Possibly can be optimized more, but just changing these numbers already had a huge effect, especially for games having a small number of submissions to begin with.
In d3d9 there were situations where the first frameId was 22, although in d3d11 it always started at 17. This did cause issues especially when waiting for fences which didn't get signaled for these frameIds.
In practice, this change affects oversubscribed threading situations where waking up the "dxvk-queue" thread potentially can cause delays in the 100s of microseconds. For a lot of situations this change isn't affecting measurements in a meaningful way. Possibly affects AMD where vkQueueSubmit execution time is non-zero.
Only need to deal with common write-after-read scenarios, we can
ignore writes since those will add extra barriers anyway. Also
move this work out of the somewhat hot pipeline bind function.
This reverts commit efeb15edbd, because ordering guarantees were broken, that notifyGpuPresentEnd should happen after notifyGpuPresentBegin, which in turn lead to wrong latency measurements in case vkWaitForPresent was skipped.
Avoids keeping draw buffers alive when the app stops using indirect
draws. Unlikely to have caused issues in practice, but draw buffers
are not part of the API state to begin with.
Not like anybody uses this feature, but we need to both check for
hazards and make sure the SO counter actually gets tracked. Use
the existing draw buffer mechanism for this.