Karol's work which greatly improves volt/clock changes on a
heap of boards, nothing too exciting beyond a random collection of fixes.
* 'linux-4.9' of git://github.com/skeggsb/linux: (33 commits)
drm/nouveau/fb/nv50: defer DMA mapping of scratch page to oneinit() hook
drm/nouveau/fb/gf100: defer DMA mapping of scratch page to oneinit() hook
drm/nouveau/pci: set streaming DMA mask early
drm/nouveau/kms: add Maxwell to backlight initialization
drm/nouveau/bar/nv50: fix bar2 vm size
drm/nouveau/disp: remove unused function in sorg94.c
drm/nouveau/volt: use kernel's 64-bit signed division function
drm/nouveau/core: add missing header dependencies
drm/nouveau/gr/nv3x: add 0x0597 kelvin 3d class support
drm/nouveau/drm/nouveau: add a LED driver for the NVIDIA logo
drm/nouveau/fb/ram: Use Kepler implementation on Maxwell
drm/nouveau/volt: Make use of cvb coefficients
drm/nouveau/volt/gf100-: Add speedo
drm/nouveau/volt: Add implementation for gf100
drm/nouveau/bios/vmap: unk0 field is the mode
drm/nouveau/volt: Don't require perfect fit
drm/nouveau/clk: Allow boosting only when NvBoost is set
drm/nouveau/bios: Add parsing of VPSTATE table
drm/nouveau/clk: Respect voltage limits in nvkm_cstate_prog
drm/nouveau/clk: Fixup cstate selection
...
Some subdevices (i.e., fb/nv50.c and fb/gf100.c) map a scratch page using
dma_map_page() way before the TTM layer has had a chance to set the DMA
mask. This may prevent the driver from loading at all on platforms whose
system memory is not covered by the default DMA mask of 32-bit (i.e., when
all RAM is above 4 GB).
So set a preliminary DMA mask right after constructing the PCI device, and
base it on the .dma_bits member of the MMU subdevice, which is what the TTM
layer will base the DMA mask on as well.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This reverts commit aff51175cd.
The commit caused fence timeouts within nvc0_screen_destroy and most likely
other places as well.
The most obvious effect is, that userspace processes take minutes to
actually quit.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This flag's only remaining function is to ignore the uncached flag for
BOs on coherent architectures.
However the reason for allocating an object uncache on a non-coherent
architecture (namely because the cost of doing explicit flushes/
invalidations is higher than the benefit of caching the data because
accesses are few and far between) should also apply on architectures for
which coherency is maintained implicitly. Thus allocate coherent objects
as uncached on all architectures.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Without this buffer inconsistencies may appear between the CPU
and GPU when using a PCI GPU on an ARM64 board.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
NVIDIA have indicated that the workaround is required on all GK10[467]
boards that have the PGOB fuse set.
I've left the commandline option in place for now, as paranoia.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>