* devcoredump support for display errors
* dpu: irq cleanup/refactor
* dpu: dt bindings conversion to yaml
* dsi: dt bindings conversion to yaml
* mdp5: alpha/blend_mode/zpos support
* a6xx: cached coherent buffer support
* a660 support
* gpu iova fault improvements:
- info about which block triggered the fault, etc
- generation of gpu devcoredump on fault
* assortment of other cleanups and fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs4=qsGBBbyn-4JWqW4-YUSTKh67X3DsPQ=T2D9aXKqNA@mail.gmail.com
Add, via the adreno-smmu-priv interface, a way for the GPU to request
the SMMU to stall translation on faults, and then later resume the
translation, either retrying or terminating the current translation.
This will be used on the GPU side to "freeze" the GPU while we snapshot
useful state for devcoredump.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jordan Crouse <jordan@cosmicpenguin.net>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210610214431.539029-5-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Add a callback in adreno-smmu-priv to read interesting SMMU
registers to provide an opportunity for a richer debug experience
in the GPU driver.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210610214431.539029-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
There shouldn't be any reason to ever use uncached over writecombine,
so just use writecombine for MSM_BO_UNCACHED.
Note: userspace never used MSM_BO_UNCACHED anyway
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Acked-by: Jordan Crouse <jordan@cosmicpenguin.net>
Link: https://lore.kernel.org/r/20210423190833.25319-6-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>
Add a new cache mode for creating coherent host-cached BOs.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Link: https://lore.kernel.org/r/20210423190833.25319-5-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmDPuyMeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGvxgH/RKvSuRPwkJ2Jcp9
VLi5kCbqtJlYLq6tB6peSJ8otKgxkcRwC0pIY4LlYIAWYboktLQ5RKp/9nB2h2FN
aMZUMu6AI/lVJyFMI5MnKnJIUiUq+WXR3lSSlw68vwFLFdzqUZFNq+bqeiVvnIy1
yqA6naj24Tu/RbYffQoPvdSJcU2SLXRMxwD8HRGiU2d51RaFsOvsZvF+P5TVcsEV
ZmttJeER21CaI/A809eqaFmyGrUOcZZK9roZEbMwanTZOMw18biEsLu/UH4kBX01
JC4+RlGxcWjQ5YNZgChsgoOK/CHzc6ITztTntdeDWAvwZjQFzV7pCy4/3BWne3O+
5178yHM=
=o8cN
-----END PGP SIGNATURE-----
Backmerge tag 'v5.13-rc7' into drm-next
Backmerge Linux 5.13-rc7 to make some pulls from later bases apply,
and to bake in the conflicts so far.
bluetooth, netfilter and can.
Current release - regressions:
- mlxsw: spectrum_qdisc: Pass handle, not band number to find_class()
to fix modifying offloaded qdiscs
- lantiq: net: fix duplicated skb in rx descriptor ring
- rtnetlink: fix regression in bridge VLAN configuration, empty info
is not an error, bot-generated "fix" was not needed
- libbpf: s/rx/tx/ typo on umem->rx_ring_setup_done to fix
umem creation
Current release - new code bugs:
- ethtool: fix NULL pointer dereference during module EEPROM dump via
the new netlink API
- mlx5e: don't update netdev RQs with PTP-RQ, the special purpose queue
should not be visible to the stack
- mlx5e: select special PTP queue only for SKBTX_HW_TSTAMP skbs
- mlx5e: verify dev is present in get devlink port ndo, avoid a panic
Previous releases - regressions:
- neighbour: allow NUD_NOARP entries to be force GCed
- further fixes for fallout from reorg of WiFi locking
(staging: rtl8723bs, mac80211, cfg80211)
- skbuff: fix incorrect msg_zerocopy copy notifications
- mac80211: fix NULL ptr deref for injected rate info
- Revert "net/mlx5: Arm only EQs with EQEs" it may cause missed IRQs
Previous releases - always broken:
- bpf: more speculative execution fixes
- netfilter: nft_fib_ipv6: skip ipv6 packets from any to link-local
- udp: fix race between close() and udp_abort() resulting in a panic
- fix out of bounds when parsing TCP options before packets
are validated (in netfilter: synproxy, tc: sch_cake and mptcp)
- mptcp: improve operation under memory pressure, add missing wake-ups
- mptcp: fix double-lock/soft lookup in subflow_error_report()
- bridge: fix races (null pointer deref and UAF) in vlan tunnel egress
- ena: fix DMA mapping function issues in XDP
- rds: fix memory leak in rds_recvmsg
Misc:
- vrf: allow larger MTUs
- icmp: don't send out ICMP messages with a source address of 0.0.0.0
- cdc_ncm: switch to eth%d interface naming
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDNP7EACgkQMUZtbf5S
IrvTmxAAgOAM9MdRl9wnYtqXKPXJ1JJtenozwt1yX6b6OG+Ns7cm6YYafU3KoZWR
KlzpvP90vRrER3RqksbMngHzvGjZKDS4LWRur7sRlJ1TBQoLrQCIbriAh07d7wlU
0nnS4J8mczTCKx78QCUYy1QBIX5TQrUbx0JQZDPoIPBjFeILW+Gx/Ghg5tUR4mhf
6icYqwIPocTXO37ZmWOzezZNVOXJF4kaQUZeuOHNe5hOtm6EeIpZbW1Xx3DIr5bd
80a/uNU7nVyos0n7jxnfVE/oelTnYbT5scZeV/PPVqZ4U113f7uex2QP23/XhGSX
lK1EhwPqPOyaNhQoihLM6Xzd4o7aZOcmF8NY96xqjC+DqdN+juvfJU+ClCZojGIj
H4bwCSaj3y2PiimfQdBiIKvYMc5d4zBdw/Dpk/gLDp4d5N638TAtuunK4Mj+TEuT
QF1qkBLIB4HFtLS0M35/twk93md/5GUdSTij2GB3fOkAWRu2m266P5m+4DigW/TB
Xm8FgKdetvxVP0Qv/p49nPEn24Ny8wCafH1x1wVTmoda2qi6j1EXMuSa0PlCdz70
Sl5FrlxdEkOpC4p+Aoc8APSoBXnOriAlpU+z/EVb8Co4JR/+Ge5zBWpsiZDVD0/K
Ay0FW3I87iyn9tw1H1Fzr9GBlVl5vWRauZFHjzl90fWakCrCzJE=
=xxUe
-----END PGP SIGNATURE-----
Merge tag 'net-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.13-rc7, including fixes from wireless, bpf,
bluetooth, netfilter and can.
Current release - regressions:
- mlxsw: spectrum_qdisc: Pass handle, not band number to find_class()
to fix modifying offloaded qdiscs
- lantiq: net: fix duplicated skb in rx descriptor ring
- rtnetlink: fix regression in bridge VLAN configuration, empty info
is not an error, bot-generated "fix" was not needed
- libbpf: s/rx/tx/ typo on umem->rx_ring_setup_done to fix umem
creation
Current release - new code bugs:
- ethtool: fix NULL pointer dereference during module EEPROM dump via
the new netlink API
- mlx5e: don't update netdev RQs with PTP-RQ, the special purpose
queue should not be visible to the stack
- mlx5e: select special PTP queue only for SKBTX_HW_TSTAMP skbs
- mlx5e: verify dev is present in get devlink port ndo, avoid a panic
Previous releases - regressions:
- neighbour: allow NUD_NOARP entries to be force GCed
- further fixes for fallout from reorg of WiFi locking (staging:
rtl8723bs, mac80211, cfg80211)
- skbuff: fix incorrect msg_zerocopy copy notifications
- mac80211: fix NULL ptr deref for injected rate info
- Revert "net/mlx5: Arm only EQs with EQEs" it may cause missed IRQs
Previous releases - always broken:
- bpf: more speculative execution fixes
- netfilter: nft_fib_ipv6: skip ipv6 packets from any to link-local
- udp: fix race between close() and udp_abort() resulting in a panic
- fix out of bounds when parsing TCP options before packets are
validated (in netfilter: synproxy, tc: sch_cake and mptcp)
- mptcp: improve operation under memory pressure, add missing
wake-ups
- mptcp: fix double-lock/soft lookup in subflow_error_report()
- bridge: fix races (null pointer deref and UAF) in vlan tunnel
egress
- ena: fix DMA mapping function issues in XDP
- rds: fix memory leak in rds_recvmsg
Misc:
- vrf: allow larger MTUs
- icmp: don't send out ICMP messages with a source address of 0.0.0.0
- cdc_ncm: switch to eth%d interface naming"
* tag 'net-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (139 commits)
net: ethernet: fix potential use-after-free in ec_bhf_remove
selftests/net: Add icmp.sh for testing ICMP dummy address responses
icmp: don't send out ICMP messages with a source address of 0.0.0.0
net: ll_temac: Avoid ndo_start_xmit returning NETDEV_TX_BUSY
net: ll_temac: Fix TX BD buffer overwrite
net: ll_temac: Add memory-barriers for TX BD access
net: ll_temac: Make sure to free skb when it is completely used
MAINTAINERS: add Guvenc as SMC maintainer
bnxt_en: Call bnxt_ethtool_free() in bnxt_init_one() error path
bnxt_en: Fix TQM fastpath ring backing store computation
bnxt_en: Rediscover PHY capabilities after firmware reset
cxgb4: fix wrong shift.
mac80211: handle various extensible elements correctly
mac80211: reset profile_periodicity/ema_ap
cfg80211: avoid double free of PMSR request
cfg80211: make certificate generation more robust
mac80211: minstrel_ht: fix sample time check
net: qed: Fix memcpy() overflow of qed_dcbx_params()
net: cdc_eem: fix tx fixup skb leak
net: hamradio: fix memory leak in mkiss_close
...
When constructing ICMP response messages, the kernel will try to pick a
suitable source address for the outgoing packet. However, if no IPv4
addresses are configured on the system at all, this will fail and we end up
producing an ICMP message with a source address of 0.0.0.0. This can happen
on a box routing IPv4 traffic via v6 nexthops, for instance.
Since 0.0.0.0 is not generally routable on the internet, there's a good
chance that such ICMP messages will never make it back to the sender of the
original packet that the ICMP message was sent in response to. This, in
turn, can create connectivity and PMTUd problems for senders. Fortunately,
RFC7600 reserves a dummy address to be used as a source for ICMP
messages (192.0.0.8/32), so let's teach the kernel to substitute that
address as a last resort if the regular source address selection procedure
fails.
Below is a quick example reproducing this issue with network namespaces:
ip netns add ns0
ip l add type veth peer netns ns0
ip l set dev veth0 up
ip a add 10.0.0.1/24 dev veth0
ip a add fc00:dead:cafe:42::1/64 dev veth0
ip r add 10.1.0.0/24 via inet6 fc00:dead:cafe:42::2
ip -n ns0 l set dev veth0 up
ip -n ns0 a add fc00:dead:cafe:42::2/64 dev veth0
ip -n ns0 r add 10.0.0.0/24 via inet6 fc00:dead:cafe:42::1
ip netns exec ns0 sysctl -w net.ipv4.icmp_ratelimit=0
ip netns exec ns0 sysctl -w net.ipv4.ip_forward=1
tcpdump -tpni veth0 -c 2 icmp &
ping -w 1 10.1.0.1 > /dev/null
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on veth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 29, seq 1, length 64
IP 0.0.0.0 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92
2 packets captured
2 packets received by filter
0 packets dropped by kernel
With this patch the above capture changes to:
IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 31127, seq 1, length 64
IP 192.0.0.8 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Juliusz Chroboczek <jch@irif.fr>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove recently added frequency invariance support from the CPPC
cpufreq driver, because it has turned out to be problematic and it
cannot be fixed properly on time for 5.13 (Viresh Kumar).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmDMzWkSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxCeYP/2ZvmDC+saX4jz6FCTvJ8J4jc7fGB3QB
k+3ek7tNvazYU470mBR68VFYGvqFkjFIP1yjJgNbh4u8fex1lPLFuWAmPurGic7o
xmuk2+uRG/aSOQPmf+tEVjcBSQn+g/SkvfC6ZIZvbODvCSWQ3VwKWORD+1BdajQJ
7VJCVOXSyqAbeDsijmuJ0j1Nzj6SZbmn7pQODXCcjuwG+jOBbrPGmjBlW24kAVRM
mF9k7xqbjcikzaW9SQZ1S38uzRDu5nK9ipcpdJtDUZ7N+xoa1qUjNLQY3H9gQDYQ
xuocdhpYbOi5ip8srYSJc16xGJPCR7oRxsFS4vuUZAUyuDg0IIvuS2bSLTM7NLBv
hCaibKQQr8fFW87Dnalv7Ld046g0bLAhan33V/8x9lSNOVDSAdFEHOtrXSSCgeEr
XiOlxqT1dCl5FU427fA2QpSnONOfl+JXuZ+zGzP99q1VVdK6xW1cIbIrCaEhHuZv
N9AODRSizrNurU1foBKgZRGj6EGjF/1EuHJ8GiTXQ31SWEdzUKY+J9VIwwPrMTJW
MAMGrgqBpajIry5hzpLaf5uFFfiK/7ijg2Nn3nSMCyd1mwG9hyh2EreomH6lEOxf
Iiu+OFn4cz+R6MDksLBMFCoxUPHBzNpN0tc9IiEQSU8R6xy3GpQhUu4l2JLnKRu+
dZ16oPbhZAnq
=tW38
-----END PGP SIGNATURE-----
Merge tag 'pm-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Remove recently added frequency invariance support from the CPPC
cpufreq driver, because it has turned out to be problematic and it
cannot be fixed properly on time for 5.13 (Viresh Kumar)"
* tag 'pm-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "cpufreq: CPPC: Add support for frequency invariance"
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmDLL74ACgkQnJ2qBz9k
QNleSAf/XikH+tsM6K9yDEeU93GGSqKUB71n9clSQBIiGZ7/UliG0wotrUjec9Rg
vBTZlh3JEdfboeBei+mG3hmOdAoYK4HMsJJikqRGPyWOTujh1eOZlT1LOXaY5zNM
631A9pWe8edlpr4Mq7Wb4nO4FToEZ91iXDLliFF371aV8kP/yuv5ZjHwIn5Pt5gI
DPnWwaJ+meW9KZ4gVKAfvZLVkKFat2xJ9r2LDpqbIkH9SBcfjBmeHOy0gFyCKx6l
yma5iANgtWLhesP6ZwSeaRb1+T9altSLCCFZrYdKH9PXTMFUqzrbiZ8tfVmllePZ
GaUOWcHYiLmvqvXnaAREiHnMFT6prg==
=kevs
-----END PGP SIGNATURE-----
Merge tag 'fixes_for_v5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota and fanotify fixes from Jan Kara:
"A fixup finishing disabling of quotactl_path() syscall (I've missed
archs using different way to declare syscalls) and a fix of an fd leak
in error handling path of fanotify"
* tag 'fixes_for_v5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: finish disable quotactl_path syscall
fanotify: fix copy_event_to_user() fid error clean up
Running devlink reload command for port in switchdev mode cause
resources to corrupt: driver can't release allocated EQ and reclaim
memory pages, because "rdma" auxiliary device had add CQs which blocks
EQ from deletion.
Erroneous sequence happens during reload-down phase, and is following:
1. detach device - suspends auxiliary devices which support it, destroys
others. During this step "eth-rep" and "rdma-rep" are destroyed,
"eth" - suspended.
2. disable SRIOV - moves device to legacy mode; as part of disablement -
rescans drivers. This step adds "rdma" auxiliary device.
3. destroy EQ table - <failure>.
Driver shouldn't create any device during unload flows. To handle that
implement MLX5_PRIV_FLAGS_DETACH flag, set it on device detach and unset
on device attach. If flag is set do no-op on drivers rescan.
Fixes: a925b5e309 ("net/mlx5: Register mlx5 devices to auxiliary virtual bus")
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
There is a race between THP unmapping and truncation, when truncate sees
pmd_none() and skips the entry, after munmap's zap_huge_pmd() cleared
it, but before its page_remove_rmap() gets to decrement
compound_mapcount: generating false "BUG: Bad page cache" reports that
the page is still mapped when deleted. This commit fixes that, but not
in the way I hoped.
The first attempt used try_to_unmap(page, TTU_SYNC|TTU_IGNORE_MLOCK)
instead of unmap_mapping_range() in truncate_cleanup_page(): it has
often been an annoyance that we usually call unmap_mapping_range() with
no pages locked, but there apply it to a single locked page.
try_to_unmap() looks more suitable for a single locked page.
However, try_to_unmap_one() contains a VM_BUG_ON_PAGE(!pvmw.pte,page):
it is used to insert THP migration entries, but not used to unmap THPs.
Copy zap_huge_pmd() and add THP handling now? Perhaps, but their TLB
needs are different, I'm too ignorant of the DAX cases, and couldn't
decide how far to go for anon+swap. Set that aside.
The second attempt took a different tack: make no change in truncate.c,
but modify zap_huge_pmd() to insert an invalidated huge pmd instead of
clearing it initially, then pmd_clear() between page_remove_rmap() and
unlocking at the end. Nice. But powerpc blows that approach out of the
water, with its serialize_against_pte_lookup(), and interesting pgtable
usage. It would need serious help to get working on powerpc (with a
minor optimization issue on s390 too). Set that aside.
Just add an "if (page_mapped(page)) synchronize_rcu();" or other such
delay, after unmapping in truncate_cleanup_page()? Perhaps, but though
that's likely to reduce or eliminate the number of incidents, it would
give less assurance of whether we had identified the problem correctly.
This successful iteration introduces "unmap_mapping_page(page)" instead
of try_to_unmap(), and goes the usual unmap_mapping_range_tree() route,
with an addition to details. Then zap_pmd_range() watches for this
case, and does spin_unlock(pmd_lock) if so - just like
page_vma_mapped_walk() now does in the PVMW_SYNC case. Not pretty, but
safe.
Note that unmap_mapping_page() is doing a VM_BUG_ON(!PageLocked) to
assert its interface; but currently that's only used to make sure that
page->mapping is stable, and zap_pmd_range() doesn't care if the page is
locked or not. Along these lines, in invalidate_inode_pages2_range()
move the initial unmap_mapping_range() out from under page lock, before
then calling unmap_mapping_page() under page lock if still mapped.
Link: https://lkml.kernel.org/r/a2a4a148-cdd8-942c-4ef8-51b77f643dbe@google.com
Fixes: fc127da085 ("truncate: handle file thp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stressing huge tmpfs often crashed on unmap_page()'s VM_BUG_ON_PAGE
(!unmap_success): with dump_page() showing mapcount:1, but then its raw
struct page output showing _mapcount ffffffff i.e. mapcount 0.
And even if that particular VM_BUG_ON_PAGE(!unmap_success) is removed,
it is immediately followed by a VM_BUG_ON_PAGE(compound_mapcount(head)),
and further down an IS_ENABLED(CONFIG_DEBUG_VM) total_mapcount BUG():
all indicative of some mapcount difficulty in development here perhaps.
But the !CONFIG_DEBUG_VM path handles the failures correctly and
silently.
I believe the problem is that once a racing unmap has cleared pte or
pmd, try_to_unmap_one() may skip taking the page table lock, and emerge
from try_to_unmap() before the racing task has reached decrementing
mapcount.
Instead of abandoning the unsafe VM_BUG_ON_PAGE(), and the ones that
follow, use PVMW_SYNC in try_to_unmap_one() in this case: adding
TTU_SYNC to the options, and passing that from unmap_page().
When CONFIG_DEBUG_VM, or for non-debug too? Consensus is to do the same
for both: the slight overhead added should rarely matter, except perhaps
if splitting sparsely-populated multiply-mapped shmem. Once confident
that bugs are fixed, TTU_SYNC here can be removed, and the race
tolerated.
Link: https://lkml.kernel.org/r/c1e95853-8bcd-d8fd-55fa-e7f2488e78f@google.com
Fixes: fec89c109f ("thp: rewrite freeze_page()/unfreeze_page() with generic rmap walkers")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Most callers of is_huge_zero_pmd() supply a pmd already verified
present; but a few (notably zap_huge_pmd()) do not - it might be a pmd
migration entry, in which the pfn is encoded differently from a present
pmd: which might pass the is_huge_zero_pmd() test (though not on x86,
since L1TF forced us to protect against that); or perhaps even crash in
pmd_page() applied to a swap-like entry.
Make it safe by adding pmd_present() check into is_huge_zero_pmd()
itself; and make it quicker by saving huge_zero_pfn, so that
is_huge_zero_pmd() will not need to do that pmd_page() lookup each time.
__split_huge_pmd_locked() checked pmd_trans_huge() before: that worked,
but is unnecessary now that is_huge_zero_pmd() checks present.
Link: https://lkml.kernel.org/r/21ea9ca-a1f5-8b90-5e88-95fb1c49bbfa@google.com
Fixes: e71769ae52 ("mm: enable thp migration for shmem thp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The routine restore_reserve_on_error is called to restore reservation
information when an error occurs after page allocation. The routine
alloc_huge_page modifies the mapping reserve map and potentially the
reserve count during allocation. If code calling alloc_huge_page
encounters an error after allocation and needs to free the page, the
reservation information needs to be adjusted.
Currently, restore_reserve_on_error only takes action on pages for which
the reserve count was adjusted(HPageRestoreReserve flag). There is
nothing wrong with these adjustments. However, alloc_huge_page ALWAYS
modifies the reserve map during allocation even if the reserve count is
not adjusted. This can cause issues as observed during development of
this patch [1].
One specific series of operations causing an issue is:
- Create a shared hugetlb mapping
Reservations for all pages created by default
- Fault in a page in the mapping
Reservation exists so reservation count is decremented
- Punch a hole in the file/mapping at index previously faulted
Reservation and any associated pages will be removed
- Allocate a page to fill the hole
No reservation entry, so reserve count unmodified
Reservation entry added to map by alloc_huge_page
- Error after allocation and before instantiating the page
Reservation entry remains in map
- Allocate a page to fill the hole
Reservation entry exists, so decrement reservation count
This will cause a reservation count underflow as the reservation count
was decremented twice for the same index.
A user would observe a very large number for HugePages_Rsvd in
/proc/meminfo. This would also likely cause subsequent allocations of
hugetlb pages to fail as it would 'appear' that all pages are reserved.
This sequence of operations is unlikely to happen, however they were
easily reproduced and observed using hacked up code as described in [1].
Address the issue by having the routine restore_reserve_on_error take
action on pages where HPageRestoreReserve is not set. In this case, we
need to remove any reserve map entry created by alloc_huge_page. A new
helper routine vma_del_reservation assists with this operation.
There are three callers of alloc_huge_page which do not currently call
restore_reserve_on error before freeing a page on error paths. Add
those missing calls.
[1] https://lore.kernel.org/linux-mm/20210528005029.88088-1-almasrymina@google.com/
Link: https://lkml.kernel.org/r/20210607204510.22617-1-mike.kravetz@oracle.com
Fixes: 96b96a96dd ("mm/hugetlb: fix huge page reservation leak in private mapping error paths"
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I found it by pure code review, that pte_same_as_swp() of unuse_vma()
didn't take uffd-wp bit into account when comparing ptes.
pte_same_as_swp() returning false negative could cause failure to
swapoff swap ptes that was wr-protected by userfaultfd.
Link: https://lkml.kernel.org/r/20210603180546.9083-1-peterx@redhat.com
Fixes: f45ec5ff16 ("userfaultfd: wp: support swap and page migration")
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [5.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When hugetlb page fault (under overcommitting situation) and
memory_failure() race, VM_BUG_ON_PAGE() is triggered by the following
race:
CPU0: CPU1:
gather_surplus_pages()
page = alloc_surplus_huge_page()
memory_failure_hugetlb()
get_hwpoison_page(page)
__get_hwpoison_page(page)
get_page_unless_zero(page)
zero = put_page_testzero(page)
VM_BUG_ON_PAGE(!zero, page)
enqueue_huge_page(h, page)
put_page(page)
__get_hwpoison_page() only checks the page refcount before taking an
additional one for memory error handling, which is not enough because
there's a time window where compound pages have non-zero refcount during
hugetlb page initialization.
So make __get_hwpoison_page() check page status a bit more for hugetlb
pages with get_hwpoison_huge_page(). Checking hugetlb-specific flags
under hugetlb_lock makes sure that the hugetlb page is not transitive.
It's notable that another new function, HWPoisonHandlable(), is helpful
to prevent a race against other transitive page states (like a generic
compound page just before PageHuge becomes true).
Link: https://lkml.kernel.org/r/20210603233632.2964832-2-nao.horiguchi@gmail.com
Fixes: ead07f6a86 ("mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org> [5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[WHY]
SCR for DP 2.0 spec says that multiple LTTPRs must not be accessed in a
single AUX transaction.
There may be other places in future where breaking up AUX accesses is
necessary.
[HOW]
Partition the entire DPCD address space into blocks. When an incoming AUX
request spans multiple blocks, break up the request into multiple requests.
Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function get_net_ns_by_fd() could be inlined when NET_NS is not
enabled.
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Scaled PPM conversion to PPB may (on 64bit systems) result
in a value larger than s32 can hold (freq/scaled_ppm is a long).
This means the kernel will not correctly reject unreasonably
high ->freq values (e.g. > 4294967295ppb, 281474976645 scaled PPM).
The conversion is equivalent to a division by ~66 (65.536),
so the value of ppb is always smaller than ppm, but not small
enough to assume narrowing the type from long -> s32 is okay.
Note that reasonable user space (e.g. ptp4l) will not use such
high values, anyway, 4289046510ppb ~= 4.3x, so the fix is
somewhat pedantic.
Fixes: d39a743511 ("ptp: validate the requested frequency adjustment.")
Fixes: d94ba80ebb ("ptp: Added a brand new class driver for ptp clocks.")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 5b9fedb31e ("quota: Disable quotactl_path syscall") Jan Kara
disabled quotactl_path syscall on several architectures.
This commit disables it on all architectures using unified list of
system calls:
- arm64
- arc
- csky
- h8300
- hexagon
- nds32
- nios2
- openrisc
- riscv (32/64)
CC: Jan Kara <jack@suse.cz>
CC: Christian Brauner <christian.brauner@ubuntu.com>
CC: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/lkml/20210512153621.n5u43jsytbik4yze@wittgenstein
Link: https://lore.kernel.org/r/20210614153712.313707-1-marcin@juszkiewicz.com.pl
Fixes: 5b9fedb31e ("quota: Disable quotactl_path syscall")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
Signed-off-by: Jan Kara <jack@suse.cz>
This reverts commit 4c38f2df71.
There are few races in the frequency invariance support for CPPC driver,
namely the driver doesn't stop the kthread_work and irq_work on policy
exit during suspend/resume or CPU hotplug.
A proper fix won't be possible for the 5.13-rc, as it requires a lot of
changes. Lets revert the patch instead for now.
Fixes: 4c38f2df71 ("cpufreq: CPPC: Add support for frequency invariance")
Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
0day robot reported a 9.2% regression for will-it-scale mmap1 test
case[1], caused by commit 57efa1fe59 ("mm/gup: prevent gup_fast from
racing with COW during fork").
Further debug shows the regression is due to that commit changes the
offset of hot fields 'mmap_lock' inside structure 'mm_struct', thus some
cache alignment changes.
From the perf data, the contention for 'mmap_lock' is very severe and
takes around 95% cpu cycles, and it is a rw_semaphore
struct rw_semaphore {
atomic_long_t count; /* 8 bytes */
atomic_long_t owner; /* 8 bytes */
struct optimistic_spin_queue osq; /* spinner MCS lock */
...
Before commit 57efa1fe59 adds the 'write_protect_seq', it happens to
have a very optimal cache alignment layout, as Linus explained:
"and before the addition of the 'write_protect_seq' field, the
mmap_sem was at offset 120 in 'struct mm_struct'.
Which meant that count and owner were in two different cachelines,
and then when you have contention and spend time in
rwsem_down_write_slowpath(), this is probably *exactly* the kind
of layout you want.
Because first the rwsem_write_trylock() will do a cmpxchg on the
first cacheline (for the optimistic fast-path), and then in the
case of contention, rwsem_down_write_slowpath() will just access
the second cacheline.
Which is probably just optimal for a load that spends a lot of
time contended - new waiters touch that first cacheline, and then
they queue themselves up on the second cacheline."
After the commit, the rw_semaphore is at offset 128, which means the
'count' and 'owner' fields are now in the same cacheline, and causes
more cache bouncing.
Currently there are 3 "#ifdef CONFIG_XXX" before 'mmap_lock' which will
affect its offset:
CONFIG_MMU
CONFIG_MEMBARRIER
CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES
The layout above is on 64 bits system with 0day's default kernel config
(similar to RHEL-8.3's config), in which all these 3 options are 'y'.
And the layout can vary with different kernel configs.
Relayouting a structure is usually a double-edged sword, as sometimes it
can helps one case, but hurt other cases. For this case, one solution
is, as the newly added 'write_protect_seq' is a 4 bytes long seqcount_t
(when CONFIG_DEBUG_LOCK_ALLOC=n), placing it into an existing 4 bytes
hole in 'mm_struct' will not change other fields' alignment, while
restoring the regression.
Link: https://lore.kernel.org/lkml/20210525031636.GB7744@xsang-OptiPlex-9020/ [1]
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is a panic in socket ioctl cmd SIOCGSKNS when NET_NS is not enabled.
The reason is that nsfs tries to access ns->ops but the proc_ns_operations
is not implemented in this case.
[7.670023] Unable to handle kernel NULL pointer dereference at virtual address 00000010
[7.670268] pgd = 32b54000
[7.670544] [00000010] *pgd=00000000
[7.671861] Internal error: Oops: 5 [#1] SMP ARM
[7.672315] Modules linked in:
[7.672918] CPU: 0 PID: 1 Comm: systemd Not tainted 5.13.0-rc3-00375-g6799d4f2da49 #16
[7.673309] Hardware name: Generic DT based system
[7.673642] PC is at nsfs_evict+0x24/0x30
[7.674486] LR is at clear_inode+0x20/0x9c
The same to tun SIOCGSKNS command.
To fix this problem, we make get_net_ns() return -EINVAL when NET_NS is
disabled. Meanwhile move it to right place net/core/net_namespace.c.
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Fixes: c62cce2cae ("net: add an ioctl to get a socket network namespace")
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here are a number of tiny USB fixes for 5.13-rc6.
There are more than I would normally like, but there's been a bunch of
people banging on the gadget and dwc3 and typec code recently for I
think an Android release, which has resulted in a number of small fixes.
It's nice to see companies send fixes upstream for this type of work, a
notable change from years ago.
Anyway, fixes in here are:
- usb-serial device id updates
- usb-serial cp210x driver fixes for broken firmware versions
- typec fixes for crazy charging devices and other reported
problems
- dwc3 fixes for reported problems found
- gadget fixes for reported problems
- tiny xhci fixes
- other small fixes for reported issues.
- revert of a problem fix found by linux-next testing
All of these have passed 0-day and linux-next testing with no reported
problems (the revert for the found linux-next build problem included).
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYMS6uA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yknTgCeKSmtJABTIcZ3LLRgArqUFEroUqEAni2ZRv/8
3mGzeXFKvvoEO/WoILmV
=WqGE
-----END PGP SIGNATURE-----
Merge tag 'usb-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of tiny USB fixes for 5.13-rc6.
There are more than I would normally like, but there's been a bunch of
people banging on the gadget and dwc3 and typec code recently for I
think an Android release, which has resulted in a number of small
fixes. It's nice to see companies send fixes upstream for this type of
work, a notable change from years ago.
Anyway, fixes in here are:
- usb-serial device id updates
- usb-serial cp210x driver fixes for broken firmware versions
- typec fixes for crazy charging devices and other reported problems
- dwc3 fixes for reported problems found
- gadget fixes for reported problems
- tiny xhci fixes
- other small fixes for reported issues.
- revert of a problem fix found by linux-next testing
All of these have passed 0-day and linux-next testing with no reported
problems (the revert for the found linux-next build problem included)"
* tag 'usb-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (44 commits)
Revert "usb: gadget: fsl: Re-enable driver for ARM SoCs"
usb: typec: mux: Fix copy-paste mistake in typec_mux_match
usb: typec: ucsi: Clear PPM capability data in ucsi_init() error path
usb: gadget: fsl: Re-enable driver for ARM SoCs
usb: typec: wcove: Use LE to CPU conversion when accessing msg->header
USB: serial: cp210x: fix CP2102N-A01 modem control
USB: serial: cp210x: fix alternate function for CP2102N QFN20
usb: misc: brcmstb-usb-pinmap: check return value after calling platform_get_resource()
usb: dwc3: ep0: fix NULL pointer exception
usb: gadget: eem: fix wrong eem header operation
usb: typec: intel_pmc_mux: Put ACPI device using acpi_dev_put()
usb: typec: intel_pmc_mux: Add missed error check for devm_ioremap_resource()
usb: typec: intel_pmc_mux: Put fwnode in error case during ->probe()
usb: typec: tcpm: Do not finish VDM AMS for retrying Responses
usb: fix various gadget panics on 10gbps cabling
usb: fix various gadgets null ptr deref on 10gbps cabling.
usb: pci-quirks: disable D3cold on xhci suspend for s2idle on AMD Renoir
usb: f_ncm: only first packet of aggregate needs to start timer
USB: f_ncm: ncm_bitrate (speed) is unsigned
MAINTAINERS: usb: add entry for isp1760
...
Here are some small misc driver fixes for 5.13-rc6 that fix some
reported problems:
- Tiny phy driver fixes for reported issues
- rtsx regression for when the device suspended
- mhi driver fix for a use-after-free
All of these have been in linux-next for a few days with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYMTXVw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynzLgCgpTYm7sEdpzqTZtJXupSP05Kqm5gAn3PhELhK
ZPnUMX9OPV9DPnJ9camz
=ApcR
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small misc driver fixes for 5.13-rc6 that fix some
reported problems:
- Tiny phy driver fixes for reported issues
- rtsx regression for when the device suspended
- mhi driver fix for a use-after-free
All of these have been in linux-next for a few days with no reported
issues"
* tag 'char-misc-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
misc: rtsx: separate aspm mode into MODE_REG and MODE_CFG
bus: mhi: pci-generic: Fix hibernation
bus: mhi: pci_generic: Fix possible use-after-free in mhi_pci_remove()
bus: mhi: pci_generic: T99W175: update channel name from AT to DUN
phy: Sparx5 Eth SerDes: check return value after calling platform_get_resource()
phy: ralink: phy-mt7621-pci: drop 'of_match_ptr' to fix -Wunused-const-variable
phy: ti: Fix an error code in wiz_probe()
phy: phy-mtk-tphy: Fix some resource leaks in mtk_phy_init()
phy: cadence: Sierra: Fix error return code in cdns_sierra_phy_probe()
phy: usb: Fix misuse of IS_ENABLED
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmDEwEEQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpu2uEACIZXc0e4Jz2tJmtlLzhm0T+YUXu88/n0Ki
3HsCfjyk0k2tvGjAmzLgBruR+0dxuoTlC8ZyLWkCgYFvRxCQMrjxB4+Q53WAAPud
ictv/5C992eWfmkk5lKWYh/SVUZU0nN/HlcITggFzH+/Ek4RgqBJK6rYPpN4YM6W
OifSZ22xwjZy9i8svzCPzGUbS5d5qbNeRSaacfADWFmzTqqzllWz/KkN633UFefR
tkqWy610P0O8fz3xe5HcECIOc3aNRZuk5zrNqCJPvxcOdYlqlL/HfsWMACEiC/g1
N3ahNGrUzJqhB1QNAIKATKAlh8hzAws9t/alLJQzSHZWRu7vso0qctoVJT3i6xRp
qD17EAQgrC0R0fQxdHmoMzRHEnKPCXQx36wb/mhZbG60/Q+scmSrFXp86XvbKZiI
uzHTsUL/80bRXHuVrKXT+JWTRCzpv1yk9ufIVzSOheVCl/H6bxZ29cabBL2/XvvI
d+OljDsy7oMH6rOBFi3XYmwZShEoUqeATeFoFf5isjkWfe7qdiMVu4apD8fBhIjX
8rNLjp0nIKN+5IjHwFkAXRwp8P1SJQ8c7Tl4I6xY82FsMQxUUgMhjSqrn58i2g9d
Lem9YHKaXIbw1yfWcaf8erA6d0S4rujG+j3miG0y248kOTb9FeMbfbRgjj8v99m1
XB7F9SIQUw==
=MbrN
-----END PGP SIGNATURE-----
Merge tag 'io_uring-5.13-2021-06-12' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Just an API change for the registration changes that went into this
release. Better to get it sorted out now than before it's too late"
* tag 'io_uring-5.13-2021-06-12' of git://git.kernel.dk/linux-block:
io_uring: add feature flag for rsrc tags
io_uring: change registration/upd/rsrc tagging ABI
- Fix performance regression caused by lack of intended
batching of RCU callbacks by over-eager NOHZ-full code.
- Fix cgroups related corruption of load_avg and load_sum metrics.
- Three fixes to fix blocked load, util_sum/runnable_sum and
util_est tracking bugs.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDErzERHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1jxqA//UxByNNuRZy6h+ek9lDXu/l7wqpt9wq5B
7ywMAKhM80ytXGrxVLY1+JOIkRLuKvimS20JlAnrnp3g0k7Wh6SGtH6uiXEmObN7
o5g8Pzwz5TIEYArwjG6C/t8szVGo4n0L4UMmdEAMFj6hu68J7KuMJ4HC18LrNkFP
kSaDWeus5tnvMLMch+y5rpvztNTpuyQxqT2fjcelcoYvpAFuG4qcVDI2tWoZ8TnB
3qlTkM/hhEMHoOAPKXh94LzXjOkJ6DVORR/6VcfePzjqd50LThav6/e0bL2cMCC3
PrmBVj0XZLDHwbsfpp6yspNNDZBfCF0PmRVBnoHzjzY39cGoLY7NdgQMOzz9WDOD
mm8rcYLg0l6hWW2WxAy6Y3lzbkxLvDAVe+rdCCYsMLi0eiBERoAc5CCwhpyHwbUe
YtS/Hda+WFnBpBQLwO0awiJbkWuLFRKL8XRPNU0eWtqOaqZGydwRm2Y39BaXImEY
fGo3Q8Upk3ziEF986y2JzlHh3+pODnTNPz12l0fqW4iG7dQx1v8uQhtVzp9Egg/X
2m1IzvPIgw5zwjLTb1Oh4IZ/x91hzg1PXk5g2j8UeTTtXOVwryipRx/t5YjU0B1X
eGCrpquSyppUnjjajWRyco6DLG1YpwHVOTq2WdsbmxWVTOvRDZxlOUDk/wzWRINo
+c0DzRZe0aY=
=aqkp
-----END PGP SIGNATURE-----
Merge tag 'sched-urgent-2021-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Misc fixes:
- Fix performance regression caused by lack of intended batching of
RCU callbacks by over-eager NOHZ-full code.
- Fix cgroups related corruption of load_avg and load_sum metrics.
- Three fixes to fix blocked load, util_sum/runnable_sum and util_est
tracking bugs"
* tag 'sched-urgent-2021-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix util_est UTIL_AVG_UNCHANGED handling
sched/pelt: Ensure that *_sum is always synced with *_avg
tick/nohz: Only check for RCU deferred wakeup on user/guest entry when needed
sched/fair: Make sure to update tg contrib for blocked load
sched/fair: Keep load_avg and load_sum synced
- remove redundant NULL checks by various people
- fix sparse checker warnings from Marc
- expose more GPU ID values to userspace from Christian
- add HWDB entry for GPU found on i.MX8MP from Sascha
- rework of the linear window calculation to better deal with
systems with large regions of reserved RAM
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas Stach <l.stach@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/f27e1ec2c2fea310bfb6fe6c99174a54e9dfba83.camel@pengutronix.de
Add IORING_FEAT_RSRC_TAGS indicating that io_uring supports a bunch of
new IORING_REGISTER operations, in particular
IORING_REGISTER_[FILES[,UPDATE]2,BUFFERS[2,UPDATE]] that support rsrc
tagging, and also indicating implemented dynamic fixed buffer updates.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9b995d4045b6c6b4ab7510ca124fd25ac2203af7.1623339162.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There are ABI moments about recently added rsrc registration/update and
tagging that might become a nuisance in the future. First,
IORING_REGISTER_RSRC[_UPD] hide different types of resources under it,
so breaks fine control over them by restrictions. It works for now, but
once those are wanted under restrictions it would require a rework.
It was also inconvenient trying to fit a new resource not supporting
all the features (e.g. dynamic update) into the interface, so better
to return to IORING_REGISTER_* top level dispatching.
Second, register/update were considered to accept a type of resource,
however that's not a good idea because there might be several ways of
registration of a single resource type, e.g. we may want to add
non-contig buffers or anything more exquisite as dma mapped memory.
So, remove IORING_RSRC_[FILE,BUFFER] out of the ABI, and place them
internally for now to limit changes.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9b554897a7c17ad6e3becc48dfed2f7af9f423d5.1623339162.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
UDP sendmsg() path can be lockless, it is possible for another
thread to re-connect an change sk->sk_txhash under us.
There is no serious impact, but we can use READ_ONCE()/WRITE_ONCE()
pair to document the race.
BUG: KCSAN: data-race in __ip4_datagram_connect / skb_set_owner_w
write to 0xffff88813397920c of 4 bytes by task 30997 on cpu 1:
sk_set_txhash include/net/sock.h:1937 [inline]
__ip4_datagram_connect+0x69e/0x710 net/ipv4/datagram.c:75
__ip6_datagram_connect+0x551/0x840 net/ipv6/datagram.c:189
ip6_datagram_connect+0x2a/0x40 net/ipv6/datagram.c:272
inet_dgram_connect+0xfd/0x180 net/ipv4/af_inet.c:580
__sys_connect_file net/socket.c:1837 [inline]
__sys_connect+0x245/0x280 net/socket.c:1854
__do_sys_connect net/socket.c:1864 [inline]
__se_sys_connect net/socket.c:1861 [inline]
__x64_sys_connect+0x3d/0x50 net/socket.c:1861
do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff88813397920c of 4 bytes by task 31039 on cpu 0:
skb_set_hash_from_sk include/net/sock.h:2211 [inline]
skb_set_owner_w+0x118/0x220 net/core/sock.c:2101
sock_alloc_send_pskb+0x452/0x4e0 net/core/sock.c:2359
sock_alloc_send_skb+0x2d/0x40 net/core/sock.c:2373
__ip6_append_data+0x1743/0x21a0 net/ipv6/ip6_output.c:1621
ip6_make_skb+0x258/0x420 net/ipv6/ip6_output.c:1983
udpv6_sendmsg+0x160a/0x16b0 net/ipv6/udp.c:1527
inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:642
sock_sendmsg_nosec net/socket.c:654 [inline]
sock_sendmsg net/socket.c:674 [inline]
____sys_sendmsg+0x360/0x4d0 net/socket.c:2350
___sys_sendmsg net/socket.c:2404 [inline]
__sys_sendmmsg+0x315/0x4b0 net/socket.c:2490
__do_sys_sendmmsg net/socket.c:2519 [inline]
__se_sys_sendmmsg net/socket.c:2516 [inline]
__x64_sys_sendmmsg+0x53/0x60 net/socket.c:2516
do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0xbca3c43d -> 0xfdb309e0
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 31039 Comm: syz-executor.2 Not tainted 5.13.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A mixture of small bug fixes and a small security issue:
- WARN_ON when IPoIB is automatically moved between namespaces
- Long standing bug where mlx5 would use the wrong page for the doorbell
recovery memory if fork is used
- Security fix for mlx4 that disables the timestamp feature
- Several crashers for mlx5
- Plug a recent mlx5 memory leak for the sig_mr
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmDCAxYACgkQOG33FX4g
mxqL4Q/9FOOS+Q0O2nOtkxzenqB931w46Q4kca1m6RZcdJI97P/tpF+SigQoUwV+
qiuJV4CThkidqWjxxfesX4uXyj6mc8yW4ux57c2JAMiS5iGIsKEPCavNvzcWRZKJ
rlMQg0yi7KeDwJ8XC2nw/Ajl1ujtxh569AkaqFVMMJer6jSa048TU14iulOOlcpZ
VGmF0/sCSY+PzyEOycr0LxGfUImCdD/spvF1RDbCNtQUcQwg41LUUkR+wvrqp8eR
KmuU7i+NLbcGyCZou16r6su9mMRYU5ZuFN5JMtjrmeqfdOi6deb7StyCgQFmRuac
Yw9Lgw91JUNphZp9v//sw6UDfyZaRMdsSW4796jiEPjnxZK7tzx+klhFLpO3WPkh
3VaZGY5nkcGcaRfqGD0PUHcHNjPr18rCXHz+JIovNLwIIJDmR4iUnZOs/JgOkvvd
bh4p4O/3xhXT57FoyBb/MhYgILAVHJ3Od6Dab3uJNx7ZaHAngtVHhzykm8PP4t/h
sHfd5W494jgec5RicJBQQfjZ4YUdSFMKjqLchKaSkdIsv/Wi+3idh+561ucmkMwI
JnIVZV/0739JUKeXhOJkxQkc1SKjr79e7+JUlrEgVFC0lJ8srzUD0f9a0L5txgt4
2MqQ9CSGljhiUpby0urFPb/vznQ3OQoZVwXOxj1TKtr0rrS3nuE=
=crsk
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"A mixture of small bug fixes and a small security issue:
- WARN_ON when IPoIB is automatically moved between namespaces
- Long standing bug where mlx5 would use the wrong page for the
doorbell recovery memory if fork is used
- Security fix for mlx4 that disables the timestamp feature
- Several crashers for mlx5
- Plug a recent mlx5 memory leak for the sig_mr"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/mlx5: Fix initializing CQ fragments buffer
RDMA/mlx5: Delete right entry from MR signature database
RDMA: Verify port when creating flow rule
RDMA/mlx5: Block FDB rules when not in switchdev mode
RDMA/mlx4: Do not map the core_clock page to user space unless enabled
RDMA/mlx5: Use different doorbell memory for different processes
RDMA/ipoib: Fix warning caused by destroying non-initial netns
- Add continue in comment (from Wei Ming Chen)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmDAxXcACgkQGXyLc2ht
IW2/PQ//aNoXZVRmuG+bFVkEdm/jbQycjEgqw/jbFzZ/ObA5ujunRej9zaeb2T2A
aQNXjXw/j0WWYUp+hxQkuQv0Fqaf5nHY7clht/KoOgeJf8dccu30dFZM/f7vvawB
wBJe53t+ItpUpp/RcRPcqSyM4RfVr+8E1TYI6wTYqB8p9J3pfkT8Sh8X3+cmXUf4
YXTezJ/rpMYu8DhvQI0CGYPPggvpwgbaP1O16aeAVpCka7OxnQQa3zbvSNqMOwzb
Hlz2NnJh1h235Q/HJYDpHH8cIzjnwATmODet9U8F5Nxx6PqCkj/6zJy2n1KEx6mh
nZv996oki3TVZvsFCoIBPqBWBD2xdd5gDXdSEssVxi0I0Z2WKUhjoFW5h3zKu2UG
V/akCpYEhoJltixlWY0jelAbpSzZKMblwQdHzbmUWEISOfkAZ9fvutvujBS00KPk
jxzggO5ZiwY5Kahkrj8wFlJ79wrnHsRkTWUbRg2LoXU6tu1b7o+k8dIPZjYxRR4G
S1uL98dsfctaoz8R33++Ntg7C1PKS1seN/AGrYE951hzI2tlIV22rqPQpgpwgWbW
ss7siPEmQ/yN7Fm9pNR2F77NBdOcgXJ+10CnvutbV2SGyPnnDvy6ImCcyR0e2z/h
zaJ2h3V1BYpE34aPhmyLQwjEdaXB0czMEzlfVHIErstjT/oFJCk=
=8kWi
-----END PGP SIGNATURE-----
Merge tag 'compiler-attributes-for-linus-v5.13-rc6' of git://github.com/ojeda/linux
Pull compiler attribute update from Miguel Ojeda:
"A trivial update to the compiler attributes: Add 'continue' keyword to
documentation in comment (from Wei Ming Chen)"
* tag 'compiler-attributes-for-linus-v5.13-rc6' of git://github.com/ojeda/linux:
Compiler Attributes: Add continue in comment
* fix more fallout from RTNL locking changes
* fixes for some of the bugs found by syzbot
* drop multicast fragments in mac80211 to align
with the spec and what drivers are doing now
* fix NULL-ptr deref in radiotap injection
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAmDAz24ACgkQB8qZga/f
l8S1LQ/8CVe2fweF6mps0gktCgAAiLhCCpcqCiGqPFe6cmDfSJp7bvCj9YNL7LaG
YvCwXlBAN7xpBwnGAXSvBpYC2Ru7KNfzTSqFfThbzPh4DLKxUfKsOK6Yel3yx6B3
gWDjT4zKpZ93k7DO1wdgO/MOvaOVbTe0F+wcLCvcZ3dpqHZuFqAK5FWlHUtlM2c3
Uc08O2WN2DrQR/Qnw0ErXK6pd8N87bnrTNd7vYf69Cmcp53GC4rQGRATQxEtm8LC
DdlAQ4ensIfrexlFG+oCSISufwlKYBNW9PY0L10qNUzB6DJyRDVz1UybWEPcTZEy
sS8nK4O98bGALWMi98Dqf/s/mQMrjs6THJIyJUQi+p2pHimDH43qwfcIAqoMcw0g
37aG67dEZDXkSYx+CPloBFPgELfDP726BFcVkRyUzdHEIZyGvIIEnEfr6LsIKXNS
pDRrDyJOaNoHjGq0VzYvZ+7ETo8rqJHDWkNjEQX13jfa2r3kDTUAvauXkNTmez5N
xTNN5XttlfNXvUgb+QWp35ZgfvwimLzVKGfPGBNl8vKaFc5tOGVnzaHU3WahOa1d
ttzGRuiNuvb0OWZqIlxG8U8FPtXXpSy/+oKdP4ZbFOLeZXRqpJ85dMSpUAIOwYT5
E0bdOpgbx5C5LFhK4GXUT/Mx6nLBr3c3Jj5flhrGx2wg9+z+PVU=
=evzy
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-net-2021-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes berg says:
====================
A fair number of fixes:
* fix more fallout from RTNL locking changes
* fixes for some of the bugs found by syzbot
* drop multicast fragments in mac80211 to align
with the spec and what drivers are doing now
* fix NULL-ptr deref in radiotap injection
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
without nested page tables.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmDAVpQUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroNkOgf9F97eFxAdod3/wbW9EbsUPR5bMTLE
+R6Hmvw+yCm/W2cycVGdCSh1BEKNuZN/XfHln2cYVfVr6ndog58A4Y0urFAhTROv
IHs8TCA5biQitoZ716l88ExOitnqJiSmMhGex969+zm1Lb9MQo1KA/zxERlqCi3s
Pfcxb6I8VbD9LEb6NaQdDgQoslJo1tzhe9gGYAYrpMOZujpj1RPeIOZIfeII0MP/
g14/JSar8cXc9QJ6zbiKn8HhpmzGJnaIsyFFL2RMIBlKvxsnpOU6VmisLTL9407o
P246Vq59BM8pdRCVUW9W9hLr2ho8lmi+ZYXASCm+qfn8cLaHyRCqSK56ZQ==
=nW43
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Bugfixes, including a TLB flush fix that affects processors without
nested page tables"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: fix previous commit for 32-bit builds
kvm: avoid speculation-based attacks from out-of-range memslot accesses
KVM: x86: Unload MMU on guest TLB flush if TDP disabled to force MMU sync
KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message
selftests: kvm: Add support for customized slot0 memory size
KVM: selftests: introduce P47V64 for s390x
KVM: x86: Ensure PV TLB flush tracepoint reflects KVM behavior
KVM: X86: MMU: Use the correct inherited permissions to get shadow page
KVM: LAPIC: Write 0 to TMICT should also cancel vmx-preemption timer
KVM: SVM: Fix SEV SEND_START session length & SEND_UPDATE_DATA query length after commit 238eca821c
aspm (Active State Power Management)
rtsx_comm_set_aspm: this function is for driver to make sure
not enter power saving when processing of init and card_detcct
ASPM_MODE_CFG: 8411 5209 5227 5229 5249 5250
Change back to use original way to control aspm
ASPM_MODE_REG: 5227A 524A 5250A 5260 5261 5228
Keep the new way to control aspm
Fixes: 121e9c6b5c ("misc: rtsx: modify and fix init_hw function")
Reported-by: Chris Chiu <chris.chiu@canonical.com>
Tested-by: Gordon Lack <gordon.lack@dsl.pipex.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Ricky Wu <ricky_wu@realtek.com>
Link: https://lore.kernel.org/r/20210607101634.4948-1-ricky_wu@realtek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
array_index_nospec does not work for uint64_t on 32-bit builds.
However, the size of a memory slot must be less than 20 bits wide
on those system, since the memory slot must fit in the user
address space. So just store it in an unsigned long.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM's mechanism for accessing guest memory translates a guest physical
address (gpa) to a host virtual address using the right-shifted gpa
(also known as gfn) and a struct kvm_memory_slot. The translation is
performed in __gfn_to_hva_memslot using the following formula:
hva = slot->userspace_addr + (gfn - slot->base_gfn) * PAGE_SIZE
It is expected that gfn falls within the boundaries of the guest's
physical memory. However, a guest can access invalid physical addresses
in such a way that the gfn is invalid.
__gfn_to_hva_memslot is called from kvm_vcpu_gfn_to_hva_prot, which first
retrieves a memslot through __gfn_to_memslot. While __gfn_to_memslot
does check that the gfn falls within the boundaries of the guest's
physical memory or not, a CPU can speculate the result of the check and
continue execution speculatively using an illegal gfn. The speculation
can result in calculating an out-of-bounds hva. If the resulting host
virtual address is used to load another guest physical address, this
is effectively a Spectre gadget consisting of two consecutive reads,
the second of which is data dependent on the first.
Right now it's not clear if there are any cases in which this is
exploitable. One interesting case was reported by the original author
of this patch, and involves visiting guest page tables on x86. Right
now these are not vulnerable because the hva read goes through get_user(),
which contains an LFENCE speculation barrier. However, there are
patches in progress for x86 uaccess.h to mask kernel addresses instead of
using LFENCE; once these land, a guest could use speculation to read
from the VMM's ring 3 address space. Other architectures such as ARM
already use the address masking method, and would be susceptible to
this same kind of data-dependent access gadgets. Therefore, this patch
proactively protects from these attacks by masking out-of-bounds gfns
in __gfn_to_hva_memslot, which blocks speculation of invalid hvas.
Sean Christopherson noted that this patch does not cover
kvm_read_guest_offset_cached. This however is limited to a few bytes
past the end of the cache, and therefore it is unlikely to be useful in
the context of building a chain of data dependent accesses.
Reported-by: Artemiy Margaritov <artemiy.margaritov@gmail.com>
Co-developed-by: Artemiy Margaritov <artemiy.margaritov@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Avoid orphan section in ARM cpuidle (Arnd Bergmann)
- Avoid orphan section with !SMP (Nathan Chancellor)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmC/pqgACgkQiXL039xt
wCbecA/+NbwT9NXdxtgQdCsSQxB2bv+LHHyhIhVcBHMtWgVO8Wc37s9bKRBH8Aiz
FGANuL7h5meXSZ6gDzfuoSO2SMGJeGGIXqb29jGCkwSb64H2Jy7wt9shYkwvSbnE
EAlznZkmg/GsCwrvSgKmWM5xYLIPTNfNj1ujlpp85f7x5LzZNQK0VD7DOFVUOp7x
3z+S+gcLK1HW9KlXUhmekqCl1ut0YSRrZ0NMLOROjdCy2EqID6nRkj8jvogNLqAq
30UMZe7QBZhd1cvRDvAkUxN/7jwhVXxdT6y1ybb9Rqom9krb1idRap6pS8VJZskB
fO9i0n5zpHKsNqLQMCWLyyWcTEFmZaQfrwoZ2kmxUhAkPus9Hw8GfJ2y1dFgKlq8
dGcn+EhyvdDwBErplcqugqFuqxQlkNNAhuqbEe9uSfc5JvIRjBsDUq4lovmFpH4H
RYc0nmDFb+Y0+wg5/KNKRNE4BUqWcYDKFg7dXhMAohpqycpyJkNkov0gV4Xkl2Tw
CBOQy9GjAKY6Wl1C1TLPsytc6ttbynSHaNHSrEcmv5pRQAY3WP8BT6wciJgQLZ5M
hHx20uLaw1cFanxzhkU5yS4uTplrFz6QlETr747YqEuiXGc/sK7qdmYusTUOaUj9
ziD8Lo00uEk+oQNIRuMKMdO/UrPNTlGp3C4B8B12lrOa3VM8fbg=
=ZvIL
-----END PGP SIGNATURE-----
Merge tag 'orphans-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull orphan section fixes from Kees Cook:
"These two corner case fixes have been in -next for about a week:
- Avoid orphan section in ARM cpuidle (Arnd Bergmann)
- Avoid orphan section with !SMP (Nathan Chancellor)"
* tag 'orphans-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
vmlinux.lds.h: Avoid orphan section with !SMP
ARM: cpuidle: Avoid orphan section warning
A collection of fixes for the regulator API that have come up since the
merge window, including a big batch of fixes from Axel Lin's usual
careful and detailed review. The one stand out fix here is Dmitry
Baryshkov's fix for an issue where we fail to power on the parents of
always on regulators during system startup if they weren't already
powered on.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmC/ao0ACgkQJNaLcl1U
h9AoqQf9GYtljFSON07MAOWwMgnjuMQ+rl0ZadqKKzq74QMMi4bxVKDWkftQ28/5
Ulk2M/mxRE6C1OEpOJl9ZnG9K0fWpOdnTURkYgW0FsJniEDiF7ZkdoFypwu93jOD
0r+3QCw/Ti9i08pOdlpFpUKU5rp/O9HYmouOTzBOCiM1SMb9TkkX5GBoDVw8+cWd
2PZqKQXEsaK1uNzeaXYw6UO8+IdSpVQRzSEILdtVWyHCNmXDDJWfI3vYeeqDbhYr
C+E3UdrO1ftNsOoJv33NqhscMnulkZDZ6H6lgbLX2FyFAe2M9N+AildGlkR5H1GD
xH3q3EvkOE2Y2X3zHteLdJ3MoI91cQ==
=poWJ
-----END PGP SIGNATURE-----
Merge tag 'regulator-fix-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A collection of fixes for the regulator API that have come up since
the merge window, including a big batch of fixes from Axel Lin's usual
careful and detailed review.
The one stand out fix here is Dmitry Baryshkov's fix for an issue
where we fail to power on the parents of always on regulators during
system startup if they weren't already powered on"
* tag 'regulator-fix-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (21 commits)
regulator: rt4801: Fix NULL pointer dereference if priv->enable_gpios is NULL
regulator: hi6421v600: Fix .vsel_mask setting
regulator: bd718x7: Fix the BUCK7 voltage setting on BD71837
regulator: atc260x: Fix n_voltages and min_sel for pickable linear ranges
regulator: rtmv20: Fix to make regcache value first reading back from HW
regulator: mt6315: Fix function prototype for mt6315_map_mode
regulator: rtmv20: Add Richtek to Kconfig text
regulator: rtmv20: Fix .set_current_limit/.get_current_limit callbacks
regulator: hisilicon: use the correct HiSilicon copyright
regulator: bd71828: Fix .n_voltages settings
regulator: bd70528: Fix off-by-one for buck123 .n_voltages setting
regulator: max77620: Silence deferred probe error
regulator: max77620: Use device_set_of_node_from_dev()
regulator: scmi: Fix off-by-one for linear regulators .n_voltages setting
regulator: core: resolve supply for boot-on/always-on regulators
regulator: fixed: Ensure enable_counter is correct if reg_domain_disable fails
regulator: Check ramp_delay_table for regulator_set_ramp_delay_regmap
regulator: fan53880: Fix missing n_voltages setting
regulator: da9121: Return REGULATOR_MODE_INVALID for invalid mode
regulator: fan53555: fix TCS4525 voltage calulation
...
If the bo is idle when calling ttm_bo_pipeline_gutting(), we unnecessarily
create a ghost object and push it out to delayed destroy.
Fix this by adding a path for idle, and document the function.
Also avoid having the bo end up in a bad state vulnerable to user-space
triggered kernel BUGs if the call to ttm_tt_create() fails.
Finally reuse ttm_bo_pipeline_gutting() in ttm_bo_evict().
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210602083818.241793-7-thomas.hellstrom@linux.intel.com
Reading out of write-combining mapped memory is typically very slow
since the CPU doesn't prefetch. However some archs have special
instructions to do this.
So add a best-effort memcpy_from_wc taking dma-buf-map pointer
arguments that attempts to use a fast prefetching memcpy and
otherwise falls back to ordinary memcopies, taking the iomem tagging
into account.
The code is largely copied from i915_memcpy_from_wc.
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Christian König <christian.koenig@amd.com>
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Link: https://lore.kernel.org/r/20210602083818.241793-5-thomas.hellstrom@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20210602083818.241793-5-thomas.hellstrom@linux.intel.com
The internal ttm_bo_util memcpy uses ioremap functionality, and while it
probably might be possible to use it for copying in- and out of
sglist represented io memory, using io_mem_reserve() / io_mem_free()
callbacks, that would cause problems with fault().
Instead, implement a method mapping page-by-page using kmap_local()
semantics. As an additional benefit we then avoid the occasional global
TLB flushes of ioremap() and consuming ioremap space, elimination of a
critical point of failure and with a slight change of semantics we could
also push the memcpy out async for testing and async driver development
purposes.
A special linear iomem iterator is introduced internally to mimic the
old ioremap behaviour for code-paths that can't immediately be ported
over. This adds to the code size and should be considered a temporary
solution.
Looking at the code we have a lot of checks for iomap tagged pointers.
Ideally we should extend the core memremap functions to also accept
uncached memory and kmap_local functionality. Then we could strip a
lot of code.
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210602083818.241793-4-thomas.hellstrom@linux.intel.com
A set of fixes that have been coming in over the last few weeks, the
usual mix of fixes:
- DT fixups for TI K3
- SATA drive detection fix for TI DRA7
- Power management fixes and a few build warning removals for OMAP
- OP-TEE fix to use standard API for UUID exporting
- DT fixes for a handful of i.MX boards
... plus a few other smaller items
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAmC9IRYPHG9sb2ZAbGl4
b20ubmV0AAoJEIwa5zzehBx3m/sQAI8JaXHnUBbXAAn0ugMLaTfXA1FKI+in+O43
xobyBwe6R4XNAr1GQvLYXF6vY3eaMHrcS/a1dU96uH7h3B8aq9ZR9DqrpJ5VNu37
q55t+brJkEI8+t5u08u6jrxfVARu4FbyZB4jIkbIwxnyEynx/8Bl7sLYImOhspkb
ipEkgPOSGcIDUcsI1eXIb6urQYkE0yssy20qNYFQ4neYS4Su4gy+LA2OTSqRHk/s
uQtfJZGN9LbxUFJ4mho349hebjM3rjXw2ox9Znk4bIhBbW38sppjstQ8lhi5YhbW
RlKDEIYcc/Fo/Hy1xoiAY2MZ7lMMUnXOEhMQmzQd1AZVNH0ysr8vuSEBy9mxF36r
Jx3eXWDZulDnhM/3eA8GJg3o5kcCZcymFHxo2X0bMeiMlXZ77QP8XV5Y5/C6LQf3
wJ4lncq0Q4wc9q7W2uGCq/JrMbJxOTjh6nbLlOIENHliRgwk48aoHY4RlOtKpws7
82S/vCAAj/08mpV2AzZtTtUxIrvgJcdKm9freXZOoUp53yvHbltLZXm2bBrfLdh1
qM0vT9+skUajxWXG0HpTNJSig3DeAMmfwC3As4kjibA8Jtr8y7249Z/xKPXJrZI+
5lvi4S2AT/QJbER0jUe6nIamS9RIFnDy0+J0BPvfJ6VpDMIM18rn6M4iev0am9v7
MDI6Mm2+
=TbGI
-----END PGP SIGNATURE-----
Merge tag 'arm-soc-fixes-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Olof Johansson:
"A set of fixes that have been coming in over the last few weeks, the
usual mix of fixes:
- DT fixups for TI K3
- SATA drive detection fix for TI DRA7
- Power management fixes and a few build warning removals for OMAP
- OP-TEE fix to use standard API for UUID exporting
- DT fixes for a handful of i.MX boards
And a few other smaller items"
* tag 'arm-soc-fixes-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (29 commits)
arm64: meson: select COMMON_CLK
soc: amlogic: meson-clk-measure: remove redundant dev_err call in meson_msr_probe()
ARM: OMAP1: ams-delta: remove unused function ams_delta_camera_power
bus: ti-sysc: Fix flakey idling of uarts and stop using swsup_sidle_act
ARM: dts: imx: emcon-avari: Fix nxp,pca8574 #gpio-cells
ARM: dts: imx7d-pico: Fix the 'tuning-step' property
ARM: dts: imx7d-meerkat96: Fix the 'tuning-step' property
arm64: dts: freescale: sl28: var1: fix RGMII clock and voltage
arm64: dts: freescale: sl28: var4: fix RGMII clock and voltage
ARM: imx: pm-imx27: Include "common.h"
arm64: dts: zii-ultra: fix 12V_MAIN voltage
arm64: dts: zii-ultra: remove second GEN_3V3 regulator instance
arm64: dts: ls1028a: fix memory node
bus: ti-sysc: Fix am335x resume hang for usb otg module
ARM: OMAP2+: Fix build warning when mmc_omap is not built
ARM: OMAP1: isp1301-omap: Add missing gpiod_add_lookup_table function
ARM: OMAP1: Fix use of possibly uninitialized irq variable
optee: use export_uuid() to copy client UUID
arm64: dts: ti: k3*: Introduce reg definition for interrupt routers
arm64: dts: ti: k3-am65|j721e|am64: Map the dma / navigator subsystem via explicit ranges
...