1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

935158 commits

Author SHA1 Message Date
Andrey Ignatov
41c48f3a98 bpf: Support access to bpf map fields
There are multiple use-cases when it's convenient to have access to bpf
map fields, both `struct bpf_map` and map type specific struct-s such as
`struct bpf_array`, `struct bpf_htab`, etc.

For example while working with sock arrays it can be necessary to
calculate the key based on map->max_entries (some_hash % max_entries).
Currently this is solved by communicating max_entries via "out-of-band"
channel, e.g. via additional map with known key to get info about target
map. That works, but is not very convenient and error-prone while
working with many maps.

In other cases necessary data is dynamic (i.e. unknown at loading time)
and it's impossible to get it at all. For example while working with a
hash table it can be convenient to know how much capacity is already
used (bpf_htab.count.counter for BPF_F_NO_PREALLOC case).

At the same time kernel knows this info and can provide it to bpf
program.

Fill this gap by adding support to access bpf map fields from bpf
program for both `struct bpf_map` and map type specific fields.

Support is implemented via btf_struct_access() so that a user can define
their own `struct bpf_map` or map type specific struct in their program
with only necessary fields and preserve_access_index attribute, cast a
map to this struct and use a field.

For example:

	struct bpf_map {
		__u32 max_entries;
	} __attribute__((preserve_access_index));

	struct bpf_array {
		struct bpf_map map;
		__u32 elem_size;
	} __attribute__((preserve_access_index));

	struct {
		__uint(type, BPF_MAP_TYPE_ARRAY);
		__uint(max_entries, 4);
		__type(key, __u32);
		__type(value, __u32);
	} m_array SEC(".maps");

	SEC("cgroup_skb/egress")
	int cg_skb(void *ctx)
	{
		struct bpf_array *array = (struct bpf_array *)&m_array;
		struct bpf_map *map = (struct bpf_map *)&m_array;

		/* .. use map->max_entries or array->map.max_entries .. */
	}

Similarly to other btf_struct_access() use-cases (e.g. struct tcp_sock
in net/ipv4/bpf_tcp_ca.c) the patch allows access to any fields of
corresponding struct. Only reading from map fields is supported.

For btf_struct_access() to work there should be a way to know btf id of
a struct that corresponds to a map type. To get btf id there should be a
way to get a stringified name of map-specific struct, such as
"bpf_array", "bpf_htab", etc for a map type. Two new fields are added to
`struct bpf_map_ops` to handle it:
* .map_btf_name keeps a btf name of a struct returned by map_alloc();
* .map_btf_id is used to cache btf id of that struct.

To make btf ids calculation cheaper they're calculated once while
preparing btf_vmlinux and cached same way as it's done for btf_id field
of `struct bpf_func_proto`

While calculating btf ids, struct names are NOT checked for collision.
Collisions will be checked as a part of the work to prepare btf ids used
in verifier in compile time that should land soon. The only known
collision for `struct bpf_htab` (kernel/bpf/hashtab.c vs
net/core/sock_map.c) was fixed earlier.

Both new fields .map_btf_name and .map_btf_id must be set for a map type
for the feature to work. If neither is set for a map type, verifier will
return ENOTSUPP on a try to access map_ptr of corresponding type. If
just one of them set, it's verifier misconfiguration.

Only `struct bpf_array` for BPF_MAP_TYPE_ARRAY and `struct bpf_htab` for
BPF_MAP_TYPE_HASH are supported by this patch. Other map types will be
supported separately.

The feature is available only for CONFIG_DEBUG_INFO_BTF=y and gated by
perfmon_capable() so that unpriv programs won't have access to bpf map
fields.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/6479686a0cd1e9067993df57b4c3eef0e276fec9.1592600985.git.rdna@fb.com
2020-06-22 22:22:58 +02:00
Andrey Ignatov
032a6b3565 bpf: Rename bpf_htab to bpf_shtab in sock_map
There are two different `struct bpf_htab` in bpf code in the following
files:
- kernel/bpf/hashtab.c
- net/core/sock_map.c

It makes it impossible to find proper btf_id by name = "bpf_htab" and
kind = BTF_KIND_STRUCT what is needed to support access to map ptr so
that bpf program can access `struct bpf_htab` fields.

To make it possible one of the struct-s should be renamed, sock_map.c
looks like a better candidate for rename since it's specialized version
of hashtab.

Rename it to bpf_shtab ("sh" stands for Sock Hash).

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/c006a639e03c64ca50fc87c4bb627e0bfba90f4e.1592600985.git.rdna@fb.com
2020-06-22 22:22:58 +02:00
Andrey Ignatov
a2d0d62f4d bpf: Switch btf_parse_vmlinux to btf_find_by_name_kind
btf_parse_vmlinux() implements manual search for struct bpf_ctx_convert
since at the time of implementing btf_find_by_name_kind() was not
available.

Later btf_find_by_name_kind() was introduced in 27ae7997a6 ("bpf:
Introduce BPF_PROG_TYPE_STRUCT_OPS"). It provides similar search
functionality and can be leveraged in btf_parse_vmlinux(). Do it.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/6e12d5c3e8a3d552925913ef73a695dd1bb27800.1592600985.git.rdna@fb.com
2020-06-22 22:22:58 +02:00
Felix Fietkau
b0c34bde72 MAINTAINERS: update email address for Felix Fietkau
The old address has been bouncing for a while now

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 12:57:11 -07:00
Jordan Crouse
30480e6ed5 drm/msm: Fix up the rest of the messed up address sizes
msm_gem_address_space_create() changed to take a start/length instead
of a start/end for the iova space but all of the callers were just
cut and pasted from the old usage. Most of the mistakes have been fixed
up so just catch up the rest.

Fixes: ccac7ce373 ("drm/msm: Refactor address space initialization")
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-06-22 12:12:29 -07:00
Shay Drory
116a1b9f1c IB/mad: Fix use after free when destroying MAD agent
Currently, when RMPP MADs are processed while the MAD agent is destroyed,
it could result in use after free of rmpp_recv, as decribed below:

	cpu-0						cpu-1
	-----						-----
ib_mad_recv_done()
 ib_mad_complete_recv()
  ib_process_rmpp_recv_wc()
						unregister_mad_agent()
						 ib_cancel_rmpp_recvs()
						  cancel_delayed_work()
   process_rmpp_data()
    start_rmpp()
     queue_delayed_work(rmpp_recv->cleanup_work)
						  destroy_rmpp_recv()
						   free_rmpp_recv()
     cleanup_work()[1]
      spin_lock_irqsave(&rmpp_recv->agent->lock) <-- use after free

[1] cleanup_work() == recv_cleanup_handler

Fix it by waiting for the MAD agent reference count becoming zero before
calling to ib_cancel_rmpp_recvs().

Fixes: 9a41e38a46 ("IB/mad: Use IDR for agent IDs")
Link: https://lore.kernel.org/r/20200621104738.54850-2-leon@kernel.org
Signed-off-by: Shay Drory <shayd@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-22 14:57:44 -03:00
Leon Romanovsky
6eefa839c4 RDMA/mlx5: Protect from kernel crash if XRC_TGT doesn't have udata
Don't deref udata if it is NULL

  BUG: kernel NULL pointer dereference, address: 0000000000000030
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000   SMP PTI
  CPU: 2 PID: 1592 Comm: python3 Not tainted 5.7.0-rc6+ #1
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
  RIP: 0010:create_qp+0x39e/0xae0 [mlx5_ib]
  Code: c0 0d 00 00 bf 10 01 00 00 e8 be a9 e4 e0 48 85 c0 49 89 c2 0f 84 0c 07 00 00 41 8b 85 74 63 01 00 0f c8 a9 00 00 00 10 74 0a <41> 8b 46 30 0f c8 41 89 42 14 41 8b 52 18 41 0f b6 4a 1c 0f ca 89
  RSP: 0018:ffffc9000067f8b0 EFLAGS: 00010206
  RAX: 0000000010170000 RBX: ffff888441313000 RCX: 0000000000000000
  RDX: 0000000000000200 RSI: 0000000000000000 RDI: ffff88845b1d4400
  RBP: ffffc9000067fa60 R08: 0000000000000200 R09: ffff88845b1d4200
  R10: ffff88845b1d4200 R11: ffff888441313000 R12: ffffc9000067f950
  R13: ffff88846ac00140 R14: 0000000000000000 R15: ffff88846c2bc000
  FS:  00007faa1a3c0540(0000) GS:ffff88846fd00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000030 CR3: 0000000446dca003 CR4: 0000000000760ea0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   ? __switch_to_asm+0x40/0x70
   ? __switch_to_asm+0x34/0x70
   mlx5_ib_create_qp+0x897/0xfa0 [mlx5_ib]
   ib_create_qp+0x9e/0x300 [ib_core]
   create_qp+0x92d/0xb20 [ib_uverbs]
   ? ib_uverbs_cq_event_handler+0x30/0x30 [ib_uverbs]
   ? release_resource+0x30/0x30
   ib_uverbs_create_qp+0xc4/0xe0 [ib_uverbs]
   ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xc8/0xf0 [ib_uverbs]
   ib_uverbs_run_method+0x223/0x770 [ib_uverbs]
   ? track_pfn_remap+0xa7/0x100
   ? uverbs_disassociate_api+0xd0/0xd0 [ib_uverbs]
   ? remap_pfn_range+0x358/0x490
   ib_uverbs_cmd_verbs.isra.6+0x19b/0x370 [ib_uverbs]
   ? rdma_umap_priv_init+0x82/0xe0 [ib_core]
   ? vm_mmap_pgoff+0xec/0x120
   ib_uverbs_ioctl+0xc0/0x120 [ib_uverbs]
   ksys_ioctl+0x92/0xb0
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x48/0x130
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: e383085c24 ("RDMA/mlx5: Set ECE options during QP create")
Link: https://lore.kernel.org/r/20200621115959.60126-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-22 14:40:53 -03:00
Vitaly Kuznetsov
312d16c7c0 KVM: x86/mmu: Avoid mixing gpa_t with gfn_t in walk_addr_generic()
translate_gpa() returns a GPA, assigning it to 'real_gfn' seems obviously
wrong. There is no real issue because both 'gpa_t' and 'gfn_t' are u64 and
we don't use the value in 'real_gfn' as a GFN, we do

 real_gfn = gpa_to_gfn(real_gfn);

instead. 'If you see a "buffalo" sign on an elephant's cage, do not trust
your eyes', but let's fix it for good.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200622151435.752560-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-22 13:38:30 -04:00
Paolo Bonzini
44d5271707 KVM: LAPIC: ensure APIC map is up to date on concurrent update requests
The following race can cause lost map update events:

         cpu1                            cpu2

                                apic_map_dirty = true
  ------------------------------------------------------------
                                kvm_recalculate_apic_map:
                                     pass check
                                         mutex_lock(&kvm->arch.apic_map_lock);
                                         if (!kvm->arch.apic_map_dirty)
                                     and in process of updating map
  -------------------------------------------------------------
    other calls to
       apic_map_dirty = true         might be too late for affected cpu
  -------------------------------------------------------------
                                     apic_map_dirty = false
  -------------------------------------------------------------
    kvm_recalculate_apic_map:
    bail out on
      if (!kvm->arch.apic_map_dirty)

To fix it, record the beginning of an update of the APIC map in
apic_map_dirty.  If another APIC map change switches apic_map_dirty
back to DIRTY during the update, kvm_recalculate_apic_map should not
make it CLEAN, and the other caller will go through the slow path.

Reported-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-22 13:37:30 -04:00
Mark Zhang
c1d869d64a RDMA/counter: Query a counter before release
Query a dynamically-allocated counter before release it, to update it's
hwcounters and log all of them into history data. Otherwise all values of
these hwcounters will be lost.

Fixes: f34a55e497 ("RDMA/core: Get sum value of all counters when perform a sysfs stat read")
Link: https://lore.kernel.org/r/20200621110000.56059-1-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-22 14:36:56 -03:00
Sami Tolvanen
4bc799dcb6 security: fix the key_permission LSM hook function type
Commit 8c0637e950 ("keys: Make the KEY_NEED_* perms an enum rather than
a mask") changed the type of the key_permission callback functions, but
didn't change the type of the hook, which trips indirect call checking with
Control-Flow Integrity (CFI). This change fixes the issue by changing the
hook type to match the functions.

Fixes: 8c0637e950 ("keys: Make the KEY_NEED_* perms an enum rather than a mask")
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <jmorris@namei.org>
2020-06-22 10:36:25 -07:00
Andy Shevchenko
5d8913504c gpio: pca953x: Fix GPIO resource leak on Intel Galileo Gen 2
When adding a quirk for IRQ on Intel Galileo Gen 2 the commit ba8c90c618
("gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2")
missed GPIO resource release. We can safely do this in the same quirk, since
IRQ will be locked by GPIO framework when requested and unlocked on freeing.

Fixes: ba8c90c618 ("gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2020-06-22 18:51:53 +02:00
Linus Torvalds
dd0d718152 spi: Fixes for v5.8
Quite a lot of fixes here for no single reason.  There's a collection of
 the usual sort of device specific fixes and also a bunch of people have
 been working on spidev and the userspace test program spidev_test so
 they've got an unusually large collection of small fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl7wl/8THGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0NJBB/4oTOxPIfkKQyPcSTQB8FC6AHJ5Z8FT
 /4dtK4U32rn5Sms/CHNXMMFxomBgB46Ud19KJ/bEgqugJfI9Vl0bl9fZT+sDyVU3
 rFPthVluIprFe1bITyfcgYmjRoznLRdzj9LMFsAWIe3Vmy584TdvXbl8n5COu5Wh
 6sR6SuQnEn65x1HC4++3IsG70+ogz81sFRapk/7jXwS+YdwzWIqtN2s2IlUcTmcy
 ylijl04XUSyo6e/v/sjZaaRmIz72mW+HbMKW4iC0mGZ6yuACoy7zop9BuwAg6S7z
 dFDyqTZ5gq72ZeJc7fuco1y7jqQoKUC1oTRghPlbq2XG3Huta80l415l
 =32lr
 -----END PGP SIGNATURE-----

Merge tag 'spi-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "Quite a lot of fixes here for no single reason.

  There's a collection of the usual sort of device specific fixes and
  also a bunch of people have been working on spidev and the userspace
  test program spidev_test so they've got an unusually large collection
  of small fixes"

* tag 'spi-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spidev: fix a potential use-after-free in spidev_release()
  spi: spidev: fix a race between spidev_release and spidev_remove
  spi: stm32-qspi: Fix error path in case of -EPROBE_DEFER
  spi: uapi: spidev: Use TABs for alignment
  spi: spi-fsl-dspi: Free DMA memory with matching function
  spi: tools: Add macro definitions to fix build errors
  spi: tools: Make default_tx/rx and input_tx static
  spi: dt-bindings: amlogic, meson-gx-spicc: Fix schema for meson-g12a
  spi: rspi: Use requested instead of maximum bit rate
  spi: spidev_test: Use %u to format unsigned numbers
  spi: sprd: switch the sequence of setting WDG_LOAD_LOW and _HIGH
2020-06-22 09:49:59 -07:00
Igor Mammedov
af28dfacbe kvm: lapic: fix broken vcpu hotplug
Guest fails to online hotplugged CPU with error
  smpboot: do_boot_cpu failed(-1) to wakeup CPU#4

It's caused by the fact that kvm_apic_set_state(), which used to call
recalculate_apic_map() unconditionally and pulled hotplugged CPU into
apic map, is updating map conditionally on state changes.  In this case
the APIC map is not considered dirty and the is not updated.

Fix the issue by forcing unconditional update from kvm_apic_set_state(),
like it used to be.

Fixes: 4abaffce4d ("KVM: LAPIC: Recalculate apic map in batch")
Cc: stable@vger.kernel.org
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200622160830.426022-1-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-22 12:48:44 -04:00
Linus Torvalds
751645789f regulator: Fixes for v5.8
This has a fix for the refactoring out of the pickable ranges
 functionality, plus the removal of a BROKEN dependency on mt6358 now
 that the dependencies were merged in -rc1 and a couple of device
 specific fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl7wlugTHGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0FIuB/4z/Aso2z4DA+NELlz8BSqVbaBxPrJI
 nFdTBCz3Wac2rxNlKVI4xFNIO5mKUOKEqSZ5cPopJUPHTy0G3ML8in8GH6/we59u
 VkzvzsxJUMl0Cxrnp/lhJWoYR6wG10743Xc/VrJEfeSqCCPodveU5ME4ciE6EJG0
 PDrWEXvxJNxPqn0e94lS5eeBfFbHWha5cJPx6Vt8cjV+qJO1q5dlmyzi+u6Ea0lJ
 jko/Th3gqK2hgo5RKMwfp15FWliygAQYAeLv5MMAHtnD+NxzUJGbbk0edvQnX9lu
 tmFnDOoeAeWQqMJN8x3xMChrBX6sPEDhroHtLVt5AH19/huwin+wtuOM
 =tBiX
 -----END PGP SIGNATURE-----

Merge tag 'regulator-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "This has a fix for the refactoring out of the pickable ranges
  functionality, plus the removal of a BROKEN dependency on mt6358 now
  that the dependencies were merged in -rc1 and a couple of device
  specific fixes"

* tag 'regulator-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: mt6358: Remove BROKEN dependency
  regualtor: pfuze100: correct sw1a/sw2 on pfuze3000
  regulator: Fix pickable ranges mapping
  regulator: da9063: fix LDO9 suspend and warning.
2020-06-22 09:47:59 -07:00
Linus Torvalds
2a00087068 regmap: Fixes for v5.8
A few small fixes, none of which are likely to have any substantial
 impact here - the most substantial one is a fix for a long standing
 memory leak on devices that use register patching which will only have
 an impact if the device is removed and re-added.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl7wllsTHGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0L5pB/sEQIAR3U1OY7Wh/fln5BvAVkjazr60
 PGclUzIWSOuNdMQBClibQhIJkTwLA3BjW1o0db3qB7ulZArGT5KqNdY0z0BkUkPz
 6/l297XHerHkNmQc16F9NqT6RX8Xori/mke0DuHkurxq+A1iS970Q3a9todz0ZWB
 UuMVfE6a3i13QxhahQmp86NNUH3BbWgyDRUuJELdHGyM7Ydit7Tt1kYLua1TTdBQ
 t7R00ukv0XSLG4vCIfSjFajVlzQZIM1YEzefY9TUAWaOc/2znj0syUGn1FehdV5J
 a01BWk91jNecHq6K8sIKV7N4z4OUKQqkwlyj+KkB0iMf11xezkJnWCZC
 =yvq/
 -----END PGP SIGNATURE-----

Merge tag 'regmap-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap fixes from Mark Brown:
 "A few small fixes, none of which are likely to have any substantial
  impact here - the most substantial one is a fix for a long standing
  memory leak on devices that use register patching which will only have
  an impact if the device is removed and re-added"

* tag 'regmap-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: Fix memory leak from regmap_register_patch
  regmap: fix the kerneldoc for regmap_test_bits()
  regmap: fix alignment issue
2020-06-22 09:46:43 -07:00
Eugenio Pérez
cb91909e48 tools/virtio: Use tools/include/list.h instead of stubs
It should not make any significant difference but reduce stub code.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Link: https://lore.kernel.org/r/20200418102217.32327-9-eperezma@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22 12:34:22 -04:00
Eugenio Pérez
1d8bf5c3a3 tools/virtio: Reset index in virtio_test --reset.
This way behavior for vhost is more like a VM.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Link: https://lore.kernel.org/r/20200418102217.32327-8-eperezma@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22 12:34:22 -04:00
Eugenio Pérez
6741239260 tools/virtio: Extract virtqueue initialization in vq_reset
So we can reset after that in the main loop.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Link: https://lore.kernel.org/r/20200418102217.32327-7-eperezma@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22 12:34:22 -04:00
Eugenio Pérez
4cfb939353 tools/virtio: Use __vring_new_virtqueue in virtio_test.c
As updated in ("2a2d1382fe9d virtio: Add improved queue allocation API")

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Link: https://lore.kernel.org/r/20200418102217.32327-6-eperezma@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22 12:34:22 -04:00
Eugenio Pérez
264ee5aa81 tools/virtio: Add --reset
Currently, it only removes and add backend, but it will reset vq
position in future commits.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Link: https://lore.kernel.org/r/20200418102217.32327-5-eperezma@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22 12:34:21 -04:00
Eugenio Pérez
7add78b2a6 tools/virtio: Add --batch=random option
So we can test with non-deterministic batches in flight.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Link: https://lore.kernel.org/r/20200418102217.32327-4-eperezma@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22 12:34:21 -04:00
Eugenio Pérez
633fae33d5 tools/virtio: Add --batch option
This allow to test vhost having >1 buffers in flight

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Link: https://lore.kernel.org/r/20200401183118.8334-5-eperezma@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20200418102217.32327-3-eperezma@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22 12:34:21 -04:00
David Hildenbrand
b3562c6087 virtio-mem: add memory via add_memory_driver_managed()
Virtio-mem managed memory is always detected and added by the virtio-mem
driver, never using something like the firmware-provided memory map.
This is the case after an ordinary system reboot, and has to be guaranteed
after kexec. Especially, virtio-mem added memory resources can contain
inaccessible parts ("unblocked memory blocks"), blindly forwarding them
to a kexec kernel is dangerous, as unplugged memory will get accessed
(esp. written).

Let's use the new way of adding special driver-managed memory introduced
in commit 7b7b27214b ("mm/memory_hotplug: introduce
add_memory_driver_managed()").

This will result in no entries in /sys/firmware/memmap ("raw firmware-
provided memory map"), the memory resource will be flagged
IORESOURCE_MEM_DRIVER_MANAGED (esp., kexec_file_load() will not place
kexec images on this memory), and it is exposed as "System RAM
(virtio_mem)" in /proc/iomem, so esp. kexec-tools can properly handle it.

Example /proc/iomem before this change:
  [...]
  140000000-333ffffff : virtio0
    140000000-147ffffff : System RAM
  334000000-533ffffff : virtio1
    338000000-33fffffff : System RAM
    340000000-347ffffff : System RAM
    348000000-34fffffff : System RAM
  [...]

Example /proc/iomem after this change:
  [...]
  140000000-333ffffff : virtio0
    140000000-147ffffff : System RAM (virtio_mem)
  334000000-533ffffff : virtio1
    338000000-33fffffff : System RAM (virtio_mem)
    340000000-347ffffff : System RAM (virtio_mem)
    348000000-34fffffff : System RAM (virtio_mem)
  [...]

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Fixes: 5f1f79bbc9 ("virtio-mem: Paravirtualized memory hotplug")
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200611093518.5737-1-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
2020-06-22 12:34:21 -04:00
Dan Carpenter
1c3d69ab53 virtio-mem: silence a static checker warning
Smatch complains that "rc" can be uninitialized if we hit the "break;"
statement on the first iteration through the loop.  I suspect that this
can't happen in real life, but returning a zero literal is cleaner and
silence the static checker warning.

Fixes: 5f1f79bbc9 ("virtio-mem: Paravirtualized memory hotplug")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20200610085911.GC5439@mwanda
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22 12:34:21 -04:00
Dan Carpenter
c09cc2c319 vhost_vdpa: Fix potential underflow in vhost_vdpa_mmap()
The "vma->vm_pgoff" variable is an unsigned long so if it's larger than
INT_MAX then "index" can be negative leading to an underflow.  Fix this
by changing the type of "index" to "unsigned long".

Fixes: ddd89d0a05 ("vhost_vdpa: support doorbell mapping via mmap")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20200610085852.GB5439@mwanda
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22 12:34:21 -04:00
Jason Wang
24eae8ebfb vdpa: fix typos in the comments for __vdpa_alloc_device()
Fix two typos in the comments for __vdpa_alloc_device().

Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200527060528.9100-1-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-06-22 12:34:21 -04:00
Andreas Gerstmayr
c42ad5d435 perf flamegraph: Explicitly set utf-8 encoding
On some platforms the default encoding is not utf-8, which causes an
UnicodeDecodeError when reading the flamegraph template and writing the
flamegraph

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200619153232.203537-1-agerstmayr@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-22 13:30:55 -03:00
Nathan Chancellor
e6d701dca9 ACPI: sysfs: Fix pm_profile_attr type
When running a kernel with Clang's Control Flow Integrity implemented,
there is a violation that happens when accessing
/sys/firmware/acpi/pm_profile:

$ cat /sys/firmware/acpi/pm_profile
0

$ dmesg
...
[   17.352564] ------------[ cut here ]------------
[   17.352568] CFI failure (target: acpi_show_profile+0x0/0x8):
[   17.352572] WARNING: CPU: 3 PID: 497 at kernel/cfi.c:29 __cfi_check_fail+0x33/0x40
[   17.352573] Modules linked in:
[   17.352575] CPU: 3 PID: 497 Comm: cat Tainted: G        W         5.7.0-microsoft-standard+ #1
[   17.352576] RIP: 0010:__cfi_check_fail+0x33/0x40
[   17.352577] Code: 48 c7 c7 50 b3 85 84 48 c7 c6 50 0a 4e 84 e8 a4 d8 60 00 85 c0 75 02 5b c3 48 c7 c7 dc 5e 49 84 48 89 de 31 c0 e8 7d 06 eb ff <0f> 0b 5b c3 00 00 cc cc 00 00 cc cc 00 85 f6 74 25 41 b9 ea ff ff
[   17.352577] RSP: 0018:ffffaa6dc3c53d30 EFLAGS: 00010246
[   17.352578] RAX: 331267e0c06cee00 RBX: ffffffff83d85890 RCX: ffffffff8483a6f8
[   17.352579] RDX: ffff9cceabbb37c0 RSI: 0000000000000082 RDI: ffffffff84bb9e1c
[   17.352579] RBP: ffffffff845b2bc8 R08: 0000000000000001 R09: ffff9cceabbba200
[   17.352579] R10: 000000000000019d R11: 0000000000000000 R12: ffff9cc947766f00
[   17.352580] R13: ffffffff83d6bd50 R14: ffff9ccc6fa80000 R15: ffffffff845bd328
[   17.352582] FS:  00007fdbc8d13580(0000) GS:ffff9cce91ac0000(0000) knlGS:0000000000000000
[   17.352582] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   17.352583] CR2: 00007fdbc858e000 CR3: 00000005174d0000 CR4: 0000000000340ea0
[   17.352584] Call Trace:
[   17.352586]  ? rev_id_show+0x8/0x8
[   17.352587]  ? __cfi_check+0x45bac/0x4b640
[   17.352589]  ? kobj_attr_show+0x73/0x80
[   17.352590]  ? sysfs_kf_seq_show+0xc1/0x140
[   17.352592]  ? ext4_seq_options_show.cfi_jt+0x8/0x8
[   17.352593]  ? seq_read+0x180/0x600
[   17.352595]  ? sysfs_create_file_ns.cfi_jt+0x10/0x10
[   17.352596]  ? tlbflush_read_file+0x8/0x8
[   17.352597]  ? __vfs_read+0x6b/0x220
[   17.352598]  ? handle_mm_fault+0xa23/0x11b0
[   17.352599]  ? vfs_read+0xa2/0x130
[   17.352599]  ? ksys_read+0x6a/0xd0
[   17.352601]  ? __do_sys_getpgrp+0x8/0x8
[   17.352602]  ? do_syscall_64+0x72/0x120
[   17.352603]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   17.352604] ---[ end trace 7b1fa81dc897e419 ]---

When /sys/firmware/acpi/pm_profile is read, sysfs_kf_seq_show is called,
which in turn calls kobj_attr_show, which gets the ->show callback
member by calling container_of on attr (casting it to struct
kobj_attribute) then calls it.

There is a CFI violation because pm_profile_attr is of type
struct device_attribute but kobj_attr_show calls ->show expecting it
to be from struct kobj_attribute. CFI checking ensures that function
pointer types match when doing indirect calls. Fix pm_profile_attr to
be defined in terms of kobj_attribute so there is no violation or
mismatch.

Fixes: 362b646062 ("ACPI: Export FADT pm_profile integer value to userspace")
Link: https://github.com/ClangBuiltLinux/linux/issues/1051
Reported-by: yuu ichii <byahu140@heisei.be>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: 3.10+ <stable@vger.kernel.org> # 3.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-22 17:13:38 +02:00
Jason A. Donenfeld
75b0cea7bf ACPI: configfs: Disallow loading ACPI tables when locked down
Like other vectors already patched, this one here allows the root
user to load ACPI tables, which enables arbitrary physical address
writes, which in turn makes it possible to disable lockdown.

Prevents this by checking the lockdown status before allowing a new
ACPI table to be installed. The link in the trailer shows a PoC of
how this might be used.

Link: https://git.zx2c4.com/american-unsigned-language/tree/american-unsigned-language-2.sh
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-22 16:41:27 +02:00
Andrew Jones
a25e91028a KVM: arm64: pvtime: Ensure task delay accounting is enabled
Ensure we're actually accounting run_delay before we claim that we'll
expose it to the guest. If we're not, then we just pretend like steal
time isn't supported in order to avoid any confusion.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200622142710.18677-1-drjones@redhat.com
2020-06-22 15:35:57 +01:00
Steven Price
66b7e05dc0 KVM: arm64: Fix kvm_reset_vcpu() return code being incorrect with SVE
If SVE is enabled then 'ret' can be assigned the return value of
kvm_vcpu_enable_sve() which may be 0 causing future "goto out" sites to
erroneously return 0 on failure rather than -EINVAL as expected.

Remove the initialisation of 'ret' and make setting the return value
explicit to avoid this situation in the future.

Fixes: 9a3cdf26e3 ("KVM: arm64/sve: Allow userspace to enable SVE for vcpus")
Cc: stable@vger.kernel.org
Reported-by: James Morse <james.morse@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200617105456.28245-1-steven.price@arm.com
2020-06-22 14:39:57 +01:00
Alexandru Elisei
7733306bd5 KVM: arm64: Annotate hyp NMI-related functions as __always_inline
The "inline" keyword is a hint for the compiler to inline a function.  The
functions system_uses_irq_prio_masking() and gic_write_pmr() are used by
the code running at EL2 on a non-VHE system, so mark them as
__always_inline to make sure they'll always be part of the .hyp.text
section.

This fixes the following splat when trying to run a VM:

[   47.625273] Kernel panic - not syncing: HYP panic:
[   47.625273] PS:a00003c9 PC:0000ca0b42049fc4 ESR:86000006
[   47.625273] FAR:0000ca0b42049fc4 HPFAR:0000000010001000 PAR:0000000000000000
[   47.625273] VCPU:0000000000000000
[   47.647261] CPU: 1 PID: 217 Comm: kvm-vcpu-0 Not tainted 5.8.0-rc1-ARCH+ #61
[   47.654508] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
[   47.661139] Call trace:
[   47.663659]  dump_backtrace+0x0/0x1cc
[   47.667413]  show_stack+0x18/0x24
[   47.670822]  dump_stack+0xb8/0x108
[   47.674312]  panic+0x124/0x2f4
[   47.677446]  panic+0x0/0x2f4
[   47.680407] SMP: stopping secondary CPUs
[   47.684439] Kernel Offset: disabled
[   47.688018] CPU features: 0x240402,20002008
[   47.692318] Memory Limit: none
[   47.695465] ---[ end Kernel panic - not syncing: HYP panic:
[   47.695465] PS:a00003c9 PC:0000ca0b42049fc4 ESR:86000006
[   47.695465] FAR:0000ca0b42049fc4 HPFAR:0000000010001000 PAR:0000000000000000
[   47.695465] VCPU:0000000000000000 ]---

The instruction abort was caused by the code running at EL2 trying to fetch
an instruction which wasn't mapped in the EL2 translation tables. Using
objdump showed the two functions as separate symbols in the .text section.

Fixes: 85738e05dc ("arm64: kvm: Unmask PMR before entering guest")
Cc: stable@vger.kernel.org
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200618171254.1596055-1-alexandru.elisei@arm.com
2020-06-22 14:39:45 +01:00
Chuck Lever
7b2182ec38 xprtrdma: Fix handling of RDMA_ERROR replies
The RPC client currently doesn't handle ERR_CHUNK replies correctly.
rpcrdma_complete_rqst() incorrectly passes a negative number to
xprt_complete_rqst() as the number of bytes copied. Instead, set
task->tk_status to the error value, and return zero bytes copied.

In these cases, return -EIO rather than -EREMOTEIO. The RPC client's
finite state machine doesn't know what to do with -EREMOTEIO.

Additional clean ups:
- Don't double-count RDMA_ERROR replies
- Remove a stale comment

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@kernel.vger.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-22 09:34:35 -04:00
Chuck Lever
c487eb7d8e xprtrdma: Clean up disconnect
1. Ensure that only rpcrdma_cm_event_handler() modifies
   ep->re_connect_status to avoid racy changes to that field.

2. Ensure that xprt_force_disconnect() is invoked only once as a
   transport is closed or destroyed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-22 09:34:35 -04:00
Chuck Lever
f423f755f4 xprtrdma: Clean up synopsis of rpcrdma_flush_disconnect()
Refactor: Pass struct rpcrdma_xprt instead of an IB layer object.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-22 09:34:35 -04:00
Chuck Lever
2d97f46376 xprtrdma: Use re_connect_status safely in rpcrdma_xprt_connect()
Clean up: Sometimes creating a fresh rpcrdma_ep can fail. That's why
xprt_rdma_connect() always checks if the r_xprt->rx_ep pointer is
valid before dereferencing it. Instead, xprt_rdma_connect() can
simply check rpcrdma_xprt_connect()'s return value.

Also, there's no need to set re_connect_status to zero just after
the rpcrdma_ep is created, since it is allocated with kzalloc.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-22 09:34:35 -04:00
Chuck Lever
2acc5cae29 xprtrdma: Prevent dereferencing r_xprt->rx_ep after it is freed
r_xprt->rx_ep is known to be good while the transport's send lock is
held.  Otherwise additional references on rx_ep must be held when it
is used outside of that lock's critical sections.

For now, bump the rx_ep reference count once whenever there is at
least one outstanding Receive WR. This avoids the memory bandwidth
overhead of taking and releasing the reference count for every
ib_post_recv() and Receive completion.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-22 09:34:35 -04:00
Krzysztof Kozlowski
f148915f91
spi: spi-fsl-dspi: Initialize completion before possible interrupt
The interrupt handler calls completion and is IRQ requested before the
completion is initialized.  Logically it should be the other way.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20200622110543.5035-4-krzk@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-06-22 13:50:29 +01:00
Krzysztof Kozlowski
3d87b613d6
spi: spi-fsl-dspi: Fix external abort on interrupt in resume or exit paths
If shared interrupt comes late, during probe error path or device remove
(could be triggered with CONFIG_DEBUG_SHIRQ), the interrupt handler
dspi_interrupt() will access registers with the clock being disabled.
This leads to external abort on non-linefetch on Toradex Colibri VF50
module (with Vybrid VF5xx):

    $ echo 4002d000.spi > /sys/devices/platform/soc/40000000.bus/4002d000.spi/driver/unbind

    Unhandled fault: external abort on non-linefetch (0x1008) at 0x8887f02c
    Internal error: : 1008 [#1] ARM
    Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
    Backtrace:
      (regmap_mmio_read32le)
      (regmap_mmio_read)
      (_regmap_bus_reg_read)
      (_regmap_read)
      (regmap_read)
      (dspi_interrupt)
      (free_irq)
      (devm_irq_release)
      (release_nodes)
      (devres_release_all)
      (device_release_driver_internal)

The resource-managed framework should not be used for shared interrupt
handling, because the interrupt handler might be called after releasing
other resources and disabling clocks.

Similar bug could happen during suspend - the shared interrupt handler
could be invoked after suspending the device.  Each device sharing this
interrupt line should disable the IRQ during suspend so handler will be
invoked only in following cases:
1. None suspended,
2. All devices resumed.

Fixes: 349ad66c0a ("spi:Add Freescale DSPI driver for Vybrid VF610 platform")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200622110543.5035-3-krzk@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-06-22 13:50:28 +01:00
Krzysztof Kozlowski
3c525b69e8
spi: spi-fsl-dspi: Fix lockup if device is shutdown during SPI transfer
During shutdown, the driver should unregister the SPI controller
and stop the hardware.  Otherwise the dspi_transfer_one_message() could
wait on completion infinitely.

Additionally, calling spi_unregister_controller() first in device
shutdown reverse-matches the probe function, where SPI controller is
registered at the end.

Fixes: dc23482599 ("spi: spi-fsl-dspi: Adding shutdown hook")
Reported-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200622110543.5035-2-krzk@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-06-22 13:50:27 +01:00
Krzysztof Kozlowski
7684580d45
spi: spi-fsl-dspi: Fix lockup if device is removed during SPI transfer
During device removal, the driver should unregister the SPI controller
and stop the hardware.  Otherwise the dspi_transfer_one_message() could
wait on completion infinitely.

Additionally, calling spi_unregister_controller() first in device
removal reverse-matches the probe function, where SPI controller is
registered at the end.

Fixes: 05209f4570 ("spi: fsl-dspi: add missing clk_disable_unprepare() in dspi_remove()")
Reported-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200622110543.5035-1-krzk@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-06-22 13:50:26 +01:00
Aneesh Kumar K.V
c1ed1754f2 powerpc/kvm/book3s64: Fix kernel crash with nested kvm & DEBUG_VIRTUAL
With CONFIG_DEBUG_VIRTUAL=y, __pa() checks for addr value and if it's
less than PAGE_OFFSET it leads to a BUG().

  #define __pa(x)
  ({
  	VIRTUAL_BUG_ON((unsigned long)(x) < PAGE_OFFSET);
  	(unsigned long)(x) & 0x0fffffffffffffffUL;
  })

  kernel BUG at arch/powerpc/kvm/book3s_64_mmu_radix.c:43!
  cpu 0x70: Vector: 700 (Program Check) at [c0000018a2187360]
      pc: c000000000161b30: __kvmhv_copy_tofrom_guest_radix+0x130/0x1f0
      lr: c000000000161d5c: kvmhv_copy_from_guest_radix+0x3c/0x80
  ...
  kvmhv_copy_from_guest_radix+0x3c/0x80
  kvmhv_load_from_eaddr+0x48/0xc0
  kvmppc_ld+0x98/0x1e0
  kvmppc_load_last_inst+0x50/0x90
  kvmppc_hv_emulate_mmio+0x288/0x2b0
  kvmppc_book3s_radix_page_fault+0xd8/0x2b0
  kvmppc_book3s_hv_page_fault+0x37c/0x1050
  kvmppc_vcpu_run_hv+0xbb8/0x1080
  kvmppc_vcpu_run+0x34/0x50
  kvm_arch_vcpu_ioctl_run+0x2fc/0x410
  kvm_vcpu_ioctl+0x2b4/0x8f0
  ksys_ioctl+0xf4/0x150
  sys_ioctl+0x28/0x80
  system_call_exception+0x104/0x1d0
  system_call_common+0xe8/0x214

kvmhv_copy_tofrom_guest_radix() uses a NULL value for to/from to
indicate direction of copy.

Avoid calling __pa() if the value is NULL to avoid the BUG().

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[mpe: Massage change log a bit to mention CONFIG_DEBUG_VIRTUAL]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200611120159.680284-1-aneesh.kumar@linux.ibm.com
2020-06-22 21:55:45 +10:00
Takashi Iwai
91ef3d9f9f ASoC: Fixes for v5.8
This is a collection of mostly small fixes, mostly fixing fallout from
 some of the DPCM changes that went in last time around which shook out
 some issues on i.MX and Qualcomm platforms.  The addition of a managed
 version of snd_soc_register_dai() is to fix resource leaks.
 
 There's also a few new device IDs for x86 systems.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl7wlcQTHGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0K9NB/0e4qOoGEyxmf6rbAnT1zvBu0vQu2qR
 ABkcCztt/tS/QJ1BSDHX/hALj9jr6RuAtn1pL4ynVGxJ86TByrrwLX1WhtKTk+DW
 TEgiPrFAKLlCwLVxgFdF0LORicv5zZEehi9+zP1HKt31Zsv2KNvTtwaF1WymIEd5
 LYVvYhtAKMmpjPN2WjlDSuita+fc/e5dm26n8hi3DJKIBVsLBbSxd1s1kwlqqnPh
 KbJjVrp5sWNhCyKu2hW8Sv1ovh40WbZ1TnzlDTiLinz0TCyqUyqEatuJ4RgRLmsW
 G3IAvm0rNjqyIF8TbQvYvvMZ0BZe/2SHdNXBtOXjXUYxhmnLvfAXocTf
 =FNgB
 -----END PGP SIGNATURE-----

Merge tag 'asoc-fix-v5.8-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.8

This is a collection of mostly small fixes, mostly fixing fallout from
some of the DPCM changes that went in last time around which shook out
some issues on i.MX and Qualcomm platforms.  The addition of a managed
version of snd_soc_register_dai() is to fix resource leaks.

There's also a few new device IDs for x86 systems.
2020-06-22 13:49:14 +02:00
Arseny Solokha
7e4773f73d powerpc/fsl_booke/32: Fix build with CONFIG_RANDOMIZE_BASE
Building the current 5.8 kernel for an e500 machine with
CONFIG_RANDOMIZE_BASE=y and CONFIG_BLOCK=n yields the following
failure:

  arch/powerpc/mm/nohash/kaslr_booke.c: In function 'kaslr_early_init':
  arch/powerpc/mm/nohash/kaslr_booke.c:387:2: error: implicit
  declaration of function 'flush_icache_range'; did you mean 'flush_tlb_range'?

Indeed, including asm/cacheflush.h into kaslr_booke.c fixes the build.

Fixes: 2b0e86cc5d ("powerpc/fsl_booke/32: implement KASLR infrastructure")
Cc: stable@vger.kernel.org # v5.5+
Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Acked-by: Scott Wood <oss@buserror.net>
[mpe: Tweak change log to mention CONFIG_BLOCK=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200613162801.1946619-1-asolokha@kb.kras.ru
2020-06-22 20:41:52 +10:00
Jacky Hu
69339d083d pinctrl: amd: fix npins for uart0 in kerncz_groups
uart0_pins is defined as:
static const unsigned uart0_pins[] = {135, 136, 137, 138, 139};

which npins is wronly specified as 9 later
	{
		.name = "uart0",
		.pins = uart0_pins,
		.npins = 9,
	},

npins should be 5 instead of 9 according to the definition.

Signed-off-by: Jacky Hu <hengqing.hu@gmail.com>
Link: https://lore.kernel.org/r/20200616015024.287683-1-hengqing.hu@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2020-06-22 09:35:39 +02:00
Jason A. Donenfeld
625d344978 Revert "kernel/printk: add kmsg SEEK_CUR handling"
This reverts commit 8ece3b3eb5.

This commit broke userspace. Bash uses ESPIPE to determine whether or
not the file should be read using "unbuffered I/O", which means reading
1 byte at a time instead of 128 bytes at a time. I used to use bash to
read through kmsg in a really quite nasty way:

    while read -t 0.1 -r line 2>/dev/null || [[ $? -ne 142 ]]; do
       echo "SARU $line"
    done < /dev/kmsg

This will show all lines that can fit into the 128 byte buffer, and skip
lines that don't. That's pretty awful, but at least it worked.

With this change, bash now tries to do 1-byte reads, which means it
skips all the lines, which is worse than before.

Now, I don't really care very much about this, and I'm already look for
a workaround. But I did just spend an hour trying to figure out why my
scripts were broken. Either way, it makes no difference to me personally
whether this is reverted, but it might be something to consider. If you
declare that "trying to read /dev/kmsg with bash is terminally stupid
anyway," I might be inclined to agree with you. But do note that bash
uses lseek(fd, 0, SEEK_CUR)==>ESPIPE to determine whether or not it's
reading from a pipe.

Cc: Bruno Meneguele <bmeneg@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-21 20:47:20 -07:00
Xiyu Yang
77577de641 cifs: Fix cached_fid refcnt leak in open_shroot
open_shroot() invokes kref_get(), which increases the refcount of the
"tcon->crfid" object. When open_shroot() returns not zero, it means the
open operation failed and close_shroot() will not be called to decrement
the refcount of the "tcon->crfid".

The reference counting issue happens in one normal path of
open_shroot(). When the cached root have been opened successfully in a
concurrent process, the function increases the refcount and jump to
"oshr_free" to return. However the current return value "rc" may not
equal to 0, thus the increased refcount will not be balanced outside the
function, causing a refcnt leak.

Fix this issue by setting the value of "rc" to 0 before jumping to
"oshr_free" label.

Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
2020-06-21 22:34:50 -05:00
Linus Torvalds
48778464bb Linux 5.8-rc2 2020-06-21 15:45:29 -07:00
Linus Torvalds
817d914d17 selinux/stable-5.8 PR 20200621
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl7vxoUUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOs3Q/+JSNYoKZiOJax5u1ePEwk5JRasRij
 JkKNjspXueVcoVkVF8lOiSOmQm/7FyfDUi2Qk8U5Gmx7Pr6vQJZgghHxdPMsemCz
 mNbbR8UMm6ssTim19ULBQ3S0Sc1QMQvQCLNDZcv24ne2K8d9HTBrFGenqlLU4UtZ
 JrMLircBt39fVonooMrf9ycGlM8tUZwm8te+Jp7KL18GUKZT8hr0HKzu2WE6/qT4
 WBGNaWxqnfbajnDb41ix2rL+lb8Snqn94cxCjp248rn7M5fJRSCKmYaumBh5ViJ2
 VuD/ZQsTX5SSnc9YDpkUDXya9M1wzFwf64ku6Avga1BXS6lNWB1wqWueSTMfggiL
 2B+LVANWkGFfHtVAVA5xsxXjeJnYmIj/g8qSiHS/RSFJazr1b/cXWedvyewll/Nv
 rFRBsVzktV6BBrlTclcrsu9FmlZRAThNC3uYs/s5vbAja+wHEhCLuacO+jiducRP
 fqQCP2iF6MqC6B2I8WzVp3jU8k2t02i6ySaXmXjzrwOZSLvnOdvDBzE7e95yNLRg
 WLeGd/o2PdLpVoSNVHelFrqm8VZKYSCkWty9WppklnrIVVydKMJ3bgihXY4pADyf
 1ABtKUZgySZKZOpr1pQBqIivHuvKqUGFynj6PSRsngQBoq6V3XpJ7ZCBhuG7cNAT
 9BfnUkhFW7lW70I=
 =nILH
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20200621' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull SELinux fixes from Paul Moore:
 "Three small patches to fix problems in the SELinux code, all found via
  clang.

  Two patches fix potential double-free conditions and one fixes an
  undefined return value"

* tag 'selinux-pr-20200621' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix undefined return of cond_evaluate_expr
  selinux: fix a double free in cond_read_node()/cond_read_list()
  selinux: fix double free
2020-06-21 15:41:24 -07:00