1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
linux/kernel
Kumar Kartikeya Dwivedi dee872e124 bpf: Populate kfunc BTF ID sets in struct btf
This patch prepares the kernel to support putting all kinds of kfunc BTF
ID sets in the struct btf itself. The various kernel subsystems will
make register_btf_kfunc_id_set call in the initcalls (for built-in code
and modules).

The 'hook' is one of the many program types, e.g. XDP and TC/SCHED_CLS,
STRUCT_OPS, and 'types' are check (allowed or not), acquire, release,
and ret_null (with PTR_TO_BTF_ID_OR_NULL return type).

A maximum of BTF_KFUNC_SET_MAX_CNT (32) kfunc BTF IDs are permitted in a
set of certain hook and type for vmlinux sets, since they are allocated
on demand, and otherwise set as NULL. Module sets can only be registered
once per hook and type, hence they are directly assigned.

A new btf_kfunc_id_set_contains function is exposed for use in verifier,
this new method is faster than the existing list searching method, and
is also automatic. It also lets other code not care whether the set is
unallocated or not.

Note that module code can only do single register_btf_kfunc_id_set call
per hook. This is why sorting is only done for in-kernel vmlinux sets,
because there might be multiple sets for the same hook and type that
must be concatenated, hence sorting them is required to ensure bsearch
in btf_id_set_contains continues to work correctly.

Next commit will update the kernel users to make use of this
infrastructure.

Finally, add __maybe_unused annotation for BTF ID macros for the
!CONFIG_DEBUG_INFO_BTF case, so that they don't produce warnings during
build time.

The previous patch is also needed to provide synchronization against
initialization for module BTF's kfunc_set_tab introduced here, as
described below:

  The kfunc_set_tab pointer in struct btf is write-once (if we consider
  the registration phase (comprised of multiple register_btf_kfunc_id_set
  calls) as a single operation). In this sense, once it has been fully
  prepared, it isn't modified, only used for lookup (from the verifier
  context).

  For btf_vmlinux, it is initialized fully during the do_initcalls phase,
  which happens fairly early in the boot process, before any processes are
  present. This also eliminates the possibility of bpf_check being called
  at that point, thus relieving us of ensuring any synchronization between
  the registration and lookup function (btf_kfunc_id_set_contains).

  However, the case for module BTF is a bit tricky. The BTF is parsed,
  prepared, and published from the MODULE_STATE_COMING notifier callback.
  After this, the module initcalls are invoked, where our registration
  function will be called to populate the kfunc_set_tab for module BTF.

  At this point, BTF may be available to userspace while its corresponding
  module is still intializing. A BTF fd can then be passed to verifier
  using bpf syscall (e.g. for kfunc call insn).

  Hence, there is a race window where verifier may concurrently try to
  lookup the kfunc_set_tab. To prevent this race, we must ensure the
  operations are serialized, or waiting for the __init functions to
  complete.

  In the earlier registration API, this race was alleviated as verifier
  bpf_check_mod_kfunc_call didn't find the kfunc BTF ID until it was added
  by the registration function (called usually at the end of module __init
  function after all module resources have been initialized). If the
  verifier made the check_kfunc_call before kfunc BTF ID was added to the
  list, it would fail verification (saying call isn't allowed). The
  access to list was protected using a mutex.

  Now, it would still fail verification, but for a different reason
  (returning ENXIO due to the failed btf_try_get_module call in
  add_kfunc_call), because if the __init call is in progress the module
  will be in the middle of MODULE_STATE_COMING -> MODULE_STATE_LIVE
  transition, and the BTF_MODULE_LIVE flag for btf_module instance will
  not be set, so the btf_try_get_module call will fail.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220114163953.1455836-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-18 14:26:41 -08:00
..
bpf bpf: Populate kfunc BTF ID sets in struct btf 2022-01-18 14:26:41 -08:00
cgroup Networking changes for 5.17. 2022-01-10 19:06:09 -08:00
configs drivers/char: remove /dev/kmem for good 2021-05-07 00:26:34 -07:00
debug kdb: Adopt scheduler's task classification 2021-11-03 17:21:37 +00:00
dma dma-mapping updates for Linux 5.16 2021-11-09 10:56:41 -08:00
entry entry: Snapshot thread flags 2021-12-01 00:06:43 +01:00
events perf: Add a counter for number of user access events in context 2021-12-14 11:30:54 +00:00
futex futex: Fix PREEMPT_RT build 2021-10-19 17:27:05 +02:00
gcov Kconfig: Introduce ARCH_WANTS_NO_INSTR and CC_HAS_NO_PROFILE_FN_ATTR 2021-06-22 11:07:18 -07:00
irq irq: remove unused flags argument from __handle_irq_event_percpu() 2022-01-07 00:25:25 +01:00
kcsan arm64: Enable KCSAN 2021-12-14 18:54:34 +00:00
livepatch Tracing updates for 5.16: 2021-11-01 20:05:19 -07:00
locking locking/rtmutex: Fix incorrect condition in rtmutex_spin_on_owner() 2021-12-18 10:55:51 +01:00
power PM: hibernate: Allow ACPI hardware signature to be honoured 2021-12-08 16:06:10 +01:00
printk printk fixup for 5.16 2021-11-18 10:50:45 -08:00
rcu RCU pull request for v5.16 2021-11-01 20:25:38 -07:00
sched - Add a set of thread_info.flags accessors which snapshot it before 2022-01-10 11:34:10 -08:00
time timekeeping: Really make sure wall_to_monotonic isn't positive 2021-12-17 23:06:22 +01:00
trace Networking changes for 5.17. 2022-01-10 19:06:09 -08:00
.gitignore .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
acct.c kernel: remove spurious blkdev.h includes 2021-10-18 06:17:01 -06:00
async.c kernel/async.c: remove async_unregister_domain() 2021-05-07 00:26:33 -07:00
audit.c audit: improve robustness of the audit queue handling 2021-12-15 13:16:39 -05:00
audit.h audit/stable-5.16 PR 20211101 2021-11-01 21:17:39 -07:00
audit_fsnotify.c fsnotify: clarify contract for create event hooks 2021-10-27 12:32:34 +02:00
audit_tree.c audit/stable-5.16 PR 20211101 2021-11-01 21:17:39 -07:00
audit_watch.c \n 2021-11-06 16:43:20 -07:00
auditfilter.c audit: add filtering for io_uring records 2021-09-19 22:34:38 -04:00
auditsc.c audit/stable-5.16 PR 20211101 2021-11-01 21:17:39 -07:00
backtracetest.c treewide: Replace DECLARE_TASKLET() with DECLARE_TASKLET_OLD() 2020-07-30 11:15:58 -07:00
bounds.c
capability.c capability: handle idmapped mounts 2021-01-24 14:27:16 +01:00
cfi.c cfi: Use rcu_read_{un}lock_sched_notrace 2021-08-11 13:11:12 -07:00
compat.c arch: remove compat_alloc_user_space 2021-09-08 15:32:35 -07:00
configs.c
context_tracking.c
cpu.c sched/scs: Reset task stack state in bringup_cpu() 2021-11-24 12:20:27 +01:00
cpu_pm.c PM: cpu: Make notifier chain use a raw_spinlock_t 2021-08-16 18:55:32 +02:00
crash_core.c kernel/crash_core: suppress unknown crashkernel parameter warning 2021-12-25 12:20:55 -08:00
crash_dump.c
cred.c ucounts: In set_cred_ucounts assume new->ucounts is non-NULL 2021-10-20 10:45:34 -05:00
delayacct.c delayacct: Add sysctl to enable at runtime 2021-05-12 11:43:25 +02:00
dma.c
exec_domain.c
exit.c Merge branch 'per_signal_struct_coredumps-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-11-03 12:15:29 -07:00
extable.c extable: use is_kernel_text() helper 2021-11-09 10:02:51 -08:00
fail_function.c fault-injection: handle EI_ETYPE_TRUE 2020-12-15 22:46:19 -08:00
fork.c A single fix for POSIX CPU timers to address a problem where POSIX CPU 2021-11-14 10:43:38 -08:00
freezer.c sched: Add get_current_state() 2021-06-18 11:43:08 +02:00
gen_kheaders.sh kbuild: clean up ${quiet} checks in shell scripts 2021-05-27 04:01:50 +09:00
groups.c groups: simplify struct group_info allocation 2021-02-26 09:41:03 -08:00
hung_task.c Merge branch 'akpm' (patches from Andrew) 2021-07-02 12:08:10 -07:00
iomem.c
irq_work.c irq_work: Also rcuwait for !IRQ_WORK_HARD_IRQ on PREEMPT_RT 2021-10-15 11:25:18 +02:00
jump_label.c jump_label: Fix jump_label_text_reserved() vs __init 2021-07-05 10:46:20 +02:00
kallsyms.c kallsyms: strip LTO suffixes from static functions 2021-10-04 10:58:25 -07:00
kcmp.c Merge branch 'exec-update-lock-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-12-15 19:36:48 -08:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/rwlock: Provide RT variant 2021-08-17 17:50:51 +02:00
Kconfig.preempt preempt: Restore preemption model selection configs 2021-11-11 13:09:33 +01:00
kcov.c kcov: replace local_irq_save() with a local_lock_t 2021-11-09 10:02:52 -08:00
kexec.c kexec: avoid compat_alloc_user_space 2021-09-08 15:32:34 -07:00
kexec_core.c Merge branch 'rework/printk_safe-removal' into for-linus 2021-08-30 16:36:10 +02:00
kexec_elf.c
kexec_file.c memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED 2021-11-06 13:30:42 -07:00
kexec_internal.h kexec: move machine_kexec_post_load() to public interface 2021-02-22 12:33:26 +00:00
kheaders.c
kmod.c modules: add CONFIG_MODPROBE_PATH 2021-05-07 00:26:33 -07:00
kprobes.c kprobes: Limit max data_size of the kretprobe instances 2021-12-01 21:04:34 -05:00
ksysfs.c
kthread.c Merge branch 'exit-cleanups-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-11-10 16:15:54 -08:00
latencytop.c
Makefile Tracing updates for 5.16: 2021-11-01 20:05:19 -07:00
module-internal.h
module.c module: change to print useful messages from elf_validity_check() 2021-11-05 15:13:10 -07:00
module_signature.c module: harden ELF info handling 2021-01-19 10:24:45 +01:00
module_signing.c module: harden ELF info handling 2021-01-19 10:24:45 +01:00
notifier.c notifier: Return an error when a callback has already been registered 2021-12-29 10:37:33 +01:00
nsproxy.c memcg: enable accounting for new namesapces and struct nsproxy 2021-09-03 09:58:12 -07:00
padata.c padata: Remove repeated verbose license text 2021-08-27 16:30:18 +08:00
panic.c Merge branch 'rework/printk_safe-removal' into for-linus 2021-08-30 16:36:10 +02:00
params.c params: lift param_set_uint_minmax to common code 2021-08-16 14:42:22 +02:00
pid.c pid: add pidfd_get_task() helper 2021-10-14 13:29:18 +02:00
pid_namespace.c memcg: enable accounting for new namesapces and struct nsproxy 2021-09-03 09:58:12 -07:00
profile.c profiling: fix shift-out-of-bounds bugs 2021-09-08 11:50:26 -07:00
ptrace.c sched: Change task_struct::state 2021-06-18 11:43:09 +02:00
range.c kernel.h: split out min()/max() et al. helpers 2020-10-16 11:11:19 -07:00
reboot.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2021-11-12 11:53:16 -08:00
regset.c regset: kill ->get() 2020-07-27 14:31:12 -04:00
relay.c relay: allow the use of const callback structs 2020-12-15 22:46:18 -08:00
resource.c kernel/resource: disallow access to exclusive system RAM regions 2021-11-09 10:02:52 -08:00
resource_kunit.c resource: provide meaningful MODULE_LICENSE() in test suite 2020-11-25 18:52:35 +01:00
rseq.c KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest 2021-09-22 10:24:01 -04:00
scftorture.c scftorture: Warn on individual scf_torture_init() error conditions 2021-09-16 10:27:48 -07:00
scs.c scs: Release kasan vmalloc poison in scs_free process 2021-09-30 09:37:27 +01:00
seccomp.c Merge branch 'exit-cleanups-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-09-01 14:52:05 -07:00
signal.c signal: Skip the altstack update when not needed 2021-12-14 13:08:36 -08:00
smp.c sched: Improve wake_up_all_idle_cpus() take #2 2021-10-22 15:32:46 +02:00
smpboot.c smpboot: Replace deprecated CPU-hotplug functions. 2021-08-10 14:57:42 +02:00
smpboot.h
softirq.c timers/nohz: Last resort update jiffies on nohz_full IRQ entry 2021-12-02 15:07:22 +01:00
stackleak.c stackleak: let stack_erasing_sysctl take a kernel pointer buffer 2020-09-19 13:13:39 -07:00
stacktrace.c stacktrace: move filter_irq_stacks() to kernel/stacktrace.c 2021-11-06 13:30:43 -07:00
static_call.c static_call: Fix static_call_text_reserved() vs __init 2021-07-05 10:46:33 +02:00
stop_machine.c stop_machine: Add caller debug info to queue_stop_cpus_work 2021-03-23 16:01:58 +01:00
sys.c Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
sys_ni.c futex: Implement sys_futex_waitv() 2021-10-07 13:51:11 +02:00
sysctl-test.c kernel/sysctl-test: Remove some casts which are no-longer required 2021-06-23 16:41:24 -06:00
sysctl.c net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
task_work.c kasan: record task_work_add() call stack 2021-04-30 11:20:42 -07:00
taskstats.c treewide: rename nla_strlcpy to nla_strscpy. 2020-11-16 08:08:54 -08:00
torture.c torture: Replace deprecated CPU-hotplug functions. 2021-08-10 10:48:07 -07:00
tracepoint.c tracepoint: Fix kerneldoc comments 2021-08-16 11:39:51 -04:00
tsacct.c mm/mmap.c: fix a data race of mm->total_vm 2021-11-06 13:30:35 -07:00
ucount.c ucounts: Fix rlimit max values check 2021-12-09 15:37:18 -06:00
uid16.c
uid16.h
umh.c kernel/umh.c: fix some spelling mistakes 2021-05-07 00:26:34 -07:00
up.c A set of locking related fixes and updates: 2021-05-09 13:07:03 -07:00
user-return-notifier.c
user.c fs/epoll: use a per-cpu counter for user's watches count 2021-09-08 11:50:27 -07:00
user_namespace.c memcg: enable accounting for new namesapces and struct nsproxy 2021-09-03 09:58:12 -07:00
usermode_driver.c Merge branch 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-07-03 11:41:14 -07:00
utsname.c uts: Use generic ns_common::count 2020-08-19 14:13:20 +02:00
utsname_sysctl.c
watch_queue.c watch_queue: rectify kernel-doc for init_watch() 2021-01-26 11:16:34 +00:00
watchdog.c kernel: watchdog: modify the explanation related to watchdog thread 2021-06-29 10:53:46 -07:00
watchdog_hld.c
workqueue.c Merge branch 'akpm' (patches from Andrew) 2021-11-06 14:08:17 -07:00
workqueue_internal.h workqueue: Assign a color to barrier work items 2021-08-17 07:49:10 -10:00