linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Stanislav Fomichev	d6d418bd8f	libbpf: Cap retries in sys_bpf_prog_load I've seen a situation, where a process that's under pprof constantly generates SIGPROF which prevents program loading indefinitely. The right thing to do probably is to disable signals in the upper layers while loading, but it still would be nice to get some error from libbpf instead of an endless loop. Let's add some small retry limit to the program loading: try loading the program 5 (arbitrary) times and give up. v2: * 10 -> 5 retires (Andrii Nakryiko) Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201202231332.3923644-1-sdf@google.com	2020-12-03 12:01:18 -08:00
Toke Høiland-Jørgensen	9cf309c56f	libbpf: Sanitise map names before pinning When we added sanitising of map names before loading programs to libbpf, we still allowed periods in the name. While the kernel will accept these for the map names themselves, they are not allowed in file names when pinning maps. This means that bpf_object__pin_maps() will fail if called on an object that contains internal maps (such as sections .rodata). Fix this by replacing periods with underscores when constructing map pin paths. This only affects the paths generated by libbpf when bpf_object__pin_maps() is called with a path argument. Any pin paths set by bpf_map__set_pin_path() are unaffected, and it will still be up to the caller to avoid invalid characters in those. Fixes: `113e6b7e15` ("libbpf: Sanitise internal map names so they are not rejected by the kernel") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201203093306.107676-1-toke@redhat.com	2020-12-03 11:57:42 -08:00
Andrei Matei	80b2b5c3a7	libbpf: Fail early when loading programs with unspecified type Before this patch, a program with unspecified type (BPF_PROG_TYPE_UNSPEC) would be passed to the BPF syscall, only to have the kernel reject it with an opaque invalid argument error. This patch makes libbpf reject such programs with a nicer error message - in particular libbpf now tries to diagnose bad ELF section names at both open time and load time. Signed-off-by: Andrei Matei <andreimatei1@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201203043410.59699-1-andreimatei1@gmail.com	2020-12-03 11:37:05 -08:00
Mariusz Dudek	e459f49b43	libbpf: Separate XDP program load with xsk socket creation Add support for separation of eBPF program load and xsk socket creation. This is needed for use-case when you want to privide as little privileges as possible to the data plane application that will handle xsk socket creation and incoming traffic. With this patch the data entity container can be run with only CAP_NET_RAW capability to fulfill its purpose of creating xsk socket and handling packages. In case your umem is larger or equal process limit for MEMLOCK you need either increase the limit or CAP_IPC_LOCK capability. To resolve privileges issue two APIs are introduced: - xsk_setup_xdp_prog - loads the built in XDP program. It can also return xsks_map_fd which is needed by unprivileged process to update xsks_map with AF_XDP socket "fd" - xsk_socket__update_xskmap - inserts an AF_XDP socket into an xskmap for a particular xsk_socket Signed-off-by: Mariusz Dudek <mariuszx.dudek@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20201203090546.11976-2-mariuszx.dudek@intel.com	2020-12-03 10:37:59 -08:00
Andrii Nakryiko	0cfdcd6378	libbpf: Add base BTF accessor Add ability to get base BTF. It can be also used to check if BTF is split BTF. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201202065244.530571-3-andrii@kernel.org	2020-12-03 10:17:26 -08:00
Andrii Nakryiko	f6a8250ea1	libbpf: Fix ring_buffer__poll() to return number of consumed samples Fix ring_buffer__poll() to return the number of non-discarded records consumed, just like its documentation states. It's also consistent with ring_buffer__consume() return. Fix up selftests with wrong expected results. Fixes: `bf99c936f9` ("libbpf: Add BPF ring buffer support") Fixes: `cb1c9ddd55` ("selftests/bpf: Add BPF ringbuf selftests") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201130223336.904192-1-andrii@kernel.org	2020-12-01 20:21:45 -08:00
Magnus Karlsson	105c4e75fe	libbpf: Replace size_t with __u32 in xsk interfaces Replace size_t with __u32 in the xsk interfaces that contain this. There is no reason to have size_t since the internal variable that is manipulated is a __u32. The following APIs are affected: __u32 xsk_ring_prod__reserve(struct xsk_ring_prod prod, __u32 nb, __u32 idx) void xsk_ring_prod__submit(struct xsk_ring_prod prod, __u32 nb) __u32 xsk_ring_cons__peek(struct xsk_ring_cons cons, __u32 nb, __u32 idx) void xsk_ring_cons__cancel(struct xsk_ring_cons cons, __u32 nb) void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb) The "nb" variable and the return values have been changed from size_t to __u32. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/1606383455-8243-1-git-send-email-magnus.karlsson@gmail.com	2020-11-27 21:58:38 +01:00
Li RongQing	db13db9f67	libbpf: Add support for canceling cached_cons advance Add a new function for returning descriptors the user received after an xsk_ring_cons__peek call. After the application has gotten a number of descriptors from a ring, it might not be able to or want to process them all for various reasons. Therefore, it would be useful to have an interface for returning or cancelling a number of them so that they are returned to the ring. This patch adds a new function called xsk_ring_cons__cancel that performs this operation on nb descriptors counted from the end of the batch of descriptors that was received through the peek call. Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> [ Magnus Karlsson: rewrote changelog ] Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/1606202474-8119-1-git-send-email-lirongqing@baidu.com	2020-11-25 13:18:47 +01:00
Jakub Kicinski	56495a2442	Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-19 19:08:46 -08:00
Jiri Olsa	1fd6cee127	libbpf: Fix VERSIONED_SYM_COUNT number parsing We remove "other info" from "readelf -s --wide" output when parsing GLOBAL_SYM_COUNT variable, which was added in [1]. But we don't do that for VERSIONED_SYM_COUNT and it's failing the check_abi target on powerpc Fedora 33. The extra "other info" wasn't problem for VERSIONED_SYM_COUNT parsing until commit [2] added awk in the pipe, which assumes that the last column is symbol, but it can be "other info". Adding "other info" removal for VERSIONED_SYM_COUNT the same way as we did for GLOBAL_SYM_COUNT parsing. [1] `aa915931ac` ("libbpf: Fix readelf output parsing for Fedora") [2] `746f534a48` ("tools/libbpf: Avoid counting local symbols in ABI check") Fixes: `746f534a48` ("tools/libbpf: Avoid counting local symbols in ABI check") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201118211350.1493421-1-jolsa@kernel.org	2020-11-19 08:45:12 -08:00
Alan Maguire	de91e631bd	libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types() When operating on split BTF, btf__find_by_name[_kind] will not iterate over all types since they use btf->nr_types to show the number of types to iterate over. For split BTF this is the number of types _on top of base BTF_, so it will underestimate the number of types to iterate over, especially for vmlinux + module BTF, where the latter is much smaller. Use btf__get_nr_types() instead. Fixes: `ba451366bf` ("libbpf: Implement basic split BTF support") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1605437195-2175-1-git-send-email-alan.maguire@oracle.com	2020-11-16 20:51:34 -08:00
Jakub Kicinski	07cbce2e46	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2020-11-14 1) Add BTF generation for kernel modules and extend BTF infra in kernel e.g. support for split BTF loading and validation, from Andrii Nakryiko. 2) Support for pointers beyond pkt_end to recognize LLVM generated patterns on inlined branch conditions, from Alexei Starovoitov. 3) Implements bpf_local_storage for task_struct for BPF LSM, from KP Singh. 4) Enable FENTRY/FEXIT/RAW_TP tracing program to use the bpf_sk_storage infra, from Martin KaFai Lau. 5) Add XDP bulk APIs that introduce a defer/flush mechanism to optimize the XDP_REDIRECT path, from Lorenzo Bianconi. 6) Fix a potential (although rather theoretical) deadlock of hashtab in NMI context, from Song Liu. 7) Fixes for cross and out-of-tree build of bpftool and runqslower allowing build for different target archs on same source tree, from Jean-Philippe Brucker. 8) Fix error path in htab_map_alloc() triggered from syzbot, from Eric Dumazet. 9) Move functionality from test_tcpbpf_user into the test_progs framework so it can run in BPF CI, from Alexander Duyck. 10) Lift hashtab key_size limit to be larger than MAX_BPF_STACK, from Florian Lehner. Note that for the fix from Song we have seen a sparse report on context imbalance which requires changes in sparse itself for proper annotation detection where this is currently being discussed on linux-sparse among developers [0]. Once we have more clarification/guidance after their fix, Song will follow-up. [0] https://lore.kernel.org/linux-sparse/CAHk-=wh4bx8A8dHnX612MsDO13st6uzAz1mJ1PaHHVevJx_ZCw@mail.gmail.com/T/ https://lore.kernel.org/linux-sparse/20201109221345.uklbp3lzgq6g42zb@ltop.local/T/ * git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (66 commits) net: mlx5: Add xdp tx return bulking support net: mvpp2: Add xdp tx return bulking support net: mvneta: Add xdp tx return bulking support net: page_pool: Add bulk support for ptr_ring net: xdp: Introduce bulking for xdp tx return path bpf: Expose bpf_d_path helper to sleepable LSM hooks bpf: Augment the set of sleepable LSM hooks bpf: selftest: Use bpf_sk_storage in FENTRY/FEXIT/RAW_TP bpf: Allow using bpf_sk_storage in FENTRY/FEXIT/RAW_TP bpf: Rename some functions in bpf_sk_storage bpf: Folding omem_charge() into sk_storage_charge() selftests/bpf: Add asm tests for pkt vs pkt_end comparison. selftests/bpf: Add skb_pkt_end test bpf: Support for pointers beyond pkt_end. tools/bpf: Always run the *-clean recipes tools/bpf: Add bootstrap/ to .gitignore bpf: Fix NULL dereference in bpf_task_storage tools/bpftool: Fix build slowdown tools/runqslower: Build bpftool using HOSTCC tools/runqslower: Enable out-of-tree build ... ==================== Link: https://lore.kernel.org/r/20201114020819.29584-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-14 09:13:41 -08:00
Andrii Nakryiko	197afc6314	libbpf: Don't attempt to load unused subprog as an entry-point BPF program If BPF code contains unused BPF subprogram and there are no other subprogram calls (which can realistically happen in real-world applications given sufficiently smart Clang code optimizations), libbpf will erroneously assume that subprograms are entry-point programs and will attempt to load them with UNSPEC program type. Fix by not relying on subcall instructions and rather detect it based on the structure of BPF object's sections. Fixes: `9a94f277c4` ("tools: libbpf: restore the ability to load programs from .text section") Reported-by: Dmitrii Banshchikov <dbanschikov@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20201107000251.256821-1-andrii@kernel.org	2020-11-09 22:15:23 +01:00
KP Singh	8885274d22	libbpf: Add support for task local storage Updates the bpf_probe_map_type API to also support BPF_MAP_TYPE_TASK_STORAGE similar to other local storage maps. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201106103747.2780972-4-kpsingh@chromium.org	2020-11-06 08:08:37 -08:00
Andrii Nakryiko	6b6e6b1d09	libbpf: Accomodate DWARF/compiler bug with duplicated identical arrays In some cases compiler seems to generate distinct DWARF types for identical arrays within the same CU. That seems like a bug, but it's already out there and breaks type graph equivalence checks, so accommodate it anyway by checking for identical arrays, regardless of their type ID. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201105043402.2530976-10-andrii@kernel.org	2020-11-05 18:37:30 -08:00
Andrii Nakryiko	f86524efcf	libbpf: Support BTF dedup of split BTFs Add support for deduplication split BTFs. When deduplicating split BTF, base BTF is considered to be immutable and can't be modified or adjusted. 99% of BTF deduplication logic is left intact (module some type numbering adjustments). There are only two differences. First, each type in base BTF gets hashed (expect VAR and DATASEC, of course, those are always considered to be self-canonical instances) and added into a table of canonical table candidates. Hashing is a shallow, fast operation, so mostly eliminates the overhead of having entire base BTF to be a part of BTF dedup. Second difference is very critical and subtle. While deduplicating split BTF types, it is possible to discover that one of immutable base BTF BTF_KIND_FWD types can and should be resolved to a full STRUCT/UNION type from the split BTF part. This is, obviously, can't happen because we can't modify the base BTF types anymore. So because of that, any type in split BTF that directly or indirectly references that newly-to-be-resolved FWD type can't be considered to be equivalent to the corresponding canonical types in base BTF, because that would result in a loss of type resolution information. So in such case, split BTF types will be deduplicated separately and will cause some duplication of type information, which is unavoidable. With those two changes, the rest of the algorithm manages to deduplicate split BTF correctly, pointing all the duplicates to their canonical counter-parts in base BTF, but also is deduplicating whatever unique types are present in split BTF on their own. Also, theoretically, split BTF after deduplication could end up with either empty type section or empty string section. This is handled by libbpf correctly in one of previous patches in the series. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201105043402.2530976-9-andrii@kernel.org	2020-11-05 18:37:30 -08:00
Andrii Nakryiko	d812362450	libbpf: Fix BTF data layout checks and allow empty BTF Make data section layout checks stricter, disallowing overlap of types and strings data. Additionally, allow BTFs with no type data. There is nothing inherently wrong with having BTF with no types (put potentially with some strings). This could be a situation with kernel module BTFs, if module doesn't introduce any new type information. Also fix invalid offset alignment check for btf->hdr->type_off. Fixes: `8a138aed4a` ("bpf: btf: Add BTF support to libbpf") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201105043402.2530976-8-andrii@kernel.org	2020-11-05 18:37:30 -08:00
Andrii Nakryiko	ba451366bf	libbpf: Implement basic split BTF support Support split BTF operation, in which one BTF (base BTF) provides basic set of types and strings, while another one (split BTF) builds on top of base's types and strings and adds its own new types and strings. From API standpoint, the fact that the split BTF is built on top of the base BTF is transparent. Type numeration is transparent. If the base BTF had last type ID #N, then all types in the split BTF start at type ID N+1. Any type in split BTF can reference base BTF types, but not vice versa. Programmatically construction of a split BTF on top of a base BTF is supported: one can create an empty split BTF with btf__new_empty_split() and pass base BTF as an input, or pass raw binary data to btf__new_split(), or use btf__parse_xxx_split() variants to get initial set of split types/strings from the ELF file with .BTF section. String offsets are similarly transparent and are a logical continuation of base BTF's strings. When building BTF programmatically and adding a new string (explicitly with btf__add_str() or implicitly through appending new types/members), string-to-be-added would first be looked up from the base BTF's string section and re-used if it's there. If not, it will be looked up and/or added to the split BTF string section. Similarly to type IDs, types in split BTF can refer to strings from base BTF absolutely transparently (but not vice versa, of course, because base BTF doesn't "know" about existence of split BTF). Internal type index is slightly adjusted to be zero-indexed, ignoring a fake [0] VOID type. This allows to handle split/base BTF type lookups transparently by using btf->start_id type ID offset, which is always 1 for base/non-split BTF and equals btf__get_nr_types(base_btf) + 1 for the split BTF. BTF deduplication is not yet supported for split BTF and support for it will be added in separate patch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201105043402.2530976-5-andrii@kernel.org	2020-11-05 18:37:30 -08:00
Andrii Nakryiko	88a82c2a9a	libbpf: Unify and speed up BTF string deduplication Revamp BTF dedup's string deduplication to match the approach of writable BTF string management. This allows to transfer deduplicated strings index back to BTF object after deduplication without expensive extra memory copying and hash map re-construction. It also simplifies the code and speeds it up, because hashmap-based string deduplication is faster than sort + unique approach. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201105043402.2530976-4-andrii@kernel.org	2020-11-05 18:37:30 -08:00
Andrii Nakryiko	c81ed6d81e	libbpf: Factor out common operations in BTF writing APIs Factor out commiting of appended type data. Also extract fetching the very last type in the BTF (to append members to). These two operations are common across many APIs and will be easier to refactor with split BTF, if they are extracted into a single place. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201105043402.2530976-2-andrii@kernel.org	2020-11-05 18:37:30 -08:00
Magnus Karlsson	25cf73b9ff	libbpf: Fix possible use after free in xsk_socket__delete Fix a possible use after free in xsk_socket__delete that will happen if xsk_put_ctx() frees the ctx. To fix, save the umem reference taken from the context and just use that instead. Fixes: `2f6324a393` ("libbpf: Support shared umems between queues and devices") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1604396490-12129-3-git-send-email-magnus.karlsson@gmail.com	2020-11-04 21:37:29 +01:00
Magnus Karlsson	f78331f74c	libbpf: Fix null dereference in xsk_socket__delete Fix a possible null pointer dereference in xsk_socket__delete that will occur if a null pointer is fed into the function. Fixes: `2f6324a393` ("libbpf: Support shared umems between queues and devices") Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1604396490-12129-2-git-send-email-magnus.karlsson@gmail.com	2020-11-04 21:37:28 +01:00
Ian Rogers	7a078d2d18	libbpf, hashmap: Fix undefined behavior in hash_bits If bits is 0, the case when the map is empty, then the >> is the size of the register which is undefined behavior - on x86 it is the same as a shift by 0. Fix by handling the 0 case explicitly and guarding calls to hash_bits for empty maps in hashmap__for_each_key_entry and hashmap__for_each_entry_safe. Fixes: `e3b9242240` ("libbpf: add resizable non-thread safe internal hashmap") Suggested-by: Andrii Nakryiko <andriin@fb.com>, Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201029223707.494059-1-irogers@google.com	2020-11-02 23:33:51 +01:00
Daniel Borkmann	3652c9a1b1	bpf, libbpf: Guard bpf inline asm from bpf_tail_call_static Yaniv reported a compilation error after pulling latest libbpf: [...] ../libbpf/src/root/usr/include/bpf/bpf_helpers.h:99:10: error: unknown register name 'r0' in asm : "r0", "r1", "r2", "r3", "r4", "r5"); [...] The issue got triggered given Yaniv was compiling tracing programs with native target (e.g. x86) instead of BPF target, hence no BTF generated vmlinux.h nor CO-RE used, and later llc with -march=bpf was invoked to compile from LLVM IR to BPF object file. Given that clang was expecting x86 inline asm and not BPF one the error complained that these regs don't exist on the former. Guard bpf_tail_call_static() with defined(__bpf__) where BPF inline asm is valid to use. BPF tracing programs on more modern kernels use BPF target anyway and thus the bpf_tail_call_static() function will be available for them. BPF inline asm is supported since clang 7 (clang <= 6 otherwise throws same above error), and __bpf_unreachable() since clang 8, therefore include the latter condition in order to prevent compilation errors for older clang versions. Given even an old Ubuntu 18.04 LTS has official LLVM packages all the way up to llvm-10, I did not bother to special case the __bpf_unreachable() inside bpf_tail_call_static() further. Also, undo the sockex3_kern's use of bpf_tail_call_static() sample given they still have the old hacky way to even compile networking progs with native instead of BPF target so bpf_tail_call_static() won't be defined there anymore. Fixes: `0e9f6841f6` ("bpf, libbpf: Add bpf_tail_call_static helper for bpf programs") Reported-by: Yaniv Agman <yanivagman@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Tested-by: Yaniv Agman <yanivagman@gmail.com> Link: https://lore.kernel.org/bpf/CAMy7=ZUk08w5Gc2Z-EKi4JFtuUCaZYmE4yzhJjrExXpYKR4L8w@mail.gmail.com Link: https://lore.kernel.org/bpf/20201021203257.26223-1-daniel@iogearbox.net	2020-10-22 01:46:52 +02:00
Jakub Kicinski	ccdf7fae3a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-10-12 The main changes are: 1) The BPF verifier improvements to track register allocation pattern, from Alexei and Yonghong. 2) libbpf relocation support for different size load/store, from Andrii. 3) bpf_redirect_peer() helper and support for inner map array with different max_entries, from Daniel. 4) BPF support for per-cpu variables, form Hao. 5) sockmap improvements, from John. ==================== Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-12 16:16:50 -07:00
Andrii Nakryiko	2b7d88c2b5	libbpf: Allow specifying both ELF and raw BTF for CO-RE BTF override Use generalized BTF parsing logic, making it possible to parse BTF both from ELF file, as well as a raw BTF dump. This makes it easier to write custom tests with manually generated BTFs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201008001025.292064-4-andrii@kernel.org	2020-10-07 18:50:27 -07:00
Andrii Nakryiko	a66345bcbd	libbpf: Support safe subset of load/store instruction resizing with CO-RE Add support for patching instructions of the following form: - rX = (T )(rY + <off>); - (T )(rX + <off>) = rY; - (T )(rX + <off>) = <imm>, where T is one of {u8, u16, u32, u64}. For such instructions, if the actual kernel field recorded in CO-RE relocation has a different size than the one recorded locally (e.g., from vmlinux.h), then libbpf will adjust T to an appropriate 1-, 2-, 4-, or 8-byte loads. In general, such transformation is not always correct and could lead to invalid final value being loaded or stored. But two classes of cases are always safe: - if both local and target (kernel) types are unsigned integers, but of different sizes, then it's OK to adjust load/store instruction according to the necessary memory size. Zero-extending nature of such instructions and unsignedness make sure that the final value is always correct; - pointer size mismatch between BPF target architecture (which is always 64-bit) and 32-bit host kernel architecture can be similarly resolved automatically, because pointer is essentially an unsigned integer. Loading 32-bit pointer into 64-bit BPF register with zero extension will leave correct pointer in the register. Both cases are necessary to support CO-RE on 32-bit kernels, as `unsigned long` in vmlinux.h generated from 32-bit kernel is 32-bit, but when compiled with BPF program for BPF target it will be treated by compiler as 64-bit integer. Similarly, pointers in vmlinux.h are 32-bit for kernel, but treated as 64-bit values by compiler for BPF target. Both problems are now resolved by libbpf for direct memory reads. But similar transformations are useful in general when kernel fields are "resized" from, e.g., unsigned int to unsigned long (or vice versa). Now, similar transformations for signed integers are not safe to perform as they will result in incorrect sign extension of the value. If such situation is detected, libbpf will emit helpful message and will poison the instruction. Not failing immediately means that it's possible to guard the instruction based on kernel version (or other conditions) and make sure it's not reachable. If there is a need to read signed integers that change sizes between different kernels, it's possible to use BPF_CORE_READ_BITFIELD() macro, which works both with bitfields and non-bitfield integers of any signedness and handles sign-extension properly. Also, bpf_core_read() with proper size and/or use of bpf_core_field_size() relocation could allow to deal with such complicated situations explicitly, if not so conventiently as direct memory reads. Selftests added in a separate patch in progs/test_core_autosize.c demonstrate both direct memory and probed use cases. BPF_CORE_READ() is not changed and it won't deal with such situations as automatically as direct memory reads due to the signedness integer limitations, which are much harder to detect and control with compiler macro magic. So it's encouraged to utilize direct memory reads as much as possible. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201008001025.292064-3-andrii@kernel.org	2020-10-07 18:50:27 -07:00
Andrii Nakryiko	47f7cf6325	libbpf: Skip CO-RE relocations for not loaded BPF programs Bypass CO-RE relocations step for BPF programs that are not going to be loaded. This allows to have BPF programs compiled in and disabled dynamically if kernel is not supposed to provide enough relocation information. In such case, there won't be unnecessary warnings about failed relocations. Fixes: `d929758101` ("libbpf: Support disabling auto-loading BPF programs") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201008001025.292064-2-andrii@kernel.org	2020-10-07 18:50:27 -07:00
Magnus Karlsson	80348d8867	libbpf: Fix compatibility problem in xsk_socket__create Fix a compatibility problem when the old XDP_SHARED_UMEM mode is used together with the xsk_socket__create() call. In the old XDP_SHARED_UMEM mode, only sharing of the same device and queue id was allowed, and in this mode, the fill ring and completion ring were shared between the AF_XDP sockets. Therefore, it was perfectly fine to call the xsk_socket__create() API for each socket and not use the new xsk_socket__create_shared() API. This behavior was ruined by the commit introducing XDP_SHARED_UMEM support between different devices and/or queue ids. This patch restores the ability to use xsk_socket__create in these circumstances so that backward compatibility is not broken. Fixes: `2f6324a393` ("libbpf: Support shared umems between queues and devices") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/1602070946-11154-1-git-send-email-magnus.karlsson@gmail.com	2020-10-07 22:28:43 +02:00
Luigi Rizzo	8cee9107e7	bpf, libbpf: Use valid btf in bpf_program__set_attach_target bpf_program__set_attach_target(prog, fd, ...) will always fail when fd = 0 (attach to a kernel symbol) because obj->btf_vmlinux is NULL and there is no way to set it (at the moment btf_vmlinux is meant to be temporary storage for use in bpf_object__load_xattr()). Fix this by using libbpf_find_vmlinux_btf_id(). At some point we may want to opportunistically cache btf_vmlinux so it can be reused with multiple programs. Signed-off-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Petar Penkov <ppenkov@google.com> Link: https://lore.kernel.org/bpf/20201005224528.389097-1-lrizzo@google.com	2020-10-06 11:36:10 -07:00
Hangbin Liu	2c193d32ca	libbpf: Check if pin_path was set even map fd exist Say a user reuse map fd after creating a map manually and set the pin_path, then load the object via libbpf. In libbpf bpf_object__create_maps(), bpf_object__reuse_map() will return 0 if there is no pinned map in map->pin_path. Then after checking if map fd exist, we should also check if pin_path was set and do bpf_map__pin() instead of continue the loop. Fix it by creating map if fd not exist and continue checking pin_path after that. Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201006021345.3817033-3-liuhangbin@gmail.com	2020-10-06 11:10:20 -07:00
Hangbin Liu	a0f2b7acb4	libbpf: Close map fd if init map slots failed Previously we forgot to close the map fd if bpf_map_update_elem() failed during map slot init, which will leak map fd. Let's move map slot initialization to new function init_map_slots() to simplify the code. And close the map fd if init slot failed. Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201006021345.3817033-2-liuhangbin@gmail.com	2020-10-06 11:10:20 -07:00
David S. Miller	8b0308fe31	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Rejecting non-native endian BTF overlapped with the addition of support for it. The rest were more simple overlapping changes, except the renesas ravb binding update, which had to follow a file move as well as a YAML conversion. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-05 18:40:01 -07:00
Hao Luo	d370bbe121	bpf/libbpf: BTF support for typed ksyms If a ksym is defined with a type, libbpf will try to find the ksym's btf information from kernel btf. If a valid btf entry for the ksym is found, libbpf can pass in the found btf id to the verifier, which validates the ksym's type and value. Typeless ksyms (i.e. those defined as 'void') will not have such btf_id, but it has the symbol's address (read from kallsyms) and its value is treated as a raw pointer. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200929235049.2533242-3-haoluo@google.com	2020-10-02 14:59:25 -07:00
Andrii Nakryiko	9c6c5c48d7	libbpf: Make btf_dump work with modifiable BTF Ensure that btf_dump can accommodate new BTF types being appended to BTF instance after struct btf_dump was created. This came up during attemp to use btf_dump for raw type dumping in selftests, but given changes are not excessive, it's good to not have any gotchas in API usage, so I decided to support such use case in general. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200929232843.1249318-2-andriin@fb.com	2020-09-30 12:30:22 -07:00
Daniel Borkmann	0e9f6841f6	bpf, libbpf: Add bpf_tail_call_static helper for bpf programs Port of tail_call_static() helper function from Cilium's BPF code base [0] to libbpf, so others can easily consume it as well. We've been using this in production code for some time now. The main idea is that we guarantee that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the JITed BPF insns with direct jumps instead of having to fall back to using expensive retpolines. By using inline asm, we guarantee that the compiler won't merge the call from different paths with potentially different content of r2/r3. We're also using Cilium's __throw_build_bug() macro (here as: __bpf_unreachable()) in different places as a neat trick to trigger compilation errors when compiler does not remove code at compilation time. This works for the BPF back end as it does not implement the __builtin_trap(). [0] `f5537c2602` Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/1656a082e077552eb46642d513b4a6bde9a7dd01.1601477936.git.daniel@iogearbox.net	2020-09-30 11:50:35 -07:00
Andrii Nakryiko	b0efc216f5	libbpf: Compile in PIC mode only for shared library case Libbpf compiles .o's for static and shared library modes separately, so no need to specify -fPIC for both. Keep it only for shared library mode. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200929220604.833631-3-andriin@fb.com	2020-09-29 17:05:31 -07:00
Andrii Nakryiko	0a62291d69	libbpf: Compile libbpf under -O2 level by default and catch extra warnings For some reason compiler doesn't complain about uninitialized variable, fixed in previous patch, if libbpf is compiled without -O2 optimization level. So do compile it with -O2 and never let similar issue slip by again. -Wall is added unconditionally, so no need to specify it again. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200929220604.833631-2-andriin@fb.com	2020-09-29 17:05:31 -07:00
Andrii Nakryiko	3343391345	libbpf: Fix uninitialized variable in btf_parse_type_sec Fix obvious unitialized variable use that wasn't reported by compiler. libbpf Makefile changes to catch such errors are added separately. Fixes: `3289959b97` ("libbpf: Support BTF loading and raw data output in both endianness") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200929220604.833631-1-andriin@fb.com	2020-09-29 17:05:31 -07:00
Toke Høiland-Jørgensen	a535909142	libbpf: Add support for freplace attachment in bpf_link_create This adds support for supplying a target btf ID for the bpf_link_create() operation, and adds a new bpf_program__attach_freplace() high-level API for attaching freplace functions with a target. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/160138355387.48470.18026176785351166890.stgit@toke.dk	2020-09-29 13:09:24 -07:00
Andrii Nakryiko	3289959b97	libbpf: Support BTF loading and raw data output in both endianness Teach BTF to recognized wrong endianness and transparently convert it internally to host endianness. Original endianness of BTF will be preserved and used during btf__get_raw_data() to convert resulting raw data to the same endianness and a source raw_data. This means that little-endian host can parse big-endian BTF with no issues, all the type data will be presented to the client application in native endianness, but when it's time for emitting BTF to persist it in a file (e.g., after BTF deduplication), original non-native endianness will be preserved and stored. It's possible to query original endianness of BTF data with new btf__endianness() API. It's also possible to override desired output endianness with btf__set_endianness(), so that if application needs to load, say, big-endian BTF and store it as little-endian BTF, it's possible to manually override this. If btf__set_endianness() was used to change endianness, btf__endianness() will reflect overridden endianness. Given there are no known use cases for supporting cross-endianness for .BTF.ext, loading .BTF.ext in non-native endianness is not supported. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200929043046.1324350-3-andriin@fb.com	2020-09-29 12:21:23 -07:00
Andrii Nakryiko	9141f75a32	selftests/bpf: Test BTF writing APIs Add selftests for BTF writer APIs. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200929020533.711288-4-andriin@fb.com	2020-09-28 19:17:13 -07:00
Andrii Nakryiko	f86ed050bc	libbpf: Add btf__str_by_offset() as a more generic variant of name_by_offset BTF strings are used not just for names, they can be arbitrary strings used for CO-RE relocations, line/func infos, etc. Thus "name_by_offset" terminology is too specific and might be misleading. Instead, introduce btf__str_by_offset() API which uses generic string terminology. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200929020533.711288-3-andriin@fb.com	2020-09-28 19:17:13 -07:00
Andrii Nakryiko	4a3b33f857	libbpf: Add BTF writing APIs Add APIs for appending new BTF types at the end of BTF object. Each BTF kind has either one API of the form btf__add_<kind>(). For types that have variable amount of additional items (struct/union, enum, func_proto, datasec), additional API is provided to emit each such item. E.g., for emitting a struct, one would use the following sequence of API calls: btf__add_struct(...); btf__add_field(...); ... btf__add_field(...); Each btf__add_field() will ensure that the last BTF type is of STRUCT or UNION kind and will automatically increment that type's vlen field. All the strings are provided as C strings (const char *), not a string offset. This significantly improves usability of BTF writer APIs. All such strings will be automatically appended to string section or existing string will be re-used, if such string was already added previously. Each API attempts to do all the reasonable validations, like enforcing non-empty names for entities with required names, proper value bounds, various bit offset restrictions, etc. Type ID validation is minimal because it's possible to emit a type that refers to type that will be emitted later, so libbpf has no way to enforce such cases. User must be careful to properly emit all the necessary types and specify type IDs that will be valid in the finally generated BTF. Each of btf__add_<kind>() APIs return new type ID on success or negative value on error. APIs like btf__add_field() that emit additional items return zero on success and negative value on error. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200929020533.711288-2-andriin@fb.com	2020-09-28 19:17:13 -07:00
Andrii Nakryiko	a871b04310	libbpf: Add btf__new_empty() to create an empty BTF object Add an ability to create an empty BTF object from scratch. This is going to be used by pahole for BTF encoding. And also by selftest for convenient creation of BTF objects. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-7-andriin@fb.com	2020-09-28 17:27:32 -07:00
Andrii Nakryiko	919d2b1dbb	libbpf: Allow modification of BTF and add btf__add_str API Allow internal BTF representation to switch from default read-only mode, in which raw BTF data is a single non-modifiable block of memory with BTF header, types, and strings layed out sequentially and contiguously in memory, into a writable representation with types and strings data split out into separate memory regions, that can be dynamically expanded. Such writable internal representation is transparent to users of libbpf APIs, but allows to append new types and strings at the end of BTF, which is a typical use case when generating BTF programmatically. All the basic guarantees of BTF types and strings layout is preserved, i.e., user can get `struct btf_type *` pointer and read it directly. Such btf_type pointers might be invalidated if BTF is modified, so some care is required in such mixed read/write scenarios. Switch from read-only to writable configuration happens automatically the first time when user attempts to modify BTF by either adding a new type or new string. It is still possible to get raw BTF data, which is a single piece of memory that can be persisted in ELF section or into a file as raw BTF. Such raw data memory is also still owned by BTF and will be freed either when BTF object is freed or if another modification to BTF happens, as any modification invalidates BTF raw representation. This patch adds the first two BTF manipulation APIs: btf__add_str(), which allows to add arbitrary strings to BTF string section, and btf__find_str() which allows to find existing string offset, but not add it if it's missing. All the added strings are automatically deduplicated. This is achieved by maintaining an additional string lookup index for all unique strings. Such index is built when BTF is switched to modifiable mode. If at that time BTF strings section contained duplicate strings, they are not de-duplicated. This is done specifically to not modify the existing content of BTF (types, their string offsets, etc), which can cause confusion and is especially important property if there is struct btf_ext associated with struct btf. By following this "imperfect deduplication" process, btf_ext is kept consitent and correct. If deduplication of strings is necessary, it can be forced by doing BTF deduplication, at which point all the strings will be eagerly deduplicated and all string offsets both in struct btf and struct btf_ext will be updated. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-6-andriin@fb.com	2020-09-28 17:27:31 -07:00
Andrii Nakryiko	7d9c71e10b	libbpf: Extract generic string hashing function for reuse Calculating a hash of zero-terminated string is a common need when using hashmap, so extract it for reuse. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-5-andriin@fb.com	2020-09-28 17:27:31 -07:00
Andrii Nakryiko	192f5a1fe6	libbpf: Generalize common logic for managing dynamically-sized arrays Managing dynamically-sized array is a common, but not trivial functionality, which significant amount of logic and code to implement properly. So instead of re-implementing it all the time, extract it into a helper function ans reuse. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-4-andriin@fb.com	2020-09-28 17:27:31 -07:00
Andrii Nakryiko	b86042478f	libbpf: Remove assumption of single contiguous memory for BTF data Refactor internals of struct btf to remove assumptions that BTF header, type data, and string data are layed out contiguously in a memory in a single memory allocation. Now we have three separate pointers pointing to the start of each respective are: header, types, strings. In the next patches, these pointers will be re-assigned to point to independently allocated memory areas, if BTF needs to be modified. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-3-andriin@fb.com	2020-09-28 17:27:31 -07:00
Andrii Nakryiko	740e69c3c5	libbpf: Refactor internals of BTF type index Refactor implementation of internal BTF type index to not use direct pointers. Instead it uses offset relative to the start of types data section. This allows for types data to be reallocatable, enabling implementation of modifiable BTF. As now getting type by ID has an extra indirection step, convert all internal type lookups to a new helper btf_type_id(), that returns non-const pointer to a type by its ID. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-2-andriin@fb.com	2020-09-28 17:27:31 -07:00

... 12 13 14 15 16 ...

1410 commits