If no thread was created yet, __pthread_sigstate will not find our ss
because self->kernel_thread is still nul, and then change the global
sigstate instead of our sigstate! We can directly call __sigthreadmask and
skip the (bogus) lookup step.
As shown in
https://sourceware.org/bugzilla/show_bug.cgi?id=25237
linker may generate an empty PT_LOAD segments at offset 0:
Elf file type is EXEC (Executable file)
Entry point 0x4000e8
There are 3 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000000f0 0x00000000000000f0 R E 0x1000
LOAD 0x0000000000000000 0x0000000000410000 0x0000000000410000
0x0000000000000000 0x0000000000b5dce8 RW 0x10000
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
Section to Segment mapping:
Segment Sections...
00 .text
01 .bss
02
Skip the empty PT_LOAD segment at offset 0 to support such binaries.
This fixes BZ #32763.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
btrfs has a 65535 maximum link count. Include this value in pathconf to
give the real max link count for this filesystem.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
powerpc was the only architecture with arch-specific hooks for
LD_SHOW_AUXV, and with the information moved to ld diagnostics there
is no need to keep the _dl_procinfo hook.
Checked with a build for all affected ABIs.
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
The _dl_string_platform is moved to hwcapinfo.h, since it is only used
by hwcapinfo.c and test-get_hwcap internal test.
Checked on powerpc64le-linux-gnu.
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
The ld.so diagnostics already prints AT_HWCAP values, but only in
hexadecimal. To avoid duplicating the strings, consolidate the
hwcap_names from cpu-features.h on a new file, dl-hwcap-info.h
(and it also improves the hwcap string description with more
values).
For future AT_HWCAP3/AT_HWCAP4 extensions, it is just a matter
to add them on dl-hwcap-info.c so both ld diagnostics and
tunable filtering will parse the new values.
Checked on powerpc64le-linux-gnu.
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
Add a new randomized strlen test similar to bench-random-memcpy. Instead of
repeating the same call to strlen over and over again, it times a large number
of different strings. The distribution of the string length and alignment is
based on SPEC2017.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Adjust sizes between 64KB and 16MB and iterations based on length.
Remove incorrect uses of alloc_bufs since we're not interested in measuring
Linux clear_page time. Use getpagesize() - 1 instead of 4095 when
aligning within a page.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The manual entry for sched_yield mentions that the function call could
be a nop if there are no other tasks with the same absolute priority.
Expand the explanation to include example schedulers on Linux so that
it's clear that sched_yield may not always result in a different task
being scheduled.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Joseph Myers <josmyers@redhat.com>
When GNU Binutils is configured with --enable-error-execstack=yes, a handful
of our tests which rely on -Wl,-z,execstack fail. Pass --Wl,--no-error-execstack
to override the behaviour and get a warning instead.
Bug: https://sourceware.org/PR32717
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
If attacker overwrites the bk_nextsize link in the first chunk of a
largebin that later has a smaller chunk inserted into it, malloc will
write a heap pointer into an attacker-controlled address [0].
This patch adds an integrity check to mitigate this attack.
[0]: https://github.com/shellphish/how2heap/blob/master/glibc_2.39/large_bin_attack.c
Signed-off-by: Ben Kallus <benjamin.p.kallus.gr@dartmouth.edu>
Reviewed-by: DJ Delorie <dj@redhat.com>
Remove unused _dl_hwcap_string defines. As a result many dl-procinfo.h headers
can be removed. This also removes target specific _dl_procinfo implementations
which only printed HWCAP strings using dl_hwcap_string.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The code now looks like:
fclass.s $fa2, $fa0
movfr2gr.s $t0, $fa2
slli.w $t0, $t0, 0x0
fclass.s $fa2, $fa1
movfr2gr.s $t1, $fa2
or $t0, $t0, $t1
andi $t0, $t0, 0x3
bnez $t0, 1f
fmin.s $fa0, $fa0, $fa1
ret
1:
fmul.s $fa0, $fa0, $fa1
ret
This looks really bad, with expensive movfr2gr instructions, redundant
sign-extensions and masking (arguably it's a compiler
missed-optimzation), and a branch. Rewrite it with inline assembly:
fcmp.cor.s $fcc0, $fa0, $fa0
fcmp.cor.s $fcc1, $fa1, $fa1
fsel $fa2, $fa0, $fa1, $fcc0
fsel $fa0, $fa1, $fa0, $fcc1
fmax.s $fa0, $fa2, $fa0
ret
Note that we cannot make it more readable with
"double a = __builtin_isnanf (x) ? y : x" because this C statement only
happens to produce what we want with https://gcc.gnu.org/PR66462, if
this bug is fixed in the future the generated code may change.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Single-precision remainderf() and quad-precision remainderl()
implementation derived from Sun is affected by an issue when the result
is +-0. IEEE754 requires that if remainder(x, y) = 0, its sign shall be
that of x regardless of the rounding direction.
The implementation seems to have assumed that x - x = +0 in all
rounding modes, which is not the case. When rounding direction is
roundTowardNegative the sign of an exact zero sum (or difference) is −0.
Regression tests that triggered this erroneous behavior are added to
math/libm-test-remainder.inc.
Tested for cross riscv64 and powerpc.
Original fix by: Bruce Evans <bde@FreeBSD.org> in FreeBSD's
a2ddfa5ea726c56dbf825763ad371c261b89b7c7.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
A number of fma tests started to fail on hppa when gcc was changed to
use Ranger rather than EVRP. Eventually I found that the value of
a1 + u.d in this is block of code was being computed in FE_TOWARDZERO
mode and not the original rounding mode:
if (TININESS_AFTER_ROUNDING)
{
w.d = a1 + u.d;
if (w.ieee.exponent == 109)
return w.d * 0x1p-108;
}
This caused the exponent value to be wrong and the wrong return path
to be used.
Here we add an optimization barrier after the rounding mode is reset
to ensure that the previous value of a1 + u.d is not reused.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
In some cases, an IFUNC resolver may need to access the gp pointer to
access global variables. Such an object may have l_relocated == 0 at
this time. In this case, an IFUNC resolver will fail to access a global
variable and cause a SIGSEGV.
This patch fixes this issue by relaxing the check of l_relocated in
elf_machine_runtime_setup, but added a check for SHARED case to avoid
using this code in static-linked executables. Such object have already
set up the gp pointer in load_gp function and l->l_scope will be NULL if
it is a pie object. So if we use these code to set up the gp pointer
again for static-pie, it will causing a SIGSEGV in glibc as original bug
on BZ #31317.
I have also reproduced and checked BZ #31317 using the mold commit
bed5b1731b ("illumos: Treat absolute symbols specially"), this patch can
fix the issue.
Also, we used the wrong gp pointer previously because ref->st_value is
not the relocated address but just the offset from the base address of
ELF. An edge case may happen if we reference gp pointer in a IFUNC
resolver in a PIE object, but it will not happen in compiler-generated
codes since -pie will disable relax to gp. In this case, the GP will be
initialized incorrectly since the ref->st_value is not the address after
relocation. This patch fixes this issue by adding the l->l_addr to
ref->st_value to get the relocated address for the gp pointer. We don't
use SYMBOL_ADDRESS macro here because __global_pointer$ is a special
symbol that has SHN_ABS type, but it will use PC-relative addressing in
the load_gp function using lla.
Closes: BZ #32269
Fixes: 96d1b9ac23 ("RISC-V: Fix the static-PIE non-relocated object check")
Co-authored-by: Vivian Wang <dramforever@live.com>
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
This series removes various ILP32 defines that are now
no longer needed.
Remove PTR_ARG/SIZE_ARG.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Improve performance of rand() and __random() by adding a single-threaded
fast path. Bench-random-lock shows about 5x speedup on Neoverse V1.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The number of iterations and the length of the string are not high
enough on some systems causing the test to return false-positives.
Testcase stdio-common/tst-fwrite-bz29459.c was fixed in the same way in
1b6f868625
(Increase the amount of data tested in stdio-common/tst-fwrite-bz29459.c, 2025-02-14)
Testcases stdio-common/tst-fwrite-bz29459.c and stdio-common/tst-fwrite-pipe.c
were introcued in 596a61cf6b
(libio: Start to return errors when flushing fwrite's buffer [BZ #29459], 2025-01-28)
Rewriting the cpuset macros test to cover more use cases and port the
tests to the new test infrastructure.
The use cases include bad actor access attempts, before and after the
CPU set structure.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@redhat.com>
Refactor the support_next_to_fault and add the
support_next_to_fault_before method returns a buffer with a protected
page before it, to be able to test buffer underflow accesses.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@redhat.com>
When compiling a program that includes <bits/floatn.h> using a clang version
between 3.4 (included) and 3.8.1 (included), clang will fail with `unknown type
name '__float128'; did you mean '__cfloat128'?`. This changes fixes the clang
prerequirements macro call in floatn.h to check for clang 3.9 instead of 3.4,
since support for __float128 was actually enabled in 3.9 by:
commit 50f29e06a1b6a38f0bba9360cbff72c82d46cdd4
Author: Nemanja Ivanovic <nemanja.i.ibm@gmail.com>
Date: Wed Apr 13 09:49:45 2016 +0000
Enable support for __float128 in Clang
This fixes bug 32694.
Signed-off-by: koraynilay <koray.fra@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Due to the extensible nature of the rseq area we can't explictly
initialize fields that are not part of the ABI yet. It was agreed with
upstream that all new fields will be documented as zero initialized by
userspace. Future kernels configured with CONFIG_DEBUG_RSEQ will
validate the content of all fields during registration.
Replace the explicit field initialization with a memset of the whole
rseq area which will cover fields as they are added to future kernels.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Test that when we return from a function that enabled GCS at runtime
we get SIGSEGV. Also test that ucontext contains GCS block with the
GCS pointer.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These tests validate that GCS tunable works as expected depending
on the GCS markings in the test binaries.
Tests validate both static and dynamically linked binaries.
These new tests are AArch64 specific. Moreover, they are included only
if linker supports the "-z gcs=<value>" option. If built, these tests
will run on systems with and without HWCAP_GCS. In the latter case the
tests will be reported as UNSUPPORTED.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
- Add check that linker supports -z gcs=...
- Add checks that main and test compiler support
-mbranch-protection=gcs
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This fixes the check-safety.sh failure with commit
ad9c4c5361, and correctly marks
the function AS-unsafe and AC-unsafe due to the use of the
non-recursive lock.
Tested on x86_64 without regressions.
Reviewed-by: Frédéric Bérat <fberat@redhat.com>
Add SVE memset based on the generic memset with predicated load for sizes < 16.
Unaligned memsets of 128-1024 are improved by ~20% on average by using aligned
stores for the last 64 bytes. Performance of random memset benchmark improves
by ~2% on Neoverse V1.
Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
Intel compiler always defines __INTEL_LLVM_COMPILER. When SYCL is
enabled by -fsycl, it also defines SYCL_LANGUAGE_VERSION. Since Intel
SYCL compiler doesn't support _Float128:
https://github.com/intel/llvm/issues/16903
define __HAVE_FLOAT128 to 0 for Intel SYCL compiler.
This fixes BZ #32723.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
The syscall pkey_alloc can return ENOSPC to indicate either that all
keys are in use or that the system runs in a mode in which memory
protection keys are disabled. In such case the test should not fail and
just return unsupported.
This matches the behaviour of the generic tst-pkey.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The number of iterations and the length of the string are not high
enough on some systems causing the test to return false-positives.
Fixes: 596a61cf6b (libio: Start to return errors when flushing fwrite's buffer [BZ #29459], 2025-01-28)
Reported-by: Florian Weimer <fweimer@redhat.com>
If an auditor loads many TLS-using modules during startup, it is
possible to trigger DTV resizing. Previously, the DTV was marked
as allocated by the main malloc afterwards, even if the minimal
malloc was still in use. With this change, _dl_resize_dtv marks
the resized DTV as allocated with the minimal malloc.
The new test reuses TLS-using modules from other auditing tests.
Reviewed-by: DJ Delorie <dj@redhat.com>
Move the initialization code to a general place instead of keeping it
specific to file-backed streams.
Fixes: 596a61cf6b (libio: Start to return errors when flushing fwrite's buffer [BZ #29459], 2025-01-28)
Reported-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Arjun Shankar <arjun@redhat.com>
By overwriting a forward link in a fastbin chunk that is subsequently
moved into the tcache, it's possible to get malloc to return an
arbitrary address [0].
When a chunk is fetched from a fastbin, its size is checked against the
expected chunk size for that fastbin (see malloc.c:3991). This patch
adds a similar check for chunks being moved from a fastbin to tcache,
which renders obsolete the exploitation technique described above.
Now updated to use __glibc_unlikely instead of __builtin_expect, as
requested.
[0]: https://github.com/shellphish/how2heap/blob/master/glibc_2.39/fastbin_reverse_into_tcache.c
Signed-off-by: Ben Kallus <benjamin.p.kallus.gr@dartmouth.edu>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Make sure that numbers never overflow uint32_t in inet_network to
properly validate octets encountered in IPv4 addresses.
Avoid malloca in NSS networks file code because /etc/networks lines
can be arbitrarily long. Instead of handcrafting the input for
inet_network by adding ".0" octets if they are missing, just left shift
the result. Also, do not accept invalid entries, but ignore the line
instead.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Reduce number of MOV/MOVPRFXs and use unpredicated FMUL.
Replace MUL with LSL. Speedup on Neoverse V1: 6%.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Use unpredicted muls, and improve memory access.
7%, 3% and 1% improvement in throughput microbenchmark on Neoverse V1,
for exp, exp2 and cosh respectively.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Use unpredicated muls, use lanewise mla's and improve memory access.
1% regression in throughput microbenchmark on Neoverse V1.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
GCC aligns global data to 16 bytes if their size is >= 16 bytes. This patch
changes the exp_data struct slightly so that the fields are better aligned
and without gaps. As a result on targets that support them, more load-pair
instructions are used in exp. Exp10 is improved by moving invlog10_2N later
so that neglog10_2hiN and neglog10_2loN can be loaded using load-pair.
The exp benchmark improves 2.5%, "144bits" by 7.2%, "768bits" by 12.7% on
Neoverse V2. Exp10 improves by 1.5%.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use the __progname symbol to override the program name to induce the
failure that CVE-2025-0395 describes.
This is related to BZ #32582
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The libm size improvement built with "--enable-stack-protector=strong
--enable-bind-now=yes --enable-fortify-source=2":
Before:
text data bss dec hex filename
585192 860 12 586064 8f150 aarch64-linux-gnu/math/libm.so
960775 1068 12 961855 ead3f x86_64-linux-gnu/math/libm.so
1189174 5544 368 1195086 123c4e powerpc64le-linux-gnu/math/libm.so
After:
text data bss dec hex filename
584952 860 12 585824 8f060 aarch64-linux-gnu/math/libm.so
960615 1068 12 961695 eac9f x86_64-linux-gnu/math/libm.so
1189078 5544 368 1194990 123bee powerpc64le-linux-gnu/math/libm.so
The are small code changes for x86_64 and powerpc64le, which do not
affect performance; but on aarch64 with gcc-14 I see a slight better
code generation due the usage of ldq for floating point constant loading.
Reviewed-by: Andreas K. Huettel <dilfridge@gentoo.org>
The libm size improvement built with "--enable-stack-protector=strong
--enable-bind-now=yes --enable-fortify-source=2":
Before:
text data bss dec hex filename
587304 860 12 588176 8f990 aarch64-linux-gnu-master/math/libm.so
962855 1068 12 963935 eb55f x86_64-linux-gnu-master/math/libm.so
1191222 5544 368 1197134 12444e powerpc64le-linux-gnu-master/math/libm.so
After:
text data bss dec hex filename
585192 860 12 586064 8f150 aarch64-linux-gnu/math/libm.so
960775 1068 12 961855 ead3f x86_64-linux-gnu/math/libm.so
1189174 5544 368 1195086 123c4e powerpc64le-linux-gnu/math/libm.so
The are small code changes for x86_64 and powerpc64le, which do not
affect performance; but on aarch64 with gcc-14 I see a slight better
code generation due the usage of ldq for floating point constant loading.
Reviewed-by: Andreas K. Huettel <dilfridge@gentoo.org>
Based on auditing all the signals and source trees for Hurd and
Linux...
SIGSYS - This is not used for a bad system call (ENOSYS is used
for that). This is used by SECCOMP and some cases where an invalid
sub-function was requested.
SIGSTKFLT - Note it used to be a coprocessor stack fault but is now
obsolete and available for general user use.
SIGLOST - Hurd only now; note that its original purpose as an NFS
lock lost signal is obsolete.
SIGPWR - Note this is for power lost *and* power restored, and is
more a user-mode signal than a kernel-generated signal.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
__LP64__ is a GCC extension and shouldn't be used in an installed
header.
Fixes: 596a61cf6b (libio: Start to return errors when flushing fwrite's buffer [BZ #29459], 2025-01-28)
Reported-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Arjun Shankar <arjun@redhat.com>
Code used during early static startup in elf/dl-tls.c uses
__mempcpy.
Fixes commit cbd9fd2369 ("Consolidate
TLS block allocation for static binaries with ld.so").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This is required when building for powerpc64le POWER8 with GCC 8
at least.
Fixes commit cbd9fd2369 ("Consolidate
TLS block allocation for static binaries with ld.so").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Tweak the @manpageurl macro to customize the output for
each of html, info, and pdf output. HTML and PDF (at
least, these days) support clicking on the link title,
whereas info does not. Add text to the intro section
explaining which man pages are normative and which
aren't.
The functions serve very similar purposes. The advantage of
__rtld_libc_freeres is that it is located within ld.so, so it is
more natural to poke at link map internals there.
This slightly regresses cleanup capabilities for statically linked
binaries. If that becomes a problem, we should start calling
__rtld_libc_freeres from __libc_freeres (perhaps after renaming it).
It's not necessary to introduce temporaries because the compiler
is able to evaluate l_soname just once in constracts like:
l_soname (l) != NULL && strcmp (l_soname (l), LIBC_SO) != 0
This reduces code size and dependencies on ld.so internals from
libc.so.
Fixes commit f4c142bb9f
("arm: Use _dl_find_object on __gnu_Unwind_Find_exidx (BZ 31405)").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
sysdeps/pthread/sem_open.c: call pthread_setcancelstate directely
since forward declaration is gone on hurd too
Message-ID: <20250201080202.494671-1-gfleury@disroot.org>
The POSIX Semaphores functions are currently undocumented in our info
pages. This commit adds links to the man-pages documentation for all
the `sem_*' functions (except `sem_clockwait') so that they refer to
some useful documentation instead of just being stubs. `sem_clockwait'
isn't documented by man-pages but thankfully already has a small useful
blurb in our own docs.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This commit moves the `sem_*' family of functions from the IPC chapter,
replacing them with a reference to their new location in the Threads
chapter. `sem_clockwait' is also moved out of the Non-POSIX Extensions
subsection since it is now included in the standard since Issue 8:
https://pubs.opengroup.org/onlinepubs/9799919799/functions/sem_clockwait.html
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Decorate BSS mappings with [anon: glibc: .bss <file>], for example
[anon: glibc: .bss /lib/libc.so.6]. The string ".bss" is already used
by bionic so use the same, but add the filename as well. If the name
would be longer than what the kernel allows, drop the directory part
of the path.
Refactor glibc.mem.decorate_maps check to a separate function and use
it to avoid assembling a name, which would not be used later.
Signed-off-by: Petr Malat <oss@malat.biz>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Linux 6.13 (662df3e5c3766) added a lightweight way to define guard areas
through madvise syscall. Instead of PROT_NONE the guard region through
mprotect, userland can madvise the same area with a special flag, and
the kernel ensures that accessing the area will trigger a SIGSEGV (as for
PROT_NONE mapping).
The madvise way has the advantage of less kernel memory consumption for
the process page-table (one less VMA per guard area), and slightly less
contention on kernel (also due to the fewer VMA areas being tracked).
The pthread_create allocates a new thread stack in two ways: if a guard
area is set (the default) it allocates the memory range required using
PROT_NONE and then mprotect the usable stack area. Otherwise, if a
guard page is not set it allocates the region with the required flags.
For the MADV_GUARD_INSTALL support, the stack area region is allocated
with required flags and then the guard region is installed. If the
kernel does not support it, the usual way is used instead (and
MADV_GUARD_INSTALL is disabled for future stack creations).
The stack allocation strategy is recorded on the pthread struct, and it
is used in case the guard region needs to be resized. To avoid needing
an extra field, the 'user_stack' is repurposed and renamed to 'stack_mode'.
This patch also adds a proper test for the pthread guard.
I checked on x86_64, aarch64, powerpc64le, and hppa with kernel 6.13.0-rc7.
Reviewed-by: DJ Delorie <dj@redhat.com>
Set stack size attribute to the size of the mmap'd region only
when the size of the remaining stack space is less than the size
of the mmap'd region.
This was reversed. As a result, the initial stack size was only
135168 bytes. On architectures where the stack grows down, the
initial stack size is approximately 8384512 bytes with the default
rlimit settings. The small main stack size on hppa broke
applications like ruby that check for stack overflows.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Operation systems which represent text files in a line-oriented
fashion (and not as byte streams with a character sequence reserved
for line termination) logically cannot flush a buffer without
also creating a terminated line.
Update this portability note and move it to the Binary Streams
section. Add another related compatibility concern, too.
I haven't exposed _pthread_mutex_lock, _pthread_mutex_trylock and
_pthread_mutex_unlock in GLIBC_PRIVATE since there aren't used in any
code in libpthread
Message-ID: <20250103103750.870897-3-gfleury@disroot.org>
Having fixed several bugs relating to flushing of FILE* streams (with
fflush and other operations) and their offsets (both the file position
indicator in the FILE*, and the offset in the underlying open file
description), especially after ungetc but not limited to that case,
add a test that more systematically covers different combinations of
cases for such issues, with 57220 separate scenarios tested (which
include examples of all the five separate fixed bugs), all of which
pass given the five previous bug fixes.
Tested for x86_64.
As discussed in bug 32535, fflush fails on files opened for reading
using mmap after ungetc. Fix the logic to handle this case and still
compute the file offset correctly.
Tested for x86_64.
As discussed in bug 32529, fseek fails on files opened for reading
using mmap after ungetc. The implementation of fseek for such files
has an offset computation that's also incorrect after fflush. A
combined fix addresses both problems (with tests for both included as
well) and it seems reasonable to consider them a single bug.
Tested for x86_64.
As discussed in bug 32369 and required by POSIX, the POSIX feature
fflush (NULL) should flush input files, not just output files. The
POSIX requirement is that "fflush() shall perform this flushing action
on all streams for which the behavior is defined above", and the
definition for input files is for "a stream open for reading with an
underlying file description, if the file is not already at EOF, and
the file is one capable of seeking".
Implement this requirement in glibc. (The underlying flushing
implementation is what deals with avoiding errors for seeking on an
unseekable file.)
Tested for x86_64.
As discussed in bug 12724 and required by POSIX, before an input file
(based on an underlying seekable file descriptor) is closed, fclose is
sometimes required to seek that file descriptor to the correct offset,
so that any other file descriptors sharing the underlying open file
description are left at that offset (as a motivating example, a script
could call a sequence of commands each of which processes some data
from (seekable) stdin using stdio; fclose needs to do this so that
each successive command can read exactly the data not handled by
previous commands), but glibc fails to do this.
The precise POSIX wording has changed a few times; in the 2024 edition
it's "If the file is not already at EOF, and the file is one capable
of seeking, the file offset of the underlying open file description
shall be set to the file position of the stream if the stream is the
active handle to the underlying file description.".
Add appropriate logic to _IO_new_file_close_it to handle this case. I
haven't made any attempt to test or change things in this area for the
"old" functions.
Note that there was a previous attempt to fix bug 12724, reverted in
commit eb6cbd249f. The fix version here
addresses the original test in that bug report without breaking the
one given in a subsequent comment in that bug report (which works with
glibc before the patch, but maybe was broken by the original fix that
was reverted).
The logic here tries to take care not to seek the file, even to its
newly computed current offset, if at EOF / possibly not the active
handle; even seeking to the current offset would be problematic
because of a potential race (fclose computes the current offset,
another thread or process with the active handle does its own seek,
fclose does a seek (not permitted by POSIX in this case) that loses
the effect of the seek on the active handle in another thread or
process). There are tests included for various cases of being or not
being the active handle, though there aren't tests for the potential
race condition.
Tested for x86_64.
As discussed in bug 5994 (plus duplicates), POSIX requires fflush
after ungetc to discard pushed-back characters but preserve the file
position indicator. For this purpose, each ungetc decrements the file
position indicator by 1; it is unspecified after ungetc at the start
of the file, and after ungetwc, so no special handling is needed for
either of those cases.
This is fixed with appropriate logic in _IO_new_file_sync. I haven't
made any attempt to test or change things in this area for the "old"
functions; the case of files using mmap is addressed in a subsequent
patch (and there seem to be no problems in this area with files opened
with fmemopen).
Tested for x86_64.
Test if the file-position is correctly updated when fwrite tries to
flush its internal cache but is not able to completely write all items.
Reviewed-by: DJ Delorie <dj@redhat.com>
When an error happens, fwrite is expected to return a value that is less
than nmemb. If this error happens while flushing its internal buffer,
fwrite is in a complex scenario: all the data might have been written to
the buffer, indicating a successful copy, but the buffer is expected to
be flushed and it was not.
POSIX.1-2024 states the following about errors on fwrite:
If an error occurs, the resulting value of the file-position indicator
for the stream is unspecified.
The fwrite() function shall return the number of elements successfully
written, which may be less than nitems if a write error is encountered.
With that in mind, this commit modifies _IO_new_file_write in order to
return the total number of bytes written via the file pointer. It also
modifies fwrite in order to use the new information and return the
correct number of bytes written even when sputn returns EOF.
Add 2 tests:
1. tst-fwrite-bz29459: This test is based on the reproducer attached to
bug 29459. In order to work, it requires to pipe stdout to another
process making it hard to reuse test-driver.c. This code is more
specific to the issue reported.
2. tst-fwrite-pipe: Recreates the issue by creating a pipe that is shared
with a child process. Reuses test-driver.c. Evaluates a more generic
scenario.
Co-authored-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
Adding some basic tests for fopen, testing different modes, stream
positioning and concurrent read/write operation on files.
Reviewed-by: DJ Delorie <dj@redhat.com>
When gawk was not built with MPFR, there's no mtrace output and those
tests FAIL. But we should make them UNSUPPORTED like other
tst-printf-format-* tests in the case.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Sam James <sam@gentoo.org>
Reviewed-by: Andreas K Hüttel <dilfridge@gentoo.org>
Add a test for setenv with updated environ. Verify that BZ #32588 is
fixed.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Many more files from the CORE-MATH have been added. Also update the
authors and copyright years.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
posix/getaddrinfo.c got moved into nss/getaddrinfo.c in commit
7f602256ab ("Move getaddrinfo from 'posix' into 'nss'")
inet/getnameinfo.c got moved into nss/getnameinfo.c in commit
2f1c 6652 d7b3 ("Move getnameinfo from 'inet' to 'nss'")
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The corresponding files are gone with the IA64 removal in commit
460860f457 ("Remove ia64-linux-gnu").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
For the originally failing application (userhelper from usermode),
it is not actually necessary to call realloc on the environ
pointer. Yes, there will be a memory leak because the application
assigns a heap-allocated pointer to environ that it never frees,
but this leak was always there: the old realloc-based setenv had
a hidden internal variable, last_environ, that was used in a similar
way to __environ_array_list. The application is not impacted by
the leak anyway because the relevant operations do not happen in
a loop.
The change here just uses a separte heap allocation and points
environ to that. This means that if an application calls
free (environ) and restores the environ pointer to the value
at process start, and does not modify the environment further,
nothing bad happens.
This change should not invalidate any previous testing that went into
the original getenv thread safety change, commit 7a61e7f557
("stdlib: Make getenv thread-safe in more cases").
The new test cases are modeled in part on the env -i use case from
bug 32588 (with !DO_MALLOC && !DO_EARLY_SETENV), and the previous
stdlib/tst-setenv-malloc test. The DO_MALLOC && !DO_EARLY_SETENV
case in the new test should approximate what userhelper from the
usermode package does.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Some applications set environ to a heap-allocated pointer, call
setenv (expecting it to call realloc), free environ, and then
restore the original environ pointer. This breaks after
commit 7a61e7f557 ("stdlib: Make
getenv thread-safe in more cases") because after the setenv call,
the environ pointer does not point to the start of a heap allocation.
Instead, setenv creates a separate allocation and changes environ
to point into that. This means that the free call in the application
results in heap corruption.
The interim approach was more compatible with other libcs because
it does not assume that the incoming environ pointer is allocated
as if by malloc (if it was written by the application). However,
it seems to be more important to stay compatible with previous
glibc version: assume the incoming pointer is heap allocated,
and preserve this property after setenv calls.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
be ca cs da de el eo es fi fr gl hr hu ia id it ja ka ko lt nb nl pl pt ro ru rw sk sl sr sv tr uk vi zh_CN zh_TW
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Include the space needed to store the length of the message itself, in
addition to the message string. This resolves BZ #32582.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed: Adhemerval Zanella <adhemerval.zanella@linaro.org>
As the test comment explains, this test is not quite valid, but
preserving the exact sequences helps distributions to port to
newer glibc versions. We can remove this test if we ever switch
to a different implementation.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Commit d5bceac99d changed the sequence
of random numbers. This was completely unintended. The statistical
properties of the new sequences are unclear, so restore the old
behavior.
Fixes commit d5bceac99d ("stdlib:
random_r: fix unaligned access in initstate and initstate_r
[BZ #30584]").
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Older versions such as binutils 2.35.2 do not recognize
PT_GNU_PROPERTY.
Fixes commit d3f2b71ef1
("aarch64: Fix tests not compatible with targets supporting GCS").
Linux 6.13 was released with a change that overwrites those bytes.
This means that the check_unused subtest fails.
Update the manual accordingly.
Tested-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
- Add GCS marking to some of the tests when target supports GCS
- Fix tst-ro-dynamic-mod.map linker script to avoid removing
GNU properties
- Add header with macros for GNU properties
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Allocate GCS based on the stack size, this can be used for coroutines
(makecontext) and thread creation (if the kernel allows user allocated
GCS).
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Unlike for BTI, the kernel does not process GCS properties so update
GL(dl_aarch64_gcs) before the GCS status is set.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
check_gcs is called for each dependency of a DSO, but the GNU property
of the ld.so is not processed so ldso->l_mach.gcs may not be correct.
Just assume ld.so is GCS compatible independently of the ELF marking.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
- Handle GCS marking
- Use l_searchlist.r_list for gcs (allows using the
same function for static exe)
Co-authored-by: Yury Khrustalev <yury.khrustalev@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use the dynamic linker start code to enable GCS in the dynamic linked
case after _dl_start returns and before _dl_start_user which marks
the point after which user code may run.
Like in the static linked case this ensures that GCS is enabled on a
top level stack frame.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use the ARCH_SETUP_TLS hook to enable GCS in the static linked case.
The system call must be inlined and then GCS is enabled on a top
level stack frame that does not return and has no exception handlers
above it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This tunable controls Guarded Control Stack (GCS) for the process.
0 = disabled: do not enable GCS
1 = enforced: check markings and fail if any binary is not marked
2 = optional: check markings but keep GCS off if a binary is unmarked
3 = override: enable GCS, markings are ignored
By default it is 0, so GCS is disabled, value 1 will enable GCS.
The status is stored into GL(dl_aarch64_gcs) early and only applied
later, since enabling GCS is tricky: it must happen on a top level
stack frame. Using GL instead of GLRO because it may need updates
depending on loaded libraries that happen after readonly protection
is applied, however library marking based GCS setting is not yet
implemented.
Describe new tunable in the manual.
Co-authored-by: Yury Khrustalev <yury.khrustalev@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Changed the makecontext logic: previously the first setcontext jumped
straight to the user callback function and the return address is set
to __startcontext. This does not work when GCS is enabled as the
integrity of the return address is protected, so instead the context
is setup such that setcontext jumps to __startcontext which calls the
user callback (passed in x20).
The map_shadow_stack syscall is used to allocate a suitably sized GCS
(which includes some reserved area to account for altstack signal
handlers and otherwise supports maximum number of 16 byte aligned
stack frames on the given stack) however the GCS is never freed as
the lifetime of ucontext and related stack is user managed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Userspace ucontext needs to store GCSPR, it does not have to be
compatible with the kernel ucontext. For now we use the linux
struct gcs_context layout but only use the gcspr field from it.
Similar implementation to the longjmp code, supports switching GCS
if the target GCS is capped, and unwinding a continuous GCS to a
previous state.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This implementations ensures that longjmp across different stacks
works: it scans for GCS cap token and switches GCS if necessary
then the target GCSPR is restored with a GCSPOPM loop once the
current GCSPR is on the same GCS.
This makes longjmp linear time in the number of jumped over stack
frames when GCS is enabled.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The target specific internal __longjmp is called with a __jmp_buf
argument which has its size exposed in the ABI. On aarch64 this has
no space left, so GCSPR cannot be restored in longjmp in the usual
way, which is needed for the Guarded Control Stack (GCS) extension.
setjmp is implemented via __sigsetjmp which has a jmp_buf argument
however it is also called with __pthread_unwind_buf_t argument cast
to jmp_buf (in cancellation cleanup code built with -fno-exception).
The two types, jmp_buf and __pthread_unwind_buf_t, have common bits
beyond the __jmp_buf field and there is unused space there which we
can use for saving GCSPR.
For this to work some bits of those two generic types have to be
reserved for target specific use and the generic code in glibc has
to ensure that __longjmp is always called with a __jmp_buf that is
embedded into one of those two types. Morally __longjmp should be
changed to take jmp_buf as argument, but that is an intrusive change
across targets.
Note: longjmp is never called with __pthread_unwind_buf_t from user
code, only the internal __libc_longjmp is called with that type and
thus the two types could have separate longjmp implementations on a
target. We don't rely on this now (but might in the future given that
cancellation unwind does not need to restore GCSPR).
Given the above this patch finds an unused slot for GCSPR. This
placement is not exposed in the ABI so it may change in the future.
This is also very target ABI specific so the generic types cannot
be easily changed to clearly mark the reserved fields.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The Guarded Control Stack instructions can be present even if the
hardware does not support the extension (runtime checked feature),
so the asm code should be backward compatible with old assemblers.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The LSB of g_signals was unused. The LSB of g1_start was used to indicate
which group is G2. This was used to always go to sleep in pthread_cond_wait
if a waiter is in G2. A comment earlier in the file says that this is not
correct to do:
"Waiters cannot determine whether they are currently in G2 or G1 -- but they
do not have to because all they are interested in is whether there are
available signals"
I either would have had to update the comment, or get rid of the check. I
chose to get rid of the check. In fact I don't quite know why it was there.
There will never be available signals for group G2, so we didn't need the
special case. Even if there were, this would just be a spurious wake. This
might have caught some cases where the count has wrapped around, but it
wouldn't reliably do that, (and even if it did, why would you want to force a
sleep in that case?) and we don't support that many concurrent waiters
anyway. Getting rid of it allows us to use one more bit, making us more
robust to wraparound.
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This function no longer waits for threads to leave g1, so rename it to
__condvar_switch_g1
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
In my previous change I turned a nested loop into a simple loop. I'm doing
the resulting indentation changes in a separate commit to make the diff on
the previous commit easier to review.
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The loop was a little more complicated than necessary. There was only one
break statement out of the inner loop, and the outer loop was nearly empty.
So just remove the outer loop, moving its code to the one break statement in
the inner loop. This allows us to replace all gotos with break statements.
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This variable used to be needed to wait in group switching until all sleepers
have confirmed that they have woken. This is no longer needed. Nothing waits
on this variable so there is no need to track how many threads are currently
asleep in each group.
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
pthread_cond_wait was checking whether it was in a closed group no less than
four times. Checking once is enough. Here are the four checks:
1. While spin-waiting. This was dead code: maxspin is set to 0 and has been
for years.
2. Before deciding to go to sleep, and before incrementing grefs: I kept this
3. After incrementing grefs. There is no reason to think that the group would
close while we do an atomic increment. Obviously it could close at any
point, but that doesn't mean we have to recheck after every step. This
check was equally good as check 2, except it has to do more work.
4. When we find ourselves in a group that has a signal. We only get here after
we check that we're not in a closed group. There is no need to check again.
The check would only have helped in cases where the compare_exchange in the
next line would also have failed. Relying on the compare_exchange is fine.
Removing the duplicate checks clarifies the code.
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This wake is unnecessary. We only switch groups after every sleeper in a group
has been woken. Sure, they may take a while to actually wake up and may still
hold a reference, but waking them a second time doesn't speed that up. Instead
this just makes the code more complicated and may hide problems.
In particular this safety wake wouldn't even have helped with the bug that was
fixed by Barrus' patch: The bug there was that pthread_cond_signal would not
switch g1 when it should, so we wouldn't even have entered this code path.
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Some comments were wrong after the most recent commit. This fixes that.
Also fixing indentation where it was using spaces instead of tabs.
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This fixes the lost wakeup (from a bug in signal stealing) with a change
in the usage of g_signals[] in the condition variable internal state.
It also completely eliminates the concept and handling of signal stealing,
as well as the need for signalers to block to wait for waiters to wake
up every time there is a G1/G2 switch. This greatly reduces the average
and maximum latency for pthread_cond_signal.
The g_signals[] field now contains a signal count that is relative to
the current g1_start value. Since it is a 32-bit field, and the LSB is
still reserved (though not currently used anymore), it has a 31-bit value
that corresponds to the low 31 bits of the sequence number in g1_start.
(since g1_start also has an LSB flag, this means bits 31:1 in g_signals
correspond to bits 31:1 in g1_start, plus the current signal count)
By making the signal count relative to g1_start, there is no longer
any ambiguity or A/B/A issue, and thus any checks before blocking,
including the futex call itself, are guaranteed not to block if the G1/G2
switch occurs, even if the signal count remains the same. This allows
initially safely blocking in G2 until the switch to G1 occurs, and
then transitioning from G1 to a new G1 or G2, and always being able to
distinguish the state change. This removes the race condition and A/B/A
problems that otherwise ocurred if a late (pre-empted) waiter were to
resume just as the futex call attempted to block on g_signal since
otherwise there was no last opportunity to re-check things like whether
the current G1 group was already closed.
By fixing these issues, the signal stealing code can be eliminated,
since there is no concept of signal stealing anymore. The code to block
for all waiters to exit g_refs can also be removed, since any waiters
that are still in the g_refs region can be guaranteed to safely wake
up and exit. If there are still any left at this time, they are all
sent one final futex wakeup to ensure that they are not blocked any
longer, but there is no need for the signaller to block and wait for
them to wake up and exit the g_refs region.
The signal count is then effectively "zeroed" but since it is now
relative to g1_start, this is done by advancing it to a new value that
can be observed by any pending blocking waiters. Any late waiters can
always tell the difference, and can thus just cleanly exit if they are
in a stale G1 or G2. They can never steal a signal from the current
G1 if they are not in the current G1, since the signal value that has
to match in the cmpxchg has the low 31 bits of the g1_start value
contained in it, and that's first checked, and then it won't match if
there's a G1/G2 change.
Note: the 31-bit sequence number used in g_signals is designed to
handle wrap-around when checking the signal count, but if the entire
31-bit wraparound (2 billion signals) occurs while there is still a
late waiter that has not yet resumed, and it happens to then match
the current g1_start low bits, and the pre-emption occurs after the
normal "closed group" checks (which are 64-bit) but then hits the
futex syscall and signal consuming code, then an A/B/A issue could
still result and cause an incorrect assumption about whether it
should block. This particular scenario seems unlikely in practice.
Note that once awake from the futex, the waiter would notice the
closed group before consuming the signal (since that's still a 64-bit
check that would not be aliased in the wrap-around in g_signals),
so the biggest impact would be blocking on the futex until the next
full wakeup from a G1/G2 switch.
Signed-off-by: Frank Barrus <frankbarrus_sw@shaggy.cc>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The new test elf/tst-rseq-tls-range-4096-static reliably detected
the extra TLS allocation problem (tcb_offset was dropped from
the allocation size) on aarch64. It also failed with a crash
in dlopen *before* the extra TLS changes, so TLS alignment with
static dlopen was already broken.
Reviewed-by: Michael Jeanson <mjeanson@efficios.com>
Use the same code to compute the TLS block size and its alignment.
The code in elf/dl-tls.c is linked in anyway for all binaries
due to the reference to _dl_tls_static_surplus_init.
It is not possible to call _dl_allocate_tls_storage directly
because malloc is not available in the static case. (The
dynamic linker uses the minimal malloc at this stage.) Therefore,
split _dl_tls_block_size_with_pre and _dl_tls_block_align from
_dl_allocate_tls_storage, and call those new functions from
__libc_setup_tls.
This fixes extra TLS allocation for the static case, and apparently
some pre-existing bugs as well (the independent recomputation of
TLS block sizes in init_static_tls looks rather suspect).
Fixes commit 0e411c5d30 ("Add generic
'extra TLS'").
The old code used the slotinfo array as a scratch area to pass the
list of TLS-using objects to _dl_determine_tlsoffset. All array
entries are subsequently overwritten by _dl_add_to_slotinfo,
except the first one. The link maps are usually not at their
right position for their module ID in the slotinfo array, so
the initial use of the slotinfo array would be incorrect if not
for scratch purposes only.
In _dl_tls_initial_modid_limit_setup, the old code relied that
some link map was written to the first slotinfo entry. After the
change, this no longer happens because TLS module ID zero is unused.
It's also necessary to move the call after the real initialization
of the slotinfo array.
This fixes an AArch64 build failure:
python3 -B ../sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py bench-float-advsimd-cospi > …/benchtests/bench-float-advsimd-cospi.c
Traceback (most recent call last):
File "…/sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py", line 106, in <module>
main(sys.argv[1])
~~~~^^^^^^^^^^^^^
File "…/sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py", line 81, in main
with open(f"../benchtests/libmvec/{input_filename}") as f:
~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '../benchtests/libmvec/cospif-inputs'
Careful updates of grnd_alloc.len are required to ensure that
after fork, grnd_alloc.states does not contain entries that
are also encountered by __getrandom_reset_state in TCBs.
For the same reason, it is necessary to overwrite the TCB state
pointer with NULL before updating grnd_alloc.states in
__getrandom_vdso_release.
Before this change, different TCBs could share the same getrandom
state after multi-threaded fork. This would be a critical security
bug (predictable randomness) if not caught during development.
The additional check in stdlib/tst-arc4random-thread makes it more
likely that the test fails due to the bugs mentioned above.
Both __getrandom_reset_state and __getrandom_vdso_release could
put reserved NULL pointers into the states array. This is also
fixed with this commit. After these changes, no null pointers were
observed in the states array during testing.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Some kernels on S390 appear to return a CPU affinity mask based on
configured processors rather than the ones online. Overallocate the CPU
set to match that, but operate only on the ones online.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Co-authored-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
If the bit is not 0, the operations FRCHG and FSCHG are
undefined and cause a trap; qemu now checks for this as
well, so we set it to 0 temporarily and restore the old
value in getcontext afterwards (setcontext/swapcontext
already do so).
From the discussion in the bugreport, this can probably
be optimised in one place but none of the people involved
are SH4 assembly experts, this patch is field-tested, and
it’s not a code path run often. The other question, what
happens if a signal occurs while the bit is temporarily 0,
is also still unsolved, but to fix that a kernel change is
most likely needed; this patch changes a certain trap on
many CPUs for a hard-to-get trap in a signal handler if a
signal is delivered during the few instructions the PR bit
is temporarily set to 0, so it’s not a regression for most
users.
See BZ and https://bugs.launchpad.net/qemu/+bug/1796520 for
related discussion, references and review comments.
Signed-off-by: mirabilos <tg@debian.org>
Reviewed-by: Oleg Endo <olegendo@gcc.gnu.org>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Adds commonly used IPv6 packet header macros similar to what is available
on NetBSD and FreeBSD in sys/netinet/ip6.h and Android in
libc/include/netinet/ip6.h
Usage example IPV6_VERSION_MASK and IPV6_VERSION:
if ((ip6->ip6_vfc & IPV6_VERSION_MASK) == IPV6_VERSION)
return true;
Usage example IPV6_FLOWINFO_MASK:
ip6->ip6_flow = (flow & IPV6_FLOWINFO_MASK);
The relevant standard is RFC2460 (Internet Protocol, Version 6
Specification). It defines the Internet Protocol version (IPV6_VERSION)
and reduced the size of the flow label field from 24 to 20 bits
(IPV6_FLOWLABEL_MASK). The traffic class and flow label fields together
make up the flow information (IPV6_FLOWINFO_MASK).
Tested on x86_64 GNU/Linux
Signed-off-by: Dan Luedtke <danrl@google.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
clang-19 shows:
scanf13.c:28:40: error: 'sscanf' may overflow; destination buffer in argument 4 has size 8, but the corresponding specifier may require size 11 [-Werror,-Wfortify-source]
28 | "A%ms%10ms%4m[bcd]%4mcB", &sp1, &sp2, &sp3, &sp4) != 4)
| ^
scanf13.c:94:34: error: 'sscanf' may overflow; destination buffer in argument 3 has size 8, but the corresponding specifier may require size 2049 [-Werror,-Wfortify-source]
94 | if (sscanf (buf, "%2048ms%mc", &sp3, &sp4) != 2)
| ^
scanf13.c:110:61: error: 'sscanf' may overflow; destination buffer in argument 4 has size 8, but the corresponding specifier may require size 1501 [-Werror,-Wfortify-source]
110 | if (sscanf (buf, "%4mc%1500m[dr/]%548m[abc/d]%3mc", &sp1, &sp2, &sp3, &sp4)
| ^
scanf13.c:110:67: error: 'sscanf' may overflow; destination buffer in argument 5 has size 8, but the corresponding specifier may require size 549 [-Werror,-Wfortify-source]
110 | if (sscanf (buf, "%4mc%1500m[dr/]%548m[abc/d]%3mc", &sp1, &sp2, &sp3, &sp4)
clang does have some support to handle 'm' prefix for -Wformat; but it
lacks support for -Wfortify to understand that it is up to libc to
allocate the memory, and uses the pointer size instead to calculate
validity.
The __ifunc_resolver macro expands to:
extern __typeof (__redirect_name) name __attribute__ ((ifunc ("iname_ifunc")));
static __typeof (__redirect_name) *name_ifunc (void) { [...] };
And although NAME_IFUNC is and alias for NAME, clang-18 still emits
an 'unused function 'name_ifunc' [-Werror,-Wunused-function]'
warning.
clang issues:
error: value size does not match register size specified by the
constraint and modifier [-Werror,-Wasm-operand-widths]
while tryng to use 32 bit variables with 'mrs' to get/set the
fpsr, dczid_el0, and ctr.
The Mach RPC host_get_uptime64() is implemented. It returns the elapsed time
value since bootup. See
https://git.savannah.gnu.org/cgit/hurd/gnumach.git/commit/?id=fc494bfe3fb6363e1077dc035eb119970d84a9d1
In this patch, the RPC is used to implement the monotonic clock for
mach.
* config.h.in: Add HAVE_HOST_GET_UPTIME64 config entry
* sysdeps/mach/clock_gettime.c: Add CLOCK_MONOTONIC case
* sysdeps/mach/configure: Check the existence of host_get_uptime64 RPC
* sysdeps/mach/configure.ac: Check the existence of host_get_uptime64 RPC
Message-ID: <20250106043907.1046-1-zhmingluo@163.com>
commit 494d65129e
Author: Michael Jeanson <mjeanson@efficios.com>
Date: Thu Aug 1 10:35:34 2024 -0400
nptl: Introduce <rseq-access.h> for RSEQ_* accessors
added things like
asm volatile ("movl %%fs:%P1(%q2),%0" \
: "=r" (__value) \
: "i" (offsetof (struct rseq_area, member)), \
"r" (__rseq_offset)); \
But this doesn't work for x32 when __rseq_offset is negative since the
address is computed as
FS + 32-bit to 64-bit zero extension of __rseq_offset
+ offsetof (struct rseq_area, member)
Cast __rseq_offset to long long int
"r" ((long long int) __rseq_offset)); \
to sign-extend 32-bit __rseq_offset to 64-bit. This is a no-op for x86-64
since x86-64 __rseq_offset is 64-bit. This fixes BZ #32543.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Sync the internal copy of '<sys/rseq.h>' with the latest Linux kernel
'include/uapi/linux/rseq.h'.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The rseq extensible ABI implementation moved the rseq area to the 'extra
TLS' block, remove the unused 'rseq_area' member of 'struct pthread'.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Move the rseq area to the newly added 'extra TLS' block, this is the
last step in adding support for the rseq extended ABI. The size of the
rseq area is now dynamic and depends on the rseq features reported by
the kernel through the elf auxiliary vector. This will allow
applications to use rseq features past the 32 bytes of the original rseq
ABI as they become available in future kernels.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
In preparation to move the rseq area to the 'extra TLS' block, we need
accessors based on the thread pointer and the rseq offset. The ONCE
variant of the accessors ensures single-copy atomicity for loads and
stores which is required for all fields once the registration is active.
A separate header is required to allow including <atomic.h> which
results in an include loop when added to <tcb-access.h>.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This allows accessing the internal aliases of __rseq_size and
__rseq_offset from ld.so without ifdefs and avoids dynamic symbol
binding at run time for both variables.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Add the Linux implementation of 'extra TLS' which will allocate space
for the rseq area at the end of the TLS blocks in allocation order.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Add the logic to append an 'extra TLS' block in the TLS block allocator
with a generic stub implementation. The duplicated code in
'csu/libc-tls.c' and 'elf/dl-tls.c' is to handle both statically linked
applications and the ELF dynamic loader.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Get the rseq feature size and alignment requirement from the auxiliary
vector for use inside the dynamic loader. Use '__rseq_size' directly to
store the feature size. If the main thread registration fails or is
disabled by tunable, reset the value to 0.
This will be used in the TLS block allocator to compute the size and
alignment of the rseq area block for the extended ABI support.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Similar to a9944a52c9 and
f9493a15ea, we need to hide calloc use from
the compiler to accommodate GCC's r15-6566-g804e9d55d9e54c change.
First, include tst-malloc-aux.h, but then use `volatile` variables
for size.
The test passes without the tst-malloc-aux.h change but IMO we want
it there for consistency and to avoid future problems (possibly silent).
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Clear libc_cv_cc_wimplicit_fallthrough if -Wimplicit-fallthrough isn't
supported. Tested with GCC 6.4.1 on x86-64.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Add a couple of tests to verify that CPU affinity set using
sched_setaffinity and pthread_setaffinity_np are inherited by a child
process and child thread.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On 32-bit Arm, -fasynchronous-unwind-tables creates a reference
to the symbol __aeabi_unwind_cpp_pr0. Compile the tests without
this flag even if it is passed as part of CC, to avoid linker
failures.
htl's pt-alloc.c calls __mempcpy, which is #defined to
__builtin_mempcpy, but which does not happen to get inlined (the size is
dynamic), and then gcc emits a reference to mempcpy, thus violating
symbol exposition standard. We thus also have to redirect such
references to __mempcpy too.
Linux bogsucker 6.1.55-gentoo-dist-hardened #1 SMP Sun Oct 1 18:03:02 UTC 2023 ppc64le POWER9 (architected), altivec supported CHRP IBM pSeries (emulated by qemu) GNU/Linux
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Commit 8f8dd904c4 (“elf:
rtld_multiple_ref is always true”) removed some code that happened
to enable compatibility with programs that do not link against
libc.so. Such programs cannot call dlopen or any dynamic linker
functions (except __tls_get_addr), so this is not really useful.
Still ld.so should not crash with a null-pointer dereference
or undefined symbol reference in these cases.
In the main relocation loop, call _dl_relocate_object unconditionally
because it already checks if the object has been relocated.
If libc.so was loaded, self-relocate ld.so against it and call
__rtld_mutex_init and __rtld_malloc_init_real to activate the full
implementations. Those are available only if libc.so is there,
so skip these initialization steps if libc.so is absent. Without
libc.so, the global scope can be completely empty. This can cause
ld.so self-relocation to fail because if it uses symbol-based
relocations, which is why the second ld.so self-relocation is not
performed if libc.so is missing.
The previous concern regarding GOT updates through self-relocation
no longer applies because function pointers are updated
explicitly through __rtld_mutex_init and __rtld_malloc_init_real,
and not through relocation. However, the second ld.so self-relocation
is still delayed, in case there are other symbols being used.
Fixes commit 8f8dd904c4 (“elf:
rtld_multiple_ref is always true”).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This reverts commit 30d3fd7f4f.
The padding is required by Chromium's MaybeUpdateGlibcTidCache
in sandbox/linux/services/namespace_sandbox.cc.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This reverts commit 7c22dcda27.
The padding is required by Chromium's MaybeUpdateGlibcTidCache
in sandbox/linux/services/namespace_sandbox.cc.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
As documented in the glibc manual, “Some systems don’t define the d_name
element sufficiently long”, and it provides an example of using a union to
properly allocate the storage under the dirent.
This will be required by the rseq extensible ABI implementation on all
Linux architectures exposing the '__rseq_size' and '__rseq_offset'
symbols to set the initial value of the 'cpu_id' field which can be used
by applications to test if rseq is available and registered. As long as
the symbols are exposed it is valid for an application to perform this
test even if rseq is not yet implemented in libc for this architecture.
Compile tested with build-many-glibcs.py but I don't have access to any
hardware to run the tests.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This will be required by the rseq extensible ABI implementation on all
Linux architectures exposing the '__rseq_size' and '__rseq_offset'
symbols to set the initial value of the 'cpu_id' field which can be used
by applications to test if rseq is available and registered. As long as
the symbols are exposed it is valid for an application to perform this
test even if rseq is not yet implemented in libc for this architecture.
Compile tested with build-many-glibcs.py but I don't have access to any
hardware to run the tests.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Linux waikiki 6.6.53-gentoo #1 SMP Wed Oct 2 13:21:27 CEST 2024 x86_64 AMD EPYC 7532 32-Core Processor AuthenticAMD GNU/Linux
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Linux matoro-mipsdev 6.12.0-gentoo-mips #2 SMP Tue Nov 19 15:34:04 EST 2024 mips64 Cavium Octeon II V0.10 EBB6800 (CN6880p2.2-1200-AAP) GNU/Linux
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Add "mtls_descriptor=desc" to preconfigure.ac and regenerate preconfigure.
Fix failure: elf/tst-gnu2-tls2.
Reported-by: Joseph S. Myers <josmyers@redhat.com>
Reported-by: Andreas K. Huettel <dilfridge@gentoo.org>
Linux matoro-alphadev 6.12.3-gentoo-alpha #1 Sun Dec 8 04:39:11 EST 2024 alpha EV68CB Titan GNU/Linux
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
There is no need for __GI_XXX symbols, like __GI___strcpy_aligned since
__strcpy_aligned is used directly.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Linux matoro-loongdev 6.12.0-gentoo-loongarch64 #1 SMP PREEMPT Fri Nov 22 00:38:46 EST 2024 loongarch64 GNU/Linux
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Use unpredicated muls, use absolute compare and improve memory access.
Expm1f, sinhf and tanhf show 7%, 5% and 1% improvement in throughput
microbenchmark on Neoverse V1.
More routines are to follow, some of which hit many failures in the
current testsuite due to wrong sign of zero (mathvec routines are not
required to get this right). Instead of disabling a large number of
tests, change the failure condition such that, for vector routines,
tests pass as long as computed == expected == 0.0, regardless of sign.
Affected tests (vector tests for expm1, log1p, sin, tan and tanh) all
still pass.
Reduce memory access by using lanewise MLA and reduce number of MOVPRFXs.
Move log1pf implementation to inline helper function.
Speedup on Neoverse V1 for log1pf (10%), acoshf (-1%), atanhf (2%), asinhf (2%).
Reduce memory access by using lanewise MLA and moving constants to struct
and reduce number of MOVPRFXs.
Update maximum ULP error for double log_sve from 1 to 2.
Speedup on Neoverse V1 for log (3%), log2 (5%), and log10 (4%).
Clang's <tgmath.h> doesn't support all C23 functions in glibc's <tgmath.h>:
https://github.com/llvm/llvm-project/issues/121536
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since have-mtls-descriptor is only used for glibc testing, rename it to
have-test-mtls-descriptor. Also enable tst-gnu2-tls2-amx only if
$(have-test-mtls-descriptor) == gnu2.
Tested with GCC 14 and Clang 19/18/17 on x86-64.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Linux timberdoodle 6.1.60-gentoo-dist-hardened #1 SMP Fri Dec 1 22:10:49 UTC 2023 ppc64 POWER9 (architected), altivec supported CHRP IBM pSeries (emulated by qemu) GNU/Linux
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Linux catbus 6.1.112 #1 SMP Sun Oct 13 10:52:08 PDT 2024 sparc64 sun4v UltraSparc T5 (Niagara5) GNU/Linux
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
clang does not define __SIG_ATOMIC_TYPE__, instead add another
directive ('size:') which instruct to use an integer type of
defined minimum size.
Reviewed-by: Sam James <sam@gentoo.org>
The mempcpy and stpcpy redirections to __mempcpy and __stpcpy were added
by
commit 939da41143
Author: Joseph Myers <joseph@codesourcery.com>
Date: Wed Nov 12 22:36:34 2014 +0000
Fix stpcpy / mempcpy namespace (bug 17573).
to fix the namespace bug since __mempcpy and __stpcpy were defined as
macros in <bits/string2.h>. These macros call __builtin_mempcpy and
__builtin_stpcpy which may end up calling the C functions mempcpy
and stpcpy. In libc.so, libc_hidden_builtin_proto ensures that calls
to mempcpy and stpcpy are in turn mapped to call __GI_mempcpy and
__GI_stpcpy. The redirections were applied outside of libc.so, including
libc.a, to map mempcpy and stpcpy to __mempcpy and __stpcpy. Since
commit 18b10de7ce
Author: Wilco Dijkstra <wdijkstr@arm.com>
Date: Mon Jun 12 15:19:38 2017 +0100
2017-06-12 Wilco Dijkstra <wdijkstr@arm.com>
There is no longer a need for string2.h, so remove it and all mention of it.
Move the redirect for __stpcpy to include/string.h since it is
still required
until all internal uses have been renamed.
This fixes several linknamespace/localplt failures when building with -Os.
removed the __mempcpy and __stpcpy macros from the public header file,
limit these redirections to libc.a to avoid Clang error:
In file included from tst-iconv-sticky-input-error.c:22:
In file included from ./gconv_int.h:24:
../include/string.h:182:44: error: attribute declaration must precede definition [-Werror,-Wignored-attributes]
182 | extern __typeof (mempcpy) mempcpy __asm__ ("__mempcpy");
| ^
../string/bits/string_fortified.h:42:8: note: previous definition is here
42 | __NTH (mempcpy (void *__restrict __dest, const void *__restrict __src,
| ^
when testing with Clang for fortify build.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
When Clang is used to test fortify glibc build configured with
--enable-fortify-source=N
clang issues errors like
In file included from tst-rfc3484.c:60:
In file included from ./getaddrinfo.c:81:
../sysdeps/unix/sysv/linux/not-cancel.h:36:10: error: reference to overloaded function could not be resolved; did you mean to call it?
36 | __typeof (open64) __open64_nocancel;
| ^~~~~~~~
../include/bits/../../io/bits/fcntl2.h:127:1: note: possible target for call
127 | open64 (__fortify_clang_overload_arg (const char *, ,__path), int __oflag,
| ^
../include/bits/../../io/bits/fcntl2.h:118:1: note: possible target for call
118 | open64 (__fortify_clang_overload_arg (const char *, ,__path), int __oflag)
| ^
../include/bits/../../io/bits/fcntl2.h:114:1: note: possible target for call
114 | open64 (const char *__path, int __oflag, mode_t __mode, ...)
| ^
../io/fcntl.h:219:12: note: possible target for call
219 | extern int open64 (const char *__file, int __oflag, ...) __nonnull ((1));
| ^
because clang fortify support for functions with variable arguments relies
on function overload. Update not-cancel.h to avoid __typeof on functions
with variable arguments.
Co-Authored-By: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Use explicit instantiation declaration and definition to silence Clang
error:
tst-unique3.cc:6:18: error: instantiation of variable 'S<char>::i' required here, but no definition is available [-Werror,-Wundefined-var-template]
6 | int t = S<char>::i;
| ^
./tst-unique3.h:5:14: note: forward declaration of template entity is here
5 | static int i;
| ^
tst-unique3.cc:6:18: note: add an explicit instantiation declaration to suppress this warning if 'S<char>::i' is explicitly instantiated in another translation unit
6 | int t = S<char>::i;
| ^
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
These inputs were generated with the programs from
https://gitlab.inria.fr/zimmerma/math_accuracy,
with rounding to nearest:
* for univariate binary32 functions by exhaustive search
* for other functions with the "threshold" parameter up to 10^6
The initstate{,_r} interfaces are documented in BSD as needing an aligned
array of 32-bit values, but neither POSIX nor glibc's own documentation
require it to be aligned. glibc's documentation says it "should" be a power
of 2, but not must.
Use memcpy to read and write to `state` to handle such an unaligned
argument.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The require size for mmap can be inferred from __vasprintf return
value. It also fixes tst-assert-2 when building with --enable-fortify,
where even if the format is not translated, __readonly_area fails
because malloc can not be used.
Checked on aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This fixes commit 5e249192ca ("elf:
Remove the GET_ADDR_ARGS and related macros from the TLS code"):
GET_ADDR_ARGS was indeed unused, but GET_ADDR_OFFSET was used
on several targets, those that define TLS_DTV_OFFSET. Instead
of reintroducing GET_ADDR_OFFSET, use TLS_DTV_OFFSET directly,
now that it is defined on all targets.
In the new tls_get_addr_adjust helper function, add a cast to
uintptr_t to help the s390 case, where the offset can be positive or
negative, depending on the addresses malloc returns. The cast avoids
pointer wraparound/overflow. The outer uintptr_t cast is needed
to suppress a warning on x86-64 x32 about mismatched integer/pointer
sizes.
Eventually this offset should be folded into the DTV addresses
themselves, to eliminate the subtraction on the TLS fast path.
This will require an adjustment to libthread_db because the
debugger interface currently returns unadjusted pointers.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On arc, the definition of TLS_DTV_UNALLOCATED now comes from
<dl-dtv.h>.
For x86-64 x32, a separate version is needed because unsigned long int
is 32 bits on this target.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The old BSD 4.4 definition (not used by Linux) was not 64b-proof: the
cmsg_data field is supposed to CMSG_ALIGN'ed (as can be also seen in the
CMSG_LEN macro).
Suggested-by: Diego Nieto Cid <dnietoc@gmail.com>
* scripts/update-copyrights: Do not update copyright notices
in licenses imported from the Linux kernel.
This should prevent glitches such as those fixed in my
recent commit.
I've updated copyright dates in glibc for 2025. This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files.
This is needed for the next patch which updates copyright dates.
* assert/test-assert-2.c: Remove trailing white space.
* elf/tst-startup-errno.c: Remove trailing empty lines.
Since https://gcc.gnu.org/r11-959, the compiler emits
-Wmaybe-uninitialized if a const pointer to an uninitialized buffer is
passed. Tell the compiler we don't dereference the pointer to remove
the false alarm.
Link: https://gcc.gnu.org/PR118194
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Sam James <sam@gentoo.org>
The new tunable can be used to control whether executable stacks are
allowed from either the main program or dependencies. The default is
to allow executable stacks.
The executable stacks default permission is checked agains the one
provided by the PT_GNU_STACK from program headers (if present). The
tunable also disables the stack permission change if any dependency
requires an executable stack at loading time.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
If some shared library loaded with dlopen/dlmopen requires an executable
stack, either implicitly because of a missing GNU_STACK ELF header
(where the ABI default flags implies in the executable bit) or explicitly
because of the executable bit from GNU_STACK; the loader will try to set
the both the main thread and all thread stacks (from the pthread cache)
as executable.
Besides the issue where any __nptl_change_stack_perm failure does not
undo the previous executable transition (meaning that if the library
fails to load, there can be thread stacks with executable stacks), this
behavior was used on a CVE [1] as a vector for RCE.
This patch changes that if a shared library requires an executable
stack, and the current stack is not executable, dlopen fails. The
change is done only for dynamically loaded modules, if the program
or any dependency requires an executable stack, the loader will still
change the main thread before program execution and any thread created
with default stack configuration.
[1] https://www.qualys.com/2023/07/19/cve-2023-38408/rce-openssh-forwarded-ssh-agent.txt
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Tested with build-many-glibcs.py with
--exclude m68k-linux-gnu-coldfire-soft
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Move the x86-64 loader first, before the i386 and x32 loaders. In
most cases, it's the loader the script needs. This avoids an error
message if the i386 loader does not work.
The effect of this change to the generated ldd script looks like this:
-RTLDLIST="/lib/ld-linux.so.2 /lib64/ld-linux-x86-64.so.2 /libx32/ld-linux-x32.so.2"
+RTLDLIST="/lib64/ld-linux-x86-64.so.2 /lib/ld-linux.so.2 /libx32/ld-linux-x32.so.2"
Reviewed-by: Sam James <sam@gentoo.org>
The addition of the new thread_pointer.h header on HPPA resulted in
duplicated inline asm to get the current thread pointer from the cr27
register.
Include thread_pointer.h in tls.h and replace __get/set_cr27() with
__set_/thread_pointer() with the appropriate casts.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
This will be required by the rseq extensible ABI implementation on all
Linux architectures exposing the '__rseq_size' and '__rseq_offset'
symbols to set the initial value of the 'cpu_id' field which can be used
by applications to test if rseq is available and registered. As long as
the symbols are exposed it is valid for an application to perform this
test even if rseq is not yet implemented in libc for this architecture.
Compile tested with build-many-glibcs.py but I don't have access to any
hardware to run the tests.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
The previous use of padding within a union made it impossible to
re-use the padding for GLIBC_PRIVATE ABI preservation because
tcbhead_t could use up all of the padding (as was historically the
case on x86-64). Allocating padding unconditionally addresses this
issue.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This was used to manage an IA-64 ABI divergence is no longere needed
after the IA-64 removal.
(It should be possible to encode all the required information in
one machine word, so the pointer indirection is really unnecessary.
Technically, none of this is part of the ABI, so perhaps it's
possible to do this retroactively. See bug 27404.)
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
m68k-linux-gnu-coldfire-soft GCC and glibc often won't build due to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103370
which results in build-many-glibcs.py failure. Add an option, --exclude,
to exclude some targets.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Also document C and C++ compilers used to test glibc should come from
the same set of compilers.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since Linux 6.11 the kernel allows path to be NULL if flags &
AT_EMPTY_PATH. Let's allow users to take the advantage if they don't
care running on old kernels.
Signed-off-by: Miao Wang <shankerwangmiao@gmail.com>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Change configure output for C++ Compiler from
...
checking -finput-charset=ascii in testing... -finput-charset=ascii
checking -finput-charset=ascii in testing... -finput-charset=ascii
...
to
...
checking -finput-charset=ascii in testing... -finput-charset=ascii
checking g++ -finput-charset=ascii in testing... -finput-charset=ascii
...
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Add valid_decimal_value to check valid decimal value in a string to
avoid uninitialized endp in add_prefixlist and gaiconf_init as reported
by Clang 19:
./getaddrinfo.c:1884:11: error: variable 'endp' is used uninitialized whenever '||' condition is true [-Werror,-Wsometimes-uninitialized]
1884 | && (cp == NULL
| ^~~~~~~~~~
./getaddrinfo.c:1887:11: note: uninitialized use occurs here
1887 | && *endp == '\0'
| ^~~~
./getaddrinfo.c:1884:11: note: remove the '||' if its condition is always false
1884 | && (cp == NULL
| ^~~~~~~~~~
1885 | || (bits = strtoul (cp, &endp, 10)) != ULONG_MAX
| ~~
./getaddrinfo.c:1875:13: note: initialize the variable 'endp' to silence this warning
1875 | char *endp;
| ^
| = NULL
This fixes BZ #32465.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
move out __getpid from pt-mutex.h
and in pt-mutex-* include <unistd.h> where
__getpid was called
Signed-off-by: gfleury <gfleury@disroot.org>
Message-ID: <20241219203727.669825-8-gfleury@disroot.org>
Add a configure check for -Wno-fortify-source to suppress Clang warnings
on string/tester.c, like:
tester.c:385:10: error: 'strncat' size argument is too large; destination buffer has size 50, but size argument is 99 [-Werror,-Wfortify-source]
385 | check (strncat (one, "lmn", 99) == one, 1); /* Returned value. */
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since Clang doesn't properly handle
/* FALLTHROUGH */
in elf/tst-align2.c nor
/* fall through */
in misc/tst-tsearch.c
tst-align2.c💯9: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
100 | case 'A':
| ^
tst-align2.c💯9: note: insert '__attribute__((fallthrough));' to silence this warning
100 | case 'A':
| ^
| __attribute__((fallthrough));
tst-align2.c💯9: note: insert 'break;' to avoid fall-through
100 | case 'A':
| ^
| break;
suppress them when compiled with Clang.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since Clang doesn't support
DIAG_IGNORE_NEEDS_COMMENT (11, "-Wformat=");
and for unknown reasons, it doesn't warn the %#m specifier, suppress
-Wformat only for gcc in tst-sprintf-errno.c.
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Add __attribute_optimization_barrier__ to disable inlining and cloning on a
function. For Clang, expand it to
__attribute__ ((optnone))
Otherwise, expand it to
__attribute__ ((noinline, clone))
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Without stack protector, inhibit_stack_protector is undefined during build:
In file included from <command-line>:
./../include/libc-symbols.h:665:3: error: expected ';' before '__typeof'
665 | __typeof (type_name) *name##_ifunc (__VA_ARGS__)
\
| ^~~~~~~~
./../include/libc-symbols.h:676:3: note: in expansion of macro
'__ifunc_resolver'
676 | __ifunc_resolver (type_name, name, expr, init, static, __VA_ARGS__)
| ^~~~~~~~~~~~~~~~
./../include/libc-symbols.h:703:3: note: in expansion of macro '__ifunc_args'
703 | __ifunc_args (type_name, name, expr, init, arg)
| ^~~~~~~~~~~~
./../include/libc-symbols.h:790:3: note: in expansion of macro '__ifunc'
790 | __ifunc (redirected_name, name, expr, void, INIT_ARCH)
| ^~~~~~~
../sysdeps/x86_64/multiarch/memchr.c:29:1: note: in expansion of macro
'libc_ifunc_redirected'
29 | libc_ifunc_redirected (__redirect_memchr, memchr, IFUNC_SELECTOR ());
| ^~~~~~~~~~~~~~~~~~~~~
1. Fix a typo in include/libc-symbols.h to define inhibit_stack_protector
for build.
2. Don't include <config.h> in include/libc-symbols.h since it has been
included in include/libc-misc.h.
3. Change #include "libc-misc.h" to #include <libc-misc.h> in
string/test-string.h.
This fixes BZ #32494.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Check if TEST_CC supports -Wno-restrict before using it to avoid Clang
error:
error: unknown warning option '-Wno-restrict' [-Werror,-Wunknown-warning-option]
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
This simplifies the handling of sanity check errors in clone.S.
Adjusted a couple of comments to reflect current code.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
The hppa Linux kernel supports the cacheflush() syscall
since version 6.5. This adds the glibc syscall wrapper.
Signed-off-by: Helge Deller <deller@gmx.de>
---
v2: This patch was too late in release cycle for GLIBC_2.40,
so update now to GLIBC_2.41 instead.
Since tst-deadline.c is an internal test, compile tst-deadline.c with
-Wno-ignored-attributes for Clang to silence -Werror,-Wunknown-attributes
errors. Also suppress -Wmaybe-uninitialized only for GCC in net-internal.h.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Suppress Clang -Wgnu-folding-constant warnings, like
tst-freopen.c:44:13: error: variable length array folded to constant array as an extension [-Werror,-Wgnu-folding-constant]
44 | char temp[strlen (test) + 1];
| ^~~~~~~~~~~~~~~~~
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Clang issues the following warning:
tst-vfprintf-width-i18n.c:51:34: error: invalid conversion specifier '1'
[-Werror,-Wformat-invalid-specifier]
TEST_COMPARE (sprintf (buf, "%I16d", 12345), 16);
~~^
since it does not how to handle %I.
Reviewed-by: Sam James <sam@gentoo.org>
clang warns that since the global variables are only used to function
calls (without being actually used), they are not needed and will
not be emitted.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Suppress Clang warning on adding an integer to a string, like:
tst-iconv-sticky-input-error.c:125:42: error: adding 'int' to a string does not append to the string [-Werror,-Wstring-plus-int]
125 | expected_output = "ABXY" + skip;
| ~~~~~~~^~~~~~
tst-iconv-sticky-input-error.c:125:42: note: use array indexing to silence this warning
125 | expected_output = "ABXY" + skip;
| ^
| & [ ]
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
clang does not handle %Z on print, and just suppressing
-Wformat-invalid-specifier might trigger another warning for extra
arguments (since %Z is ignored). So suppress -Wformat-extra-args
as well.
For tst-fphex.c a heavy hammer is used since the printf is more
complex and clang throws a more generic warning.
Reviewed-by: Sam James <sam@gentoo.org>
clang warns that the instantiation of the variable is required,
but no definition is available. They are implemented on
tst-unique4lib.so.
Checked on x86_64-linux-gnu.
Reviewed-by: Sam James <sam@gentoo.org>
tst-dlopen-nodelete-reloc requires STB_GNU_UNIQUE support so that NODELETE
is propagated by do_lookup_unique. Enable it only if TEST_CXX supports
STB_GNU_UNIQUE,
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Check PDE load address with non-empty text section:
.globl _start
_start:
.globl __start
.byte 0
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Clang supports __builtin_fabsf128 (despite not supporting _Float128) but
it does not support __builtin_fabsq. Fallback to back to
`typedef __float128 _Float128;` it clang is used.
Originally developed by Fangrui Song <maskray@google.com>.
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Four new macros are added:
* DIAG_{PUSH,POP}_NEEDS_COMMENT_CLANG are similar to
DIAG_{PUSH,POP}_NEEDS_COMMENT, but enable clang specific pragmas to
handle warnings for options only supported by clang.
* DIAG_IGNORE_NEEDS_COMMENT_{CLANG,GCC} are similar to
DIAG_IGNORE_NEEDS_COMMENT, but enable the warning suppression only
for the referenced compiler.
Reviewed-by: Sam James <sam@gentoo.org>
Compiler may default to -fno-semantic-interposition. But some elf test
modules must be compiled with -fsemantic-interposition to function properly.
Add a TEST_CC check for -fsemantic-interposition and use it on elf test
modules. This fixed
FAIL: elf/tst-dlclose-lazy
FAIL: elf/tst-pie1
FAIL: elf/tst-plt-rewrite1
FAIL: elf/unload4
when Clang 19 is used to test glibc.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Use
__attribute__ ((optnone))
instead of
__attribute__ ((optimize ("-O0")))
to disable optimization for Clang.
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Clang emits the following warnings:
../sysdeps/unix/sysv/linux/tst-getdents64.c:111:18: error: fields must
have a constant size: 'variable length array in structure' extension
will never be supported
char buffer[buffer_size];
^
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Add include/libc-misc.h to provide miscellaneous definitions for both
glibc build and test:
1. Move inhibit_stack_protector to libc-misc.h and add Clang support.
2. Add test_inhibit_stack_protector for glibc testing.
3. Move inhibit_loop_to_libcall to libc-misc.h.
4. Add test_cc_inhibit_loop_to_libcall to handle TEST_CC != CC and
replace inhibit_loop_to_libcall with test_cc_inhibit_loop_to_libcall
in glibc tests.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Co-Authored-By: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Sam James <sam@gentoo.org>
Clang doesn't support -ffloat-store:
clang: error: optimization flag '-ffloat-store' is not supported [-Werror,-Wignored-optimization-argument]
Define test-config-cflags-float-store for -ffloat-store and use it in
math/Makefile for testing.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since trampoline is required to test execstack, enable execstack tests
only if compiler supports trampoline.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since Clang doesn't support -mfpmath=387 on x86-64, on x86, include
test-flt-eval-method-387 only if -mfpmath=387 works.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Unlike GCC, libmvec support in Clang is hard-coded. Clang doesn't use
macros defined in <bits/libm-simd-decl-stubs.h> to support new libmvec
functions added to glibc and can't vectorize all test loops to test
libmvec ABI:
https://github.com/llvm/llvm-project/issues/120868
disable libmvec ABI test for Clang.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Remove the /usr/include/tgmath.h dependency generated by Clang even though
Clang never reads /usr/include/tgmath.h.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since math/math.h isn't a system header, clang issues errors:
In file included from test-flt-eval-method.c:20:
In file included from ../include/math.h:7:
../math/math.h:91:11: error: 'INFINITY' macro redefined [-Werror,-Wmacro-redefined]
91 | # define INFINITY (__builtin_inff ())
| ^
/usr/bin/../lib/clang/19/include/float.h:173:11: note: previous definition is here
173 | # define INFINITY (__builtin_inff())
| ^
In file included from test-flt-eval-method.c:20:
In file included from ../include/math.h:7:
../math/math.h:98:11: error: 'NAN' macro redefined [-Werror,-Wmacro-redefined]
98 | # define NAN (__builtin_nanf (""))
| ^
/usr/bin/../lib/clang/19/include/float.h:174:11: note: previous definition is here
174 | # define NAN (__builtin_nanf(""))
Don't define INFINITY nor NAN if they are defined.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since __builtin_complex was added to Clang 12, support __builtin_complex
for Clang 12.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Check if -finput-charset=ascii is supported before using it in
check-installed-headers.sh.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
tgmath3-macro-tests won't compile with <float.h> and <tgmath.h> from
Clang due to missing C23 support:
https://github.com/llvm/llvm-project/issues/97335
Disable them for now when Clang is used for testing so that "make check"
can finish.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Since -mamx-tile is used only for testing, use LIBC_TRY_TEST_CC_COMMAND,
instead of LIBC_TRY_CC_AND_TEST_CC_COMMAND to check it and don't check
__builtin_ia32_ldtilecfg for Clang.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Fix assert.c so that even the fallback
case conforms to POSIX, although not exactly the same as
the default case so a test can tell the difference.
Add a test that verifies that abort is called, and that the
message printed to stderr has all the info that POSIX requires.
Verify this even when malloc isn't usable.
Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
After
commit 215447f5cb
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Dec 17 06:18:55 2024 +0800
cet: Pass -mshstk to compiler for tst-cet-legacy-10a[-static].c
we can remove '#pragma GCC target' in tst-cet-legacy-10a[-static].c.
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
POSIX states that "if a child process cannot be created, or if the
termination status for the command language interpreter cannot be
obtained, system() shall return -1 and set errno to indicate the error."
In the glibc implementation it could happen when posix_spawn fails,
which happens when the underlying fork, vfork, or clone call fails. They
could fail with EAGAIN and ENOMEM.
Resolves: BZ #32450
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Clang has its own <tgmath.h> and doesn't use <tgmath.h> from glibc. Pass
"-I." to compiler only if $($(<F)-no-include-dot) are undefined. Define
it to yes for tgmath tests when testing with Clang.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Clang 19 takes a very long time, it ran more than 27 minutes on Intel Core
i7-1195G7 before the process was killed, to compile bug28.c:
https://github.com/llvm/llvm-project/issues/120462
Exclude it when Clang is used for testing.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
This was discovered after extending elf/tst-audit23 to cover
dlclose of the dlmopen namespace.
Auditors already experience the new order during process
shutdown (_dl_fini), so no LAV_CURRENT bump or backwards
compatibility code seems necessary.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Previously, the ld.so link map was silently added to the namespace.
This change produces an auditing event for it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
After commit 1d5024f4f0
("support: Build with exceptions and asynchronous unwind tables
[BZ #30587]"), libgcc_s is expected to show up in the DSO
list on 32-bit Arm. Do not update max_objs because vdso is not
tracked (and which is the reason why the test currently passes
even with libgcc_s present).
Also write the log output from the auditor to standard output,
for easier test debugging.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This avoids immediate GLIBC_PRIVATE ABI issues if the size of
struct link_map or struct auditstate changes.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Unconditionally define it to false for static builds.
This avoids the awkward use of weak_extern for _dl_rtld_map
in checks that cannot be possibly true on static builds.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Linux 6.12 adds a constant MSG_SOCK_DEVMEM (recall that various
constants such as this one are defined in the non-uapi linux/socket.h
but still form part of the kernel/userspace interface, so that
non-uapi header is one that needs checking each release for new such
constants). Add it to glibc's bits/socket.h.
Tested for x86_64.
This matches kernel behavior. With this change, it is possible
to use utimensat as a replacement for the futimens interface,
similar to what glibc does internally.
Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
This padding is difficult to use for preserving the internal
GLIBC_PRIVATE ABI. The comment is misleading. Current Address
Sanitizer uses heuristics to determine struct pthread size.
It does not depend on its precise layout. It merely scans for
pointers allocated using malloc.
Due to the removal of the padding, the assert for its start
is no longer required.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
The current DSO dependency sorting tests are for a limited number of
specific cases, including some from particular bug reports.
Add tests that systematically cover all possible DAGs for an
executable and the shared libraries it depends on, directly or
indirectly, up to four objects (an executable and three shared
libraries). (For this kind of DAG - ones with a single source vertex
from which all others are reachable, and an ordering on the edges from
each vertex - there are 57 DAGs on four vertices, 3399 on five
vertices and 1026944 on six vertices; see
https://arxiv.org/pdf/2303.14710 for more details on this enumeration.
I've tested that the 3399 cases with five vertices do all pass if
enabled.)
These tests are replicating the sorting logic from the dynamic linker
(thereby, for example, asserting that it doesn't accidentally change);
I'm not claiming that the logic in the dynamic linker is in some
abstract sense optimal. Note that these tests do illustrate how in
some cases the two sorting algorithms produce different results for a
DAG (I think all the existing tests for such differences are ones
involving cycles, and the motivation for the new algorithm was also to
improve the handling of cycles):
tst-dso-ordering-all4-44: a->[bc];{}->[cba]
output(glibc.rtld.dynamic_sort=1): c>b>a>{}<a<b<c
output(glibc.rtld.dynamic_sort=2): b>c>a>{}<a<c<b
They also illustrate that sometimes the sorting algorithms do not
follow the order in which dependencies are listed in DT_NEEDED even
though there is a valid topological sort that does follow that, which
might be counterintuitive considering that the DT_NEEDED ordering is
followed in the simplest cases:
tst-dso-ordering-all4-56: {}->[abc]
output: c>b>a>{}<a<b<c
shows such a simple case following DT_NEEDED order for destructor
execution (the reverse of it for constructor execution), but
tst-dso-ordering-all4-41: a->[cb];{}->[cba]
output: c>b>a>{}<a<b<c
shows that c and b are in the opposite order to what might be expected
from the simplest case, though there is no dependency requiring such
an opposite order to be used.
(I'm not asserting that either of those things is a problem, simply
observing them as less obvious properties of the sorting algorithms
shown up by these tests.)
Tested for x86_64.
This change implements vfork.S for direct support of the vfork
syscall. clone.S is revised to correct child support for the
vfork case.
The main bug was creating a frame prior to the clone syscall.
This was done to allow the rp and r4 registers to be saved and
restored from the stack frame. r4 was used to save and restore
the PIC register, r19, across the system call and the call to
set errno. But in the vfork case, it is undefined behavior
for the child to return from the function in which vfork was
called. It is surprising that this usually worked.
Syscalls on hppa save and restore rp and r19, so we don't need
to create a frame prior to the clone syscall. We only need a
frame when __syscall_error is called. We also don't need to
save and restore r19 around the call to $$dyncall as r19 is not
used in the code after $$dyncall.
This considerably simplifies clone.S.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
There are no new constants covered by tst-mman-consts.py,
tst-mount-consts.py or tst-pidfd-consts.py in Linux 6.12 that need any
header changes, so update the kernel version in those tests.
(tst-sched-consts.py will need updating separately along with adding
SCHED_EXT.)
Tested with build-many-glibcs.py.
The CORE-MATH implementation is correctly rounded (for any rounding mode),
although it should worse performance than current one. The current
implementation performance comes mainly from the internal usage of
the optimize expf implementation, and shows a maximum ULPs of 2 for
FE_TONEAREST and 3 for other rounding modes.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).
Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1):
Latency master patched improvement
x86_64 40.6995 49.0737 -20.58%
x86_64v2 40.5841 44.3604 -9.30%
x86_64v3 39.3879 39.7502 -0.92%
i686 112.3380 129.8570 -15.59%
aarch64 (Neoverse) 18.6914 17.0946 8.54%
power10 11.1343 9.3245 16.25%
reciprocal-throughput master patched improvement
x86_64 18.6471 24.1077 -29.28%
x86_64v2 17.7501 20.2946 -14.34%
x86_64v3 17.8262 17.1877 3.58%
i686 64.1454 86.5645 -34.95%
aarch64 (Neoverse) 9.77226 12.2314 -25.16%
power10 4.0200 5.3316 -32.63%
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
The pi defined constants are not the expected value for atan2
on non-default rounding modes. Instead use the autogenerated value.
Reviewed-by: DJ Delorie <dj@redhat.com>
For some correctly rounded inputs where infinity might generate
a number (like atanf), comparing to a pre-defined constant does not
yield the expected result in all rounding modes.
The most straightforward way to handle it would be to get the expected
result from mpfr, where it handles all the rounding modes.
This will be required by the rseq extensible ABI implementation on all
Linux architectures exposing the '__rseq_size' and '__rseq_offset'
symbols to set the initial value of the 'cpu_id' field which can be used
by applications to test if rseq is available and registered. As long as
the symbols are exposed it is valid for an application to perform this
test even if rseq is not yet implemented in libc for this architecture.
Compile tested with build-many-glibcs.py but I don't have access to any
hardware to run the tests.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Linux 6.12 has no new syscalls. Update the version number in
syscall-names.list to reflect that it is still current for 6.12.
Tested with build-many-glibcs.py.
Hide memset/bzero from compiler to silence Clang error:
./tester.c:1345:29: error: 'size' argument to memset is '0'; did you mean to transpose the last two arguments? [-Werror,-Wmemset-transposed-args]
1345 | (void) memset(one+2, 'y', 0);
| ^
./tester.c:1345:29: note: parenthesize the third argument to silence
./tester.c:1432:16: error: 'size' argument to bzero is '0' [-Werror,-Wsuspicious-bzero]
1432 | bzero(one+2, 0);
| ^
./tester.c:1432:16: note: parenthesize the second argument to silence
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Set have-test-clang to yes if clang is used to test glibc. Set
have-test-clangxx to yes if clang++ is used to test glibc.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Although _chk functions are exported in libc.so.6, their prototypes aren't
provided. Their built versions are supported by compiler. Replace
__strcpy_chk with __builtin___strcpy_chk to silence Clang error:
./tst-gnuglob-skeleton.c:225:3: error: call to undeclared function '__strcpy_chk'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
225 | __strcpy_chk (dir->d.d_name, filesystem[dir->idx].name, NAME_MAX);
| ^
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The C standard requires that ungetc guarantees at least one pushback,
but the malloc call to allocate the pushback buffer could fail, thus
violating that requirement. Fix this by adding a single byte pushback
buffer in the FILE struct that the pushback can fall back to if malloc
fails.
The side-effect is that if the initial malloc fails and the 1-byte
fallback buffer is used, future resizing (if it succeeds) will be
2-bytes, 4-bytes and so on, which is suboptimal but it's after a malloc
failure, so maybe even desirable.
A future optimization here could be to have the pushback code use the
single byte buffer first and only fall back to malloc for subsequent
calls.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Maciej W. Rozycki <macro@redhat.com>
Clang does not define _Bool for -std=c++98:
/usr/include/bits/platform/features.h:31:19: error: unknown type name '_Bool'
31 | static __inline__ _Bool
| ^
Change _Bool to bool to silence clang++ error.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
GCC 4.9 issues an error for copysign in initializer:
In file included from tst-printf-format-p-double.c:20:0:
tst-printf-format-skeleton-double.c:29:3: error: initializer element is not a constant expression [-Werror]
{ -HUGE_VAL, -DBL_MAX, -DBL_MIN, copysign (0, -1), -NAN, NAN, 0, DBL_MIN,
^
since it can't fold "copysign (0, -1)". Replace copysign (0,-1) with -0.0.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Add explicit instantiation declaration of S<char>::i to silence Clang
error:
tst-unique3.cc:6:18: error: instantiation of variable 'S<char>::i' required here, but no definition is available [-Werror,-Wundefined-var-template]
6 | int t = S<char>::i;
| ^
./tst-unique3.h:5:14: note: forward declaration of template entity is here
5 | static int i;
| ^
tst-unique3.cc:6:18: note: add an explicit instantiation declaration to suppress this warning if 'S<char>::i' is explicitly instantiated in another translation unit
6 | int t = S<char>::i;
| ^
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
ieee_long_double_shape_type has
typedef union
{
long double value;
struct
{
...
int sign_exponent:16;
...
} parts;
} ieee_long_double_shape_type;
Clang issues an error:
../sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c:49:2: error: implicit truncation from 'int' to bit-field changes value from 65535 to -1 [-Werror,-Wbitfield-constant-conversion]
49 | SET_LDOUBLE_WORDS (ldnx, 0xffff,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50 | tests[i] >> 32, tests[i] & 0xffffffffULL);
|
Use -1, instead of 0xffff, to silence Clang.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Add _Atomic to futex_wait argument and ctid in tst-clone3[-internal].c to
silence Clang error:
../sysdeps/unix/sysv/linux/tst-clone3-internal.c:93:3: error: address argument to atomic operation must be a pointer to _Atomic type ('pid_t *' (aka 'int *') invalid)
93 | wait_tid (&ctid, CTID_INIT_VAL);
| ^ ~~~~~
../sysdeps/unix/sysv/linux/tst-clone3-internal.c:51:21: note: expanded from macro 'wait_tid'
51 | while ((__tid = atomic_load_explicit (ctid_ptr, \
| ^ ~~~~~~~~
/usr/bin/../lib/clang/19/include/stdatomic.h:145:30: note: expanded from macro 'atomic_load_explicit'
145 | #define atomic_load_explicit __c11_atomic_load
| ^
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Mark _exit_with_flush as noreturn to silence the Clang error on
tst-atexit-common.c:
In file included from tst-atexit.c:22:
../stdlib/tst-atexit-common.c:113:5: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
113 | case 0: /* Child. */
| ^
../stdlib/tst-atexit-common.c:113:5: note: insert 'break;' to avoid fall-through
113 | case 0: /* Child. */
| ^
| break;
1 error generated.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Some hypervisors report 1 TiB L3 cache size. This results
in some variables incorrectly getting zeroed, causing crashes
in memcpy/memmove because invariants are violated.
Explicitly cast 192 and 168 to char to silence Clang error:
tst-resolv-invalid-cname.c:313:17: error: implicit conversion from 'int' to 'char' changes value from 192 to -64 [-Werror,-Wconstant-conversion]
313 | addr[0] = 192;
| ~ ^~~
tst-resolv-invalid-cname.c:314:17: error: implicit conversion from 'int' to 'char' changes value from 168 to -88 [-Werror,-Wconstant-conversion]
314 | addr[1] = 168;
| ~ ^~~
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Use "#include <...>" to silence Clang #include_next error:
In file included from ../sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c:19:
../sysdeps/x86_64/fpu/test-double-vlen4.h:19:2: error: #include_next in file found relative to primary source file or found by absolute path; will search from start of include path [-Werror,-Winclude-next-absolute-path]
19 | #include_next <test-double-vlen4.h>
| ^
1 error generated.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Pass -mshstk to compiler to silence Clang:
In file included from ../sysdeps/x86_64/tst-cet-legacy-10a.c:2:
../sysdeps/x86_64/tst-cet-legacy-10.c:29:7: error: always_inline function '_get_ssp' requires target feature 'shstk', but would be inlined into function 'do_test' that is compiled without support for 'shstk'
29 | if (_get_ssp () != 0)
| ^
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Load the polynomial evaluation coefficients into 2 vectors and use lanewise MLAs.
Also use intrinsics instead of native operations.
expf: 3% improvement in throughput microbenchmark on Neoverse V1, exp2f: 5%,
exp10f: 13%, coshf: 14%.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Load the polynomial evaluation coefficients into 2 vectors and use lanewise MLAs.
8% improvement in throughput microbenchmark on Neoverse V1.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Load the polynomial evaluation coefficients into 2 vectors and use lanewise MLAs.
8% improvement in throughput microbenchmark on Neoverse V1 for log2 and log,
and 2% for log10.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Since -1 isn't a power of two, compiler may reject it, hide memalign from
Clang 19 which issues an error:
tst-memalign.c:86:31: error: requested alignment is not a power of 2 [-Werror,-Wnon-power-of-two-alignment]
86 | p = memalign (-1, pagesize);
| ^~
tst-memalign.c:86:31: error: requested alignment must be 4294967296 bytes or smaller; maximum alignment assumed [-Werror,-Wbuiltin-assume-aligned-alignment]
86 | p = memalign (-1, pagesize);
| ^~
Update tst-malloc-aux.h to hide all malloc functions and include it in
all malloc tests to prevent compiler from optimizing out any malloc
functions.
Tested with Clang 19.1.5 and GCC 15 20241206 for BZ #32366.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
This was missed in a recent global change.
Fixes: 53fcdf5f74 (2024-11-25, "Silence most -Wzero-as-null-pointer-constant diagnostics")
Reported-by: "Maciej W. Rozycki" <macro@redhat.com>
Cc: Siddhesh Poyarekar <siddhesh@sourceware.org>
Cc: Bruno Haible <bruno@clisp.org>
Cc: Martin Uecker <uecker@tugraz.at>
Cc: Xi Ruoyao <xry111@xry111.site>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Joseph Myers <josmyers@redhat.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Maciej W. Rozycki <macro@redhat.com>
Remove the second
ifndef BUILD_CC
BUILD_CC = $(CC)
endif
in Makeconfig.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
Commit 6cbf845fcd ("iconv: Preserve iconv -c error exit on invalid
inputs (bug 32046)") changed the error exit code to report an error when
an input character has been transliterated. This looks like a bug as the
moto in the iconv program is to report an error code in the same
condition as the iconv() function.
This happens because the STANDARD_TO_LOOP_ERR_HANDLER macro sets a
default value for result and later updates it if the transliteration
succeed. With the changes, setting the default value also marks the
input as illegal.
Fix that by setting up the default value of result only when the
transliteration is not used. This works because __gconv_transliterate()
calls __gconv_mark_illegal_input() to return an error. At the same time
also fix the typo outself -> ourselves.
Fixes: 6cbf845fcd
Resolves: BZ #32448
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use empty initializer to silence GCC 4.9 or older:
getaddrinfo.c: In function ‘gaih_inet’:
getaddrinfo.c:1135:24: error: missing braces around initializer [-Werror=missing-braces]
/ sizeof (struct gaih_typeproto)] = {0};
^
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
As of Linux 6.13, there is no code in the vDSO that declines this
initialization request with the special ~0UL state size. If the vDSO
has the function, the call succeeds and returns 0. It's expected
that the code would follow the “a negative value indicating an error”
convention, as indicated in the __cvdso_getrandom_data function
comment, so that INTERNAL_SYSCALL_ERROR_P on glibc's side would return
true. This commit changes the commit to check for zero to indicate
success instead, which covers potential future non-zero success
return values and error returns.
Fixes commit 4f5704ea34 ("powerpc: Use
correct procedure call standard for getrandom vDSO call (bug 32440)").
Use "main (void)" instead of "main (void)" to avoid GCC 4.9 warning:
tst-difftime.c:62:1: error: function declaration isn’t a prototype [-Werror=strict-prototypes]
main ()
^
tst-difftime.c: In function ‘main’:
tst-difftime.c:62:1: error: old-style function definition [-Werror=old-style-definition]
cc1: all warnings being treated as errors
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Since GCC 4.9 doesn't have __builtin_add_overflow:
In file included from tst-stringtable.c:180:0:
stringtable.c: In function ‘stringtable_finalize’:
stringtable.c:185:7: error: implicit declaration of function ‘__builtin_add_overflow’ [-Werror=implicit-function-declaration]
else if (__builtin_add_overflow (previous->offset,
^
return EXIT_UNSUPPORTED for GCC 4.9 or older.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since elf/ifuncmain9.c fails at run-time when compiled with GCC 5.4 or
older (PR ipa/81128), return EXIT_UNSUPPORTED for GCC 5.4 or older.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
GCC 4.9 issues an error when generating misc/check-installed-headers-c.out:
In file included from ../signal/signal.h:328:0,
from ../include/signal.h:2,
from ../misc/sys/param.h:28,
from ../include/sys/param.h:1,
from /tmp/cih_test_e156ZB.c:10:
../include/bits/sigstksz.h:5:7: error: "IS_IN" is not defined [-Werror=undef]
#elif IS_IN (libsupport)
^
Use "#else" instead.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
GCC 4.9 doesn't define __STDC_VERSION__ and issues an error:
In file included from ../include/regex.h:2:0,
from ../posix/re_comp.h:23,
from ../include/re_comp.h:1,
from /tmp/cih_test_7IKTRI.c:10:
../posix/regex.h:650:19: error: "__STDC_VERSION__" is not defined [-Werror=undef]
# elif 199901L <= __STDC_VERSION__ || defined restrict
^
Use "#else" instead.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since assert/tst-assert-c++.cc fails to compile with GCC 4.9:
./tst-assert-c++.cc: In function ‘constexpr int check_constexpr()’:
./tst-assert-c++.cc:30:1: error: body of constexpr function ‘constexpr int check_constexpr()’ not a return-statement
}
^
return EXIT_UNSUPPORTED for GCC 4.9 or older.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Add braces to silence GCC 4.9 or older:
getaddrinfo.c: In function ‘gaih_inet’:
getaddrinfo.c:1135:24: error: missing braces around initializer [-Werror=missing-braces]
/ sizeof (struct gaih_typeproto)] = {0};
^
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since GCC 4.9 doesn't support __builtin_mul_overflow:
tst-fd_to_filename.c: In function ‘check_ranges’:
tst-fd_to_filename.c:51:3: error: implicit declaration of function ‘__builtin_mul_overflow’ [-Werror=implicit-function-declaration]
while (!__builtin_mul_overflow (power, base, &power));
^
cc1: all warnings being treated as errors
return EXIT_UNSUPPORTED for GCC 4.9 or older.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since ATOMIC_INT_LOCK_FREE in GCC 4.9 is defined as
#define ATOMIC_INT_LOCK_FREE \
__atomic_type_lock_free (atomic_int)
GCC 4.9 fails to compile tst-minsigstksz-1.c:
tst-minsigstksz-1.c:45:6: error: missing binary operator before token "("
# if ATOMIC_INT_LOCK_FREE != 2
^
Change tst-minsigstksz-1.c to define TEST_ATOMIC_OPS to 0 for GCC 4.9 or
older.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since GCC 4.9 issues an error:
In file included from inl-tester.c:6:0:
tester.c:58:1: error: unknown option after ‘#pragma GCC diagnostic’ kind [-Werror=pragmas]
DIAG_IGNORE_NEEDS_COMMENT (5.0, "-Wmemset-transposed-args");
^
use it for GCC 5 or newer.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Add test to check xcheck rule so that TEST_CC and TEST_CXX are used for
"make test".
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since the C++ compiler is also used to compile links-dso-program.cc in
libsupport, use TEST_CXX to get C++ headers for testing, but don't use
TEST_CXX as CXX for build.
Tested for m68k-linux-gnu-coldfire build and native build on x86-64.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
If an executable is static PIE and has a non-zero load address
(compare to elf/tst-pie-address-static), it segfaults as
elf_machine_load_address() returns 0x0 and elf_machine_dynamic()
returns the run-time instead of link-time address of _DYNAMIC.
Now rely on __ehdr_start and _DYNAMIC as also done on other
architectures.
Checked back to old arch-levels that this approach works fine:
- 31bit: -march=g5
- 64bit: -march=z900
Note, that there is no static-PIE support on 31bit, but this
approach cleans it also up.
Furthermore this cleanup in glibc does not change anything
regarding the first GOT-element as the s390 ABI
(https://github.com/IBM/s390x-abi) explicitely defines:
The doubleword at _GLOBAL_OFFSET_TABLE_[0] is set by the linkage
editor to hold the address of the dynamic structure, referenced
with the symbol _DYNAMIC. This allows a program, such as the dynamic
linker, to find its own dynamic structure without having yet processed
its relocation entries. This is especially important for the dynamic
linker, because it must initialize itself without relying on other
programs to relocate its memory image.
This will be required by the rseq extensible ABI implementation on all
Linux architectures exposing the '__rseq_size' and '__rseq_offset'
symbols to set the initial value of the 'cpu_id' field which can be used
by applications to test if rseq is available and registered. As long as
the symbols are exposed it is valid for an application to perform this
test even if rseq is not yet implemented in libc for this architecture.
Compile tested with build-many-glibcs.py but I don't have access to any
hardware to run the tests.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Stafford Horne <shorne@gmail.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the atan2pi functions (atan2(y,x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
Since the C++ compiler is used only for testing, use TEST_CXX as the C++
compiler if available. If C++ link test fails, clear both CXX and
TEST_CXX so that the C++ compiler isn't used for glibc build nor test.
Tested for m68k-linux-gnu-coldfire build and native build on x86-64.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Since libm doesn't export __XXX math functions, don't declare them in
the installed math.h by adding <bits/mathcalls-macros.h> to declare
__XXX math functions internally for glibc build. This fixes BZ #32418.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Optimize the bsearch() function to improve binary search performance.
Although the code size grew by 8 bytes, the new implementation achieves
a 15% reduction in execution time on my x86 machine, according to the
bench-bsearch benchmark results.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
Introduce a benchmark test for the bsearch function to evaluate its
performance.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the atanpi functions (atan(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
A plain indirect function call does not work on POWER because
success and failure are signaled through a flag register, and
not via the usual Linux negative return value convention.
This has potential security impact, in two ways: the return value
could be out of bounds (EAGAIN is 11 on powerpc6le), and no
random bytes have been written despite the non-error return value.
Fixes commit 461cab1de7 ("linux: Add
support for getrandom vDSO").
Reported-by: Ján Stanček <jstancek@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Support testing glibc build with a different C compiler or a different
C++ compiler with
$ ../glibc-VERSION/configure TEST_CC="gcc-6.4.1" TEST_CXX="g++-6.4.1"
1. Add LIBC_TRY_CC_AND_TEST_CC_OPTION, LIBC_TRY_CC_AND_TEST_CC_COMMAND
and LIBC_TRY_CC_AND_TEST_LINK to test both CC and TEST_CC.
2. Add check and xcheck targets to Makefile.in and override build compiler
options with ones from TEST_CC and TEST_CXX.
Tested on Fedora 41/x86-64:
1. Building with GCC 14.2.1 and testing with GCC 6.4.1 and GCC 11.2.1.
2. Building with GCC 15 and testing with GCC 6.4.1.
Support for GCC versions older than GCC 6.2 may need to change the test
sources. Other targets may need to update configure.ac under sysdeps and
modify Makefile.in to override target build compiler options.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
This commit add tcache support in calloc() which can largely improve
the performance of small size allocation, especially in multi-thread
scenario. tcache_available() and tcache_try_malloc() are split out as
a helper function for better reusing the code.
Also fix tst-safe-linking failure after enabling tcache. In previous,
calloc() is used as a way to by-pass tcache in memory allocation and
trigger safe-linking check in fastbins path. With tcache enabled, it
needs extra workarounds to bypass tcache.
Result of bench-calloc-thread benchmark
Test Platform: Xeon-8380
Ratio: New / Original time_per_iteration (Lower is Better)
Threads# | Ratio
-----------|------
1 thread | 0.656
4 threads | 0.470
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the asinpi functions (asin(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
GCC 15 introduces allocation dead code removal (DCE) for PR117370 in
r15-5255-g7828dc070510f8. This breaks various glibc tests which want
to assert various properties of the allocator without doing anything
obviously useful with the allocated memory.
Alexander Monakov rightly pointed out that we can and should do better
than passing -fno-malloc-dce to paper over the problem. Not least because
GCC 14 already does such DCE where there's no testing of malloc's return
value against NULL, and LLVM has such optimisations too.
Handle this by providing malloc (and friends) wrappers with a volatile
function pointer to obscure that we're calling malloc (et. al) from the
compiler.
Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the acospi functions (acos(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
Add ROP protection for the getcontext, setcontext, makecontext, swapcontext
and __sigsetjmp_symbol functions.
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
This will be required by the rseq extensible ABI implementation on all
Linux architectures exposing the '__rseq_size' and '__rseq_offset'
symbols to set the initial value of the 'cpu_id' field which can be used
by applications to test if rseq is available and registered. As long as
the symbols are exposed it is valid for an application to perform this
test even if rseq is not yet implemented in libc for this architecture.
Compile tested with build-many-glibcs.py but I don't have access to any
hardware to run the tests.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Arjun Shankar <arjun@redhat.com>
This will be required by the rseq extensible ABI implementation on all
Linux architectures exposing the '__rseq_size' and '__rseq_offset'
symbols to set the initial value of the 'cpu_id' field which can be used
by applications to test if rseq is available and registered. As long as
the symbols are exposed it is valid for an application to perform this
test even if rseq is not yet implemented in libc for this architecture.
Both code paths tested on a Visionfive 2 with Debian sid.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Enable RSEQ for RISC-V, support was added in Linux 5.18.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Add inline helper for expm1 and rearrange operations so MOV
is not necessary in reduction or around the special-case handler.
Reduce memory access by using more indexed MLAs in polynomial.
Speedup on Neoverse V1 for expm1 (19%), sinh (8.5%), and tanh (7.5%).
Add inline helper for log1p and rearrange operations so MOV
is not necessary in reduction or around the special-case handler.
Reduce memory access by using more indexed MLAs in polynomial.
Speedup on Neoverse V1 for log1p (3.5%), acosh (7.5%) and atanh (10%).
Remove spurious ADRP and a few MOVs.
Reduce memory access by using more indexed MLAs in polynomial.
Align notation so that algorithms are easier to compare.
Speedup on Neoverse V1 for log10 (8%), log (8.5%), and log2 (10%).
Update error threshold in AdvSIMD log (now matches SVE log).
Remove spurious ADRP. Improve memory access by shuffling constants and
using more indexed MLAs.
A few more optimisation with no impact on accuracy
- force fmas contraction
- switch from shift-aided rint to rint instruction
Between 1 and 5% throughput improvement on Neoverse
V1 depending on benchmark.
Since internal tests don't have access to internal symbols in libm,
exclude them for internal tests. Also make tst-strtod5 and tst-strtod5i
depend on $(libm) to support older versions of GCC which can't inline
copysign family functions. This fixes BZ #32414.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
Remove
AC_SUBST(libc_cv_mtls_descriptor)
since there is no @libc_cv_mtls_descriptor@ and there is
LIBC_CONFIG_VAR([have-mtls-descriptor], [$libc_cv_mtls_descriptor])
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the tanpi functions (tan(pi*x)).
Tested for x86_64 and x86, and with build-many-glibcs.py.
The postclean-generated setting in elf/Makefile lists
$(objpfx)/dso-sort-tests-2.generated-makefile twice and
$(objpfx)/dso-sort-tests-1.generated-makefile not at all, which looks
like a typo; fix it to list each once.
Tested for x86_64.
Update i686 libm-test-ulps to fix
FAIL: math/test-float64x-cospi
FAIL: math/test-float64x-sinpi
FAIL: math/test-ldouble-cospi
FAIL: math/test-ldouble-sinpi
when building glibc with GCC 7.4.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Add an additional test of TLS variables, with different alignment,
accessed from different modules. The idea of the alignment test is
similar to tst-tlsalign and the same code is shared for setting up
test variables, but unlike the tst-tlsalign code, there are multiple
threads and variables are accessed from multiple objects to verify
that they get a consistent notion of the address of an object within a
thread. Threads are repeatedly created and shut down to verify proper
initialization in each new thread. The test is also repeated with TLS
descriptors when supported. (However, only initial-exec TLS is
covered in this test.)
Tested for x86_64.
There already was a branch checking for this case in _hurd_fd_read ()
when the data is returned out-of-line. Do the same for inline data, as
well as for _hurd_fd_write (). It's also not possible for the length to
be negative, since it's stored in an unsigned integer.
Not verifying the returned length can confuse the callers who assume
the returned length is always reasonable. This manifested as libzstd
test suite failing on writes to /dev/zero, even though the write () call
appeared to succeed. In fact, the zero store backing /dev/zero was
returning a larger written length than the size actually submitted to
it, which is a separate bug to be fixed on the Hurd side. With this
patch, EGRATUITOUS is now propagated to the caller.
Reported-by: Diego Nieto Cid <dnietoc@gmail.com>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20241204112915.540032-1-bugaevc@gmail.com>
Fix variables in Makefiles:
1. There is a tab, not a space, between "variable" and =, +=, :=.
2. The last entry doesn't have a trailing \.
and sort them.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the sinpi functions (sin(pi*x)).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the cospi functions (cos(pi*x)).
Tested for x86_64 and x86, and with build-many-glibcs.py.
Add calloc-clear-memory.h to clear memory size up to 36 bytes (72 bytes
on 64-bit targets) for calloc. Use repeated stores with 1 branch, instead
of up to 3 branches. On x86-64, it is faster than memset since calling
memset needs 1 indirect branch, 1 broadcast, and up to 4 branches.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Esperanto, as an international language and a bit of a non-locale,
usually defaults to international consensus. In this commit, I make the
Esperanto locale more in line with ISO 8601 by setting the first day as
Monday, and the first week as containing January 4.
Closes: BZ #32323
Signed-off-by: Carmen Bianca BAKKER <carmen@carmenbianca.eu>
Reviewed-by: Mike FABIAN <mfabian@redhat.com>
This does not describe how to use RTLD_DI_ORIGIN and l_name
to reconstruct a full path for the an object. The reason
is that I think we should not recommend further use of
RTLD_DI_ORIGIN due to its buffer overflow potential (bug 24298).
This should be covered by another dlinfo extension. It would
also obsolete the need for the dladdr approach to obtain
the file name for the main executable.
Obtaining the lowest address from load segments in program
headers is quite clumsy and should be provided directly
via dlinfo.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Add tests that the dynamic linker works correctly with symbol names
involving hash collisions, for both choices of hash style (and
--hash-style=both as well). I note that there weren't actually any
previous tests using --hash-style (so tests would only cover the
default linker configuration in that regard). Also test symbol
versions involving hash collisions.
Tested for x86_64.
Add a threaded test for pthread_spin_trylock attempting to lock already
acquired spin lock and checking for correct return code.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Large chunks get added to the unsorted bin since
sorting them takes time, for small chunks the
benefit of adding them to the unsorted bin is
non-existant, actually hurting performance.
Splitting and malloc_consolidate still add small
chunks to unsorted, but we can hint the compiler
that that is a relatively rare occurance.
Benchmarking shows this to be consistently good.
Authored-by: k4lizen <k4lizen@proton.me>
Signed-off-by: Aleksa Siriški <sir@tmina.org>
Remove ZVA 128 support from memset - the new memset no longer
guarantees count >= 256, which can result in underflow and a
crash if ZVA size is 128 ([1]). Since only one CPU uses a ZVA
size of 128 and its memcpy implementation was removed in commit
e162ab2bf1, remove this special
case too.
[1] https://sourceware.org/pipermail/libc-alpha/2024-November/161626.html
Reviewed-by: Andrew Pinski <quic_apinski@quicinc.com>
Add a descriptive comment to the tst-pthread-cpuclockid-invalid test and
also drop pthread_getcpuclockid from the TODO-testing list since it now
has full coverage.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
GCC 15 (e876acab6cdd84bb2b32c98fc69fb0ba29c81153) and binutils
(e7a16d9fd65098045ef5959bf98d990f12314111) both removed all Nios II
support, and the architecture has been EOL'ed by the vendor. The
kernel still has support, but without a proper compiler there
is no much sense in keep it on glibc.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Trivial cleanup to limit _IO_least_marker so that it's clear that it is
unused outside of genops.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Tcache is an important optimzation to accelerate memory free(), things
within this code path should be kept as simple as possible. This commit
try to remove the function call when free() invokes tcache code path by
inlining _int_free().
Result of bench-malloc-thread benchmark
Test Platform: Xeon-8380
Ratio: New / Original time_per_iteration (Lower is Better)
Threads# | Ratio
-----------|------
1 thread | 0.879
4 threads | 0.874
The performance data shows it can improve bench-malloc-thread benchmark
by ~12% in both single thread and multi-thread scenario.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Some CORE-MATH routines uses roundeven and most of ISA do not have
an specific instruction for the operation. In this case, the call
will be routed to generic implementation.
However, if the ISA does support round() and ctz() there is a better
alternative (as used by CORE-MATH).
This patch adds such optimization and also enables it on powerpc.
On a power10 it shows the following improvement:
expm1f master patched improvement
latency 9.8574 7.0139 28.85%
reciprocal-throughput 4.3742 2.6592 39.21%
Checked on powerpc64le-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
The constants themselves were added to elf.h back in 8754a4133e but the
array in _dl_show_auxv wasn't modified accordingly, resulting in the
following output when running LD_SHOW_AUXV=1 /bin/true on recent Linux:
AT_??? (0x1b): 0x1c
AT_??? (0x1c): 0x20
With this patch:
AT_RSEQ_FEATURE_SIZE: 28
AT_RSEQ_ALIGN: 32
Tested on Linux 6.11 x86_64
Signed-off-by: Yannick Le Pennec <yannick.lepennec@live.fr>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The test was added in commit ac8cc9e300
without all the required Makefile scaffolding. Tweak the test
so that it actually builds (including with dynamic SIGSTKSZ).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When adding explicit initialization of rseq fields prior to
registration, I glossed over the fact that 'cpu_id_start' is also
documented as initialized by user-space.
While current kernels don't validate the content of this field on
registration, future ones could.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Add ROP protect instructions to strncpy and ppc-mount functions.
Modify FRAME_MIN_SIZE to 48 bytes for ELFv2 to reserve additional
16 bytes for ROP save slot and padding.
Signed-off-by: Sachin Monga <smonga@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
The k>>31 in signgam = 1 - (((k&(k>>31))&1)<<1); is not portable:
* The ISO C standard says "If E1 has a signed type and a negative
value, the resulting value is implementation-defined." (this is
still in C23).
* If the int type is larger than 32 bits (e.g. a 64-bit type),
then k = INT_MAX; line 144 will make k>>31 put 1 in bit 0
(thus signgam will be -1) while 0 is expected.
Moreover, instead of the fx >= 0x1p31f condition, testing fx >= 0
is probably better for 2 reasons:
The signgam expression has more or less a condition on the sign
of fx (the goal of k>>31, which can be dropped with this new
condition). Since fx ≥ 0 should be the most common case, one can
get signgam directly in this case (value 1). And this simplifies
the expression for the other case (fx < 0).
This new condition may be easier/faster to test on the processor
(e.g. by avoiding a load of a constant from the memory).
This is commit d41459c731865516318f813cf4c966dafa0eecbf from CORE-MATH.
Checked on x86_64-linux-gnu.
Exercise the case where an exited thread will cause
pthread_getcpuclockid to fail.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Test coverage of sem_getvalue is fairly limited. Add a test that runs
it on threads on each CPU. For this purpose I adapted
tst-skeleton-thread-affinity.c; it didn't seem very suitable to use
as-is or include directly in a different test doing things per-CPU,
but did seem a suitable starting point (thus sharing
tst-skeleton-affinity.c) for such testing.
Tested for x86_64.
The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance to the generic tanf.
The code was adapted to glibc style, to use the definition of
math_config.h, to remove errno handling, and to use a generic
128 bit routine for ABIs that do not support it natively.
Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (neoverse1,
gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1):
latency master patched improvement
x86_64 82.3961 54.8052 33.49%
x86_64v2 82.3415 54.8052 33.44%
x86_64v3 69.3661 50.4864 27.22%
i686 219.271 45.5396 79.23%
aarch64 29.2127 19.1951 34.29%
power10 19.5060 16.2760 16.56%
reciprocal-throughput master patched improvement
x86_64 28.3976 19.7334 30.51%
x86_64v2 28.4568 19.7334 30.65%
x86_64v3 21.1815 16.1811 23.61%
i686 105.016 15.1426 85.58%
aarch64 18.1573 10.7681 40.70%
power10 8.7207 8.7097 0.13%
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
It is based on binary64 erfc-inputs, with random inputs in
[0,b=0x1.41bbf6p+3] where b in the smallest number such that
erfcf(b) rounds to 0 (to nearest).
Reviewed-by: DJ Delorie <dj@redhat.com>
It is based on binary64 erf-inputs, with random inputs in [0,b=0x1.f5a888p+1]
where b in the smallest number such that erff(b) rounds to 1 (to nearest).
Reviewed-by: DJ Delorie <dj@redhat.com>
For a static PIE with non-zero load address, its PT_DYNAMIC segment
entries contain the relocated values for the load address in static PIE.
Since static PIE usually doesn't have PT_PHDR segment, use p_vaddr of
the PT_LOAD segment with offset == 0 as the load address in static PIE
and adjust the entries of PT_DYNAMIC segment in static PIE by properly
setting the l_addr field for static PIE. This fixes BZ #31799.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Async-signal-safety is preserved, too. In fact, getenv is fully
reentrant and can be called from the malloc call in setenv
(if a replacement malloc uses getenv during its initialization).
This is relatively easy to implement because even before this change,
setenv, unsetenv, clearenv, putenv do not deallocate the environment
strings themselves as they are removed from the environment.
The main changes are:
* Use release stores for environment array updates, following
the usual pattern for safely publishing immutable data
(in this case, the environment strings).
* Do not deallocate the environment array. Instead, keep older
versions around and adopt an exponential resizing policy. This
results in an amortized constant space leak per active environment
variable, but there already is such a leak for the variable itself
(and that is even length-dependent, and includes no-longer used
values).
* Add a seqlock-like mechanism to retry getenv if a concurrent
unsetenv is observed. Without that, it is possible that
getenv returns NULL for a variable that is never unset. This
is visible on some AArch64 implementations with the newly
added stdlib/tst-getenv-unsetenv test case. The mechanism
is not a pure seqlock because it tolerates one write from
unsetenv. This avoids the need for a second copy of the
environ array that getenv can read from a signal handler
that happens to interrupt an unsetenv call.
No manual updates are included with this patch because environ
usage with execve, posix_spawn, system is still not thread-safe
relative unsetenv. The new process may end up with an environment
that misses entries that were never unset. This is the same issue
described above for getenv.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The hardware architects have a new recommendation not to use
non-temporal load/stores for memset. This patch removes this path.
I found there was no difference in the memset speed with/without
non-temporal load/stores either.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The hardware architects have a new recommendation not to use
non-temporal load/stores for memcpy. This patch removes this path.
I found there was no difference in the memcpy speed with/without
non-temporal load/stores either.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The ROP instructions were added in ISA 3.1 (ie, Power10), however they
were defined so that if executed on older cpus, they would behave as
nops. This allows us to emit them on older cpus and they'd just be
ignored, but if run on a Power10, then the binary would be ROP protected.
Hash instructions use negative offsets so the default position
of ROP pointer is FRAME_ROP_SAVE from caller's SP.
Modified FRAME_MIN_SIZE_PARM to 112 for ELFv2 to reserve
additional 16 bytes for ROP save slot and padding.
Signed-off-by: Sachin Monga <smonga@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
This is an addendum to commit b7b52b9dec ("error, error_at_line: Add
missing va_end calls"), which added the va_end calls in the callers where
they belong.
Describe AArch64 specific flags PKEY_DISABLE_READ and PKEY_DISABLE_EXECUTE that
are available on AArch64 systems with enabled Stage 1 permission overlays
feature introduced in Armv8.9 / 9.4 (FEAT_S1POE).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch adds support for memory protection keys on AArch64 systems with
enabled Stage 1 permission overlays feature introduced in Armv8.9 / 9.4
(FEAT_S1POE) [1].
1. Internal functions "pkey_read" and "pkey_write" to access data
associated with memory protection keys.
2. Implementation of API functions "pkey_get" and "pkey_set" for
the AArch64 target.
3. AArch64-specific PKEY flags for READ and EXECUTE (see below).
4. New target-specific test that checks behaviour of pkeys on
AArch64 targets.
5. This patch also extends existing generic test for pkeys.
6. HWCAP constant for Permission Overlay Extension feature.
To support more accurate mapping of underlying permissions to the
PKEY flags, we introduce additional AArch64-specific flags. The full
list of flags is:
- PKEY_UNRESTRICTED: 0x0 (for completeness)
- PKEY_DISABLE_ACCESS: 0x1 (existing flag)
- PKEY_DISABLE_WRITE: 0x2 (existing flag)
- PKEY_DISABLE_EXECUTE: 0x4 (new flag, AArch64 specific)
- PKEY_DISABLE_READ: 0x8 (new flag, AArch64 specific)
The problem here is that PKEY_DISABLE_ACCESS has unusual semantics as
it overlaps with existing PKEY_DISABLE_WRITE and new PKEY_DISABLE_READ.
For this reason mapping between permission bits RWX and "restrictions"
bits awxr (a for disable access, etc) becomes complicated:
- PKEY_DISABLE_ACCESS disables both R and W
- PKEY_DISABLE_{WRITE,READ} disables W and R respectively
- PKEY_DISABLE_EXECUTE disables X
Combinations like the one below are accepted although they are redundant:
- PKEY_DISABLE_ACCESS | PKEY_DISABLE_READ | PKEY_DISABLE_WRITE
Reverse mapping tries to retain backward compatibility and ORs
PKEY_DISABLE_ACCESS whenever both flags PKEY_DISABLE_READ and
PKEY_DISABLE_WRITE would be present.
This will break code that compares pkey_get output with == instead
of using bitwise operations. The latter is more correct since PKEY_*
constants are essentially bit flags.
It should be noted that PKEY_DISABLE_ACCESS does not prevent execution.
[1] https://developer.arm.com/documentation/ddi0487/ka/ section D8.4.1.4
Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
ThunderX1 and ThunderX2 have been retired for a few years now.
So let's remove the thunderx{,2} specific versions of memcpy.
The performance gain or them was for medium and large sizes
while the generic (aarch64) memcpy will handle just slightly worse.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Two of the architecture bits/fenv.h headers define femode_t if
__GLIBC_USE (IEC_60559_BFP_EXT), instead of the correct condition
__GLIBC_USE (IEC_60559_BFP_EXT_C23) (both were added after commit
0175c9e9be, but were probably first
developed before it and then not updated to take account of its
changes). This results in failures of the installed headers check for
fenv.h when building with GCC 15 (defaults to -std=gnu23 - we don't
yet have an installed-headers test specifically for C23 mode and don't
yet require a compiler with such a mode for building glibc) together
with a combination of options leaving C23 features enabled, since the
declarations of functions using femode_t use the correct conditions;
see
<https://sourceware.org/pipermail/libc-testresults/2024q4/013163.html>.
Fix the conditionals to get <fenv.h> to work correctly in C23 mode
again.
Tested with build-many-glibcs.py (arc-linux-gnu, arch-linux-gnuhf,
or1k-linux-gnu-hard, or1k-linux-gnu-soft).
Update the inline asm syscall wrappers to match the newer register constraint
usage in INTERNAL_VSYSCALL_CALL_TYPE. Use the faster mfocrf instruction when
available, rather than the slower mfcr microcoded instruction.
Nameless function parameters have only been added to ISO C with the C23
revision of the language standard. Give names to the unused parameters
of the stub 'dladdr' implementation then so as to make compilation happy
with the earlier language definitions, fixing errors such as:
tst-printf-format-skeleton.c:374:9: error: parameter name omitted
374 | dladdr (const void *, Dl_info *)
reported by older compilers.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The remaining_to_add variable can be 0 if (current_used + count) wraps,
This is caught by GCC 14+ on hppa, which determines from there that
target_seg could be be NULL when remaining_to_add is zero, which in
turns causes a -Wstringop-overflow warning:
In file included from ../include/atomic.h:49,
from dl-find_object.c:20:
In function '_dlfo_update_init_seg',
inlined from '_dl_find_object_update_1' at dl-find_object.c:689:30,
inlined from '_dl_find_object_update' at dl-find_object.c:805:13:
../sysdeps/unix/sysv/linux/hppa/atomic-machine.h:44:4: error: '__atomic_store_4' writing 4 bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=]
44 | __atomic_store_n ((mem), (val), __ATOMIC_RELAXED); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dl-find_object.c:644:3: note: in expansion of macro 'atomic_store_relaxed'
644 | atomic_store_relaxed (&seg->size, new_seg_size);
| ^~~~~~~~~~~~~~~~~~~~
In function '_dl_find_object_update':
cc1: note: destination object is likely at address zero
In practice, this is not possible as it represent counts of link maps.
Link maps have sizes larger than 1 byte, so the sum of any two link map
counts will always fit within a size_t without wrapping around.
This patch therefore adds a check on remaining_to_add == 0 and tell GCC
that this can not happen using __builtin_unreachable.
Thanks to Andreas Schwab for the investigation.
Closes: BZ #32245
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: John David Anglin <dave.anglin@bell.net>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Linux 6.11 has getrandom() in vDSO. It operates on a thread-local opaque
state allocated with mmap using flags specified by the vDSO.
Multiple states are allocated at once, as many as fit into a page, and
these are held in an array of available states to be doled out to each
thread upon first use, and recycled when a thread terminates. As these
states run low, more are allocated.
To make this procedure async-signal-safe, a simple guard is used in the
LSB of the opaque state address, falling back to the syscall if there's
reentrancy contention.
Also, _Fork() is handled by blocking signals on opaque state allocation
(so _Fork() always sees a consistent state even if it interrupts a
getrandom() call) and by iterating over the thread stack cache on
reclaim_stack. Each opaque state will be in the free states list
(grnd_alloc.states) or allocated to a running thread.
The cancellation is handled by always using GRND_NONBLOCK flags while
calling the vDSO, and falling back to the cancellable syscall if the
kernel returns EAGAIN (would block). Since getrandom is not defined by
POSIX and cancellation is supported as an extension, the cancellation is
handled as 'may occur' instead of 'shall occur' [1], meaning that if
vDSO does not block (the expected behavior) getrandom will not act as a
cancellation entrypoint. It avoids a pthread_testcancel call on the fast
path (different than 'shall occur' functions, like sem_wait()).
It is currently enabled for x86_64, which is available in Linux 6.11,
and aarch64, powerpc32, powerpc64, loongarch64, and s390x, which are
available in Linux 6.12.
Link: https://pubs.opengroup.org/onlinepubs/9799919799/nframe.html [1]
Co-developed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Jason A. Donenfeld <Jason@zx2c4.com> # x86_64
Tested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> # x86_64, aarch64
Tested-by: Xi Ruoyao <xry111@xry111.site> # x86_64, aarch64, loongarch64
Tested-by: Stefan Liebler <stli@linux.ibm.com> # s390x
Add a new test tst-faccessat-setuid that iterates through real and
effective UID/GID combination and tests the faccessat() interface for
default and AT_EACCESS flags.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use libsupport convenience functions and macros instead of the old
test-skeleton. Also add a new xdup() convenience wrapper function.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add xdup as the error-checking version of dup for test cases.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When building with e.g. -std=c99 and _ATFILE_SOURCE, stat.h was missing
including bits/types/struct_timespec.h to get the struct timespec
declaration for utimensat.
For instance, 1073741906 leads to system 16, subsystem 0 and code 82,
which is in range (max_code is 122), but not defined. Return EINVAL in
that case, like
A static analyzer apparently reported an uninitialized use of the
variable result in sem_open in the case where the file is required to
exist but does not exist.
The report appears to be correct; set result to SEM_FAILED in that
case, and add a test for it.
Note: the test passes for me even without the sem_open fix, I guess
because result happens to get value SEM_FAILED (i.e. 0) when
uninitialized.
Tested for x86_64.
Per the rseq syscall documentation, 3 fields are required to be
initialized by userspace prior to registration, they are 'cpu_id',
'rseq_cs' and 'flags'. Since we have no guarantee that 'struct pthread'
is cleared on all architectures, explicitly set those 3 fields prior to
registration.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The declaration of found_other_class could be jumped
over via the goto just above it, but the code jumped
to uses found_other_class. Move the declaration
up a bit to ensure it's properly declared and initialized.
The commit 9247f53219 triggered some regressions on loongarch and
riscv:
math/test-float-log10
math/test-float32-log10
And it is due a wrong sync with CORE-MATH for special 0.0/-0.0
inputs.
Checked on aarch64-linux-gnu and loongarch64-linux-gnu-lp64d.
Wire vasprintf into test infrastructure for formatted printf output
specifiers.
Owing to mtrace logging these tests take amounts of time to complete
similar to those of corresponding asprintf tests, so set timeouts for
the tests accordingly, with a global default for all the vasprintf
tests, and then individual higher settings for double and long double
tests each.
Reviewed-by: DJ Delorie <dj@redhat.com>
Wire asprintf into test infrastructure for formatted printf output
specifiers.
Owing to mtrace logging of lots of memory allocation calls these tests
take a considerable amount of time to complete, except for the character
conversion, taking from 00m20s for 'tst-printf-format-as-s --direct s',
through 01m10s and 03m53s for 'tst-printf-format-as-char --direct i' and
'tst-printf-format-as-double --direct f' respectively, to 19m24s for
'tst-printf-format-as-ldouble --direct f', all in standalone execution
from NFS on a RISC-V FU740@1.2GHz system and with output redirected over
100Mbps network via SSH. It is with the skeleton's stub implementation
of dladdr(3); execution times with regular dladdr(3) are up to over
twice longer.
Set timeouts for the tests accordingly then, with a global default for
all the asprintf tests, and then individual higher settings for double
and long double tests each.
Reviewed-by: DJ Delorie <dj@redhat.com>
This is a collection of tests for formatted printf output specifiers
covering the d, i, o, u, x, and X integer conversions, the e, E, f, F,
g, and G floating-point conversions, the c character conversion, and the
s string conversion. Also the hh, h, l, and ll length modifiers are
covered with the integer conversions as is the L length modifier with
the floating-point conversions.
The -, +, space, #, and 0 flags are iterated over, as permitted by the
conversion handled, in tuples of 1..5, including tuples with repetitions
of 2, and combined with field width and/or precision, again as permitted
by the conversion. The resulting format string is then used to produce
output from respective sets of input data corresponding to the specific
conversion under test. POSIX extensions beyond ISO C are not used.
Output is produced in the form of records which include both the format
string (and width and/or precision where given in the form of separate
arguments) and the conversion result, and is verified with GNU AWK using
the format obtained from each such record against the reference value
also supplied, relying on the fact that GNU AWK has its own independent
implementation of format processing, striving to be ISO C compatible.
In the course of implementation I have determined that in the non-bignum
mode GNU AWK uses system sprintf(3) for the floating-point conversions,
defeating the objective of doing the verification against an independent
implementation. Additionally the bignum mode (using MPFR) is required
to correctly output wider integer and floating-point data. Therefore
for the conversions affected the relevant shell scripts sanity-check AWK
and terminate with unsupported status if the bignum mode is unavailable
for floating-point data or where data is output incorrectly.
The f and F floating-point conversions are build-time options for GNU
AWK, depending on the environment, so they are probed for before being
used. Similarly the a and A floating-point conversions, however they
are currently not used, see below. Also GNU AWK does not handle the b
or B integer conversions at all at the moment, as at 5.3.0. Support for
the a, A, b, and B conversions can however be easily added following the
approach taken for the f and F conversions.
Output produced by gawk for the a and A floating-point conversions does
not match one produced by us: insufficient precision is used where one
hasn't been explicitly given, e.g. for the negated maximum finite IEEE
754 64-bit value of -1.79769313486231570814527423731704357e+308 and "%a"
format we produce -0x1.fffffffffffffp+1023 vs gawk's -0x1.000000p+1024
and a different exponent is chosen otherwise, such as with "%.a" where
we output -0x2p+1023 vs gawk's -0x1p+1024 for the same value, or "%.20a"
where -0x1.fffffffffffff0000000p+1023 is our output, but gawk produces
-0xf.ffffffffffff80000000p+1020 instead. Consequently I chose not to
include a and A conversions in testing at this time.
And last but not least there are numerous corner cases that GNU AWK does
not handle correctly, which are worked around by explicit handling in
the AWK script. These are in particular:
- extraneous leading 0 produced for the alternative form with the o
conversion, e.g. { printf "%#.2o", 1 } produces "001" rather than
"01",
- unexpected 0 produced where no characters are expected for the input
of 0 and the alternative form with the precision of 0 and the integer
hexadecimal conversions, e.g. { printf "%#.x", 0 } produces "0" rather
than "",
- missing + character in the non-bignum mode only for the input of 0
with the + flag, precision of 0 and the signed integer conversions,
e.g. { printf "%+.i", 0 } produces "" rather than "+",
- missing space character in the non-bignum mode only for the input of 0
with the space flag, precision of 0 and the signed integer
conversions, e.g. { printf "% .i", 0 } produces "" rather than " ",
- for released gawk versions of up to 4.2.1 missing - character for the
input of -NaN with the floating-point conversions, e.g. { printf "%e",
"-nan" }' produces "nan" rather than "-nan",
- for released gawk versions from 5.0.0 onwards + character output for
the input of -NaN with the floating-point conversions, e.g. { printf
"%e", "-nan" }' produces "+nan" rather than "-nan",
- for released gawk versions from 5.0.0 onwards + character output for
the input of Inf or NaN in the absence of the + or space flags with
the floating-point conversions, e.g. { printf "%e", "inf" }' produces
"+inf" rather than "inf",
- for released gawk versions of up to 4.2.1 missing + character for the
input of Inf or NaN with the + flag and the floating-point
conversions, e.g. { printf "%+e", "inf" }' produces "inf" rather than
"+inf",
- for released gawk versions of up to 4.2.1 missing space character for
the input of Inf or NaN with the space flag and the floating-point
conversions, e.g. { printf "% e", "nan" }' produces "nan" rather than
" nan",
- for released gawk versions from 5.0.0 onwards + character output for
the input of Inf or NaN with the space flag and the floating-point
conversions, e.g. { printf "% e", "inf" }' produces "+inf" rather than
" inf",
- for released gawk versions from 5.0.0 onwards the field width is
ignored for the input of Inf or NaN and the floating-point
conversions, e.g. { printf "%20e", "-inf" }' produces "-inf" rather
than " -inf",
NB for released gawk versions of up to 4.2.1 floating-point conversion
issues apply to the bignum mode only, as in the non-bignum mode system
sprintf(3) is used. As from version 5.0.0 specialized handling has been
added for [-]Inf and [-]NaN inputs and the issues listed apply to both
modes. The '--posix' flag makes gawk versions from 5.0.0 onwards avoid
the issue with field width and the + character unconditionally output
for the input of Inf or NaN, however not the remaining issues and then
the 'gensub' function is not supported in the POSIX mode, so to go this
path I deemed not worth it.
Each test completes within single seconds except for the long double
one. There the F/f formats produce a large number of digits, which
appears to be computationally intensive and CPU-bound. Standalone
execution time for 'tst-printf-format-p-ldouble --direct f' is in the
range of 00m36s for POWER9@2.166GHz and 09m52s for FU740@1.2GHz and
output redirected locally to /dev/null, and 10m11s for FU740 and output
redirected over 100Mbps network via SSH to /dev/null, so the throughput
of the network adds very little (~3.2% in this case) to the processing
time. This is with IEEE 754 quad.
So I have scaled the timeout for 'tst-printf-format-skeleton-ldouble'
accordingly. Regardless, following recent practice the test has been
added to the standard rather than extended set. However, unlike most
of the remaining tests it has been split by the conversion specifier,
so as to allow better parallelization of this long-running test. As
a side effect this lets the test report the unsupported status for the
F/f conversions where applicable, so 'tst-printf-format-p-double' has
been split for consistency as well.
Only printf itself is handled at the moment, but the infrastructure
provides for all the printf family functions to be verified, changes
for which to be supplied separately. The complication around having
some tests iterating over all the relevant conversion specifiers and
other verifying conversion specifiers individually combined with
iterating over printf family functions has hit a peculiarity in GNU
make where the use of multiple targets with a pattern rule is handled
differently from such use with an ordinary rule. Consequently it
seems impossible to bulk-define a pattern rule using '$(foreach ...)',
where each target would simply trigger the recipe according to the
pattern and matching dependencies individually (such a rule does work,
but implies all targets to be updated with a single recipe execution).
Therefore as a compromise a single single-target pattern rule has been
defined that has listed all the conversion-specific scripts and all the
test executables as dependencies. Consequently tests will be rerun in
the absence of changes to their actual sources or scripts whenever an
unrelated file has changed that has been listed. Also all the formatted
printf output tests will always be built whenever any single one is to
be run. This only affects test development and not test runs in the
field, though it does change the order of execution of the individual
steps and also acts as a Makefile barrier in parallel runs. As the
execution time dominates the compilation time for these tests it is not
seen as a serious shortcoming.
As pointed out by Florian Weimer <fweimer@redhat.com> the malloc tracing
facility can take a substantial amount of time in calling dladdr(3) to
determine the caller's location. This is not needed by the verification
made with these tests, so I chose to interpose the symbol with a stub
implementation that always fails in the shared skeleton. We have total
control over the test environment, so I think it is a safe and minimal
impact approach. If there's ever anything else added to the tests that
would actually rely on dladdr(3) returning usable results, only then we
can think of a different approach.
Reviewed-by: DJ Delorie <dj@redhat.com>
On GCC 11 (x86-64), the previous code produced test failures like
this one:
Failure: Test: exp10m1_towardzero (-0x1.1p+4)
Result:
is: -1.00000000e+00 -0x1.000000p+0
should be: -9.99999940e-01 -0x1.fffffep-1
difference: 5.96046447e-08 0x1.000000p-24
ulp : 1.0000
max.ulp : 0.0000
Apply a similar fix to exp2m1f.
Co-authored-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Change name of the access_rights argument to access_restrictions
of the following functions:
- pkey_alloc()
- pkey_set()
as this argument refers to access restrictions rather than access
rights and previous name might have been misleading.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Update the name of the argument in several pkey_*() functions that refers
to access restrictions rather than access rights: change access "rights"
to access "restrictions".
Specify that the result of the pkey_get() should be checked using bitwise
operations rather than plain equals comparison.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Before commit ee1ada1bdb
("elf: Rework exception handling in the dynamic loader
[BZ #25486]"), the previous order called the main calloc
to allocate a shadow GOT/PLT array for auditing support.
This happened before libc.so.6 ELF constructors were run, so
a user malloc could run without libc.so.6 having been
initialized fully. One observable effect was that
environ was NULL at this point.
It does not seem to be possible at present to trigger such
an allocation, but it seems more robust to delay switching
to main malloc after ld.so self-relocation is complete.
The elf/tst-rtld-no-malloc-audit test case fails with a
2.34-era glibc that does not have this fix.
Reviewed-by: DJ Delorie <dj@redhat.com>
For a long time, libc.so.6 has dependend on ld.so, which
means that there is a reference to ld.so in all processes,
and rtld_multiple_ref is always true. In fact, if
rtld_multiple_ref were false, some of the ld.so setup code
would not run.
Reviewed-by: DJ Delorie <dj@redhat.com>
Linux 3.15 and 6.2 added HWCAP2_* values for Arm. These bits have
already been added to dl-procinfo.{c,h} in commits 9aea0cb842 and
8ebe9c0b38. Also add them to <bits/hwcap.h> so that they can be used
in user code. For example, for checking bits in the value returned by
getauxval(AT_HWCAP2).
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
This patch starts preparation for C2Y support in glibc headers by
adding a feature test macro _ISOC2Y_SOURCE and corresponding
__GLIBC_USE (ISOC2Y). (I mostly copied the work of Joseph Myers
for C2X). As with other such macros, C2Y features are also
enabled by compiling for a standard newer than C23, or by using
_GNU_SOURCE.
This patch does not itself enable anything new in the headers for C2Y;
that is to be done in followup patches. (For example an implementation
of WG14 N3349.)
Once C2Y becomes an actual standard we'll presumably move to using the
actual year in the feature test macro and __GLIBC_USE, with some
period when both macro spellings are accepted, as was done with
_ISOC2X_SOURCE.
Tested for x86_64.
Signed-off-by: Lenard Mollenkopf <glibc@lenardmollenkopf.de>
By using a combination of mask-and-add instead of the shift-based
index calculation the routines can share the same table as other
variants with no performance degradation.
The tables change name because of other changes in downstream AOR.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The CORE-MATH exp2m1f implementation showed slight worse latency
when using x86_64 baseline ABI. This patch adds a ifunc variant
with similar performance for x86_64-v3.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
The CORE-MATH exp10m1f implementation showed slight worse latency
when using x86_64 baseline ABI. This patch adds a ifunc variant
with similar performance for x86_64-v3.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance compared to the generic exp2m1f.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow). The
only change is to handle FLT_MAX_EXP for FE_DOWNWARD or FE_TOWARDZERO.
The benchmark inputs are based on exp2f ones.
Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1):
Latency master patched improvement
x86_64 40.6042 48.7104 -19.96%
x86_64v2 40.7506 35.9032 11.90%
x86_64v3 35.2301 31.7956 9.75%
i686 102.094 94.6657 7.28%
aarch64 18.2704 15.1387 17.14%
power10 11.9444 8.2402 31.01%
reciprocal-throughput master patched improvement
x86_64 20.8683 16.1428 22.64%
x86_64v2 19.5076 10.4474 46.44%
x86_64v3 19.2106 10.4014 45.86%
i686 56.4054 59.3004 -5.13%
aarch64 12.0781 7.3953 38.77%
power10 6.5306 5.9388 9.06%
The generic implementation calls __ieee754_exp2f and x86_64 provides
an optimized ifunc version (built with -mfma -mavx2, not correctly
rounded). This explains the performance difference for x86_64.
Same for i686, where the ABI provides an optimized __ieee754_exp2f
version built with '-msse2 -mfpmath=sse'. When built wth same
flags, the new algorithm shows a better performance:
master patched improvement
latency 102.094 91.2823 10.59%
reciprocal-throughput 56.4054 52.7984 6.39%
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance compared to the generic exp10m1f.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow). I mostly
fixed some small issues in corner cases (sNaN handling, -INFINITY,
a specific overflow check).
Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1):
Latency master patched improvement
x86_64 45.4690 49.5845 -9.05%
x86_64v2 46.1604 36.2665 21.43%
x86_64v3 37.8442 31.0359 17.99%
i686 121.367 93.0079 23.37%
aarch64 21.1126 15.0165 28.87%
power10 12.7426 8.4929 33.35%
reciprocal-throughput master patched improvement
x86_64 19.6005 17.4005 11.22%
x86_64v2 19.6008 11.1977 42.87%
x86_64v3 17.5427 10.2898 41.34%
i686 59.4215 60.9675 -2.60%
aarch64 13.9814 7.9173 43.37%
power10 6.7814 6.4258 5.24%
The generic implementation calls __ieee754_exp10f which has an
optimized version, although it is not correctly rounded, which is
the main culprit of the the latency difference for x86_64 and
throughp for i686.
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
Also remove the use of builtins in favor of standard names, compiler
already inline them (if supported) with current compiler options.
It also fixes and issue where __builtin_roundeven is not support on
gcc older than version 10.
Checked on x86_64-linux-gnu and i686-linux_gnu.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
This will be required by the rseq extensible ABI implementation on all
Linux architectures exposing the '__rseq_size' and '__rseq_offset'
symbols to set the initial value of the 'cpu_id' field which can be used
by applications to test if rseq is available and registered. As long as
the symbols are exposed it is valid for an application to perform this
test even if rseq is not yet implemented in libc for this architecture.
Both code paths are compile tested with build-many-glibcs.py but I don't
have access to any hardware to run the tests.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Arjun Shankar <arjun@redhat.com>
Save lr in a non-volatile register before scv in clone/clone3.
For clone, the non-volatile register was unused and already
saved/restored. Remove the dead code from clone.
Signed-off-by: Sachin Monga <smonga@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
There are no tests specifically focused on the functions time,
gettimeofday and clock_gettime, although there are some incidental
uses in tests of other functions. Add tests specifically for these
three functions.
Tested for x86_64 and x86.
There are various existing tests that call pthread_attr_init and then
verify properties of the resulting initial values retrieved with
pthread_attr_get* functions. However, those are missing coverage of
the initial values retrieved with pthread_attr_getschedparam and
pthread_attr_getstacksize. Add testing for initial values from those
functions as well.
(tst-attr2 covers pthread_attr_getdetachstate,
pthread_attr_getguardsize, pthread_attr_getinheritsched,
pthread_attr_getschedpolicy, pthread_attr_getscope. tst-attr3 covers
some of those together with pthread_attr_getaffinity_np.
tst-pthread-attr-sigmask covers pthread_attr_getsigmask_np.
pthread_attr_getstack has unspecified results if called before the
relevant attributes have been set, while pthread_attr_getstackaddr is
deprecated.)
Tested for x86_64.
The gilbc manual has some documentation in llio.texi of requirements
for moving between I/O on FILE * streams and file descriptors on the
same open file description.
The documentation of what must be done on a FILE * stream to move from
it to either a file descriptor or another FILE * for the same open
file description seems to match POSIX. However, there is an
additional requirement in POSIX on the *second* of the two handles
being moved between, which is not mentioned in the glibc manual: "If
any previous active handle has been used by a function that explicitly
changed the file offset, except as required above for the first
handle, the application shall perform an lseek() or fseek() (as
appropriate to the type of handle) to an appropriate location.".
Document this requirement on seeking in the glibc manual, limited to
the case that seems relevant to glibc (the new channel is a previously
active stream, on which the seeking previously occurred). Note that
I'm not sure what the "except as required above for the first handle"
is meant to be about, so I haven't documented anything for it. As far
as I can tell, nothing specified for moving from the first handle
actually list calling a seek function as one of the steps to be done.
(Current POSIX doesn't seem to have any relevant rationale for this
section. The rationale in the 1996 edition says "In requiring the
seek to an appropriate location for the new handle, the application is
required to know what it is doing if it is passing streams with seeks
involved. If the required seek is not done, the results are undefined
(and in fact the program probably will not work on many common
implementations)." - which also doesn't help in understanding the
purpose of "except as required above for the first handle".)
Tested with "make info" and "make pdf".
In both routines, reduce register pressure such that GCC 14 emits no
spills for erf and fewer spills for erfc. Also use more efficient
comparison for the special-case in erf.
Benchtests show erf improves by 6.4%, erfc by 1.0%.
In commit c628c22963 (elf: Remove
ldconfig kernel version check), the layout of auxcache entries
changed because the osversion field was removed from
struct aux_cache_file_entry. However, AUX_CACHEMAGIC was not
changed, so existing files are still used, potentially leading
to unintended ldconfig behavior. This commit changes AUX_CACHEMAGIC,
so that the file is regenerated.
Reported-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This fixes a buffer overflow in wide character string output, reproducing
when output fails, such as if the output fd is closed or is redirected
to a full device.
Wide character output data attempts to maintain the invariant that
`_IO_buf_base <= _IO_write_base <= _IO_write_end <= _IO_buf_end` (that is,
that the write region is a sub-region of `_IO_buf`). Prior to this commit,
this invariant is violated by the `_IO_wfile_overflow` function as so:
1. `_IO_wsetg` is called, assigning `_IO_write_base` to `_IO_buf_base`
2. `_IO_doallocbuf` is called, which jumps to `_IO_wfile_doallocate` via
the _IO_wfile_jumps vtable. This function then assigns the wide data
`_IO_buf_base` and `_IO_buf_end` to a malloc'd buffer.
Thus the invariant is violated. The fix is simply to reverse the order:
malloc the `_IO_buf` first and then assign `_IO_write_base` to it.
We also take this opportunity to defensively guard the initialization of
the number of unwritten characters via pointer arithmetic. We now check
that the buffer end is not before the buffer beginning; this matches a
similar defensive check in the narrow analogue `fileops.c`.
Add a test which fails without the fix.
Signed-off-by: Peter Ammon <corydoras@ridiculousfish.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The scanf family of functions like sscanf and fscanf currently
ignore nan() and nan(n-char-sequence). This happens because
__vfscanf_internal only checks for 'nan'.
This commit adds support for all valid nan types i.e. nan, nan()
and nan(n-char-sequence), where n-char-sequence can be
[a-zA-Z0-9_]+, thus fixing the bug 30647. Any other representation
of NaN should result in conversion error.
New tests are also added to verify the correct parsing of NaN types for
float, double and long double formats.
Signed-off-by: Avinal Kumar <avinal.xlvii@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The refactoring did not take the change of variable into account.
Fixes commit 43db5e2c06
("elf: Signal RT_CONSISTENT after relocation processing in dlopen
(bug 31986)").
Previously, a la_activity audit event was generated before
relocation processing completed. This does did not match what
happened during initial startup in elf/rtld.c (towards the end
of dl_main). It also caused various problems if an auditor
tried to open the same shared object again using dlmopen:
If it was the directly loaded object, it had a search scope
associated with it, so the early exit in dl_open_worker_begin
was taken even though the object was unrelocated. This caused
the r_state == RT_CONSISTENT assert to fail. Avoidance of the
assert also depends on reversing the order of r_state update
and auditor event (already implemented in a previous commit).
At the later point, args->map can be NULL due to failure,
so use the assigned namespace ID instead if that is available.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Auditors can call into the dynamic loader again if
LA_ACT_CONSISTENT, and those recursive calls could observe
r_state != RT_CONSISTENT.
We should consider failing dlopen/dlmopen/dlclose if
r_state != RT_CONSISTENT. The dynamic linker is probably not
in a state in which it can handle reentrant calls. This
needs further investigation.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This is conceptually similar to the reported bug, but does not
depend on auditing. The fix is simple: just complete execution
of the constructors. This exposed the fact that the link map
for statically linked executables does not have l_init_called
set, even though constructors have run.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This avoids -Werror build issues in strace, which bundles UAPI
headers, but does not include them as system headers.
Fixes commit c444cc1d83
("Linux: Add missing scheduler constants to <sched.h>").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
tst-popen-fork failed to build for Hurd due to not being linked with
libpthread. This commit fixes that.
Tested with build-many-glibcs.py for i686-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Add basic tests of pthread_mutexattr_gettype and
pthread_mutexattr_settype with each valid mutex kind, plus test for
EINVAL with an invalid mutex kind.
Tested for x86_64.
popen modifies its file handler book-keeping under a lock that wasn't
being taken during fork. This meant that a concurrent popen and fork
could end up copying the lock in a "locked" state into the fork child,
where subsequently calling popen would lead to a deadlock due to the
already (spuriously) held lock.
This commit fixes the deadlock by appropriately taking the lock before
fork, and releasing/resetting it in the parent/child after the fork.
A new test for concurrent popen and fork is also added. It consistently
hangs (and therefore fails via timeout) without the fix applied.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Glibc has two gnu-extension functions that are implemented as
macros but not documented as such: fread_unlocked and
fwrite_unlocked. Document them as such.
Additionally, putc_unlocked and getc_unlocked are documented in
POSIX as possibly being macros. Update the manual to add a warning
about those also, depite glibc not implementing them as macros.
The pthread_timedjoin_np and pthread_clockjoin_np functions do not
check that a valid time has been specified. The documentation for
these functions in the glibc manual isn't sufficiently detailed to say
if they should, but consistency with POSIX functions such as
pthread_mutex_timedlock and pthread_cond_timedwait strongly indicates
that an EINVAL error is appropriate (even if there might be some
ambiguity about exactly where such a check should go in relation to
other checks for whether the thread exists, whether it's immediately
joinable, etc.). Copy the logic for such a check used in
pthread_rwlock_common.c.
pthread_join_common had some logic calling valid_nanoseconds before
commit 9e92278ffa, "nptl: Remove
clockwait_tid"; I haven't checked exactly what cases that detected.
Tested for x86_64 and x86.
This makes b4 use inbox.sourceware.org instead of the default host
lore.kernel.org, so that every b4 user doesn't have to configure this
themselves for the glibc repo.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The commit 'sparc: Use Linux kABI for syscall return'
(86c5d2cf0c) did not take into account
a subtle sparc syscall kABI constraint. For syscalls that might block
indefinitely, on an interrupt (like SIGCONT) the kernel will set the
instruction pointer to just before the syscall:
arch/sparc/kernel/signal_64.c
476 static void do_signal(struct pt_regs *regs, unsigned long orig_i0)
477 {
[...]
525 if (restart_syscall) {
526 switch (regs->u_regs[UREG_I0]) {
527 case ERESTARTNOHAND:
528 case ERESTARTSYS:
529 case ERESTARTNOINTR:
530 /* replay the system call when we are done */
531 regs->u_regs[UREG_I0] = orig_i0;
532 regs->tpc -= 4;
533 regs->tnpc -= 4;
534 pt_regs_clear_syscall(regs);
535 fallthrough;
536 case ERESTART_RESTARTBLOCK:
537 regs->u_regs[UREG_G1] = __NR_restart_syscall;
538 regs->tpc -= 4;
539 regs->tnpc -= 4;
540 pt_regs_clear_syscall(regs);
541 }
However, on a SIGCONT it seems that 'g1' register is being clobbered after the
syscall returns. Before 86c5d2cf0c, the 'g1' was always placed jus
before the 'ta' instruction which then reloads the syscall number and restarts
the syscall.
On master, where 'g1' might be placed before 'ta':
$ cat test.c
#include <unistd.h>
int main ()
{
pause ();
}
$ gcc test.c -o test
$ strace -f ./t
[...]
ppoll(NULL, 0, NULL, NULL, 0
On another terminal
$ kill -STOP 2262828
$ strace -f ./t
[...]
--- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=2521813, si_uid=8289} ---
--- stopped by SIGSTOP ---
And then
$ kill -CONT 2262828
Results in:
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=2521813, si_uid=8289} ---
restart_syscall(<... resuming interrupted ppoll ...>) = -1 EINTR (Interrupted system call)
Where the expected behaviour would be:
$ strace -f ./t
[...]
ppoll(NULL, 0, NULL, NULL, 0) = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=2521813, si_uid=8289} ---
--- stopped by SIGSTOP ---
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=2521813, si_uid=8289} ---
ppoll(NULL, 0, NULL, NULL, 0
Just moving the 'g1' setting near the syscall asm is not suffice,
the compiler might optimize it away (as I saw on cancellation.c by
trying this fix). Instead, I have change the inline asm to put the
'g1' setup in ithe asm block. This would require to change the asm
constraint for INTERNAL_SYSCALL_NCS, since the syscall number is not
constant.
Checked on sparc64-linux-gnu.
Reported-by: René Rebe <rene@exactcode.de>
Tested-by: Sam James <sam@gentoo.org>
Reviewed-by: Sam James <sam@gentoo.org>
The manual contained several instances of incorrect formatting
that were correct texinfo but produced incorrectly rendered manuals
or incorrect behaviour from the tooling.
The most important was incorrect quoting of function returns
by failing to use {} to quote the return. The impact of this
mistake means that 'info libc func' does not jump to the function
in question but instead to the introductory page under the assumption
that func doesn't exist. The function returns are now correctly
quoted.
The second issue was the use of a category specifier with
@deftypefun which doesn't accept a category specifier. If a category
specifier is required then @deftypefn needs to be used. This is
corrected by changing the command to @deftypefn for such functions
that used {Deprecated function} as a category.
The last issue is a missing space between the function name and the
arguments which results in odd function names like "epoll_wait(int"
instead of "epoll_wait". This also impacts the use of 'info libc'
and is corrected.
We additionally remove ';' from the end of function arguments and
add an 'int' return type for dprintf.
Lastly we add a new test check-deftype.sh which verifies the expected
formatting of @deftypefun, @deftypefunx, @deftypefn, and
@deftypefnx. The new test is also run as the summary file is
generated to ensure we don't generate incorrect results.
The existing check-safety.sh is also run directly as a test to increase
coverage since the existing tests only ran on manual install.
The new tests now run as part of the standard "make check" that
pre-commit CI runs and developers should run.
No regressions on x86_64.
HTML and PDF rendering reviewed and looks correct for all changes.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The CORE-MATH implementation is correctly rounded (for any rounding mode).
This can be checked by exhaustive tests in a few minutes since there are
less than 2^32 values to check against for example GNU MPFR.
This patch also adds some bench values for tgammaf.
Tested on x86_64 and x86 (cfarm26).
With the initial GNU libc code it gave on an Intel(R) Core(TM) i7-8700:
"tgammaf": {
"": {
"duration": 3.50188e+09,
"iterations": 2e+07,
"max": 602.891,
"min": 65.1415,
"mean": 175.094
}
}
With the new code:
"tgammaf": {
"": {
"duration": 3.30825e+09,
"iterations": 5e+07,
"max": 211.592,
"min": 32.0325,
"mean": 66.1649
}
}
With the initial GNU libc code it gave on cfarm26 (i686):
"tgammaf": {
"": {
"duration": 3.70505e+09,
"iterations": 6e+06,
"max": 2420.23,
"min": 243.154,
"mean": 617.509
}
}
With the new code:
"tgammaf": {
"": {
"duration": 3.24497e+09,
"iterations": 1.8e+07,
"max": 1238.15,
"min": 101.155,
"mean": 180.276
}
}
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Changes in v2:
- include <math.h> (fix the linknamespace failures)
- restored original benchtests/strcoll-inputs/filelist#en_US.UTF-8 file
- restored original wrapper code (math/w_tgammaf_compat.c),
except for the dealing with the sign
- removed the tgammaf/float entries in all libm-test-ulps files
- address other comments from Joseph Myers
(https://sourceware.org/pipermail/libc-alpha/2024-July/158736.html)
Changes in v3:
- pass NULL argument for signgam from w_tgammaf_compat.c
- use of math_narrow_eval
- added more comments
Changes in v4:
- initialize local_signgam to 0 in math/w_tgamma_template.c
- replace sysdeps/ieee754/dbl-64/gamma_productf.c by dummy file
Changes in v5:
- do not mention local_signgam any more in math/w_tgammaf_compat.c
- initialize local_signgam to 1 instead of 0 in w_tgamma_template.c
and added comment
Changes in v6:
- pass NULL as 2nd argument of __ieee754_gammaf_r in
w_tgammaf_compat.c, and check for NULL in e_gammaf_r.c
Changes in v7:
- added Signed-off-by line for Alexei Sibidanov (author of the code)
Changes in v8:
- added Signed-off-by line for Paul Zimmermann (submitted of the patch)
Changes in v9:
- address comments from review by Adhemerval Zanella
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Linux 6.11 adds a define IPPROTO_SMC to its include/uapi/linux/in.h
(commit d25a92ccae6b).
Checked on x86_64-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Linux 6.11 adds the new flag for pwritev2 (commit
c34fc6f26ab86d03a2d47446f42b6cd492dfdc56).
Checked on x86_64-linux-gnu on 6.11 kernel.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
It adds the new constants from 'fs: Add initial atomic write support
info to statx' (commit 0f9ca80fa4f9670ba09721e4e36b8baf086a500c).
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This patch updates the kernel version in the tests tst-mount-consts.py,
and tst-sched-consts.py to 6.11.
There are no new constants covered by these tests in 6.11.
Tested with build-many-glibcs.py.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This request the page to be never written out to swap, it will be zeroed
under memory pressure (so kernel can just drop the page), it is inherited
by fork, it is not counted against @code{mlock} budget, and if there is
no enough memory to service a page faults there is no fatal error (so not
signal is sent).
Tested with build-many-glibcs.py.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Linux 6.11 adds some more PIDFD_* constants for 'pidfs: allow retrieval
of namespace file descriptors'
(5b08bd408534bfb3a7cf5778da5b27d4e4fffe12).
Tested with build-many-glibcs.py.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Linux 6.11 changes for syscall are:
* fstat/newfstatat for loongarch (it should be safe to add since
255dc1e4ed that undefine them).
* clone3 for nios2, which only adds the entry point but defined
__ARCH_BROKEN_SYS_CLONE3 (the syscall will always return ENOSYS).
* uretprobe for x86_64 and x32.
Update syscall-names.list and regenerate the arch-syscall.h headers
with build-many-glibcs.py update-syscalls.
Tested with build-many-glibcs.py.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
GCC mainline produces a -Wheader-guard error building for x86_64-gnu.
Fix what seems to be incorrect macro naming in the #ifndef
conditional.
Tested with build-many-glibc.py for x86_64-gnu (GCC mainline).
Message-ID: <fd800046-5ecb-ebd5-4df1-29d4eb3d5433@redhat.com>
Test that clock_nanosleep rejects out of range time values.
Test that clock_nanosleep actually sleeps for at least the
requested time relative to the requested clock.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The recursive lock used on abort does not synchronize with a new process
creation (either by fork-like interfaces or posix_spawn ones), nor it
is reinitialized after fork().
Also, the SIGABRT unblock before raise() shows another race condition,
where a fork or posix_spawn() call by another thread, just after the
recursive lock release and before the SIGABRT signal, might create
programs with a non-expected signal mask. With the default option
(without POSIX_SPAWN_SETSIGDEF), the process can see SIG_DFL for
SIGABRT, where it should be SIG_IGN.
To fix the AS-safe, raise() does not change the process signal mask,
and an AS-safe lock is used if a SIGABRT is installed or the process
is blocked or ignored. With the signal mask change removal,
there is no need to use a recursive loc. The lock is also taken on
both _Fork() and posix_spawn(), to avoid the spawn process to see the
abort handler as SIG_DFL.
A read-write lock is used to avoid serialize _Fork and posix_spawn
execution. Both sigaction (SIGABRT) and abort() requires to lock
as writer (since both change the disposition).
The fallback is also simplified: there is no need to use a loop of
ABORT_INSTRUCTION after _exit() (if the syscall does not terminate the
process, the system is broken).
The proposed fix changes how setjmp works on a SIGABRT handler, where
glibc does not save the signal mask. So usage like the below will now
always abort.
static volatile int chk_fail_ok;
static jmp_buf chk_fail_buf;
static void
handler (int sig)
{
if (chk_fail_ok)
{
chk_fail_ok = 0;
longjmp (chk_fail_buf, 1);
}
else
_exit (127);
}
[...]
signal (SIGABRT, handler);
[....]
chk_fail_ok = 1;
if (! setjmp (chk_fail_buf))
{
// Something that can calls abort, like a failed fortify function.
chk_fail_ok = 0;
printf ("FAIL\n");
}
Such cases will need to use sigsetjmp instead.
The _dl_start_profile calls sigaction through _profil, and to avoid
pulling abort() on loader the call is replaced with __libc_sigaction.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
The BZ#24967 fix (1bdda52fe9) missed the time for
architectures that define USE_IFUNC_TIME. Although it is not
an issue, since there is no pointer mangling, there is also no need
to call dl_vdso_vsym since the vDSO setup was already done by the
loader.
Checked on x86_64-linux-gnu and i686-linux-gnu.
The BZ#24967 fix (1bdda52fe9) missed the gettimeofday for
architectures that define USE_IFUNC_GETTIMEOFDAY. Although it is not
an issue, since there is no pointer mangling, there is also no need
to call dl_vdso_vsym since the vDSO setup was already done by the
loader.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Building the s390 specific iconv modules - utf16-utf32-z9.c, utf8-utf32-z9.c
and utf8-utf16-z9.c - with -fno-omit-frame-pointer leads to a build error
"error: %r11 cannot be used in 'asm' here" as r11 is needed as frame-pointer.
The cuXY-instructions need two even-odd register pairs. Therefore the register
pinning is used. This patch just uses a different register pair.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Several copies of the licenses in files contained whitespace related
problems. Two cases are addressed here, the first is two spaces
after a period which appears between "PURPOSE." and "See". The other
is a space after the last forward slash in the URL. Both issues are
corrected and the licenses now match the official textual description
of the license (and the other license in the sources).
Since these whitespaces changes do not alter the paragraph structure of
the license, nor create new sentences, they do not change the license.
Add tests of freopen adding or removing "c" (non-cancelling I/O) from
the mode string (so completing my planned tests of freopen with
different features used in the mode strings). Note that it's in the
nature of the uncertain time at which cancellation might act (possibly
during freopen, possibly during subsequent reads) that these can leak
memory or file descriptors, so these do not include leak tests.
Tested for x86_64.
The sparc clone mitigation (faeaa3bc9f) added the use of
flushw, which is not support by LEON/sparcv8. As discussed on
the libc-alpha, 'ta 3' is a working alternative [1].
[1] https://sourceware.org/pipermail/libc-alpha/2024-August/158905.html
Checked with a build for sparcv8-linux-gnu targetting leon.
Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
LEON2/LEON3 are both sparcv8, which does not support branch hints
(bne,pn) nor the return instruction.
Checked with a build for sparcv8-linux-gnu targetting leon. I also
checked some cancellation tests with qemu-system (targeting LEON3).
Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
GCC aligns global data to 16 bytes if their size is >= 16 bytes. This patch
changes the exp2f_data struct slightly so that the fields are better aligned.
As a result on targets that support them, load-pair instructions accessing
poly_scaled and invln2_scaled are now 16-byte aligned.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Even though building glibc with 64 bit time_t flags is not supported,
and the usual way is to patch the build system to avoid it; some
systems do enable it by default, and it increases the requirements
to build glibc in such cases (it also does not help newcomers when
trying to build glibc).
The conform namespace and linknamespace tests also do not expect
that flag to be set by default, so disable it as well.
Checked with a build/check for major ABI and some (i386, arm,
mipsel, hppa) with a toolchain that has LFS flags by default.
Reviewed-by: DJ Delorie <dj@redhat.com>
Even though building glibc with LFS flags is not supported, and the
the usual way is to patch the build system to avoid it [1]; some system
do enable it by default, and it increases the requirements to build
glibc in such cases (it also does not help newcomers when trying
to build glibc).
The conform namespace and linknamespace tests also do not expect
that flag to be set by default, so disable it as well.
Checked with a build/check for major ABI and some (i386, arm,
mipsel, hppa) with a toolchain that has LFS flags by default.
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=31624
Reviewed-by: DJ Delorie <dj@redhat.com>
The -Wp does not work properly if the compiler is configured to enable
fortify by default, since it bypasses the compiler driver (which defines
the fortify flags in this case).
This patch is similar to the one used on Ubuntu [1].
I checked with a build for x86_64-linux-gnu, i686-linux-gnu,
aarch64-linux-gnu, s390x-linux-gnu, and riscv64-linux-gnu with
gcc-13 that enables the fortify by default.
Co-authored-by: Matthias Klose <matthias.klose@canonical.com>
[1] https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/glibc/tree/debian/patches/ubuntu/fix-fortify-source.patch
Reviewed-by: DJ Delorie <dj@redhat.com>
Since _IO_vtable_offset is used to detect the old binaries, set it
in _IO_old_file_init_internal before calling _IO_link_in which checks
_IO_vtable_offset. Add a glibc 2.0 test with copy relocation on
_IO_stderr_@GLIBC_2.0 to verify that fopen won't cause memory corruption.
This fixes BZ #32148.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Exercises fwrite's internal buffer when doing a file operation.
The new test, exercises 2 overflow behaviors:
1. Call fwrite multiple times making usage of fwrite's internal buffer.
The total number of bytes written is larger than fwrite's internal
buffer, forcing an automatic flush.
2. Call fwrite a single time with an amount of data that is larger than
fwrite's internal buffer.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The loop should be aligned to 32-bytes so that it can ideally run out
the DSB. This is particularly important on Skylake-Server where
deficiencies in it's DSB implementation make it prone to not being
able to run loops out of the DSB.
For example running strcmp-evex on 200Mb string:
32-byte aligned loop:
- 43,399,578,766 idq.dsb_uops
not 32-byte aligned loop:
- 6,060,139,704 idq.dsb_uops
This results in a 25% performance degradation for the non-aligned
version.
The fix is to just ensure the code layout is such that the loop is
aligned. (Which was previously the case but was accidentally dropped
in 84e7c46df).
NB: The fix was actually 64-byte alignment. This is because 64-byte
alignment generally produces more stable performance than 32-byte
aligned code (cache line crosses can affect perf), so if we are going
past 16-byte alignmnent, might as well go to 64. 64-byte alignment
also matches most other functions we over-align, so it creates a
common point of optimization.
Times are reported as ratio of Time_With_Patch /
Time_Without_Patch. Lower is better.
The values being reported is the geometric mean of the ratio across
all tests in bench-strcmp and bench-strncmp.
Note this patch is only attempting to improve the Skylake-Server
strcmp for long strings. The rest of the numbers are only to test for
regressions.
Tigerlake Results Strings <= 512:
strcmp : 1.026
strncmp: 0.949
Tigerlake Results Strings > 512:
strcmp : 0.994
strncmp: 0.998
Skylake-Server Results Strings <= 512:
strcmp : 0.945
strncmp: 0.943
Skylake-Server Results Strings > 512:
strcmp : 0.778
strncmp: 1.000
The 2.6% regression on TGL-strcmp is due to slowdowns caused by
changes in alignment of code handling small sizes (most on the
page-cross logic). These should be safe to ignore because 1) We
previously only 16-byte aligned the function so this behavior is not
new and was essentially up to chance before this patch and 2) this
type of alignment related regression on small sizes really only comes
up in tight micro-benchmark loops and is unlikely to have any affect
on realworld performance.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This hides the inconsistent TCB state (missing robust mutex list) from
signal handlers.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Unicode 16.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 16.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).
Changes in CHARMAP and WIDTH:
Total added characters in newly generated CHARMAP: 5185
Total removed characters in newly generated WIDTH: 1
Total added characters in newly generated WIDTH: 170
The removed character from WIDTH is U+1171E AHOM CONSONANT SIGN MEDIAL RA.
It changed like this:
UnicodeData.txt 15.1.0: 1171E;AHOM CONSONANT SIGN MEDIAL RA;Mn;0;NSM;;;;;N;;;;;
UnicodeData.txt 16.0.0: 1171E;AHOM CONSONANT SIGN MEDIAL RA;Mc;0;L;;;;;N;;;;;
EastAsianWidth.txt 15.1.0: 1171D..1171F ; N # Mn [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
EastAsianWidth.txt 16.0.0: 1171E ; N # Mc AHOM CONSONANT SIGN MEDIAL RA
I.e it changed from Mn (Mark Nonspacing) to Mc (Mark Spacing
combining). So it should now have width 1 instead of 0, therefore it
is OK that it was removed from WIDTH, characters not in WIDTH get
width 1 by default.
Nothing suspicious when browsing the list of the 170 added characters.
Changes in ctype:
alpha: Added 4452 characters in new ctype which were not in old ctype
combining: Added 51 characters in new ctype which were not in old ctype
combining_level3: Added 43 characters in new ctype which were not in old ctype
graph: Added 5185 characters in new ctype which were not in old ctype
lower: Added 25 characters in new ctype which were not in old ctype
print: Added 5185 characters in new ctype which were not in old ctype
punct: Missing 33 characters of old ctype in new ctype
punct: Added 766 characters in new ctype which were not in old ctype
tolower: Added 27 characters in new ctype which were not in old ctype
totitle: Added 27 characters in new ctype which were not in old ctype
toupper: Added 27 characters in new ctype which were not in old ctype
upper: Added 27 characters in new ctype which were not in old ctype
Nothing suspicous in the additions.
About the 33 characters removed from `punct`:
U+0363 - U+036F are identical in UnicodeData.txt. Difference in DerivedCoreProperties.txt:
DerivedCoreProperties.txt 15.1.0: not there.
DerivedCoreProperties.txt 16.0.0: 0363..036F ; Alphabetic # Mn [13] COMBINING LATIN SMALL LETTER A..COMBINING LATIN SMALL LETTER X
So that’s the reason why they are added to `alpha` and removed from `punct`.
Same for U+1DD3 - U+1DE6, they are identical in UnicodeData.txt but there is a difference in DerivedCoreProperties.txt:
DerivedCoreProperties.txt 15.1.0: 1DE7..1DF4 ; Alphabetic # Mn [14] COMBINING LATIN SMALL LETTER ALPHA..COMBINING LATIN SMALL LETTER U WITH DIAERESIS
DerivedCoreProperties.txt 16.0.0: 1DD3..1DF4 ; Alphabetic # Mn [34] COMBINING LATIN SMALL LETTER FLATTENED OPEN A ABOVE..COMBINING LATIN SMALL LETTER U WITH DIAERESIS
So they became `Alphabetic` and were thus added to `alpha` and removed from `punct`.
Resolves: BZ #32168
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This is not completely clear from the C standard (although there
is footnote number 289 in C11), but I assume that our implementation
works this way.
Reviewed-by: DJ Delorie <dj@redhat.com>
This was discussed on the hallway track at GNU Tools Cauldron
2024. There are concerns about stability of the big-endian
GCC backend, and Linux removed support for the only big-endian
ARC platform in commit dd7c7ab01a04d645b7e7baa8530bfd81e31a2202
("ARC: [plat-eznps]: Drop support for EZChip NPS platform").
In Linux 6.11, fstat and newfstatat are added back. To avoid the messy
usage of the fstat, newfstatat, and statx system calls, we will continue
using statx only in glibc, maintaining consistency with previous versions of
the LoongArch-specific glibc implementation.
Signed-off-by: caiyinyu <caiyinyu@loongson.cn>
Reviewed-by: Xi Ruoyao <xry111@xry111.site>
Suggested-by: Florian Weimer <fweimer@redhat.com>
Use the setresuid32 system call if it is available, prefering
it over setresuid. If both system calls exist, setresuid
is the 16-bit variant. This fixes a build failure on
sparcv9-linux-gnu.
Calling an extern function in a different translation unit before
self-relocation is brittle. The compiler may load the address
at an earlier point in _dl_start, before self-relocation. In
_dl_start_final, the call is behind a compiler barrier, so this
cannot happen.
This case is detected early in the elf/dl-version.c consistency
checks. (These checks could be disabled in the future to allow
the removal of symbol versioning from objects.)
Commit f0b2132b35 ("ld.so: Support moving versioned symbols between
sonames [BZ #24741]) removed another call to _dl_name_match_p. The
_dl_check_caller function no longer exists, and the remaining calls
to _dl_name_match_p happen under the loader lock. This means that
atomic accesses are no longer required for the l_libname list. This
supersedes commit 395be7c218 ("elf: Fix data race in _dl_name_match_p
[BZ #21349]").
With --enable-hardcoded-path-in-tests, $(test-program-prefix)
does not redirect to the built glibc, but we need to run
iconv (the program) against the built glibc even with
--enable-hardcoded-path-in-tests, as it is using the ABI
path for the dynamic linker (as an installed program).
Use $(run-program-prefix) instead.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This operation can be simplified to use simpler multiply-round-convert
sequence, which uses fewer instructions and constants.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Rearrange operations so MOV is not necessary in reduction or around
the special-case handler. Reduce memory access by using more indexed
MLAs in polynomial.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
log1pf is quite register-intensive - use fewer registers for the
polynomial, and make various changes to shorten dependency chains in
parent routines. There is now no spilling with GCC 14. Accuracy moves
around a little - comments adjusted accordingly but does not require
regen-ulps.
Use the helper in log1pf as well, instead of having separate
implementations. The more accurate polynomial means special-casing can
be simplified, and the shorter dependency chain avoids the usual dance
around v0, which is otherwise difficult.
There is a small duplication of vectors containing 1.0f (or 0x3f800000) -
GCC is not currently able to efficiently handle values which fit in FMOV
but not MOVI, and are reinterpreted to integer. There may be potential
for more optimisation if this is fixed.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reduce MOVPRFXs by using unpredicated (non-destructive) instructions
where possible. Similar to the recent change to AdvSIMD F32 logs,
adjust special-case arguments and bounds to allow for more optimal
register usage. For all 3 routines one MOVPRFX remains in the
reduction, which cannot be avoided as immediate AND and ASR are both
destructive.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reduce MOV and MOVPRFX by improving special-case handling. Use inline
helper to duplicate the entire computation between the special- and
non-special case branches, removing the contention for z0 between x
and the return value.
Also rearrange some MLAs and MLSs - by making the multiplicand the
destination we can avoid a MOVPRFX in several cases. Also change which
constants go in the vector used for lanewise ops - the last lane is no
longer wasted.
Spotted that shift was incorrect in exp2f and exp10f, w.r.t. to the
comment that explains it. Fixed - worst-case ULP for exp2f moves
around but it doesn't change significantly for either routine.
Worst-case error for coshf increases due to passing x to exp rather
than abs(x) - updated the comment, but does not require regen-ulps.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It tests long names and ENAMETOOLONG handling, specifically
for readdir_r. This is a regression test for bug 14699,
bug 32124, and bug 32128.
Reviewed-by: DJ Delorie <dj@redhat.com>
It is not necessary to do the conversion at the getdents64
layer for readdir64_r. Doing it piecewise for readdir64
is slightly simpler and allows deleting __old_getdents64.
This fixes bug 32128 because readdir64_r handles the length
check correctly.
Reviewed-by: DJ Delorie <dj@redhat.com>
The tests check that O_EXCL is used properly, that 0600 is used
as the mode, that the characters used are as expected, and that
the distribution of names generated is reasonably random.
The tests run very slowly on some kernel versions, so make them
xtests.
Reviewed-by: DJ Delorie <dj@redhat.com>
Add tests of special cases for freopen that were omitted from the more
general tests of different modes and similar issues. The special
cases in the three tests here are logically unconnected, it was simply
convenient to put these tests in one patch.
* Test freopen with a NULL path to the new file, in a chroot. Rather
than asserting that this fails (logically, failure in this case is
an implementation detail; it's not required for freopen to rely on
/proc), verify that either it fails (without memory leaks) or that
it succeeds and behaves as expected on success. There is no check
for file descriptor leaks because the machinery for that also
depends on /proc, so can't be used in a chroot.
* Test that freopen and freopen64 are genuinely different in
configurations with 32-bit off_t by checking for an EFBIG trying to
write past 2GB in a file opened with freopen in such a configuration
but no error with 64-bit off_t or when opening with freopen64.
* Test freopen of stdin, stdout and stderr.
Tested for x86_64 and x86.
The test tst-strtod-underflow covers various edge cases close to the
underflow threshold for strtod (especially cases where underflow on
architectures with after-rounding tininess detection depends on the
rounding mode). Make it use the type-generic machinery, with
corresponding test inputs for each supported floating-point format, so
that other functions in the strtod family are tested for underflow
edge cases as well.
Tested for x86_64.
There is very little test coverage of inputs to strtod-family
functions that don't contain anything that can be parsed as a number
(one test of ".y" in tst-strtod2), and none that I can see of skipping
initial whitespace. Add some tests of these things to tst-strtod2.
Tested for x86_64.
Although there are some tests in tst-strtod2 and tst-strtod3 for the
end pointer provided by strtod when it doesn't parse the whole string,
they aren't very thorough. Add tests of more such cases to
tst-strtod2.
Tested for x86_64.
Some of the strtod tests use type-generic machinery in tst-strtod.h to
test the strto* functions for all floating types, while others only
test double even when the tests are in fact meaningful for all
floating types.
Convert tst-strtod2 and tst-strtod5 to use the type-generic machinery
so they test all floating types. I haven't tried to convert them to
use newer test interfaces in other ways, just made the changes
necessary to use the type-generic machinery.
Tested for x86_64.
Previously, the second occurrence of the xtests target
expected all xtests to run (as the result of specifying
$(xtests)), but these tests have not been run due to
the the first xtests target is set up for run-built-tests=no:
it only runs tests in $(xtests-special). Consequently,
xtests are reported as UNSUPPORTED with “make xcheck
run-built-tests=no”. The xtests were not built, either.
After this change always, xtests are built regardless
of the $(run-built-tests) variable (except for xtests listed
in $(tests-unsupported)). To fix the UNSUPPORTED issue,
introduce xtests-expected and use that manage test
expectations in the second xtests target.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Add new testcase elf/tst-startup-errno.c which tests that errno is set
to 0 at first ELF constructor execution and at the start of the
program's main function.
Tested for x86_64
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Add new file libio/tst-fclose-unopened2.c that tests whether fclose on an
unopened file returns EOF.
This test differs from tst-fclose-unopened.c by ensuring the file's buffer
is allocated prior to double-fclose. A comment in tst-fclose-unopened.c
now clarifies that it is testing a file with an unallocated buffer.
Calling fclose on unopened files normally causes a use-after-free bug,
however the standard streams are an exception since they are not
deallocated by fclose.
Tested for x86_64.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Usually, the second and subsequent - return EOF immediately
and do not contribute to the output, but this is not an error.
Reviewed-by: DJ Delorie <dj@redhat.com>
Check if any of the input files overlaps with the output file, and use
a temporary file in this case, so that the input is no clobbered
before it is read. This fixes bug 10460. It allows to use iconv
more easily as a functional replacement for GNU recode.
The updated output buffer management truncates the output file
if there is no input, fixing bug 32033.
Reviewed-by: DJ Delorie <dj@redhat.com>
In several converters, a __GCONV_ILLEGAL_INPUT result gets overwritten
with __GCONV_FULL_OUTPUT. As a result, iconv (the function) returns
E2BIG instead of EILSEQ. The iconv program does not see the original
EILSEQ failure, does not recognize the invalid input, and may
incorrectly exit successfully.
To address this, a new __flags bit is used to indicate a sticky input
error state. All __GCONV_ILLEGAL_INPUT results are replaced with a
function call that sets this new __GCONV_ENCOUNTERED_ILLEGAL_INPUT and
returns __GCONV_ILLEGAL_INPUT. The iconv program checks for
__GCONV_ENCOUNTERED_ILLEGAL_INPUT and overrides the exit status.
The converter changes introducing __gconv_mark_illegal_input are
mostly mechanical, except for the res variable initialization in
iconvdata/iso-2022-jp.c: this error gets overwritten with __GCONV_OK
and other results in the following code. If res ==
__GCONV_ILLEGAL_INPUT afterwards, STANDARD_TO_LOOP_ERR_HANDLER below
will handle it.
The __gconv_mark_illegal_input changes do not alter the errno value
set by the iconv function. This is simpler to implement than
reviewing each __GCONV_FULL_OUTPUT result and adjust it not to
override a previous __GCONV_ILLEGAL_INPUT result. Doing it that way
would also change some E2BIG errors in to EILSEQ errors, so it had to
be done conditionally (under a flag set by the iconv program only), to
avoid confusing buffer management in other applications.
Reviewed-by: DJ Delorie <dj@redhat.com>
The __is_last field was replaced with a bitmask in
commit 85830c4c46 in 2000,
and multiple bits are in use today.
Reviewed-by: DJ Delorie <dj@redhat.com>
On current systems, very large files are needed before
mmap becomes beneficial. Simplify the implementation.
This exposed that inptr was not initialized correctly in
process_fd. Handling multiple input files resulted in
EFAULT in read because a null pointer was passed. This
could be observed previously if an input file was not
mappable and was reported as bug 17703.
Reviewed-by: DJ Delorie <dj@redhat.com>
This enables vectorisation of C23 logp1, which is an alias for log1p.
There are no new tests or ulp entries because the new symbols are simply
aliases.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
A common use case of access () / faccessat () is checking for file
existence, not any specific access permissions. In that case, we can
avoid doing the file_check_access () RPC; whether the given path had
been successfully resolved to a file is all we need to know to answer.
This is prompted by GLib switching to use faccessat (F_OK) to implement
g_file_query_exists () for local files.
https://gitlab.gnome.org/GNOME/glib/-/merge_requests/4272
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240919101439.179663-1-bugaevc@gmail.com>
This patch adds new flag --glibctunables to the cross-test-ssh.sh script
to pass Glibc tunables to the system on which tests are executed.
The value to pass can be also provided via the GLIBC_TUNABLES environment
variable.
This works similar to the TIMEOUTFACTOR variable.
Sometimes it is useful to cross test glibc with some non-default tunable,
and a global environment variable is the easiest way to inject some
tunable value into most tests. With this patch using cross-test-ssh.sh
script becomes very similar to running a test natively on the local host
when using non-default tunable is important.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
It allows to read directories using the six readdir variants
without writing type-specific code or using skeleton files
that are compiled four times.
The readdir_r subtest for support_readdir_expect_error revealed
bug 32124.
Reviewed-by: DJ Delorie <dj@redhat.com>
And struct sched_attr.
In sysdeps/unix/sysv/linux/bits/sched.h, the hack that defines
sched_param around the inclusion of <linux/sched/types.h> is quite
ugly, but the definition of struct sched_param has already been
dropped by the kernel, so there is nothing else we can do and maintain
compatibility of <sched.h> with a wide range of kernel header
versions. (An alternative would involve introducing a separate header
for this functionality, but this seems unnecessary.)
The existing sched_* functions that change scheduler parameters
are already incompatible with PTHREAD_PRIO_PROTECT mutexes, so
there is no harm in adding more functionality in this area.
The documentation mostly defers to the Linux manual pages.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
From the existing @manpagefunctionstub{func,sec} macro,
so that URLs can be included in the manual without the
stub text.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The reading loops did not check for read failures. Addresses
a static analysis report.
Manually tested by compiling a program with the GCC's
-finstrument-functions option, running it with
“LD_PRELOAD=debug/libpcprofile.so PCPROFILE_OUTPUT=output-file”,
and reviewing the output of “debug/pcprofiledump output-file”.
Improve small memsets by avoiding branches and use overlapping stores.
Use DC ZVA for copies over 128 bytes. Remove unnecessary code for ZVA sizes
other than 64 and 128. Performance of random memset benchmark improves by 24%
on Neoverse N1.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since the last operation is destructive, the first argument to the FMA
also has to be the first argument to the special-case in order to
avoid unnecessary MOVs. Reorder arguments and adjust special-case
bounds to facilitate this.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
As recently discussed, document that freopen does not work with
streams opened with functions such as popen, fmemopen, open_memstream
or fopencookie. I've filed
<https://austingroupbugs.net/view.php?id=1855> to clarify this issue
in POSIX.
Tested with "make info" and "make html".
Using TLS directly introduces a GLIBC_PRIVATE ABI dependency
into libc_nonshared.a, and thus indirectly into applications.
Adding the !defined LIBC_NONSHARED condition deactivates direct
TLS access, and libc_nonshared.a code switches to using
__errno_location, like application code.
Currently, this has no effect because there is no code in
libc_nonshared.a that accesses errno.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Sync tzselect, zdump, zic to TZDB 2024b. This patch incorporates the
following TZDB source code changes:
6903dde3 Release 2024b
812aff32 Improve historical transitions in Mexico 1921-1997
52662566 Adjust to mailing list software change
7748036b Mention Internet RFC 9557
339e81d1 Mention Levine’s proposal to replace leap seconds
b4e6ad2d No leap second on 2024-12-31
7eb5bf88 Asia/Choibalsan is now an alias for Asia/Ulaanbaatar
43450cbf Improve historical data for Portugal and former possessions.
13d7348b Typo and validation fixes.
3c39cde8 Fix typo for “removed” in a comment
03fd9e45 More documentation updates for POSIX.1-2024
eb3bcceb POSIX.1-2014 is now published
913b0410 tzselect: support POSIX.1-2024 offset range
b5318b55 Document POSIX.1-2024 better
837609b7 Fix typo when making .txt man pages
d56ae6ee SUPPORT_C89 now defaults to 1, not 0
b1fe113d Port ! to Solaris make
8f1fd321 Avoid crash in Solaris 10 /usr/xpg4/bin/make
e0fcfdd6 Use ‘export VAR=VAL’ syntax
eba43166 Avoid an awk invocation via $'...'
36479a80 Avoid some subshells in tzselect
7f6cf054 * tzselect.ksh: Assume POSIX.2 awk.
a1cf1daf * tzselect.ksh: Assume POSIX.2 $PWD.
a9b8e536 Assume POSIX.2 command substitution
eaa4ef16 Avoid subshells when possible
9dac9eb7 Prefer $PWD to $(pwd) in Makefile
fada6a4c Prefer $(CMD) to `CMD` in Makefile
3e871b9a Assume POSIX.2 and eschew ‘expr’
c5d67805 difftime isn’t pure either
5857c056 * CONTRIBUTING: Document build assumptions.
6822cc82 ‘make check’ no longer depends on curl+Internet
cc6eb255 Document GCC bug 114833 and workaround
bcbc86bf Scale back on function attribute use
c0789e46 C23 [[reproducible]] and [[unsequenced]] fixups
bbd88154 More updates to GCC_DEBUG_FLAGS for GCC 14
1a35b7c8 Spelling fixes
f71085f2 POSIX.1-2024 removes asctime_r, ctime_r
70856f8e Adjust to refactored location of ctime, ctime_r
aacd151d Update GCC_DEBUG_FLAGS for GCC 14
967dcf3b Sub-second history for Maputo and Zurich
782d0826 Make EET, MET and WET links
a0b09c02 Mark CET, CST6CDT etc. as obsolescent
db7fb40d Document SMPTE timecodes and rolling leaps
97232e18 Don’t be so sure about leap seconds going away
5b6a74fb Update some URLs
a75a6251 * zic.8: Tweak for consistency.
1e75b31f Document what %s means before any rule applies
00c96cbb Conform to RFC 8536 section 3.2 for default type
3e944959 Document problems with stripped-down TZif readers
20fc91cf Shanks is likely wrong about Maputo switch to CAT
d99589b6 * zic.8: Add missing tab character.
94e6b3b0 Switch to %z in main dataform
2cd57b93 Treat W-Eur like Port when reguarding
ad6f6d94 Check that main.zi agrees with sources
a43b030f .gitignore: Add .pdf, .ps, .s. Remove obsolete ‘yearistype’.
253ca020 * theory.html: ‘CLT’ → ‘LTC’ (per Michael H Deckers)
a3dee8c8 * NEWS: ‘how’ → ‘now’ (thanks to Paul Goyette).
ea6341c5 * theory.html: Mention NASA and CLT (per Arthur David Olson).
0dcebe37 America/Scoresbysund matches America/Nuuk from now on
b1e07fb0 Update Vzic link (thanks to Allen Winter)
a4b05030 Fix wday/mday typo in previous patch
732a4803 Document how to detect mktime failure reliably
a64067e9 ziguard.awk: generalize for proposed Portugal patch
59c861fd Line up zdump examples
66c106c9 tzfile.5: srcfix
e5553001 Fix .RS/.RE problem in tzfile.5
d647eb01 Add Doctorow book
59d4a1ba Asia/Almaty matches Asia/Tashkent from now on
d4d3c3ba * asia: Update Philippine URLs (thanks to Guy Harris).
9fc11a27 Port unlikely overflow check to C23
b52a2969 Fix 2023d NEWS typo
e48c5b53 Cite "The NTP Leap Second File"
b1dc2122 Update Israel tz-link
6cf4e912 Extrapolate less from the 2022 CGPM resolution.
It fixes glibc build with gcc master [1].
Checked on x86_64-linux-gnu and on i686-linux-gnu.
[1] https://sourceware.org/pipermail/libc-alpha/2024-September/159571.html
Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
As reported in bug 23675 and shown up in the recently added tests of
different cases of freopen (relevant part of the test currently
conditioned under #if 0 to avoid a failure resulting from this bug),
freopen wrongly forces the stream to unoriented even when a mode with
,ccs= is specified, though such a mode is supposed to result in a
wide-oriented stream. Move the clearing of _mode to before the actual
reopening occurs, so that the main fopen implementation can leave a
wide-oriented stream in the ,ccs= case.
Tested for x86_64.
Add new file libio/tst-fclosed-unopened.c that tests whether fclose on
an unopened file returns EOF.
Calling fclose on unopened files normally causes a use-after-free bug,
however the standard streams are an exception since they are not
deallocated by fclose.
fclose returning EOF for unopened files is not part of the external
contract but there are dependancies on this behaviour. For example,
gnulib's close_stdout in lib/closeout.c.
Tested for x86_64.
Signed-off-by: Aaron Merey <amerey@redhat.com>
As reported in bug 32140, freopen leaks the FILE object when it
returns NULL: there is no valid use of the FILE * pointer (including
passing to freopen again or to fclose) after such an error return, so
the underlying object should be freed. Add code to free it.
Note 1: while I think it's clear from the relevant standards that the
object should be freed and the FILE * can't be used after the call in
this case (the stream is closed, which ends the lifetime of the FILE),
it's entirely possible that some existing code does in fact try to use
the existing FILE * in some way and could be broken by this change.
(Though the most common case for freopen may be stdin / stdout /
stderr, which _IO_deallocate_file explicitly checks for and does not
deallocate.)
Note 2: the deallocation is only done in the _IO_IS_FILEBUF case.
Other kinds of streams bypass all the freopen logic handling closing
the file, meaning a call to _IO_deallocate_file would neither be safe
(the FILE might still be linked into the list of all open FILEs) nor
sufficient (other internal memory allocations associated with the file
would not have been freed). I think the validity of freopen for any
other kind of stream will need clarifying with the Austin Group, but
if it is valid in any such case (where "valid" means "not undefined
behavior so required to close the stream" rather than "required to
successfully associate the stream with the new file in cases where
fopen would work"), more significant changes would be needed to ensure
the stream gets fully closed.
Tested for x86_64.
As reported in bug 32134, freopen does not clear the flags set in
fp->_flags2 by the "e", "m" or "c" mode characters. Clear these so
that they can be set or not as appropriate from the mode string passed
to freopen. The relevant test for "e" in tst-freopen2-main.c is
enabled accordingly; "c" is expected to be covered in a separately
written test (and while tst-freopen2-main.c does include transitions
to and from "m", that's not really a semantic flag intended to result
in behaving in an observably different way).
Tested for x86_64.
This allows to monitor the exact file system operations
performed by glibc and inject errors.
Hurd does not have <sys/mount.h>. To get the sources to compile
at least, the same approach as in support/test-container.c is used.
Reviewed-by: DJ Delorie <dj@redhat.com>
Upon error, return the errno value set by the __getdents call
in __readdir_unlocked. Previously, kernel-reported errors
were ignored.
Reviewed-by: DJ Delorie <dj@redhat.com>
Use static functions for readdir/readdir_r, so that
-D_FILE_OFFSET_BITS=64 does not improperly redirect calls to the wrong
implementation.
Reviewed-by: DJ Delorie <dj@redhat.com>
And include the required licensing information. The only
change is a removed trailing empty line in
LICENSES/exceptions/Linux-syscall-note.
Bundling <linux/fuse.h> is the recommended way to deal with
the evolution of the FUSE userspace interface because
structs change sizes over time. The kernel maintains
compatibility, but source-level compatibility on recompilation
may require additional code that is aware of older struct sizes.
Signed-off-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
freopen is rather minimally tested in libio/tst-freopen and
libio/test-freopen. Add some more thorough tests, covering different
cases for change of mode in particular. The tests are run for both
freopen and freopen64 (given that those functions have two separate
copies of much of the code, so any bug fix directly in the freopen
code would probably need applying in both places).
Note that there are two parts of the tests disabled because of bugs
discovered through running the tests, with bug numbers given in
comments. I expect to address those separately. The tests also don't
cover changes to cancellation ("c" in mode); I think that will better
be handled through a separate test. Also to handle separately:
testing on stdin / stdout / stderr; documenting lack of support for
streams opened with popen / fmemopen / open_memstream / fopencookie;
maybe also a chroot test without /proc; maybe also more thorough tests
for large file handling on 32-bit systems (freopen64).
Tested for x86_64.
_wide_data and _mode are not available in legacy code, so do not attempt
to free the wide backup buffer in legacy code.
Resolves: BZ #32137 and BZ #27821
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
As reported in bug 32045, it's incorrect for strtod/nan functions to
set errno based on overflowing payload (strtod should only set errno
for overflow / underflow of its actual result, and potentially if
nothing in the string can be parsed as a number at all; nan should be
a pure function that never sets it). Save and restore errno around
the internal strtoull call and add associated test coverage.
Tested for x86_64.
There are two separate sets of tests of NaN payloads in glibc:
* libm-test-{get,set}payload* verify that getpayload, setpayload,
setpayloadsig and __builtin_nan functions are consistent in their
payload handling.
* test-nan-payload verifies that strtod-family functions and the
not-built-in nan functions are consistent in their payload handling.
Nothing, however, connects the two sets of functions (i.e., verifies
that strtod / nan are consistent with getpayload / setpayload /
__builtin_nan).
Improve test-nan-payload to check actual payload value with getpayload
rather than just verifying that the strtod and nan functions produce
the same NaN. Also check that the NaNs produced aren't signaling and
extend the tests to cover _FloatN / _FloatNx.
Tested for x86_64.
On Linux most descriptors that do not correspond to file system
entities (such as anonymous pipes and sockets) have file permissions
that can be changed. While it is possible to create a custom file
system that returns (say) EINVAL for an fchmod attempt, testing this
does not appear to be useful.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
In __syscall_cancel_arch, there's a tail call to __syscall_do_cancel.
On P10, since the caller uses the TOC and the callee is using
PC-relative addressing, there's only a branch instruction with no NOPs
to restore the TOC, which causes the build error. The fix involves adding
the NOTOC directive to the branch instruction, informing the linker
not to generate a TOC stub, thus resolving the issue.
Some of the strtod tests use type-generic machinery in tst-strtod.h to
test the strto* functions for all floating types, while others only
test double even when the tests are in fact meaningful for all
floating types.
Convert the tests of the internal __strtod_internal interface to cover
all floating types. I haven't tried to convert them to use newer test
interfaces in other ways, just made the changes necessary to use the
type-generic machinery. As an internal interface, there are no
aliases for different types with the same ABI (however,
__strtold_internal is defined even if long double has the same ABI as
double), so macros used by the type-generic testing code are redefined
as needed to avoid expecting such aliases to be present.
Tested for x86_64.
As reported in bug 30220, the implementation of strtod-family
functions has a bug in the following case: the input string would,
with infinite exponent range, take one more bit to represent than is
available in the normal precision of the return type; the value
represented is in the subnormal range; and there are no nonzero bits
in the value, below those that can be represented in subnormal
precision, other than the least significant bit and possibly the
0.5ulp bit. In this case, round_and_return ends up discarding the
least significant bit.
Fix by saving that bit to merge into more_bits (it can't be merged in
at the time it's computed, because more_bits mustn't include this bit
in the case of after-rounding tininess detection checking if the
result is still subnormal when rounded to normal precision, so merging
this bit into more_bits needs to take place after that check).
Tested for x86_64.
Add tests of underflow in tst-strtod-round, and thus also test for
errno being unchanged when there is neither overflow nor underflow.
The errno setting before the function call to test for being unchanged
is adjusted to set errno to 12345 instead of 0, so that any bugs where
strtod sets errno to 0 would be detected.
This doesn't add any new test inputs for tst-strtod-round, and in
particular doesn't cover the edge cases of underflow the way
tst-strtod-underflow does (none of the existing test inputs for
tst-strtod-round actually exercise cases that have underflow with
before-rounding tininess detection but not with after-rounding
tininess detection), but at least it provides some coverage (as per
the recent discussions) that ordinary non-overflowing non-underflowing
inputs to these functions do not set errno.
Tested for x86_64.
Reference this new section from the O_PATH documentation.
And document the functions openat, openat64, fstatat, fstatat64.
(The safety assessment for fstatat was already obsolete because
current glibc assumes kernel support for the underlying system
call.)
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch uses 'Avoid_Non_Temporal_Memset' flag to access
the non-temporal memset implementation for hygon processors.
Test Results:
hygon1 arch
x86_memset_non_temporal_threshold = 8MB
size new performance time / old performance time
1MB 0.994
4MB 0.996
8MB 0.670
16MB 0.343
32MB 0.355
hygon2 arch
x86_memset_non_temporal_threshold = 8MB
size new performance time / old performance time
1MB 1
4MB 1
8MB 1.312
16MB 0.822
32MB 0.830
hygon3 arch
x86_memset_non_temporal_threshold = 8MB
size new performance time / old performance time
1MB 1
4MB 0.990
8MB 0.737
16MB 0.390
32MB 0.401
For hygon arch with this patch, non-temporal stores can improve
performance by 20% - 65%.
Signed-off-by: Feifei Wang <wangfeifei@hygon.cn>
Reviewed-by: Jing Li <lijing@hygon.cn>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Add hygon branch in dl_init_cacheinfo function to initialize
cache size variables for hygon processors. In the meanwhile,
add handle_hygon() function to get cache information.
Signed-off-by: Feifei Wang <wangfeifei@hygon.cn>
Reviewed-by: Jing Li <lijing@hygon.cn>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Add a new architecture type arch_kind_hygon to spilt Hygon branch
from AMD. This is to facilitate the Hygon processors to make settings
that are suitable for its own characteristics.
Signed-off-by: Feifei Wang <wangfeifei@hygon.cn>
Reviewed-by: Jing Li <lijing@hygon.cn>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
One can be very unlucky to call time_now first just before a second switch,
and mach_msg sleep just a bit more enough for the second time_now call to
count one second too many (or even more if scheduling is really unlucky).
So we have to protect against returning a bogus negative value in such case.
This patch modifies the current Power9 implementation of strcpy and
stpcpy to optimize it for Power9 and Power10.
No new Power10 instructions are used, so the original Power9 strcpy
is modified instead of creating a new implementation for Power10.
The changes also affect stpcpy, which uses the same implementation
with some additional code before returning.
Improvements compared to the old Power9 version:
Use simple comparisons for the first ~512 bytes:
The main loop is good for long strings, but comparing 16B each time is
better for shorter strings. After aligning the address to 16 bytes, we
unroll the loop four times, checking 128 bytes each time. There may be
some overlap with the main loop for unaligned strings, but it is better
for shorter strings.
Loop with 64 bytes for longer bytes:
Use 4 consecutive lxv/stxv instructions.
Showed an average improvement of 13%.
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
The current racy approach is to enable asynchronous cancellation
before making the syscall and restore the previous cancellation
type once the syscall returns, and check if cancellation has happen
during the cancellation entrypoint.
As described in BZ#12683, this approach shows 2 problems:
1. Cancellation can act after the syscall has returned from the
kernel, but before userspace saves the return value. It might
result in a resource leak if the syscall allocated a resource or a
side effect (partial read/write), and there is no way to program
handle it with cancellation handlers.
2. If a signal is handled while the thread is blocked at a cancellable
syscall, the entire signal handler runs with asynchronous
cancellation enabled. This can lead to issues if the signal
handler call functions which are async-signal-safe but not
async-cancel-safe.
For the cancellation to work correctly, there are 5 points at which the
cancellation signal could arrive:
[ ... )[ ... )[ syscall ]( ...
1 2 3 4 5
1. Before initial testcancel, e.g. [*... testcancel)
2. Between testcancel and syscall start, e.g. [testcancel...syscall start)
3. While syscall is blocked and no side effects have yet taken
place, e.g. [ syscall ]
4. Same as 3 but with side-effects having occurred (e.g. a partial
read or write).
5. After syscall end e.g. (syscall end...*]
And libc wants to act on cancellation in cases 1, 2, and 3 but not
in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually
happen in the next cancellable entrypoint without any further external
event.
The proposed solution for each case is:
1. Do a conditional branch based on whether the thread has received
a cancellation request;
2. It can be caught by the signal handler determining that the saved
program counter (from the ucontext_t) is in some address range
beginning just before the "testcancel" and ending with the
syscall instruction.
3. SIGCANCEL can be caught by the signal handler and determine that
the saved program counter (from the ucontext_t) is in the address
range beginning just before "testcancel" and ending with the first
uninterruptable (via a signal) syscall instruction that enters the
kernel.
4. In this case, except for certain syscalls that ALWAYS fail with
EINTR even for non-interrupting signals, the kernel will reset
the program counter to point at the syscall instruction during
signal handling, so that the syscall is restarted when the signal
handler returns. So, from the signal handler's standpoint, this
looks the same as case 2, and thus it's taken care of.
5. For syscalls with side-effects, the kernel cannot restart the
syscall; when it's interrupted by a signal, the kernel must cause
the syscall to return with whatever partial result is obtained
(e.g. partial read or write).
6. The saved program counter points just after the syscall
instruction, so the signal handler won't act on cancellation.
This is similar to 4. since the program counter is past the syscall
instruction.
So The proposed fixes are:
1. Remove the enable_asynccancel/disable_asynccancel function usage in
cancellable syscall definition and instead make them call a common
symbol that will check if cancellation is enabled (__syscall_cancel
at nptl/cancellation.c), call the arch-specific cancellable
entry-point (__syscall_cancel_arch), and cancel the thread when
required.
2. Provide an arch-specific generic system call wrapper function
that contains global markers. These markers will be used in
SIGCANCEL signal handler to check if the interruption has been
called in a valid syscall and if the syscalls has side-effects.
A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
is provided. However, the markers may not be set on correct
expected places depending on how INTERNAL_SYSCALL_NCS is
implemented by the architecture. It is expected that all
architectures add an arch-specific implementation.
3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
type and if current IP from signal handler falls between the global
markers and act accordingly.
4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
use the appropriate cancelable syscalls.
5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
provide cancelable futex calls.
Some architectures require specific support on syscall handling:
* On i386 the syscall cancel bridge needs to use the old int80
instruction because the optimized vDSO symbol the resulting PC value
for an interrupted syscall points to an address outside the expected
markers in __syscall_cancel_arch. It has been discussed in LKML [1]
on how kernel could help userland to accomplish it, but afaik
discussion has stalled.
Also, sysenter should not be used directly by libc since its calling
convention is set by the kernel depending of the underlying x86 chip
(check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60).
* mips o32 is the only kABI that requires 7 argument syscall, and to
avoid add a requirement on all architectures to support it, mips
support is added with extra internal defines.
Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu,
powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and
x86_64-linux-gnu.
[1] https://lkml.org/lkml/2016/3/8/1105
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The test io/tst-mkdirat doesn't verify the permissions on the created
directory (thus, doesn't verify at all anything about how mkdirat uses
the mode argument). Add checks of this to the existing test.
Tested for x86_64.
There is very little test coverage for getline (only a minimal
stdio-common/tstgetln.c which doesn't verify anything about the
results of the getline calls). Add some more thorough tests
(generally using fopencookie for convenience in testing various cases
for what the input and possible errors / EOF in the file read might
look like).
Note the following regarding testing of error cases:
* Nothing is said in the specifications about what if anything might
be written into the buffer, and whether it might be reallocated, in
error cases. The expectation of the tests (required to avoid memory
leaks on error) is that at least on error cases, the invariant that
lineptr points to at least n bytes is maintained.
* The optional EOVERFLOW error case specified in POSIX, "The number of
bytes to be written into the buffer, including the delimiter
character (if encountered), would exceed {SSIZE_MAX}.", doesn't seem
practically testable, as any case reading so many characters (half
the address space) would also be liable to run into allocation
failure along (ENOMEM) the way.
* If a read error occurs part way through reading an input line, it
seems unclear whether a partial line should be returned by getline
(avoid input getting lost), which is what glibc does at least in the
fopencookie case used in this test, or whether getline should return
-1 (error) (so avoiding the program misbehaving by processing a
truncated line as if it were complete). (There was a short,
inconclusive discussion about this on the Austin Group list on 9-10
November 2014.)
* The POSIX specification of getline inherits errors from fgetc. I
didn't try to cover fgetc errors systematically, just one example of
such an error.
Tested for x86_64 and x86.
This will avoid in the future cases like a57cbbd853 ("malloc: Link
threading tests with $(shared-thread-library") missing the memcheck
cases added in 251843e16f ("malloc: Link threading tests with
$(shared-thread-library)")
Tests for if_nameindex, if_name2index, and if_index2name
Tests that valid results are consistent.
Tests that invalid parameters fail correctly.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
With ia64 removal, the function descriptor supports is only used
by HPPA and new architectures do not seem leaning towards this
design.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This commit uses a common implementation 'strnlen-evex-base.S' for both
'strnlen-evex' and 'strnlen-evex512'
This patch serves both to reduce the number of implementations, and it also does some small optimizations that benefit strnlen-evex and strnlen-evex512.
All tests pass on x86.
Benchmarks were taken on SKX.
https://www.intel.com/content/www/us/en/products/sku/123613/intel-core-i97900x-xseries-processor-13-75m-cache-up-to-4-30-ghz/specifications.html
Geometric mean for strnlen-evex over all benchmarks (N=10) was (new/old) 0.881
Geometric mean for strnlen-evex512 over all benchmarks (N=10) was (new/old) 0.953
Code Size Changes:
strnlen-evex : +31 bytes
strnlen-evex512 : +156 bytes
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Secondary namespaces have a different malloc. Allocating the
buffer in one namespace and freeing it another results in
heap corruption. Fix this by using a static string (potentially
translated) in secondary namespaces. It would also be possible
to use the malloc from the initial namespace to manage the
buffer, but these functions would still not be safe to use in
auditors etc. because a call to strerror could still free a
buffer while it is used by the application. Another approach
could use proper initial-exec TLS, duplicated in secondary
namespaces, but that would need a callback interface for freeing
libc resources in namespaces on thread exit, which does not exist
today.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Macros will automatically use the correct types, without
having to fiddle with internal glibc macros. It's also
impossible to get the types wrong due to aliasing because
support_check_stat_fd and support_check_stat_path do not
depend on the struct stat* types.
The changes reveal some inconsistencies in tests.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This avoids the need to define struct_statx to an appropriate
struct stat type variant because struct statx does not change
based on time/file offset flags.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This is currently implied by the internal headers, but it makes
sense not to rely on this.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This is not needed: include/intprops.h has its own detection logic.
It makes building these files outside of glibc easer.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Fix an issue with commit 8f4632deb3 ("Linux: rseq registration tests")
and prevent testing from being run in the process of the test driver
itself rather than just the test child where one has been forked. The
problem here is the unguarded use of a destructor to call a part of the
testing. The destructor function, 'do_rseq_destructor_test' is called
implicitly at program completion, however because it is associated with
the executable itself rather than an individual process, it is called
both in the test child *and* in the test driver itself.
Prevent this from happening by providing a guard variable that only
enables test invocation from 'do_rseq_destructor_test' in the process
that has first run 'do_test'. Consequently extra testing is invoked
from 'do_rseq_destructor_test' only once and in the correct process,
regardless of the use or the lack of of the '--direct' option. Where
called in the controlling test driver process that has neved called
'do_test' the destructor function silently returns right away without
taking any further actions, letting the test driver fail gracefully
where applicable.
This arrangement prevents 'tst-rseq-nptl' from ever causing testing to
hang forever and never complete, such as currently happening with the
'mips-linux-gnu' (o32 ABI) target.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Previously if the setaffinity wrapper failed the rest of the subtest
would not execute and the current subtest would be reported as passing.
Now if the setaffinity wrapper fails the subtest is correctly reported
as faling. Tested manually by changing the conditions of the affinity
call including setting size to zero, or checking the wrong condition.
No regressions on x86_64.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
If a file descriptor is left unclosed and is cleaned up by _IO_cleanup
on exit, its backup buffer remains unfreed, registering as a leak in
valgrind. This is not strictly an issue since (1) the program should
ideally be closing the stream once it's not in use and (2) the program
is about to exit anyway, so keeping the backup buffer around a wee bit
longer isn't a real problem. Free it anyway to keep valgrind happy
when the streams in question are the standard ones, i.e. stdout, stdin
or stderr.
Also, the _IO_have_backup macro checks for _IO_save_base,
which is a roundabout way to check for a backup buffer instead of
directly looking for _IO_backup_base. The roundabout check breaks when
the main get area has not been used and user pushes a char into the
backup buffer with ungetc. Fix this to use the _IO_backup_base
directly.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
When ungetc is called on an unused stream, the backup buffer is
allocated without the main get area being present. This results in
every subsequent ungetc (as the stream remains in the backup area)
checking uninitialized memory in the backup buffer when trying to put a
character back into the stream.
Avoid comparing the input character with buffer contents when in backup
to avoid this uninitialized read. The uninitialized read is harmless in
this context since the location is promptly overwritten with the input
character, thus fulfilling ungetc functionality.
Also adjust wording in the manual to drop the paragraph that says glibc
cannot do multiple ungetc back to back since with this change, ungetc
can actually do this.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The goal of this flag is to allow targets which don't prefer/have ERMS
to still access the non-temporal memset implementation.
There are 4 cases for tuning memset:
1) `Avoid_STOSB && Avoid_Non_Temporal_Memset`
- Memset with temporal stores
2) `Avoid_STOSB && !Avoid_Non_Temporal_Memset`
- Memset with temporal/non-temporal stores. Non-temporal path
goes through `rep stosb` path. We accomplish this by setting
`x86_rep_stosb_threshold` to
`x86_memset_non_temporal_threshold`.
3) `!Avoid_STOSB && Avoid_Non_Temporal_Memset`
- Memset with temporal stores/`rep stosb`
3) `!Avoid_STOSB && !Avoid_Non_Temporal_Memset`
- Memset with temporal stores/`rep stosb`/non-temporal stores.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This is just a refactor and there should be no behavioral change from
this commit.
The goal is to make `Avoid_Non_Temporal_Memset` a more universal knob
for controlling whether we use non-temporal memset rather than having
extra logic based on vendor.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Issue was we were expecting not matches with CHAR before the start of
the string in the page cross case.
The check code in the page cross case:
```
and $0xffffffffffffffc0,%rax
vmovdqa64 (%rax),%zmm17
vpcmpneqb %zmm17,%zmm16,%k1
vptestmb %zmm17,%zmm17,%k0{%k1}
kmovq %k0,%rax
inc %rax
shr %cl,%rax
je L(continue)
```
expects that all characters that neither match null nor CHAR will be
1s in `rax` prior to the `inc`. Then the `inc` will overflow all of
the 1s where no relevant match was found.
This is incorrect in the page-cross case, as the
`vmovdqa64 (%rax),%zmm17` loads from before the start of the input
string.
If there are matches with CHAR before the start of the string, `rax`
won't properly overflow.
The fix is quite simple. Just replace:
```
inc %rax
shr %cl,%rax
```
With:
```
sar %cl,%rax
inc %rax
```
The arithmetic shift will clear any matches prior to the start of the
string while maintaining the signbit so the 1s can properly overflow
to zero in the case of no matches.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
We have no tests that errno is set to ERANGE on overflow of
strtod-family functions (we do have some tests for underflow, in
tst-strtod-underflow). Add such tests to tst-strtod-round.
Tested for x86_64.
In _dl_tlsdesc_dynamic, there are three 'addi.d sp, sp, -size'
instructions to allocate stack size for Float/LSX/LASX registers.
Every 'addi.d sp, sp, -size' needs a cfi_adjust_cfa_offset because
of sp is used to compute CFA. But only one 'addi.d sp, sp, -size'
will be run according to HWCAP value. And all cfi_adjust_cfa_offset
will be executed in stack unwinding, it result in incorrect CFA.
Change _dl_tlsdesc_dynamic to _dl_tlsdesc_dynamic,
_dl_tlsdesc_dynamic_lsx and _dl_tlsdesc_dynamic_lasx.
Conflicting cfi instructions can be distributed to the three functions.
And cfi instructions can correspond to stack down instructions.
Fix an issue with commit b74121ae4b ("Update.") and prevent a stray
process from being left behind by tst-cancel7 (and also tst-cancelx7,
which is the same test built with '-fexceptions' additionally supplied
to the compiler), which then blocks remote testing until the process has
been killed by hand.
This test case creates a thread that runs an extra copy of the test via
system(3) and using the '--direct' option so that the test wrapper does
not interfere with this instance. This extra copy executes its business
and calls sigsuspend(2) and then never terminates by itself. Instead it
relies on being killed by the main test process directly via a thread
cancellation request or, should that fail, by issuing SIGKILL either at
the conclusion of 'do_test' or by the test driver via 'do_cleanup' where
the test timeout has been hit or the test driver interrupted.
However if the main test process has been instead killed by a signal,
such as due to incorrect execution, before it had a chance to kill the
extra copy of the test case, then the test wrapper will terminate
without running 'do_cleanup' and consequently the extra copy of the test
case will remain forever in its suspended state, and in the remote case
in particular it means that the remote test wrapper will wait forever
for the SSH command to complete.
This has been observed with the 'alpha-linux-gnu' target, where the main
test process triggers SIGSEGV and the test wrapper correctly records:
Didn't expect signal from child: got `Segmentation fault'
in nptl/tst-cancel7.out and terminates, but then the calling SSH command
continues waiting for the remaining process started in the same session
on the remote target to complete.
Address this problem by also registering 'do_cleanup' via atexit(3),
observing that 'support_delete_temp_files' is registered by the test
wrapper before the test initializing function 'do_prepare' is called and
that we call all the functions registered in the reverse of the order in
which they were registered, so it is safe to refer to 'pidfilename' in
'do_cleanup' invoked by exit(3) because by that time temporary files
have not yet been deleted.
A minor inconvenience is that if 'signal_handler' is invoked in the test
wrapper as a result of SIGALRM rather than SIGINT, then 'do_cleanup'
will be called twice, once as a cleanup handler and again by exit(3).
In reality it is harmless though, because issuing SIGKILL is guarded by
a record lock, so if the first call has succeeded in killing the extra
copy of the test case, then the subsequent call will do nothing.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Move the release of the semaphore used to synchronize between an extra
copy of the test run as a separate process and the main test process
until after the PID file has been locked. It is so that if the cleanup
function gets called by the test driver due to premature termination of
the main test process, then the function does not get at the PID file
before it has been locked and conclude that the extra copy of the test
has already terminated. This won't usually happen due to a relatively
high amount of time required to elapse before timeout triggers in the
test driver, but it will change with the next change.
There is still a small time window remaining with this change in place
where the main test process gets killed for some reason between the
extra copy of the test has been already started by pthread_create(3) and
a successful return from the call to sem_wait(3), in which case the
cleanup function can be reached before PID has been written to the PID
file and the file locked. It seems that with the test case structured
as it is now and PID-based process management we have no means to avoid
it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add a new randomized memset test similar to bench-random-memcpy. Instead of
repeating the same call to memset over and over again, it times a large number
of different inputs. The distribution of memset length and alignment is based
on SPEC2017 (length up to 4096 and alignment up to 64).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Improve performance by handling another 16 bytes before entering the loop.
Use ADDHN in the loop to avoid SHRN+FMOV when it terminates. Change final
size computation to avoid increasing latency. On Neoverse V1 performance
of the random strlen benchmark improves by 4.6%.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These functions are exp10m1, exp2m1, log10p1, log2p1.
Also regenerated ulps on x86_64.
For each format, there are 4 values, one for each rounding mode.
(For the intel96 format, there are 8 values, 4 for Intel hardware,
and 4 for AMD hardware. However, regen-ulps was only run on Intel.
It should be run in a separate patch on a AMD x86_64.)
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This is a follow-up to 10de4a47ef that
reworded the manual entries for putc and putwc and removed any
performance claims.
This commit further clarifies these entries and brings getc and getwc in
line with the descriptions of putc and putwc, removing any performance
claims from them as well.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
As for exit, also allows concurrent quick_exit to avoid race
conditions when it is called concurrently. Since it uses the same
internal function as exit, the __exit_lock lock is moved to
__run_exit_handlers. It also solved a potential concurrent when
calling exit and quick_exit concurrently.
The test case 'expected' is expanded to a value larger than the
minimum required by C/POSIX (32 entries) so at_quick_exit() will
require libc to allocate a new block. This makes the test mre likely to
trigger concurrent issues (through free() at __run_exit_handlers)
if quick_exit() interacts with the at_quick_exit list concurrently.
This is also the latest interpretation of the Austin Ticket [1].
Checked on x86_64-linux-gnu.
[1] https://austingroupbugs.net/view.php?id=1845
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The old code used l_init_called as an indicator for whether TLS
initialization was complete. However, it is possible that
TLS for an object is initialized, written to, and then dlopen
for this object is called again, and l_init_called is not true at
this point. Previously, this resulted in TLS being initialized
twice, discarding any interim writes (technically introducing a
use-after-free bug even).
This commit introduces an explicit per-object flag, l_tls_in_slotinfo.
It indicates whether _dl_add_to_slotinfo has been called for this
object. This flag is used to avoid double-initialization of TLS.
In update_tls_slotinfo, the first_static_tls micro-optimization
is removed because preserving the initalization flag for subsequent
use by the second loop for static TLS is a bit complicated, and
another per-object flag does not seem to be worth it. Furthermore,
the l_init_called flag is dropped from the second loop (for static
TLS initialization) because l_need_tls_init on its own prevents
double-initialization.
The remaining l_init_called usage in resize_scopes and update_scopes
is just an optimization due to the use of scope_has_map, so it is
not changed in this commit.
The isupper check ensures that libc.so.6 is TLS is not reverted.
Such a revert happens if l_need_tls_init is not cleared in
_dl_allocate_tls_init for the main_thread case, now that
l_init_called is not checked anymore in update_tls_slotinfo
in elf/dl-open.c.
Reported-by: Jonathon Anderson <janderson@rice.edu>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Only return __GCONV_INCOMPLETE_INPUT for a partial match when the end of
the input buffer is reached. Otherwise it is a non-match, and other
patterns should be tried.
5476f8cd2e ("htl: move pthread_self info libc.") and
9dfa256216 ("htl: move pthread_equal into libc") to
1dc0bc8f07 ("htl: move pthread_attr_setdetachstate into libc")
moved some pthread_ symbols from libpthread.so to libc.so, but missed
adding the compat version like 5476f8cd2e ("htl: move pthread_self
info libc.") did: libc already had these symbols as forwards,
but versioned GLIBC_2.21, while the symbols in libpthread.so were
versioned GLIBC_2.12.
To fix running executables built before this, we thus have to add the
GLIBC_2.12 version, otherwise execution fails with e.g.
/usr/lib/i386-gnu/libglib-2.0.so: symbol lookup error: /usr/lib/i386-gnu/libglib-2.0.so: undefined symbol: pthread_attr_setinheritsched, version GLIBC_2.12
Previous GCC versions do not support the C23 change that
allows labels on declarations.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add tests for MREMAP_MAYMOVE and MREMAP_FIXED. On Linux, also test
MREMAP_DONTUNMAP.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Update the mremap C implementation to support the optional argument for
MREMAP_DONTUNMAP added in Linux 5.7 since it may not always be correct
to implement a variadic function as a non-variadic function on all Linux
targets. Return MAP_FAILED and set errno to EINVAL for unknown flag bits.
This fixes BZ #31968.
Note: A test must be added when a new flag bit is introduced.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add string/test-strncmp-nonarray and
wcsmbs/test-wcsncmp-nonarray.
This is the test that uncovered bug 31934. Test run time
is more than one minute on a fairly current system, so turn
these into xtests that do not run automatically.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
This helps HotColdSplitting in GCC/LLVM.
Thought about doing `exit` as well since its only called once per
process, but since its easy to imagine a hot path leading into
`exit(0)`, its less clear if its profitable.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Even if C/POSIX standard states that exit is not formally thread-unsafe,
calling it more than once is UB. The glibc already supports
it for the single-thread, and both elf/nodelete2.c and tst-rseq-disable.c
call exit from a DSO destructor (which is called by _dl_fini, registered
at program startup with __cxa_atexit).
However, there are still race issues when it is called more than once
concurrently by multiple threads. A recent Rust PR triggered this
issue [1], which resulted in an Austin Group ask for clarification [2].
Besides it, there is a discussion to make concurrent calling not UB [3],
wtih a defined semantic where any remaining callers block until the first
call to exit has finished (reentrant calls, leaving through longjmp, and
exceptions are still undefined).
For glibc, at least reentrant calls are required to be supported to avoid
changing the current behaviour. This requires locking using a recursive
lock, where any exit called by atexit() handlers resumes at the point of
the current handler (thus avoiding calling the current handle multiple
times).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
[1] https://github.com/rust-lang/rust/issues/126600
[2] https://austingroupbugs.net/view.php?id=1845
[3] https://www.openwall.com/lists/libc-coord/2024/07/24/4
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
It was added by commit c62b758bae6af16 as a way for userspace to
check if two file descriptors refer to the same struct file.
Checked on aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This patch updates the kernel version in the tests tst-mman-consts.py,
tst-mount-consts.py, and tst-pidfd-consts.py to 6.9.
There are no new constants covered by these tests in 6.10.
Tested with build-many-glibcs.py.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Linux 6.10 changes for syscall are:
* mseal for all architectures.
* map_shadow_stack for x32.
* Replace sync_file_range with sync_file_range2 for csky (which
fixes a broken sync_file_range usage).
Update syscall-names.list and regenerate the arch-syscall.h headers
with build-many-glibcs.py update-syscalls.
Tested with build-many-glibcs.py.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The existing description for setrlimit() has some ambiguity. It could be
understood to have the semantics of getrlimit(), i.e., the limits from the
process are stored in the provided rlp pointer.
Make the description more explicit that rlp are the input values, and that
the limits of the process is changed with this function.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The manual entry for `putc' described what "most systems" do instead of
describing the glibc implementation and its guarantees. This commit
fixes that by warning that putc may be implemented as a macro that
double-evaluates `stream', and removing the performance claim.
Even though the current `putc' implementation does not double-evaluate
`stream', offering this obscure guarantee as an extension to what
POSIX allows does not seem very useful.
The entry for `putwc' is also edited to bring it in line with `putc'.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This helps compilers split the codegen for setting up the arguments
(`__expression`, `__filename`, etc...) from the potentially hot cold
where the `assert` is to a presumably cold region on the assertion
failure path.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Sam James <sam@gentoo.org>
Complement commit b03e4d7bd2 ("stdio: fix vfscanf with matches longer
than INT_MAX (bug 27650)") and add a test case for the issue, inspired
by the reproducer provided with the bug report.
This has been verified to succeed as from the commit referred and fail
beforehand.
As the test requires 2GiB of data to be passed around its performance
has been evaluated using a choice of systems and the execution time
determined to be respectively in the range of 9s for POWER9@2.166GHz,
24s for FU740@1.2GHz, and 40s for 74Kf@950MHz. As this is on the verge
of and beyond the default timeout it has been increased by the factor of
8. Regardless, following recent practice the test has been added to the
standard rather than extended set.
Reviewed-by: DJ Delorie <dj@redhat.com>
Add a FAIL test failure helper analogous to FAIL_RET, that does not
cause the current function to return, providing a standardized way to
report a test failure with a message supplied while permitting the
caller to continue executing, for further reporting, cleaning up, etc.
Update existing test cases that provide a conflicting definition of FAIL
by removing the local FAIL definition and then as follows:
- tst-fortify-syslog: provide a meaningful message in addition to the
file name already added by <support/check.h>; 'support_record_failure'
is already called by 'support_print_failure_impl' invoked by the new
FAIL test failure helper.
- tst-ctype: no update to FAIL calls required, with the name of the file
and the line number within of the failure site additionally included
by the new FAIL test failure helper, and error counting plus count
reporting upon test program termination also already provided by
'support_record_failure' and 'support_report_failure' respectively,
called by 'support_print_failure_impl' and 'adjust_exit_status' also
respectively. However in a number of places 'printf' is called and
the error count adjusted by hand, so update these places to make use
of FAIL instead. And last but not least adjust the final summary just
to report completion, with any error count following as reported by
the test driver.
- test-tgmath2: no update to FAIL calls required, with the name of the
file of the failure site additionally included by the new FAIL test
failure helper. Also there is no need to track the return status by
hand as any call to FAIL will eventually cause the test case to return
an unsuccesful exit status regardless of the return status from the
test function, via a call to 'adjust_exit_status' made by the test
driver.
Reviewed-by: DJ Delorie <dj@redhat.com>
Remove local FAIL macro in favor to FAIL_RET from <support/check.h>,
which provides equivalent reporting, with the name of the file of the
failure site additionally included, for the tst-truncate-common core
shared between the tst-truncate and tst-truncate64 tests.
Reviewed-by: DJ Delorie <dj@redhat.com>
Remove local FAIL macro in favor to FAIL_EXIT1 from <support/check.h>,
which provides equivalent reporting, with the name of the file and the
line number within of the failure site additionally included. Remove
FAIL_ERR altogether and include ": %m" explicitly with the format string
supplied to FAIL_EXIT1 as there seems little value to have a separate
macro just for this.
Reviewed-by: DJ Delorie <dj@redhat.com>
Use RXX_LP in RTLD_START_ENABLE_X86_FEATURES. Support shadow stack during
startup for Linux 6.10:
commit 2883f01ec37dd8668e7222dfdb5980c86fdfe277
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Mar 15 07:04:33 2024 -0700
x86/shstk: Enable shadow stacks for x32
1. Add shadow stack support to x32 signal.
2. Use the 64-bit map_shadow_stack syscall for x32.
3. Set up shadow stack for x32.
Add the map_shadow_stack system call to <fixup-asm-unistd.h> and regenerate
arch-syscall.h. Tested on Intel Tiger Lake with CET enabled x32. There
are no regressions with CET enabled x86-64. There are no changes in CET
enabled x86-64 _dl_start_user.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Remove sysdeps/x86_64/x32/dl-machine.h by folding x32 ARCH_LA_PLTENTER,
ARCH_LA_PLTEXIT and RTLD_START into sysdeps/x86_64/dl-machine.h. There
are no regressions on x86-64 nor x32. There are no changes in x86-64
_dl_start_user. On x32, _dl_start_user changes are
<_dl_start_user>:
mov %eax,%r12d
+ mov %esp,%r13d
mov (%rsp),%edx
mov %edx,%esi
- mov %esp,%r13d
and $0xfffffff0,%esp
mov 0x0(%rip),%edi # <_dl_start_user+0x14>
lea 0x8(%r13,%rdx,4),%ecx
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
For now, do not enable this mode by default due to the potential
impact on compatibility with existing deployments.
Reviewed-by: DJ Delorie <dj@redhat.com>
In single-request mode, there is no second response after an error
because the second query has not been sent yet. Waiting for it
introduces an unnecessary timeout.
Reviewed-by: DJ Delorie <dj@redhat.com>
Improve aligned_alloc/calloc/malloc test coverage by adding
multi-threaded tests with random memory allocations and with/without
cross-thread memory deallocations.
Perform a number of memory allocation calls with random sizes limited
to 0xffff.
Use the existing DSO ('malloc/tst-aligned_alloc-lib.c') to randomize
allocator selection.
The multi-threaded allocation/deallocation is staged as described below:
- Stage 1: Half of the threads will be allocating memory and the
other half will be waiting for them to finish the allocation.
- Stage 2: Half of the threads will be allocating memory and the
other half will be deallocating memory.
- Stage 3: Half of the threads will be deallocating memory and the
second half waiting on them to finish.
Add 'malloc/tst-aligned-alloc-random-thread.c' where each thread will
deallocate only the memory that was previously allocated by itself.
Add 'malloc/tst-aligned-alloc-random-thread-cross.c' where each thread
will deallocate memory that was previously allocated by another thread.
The intention is to be able to utilize existing malloc testing to ensure
that similar allocation APIs are also exposed to the same rigors.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
Make sure the DSO used by aligned_alloc/calloc/malloc tests does not get
a global lock on multithreaded tests.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
For each input readelf output, localplt.awk parses each 'Relocation
section' entry, checks its offset against the dynamic section entry, and
saves each DT_JMPREL, DT_RELA, and DT_REL offset value it finds. After
all lines are read, the script checks if any segment offset differed
from 0, meaning at least one 'Relocation section' was matched.
However, if the shared object was built with RELR support and the static
linker could place all the relocation on DT_RELR, there would be no
DT_JMPREL, DT_RELA, and DT_REL entries; only a DT_RELR.
For the current three ABIs that support (aarch64, x86, and powerpc64),
the powerpc64 ld.so shows the behavior above. Both x86_64 and aarch64
show extra relocations on '.rela.dyn', which makes the script check to
succeed.
This patch fixes by handling DT_RELR, where the offset is checked
against the dynamic section entries and if the shared object contains an
entry it means that there are no extra PLT entries (since all
relocations are relative).
It fixes the elf/check-localplt failure on powerpc.
Checked with a build/check for aarch64-linux-gnu, x86_64-linux-gnu,
i686-linux-gnu, arm-linux-gnueabihf, s390x-linux-gnu, powerpc-linux-gnu,
powerpc64-linux-gnu, and powerpc64le-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The powerpc pkey_get/pkey_set support was only added for 64-bit [1],
and tst-pkey only checks if the support was present with pkey_alloc
(which does not fail on powerpc32, at least running a 64-bit kernel).
Checked on powerpc-linux-gnu.
[1] https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a803367bab167f5ec4fde1f0d0ec447707c29520
Reviewed-By: Andreas K. Huettel <dilfridge@gentoo.org>
AT_HWCAP on some architecture can indeed use all bits.
Checked on x86_64-linux-gnu and powerpc-linux-gnu.
Reviewed-By: Andreas K. Hüttel <dilfridge@gentoo.org>
Xfail elf/tst-platform-1 on x32 since kernel passes i686 in AT_PLATFORM.
See https://sourceware.org/bugzilla/show_bug.cgi?id=22363
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
This avoids a test failure when the system has no /etc/ld.so.cache.
Tested on x86_64-linux-gnu.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
0e75c4a463 ("hurd: Fix pthread_self() without libpthread") added a
declaration for ___pthread_init_thread instead of __pthread_init_thread,
and missed defining the external hidden symbol.
5476f8cd2e ("htl: move pthread_self info libc.") moved the htl
pthread_self() function from libpthread to libc, replacing the previous libc
stub that just returns 0. And 53da64d1cf ("htl: Initialize ___pthread_self
early") added initialization code which is needed before being able to
call pthread_self. It is currently in libpthread, and thus never called
before programs can call pthread_self from libc, which then segfaults
when accessing _pthread_self()->thread.
This moves the initialization to libc itself, as initialized variables, so
pthread_self can always be called fine.
In _dl_tlsdesc_dynamic, there are three 'addi.d sp, sp, -size'
instructions to allocate stack size for Float/LSX/LASX registers.
Every 'addi.d sp, sp, -size' needs a cfi_adjust_cfa_offset because
of sp is used to compute CFA. But only one 'addi.d sp, sp, -size'
will be run according to HWCAP value. And all cfi_adjust_cfa_offset
will be executed in stack unwinding, it result in incorrect CFA.
Change _dl_tlsdesc_dynamic to _dl_tlsdesc_dynamic,
_dl_tlsdesc_dynamic_lsx and _dl_tlsdesc_dynamic_lasx.
Conflicting cfi instructions can be distributed to the three functions.
And cfi instructions can correspond to stack down instructions.
The original commit enabling non-temporal memset on Skylake Server had
erroneous benchmarks (actually done on ICX).
Further benchmarks indicate non-temporal stores may in fact by a
regression on Skylake Server.
This commit may be over-cautious in some cases, but should avoid any
regressions for 2.40.
Tested using qemu on all x86_64 cpu arch supported by both qemu +
GLIBC.
Reviewed-by: DJ Delorie <dj@redhat.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
We use thread_get_name and thread_set_name to get and set the thread
name, so nothing is stored in the thread structure since these functions
are supposed to be called sparingly.
One notable difference with Linux is that the thread name is up to 32
chars, whereas Linux's is 16.
Also added a mach_RPC_CHECK to check for the existing of gnumach RPCs.
save_data stores the start of the original message to be retried,
overwritten by the EINTR reply. In 64b builds the overwrite is however
rounded up to the 64b pointer size, so we have to save more than just
the 32b err.
Thanks a lot to Luca Dariz for the investigation!
Fix an issue with commit 2af4e3e566 ("Test of semaphores.") by making
the tst-sem11 and tst-sem12 tests use the test driver, preventing them
from ever causing testing to hang forever and never complete, such as
currently happening with the 'mips-linux-gnu' (o32 ABI) target. Adjust
the name of the PREPARE macro, which clashes with the interpretation of
its presence by the test driver, by using a TF_ prefix in reference to
the name of the 'tf' function.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add a copyright notice to the tst-sem11 and tst-sem12 tests, observing
that they have been originally contributed back in 2007, with commit
2af4e3e566 ("Test of semaphores.").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The z13/vector-optimized wcsncmp implementation segfaults if n=1
and there is only one character (equal on both strings) before
the page end. Then it loads and compares one character and misses
to check n again. The following load fails.
This patch removes the extra load and compare of the first character
and just start with the loop which uses vector-load-to-block-boundary.
This code-path also checks n.
With this patch both tests are passing:
- the simplified one mentioned in the bugzilla 31934
- the full one in Florian Weimer's patch:
"manual: Document a GNU extension for strncmp/wcsncmp"
(https://patchwork.sourceware.org/project/glibc/patch/874j9eml6y.fsf@oldenburg.str.redhat.com/):
On s390x-linux-gnu (z16), the new wcsncmp test fails due to bug 31934.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The __rseq_size value is now the active area of struct rseq
(so 20 initially), not the full struct size including padding
at the end (32 initially).
Update misc/tst-rseq to print some additional diagnostics.
Reviewed-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
The purpose of this patch is to add some system calls that (1) aren't
otherwise documented, and (2) are merely redirected to the kernel, so
can refer to their documentation; and define a standard way of doing
so in the future. A more detailed explaination of how system calls
are wrapped is added along with reference to the Linux Man-Pages
project.
Default version of man-pages is in configure.ac but can be overridden
by --with-man-pages=X.Y
Reviewed-by: Alejandro Colomar <alx@kernel.org>
ldconfig already ignores files with the -gdb.py suffix, but GDB also
looks for -gdb.gdb and -gdb.scm files. These aren't as widely used, but
libguile at least comes with a -gdb.scm file.
Rename is_gdb_python_file to is_gdb_extension_file, and make it
recognise all three types of GDB extension.
Signed-off-by: Adam Sampson <ats@offog.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
is_gdb_python_file is doing a similar test, so it can use this helper
function as well.
Signed-off-by: Adam Sampson <ats@offog.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This hasn't been looked at for a loong time (already guessing from
the number of missing entries), and it ain't pretty.
There are some 9-ulps results for float.
- ZaZaZebra (qemu-system-m68k clone of PowerBook 190 system)
- GCC 13.3.1 20240614 (Gentoo 13.3.1_p20240614 p17)
- ld GNU ld (Gentoo 2.42 p6) 2.42.0
- Linux ZaZaZebra 4.19.0-5-m68k #1 Gentoo 4.19.37-5 (2019-06-19) m68k 68040 68040 GNU/Linux
- manual build
- ../glibc/configure --enable-fortify-source --prefix=/usr
- Tested by Immolo (via Andreas K. Hüttel)
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
The __getrandom_nocancel used by __arc4random_buf uses
INLINE_SYSCALL_CALL (which returns -1/errno) and the loop checks for
the return value instead of errno to fallback to /dev/urandom.
The malloc code now uses __getrandom_nocancel_nostatus, which uses
INTERNAL_SYSCALL_CALL, so there is no need to use the variant that does
not set errno (BZ#29624).
Checked on x86_64-linux-gnu.
Reviewed-by: Xi Ruoyao <xry111@xry111.site>
While working on a patch to add support for the extensible rseq ABI, we
came across an issue where a new 'const' variable would be merged with
the existing '__rseq_size' variable. We tracked this to the use of
'-fmerge-all-constants' which allows the compiler to merge identical
constant variables. This means that all 'const' variables in a compile
unit that are of the same size and are initialized to the same value can
be merged.
In this specific case, on 32 bit systems 'unsigned int' and 'ptrdiff_t'
are both 4 bytes and initialized to 0 which should trigger the merge.
However for reasons we haven't delved into when the attribute 'section
(".data.rel.ro")' is added to the mix, only variables of the same exact
types are merged. As far as we know this behavior is not specified
anywhere and could change with a new compiler version, hence this patch.
Move the definitions of these variables into an assembler file and add
hidden writable aliases for internal use. This has the added bonus of
removing the asm workaround to set the values on rseq registration.
Tested on Debian 12 with GCC 12.2.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This new section in the manual provides recommendations for
use of glibc in environments with higher integrity requirements.
It's reflecting both current implementation shortcomings, and
challenges we inherit from ELF and psABI requirements.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Starting with commit
59974938fe
elf/rtld: Count skipped environment variables for enable_secure
The new testcase elf/tst-tunables-enable_secure-env segfaults on s390 (31bit).
There _start parses the auxiliary vector for some additional checks.
Therefore it skips over the zeros after the environment variables ...
0x7fffac20: 0x7fffbd17 0x7fffbd32 0x7fffbd69 0x00000000
------------------------------------------------^^^last environment variable
... and then it parses the auxiliary vector and stops at AT_NULL.
0x7fffac30: 0x00000000 0x00000021 0x00000000 0x00000000
--------------------------------^^^AT_SYSINFO_EHDR--------------^^^AT_NULL
----------------^^^newp-----------------------------------------^^^oldp
Afterwards it tries to access AT_PHDR which points to somewhere and segfaults.
Due to not incorporating the skip_env variable in the computation of oldp
when shuffling down the auxv in rtld.c, it just copies one entry with AT_NULL
and value 0x00000021 and stops the loop. In reality we have skipped
GLIBC_TUNABLES environment variable (=> skip_env=1). Thus we should copy from
here:
0x7fffac40: 0x00000021 0x7ffff000 0x00000010 0x007fffff
----------------^^^fixed-oldp
This patch fixes the computation of oldp when shuffling down auxiliary vector.
It also adds some checks in the testcase. Those checks also fail on
s390x (64bit) and x86_64 without the fix.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Adhemerval noticed that the gettimeofday() and 32-bit clock_gettime()
vDSO calls won't be used by glibc on hppa, so there is no need to
declare them. Both syscalls will be emulated by utilizing return values
of the 64-bit clock_gettime() vDSO instead.
Signed-off-by: Helge Deller <deller@gmx.de>
Suggested-by: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
The clang open fortify wrapper from 4228baef1a added
a restriction where open with 3 arguments where flags do not
contain O_CREAT or O_TMPFILE are handled as invalid. They are
not invalid, since the third argument is ignored, and the gcc
wrapper also allows it.
Checked x86_64-linux-gnu and with a yocto build for some affected
packages.
Tested-by: “Khem Raj <raj.khen@gmail.com>”
By default, if the C++ toolchain lacks support for static linking,
configure fails to find the C++ header files and the glibc build fails.
The --disable-static-c++-link-check option allows the glibc build to
finish, but static C++ tests will fail if the C++ toolchain doesn't
have the necessary static C++ libraries which may not be easily installed.
Add --disable-static-c++-tests option to skip the static C++ link check
and tests. This fixes BZ #31797.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
The current minimum GCC version of glibc build is GCC 6.2 or newer. But
building i686 glibc with GCC 6.4 on Fedora 40 failed since the C++ header
files couldn't be found which was caused by the static C++ link check
failure due to missing __divmoddi4 which was referenced in i686 libc.a
and added to GCC 7. Add --disable-static-c++-link-check configure option
to disable the static C++ link test. The newly built i686 libc.a can be
used by GCC 6.4 to create static C++ tests. This fixes BZ #31412.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Extend the list of MAP_* macros to include all macros available
to the average program (gcc -E -dM | grep MAP_*)
Extend the list of errno codes.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
MIPSr6 has MADDF.s/MADDF.d instructions, which are fused.
In MIPS ISA, double support can be subsetted. Only FMAF is enabled
for this case.
* sysdeps/mips/fpu/math-use-builtins-fma.h
Signed-off-by: YunQiang Su <syq@gcc.gnu.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
It turns out that quite a few applications use bundled mallocs that
have been built to use global-dynamic TLS (instead of the recommended
initial-exec TLS). The previous workaround from
commit afe42e935b ("elf: Avoid some
free (NULL) calls in _dl_update_slotinfo") does not fix all
encountered cases unfortunatelly.
This change avoids the TLS generation update for recursive use
of TLS from a malloc that was called during a TLS update. This
is possible because an interposed malloc has a fixed module ID and
TLS slot. (It cannot be unloaded.) If an initially-loaded module ID
is encountered in __tls_get_addr and the dynamic linker is already
in the middle of a TLS update, use the outdated DTV, thus avoiding
another call into malloc. It's still necessary to update the
DTV to the most recent generation, to get out of the slow path,
which is why the check for recursion is needed.
The bookkeeping is done using a global counter instead of per-thread
flag because TLS access in the dynamic linker is tricky.
All this will go away once the dynamic linker stops using malloc
for TLS, likely as part of a change that pre-allocates all TLS
during pthread_create/dlopen.
Fixes commit d2123d6827 ("elf: Fix slow
tls access after dlopen [BZ #19924]").
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
The conditionals for several mtrace-based tests in catgets, elf, libio,
malloc, misc, nptl, posix, and stdio-common were incorrect leading to
test failures when bootstrapping glibc without perl.
The correct conditional for mtrace-based tests requires three checks:
first checking for run-built-tests, then build-shared, and lastly that
PERL is not equal to "no" (missing perl).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This will make it easier to add tests later; see also the test section
in malloc/Makefile that inspires this.
Signed-off-by: Michel Lind <salimma@fedoraproject.org>
Suggested-by: Arjun Shankar <arjun@redhat.com>
Reviewed-by: Arjun Shankar <arjun@redhat.com>
Current 'non_temporal_threshold' set to 'non_temporal_threshold_lowbound'
on Zhaoxin processors without ERMS. The default
'non_temporal_threshold_lowbound' is too small for the KH-40000 and KX-7000
Zhaoxin processors, this patch updates the value to
'shared / cachesize_non_temporal_divisor'.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
This patch optimizes large size copy using normal store when src > dst
and overlap. Make it the same as the logic in memmove-vec-unaligned-erms.S.
Current memmove-ssse3 use '__x86_shared_cache_size_half' as the non-
temporal threshold, this patch updates that value to
'__x86_shared_non_temporal_threshold'. Currently, the
__x86_shared_non_temporal_threshold is cpu-specific, and different CPUs
will have different values based on the related nt-benchmark results.
However, in memmove-ssse3, the nontemporal threshold uses
'__x86_shared_cache_size_half', which sounds unreasonable.
The performance is not changed drastically although shows overall
improvements without any major regressions or gains.
Results on Zhaoxin KX-7000:
bench-memcpy geometric_mean(N=20) New / Original: 0.999
bench-memcpy-random geometric_mean(N=20) New / Original: 0.999
bench-memcpy-large geometric_mean(N=20) New / Original: 0.978
bench-memmove geometric_mean(N=20) New / Original: 1.000
bench-memmmove-large geometric_mean(N=20) New / Original: 0.962
Results on Intel Core i5-6600K:
bench-memcpy geometric_mean(N=20) New / Original: 1.001
bench-memcpy-random geometric_mean(N=20) New / Original: 0.999
bench-memcpy-large geometric_mean(N=20) New / Original: 1.001
bench-memmove geometric_mean(N=20) New / Original: 0.995
bench-memmmove-large geometric_mean(N=20) New / Original: 0.936
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Fix code formatting under the Zhaoxin branch and add comments for
different Zhaoxin models.
Unaligned AVX load are slower on KH-40000 and KX-7000, so disable
the AVX_Fast_Unaligned_Load.
Enable Prefer_No_VZEROUPPER and Fast_Unaligned_Load features to
use sse2_unaligned version of memset,strcpy and strcat.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Qualcom's new core, oryon-1, has a different characteristics for
memset than the current versions of memset. For non-zero, larger
sizes, using GPRs rather than the SIMD stores is ~30% faster.
For even larger sizes, using the nontemporal stores is needed
not to polute the L1/L2 caches.
For zero values, using `dc zva` should be used. Since we
know the size will always be 64 bytes, we don't need to figure
out the size there.
I started with the emag memset and added back the `dc zva` code.
Changes since v1:
* v3: Fix comment formating
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Qualcomm's new core (oryon-1) has a different performance characteristic
than other cores. For memcpy, it is faster to use the GPRs to
do the copy for large sizes (2x faster). For even larger sizes,
it is better to use the nontemporal load/store instructions so
we don't pollute the L1/L2 caches.
For smaller sizes, the characteristic are very similar to
other cores.
I used the thunderx memcpy as a starting point and expanded from there.
Changes since v1:
* v2: Fix ordering in Makefile.
* v3: Fix comment grammar about the ldnp/stnp instructions.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The fcntl.h fortify wrapper for clang added by 86889e22db
missed the __fortify_clang_overload_arg and and also added the
mode argument for the __fortify_function_error_function function,
which leads clang to be able to correct resolve which overloaded
function it should emit.
Checked on x86_64-linux-gnu.
Reported-by: Khem Raj <raj.khem@gmail.com>
Tested-by: Khem Raj <raj.khem@gmail.com>
The mqueue.h fortify wrapper for clang added by c23107effb
is not fully correct, where correct 4 argument usage are not
being correctly handled. For instance, while building socat 1.8
with a yocto clang based system shows:
./socat-1.8.0.0/xio-posixmq.c:119:8: error: 'mq_open' is unavailable: mq_open can be called either with 2 or 4 arguments
119 | mqd = mq_open(name, oflag, opt_mode, NULL);
| ^
[...] /usr/include/bits/mqueue2.h:66:8: note: 'mq_open' has been explicitly marked unavailable here
66 | __NTH (mq_open (const char *__name, int __oflag, mode_t mode,
| ^
1 error generated.
The correct way to define the wrapper is to set invalid usage
with __fortify_clang_unavailable (for the case with 5 or more
arguments), followed by the expected ones. This fix make mq_open
similar to current open wrappers.
[1] http://www.dest-unreach.org/socat/
Reported-by: Khem Raj <raj.khem@gmail.com>
Acked-by: Khem Raj <raj.khem@gmail.com>
With gcc 14, I get this warning/werror when building the localedata tests:
tests-mbwc/tsp_common.c: In function ‘result.constprop.isra’:
tests-mbwc/tsp_common.c:55:43: error: ‘%s’ directive writing up to 92 bytes into a region of size between 0 and 114 [-Werror=format-overflow=]
55 | sprintf (result_rec, "%s:%s:%d:%d:%d:%c:%s\n", func, loc, rec_no, seq_no,
| ^~
In file included from ../include/bits/stdio2.h:1,
from ../libio/stdio.h:980,
from ../include/stdio.h:14,
from tests-mbwc/tsp_common.c:10:
In function ‘sprintf’,
inlined from ‘result.constprop.isra’ at tests-mbwc/tsp_common.c:55:3:
../libio/bits/stdio2.h:30:10: note: ‘__builtin___sprintf_chk’ output between 20 and 234 bytes into a destination of size 132
30 | return __builtin___sprintf_chk (__s, __USE_FORTIFY_LEVEL - 1,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
31 | __glibc_objsize (__s), __fmt,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
32 | __va_arg_pack ());
| ~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
This patch now gets rid of using sprintf and the result_rec buffer and just
prints to fp directly.
The test does not necessarily trigger the crash, depending on memcmp
behavior. A crash was observed in __memcmp_ia32 on i686 builds.
Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
* manual/string.texi: For strnlen (s, maxlen), do not say that s must
be of size maxlen, as it can be smaller if it is null-terminated.
This should help avoid confusion such as seen in
<https://lists.gnu.org/r/bug-gnulib/2024-06/msg00280.html>.
Mention that strnlen and wcsnlen have been in POSIX since
POSIX.1-2008.
This recently came up during a cleanup to remove misaligned accesses
from the RISC-V port.
Link: https://sourceware.org/pipermail/libc-alpha/2022-June/139961.html
Suggested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Fangrui Song <maskray@google.com>
asm volatile ("movfcsr2gr $t0, $fcsr0" ::: "$t0");
asm volatile ("st.d $t0, %0" :"=m"(restore_fcsr));
generate to the following instructions with -Og flag:
movfcsr2gr $t0, $zero
addi.d $t0, $sp, 2047(0x7ff)
addi.d $t0, $t0, 77(0x4d)
st.w $t0, $t0, 0
fcsr0 register and restore_fcsr variable are both stored in t0 register.
Change to:
asm volatile ("movfcsr2gr %0, $fcsr0" :"=r"(restore_fcsr));
to avoid restore_fcsr address in t0.
Comparing float value using memcmp because float value cannot be
directly compared for equality.
Put LOAD_REGISTER_FCSR and SAVE_REGISTER_FCC after LOAD_REGISTER_FLOAT.
Some float instructions may change fcsr register.
If the pidfd_spawn/pidfd_spawnp helper process succeeds, but evecve
fails for some reason (either with an invalid/non-existent, memory
allocation, etc.) the resulting pidfd is never closed, nor returned
to caller (so it can call close).
Since the process creation failed, it should be up to posix_spawn to
also, close the file descriptor in this case (similar to what it
does to reap the process).
This patch also changes the waitpid with waitid (P_PIDFD) for pidfd
case, to avoid a possible pid re-use.
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The atomic_spin_nop() macro can be used to run arch-specific
code in the body of a spin loop to potentially improve efficiency.
RISC-V's Zihintpause extension includes a PAUSE instruction for
this use-case, which is encoded as a HINT, which means that it
behaves like a NOP on systems that don't implement Zihintpause.
Binutils supports Zihintpause since 2.36, so this patch uses
the ".insn" directive to keep the code compatible with older
toolchains.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
MIPSr6 has MADDF.s/MADDF.d instructions, which are fused.
In MIPS ISA, double support can be subsetted. Only FMAF is enabled
for this case.
* sysdeps/mips/fpu/math-use-builtins-fma.h
Signed-off-by: YunQiang Su <syq@gcc.gnu.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The upcoming parisc (hppa) v6.11 Linux kernel will include vDSO
support for gettimeofday(), clock_gettime() and clock_gettime64()
syscalls for 32- and 64-bit userspace.
The patch below adds the necessary glue code for glibc.
Signed-off-by: Helge Deller <deller@gmx.de>
Changes in v2:
- add vsyscalls for 64-bit too
The walk benchmarks don't measure anything useful - memory is not initialized
properly so doing a single walk in 32MB just measures reading the 4KB zero
page for reads and clear_page overhead for writes. The memset variants don't
even manage to do a walk in the 32MB region due to using incorrect pointer
increments... Neither is it clear why it is walking backwards since this
won't confuse modern prefetchers. If you fix the benchmark and print the
bandwidth, the results are identical for all sizes larger than ~1KB since it
is just testing memory bandwidth of a single 32MB block. This case is already
tested by the large benchmark, so overall it doesn't seem useful to keep these.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The previous version expanded $0 and $@ twice.
The new version defines a q no-op shell command. The Perl syntax
error is masked by the eval Perl function. The q { … } construct
is executed by the shell without errors because the q shell function
was defined, but treated as a non-expanding quoted string by Perl,
effectively hiding its context from the Perl interpreter. As before
the script is read by require instead of executed directly, to avoid
infinite recursion because the #! line contains /bin/sh.
Introduce the “fatal” function to produce diagnostics that are not
suppressed by “do”. Use “do” instead of “require” because it has
fewer requirements on the executed script than “require”.
Prefix relative paths with './' because “do” (and “require“ before)
searches for the script in @INC if the path is relative and does not
start with './'. Use $_ to make the trampoline shorter.
Add an Emacs mode marker to indentify the script as a Perl script.
The test is not a run-time check, so update the description.
Also use readelf -W for a more stable output format and fix
an LC_ALL typo.
This avoids garbled configure messages:
checking for s390-specific static PIE requirements (runtime check)... 0x0000000000000017 (JMPREL) 0x280
yes
Based on a -march=x86-64-v4 -mfpmath=sse build, with and without
--disable-multi-arch, running on a Zen 4 CPU. Also used different
-march=x8i6-64-v… settings.
Generation of the Perl script does not depend on Perl, so we can
always install it even if $(PERL) is not set during the build.
Change the malloc/mtrace.pl text substition not to rely on $(PERL).
Instead use PATH at run time to find the Perl interpreter. The Perl
interpreter cannot execute directly a script that starts with
“#! /bin/sh”: it always executes it with /bin/sh. There is no
perl command line switch to disable this behavior. Instead, use
the Perl require function to execute the script. The additional
shift calls remove the “.” shell arguments. Perl interprets the
“.” as a string concatenation operator, making the expression
syntactically valid.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
HWCAP value is overwritten at the first comparison of the LASX case.
The second comparison at LSX get incorrect result.
Change to use t0 to save HWCAP value, and use t1 to save comparison
result.
The _dl_sysdep_parse_arguments function contains initalization
of a large on-stack variable:
dl_parse_auxv_t auxv_values = { 0, };
This uses a non-inline version of memset on powerpc64le-linux-gnu,
so it must use the baseline memset.
A desired hugetlb page size can be encoded in the flags parameter of
system calls such as mmap() and shmget(). The Linux UAPI headers have
included explicit definitions for these encodings since v4.14.
This patch adds these definitions that are used along with MAP_HUGETLB
and SHM_HUGETLB flags as specified in the corresponding man pages. This
relieves programs from having to duplicate and/or compute the encodings
manually.
Additionally, the filter on these definitions in tst-mman-consts.py is
removed, as suggested by Florian. I then ran this tests successfully,
confirming the alignment with the kernel headers.
PASS: misc/tst-mman-consts
original exit status 0
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Tested-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Remove the definitions of HWCAP_IMPORTANT after removal of
LD_HWCAP_MASK / tunable glibc.cpu.hwcap_mask. There HWCAP_IMPORTANT
was used as default value.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove the environment variable LD_HWCAP_MASK and the tunable
glibc.cpu.hwcap_mask as those are not used anymore in common-code
after removal in elf/dl-cache.c:search_cache().
The only remaining user is sparc32 where it is used in
elf_machine_matches_host(). If sparc32 does not need it anymore,
we can get rid of it at all. Otherwise we could also move
LD_HWCAP_MASK / tunable glibc.cpu.hwcap_mask to be sparc32 specific.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove the definitions of _DL_PLATFORMS_COUNT as those are not used
anymore after removal in elf/dl-cache.c:search_cache().
Note: On x86, we can also get rid of the definitions
HWCAP_PLATFORMS_START and HWCAP_PLATFORMS_COUNT.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove the definitions of _DL_FIRST_PLATFORM as those were only used
in the _DL_HWCAP_PLATFORM definitions and in _dl_string_platform().
Both were removed.
Note: Removed on every architecture despite of powerpc, where
_dl_string_platform() is still used.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove the definitions of _DL_HWCAP_PLATFORM as those are not used
anymore after removal in elf/dl-cache.c:search_cache().
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove the platform strings in dl-procinfo.c where also
the implementation of _dl_string_platform() was removed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Despite of powerpc where the returned integer is stored in tcb,
and the diagnostics output, there is no user anymore.
Thus this patch removes the diagnostics output and
_dl_string_platform for all other platforms.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The legacy hwcaps mechanism was removed with glibc 2.37:
See this commit series:
- d178c67535
x86_64: Remove platform directory library loading test
- 6099908fb8
elf: Remove legacy hwcaps support from the dynamic loader
- b78ff5a25d
elf: Remove legacy hwcaps support from ldconfig
- 4a7094119c
elf: Remove hwcap parameter from add_to_cache signature
- cfbf883db3
elf: Remove hwcap and bits_hwcap fields from struct cache_entry
- 78d9a1620b
Add NEWS entry for legacy hwcaps removal
- ab40f20364
elf: Remove _dl_string_hwcap
- e76369ed63
elf: Simplify output of hwcap subdirectories in ld.so help
According to Florian Weimer, this was an oversight and should also
have been removed.
As ldconfig does not generate ld.so.cache entries with hwcap/platform
bits in the hwcap-field anymore, this patch now skips those entries.
Thus currently only named-hwcap-entries and the default entries are
allowed.
For named-hwcap entries bit 62 is set and also the isa-level bits can
be set.
For the default entries the hwcap-field is 0.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Both defines are not used anymore. Those were only used for
_dl_string_hwcap(), which itself was removed with commit
ab40f20364
"elf: Remove _dl_string_hwcap"
Just clean up.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
As discussed at the patch review meeting
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Reviewed-by: Simon Chopin <simon.chopin@canonical.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the exp2m1 and exp10m1 functions (exp2(x)-1 and
exp10(x)-1, like expm1).
As with other such functions, these use type-generic templates that
could be replaced with faster and more accurate type-specific
implementations in future. Test inputs are copied from those for
expm1, plus some additions close to the overflow threshold (copied
from exp2 and exp10) and also some near the underflow threshold.
exp2m1 has the unusual property of having an input (M_MAX_EXP) where
whether the function overflows (under IEEE semantics) depends on the
rounding mode. Although these could reasonably be XFAILed in the
testsuite (as we do in some cases for arguments very close to a
function's overflow threshold when an error of a few ulps in the
implementation can result in the implementation not agreeing with an
ideal one on whether overflow takes place - the testsuite isn't smart
enough to handle this automatically), since these functions aren't
required to be correctly rounding, I made the implementation check for
and handle this case specially.
The Makefile ordering expected by lint-makefiles for the new functions
is a bit peculiar, but I implemented it in this patch so that the test
passes; I don't know why log2 also needed moving in one Makefile
variable setting when it didn't in my previous patches, but the
failure showed a different place was expected for that function as
well.
The powerpc64le IFUNC setup seems not to be as self-contained as one
might hope; it shouldn't be necessary to add IFUNCs for new functions
such as these simply to get them building, but without setting up
IFUNCs for the new functions, there were undefined references to
__GI___expm1f128 (that IFUNC machinery results in no such function
being defined, but doesn't stop include/math.h from doing the
redirection resulting in the exp2m1f128 and exp10m1f128
implementations expecting to call it).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the log10p1 functions (log10(1+x): like log1p, but for
base-10 logarithms).
This is directly analogous to the log2p1 implementation (except that
whereas log2p1 has a smaller underflow range than log1p, log10p1 has a
larger underflow range). The test inputs are copied from those for
log1p and log2p1, plus a few more inputs in that wider underflow
range.
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the logp1 functions (aliases for log1p functions - the
name is intended to be more consistent with the new log2p1 and
log10p1, where clearly it would have been very confusing to name those
functions log21p and log101p). As aliases rather than new functions,
the content of this patch is somewhat different from those actually
adding new functions.
Tests are shared with log1p, so this patch *does* mechanically update
all affected libm-test-ulps files to expect the same errors for both
functions.
The vector versions of log1p on aarch64 and x86_64 are *not* updated
to have logp1 aliases (and thus there are no corresponding header,
tests, abilist or ulps changes for vector functions either). It would
be reasonable for such vector aliases and corresponding changes to
other files to be made separately. For now, the log1p tests instead
avoid testing logp1 in the vector case (a Makefile change is needed to
avoid problems with grep, used in generating the .c files for vector
function tests, matching more than one ALL_RM_TEST line in a file
testing multiple functions with the same inputs, when it assumes that
the .inc file only has a single such line).
Tested for x86_64 and x86, and with build-many-glibcs.py.
POSIX.1-2024 (now official) specifies tm_gmtoff and tm_zone.
This is a good time to update the manual’s “Date and Time”
chapter so I went through it, fixed some outdated
stuff that had been in there for decades, and improved it to match
POSIX.1-2024 better and to clarify some implementation-defined
behavior. Glibc already conforms to POSIX.1-2024 in these matters, so
this is merely a documentation change.
* manual/examples/strftim.c: Use snprintf instead of now-deprecated
function asctime. Check for localtime failure. Simplify by using
puts instead of fputs. Prefer ‘buf, sizeof buf’ to less-obvious
‘buffer, SIZE’.
* manual/examples/timespec_subtract.c: Modernize to use struct
timespec not struct timeval, and rename from timeval_subtract.c.
All uses changed. Check for overflow. Do not check for negative
return value, which ought to be OK since negative time_t is OK.
Use GNU indenting style.
* manual/time.texi:
Document CLOCKS_PER_SEC, TIME_UTC, timespec_get, timespec_getres,
strftime_l.
Document the storage lifetime of tm_zone and of tzname.
Caution against use of tzname, timezone and daylight, saying that
these variables have unspecified values when TZ is geographic.
This is what glibc actually does (contrary to what the manual said
before this patch), and POSIX is planned to say the same thing
<https://austingroupbugs.net/view.php?id=1816>.
Also say that directly accessing the variables is not thread-safe.
Say that localtime_r and ctime_r don’t necessarily set time zone
state. Similarly, in the tzset documentation, say that it is called
by ctime, localtime, mktime, strftime, not that it is called by all
time conversion functions that depend on the time zone.
Say that tm_isdst is useful mostly just for mktime, and that
other uses should prefer tm_gmtoff and tm_zone instead.
Do not say that strftime ignores tm_gmtoff and tm_zone, because
it doesn’t do that.
Document what gmtime does to tm_gmtoff and tm_zone.
Say that the asctime, asctime_r, ctime, and ctime_r are now deprecated
and/or obsolescent, and that behavior is undefined if the year is <
1000 or > 9999. Document strftime before these now-obsolescent
functions, so that readers see the useful function first.
Coin the terms “geographical format” and “proleptic format” for the
two main formats of TZ settings, to simplify exposition. Use this
wording consistently.
Update top-level proleptic syntax to match POSIX.1-2024, which glibc
already implements. Document the angle-bracket quoted forms of time
zone abbreviations in proleptic TZ. Say that time zone abbreviations
can contain only ASCII alphanumerics, ‘+’, and ‘-’.
Document what happens if the proleptic form specifies a DST
abbreviation and offset but omits the rules. POSIX says this is
implementation-defined so we need to document it. Although this
documentation mentions ‘posixrules’ tersely, we need to rethink
‘posixrules’ since I think it stops working after 2038.
Clarify wording about TZ settings beginning with ‘;’.
Say that timegm is in ISO C (as of C23).
Say that POSIX.1-2024 removed gettimeofday.
Say that tm_gmtoff and tm_zone are extensions to ISO C, which is
clearer than saying they are invisible in a struct ISO C enviroment,
and gives us more wiggle room if we want to make them visible in
strict ISO C, something that ISO C allows.
Drop mention of old standards like POSIX.1c and POSIX.2-1992 in the
text when the history is so old that it’s no longer useful in a
general-purpose manual.
Define Coordinated Universal Time (UTC), time zone, time zone ruleset,
and POSIX Epoch, and use these phrases more consistently.
Improve TZ examples to show more variety, and to reflect current
practice and timestamps. Remove obsolete example about Argentina.
Add an example for Ireland.
Don’t rely on GCC extensions when explaining ctime_r.
Do not say that difftime produces the mathematically correct result,
since it might be inexact.
For clock_t don’t say “as in the example above” when there is no
such example, and don’t say that casting to double works “properly
and consistently no matter what”, as it suffers from rounding and
overflow.
Don’t say broken-down time is not useful for calculations; it’s
merely painful.
Say that UTC is not defined before 1960.
Rename Time Zone Functions to Time Zone State. All uses changed.
Update Internet RFC 822 → 5322, 1305 → 5905. Drop specific years of
ISO 8601 as they don’t matter.
Minor style changes: @code{"..."} → @t{"..."} to avoid overquoting in
info files, @code → @env for environment variables, Daylight Saving
Time → daylight saving time, white space → whitespace, prime meridian
→ Prime Meridian.
When we don't want to use non-temporal stores for memset, we set
`x86_memset_non_temporal_threshold` to SIZE_MAX.
The current code, however, we using `maximum_non_temporal_threshold`
as the upper bound which is `SIZE_MAX >> 4` so we ended up with a
value of `0`.
Fix is to just use `SIZE_MAX` as the upper bound for when setting the
tunable.
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Linux pinacolada 6.6.32-gentoo #1 SMP PREEMPT Sun Jun 9 14:18:17 CEST 2024 x86_64 Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz GenuineIntel GNU/Linux
32bit build for multilib environment
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
"ADDI sp, sp, 24" and "ADDI sp, sp, SZFCSREG" (SZFCSREG = 4) are
misaligning the stack: the ABI mandates a 16-byte alignment. Fix it
by changing the first one to "ADDI sp, sp, 32", and reuse the spare 4th
slot for saving fcsr.
Reported-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
This avoids changing _res.options, which inteferes with change
detection as part of automatic reloading of /etc/resolv.conf.
Reviewed-by: DJ Delorie <dj@redhat.com>
Properly set libc_cv_have_x86_isa_level in shell for MINIMUM_X86_ISA_LEVEL
defined as
(__X86_ISA_V1 + __X86_ISA_V2 + __X86_ISA_V3 + __X86_ISA_V4)
Also set __X86_ISA_V2 to 1 for i386 if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8
is defined. There are no changes in config.h nor in config.make on x86-64.
On i386, -march=x86-64-v2 with GCC generates
#define MINIMUM_X86_ISA_LEVEL 2
in config.h and
have-x86-isa-level = 2
in config.make. This fixes BZ #31883.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Sort tunables list at the time it's generated. Note: adding new
tunables will cause other tunable IDs to change, but that was
the case before anyway. POSIX does not guarantee the order of "foo
in bar" AWK operators, so the order was indeterminate before anyway.
Even depending on the order to be the same across multiple calls,
such as in this script, is undefined, so sorting the list resolves
that also.
Note that sorting is not dependent on the user's locale.
The __stack_prot is used by Linux to make the stack executable if
a modules requires it. It is also marked as RELRO, which requires
to change the segment permission to RW to update it.
Also, there is no need to keep track of the flags: either the stack
will have the default permission of the ABI or should be change to
PROT_READ | PROT_WRITE | PROT_EXEC. The only additional flag,
PROT_GROWSDOWN or PROT_GROWSUP, is Linux only and can be deducted
from _STACK_GROWS_DOWN/_STACK_GROWS_UP.
Also, the check_consistency function was already removed some time
ago.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
These comments were written in 2003 (added in 2c008571c3), predating
the addition of getdelim(3)/getline(3) in POSIX.1-2008.
Reviewed-by: Sam James <sam@gentoo.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
On i386, set the default minimum ISA level to 0, not 1 (baseline which
includes SSE2). There are no changes in config.h nor in config.make on
x86-64. This fixes BZ #31867.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Tested-by: Ian Jordan <immoloism@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
In commit 46b5e98ef6 ("x86: Add seperate non-temporal tunable for
memset") a tunable threshold for enabling non-temporal memset was added,
but only for Intel hardware.
Since that commit, new benchmark results suggest that non-temporal
memset is beneficial on AMD, as well, so allow this tunable to be set
for AMD.
See:
https://docs.google.com/spreadsheets/d/1opzukzvum4n6-RUVHTGddV6RjAEil4P2uMjjQGLbLcU/edit?usp=sharing
which has been updated to include data using different stategies for
large memset on AMD Zen2, Zen3, and Zen4.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
The manpage says that when the passed size is zero, they should set the
expected size and return 0. ERANGE shall be returned only when the non-zero
passed size is not large enough.
When no translator is set, __file_get_translator would return EINVAL
which is a confusing value. Better check for a passive translation
before getting the value.
The error message in xgetsockname was incorrectly referring to a
different function. This commit fixes that.
Suggested-by: Arjun Shankar <arjun@redhat.com>
Signed-off-by: Avinal Kumar <avinal.xlvii@gmail.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
These are required by the upcoming POSIX standard and are available on
other platforms.
Link: https://austingroupbugs.net/view.php?id=339
Signed-off-by: Mohamed Akram <mohd.akram@outlook.com>
Reviewed-by: Arjun Shankar <arjun@redhat.com>
As of Linux kernel 6.9, some ioctls and a parameters structure have been
introduced which allow user programs to control whether a particular
epoll context will busy poll.
Update the headers to include these for the convenience of user apps.
The ioctls were added in Linux kernel 6.9 commit 18e2bf0edf4dd
("eventpoll: Add epoll ioctl for epoll_params") [1] to
include/uapi/linux/eventpoll.h.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/?h=v6.9&id=18e2bf0edf4dd
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
difftime can signal an inexact conversion when converting to double,
so it should not be marked as pure or nothrow (BZ 31808).
Although we could do something more complicated, in which difftime is
plain on modern platforms but const and nothrow on obsolescent
platforms with 32-bit time_t, it hardly seems worth the trouble.
difftime is used so rarely that it's not worth taking pains to
optimize calls to it on obsolescent platforms.
Reviewed-by: DJ Delorie <dj@redhat.com>
The test aims to ensure that malloc uses the alternate path to
allocate memory when sbrk() or brk() fails.To achieve this,
the test first creates an obstruction at current program break,
tests that obstruction with a failing sbrk(), then checks if malloc
is still returning a valid ptr thus inferring that malloc() used
mmap() instead of brk() or sbrk() to allocate the memory.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
Reviewed-by: Zack Weinberg <zack@owlfolio.org>
Left shift of ki is undefined when ki<0, copy the logic from exp,
which uses unsigned arithmetics, to fix it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Deallocate the memory for the FILE structure when seeking to the end fails
in append mode.
Fixes: ea33158c96 ("Fix offset caching for streams and use it for ftell (BZ #16680)")
Linux 6.9 adds the ELF note type NT_ARM_FPMR. Add this to glibc's
elf.h, along with the previously missed NT_ARM_SSVE, NT_ARM_ZA and
NT_ARM_ZT (added in older kernel versions).
Tested for x86_64.
This has been confirmed to work around some interposed mallocs. Here
is a discussion of the impact test ust/libc-wrapper/test_libc-wrapper
in lttng-tools:
New TLS usage in libgcc_s.so.1, compatibility impact
<https://inbox.sourceware.org/libc-alpha/8734v1ieke.fsf@oldenburg.str.redhat.com/>
Reportedly, this patch also papers over a similar issue when tcmalloc
2.9.1 is not compiled with -ftls-model=initial-exec. Of course the
goal really should be to compile mallocs with the initial-exec TLS
model, but this commit appears to be a useful interim workaround.
Fixes commit d2123d6827 ("elf: Fix slow
tls access after dlopen [BZ #19924]").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The tuning for non-temporal stores for memset vs memcpy is not always
the same. This includes both the exact value and whether non-temporal
stores are profitable at all for a given arch.
This patch add `x86_memset_non_temporal_threshold`. Currently we
disable non-temporal stores for non Intel vendors as the only
benchmarks showing its benefit have been on Intel hardware.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Previously we use `rep stosb` for all medium/large memsets. This is
notably worse than non-temporal stores for large (above a
few MBs) memsets.
See:
https://docs.google.com/spreadsheets/d/1opzukzvum4n6-RUVHTGddV6RjAEil4P2uMjjQGLbLcU/edit?usp=sharing
For data using different stategies for large memset on ICX and SKX.
Using non-temporal stores can be up to 3x faster on ICX and 2x faster
on SKX. Historically, these numbers would not have been so good
because of the zero-over-zero writeback optimization that `rep stosb`
is able to do. But, the zero-over-zero writeback optimization has been
removed as a potential side-channel attack, so there is no longer any
good reason to only rely on `rep stosb` for large memsets. On the flip
size, non-temporal writes can avoid data in their RFO requests saving
memory bandwidth.
All of the other changes to the file are to re-organize the
code-blocks to maintain "good" alignment given the new code added in
the `L(stosb_local)` case.
The results from running the GLIBC memset benchmarks on TGL-client for
N=20 runs:
Geometric Mean across the suite New / Old EXEX256: 0.979
Geometric Mean across the suite New / Old EXEX512: 0.979
Geometric Mean across the suite New / Old AVX2 : 0.986
Geometric Mean across the suite New / Old SSE2 : 0.979
Most of the cases are essentially unchanged, this is mostly to show
that adding the non-temporal case didn't add any regressions to the
other cases.
The results on the memset-large benchmark suite on TGL-client for N=20
runs:
Geometric Mean across the suite New / Old EXEX256: 0.926
Geometric Mean across the suite New / Old EXEX512: 0.925
Geometric Mean across the suite New / Old AVX2 : 0.928
Geometric Mean across the suite New / Old SSE2 : 0.924
So roughly a 7.5% speedup. This is lower than what we see on servers
(likely because clients typically have faster single-core bandwidth so
saving bandwidth on RFOs is less impactful), but still advantageous.
Full test-suite passes on x86_64 w/ and w/o multiarch.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This new note type is defined at https://systemd.io/ELF_DLOPEN_METADATA/
and is used to list shared library dependencies loaded via dlopen().
Distro packagers can use this, via tools like those available at
https://github.com/systemd/package-notes to automatically generate
dependencies when building projects that make use of this
specification.
By defining the note id here we can use it in other projects as a
stable identifier, for example in 'readelf' to pretty-print its
content.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Arjun Shankar <arjun@redhat.com>
Page was renamed some time ago, there's a redirect but better
to point to the right one
Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Arjun Shankar <arjun@redhat.com>
A space is added before the left bracket of the x86_64 elf_machine_rela
function, in order to harmonize with the rest of the implementation of
the function and to make it easier to retrieve the function. The lines
where the function definition is located has been re-indented, as well
as its left curly bracket placed in the correct position.
Signed-off-by: Xin Wang <yw987194828@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Because difftime's behavior depends on the floating-point environment,
the function is pure, not const (BZ 31802).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
pidfd_getpid.c has
/* Ignore invalid large values. */
if (INT_MULTIPLY_WRAPV (10, n, &n)
|| INT_ADD_WRAPV (n, *l++ - '0', &n))
return -1;
For GCC older than GCC 7, INT_ADD_WRAPV(a, b, r) is defined as
_GL_INT_OP_WRAPV (a, b, r, +, _GL_INT_ADD_RANGE_OVERFLOW)
and *l++ - '0' is evaluated twice. Fix BZ #31798 by moving "l++" out of
the if statement. Tested with GCC 6.4 and GCC 14.1.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This reverts commit 30a745450e.
On ppc64le, without <stdio.h>, vfscanf is used and with <stdio.h>
__isoc23_vfscanfieee128 is used. I am reverting this since it doesn't
work on all targets.
Add a test for fscanf of long double without including <stdio.h>.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
The powerpc32 have an extra versionsort provided by LFS
versionsort64.o. It seems that 5226a81f55
used the wrong check to create the alias for the LFS to non-LFS version.
It should not matter for _DIRENT_MATCHES_DIRENT64 since both symbols
have the same implementation.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This patch updates the kernel version in the tests tst-mman-consts.py
and tst-mount-consts.py to 6.9. (There are no new constants covered
by these tests in 6.9 that need any other header changes;
tst-pidfd-consts.py was updated separately along with adding new
constants relevant to that test.)
Tested with build-many-glibcs.py.
The libc.a for alpha, s390, and sparcv9 does not provide
copysignf64x, copysignf128, frexpf64x, frexpf128, modff64x, and
modff128.
Checked with a static build for the affected ABIs.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Both the generic and POWER6 versions provide definitions of the
symbol, which are already provided by the ifunc resolver.
Checked on powerpc-linux-gnu-power4.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
For powerpc64 the generic version provides a weak definition of
strchrnul, which are already provided by the ifunc resolver. The
powerpc32 version is slight different, where for static case there
is no iFUNC support.
The strncasecmp_l is provided ifunc resolver.
Checked on powerpc-linux-gnu-power4 and powerpc64-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The generic version provides weak definitions of memchr/strlen,
which are already provided by the ifunc resolvers.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Linux 6.9 adds some more PIDFD_* constants. Add them to glibc's
sys/pidfd.h, including updating comments that said FLAGS was reserved
and must be 0, along with updating tst-pidfd-consts.py.
Tested with build-many-glibcs.py.
libc.so doesn't use nor export write_profiling functions. There is no
point to define them in libc.so nor in libc.a. Fix BZ #31756 by defining
them only in profile library.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Don't provide __nexttowardf128_do_not_use, nexttowardf128_do_not_use,
finitef128_do_not_use, isinff128_do_not_use and isnanf128_do_not_use.
This fixes BZ #31757.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
scalb is a deprecated interface which was obsolescent in POSIX.1-2001,
removed in POSIX.1-2008, never made to C standard. significant was
originally from BSD and never made in any standard. Fix BZ #31760 by
not providing _FloatN aliases for them.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Some static implementation of float128 routines might call __isnanf128,
which is not provided by the static object.
Checked on x86_64-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
It basically copy the already in place rules for dynamic tests for
auto-generated math functions for all support types. To avoid the
need to duplicate .inc files, a .SECONDEXPANSION rules is adeed for
the gen-libm-test.py generation.
New tests are added on the new rules 'libm-test-funcs-auto-static',
'libm-test-funcs-noauto-static', and 'libm-test-funcs-narrow-static';
similar to the non-static counterparts.
To avoid add extra build and disk requirement, the new math static
tests are only enable with a new define 'build-math-static-tests'.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Since Glibc never provides symbol binary compatibility for relocatable
files, fix BZ #31766 by changing _IO_stderr_/_IO_stdin_/_IO_stdout to
compat symbols.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
There is no _dl_mcount_wrapper prototype in any installed header files.
Fix BZ #31765 by changing _dl_mcount_wrapper to a compat symbol and
obsolete it in glibc 2.40.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
The commit 08ddd26814 removed the static exp10 on i386 and m68k with an
empty w_exp10.c (required for the ABIs that uses the newly
implementation). This patch fixes by adding the required symbols on the
arch-specific w_exp{f}_compat.c implementation.
Checked on i686-linux-gnu and with a build for m68k-linux-gnu.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
The commit 16439f419b removed the static fmod/fmodf on i386 and m68k
with and empty w_fmod.c (required for the ABIs that uses the newly
implementation). This patch fixes by adding the required symbols on
the arch-specific w_fmod{f}_compat.c implementation.
To statically build fmod fails on some ABI (alpha, s390, sparc) because
it does not export the ldexpf128, this is also fixed by this patch.
Checked on i686-linux-gnu and with a build for m68k-linux-gnu.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
clone3 isn't exported from glibc and is hidden in libc.so. Fix BZ #31770
by removing clone3 alias.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Plus a small amount of moving includes around in order to be able to
remove duplicate definition of asuint64.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Rounding intrinsics may not be inlined without
-fno-math-errno. libmvec is free to do what it
likes with errno, so disable it for better
performance.
Tested with no regression on aarch64 and x86_64.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
_res_opcodes was exported by accident as a variable. Fix BZ #31764 by
making _res_opcodes a compat symbol.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
Fix BZ #31768 by not defining stpncpy alias when used in IFUNC.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
Add start and end indicators that identify the test being run in the
verbose output. Better identify the tests for max errors in the
summary output. Count each exception checked for each test. Remove
double counting of tests for the check_<type> functions other than
check_float_internal. Rename print_max_error and
print_complex_max_error to check_max_error and check_complex_max_error
respectively since they have side effects.
Co-Authored-By: Carlos O'Donell <carlos@redhat.com>
Reviewed-By: Joseph Myers <josmyers@redhat.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the log2p1 functions (log2(1+x): like log1p, but for
base-2 logarithms).
This illustrates the intended structure of implementations of all
these function families: define them initially with a type-generic
template implementation. If someone wishes to add type-specific
implementations, it is likely such implementations can be both faster
and more accurate than the type-generic one and can then override it
for types for which they are implemented (adding benchmarks would be
desirable in such cases to demonstrate that a new implementation is
indeed faster).
The test inputs are copied from those for log1p. Note that these
changes make gen-auto-libm-tests depend on MPFR 4.2 (or later).
The bulk of the changes are fairly generic for any such new function.
(sysdeps/powerpc/nofpu/Makefile only needs changing for those
type-generic templates that use fabs.)
Tested for x86_64 and x86, and with build-many-glibcs.py.
Linux 6.9 has no new syscalls. Update the version number in
syscall-names.list to reflect that it is still current for 6.9.
Tested with build-many-glibcs.py.
Fix BZ #31755 by renaming the internal function procutils_read_file to
__libc_procutils_read_file.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Fix BZ #31759 by not defining nearbyint aliases when used in IFUNC.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Since -r in GCC 6/7/8 doesn't imply -nostdlib -nostartfiles, update the
link-static-libc.out rule to also pass -nostdlib -nostartfiles. This
fixes BZ #31753.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This supports common coding patterns. The GCC C front end before
version 7 rejects the may_alias attribute on a struct definition
if it was not present in a previous forward declaration, so this
attribute can only be conditionally applied.
This implements the spirit of the change in Austin Group issue 1641.
Suggested-by: Marek Polacek <polacek@redhat.com>
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Sam James <sam@gentoo.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch fixes BZ #27777 "fclose does a linear search, takes ages when
many FILE* are opened". Simply put, the master list of opened (FILE*),
namely _IO_list_all, is a singly-linked list. As a consequence, the
removal of a single element is in O(N), which cripples the performance
of fclose(). The patch switches to a doubly-linked list, yielding O(1)
removal. The one padding field in struct _IO_FILE, __pad5, is renamed
to _prevchain for a doubly-linked list. Since fields in struct _IO_FILE
after the _lock field are internal to glibc and opaque to applications.
We can change them as long as the size of struct _IO_FILE is unchanged,
which is checked as the part of glibc ABI with sizes of _IO_2_1_stdin_,
_IO_2_1_stdout_ and _IO_2_1_stderr_.
NB: When _IO_vtable_offset (fp) == 0, copy relocation will cover the
whole struct _IO_FILE. Otherwise, only fields up to the _lock field
will be copied to applications at run-time. It is used to check if
the _prevchain field can be safely accessed.
After opening 2 million (FILE*), the fclose() of 100 of them takes quite
a few seconds without the patch, and under 2 seconds with it on a loaded
machine.
No test is added since there are no functional changes.
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch ensures that $libc_cv_cc_submachine, which is set from
"--with-cpu", overrides $CFLAGS for configure time tests.
Suggested-by: Peter Bergner <bergner@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
Measure duration of 100 fclose calls after opening 1 million FILEs.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
= `Default_Ignorable_Code_Point`s should have width 0 =
Unicode specifies (https://www.unicode.org/faq/unsup_char.html#3) that characters
with the `Default_Ignorable_Code_Point` property
> should be rendered as completely invisible (and non advancing, i.e. “zero width”),
if not explicitly supported in rendering.
Hence, `wcwidth()` should give them all a width of 0, with two exceptions:
- the soft hyphen (U+00AD SOFT HYPHEN) is assigned width 1 by longstanding precedent
- U+115F HANGUL CHOSEONG FILLER needs a carveout
due to the unique behavior of the conjoining Korean jamo characters.
One composed Hangul "syllable block" like 퓛
is made up of two to three individual component characters, or "jamo".
These are all assigned an `East_Asian_Width` of `Wide`
by Unicode, which would normally mean they would all be assigned
width 2 by glibc; a combination of (leading choseong jamo) +
(medial jungseong jamo) + (trailing jongseong jamo) would then have width 2 + 2 + 2 = 6.
However, glibc (and other wcwidth implementations) special-cases jungseong and jongseong,
assigning them all width 0,
to ensure that the complete block has width 2 + 0 + 0 = 2 as it should.
U+115F is meant for use in syllable blocks
that are intentionally missing a leading jamo;
it must be assigned a width of 2 even though it has no visible display
to ensure that the complete block has width 2.
However, `wcwidth()` currently (before this patch)
incorrectly assigns non-zero width to
U+3164 HANGUL FILLER and U+FFA0 HALFWIDTH HANGUL FILLER;
this commit fixes that.
Unicode spec references:
- Hangul: §3.12 https://www.unicode.org/versions/Unicode15.0.0/ch03.pdf#G24646 and
§18.6 https://www.unicode.org/versions/Unicode15.0.0/ch18.pdf#G31028
- `Default_Ignorable_Code_Point`: §5.21 https://www.unicode.org/versions/Unicode15.0.0/ch05.pdf#G40095.
= Non-`Default_Ignorable_Code_Point` format controls should be visible =
The Unicode Standard, §5.21 - Characters Ignored for Display
(https://www.unicode.org/versions/Unicode15.0.0/ch05.pdf#G40095)
says the following:
> A small number of format characters (General_Category = Cf )
> are also not given the Default_Ignorable_Code_Point property.
> This may surprise implementers, who often assume
> that all format characters are generally ignored in fallback display.
> The exact list of these exceptional format characters
> can be found in the Unicode Character Database.
> There are, however, three important sets of such format characters to note:
>
> - prepended concatenation marks
> - interlinear annotation characters
> - Egyptian hieroglyph format controls
>
> The prepended concatenation marks always have a visible display.
> See “Prepended Concatenation Marks” in [*Section 23.2, Layout Controls*](https://www.unicode.org/versions/Unicode15.1.0/ch23.pdf#M9.35858.HeadingBreak.132.Layout.Controls)
> for more discussion of the use and display of these signs.
>
> The other two notable sets of format characters that exceptionally are not ignored
> in fallback display consist of the interlinear annotation characters,
> U+FFF9 INTERLINEAR ANNOTATION ANCHOR through
> U+FFFB INTERLINEAR ANNOTATION TERMINATOR,
> and the Egyptian hieroglyph format controls,
> U+13430 EGYPTIAN HIEROGLYPH VERTICAL JOINER through
> U+1343F EGYPTIAN HIEROGLYPH END WALLED ENCLOSURE.
> These characters should have a visible glyph display for fallback rendering,
> because if they are not displayed,
> it is too easy to misread the resulting displayed text.
> See “Annotation Characters” in [*Section 23.8, Specials*](https://www.unicode.org/versions/Unicode15.1.0/ch23.pdf#M9.21335.Heading.133.Specials),
> as well as [*Section 11.4, Egyptian Hieroglyphs*](https://www.unicode.org/versions/Unicode15.1.0/ch11.pdf#M9.73291.Heading.1418.Egyptian.Hieroglyphs)
> for more discussion of the use and display of these characters.
glibc currently correctly assigns non-zero width to the prepended concatenation marks,
but it incorrectly gives zero width to the interlinear annotation characters
(which a generic terminal cannot interpret)
and the Egyptian hieroglyph format controls
(which are not widely supported in rendering implementations at present).
This commit fixes both these issues as well.
= Derive Hangul syllable type from Unicode data =
Previosuly, the jungseong and jongseong jamo ranges
were hard-coded into the script. With this commit, they are instead parsed
from the HangulSyllableType.txt data file published by Unicode.
This does not affect the end result.
Signed-off-by: Jules Bertholet <julesbertholet@quoi.xyz>
This is mostly based on AArch64 and RISC-V implementation.
Add R_LARCH_TLS_DESC32 and R_LARCH_TLS_DESC64 relocations.
For _dl_tlsdesc_dynamic function slow path, temporarily save and restore
all vector registers.
Allow the libm-test-driver based tests to have their verbosity set based
on the GLIBC_TEST_LIBM_VERBOSE environment variable. This allows the entire
testsuite to be run with a non-default verbosity.
While here check the conversion for the verbose option as well.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Add a DSO (malloc/tst-aligned_alloc-lib.so) that can be used during
testing to interpose malloc with a call that randomly uses either
aligned_alloc, __libc_malloc, or __libc_calloc in the place of malloc.
Use LD_PRELOAD with the DSO to mirror malloc/tst-malloc.c testing as an
example in malloc/tst-malloc-random.c. Add malloc/tst-aligned-alloc-random.c
as another example that does a number of malloc calls with randomly sized,
but limited to 0xffff, requests.
The intention is to be able to utilize existing malloc testing to ensure
that similar allocation APIs are also exposed to the same rigors.
Reviewed-by: DJ Delorie <dj@redhat.com>
Previously many routines used * to load from vector types stored
in the data table. This is emitted as ldr, which byte-swaps the
entire vector register, and causes bugs for big-endian when not
all lanes contain the same value. When a vector is to be used
this way, it has been replaced with an array and the load with an
explicit ld1 intrinsic, which byte-swaps only within lanes.
As well, many routines previously used non-standard GCC syntax
for vector operations such as indexing into vectors types with []
and assembling vectors using {}. This syntax should not be mixed
with ACLE, as the former does not respect endianness whereas the
latter does. Such examples have been replaced with, for instance,
vcombine_* and vgetq_lane* intrinsics. Helpers which only use the
GCC syntax, such as the v_call helpers, do not need changing as
they do not use intrinsics.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This test case calls `fopen':
FILE *fp = fopen (temp_file, "r");
however if that fails it reports `fdopen' being the origin of the error.
Adjust the message to say `fopen' then.
On Fedora 40/x86-64, linker enables --enable-new-dtags by default which
generates DT_RUNPATH instead of DT_RPATH. Unlike DT_RPATH, DT_RUNPATH
only applies to DT_NEEDED entries in the executable and doesn't applies
to DT_NEEDED entries in shared libraries which are loaded via DT_NEEDED
entries in the executable. Some glibc tests have libstdc++.so.6 in
DT_NEEDED, which has libm.so.6 in DT_NEEDED. When DT_RUNPATH is generated,
/lib64/libm.so.6 is loaded for such tests. If the newly built glibc is
older than glibc 2.36, these tests fail with
assert/tst-assert-c++: /export/build/gnu/tools-build/glibc-gitlab-release/build-x86_64-linux/libc.so.6: version `GLIBC_2.36' not found (required by /lib64/libm.so.6)
assert/tst-assert-c++: /export/build/gnu/tools-build/glibc-gitlab-release/build-x86_64-linux/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by /lib64/libm.so.6)
Pass -Wl,--disable-new-dtags to linker when building glibc tests with
--enable-hardcoded-path-in-tests. This fixes BZ #31719.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
The e68b1151f7 commit changed the
__fesetround_inline_nocheck implementation to use mffscrni
(through __fe_mffscrn) instead of mtfsfi. For generic powerpc
ceil/floor/trunc, the function is supposed to disable the
floating-point inexact exception enable bit, however mffscrni
does not change any exception enable bits.
This patch fixes by reverting the optimization for the
__fesetround_inline_nocheck.
Checked on powerpc-linux-gnu.
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
This code expects the WCSCAT preprocessor macro to be predefined in case
the evex implementation of the function should be defined with a name
different from __wcsncat_evex. However, when glibc is built for
x86-64-v4 without multiarch support, sysdeps/x86_64/wcsncat.S defines
WCSNCAT variable instead of WCSCAT to build it as wcsncat. Rename the
variable to WCSNCAT, as it is actually a better naming choice for the
variable in this case.
Reported-by: Kenton Groombridge
Link: https://bugs.gentoo.org/921945
Fixes: 64b8b6516b ("x86: Add evex optimized functions for the wchar_t strcpy family")
Signed-off-by: Gabi Falk <gabifalk@gmx.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
Tunable with environment variables aliases are also ignored if
glibc.rtld.enable_secure is enabled. The tunable parsing is also
optimized a bit, where the loop that checks each environment variable
only checks for the tunables with aliases instead of all tables.
Checked on aarch64-linux-gnu and x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
And move it to parse_tunables. It avoids a string comparison for
each tunable.
Checked on aarch64-linux-gnu and x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The 680c597e9c commit made loader reject ill-formatted strings by
first tracking all set tunables and then applying them. However, it does
not take into consideration if the same tunable is set multiple times,
where parse_tunables_string appends the found tunable without checking
if it was already in the list. It leads to a stack-based buffer overflow
if the tunable is specified more than the total number of tunables. For
instance:
GLIBC_TUNABLES=glibc.malloc.check=2:... (repeat over the number of
total support for different tunable).
Instead, use the index of the tunable list to get the expected tunable
entry. Since now the initial list is zero-initialized, the compiler
might emit an extra memset and this requires some minor adjustment
on some ports.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reported-by: Yuto Maeda <maeda@cyberdefense.jp>
Reported-by: Yutaro Shimizu <shimizu@cyberdefense.jp>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Starting from glibc 2.1, crt1.o contains _IO_stdin_used which is checked
by _IO_check_libio to provide binary compatibility for glibc 2.0. Add
crt1-2.0.o for tests against glibc 2.0. Define tests-2.0 for glibc 2.0
compatibility tests. Add and update glibc 2.0 compatibility tests for
stderr, matherr and pthread_kill.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch is based on __strcmp_power10.
Improvements from __strncmp_power9:
1. Uses new POWER10 instructions
- This code uses lxvp to decrease contention on load
by loading 32 bytes per instruction.
2. Performance implication
- This version has around 38% better performance on average.
- Minor performance regression is seen for few small sizes
and specific combination of alignments.
Signed-off-by: Amrita H S <amritahs@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
This adds the OpenRISC hard float glibc variant to the build many
script. We update the compiler for glibc to support hard-float
multilibs to allow us to use a single generic compiler for all glibc
variants, this requires updating the compiler name.
Tested and all builds are passing.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch adds hardware floating point support to OpenRISC. Hardware
floating point toolchain builds are enabled by passing the machine
specific argument -mhard-float to gcc via CFLAGS. With this enabled GCC
generates floating point instructions for single-precision operations
and exports __or1k_hard_float__.
There are 2 main parts to this patch.
- Implement fenv functions to update the FPCSR flags keeping it in sync
with sfp (software floating point).
- Update machine context functions to store and restore the FPCSR
state.
*On mcontext_t ABI*
This patch adds __fpcsr to mcontext_t. This is an ABI change, but also
an ABI fix. The Linux kernel has always defined padding in mcontext_t
that space was missing from the glibc ABI. In Linux this unused space
has now been re-purposed for storing the FPCSR. This patch brings
OpenRISC glibc in line with the Linux kernel and other libc
implementation (musl).
Compatibility getcontext, setcontext, etc symbols have been added to
allow for binaries expecting the old ABI to continue to work.
*Hard float ABI*
The calling conventions and types do not change with OpenRISC hard-float
so glibc hard-float builds continue to use dynamic linker
/lib/ld-linux-or1k.so.1.
*Testing*
I have tested this patch both with hard-float and soft-float builds and
the test results look fine to me. Results are as follows:
Hard Float
# failures
FAIL: elf/tst-sprof-basic (Haven't figured out yet, not related to hard-float)
FAIL: gmon/tst-gmon-pie (PIE bug in or1k toolchain)
FAIL: gmon/tst-gmon-pie-gprof (PIE bug in or1k toolchain)
FAIL: iconvdata/iconv-test (timeout, passed when run manually)
FAIL: nptl/tst-cond24 (Timeout)
FAIL: nptl/tst-mutex10 (Timeout)
# summary
6 FAIL
4289 PASS
86 UNSUPPORTED
16 XFAIL
2 XPASS
# versions
Toolchain: or1k-smhfpu-linux-gnu
Compiler: gcc version 14.0.1 20240324 (experimental) [master r14-9649-gbb04a11418f] (GCC)
Binutils: GNU assembler version 2.42.0 (or1k-smhfpu-linux-gnu) using BFD version (GNU Binutils) 2.42.0.20240324
Linux: Linux buildroot 6.9.0-rc1-00008-g4dc70e1aadfa #112 SMP Sat Apr 27 06:43:11 BST 2024 openrisc GNU/Linux
Tester: shorne
Glibc: 2024-04-25 b62928f907 Florian Weimer x86: In ld.so, diagnose missing APX support in APX-only builds (origin/master, origin/HEAD)
Soft Float
# failures
FAIL: elf/tst-sprof-basic
FAIL: gmon/tst-gmon-pie
FAIL: gmon/tst-gmon-pie-gprof
FAIL: nptl/tst-cond24
FAIL: nptl/tst-mutex10
# summary
5 FAIL
4295 PASS
81 UNSUPPORTED
16 XFAIL
2 XPASS
# versions
Toolchain: or1k-smh-linux-gnu
Compiler: gcc version 14.0.1 20240324 (experimental) [master r14-9649-gbb04a11418f] (GCC)
Binutils: GNU assembler version 2.42.0 (or1k-smh-linux-gnu) using BFD version (GNU Binutils) 2.42.0.20240324
Linux: Linux buildroot 6.9.0-rc1-00008-g4dc70e1aadfa #112 SMP Sat Apr 27 06:43:11 BST 2024 openrisc GNU/Linux
Tester: shorne
Glibc: 2024-04-25 b62928f907 Florian Weimer x86: In ld.so, diagnose missing APX support in APX-only builds (origin/master, origin/HEAD)
Documentation: https://raw.githubusercontent.com/openrisc/doc/master/openrisc-arch-1.4-rev0.pdf
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch adds the ulps test file to prepare for the upcoming
hard float patch. This is separated out to make the hard float patch
smaller.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Using int may give false results for future dates (timeouts after the
year 2028).
Fixes commit 04a21e050d64a1193a6daab872bca2528bda44b ("CVE-2024-33601,
CVE-2024-33602: nscd: netgroup: Use two buffers in addgetnetgrentX
(bug 31680)").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This change follows two previous fixes addressing multiple definitions
of __memcpy_chk and __mempcpy_chk functions on i586, and __memmove_chk
and __memset_chk functions on i686. The test is intended to prevent
such issues from occurring in the future.
Signed-off-by: Gabi Falk <gabifalk@gmx.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
Commit c73c96a4a1 updated memcpy.S and
mempcpy.S, but omitted memmove.S and memset.S. As a result, the static
library built as PIC, whether with or without multiarch support,
contains two definitions for each of the __memmove_chk and __memset_chk
symbols.
/usr/lib/gcc/i686-pc-linux-gnu/14/../../../../i686-pc-linux-gnu/bin/ld: /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../lib/libc.a(memset-ia32.o): in function `__memset_chk':
/var/tmp/portage/sys-libs/glibc-2.39-r3/work/glibc-2.39/string/../sysdeps/i386/i686/memset.S:32: multiple definition of `__memset_chk'; /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../lib/libc.a(memset_chk.o):/var/tmp/portage/sys-libs/glibc-2.39-r3/work/glibc-2.39/debug/../sysdeps/i386/i686/multiarch/memset_chk.c:24: first defined here
After this change, regardless of PIC options, the static library, built
for i686 with multiarch contains implementations of these functions
respectively from debug/memmove_chk.c and debug/memset_chk.c, and
without multiarch contains implementations of these functions
respectively from sysdeps/i386/memmove_chk.S and
sysdeps/i386/memset_chk.S. This ensures that memmove and memset won't
pull in __chk_fail and the routines it calls.
Reported-by: Sam James <sam@gentoo.org>
Tested-by: Sam James <sam@gentoo.org>
Fixes: c73c96a4a1 ("i686: Fix build with --disable-multiarch")
Signed-off-by: Gabi Falk <gabifalk@gmx.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
/home/bmg/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/13.2.1/../../../../x86_64-glibc-linux-gnu/bin/ld: /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(memcpy_chk.o): in function `__memcpy_chk':
/home/bmg/src/glibc/debug/../sysdeps/i386/memcpy_chk.S:29: multiple definition of `__memcpy_chk';/home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(memcpy.o):/home/bmg/src/glibc/string/../sysdeps/i386/i586/memcpy.S:31: first defined here /home/bmg/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/13.2.1/../../../../x86_64-glibc-linux-gnu/bin/ld: /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(mempcpy_chk.o): in function `__mempcpy_chk': /home/bmg/src/glibc/debug/../sysdeps/i386/mempcpy_chk.S:28: multiple definition of `__mempcpy_chk'; /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(mempcpy.o):/home/bmg/src/glibc/string/../sysdeps/i386/i586/memcpy.S:31: first defined here
After this change, the static library built for i586, regardless of PIC
options, contains implementations of these functions respectively from
sysdeps/i386/memcpy_chk.S and sysdeps/i386/mempcpy_chk.S. This ensures
that memcpy and mempcpy won't pull in __chk_fail and the routines it
calls.
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Gabi Falk <gabifalk@gmx.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
The FSF's Licensing and Compliance Lab noted a discrepancy in the
licensing of several files in the glibc package.
When timespect_get.c was impelemented the license did not include
the standard ", or (at your option) any later version." text.
Change the license in timespec_get.c and all copied files to match
the expected license.
This change was previously approved in principle by the FSF in
RT ticket #1316403. And a similar instance was fixed in
commit 46703efa02.
While AT_NO_AUTOMOUNT is similar in function to the Hurd's O_NOTRANS,
there are significant enough differences in semantics:
1. AT_NO_AUTOMOUNT has no effect on already established mounts,
whereas O_NOTRANS causes the lookup to ignore both passive and active
translators. A better approximation of the AT_NO_AUTOMOUNT behavior
would be to honor active translators, but avoid starting passive
ones; like what the file_name_lookup_carefully () routine from
sutils/clookup.c in the Hurd source tree does.
2. On GNU/Hurd, translators are used much more pervasively than mounts
on "traditional" Unix systems: among other things, translators
underlie features like symlinks, device nodes, and sockets. And while
on a "traditional" Unix system, the mountpoint and the root of the
mounted tree may look similar enough for many purposes (they're both
directories, for one thing), the Hurd allows for any combination of
the two node types, and indeed it is common to have e.g. a device
node "mounted" on top of a regular file node on the underlying
filesystem. Ignoring the translator and stat'ing the underlying node
is therefore likely to return very different results from what you'd
get if you stat the translator's root node.
In practice, mapping AT_NO_AUTOMOUNT to O_NOTRANS was breaking GNU
Coreutils, including stat(1) and ls(1):
$ stat /dev/hd0s1
File: /dev/hd0s1
Size: 0 Blocks: 8 IO Block: 8192 regular empty file
Device: 0,8 Inode: 32866 Links: 1
This was also breaking GNOME's glib, where a g_local_file_stat () call
that is supposed to stat () a file through a symlink uses
AT_NO_AUTOMOUNT, which gets mapped to O_NOTRANS, which then causes the
stat () call to stat symlink itself like lstat () would, rather then the
file it points to, which is what the logic expects to happen.
This reverts most of 13710e7e6a
"hurd: Add support for AT_NO_AUTOMOUNT".
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
This reverts commit 84e93afc7 ("Switch to UTF-8 for INSTALL") and
reinstates commit c14f2e4aa ("Make sure INSTALL is ASCII plaintext")
and regenerates INSTALL.
It turns out that different versions of makeinfo (texinfo/texi2any),
at least versions 7.0.3 and 7.1, put unicode quote glyphs in different
places (specifically whether contractions like you'd, don't, aren't or
you'll use ’ or '). This breaks the make dist target as used for
(snapshot) releases, which have a check on the regenerated INSTALL
file. Using --disable-encoding generates the same plaintext ASCII on
all versions.
An alternative would be to regenerate INSTALL with texinfo 7.1 and
require at least that version. But that seems too soon while various
distros don't have 7.1 yet. We can try again to use UTF-8 for INSTALL
in a couple of years.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
At this point, this is mainly a tool for testing the early ld.so
CPU compatibility diagnostics: GCC uses the new instructions in most
functions, so it's easy to spot if some of the early code is not
built correctly.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Also compile dl-misc.os with $(rtld-early-cflags) to avoid
Program received signal SIGILL, Illegal instruction.
0x00007ffff7fd36ea in _dl_strtoul (nptr=nptr@entry=0x7fffffffe2c9 "2",
endptr=endptr@entry=0x7fffffffd728) at dl-misc.c:156
156 bool positive = true;
(gdb) bt
#0 0x00007ffff7fd36ea in _dl_strtoul (nptr=nptr@entry=0x7fffffffe2c9 "2",
endptr=endptr@entry=0x7fffffffd728) at dl-misc.c:156
#1 0x00007ffff7fdb1a9 in tunable_initialize (
cur=cur@entry=0x7ffff7ffbc00 <tunable_list+2176>,
strval=strval@entry=0x7fffffffe2c9 "2", len=len@entry=1)
at dl-tunables.c:131
#2 0x00007ffff7fdb3a2 in parse_tunables (valstring=<optimized out>)
at dl-tunables.c:258
#3 0x00007ffff7fdb5d9 in __GI___tunables_init (envp=0x7fffffffdd58)
at dl-tunables.c:288
#4 0x00007ffff7fe44c3 in _dl_sysdep_start (
start_argptr=start_argptr@entry=0x7fffffffdcb0,
dl_main=dl_main@entry=0x7ffff7fe5f80 <dl_main>)
at ../sysdeps/unix/sysv/linux/dl-sysdep.c:110
#5 0x00007ffff7fe5cae in _dl_start_final (arg=0x7fffffffdcb0) at rtld.c:494
#6 _dl_start (arg=0x7fffffffdcb0) at rtld.c:581
#7 0x00007ffff7fe4b38 in _start ()
(gdb)
when setting GLIBC_TUNABLES in glibc compiled with APX.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This avoids potential memory corruption when the underlying NSS
callback function does not use the buffer space to store all strings
(e.g., for constant strings).
Instead of custom buffer management, two scratch buffers are used.
This increases stack usage somewhat.
Scratch buffer allocation failure is handled by return -1
(an invalid timeout value) instead of terminating the process.
This fixes bug 31679.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The addgetnetgrentX call in addinnetgrX may have failed to produce
a result, so the result variable in addinnetgrX can be NULL.
Use db->negtimeout as the fallback value if there is no result data;
the timeout is also overwritten below.
Also avoid sending a second not-found response. (The client
disconnects after receiving the first response, so the data stream did
not go out of sync even without this fix.) It is still beneficial to
add the negative response to the mapping, so that the client can get
it from there in the future, instead of going through the socket.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
If we failed to add a not-found response to the cache, the dataset
point can be null, resulting in a null pointer dereference.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Add another difficult needle to strstr that clearly shows the quadratic
complexity of bruteforce algorithms.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Define MINIMUM_X86_ISA_LEVEL at configure time to avoid
/usr/bin/ld: …/build/elf/librtld.os: in function `init_cpu_features':
…/git/elf/../sysdeps/x86/cpu-features.c:1202: undefined reference to `_dl_runtime_resolve_fxsave'
/usr/bin/ld: …/build/elf/librtld.os: relocation R_X86_64_PC32 against undefined hidden symbol `_dl_runtime_resolve_fxsave' can not be used when making a shared object
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status
when glibc is built with -march=x86-64-v3 and configured with
--with-rtld-early-cflags=-march=x86-64, which is used to allow ld.so to
print an error message on unsupported CPUs:
Fatal glibc error: CPU does not support x86-64-v3
This fixes BZ #31676.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
The current IFUNC selection is always using the most recent
features which are available via AT_HWCAP. But in
some scenarios it is useful to adjust this selection.
The environment variable:
GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,zzz,....
can be used to enable HWCAP feature yyy, disable HWCAP feature xxx,
where the feature name is case-sensitive and has to match the ones
used in sysdeps/loongarch/cpu-tunables.c.
Signed-off-by: caiyinyu <caiyinyu@loongson.cn>
Fall back to ppoll if ppoll_time64 fails with ENOSYS.
Fixes commit 370da8a121 ("nptl: Fix
tst-cancel30 on sparc64").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Delay setting file->decided until the data has been successfully loaded
by _nl_load_locale(). If the function fails to load the data then we
must return and error and leave decided untouched to allow the caller to
attempt to load the data again at a later time. We should not set
decided to 1 early in the function since doing so may prevent attempting
to load it again. We want to try loading it again because that allows an
open to fail and set errno correctly.
On the other side of this problem is that if we are called again with
the same inputs we will fetch the cached version of the object and carry
out no open syscalls and that fails to set errno so we must set errno to
ENOENT in that case. There is a second code path that has to be handled
where the name of the locale matches but the codeset doesn't match.
These changes ensure that errno is correctly set on failure in all the
return paths in _nl_find_locale().
Adds tst-locale-loadlocale to cover the bug.
No regressions on x86_64.
Co-authored-by: Jeff Law <law@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On some architectures and depending on the page size, the loader can
also allocate some memory during dependencies loading and it will be
marked as 'loader malloc'. However, if the system page size is
large enough, the initial data page will be enough for all required
allocation and there will be no extra loader mmap. To avoid false
negatives, the test does not check for such pages.
Checked on powerpc64le-linux-gnu with 64k pagesize.
Reviewed-by: Simon Chopin <simon.chopin@canonical.com>
Until GCC removes Nios II support (at which point we should do so as
well), this is now needed for GCC 14 / mainline to build for
nios2-linux-gnu target.
Tested with build-many-glibcs.py (GCC mainline) for nios2-linux-gnu.
These fields store timestamps when the system was running. No Linux
systems existed before 1970, so these values are unused. Switching
to unsigned types allows continued use of the existing struct layouts
beyond the year 2038.
The intent is to give distributions more time to switch to improved
interfaces that also avoid locking/data corruption issues.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These structs describe file formats under /var/log, and should not
depend on the definition of _TIME_BITS. This is achieved by
defining __WORDSIZE_TIME64_COMPAT32 to 1 on 32-bit ports that
support 32-bit time_t values (where __time_t is 32 bits).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The default <utmp-size.h> is for ports with a 64-bit time_t.
Ports with a 32-bit time_t or with __WORDSIZE_TIME64_COMPAT32=1
need to override it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add a simple benchmark to measure the overhead of internal libc locks in
the random() implementation on both single- and multi-threaded cases.
This relies on the implementation of random using internal locks to
access shared global data, and that the runtime uses multi-threaded
locking once a thread has been created (even after it finishes).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
ISO-2022-CN-EXT uses escape sequences to indicate character set changes
(as specified by RFC 1922). While the SOdesignation has the expected
bounds checks, neither SS2designation nor SS3designation have its;
allowing a write overflow of 1, 2, or 3 bytes with fixed values:
'$+I', '$+J', '$+K', '$+L', '$+M', or '$*H'.
Checked on aarch64-linux-gnu.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
When using the glibc.rtld.enable_secure tunable we need to keep track of
the count of environment variables we skip due to __libc_enable_secure
being set and adjust the auxv section of the stack. This fixes an
assertion when running ld.so directly with glibc.rtld.enable_secure set.
Add a testcase that ensures the assert is not hit.
elf/rtld.c:1324 assert (auxv == sp + 1);
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This seems to have stopped working with some GCC 14 versions,
which clobber r2. With other compilers, the kernel-provided
r2 value is still available at this point.
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
This reverts commit a1735e0aa8.
The test failure is a real valgrind bug that needs to be fixed before
valgrind is usable with a glibc that has been built with
CC="gcc -march=x86-64-v3". The proposed valgrind patch teaches
valgrind to replace ld.so strcmp with an unoptimized scalar
implementation, thus avoiding any AVX2-related problems.
Valgrind bug: <https://bugs.kde.org/show_bug.cgi?id=485487>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
It uses the same two-way algorithm used on strstr, strcasestr, and
memmem. Different than strstr, neither the "shift table" optimization
nor the self-adapting filtering check is used because it would result in
a too-large shift table (and it also simplifies the implementation bit).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Parametrize test-strstr.c so it can be used to check wcsstr.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
The gnulib version contains an important change (9ce573cde), which
fixes some problems with multithreading, entropy loss, and ASLR leak
nfo. It also fixes an issue where getrandom is not being used
on some new files generation (only for __GT_NOCREATE on first try).
The 044bf893ac removed __path_search, which is now moved to another
gnulib shared files (stdio-common/tmpdir.{c,h}). Tthis patch
also fixes direxists to use __stat64_time64 instead of __xstat64,
and move the include of pathmax.h for !_LIBC (since it is not used
by glibc). The license is also changed from GPL 3.0 to 2.1, with
permission from the authors (Bruno Haible and Paul Eggert).
The sync also removed the clock fallback, since clock_gettime
with CLOCK_REALTIME is expected to always succeed.
It syncs with gnulib commit 323834962817af7b115187e8c9a833437f8d20ec.
Checked on x86_64-linux-gnu.
Co-authored-by: Bruno Haible <bruno@clisp.org>
Co-authored-by: Paul Eggert <eggert@cs.ucla.edu>
Reviewed-by: Bruno Haible <bruno@clisp.org>
This commit adds a simple bind/accept/connect test for an IPv4 TCP
connection to a local process via the loopback interface.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
None of the existing tests seem to cover the case where
_dl_signal_error is called without an active error handler.
The new elf/tst-rtld-does-not-exist test triggers such a
_dl_signal_error call from _dl_map_object.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
_dl_signal_error may be called with objname == NULL. _dl_exception_create
checks objname == NULL. But fatal_error doesn't. Check objname before
calling fatal_error. This fixes BZ #31596.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
When static PIE is enabled by default, we shouldn't use crtbeginS.o and
crtendS.o for non-PIE static executables. Check $($(@F)-no-pie) to use
crtbeginT.o and crtend.o to create non-PIE static executables.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
This prints some information from struct cpu_features, and the midr_el1
and dczid_el0 system register contents on every CPU.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This is surprisingly difficult to implement if the goal is to produce
reasonably sized output. With the current approaches to output
compression (suppressing zeros and repeated results between CPUs,
folding ranges of identical subleaves, dealing with the %ecx
reflection issue), the output is less than 600 KiB even for systems
with 256 logical CPUs.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Sync tzselect, zdump, zic to TZDB 2024a.
This patch incorporates the following TZDB source code changes,
listed roughly in descending order of importance.
zic now supports links to links, needed for future tzdata
zic now defaults to '-b slim'
zic now updates output files atomically
zic has new options -R, -l -, -p -
zic -r now uses -00 for unspecified timestamps
zdump now uses [lo,hi) for both -c and -t
Fix several integer overflow bugs
zic now checks input bytes more carefully
Simplify and fix new TZDIR setup
Default time_t to 64 bits on glibc 2.34+ 32-bit
zic now generates TZ strings that conform to POSIX when all-year DST
zic -v now shows extreme-int tm_year transitions
Fix zic bug in last time type of Asia/Gaza etc.
Fix zic bug with Palestine after 2075
Fix bug uncovered by recent change to Iran history
Fix 'zic -b fat' bug with Port Moresby 32-bit data
Fix zic bug with -r @X where X is deduced from TZ
Fix bug with zic -r cutoff before 1st transition
Fix leap second expiry and truncation
Fix zic bug on Linux 2.6.16 and 2.6.17
Fix bug with 'zic -d /a/b/c' if /a is unwriteable
Don't mistruncate TZif files at leap seconds
Fix zdump undefined behavior if !USE_LTZ
zdump -v reports localtime+gmtime failures better
Fix zdump diagnostic for missing timezone
Don't assume nonempty argv
Port better to C23
Do not assume negative >> behavior
I18nize zdump a bit better
Port zdump to right_only installations
New tzselect menu option 'now'
tzselect can now use current time to help choose
Improve tzselect behavior for Turkey etc.
tzselect: do not create temporary files
tzselect: work around mawk bug with {2,}
tzselect: Port to POSIX awk, which prohibits -v newlines
Do not use empty RE in tzselect
Don't set TZ in tzselect
Avoid sed, expr in tzselect
tzselect: Fix problems with spaces in TZDIR
Improve tzselect diagnostics
Remove zic workaround for Qt bug 53071
Remove zic support for "min" in Rule lines
Remove zic support for zic -y, Rule TYPEs, pacificnew
Remove tzselect workaround for Bash 1.14.7 bug
* SHARED-FILES: Update to match current sync.
* config.h.in (HAVE_STRERROR): Remove; no longer needed.
* timezone/Makefile ($(objpfx)zic.o): Depend on tzdir.h.
($(objpfx)tzdir.h): New rule to build a placeholder.
* timezone/private.h, timezone/tzfile.h, timezone/version:
* timezone/zdump.c, timezone/zic.c: Copy verbatim from TZDB 2024a.
* manual/search.texi (Array Search Function):
Correct the statement about lfind’s mean runtime:
it is proportional to a number (not that number),
and this is true only if random elements are searched for.
Relax the constraint on bsearch’s array argument:
POSIX says it need not be sorted, only partially sorted.
Say that the first arg passed to bsearch’s comparison function
is the key, and the second arg is an array element, as
POSIX requires. For bsearch and qsort, say that the
comparison function should not alter the array, as POSIX
requires. For qsort, say that the comparison function
must define a total order, as POSIX requires, that
it should not depend on element addresses, that
the original array index can be used for stable sorts,
and that if qsort still works if memory allocation fails.
Be more consistent in calling the array elements
“elements” rather than “objects”.
Co-authored-by: Zack Weinberg <zack@owlfolio.org>
When -mapxf is used to build glibc, the resulting glibc will never run
on FMA4 machines. Exclude FMA4 IFUNC functions when -mapxf is used.
This requires GCC which defines __APX_F__ for -mapxf with commit:
1df56719bd8 x86: Define __APX_F__ for -mapxf
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
The a4ed0471d7 removed the generic version which is included by
features.h and used by Hurd.
Checked by building i686-gnu and x86_64-gnu with build-many-glibc.py.
The implementations of trunc functions using x87 floating point (i386 and
x86_64 long double only) traps when FE_INEXACT is enabled. Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.
The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The implementations of floor functions using x87 floating point (i386 and
86_64 long double only) traps when FE_INEXACT is enabled. Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.
The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The implementations of ceil functions using x87 floating point (i386 and
x86_64 long double only) traps when FE_INEXACT is enabled. Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.
The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
In Linux 6.9 a new flag is added to allow for Per-io operations to
disable append mode even if a file was opened with the flag O_APPEND.
This is done with the new RWF_NOAPPEND flag.
This caused two test failures as these tests expected the flag 0x00000020
to be unused. Adding the flag definition now fixes these tests on Linux
6.9 (v6.9-rc1).
FAIL: misc/tst-preadvwritev2
FAIL: misc/tst-preadvwritev64v2
This patch adds the flag, adjusts the test and adds details to
documentation.
Link: https://lore.kernel.org/all/20200831153207.GO3265@brightrain.aerifal.cx/
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The ifunc variants now uses the powerpc implementation which in turn
uses the compiler builtin. Without the proper -mcpu switch the builtin
does not generate the expected optimization.
Checked on powerpc-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
Reflow all long lines adding comment terminators.
Sort all reflowed text using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
It was raised on libc-help [1] that some Linux kernel interfaces expect
the libc to define __USE_TIME_BITS64 to indicate the time_t size for the
kABI. Different than defined by the initial y2038 design document [2],
the __USE_TIME_BITS64 is only defined for ABIs that support more than
one time_t size (by defining the _TIME_BITS for each module).
The 64 bit time_t redirects are now enabled using a different internal
define (__USE_TIME64_REDIRECTS). There is no expected change in semantic
or code generation.
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, and
arm-linux-gnueabi
[1] https://sourceware.org/pipermail/libc-help/2024-January/006557.html
[2] https://sourceware.org/glibc/wiki/Y2038ProofnessDesign
Reviewed-by: DJ Delorie <dj@redhat.com>
Use same strategy as bench-strstr.c (93eebae516 and 80b2bfb535)
and use json_ctx for output to help standardize format across all
benchtests.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
As indicated in a recent thread, this it is a simple brute-force
algorithm that checks the whole needle at a matching character pair
(and does so 1 byte at a time after the first 64 bytes of a needle).
Also it never skips ahead and thus can match at every haystack
position after trying to match all of the needle, which generic
implementation avoids.
As indicated by Wilco, a 4x larger needle and 16x larger haystack gives
a clear 65x slowdown both basic_strstr and __strstr_avx512:
"ifuncs": ["basic_strstr", "twoway_strstr", "__strstr_avx512",
"__strstr_sse2_unaligned", "__strstr_generic"],
{
"len_haystack": 65536,
"len_needle": 1024,
"align_haystack": 0,
"align_needle": 0,
"fail": 1,
"desc": "Difficult bruteforce needle",
"timings": [4.0948e+07, 15094.5, 3.20818e+07, 108558, 10839.2]
},
{
"len_haystack": 1048576,
"len_needle": 4096,
"align_haystack": 0,
"align_needle": 0,
"fail": 1,
"desc": "Difficult bruteforce needle",
"timings": [2.69767e+09, 100797, 2.08535e+09, 495706, 82666.9]
}
PS: I don't have an AVX512 capable machine to verify this issues, but
skimming through the code it does seems to follow what Wilco has
described.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
The value of l_scope is only valid post relocation, so this original
check was triggering undefined behavior. Instead just directly check to
see if the object has been relocated, at which point using l_scope is
safe.
Reported-by: Andreas Schwab <schwab@suse.de>
Closes: BZ #31317
Fixes: e0590f41fe ("RISC-V: Enable static-pie.")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Previously, HTL would always allocate non-executable stacks. This has
never been noticed, since GNU Mach on x86 ignores VM_PROT_EXECUTE and
makes all pages implicitly executable. Since GNU Mach on AArch64
supports non-executable pages, HTL forgetting to pass VM_PROT_EXECUTE
immediately breaks any code that (unfortunately, still) relies on
executable stacks.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-7-bugaevc@gmail.com>
While we could support it on any architecture, the tunable is currently
only defined on x86_64.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-5-bugaevc@gmail.com>
We would like to avoid statically defining any specific page size on
aarch64-gnu, and instead make sure that everything uses the dynamic
page size, available via vm_page_size and GLRO(dl_pagesize).
There are currently a few places in glibc that require EXEC_PAGESIZE
to be defined. Per Roland's suggestion [0], drop the static
GLRO(dl_pagesize) initializers (for now, only if EXEC_PAGESIZE is not
defined), and don't require EXEC_PAGESIZE definition for libio to
enable mmap usage.
[0]: https://mail.gnu.org/archive/html/bug-hurd/2011-10/msg00035.html
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-4-bugaevc@gmail.com>
We'd like to avoid committing to a specific size of virtual address
space (i.e. the value of VM_AARCH64_T0SZ) on AArch64. While the current
version of GNU Mach still exports VM_MAX_ADDRESS for compatibility, we
should try to avoid relying on it when we can. This piece of logic in
_hurdsig_getenv () doesn't actually care about the size of user-
accessible virtual address space, it just wants to preempt faults on any
addresses starting from the value of the P pointer and above. So, use
(unsigned long int) -1 instead of VM_MAX_ADDRESS.
While at it, change the casts to (unsigned long int) and not just
(long int), since the type of struct hurd_signal_preemptor.{first,last}
is unsigned long int.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-3-bugaevc@gmail.com>
Move _hurd_self_sigstate (), _hurd_critical_section_lock (), and
_hurd_critical_section_unlock () inline implementations (that were
already guarded by #if defined _LIBC) to the internal version of the
header. While at it, add <tls.h> to the includes, and use
__LIBC_NO_TLS () unconditionally.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-2-bugaevc@gmail.com>
The log incorrectly prints, setcontext failed. Update this to indicate
that actually swapcontext failed.
Signed-off-by: Stafford Horne <shorne@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
On OpenRISC variadic functions and regular functions have different
calling conventions so this wrapper is needed to translate. This
wrapper is copied from x86_64/x32. I don't know the build system enough
to find a cleaner way to share the code between x86_64/x32 and or1k
(maybe Implies?), so I went with the straight copy.
This fixes test failures:
misc/tst-prctl
nptl/tst-setgetname
This test failure:
math/test-fenv
If rounding mode and exception macros are defined then the fenv tests
run and always fail. This patch adds an ifdef using the
__or1k_hard_float__ macro provided by gcc to avoid defining these fenv
macros when they cnnot be used. This is similar to what is done in csky.
Note, I will post the or1k hard-float support soon. So, I prefer to
leave the hard-float bits here for now.
Old Linux kernels disable SVE after every system call. Calling the
SVE-optimized memcpy afterwards will then cause a trap to reenable SVE.
As a result, applications with a high use of syscalls may run slower with
the SVE memcpy. This is true for kernels between 4.15.0 and before 6.2.0,
except for 5.14.0 which was patched. Avoid this by checking the kernel
version and selecting the SVE ifunc on modern kernels.
Parse the kernel version reported by uname() into a 24-bit kernel.major.minor
value without calling any library functions. If uname() is not supported or
if the version format is not recognized, assume the kernel is modern.
Tested-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
The following three changes have been added to provide initial Power11 support.
1. Add the directories to hold Power11 files.
2. Add support to select Power11 libraries based on AT_PLATFORM.
3. Let submachine=power11 be set automatically.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
This patch adds a new feature for powerpc. In order to get faster
access to the HWCAP3/HWCAP4 masks, similar to HWCAP/HWCAP2 (i.e. for
implementing __builtin_cpu_supports() in GCC) without the overhead of
reading them from the auxiliary vector, we now reserve space for them
in the TCB.
This is an ABI change for GLIBC 2.39.
Suggested-by: Peter Bergner <bergner@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
The aarch64 uses 'trad' for traditional tls and 'desc' for tls
descriptors, but unlike other targets it defaults to 'desc'. The
gnutls2 configure check does not set aarch64 as an ABI that uses
TLS descriptors, which then disable somes stests.
Also rename the internal machinery fron gnu2 to tls descriptors.
Checked on aarch64-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
ARM _dl_tlsdesc_dynamic slow path has two issues:
* The ip/r12 is defined by AAPCS as a scratch register, and gcc is
used to save the stack pointer before on some function calls. So it
should also be saved/restored as well. It fixes the tst-gnu2-tls2.
* None of the possible VFP registers are saved/restored. ARM has the
additional complexity to have different VFP bank sizes (depending of
VFP support by the chip).
The tst-gnu2-tls2 test is extended to check for VFP registers, although
only for hardfp builds. Different than setcontext, _dl_tlsdesc_dynamic
does not have HWCAP_ARM_IWMMXT (I don't have a way to properly test
it and it is almost a decade since newer hardware was released).
With this patch there is no need to mark tst-gnu2-tls2 as XFAIL.
Checked on arm-linux-gnueabihf.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
_dl_tlsdesc_dynamic preserves RDI, RSI and RBX before realigning stack.
After realigning stack, it saves RCX, RDX, R8, R9, R10 and R11. Define
TLSDESC_CALL_REGISTER_SAVE_AREA to allocate space for RDI, RSI and RBX
to avoid clobbering saved RDI, RSI and RBX values on stack by xsave to
STATE_SAVE_OFFSET(%rsp).
+==================+<- stack frame start aligned at 8 or 16 bytes
| |<- RDI saved in the red zone
| |<- RSI saved in the red zone
| |<- RBX saved in the red zone
| |<- paddings for stack realignment of 64 bytes
|------------------|<- xsave buffer end aligned at 64 bytes
| |<-
| |<-
| |<-
|------------------|<- xsave buffer start at STATE_SAVE_OFFSET(%rsp)
| |<- 8-byte padding for 64-byte alignment
| |<- 8-byte padding for 64-byte alignment
| |<- R11
| |<- R10
| |<- R9
| |<- R8
| |<- RDX
| |<- RCX
+==================+<- RSP aligned at 64 bytes
Define TLSDESC_CALL_REGISTER_SAVE_AREA, the total register save area size
for all integer registers by adding 24 to STATE_SAVE_OFFSET since RDI, RSI
and RBX are saved onto stack without adjusting stack pointer first, using
the red-zone. This fixes BZ #31501.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
Originally, nptl/descr.h included <sys/rseq.h>, but we removed that
in commit 2c6b4b272e ("nptl:
Unconditionally use a 32-byte rseq area"). After that, it was
not ensured that the RSEQ_SIG macro was defined during sched_getcpu.c
compilation that provided a definition. This commit always checks
the rseq area for CPU number information before using the other
approaches.
This adds an unnecessary (but well-predictable) branch on
architectures which do not define RSEQ_SIG, but its cost is small
compared to the system call. Most architectures that have vDSO
acceleration for getcpu also have rseq support.
Fixes: 2c6b4b272e
Fixes: 1d350aa060
Reviewed-by: Arjun Shankar <arjun@redhat.com>
Due to GCC bug 110901 -mcpu can override -march setting when compiling
asm code and thus a compiler targetting a specific cpu can fail the
configure check even when binutils gas supports SVE.
The workaround is that explicit .arch directive overrides both -mcpu
and -march, and since that's what the actual SVE memcpy uses the
configure check should use that too even if the GCC issue is fixed
independently.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This patch updates the kernel version in the tests tst-mman-consts.py,
tst-mount-consts.py and tst-pidfd-consts.py to 6.8. (There are no new
constants covered by these tests in 6.8 that need any other header
changes.)
Tested with build-many-glibcs.py.
Linux 6.8 adds five new syscalls. Update syscall-names.list and
regenerate the arch-syscall.h headers with build-many-glibcs.py
update-syscalls.
Tested with build-many-glibcs.py.
Similar to strstr (1e9a550ba4), power8 strcasestr does not show much
improvement compared to the generic implementation. The geomean
on bench-strcasestr shows:
__strcasestr_power8 __strcasestr_ppc
power10 1159 1120
power9 1640 1469
power8 1787 1904
The strcasestr uses the same 'trick' as power7 strstr to detect
potential quadradic behavior, which only adds overheads for input
that trigger quadradic behavior and it is really a hack.
Checked on powerpc64le-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
The memcpy optimization (commit 587a1290a1) has a series
of mistakes:
- The implementation is wrong: the chunk size calculation is wrong
leading to invalid memory access.
- It adds ifunc supports as default, so --disable-multi-arch does
not work as expected for riscv.
- It mixes Linux files (memcpy ifunc selection which requires the
vDSO/syscall mechanism) with generic support (the memcpy
optimization itself).
- There is no __libc_ifunc_impl_list, which makes testing only
check the selected implementation instead of all supported
by the system.
This patch also simplifies the required bits to enable ifunc: there
is no need to memcopy.h; nor to add Linux-specific files.
The __memcpy_noalignment tail handling now uses a branchless strategy
similar to aarch64 (overlap 32-bits copies for sizes 4..7 and byte
copies for size 1..3).
Checked on riscv64 and riscv32 by explicitly enabling the function
on __libc_ifunc_impl_list on qemu-system.
Changes from v1:
* Implement the memcpy in assembly to correctly handle RISCV
strict-alignment.
Reviewed-by: Evan Green <evan@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Each mask in the sigset array is an unsigned long, so fix __sigisemptyset
to use that instead of int. The __sigword function returns a simple array
index, so it can return int instead of unsigned long.
Protect the global locale from being modified while we compute the size of
the locale category names. That allows the use of the global locale in a
single thread, while all other threads use the thread safe locale
functions.
Replace minimum ISA check ifdef conditional with if. Since
MINIMUM_X86_ISA_LEVEL and AVX_X86_ISA_LEVEL are compile time constants,
compiler will perform constant folding optimization, getting same
results.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
For CPU implementations that can perform unaligned accesses with little
or no performance penalty, create a memcpy implementation that does not
bother aligning buffers. It will use a block of integer registers, a
single integer register, and fall back to bytewise copy for the
remainder.
Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Add a little helper method so it's easier to fetch a single value from
the hwprobe function when used within an ifunc selector.
Signed-off-by: Evan Green <evan@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
RISC-V is apparently the first architecture to pass more than one
argument to ifunc resolvers. The helper macros in libc-symbols.h,
__ifunc_resolver(), __ifunc(), and __ifunc_hidden(), are incompatible
with this. These macros have an "arg" (non-final) parameter that
represents the parameter signature of the ifunc resolver. The result is
an inability to pass the required comma through in a single preprocessor
argument.
Rearrange the __ifunc_resolver() macro to be variadic, and pass the
types as those variable parameters. Move the guts of __ifunc() and
__ifunc_hidden() into new macros, __ifunc_args(), and
__ifunc_args_hidden(), that pass the variable arguments down through to
__ifunc_resolver(). Then redefine __ifunc() and __ifunc_hidden(), which
are used in a bunch of places, to simply shuffle the arguments down into
__ifunc_args[_hidden]. Finally, define a riscv-ifunc.h header, which
provides convenience macros to those looking to write ifunc selectors
that use both arguments.
Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The new __riscv_hwprobe() function is designed to be used by ifunc
selector functions. This presents a challenge for applications and
libraries, as ifunc selectors are invoked before all relocations have
been performed, so an external call to __riscv_hwprobe() from an ifunc
selector won't work. To address this, pass a pointer to the
__riscv_hwprobe() function into ifunc selectors as the second
argument (alongside dl_hwcap, which was already being passed).
Include a typedef as well for convenience, so that ifunc users don't
have to go through contortions to call this routine. Users will need to
remember to check the second argument for NULL, to account for older
glibcs that don't pass the function.
Signed-off-by: Evan Green <evan@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The new riscv_hwprobe syscall also comes with a vDSO for faster answers
to your most common questions. Call in today to speak with a kernel
representative near you!
Signed-off-by: Evan Green <evan@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Add an INTERNAL_VSYSCALL() macro that makes a vDSO call, falling back to
a regular syscall, but without setting errno. Instead, the return value
is plumbed straight out of the macro.
Signed-off-by: Evan Green <evan@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Add awareness and a thin wrapper function around a new Linux system call
that allows callers to get architecture and microarchitecture
information about the CPUs from the kernel. This can be used to
do things like dynamically choose a memcpy implementation.
Signed-off-by: Evan Green <evan@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Add a tunable for setting __libc_enable_secure to 1. Do not set
__libc_enable_secure to 0 if the tunable is set to 0. Ignore all
tunables if glib.rtld.enable_secure is set. One use-case for this
addition is to enable testing code paths that depend on
__libc_enable_secure being set without the need to use setxid binaries.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
_dl_tlsdesc_dynamic should also preserve AMX registers which are
caller-saved. Add X86_XSTATE_TILECFG_ID and X86_XSTATE_TILEDATA_ID
to x86-64 TLSDESC_CALL_STATE_SAVE_MASK. Compute the AMX state size
and save it in xsave_state_full_size which is only used by
_dl_tlsdesc_dynamic_xsave and _dl_tlsdesc_dynamic_xsavec. This fixes
the AMX part of BZ #31372. Tested on AMX processor.
AMX test is enabled only for compilers with the fix for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098
GCC 14 and GCC 11/12/13 branches have the bug fix.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
When strcmp-avx2.S is used as the default, elf/tst-valgrind-smoke fails
with
==1272761== Conditional jump or move depends on uninitialised value(s)
==1272761== at 0x4022C98: strcmp (strcmp-avx2.S:462)
==1272761== by 0x400B05B: _dl_name_match_p (dl-misc.c:75)
==1272761== by 0x40085F3: _dl_map_object (dl-load.c:1966)
==1272761== by 0x401AEA4: map_doit (rtld.c:644)
==1272761== by 0x4001488: _dl_catch_exception (dl-catch.c:237)
==1272761== by 0x40015AE: _dl_catch_error (dl-catch.c:256)
==1272761== by 0x401B38F: do_preload (rtld.c:816)
==1272761== by 0x401C116: handle_preload_list (rtld.c:892)
==1272761== by 0x401EDF5: dl_main (rtld.c:1842)
==1272761== by 0x401A79E: _dl_sysdep_start (dl-sysdep.c:140)
==1272761== by 0x401BEEE: _dl_start_final (rtld.c:494)
==1272761== by 0x401BEEE: _dl_start (rtld.c:581)
==1272761== by 0x401AD87: ??? (in */elf/ld.so)
The assembly codes are:
0x0000000004022c80 <+144>: vmovdqu 0x20(%rdi),%ymm0
0x0000000004022c85 <+149>: vpcmpeqb 0x20(%rsi),%ymm0,%ymm1
0x0000000004022c8a <+154>: vpcmpeqb %ymm0,%ymm15,%ymm2
0x0000000004022c8e <+158>: vpandn %ymm1,%ymm2,%ymm1
0x0000000004022c92 <+162>: vpmovmskb %ymm1,%ecx
0x0000000004022c96 <+166>: inc %ecx
=> 0x0000000004022c98 <+168>: jne 0x4022c32 <strcmp+66>
strcmp-avx2.S has 32-byte vector loads of strings which are shorter than
32 bytes:
(gdb) p (char *) ($rdi + 0x20)
$6 = 0x1ffeffea20 "memcheck-amd64-linux.so"
(gdb) p (char *) ($rsi + 0x20)
$7 = 0x4832640 "core-amd64-linux.so"
(gdb) call (int) strlen ((char *) ($rsi + 0x20))
$8 = 19
(gdb) call (int) strlen ((char *) ($rdi + 0x20))
$9 = 23
(gdb)
It triggers the valgrind error. The above code is safe since the loads
don't cross the page boundary. Update tst-valgrind-smoke.sh to accept
an optional suppression file and pass a suppression file to valgrind when
strcmp-avx2.S is the default implementation of strcmp.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
When glibc is built with ISA level 3 or above enabled, SSE resolvers
aren't available and glibc fails to build:
ld: .../elf/librtld.os: in function `init_cpu_features':
.../elf/../sysdeps/x86/cpu-features.c:1200:(.text+0x1445f): undefined reference to `_dl_runtime_resolve_fxsave'
ld: .../elf/librtld.os: relocation R_X86_64_PC32 against undefined hidden symbol `_dl_runtime_resolve_fxsave' can not be used when making a shared object
/usr/local/bin/ld: final link failed: bad value
For ISA level 3 or above, don't use _dl_runtime_resolve_fxsave nor
_dl_tlsdesc_dynamic_fxsave.
This fixes BZ #31429.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Compiler generates the following instruction sequence for GNU2 dynamic
TLS access:
leaq tls_var@TLSDESC(%rip), %rax
call *tls_var@TLSCALL(%rax)
or
leal tls_var@TLSDESC(%ebx), %eax
call *tls_var@TLSCALL(%eax)
CALL instruction is transparent to compiler which assumes all registers,
except for EFLAGS and RAX/EAX, are unchanged after CALL. When
_dl_tlsdesc_dynamic is called, it calls __tls_get_addr on the slow
path. __tls_get_addr is a normal function which doesn't preserve any
caller-saved registers. _dl_tlsdesc_dynamic saved and restored integer
caller-saved registers, but didn't preserve any other caller-saved
registers. Add _dl_tlsdesc_dynamic IFUNC functions for FNSAVE, FXSAVE,
XSAVE and XSAVEC to save and restore all caller-saved registers. This
fixes BZ #31372.
Add GLRO(dl_x86_64_runtime_resolve) with GLRO(dl_x86_tlsdesc_dynamic)
to optimize elf_machine_runtime_setup.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
When passed a pointer to a zero-sized struct, the access attribute
without the third argument misleads -Wstringop-overflow diagnostics to
think that a function is writing 1 byte into the zero-sized structs.
The attribute doesn't add that much value in this context, so drop it
completely for _FORTIFY_SOURCE=3.
Resolves: BZ #31383
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Instead of tying based on the linker name and version, check for the
required support:
* whether it does not generate dynamic TLS relocations in PIE
(binutils PR ld/22263);
* if it accepts --no-dynamic-linker (by using -static-pie);
* and if it adds a DT_JMPREL pointing to .rela.iplt with static pie.
The patch also trims the comments, for binutils one of the tests should
already cover it. The kernel ones are not clear which version should
have the backport, nor it is something that glibc can do much about
it. Finally, the glibc is somewhat confusing, since it refers
to commits not related to s390x.
Checked with a build for s390x-linux-gnu.
Reviewed-by: Stefan Liebler <stli@linux.ibm.com>
It improve mq_open. The compile and runtime checks have similar
coverage as with GCC.
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
It improves open, open64, openat, and openat64. The compile and runtime
checks have similar coverage as with GCC.
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
It improve fortify checks for wmemcpy, wmemmove, wmemset, wcscpy,
wcpcpy, wcsncpy, wcpncpy, wcscat, wcsncat, wcslcpy, wcslcat, swprintf,
fgetws, fgetws_unlocked, wcrtomb, mbsrtowcs, wcsrtombs, mbsnrtowcs, and
wcsnrtombs. The compile and runtime checks have similar coverage as
with GCC.
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
It improve fortify checks for syslog and vsyslog. The compile
and runtime hecks have similar coverage as with GCC.
The syslog fortify wrapper calls the va_arg version, since clang does
not support __va_arg_pack.
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
It improve fortify checks recv, recvfrom, poll, and ppoll. The compile
and runtime hecks have similar coverage as with GCC.
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
It improve fortify checks for read, pread, pread64, readlink,
readlinkat, getcwd, getwd, confstr, getgroups, ttyname_r, getlogin_r,
gethostname, and getdomainname. The compile and runtime checks have
similar coverage as with GCC.
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
It improve fortify checks for realpath, ptsname_r, wctomb, mbstowcs,
and wcstombs. The runtime and compile checks have similar coverage as
with GCC.
Checked on aarch64, armhf, x86_64, and i686.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
It improve fortify checks for strcpy, stpcpy, strncpy, stpncpy, strcat,
strncat, strlcpy, and strlcat. The runtime and compile checks have
similar coverage as with GCC.
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
It improve fortify checks for sprintf, vsprintf, vsnsprintf, fprintf,
dprintf, asprintf, __asprintf, obstack_printf, gets, fgets,
fgets_unlocked, fread, and fread_unlocked. The runtime checks have
similar support coverage as with GCC.
For function with variadic argument (sprintf, snprintf, fprintf, printf,
dprintf, asprintf, __asprintf, obstack_printf) the fortify wrapper calls
the va_arg version since clang does not support __va_arg_pack.
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
For instance, the read wrapper is currently expanded as:
extern __inline
__attribute__((__always_inline__))
__attribute__((__artificial__))
__attribute__((__warn_unused_result__))
ssize_t read (int __fd, void *__buf, size_t __nbytes)
{
return __glibc_safe_or_unknown_len (__nbytes,
sizeof (char),
__glibc_objsize0 (__buf))
? __read_alias (__fd, __buf, __nbytes)
: __glibc_unsafe_len (__nbytes,
sizeof (char),
__glibc_objsize0 (__buf))
? __read_chk_warn (__fd,
__buf,
__nbytes,
__builtin_object_size (__buf, 0))
: __read_chk (__fd,
__buf,
__nbytes,
__builtin_object_size (__buf, 0));
}
The wrapper relies on __builtin_object_size call lowers to a constant at
compile-time and many other operations in the wrapper depends on
having a single, known value for parameters. Because this is
impossible to have for function parameters, the wrapper depends heavily
on inlining to work and While this is an entirely viable approach on
GCC, it is not fully reliable on clang. This is because by the time llvm
gets to inlining and optimizing, there is a minimal reliable source and
type-level information available (more information on a more deep
explanation on how to fortify wrapper works on clang [1]).
To allow the wrapper to work reliably and with the same functionality as
with GCC, clang requires a different approach:
* __attribute__((diagnose_if(c, “str”, “warning”))) which is a function
level attribute; if the compiler can determine that 'c' is true at
compile-time, it will emit a warning with the text 'str1'. If it would
be better to emit an error, the wrapper can use "error" instead of
"warning".
* __attribute__((overloadable)) which is also a function-level attribute;
and it allows C++-style overloading to occur on C functions.
* __attribute__((pass_object_size(n))) which is a parameter-level
attribute; and it makes the compiler evaluate
__builtin_object_size(param, n) at each call site of the function
that has the parameter, and passes it in as a hidden parameter.
This attribute has two side-effects that are key to how FORTIFY works:
1. It can overload solely on pass_object_size (e.g. there are two
overloads of foo in
void foo(char * __attribute__((pass_object_size(0))) c);
void foo(char *);
(The one with pass_object_size attribute has precende over the
default one).
2. A function with at least one pass_object_size parameter can never
have its address taken (and overload resolution respects this).
Thus the read wrapper can be implemented as follows, without
hindering any fortify coverage compile and runtime:
extern __inline
__attribute__((__always_inline__))
__attribute__((__artificial__))
__attribute__((__overloadable__))
__attribute__((__warn_unused_result__))
ssize_t read (int __fd,
void *const __attribute__((pass_object_size (0))) __buf,
size_t __nbytes)
__attribute__((__diagnose_if__ ((((__builtin_object_size (__buf, 0)) != -1ULL
&& (__nbytes) > (__builtin_object_size (__buf, 0)) / (1))),
"read called with bigger length than size of the destination buffer",
"warning")))
{
return (__builtin_object_size (__buf, 0) == (size_t) -1)
? __read_alias (__fd,
__buf,
__nbytes)
: __read_chk (__fd,
__buf,
__nbytes,
__builtin_object_size (__buf, 0));
}
To avoid changing the current semantic for GCC, a set of macros is
defined to enable the clang required attributes, along with some changes
on internal macros to avoid the need to issue the symbol_chk symbols
(which are done through the __diagnose_if__ attribute for clang).
The read wrapper is simplified as:
__fortify_function __attribute_overloadable__ __wur
ssize_t read (int __fd,
__fortify_clang_overload_arg0 (void *, ,__buf),
size_t __nbytes)
__fortify_clang_warning_only_if_bos0_lt (__nbytes, __buf,
"read called with bigger length than "
"size of the destination buffer")
{
return __glibc_fortify (read, __nbytes, sizeof (char),
__glibc_objsize0 (__buf),
__fd, __buf, __nbytes);
}
There is no expected semantic or code change when using GCC.
Also, clang does not support __va_arg_pack, so variadic functions are
expanded to call va_arg implementations. The error function must not
have bodies (address takes are expanded to nonfortified calls), and
with the __fortify_function compiler might still create a body with the
C++ mangling name (due to the overload attribute). In this case, the
function is defined with __fortify_function_error_function macro
instead.
[1] https://docs.google.com/document/d/1DFfZDICTbL7RqS74wJVIJ-YnjQOj1SaoqfhbgddFYSM/edit
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
This includes a fix for big-endian in AdvSIMD log, some cosmetic
changes, and numerous small optimisations mainly around inlining and
using indexed variants of MLA intrinsics.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Starting with commit e57d8fc97b
"S390: Always use svc 0"
clone clobbers the call-saved register r7 in error case:
function or stack is NULL.
This patch restores the saved registers also in the error case.
Furthermore the existing test misc/tst-clone is extended to check
all error cases and that clone does not clobber registers in this
error case.
When glibc is built with ISA level 3 or higher by default, the resulting
glibc binaries won't run on SSE or FMA4 processors. Exclude SSE, AVX and
FMA4 variants in libm multiarch when ISA level 3 or higher is enabled by
default.
When glibc is built with ISA level 2 enabled by default, only keep SSE4.1
variant.
Fixes BZ 31335.
NB: elf/tst-valgrind-smoke test fails with ISA level 4, because valgrind
doesn't support AVX512 instructions:
https://bugs.kde.org/show_bug.cgi?id=383010
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reflow and sort Makefile.
Code generation changes present due to link order changes.
No regressions on x86_64 and i686.
Tested with build-many-glibcs.py for x86_64-gnu.
Reflow and sort Makefile.
No code generation changes in non-test binary artifacts.
No regressions on x86_64 and i686.
Tested with build-many-glibcs.py for x86_64-gnu.
Add APX registers to STATE_SAVE_MASK so that APX registers are saved in
ld.so trampoline. This fixes BZ #31371.
Also update STATE_SAVE_OFFSET and STATE_SAVE_MASK for i386 which will
be used by i386 _dl_tlsdesc_dynamic.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
This patch adds more benchtests for rounding functions.
The double inputs are copied from trunc-inputs, the float inputs are copied from truncf-inputs. and the rintf is copied from rint-inputs.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Testing for `None`-ness with `==` operator is frowned upon and causes
warnings in at least "LGTM" python linter. Fix that.
Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The optimization is not faster than the generic algorithm,
using the bench-strstr the geometric mean running on a POWER10 machine
using gcc 13.1.1 is 482.47 while the default __strstr_ppc is 340.97
(which uses the generic implementation).
Also, there is no need to redirect the internal str*/mem* call
to optimized version, internal ifunc is supported and enabled
for internal calls (meaning that the generic implementation
will use any asm optimization if available).
Checked on powerpc64le-linux-gnu.
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
This patch adds some --disable-multi-arch variants for s390x.
As the used IFUNC variants and __GI symbols depend on the used
gcc -march=cpu-level, there are multiple new configurations.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The FSR version field is read-only and might be non-zero.
This allows math/test-fpucw* to correctly pass when the version is
non-zero.
Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Commit ff026950e2 ("Add a C wrapper for
prctl [BZ #25896]") replaced the assembler wrapper with a C function.
However, on powerpc64le-linux-gnu, the C variadic function
implementation requires extra work in the caller to set up the
parameter save area. Calling a function that needs a parameter save
area without one (because the prototype used indicates the function is
not variadic) corrupts the caller's stack. The Linux manual pages
project documents prctl as a non-variadic function. This has resulted
in various projects over the years using non-variadic prototypes,
including the sanitizer libraries in LLVm and GCC (GCC PR 113728).
This commit switches back to the assembler implementation on most
targets and only keeps the C implementation for x86-64 x32.
Also add the __prctl_time64 alias from commit
b39ffab860 ("Linux: Add time64 alias for
prctl") to sysdeps/unix/sysv/linux/syscalls.list; it was not yet
present in commit ff026950e2.
This restores the old ABI on powerpc64le-linux-gnu, thus fixing
bug 29770.
Reviewed-By: Simon Chopin <simon.chopin@canonical.com>
Before this change, we incorrectly used the SSE2 variant in the
implementation, without checking that the system actually supports
SSE2.
Tested-by: Sam James <sam@gentoo.org>
'_' is used in Makefile variable names and many variables end with
"^# name". Relax sort-makefile-lines.py to allow '_' in name and
"^# name" as variable end. This fixes BZ #31385.
"number of arguments, from zero to five" is wrong, because on Linux maximal number
of arguments is 6, not 5. Also, maximal number of arguments is kernel-dependent,
so let's not include it here at all.
Moreover, "Each kind of system call has a definite number of arguments" is questionable.
Think about SYS_open on Linux, which takes 2 or 3 arguments. Or SYS_clone on Linux x86_64, which
takes 2 to 5 arguments. So I propose to fully remove this sentence.
Signed-off-by: Askar Safin <safinaskar@zohomail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
__builtin_ffs{,ll} basically on __builtin_ctz{,ll} in MIPS GCC compiler.
The hardware ctz instructions were available after MIPS{32,64} Release1. By using builtin ctz. It can also reduce code size of ffs/ffsll.
Checked on mips o32. mips64.
Signed-off-by: Junxian Zhu <zhujunxian@oss.cipunited.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
For AMD Zen3+ architecture, the performance of the vectorized loop is
slightly better than ERMS.
Checked on x86_64-linux-gnu on Zen3.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The REP MOVSB usage on memcpy/memmove does not show much performance
improvement on Zen3/Zen4 cores compared to the vectorized loops. Also,
as from BZ 30994, if the source is aligned and the destination is not
the performance can be 20x slower.
The performance difference is noticeable with small buffer sizes, closer
to the lower bounds limits when memcpy/memmove starts to use ERMS. The
performance of REP MOVSB is similar to vectorized instruction on the
size limit (the L2 cache). Also, there is no drawback to multiple cores
sharing the cache.
Checked on x86_64-linux-gnu on Zen3.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Some shadow stack test scripts use the '==' operator with the 'test'
command to validate exit codes resulting in the following error:
sysdeps/x86_64/tst-shstk-legacy-1e.sh: 31: test: 139: unexpected operator
The '==' operator is invalid for the 'test' command, use '-eq' like the
previous call to 'test'.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Change "\(" and "\)" to "\\(" and "\\)" in test_printers_common.py. This
fixes the test warning:
.../scripts/test_printers_common.py:101: SyntaxWarning: invalid escape sequence '\('
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Linux 6.7 adds a constant SOL_VSOCK (recall that various constants in
include/linux/socket.h are in fact part of the kernel-userspace API
despite that not being a uapi header). Add it to glibc's
bits/socket.h.
Tested for x86_64.
Otherwise on at least x86_64 and s390x there is an unwanted PLT entry
in libc.so when configured with --enable-fortify-source=3 and build
with -Os.
This is observed in elf/check-localplt
Extra PLT reference: libc.so: __strcpy_chk
The call to PLT entry is in inet/ruserpass.c.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The commit 49d877a80b (arm: Remove
_dl_skip_args usage) removed the _SKIP_ARGS literal, which was
previously loader to r4 on loader _start. However, the cleanup did not
remove the following 'ldr r4, [sl, r4]' on _dl_start_user, used to check
to skip the arguments after ld self-relocations.
In my testing, the kernel initially set r4 to 0, which makes the
ldr instruction just read the _GLOBAL_OFFSET_TABLE_. However, since r4
is a callee-saved register; a different runtime might not zero
initialize it and thus trigger an invalid memory access.
Checked on arm-linux-gnu.
Reported-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
On LoongArch GCC compiles __builtin_ffs{,ll} to basically
`(x ? __builtin_ctz (x) : -1) + 1`. Since a hardware ctz instruction is
available, this is much better than the table-driven generic
implementation.
Tested on loongarch64.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On s390x, I get warnings like this when do_one_test is inlined with SIZE_MAX:
In function ‘do_one_test’,
inlined from ‘do_overflow_tests’ at tst-strlcat2.c:184:2:
tst-strlcat2.c:49:18: error: ‘strnlen’ specified bound [18446744073709550866, 18446744073709551615] exceeds maximum object size 9223372036854775807 [-Werror=stringop-overflow=]
49 | # define STRNLEN strnlen
| ^
tst-strlcat2.c:89:23: note: in expansion of macro ‘STRNLEN’
89 | size_t dst_length = STRNLEN (dst, n);
| ^~~~~~~
This patch just marks the do_one_test function as noinline as also done in test-strncat.c:
Fix stringop-overflow warning in test-strncat.
https://sourceware.org/git/?p=glibc.git;a=commit;h=51aeab9a363a0d000d0912aa3d6490463a26fba2
For o32 we need to setup a minimal stack frame to allow cprestore
on __thread_start_clone3 (which instruct the linker to save the
gp for PIC). Also, there is no guarantee by kABI that $8 will be
preserved after syscall execution, so we need to save it on the
provided stack.
Checked on mipsel-linux-gnu.
Reported-by: Khem Raj <raj.khem@gmail.com>
Tested-by: Khem Raj <raj.khem@gmail.com>
* manual/search.texi (Comparison Functions, Array Sort Function):
Sort an array of long ints, not doubles, to avoid hassles
with NaNs.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Complete the internal renaming from "C2X" and related names in GCC by
renaming *-c2x and *-gnu2x tests to *-c23 and *-gnu23.
Tested for x86_64, and with build-many-glibcs.py for powerpc64le.
My recent change broke make pdf and in other documentation formats
results in weird rendering and invalid URL, all because of a forgotten
comma to separate @uref arguments.
When running the testsuite in parallel, for instance running make -j
$(nproc) check, occasionally tst-epoll fails with a timeout. It happens
because it sometimes takes a bit more than 10ms for the process to get
cloned and blocked by the syscall. In that case the signal is
sent to early, and the test fails with a timeout.
Checked on x86_64-linux-gnu.
The exp10, exp10l, fma, fmaf, and fmal default implementation do not
implement the appropriate semantics nor with an reasonable accuracy.
They are also not used by any supported port.
WG14 decided to use the name C23 as the informal name of the next
revision of the C standard (notwithstanding the publication date in
2024). Update references to C2X in glibc to use the C23 name.
This is intended to update everything *except* where it involves
renaming files (the changes involving renaming tests are intended to
be done separately). In the case of the _ISOC2X_SOURCE feature test
macro - the only user-visible interface involved - support for that
macro is kept for backwards compatibility, while adding
_ISOC23_SOURCE.
Tested for x86_64.
A version string may contain non-digit characters, commonly found in
built-from-VCS tools, e.g.
```
git version 2.39.GIT
git version 2.43.0.493.gbc7ee2e5e1
```
`int()` will raise a ValueError, leading to a spurious 'missing'.
Reviewed-by: DJ Delorie <dj@redhat.com>
The following patch uses the GCC 14 __builtin_stdc_* builtins in stdbit.h
for the type-generic macros, so that when compiled with GCC 14 or later,
it supports not just 8/16/32/64-bit unsigned integers, but also 128-bit
(if target supports them) and unsigned _BitInt (any supported precision).
And so that the macros don't expand arguments multiple times and can be
evaluated in constant expressions.
The new testcase is gcc's gcc/testsuite/gcc.dg/builtin-stdc-bit-1.c
adjusted to test stdbit.h and the type-generic macros in there instead
of the builtins and adjusted to use glibc test framework rather than
gcc style tests with __builtin_abort ().
Signed-off-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Joseph Myers <josmyers@redhat.com>
Starting with commits
- 7ea510127e
string: Add libc_hidden_proto for strchrnul
- 22999b2f0f
string: Add libc_hidden_proto for memrchr
building glibc on s390x with --disable-multi-arch fails if only
the C-variant of strchrnul / memrchr is used. This is the case
if gcc uses -march < z13.
The build fails with:
../sysdeps/s390/strchrnul-c.c:28:49: error: ‘__strchrnul_c’ undeclared here (not in a function); did you mean ‘__strchrnul’?
28 | __hidden_ver1 (__strchrnul_c, __GI___strchrnul, __strchrnul_c);
With --disable-multi-arch, __strchrnul_c is not available as string/strchrnul.c
is just included without defining STRCHRNUL and thus we also don't have to create
the internal hidden symbol.
Tested-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Simplify the advisory format by dropping the -Backport tags and instead
stick to using just the -Commit tags. To identify backports, put a
substring of git-describe into the release version in the brackets next
to the commit ref. This way, it not only identifies that the fix (or
regression) is on the release/2.YY/master branch, it also disambiguates
regressions/fixes in the branch from those in the tarball.
Add a README to make it easier for consumers to understand the format.
Additionally, the Release wiki needs to be updated to inform the release
manager to:
1. Generate a NEWS snipped from the advisories directory
AND
2. on release/2.YY/master, replace the advisories directory with a text
file pointing to the advisories directory in master so that we don't
have to update multiple locations.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Andreas K. Hüttel <dilfridge@gentoo.org>
__vsyslog_internal calculated a buffer size by adding two integers, but
did not first check if the addition would overflow. This commit fixes
that.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
__vsyslog_internal used the return value of snprintf/vsnprintf to
calculate buffer sizes for memory allocation. If these functions (for
any reason) failed and returned -1, the resulting buffer would be too
small to hold output. This commit fixes that.
All snprintf/vsnprintf calls are checked for negative return values and
the function silently returns upon encountering them.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
__vsyslog_internal did not handle a case where printing a SYSLOG_HEADER
containing a long program name failed to update the required buffer
size, leading to the allocation and overflow of a too-small buffer on
the heap. This commit fixes that. It also adds a new regression test
that uses glibc.malloc.check.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
This change relicenses the IBM portions of resolv/base64.c and
resolv/res_debug.c to a new license that does not have use-limited
patent language. The top-level LICENSE file is updated with the
license.
The relicensing was approved by IBM.
Signed-off-by: Brad Topol, IBM Director of Open Technologies <btopol@us.ibm.com>
Signed-off-by: Richard Fontana <rfontana@redhat.com>
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
On the summary page the order of the function arguments was reversed, but it is
in correct order in the other places of the manual.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The increased malloc subsystem usage is a side effect of
commit d2123d6827 ("elf: Fix slow tls
access after dlopen [BZ #19924]").
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
For ports that use the default memset, the compiler might generate early
calls before the stack protector is initialized (for instance, riscv
with -fstack-protector-all on _dl_aux_init).
Checked on riscv64-linux-gnu-rv64imafdc-lp64d.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
In qsort_r we allocate a buffer sized QSORT_STACK_SIZE (1024) on stack
and we intend to use it if all elements can fit into it. But there is a
typo:
if (total_size < sizeof buf)
buf = tmp;
else
/* allocate a buffer on heap and use it ... */
Here "buf" is a pointer, thus sizeof buf is just 4 or 8, instead of
1024. There is also a minor issue that we should use "<=" instead of
"<".
This bug is detected debugging some strange heap corruption running the
Ruby-3.3.0 test suite (on an experimental Linux From Scratch build using
Binutils-2.41.90 and Glibc trunk, and also Fedora Rawhide [1]). It
seems Ruby is doing some wild "optimization" by jumping into somewhere
in qsort_r instead of calling it normally, resulting in a double free of
buf if we allocate it on heap. The issue can be reproduced
deterministically with:
LD_PRELOAD=/usr/lib/libc_malloc_debug.so MALLOC_CHECK_=3 \
LD_LIBRARY_PATH=. ./ruby test/runner.rb test/ruby/test_enum.rb
in Ruby-3.3.0 tree after building it. This change would hide the issue
for Ruby, but Ruby is likely still buggy (if using this "optimization"
sorting larger arrays).
[1]:https://kojipkgs.fedoraproject.org/work/tasks/9729/111889729/build.log
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
The small counts copy bytes comparsion should be unsigned (as the
memmove size argument). It fixes string/tst-memmove-overflow on
sparcv9, where the input size triggers an invalid code path.
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
Similar to sparc32 fix, remove the unwind information on the signal
return stubs. This fixes the regressions:
FAIL: nptl/tst-cancel24-static
FAIL: nptl/tst-cond8-static
FAIL: nptl/tst-mutex8-static
FAIL: nptl/tst-mutexpi8-static
FAIL: nptl/tst-mutexpi9
On sparc64-linux-gnu.
It turns out that the replacement of datetime.datetime.utcnow(), for a
warning produced early in running build-many-glibcs.py with Python
3.12, (a) wasn't complete (there were other uses elsewhere in the
script also needing updating) and (b) broke reading of build-time from
build-state.json, because an aware datetime was written out including
+00:00 for the timezone, which was not expected by the strptime call.
Fix the first by making the change to
datetime.datetime.now(datetime.timezone.utc) for all the remaining
utcnow() calls. Fix the second by using strftime with an explicit
format instead of just str() when formatting build times for
build-state.json and and email subjects, and then setting the timezone
explicitly when reading from build-state.json. (Other uses, in
particular messages output by the bot, continue to use str() as the
precise format should not matter in those cases; it shouldn't actually
matter for email subjects either but it seems a good idea to keep
those short.)
Tested with a bot-cycle run and checking the format of times in
build-state.json afterwards.
The FPU used by LEON does not preserve NaN payload. This change allows
the math/test-*-canonicalize tests to pass on LEON.
Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use the math_force_eval() macro to force the calculation to complete and
raise the exception.
With this change the math/test-fenv test pass.
Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Conversions from a float to a long long on SPARC v8 uses a libgcc function
that may not raise the correct exceptions on overflow. It also may raise
spurious "inexact" exceptions on non overflow cases. This patch fixes the
problem in the same way as for RV32.
Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The functions were previously written in C, but were not compiled
with unwind information. The ENTRY/END macros includes .cfi_startproc
and .cfi_endproc which adds unwind information. This caused the
tests cleanup-8 and cleanup-10 in the GCC testsuite to fail.
This patch adds a version of the ENTRY/END macros without the
CFI instructions that can be used instead.
sigaction registers a restorer address that is located two instructions
before the stub function. This patch adds a two instruction padding to
avoid that the unwinder accesses the unwind information from the function
that the linker has placed right before it in memory. This fixes an issue
with pthread_cancel that caused tst-mutex8-static (and other tests) to fail.
Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On LEON, if the stfsr instruction is immediately following a floating-point
operation instruction in a running program, with no other instruction in
between the two, the stfsr might behave as if the order was reversed
between the two instructions and the stfsr occurred before the
floating-point operation.
Add a nop instruction before the stfsr to prevent this from happening.
Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Macros for using inline assembly to access the fp state register exists
in both fenv_private.h and in fpu_control.h. Let fenv_private.h use the
macros from fpu_control.h
Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch updates the kernel version in the tests tst-mman-consts.py,
tst-mount-consts.py and tst-pidfd-consts.py to 6.7. (There are no new
constants covered by these tests in 6.7 that need any other header
changes.)
Tested with build-many-glibcs.py.
Linux 6.7 adds the futex_requeue, futex_wait and futex_wake syscalls,
and enables map_shadow_stack for architectures previously missing it.
Update syscall-names.list and regenerate the arch-syscall.h headers
with build-many-glibcs.py update-syscalls.
Tested with build-many-glibcs.py.
Adjust the testing approach to start from scenarios with only 2
elements, as insertion sort no longer handles such cases.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When malloc fails to allocate a buffer and falls back to heapsort, the
current heapsort implementation does not perform sorting when there are
exactly two elements. Heapsort is now skipped only when there is
exactly one element.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Resolves: BZ # 31239
The correct abbreviated month names were apparently given in the comment above `abmon`.
But the value of `abmon` was apparently just copied from the value of `mon` and this
mistake was hard to see because code point notation <Uxxxx> was used. After converting
to UTF-8 it was obvious that there was apparently a copy and paste mistake.
The mergesort removal from qsort implementation (commit 03bf8357e8)
had the side-effect of making sorting nonstable. Although neither
POSIX nor C standard specify that qsort should be stable, it seems
that it has become an instance of Hyrum's law where multiple programs
expect it.
Also, the resulting introsort implementation is not faster than
the previous mergesort (which makes the change even less appealing).
This patch restores the previous mergesort implementation, with the
exception of machinery that checks the resulting allocation against
the _SC_PHYS_PAGES (it only adds complexity and the heuristic not
always make sense depending on the system configuration and load).
The alloca usage was replaced with a fixed-size buffer.
For the fallback mechanism, the implementation uses heapsort. It is
simpler than quicksort, and it does not suffer from adversarial
inputs. With memory overcommit, it should be rarely triggered.
The drawback is mergesort requires O(n) extra space, and since it is
allocated with malloc the function is AS-signal-unsafe. It should be
feasible to change it to use mmap, although I am not sure how urgent
it is. The heapsort is also nonstable, so programs that require a
stable sort would still be subject to this latent issue.
The tst-qsort5 is removed since it will not create quicksort adversarial
inputs with the current qsort_r implementation.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Systemd execution environment configuration may prohibit changing a memory
mapping to become executable:
MemoryDenyWriteExecute=
Takes a boolean argument. If set, attempts to create memory mappings
that are writable and executable at the same time, or to change existing
memory mappings to become executable, or mapping shared memory segments
as executable, are prohibited.
When it is set, systemd service stops working if PLT rewrite is enabled.
Check if mprotect works before rewriting PLT. This fixes BZ #31230.
This also works with SELinux when deny_execmem is on.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Ffsll function randomly regress by ~20%, depending on how code gets
aligned in memory. Ffsll function code size is 17 bytes. Since default
function alignment is 16 bytes, it can load on 16, 32, 48 or 64 bytes
aligned memory. When ffsll function load at 16, 32 or 64 bytes aligned
memory, entire code fits in single 64 bytes cache line. When ffsll
function load at 48 bytes aligned memory, it splits in two cache line,
hence random regression.
Ffsll function size reduction from 17 bytes to 12 bytes ensures that it
will always fit in single 64 bytes cache line.
This patch fixes ffsll function random performance regression.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch referents the commit 374cef3 to add static-pie support. And
because the dummy link map is used when relocating ourselves, so need
not to set __global_pointer$ at this time.
It will also check whether toolchain supports to build static-pie.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The 551101e824 change is incorrect for
alpha and sparc, since __NR_stat is defined by both kABI. Use
__NR_newfstat to check whether to fallback to __NR_fstat64 (similar
to what fstatat64 does).
Checked on sparc64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Remove the error handling wrapper from exp10. This is very similar to
the changes done to exp and exp2, except that we also need to handle
pow10 and pow10l.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Increase benchmark iterations for math and vector math functions to improve
timing accuracy. Vector math benchmarks now take 1-3 seconds on a modern CPU.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Otherwise the warning message for the getwd symbol ends up being duplicated.
Signed-off-by: Frederic Cambus <fred@statdns.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
The __getrandom_nocancel function returns errors as negative values
instead of errno. This is inconsistent with other _nocancel functions
and it breaks "TEMP_FAILURE_RETRY (__getrandom_nocancel (p, n, 0))" in
__arc4random_buf. Use INLINE_SYSCALL_CALL instead of
INTERNAL_SYSCALL_CALL to fix this issue.
But __getrandom_nocancel has been avoiding from touching errno for a
reason, see BZ 29624. So add a __getrandom_nocancel_nostatus function
and use it in tcache_key_initialize.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
CET feature bits in TCB, which are Linux specific, are used to check if
CET features are active. Move CET feature check to Linux/x86 directory.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Starting with commit 40c0add7d4
"resolve: Remove __res_context_query alloca usage"
there is an endless loop in __res_context_query if
__res_context_mkquery fails e.g. if type is invalid. Then the
scratch buffer is resized to MAXPACKET size and it is retried again.
Before the mentioned commit, it was retried only once and with the
mentioned commit, there is no check and it retries in an endless loop.
This is observable with xtest resolv/tst-resolv-qtypes which times out
after 300s.
This patch retries mkquery only once as before the mentioned commit.
Furthermore, scratch_buffer_set_array_size is now only called with
nelem=2 if type is T_QUERY_A_AND_AAAA (also see mentioned commit).
The test tst-resolv-qtypes is also adjusted to verify that <func>
is really returning with -1 in case of an invalid type.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
After 78ca44da01
("elf: Relocate libc.so early during startup and dlmopen (bug 31083)")
we start seeing tst-nodeps2 failures when building the testsuite with
--enable-hard-coded-path-in-tests.
When building the testsuite with --enable-hard-coded-path-in-tests
the tst-nodeps2-mod.so is not built with the required DT_RUNPATH
values and the test escapes the test framework and loads the system
libraries and aborts. The fix is to use the existing
$(link-test-modules-rpath-link) variable to set DT_RUNPATH correctly.
No regressions on x86_64.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
1. Remove _dl_runtime_resolve_shstk and _dl_runtime_profile_shstk.
2. Move CET offsets from x86 cpu-features-offsets.sym to x86-64
features-offsets.sym.
3. Rename x86 cet-control.h to x86-64 feature-control.h since it is only
for x86-64 and also used for PLT rewrite.
4. Add x86-64 ldsodefs.h to include feature-control.h.
5. Change TUNABLE_CALLBACK (set_plt_rewrite) to x86-64 only.
6. Move x86 dl-procruntime.c to x86-64.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Move sysdeps/x86/libc-start.h to sysdeps/x86_64/libc-start.h and use
sysdeps/generic/libc-start.h for i386.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Running build-many-glibcs.py with Python 3.12 or later produces a
warning:
build-many-glibcs.py:566: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
build_time = datetime.datetime.utcnow()
Replace with datetime.datetime.now(datetime.timezone.utc) (the
datetime.UTC constant is new in 3.11, so not suitable for use in this
script at present).
Running build-many-glibcs.py with Python 3.12 or later produces a
warning:
build-many-glibcs.py:173: SyntaxWarning: invalid escape sequence '\.'
m = re.fullmatch('([0-9]+)\.([0-9]+)[.0-9]*', l)
Use a raw string instead to avoid that warning. (Note: I haven't
checked whether any other Python scripts included with glibc might
have issues with newer Python versions.)
The feupdateenv tests added by 802aef27b2 do not restore the floating
point mask, which might keep some floating point exception enabled and
thus make the feupdateenv_single_test raise an unexpected signal.
Checked on x86_64-linux-gnu and aarch64-linux-gnu (on Apple M1 trapping
is supported).
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
On x86-64 machine with
[hjl@gnu-cfl-3 x86-glibc]$ ls -l /usr/include/asm/prctl.h sysdeps/unix/sysv/linux/x86_64/include/asm/prctl.h
-rw-r--r-- 1 hjl hjl 825 Jan 9 09:41 sysdeps/unix/sysv/linux/x86_64/include/asm/prctl.h
-rw-r--r-- 1 root root 1170 Nov 27 16:00 /usr/include/asm/prctl.h
[hjl@gnu-cfl-3 x86-glibc]$
glibc configured with --enable-cet build failed:
make[2]: Entering directory '/export/gnu/import/git/gitlab/x86-glibc/iconv'
../Makerules:327: update target
'/export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/gnu/lib-names-64.h'
due to: /export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/gnu/lib-names-64.stmp
:
../Makeconfig:1216: update target
'/export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/libc-modules.h'
due to: /export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/libc-modules.stmp
:
../Makerules:1126: update target '/usr/include/asm/prctl.h' due to:
../sysdeps/unix/sysv/linux/x86_64/64/../include/asm/prctl.h
force-install
/usr/bin/install -c -m 644
../sysdeps/unix/sysv/linux/x86_64/64/../include/asm/prctl.h
/usr/include/asm/prctl.h
/usr/bin/install: cannot remove '/usr/include/asm/prctl.h': Permission denied
make[2]: *** [../Makerules:1126: /usr/include/asm/prctl.h] Error 1
make[2]: Leaving directory '/export/gnu/import/git/gitlab/x86-glibc/iconv'
make[1]: *** [Makefile:484: iconv/subdir_lib] Error 2
make[1]: Leaving directory '/export/gnu/import/git/gitlab/x86-glibc'
make: *** [Makefile:9: all] Error 2
This is triggered by the rule in Makerules:
$(inst_includedir)/%.h: $(..)include/%.h $(+force)
$(do-install)
Since no files under include/ should be installed, remove it from
Makerules.
Tested it on x86-64. There are no differences in the installed header
files.
CET is only support for x86_64, this patch reverts:
- faaee1f07e x86: Support shadow stack pointer in setjmp/longjmp.
- be9ccd27c0 i386: Add _CET_ENDBR to indirect jump targets in
add_n.S/sub_n.S
- c02695d776 x86/CET: Update vfork to prevent child return
- 5d844e1b72 i386: Enable CET support in ucontext functions
- 124bcde683 x86: Add _CET_ENDBR to functions in crti.S
- 562837c002 x86: Add _CET_ENDBR to functions in dl-tlsdesc.S
- f753fa7dea x86: Support IBT and SHSTK in Intel CET [BZ #21598]
- 825b58f3fb i386-mcount.S: Add _CET_ENDBR to _mcount and __fentry__
- 7e119cd582 i386: Use _CET_NOTRACK in i686/memcmp.S
- 177824e232 i386: Use _CET_NOTRACK in memcmp-sse4.S
- 0a899af097 i386: Use _CET_NOTRACK in memcpy-ssse3-rep.S
- 7fb613361c i386: Use _CET_NOTRACK in memcpy-ssse3.S
- 77a8ae0948 i386: Use _CET_NOTRACK in memset-sse2-rep.S
- 00e7b76a8f i386: Use _CET_NOTRACK in memset-sse2.S
- 90d15dc577 i386: Use _CET_NOTRACK in strcat-sse2.S
- f1574581c7 i386: Use _CET_NOTRACK in strcpy-sse2.S
- 4031d7484a i386/sub_n.S: Add a missing _CET_ENDBR to indirect jump
- target
-
Checked on i686-linux-gnu.
The CET is only supported for x86_64 and there is no plan to add
kernel support for i386. Move the Makefile rules and files from the
generic x86 folder to x86_64 one.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Resolves: BZ # 31221
glibc can recognise its code, but does not have its data.
This patch remedies that.
Signed-off-by: Janet Blackquill <uhhadd@gmail.com>
PLT rewrite calculated displacement with
ElfW(Addr) disp = value - branch_start - JMP32_INSN_SIZE;
On x32, displacement from 0xf7fbe060 to 0x401030 was calculated as
unsigned int disp = 0x401030 - 0xf7fbe060 - 5;
with disp == 0x8442fcb and caused displacement overflow. The PLT entry
was changed to:
0xf7fbe060 <+0>: e9 cb 2f 44 08 jmp 0x401030
0xf7fbe065 <+5>: cc int3
0xf7fbe066 <+6>: cc int3
0xf7fbe067 <+7>: cc int3
0xf7fbe068 <+8>: cc int3
0xf7fbe069 <+9>: cc int3
0xf7fbe06a <+10>: cc int3
0xf7fbe06b <+11>: cc int3
0xf7fbe06c <+12>: cc int3
0xf7fbe06d <+13>: cc int3
0xf7fbe06e <+14>: cc int3
0xf7fbe06f <+15>: cc int3
x32 has 32-bit address range, but it doesn't wrap address around at 4GB,
JMP target was changed to 0x100401030 (0xf7fbe060LL + 0x8442fcbLL + 5),
which is above 4GB.
Always use uint64_t to calculate displacement. This fixes BZ #31218.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
With gcc 6.5.0, 7.5.0, 8.5.0, and 9.5.0 the tst-stdbit-Wconversion
issues the warnings:
../stdlib/stdbit.h: In function ‘__clo16_inline’:
../stdlib/stdbit.h:128:26: error: conversion to ‘uint16_t {aka short
unsigned int}’ from ‘int’ may alter its value [-Werror=conversion]
return __clz16_inline (~__x);
^
../stdlib/stdbit.h: In function ‘__clo8_inline’:
../stdlib/stdbit.h:134:25: error: conversion to ‘uint8_t {aka unsigned
char}’ from ‘int’ may alter its value [-Werror=conversion]
return __clz8_inline (~__x);
^
../stdlib/stdbit.h: In function ‘__cto16_inline’:
../stdlib/stdbit.h:232:26: error: conversion to ‘uint16_t {aka short
unsigned int}’ from ‘int’ may alter its value [-Werror=conversion]
return __ctz16_inline (~__x);
^
../stdlib/stdbit.h: In function ‘__cto8_inline’:
../stdlib/stdbit.h:238:25: error: conversion to ‘uint8_t {aka unsigned
char}’ from ‘int’ may alter its value [-Werror=conversion]
return __ctz8_inline (~__x);
^
../stdlib/stdbit.h: In function ‘__bf16_inline’:
../stdlib/stdbit.h:701:23: error: conversion to ‘uint16_t {aka short
unsigned int}’ from ‘int’ may alter its value [-Werror=conversion]
return __x == 0 ? 0 : ((uint16_t) 1) << (__bw16_inline (__x) - 1);
~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../stdlib/stdbit.h: In function ‘__bf8_inline’:
../stdlib/stdbit.h:707:23: error: conversion to ‘uint8_t {aka unsigned
char}’ from ‘int’ may alter its value [-Werror=conversion]
return __x == 0 ? 0 : ((uint8_t) 1) << (__bw8_inline (__x) - 1);
~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../stdlib/stdbit.h: In function ‘__bc16_inline’:
../stdlib/stdbit.h:751:59: error: conversion to ‘uint16_t {aka short
unsigned int}’ from ‘int’ may alter its value [-Werror=conversion]
return __x <= 1 ? 1 : ((uint16_t) 2) << (__bw16_inline (__x - 1) -
1);
^~~
../stdlib/stdbit.h:751:23: error: conversion to ‘uint16_t {aka short
unsigned int}’ from ‘int’ may alter its value [-Werror=conversion]
return __x <= 1 ? 1 : ((uint16_t) 2) << (__bw16_inline (__x - 1) -
1);
~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../stdlib/stdbit.h: In function ‘__bc8_inline’:
../stdlib/stdbit.h:757:57: error: conversion to ‘uint8_t {aka unsigned
char}’ from ‘int’ may alter its value [-Werror=conversion]
return __x <= 1 ? 1 : ((uint8_t) 2) << (__bw8_inline (__x - 1) - 1);
^~~
../stdlib/stdbit.h:757:23: error: conversion to ‘uint8_t {aka unsigned
char}’ from ‘int’ may alter its value [-Werror=conversion]
return __x <= 1 ? 1 : ((uint8_t) 2) << (__bw8_inline (__x - 1) - 1);
It seems to boiler down to __builtin_clz not having a variant for 8 or
16 bits. Fix it by explicit casting to the expected types.
Checked on x86_64-linux-gnu and i686-linux-gnu with gcc 9.5.0.
Add ELF_DYNAMIC_AFTER_RELOC to allow target specific processing after
relocation.
For x86-64, add
#define DT_X86_64_PLT (DT_LOPROC + 0)
#define DT_X86_64_PLTSZ (DT_LOPROC + 1)
#define DT_X86_64_PLTENT (DT_LOPROC + 3)
1. DT_X86_64_PLT: The address of the procedure linkage table.
2. DT_X86_64_PLTSZ: The total size, in bytes, of the procedure linkage
table.
3. DT_X86_64_PLTENT: The size, in bytes, of a procedure linkage table
entry.
With the r_addend field of the R_X86_64_JUMP_SLOT relocation set to the
memory offset of the indirect branch instruction.
Define ELF_DYNAMIC_AFTER_RELOC for x86-64 to rewrite the PLT section
with direct branch after relocation when the lazy binding is disabled.
PLT rewrite is disabled by default since SELinux may disallow modifying
code pages and ld.so can't detect it in all cases. Use
$ export GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1
to enable PLT rewrite with 32-bit direct jump at run-time or
$ export GLIBC_TUNABLES=glibc.cpu.plt_rewrite=2
to enable PLT rewrite with 32-bit direct jump and on APX processors with
64-bit absolute jump at run-time.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
These describe generic AArch64 CPU features, and are not tied to a
kernel-specific way of determining them. We can share them between
the Linux and Hurd AArch64 ports.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240103171502.1358371-13-bugaevc@gmail.com>
We fetch __vm_page_size as the very first RPC that we do, inside
__mach_init (). Propagate that to _dl_pagesize ASAP after that,
before any other initialization.
In dynamic builds, this is already done immediately after
__mach_init (), inside _dl_sysdep_start ().
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240103171502.1358371-12-bugaevc@gmail.com>
This is the case on both x86 architectures, but not on AArch64.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240103171502.1358371-11-bugaevc@gmail.com>
We already have the RETURN_TO macro for this exact use case, and it's already
used in the non-static code path. Use it here too.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240103171502.1358371-9-bugaevc@gmail.com>
Instead of relying on the stack frame layout to figure out where the stack
pointer was prior to the _hurd_stack_setup () call, just pass the pointer
as an argument explicitly. This is less brittle and much more portable.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240103171502.1358371-8-bugaevc@gmail.com>
setcontext and swapcontext put a restore token on the old shadow stack
which is used to restore the target shadow stack when switching user
contexts. When longjmp from a user context, the target shadow stack
can be different from the current shadow stack and INCSSP can't be
used to restore the shadow stack pointer to the target shadow stack.
Update longjmp to search for a restore token. If found, use the token
to restore the shadow stack pointer before using INCSSP to pop the
shadow stack. Stop the token search and use INCSSP if the shadow stack
entry value is the same as the current shadow stack pointer.
It is a user error if there is a shadow stack switch without leaving a
restore token on the old shadow stack.
The only difference between __longjmp.S and __longjmp_chk.S is that
__longjmp_chk.S has a check for invalid longjmp usages. Merge
__longjmp.S and __longjmp_chk.S by adding the CHECK_INVALID_LONGJMP
macro.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Since shadow stack is only supported for x86-64, ignore --enable-cet for
i386. Always setting $(enable-cet) for i386 to "no" to support
ifneq ($(enable-cet),no)
in x86 Makefiles. We can't use
ifeq ($(enable-cet),yes)
since $(enable-cet) can be "yes", "no" or "permissive".
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These symbols are internal and never exported; make sure the compiler
realizes that when compiling hurdsig.c and does not try to emit GOT
reads.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
C23 adds a header <stdbit.h> with various functions and type-generic
macros for bit-manipulation of unsigned integers (plus macro defines
related to endianness). Implement this header for glibc.
The functions have both inline definitions in the header (referenced
by macros defined in the header) and copies with external linkage in
the library (which are implemented in terms of those macros to avoid
duplication). They are documented in the glibc manual. Tests, as
well as verifying results for various inputs (of both the macros and
the out-of-line functions), verify the types of those results (which
showed up a bug in an earlier version with the type-generic macro
stdc_has_single_bit wrongly returning a promoted type), that the
macros can be used at top level in a source file (so don't use ({})),
that they evaluate their arguments exactly once, and that the macros
for the type-specific functions have the expected implicit conversions
to the relevant argument type.
Jakub previously referred to -Wconversion warnings in type-generic
macros, so I've included a test with -Wconversion (but the only
warnings I saw and fixed from that test were actually in inline
functions in the <stdbit.h> header - not anything coming from use of
the type-generic macros themselves).
This implementation of the type-generic macros does not handle
unsigned __int128, or unsigned _BitInt types with a width other than
that of a standard integer type (and C23 doesn't require the header to
handle such types either). Support for those types, using the new
type-generic built-in functions Jakub's added for GCC 14, can
reasonably be added in a followup (along of course with associated
tests).
This implementation doesn't do anything special to handle C++, or have
any tests of functionality in C++ beyond the existing tests that all
headers can be compiled in C++ code; it's not clear exactly what form
this header should take in C++, but probably not one using macros.
DIS ballot comment AT-107 asks for the word "count" to be added to the
names of the stdc_leading_zeros, stdc_leading_ones,
stdc_trailing_zeros and stdc_trailing_ones functions and macros. I
don't think it's likely to be accepted (accepting any technical
comments would mean having an FDIS ballot), but if it is accepted at
the WG14 meeting (22-26 January in Strasbourg, starting with DIS
ballot comment handling) then there would still be time to update
glibc for the renaming before the 2.39 release.
The new functions and header are placed in the stdlib/ directory in
glibc, rather than creating a new toplevel stdbit/ or putting them in
string/ alongside ffs.
Tested for x86_64 and x86.
Includes test for setcontext too.
The test directly checks after longjmp if ZA got disabled and the
ZA contents got saved following the lazy saving scheme. It does not
use ACLE code to verify that gcc can interoperate with glibc.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
For the ZA lazy saving scheme to work, setcontext has to call
__libc_arm_za_disable.
Also fixes swapcontext which uses setcontext internally.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
For the ZA lazy saving scheme to work, longjmp has to call
__libc_arm_za_disable.
In ld.so we assume ZA is not used so longjmp does not need
special support there.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The runtime support routines for the call ABI of the Scalable Matrix
Extension (SME) are mostly in libgcc. Since libc.so cannot depend on
libgcc_s.so have an implementation of __arm_za_disable in libc for
libc internal use in longjmp and similar APIs.
__libc_arm_za_disable follows the same PCS rules as __arm_za_disable,
but it's a hidden symbol so it does not need variant PCS marking.
Using __libc_fatal instead of abort because it can print a message and
works in ld.so too. But for now we don't need SME routines in ld.so.
To check the SME HWCAP in asm, we need the _dl_hwcap2 member offset in
_rtld_global_ro in the shared libc.so, while in libc.a the _dl_hwcap2
object is accessed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The multibyte character needs to fit into the remaining buffer space,
not the already-written buffer space. Without the fix, we were never
moving the write pointer from the start of the buffer, always using
the single-character fallback buffer.
Fixes commit 04b76b5aa8 ("Don't error out writing
a multibyte character to an unbuffered stream (bug 17522)").
Seeing occasional failures in `__strchrnul_evex512` that are not
consistently reproducible. Hopefully by adding this the next failure
will provide enough information to debug.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Verify that setjmp and longjmp work correctly between user contexts.
Arrange stacks for uctx_func1 and uctx_func2 so that ____longjmp_chk
works when setjmp and longjmp are called from different user contexts.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
When shadow stack is enabled, some CET tests failed when compiled with
GCC 14:
FAIL: elf/tst-cet-legacy-4
FAIL: elf/tst-cet-legacy-5a
FAIL: elf/tst-cet-legacy-6a
which are caused by
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113039
These tests use -fcf-protection -fcf-protection=branch and assume that
-fcf-protection=branch will override -fcf-protection. But this GCC 14
commit:
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=1c6231c05bdcca
changed the -fcf-protection behavior such that
-fcf-protection -fcf-protection=branch
is treated the same as
-fcf-protection
Use
-fcf-protection -fcf-protection=none -fcf-protection=branch
as the workaround. This fixes BZ #31187.
Tested with GCC 13 and GCC 14 on Intel Tiger Lake.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* posix/regex.c: [!_LIBC && __GNUC_PREREQ (4, 3)]:
Omit GCC pragmas no longer needed when this file is used as part of Gnulib.
-Wold-style-definition no longer needs to be ignored because the regex
code no longer uses old style definitions. -Wtype-limits no longer
needs to be ignored because Gnulib already arranges for it to be
ignored in the C compiler flags. This patch is taken from Gnulib.
I've updated copyright dates in glibc for 2024. This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files.
Not all CET enabled applications and libraries have been properly tested
in CET enabled environments. Some CET enabled applications or libraries
will crash or misbehave when CET is enabled. Don't set CET active by
default so that all applications and libraries will run normally regardless
of whether CET is active or not. Shadow stack can be enabled by
$ export GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
at run-time if shadow stack can be enabled by kernel.
NB: This commit can be reverted if it is OK to enable CET by default for
all applications and libraries.
Initially, IBT and SHSTK are marked as active when CPU supports them
and CET are enabled in glibc. They can be disabled early by tunables
before relocation. Since after relocation, GLRO(dl_x86_cpu_features)
becomes read-only, we can't update GLRO(dl_x86_cpu_features) to mark
IBT and SHSTK as inactive. Instead, check the feature_1 field in TCB
to decide if IBT and SHST are active.
Previously, CET was enabled by kernel before passing control to user
space and the startup code must disable CET if applications or shared
libraries aren't CET enabled. Since the current kernel only supports
shadow stack and won't enable shadow stack before passing control to
user space, we need to enable shadow stack during startup if the
application and all shared library are shadow stack enabled. There
is no need to disable shadow stack at startup. Shadow stack can only
be enabled in a function which will never return. Otherwise, shadow
stack will underflow at the function return.
1. GL(dl_x86_feature_1) is set to the CET features which are supported
by the processor and are not disabled by the tunable. Only non-zero
features in GL(dl_x86_feature_1) should be enabled. After enabling
shadow stack with ARCH_SHSTK_ENABLE, ARCH_SHSTK_STATUS is used to check
if shadow stack is really enabled.
2. Use ARCH_SHSTK_ENABLE in RTLD_START in dynamic executable. It is
safe since RTLD_START never returns.
3. Call arch_prctl (ARCH_SHSTK_ENABLE) from ARCH_SETUP_TLS in static
executable. Since the start function using ARCH_SETUP_TLS never returns,
it is safe to enable shadow stack in ARCH_SETUP_TLS.
Sync with Linux kernel 6.6 shadow stack interface. Since only x86-64 is
supported, i386 shadow stack codes are unchanged and CET shouldn't be
enabled for i386.
1. When the shadow stack base in TCB is unset, the default shadow stack
is in use. Use the current shadow stack pointer as the marker for the
default shadow stack. It is used to identify if the current shadow stack
is the same as the target shadow stack when switching ucontexts. If yes,
INCSSP will be used to unwind shadow stack. Otherwise, shadow stack
restore token will be used.
2. Allocate shadow stack with the map_shadow_stack syscall. Since there
is no function to explicitly release ucontext, there is no place to
release shadow stack allocated by map_shadow_stack in ucontext functions.
Such shadow stacks will be leaked.
3. Rename arch_prctl CET commands to ARCH_SHSTK_XXX.
4. Rewrite the CET control functions with the current kernel shadow stack
interface.
Since CET is no longer enabled by kernel, a separate patch will enable
shadow stack during startup.
Code is mostly inspired from the LoongArch one, which has a similar ABI,
with minor changes to support riscv32 and register differences.
This fixes elf/tst-sprof-basic. This also fixes elf/tst-audit1,
elf/tst-audit2 and elf/tst-audit8 with recent binutils snapshots when
--enable-bind-now is used.
Resolves: BZ #31151
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Similar to other printf-like ones. It requires to be in a different
process so we can change the orientation of stdout.
Checked on aarch64, armhf, x86_64, and i686.
It requires to be in a container tests to avoid logging bogus
information on the system. The syslog also requires to be checked in
a different process because the internal printf call will abort with
the internal syslog lock taken (which makes subsequent syslog calls
deadlock).
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The fortify wrappers for varargs functions already add fallbacks to
builtins calls if __va_arg_pack is not supported.
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
_dl_tlsdesc_undefweak and _dl_tlsdesc_dynamic access the thread pointer
via the tcb field in TCB:
_dl_tlsdesc_undefweak:
_CET_ENDBR
movq 8(%rax), %rax
subq %fs:0, %rax
ret
_dl_tlsdesc_dynamic:
...
subq %fs:0, %rax
movq -8(%rsp), %rdi
ret
Since the tcb field in TCB is a pointer, %fs:0 is a 32-bit location,
not 64-bit. It should use "sub %fs:0, %RAX_LP" instead. Since
_dl_tlsdesc_undefweak returns ptrdiff_t and _dl_make_tlsdesc_dynamic
returns void *, RAX_LP is appropriate here for x32 and x86-64. This
fixes BZ #31185.
On x32, I got
FAIL: elf/tst-tlsgap
$ gdb elf/tst-tlsgap
...
open tst-tlsgap-mod1.so
Thread 2 "tst-tlsgap" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 2268754]
_dl_tlsdesc_dynamic () at ../sysdeps/x86_64/dl-tlsdesc.S:108
108 movq (%rsi), %rax
(gdb) p/x $rsi
$4 = 0xf7dbf9005655fb18
(gdb)
This is caused by
_dl_tlsdesc_dynamic:
_CET_ENDBR
/* Preserve call-clobbered registers that we modify.
We need two scratch regs anyway. */
movq %rsi, -16(%rsp)
movq %fs:DTV_OFFSET, %rsi
Since the dtv field in TCB is a pointer, %fs:DTV_OFFSET is a 32-bit
location, not 64-bit. Load the dtv field to RSI_LP instead of rsi.
This fixes BZ #31184.
No bug because this is not visible if glibc is built with
optimization. Otherwise this would be a critical resource leak.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
In permissive mode, don't disable IBT nor SHSTK when dlopening a legacy
shared library if not single threaded since IBT and SHSTK may be still
enabled in other threads. Other threads with IBT or SHSTK enabled will
crash when calling functions in the legacy shared library. Instead, an
error will be issued.
Improve readability and make maintenance easier for dl-feature.c by
modularizing sysdeps/x86/dl-cet.c:
1. Support processors with:
a. Only IBT. Or
b. Only SHSTK. Or
c. Both IBT and SHSTK.
2. Lock CET features only if IBT or SHSTK are enabled and are not
enabled permissively.
This is a minimal regression test for bug 29039 which only affects
targets with TLSDESC and a reproducer requires that
1) Have modid gaps (closed modules) with old generation.
2) Update a DTV to a newer generation (needs a newer dlopen).
3) But do not update the closed gap entry in that DTV.
4) Reuse the modid gap for a new module (another dlopen).
5) Use dynamic TLSDESC in that new module with old generation (bug).
6) Access TLS via this TLSDESC and the now outdated DTV.
However step (3) in practice rarely happens: during DTV update the
entries for closed modids are initialized to "unallocated" and then
dynamic TLSDESC calls __tls_get_addr independently of its generation.
The only exception to this is DTV setup at thread creation (gaps are
initialized to NULL instead of unallocated) or DTV resize where the
gap entries are outside the previous DTV array (again NULL instead
of unallocated, and this requires loading > DTV_SURPLUS modules).
So the bug can only cause NULL (+ offset) dereference, not use after
free. And the easiest way to get (3) is via thread creation.
Note that step (5) requires that the newly loaded module has larger
TLS than the remaining optional static TLS. And for (6) there cannot
be other TLS access or dlopen in the thread that updates the DTV.
Tested on aarch64-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Added annotations for autovec by GCC and GFortran - this enables GCC
>= 9 to autovectorise math calls at -Ofast.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Compilers may emit calls to 'half-width' routines (two-lane
single-precision variants). These have been added in the form of
wrappers around the full-width versions, where the low half of the
vector is simply duplicated. This will perform poorly when one lane
triggers the special-case handler, as there will be a redundant call
to the scalar version, however this is expected to be rare at Ofast.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
If /tmp is mounted nosuid and make xcheck is run,
then tst-env-setuid fails UNSUPPORTED with "SGID failed: GID and EGID match"
and /var/tmp/tst-sonamemove-runmod1.so.profile is created.
If you then try to rerun the test with a suid mounted test-dir
(the SGID binary is created in test-dir which defaults to /tmp)
with something like that:
make tst-env-setuid-ENV="TMPDIR=..." t=elf/tst-env-setuid test
the test fails as the LD_PROFILE output file is still available
from the previous run.
Thus this patch removes the LD_PROFILE output file in parent
before spawning the SGID binary.
Even if LD_PROFILE is not supported anymore in static binaries,
use a different library and thus output file for tst-env-setuid
and tst-env-setuid-static in order to not interfere if both
tests are run in parallel.
Furthermore the checks in test_child are now more verbose.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When _FORTIFY_SOURCE is defined to 2, ____longjmp_chk is called,
instead of longjmp. ____longjmp_chk compares the relative stack
values to decide if it is called from a stack frame which called
setjmp. If not, ____longjmp_chk assumes that an alternate signal
stack is used. Since comparing the relative stack values isn't
reliable with user context, when there is no signal, ____longjmp_chk
will fail. Undefine _FORTIFY_SOURCE to avoid ____longjmp_chk in
user context test.
The expression
(excepts & FE_ALL_EXCEPT) << 27
produces a signed integer overflow when 'excepts' is specified as
FE_INVALID (= 0x10), because
- excepts is of type 'int',
- FE_ALL_EXCEPT is of type 'int',
- thus (excepts & FE_ALL_EXCEPT) is (int) 0x10,
- 'int' is 32 bits wide.
The patched code produces the same instruction sequence as
previosuly.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
It clears some exception flags that are outside the EXCEPTS argument.
It fixes math/test-fexcept on qemu-user.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
libc_feupdateenv_riscv should check for FE_DFL_ENV, similar to
libc_fesetenv_riscv.
Also extend the test-fenv.c to test fenvupdate.
Checked on riscv under qemu-system.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set
floating-point exception flags without raising a trap (unlike
feraiseexcept, which is supposed to raise a trap if feenableexcept
was called with the appropriate argument).
The flags can be set in the 387 unit or in the SSE unit. When we need
to clear a flag, we need to do so in both units, due to the way
fetestexcept is implemented.
When we need to set a flag, it is sufficient to do it in the SSE unit,
because that is guaranteed to not trap. However, on i386 CPUs that have
only a 387 unit, set the flags in the 387, as long as this cannot trap.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set
floating-point exception flags without raising a trap (unlike
feraiseexcept, which is supposed to raise a trap if feenableexcept
was called with the appropriate argument).
The flags can be set in the 387 unit or in the SSE unit. To set
a flag, it is sufficient to do it in the SSE unit, because that is
guaranteed to not trap. However, on i386 CPUs that have only a
387 unit, set the flags in the 387, as long as this cannot trap.
Checked on i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set
floating-point exception flags without raising a trap (unlike
feraiseexcept, which is supposed to raise a trap if feenableexcept was
called with the appropriate argument).
This is a side-effect of how we implement the GNU extension
feenableexcept, where feenableexcept/fesetenv/fesetmode/feupdateenv
might issue prctl (PR_SET_FPEXC, PR_FP_EXC_PRECISE) depending of the
argument. And on PR_FP_EXC_PRECISE, setting a floating-point exception
flag triggers a trap.
To make the both functions follow the C23, fesetexcept and
fesetexceptflag now fail if the argument may trigger a trap.
The math tests now check for an value different than 0, instead
of bail out as unsupported for EXCEPTION_SET_FORCES_TRAP.
Checked on powerpc64le-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The loader now warns for invalid and out-of-range tunable values. The
patch also fixes the parsing of size_t maximum values, where
_dl_strtoul was failing for large values close to SIZE_MAX.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The tunable parsing duplicates the tunable environment variable so it
null-terminates each one since it simplifies the later parsing. It has
the drawback of adding another point of failure (__minimal_malloc
failing), and the memory copy requires tuning the compiler to avoid mem
operations calls.
The parsing now tracks the tunable start and its size. The
dl-tunable-parse.h adds helper functions to help parsing, like a strcmp
that also checks for size and an iterator for suboptions that are
comma-separated (used on hwcap parsing by x86, powerpc, and s390x).
Since the environment variable is allocated on the stack by the kernel,
it is safe to keep the references to the suboptions for later parsing
of string tunables (as done by set_hwcaps by multiple architectures).
Checked on x86_64-linux-gnu, powerpc64le-linux-gnu, and
aarch64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Since GCC commit f31a019d1161ec78846473da743aedf49cca8c27 "Emit
funcall external declarations only if actually used.", the glibc
testsuite has failed to build for 32-bit SPARC with GCC mainline.
/scratch/jmyers/glibc-bot/install/compilers/sparc64-linux-gnu/lib/gcc/sparc64-glibc-linux-gnu/14.0.0/../../../../sparc64-glibc-linux-gnu/bin/ld: /scratch/jmyers/glibc-bot/install/compilers/sparc64-linux-gnu/lib/gcc/sparc64-glibc-linux-gnu/14.0.0/32/libgcc.a(_divsi3.o): in function `.div':
/scratch/jmyers/glibc-bot/src/gcc/libgcc/config/sparc/lb1spc.S:138: multiple definition of `.div'; /scratch/jmyers/glibc-bot/build/glibcs/sparcv9-linux-gnu/glibc/libc.a(sdiv.o):/scratch/jmyers/glibc-bot/src/glibc/gnulib/../sysdeps/sparc/sparc32/sparcv9/sdiv.S:13: first defined here
/scratch/jmyers/glibc-bot/install/compilers/sparc64-linux-gnu/lib/gcc/sparc64-glibc-linux-gnu/14.0.0/../../../../sparc64-glibc-linux-gnu/bin/ld: disabling relaxation; it will not work with multiple definitions
collect2: error: ld returned 1 exit status
make[3]: *** [../Rules:298: /scratch/jmyers/glibc-bot/build/glibcs/sparcv9-linux-gnu/glibc/nptl/tst-cancel24-static] Error 1
https://sourceware.org/pipermail/libc-testresults/2023q4/012154.html
I'm not sure of the exact sequence of undefined references that cause
first the glibc object file defining .div and then the libgcc object
file defining both .div and .udiv to be pulled in (which must have
been perturbed by that GCC change in a way that introduced the build
failure), but I think the failure illustrates that it's inherently
fragile for glibc to define symbols in separate object files that
libgcc defines in the same object file - and indeed for glibc to
redefine libgcc symbols at all, since the division into object files
shouldn't really be part of the interface between libgcc and libc.
These symbols appear to be in libc only for compatibility, maybe one
of the cases where they were accidentally exported from shared libc in
glibc 2.0 before the introduction of symbol versioning and so programs
started expecting shared libc to provide them. Thus, there is no need
to have them in static libc. Add this set of libgcc functions to
shared-only-routines so they are no longer provided in static libc.
(No change is made regarding .mul - dotmul source file - since unlike
the other symbols in this grouping, it doesn't actually appear to be a
libgcc symbol, at least in current GCC.)
Tested with build-many-glibcs.py for sparcv9-linux-gnu with GCC
mainline.
Verify that legacy shadow stack code in .init_array section in application
and shared library, which are marked as shadow stack enabled, will trigger
segfault.
So far if the ucontext structure was obtained by getcontext and co,
the return address was stored in general purpose register 14 as
it is defined as return address in the ABI.
In contrast, the context passed to a signal handler contains the address
in psw.addr field.
If somebody e.g. wants to dump the address of the context, the origin
needs to be known.
Now this patch adjusts getcontext and friends and stores the return address
also in psw.addr field.
Note that setcontext isn't adjusted and it is not supported to pass a
ucontext structure from signal-handler to setcontext. We are not able to
restore all registers and branching to psw.addr without clobbering one
register.
This commit uses a common implementation 'strlen-evex-base.S' for both
'strlen-evex' and 'strlen-evex512'
The motivation is to reduce the number of implementations to maintain.
This incidentally gives a small performance improvement.
All tests pass on x86.
Benchmarks were taken on SKX.
https://www.intel.com/content/www/us/en/products/sku/123613/intel-core-i97900x-xseries-processor-13-75m-cache-up-to-4-30-ghz/specifications.html
Geometric mean for strlen-evex512 over all benchmarks (N=10) was (new/old) 0.939
Geometric mean for wcslen-evex512 over all benchmarks (N=10) was (new/old) 0.965
Code Size Changes:
strlen-evex512.S : +24 bytes
wcslen-evex512.S : +54 bytes
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Since shadow stack (SHSTK) is enabled in the Linux kernel without
enabling indirect branch tracking (IBT), don't assume that SHSTK
implies IBT. Use "CPU_FEATURE_ACTIVE (IBT)" to check if IBT is active
and "CPU_FEATURE_ACTIVE (SHSTK)" to check if SHSTK is active.
Hello! I am Indonesian, was born and raised in Indonesia and still do live in
Indonesia.
This patch brings a few changes to the time locales of id_ID, which
includes :
\- Defining am_pm and time_fmpt_ampm
\- Changing time_fmt and d_t_fmt to use the 24-hour format
\- Changing first_weekday to Monday
This is a squashed version of what is previously a 5 patch set
Here are reasons and details of the changes :
Change 1 part 1
id_ID: Define `am_pm` string
Current formatting does not define am_pm string, leading to AM and PM
not being specified in 12 H time format. This change defines the string
by changing it from an empty string to "AM";"PM".
output of `date +%r`:
before commit: 01:23
after commit: 01:23 PM
Change 1 part 2
id_ID: Define time_fmt_ampm, change from an empty string
Currently, time_fmpt_ampm is set to an empty string, causing some
programs to not be able to display time in the 12-hour format, for
example, glib: https://gitlab.gnome.org/GNOME/glib/-/issues/2967.
This commit changes it from an empty string to "%I:%M:%S %p"
Change 2 part 1
id_ID: Use 24-hour format for time_fmt
Indonesian standard and formal time format uses the 24-hour format inst-
ead of the 12-hour format. This commit aims to change the id_ID locale's
time_fmt to match that accordingly.
Change 2 part 2
id_ID: Use 24-hour format for d_t_fmt.
Indonesian standard and formal time format uses the 24-hour format inst-
ead of the 12-hour format. This commit aims to change the id_ID locale's
d_t_fmt to match that accordingly.
Change 3
id_ID: Change first_weekday to monday
Indonesian calendar starts of the week with Monday, let's comply
Message-ID: <20230821035530.9075-1-rushing27alien@gmail.com>
Resolves: BZ # 30412
Reviewed-by: Mike Fabian <mfabian@redhat.com>
This patch reserves space for HWCAP3/HWCAP4 in the TCB of powerpc.
These hardware capabilities bits will be used by future Power
architectures.
Versioned symbol '__parse_hwcap_3_4_and_convert_at_platform' advertises
the availability of the new HWCAP3/HWCAP4 data in the TCB.
This is an ABI change for GLIBC 2.39.
Suggested-by: Peter Bergner <bergner@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
Current implementation of strcmp for power10 has
performance regression for multiple small sizes
and alignment combination.
Most of these performance issues are fixed by this
patch. The compare loop is unrolled and page crosses
of unrolled loop is handled.
Thanks to Paul E. Murphy for helping in fixing the
performance issues.
Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>
Co-Authored-By: Paul E. Murphy <murphyp@linux.ibm.com>
Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
Optimized memchr for POWER10 based on existing rawmemchr and strlen.
Reordering instructions and loop unrolling helped in getting better performance.
Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
The previous commit was incomplete: gettext() still returns a translation
if the file /usr/share/locale/C/LC_MESSAGES/<domain>.mo exists. This patch
prohibits the translation also in this case.
* gettext-runtime/intl/dcigettext.c (DCIGETTEXT): Treat C.<encoding> locale
like the C locale.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
After refactoring the alloca usage in 40c0add7d4 ("resolve: Remove
__res_context_query alloca usage") a few unaligned accesses to HEADER
fields surfaced. These unaligned accesses led to problems when running
the resolv test suite on sparc32-linux (leon) as many tests failed due to
SIGBUS crashes.
The issue(s) occured during T_QUERY_A_AND_AAAA queries as the second query
now can start on an unaligned address (previously it was explicitly aligned).
With this patch the unaligned accesses are now fixed by using the
UHEADER instead to ensure the fields are accessed with byte
loads/stores.
The patch has been verfied by running the resolv test suite on sparc32
and x86_64.
Signed-off-by: Ludwig Rydberg <ludwig.rydberg@gaisler.com>
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The PT_GNU_PROPERTY segment is scanned before PT_NOTE. For binaries
with the PT_GNU_PROPERTY segment, we can check it to avoid scan of
the PT_NOTE segment.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
GLRO(dl_lazy) is used to set the parameters for the early
_dl_relocate_object call, so the consider_profiling setting has to
be applied before the call.
Fixes commit 78ca44da01 ("elf: Relocate
libc.so early during startup and dlmopen (bug 31083)").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
One of the requirements to becoming a CVE Numbering Authority (CNA) is
to publish advisories. Do this by maintaining a file for each CVE fixed
in the advisories directory in the source tree. Links to the advisories
can then be shared as:
https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=advisories/GLIBC-SA-YYYY-NNNN
The file format at the moment is rudimentary and derives from the git
commit format, i.e. a subject line and a potentially multi-paragraph
description and then tags to describe some meta information. This is a
loose format at the moment and could change as we evolve this.
Also add a script process-fixed-cves.sh that processes these advisories
and generates a list to add to NEWS at release time.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This patch is based on __strcmp_power9 and __strlen_power10.
Improvements from __strcmp_power9:
1. Uses new POWER10 instructions
- This code uses lxvp to decrease contention on load
by loading 32 bytes per instruction.
2. Performance implication
- This version has around 30% better performance on average.
- Performance regression is seen for a specific combination
of sizes and alignments. Some of them is observed without
changes also, while rest may be induced by the patch.
Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
It splits between process_envvars_secure and process_envvars_default,
with the former used to process arguments for __libc_enable_secure.
It does not have any semantic change, just simplify the code so there
is no need to handle __libc_enable_secure on each len switch.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
To avoid any environment variable to change setuid binaries
semantics.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Loader already ignores LD_DEBUG, LD_DEBUG_OUTPUT, and
LD_TRACE_LOADED_OBJECTS. Both LD_WARN and LD_VERBOSE are similar to
LD_DEBUG, in the sense they enable additional checks and debug
information, so it makes sense to disable them.
Also add both LD_VERBOSE and LD_WARN on filtered environment variables
for setuid binaries.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Call the document a "Security Policy" to disambiguate it from the
security *process* documented in the security page. Also, point to the
security page for bug reporting and CVE assignment.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The .cfi_return_column directive changes the return column for the whole
FDE range. But the actual intent is to tell the unwinder that the value
in x30 (lr) now resides in x15 after the move, and that is expressed by
the .cfi_register directive.
New implementation is based on the existing exp/exp2, with different
reduction constants and polynomial. Worst-case error in round-to-
nearest is 0.513 ULP.
The exp/exp2 shared table is reused for exp10 - .rodata size of
e_exp_data increases by 64 bytes.
As for exp/exp2, targets with single-instruction rounding/conversion
intrinsics can use them by toggling TOINT_INTRINSICS=1 and adding the
necessary code to their math_private.h.
Improvements on Neoverse V1 compared to current GLIBC master:
exp10 thruput: 3.3x in [-0x1.439b746e36b52p+8 0x1.34413509f79ffp+8]
exp10 latency: 1.8x in [-0x1.439b746e36b52p+8 0x1.34413509f79ffp+8]
Tested on:
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction)
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
The previous check did not do anything because tmp_ptr already
points before run_ptr due to the way it is initialized.
Fixes commit e4d8117b82
("stdlib: Avoid another self-comparison in qsort").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
For i686, this change is no op but for x86_64 it forces all inlined port
rights to be 8 bytes long.
Message-ID: <20231124213041.952886-2-flaviocruz@gmail.com>
It is not strictly required by the POSIX, since O_PATH is a Linux
extension, but it is QoI to fail early instead of at readdir. Also
the check is free, since fdopendir already checks if the file
descriptor is opened for read.
Checked on x86_64-linux-gnu.
The linker just concatenates the .init and .fini sections which
results in the complete _init and _fini functions. If needed the
linker adds padding bytes due to an alignment. GNU ld is adding
NOPs, which is fine. But e.g. mold is adding traps which results
in broken _init and _fini functions.
Thus this patch removes the alignment in .init and .fini sections
in crtn.S files.
We keep the 4 byte function alignment in crti.S files. As the
assembler now also outputs the start of _init and _fini functions
as multiples of 4 byte, it perhaps has to fill it. Although GNU as
is using NOPs here, to be sure, we just keep the alignment with
0x07 (=NOPs) at the end of crti.S.
In order to avoid an obvious NOP slide in _fini, this patch also
uses an lg instead of lgr instruction. Then the emitted instructions
needs a multiple of 4 bytes.
Even for explicit large page support, allocation might use mmap without
the hugepage bit set if the requested size is smaller than
mmap_threshold. For this case where mmap is issued, MAP_HUGETLB is set
iff the allocation size is larger than the used large page.
To force such allocations to use large pages, also tune the mmap_threhold
(if it is not explicit set by a tunable). This forces allocation to
follow the sbrk path, which will fall back to mmap (which will try large
pages before galling back to default mmap).
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
The patch adds two new macros, TUNABLE_GET_DEFAULT and TUNABLE_IS_INITIALIZED,
here the former get the default value with a signature similar to
TUNABLE_GET, while the later returns whether the tunable was set by
the environment variable.
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Current code aligns to 2x VEC_SIZE. Aligning to 2x has no affect on
performance other than potentially resulting in an additional
iteration of the loop.
1x maintains aligned stores (the only reason to align in this case)
and doesn't incur any unnecessary loop iterations.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
_dl_assign_tls_modid() assigns a slotinfo entry for a new module, but
does *not* do anything to the generation counter. The first time this
happens, the generation is zero and map_generation() returns the current
generation to be used during relocation processing. However, if
a slotinfo entry is later reused, it will already have a generation
assigned. If this generation has fallen behind the current global max
generation, then this causes an obsolete generation to be assigned
during relocation processing, as map_generation() returns this
generation if nonzero. _dl_add_to_slotinfo() eventually resets the
generation, but by then it is too late. This causes DTV updates to be
skipped, leading to NULL or broken TLS slot pointers and segfaults.
Fix this by resetting the generation to zero in _dl_assign_tls_modid(),
so it behaves the same as the first time a slot is assigned.
_dl_add_to_slotinfo() will still assign the correct static generation
later during module load, but relocation processing will no longer use
an obsolete generation.
Note that slotinfo entry (aka modid) reuse typically happens after a
dlclose and only TLS access via dynamic tlsdesc is affected. Because
tlsdesc is optimized to use the optional part of static TLS, dynamic
tlsdesc can be avoided by increasing the glibc.rtld.optional_static_tls
tunable to a large enough value, or by LD_PRELOAD-ing the affected
modules.
Fixes bug 29039.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This patch adds the TCP_MD5SIG_FLAG_IFINDEX constant from Linux 5.6 to
sysdeps/gnu/netinet/tcp.h and updates struct tcp_md5sig accordingly to
contain the device index.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This is just a minor optimization. It also makes it more obvious that
_dl_relocate_object can be called multiple times.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
A recent commit, apparently commit
6c6fce572f "elf: Remove /etc/suid-debug
support", resulted in localplt failures for i686-gnu and x86_64-gnu:
Missing required PLT reference: ld.so: __access_noerrno
After that commit, __access_noerrno is actually no longer used at all.
So rather than just removing the localplt expectation for that symbol
for Hurd, completely remove all definitions of and references to that
symbol.
Tested for x86_64, and with build-many-glibcs.py for i686-gnu and
x86_64-gnu.
This restore the 2.33 semantic for arena_get2. It was changed by
11a02b035b to avoid arena_get2 call malloc (back when __get_nproc
was refactored to use an scratch_buffer - 903bc7dcc2). The
__get_nproc was refactored over then and now it also avoid to call
malloc.
The 11a02b035b did not take in consideration any performance
implication, which should have been discussed properly. The
__get_nprocs_sched is still used as a fallback mechanism if procfs
and sysfs is not acessible.
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
These were broken by the new atan2 functions, as they were only
set up for univariate functions. Arity is now detected from the
input file - this revealed a mistake that the double-precision
inputs were being used for both single- and double-precision
routines, which is now remedied.
Many applications still rely on this prototype. Rebuilds without
this prototype result in an implicit function declaration, which can
introduce security vulnerabilities due to 32-bit pointer truncation.
The _dl_non_dynamic_init does not parse LD_PROFILE, which does not
enable profile for dlopen objects. Since dlopen is deprecated for
static objects, it is better to remove the support.
It also allows to trim down libc.a of profile support.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Loader does not ignore LD_PROFILE in secure-execution mode (different
than man-page states [1]), rather it uses a different path
(/var/profile) and ignore LD_PROFILE_OUTPUT.
Allowing secure-execution profiling is already a non good security
boundary, since it enables different code paths and extra OS access by
the process. But by ignoring LD_PROFILE_OUTPUT, the resulting profile
file might also be acceded in a racy manner since the file name does not
use any process-specific information (such as pid, timing, etc.).
Another side-effect is it forces lazy binding even on libraries that
might be with DF_BIND_NOW.
[1] https://man7.org/linux/man-pages/man8/ld.so.8.html
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Using the memcmp symbol directly allows the compile to inline the
memcmp calls (especially because _dl_tunable_set_hwcaps uses constants
values), generating better code.
Checked with tst-tunables on s390x-linux-gnu (qemu system).
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The dl-symbol-redir-ifunc.h redirects compiler-generated libcalls to
arch-specific memory implementations to avoid ifunc calls where it is not
yet possible. The memcmp-isa-default-impl.h aims to fix the same issue
by calling the specific memset implementation directly.
Using the memcmp symbol directly allows the compiler to inline the memset
calls (especially because _dl_tunable_set_hwcaps uses constants values),
generating better code.
Checked on x86_64-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The strlen might trigger and invalid GOT entry if it used before
the process is self-relocated (for instance on dl-tunables if any
error occurs).
For i386, _dl_writev with PIE requires to use the old 'int $0x80'
syscall mode because the calling the TLS register (gs) is not yet
initialized.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Instead of ignoring ill-formatted tunable strings, first, check all the
tunable definitions are correct and then set each tunable value. It
means that partially invalid strings, like "key1=value1:key2=key2=value'
or 'key1=value':key2=value2=value2' do not enable 'key1=value1'. It
avoids possible user-defined errors in tunable definitions.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Tunable definitions with more than one '=' on are parsed and enabled,
and any subsequent '=' are ignored. It means that tunables in the form
'tunable=tunable=value' or 'tunable=value=value' are handled as
'tunable=value'. These inputs are likely user input errors, which
should not be accepted.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Some environment variables allow alteration of allocator behavior
across setuid boundaries, where a setuid program may ignore the
tunable, but its non-setuid child can read it and adjust the memory
allocator behavior accordingly.
Most library behavior tunings is limited to the current process and does
not bleed in scope; so it is unclear how pratical this misfeature is.
If behavior change across privilege boundaries is desirable, it would be
better done with a wrapper program around the non-setuid child that sets
these envvars, instead of using the setuid process as the messenger.
The patch as fixes tst-env-setuid, where it fail if any unsecvars is
set. It also adds a dynamic test, although it requires
--enable-hardcoded-path-in-tests so kernel correctly sets the setuid
bit (using the loader command directly would require to set the
setuid bit on the loader itself, which is not a usual deployment).
Co-authored-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
The tunable privilege levels were a retrofit to try and keep the malloc
tunable environment variables' behavior unchanged across security
boundaries. However, CVE-2023-4911 shows how tricky can be
tunable parsing in a security-sensitive environment.
Not only parsing, but the malloc tunable essentially changes some
semantics on setuid/setgid processes. Although it is not a direct
security issue, allowing users to change setuid/setgid semantics is not
a good security practice, and requires extra code and analysis to check
if each tunable is safe to use on all security boundaries.
It also means that security opt-in features, like aarch64 MTE, would
need to be explicit enabled by an administrator with a wrapper script
or with a possible future system-wide tunable setting.
Co-authored-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
setuid/setgid process now ignores any glibc tunables, and filters out
all environment variables that might changes its behavior. This patch
also adds GLIBC_TUNABLES, so any spawned process by setuid/setgid
processes should set tunable explicitly.
Checked on x86_64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Since malloc debug support moved to a different library
(libc_malloc_debug.so), the glibc.malloc.check requires preloading the
debug library to enable it. It means that suid-debug support has not
been working since 2.34.
To restore its support, it would require to add additional information
and parsing to where to find libc_malloc_debug.so.
It is one thing less that might change AT_SECURE binaries' behavior
due to environment configurations.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The existing logic avoided internal stack overflow. To avoid
a denial-of-service condition with adversarial input, it is necessary
to fall over to heapsort if tail-recursing deeply, too, which does
not result in a deep stack of pending partitions.
The new test stdlib/tst-qsort5 is based on Douglas McIlroy's paper
on this subject.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The previous implementation did not consistently apply the rule that
the child nodes of node K are at 2 * K + 1 and 2 * K + 2, or
that the parent node is at (K - 1) / 2.
Add an internal test that targets the heapsort implementation
directly.
Reported-by: Stepan Golosunov <stepan@golosunov.pp.ru>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
In the insertion phase, we could run off the start of the array if the
comparison function never runs zero. In that case, it never finds the
initial element that terminates the iteration.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove the unused 'char *name;' from the example.
Use write instead of putchar to write input as it is read.
Example tested on x86_64 by compiling and running the example.
Tested by building the manual pdf and reviewing the results.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Linux 6.6 (09da082b07bbae1c) added support for fchmodat2, which has
similar semantics as fchmodat with an extra flag argument. This
allows fchmodat to implement AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH
without the need for procfs.
The syscall is registered on all architectures (with value of 452
except on alpha which is 562, commit 78252deb023cf087).
The tst-lchmod.c requires a small fix where fchmodat checks two
contradictory assertions ('(st.st_mode & 0777) == 2' and
'(st.st_mode & 0777) == 3').
Checked on x86_64-linux-gnu on a 6.6 kernel.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
pool_max_size denotes total allocated rows in pool but possibly not yet
initialized. it's pool_size that represents number of actually occupied
rows hence use it when freeing pool to avoid freeing random addresses.
Signed-off-by: Jan Palus <jpalus@fastmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Arguments to a memchr call were swapped, causing incorrect skipping
of files.
Files related to dpkg have different names: they actually end in
.dpkg-new and .dpkg-tmp, not .tmp as I mistakenly assumed.
Fixes commit 2aa0974d25 ("elf: ldconfig should skip
temporary files created by package managers").
This ensures that the test still links with a linker that refuses
to create an executable stack marker automatically.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
So that the test is harder to confuse with elf/tst-execstack
(although the tests are supposed to be the same).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Existing MiG does not support untyped messages and the Hurd will
continue to use typed messages for the foreseeable future.
Message-ID: <ZVmYX6j4pYNUfqn4@jupiter.tail36e24.ts.net>
The `ty` pointer is only set at the end of the loop so that
`msgtl_header.msgt_inline` and `msgtl_header.msgt_deallocate` remain
valid. Also, when deallocating memory, we use the length from the
message directly rather than hard coding mach_port_t since we want to
deallocate any kind of OOL data.
Message-ID: <ZVlGVD6eEN-dXsOr@jupiter.tail36e24.ts.net>
The force_first parameter was ineffective because the dlclose'd
object was not necessarily the first in the maps array. Also
enable force_first handling unconditionally, regardless of namespace.
The initial object in a namespace should be destructed first, too.
The _dl_sort_maps_dfs function had early returns for relocation
dependency processing which broke force_first handling, too, and
this is fixed in this change as well.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The open_path stops if a relative path in search path contains a
component that is a non directory (for instance, if the component
is an existing file).
For instance:
$ cat > lib.c <<EOF
> void foo (void) {}
> EOF
$ gcc -shared -fPIC -o lib.so lib.c
$ cat > main.c <<EOF
extern void foo ();
int main () { foo (); return 0; }
EOF
$ gcc -o main main.c lib.so
$ LD_LIBRARY_PATH=. ./main
$ LD_LIBRARY_PATH=non-existing/path:. ./main
$ LD_LIBRARY_PATH=$(pwd)/main:. ./main
$ LD_LIBRARY_PATH=./main:. ./main
./main: error while loading shared libraries: lib.so: cannot open shared object file: No such file or directory
The invalid './main' should be ignored as a non-existent one,
instead as a valid but non accessible file.
Absolute paths do not trigger this issue because their status are
initialized as 'unknown' and open_path check if this is a directory.
Checked on x86_64-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
strrchr-evex-base used `vpcompress{b|d}` in the page cross logic but
was missing the CPU_FEATURE checks for VBMI2 in the
ifunc/ifunc-impl-list.
The fix is either to add those checks or change the logic to not use
`vpcompress{b|d}`. Choosing the latter here so that the strrchr-evex
implementation is usable on SKX.
New implementation is a bit slower, but this is in a cold path so its
probably okay.
Fixes commit a61933fe27 ("sparc: Remove bzero optimization") that
after moving code jumped to the wrong label 4.
Verfied by successfully running string/test-memset on sparc32.
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Ludwig Rydberg <ludwig.rydberg@gaisler.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When the given options do not include MACH_SEND_INTERRUPT,
_hurd_intr_rpc_mach_msg (aka mach_msg) is not supposed to return
MACH_SEND_INTERRUPTED. In such a case we thus have to retry sending the
message.
This was observed to fix various occurrences of spurious
"(ipc/send) interrupted" errors when running haskell programs.
The latest implementations of memcpy are actually faster than the Falkor
implementations [1], so remove the falkor/phecda ifuncs for memcpy and
the now unused IS_FALKOR/IS_PHECDA defines.
[1] https://sourceware.org/pipermail/libc-alpha/2022-December/144227.html
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add a specialized memset for the common ZVA size of 64 to avoid the
overhead of reading the ZVA size. Since the code is identical to
__memset_falkor, remove the latter.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cleanup emag memset - merge the memset_base64.S file, remove
the unused ZVA code (since it is disabled on emag).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This improves compatibility with applications which assume that qsort
does not invoke the comparison function with equal pointer arguments.
The newly introduced branches should be predictable, as leading to a
call to the comparison function. If the prediction fails, we avoid
calling the function.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The PR_SET_VMA_ANON_NAME support is only enabled through a configurable
kernel switch, mainly because assigning a name to a
anonymous virtual memory area might prevent that area from being
merged with adjacent virtual memory areas.
For instance, with the following code:
void *p1 = mmap (NULL,
1024 * 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0);
void *p2 = mmap (p1 + (1024 * 4096),
1024 * 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0);
The kernel will potentially merge both mappings resulting in only one
segment of size 0x800000. If the segment is names with
PR_SET_VMA_ANON_NAME with different names, it results in two mappings.
Although this will unlikely be an issue for pthread stacks and malloc
arenas (since for pthread stacks the guard page will result in
a PROT_NONE segment, similar to the alignment requirement for the arena
block), it still might prevent the mmap memory allocated for detail
malloc.
There is also another potential scalability issue, where the prctl
requires
to take the mmap global lock which is still not fully fixed in Linux
[1] (for pthread stacks and arenas, it is mitigated by the stack
cached and the arena reuse).
So this patch disables anonymous mapping annotations as default and
add a new tunable, glibc.mem.decorate_maps, can be used to enable
it.
[1] https://lwn.net/Articles/906852/
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Add anonymous mmap annotations on loader malloc, malloc when it
allocates memory with mmap, and on malloc arena. The /proc/self/maps
will now print:
[anon: glibc: malloc arena]
[anon: glibc: malloc]
[anon: glibc: loader malloc]
On arena allocation, glibc annotates only the read/write mapping.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Linux 4.5 removed thread stack annotations due to the complexity of
computing them [1], and Linux added PR_SET_VMA_ANON_NAME on 5.17
as a way to name anonymous virtual memory areas.
This patch adds decoration on the stack created and used by
pthread_create, for glibc crated thread stack the /proc/self/maps will
now show:
[anon: glibc: pthread stack: <tid>]
And for user-provided stacks:
[anon: glibc: pthread user stack: <tid>]
The guard page is not decorated, and the mapping name is cleared when
the thread finishes its execution (so the cached stack does not have any
name associated).
Checked on x86_64-linux-gnu aarch64 aarch64-linux-gnu.
[1] 65376df582
Co-authored-by: Ian Rogers <irogers@google.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
Linux 5.17 added support to naming anonymous virtual memory areas
through the prctl syscall. The __set_vma_name is a wrapper to avoid
optimizing the prctl call if the kernel does not support it.
If the kernel does not support PR_SET_VMA_ANON_NAME, prctl returns
EINVAL. And it also returns the same error for an invalid argument.
Since it is an internal-only API, it assumes well-formatted input:
aligned START, with (START, START+LEN) being a valid memory range,
and NAME with a limit of 80 characters without an invalid one
("\\`$[]").
Reviewed-by: DJ Delorie <dj@redhat.com>
When invoking sem_open with O_CREAT as one of its flags, we'll end up
in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
& O_EXCL) == 0)", which means that we don't expect the semaphore file
to exist.
In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
| O_CLOEXEC" and there's an attempt to open(2) the file, which will
likely fail because it won't exist. After that first (expected)
failure, some cleanup is done and we go back to the label "try_again",
which lives in the first part of the aforementioned "if".
The problem is that, in that part of the code, we expect the semaphore
file to exist, and as such O_CREAT (this time the flag we pass to
open(2)) needs to be cleaned from open_flags, otherwise we'll see
another failure (this time unexpected) when trying to open the file,
which will lead the call to sem_open to fail as well.
This can cause very strange bugs, especially with OpenMPI, which makes
extensive use of semaphores.
Fix the bug by simplifying the logic when choosing open(2) flags and
making sure O_CREAT is not set when the semaphore file is expected to
exist.
A regression test for this issue would require a complex and cpu time
consuming logic, since to trigger the wrong code path is not
straightforward due the racy condition. There is a somewhat reliable
reproducer in the bug, but it requires using OpenMPI.
This resolves BZ #30789.
See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
Signed-off-by: Sergio Durigan Junior <sergiodj@sergiodj.net>
Co-Authored-By: Simon Chopin <simon.chopin@canonical.com>
Co-Authored-By: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Fixes: 533deafbdf ("Use O_CLOEXEC in more places (BZ #15722)")
It adds NT_X86_SHSTK (2fab02b25ae7cf5), NT_RISCV_CSR/NT_RISCV_VECTOR
(9300f00439743c4), and NT_LOONGARCH_HW_BREAK/NT_LOONGARCH_HW_WATCH
(1a69f7a161a78ae).
The years of dealing with Binutils, GCC and GDB test results
made the community create good tools for comparison and analysis
of DejaGnu test results. This change allows to use those tools
for Glibc's test results as well.
The motivation for this change is Linaro's pre-commit testers,
which use a modified version of GCC's validate_failures.py
to create test xfail lists with baseline failures and known
flaky tests. See below links for an example xfails file (only
one link is supposed to work at any given time):
- https://ci.linaro.org/job/tcwg_glibc_check--master-arm-build/lastSuccessfulBuild/artifact/artifacts/artifacts.precommit/sumfiles/xfails.xfail/*view*/
- https://ci.linaro.org/job/tcwg_glibc_check--master-arm-build/lastSuccessfulBuild/artifact/artifacts/sumfiles/xfails.xfail/*view*/
Specifacally, this patch changes format of glibc's .sum files from ...
<cut>
FAIL: elf/test1
PASS: string/test2
</cut>
... to ...
<cut>
=== glibc tests ===
Running elf ...
FAIL: elf/test1
Running string ...
PASS: string/test2
</cut>.
And output of "make check" from ...
<cut>
FAIL: elf/test1
</cut>
... to ...
<cut>
FAIL: elf/test1
=== Summary of results ===
1 FAIL
1 PASS
</cut>.
Signed-off-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cleanup ifuncs. Remove uses of libc_hidden_builtin_def, use ENTRY rather than
ENTRY_ALIGN, remove unnecessary defines and conditional compilation. Rename
strlen_mte to strlen_generic. Remove rtld-memset.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Commit 7f602256ab moved the tst-rfc3484*
tests from posix/ to nss/, but didn't correct references to point to
their new subdir when building for mach and arm. This commit fixes
that.
Tested with build-many-glibcs.sh for i686-gnu.
This patch adds a qsort and qsort_r to trigger the worst case
scenario for the quicksort (which glibc current lacks coverage).
The test is done with random input, dfferent internal types (uint8_t,
uint16_t, uint32_t, uint64_t, large size), and with
different set of element numbers.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
This patch removes the mergesort optimization on qsort implementation
and uses the introsort instead. The mergesort implementation has some
issues:
- It is as-safe only for certain types sizes (if total size is less
than 1 KB with large element sizes also forcing memory allocation)
which contradicts the function documentation. Although not required
by the C standard, it is preferable and doable to have an O(1) space
implementation.
- The malloc for certain element size and element number adds
arbitrary latency (might even be worse if malloc is interposed).
- To avoid trigger swap from memory allocation the implementation
relies on system information that might be virtualized (for instance
VMs with overcommit memory) which might lead to potentially use of
swap even if system advertise more memory than actually has. The
check also have the downside of issuing syscalls where none is
expected (although only once per execution).
- The mergesort is suboptimal on an already sorted array (BZ#21719).
The introsort implementation is already optimized to use constant extra
space (due to the limit of total number of elements from maximum VM
size) and thus can be used to avoid the malloc usage issues.
Resulting performance is slower due the usage of qsort, specially in the
worst-case scenario (partialy or sorted arrays) and due the fact
mergesort uses a slight improved swap operations.
This change also renders the BZ#21719 fix unrequired (since it is meant
to fix the sorted input performance degradation for mergesort). The
manual is also updated to indicate the function is now async-cancel
safe.
Checked on x86_64-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
This patch makes the quicksort implementation to acts as introsort, to
avoid worse-case performance (and thus making it O(nlog n)). It switch
to heapsort when the depth level reaches 2*log2(total elements). The
heapsort is a textbook implementation.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
The optimization takes in consideration both the most common elements
are either 32 or 64 bit in size and inputs are aligned to the word
boundary. This is similar to what msort does.
For large buffer the swap operation uses memcpy/mempcpy with a
small fixed size buffer (so compiler might inline the operations).
Checked on x86_64-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
The prototype is:
void __memswap (void *restrict p1, void *restrict p2, size_t n)
The function swaps the content of two memory blocks P1 and P2 of
len N. Memory overlap is NOT handled.
It will be used on qsort optimization.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
All the crypt related functions, cryptographic algorithms, and
make requirements are removed, with only the exception of md5
implementation which is moved to locale folder since it is
required by localedef for integrity protection (libc's
locale-reading code does not check these, but localedef does
generate them).
Besides thec code itself, both internal documentation and the
manual is also adjusted. This allows to remove both --enable-crypt
and --enable-nss-crypt configure options.
Checked with a build for all affected ABIs.
Co-authored-by: Zack Weinberg <zack@owlfolio.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The libcrypt was maked to be phase out on 2.38, and a better project
already exist that provide both compatibility and better API
(libxcrypt). The sparc optimizations add the burden to extra
build-many-glibcs.py configurations.
Checked on sparc64 and sparcv9.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Add support for MOPS in cpu_features and INIT_ARCH. Add ifuncs using MOPS for
memcpy, memmove and memset (use .inst for now so it works with all binutils
versions without needing complex configure and conditional compilation).
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
getnameinfo is an entry points for nss functionality. This commit moves
it from the 'inet' subdirectory to 'nss'. The corresponding Versions
entry is also moved from 'posix' into 'nss'.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
getaddrinfo is an entry point for nss functionality. This commit moves
it from 'sysdeps/posix' to 'nss', gets rid of the stub in 'posix', and
moves all associated tests as well.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The getservby* and getservent* routines are entry points for nss
functionality. This commit moves them from the 'inet' subdirectory to
'nss'.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The getrpcby* and getrpcent* routines are entry points for nss
functionality. This commit moves them from the 'inet' subdirectory to
'nss'. The Versions entries for these routines along with a test,
located in the 'sunrpc' subdirectory, are also moved into 'nss'.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The getprotoby* and getprotoent* routines are entry points for nss
functionality. This commit moves them from the 'inet' subdirectory to
'nss'.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The getnetby* and getnetent* routines are entry points for nss
functionality. This commit moves them from the 'inet' subdirectory to
'nss'.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These netgroup routines are entry points for nss functionality.
This commit moves them along with netgroup.h from the 'inet'
subdirectory to 'nss', and adjusts any references accordingly.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The gethostby* and gethostent* routines are entry points for nss
functionality. This commit moves them from the 'inet' subdirectory to
'nss'.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
ether_hostton and ether_ntohost are entry points for nss functionality.
This commit moves them from the 'inet' subdirectory to 'nss', and
adjusts any references accordingly.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The aliases routines are entry points for nss functionality. This
commit moves aliases.h and the aliases routines from the 'inet'
subdirectory to 'nss', and adjusts any external references.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The majority of shadow routines are entry points for nss functionality.
This commit removes the 'shadow' subdirectory and moves all
functionality and tests to 'nss'. References to shadow/ are accordingly
changed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The majority of pwd routines are entry points for nss functionality.
This commit removes the 'pwd' subdirectory and moves all functionality
and tests to 'nss'. References to pwd/ are accordingly changed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The majority of gshadow routines are entry points for nss functionality.
This commit removes the 'gshadow' subdirectory and moves all
functionality and tests to 'nss'. References to gshadow/ are
accordingly changed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The majority of grp routines are entry points for nss functionality.
This commit removes the 'grp' subdirectory and moves all nss-relevant
functionality and all tests to 'nss', and the 'setgroups' stub into
'posix' (alongside the 'getgroups' stub). References to grp/ are
accordingly changed. In addition, compat-initgroups.c, a fallback
implementation of initgroups is renamed to initgroups-fallback.c so that
the build system does not confuse it for nss_compat/compat-initgroups.c.
Build time improves very slightly; e.g. down from an average of 45.5s to
44.5s on an 8-thread mobile x86_64 CPU.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
With gcc 13.1 with --enable-fortify-source=2, tst-tcfree3 fails to
build on csky-linux-gnuabiv2 with:
../string/bits/string_fortified.h: In function ‘do_test’:
../string/bits/string_fortified.h:26:8: error: inlining failed in call
to ‘always_inline’ ‘memcpy’: target specific option mismatch
26 | __NTH (memcpy (void *__restrict __dest, const void *__restrict
__src,
| ^~~~~~
../misc/sys/cdefs.h:81:62: note: in definition of macro ‘__NTH’
81 | # define __NTH(fct) __attribute__ ((__nothrow__ __LEAF)) fct
| ^~~
tst-tcfree3.c:45:3: note: called from here
45 | memcpy (c, a, 32);
| ^~~~~~~~~~~~~~~~~
Instead of relying on -O0 to avoid malloc/free to be optimized away,
disable the builtin.
Reviewed-by: DJ Delorie <dj@redhat.com>
When building the testroot, the script runs the newly built ld.so on a
couple of binaries in order to copy over any additional libraries
needed. However, if the dependencies are found in the system cache, it
will be copied over using that path.
This is problematic if the system ld.so and the one built don't have the
exact same search configuration. We encountered this in Ubuntu, where we
build a variant of libc with -fno-omit-frame-pointer for accurate
performance profiling.
This variant is built using a non-standard slibdir to be able to be
co-installed with the default library (e.g. slibdir = /lib/libc6-prof).
Since we have /lib pointing to /usr/lib, any additional dependency
should still be reachable via /usr. However, resolving via the cache
might result in the additional DSOs being copied into $testroot/lib, out
of the search path in the container.
The problem has been triggered by 1d5024f4f0
("support: Build with exceptions and asynchronous unwind tables [BZ #30587]")
which introduced a dependency on libgcc_s.so.1 under some circumstances.
Downstream bug: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/2031495
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This avoids crashes due to partially written files, after a package
update is interrupted.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The arguments for "expected" and "got" are mismatched. Furthermore
this patch is dumping both values as hex.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
If feenableexcept or fedisableexcept gets excepts=FE_INVALID=0x80
as input, we have a signed left shift: 0x80 << 24 which is not
representable as int and thus is undefined behaviour according to
C standard.
This patch casts excepts as unsigned int before shifting, which is
defined.
For me, the observed undefined behaviour is that the shift is done
with "unsigned"-instructions, which is exactly what we want.
Furthermore, I don't get any exception-flags.
After the fix, the code is using the same instruction sequence as
before.
This reverts commit 6985865bc3.
Reason for revert:
The commit changes the order of ELF destructor calls too much relative
to what applications expect or can handle. In particular, during
process exit and _dl_fini, after the revert commit, we no longer call
the destructors of the main program first; that only happens after
some dlopen'ed objects have been destructed. This robs applications
of an opportunity to influence destructor order by calling dlclose
explicitly from the main program's ELF destructors. A couple of
different approaches involving reverse constructor order were tried,
and none of them worked really well. It seems we need to keep the
dependency sorting in _dl_fini.
There is also an ambiguity regarding nested dlopen calls from ELF
constructors: Should those destructors run before or after the object
that called dlopen? Commit 6985865bc3 used reverse order
of the start of ELF constructor calls for destructors, but arguably
using completion of constructors is more correct. However, that alone
is not sufficient to address application compatibility issues (it
does not change _dl_fini ordering at all).
This patch implements comprehensive tests for strlcat/wcslcat
functions. Tests are mostly derived from strncat test suites
and modified to incorporate strlcat/wcslcat specifications.
Reviewed-by: DJ Delorie <dj@redhat.com>
This patch implements comprehensive tests for strlcpy/wcslcpy
functions. Tests are mostly derived from strncpy test suites
and modified to incorporate strlcpy/wcslcpy specifications.
Reviewed-by: DJ Delorie <dj@redhat.com>
Linux 6.5 adds a constant SCM_PIDFD (recall that the non-uapi
linux/socket.h, where this constant is added, is in fact a header
providing many constants that are part of the kernel/userspace
interface). This shows up that SCM_SECURITY, from the same set of
definitions and added in Linux 2.6.17, is also missing from glibc,
although glibc has the first two constants from this set, SCM_RIGHTS
and SCM_CREDENTIALS; add both missing constants to glibc.
Tested for x86_64.
Linux 6.5 adds a constant AT_HANDLE_FID; add it to glibc. Because
this is a flag for the function name_to_handle_at declared in
bits/fcntl-linux.h, put the flag there rather than alongside other
AT_* flags in (OS-independent) fcntl.h.
Tested for x86_64.
With GCC 14 on 32-bit x86 the compiler emits a maybe-uninitialized
warning:
../sysdeps/ieee754/dbl-64/k_rem_pio2.c: In function '__kernel_rem_pio2':
../sysdeps/ieee754/dbl-64/k_rem_pio2.c:364:20: error: 'fq' may be used uninitialized [-Werror=maybe-uninitialized]
364 | y[0] = fq[0]; y[1] = fq[1]; y[2] = fw;
| ~~^~~
This is similar to the warning that is suppressed in the other branch of
the switch. Help the compiler knowing that the variable is always
initialized, which also makes the suppression obsolete.
For container tests, gdb needs to set the sysroot to the corresponding
testroot.root directory. The assumption was that PIDs < 3 means that
we are running within a container.
Starting with commit 2fe64148a8
"Allow for unpriviledged nested containers", the default is to use
the PID namespace of the parent. Thus support_test_main.c does not
recognize our container anymore.
This patch now assumes that we are running inside a container if
test-container.c has set PID_OUTSIDE_CONTAINER and always uses this
PID independent of having a new PID namespace or not.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The grouping verification only worked for a single-byte thousands
separator. With a multi-byte separator it returned as if no separators
were present. The actual parsing in str_to_mpn will then go wrong when
there are multiple adjacent multi-byte separators in the number.
Notes for future devs:
* Add tools as you find they're needed, with version 0,0
* Bump version when you find an old tool that doesn't work
* Don't add a version just because you know it works
Co-authored-by: Lukasz Majewski <lukma@denx.de>
Co-authored-by: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
common implementation: `strrchr-evex-base.S`.
The motivation is `strrchr-evex` needed to be refactored to not use
64-bit masked registers in preperation for AVX10.
Once vec-width masked register combining was removed, the EVEX and
EVEX512 implementations can easily be implemented in the same file
without any major overhead.
The net result is performance improvements (measured on TGL) for both
`strrchr-evex` and `strrchr-evex512`. Although, note there are some
regressions in the test suite and it may be many of the cases that
make the total-geomean of improvement/regression across bench-strrchr
are cold. The point of the performance measurement is to show there
are no major regressions, but the primary motivation is preperation
for AVX10.
Benchmarks where taken on TGL:
https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html
EVEX geometric_mean(N=5) of all benchmarks New / Original : 0.74
EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87
Full check passes on x86.
* Transpose table layout for improved memory access
* Use half-vector special comparisons for AdvSIMD
* Improve register use near special-case branches
- Due to the presence of a function call, return value would get
mov-d out of x0 in order to facilitate PCS. By moving the final
computation after the branch this can be avoided
Also change SVE routines to use overloaded intrinsics for readability.
Use overloaded intrinsics for readability. Codegen does not
change, however while we're bringing the routines up-to-date with
recent improvements to other routines in AOR it is worth copying
this change over as well.
* Update ULP comment reflecting a new observed max in [-pi/2, pi/2]
* Use the same polynomial in AdvSIMD and SVE, rather than FTRIG instructions
* Improve register use near special-case branch
Also use overloaded intrinsics for SVE.
When -D_FORTIFY_SOURCE=2 was given during compilation,
sprintf and similar functions will check if their
first argument is in read-only memory and exit with
*** %n in writable segment detected ***
otherwise. To check if the memory is read-only, glibc
reads frpm the file "/proc/self/maps". If opening this
file fails due to too many open files (EMFILE), glibc
will now ignore this error.
Fixes [BZ #30932]
Signed-off-by: Volker Weißmann <volker.weissmann@gmx.de>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Rearrange lists of routines, tests, etc. into one-per-line in
nss/Makefile and sort them using scripts/sort-makefile-lines.py.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Rearrange lists of routines, tests, etc. into one-per-line in
inet/Makefile and sort them using scripts/sort-makefile-lines.py.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The iconv buffer sizes must not include the \0 string terminator.
And the output termination with *outbufpos = '\0' was OOB.
Consistently use non-null-terminated buffer sizes.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The string parsing routine may end up writing beyond bounds of tunestr
if the input tunable string is malformed, of the form name=name=val.
This gets processed twice, first as name=name=val and next as name=val,
resulting in tunestr being name=name=val:name=val, thus overflowing
tunestr.
Terminate the parsing loop at the first instance itself so that tunestr
does not overflow.
This also fixes up tst-env-setuid-tunables to actually handle failures
correct and add new tests to validate the fix for this CVE.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
GLIBC_TUNABLES scrubbing happens earlier than envvar scrubbing and some
tunables are required to propagate past setxid boundary, like their
env_alias. Rely on tunable scrubbing to clean out GLIBC_TUNABLES like
before, restoring behaviour in glibc 2.37 and earlier.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Linux v5.10 added a mount option MS_NOSYMFOLLOW, which was added to
glibc in commit 0ca21427d9.
Add the corresponding statfs/statvfs flag bit, ST_NOSYMFOLLOW.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The bufsize on current Linux build is:
size_t bufsize = (type == 439963904 ? 2 : 1) * (12 + 4 + 255 + 1);
So with upper bound as 544 (2 * (12 + 4 + 255 + 1)). However, it might
increase to 2 * PACKETSIZE later with malloc. The default scratch_buffer
should fullfill the most usual allocation requirement.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Joe Simmons-Talbott <josimmon@redhat.com>
Read directly into the mips_abiflags struct rather than reading the
entire segment and using alloca when the passed buffer is not big enough.
Checked with build-many-glibcs.py on mips-linux-gnu
Tested-by: Ying Huang <ying.huang@oss.cipunited.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This commit add support for the new AVX10 cpu features:
https://cdrdv2-public.intel.com/784267/355989-intel-avx10-spec.pdf
We add checks for:
- `AVX10`: Check if AVX10 is present.
- `AVX10_{X,Y,Z}MM`: Check if a given vec class has AVX10 support.
`make check` passes and cpuid output was checked against GNR/DMR on an
emulator.
getaddrinfo doesn't look for any RESOLVER defines for conditional
compilation. Therefore, remove the unnecessary -DRESOLVER build flag in
getaddrinfo's CFLAGS.
Checked on x86_64 for code generation changes; none found.
ISO C2x defines scanf length modifiers wN (for intN_t / int_leastN_t /
uintN_t / uint_leastN_t) and wfN (for int_fastN_t / uint_fastN_t).
Add support for those length modifiers, similar to the printf support
previously added.
Tested for x86_64 and x86.
If the binary to run is 'env', test-containers skips it and adds
any required environment variable on the process envs variables.
This simplifies the required code to spawn new process (no need
to build an env-like program).
However, this is an issue for recursive_remove if there is any
LD_PRELOAD, since test-container will not prepend the loader command
along with required paths. If the required preloaded library can
not be loaded by the system glibc, the 'post-clean rsync' will
eventually fail.
One example is if system glibc does not support DT_RELR and the
built glibc does, the nss/tst-nss-gai-hv2-canonname test fails
with:
../scripts/evaluate-test.sh nss/tst-nss-gai-hv2-canonname $? false false
86_64-linux-gnu/nss/tst-nss-gai-hv2-canonname.test-result
rm: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_ABI_DT_RELR' not
found (required by x86_64-linux-gnu/malloc/libc_malloc_debug.so)
Instead trying to figure out the required loader arguments on how
to spawn the 'rm -rf', replace the command with a nftw call.
Checked on x86_64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Stefan Liebler <stli@linux.ibm.com>
Compilation fails when building with -DNDEBUG after commit a3189f66a5.
Here is the error:
dl-close.c: In function ‘_dl_close_worker’:
dl-close.c:140:22: error: unused variable ‘nloaded’ [-Werror=unused-variable]
140 | const unsigned int nloaded = ns->_ns_nloaded;
Add __attribute_maybe_unused__ for‘nloaded’to fix it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Now binutils use some E_MIPS_* macros and EF_MIPS_* macros, it is
difficult to decide which style macro we should use when we want
to add new ELF file header flags.
IRIX used to use EF_MIPS_* macros and in elf/elf.h there also has
comments "The following are unofficial names and should not be used".
So we should use EF_MIPS_* to keep same style with the beginning.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On powerpc, SET_RESTORE_ROUND uses inline assembly to optimize the
prologue get/save/set rounding mode operations for POWER9 and
later by using 'mffscrn' where possible, this was introduced by
commit f1c56cdff0.
GCC version 14 onwards supports builtins as __builtin_set_fpscr_rn
which now returns the FPSCR fields in a double. This feature is
available on Power9 when the __SET_FPSCR_RN_RETURNS_FPSCR__ macro
is defined.
GCC commit ef3bbc69d15707e4db6e2f198c621effb636cc26 adds
this feature.
Changes are done to use __builtin_set_fpscr_rn instead of mffscrn
or mffscrni in __fe_mffscrn(rn).
Suggested-by: Carl Love <cel@us.ibm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
AT_EMPTY_PATH is a requirement to implement fstat over fstatat,
however it does not prevent the kernel to read the path argument.
It is not an issue, but on x86-64 with SMAP-capable CPUs the kernel is
forced to perform expensive user memory access. After that regular
lookup is performed which adds even more overhead.
Instead, issue the fstat syscall directly on LFS fstat implementation
(32 bit architectures will still continue to use statx, which is
required to have 64 bit time_t support). it should be even a
small performance gain on non x86_64, since there is no need
to handle the path argument.
Checked on x86_64-linux-gnu.
During the review of a GCC analyzer test case, we found most stdio
functions accepting a FILE * argument expect it to be nonnull and just
segfault when the argument is NULL. Add nonnull attribute for them.
fflush and fflush_unlocked are well defined when __stream is NULL so
they are not touched.
For fputs, fgets, fread, fwrite, fprintf, vfprintf, and their unlocked
version, if __stream is empty but there is nothing to read or write,
they did not segfault. But the standard disallow __stream to be empty
here, so nonnull attribute is also added for them. Note that this may
blow up some old code already subtly broken.
Also add __nonnull for _chk variants and __fortify_function versions for
them.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Remove the unnecessary extra checks for sin (-0.0) from vector sin/sinf,
improving performance. Passes regress.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
I made it to agree as much as possible with the rules from CLDR (see:
https://github.com/unicode-org/cldr/blob/main/common/collation/th.xml).
It seems to be impossible to follow the CLDR rules
&[before 1]๚<ฯ # should be "variable"
and
&๛<ๆ # should be "variable"
exactly though. These ask for a primary difference in punctuation
characters whose primary weight should be "IGNORE". But using a
secondary differnence instead still sorts the test data correctly and
the previously used collation in th_TH used tertiary differences for
these characters.
There was old localedata/th_TH.in test data in TIS-620 encoding which
was not used (it was not in the localedata/Makefile). I converted this
to UTF-8 and moved it to localedata/th_TH.UTF-8.in and added it to
localedata/Makefile.
Using the existing collation rules in the th_TH locale did not sort that
test file completely correct, I think my new collation rules based on
iso14651_t1 are better.
This patch updates the kernel version in the tests tst-mman-consts.py
and tst-pidfd-consts.py to 6.5. (There are no new constants covered
by these tests in 6.5 that need any other header changes;
tst-mount-consts.py was updated separately along with a header
constant addition.)
Tested with build-many-glibcs.py.
Add support for a no-mathvec flag to gen-auto-libm-tests.c.
Update input test sin (-0.0) to be skipped in vector math libraries and
regenerate testcases.
Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Unicode 15.1.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 15.1.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).
Total removed characters in newly generated CHARMAP: 0
Total changed characters in newly generated CHARMAP: 0
Total added characters in newly generated CHARMAP: 627
Total removed characters in newly generated WIDTH: 0
Total changed characters in newly generated WIDTH: 0
Total added characters in newly generated WIDTH: 627
alpha: Added 622 characters in new ctype which were not in old ctype
graph: Added 627 characters in new ctype which were not in old ctype
print: Added 627 characters in new ctype which were not in old ctype
punct: Added 5 characters in new ctype which were not in old ctype
The five characters added to punct are:
2FFC;IDEOGRAPHIC DESCRIPTION CHARACTER SURROUND FROM RIGHT;So;0;ON;;;;;N;;;;;
2FFD;IDEOGRAPHIC DESCRIPTION CHARACTER SURROUND FROM LOWER RIGHT;So;0;ON;;;;;N;;;;;
2FFE;IDEOGRAPHIC DESCRIPTION CHARACTER HORIZONTAL REFLECTION;So;0;ON;;;;;N;;;;;
2FFF;IDEOGRAPHIC DESCRIPTION CHARACTER ROTATION;So;0;ON;;;;;N;;;;;
31EF;IDEOGRAPHIC DESCRIPTION CHARACTER SUBTRACTION;So;0;ON;;;;;N;;;;;
The Unicode announcement blog entry says "[...] adds 627
characters, [...] additions include 622 CJK unified ideographs in
a new block, [...]", so that looks OK. The Unicode
blog mentions "six completely new emoji" but they don't appear here as
they are all sequences and not single code points.
Resolves: BZ #30854
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
When an NSS plugin only implements the _gethostbyname2_r and
_getcanonname_r callbacks, getaddrinfo could use memory that was freed
during tmpbuf resizing, through h_name in a previous query response.
The backing store for res->at->name when doing a query with
gethostbyname3_r or gethostbyname2_r is tmpbuf, which is reallocated in
gethosts during the query. For AF_INET6 lookup with AI_ALL |
AI_V4MAPPED, gethosts gets called twice, once for a v6 lookup and second
for a v4 lookup. In this case, if the first call reallocates tmpbuf
enough number of times, resulting in a malloc, th->h_name (that
res->at->name refers to) ends up on a heap allocated storage in tmpbuf.
Now if the second call to gethosts also causes the plugin callback to
return NSS_STATUS_TRYAGAIN, tmpbuf will get freed, resulting in a UAF
reference in res->at->name. This then gets dereferenced in the
getcanonname_r plugin call, resulting in the use after free.
Fix this by copying h_name over and freeing it at the end. This
resolves BZ #30843, which is assigned CVE-2023-4806.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
According to glibc strrchr microbenchmark test results, this implementation
could reduce the runtime time as following:
Name Percent of rutime reduced
strrchr-lasx 10%-50%
strrchr-lsx 0%-50%
strrchr-aligned 5%-50%
Generic strrchr is implemented by function strlen + memrchr, the lasx version
will compare with generic strrchr implemented by strlen-lasx + memrchr-lasx,
the lsx version will compare with generic strrchr implemented by strlen-lsx +
memrchr-lsx, the aligned version will compare with generic strrchr implemented
by strlen-aligned + memrchr-generic.
According to glibc strcpy and stpcpy microbenchmark test results(changed
to use generic_strcpy and generic_stpcpy instead of strlen + memcpy),
comparing with the generic version, this implementation could reduce the
runtime as following:
Name Percent of rutime reduced
strcpy-aligned 8%-45%
strcpy-unaligned 8%-48%, comparing with the aligned version, unaligned
version takes less instructions to copy the tail of data
which length is less than 8. it also has better performance
in case src and dest cannot be both aligned with 8bytes
strcpy-lsx 20%-80%
strcpy-lasx 15%-86%
stpcpy-aligned 6%-43%
stpcpy-unaligned 8%-48%
stpcpy-lsx 10%-80%
stpcpy-lasx 10%-87%
This patch adds the MOVE_MOUNT_BENEATH constant from Linux 6.5 to
glibc's sys/mount.h and updates tst-mount-consts.py to reflect these
constants being up to date with that Linux kernel version.
Tested with build-many-glibcs.py.
Without passing alt_dns_packet_buffer, __res_context_search can only
store 2048 bytes (what fits into dns_packet_buffer). However,
the function returns the total packet size, and the subsequent
DNS parsing code in _nss_dns_gethostbyname4_r reads beyond the end
of the stack-allocated buffer.
Fixes commit f282cdbe7f ("resolv: Implement no-aaaa
stub resolver option") and bug 30842.
Linux 6.5 has one new syscall, cachestat, and also enables the
cacheflush syscall for hppa. Update syscall-names.list and regenerate
the arch-syscall.h headers with build-many-glibcs.py update-syscalls.
Tested with build-many-glibcs.py.
Needed since gcc-10 enabled -fno-common by default.
[In use in Gentoo since gcc-10, no problems observed.
Also discussed with and reviewed by Jessica Clarke from
Debian. Andreas]
Bug: https://bugs.gentoo.org/723268
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Sergei Trofimovich <slyich@gmail.com>
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Use a fixed size array instead. The maximum number of arguments
is set by macro tricks.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
IO_VTABLES_LEN is the size of the struct array in bytes, not the number
of __IO_jump_t's in the array. Drops just under 384kb from .rodata on
LP64 machines.
Fixes: 3020f72618 ("libio: Remove the usage of __libc_IO_vtables")
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Tested-by: Florian Weimer <fweimer@redhat.com>
It is a left-over from commit 52a01100ad
("elf: Remove ad-hoc restrictions on dlopen callers [BZ #22787]").
When backporting commmit 6985865bc3
("elf: Always call destructors in reverse constructor order
(bug 30785)"), we can move the l_init_called_next field to this
place, so that the internal GLIBC_PRIVATE ABI does not change.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The current implementation of dlclose (and process exit) re-sorts the
link maps before calling ELF destructors. Destructor order is not the
reverse of the constructor order as a result: The second sort takes
relocation dependencies into account, and other differences can result
from ambiguous inputs, such as cycles. (The force_first handling in
_dl_sort_maps is not effective for dlclose.) After the changes in
this commit, there is still a required difference due to
dlopen/dlclose ordering by the application, but the previous
discrepancies went beyond that.
A new global (namespace-spanning) list of link maps,
_dl_init_called_list, is updated right before ELF constructors are
called from _dl_init.
In dl_close_worker, the maps variable, an on-stack variable length
array, is eliminated. (VLAs are problematic, and dlclose should not
call malloc because it cannot readily deal with malloc failure.)
Marking still-used objects uses the namespace list directly, with
next and next_idx replacing the done_index variable.
After marking, _dl_init_called_list is used to call the destructors
of now-unused maps in reverse destructor order. These destructors
can call dlopen. Previously, new objects do not have l_map_used set.
This had to change: There is no copy of the link map list anymore,
so processing would cover newly opened (and unmarked) mappings,
unloading them. Now, _dl_init (indirectly) sets l_map_used, too.
(dlclose is handled by the existing reentrancy guard.)
After _dl_init_called_list traversal, two more loops follow. The
processing order changes to the original link map order in the
namespace. Previously, dependency order was used. The difference
should not matter because relocation dependencies could already
reorder link maps in the old code.
The changes to _dl_fini remove the sorting step and replace it with
a traversal of _dl_init_called_list. The l_direct_opencount
decrement outside the loader lock is removed because it appears
incorrect: the counter manipulation could race with other dynamic
loader operations.
tst-audit23 needs adjustments to the changes in LA_ACT_DELETE
notifications. The new approach for checking la_activity should
make it clearer that la_activty calls come in pairs around namespace
updates.
The dependency sorting test cases need updates because the destructor
order is always the opposite order of constructor order, even with
relocation dependencies or cycles present.
There is a future cleanup opportunity to remove the now-constant
force_first and for_fini arguments from the _dl_sort_maps function.
Fixes commit 1df71d32fe ("elf: Implement
force_first handling in _dl_sort_maps_dfs (bug 28937)").
Reviewed-by: DJ Delorie <dj@redhat.com>
Commit 5f828ff824 ("io: Fix F_GETLK, F_SETLK, and F_SETLKW for
powerpc64") fixed an issue with the value of the lock constants on
powerpc64 when not using __USE_FILE_OFFSET64, but it ended-up also
changing the value when using __USE_FILE_OFFSET64 causing an API change.
Fix that by also checking that define, restoring the pre
4d0fe291ae commit values:
Default values:
- F_GETLK: 5
- F_SETLK: 6
- F_SETLKW: 7
With -D_FILE_OFFSET_BITS=64:
- F_GETLK: 12
- F_SETLK: 13
- F_SETLKW: 14
At the same time, it has been noticed that there was no test for io lock
with __USE_FILE_OFFSET64, so just add one.
Tested on x86_64-linux-gnu, i686-linux-gnu and
powerpc64le-unknown-linux-gnu.
Resolves: BZ #30804.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
And shorten the section/node names a bit, so that the menu
entries become easier to read.
Texinfo 6.5 fails to process the previous structure:
./dynlink.texi:56: warning: node `Dynamic Linker Introspection' is
next for `Dynamic Linker Diagnostics' in sectioning but not in menu
./dynlink.texi:56: warning: node up `Dynamic Linker Diagnostics'
in menu `Dynamic Linker Invocation' and
in sectioning `Dynamic Linker' differ
./dynlink.texi:1: node `Dynamic Linker' lacks menu item for
`Dynamic Linker Diagnostics' despite being its Up target
./dynlink.texi:226: warning: node prev `Dynamic Linker Introspection' in menu `Dynamic Linker Invocation'
and in sectioning `Dynamic Linker Diagnostics' differ
Texinfo 7.0.2 does not report an error.
This fixes commit f21962ddfc
("manual: Document ld.so --list-diagnostics output").
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
XTheadBb has similar instructions like Zbb, which allow optimized
string processing:
* th.ff0: find-first zero is a CLZ instruction.
* th.tstnbz: Similar like orc.b, but with a bit-inverted result.
The instructions are documented here:
https://github.com/T-head-Semi/thead-extension-spec/tree/master/xtheadbb
These instructions can be found in the T-Head C906 and the C910.
Tested with the string tests.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This code is generally unused in practice since there don't seem to be
any NSS modules that only implement _nss_MOD_gethostbyname2_r and not
_nss_MOD_gethostbyname3_r.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This interface allows to obtain the associated process ID from the
process file descriptor. It is done by parsing the procps fdinfo
information. Its prototype is:
pid_t pidfd_getpid (int fd)
It returns the associated pid or -1 in case of an error and sets the
errno accordingly. The possible errno values are those from open, read,
and close (used on procps parsing), along with:
- EBADF if the FD is negative, does not have a PID associated, or if
the fdinfo fields contain a value larger than pid_t.
- EREMOTE if the PID is in a separate namespace.
- ESRCH if the process is already terminated.
Checked on x86_64-linux-gnu on Linux 4.15 (no CLONE_PIDFD or waitid
support), Linux 5.4 (full support), and Linux 6.2.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Returning a pidfd allows a process to keep a race-free handle for a
child process, otherwise, the caller will need to either use pidfd_open
(which still might be subject to TOCTOU) or keep the old racy interface
base on pid_t.
To correct use pifd_spawn, the kernel must support not only returning
the pidfd with clone/clone3 but also waitid (P_PIDFD) (added on Linux
5.4). If kernel does not support the waitid, pidfd return ENOSYS.
It avoids the need to racy workarounds, such as reading the procfs
fdinfo to get the pid to use along with other wait interfaces.
These interfaces are similar to the posix_spawn and posix_spawnp, with
the only difference being it returns a process file descriptor (int)
instead of a process ID (pid_t). Their prototypes are:
int pidfd_spawn (int *restrict pidfd,
const char *restrict file,
const posix_spawn_file_actions_t *restrict facts,
const posix_spawnattr_t *restrict attrp,
char *const argv[restrict],
char *const envp[restrict])
int pidfd_spawnp (int *restrict pidfd,
const char *restrict path,
const posix_spawn_file_actions_t *restrict facts,
const posix_spawnattr_t *restrict attrp,
char *const argv[restrict_arr],
char *const envp[restrict_arr]);
A new symbol is used instead of a posix_spawn extension to avoid
possible issues with language bindings that might track the return
argument lifetime. Although on Linux pid_t and int are interchangeable,
POSIX only states that pid_t should be a signed integer.
Both symbols reuse the posix_spawn posix_spawn_file_actions_t and
posix_spawnattr_t, to void rehash posix_spawn API or add a new one. It
also means that both interfaces support the same attribute and file
actions, and a new flag or file action on posix_spawn is also added
automatically for pidfd_spawn.
Also, using posix_spawn plumbing allows the reusing of most of the
current testing with some changes:
- waitid is used instead of waitpid since it is a more generic
interface.
- tst-posix_spawn-setsid.c is adapted to take into consideration that
the caller can check for session id directly. The test now spawns
itself and writes the session id as a file instead.
- tst-spawn3.c need to know where pidfd_spawn is used so it keeps an
extra file description unused.
Checked on x86_64-linux-gnu on Linux 4.15 (no CLONE_PIDFD or waitid
support), Linux 5.4 (full support), and Linux 6.2.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
These functions allow to posix_spawn and posix_spawnp to use
CLONE_INTO_CGROUP with clone3, allowing the child process to
be created in a different cgroup version 2. These are GNU
extensions that are available only for Linux, and also only
for the architectures that implement clone3 wrapper
(HAVE_CLONE3_WRAPPER).
To create a process on a different cgroupv2, one can use the:
posix_spawnattr_t attr;
posix_spawnattr_init (&attr);
posix_spawnattr_setflags (&attr, POSIX_SPAWN_SETCGROUP);
posix_spawnattr_setcgroup_np (&attr, cgroup);
posix_spawn (...)
Similar to other posix_spawn flags, POSIX_SPAWN_SETCGROUP control
whether the cgroup file descriptor will be used or not with
clone3.
There is no fallback if either clone3 does not support the flag
or if the architecture does not provide the clone3 wrapper, in
this case posix_spawn returns EOPNOTSUPP.
Checked on x86_64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It follows the internal signature:
extern int clone3 (struct clone_args *__cl_args, size_t __size,
int (*__func) (void *__arg), void *__arg);
Checked on mips64el-linux-gnueabihf, mips64el-n32-linux-gnu, and
mipsel-linux-gnu.
It follows the internal signature:
extern int clone3 (struct clone_args *__cl_args, size_t __size,
int (*__func) (void *__arg), void *__arg);
Checked on arm-linux-gnueabihf.
The wiki page https://sourceware.org/glibc/wiki/Proposals/C.UTF-8
says that "Setting LC_ALL=C.UTF-8 will ignore LANGUAGE just like it
does with LC_ALL=C." This patch implements it.
* intl/dcigettext.c (guess_category_value): Treat C.<encoding> locale
like the C locale.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
In short: __tls_get_addr checks the global generation counter and if
the current dtv is older then _dl_update_slotinfo updates dtv up to the
generation of the accessed module. So if the global generation is newer
than generation of the module then __tls_get_addr keeps hitting the
slow dtv update path. The dtv update path includes a number of checks
to see if any update is needed and this already causes measurable tls
access slow down after dlopen.
It may be possible to detect up-to-date dtv faster. But if there are
many modules loaded (> TLS_SLOTINFO_SURPLUS) then this requires at
least walking the slotinfo list.
This patch tries to update the dtv to the global generation instead, so
after a dlopen the tls access slow path is only hit once. The modules
with larger generation than the accessed one were not necessarily
synchronized before, so additional synchronization is needed.
This patch uses acquire/release synchronization when accessing the
generation counter.
Note: in the x86_64 version of dl-tls.c the generation is only loaded
once, since relaxed mo is not faster than acquire mo load.
I have not benchmarked this. Tested by Adhemerval Zanella on aarch64,
powerpc, sparc, x86 who reported that it fixes the performance issue
of bug 19924.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The old Intel software developer manual specified that the low byte of
EAX of CPUID leaf 2 returned 1 which indicated the number of rounds of
CPUDID leaf 2 was needed to retrieve the complete cache information. The
newer Intel manual has been changed to that it should always return 1
and be ignored. If the lower byte isn't 1, CPUID leaf 2 can't be used.
In this case, we ignore CPUID leaf 2 and use CPUID leaf 4 instead. If
CPUID leaf 4 doesn't contain the cache information, cache information
isn't available at all. This addresses BZ #30643.
This patch makes build-many-glibcs.py use the new GMP 6.3.0 and MPFR
4.2.1 releases.
Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).
Add common emojis to the translit-able characters (mostly
faces and hearts), and translit them to old-fashioned
smileys.
Signed-off-by: Colin Leroy-Mira <colin@colino.net>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Previously, if an entry was marked unusable for any reason, but had
not timed out yet, the assert would trigger.
One way to get into such state is if a data change is detected during
re-validation of an entry. This causes the entry to be marked as not
usable. If exits nscd soon after that, then the clock jumps
backwards, and nscd restarted, the cache re-validation run after
startup triggers the removed assert.
The change is more complicated than just the removal of the assert
because entries marked as not usable should be garbage-collected in
the second pass. To make this happen, it is necessary to update some
book-keeping data.
Reviewed-by: DJ Delorie <dj@redhat.com>
According to glibc memcmp microbenchmark test results(Add generic
memcmp), this implementation have performance improvement
except the length is less than 3, details as below:
Name Percent of time reduced
memcmp-lasx 16%-74%
memcmp-lsx 20%-50%
memcmp-aligned 5%-20%
According to glibc memset microbenchmark test results, for LSX and LASX
versions, A few cases with length less than 8 experience performace
degradation, overall, the LASX version could reduce the runtime about
15% - 75%, LSX version could reduce the runtime about 15%-50%.
The unaligned version uses unaligned memmory access to set data which
length is less than 64 and make address aligned with 8. For this part,
the performace is better than aligned version. Comparing with the generic
version, the performance is close when the length is larger than 128. When
the length is 8-128, the unaligned version could reduce the runtime about
30%-70%, the aligned version could reduce the runtime about 20%-50%.
According to glibc memrchr microbenchmark, this implementation could reduce
the runtime as following:
Name Percent of rutime reduced
memrchr-lasx 20%-83%
memrchr-lsx 20%-64%
According to glibc memchr microbenchmark, this implementation could reduce
the runtime as following:
Name Percent of runtime reduced
memchr-lasx 37%-83%
memchr-lsx 30%-66%
memchr-aligned 0%-15%
According to glibc rawmemchr microbenchmark, A few cases tested with
char '\0' experience performance degradation due to the lasx and lsx
versions don't handle the '\0' separately. Overall, rawmemchr-lasx
implementation could reduce the runtime about 40%-80%, rawmemchr-lsx
implementation could reduce the runtime about 40%-66%, rawmemchr-aligned
implementation could reduce the runtime about 20%-40%.
We are requiring Binutils >= 2.41, so explicit relocation syntax is
always supported by the assembler. Use it to reduce one instruction.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
This patch adds the new F_SEAL_EXEC constant from Linux 6.3 (see Linux
commit 6fd7353829c ("mm/memfd: add F_SEAL_EXEC") to bits/fcntl-linux.h.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Even though the alloca usage is relatively small and fixed size the code
can be written without using alloca. Convert to local variables.
Checked on x86_64-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Convert to scratch_buffers to avoid potential stack overflow.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch adds a new macro, M68K_SCALE_AVAILABLE, similar to gmp
scale_available_p (mpn/m68k/m68k-defs.m4) that expand to 1 if a
scale factor can be used in addressing modes. This is used
instead of __mc68020__ for some optimization decisions.
Checked on a build for m68k-linux-gnu target mc68020 and mc68040.
GCC currently does not define __mc68020__ for -mcpu=68040 or higher,
which memcpy/memmove assumptions. Since this memory copy optimization
seems only intended for m68020, disable for other m680X0 variants.
Checked on a build for m68k-linux-gnu target mc68020 and mc68040.
Parts of elf/tst-rtld-list-diagnostics.py have been copied from
scripts/tst-ld-trace.py.
The abnf module is entirely optional and used to verify the
ABNF grammar as included in the manual.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Based on the glibc microbenchmark, only a few short inputs with this
strncmp-aligned and strncmp-lsx implementation experience performance
degradation, overall, strncmp-aligned could reduce the runtime 0%-10%
for aligned comparision, 10%-25% for unaligend comparision, strncmp-lsx
could reduce the runtime about 0%-60%.
Based on the glibc microbenchmark, strcmp-aligned implementation could
reduce the runtime 0%-10% for aligned comparison, 10%-20% for unaligned
comparison, strcmp-lsx implemenation could reduce the runtime 0%-50%.
Based on the glibc microbenchmark, strnlen-aligned implementation could
reduce the runtime more than 10%, strnlen-lsx implementation could reduce
the runtime about 50%-78%, strnlen-lasx implementation could reduce the
runtime about 50%-88%.
The path auxv[*].a_val could either be an integer or a string,
depending on the a_type value. Use a separate field, a_val_string, to
simplify mechanical parsing of the --list-diagnostics output.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On Skylake, it changes log1p bench performance by:
Before After Improvement
max 63.349 58.347 8%
min 4.448 5.651 -30%
mean 12.0674 10.336 14%
The minimum code path is
if (hx < 0x3FDA827A) /* x < 0.41422 */
{
if (__glibc_unlikely (ax >= 0x3ff00000)) /* x <= -1.0 */
{
...
}
if (__glibc_unlikely (ax < 0x3e200000)) /* |x| < 2**-29 */
{
math_force_eval (two54 + x); /* raise inexact */
if (ax < 0x3c900000) /* |x| < 2**-54 */
{
...
}
else
return x - x * x * 0.5;
FMA and non-FMA code sequences look similar. Non-FMA version is slightly
faster. Since log1p is called by asinh and atanh, it improves asinh
performance by:
Before After Improvement
max 75.645 63.135 16%
min 10.074 10.071 0%
mean 15.9483 14.9089 6%
and improves atanh performance by:
Before After Improvement
max 91.768 75.081 18%
min 15.548 13.883 10%
mean 18.3713 16.8011 8%
When building with fortify enabled, GCC < 12 issues a warning on the
fortify strncat wrapper might overflow the destination buffer (the
failure is tied to -Werror).
Checked on ppc64 and x86_64.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The static PIE configure check uses link tests. When bootstrapping
a cross-toolchain, the link tests fail due to missing crt-files /
libc.so. As we explicitely want to test an issue in binutils (ld),
we now also explicitely check for known linker versions.
See also commit 368b7c614b
S390: Use compile-only instead of also link-tests in configure.
These implementations improve the time to copy data in the glibc
microbenchmark as below:
memcpy-lasx reduces the runtime about 8%-76%
memcpy-lsx reduces the runtime about 8%-72%
memcpy-unaligned reduces the runtime of unaligned data copying up to 40%
memcpy-aligned reduece the runtime of unaligned data copying up to 25%
memmove-lasx reduces the runtime about 20%-73%
memmove-lsx reduces the runtime about 50%
memmove-unaligned reduces the runtime of unaligned data moving up to 40%
memmove-aligned reduces the runtime of unaligned data moving up to 25%
These implementations improve the time to run strchr{nul}
microbenchmark in glibc as below:
strchr-lasx reduces the runtime about 50%-83%
strchr-lsx reduces the runtime about 30%-67%
strchr-aligned reduces the runtime about 10%-20%
strchrnul-lasx reduces the runtime about 50%-83%
strchrnul-lsx reduces the runtime about 36%-65%
strchrnul-aligned reduces the runtime about 6%-10%
SYS_modify_ldt requires CONFIG_MODIFY_LDT_SYSCALL to be set in the kernel, which
some distributions may disable for hardening. Check if that's the case (unset)
and mark the test as UNSUPPORTED if so.
Reviewed-by: DJ Delorie <dj@redhat.com>
Signed-off-by: Sam James <sam@gentoo.org>
All callers pass 1 or 0x11 anyway (same meaning according to man page),
but still.
Reviewed-by: DJ Delorie <dj@redhat.com>
Signed-off-by: Sam James <sam@gentoo.org>
On i686 f_type is an i32 so the test fails when that has the top bit set.
Explicitly cast to u32.
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Commit 78ceef25d6 ("configure: Remove
--enable-all-warnings option") removed it due to a missing +.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
On the test workload (mpv --cache=yes with VP9 video decoding), the
bin scanning has a very poor success rate (less than 2%). The tcache
scanning has about 50% success rate, so keep that.
Update comments in malloc/tst-memalign-2 to indicate the purpose
of the tests. Even with the scanning removed, the additional
merging opportunities since commit 542b110585
("malloc: Enable merging of remainders in memalign (bug 30723)")
are sufficient to pass the existing large bins test.
Remove leftover variables from _int_free from refactoring in the
same commit.
Reviewed-by: DJ Delorie <dj@redhat.com>
On Skylake, it improves expm1 bench performance by:
Before After Improvement
max 70.204 68.054 3%
min 20.709 16.2 22%
mean 22.1221 16.7367 24%
NB: Add
extern long double __expm1l (long double);
extern long double __expm1f128 (long double);
for __typeof (__expm1l) and __typeof (__expm1f128) when __expm1 is
defined since __expm1 may be expanded in their declarations which
causes the build failure.
strlen-lasx is implemeted by LASX simd instructions(256bit)
strlen-lsx is implemeted by LSX simd instructions(128bit)
strlen-align is implemented by LA basic instructions and never use unaligned memory acess
LoongArch glibc can add some LASX/LSX vector instructions codes,
change the required minimum binutils version to 2.41 which could
support vector instructions. HAVE_LOONGARCH_VEC_ASM is removed
accordingly.
The following usage of macro LEAF/ENTRY are all feasible:
1. LEAF(fcn) -- the align value of fcn is .align 3(default value)
2. LEAF(fcn, 6) -- the align value of fcn is .align 6
The:
```
if (shared_per_thread > 0 && threads > 0)
shared_per_thread /= threads;
```
Code was accidentally moved to inside the else scope. This doesn't
match how it was previously (before af992e7abd).
This patch fixes that by putting the division after the `else` block.
Previously, calling _int_free from _int_memalign could put remainders
into the tcache or into fastbins, where they are invisible to the
low-level allocator. This results in missed merge opportunities
because once these freed chunks become available to the low-level
allocator, further memalign allocations (even of the same size are)
likely obstructing merges.
Furthermore, during forwards merging in _int_memalign, do not
completely give up when the remainder is too small to serve as a
chunk on its own. We can still give it back if it can be merged
with the following unused chunk. This makes it more likely that
memalign calls in a loop achieve a compact memory layout,
independently of initial heap layout.
Drop some useless (unsigned long) casts along the way, and tweak
the style to more closely match GNU on changed lines.
Reviewed-by: DJ Delorie <dj@redhat.com>
The nscd daemon caches hosts data from NSS modules verbatim, without
filtering protocol families or sorting them (otherwise separate caches
would be needed for certain ai_flags combinations). The cache
implementation is complete separate from the getaddrinfo code. This
means that rebuilding getaddrinfo is not needed. The only function
actually used is __bump_nl_timestamp from check_pf.c, and this change
moves it into nscd/connections.c.
Tested on x86_64-linux-gnu with -fexceptions, built with
build-many-glibcs.py. I also backported this patch into a distribution
that still supports nscd and verified manually that caching still works.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Since i686 provides the fortified wrappers for memcpy, mempcpy,
memmove, and memset on the same string implementation, the static
build tries to optimized it by not tying the fortified wrappers
to string routine (to avoid pulling the fortify function if
they are not required).
Checked on i686-linux-gnu building with different option:
default and --disable-multi-arch plus default, --disable-default-pie,
--enable-fortify-source={2,3}, and --enable-fortify-source={2,3}
with --disable-default-pie.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
With multiarch disabled, the default memmove implementation provides
the fortify routines for memcpy, mempcpy, and memmove. However, it
does not provide the internal hidden definitions used when building
with fortify enabled. The memset has a similar issue.
Checked on x86_64-linux-gnu building with different options:
default and --disable-multi-arch plus default, --disable-default-pie,
--enable-fortify-source={2,3}, and --enable-fortify-source={2,3}
with --disable-default-pie.
Tested-by: Andreas K. Huettel <dilfridge@gentoo.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Linux 6.4 adds new constants PTRACE_SET_SYSCALL_USER_DISPATCH_CONFIG
and PTRACE_GET_SYSCALL_USER_DISPATCH_CONFIG. Add those to all
relevant sys/ptrace.h headers, along with adding the associated
argument structure to bits/ptrace-shared.h (named struct
__ptrace_sud_config there following the usual convention for such
structures).
Tested for x86_64 and with build-many-glibcs.py.
Making error_t defined to enum __error_t_codes conveniently makes the
debugger print symbolic values, but in C++ int is not interoperable with
enum __error_t_codes, leading to C++ application build issues, so let's
revert error_t to int in C++.
This is the only missing part in struct statvfs.
The LSB calls [f]statfs() deprecated, and its weird types are definitely
off-putting. However, its use is required to get f_type.
Instead, allocate one of the six spares to f_type,
copied directly from struct statfs.
This then becomes a small glibc extension to the standard interface
on Linux and the Hurd, instead of two different interfaces, one of which
is quite odd due to being an ABI type, and there no longer is any reason
to use statfs().
The underlying kernel type is a mess, but all architectures agree on u32
(or more) for the ABI, and all filesystem magicks are 32-bit integers.
We don't lose any generality by using u32, and by doing so we both make
the API consistent with the Hurd, and allow C++
switch(f_type) { case RAMFS_MAGIC: ...; }
Also fix tst-statvfs so that it actually fails;
as it stood, all it did was return 0 always.
Test statfs()' and statvfs()' f_types are the same.
Link: https://lore.kernel.org/linux-man/f54kudgblgk643u32tb6at4cd3kkzha6hslahv24szs4raroaz@ogivjbfdaqtb/t/#u
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When using jemalloc, malloc() needs to use TSD, while libpthread
initialization needs malloc(). Having ___pthread_self set early to some
static storage allows TSD to work early, thus allowing jemalloc and
libpthread to initialize together.
This incidentaly simplifies __pthread_enable/disable_asynccancel and
__pthread_self, now that ___pthread_self is always initialized.
When using jemalloc, malloc() needs to use TSD, while libpthread
initialization needs malloc(). Supporting a static TSD area allows jemalloc
and libpthread to initialize together.
Some legacy AMD CPUs and hypervisors have the _cpuid_ '0x8000_001D'
set to Zero, thus resulting in zeroed-out computed cache values.
This patch reintroduces the old way of cache computation as a
fail-safe option to handle these exceptions.
Fixed 'level4_cache_size' value through handle_amd().
Reviewed-by: Premachandra Mallappa <premachandra.mallappa@amd.com>
Tested-by: Florian Weimer <fweimer@redhat.com>
04bf7d2d8a ("chk: Add and fix hidden builtin definitions for *_chk")
added an #undef for longjmp and siglongjmp to compensate for the
definition in include/setjmp.h, but missed doing so for the powerpc
version too.
Fixes: 04bf7d2d8a ("chk: Add and fix hidden builtin definitions for
*_chk")
Otherwise on gnu-i686 there are unwanted PLT entries in libc.so when
fortification is enabled.
Tested for i686-gnu, x86_64-gnu, i686-linux-gnu and x86_64-linux-gnu
Posix says that d_name is of unspecified size, and sizeof(d_name)
should not be used. It is indeed only 1-byte long in bits/dirent.h. We
can instead explictly provide the actual allocated size to
__strcpy_chk.
This fixes a hurd/check-installed-headers-c failure with
-std=c89 #define _FORTIFY_SOURCE 1:
In file included from ../hurd/hurd.h:354,
from ../sysdeps/hurd/include/hurd.h:2,
from /tmp/cih_test_9IaUwa.c:10:
/home/bmg/install/compilers/i686-gnu/lib/gcc/i686-glibc-gnu/13.2.1/include/stdarg.h:54:34: error: "__STDC_VERSION__" is not defined, evaluates to 0 [-Werror=undef]
54 | #if !defined(__STRICT_ANSI__) || __STDC_VERSION__ + 0 >= 199900L \
| ^~~~~~~~~~~~~~~~
/home/bmg/install/compilers/i686-gnu/lib/gcc/i686-glibc-gnu/13.2.1/include/stdarg.h:55:8: error: "__cplusplus" is not defined, evaluates to 0 [-Werror=undef]
55 | || __cplusplus + 0 >= 201103L
| ^~~~~~~~~~~
cc1: all warnings being treated as errors
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Commit 91927b7c76 ("Rewrite iconv option parsing [BZ #19519]") changed the
iconv program to call __gconv_open directly instead of the iconv_open
wrapper, but the former does not set errno. Update the caller to
interpret the return codes like iconv_open does.
The option is not activelly tested and has bitrotten, to fix it
would require a lot of work and multiple fixes. A better option
would to evaluate each option and enable the warning if it makes
sense.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This patch updates the kernel version in the tests tst-mman-consts.py,
tst-mount-consts.py and tst-pidfd-consts.py to 6.4. (There are no new
constants covered by these tests in 6.4 that need any other header
changes.)
Tested with build-many-glibcs.py.
This patch enables the option to influence hwcaps used by PowerPC.
The environment variable, GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,-zzz....,
can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature xxx
and zzz, where the feature name is case-sensitive and has to match the ones
mentioned in the file{sysdeps/powerpc/dl-procinfo.c}.
Note that the hwcap tunables only used in the IFUNC selection.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Avoid potential stack overflow from unbounded alloca. Use the existing
scratch_buffer instead.
Add testcases to exercise the code as suggested by Adhemerval Zanella Netto.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On GCC before 11, IPA can make the fortified realpath aware that the
buffer size is not large enough (8 bytes instead of PATH_MAX bytes).
Fix this by using a buffer that is large enough.
When building with fortify enabled, GCC 6 issues an warning the fortify
wrapper might overflow the destination buffer. However, GCC does not
provide a specific flag to disable the warning (the failure is tied to
-Werror). So to avoid disable all errors, only enable the check for
GCC 7 or newer.
Checked on i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
On __convert_scm_timestamps GCC 6 issues an warning that tvts[0]/tvts[1]
maybe be used uninitialized, however it would be used if type is set to a
value different than 0 (done by either COMPAT_SO_TIMESTAMP_OLD or
COMPAT_SO_TIMESTAMPNS_OLD) which will fallthrough to 'common' label.
It does not show with gcc 7 or more recent versions.
Checked on i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Similar to memcpy, mempcpy, and memmove there is no need for an
specific memset_chk-nonshared.S. It can be provided by
memset-ia32.S itself for static library.
Checked on i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The i386 string routines provide multiple internal definitions
for memcpy, memmove, and mempcpy chk routines:
$ objdump -t libc.a | grep __memcpy_chk
00000000 g F .text 0000000e __memcpy_chk
00000000 g F .text 00000013 __memcpy_chk
$ objdump -t libc.a | grep __mempcpy_chk
00000000 g F .text 0000000e __mempcpy_chk
00000000 g F .text 00000013 __mempcpy_chk
$ objdump -t libc.a | grep __memmove_chk
00000000 g F .text 0000000e __memmove_chk
00000000 g F .text 00000013 __memmove_chk
Although is not an issue for normal static builds, with fortify=3
glibc itself might use the fortify chk functions and thus static
build might fail with multiple definitions. For instance:
x86_64-glibc-linux-gnu-gcc -m32 -march=i686 -o [...]math/test-signgam-uchar-static -nostdlib -nostartfiles -static -static-pie [...]
x86_64-glibc-linux-gnu/bin/ld: [...]/libc.a(mempcpy-ia32.o):
in function `__mempcpy_chk': [...]/glibc-git/string/../sysdeps/i386/i686/mempcpy.S:32: multiple definition of `__mempcpy_chk';
[...]/libc.a(mempcpy_chk-nonshared.o):[...]/debug/../sysdeps/i386/mempcpy_chk.S:28: first defined here
collect2: error: ld returned 1 exit status
make[2]: *** [../Rules:298:
There is no need for mem*-nonshared.S, the __mem*_chk routines
are already provided by the assembly routines.
Checked on i686-linux-gnu with gcc 13 built with fortify=1,2,3 and
without fortify.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
With gcc 11.3.1, building with -D_FORTIFY_SOURCE=2 shows:
In function ‘getgroups’,
inlined from ‘do_test’ at test-errno.c:129:12:
../misc/sys/cdefs.h:195:6: error: argument 1 value -1 is negative
[-Werror=stringop-overflow=]
195 | ? __ ## f ## _alias (__VA_ARGS__)
\
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../posix/bits/unistd.h:115:10: note: in expansion of macro
‘__glibc_fortify’
115 | return __glibc_fortify (getgroups, __size, sizeof (__gid_t),
| ^~~~~~~~~~~~~~~
../posix/bits/unistd.h: In function ‘do_test’:
../posix/bits/unistd-decl.h:135:28: note: in a call to function
‘__getgroups_alias’ declared with attribute ‘access (write_only, 2, 1)’
135 | extern int __REDIRECT_NTH (__getgroups_alias, (int __size,
__gid_t __list[]),
| ^~~~~~~~~~~~~~~~~
../misc/sys/cdefs.h:264:6: note: in definition of macro ‘__REDIRECT_NTH’
264 | name proto __asm__ (__ASMNAME (#alias)) __THROW
It builds fine with gcc 12 and gcc 13.
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The compiler might not see that internal definition is an alias
due the libc_ifunc macro, which redefines __strchrnul. With
gcc 6 it fails with:
In file included from <command-line>:0:0:
./../include/libc-symbols.h:472:33: error: ‘__EI___strchrnul’ aliased to
undefined symbol ‘__GI___strchrnul’
extern thread __typeof (name) __EI_##name \
^
./../include/libc-symbols.h:468:3: note: in expansion of macro
‘__hidden_ver2’
__hidden_ver2 (, local, internal, name)
^~~~~~~~~~~~~
./../include/libc-symbols.h:476:29: note: in expansion of macro
‘__hidden_ver1’
# define hidden_def(name) __hidden_ver1(__GI_##name, name, name);
^~~~~~~~~~~~~
./../include/libc-symbols.h:557:32: note: in expansion of macro
‘hidden_def’
# define libc_hidden_def(name) hidden_def (name)
^~~~~~~~~~
../sysdeps/powerpc/powerpc64/multiarch/strchrnul.c:38:1: note: in
expansion of macro ‘libc_hidden_def’
libc_hidden_def (__strchrnul)
^~~~~~~~~~~~~~~
Use libc_ifunc_hidden as stpcpy. Checked on powerpc64 with
gcc 6 and gcc 13.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Old GCC might trigger the the comparison will always evaluate as ‘true’
warnig for static build:
set-freeres.c:87:14: error: the comparison will always evaluate as
‘true’ for the address of ‘__libc_getgrgid_freemem_ptr’ will never be
NULL [-Werror=address]
if (&__ptr != NULL) \
So add pragma weak for all affected usages.
Checked on x86_64 and i686 with gcc 6 and 13.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Generated on a Cavium Octeon III 2 board running Linux version 4.19.249
and GCC 13.1.0.
Needed due to commit cf7ffdd8a5 ("added pair of inputs for hypotf in
binary32").
This was added in 233399bce2 just for warn_if_unused
warnings rather than anything substantial.
Now that we have a proper configure argument for F_S (--enable-fortify-source),
just drop this entirely, to avoid conflicting with e.g. detected --enable-fortify-source
finding F_S=3, then nscd's Makefile setting F_S=2, resulting in a build-failure
because of the redefinition.
Signed-off-by: Sam James <sam@gentoo.org>
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Reviewed-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Starting with commit 2c6b4b272e
"nptl: Unconditionally use a 32-byte rseq area", the testcase
misc/tst-rseq-disable is UNSUPPORTED as RSEQ_SIG is not defined.
The mentioned commit removes inclusion of sys/rseq.h in nptl/descr.h.
Thus just include sys/rseq.h in the tst-rseq-disable.c as also done
in tst-rseq.c and tst-rseq-nptl.c.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
If fortify is enabled, the truncated output warning is issued by
the wrapper itself:
In function ‘strncpy’,
inlined from ‘test_strncpy’ at tester.c:505:10:
../string/bits/string_fortified.h:95:10: error: ‘__builtin_strncpy’
destination unchanged after copying no bytes from a string of length 3
[-Werror=stringop-truncation]
95 | return __builtin___strncpy_chk (__dest, __src, __len,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
96 | __glibc_objsize (__dest));
| ~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../include/bits/string_fortified.h:1,
from ../string/string.h:548,
from ../include/string.h:60,
from tester.c:33,
from inl-tester.c:6:
In function ‘strncpy’,
inlined from ‘test_strncpy’ at tester.c:505:10:
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
If fortify is enabled, the truncated output warning is issued by
the wrapper itself:
bug-strncat1.c: In function ‘main’:
bug-strncat1.c:14:3: error: ‘__builtin___strncat_chk’ output truncated
copying 1 byte from a string of length 2 [-Werror=stringop-truncation]
14 | strncat (d, "\5\6", 1);
| ^
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The errno variable is potentially clobbered by the preceding
send call. It is not related to the to-be-cached information.
The parallel code in hstcache.c and servicescache.c already uses
errval.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Leads to build failures (preprocessor redefinitions), and there is not
enough time to address this properly. Deferred until after 2.38 release.
This reverts commit 59dc07637f.
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Generated on a VisionFive 2 board running Linux version 6.4.2 and
GCC 13.1.0.
Needed due to commit cf7ffdd8a5 ("added pair of inputs for hypotf in
binary32").
Add new definitions for the MIPS target, specifically: relocation
types, machine flags, section type names, and object attribute tags
and values. On MIPS64, up to three relocations may be specified
within r_info, by the r_type, r_type2, and r_type3 fields, so add new
macros to get the respective reloc types for MIPS64.
If the kernel headers provide a larger struct rseq, we used that
size as the argument to the rseq system call. As a result,
rseq registration would fail on older kernels which only accept
size 32.
On GNU/Hurd, O_RDWR actually is O_WRONLY|O_RDONLY, so checking through
bitness really is wrong. O_ACCMODE is there for this.
Fixes: 5324d25842 ("fileops: Don't process ,ccs= as individual mode flags (BZ#18906)")
The 30379efad1 added _FORTIFY_SOURCE checks without check if compiler
does support all used fortify levels. This patch fixes it by first
checking at configure time the maximum support fortify level and using
it instead of a pre-defined one.
Checked on x86_64 with gcc 11, 12, and 13.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Tested-by: Florian Weimer <fweimer@redhat.com>
We mentioned eventual dropping of libcrypt in the 2.28 NEWS. Actually
put that plan in motion by first disabling building libcrypt by default.
note in NEWS that the library will be dropped completely in a future
release.
Also add a couple of builds into build-many-glibcs.py.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Fixes the following test-time errors, that lead to FAILs, on toolchains
that set -z now out o the box, such as the one used on Gentoo Hardened:
.../build-x86-x86_64-pc-linux-gnu-nptl $ grep '' nptl/tst-tls3*.out
nptl/tst-tls3.out:dlopen failed
nptl/tst-tls3-malloc.out:dlopen failed
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
* nptl/descr.h (struct pthread): Remove end_padding member, which
made this type incomplete.
(PTHREAD_STRUCT_END_PADDING): Stop using end_padding.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The _FORTIFY_SOURCE is used as default by some system compilers,
and there is no way to check if some fortify extension does not
trigger any conformance issue.
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Based on feedback by Mike Gilbert <floppym@gentoo.org>
Linux-6.1.38-dist x86_64 AMD Phenom-tm- II X6 1055T Processor
-march=amdfam10
failures occur for x32 ABI
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Starting with commit 1bcfe0f732, the
test was enhanced and the object for __builtin_return_address (0)
is searched with _dl_find_object.
Unfortunately on e.g. s390 (31bit), a postprocessing step is needed
as the highest bit has to be masked out. This can be done with
__builtin_extract_return_addr.
Without this postprocessing, _dl_find_object returns with -1 and the
content of dlfo is invalid, which may lead to segfaults in basename.
Therefore those checks are now only done on success.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
On some machines we end up with incomplete cache information. This can
make the new calculation of `sizeof(total-L3)/custom-divisor` end up
lower than intended (and lower than the prior value). So reintroduce
the old bound as a lower bound to avoid potentially regressing code
where we don't have complete information to make the decision.
Reviewed-by: DJ Delorie <dj@redhat.com>
After:
```
commit af992e7abd
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Wed Jun 7 13:18:01 2023 -0500
x86: Increase `non_temporal_threshold` to roughly `sizeof_L3 / 4`
```
Split `shared` (cumulative cache size) from `shared_per_thread` (cache
size per socket), the `shared_per_thread` *can* be slightly off from
the previous calculation.
Previously we added `core` even if `threads_l2` was invalid, and only
used `threads_l2` to divide `core` if it was present. The changed
version only included `core` if `threads_l2` was valid.
This change restores the old behavior if `threads_l2` is invalid by
adding the entire value of `core`.
Reviewed-by: DJ Delorie <dj@redhat.com>
Based on feedback by Arsen Arsenović <arsen@gentoo.org>
Linux-6.1.38-gentoo-dist-hardened x86_64 AMD Ryzen 7 3800X 8-Core Processor
-march=x86-64-v2
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Line numbers, version numbers, template date changed everywhere
Nontrivial changes in de, ro, uk, zh_TW
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Bump autoconf requirement to 2.71 to allow regenerating configure on
more recent distributions. autoconf 2.71 has been in Fedora since F36
and is the current version in Debian stable (bookworm). It appears to
be current in Gentoo as well.
All sysdeps configure and preconfigure scripts have also been
regenerated; all changes are trivial transformations that do not affect
functionality.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
It follows the internal signature:
extern int clone3 (struct clone_args *__cl_args, size_t __size,
int (*__func) (void *__arg), void *__arg);
Checked on s390x-linux-gnu and s390-linux-gnu.
The sparc ABI has multiple cases on how to handle JMP_SLOT relocations,
(sparc_fixup_plt/sparc64_fixup_plt). For BINDNOW, _dl_audit_symbind
will be responsible to setup the final relocation value; while for
lazy binding _dl_fixup/_dl_profile_fixup will call the audit callback
and tail cail elf_machine_fixup_plt (which will call
sparc64_fixup_plt).
This patch fixes by issuing the SPARC specific routine on bindnow and
forwarding the audit value to elf_machine_fixup_plt for lazy resolution.
It fixes the la_symbind for bind-now tests on sparc64 and sparcv9:
elf/tst-audit24a
elf/tst-audit24b
elf/tst-audit24c
elf/tst-audit24d
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
This patch checks if assembler supports vector instructions to
generate LASX/LSX code or not, and then define HAVE_LOONGARCH_VEC_ASM macro
We have added support for vector instructions in binutils-2.41
See:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=75b2f521b101d974354f6ce9ed7c054d8b2e3b7a
commit 75b2f521b101d974354f6ce9ed7c054d8b2e3b7a
Author: mengqinggang <mengqinggang@loongson.cn>
Date: Thu Jun 22 10:35:28 2023 +0800
LoongArch: gas: Add lsx and lasx instructions support
gas/ChangeLog:
* config/tc-loongarch.c (md_parse_option): Add lsx and lasx option.
(loongarch_after_parse_args): Add lsx and lasx option.
opcodes/ChangeLog:
* loongarch-opc.c (struct loongarch_ase): Add lsx and lasx
instructions.
Depending on build configuration, the [routine]-c.c files may be chosen
to provide fortified routines implementation. While [routines].c
implementation were automatically excluded, the [routines]-c.c ones were
not. This patch fixes that by adding these file to the list to be
filtered.
This brings in the new Romanian language translations, and updates
nine other translations. Important translations in this update
include the Italian and Japanese translations for ESTALE which
remove the mention of "NFS" from the error message translation.
Success is reported with a 0 return value, and failure is -1.
Enhance the kitchen sink test elf/tst-audit28 to cover
_dl_find_object as well.
Fixes commit 5d28a8962d ("elf: Add _dl_find_object function")
and bug 30515.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
The trim_threshold is too aggressive a heuristic to decide if chunk
reuse is OK for reallocated memory; for repeated small, shrinking
allocations it leads to internal fragmentation and for repeated larger
allocations that fragmentation may blow up even worse due to the dynamic
nature of the threshold.
Limit reuse only when it is within the alignment padding, which is 2 *
size_t for heap allocations and a page size for mmapped allocations.
There's the added wrinkle of THP, but this fix ignores it for now,
pessimizing that case in favor of keeping fragmentation low.
This resolves BZ #30579.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reported-by: Nicolas Dusart <nicolas@freedelity.be>
Reported-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Some locales define a list of mapping pairs of alternate digits and
separators for input digits (to_inpunct). This require the scanf
to create a list of all possible inputs for the optional type
modifier 'I'.
Checked on x86_64-linux-gnu.
Reviewed-by: Joe Simmons-Talbott <josimmon@redhat.com>
In processing the first 7 individual characters of the mode for fopen
if ,ccs= is used those characters will be processed as well. Stop
processing individual mode flags once a comma is encountered. This has
the effect of requiring ,ccs= to be the last mode flag in the mode
string. Add a testcase to check that the ,ccs= mode flag is not
processed as individual mode flags.
Reviewed-by: DJ Delorie <dj@redhat.com>
Return value from *scanf and *asprintf routines are now properly checked
in test-scanf-ldbl-compat-template.c and test-printf-ldbl-compat.c.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The change is meant to avoid unwanted PLT entry for the fgets_unlocked
routine when _FORTIFY_SOURCE is set.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Move declarations from libio/bits/stdio.h to existing
libio/bits/stdio2-decl.h. This will enable future use of
__REDIRECT_FORTIFY in place of some __REDIRECT.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This allows to include bits/syslog-decl.h in include/sys/syslog.h and
therefore be able to create the libc_hidden_builtin_proto (__syslog_chk)
prototype.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The __fdelt_chk declaration needs to be available so that
libc_hidden_proto can be used while not redefining __FD_ELT.
Thus, misc/bits/select-decl.h is created to hold the corresponding
prototypes.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The change is meant to avoid unwanted PLT entries for the read_chk,
getdomainname_chk and getlogin_r_chk routines when _FORTIFY_SOURCE is set.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This change is similar to what was done for bits/wchar2.h.
Routines declaration are moved into a dedicated bits/unistd-decl.h file
which is then included into the bits/unistd.h file.
This will allow to adapt the files so that PLT entries are not created when
_FORTIFY_SOURCE is enabled.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The change is meant to avoid unwanted PLT entries for the wmemset and
wcrtomb routines when _FORTIFY_SOURCE is set.
On top of that, ensure that *_chk routines have their hidden builtin
definitions available.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The __REDIRECT* macros are creating aliases which may lead to unwanted
PLT entries when fortification is enabled.
To prevent these entries, the REDIRECT alias should be set to point to the
existing __GI_* aliases.
This is done transparently by creating a __REDIRECT_FORTIFY* version of
these macros, that can be overwritten internally when necessary.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
If libc_hidden_builtin_{def,proto} isn't properly set for *_chk routines,
there are unwanted PLT entries in libc.so.
There is a special case with __asprintf_chk:
If ldbl_* macros are used for asprintf, ABI gets broken on s390x,
if it isn't, ppc64le isn't building due to multiple asm redirections.
This is due to the inclusion of bits/stdio-lbdl.h for ppc64le whereas it
isn't for s390x. This header creates redirections, which are not
compatible with the ones generated using libc_hidden_def.
Yet, we can't use libc_hidden_ldbl_proto on s390x since it will not
create a simple strong alias (e.g. as done on x86_64), but a versioned
alias, leading to ABI breakage.
This results in errors on s390x:
/usr/bin/ld: glibc/iconv/../libio/bits/stdio2.h:137: undefined reference
to `__asprintf_chk'
Original __asprintf_chk symbols:
00000000001395b0 T __asprintf_chk
0000000000177e90 T __nldbl___asprintf_chk
__asprintf_chk symbols with ldbl_* macros:
000000000012d590 t ___asprintf_chk
000000000012d590 t __asprintf_chk@@GLIBC_2.4
000000000012d590 t __GI___asprintf_chk
000000000012d590 t __GL____asprintf_chk___asprintf_chk
0000000000172240 T __nldbl___asprintf_chk
__asprintf_chk symbols with the patch:
000000000012d590 t ___asprintf_chk
000000000012d590 T __asprintf_chk
000000000012d590 t __GI___asprintf_chk
0000000000172240 T __nldbl___asprintf_chk
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
If libc_hidden_builtin_{def,proto} isn't properly set for *_chk routines,
there are unwanted PLT entries in libc.so.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The *_chk routines naming doesn't match the name that would be generated
using libc_hidden_ldbl_proto. Since the macro is needed for some of
these *_chk functions for _FORTIFY_SOURCE to be enabled, that needed to
be fixed.
While at it, all the *_chk function get renamed appropriately for
consistency, even if not strictly necessary.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
Since the _FORTIFY_SOURCE feature uses some routines of Glibc, they need to
be excluded from the fortification.
On top of that:
- some tests explicitly verify that some level of fortification works
appropriately, we therefore shouldn't modify the level set for them.
- some objects need to be build with optimization disabled, which
prevents _FORTIFY_SOURCE to be used for them.
Assembler files that implement architecture specific versions of the
fortified routines were not excluded from _FORTIFY_SOURCE as there is no
C header included that would impact their behavior.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Add --enable-fortify-source option.
It is now possible to enable fortification through a configure option.
The level may be given as parameter, if none is provided, the configure
script will determine what is the highest level possible that can be set
considering GCC built-ins availability and set it.
If level is explicitly set to 3, configure checks if the compiler
supports the built-in function necessary for it or raise an error if it
isn't.
If the configure option isn't explicitly enabled, it _FORTIFY_SOURCE is
forcibly undefined (and therefore disabled).
The result of the configure checks are new variables, ${fortify_source}
and ${no_fortify_source} that can be used to appropriately populate
CFLAGS.
A dedicated patch will follow to make use of this variable in Makefiles
when necessary.
Updated NEWS and INSTALL.
Adding dedicated x86_64 variant that enables the configuration.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The current implementation of strerror is thread-safe, but this
has implications for the lifetime of the return string.
Describe the strerror_l function. Describe both variants of the
strerror_r function. Mention the lifetime of the returned string
for strerrorname_np and strerrordesc_np. Clarify that perror
output depends on the current locale.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Describe the problems with signed characters, and the glibc extension
to deal with most of them. Mention that the is* functions return
zero for the special argument EOF.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Now that abort no longer calls fflush there is no reason to avoid locking
the stdio streams anywhere. This fixes a conformance issue and potential
heap corruption during exit.
MAP_FIXED is defined to silently replace any existing mappings at the
address range being mapped over. This, however, is a dangerous, and only
rarely desired behavior.
Various Unix systems provide replacements or additions to MAP_FIXED:
* SerenityOS and Linux provide MAP_FIXED_NOREPLACE. If the address space
already contains a mapping in the requested range, Linux returns
EEXIST. SerenityOS returns ENOMEM, however that is a bug, as the
MAP_FIXED_NOREPLACE implementation is intended to be compatible with
Linux.
* FreeBSD provides the MAP_EXCL flag that has to be used in combination
with MAP_FIXED. It returns EINVAL if the requested range already
contains existing mappings. This is directly analogous to the O_EXCL
flag in the open () call.
* DragonFly BSD, NetBSD, and OpenBSD provide MAP_TRYFIXED, but with
different semantics. DragonFly BSD returns ENOMEM if the requested
range already contains existing mappings. NetBSD does not return an
error, but instead creates the mapping at a different address if the
requested range contains mappings. OpenBSD behaves the same, but also
notes that this is the default behavior even without MAP_TRYFIXED
(which is the case on the Hurd too).
Since the Hurd leans closer to the BSD side, add MAP_EXCL as the primary
API to request the behavior of not replacing existing mappings. Declare
MAP_FIXED_NOREPLACE and MAP_TRYFIXED as aliases of (MAP_FIXED|MAP_EXCL),
so any existing software that checks for either of those macros will
pick them up automatically. For compatibility with Linux, return EEXIST
if a mapping already exists.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230625231751.404120-5-bugaevc@gmail.com>
Zero address passed to mmap () typically means the caller doesn't have
any specific preferred address. Not so if MAP_FIXED is passed: in this
case 0 means literal 0. Fix this case to pass anywhere = 0 into vm_map.
Also add some documentation.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230625231751.404120-4-bugaevc@gmail.com>
Only call vm_deallocate when we do have the old buffer, and check for
unexpected errors.
Spotted while debugging a msgids/readdir issue on x86_64-gnu.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230625231751.404120-3-bugaevc@gmail.com>
The rest of the heap (backed by individual pages) is already mapped RW.
Mapping these pages RWX presents a security hazard.
Also, in another branch memory gets allocated using vm_allocate, which
sets memory protection to VM_PROT_DEFAULT (which is RW). The mismatch
between protections prevents Mach from coalescing the VM map entries.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230625231751.404120-2-bugaevc@gmail.com>
Instead of trying to allocate a thread stack at a specific address,
looping over the address space, just set the ANYWHERE flag in
vm_allocate (). The previous behavior:
- defeats ASLR (for Mach versions that support ASLR),
- is particularly slow if the lower 4 GB of the address space are mapped
inaccessible, as we're planning to do on 64-bit Hurd,
- is just silly.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230625231751.404120-1-bugaevc@gmail.com>
This follows 1d44530a5b ("string: strerror must not return NULL (bug 30555)"):
«
For strerror, this fixes commit 28aff04781 ("string:
Implement strerror in terms of strerror_l"). This commit avoids
returning NULL for strerror_l as well, although POSIX allows this
behavior for strerror_l.
»
Changing tst-cleanup4.c to use xread instead of read caused
the nptl/tst-cleanupx4 test to fail. The routines in libsupport.a
need to be built with exception handling and asynchronous unwind
table support.
v2: Use "CFLAGS-.oS" instead of "override CFLAGS".
GCC was the only compiler affected by the issue with
__builtin_isinf_sign and float128.
Fix BZ #30550.
Reported-by: Qiu Chaofan <qiucofan@cn.ibm.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The first segment in a shared library may be read-only, not executable.
To support LD_PREFER_MAP_32BIT_EXEC on such shared libraries, we also
check MAP_DENYWRITE to decide if MAP_32BIT should be passed to mmap.
Normally the first segment is mapped with MAP_COPY, which is defined
as (MAP_PRIVATE | MAP_DENYWRITE). But if the segment alignment is
greater than the page size, MAP_COPY isn't used to allocate enough
space to ensure that the segment can be properly aligned. Map the
first segment with MAP_COPY in this case to fix BZ #30452.
tm time struct contains tm_wday and tm_yday that were previously not
checked in this test. Also added new test cases for date formats
containing %D, %R or %h.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Optimised implementations for single and double precision, Advanced
SIMD and SVE, copied from Arm Optimized Routines.
As previously, data tables are used via a barrier to prevent
overly aggressive constant inlining. Special-case handlers are
marked NOINLINE to avoid incurring the penalty of switching call
standards unnecessarily.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Optimised implementations for single and double precision, Advanced
SIMD and SVE, copied from Arm Optimized Routines. Log lookup table
added as HIDDEN symbol to allow it to be shared between AdvSIMD and
SVE variants.
As previously, data tables are used via a barrier to prevent
overly aggressive constant inlining. Special-case handlers are
marked NOINLINE to avoid incurring the penalty of switching call
standards unnecessarily.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Optimised implementations for single and double precision, Advanced
SIMD and SVE, copied from Arm Optimized Routines.
As previously, data tables are used via a barrier to prevent
overly aggressive constant inlining. Special-case handlers are
marked NOINLINE to avoid incurring the penalty of switching call
standards unnecessarily.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Replace the loop-over-scalar placeholder routines with optimised
implementations from Arm Optimized Routines (AOR).
Also add some headers containing utilities for aarch64 libmvec
routines, and update libm-test-ulps.
Data tables for new routines are used via a pointer with a
barrier on it, in order to prevent overly aggressive constant
inlining in GCC. This allows a single adrp, combined with offset
loads, to be used for every constant in the table.
Special-case handlers are marked NOINLINE in order to confine the
save/restore overhead of switching from vector to normal calling
standard. This way we only incur the extra memory access in the
exceptional cases. NOINLINE definitions have been moved to
math_private.h in order to reduce duplication.
AOR exposes a config option, WANT_SIMD_EXCEPT, to enable
selective masking (and later fixing up) of invalid lanes, in
order to trigger fp exceptions correctly (AdvSIMD only). This is
tested and maintained in AOR, however it is configured off at
source level here for performance reasons. We keep the
WANT_SIMD_EXCEPT blocks in routine sources to greatly simplify
the upstreaming process from AOR to glibc.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Add --disable-encoding to makeinfo flags so that it does not generate
unicode quote glyphs.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Linux 6.4 adds the riscv_hwprobe syscall on riscv and enables
memfd_secret on s390. Update syscall-names.list and regenerate the
arch-syscall.h headers with build-many-glibcs.py update-syscalls.
Tested with build-many-glibcs.py.
Trying to mount procfs can fail due multiples reasons: proc is locked
due the container configuration, mount syscall is filtered by a
Linux Secuirty Module, or any other security or hardening mechanism
that Linux might eventually add.
The tests does require a new procfs without binding to parent, and
to fully fix it would require to change how the container was created
(which is out of the scope of the test itself). Instead of trying to
foresee any possible scenario, if procfs can not be mount fail with
unsupported.
Checked on aarch64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The tst-ttyname-direct.c checks the ttyname with procfs mounted in
bind mode (MS_BIND|MS_REC), while tst-ttyname-namespace.c checks
with procfs mount with MS_NOSUID|MS_NOEXEC|MS_NODEV in a new
namespace.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This patch improves tests-clean Makefile target to reliably clean
test artifacts from a build directory. Before this patch tests-clean
missed around 3k (out of total 9k) .out and .test-result files.
Signed-off-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
These files could be useful to any port that wants to use ld.so.cache.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
ldconfig was allocating PATH_MAX bytes on the stack for the library file
name. The issues with PATH_MAX usage are well documented [0][1]; even if
a program does not rely on paths being limited to PATH_MAX bytes,
allocating 4096 bytes on the stack for paths that are typically rather
short (strlen ("/lib64/libc.so.6") is 16) is wasteful and dangerous.
[0]: https://insanecoding.blogspot.com/2007/11/pathmax-simply-isnt.html
[1]: https://eklitzke.org/path-max-is-tricky
Instead, make use of asprintf to dynamically allocate memory of just the
right size on the heap.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
In documentation, call strings like "CST" time zone abbreviations, not
time zone names. This terminology is more precise, and is what tzdb uses.
A string like "CST" is ambiguous and does not fully name a time zone.
Few tests needed to properly check for asprintf and system calls return
values with _FORTIFY_SOURCE enabled.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The fread routine return value needs to be checked when fortification
is enabled, hence use xfread helper.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The declaration and definition of these routines aren't consistent.
Make the definition of __readlink_chk and __readlinkat_chk match the
declaration of the routines they fortify. While there are no problems
today this avoids any future potential problems related to the mismatch.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This will enable __REDIRECT_FORTIFY* macros to be used when _FORTIFY_SOURCE
is set.
Routine declarations that were in bits/wchar2.h are moved into the
bits/wchar2-decl.h file.
The file is now included into include/wchar.h irrespectively from
fortification.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Few tests using swprintf are passing incorrect maxlen parameter.
This triggers an abort when _FORTIFY_SOURCE is enabled.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
On i386 and x86_64, for libc.a specifically, __mempcpy_chk calls
mempcpy which leads POSIX routines to call non-POSIX mempcpy indirectly.
This leads the linknamespace test to fail when glibc is built with
__FORTIFY_SOURCE=3.
Since calling mempcpy doesn't bring any benefit for libc.a, directly
call __mempcpy instead.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Replace alloca with a scratch_buffer to avoid potential stack overflows.
Checked on i686-gnu and x86_64-linux-gnu
Message-Id: <20230619144334.2902429-1-josimmon@redhat.com>
There is a potential memory leak for large writes due to writev being a
"shall occur" cancellation point. Add back the cleanup handler removed
in cf30aa43a5.
Checked on i686-gnu and x86_64-linux-gnu.
Message-Id: <20230619143842.2901522-1-josimmon@redhat.com>
ISO C2x defines scanf %b for input of binary integers (with an
optional 0b or 0B prefix). Implement such support, along with the
corresponding SCNb* macros in <inttypes.h>. Unlike the support for
binary integers with 0b or 0B prefix with scanf %i, this is supported
in all versions of scanf (independent of the standards mode used for
compilation), because there are no backwards compatibility concerns
(%b wasn't previously a supported format) the way there were for %i.
Tested for x86_64 and x86.
ISO C2x defines printf length modifiers wN (for intN_t / int_leastN_t
/ uintN_t / uint_leastN_t) and wfN (for int_fastN_t / uint_fastN_t).
Add support for those length modifiers (such a feature was previously
requested in bug 24466). scanf support is to be added separately.
GCC 13 has format checking support for these modifiers.
When used with the support for registering format specifiers, these
modifiers are translated to existing flags in struct printf_info,
rather than trying to add some way of distinguishing them without
breaking the printf_info ABI. C2x requires an error to be returned
for unsupported values of N; this is implemented for printf-family
functions, but the parse_printf_format interface doesn't support error
returns, so such an error gets discarded by that function.
Tested for x86_64 and x86.
With fortification enabled, system calls return result needs to be checked,
has it gets the __wur macro enabled.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
With fortification enabled, read calls return result needs to be checked,
has it gets the __wur macro enabled.
Note on read call removal from sysdeps/pthread/tst-cancel20.c and
sysdeps/pthread/tst-cancel21.c:
It is assumed that this second read call was there to overcome the race
condition between pipe closure and thread cancellation that could happen
in the original code. Since this race condition got fixed by
d0e3ffb7a5 the second call seems
superfluous. Hence, instead of checking for the return value of read, it
looks reasonable to simply remove it.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Use a scratch_buffer rather than alloca to avoid potential stack
overflows.
Checked on i686-gnu and x86_64-linux-gnu
Message-Id: <20230608155844.976554-1-josimmon@redhat.com>
For strerror, this fixes commit 28aff04781 ("string:
Implement strerror in terms of strerror_l"). This commit avoids
returning NULL for strerror_l as well, although POSIX allows this
behavior for strerror_l.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
These functions are about to be added to POSIX, under Austin Group
issue 986.
The fortified strlcat implementation does not raise SIGABRT if the
destination buffer does not contain a null terminator, it just
inherits the non-failing regular strlcat behavior.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
With fortification enabled, fgets calls return result needs to be checked,
has it gets the __wur macro enabled.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
With fortification enabled, fread calls return result needs to be checked,
has it gets the __wur macro enabled.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The tst-mallocfork2 and tst-mallocfork3 create large number of
subprocesss, around 11k for former and 20k for latter, to check
for malloc async-signal-safeness on both fork and _Fork. However
they do not really exercise allocation patterns different than
other tests fro malloc itself, and the spawned process just exit
without any extra computation.
The tst-malloc-tcache-leak is similar, but creates 100k threads
and already checks the resulting with mallinfo.
These tests are also very sensitive to system load (since they
estresss heavy the kernel resource allocation), and adding them
on THP tunable and mcheck tests increase the pressure even more.
For THP the fork tests do not add any more coverage than other
tests. The mcheck is also not enable for tst-malloc-tcache-leak.
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
There is no fork detection on current arc4random implementation, so
use lower subprocess on fork tests. The tests now run on 0.1s
instead of 8s on a Ryzen9 5900X.
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The getdate testcases all expect successful results. Add support for
negative testcases and testcases where a full date and time are not
supplied by skipping the tm checks in the test. Add a testcase that
would catch a use-after-free that was recently found.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
Different systems prefer a different divisors.
From benchmarks[1] so far the following divisors have been found:
ICX : 2
SKX : 2
BWD : 8
For Intel, we are generalizing that BWD and older prefers 8 as a
divisor, and SKL and newer prefers 2. This number can be further tuned
as benchmarks are run.
[1]: https://github.com/goldsteinn/memcpy-nt-benchmarks
Reviewed-by: DJ Delorie <dj@redhat.com>
This patch should have no affect on existing functionality.
The current code, which has a single switch for model detection and
setting prefered features, is difficult to follow/extend. The cases
use magic numbers and many microarchitectures are missing. This makes
it difficult to reason about what is implemented so far and/or
how/where to add support for new features.
This patch splits the model detection and preference setting stages so
that CPU preferences can be set based on a complete list of available
microarchitectures, rather than based on model magic numbers.
Reviewed-by: DJ Delorie <dj@redhat.com>
Current `non_temporal_threshold` set to roughly '3/4 * sizeof_L3 /
ncores_per_socket'. This patch updates that value to roughly
'sizeof_L3 / 4`
The original value (specifically dividing the `ncores_per_socket`) was
done to limit the amount of other threads' data a `memcpy`/`memset`
could evict.
Dividing by 'ncores_per_socket', however leads to exceedingly low
non-temporal thresholds and leads to using non-temporal stores in
cases where REP MOVSB is multiple times faster.
Furthermore, non-temporal stores are written directly to main memory
so using it at a size much smaller than L3 can place soon to be
accessed data much further away than it otherwise could be. As well,
modern machines are able to detect streaming patterns (especially if
REP MOVSB is used) and provide LRU hints to the memory subsystem. This
in affect caps the total amount of eviction at 1/cache_associativity,
far below meaningfully thrashing the entire cache.
As best I can tell, the benchmarks that lead this small threshold
where done comparing non-temporal stores versus standard cacheable
stores. A better comparison (linked below) is to be REP MOVSB which,
on the measure systems, is nearly 2x faster than non-temporal stores
at the low-end of the previous threshold, and within 10% for over
100MB copies (well past even the current threshold). In cases with a
low number of threads competing for bandwidth, REP MOVSB is ~2x faster
up to `sizeof_L3`.
The divisor of `4` is a somewhat arbitrary value. From benchmarks it
seems Skylake and Icelake both prefer a divisor of `2`, but older CPUs
such as Broadwell prefer something closer to `8`. This patch is meant
to be followed up by another one to make the divisor cpu-specific, but
in the meantime (and for easier backporting), this patch settles on
`4` as a middle-ground.
Benchmarks comparing non-temporal stores, REP MOVSB, and cacheable
stores where done using:
https://github.com/goldsteinn/memcpy-nt-benchmarks
Sheets results (also available in pdf on the github):
https://docs.google.com/spreadsheets/d/e/2PACX-1vS183r0rW_jRX6tG_E90m9qVuFiMbRIJvi5VAE8yYOvEOIEEc3aSNuEsrFbuXw5c3nGboxMmrupZD7K/pubhtml
Reviewed-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
tst-getdate used to rely on an in-tree datemsk file that was
subsequently replaced by a file created during test execution. This
commit removes the unused file and corresponding env-var and uses a more
appropriate name for the temp file.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
getdate would free the buffer pointed to by the result of its call to
strptime, then reference the same buffer later on -- leading to a
use-after-free. This commit fixes that.
Reported-by: Martin Coufal <mcoufal@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since these functions are used in both catgets/gencat.c and
malloc/memusage{,stat}.c, it make sense to move them into a dedicated
header where they can be inlined.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
With fortification enabled, few function calls return result need to be
checked, has they get the __wur macro enabled.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
When enabling _FORTIFY_SOURCE, some functions now lead to warnings when
their result is not checked.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Container management default seccomp filter [1] only accepts
clock_settime if process has also CAP_SYS_TIME. So also handle
EPERM as well.
Also adapt the test to libsupport and add a proper Copyright header.
Checked on aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Container management default seccomp filter [1] only accepts
personality(2) with PER_LINUX, (0x0), UNAME26 (0x20000),
PER_LINUX32 (0x8), UNAME26 | PER_LINUX32, and 0xffffffff (to query
current personality)
Although the documentation only state it is blocked to prevent
'enabling BSD emulation' (PER_BSD, not implemented by Linux), checking
on repository log the real reason is to block ASLR disable flag
(ADDR_NO_RANDOMIZE) and other poorly support emulations.
So handle EPERM and fail as UNSUPPORTED if we can really check for
BZ#19408.
Checked on aarch64-linux-gnu.
[1] https://github.com/moby/moby/blob/master/profiles/seccomp/default.json
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Since the area of the user's stack we use for the registers dump (and
otherwise as __sigreturn2's stack) can and does overlap the sigcontext,
we have to be very careful about the order of loads and stores that we
do. In particular we have to load sc_reply_port before we start
clobbering the sigcontext.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
We add a 'make check' test that lints all Makefiles in the source
directory of the glibc build. This linting test ensures that the
lines in all Makefiles will be sorted correctly as developers
creates patches. It is added to 'make check' because it is
light-weight and supports the existing developer workflow
The test adds ~3s to a 'make check' execution.
No regressions on x86_64 and i686.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Sort Makefile variables using scrips/sort-makefile-lines.py.
No code generation changes observed in non-test binary artifacts.
No regressions on x86_64 and i686.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
epoll_pwait2(2)'s second argument should be nonnull. We're going to add
__nonnull to the prototype, so let's fix the test accordingly. We can
use a dummy variable to avoid passing NULL.
Reported-by: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
With fortification enabled, few function calls return result need to be
checked, has they get the __wur macro enabled.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
With fortification enabled, ftruncate calls return result needs to be
checked, has it gets the __wur macro enabled.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Since the assembly source file with -evex suffix should use YMM registers,
not ZMM registers, include x86-evex256-vecs.h by default to use YMM
registers in memcmpeq-evex.S
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
In some cases (e.g. when podman creates user containers), the only other
group assigned to the executing user is nobody and fchown fails with it
because the group is not mapped. Do not fail the test in this case,
instead exit as unsupported.
Reported-by: Frédéric Bérat <fberat@redhat.com>
Tested-by: Frédéric Bérat <fberat@redhat.com>
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Different than other 64 bit architectures, powerpc64 defines the
LFS POSIX lock constants with values similar to 32 ABI, which
are meant to be used with fcntl64 syscall. Since powerpc64 kABI
does not have fcntl, the constants are adjusted with the
FCNTL_ADJUST_CMD macro.
The 4d0fe291ae changed the logic of generic constants
LFS value are equal to the default values; which is now wrong
for powerpc64.
Fix the value by explicit define the previous glibc constants
(powerpc64 does not need to use the 32 kABI value, but it simplifies
the FCNTL_ADJUST_CMD which should be kept as compatibility).
Checked on powerpc64-linux-gnu and powerpc-linux-gnu.
For architecture with default 64 bit time_t support, the kernel
does not provide LFS and non-LFS values for F_GETLK, F_GETLK, and
F_GETLK (the default value used for 64 bit architecture are used).
This is might be considered an ABI break, but the currenct exported
values is bogus anyway.
The POSIX lockf is not affected since it is aliased to lockf64,
which already uses the LFS values.
Checked on i686-linux-gnu and the new tests on a riscv32.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Previously, after destructors for a DSO have been invoked, ld.so refused
to bind against that DSO in all cases. Relax this restriction somewhat
if the referencing object is itself a DSO that is being unloaded. This
assumes that the symbol reference is not going to be stored anywhere.
The situation in the test case can arise fairly easily with C++ and
objects that are built with different optimization levels and therefore
define different functions with vague linkage.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The LoongArch glibc was using the value of the SHMLBA macro from common code,
which is __getpagesize() (16k), but this was inconsistent with the value of
the SHMLBA macro in the kernel, which is SZ_64K (64k). This caused several
shmat-related tests in LTP (Linux Test Project) to fail. This commit fixes
the issue by ensuring that the glibc's SHMLBA macro value matches the value
used in the kernel like other architectures.
Applying this commit results in bit-identical libc.so.6.
The elf/ld-linux-x86-64.so.2 does change, but only in .note.gnu.build-id
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Use a scratch_buffer rather than either alloca or malloc to reduce the
possibility of a stack overflow.
Suggested-by: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
it almost certainly means we failed to read the systems cache info.
In this case, rather than defaulting the minimum correct value, we
should default to a value that gets at least reasonable
performance. 64MB is chosen conservatively to be at the very high
end. This should never cause non-temporal stores when, if we had read
cache info, we wouldn't have otherwise.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
All the changes are in comments or '#error' messages.
Applying this commit results in bit-identical rebuild of iconvdata/*.so
Reviewed-by: Florian Weimer <fw@deneb.enyo.de>
Linux 6.3 adds new constants MFD_NOEXEC_SEAL and MFD_EXEC. Add these
to bits/mman-shared.h (conditional on MFD_NOEXEC_SEAL not already
being defined, similar to the existing conditional on the older MFD_*
macros).
Tested for x86_64.
Linux 6.3 adds constants AT_RSEQ_FEATURE_SIZE and AT_RSEQ_ALIGN; add
them to glibc's elf.h. (Recall that, although elf.h is a
system-independent header, so far we've put AT_* constants there even
if Linux-specific, as discussed in bug 15794. So rather than making
any attempt to fix that issue, the new constants are just added there
alongside the existing ones.)
Tested for x86_64.
This patch checks _dl_debug_vdprintf, by passing various inputs to
_dl_dprintf and comparing the output with invocations of snprintf.
Signed-off-by: Roy Eldar <royeldar0@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
_dl_debug_vdprintf is a bare-bones printf implementation; currently
printing a signed integer (using "%d" format specifier) behaves
incorrectly when the number is negative, as it just prints the
corresponding unsigned integer, preceeded by a minus sign.
For example, _dl_printf("%d", -1) would print '-4294967295'.
Signed-off-by: Roy Eldar <royeldar0@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
f55727ca53 updated open_path to use the
r_search_path_struct struct but failed to update the comment.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
When dlopen is being called, efforts have been made to improve
future lookup performance. This includes marking a search path
as non-existent using `stat`. However, if the root directory
is given as a search path, there exists a bug which erroneously
marks it as non-existing.
The bug is reproduced under the following sequence:
1. dlopen is called to open a shared library, with at least:
1) a dependency 'A.so' not directly under the '/' directory
(e.g. /lib/A.so), and
2) another dependency 'B.so' resides in '/'.
2. for this bug to reproduce, 'A.so' should be searched *before* 'B.so'.
3. it first tries to find 'A.so' in /, (e.g. /A.so):
- this will (obviously) fail,
- since it's the first time we have seen the '/' directory,
its 'status' is 'unknown'.
4. `buf[buflen - namelen - 1] = '\0'` is executed:
- it intends to remove the leaf and its final slash,
- because of the speciality of '/', its buflen == namelen + 1,
- it erroneously clears the entire buffer.
6. it then calls 'stat' with the empty buffer:
- which will result in an error.
7. so it marks '/' as 'nonexisting', future lookups will not consider
this path.
8. while /B.so *does* exist, failure to look it up in the '/'
directory leads to a 'cannot open shared object file' error.
This patch fixes the bug by preventing 'buflen', an index to put '\0',
from being set to 0, so that the root '/' is always kept.
Relative search paths are always considered as 'existing' so this
wont be affected.
Writeup by Moody Liu <mooodyhunter@outlook.com>
Suggested-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Qixing ksyx Xue <qixingxue@outlook.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
When the base is 0 or 2 and the first two characters are '0' and 'b',
but the rest are no binary digits. In this case this is no error,
and strtol must return 0 and ENDPTR points to the 'x' or 'b'.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Add list end markers.
Sort text using scripts/sort-makefile-lines.py.
No code generation changes observed in non-test binary artifacts.
No regressions on x86_64 and i686.
All fixes are in comments, so the binaries should be identical
before/after this commit, but I can't verify this.
Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
Applying this commit results in a bit-identical rebuild of
mathvec/libmvec.so.1 (which is the only binary that gets rebuilt).
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
We do not want mach_i386.h to get installed into machine/, but into
i386/ or x86_64/ depending where mach_i386.defs was found, i.e.
according to 32/64 bitness.
Some of the s390-specific configure checks are using compile and
link configure tests. Now use only compile tests as the link
tests fails when e.g. bootstrapping a cross-toolchain due to
missing crt-files/libc.so. This is achieved by using
AC_COMPILE_IFELSE in configure.ac file.
This is observable e.g. when using buildroot which builds glibc
only once or the build-many-glibcs.py script. Note that the latter
one is building glibc twice in the compilers-step (configure-checks
fails) and in the glibcs-step (configure-checks succeed).
Note, that the s390 specific configure tests for static PIE have to
link an executable to test binutils support. Thus we can't fix
those tests.
We need to include hurd.h for libc_hidden_proto (__hurd_thread_self),
introduced in b44c1e1252 ("hurd: Fix using interposable
hurd_thread_self")
This the error log:
In file included from <command-line>:
./../include/libc-symbols.h:472:33: error: '__EI___hurd_thread_self' aliased to undefined symbol '__GI___hurd_thread_self'
472 | extern thread __typeof (name) __EI_##name \
| ^~~~~
./../include/libc-symbols.h:468:3: note: in expansion of macro '__hidden_ver2'
468 | __hidden_ver2 (, local, internal, name)
| ^~~~~~~~~~~~~
./../include/libc-symbols.h:476:41: note: in expansion of macro '__hidden_ver1'
476 | # define hidden_def(name) __hidden_ver1(__GI_##name, name, name);
| ^~~~~~~~~~~~~
./../include/libc-symbols.h:557:32: note: in expansion of macro 'hidden_def'
557 | # define libc_hidden_def(name) hidden_def (name)
| ^~~~~~~~~~
thread-self.c:27:1: note: in expansion of macro 'libc_hidden_def'
27 | libc_hidden_def (__hurd_thread_self)
| ^~~~~~~~~~~~~~~
Message-Id: <ZGr6wj2UOxg3F0qH@jupiter.tail36e24.ts.net>
Fixes 85f7554cd9
"Add test case for O_TMPFILE handling in open, openat"
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230520115531.3911877-3-bugaevc@gmail.com>
The __hurd_fail () inline function is the dedicated, idiomatic way of
reporting errors in the Hurd part of glibc. Not only is it more concise
than '{ errno = err; return -1; }', it is since commit
6639cc1002
"hurd: Mark error functions as __COLD" marked with the cold attribute,
telling the compiler that this codepath is unlikely to be executed.
In one case, use __hurd_dfail () over the plain __hurd_fail ().
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230520115531.3911877-1-bugaevc@gmail.com>
Create a private hidden __hurd_thread_self alias, and use that one.
Fixes 2f8ecb58a5
"hurd: Fix x86_64 _hurd_tls_fork" and
c7fcce38c8
"hurd: Make sure to not use tcb->self"
Reported-by: Joseph Myers <joseph@codesourcery.com>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
4d3f846b88 ("hurd: Fix __TIMESIZE on x86_64") incidentaly dropped it
because it fixed hurd 64bit into setting __TIMESIZE to 64, and that case
was not having gai_suspend defined yet.
Fix LOCALE list formatting.
Sort all reflowed text using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Reflow all long lines adding comment terminators.
Sort all reflowed text using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Reflow all long lines adding comment terminators.
Sort all reflowed text using scripts/sort-makefile-lines.py.
No regressions running microbenchmarks.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Reflow all long lines adding comment terminators.
Sort all reflowed text using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Reflow all long lines adding comment terminators.
Rename files that cause inconsistent ordering.
Sort all reflowed text using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Reflow Makefile.
Sort using updated scripts/sort-makefile-lines.py.
Code generation is changed as routines are linked in sorted order
as expected.
No regressions on x86_64 and i686.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reflow Makefile.
Sort using updated scripts/sort-makefile-lines.py.
Code generation is changed as routines are linked in sorted order
as expected.
No regressions on x86_64 and i686.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Sort tests against updated scripts/sort-makefile-lines.py.
No changes in generated code.
No regressions on x86_64 and i686.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Sort tests against updated scripts/sort-makefile-lines.py.
No changes in generated code.
No regressions on x86_64 and i686.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
We must return < 0, 0, or > 0 as the result of the comparison function
for cmp_to_key() to work correctly across all comparisons.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Move content from the Security Process[1] and Security Exceptions[2]
wiki documents into the repository so that it is in a standard place for
analysis tools to look for the glibc security policy.
This is a more or less verbatim port of the wiki document with some
restructuring for a more coherent layout since the two pages are now
merged. There should be no change in messaging in this commit.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Linux 6.3 adds six HWCAP2_SME* constants for AArch64; add them to the
corresponding bits/hwcap.h in glibc.
Tested with build-many-glibcs.py for aarch64-linux-gnu.
strlen, which is another ifunc-selected function, is invoked during
early static executable startup if the argv arrives from the exec
server. Make it not crash.
Checked on x86_64-gnu: statically linked executables launched after the
exec server is up now start up successfully.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230517191436.73636-10-bugaevc@gmail.com>
On x86_64, we have to pass function arguments in registers, not on the
stack. We also have to align the stack pointer in a specific way. Since
sharing the logic with i386 does not bring much benefit, split the file
back into i386- and x86_64-specific versions, and fix the x86_64 version
to set up the thread properly.
Bonus: i386 keeps doing the extra RPC inside __thread_set_pcsptp to
fetch the state of the thread before setting it; but x86_64 no lnoger
does that.
Checked on x86_64-gnu and i686-gnu.
Fixes be6d002ca2
"hurd: Set up the basic tree for x86_64-gnu"
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230517191436.73636-9-bugaevc@gmail.com>
It is illegal to call thread_get_state () on mach_thread_self (), so
this codepath cannot be used as-is to fork the calling thread's TLS.
Fortunately we can use THREAD_SELF (aka %fs:0x0) to find out the value
of our fs_base without calling into the kernel.
Fixes: f6cf701efc
"hurd: Implement TLS for x86_64"
Checked on x86_64-gnu: fork () now works!
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230517191436.73636-8-bugaevc@gmail.com>
Unlike sigstate->thread, tcb->self did not hold a Mach port reference on
the thread port it names. This means that the port can be deallocated,
and the name reused for something else, without anyone noticing. Using
tcb->self will then lead to port use-after-free.
Fortunately nothing was accessing tcb->self, other than it being
intially set to then-valid thread port name upon TCB initialization. To
assert that this keeps being the case without altering TCB layout,
rename self -> self_do_not_use, and stop initializing it.
Also, do not (re-)allocate a whole separate and unused stack for the
main thread, and just exit __pthread_setup early in this case.
Found upon attempting to use tcb->self and getting unexpected crashes.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230517191436.73636-7-bugaevc@gmail.com>
...instead of mach_setup_thread (), which is unsuitable for setting up
function calls.
Checked on x86_64-gnu: the signal thread no longer crashes upon trying
to process a message.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230517191436.73636-6-bugaevc@gmail.com>
This is just like mach_setup_thread (), but it's suitable for making the
thread call a function correctly, as opposed to explicitly setting the
thread's stack and instruction pointers to the given values. Internally,
it uses MACHINE_THREAD_STATE_SETUP_CALL.
Unlike mach_setup_thread (), which is exported via mach.h for the
benefit of the Hurd exec server, __mach_setup_thread_call () is private
to glibc for the time being.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230517191436.73636-5-bugaevc@gmail.com>
The existing two macros, MACHINE_THREAD_STATE_SET_PC and
MACHINE_THREAD_STATE_SET_SP, can be used to set program counter and the
stack pointer registers in a machine-specific thread state structure.
Useful as it is, this may not be enough to set up the thread to make a
function call, because the machine-specific ABI may impose additional
requirements. In particular, x86_64 ABI requires that upon function
entry, the stack pointer is 8 less than 16-byte aligned (sp & 15 == 8).
To deal with this, introduce a new macro,
MACHINE_THREAD_STATE_SETUP_CALL (), which sets both stack and
instruction pointers, and also applies any machine-specific requirements
to make a valid function call. The default implementation simply
forwards to MACHINE_THREAD_STATE_SET_PC and MACHINE_THREAD_STATE_SET_SP,
but on x86_64 we additionally align the stack pointer.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230517191436.73636-3-bugaevc@gmail.com>
This hasn't caused any problems yet but we are passing a pointer to struct
task_thread_times_info which can cause problems if we populate over the
existing size of the struct.
Message-Id: <ZGRDDNcOM2hA3CuT@jupiter.tail36e24.ts.net>
Reflow Makefile.
Sort using scripts/sort-makefile-lines.py.
Code generation is changed as routines are linked in sorted order
as expected.
No regressions on x86_64 and i686.
The last loop could attempt to overflow beyond INT_MAX on 32-bit
architectures.
Also switch to GNU style.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This patch updates the kernel version in the tests tst-mman-consts.py,
tst-mount-consts.py and tst-pidfd-consts.py to 6.3. (There are no new
constants covered by these tests in 6.3 that need any other header
changes.)
Tested with build-many-glibcs.py.
So I was able to reproduce the hangs in the original source, and debug
it, and fix it. In doing so, I realized that we can't use anything
complex to trigger the thread because that "anything" might also cause
the expected segfault and force everything out of sync again.
Here's what I ended up with, and it doesn't seem to hang where the
original one hung quite often (in a tight while..end loop). The key
changes are:
1. Calls to futex are error checked, with retries, to ensure that the
futexes are actually doing what they're supposed to be doing. In the
original code, nearly every futex call returned an error.
2. The main loop has checks for whether the thread ran or not, and
"unlocks" the thread if it didn't (this is how the original source
hangs).
Note: the usleep() is not for timing purposes, but just to give the
kernel an excuse to run the other thread at that time. The test will
not hang without it, but is more likely to test the right bugfix
if the usleep() is present.
Test minimum and maximum long long values, zero, 32bit crossover points, and
part of the range of long long values. Use '-fno-builtin' to ensure we are
testing the implementation.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Test minimum and maximum long values, zero, and part of the range
of long values. Use '-fno-builtin' to ensure we are testing the
implementation.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Test minimum and maximum int values, zero, and part of the range
of int values. Use '-fno-builtin' to ensure we are testing the
implementation.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The real i386_thread_state Mach structure has an alignment of 8 on
x86_64. However, in struct sigcontext, the compiler was packing sc_gs
(which is the first member of sc_i386_thread_state) into the same 8-byte
slot as sc_error; this resulted in the rest of sc_i386_thread_state
members having wrong offsets relative to each other, and the overall
sc_i386_thread_state layout mismatching that of i386_thread_state.
Fix this by explicitly adding the required padding members, and
statically asserting that this results in the desired alignment.
The same goes for sc_i386_float_state.
Checked on x86_64-gnu.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230515083323.1358039-4-bugaevc@gmail.com>
sizeof (*stackframe) appears to be divisible by 16, but we should not
rely on that. So make sure to leave enough space for the stackframe
first, and then align the final pointer at 16 bytes.
Checked on x86_64-gnu.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230515083323.1358039-3-bugaevc@gmail.com>
Fixes 60f9bf9746
"hurd: Port trampoline.c to x86_64"
Checked on x86_64-gnu.
Reported-by: Bruno Haible <bruno@clisp.org>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230515083323.1358039-2-bugaevc@gmail.com>
Reflow Makefile.
Sort using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Reflow Makefile.
Sort using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Reflow Makefile.
Sort using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Reflow Makefile.
Sort using scripts/sort-makefile-lines.py.
Code generation is changed as routines are linked in sorted order
as expected.
No regressions on x86_64 and i686.
Reflow Makefile.
Sort using scripts/sort-makefile-lines.py.
Code generation is changed as routines are linked in sorted order
as expected.
No regressions on x86_64 and i686.
Reflow Makefile.
Sort using scripts/sort-makefile-lines.py.
Code generation is changed as routines are linked in sorted order
as expected.
No regressions on x86_64 and i686.
Fix list terminator whitspace.
Sort using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Fix list terminator whitspace.
Sort using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Calling fclose or freopen with a null FILE * is undefined behavior, and
doing so in practice will cause a SIGSEGV. So it seems suitable for
__nonnull.
This will help the compiler to warn for some buggy code, like
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109570.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This change harmonizes the declaration and use of `is_nscd' and fixes a
build failure with the "--enable-static-nss --enable-nscd"
configuration options due to `is_nscd' being used undeclared.
The purpose of `is_nscd' is to avoid (nss <-> nscd) recursion in
dynamically linked libc (SHARED) that is nscd-aware (USE_NSCD), and so
its declaration and use should be guarded by the definition of those
macros.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Linux 6.3 has no new syscalls. Update the version number in
syscall-names.list to reflect that it is still current for 6.3.
Tested with build-many-glibcs.py.
While mach/kern_return.h happens to pull mach/machine/kern_return.h,
mach/machine/boolean.h, and mach/machine/vm_types.h (and realpath-ing them
exposes the machine-specific machine symlink content), those headers do not
actually define anything machine-specific for the content of errno.h.
So we can just rule out these machine-specific from the dependency
comment.
We already did the same change for Hurd
(https://git.savannah.gnu.org/cgit/hurd/hurd.git/commit/?id=ef5924402864ef049f40a39e73967628583bc1a4)
Due to MiG requiring the subsystem to be defined early in order to know the
size of a port, this was causing a division by zero error during ./configure.
We could have just move subsystem to the top of the snippet, however it is
simpler to just remove the check given that we have no plans to use some other
MiG anyway.
HAVE_MIG_RETCODE is removed completely since this will be a no-op either
way (compiling against old Hurd headers will work the same, new Hurd
headers will result in the same stubs since retcode is a no-op).
Message-Id: <ZFspor91aoMwbh9T@jupiter.tail36e24.ts.net>
This patch redirects the error functions to the appropriate
longdouble variants which enables the compiler to optimize
for the abi ieeelongdouble.
Signed-off-by: Sachin Monga <smonga@linux.ibm.com>
Reflow all long lines adding comment terminators.
Sort all reflowed text using scripts/sort-makefile-lines.py.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The scripts/sort-makefile-lines.py script sorts Makefile variables
according to project expected order.
The script can be used like this:
$ scripts/sort-makefile-lines.py < elf/Makefile > elf/Makefile.tmp
$ mv elf/Makefile.tmp elf/Makefile
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This patch adds the strict checking for power-of-two alignments
in aligned_alloc(), and updates the manual accordingly.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
All of the mentioned variables are gone. gcc is just the default and
argv[1] can be used instead. /usr/include isn't hard-coded and you can
pass argv[2] with -I... to adjust.
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The argument to @deftypefun must be on a single line.
Also add the missing @safety for sem_clockwait.
Reported-by: Nilgün Belma Bugüner <nillguine@gmail.com>
Summary of changes:
- Use BAD_TYPECHECK to perform type checking in a cleaner way.
BAD_TYPECHECK is moved into sysdeps/mach/rpc.h to avoid duplication.
- Remove assertions for mach_msg_type_t since those won't work for
x86_64.
- Update message structs to use mach_msg_type_t directly.
- Use designated initializers.
Message-Id: <ZFa+roan3ioo0ONM@jupiter.tail36e24.ts.net>
Reduce the usage of alloca() to the bare minimum to avoid the potential
for stack overflow. Use __strndup to simplify the code.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Summary of the changes:
- Introduce BAD_TYPECHECK from MiG to make it simpler to do type
checking.
- Replace int type with mach_msg_type_t. This assumes that
mach_msg_type_t is always the same size as int which is not true for
x86_64.
- Calculate the size and align using PTR_ALIGN_UP, which is a bit
cleaner and similar to what we do elsewhere.
- Define mach_msg_type_t to check using designated initializers.
Message-Id: <ZFMvrIkvoCSxqB/C@jupiter.tail36e24.ts.net>
Summary of the changes:
- Update msg_align to use ALIGN_UP like we have done in previous
patches. Use it below whenever necessary to avoid repeating the same
alignment logic.
- Define BAD_TYPECHECK to make it easier to do type checking in a few
places below.
- Update io2mach_type to use designated initializers.
- Make RetCodeType use mach_msg_type_t. mach_msg_type_t is 8 byte for
x86_64, so this make it portable.
- Also call msg_align for _IOT_COUNT2/_IOT_TYPE2 since it is more
correct.
Message-Id: <ZFMvVsuFKwIy2dUS@jupiter.tail36e24.ts.net>
arm_sve.h depends on stdint.h but that relies on libc headers unless
compiled in freestanding mode. Without this change a bootstrap glibc
build (that uses a compiler without installed libc headers) failed with
checking for availability of SVE ACLE... In file included from [...]/arm_sve.h:28,
from conftest.c:1:
[...]/stdint.h:9:16: fatal error: stdint.h: No such file or directory
9 | # include_next <stdint.h>
| ^~~~~~~~~~
compilation terminated.
configure: error: mathvec is enabled but compiler does not have SVE ACLE. [...]
This patch enables libmvec on AArch64. The proposed change is mainly
implementing build infrastructure to add the new routines to ABI,
tests and benchmarks. I have demonstrated how this all fits together
by adding implementations for vector cos, in both single and double
precision, targeting both Advanced SIMD and SVE.
The implementations of the routines themselves are just loops over the
scalar routine from libm for now, as we are more concerned with
getting the plumbing right at this point. We plan to contribute vector
routines from the Arm Optimized Routines repo that are compliant with
requirements described in the libmvec wiki.
Building libmvec requires minimum GCC 10 for SVE ACLE. To avoid raising
the minimum GCC by such a big jump, we allow users to disable libmvec
if their compiler is too old.
Note that at this point users have to manually call the vector math
functions. This seems to be acceptable to some downstream users.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
dev_t are 64bit on Linux ports, so better increase their size on 64bit
Hurd. It happens that this helps with BZ 23084 there: st_dev has type fsid_t
(quad) and is specified by POSIX to have type dev_t. Making dev_t 64bit
makes these match.
This patch updates build-many-glibcs.py to use Linux 6.3 and GCC 13
branch by default.
Tested with build-many-glibcs.py (host-libraries, compilers and glibc
builds).
GCC docs explicitly list perror () as a good candidate for using
__attribute__ ((cold)). So apply __COLD to perror () and similar
functions.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429131223.2507236-3-bugaevc@gmail.com>
include/regex.h had not been updated during the int -> Idx transition,
and the prototypes don't matched the definitions in regexec.c.
In regcomp.c, most interfaces were updated for Idx, except for two ones
guarded by #if _LIBC.
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
The standards want msg_lspid/msg_lrpid/shm_cpid/shm_lpid to be pid_t, see BZ
23083 and 23085.
We can leave them __rpc_pid_t on i386 for ABI compatibility, but avoid
hitting the issue on 64bit.
The standards want uid/cuid to be uid_t, gid/cgid to be gid_t and mode to be
mode_t, see BZ 23082.
We can leave them short ints on i386 for ABI compatibility, but avoid
hitting the issue on 64bit.
bits/ipc.h ends up being exactly the same in sysdeps/gnu/ and
sysdeps/unix/sysv/linux/, so remove the latter.
The standards want l_type and l_whence to be short ints, see BZ 23081.
We can leave them ints on i386 for ABI compatibility, but avoid hitting the
issue on 64bit.
cmsg_len is supposed to be socklen_t according to standards, but it was made
size_t on Linux, see BZ 16919. For ports that have it socklen_t, SIZE_MAX is
too large. We can however explicitly cast it to the type of cmsg_len so it
will fit according to that type.
These were created by creating stub files, running 'make update-abi',
and reviewing the results.
Also, set baseline ABI to GLIBC_2.38, the (upcoming) first glibc
release to first have x86_64-gnu support.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
If we're trying to interrupt an interruptible RPC, but the server fails
to respond to our __interrupt_operation () call, we instead destroy the
reply port we were expecting the reply to the RPC on.
Instead of deallocating the name completely, replace it with a dead
name, so the name won't get reused for some other right, and deallocate
it in _hurd_intr_rpc_mach_msg once we return from the signal handler.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429201822.2605207-4-bugaevc@gmail.com>
We make lib{mach,hurd}user.so only call __mig_strlen which can be
relocated before libc.so is relocated, similar to what is done with
__mig_memcpy.
Message-Id: <ZE8DTRDpY2hpPZlJ@jupiter.tail36e24.ts.net>
Normally, in static builds, the first code that runs is _start, in e.g.
sysdeps/x86_64/start.S, which quickly calls __libc_start_main, passing
it the argv etc. Among the first things __libc_start_main does is
initializing the tunables (based on env), then CPU features, and then
calls _dl_relocate_static_pie (). Specifically, this runs ifunc
resolvers to pick, based on the CPU features discovered earlier, the
most suitable implementation of "string" functions such as memcpy.
Before that point, calling memcpy (or other ifunc-resolved functions)
will not work.
In the Hurd port, things are more complex. In order to get argv/env for
our process, glibc normally needs to do an RPC to the exec server,
unless our args/env are already located on the stack (which is what
happens to bootstrap processes spawned by GNU Mach). Fetching our
argv/env from the exec server has to be done before the call to
__libc_start_main, since we need to know what our argv/env are to pass
them to __libc_start_main.
On the other hand, the implementation of the RPC (and other initial
setup needed on the Hurd before __libc_start_main can be run) is not
very trivial. In particular, it may (and on x86_64, will) use memcpy.
But as described above, calling memcpy before __libc_start_main can not
work, since the GOT entry for it is not yet initialized at that point.
Work around this by pre-filling the GOT entry with the baseline version
of memcpy, __memcpy_sse2_unaligned. This makes it possible for early
calls to memcpy to just work. The initial value of the GOT entry is
unused on x86_64, and changing it won't interfere with the relocation
being performed later: once _dl_relocate_static_pie () is called, the
baseline version will get replaced with the most suitable one, and that
is what subsequent calls of memcpy are going to call.
Checked on x86_64-gnu.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429201822.2605207-6-bugaevc@gmail.com>
Checked on x86_64-gnu.
[samuel.thibault@ens-lyon.org: Restored same comments as on i386]
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429201822.2605207-3-bugaevc@gmail.com>
We need to align on uintptr_t to make this work for x86_64,
otherwise things will go wrong when RPCs return errors.
Message-Id: <ZE3aPH7uCEDti47H@jupiter.tail36e24.ts.net>
This should hopefully hint the compiler that they are unlikely
to be called.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429131223.2507236-2-bugaevc@gmail.com>
This expands to __attribute__ ((cold)) when supported. It should be
used to mark up functions that are invoked rarely.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
We need to set file_name, not update retryname. This is what the other
branches do.
Before this change, any attempt to access such a file would segfault due
to file_name being unset:
$ settrans -ac /tmp/my-machtype /hurd/magic machtype
$ cat /tmp/my-machtype
Segmentation fault
Checked on i686-gnu.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429131354.2507443-7-bugaevc@gmail.com>
If the process has set the close-on-exec flag for the file descriptor,
it expects the file descriptor to get closed on exec, even if we replace
what the file descriptor refers to.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429131354.2507443-6-bugaevc@gmail.com>
If any of the early boot-up tasks calls exit () or returns from main (),
terminate it properly instead of crashing on trying to dereference
_hurd_ports and getting forcibly terminated by the kernel.
We sadly cannot make the __USEPORT macro do the check for _hurd_ports
being unset, because it evaluates to the value of the expression
provided as the second argument, and that can be of any type; so there
is no single suitable fallback value for the macro to evaluate to in
case _hurd_ports is unset. Instead, each use site that wants to care for
this case will have to do its own checking.
Checked on x86_64-gnu.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429131354.2507443-4-bugaevc@gmail.com>
Each libc_hidden_def should be placed immediately next to its function,
not in some random unrelated place.
No functional change.
Fixes: 653d74f12a
"hurd: Global signal disposition"
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429131354.2507443-2-bugaevc@gmail.com>
This block of code was doing exactly what _hurd_self_sigstate does; so
just call that and let it do its job.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230429131354.2507443-1-bugaevc@gmail.com>
There are reports for hang in __check_pf:
https://github.com/JoeDog/siege/issues/4
It is reproducible only under specific configurations:
1. Large number of cores (>= 64) and large number of threads (> 3X of
the number of cores) with long lived socket connection.
2. Low power (frequency) mode.
3. Power management is enabled.
While holding lock, __check_pf calls make_request which calls __sendto
and __recvmsg. Since __sendto and __recvmsg are cancellation points,
lock held by __check_pf won't be released and can cause deadlock when
thread cancellation happens in __sendto or __recvmsg. Add a cancellation
cleanup handler for __check_pf to unlock the lock when cancelled by
another thread. This fixes BZ #20975 and the siege hang issue.
__GLIBC_FLT_EVAL_METHOD will effect the definition of float_t and
double_t, currently we'll set __GLIBC_FLT_EVAL_METHOD to 2 when
__FLT_EVAL_METHOD__ is -1, that means we'll define float_t and double_t
to long double.
However some target isn't natively (HW) support long double like AArch64 and
RISC-V, they defined long double as 128-bits IEEE 754 floating point type.
That means setting __GLIBC_FLT_EVAL_METHOD to 2 will cause very inefficient
code gen for those target who didn't provide native support for long
double, and that's violate the spirit float_t and double_t - most efficient
types at least as wide as float and double.
So this patch propose to remap __GLIBC_FLT_EVAL_METHOD to 0 rather than
2 when __FLT_EVAL_METHOD__ is -1, which means we'll use float/double
rather than long double for float_t and double_t.
Note: __FLT_EVAL_METHOD__ == -1 means the precision is indeterminable,
which means compiler might using indeterminable precision during
optimization/code gen, clang will set this value to -1 when fast
math is enabled.
Note: Default definition float_t and double_t in current glibc:
| __GLIBC_FLT_EVAL_METHOD | float_t | double_t
| 0 or 16 | float | double
| 1 | double | doulbe
| 2 | long double | long double
More complete list see math/math.h
Note: RISC-V has defined ISA extension to support 128-bits IEEE 754
floating point operations, but only rare RISC-V core will implement that.
Related link:
[1] LLVM issue (__FLT_EVAL_METHOD__ is set to -1 with Ofast. #60781):
https://github.com/llvm/llvm-project/issues/60781
[2] Last version of this patch: https://sourceware.org/pipermail/libc-alpha/2023-February/145622.html
Acked-by: Palmer Dabbelt <palmer@rivosinc.com> # RISC-V
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Link: https://inbox.sourceware.org/libc-alpha/20230314151948.12892-1-kito.cheng@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
In some cases, we do not want to go through the resolver for function
calls. For example, functions with vector arguments will use vector
registers to pass arguments. In the resolver, we do not save/restore the
vector argument registers for lazy binding efficiency. To avoid ruining
the vector arguments, functions with vector arguments will not go
through the resolver.
To achieve the goal, we will annotate the function symbols with
STO_RISCV_VARIANT_CC flag and add DT_RISCV_VARIANT_CC tag in the dynamic
section. In the first pass on PLT relocations, we do not set up to call
_dl_runtime_resolve. Instead, we resolve the functions directly.
Signed-off-by: Hsiangkai Wang <kai.wang@sifive.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://inbox.sourceware.org/libc-alpha/20230314162512.35802-1-kito.cheng@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Support for SFrame format is available in Binutils 2.40. The GNU ld merges
the input .sframe sections and creates an output .sframe section in a
segment PT_GNU_SFRAME.
The build of glibc for i686-gnu has been failing for a while with GCC
mainline / GCC 13:
../sysdeps/mach/hurd/getcwd.c: In function '__hurd_canonicalize_directory_name_internal':
../sysdeps/mach/hurd/getcwd.c:242:48: error: pointer 'file_name' may be used after 'realloc' [-Werror=use-after-free]
242 | file_namep = &buf[file_namep - file_name + size / 2];
| ~~~~~~~~~~~^~~~~~~~~~~
../sysdeps/mach/hurd/getcwd.c:236:25: note: call to 'realloc' here
236 | buf = realloc (file_name, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~
Fix by doing the subtraction before the reallocation.
Tested with build-many-glibcs.py for i686-gnu.
[samuel.thibault@ens-lyon.rg: Removed mention of this being a bug]
Message-Id: <18587337-7815-4056-ebd0-724df262d591@codesourcery.com>
Since asprintf is called "if (mask & XPG_NORM_CODESET)" there is no
point in checking the mask again within the asprintf call.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
State that getpt is similar to posix_openpt. Use posix_openpt
instead of getpt in example.
Signed-off-by: Gavin Smith <gavinsmith0123@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This checks that:
* We can send and receive fds over Unix domain sockets using SCM_RIGHTS;
* msg_controllen, cmsg_level, cmsg_type, cmsg_len are all filled in
correctly on receive;
* Most importantly, the received fd has or has not the close-on-exec
flag set depending on whether we pass MSG_CMSG_CLOEXEC to recvmsg ().
Checked on i686-gnu and x86_64-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230423160548.126576-4-bugaevc@gmail.com>
As fixed in 0822e3552a ("hurd: Don't pass FD_CLOEXEC in CMSG_DATA"),
senders currently don't have any flag to pass. We shouldn't blindly take
random flags that senders could be erroneously giving us.
This is a new flag that can be passed to recvmsg () to make it
atomically set the CLOEXEC flag on all the file descriptors received
using the SCM_RIGHTS mechanism. This is useful for all the same reasons
that the other XXX_CLOEXEC flags are useful: namely, it provides
atomicity with respect to another thread of the same process calling
(fork and then) exec at the same time.
This flag is already supported on Linux and FreeBSD. The flag's value,
0x40000, is choosen to match FreeBSD's.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230423160548.126576-2-bugaevc@gmail.com>
The flags are used by _hurd_intern_fd, which takes O_* flags, not FD_*.
Also, it is of no concern to the receiving process whether or not
the sender process wants to close its copy of sent file descriptor
upon exec, and it should not influence whether or not the received
file descriptor gets the FD_CLOEXEC flag set in the receiving process.
The latter should in fact be dependent on the MSG_CMSG_CLOEXEC flag
being passed to the recvmsg () call, which is going to be implemented
in the following commit.
Fixes 344e755248
"hurd: Support sending file descriptors over Unix sockets"
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
This makes the prefer_map_32bit_exec tunable no longer Linux-specific.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230423215526.346009-4-bugaevc@gmail.com>
This is a flag that can be passed to mmap () to request that the mapping
being established should be located in the lower 2 GB area of the
address space, so only the lower 31 (not 32) bits can be set in its
address, and the address can be represented as a 32-bit integer without
truncating it.
This flag is intended to be compatible with Linux, FreeBSD, and Darwin
flags of the same name. Out of those systems, it appears Linux and
FreeBSD take MAP_32BIT to mean "map 31 bit", whereas Darwin allows the
32nd bit to be set in the address as well. The Hurd follows Linux and
FreeBSD behavior.
Unlike on those systems, on the Hurd MAP_32BIT is defined on all
supported architectures (which currently are only i386 and x86_64).
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230423215526.346009-1-bugaevc@gmail.com>
When opening a temporary file without O_CLOEXEC we risk leaking the
file descriptor if another thread calls (fork and then) exec while we
have the fd open. Fix this by consistently passing O_CLOEXEC everywhere
where we open a file for internal use (and not to return it to the user,
in which case the API defines whether or not the close-on-exec flag
shall be set on the returned fd).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230419160207.65988-4-bugaevc@gmail.com>
This is nicer, and is going to be required for the following changes
to reasonably stay within the 79 column limit.
No functional change.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Message-Id: <20230419160207.65988-2-bugaevc@gmail.com>
Copy strncpy tests for strndup. Covers some basic testcases with random
strings. Remove tests that set the destination's bytes and checked the
resulting buffer's bytes. Remove wide character test support since
wcsndup() doesn't exist.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Copy strcpy tests for strdup. Covers some basic testcases with random
strings. Add a zero-length string testcase.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Mark two variables as unused to silence warning when using
test-string.h for non-ifunc implementations.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Properly differentiate between setting up the real TLS with
TLS_INIT_TP, and setting up the early TLS (__init1_tcbhead) in static
builds. In the latter case, don't yet migrate the reply port into the
TCB, and don't yet set __libc_tls_initialized to 1.
This also lets us move the __init1_desc assignment inside
_hurd_tls_init ().
Fixes cd019ddd89
"hurd: Don't leak __hurd_reply_port0"
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Created tunable glibc.pthread.stack_hugetlb to control when hugepages
can be used for stack allocation.
In case THP are enabled and glibc.pthread.stack_hugetlb is set to
0, glibc will madvise the kernel not to use allow hugepages for stack
allocations.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Based on these comments in malloc.c:
size field is or'ed with NON_MAIN_ARENA if the chunk was obtained
from a non-main arena. This is only set immediately before handing
the chunk to the user, if necessary.
The NON_MAIN_ARENA flag is never set for unsorted chunks, so it
does not have to be taken into account in size comparisons.
When we pull a chunk off the unsorted list (or any list) we need to
make sure that flag is set properly before returning the chunk.
Use the rounded-up size for chunk_ok_for_memalign()
Do not scan the arena for reusable chunks if there's no arena.
Account for chunk overhead when determining if a chunk is a reuse
candidate.
mcheck interferes with memalign, so skip mcheck variants of
memalign tests.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
_hurd_thread_sigstate () already handles finding an existing sigstate
before allocating a new one, so just use that. Bonus: this will only
lock the _hurd_siglock once.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Much as the comment says, things on _hurd_subinit assume that _hurd_pid
is already initialized by the time _hurd_subinit is run, so
_hurd_proc_subinit has to run before it. Specifically, init_dtable ()
calls _hurd_port2fd (), which uses _hurd_pid and _hurd_pgrp to set up
ctty handling. With _hurd_subinit running before _hurd_proc_subinit,
ctty setup was broken:
13<--33(pid1255)->term_getctty () = 0 4<--39(pid1255)
task16(pid1255)->mach_port_deallocate (pn{ 10}) = 0
13<--33(pid1255)->term_open_ctty (0 0) = 0x40000016 (Invalid argument)
Fix this by running the _hurd_proc_subinit hook in the correct place --
just after _hurd_portarray is set up (so the proc server port is
available in its usual place) and just before running _hurd_subinit.
Fixes 1ccbb9258e
("hurd: Notify the proc server later during initialization").
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
We must not use the user's reply port (scp->sc_reply_port) for any of
our own RPCs, otherwise various things break. So, use MACH_PORT_DEAD as
a reply port when destroying our reply port, and make sure to do this
after _hurd_sigstate_unlock (), which may do a gsync_wake () RPC.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
It is common to have (some of) stdin, stdout and stderr point to the
very same port. We were making the ctty RPCs that _hurd_port2fd () does
for each one of them separately:
1. term_getctty ()
2. mach_port_deallocate ()
3. term_open_ctty ()
Instead, let's detect this case and duplicate the ctty port we already
have. This means we do 1 RPC instead of 3 (and create a single protid
on the server side) if the file is our ctty, and no RPCs instead of 1
if it's not. A clear win!
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Optimize the fast paths (x < y) and (x/y < 2^12). Delay handling of special
cases to reduce the number of instructions executed before the fast paths.
Performance improvements for fmod:
Skylake Zen2 Neoverse V1
subnormals 11.8% 4.2% 11.5%
normal 3.9% 0.01% -0.5%
close-exponents 6.3% 5.6% 19.4%
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Adjust iteration counts so benchmarks don't run too slowly or quickly.
Ensure benchmarks take less than 10 seconds on older, slower cores and
more than 0.5 seconds on fast cores.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When glibc is built as a shared library, TLS is always initialized by
the call of TLS_INIT_TP () macro made inside the dynamic loader, prior
to running the main program (see dl-call_tls_init_tp.h). We can take
advantage of this: we know for sure that __LIBC_NO_TLS () will evaluate
to 0 in all other cases, so let the compiler know that explicitly too.
Also, only define _hurd_tls_init () and TLS_INIT_TP () under the same
conditions (either !SHARED or inside rtld), to statically assert that
this is the case.
Other than a microoptimization, this also helps with avoiding awkward
sharing of the __libc_tls_initialized variable between ld.so and libc.so
that we would have to do otherwise -- we know for sure that no sharing
is required, simply because __libc_tls_initialized would always be set
to true inside libc.so.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-25-bugaevc@gmail.com>
Now that the signal code no longer accesses it, the only real user of it
was mig-reply.c, so move the logic for managing the port there.
If we're in SHARED and outside of rtld, we know that __LIBC_NO_TLS ()
always evaluates to 0, and a TLS reply port will always be used, not
__hurd_reply_port0. Still, the compiler does not see that
__hurd_reply_port0 is never used due to its address being taken. To deal
with this, explicitly compile out __hurd_reply_port0 when we know we
won't use it.
Also, instead of accessing the port via THREAD_SELF->reply_port, this
uses THREAD_GETMEM and THREAD_SETMEM directly, avoiding possible
miscompilations.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
The content of the pool array is initialized only until pool_size,
pointers between pool_size and pool_max_size were not initialized by the
realloc call in get_elem so they should not be freed.
This fixes aio tests crashing at their termination on GNU/Hurd.
This reverts commit b37899d34d.
Apparently we load libc.so (and thus start using its functions) before
calling TLS_INIT_TP, so libc.so functions should not actually assume
that TLS is always set up.
Previously, once we set up TLS, we would implicitly switch from using
__hurd_reply_port0 to reply_port inside the TCB, leaving the former
unused. But we never deallocated it, so it got leaked.
Instead, migrate the port into the new TCB's reply_port slot. This
avoids both the port leak and an extra syscall to create a new reply
port for the TCB.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-28-bugaevc@gmail.com>
If we're doing signals, that means we've already got the signal thread
running, and that implies TLS having been set up. So we know that
__hurd_local_reply_port will resolve to THREAD_SELF->reply_port, and can
access that directly using the THREAD_GETMEM and THREAD_SETMEM macros.
This avoids potential miscompilations, and should also be a tiny bit
faster.
Also, use mach_port_mod_refs () and not mach_port_destroy () to destroy
the receive right. mach_port_destroy () should *never* be used on
mach_task_self (); this can easily lead to port use-after-free
vulnerabilities if the task has any other references to the same port.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-26-bugaevc@gmail.com>
When glibc is built as a shared library, TLS is always initialized by
the call of TLS_INIT_TP () macro made inside the dynamic loader, prior
to running the main program (see dl-call_tls_init_tp.h). We can take
advantage of this: we know for sure that __LIBC_NO_TLS () will evaluate
to 0 in all other cases, so let the compiler know that explicitly too.
Also, only define _hurd_tls_init () and TLS_INIT_TP () under the same
conditions (either !SHARED or inside rtld), to statically assert that
this is the case.
Other than a microoptimization, this also helps with avoiding awkward
sharing of the __libc_tls_initialized variable between ld.so and libc.so
that we would have to do otherwise -- we know for sure that no sharing
is required, simply because __libc_tls_initialized would always be set
to true inside libc.so.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-25-bugaevc@gmail.com>
These are just regular local variables that are not accessed in any
funny ways, not even though a pointer. There's absolutely no reason to
declare them volatile. It only ends up hurting the quality of the
generated machine code.
If anything, it would make sense to decalre sigsp as *pointing* to
volatile memory (volatile void *sigsp), but evidently that's not needed
either.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230403115621.258636-2-bugaevc@gmail.com>
This is based on the Linux port's version, but laid out to match Mach's
struct i386_thread_state, much like the i386 version does.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
* manual/string.texi (Truncating Strings): Update obsolescent
reference and use the more-generic term “AddressSanitizer”.
Mention fortification, too. -fcheck-pointer-bounds is no longer
supported.
* manual/string.texi: Editorial fixes. Do not say “text” when
“string” or “string contents” is meant, as a C string can contain
bytes that are not valid text in the current encoding.
When warning about strcat efficiency, warn similarly about strncat
and wcscat. “coping” → “copying”.
Mention at the start of the two problematic sections that problems
are discussed at section end.
FreeBSD makes these functions available by default, so we should
not treat them as GNU-specific and restrict them to _GNU_SOURCE.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
FreeBSD makes them available by default, too, so there does not seem
to be a reason to restrict these functions to _GNU_SOURCE.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Rename x86_cpu_INDEX_7_ECX_1 to x86_cpu_INDEX_7_ECX_15 for the unused bit
15 in ECX from CPUID with EAX == 0x7 and ECX == 0.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
sysdeps/mach/hurd/htl/pt-pthread_self.c: New file.
htl/Makefile: .. Add it to libc routine.
sysdeps/mach/hurd/htl/pt-sysdep.c(__pthread_self): Remove it.
sysdeps/mach/hurd/htl/pt-sysdep.h(__pthread_self): Add hidden propertie.
htl/Versions(__pthread_self) Version it as private symbol.
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230318095826.1125734-3-gfleury@disroot.org>
htl/pt-nthreads.c: new file.
htl/Makefile: Add it to routine.
htl/Versions: version it as private libc symbol.
htl/pt-create.c: remove his definition here.
htl/pt-internal.h: add propertie to it declaration.
Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org>
Message-Id: <20230318095826.1125734-2-gfleury@disroot.org>
As indicated by sparc kernel-features.h, even though sparc64 defines
__NR_pause, it is not supported (ENOSYS). Always use ppoll or the
64 bit time_t variant instead.
The error handling is moved to sysdeps/ieee754 version with no SVID
support. The compatibility symbol versions still use the wrapper
with SVID error handling around the new code. There is no new symbol
version nor compatibility code on !LIBM_SVID_COMPAT targets
(e.g. riscv).
The ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation. For both i686 and m68k,
which provive arch specific implementation, wrappers are added so
no new symbol are added (which would require to change the
implementations).
It shows an small improvement, the results for fmod:
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 12.5049 | 9.40992
x86_64 (Ryzen 9) | normal | 296.939 | 296.738
x86_64 (Ryzen 9) | close-exponents | 16.0244 | 13.119
aarch64 (N1) | subnormal | 6.81778 | 4.33313
aarch64 (N1) | normal | 155.620 | 152.915
aarch64 (N1) | close-exponents | 8.21306 | 5.76138
armhf (N1) | subnormal | 15.1083 | 14.5746
armhf (N1) | normal | 244.833 | 241.738
armhf (N1) | close-exponents | 21.8182 | 22.457
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
This uses a new algorithm similar to already proposed earlier [1].
With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers),
the simplest implementation is:
mx * 2^ex == 2 * mx * 2^(ex - 1)
while (ex > ey)
{
mx *= 2;
--ex;
mx %= my;
}
With mx/my being mantissa of double floating pointer, on each step the
argument reduction can be improved 8 (which is sizeof of uint32_t minus
MANTISSA_WIDTH plus the signal bit):
while (ex > ey)
{
mx << 8;
ex -= 8;
mx %= my;
} */
The implementation uses builtin clz and ctz, along with shifts to
convert hx/hy back to doubles. Different than the original patch,
this path assume modulo/divide operation is slow, so use multiplication
with invert values.
I see the following performance improvements using fmod benchtests
(result only show the 'mean' result):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 17.2549 | 12.0318
x86_64 (Ryzen 9) | normal | 85.4096 | 49.9641
x86_64 (Ryzen 9) | close-exponents | 19.1072 | 15.8224
aarch64 (N1) | subnormal | 10.2182 | 6.81778
aarch64 (N1) | normal | 60.0616 | 20.3667
aarch64 (N1) | close-exponents | 11.5256 | 8.39685
I also see similar improvements on arm-linux-gnueabihf when running on
the N1 aarch64 chips, where it a lot of soft-fp implementation (for
modulo, and multiplication):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
armhf (N1) | subnormal | 11.6662 | 10.8955
armhf (N1) | normal | 69.2759 | 34.1524
armhf (N1) | close-exponents | 13.6472 | 18.2131
Instead of using the math_private.h definitions, I used the
math_config.h instead which is used on newer math implementations.
Co-authored-by: kirill <kirill.okhotnikov@gmail.com>
[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
This uses a new algorithm similar to already proposed earlier [1].
With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers),
the simplest implementation is:
mx * 2^ex == 2 * mx * 2^(ex - 1)
while (ex > ey)
{
mx *= 2;
--ex;
mx %= my;
}
With mx/my being mantissa of double floating pointer, on each step the
argument reduction can be improved 11 (which is sizeo of uint64_t minus
MANTISSA_WIDTH plus the signal bit):
while (ex > ey)
{
mx << 11;
ex -= 11;
mx %= my;
} */
The implementation uses builtin clz and ctz, along with shifts to
convert hx/hy back to doubles. Different than the original patch,
this path assume modulo/divide operation is slow, so use multiplication
with invert values.
I see the following performance improvements using fmod benchtests
(result only show the 'mean' result):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 19.1584 | 12.5049
x86_64 (Ryzen 9) | normal | 1016.51 | 296.939
x86_64 (Ryzen 9) | close-exponents | 18.4428 | 16.0244
aarch64 (N1) | subnormal | 11.153 | 6.81778
aarch64 (N1) | normal | 528.649 | 155.62
aarch64 (N1) | close-exponents | 11.4517 | 8.21306
I also see similar improvements on arm-linux-gnueabihf when running on
the N1 aarch64 chips, where it a lot of soft-fp implementation (for
modulo, clz, ctz, and multiplication):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
armhf (N1) | subnormal | 15.908 | 15.1083
armhf (N1) | normal | 837.525 | 244.833
armhf (N1) | close-exponents | 16.2111 | 21.8182
Instead of using the math_private.h definitions, I used the
math_config.h instead which is used on newer math implementations.
Co-authored-by: kirill <kirill.okhotnikov@gmail.com>
[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
1. Subnormals: 128 inputs.
2. Normal numbers with large exponent difference (|x/y| > 2^8):
1024 inputs between FLT_MIN and FLT_MAX;
3. Close exponents (ey >= -103 and |x/y| < 2^8): 1024 inputs with
exponents between -10 and 10.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Add three different dataset, from random floating point numbers:
1. Subnormals: 128 inputs.
2. Normal numbers with large exponent difference (|x/y| > 2^52):
1024 inputs between DBL_MIN and DBL_MAX;
3. Close exponents (ey >= -907 and |x/y| < 2^52): 1024 inputs with
exponents between -10 and 10.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Linux kernel uses AT_HWCAP2 to indicate if FSGSBASE instructions are
enabled. If the HWCAP2_FSGSBASE bit in AT_HWCAP2 is set, FSGSBASE
instructions can be used in user space. Define dl_check_hwcap2 to set
the FSGSBASE feature to active on Linux when the HWCAP2_FSGSBASE bit is
set.
Add a test to verify that FSGSBASE is active on current kernels.
NB: This test will fail if the kernel doesn't set the HWCAP2_FSGSBASE
bit in AT_HWCAP2 while fsgsbase shows up in /proc/cpuinfo.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The divss instruction clobbers its first argument, and the constraints
need to reflect that. Fortunately, with GCC 12, generated code does
not actually change, so there is no externally visible bug.
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
The __warn_unused_result__ attribute is only enabled when fortification
is enabled. Mention that in the document. The rationale for this is
essentially to mitigate against CWE-252:
[1] https://cwe.mitre.org/data/definitions/252.html
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
When THREAD_GETMEM is defined with inline assembly, the compiler may not
optimize away the two reads of _hurd_sigstate. Help it out a little bit
by only reading it once. This also makes for a slightly cleaner code.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-32-bugaevc@gmail.com>
Just like the other existing rtld-str* files, this provides rtld with
usable versions of stpncpy and strncpy.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-22-bugaevc@gmail.com>
The source code is the same as sysdeps/i386/htl/tcb-offsets.sym, but of
course the produced tcb-offsets.h will be different.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-21-bugaevc@gmail.com>
These do not need any changes to be used on x86_64.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-20-bugaevc@gmail.com>
This is more correct, if only because these fields are defined as having
the type unsigned int in the Mach headers, so casting them to a signed
int and then back is suboptimal.
Also, remove an extra reassignment of uesp -- this is another remnant of
the ecx kludge.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-16-bugaevc@gmail.com>
There's nothing Mach- or Hurd-specific about it; any port that ends
up with rtld pulling in strncpy will need this.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-15-bugaevc@gmail.com>
This was used for the value of libc-lock's owner when TLS is not yet set
up, so THREAD_SELF can not be used. Since the value need not be anything
specific -- it just has to be non-NULL -- we can just use a plain
constant, such as (void *) 1, for this. This avoids accessing the symbol
through GOT, and exporting it from libc.so in the first place.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-12-bugaevc@gmail.com>
In this case, _itoa_word () is already defined inline in the header (see
sysdeps/generic/_itoa.h), and the second definition causes an error.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-11-bugaevc@gmail.com>
hurd/lookup-retry.c is compiled into rtld, the dynamic linker/loader. To
avoid pulling in file_set_size, file_utimens, tty/ctty stuff, more
string/memory code (memmove, strncpy, strcpy), and more strtoul/itoa
code, compile out support for O_TRUNC and FS_RETRY_MAGICAL when building
hurd/lookup-retry.c for rtld. None of that functionality is useful to
rtld during startup anyway. Keep support for FS_RETRY_MAGICAL("/"),
since that does not pull in much, and is required for following absolute
symlinks.
The large number of extra code being pulled into rtld was noticed by
reviewing librtld.map & elf/librtld.os.map in the build tree.
It is worth noting that once libc.so is loaded, the real __open, __stat,
etc. replace the minimal versions used initially by rtld -- this is
especially important in the Hurd port, where the minimal rtld versions
do not use the dtable and just pass real Mach port names as fds. Thus,
once libc.so is loaded, rtld will gain access to the full
__hurd_file_name_lookup_retry () version, complete with FS_RETRY_MAGICAL
support, which is important in case the program decides to
dlopen ("/proc/self/fd/...") or some such.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-9-bugaevc@gmail.com>
...to keep `sigexc' port initialization in one place, and match what the
comments say.
No functional change.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-7-bugaevc@gmail.com>
Noone is or should be using __hurd_threadvar_stack_{offset,mask}, we
have proper TLS now. These two remaining variables are never set to
anything other than zero, so any code that would try to use them as
described would just dereference a zero pointer and crash. So remove
them entirely.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230319151017.531737-6-bugaevc@gmail.com>
On EXC_BAD_ACCESS, exception subcode is used to pass the faulting memory
address, so it needs to be (at least) pointer-sized. Thus, make it into
a long. This matches the corresponding change in GNU Mach.
Message-Id: <20230319151017.531737-5-bugaevc@gmail.com>
strftime(3) doesn't accept null pointers in any of the parameters.
Cc: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch adds a chunk scanning algorithm to the _int_memalign code
path that reduces heap fragmentation by reusing already aligned chunks
instead of always looking for chunks of larger sizes and splitting
them. The tcache macros are extended to allow removing a chunk from
the middle of the list.
The goal is to fix the pathological use cases where heaps grow
continuously in workloads that are heavy users of memalign.
Note that tst-memalign-2 checks for tcache operation, which
malloc-check bypasses.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
And make always supported. The configure option was added on glibc 2.25
and some features require it (such as hwcap mask, huge pages support, and
lock elisition tuning). It also simplifies the build permutations.
Changes from v1:
* Remove glibc.rtld.dynamic_sort changes, it is orthogonal and needs
more discussion.
* Cleanup more code.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
It is the default since 2.26 and it has bitrotten over the years,
By using it multiple malloc tests fails:
FAIL: malloc/tst-memalign-2
FAIL: malloc/tst-memalign-2-malloc-hugetlb1
FAIL: malloc/tst-memalign-2-malloc-hugetlb2
FAIL: malloc/tst-memalign-2-mcheck
FAIL: malloc/tst-mxfast-malloc-hugetlb1
FAIL: malloc/tst-mxfast-malloc-hugetlb2
FAIL: malloc/tst-tcfree2
FAIL: malloc/tst-tcfree2-malloc-hugetlb1
FAIL: malloc/tst-tcfree2-malloc-hugetlb2
Checked on x86_64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Prevent sh from interpreting a user string as shell options if it
starts with '-' or '+'. Since the version of /bin/sh used for testing
system() is different from the full-fledged system /bin/sh add support
to it for handling "--" after "-c". Add a testcase to ensure the
expected behavior.
Signed-off-by: Joe Simmons-Talbott <josimmon@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Without these fixes, the first three included tests segfault (on a
NULL dereference); the fourth aborts on an assertion, which is itself
unnecessary.
Signed-off-by: Julian Squires <julian@cipht.net>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The hooks mechanism uses symbol sets for running lists of functions,
which requires either extra linker directives to provide any hardening
(such as RELRO) or additional code (such as pointer obfuscation via
mangling with random value).
Currently only hurd uses set-hooks.h so we remove it from the generic
includes. The generic implementation uses direct function calls which
provide hardening and good code generation, observability and debugging
without the need for extra linking options or special code handling.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Now that there is no need to use a special linker script to hardening
internal data structures, remove the --with-default-link configure
option and associated definitions.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Instead of using a special ELF section along with a linker script
directive to put the IO vtables within the RELRO section, the libio
vtables are all moved to an array marked as data.relro (so linker
will place in the RELRO segment without the need of extra directives).
To avoid static linking namespace issues and including all vtable
referenced objects, all required function pointers are set to weak alias.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Instead define the required fields in system dependend files. The only
system dependent definition is FILENAME_MAX, which should match POSIX
PATH_MAX, and it is obtained from either kernel UAPI or mach headers.
Currently set pre-defined value from current kernels.
It avoids a circular dependendy when including stdio.h in
gen-as-const-headers files.
Checked on x86_64-linux-gnu and i686-linux-gnu
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
They are both used by __libc_freeres to free all library malloc
allocated resources to help tooling like mtrace or valgrind with
memory leak tracking.
The current scheme uses assembly markers and linker script entries
to consolidate the free routine function pointers in the RELRO segment
and to be freed buffers in BSS.
This patch changes it to use specific free functions for
libc_freeres_ptrs buffers and call the function pointer array directly
with call_function_static_weak.
It allows the removal of both the internal macros and the linker
script sections.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This allows other targets to use the same inputs for their own libmvec
microbenchmarks without having to duplicate them in their own
subdirectory.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Exactly the same as 35bcb08eaa.
If using -D_FORITFY_SOURCE=3 (in my case, I've patched GCC to add
=3 instead of =2 (we've done =2 for years in Gentoo)), building
glibc tests will fail on tst-bz11319-fortify2 like:
```
<command-line>: error: "_FORTIFY_SOURCE" redefined [-Werror]
<built-in>: note: this is the location of the previous definition
cc1: all warnings being treated as errors
```
It's just because we're always setting -D_FORTIFY_SOURCE=2
rather than unsetting it first. If F_S is already 2, it's harmless,
but if it's another value (say, 1, or 3), the compiler will bawk.
(I'm not aware of a reason this couldn't be tested with =3,
but the toolchain support is limited for that (too new), and we want
to run the tests everywhere possible.)
As Siddhesh noted previously, we could implement some fallback
logic to determine the maximal F_S value supported by the toolchain,
which is a bit easier now that autoconf-archive has been updated for F_S=3
(https://github.com/autoconf-archive/autoconf-archive/pull/269), but let's
revisit this if it continues to crop up.
Signed-off-by: Sam James <sam@gentoo.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Binutils 2.40 sets EF_LARCH_OBJABI_V1 for shared objects:
$ ld --version | head -n1
GNU ld (GNU Binutils) 2.40
$ echo 'int dummy;' > dummy.c
$ cc dummy.c -shared
$ readelf -h a.out | grep Flags
Flags: 0x43, DOUBLE-FLOAT, OBJ-v1
We need to ignore it in ldconfig or ldconfig will consider all shared
objects linked by Binutils 2.40 "unsupported". Maybe we should stop
setting EF_LARCH_OBJABI_V1 for shared objects, but Binutils 2.40 is
already released and we cannot change it.
After commit ed3ce71f5c ("elf: Move la_activity (LA_ACT_ADD) after
_dl_add_to_namespace_list() (BZ #28062)") it is no longer necessary to
reset the debugger state in the error case, since the debugger
notification only happens after no more errors can occur.
Linux threads were removed about 12 years ago and the current
nptl implementation only requires 4-byte alignment for pthread
locks.
The 16-byte alignment causes various issues. For example in
building ignition-msgs, we have:
/usr/include/google/protobuf/map.h:124:37: error: static assertion failed
124 | static_assert(alignof(value_type) <= 8, "");
| ~~~~~~~~~~~~~~~~~~~~^~~~
This is caused by the 16-byte pthread lock alignment.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
WG14 accepted the changes in N3105 to define wcstofN and wcstofNx
functions for C2x. Thus enable those for C2x (given also __GLIBC_USE
(IEC_60559_TYPES_EXT) and support for the relevant _FloatN / _FloatNx
type) rather than only for __USE_GNU.
Tested for x86_64.
WG14 recently accepted two additions to the printf/scanf %b/%B
support: there are now PRIb* and SCNb* macros in <inttypes.h>, and
printf %B is now an optional feature defined in normative text,
instead of recommended practice, with corresponding PRIB* macros that
can also be used to test whether that optional feature is supported.
See N3072 items 14 and 15 for details (those changes were accepted,
some other changes in that paper weren't).
Add the corresponding PRI* macros to glibc and update one place in the
manual referring to %B as recommended. (SCNb* should naturally be
added at the same time as the corresponding scanf %b support.)
Tested for x86_64 and x86.
For better debug experience use separate code block with extra
cfi_* directives to run child (same as in __clone3).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use the clone3 wrapper on ARC. It doesn't care about stack alignment.
All callers should provide an aligned stack.
It follows the internal signature:
extern int clone3 (struct clone_args *__cl_args, size_t __size,
int (*__func) (void *__arg), void *__arg);
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Between versions v2.11 and v2.12 struct ntptimeval got new fields.
That wasn't a problem because new function ntp_gettimex was created
(and made default) to support new struct. Old ntp_gettime was not
using new fields so it was safe to call with old struct
definition. Then commits 5613afe9e3 and b6ad64b907 (added for
64 bit time_t support), ntp_gettime start setting new fields.
Sets fields manually to maintain compatibility with v2.11 struct
definition.
Resolves#30156
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
It was possible to run this test individually and have it fail because
it can't find testobj1.so. This patch adds that dependency, to prevent
such issues.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Some toolchains, such as that used on Gentoo Hardened, set -z now out of
the box. This trips up a couple of tests.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Remove simple_strcspn/strpbrk/strsep which are significantly slower than the
generic implementations. Also remove oldstrsep and oldstrtok since they are
practically identical to the generic implementation. Adjust iteration count
to reduce benchmark time.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove memchr_strnlen since it is now the same as generic_strnlen. Adjust
iteration count to reduce benchmark time. Keep memchr_strlen since the
generic strlen does not use memchr.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
git commit 0849eed45d ("malloc: Move MORECORE fallback mmap to
sysmalloc_mmap_fallback") moved a block of code from sysmalloc to a
new helper function sysmalloc_mmap_fallback(), but 'pagesize' is used
for the 'minsize' argument and 'MMAP_AS_MORECORE_SIZE' for the
'pagesize' argument.
Fixes: 0849eed45d ("malloc: Move MORECORE fallback mmap to sysmalloc_mmap_fallback")
Signed-off-by: Robert Morell <rmorell@nvidia.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
according to man-pages-posix-2017, shm_open() function may fail if the length
of the name argument exceeds {_POSIX_PATH_MAX} and set ENAMETOOLONG
Signed-off-by: abushwang <abushwangs@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To avoid possible failure if any parent set any initial signal
disposition as SIG_IGN (for instance if the testcase is issued
with nohup).
Checked on x86_64-linux-gnu.
Tested-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
We made tst-system.c depend on pthread, but that requires linking with
$(shared-thread-library). It does not fail under Linux because the
variable expands to nothing under Linux, but it fails for Hurd.
I tested verified via cross-compiling that "make check" now works
for Hurd.
Signed-off-by: Adam Yi <ayi@janestreet.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Recorded in [BZ #30183]:
1. export GLIBC_TUNABLES=glibc.cpu.hwcaps=-AVX512
2. Add _dl_printf("p -- %s\n", p); just before switch(nl) in
sysdeps/x86/cpu-tunables.c
3. compiled and run ./testrun.sh /usr/bin/ls
you will get:
p -- -AVX512
p -- LC_ADDRESS=en_US.UTF-8
p -- LC_NUMERIC=C
...
The function, TUNABLE_CALLBACK (set_hwcaps)
(tunable_val_t *valp), checks far more than it should and it
should stop at end of "-AVX512".
Fix bug that SIGCHLD is erroneously blocked forever in the following
scenario:
1. Thread A calls system but hasn't returned yet
2. Thread B calls another system but returns
SIGCHLD would be blocked forever in thread B after its system() returns,
even after the system() in thread A returns.
Although POSIX does not require, glibc system implementation aims to be
thread and cancellation safe. This bug was introduced in
5fb7fc9635 when we moved reverting signal
mask to happen when the last concurrently running system returns,
despite that signal mask is per thread. This commit reverts this logic
and adds a test.
Signed-off-by: Adam Yi <ayi@janestreet.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Before this change, sgetsgent_r did not set errno to ERANGE, but
sgetsgent only check errno, not the return value from sgetsgent_r.
Consequently, sgetsgent did not detect any error, and reported
success to the caller, without initializing the struct sgrp object
whose address was returned.
This commit changes sgetsgent_r to set errno as well. This avoids
similar issues in applications which only change errno.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This patch updates the kernel version in the tests tst-mman-consts.py,
tst-mount-consts.py and tst-pidfd-consts.py to 6.2. (There are no new
constants covered by these tests in 6.2 that need any other header
changes, and the removed MAP_VARIABLE for hppa was addressed
separately.)
Tested with build-many-glibcs.py.
The __builtin_arm_uqsub8 is an internal GCC builtin which might change
in future release (the correct way is to include "arm_acle.h" and use
__uqsub8 ()). Since not all compilers support it, just use the
inline assembler instead.
Checked on armv7a-linux-gnueabihf.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The generic implementation already cover word access along with
cmpbge for both aligned and unaligned, so use it instead.
Checked qemu static for alpha-linux-gnu.
The default, and power7 implementation just adds word aligned
access when inputs have the same aligment. The unaligned case
is still done by byte operations.
This is already covered by the generic implementation, which also add
the unaligned input optimization.
Checked on powerpc64-linux-gnu built without multi-arch for powerpc64,
power7, power8, and power9 (build for le).
Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
The default, power4, and power7 implementation just adds word aligned
access when inputs have the same aligment. The unaligned case
is still done by byte operations.
This is already covered by the generic implementation, which also add
the unaligned input optimization.
Checked on powerpc-linux-gnu built without multi-arch for powerpc,
power4, and power7.
Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
C2x adds binary integer constants starting with 0b or 0B, and supports
those constants for the %i scanf format (in addition to the %b format,
which isn't yet implemented for scanf in glibc). Implement that scanf
support for glibc.
As with the strtol support, this is incompatible with previous C
standard versions, in that such an input string starting with 0b or 0B
was previously required to be parsed as 0 (with the rest of the input
potentially matching subsequent parts of the scanf format string).
Thus this patch adds 12 new __isoc23_* functions per long double
format (12, 24 or 36 depending on how many long double formats the
glibc configuration supports), with appropriate header redirection
support (generally very closely following that for the __isoc99_*
scanf functions - note that __GLIBC_USE (DEPRECATED_SCANF) takes
precedence over __GLIBC_USE (C2X_STRTOL), so the case of GNU
extensions to C89 continues to get old-style GNU %a and does not get
this new feature). The function names would remain as __isoc23_* even
if C2x ends up published in 2024 rather than 2023.
When scanf %b support is added, I think it will be appropriate for all
versions of scanf to follow C2x rules for inputs to the %b format
(given that there are no compatibility concerns for a new format).
Tested for x86_64 (full glibc testsuite). The first version was also
tested for powerpc (32-bit) and powerpc64le (stdio-common/ and wcsmbs/
tests), and with build-many-glibcs.py.
Starting with commit
b2c474f8de
"x86: Fix strncat-avx2.S reading past length [BZ #30065]"
Building on s390 the test fails due warnings like:
In function ‘do_one_test’,
inlined from ‘do_overflow_tests’ at test-strncat.c:175:7:
test-strncat.c:31:18: error: ‘strnlen’ specified bound [4294966546, 4294967295] exceeds maximum object size 2147483647 [-Werror=stringop-overflow=]
31 | # define STRNLEN strnlen
| ^
test-strncat.c:83:16: note: in expansion of macro ‘STRNLEN’
83 | size_t len = STRNLEN (src, n);
| ^~~~~~~
In all werror cases, the call to strnlen (.., SIZE_MAX) is inlined.
Therefore this patch just marks the do_one_test function as noinline.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
When building with -O3 on s390x/x86_64, I get this stringop-truncation warning
which leads to a build fail:
In function ‘nis_local_host’,
inlined from ‘nis_local_host’ at nis_local_names.c:147:1:
nis_local_names.c:171:11: error: ‘strncpy’ output may be truncated copying between 0 and 1023 bytes from a string of length 1024 [-Werror=stringop-truncation]
171 | strncpy (cp, nis_local_directory (), NIS_MAXNAMELEN - len -1);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We can just ignore this warning as the hostname + '.' + directory-name + '\0' always fits
in __nishostname with length of (NIS_MAXNAMELEN + 1) as there is the runtime check above.
Furthermore as we already know the length of the directory-name, we can also just use
memcpy to copy the directory-name inclusive the NUL-termination.
Note: This werror was introduced with commit
32c7acd464
"Replace rawmemchr (s, '\0') with strchr"
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Before GCC r13-2728, it would produce a normal dynamic-linked executable
with -static-pie. I mistakely believed it would produce a static-linked
executable, so failed to detect the breakage. Then with Binutils 2.40
and (vanilla) GCC 12, libc_cv_static_pie_on_loongarch is mistakenly
enabled and cause a building failure with "undefined reference to
_DYNAMIC".
Fix the issue by disabling static PIE if -static-pie creates something
with a INTERP header.
Also, fix a couple of typos. No functional change.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230301162355.426887-2-bugaevc@gmail.com>
"We don't need it any more"
The INTR_MSG_TRAP macro in intr-msg.h used to play little trick with
the stack pointer: it would temporarily save the "real" stack pointer
into ecx, while setting esp to point to just before the message buffer,
and then invoke the mach_msg trap. This way, INTR_MSG_TRAP reused the
on-stack arguments laid out for the containing call of
_hurd_intr_rpc_mach_msg (), passing them to the mach_msg trap directly.
This, however, required special support in hurdsig.c and trampoline.c,
since they now had to recognize when a thread is inside the piece of
code where esp doesn't point to the real tip of the stack, and handle
this situation specially.
Commit 1d20f33ff4 has removed the actual
temporary change of esp by actually re-pushing mach_msg arguments onto
the stack, and popping them back at end. It did not, however, deal with
the rest of "the ecx kludge" code in other files, resulting in potential
crashes if a signal arrives in the middle of pushing arguments onto the
stack.
Fix that by removing "the ecx kludge". Instead, when we want a thread
to skip the RPC, but cannot make just make it jump to after the trap
since it's not done adjusting the stack yet, set the SYSRETURN register
to MACH_SEND_INTERRUPTED (as we do anyway), and rely on the thread
itself for detecting this case and skipping the RPC.
This simplifies things somewhat and paves the way for a future x86_64
port of this code.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230301162355.426887-1-bugaevc@gmail.com>
The input argument passes an invalid string without a NUL terminator
on crypt settings inputs, which might lead to invalid OOB on strncmp.
Implementations only assume there is a NUL terminator if the string is
shorter than the specified size, so strings don't need to always be NUL
terminated (stratcliff.c has tests for this).
Also adapt the code to use libsupport.
Checked on arm-linux-gnuabihf.
The _FPU_SETCW and _FPU_GETCW macros are defined with inline assemblies.
They use the sfpc and efpc instructions, respectively. But both contain
a spurious second operand that leads to a compile error with Clang.
Removing this operand works both with gcc/gas (since binutils 2.18) as
well as with clang/llvm.
Needed due to recent commits:
- "added pair of inputs for hypotf in binary32"
commit ID cf7ffdd8a5
- "update auto-libm-test-out-hypot"
commit ID 3efbf11fdf
Linux 6.2 adds six new Arm HWCAP values and two new HWCAP2 values; add
them to glibc's Arm bits/hwcap.h, with corresponding dl-procinfo.c and
dl-procinfo.h updates.
Tested with build-many-glibcs.py for arm-linux-gnueabi.
This is for future-proofing. On i386, it is 4-byte aligned anyway, but
on x86_64, we want it 8-byte aligned, not 4-byte aligned.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230214173722.428140-4-bugaevc@gmail.com>
Update libm test ulps for
commit 3efbf11fdf
Author: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Date: Tue Feb 14 11:24:59 2023 +0100
update auto-libm-test-out-hypot
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch implements the LoongArch specific math barriers in order to omit
the store and load from stack if possible.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The Linux kernel upstream commit 71bdea6f798b ("parisc: Align parisc
MADV_XXX constants with all other architectures") dropped the
parisc-specific MADV_* values in favour of the same constants as
other architectures. In the same commit a wrapper was added which
translates the old values to the standard MADV_* values to avoid
breakage of existing programs.
This upstream patch has been downported to all stable kernel trees as
well.
This patch now drops the parisc specific constants from glibc to
allow newly compliled programs to use the standard MADV_* constants.
v2: Added NEWS section, based on feedback from Florian Weimer
Signed-off-by: Helge Deller <deller@gmx.de>
This drops all of the return address rewriting kludges. The only
remaining hack is the jump out of a call stack while adjusting the
stack pointer.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Linux 6.2 has no new syscalls. Update the version number in
syscall-names.list to reflect that it is still current for 6.2.
Tested with build-many-glibcs.py.
Crossing 2GB boundaries with indirect calls and jumps can use more
branch prediction resources on Intel Golden Cove CPU (see the
"Misprediction for Branches >2GB" section in Intel 64 and IA-32
Architectures Optimization Reference Manual.) There is visible
performance improvement on workloads with many PLT calls when executable
and shared libraries are mmapped below 2GB. Add the Prefer_MAP_32BIT_EXEC
bit so that mmap will try to map executable or denywrite pages in shared
libraries with MAP_32BIT first.
NB: Prefer_MAP_32BIT_EXEC reduces bits available for address space
layout randomization (ASLR), which is always disabled for SUID programs
and can only be enabled by the tunable, glibc.cpu.prefer_map_32bit_exec,
or the environment variable, LD_PREFER_MAP_32BIT_EXEC. This works only
between shared libraries or between shared libraries and executables with
addresses below 2GB. PIEs are usually loaded at a random address above
4GB by the kernel.
V2 of this patch fixes an issue in V1, where the state was changed to ON not
OFF at end of _mcleanup. I hadn't noticed that (counterintuitively) ON=0 and
OFF=3, hence zeroing the buffer turned it back on. So set the state to OFF
after the memset.
1. Prevent double free, and reads from unallocated memory, when
_mcleanup is (incorrectly) called two or more times in a row,
without an intervening call to __monstartup; with this patch, the
second and subsequent calls effectively become no-ops instead.
While setting tos=NULL is minimal fix, safest action is to zero the
whole gmonparam buffer.
2. Prevent memory leak when __monstartup is (incorrectly) called two
or more times in a row, without an intervening call to _mcleanup;
with this patch, the second and subsequent calls effectively become
no-ops instead.
3. After _mcleanup, treat __moncontrol(1) as __moncontrol(0) instead.
With zeroing of gmonparam buffer in _mcleanup, this stops the
state incorrectly being changed to GMON_PROF_ON despite profiling
actually being off. If we'd just done the minimal fix to _mcleanup
of setting tos=NULL, there is risk of far worse memory corruption:
kcount would point to deallocated memory, and the __profil syscall
would make the kernel write profiling data into that memory,
which could have since been reallocated to something unrelated.
4. Ensure __moncontrol(0) still turns off profiling even in error
state. Otherwise, if mcount overflows and sets state to
GMON_PROF_ERROR, when _mcleanup calls __moncontrol(0), the __profil
syscall to disable profiling will not be invoked. _mcleanup will
free the buffer, but the kernel will still be writing profiling
data into it, potentially corrupted arbitrary memory.
Also adds a test case for (1). Issues (2)-(4) are not feasible to test.
Signed-off-by: Simon Kissane <skissane@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
When mcount overflows, no gmon.out file is generated, but no message is printed
to the user, leaving the user with no idea why, and thinking maybe there is
some bug - which is how BZ 27576 ended up being logged. Print a message to
stderr in this case so the user knows what is going on.
As a comment in sys/gmon.h acknowledges, the hardcoded MAXARCS value is too
small for some large applications, including the test case in that BZ. Rather
than increase it, add tunables to enable MINARCS and MAXARCS to be overridden
at runtime (glibc.gmon.minarcs and glibc.gmon.maxarcs). So if a user gets the
mcount overflow error, they can try increasing maxarcs (they might need to
increase minarcs too if the heuristic is wrong in their case.)
Note setting minarcs/maxarcs too large can cause monstartup to fail with an
out of memory error. If you set them large enough, it can cause an integer
overflow in calculating the buffer size. I haven't done anything to defend
against that - it would not generally be a security vulnerability, since these
tunables will be ignored in suid/sgid programs (due to the SXID_ERASE default),
and if you can set GLIBC_TUNABLES in the environment of a process, you can take
it over anyway (LD_PRELOAD, LD_LIBRARY_PATH, etc). I thought about modifying
the code of monstartup to defend against integer overflows, but doing so is
complicated, and I realise the existing code is susceptible to them even prior
to this change (e.g. try passing a pathologically large highpc argument to
monstartup), so I decided just to leave that possibility in-place.
Add a test case which demonstrates mcount overflow and the tunables.
Document the new tunables in the manual.
Signed-off-by: Simon Kissane <skissane@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
The `__monstartup()` allocates a buffer used to store all the data
accumulated by the monitor.
The size of this buffer depends on the size of the internal structures
used and the address range for which the monitor is activated, as well
as on the maximum density of call instructions and/or callable functions
that could be potentially on a segment of executable code.
In particular a hash table of arcs is placed at the end of this buffer.
The size of this hash table is calculated in bytes as
p->fromssize = p->textsize / HASHFRACTION;
but actually should be
p->fromssize = ROUNDUP(p->textsize / HASHFRACTION, sizeof(*p->froms));
This results in writing beyond the end of the allocated buffer when an
added arc corresponds to a call near from the end of the monitored
address range, since `_mcount()` check the incoming caller address for
monitored range but not the intermediate result hash-like index that
uses to write into the table.
It should be noted that when the results are output to `gmon.out`, the
table is read to the last element calculated from the allocated size in
bytes, so the arcs stored outside the buffer boundary did not fall into
`gprof` for analysis. Thus this "feature" help me to found this bug
during working with https://sourceware.org/bugzilla/show_bug.cgi?id=29438
Just in case, I will explicitly note that the problem breaks the
`make test t=gmon/tst-gmon-dso` added for Bug 29438.
There, the arc of the `f3()` call disappears from the output, since in
the DSO case, the call to `f3` is located close to the end of the
monitored range.
Signed-off-by: Леонид Юрьев (Leonid Yuriev) <leo@yuriev.ru>
Another minor error seems a related typo in the calculation of
`kcountsize`, but since kcounts are smaller than froms, this is
actually to align the p->froms data.
Co-authored-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* malloc/malloc.c (_int_malloc): remove redundant check of
unsorted bin corruption
With commit "b90ddd08f6dd688e651df9ee89ca3a69ff88cd0c"
(malloc: Additional checks for unsorted bin integrity),
same check of (bck->fd != victim) is added before checking of unsorted
chunk corruption, which was added in "bdc3009b8ff0effdbbfb05eb6b10966753cbf9b8"
(Added check before removing from unsorted list).
..
3773 if (__glibc_unlikely (bck->fd != victim)
3774 || __glibc_unlikely (victim->fd != unsorted_chunks (av)))
3775 malloc_printerr ("malloc(): unsorted double linked list corrupted");
..
..
3815 /* remove from unsorted list */
3816 if (__glibc_unlikely (bck->fd != victim))
3817 malloc_printerr ("malloc(): corrupted unsorted chunks 3");
3818 unsorted_chunks (av)->bk = bck;
..
So this extra check can be removed.
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Ayush Mittal <ayush.m@samsung.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
Linux 6.2 removed the hppa compatibility MAP_VARIABLE define. That
means that, whether or not we remove it in glibc, it needs to be
ignored in tst-mman-consts.py (since this macro comparison
infrastructure expects that new kernel header versions only add new
macros, not remove old ones).
Tested with build-many-glibcs.py for hppa-linux-gnu (Linux 6.2
headers).
Fix the computation to allow for cntfrq_el0 being larger than 1GHz.
Assume cntfrq_el0 is a multiple of 1MHz to increase the maximum
interval (1024 seconds at 1GHz).
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It fixes the build after 7ea510127e and 22999b2f0f.
Checked with build for s390x-linux-gnu with -march=z13.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
__builtin_arm_uqsub8 is only available on gcc newer or equal than 10.
Checked on arm-linux-gnueabihf built with gcc 9.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The default Linux implementation already handled the Linux generic
ABIs interface used on newer architectures, so there is no need to
Imply the generic any longer.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
And disable if kernel does not support it.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
And disable if kernel does not support it.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
And remove redundant entries on other architectures Version. The
version for fallocate64 was supposed to be 2.10, but it was then
added to 32-bit platforms in 2.11 because it mistakenly wasn't
exported for them in 2.10 (see the commit message for
1f3615a1c9).
The linux/generic did not exist before 2.15, i.e. when the tile
ports were added (and microblaze did not exist before 2.18), which
explains those differences but also illustrates that "2.11 for 32-bit,
2.10 for 64-bit" should be sufficient since versions older than the
minimum for the architecture are automatically adjusted.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
While cleaning up old libc version support, the deprecated libc4 code was
accidentally kept in `implicit_soname`, instead of the libc6 code.
This causes additional symlinks to be created by `ldconfig` for libraries
without a soname, e.g. a library `libsomething.123.456.789` without a soname
will create a `libsomething.123` -> `libsomething.123.456.789` symlink.
As the libc6 version of the `implicit_soname` code is a trivial `xstrdup`,
just inline it and remove `implicit_soname` altogether.
Some further simplification looks possible (e.g. the call to `create_links`
looks like a no-op if `soname == NULL`, other than the verbose printfs), but
logic is kept as-is for now.
Fixes: BZ #30125
Fixes: 8ee878592c ("Assume only FLAG_ELF_LIBC6 suport")
Signed-off-by: Joan Bruguera <joanbrugueram@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Post review removal of "goto restart" from
https://sourceware.org/pipermail/libc-alpha/2021-April/125470.html
introduced a bug when some atexit handers skipped.
Signed-off-by: Vitaly Buka <vitalybuka@google.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The PAGE_SIZE from the Mach headers statically defines the machine's
page size. There's no need to query it dynamically; furthermore, the
implementation of the vm_statistics () RPC unconditionally fills in
pagesize = PAGE_SIZE;
Not doing the extra RPC shaves off 2 RPCs from the start-up of every
process!
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230218203717.373211-7-bugaevc@gmail.com>
And make it a bit more 64-bit ready. This is in preparation to moving this
file into x86/
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230218203717.373211-6-bugaevc@gmail.com>
This ensures that a timer_t value can be cast to struct timer_node *
and back.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230218203717.373211-5-bugaevc@gmail.com>
Fix a few more cases of build errors caused by mismatched types. This is a
continuation of f4315054b4.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230218203717.373211-3-bugaevc@gmail.com>
This is going to be done differently on x86_64.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230218203717.373211-2-bugaevc@gmail.com>
Add extra check for compiler definitions to ensure that compiler provides
sqrt and fma hw fpu instructions else use software implementation.
As divide/sqrt and FMA hw support from CPU side is optional,
the compiler can be configured by options to generate hw FPU instructions,
but without use of FDDIV, FDSQRT, FSDIV, FSSQRT, FDMADD and FSMADD
instructions. In this case __builtin_sqrt and __builtin_sqrtf provided by
compiler can't be used inside the glibc code, as these builtins are used
in implementations of sqrt() and sqrtf() functions but at the same time
these builtins unfold to sqrt() and sqrtf(). So it is possible to receive
code like that:
0001c4b4 <__ieee754_sqrtf>:
1c4b4: 0001 0000 b 0 ;1c4b4 <__ieee754_sqrtf>
The same is also true for __builtin_fma and __builtin_fmaf.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The ARCv2 ABI requires 4 byte stack pointer alignment. Don't allow to
use unaligned child stack in clone. As the stack grows down,
align it down.
This was pointed by misc/tst-misalign-clone-internal and
misc/tst-misalign-clone tests. Stack alignmet fixes these tests
fails.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use put/get macros __builtin_bswap32 instead. It allows to remove
the unaligned routines, the compiler will generate unaligned access
if the ABI allows it.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
And use a packed structure instead. The compiler generates optimized
unaligned code if the architecture supports it.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It only adds a small overhead for unaligned inputs (which should not
be common) and unify the code.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Assume unaligned inputs on all cases. The code is built and used only
in compat mode.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Builds for s390 recently started failing with:
../sysdeps/s390/multiarch/ifunc-impl-list.c: In function '__libc_ifunc_impl_list':
../sysdeps/s390/multiarch/ifunc-impl-list.c:83:21: error: unused variable 'dl_hwcap' [-Werror=unused-variable]
83 | unsigned long int dl_hwcap = features->hwcap;
| ^~~~~~~~
https://sourceware.org/pipermail/libc-testresults/2023q1/010855.html
Add __attribute__ ((unused)) as already done for another variable
there.
Tested with build-many-glibcs.py (compilers and glibcs) for
s390x-linux-gnu and s390-linux-gnu.
Note: s390x-linux-gnu-O3 started failing with a different error
earlier; that problem may still need to be fixed after this fix is in.
https://sourceware.org/pipermail/libc-testresults/2023q1/010829.html
C2x adds binary integer constants starting with 0b or 0B, and supports
those constants in strtol-family functions when the base passed is 0
or 2. Implement that strtol support for glibc.
As discussed at
<https://sourceware.org/pipermail/libc-alpha/2020-December/120414.html>,
this is incompatible with previous C standard versions, in that such
an input string starting with 0b or 0B was previously required to be
parsed as 0 (with the rest of the string unprocessed). Thus, as
proposed there, this patch adds 20 new __isoc23_* functions with
appropriate header redirection support. This patch does *not* do
anything about scanf %i (which will need 12 new functions per long
double variant, so 12, 24 or 36 depending on the glibc configuration),
instead leaving that for a future patch. The function names would
remain as __isoc23_* even if C2x ends up published in 2024 rather than
2023.
Making this change leads to the question of what should happen to
internal uses of these functions in glibc and its tests. The header
redirection (which applies for _GNU_SOURCE or any other feature test
macros enabling C2x features) has the effect of redirecting internal
uses but without those uses then ending up at a hidden alias (see the
comment in include/stdio.h about interaction with libc_hidden_proto).
It seems desirable for the default for internal uses to be the same
versions used by normal code using _GNU_SOURCE, so rather than doing
anything to disable that redirection, similar macro definitions to
those in include/stdio.h are added to the include/ headers for the new
functions.
Given that the default for uses in glibc is for the redirections to
apply, the next question is whether the C2x semantics are correct for
all those uses. Uses with the base fixed to 10, 16 or any other value
other than 0 or 2 can be ignored. I think this leaves the following
internal uses to consider (an important consideration for review of
this patch will be both whether this list is complete and whether my
conclusions on all entries in it are correct):
benchtests/bench-malloc-simple.c
benchtests/bench-string.h
elf/sotruss-lib.c
math/libm-test-support.c
nptl/perf.c
nscd/nscd_conf.c
nss/nss_files/files-parse.c
posix/tst-fnmatch.c
posix/wordexp.c
resolv/inet_addr.c
rt/tst-mqueue7.c
soft-fp/testit.c
stdlib/fmtmsg.c
support/support_test_main.c
support/test-container.c
sysdeps/pthread/tst-mutex10.c
I think all of these places are OK with the new semantics, except for
resolv/inet_addr.c, where the POSIX semantics of inet_addr do not
allow for binary constants; thus, I changed that file (to use
__strtoul_internal, whose semantics are unchanged) and added a test
for this case. In the case of posix/wordexp.c I think accepting
binary constants is OK since POSIX explicitly allows additional forms
of shell arithmetic expressions, and in stdlib/fmtmsg.c SEV_LEVEL is
not in POSIX so again I think accepting binary constants is OK.
Functions such as __strtol_internal, which are only exported for
compatibility with old binaries from when those were used in inline
functions in headers, have unchanged semantics; the __*_l_internal
versions (purely internal to libc and not exported) have a new
argument to specify whether to accept binary constants.
As well as for the standard functions, the header redirection also
applies to the *_l versions (GNU extensions), and to legacy functions
such as strtoq, to avoid confusing inconsistency (the *q functions
redirect to __isoc23_*ll rather than needing their own __isoc23_*
entry points). For the functions that are only declared with
_GNU_SOURCE, this means the old versions are no longer available for
normal user programs at all. An internal __GLIBC_USE_C2X_STRTOL macro
is used to control the redirections in the headers, and cases in glibc
that wish to avoid the redirections - the function implementations
themselves and the tests of the old versions of the GNU functions -
then undefine and redefine that macro to allow the old versions to be
accessed. (There would of course be greater complexity should we wish
to make any of the old versions into compat symbols / avoid them being
defined at all for new glibc ABIs.)
strtol_l.c has some similarity to strtol.c in gnulib, but has already
diverged some way (and isn't listed at all at
https://sourceware.org/glibc/wiki/SharedSourceFiles unlike strtoll.c
and strtoul.c); I haven't made any attempts at gnulib compatibility in
the changes to that file.
I note incidentally that inttypes.h and wchar.h are missing the
__nonnull present on declarations of this family of functions in
stdlib.h; I didn't make any changes in that regard for the new
declarations added.
This macro from Mach headers conflicts with how
sysdeps/x86_64/multiarch/strcmp-sse2.S expects it to be defined.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230214173722.428140-3-bugaevc@gmail.com>
* Micro-optimize TLS access using GCC's native support for gs-based
addressing when available;
* Just use THREAD_GETMEM and THREAD_SETMEM instead of more inline
assembly;
* Sync tcbhead_t layout with NPTL, in particular update/fix __private_ss
offset;
* Statically assert that the two offsets that are a part of ABI are what
we expect them to be.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230214173722.428140-2-bugaevc@gmail.com>
ISO C does not support omitting parameter names in function definitions
before C2X,the compiler is giving an error with older versions of gcc and
this commit will resolve the test failure "error: parameter name omitted"
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These are supposed to stay 32-bit even on 64-bit systems. This matches
BSD and Linux, as well as how these types are already defined in
tioctl.defs
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
'sem' is the opaque 'sem_t', 'isem' is the actual 'struct new_sem'.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230212111044.610942-6-bugaevc@gmail.com>
This does not seem like it is supposed to return negative error codes.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230212111044.610942-5-bugaevc@gmail.com>
When casting between a pointer and an integer of a different size, GCC
emits a warning (which is escalated to a build failure by -Werror).
Indeed, if what you start with is a pointer, which you then cast to a
shorter integer and then back again, you're going to cut off some bits
of the pointer.
But if you start with an integer (such as mach_port_t), then cast it to
a longer pointer (void *), and then back to a shorter integer, you are
fine. To keep GCC happy, cast through an intermediary uintptr_t, which
is always the same size as a pointer.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230212111044.610942-4-bugaevc@gmail.com>
It has been decided that on x86_64, mach_msg_type_number_t stays 32-bit.
Therefore, it's not possible to use mach_msg_type_number_t
interchangeably with size_t, in particular this breaks when a pointer to
a variable is passed to a MIG routine.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230212111044.610942-3-bugaevc@gmail.com>
Make the code flow more linear using early returns where possible. This
makes it so much easier to reason about what runs on error / successful
code paths.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230212111044.610942-2-bugaevc@gmail.com>
Likewise use __builtin_LINE instead of __LINE__.
When building C++, inline functions are required to have the exact same
sequence of tokens in every translation unit. But __FILE__ token, when
used in a header file, does not necessarily expand to the exact same
string literal, and that may cause compilation failure when C++ modules
are being used. (It would also cause unpredictable output on assertion
failure at runtime, but this rarely matters in practice.)
For example, given the following sources:
// a.h
#include <assert.h>
inline void fn () { assert (0); }
// a.cc
#include "a.h"
// b.cc
#include "foo/../a.h"
preprocessing a.cc will yield a call to __assert_fail("0", "a.h", ...)
but b.cc will yield __assert_fail("0", "foo/../a.h", ...)
We used to use .cfi_adjust_cfa_offset around %esp manipulation
asm instructions to fix unwinding, but when building glibc with
-fno-omit-frame-pointer this is bogus since in that case %ebp is the CFA and
does not move.
Instead, let's force -fno-omit-frame-pointer when building intr-msg.c so
that %ebp can always be used and no .cfi_adjust_cfa_offset is needed.
It follows the internal signature:
extern int clone3 (struct clone_args *__cl_args, size_t __size,
int (*__func) (void *__arg), void *__arg);
The powerpc64 ABI requires an initial stackframe so the child can
store/restore the TOC. It is create prior calling clone3 by
adjusting the stack size (since kernel will compute the stack as
stack plus size).
Checked on powerpc64-linux-gnu (power8, kernel 6.0) and
powerpc64le-linux-gnu (power9, kernel 4.18).
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
For powerpc, strncmp is used on _dl_string_platform issued by
__tcb_parse_hwcap_and_convert_at_platform.
Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>
Although static linker can optimize it to local call, it follows the
internal scheme to provide hidden proto and definitions.
Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>
Although static linker can optimize it to local call, it follows the
internal scheme to provide hidden proto and definitions.
Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>
The test is sufficient to detect the ldconfig bug fixed in
commit 9fe6f63638 ("elf: Fix 64 time_t
support for installed statically binaries").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The hard float abi and hard float are different,
Hard float abi: Use float register to pass float type arguments.
Hard float: Enable the hard float ISA feature.
So the with_fp_cond cannot represent these two features. When
-mfloat-abi=softfp, the float abi is soft and hard float is enabled.
So add 'with_hard_float_abi' in preconfigure and define 'CSKY_HARD_FLOAT_ABI'
if float abi is hard, and use 'CSKY_HARD_FLOAT_ABI' to determine
dynamic linker because it is what determines compatibility.
And with_fp_cond is still needed to tell glibc whether to enable
hard floating feature.
In addition, use AC_TRY_COMMAND to test gcc to ensure compatibility
between different versions of gcc. The original way has a problem
that __CSKY_HARD_FLOAT_FPU_SF__ means the target only has single
hard float-points ISA, so it's not defined in CPUs like ck810f.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch enables the option to influence hwcaps and stfle bits used
by the s390 specific ifunc-resolvers. The currently x86-specific
tunable glibc.cpu.hwcaps is also used on s390x to achieve the task. In
addition the user can also set a CPU arch-level like z13 instead of
single HWCAP and STFLE features.
Note that the tunable only handles the features which are really used
in the IFUNC-resolvers. All others are ignored as the values are only
used inside glibc. Thus we can influence:
- HWCAP_S390_VXRS (z13)
- HWCAP_S390_VXRS_EXT (z14)
- HWCAP_S390_VXRS_EXT2 (z15)
- STFLE_MIE3 (z15)
The influenced hwcap/stfle-bits are stored in the s390-specific
cpu_features struct which also contains reserved fields for future
usage.
The ifunc-resolvers and users of stfle bits are adjusted to use the
information from cpu_features struct.
On 31bit, the ELF_MACHINE_IRELATIVE macro is now also defined.
Otherwise the new ifunc-resolvers segfaults as they depend on
the not yet processed_rtld_global_ro@GLIBC_PRIVATE relocation.
It uses the bitmanip extension to optimize index_fist and index_last
with clz/ctz (using generic implementation that routes to compiler
builtin) and orc.b to check null bytes.
Checked the string test on riscv64 user mode.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
While ppc has the more important string functions in assembly,
there are still a few generic routines used.
Use the Power 6 CMPB insn for testing of zeros.
Checked on powerpc64le-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
While arm has the more important string functions in assembly,
there are still a few generic routines used.
Use the UQSUB8 insn for testing of zeros.
Checked on armv7-linux-gnueabihf
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
While alpha has the more important string functions in assembly,
there are still a few for find the generic routines are used.
Use the CMPBGE insn, via the builtin, for testing of zeros. Use a
simplified expansion of __builtin_ctz when the insn isn't available.
Checked on alpha-linux-gnu.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use UXOR,SBZ to test for a zero byte within a word. While we can
get semi-decent code out of asm-goto, we would do slightly better
with a compiler builtin.
For index_zero et al, sequential testing of bytes is less expensive than
any tricks that involve a count-leading-zeros insn that we don't have.
Checked on hppa-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
GCC's combine pass cannot merge (x >> c | y << (32 - c)) into a
double-word shift unless (1) the subtract is in the same basic block
and (2) the result of the subtract is used exactly once. Neither
condition is true for any use of MERGE.
By forcing the use of a double-word shift, we not only reduce
contention on SAR, but also allow the setting of SAR to be hoisted
outside of a loop.
Checked on hppa-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Now that both strlen and memrchr have word vectorized implementation,
it should be faster to implement strrchr based on memrchr over the
string length instead of calling strchr on a loop.
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
and powerpc64-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
New algorithm read the lastaligned address and mask off the unwanted
bytes. The loop now read word-aligned address and check using the
has_eq macro.
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
and powerpc64-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
Co-authored-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
It also cleanups the multiple inclusion by leaving the ifunc
implementation to undef the weak_alias and libc_hidden_def.
Co-authored-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
New algorithm read the first aligned address and mask off the
unwanted bytes (this strategy is similar to arch-specific
implementations used on powerpc, sparc, and sh).
The loop now read word-aligned address and check using the has_eq
macro.
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
and powerpc64-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
Co-authored-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Now that stpcpy is vectorized based on op_t, it should be better to
call it instead of strlen plus memcpy.
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
and powerpc-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
It follows the strategy:
- Align the destination on word boundary using byte operations.
- If source is also word aligned, read a word per time, check for
null (using has_zero from string-fzb.h), and write the remaining
bytes.
- If source is not word aligned, loop by aligning the source, and
merging the result of two reads. Similar to aligned case,
check for null with has_zero, and write the remaining bytes if
null is found.
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
and powerpc-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
It follows the strategy:
- Align the first input to word boundary using byte operations.
- If second input is also word aligned, read a word per time, check
for null (using has_zero), and check final words using byte
operation.
- If second input is not word aligned, loop by aligning the source,
and merge the result of two reads. Similar to aligned case, check
for null with has_zero, and check final words using byte operation.
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
and powerpc-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
It follows the strategy:
- Align the first input to word boundary using byte operations.
- If second input is also word aligned, read a word per time, check for
null (using has_zero), and check final words using byte operation.
- If second input is not word aligned, loop by aligning the source, and
merging the result of two reads. Similar to aligned case, check for
null with has_zero, and check final words using byte operation.
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
and powerpc-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
Co-authored-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
New algorithm now calls strchrnul.
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
and powerpc64-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
New algorithm read the first aligned address and mask off the unwanted
bytes (this strategy is similar to arch-specific implementations used
on powerpc, sparc, and sh).
The loop now read word-aligned address and check using the has_zero_eq
function.
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
and powerpc-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
Co-authored-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
New algorithm read the first aligned address and mask off the
unwanted bytes (this strategy is similar to arch-specific
implementations used on powerpc, sparc, and sh).
The loop now read word-aligned address and check using the has_zero
macro.
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
and powercp64-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
Co-authored-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
This patch adds generic string find and detection meant to be used in
generic vectorized string implementation. The idea is to decompose the
basic string operation so each architecture can reimplement if it
provides any specialized hardware instruction.
The 'string-misc.h' provides miscellaneous functions:
- extractbyte: extracts the byte from an specific index.
- repeat_bytes: setup an word by replicate the argument on each byte.
The 'string-fza.h' provides zero byte detection functions:
- find_zero_low, find_zero_all, find_eq_low, find_eq_all,
find_zero_eq_low, find_zero_eq_all, and find_zero_ne_all
The 'string-fzb.h' provides boolean zero byte detection functions:
- has_zero: determine if any byte within a word is zero.
- has_eq: determine byte equality between two words.
- has_zero_eq: determine if any byte within a word is zero along with
byte equality between two words.
The 'string-fzi.h' provides positions for string-fza.h results:
- index_first: return index of first zero byte within a word.
- index_last: return index of first byte different between two words.
The 'string-fzc.h' provides a combined version of fza and fzi:
- index_first_zero_eq: return index of first zero byte within a word or
first byte different between two words.
- index_first_zero_ne: return index of first zero byte within a word or
first byte equal between two words.
- index_last_zero: return index of last zero byte within a word.
- index_last_eq: return index of last byte different between two words.
The 'string-shift.h' provides a way to mask off parts of a work based on
some alignmnet (to handle unaligned arguments):
- shift_find, shift_find_last.
Co-authored-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
It moves OP_T_THRES out of memcopy.h to its own header and adjust
each architecture that redefines it.
Checked with a build and check with run-built-tests=no for all major
Linux ABIs.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
It moves the op_t definition out to an specific header, adds
the attribute 'may-alias', and cleanup its duplicated definitions.
Checked with a build and check with run-built-tests=no for all major
Linux ABIs.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Almost all uses of rawmemchr find the end of a string. Since most targets use
a generic implementation, replacing it with strchr is better since that is
optimized by compilers into strlen (s) + s. Also fix the generic rawmemchr
implementation to use a cast to unsigned char in the if statement.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Improve SVE memcpy by copying 2 vectors if the size is small enough.
This improves performance of random memcpy by ~9% on Neoverse V1, and
33-64 byte copies are ~16% faster.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This is a partial fix for mishandling of grouping when formatting
integers. It properly computes the width in the presence of grouping
characters when the width is larger than the number of significant
digits. The precision related issue is documented in bug 23432.
Co-authored-by: Andreas Schwab <schwab@suse.de>
If using -D_FORITFY_SOURCE=3 (in my case, I've patched GCC to add
=3 instead of =2 (we've done =2 for years in Gentoo)), building
glibc tests will fail on testmb like:
```
<command-line>: error: "_FORTIFY_SOURCE" redefined [-Werror]
<built-in>: note: this is the location of the previous definition
cc1: all warnings being treated as errors
make[2]: *** [../o-iterator.mk:9: /var/tmp/portage/sys-libs/glibc-2.36/work/build-x86-x86_64-pc-linux-gnu-nptl/stdlib/testmb.o] Error 1
make[2]: *** Waiting for unfinished jobs....
```
It's just because we're always setting -D_FORTIFY_SOURCE=2
rather than unsetting it first. If F_S is already 2, it's harmless,
but if it's another value (say, 1, or 3), the compiler will bawk.
(I'm not aware of a reason this couldn't be tested with =3,
but the toolchain support is limited for that (too new), and we want
to run the tests everywhere possible.)
Signed-off-by: Sam James <sam@gentoo.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This file is not used today since we end up using
sysdeps/i386/htl/machine-sp.h. Getting the stack pointer does not need
to be hurd specific and can go into sysdeps/<arch>.
Message-Id: <Y9tpWs2WOgE/Duiq@jupiter.tail36e24.ts.net>
Define the __glibc_fortify and other macros only when __FORTIFY_LEVEL >
0. This has the effect of not defining these macros on older C90
compilers that do not have support for variable length argument lists.
Also trim off the trailing backslashes from the definition of
__glibc_fortify and __glibc_fortify_n macros.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This adds a special SHM_ANON value that can be passed into shm_open ()
in place of a name. When called in this way, shm_open () will create a
new anonymous shared memory file. The file will be created in the same
way that other shared memory files are created (i.e., under /dev/shm/),
except that it is not given a name and therefore cannot be reached from
the file system, nor by other calls to shm_open (). This is accomplished
by utilizing O_TMPFILE.
This is intended to be compatible with FreeBSD's API of the same name.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230130125216.6254-4-bugaevc@gmail.com>
This is a flag that causes open () to create a new, unnamed file in the
same filesystem as the given directory. The file descriptor can be
simply used in the creating process as a temporary file, or shared with
children processes via fork (), or sent over a Unix socket. The file can
be left anonymous, in which case it will be deleted from the backing
file system once all copies of the file descriptor are closed, or given
a permanent name with a linkat () call, such as the following:
int fd = open ("/tmp", O_TMPFILE | O_RDWR, 0700);
/* Do something with the file... */
linkat (fd, "", AT_FDCWD, "/tmp/filename", AT_EMPTY_PATH);
In between creating the file and linking it to the file system, it is
possible to set the file content, mode, ownership, author, and other
attributes, so that the file visibly appears in the file system (perhaps
replacing another file) atomically, with all of its attributes already
set up.
The Hurd support for O_TMPFILE directly exposes the dir_mkfile RPC to
user programs. Previously, dir_mkfile was used by glibc internally, in
particular for implementing tmpfile (), but not exposed to user programs
through a Unix-level API.
O_TMPFILE was initially introduced by Linux. This implementation is
intended to be compatible with the Linux implementation, except that the
O_EXCL flag is not given the special meaning when used together with
O_TMPFILE, unlike on Linux.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230130125216.6254-3-bugaevc@gmail.com>
Instead of __file_name_lookup_at delegating to __file_name_lookup
in simple cases, make __file_name_lookup_at deal with both cases, and
have __file_name_lookup simply wrap __file_name_lookup_at.
This factorizes handling the empy name case.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20230130125216.6254-2-bugaevc@gmail.com>
Add an optimization to avoid calling clone3 when glibc detects that
there is no kernel support. It also adds __ASSUME_CLONE3, which allows
skipping this optimization and issuing the clone3 syscall directly.
It does not handle the the small window between 5.3 and 5.5 for
posix_spawn (CLONE_CLEAR_SIGHAND was added in 5.5).
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
It follow the internal signature:
extern int clone3 (struct clone_args *__cl_args, size_t __size,
int (*__func) (void *__arg), void *__arg);
Checked on aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The clone3 flag resets all signal handlers of the child not set to
SIG_IGN to SIG_DFL. It allows to skip most of the sigaction calls
to setup child signal handling, where previously a posix_spawn
had to issue 2 times NSIG sigaction calls (one to obtain the current
disposition and another to set either SIG_DFL or SIG_IGN).
With POSIX_SPAWN_SETSIGDEF the child will setup the signal for the case
where the disposition is SIG_IGN.
The code must handle the fallback where clone3 is not available. This is
done by splitting __clone_internal_fallback from __clone_internal.
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
All internal callers of __clone3 should provide an already aligned
stack. Removing the stack alignment in __clone3 is a net gain: it
simplifies the internal function contract (mask/unmask signals) along
with the arch-specific code.
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Different than kernel, clone3 returns EINVAL for NULL struct
clone_args or function pointer. This is similar to clone
interface that return EINVAL for NULL function argument.
It also clean up the Linux clone3.h interface, since it not
currently exported.
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
There is no need to issue another sigaction if the disposition is
already SIG_DFL.
Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-02-01 08:42:11 -03:00
15519 changed files with 443431 additions and 278324 deletions