glibc/sysdeps
Szabolcs Nagy d2123d6827 elf: Fix slow tls access after dlopen [BZ #19924]
In short: __tls_get_addr checks the global generation counter and if
the current dtv is older then _dl_update_slotinfo updates dtv up to the
generation of the accessed module. So if the global generation is newer
than generation of the module then __tls_get_addr keeps hitting the
slow dtv update path. The dtv update path includes a number of checks
to see if any update is needed and this already causes measurable tls
access slow down after dlopen.

It may be possible to detect up-to-date dtv faster.  But if there are
many modules loaded (> TLS_SLOTINFO_SURPLUS) then this requires at
least walking the slotinfo list.

This patch tries to update the dtv to the global generation instead, so
after a dlopen the tls access slow path is only hit once.  The modules
with larger generation than the accessed one were not necessarily
synchronized before, so additional synchronization is needed.

This patch uses acquire/release synchronization when accessing the
generation counter.

Note: in the x86_64 version of dl-tls.c the generation is only loaded
once, since relaxed mo is not faster than acquire mo load.

I have not benchmarked this. Tested by Adhemerval Zanella on aarch64,
powerpc, sparc, x86 who reported that it fixes the performance issue
of bug 19924.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-09-01 08:21:37 +01:00
..
aarch64 configure: Use autoconf 2.71 2023-07-17 10:08:10 -04:00
alpha Remove references to the defunct db2 subdir 2023-08-21 18:20:53 +02:00
arc configure: Use autoconf 2.71 2023-07-17 10:08:10 -04:00
arm configure: Use autoconf 2.71 2023-07-17 10:08:10 -04:00
csky configure: Use autoconf 2.71 2023-07-17 10:08:10 -04:00
generic elf: Fix slow tls access after dlopen [BZ #19924] 2023-09-01 08:21:37 +01:00
gnu configure: Use autoconf 2.71 2023-07-17 10:08:10 -04:00
hppa configure: Use autoconf 2.71 2023-07-17 10:08:10 -04:00
htl htl: move pthread_attr_setdetachstate into libc 2023-08-24 01:57:22 +02:00
hurd hurd: Fix using interposable hurd_thread_self 2023-05-19 20:45:51 +02:00
i386 i686: Fix build with --disable-multiarch 2023-08-10 10:29:29 -03:00
ia64 configure: Use autoconf 2.71 2023-07-17 10:08:10 -04:00
ieee754 x86_64: Add log1p with FMA 2023-08-21 10:44:26 -07:00
loongarch LoongArch: Change loongarch to LoongArch in comments 2023-08-29 10:35:38 +08:00
m68k m68k: Use M68K_SCALE_AVAILABLE on __mpn_lshift and __mpn_rshift 2023-08-25 10:07:24 -03:00
mach htl: move pthread_attr_setdetachstate into libc 2023-08-24 01:57:22 +02:00
microblaze configure: Use autoconf 2.71 2023-07-17 10:08:10 -04:00
mips MIPS: Update mips32 and mip64 libm test ulps 2023-07-25 22:20:57 +02:00
nios2 configure: Use autoconf 2.71 2023-07-17 10:08:10 -04:00
nptl Fix misspellings in sysdeps/ -- BZ 25337 2023-05-30 23:02:29 +00:00
or1k configure: Use autoconf 2.71 2023-07-17 10:08:10 -04:00
posix hurd: readv: Get rid of alloca 2023-06-20 19:15:10 +02:00
powerpc powerpc longjmp: Fix build after chk hidden builtin fix 2023-08-04 10:03:59 +02:00
pthread Exclude routines from fortification 2023-07-05 16:59:48 +02:00
riscv riscv: Update rvd libm test ulps 2023-07-22 15:55:33 +02:00
s390 s390x: Fix static PIE condition for toolchain bootstrapping. 2023-08-18 10:57:59 +02:00
sh configure: Use autoconf 2.71 2023-07-17 10:08:10 -04:00
sparc Remove references to the defunct db2 subdir 2023-08-21 18:20:53 +02:00
unix LoongArch: Micro-optimize LD_PCREL 2023-08-29 10:35:38 +08:00
wordsize-32 Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
wordsize-64 hurd: Fix tst-writev test 2023-05-01 13:01:30 +02:00
x86 x86: Check the lower byte of EAX of CPUID leaf 2 [BZ #30643] 2023-08-29 12:57:41 -07:00
x86_64 elf: Fix slow tls access after dlopen [BZ #19924] 2023-09-01 08:21:37 +01:00