Patch series "Introduce mseal", v10. This patchset proposes a new mseal() syscall for the Linux kernel. In a nutshell, mseal() protects the VMAs of a given virtual memory range against modifications, such as changes to their permission bits. Modern CPUs support memory permissions, such as the read/write (RW) and no-execute (NX) bits. Linux has supported NX since the release of kernel version 2.6.8 in August 2004 [1]. The memory permission feature improves the security stance on memory corruption bugs, as an attacker cannot simply write to arbitrary memory and point the code to it. The memory must be marked with the X bit, or else an exception will occur. Internally, the kernel maintains the memory permissions in a data structure called VMA (vm_area_struct). mseal() additionally protects the VMA itself against modifications of the selected seal type. Memory sealing is useful to mitigate memory corruption issues where a corrupted pointer is passed to a memory management system. For example, such an attacker primitive can break control-flow integrity guarantees since read-only memory that is supposed to be trusted can become writable or .text pages can get remapped. Memory sealing can automatically be applied by the runtime loader to seal .text and .rodata pages and applications can additionally seal security critical data at runtime. A similar feature already exists in the XNU kernel with the VM_FLAGS_PERMANENT [3] flag and on OpenBSD with the mimmutable syscall [4]. Also, Chrome wants to adopt this feature for their CFI work [2] and this patchset has been designed to be compatible with the Chrome use case. Two system calls are involved in sealing the map: mmap() and mseal(). The new mseal() is an syscall on 64 bit CPU, and with following signature: int mseal(void addr, size_t len, unsigned long flags) addr/len: memory range. flags: reserved. mseal() blocks following operations for the given memory range. 1> Unmapping, moving to another location, and shrinking the size, via munmap() and mremap(), can leave an empty space, therefore can be replaced with a VMA with a new set of attributes. 2> Moving or expanding a different VMA into the current location, via mremap(). 3> Modifying a VMA via mmap(MAP_FIXED). 4> Size expansion, via mremap(), does not appear to pose any specific risks to sealed VMAs. It is included anyway because the use case is unclear. In any case, users can rely on merging to expand a sealed VMA. 5> mprotect() and pkey_mprotect(). 6> Some destructive madvice() behaviors (e.g. MADV_DONTNEED) for anonymous memory, when users don't have write permission to the memory. Those behaviors can alter region contents by discarding pages, effectively a memset(0) for anonymous memory. The idea that inspired this patch comes from Stephen Röttger’s work in V8 CFI [5]. Chrome browser in ChromeOS will be the first user of this API. Indeed, the Chrome browser has very specific requirements for sealing, which are distinct from those of most applications. For example, in the case of libc, sealing is only applied to read-only (RO) or read-execute (RX) memory segments (such as .text and .RELRO) to prevent them from becoming writable, the lifetime of those mappings are tied to the lifetime of the process. Chrome wants to seal two large address space reservations that are managed by different allocators. The memory is mapped RW- and RWX respectively but write access to it is restricted using pkeys (or in the future ARM permission overlay extensions). The lifetime of those mappings are not tied to the lifetime of the process, therefore, while the memory is sealed, the allocators still need to free or discard the unused memory. For example, with madvise(DONTNEED). However, always allowing madvise(DONTNEED) on this range poses a security risk. For example if a jump instruction crosses a page boundary and the second page gets discarded, it will overwrite the target bytes with zeros and change the control flow. Checking write-permission before the discard operation allows us to control when the operation is valid. In this case, the madvise will only succeed if the executing thread has PKEY write permissions and PKRU changes are protected in software by control-flow integrity. Although the initial version of this patch series is targeting the Chrome browser as its first user, it became evident during upstream discussions that we would also want to ensure that the patch set eventually is a complete solution for memory sealing and compatible with other use cases. The specific scenario currently in mind is glibc's use case of loading and sealing ELF executables. To this end, Stephen is working on a change to glibc to add sealing support to the dynamic linker, which will seal all non-writable segments at startup. Once this work is completed, all applications will be able to automatically benefit from these new protections. In closing, I would like to formally acknowledge the valuable contributions received during the RFC process, which were instrumental in shaping this patch: Jann Horn: raising awareness and providing valuable insights on the destructive madvise operations. Liam R. Howlett: perf optimization. Linus Torvalds: assisting in defining system call signature and scope. Theo de Raadt: sharing the experiences and insight gained from implementing mimmutable() in OpenBSD. MM perf benchmarks ================== This patch adds a loop in the mprotect/munmap/madvise(DONTNEED) to check the VMAs’ sealing flag, so that no partial update can be made, when any segment within the given memory range is sealed. To measure the performance impact of this loop, two tests are developed. [8] The first is measuring the time taken for a particular system call, by using clock_gettime(CLOCK_MONOTONIC). The second is using PERF_COUNT_HW_REF_CPU_CYCLES (exclude user space). Both tests have similar results. The tests have roughly below sequence: for (i = 0; i < 1000, i++) create 1000 mappings (1 page per VMA) start the sampling for (j = 0; j < 1000, j++) mprotect one mapping stop and save the sample delete 1000 mappings calculates all samples. Below tests are performed on Intel(R) Pentium(R) Gold 7505 @ 2.00GHz, 4G memory, Chromebook. Based on the latest upstream code: The first test (measuring time) syscall__ vmas t t_mseal delta_ns per_vma % munmap__ 1 909 944 35 35 104% munmap__ 2 1398 1502 104 52 107% munmap__ 4 2444 2594 149 37 106% munmap__ 8 4029 4323 293 37 107% munmap__ 16 6647 6935 288 18 104% munmap__ 32 11811 12398 587 18 105% mprotect 1 439 465 26 26 106% mprotect 2 1659 1745 86 43 105% mprotect 4 3747 3889 142 36 104% mprotect 8 6755 6969 215 27 103% mprotect 16 13748 14144 396 25 103% mprotect 32 27827 28969 1142 36 104% madvise_ 1 240 262 22 22 109% madvise_ 2 366 442 76 38 121% madvise_ 4 623 751 128 32 121% madvise_ 8 1110 1324 215 27 119% madvise_ 16 2127 2451 324 20 115% madvise_ 32 4109 4642 534 17 113% The second test (measuring cpu cycle) syscall__ vmas cpu cmseal delta_cpu per_vma % munmap__ 1 1790 1890 100 100 106% munmap__ 2 2819 3033 214 107 108% munmap__ 4 4959 5271 312 78 106% munmap__ 8 8262 8745 483 60 106% munmap__ 16 13099 14116 1017 64 108% munmap__ 32 23221 24785 1565 49 107% mprotect 1 906 967 62 62 107% mprotect 2 3019 3203 184 92 106% mprotect 4 6149 6569 420 105 107% mprotect 8 9978 10524 545 68 105% mprotect 16 20448 21427 979 61 105% mprotect 32 40972 42935 1963 61 105% madvise_ 1 434 497 63 63 115% madvise_ 2 752 899 147 74 120% madvise_ 4 1313 1513 200 50 115% madvise_ 8 2271 2627 356 44 116% madvise_ 16 4312 4883 571 36 113% madvise_ 32 8376 9319 943 29 111% Based on the result, for 6.8 kernel, sealing check adds 20-40 nano seconds, or around 50-100 CPU cycles, per VMA. In addition, I applied the sealing to 5.10 kernel: The first test (measuring time) syscall__ vmas t tmseal delta_ns per_vma % munmap__ 1 357 390 33 33 109% munmap__ 2 442 463 21 11 105% munmap__ 4 614 634 20 5 103% munmap__ 8 1017 1137 120 15 112% munmap__ 16 1889 2153 263 16 114% munmap__ 32 4109 4088 -21 -1 99% mprotect 1 235 227 -7 -7 97% mprotect 2 495 464 -30 -15 94% mprotect 4 741 764 24 6 103% mprotect 8 1434 1437 2 0 100% mprotect 16 2958 2991 33 2 101% mprotect 32 6431 6608 177 6 103% madvise_ 1 191 208 16 16 109% madvise_ 2 300 324 24 12 108% madvise_ 4 450 473 23 6 105% madvise_ 8 753 806 53 7 107% madvise_ 16 1467 1592 125 8 108% madvise_ 32 2795 3405 610 19 122% The second test (measuring cpu cycle) syscall__ nbr_vma cpu cmseal delta_cpu per_vma % munmap__ 1 684 715 31 31 105% munmap__ 2 861 898 38 19 104% munmap__ 4 1183 1235 51 13 104% munmap__ 8 1999 2045 46 6 102% munmap__ 16 3839 3816 -23 -1 99% munmap__ 32 7672 7887 216 7 103% mprotect 1 397 443 46 46 112% mprotect 2 738 788 50 25 107% mprotect 4 1221 1256 35 9 103% mprotect 8 2356 2429 72 9 103% mprotect 16 4961 4935 -26 -2 99% mprotect 32 9882 10172 291 9 103% madvise_ 1 351 380 29 29 108% madvise_ 2 565 615 49 25 109% madvise_ 4 872 933 61 15 107% madvise_ 8 1508 1640 132 16 109% madvise_ 16 3078 3323 245 15 108% madvise_ 32 5893 6704 811 25 114% For 5.10 kernel, sealing check adds 0-15 ns in time, or 10-30 CPU cycles, there is even decrease in some cases. It might be interesting to compare 5.10 and 6.8 kernel The first test (measuring time) syscall__ vmas t_5_10 t_6_8 delta_ns per_vma % munmap__ 1 357 909 552 552 254% munmap__ 2 442 1398 956 478 316% munmap__ 4 614 2444 1830 458 398% munmap__ 8 1017 4029 3012 377 396% munmap__ 16 1889 6647 4758 297 352% munmap__ 32 4109 11811 7702 241 287% mprotect 1 235 439 204 204 187% mprotect 2 495 1659 1164 582 335% mprotect 4 741 3747 3006 752 506% mprotect 8 1434 6755 5320 665 471% mprotect 16 2958 13748 10790 674 465% mprotect 32 6431 27827 21397 669 433% madvise_ 1 191 240 49 49 125% madvise_ 2 300 366 67 33 122% madvise_ 4 450 623 173 43 138% madvise_ 8 753 1110 357 45 147% madvise_ 16 1467 2127 660 41 145% madvise_ 32 2795 4109 1314 41 147% The second test (measuring cpu cycle) syscall__ vmas cpu_5_10 c_6_8 delta_cpu per_vma % munmap__ 1 684 1790 1106 1106 262% munmap__ 2 861 2819 1958 979 327% munmap__ 4 1183 4959 3776 944 419% munmap__ 8 1999 8262 6263 783 413% munmap__ 16 3839 13099 9260 579 341% munmap__ 32 7672 23221 15549 486 303% mprotect 1 397 906 509 509 228% mprotect 2 738 3019 2281 1140 409% mprotect 4 1221 6149 4929 1232 504% mprotect 8 2356 9978 7622 953 423% mprotect 16 4961 20448 15487 968 412% mprotect 32 9882 40972 31091 972 415% madvise_ 1 351 434 82 82 123% madvise_ 2 565 752 186 93 133% madvise_ 4 872 1313 442 110 151% madvise_ 8 1508 2271 763 95 151% madvise_ 16 3078 4312 1234 77 140% madvise_ 32 5893 8376 2483 78 142% From 5.10 to 6.8 munmap: added 250-550 ns in time, or 500-1100 in cpu cycle, per vma. mprotect: added 200-750 ns in time, or 500-1200 in cpu cycle, per vma. madvise: added 33-50 ns in time, or 70-110 in cpu cycle, per vma. In comparison to mseal, which adds 20-40 ns or 50-100 CPU cycles, the increase from 5.10 to 6.8 is significantly larger, approximately ten times greater for munmap and mprotect. When I discuss the mm performance with Brian Makin, an engineer who worked on performance, it was brought to my attention that such performance benchmarks, which measuring millions of mm syscall in a tight loop, may not accurately reflect real-world scenarios, such as that of a database service. Also this is tested using a single HW and ChromeOS, the data from another HW or distribution might be different. It might be best to take this data with a grain of salt. This patch (of 5): Wire up mseal syscall for all architectures. Link: https://lkml.kernel.org/r/20240415163527.626541-1-jeffxu@chromium.org Link: https://lkml.kernel.org/r/20240415163527.626541-2-jeffxu@chromium.org Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Jann Horn <jannh@google.com> [Bug #2] Cc: Jeff Xu <jeffxu@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Stephen Röttger <sroettger@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Amer Al Shanawany <amer.shanawany@gmail.com> Cc: Javier Carrasco <javier.carrasco.cruz@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
431 lines
15 KiB
Text
431 lines
15 KiB
Text
#
|
|
# 64-bit system call numbers and entry vectors
|
|
#
|
|
# The format is:
|
|
# <number> <abi> <name> <entry point>
|
|
#
|
|
# The __x64_sys_*() stubs are created on-the-fly for sys_*() system calls
|
|
#
|
|
# The abi is "common", "64" or "x32" for this file.
|
|
#
|
|
0 common read sys_read
|
|
1 common write sys_write
|
|
2 common open sys_open
|
|
3 common close sys_close
|
|
4 common stat sys_newstat
|
|
5 common fstat sys_newfstat
|
|
6 common lstat sys_newlstat
|
|
7 common poll sys_poll
|
|
8 common lseek sys_lseek
|
|
9 common mmap sys_mmap
|
|
10 common mprotect sys_mprotect
|
|
11 common munmap sys_munmap
|
|
12 common brk sys_brk
|
|
13 64 rt_sigaction sys_rt_sigaction
|
|
14 common rt_sigprocmask sys_rt_sigprocmask
|
|
15 64 rt_sigreturn sys_rt_sigreturn
|
|
16 64 ioctl sys_ioctl
|
|
17 common pread64 sys_pread64
|
|
18 common pwrite64 sys_pwrite64
|
|
19 64 readv sys_readv
|
|
20 64 writev sys_writev
|
|
21 common access sys_access
|
|
22 common pipe sys_pipe
|
|
23 common select sys_select
|
|
24 common sched_yield sys_sched_yield
|
|
25 common mremap sys_mremap
|
|
26 common msync sys_msync
|
|
27 common mincore sys_mincore
|
|
28 common madvise sys_madvise
|
|
29 common shmget sys_shmget
|
|
30 common shmat sys_shmat
|
|
31 common shmctl sys_shmctl
|
|
32 common dup sys_dup
|
|
33 common dup2 sys_dup2
|
|
34 common pause sys_pause
|
|
35 common nanosleep sys_nanosleep
|
|
36 common getitimer sys_getitimer
|
|
37 common alarm sys_alarm
|
|
38 common setitimer sys_setitimer
|
|
39 common getpid sys_getpid
|
|
40 common sendfile sys_sendfile64
|
|
41 common socket sys_socket
|
|
42 common connect sys_connect
|
|
43 common accept sys_accept
|
|
44 common sendto sys_sendto
|
|
45 64 recvfrom sys_recvfrom
|
|
46 64 sendmsg sys_sendmsg
|
|
47 64 recvmsg sys_recvmsg
|
|
48 common shutdown sys_shutdown
|
|
49 common bind sys_bind
|
|
50 common listen sys_listen
|
|
51 common getsockname sys_getsockname
|
|
52 common getpeername sys_getpeername
|
|
53 common socketpair sys_socketpair
|
|
54 64 setsockopt sys_setsockopt
|
|
55 64 getsockopt sys_getsockopt
|
|
56 common clone sys_clone
|
|
57 common fork sys_fork
|
|
58 common vfork sys_vfork
|
|
59 64 execve sys_execve
|
|
60 common exit sys_exit
|
|
61 common wait4 sys_wait4
|
|
62 common kill sys_kill
|
|
63 common uname sys_newuname
|
|
64 common semget sys_semget
|
|
65 common semop sys_semop
|
|
66 common semctl sys_semctl
|
|
67 common shmdt sys_shmdt
|
|
68 common msgget sys_msgget
|
|
69 common msgsnd sys_msgsnd
|
|
70 common msgrcv sys_msgrcv
|
|
71 common msgctl sys_msgctl
|
|
72 common fcntl sys_fcntl
|
|
73 common flock sys_flock
|
|
74 common fsync sys_fsync
|
|
75 common fdatasync sys_fdatasync
|
|
76 common truncate sys_truncate
|
|
77 common ftruncate sys_ftruncate
|
|
78 common getdents sys_getdents
|
|
79 common getcwd sys_getcwd
|
|
80 common chdir sys_chdir
|
|
81 common fchdir sys_fchdir
|
|
82 common rename sys_rename
|
|
83 common mkdir sys_mkdir
|
|
84 common rmdir sys_rmdir
|
|
85 common creat sys_creat
|
|
86 common link sys_link
|
|
87 common unlink sys_unlink
|
|
88 common symlink sys_symlink
|
|
89 common readlink sys_readlink
|
|
90 common chmod sys_chmod
|
|
91 common fchmod sys_fchmod
|
|
92 common chown sys_chown
|
|
93 common fchown sys_fchown
|
|
94 common lchown sys_lchown
|
|
95 common umask sys_umask
|
|
96 common gettimeofday sys_gettimeofday
|
|
97 common getrlimit sys_getrlimit
|
|
98 common getrusage sys_getrusage
|
|
99 common sysinfo sys_sysinfo
|
|
100 common times sys_times
|
|
101 64 ptrace sys_ptrace
|
|
102 common getuid sys_getuid
|
|
103 common syslog sys_syslog
|
|
104 common getgid sys_getgid
|
|
105 common setuid sys_setuid
|
|
106 common setgid sys_setgid
|
|
107 common geteuid sys_geteuid
|
|
108 common getegid sys_getegid
|
|
109 common setpgid sys_setpgid
|
|
110 common getppid sys_getppid
|
|
111 common getpgrp sys_getpgrp
|
|
112 common setsid sys_setsid
|
|
113 common setreuid sys_setreuid
|
|
114 common setregid sys_setregid
|
|
115 common getgroups sys_getgroups
|
|
116 common setgroups sys_setgroups
|
|
117 common setresuid sys_setresuid
|
|
118 common getresuid sys_getresuid
|
|
119 common setresgid sys_setresgid
|
|
120 common getresgid sys_getresgid
|
|
121 common getpgid sys_getpgid
|
|
122 common setfsuid sys_setfsuid
|
|
123 common setfsgid sys_setfsgid
|
|
124 common getsid sys_getsid
|
|
125 common capget sys_capget
|
|
126 common capset sys_capset
|
|
127 64 rt_sigpending sys_rt_sigpending
|
|
128 64 rt_sigtimedwait sys_rt_sigtimedwait
|
|
129 64 rt_sigqueueinfo sys_rt_sigqueueinfo
|
|
130 common rt_sigsuspend sys_rt_sigsuspend
|
|
131 64 sigaltstack sys_sigaltstack
|
|
132 common utime sys_utime
|
|
133 common mknod sys_mknod
|
|
134 64 uselib
|
|
135 common personality sys_personality
|
|
136 common ustat sys_ustat
|
|
137 common statfs sys_statfs
|
|
138 common fstatfs sys_fstatfs
|
|
139 common sysfs sys_sysfs
|
|
140 common getpriority sys_getpriority
|
|
141 common setpriority sys_setpriority
|
|
142 common sched_setparam sys_sched_setparam
|
|
143 common sched_getparam sys_sched_getparam
|
|
144 common sched_setscheduler sys_sched_setscheduler
|
|
145 common sched_getscheduler sys_sched_getscheduler
|
|
146 common sched_get_priority_max sys_sched_get_priority_max
|
|
147 common sched_get_priority_min sys_sched_get_priority_min
|
|
148 common sched_rr_get_interval sys_sched_rr_get_interval
|
|
149 common mlock sys_mlock
|
|
150 common munlock sys_munlock
|
|
151 common mlockall sys_mlockall
|
|
152 common munlockall sys_munlockall
|
|
153 common vhangup sys_vhangup
|
|
154 common modify_ldt sys_modify_ldt
|
|
155 common pivot_root sys_pivot_root
|
|
156 64 _sysctl sys_ni_syscall
|
|
157 common prctl sys_prctl
|
|
158 common arch_prctl sys_arch_prctl
|
|
159 common adjtimex sys_adjtimex
|
|
160 common setrlimit sys_setrlimit
|
|
161 common chroot sys_chroot
|
|
162 common sync sys_sync
|
|
163 common acct sys_acct
|
|
164 common settimeofday sys_settimeofday
|
|
165 common mount sys_mount
|
|
166 common umount2 sys_umount
|
|
167 common swapon sys_swapon
|
|
168 common swapoff sys_swapoff
|
|
169 common reboot sys_reboot
|
|
170 common sethostname sys_sethostname
|
|
171 common setdomainname sys_setdomainname
|
|
172 common iopl sys_iopl
|
|
173 common ioperm sys_ioperm
|
|
174 64 create_module
|
|
175 common init_module sys_init_module
|
|
176 common delete_module sys_delete_module
|
|
177 64 get_kernel_syms
|
|
178 64 query_module
|
|
179 common quotactl sys_quotactl
|
|
180 64 nfsservctl
|
|
181 common getpmsg
|
|
182 common putpmsg
|
|
183 common afs_syscall
|
|
184 common tuxcall
|
|
185 common security
|
|
186 common gettid sys_gettid
|
|
187 common readahead sys_readahead
|
|
188 common setxattr sys_setxattr
|
|
189 common lsetxattr sys_lsetxattr
|
|
190 common fsetxattr sys_fsetxattr
|
|
191 common getxattr sys_getxattr
|
|
192 common lgetxattr sys_lgetxattr
|
|
193 common fgetxattr sys_fgetxattr
|
|
194 common listxattr sys_listxattr
|
|
195 common llistxattr sys_llistxattr
|
|
196 common flistxattr sys_flistxattr
|
|
197 common removexattr sys_removexattr
|
|
198 common lremovexattr sys_lremovexattr
|
|
199 common fremovexattr sys_fremovexattr
|
|
200 common tkill sys_tkill
|
|
201 common time sys_time
|
|
202 common futex sys_futex
|
|
203 common sched_setaffinity sys_sched_setaffinity
|
|
204 common sched_getaffinity sys_sched_getaffinity
|
|
205 64 set_thread_area
|
|
206 64 io_setup sys_io_setup
|
|
207 common io_destroy sys_io_destroy
|
|
208 common io_getevents sys_io_getevents
|
|
209 64 io_submit sys_io_submit
|
|
210 common io_cancel sys_io_cancel
|
|
211 64 get_thread_area
|
|
212 common lookup_dcookie
|
|
213 common epoll_create sys_epoll_create
|
|
214 64 epoll_ctl_old
|
|
215 64 epoll_wait_old
|
|
216 common remap_file_pages sys_remap_file_pages
|
|
217 common getdents64 sys_getdents64
|
|
218 common set_tid_address sys_set_tid_address
|
|
219 common restart_syscall sys_restart_syscall
|
|
220 common semtimedop sys_semtimedop
|
|
221 common fadvise64 sys_fadvise64
|
|
222 64 timer_create sys_timer_create
|
|
223 common timer_settime sys_timer_settime
|
|
224 common timer_gettime sys_timer_gettime
|
|
225 common timer_getoverrun sys_timer_getoverrun
|
|
226 common timer_delete sys_timer_delete
|
|
227 common clock_settime sys_clock_settime
|
|
228 common clock_gettime sys_clock_gettime
|
|
229 common clock_getres sys_clock_getres
|
|
230 common clock_nanosleep sys_clock_nanosleep
|
|
231 common exit_group sys_exit_group
|
|
232 common epoll_wait sys_epoll_wait
|
|
233 common epoll_ctl sys_epoll_ctl
|
|
234 common tgkill sys_tgkill
|
|
235 common utimes sys_utimes
|
|
236 64 vserver
|
|
237 common mbind sys_mbind
|
|
238 common set_mempolicy sys_set_mempolicy
|
|
239 common get_mempolicy sys_get_mempolicy
|
|
240 common mq_open sys_mq_open
|
|
241 common mq_unlink sys_mq_unlink
|
|
242 common mq_timedsend sys_mq_timedsend
|
|
243 common mq_timedreceive sys_mq_timedreceive
|
|
244 64 mq_notify sys_mq_notify
|
|
245 common mq_getsetattr sys_mq_getsetattr
|
|
246 64 kexec_load sys_kexec_load
|
|
247 64 waitid sys_waitid
|
|
248 common add_key sys_add_key
|
|
249 common request_key sys_request_key
|
|
250 common keyctl sys_keyctl
|
|
251 common ioprio_set sys_ioprio_set
|
|
252 common ioprio_get sys_ioprio_get
|
|
253 common inotify_init sys_inotify_init
|
|
254 common inotify_add_watch sys_inotify_add_watch
|
|
255 common inotify_rm_watch sys_inotify_rm_watch
|
|
256 common migrate_pages sys_migrate_pages
|
|
257 common openat sys_openat
|
|
258 common mkdirat sys_mkdirat
|
|
259 common mknodat sys_mknodat
|
|
260 common fchownat sys_fchownat
|
|
261 common futimesat sys_futimesat
|
|
262 common newfstatat sys_newfstatat
|
|
263 common unlinkat sys_unlinkat
|
|
264 common renameat sys_renameat
|
|
265 common linkat sys_linkat
|
|
266 common symlinkat sys_symlinkat
|
|
267 common readlinkat sys_readlinkat
|
|
268 common fchmodat sys_fchmodat
|
|
269 common faccessat sys_faccessat
|
|
270 common pselect6 sys_pselect6
|
|
271 common ppoll sys_ppoll
|
|
272 common unshare sys_unshare
|
|
273 64 set_robust_list sys_set_robust_list
|
|
274 64 get_robust_list sys_get_robust_list
|
|
275 common splice sys_splice
|
|
276 common tee sys_tee
|
|
277 common sync_file_range sys_sync_file_range
|
|
278 64 vmsplice sys_vmsplice
|
|
279 64 move_pages sys_move_pages
|
|
280 common utimensat sys_utimensat
|
|
281 common epoll_pwait sys_epoll_pwait
|
|
282 common signalfd sys_signalfd
|
|
283 common timerfd_create sys_timerfd_create
|
|
284 common eventfd sys_eventfd
|
|
285 common fallocate sys_fallocate
|
|
286 common timerfd_settime sys_timerfd_settime
|
|
287 common timerfd_gettime sys_timerfd_gettime
|
|
288 common accept4 sys_accept4
|
|
289 common signalfd4 sys_signalfd4
|
|
290 common eventfd2 sys_eventfd2
|
|
291 common epoll_create1 sys_epoll_create1
|
|
292 common dup3 sys_dup3
|
|
293 common pipe2 sys_pipe2
|
|
294 common inotify_init1 sys_inotify_init1
|
|
295 64 preadv sys_preadv
|
|
296 64 pwritev sys_pwritev
|
|
297 64 rt_tgsigqueueinfo sys_rt_tgsigqueueinfo
|
|
298 common perf_event_open sys_perf_event_open
|
|
299 64 recvmmsg sys_recvmmsg
|
|
300 common fanotify_init sys_fanotify_init
|
|
301 common fanotify_mark sys_fanotify_mark
|
|
302 common prlimit64 sys_prlimit64
|
|
303 common name_to_handle_at sys_name_to_handle_at
|
|
304 common open_by_handle_at sys_open_by_handle_at
|
|
305 common clock_adjtime sys_clock_adjtime
|
|
306 common syncfs sys_syncfs
|
|
307 64 sendmmsg sys_sendmmsg
|
|
308 common setns sys_setns
|
|
309 common getcpu sys_getcpu
|
|
310 64 process_vm_readv sys_process_vm_readv
|
|
311 64 process_vm_writev sys_process_vm_writev
|
|
312 common kcmp sys_kcmp
|
|
313 common finit_module sys_finit_module
|
|
314 common sched_setattr sys_sched_setattr
|
|
315 common sched_getattr sys_sched_getattr
|
|
316 common renameat2 sys_renameat2
|
|
317 common seccomp sys_seccomp
|
|
318 common getrandom sys_getrandom
|
|
319 common memfd_create sys_memfd_create
|
|
320 common kexec_file_load sys_kexec_file_load
|
|
321 common bpf sys_bpf
|
|
322 64 execveat sys_execveat
|
|
323 common userfaultfd sys_userfaultfd
|
|
324 common membarrier sys_membarrier
|
|
325 common mlock2 sys_mlock2
|
|
326 common copy_file_range sys_copy_file_range
|
|
327 64 preadv2 sys_preadv2
|
|
328 64 pwritev2 sys_pwritev2
|
|
329 common pkey_mprotect sys_pkey_mprotect
|
|
330 common pkey_alloc sys_pkey_alloc
|
|
331 common pkey_free sys_pkey_free
|
|
332 common statx sys_statx
|
|
333 common io_pgetevents sys_io_pgetevents
|
|
334 common rseq sys_rseq
|
|
# don't use numbers 387 through 423, add new calls after the last
|
|
# 'common' entry
|
|
424 common pidfd_send_signal sys_pidfd_send_signal
|
|
425 common io_uring_setup sys_io_uring_setup
|
|
426 common io_uring_enter sys_io_uring_enter
|
|
427 common io_uring_register sys_io_uring_register
|
|
428 common open_tree sys_open_tree
|
|
429 common move_mount sys_move_mount
|
|
430 common fsopen sys_fsopen
|
|
431 common fsconfig sys_fsconfig
|
|
432 common fsmount sys_fsmount
|
|
433 common fspick sys_fspick
|
|
434 common pidfd_open sys_pidfd_open
|
|
435 common clone3 sys_clone3
|
|
436 common close_range sys_close_range
|
|
437 common openat2 sys_openat2
|
|
438 common pidfd_getfd sys_pidfd_getfd
|
|
439 common faccessat2 sys_faccessat2
|
|
440 common process_madvise sys_process_madvise
|
|
441 common epoll_pwait2 sys_epoll_pwait2
|
|
442 common mount_setattr sys_mount_setattr
|
|
443 common quotactl_fd sys_quotactl_fd
|
|
444 common landlock_create_ruleset sys_landlock_create_ruleset
|
|
445 common landlock_add_rule sys_landlock_add_rule
|
|
446 common landlock_restrict_self sys_landlock_restrict_self
|
|
447 common memfd_secret sys_memfd_secret
|
|
448 common process_mrelease sys_process_mrelease
|
|
449 common futex_waitv sys_futex_waitv
|
|
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
|
|
451 common cachestat sys_cachestat
|
|
452 common fchmodat2 sys_fchmodat2
|
|
453 common map_shadow_stack sys_map_shadow_stack
|
|
454 common futex_wake sys_futex_wake
|
|
455 common futex_wait sys_futex_wait
|
|
456 common futex_requeue sys_futex_requeue
|
|
457 common statmount sys_statmount
|
|
458 common listmount sys_listmount
|
|
459 common lsm_get_self_attr sys_lsm_get_self_attr
|
|
460 common lsm_set_self_attr sys_lsm_set_self_attr
|
|
461 common lsm_list_modules sys_lsm_list_modules
|
|
462 common mseal sys_mseal
|
|
|
|
#
|
|
# Due to a historical design error, certain syscalls are numbered differently
|
|
# in x32 as compared to native x86_64. These syscalls have numbers 512-547.
|
|
# Do not add new syscalls to this range. Numbers 548 and above are available
|
|
# for non-x32 use.
|
|
#
|
|
512 x32 rt_sigaction compat_sys_rt_sigaction
|
|
513 x32 rt_sigreturn compat_sys_x32_rt_sigreturn
|
|
514 x32 ioctl compat_sys_ioctl
|
|
515 x32 readv sys_readv
|
|
516 x32 writev sys_writev
|
|
517 x32 recvfrom compat_sys_recvfrom
|
|
518 x32 sendmsg compat_sys_sendmsg
|
|
519 x32 recvmsg compat_sys_recvmsg
|
|
520 x32 execve compat_sys_execve
|
|
521 x32 ptrace compat_sys_ptrace
|
|
522 x32 rt_sigpending compat_sys_rt_sigpending
|
|
523 x32 rt_sigtimedwait compat_sys_rt_sigtimedwait_time64
|
|
524 x32 rt_sigqueueinfo compat_sys_rt_sigqueueinfo
|
|
525 x32 sigaltstack compat_sys_sigaltstack
|
|
526 x32 timer_create compat_sys_timer_create
|
|
527 x32 mq_notify compat_sys_mq_notify
|
|
528 x32 kexec_load compat_sys_kexec_load
|
|
529 x32 waitid compat_sys_waitid
|
|
530 x32 set_robust_list compat_sys_set_robust_list
|
|
531 x32 get_robust_list compat_sys_get_robust_list
|
|
532 x32 vmsplice sys_vmsplice
|
|
533 x32 move_pages sys_move_pages
|
|
534 x32 preadv compat_sys_preadv64
|
|
535 x32 pwritev compat_sys_pwritev64
|
|
536 x32 rt_tgsigqueueinfo compat_sys_rt_tgsigqueueinfo
|
|
537 x32 recvmmsg compat_sys_recvmmsg_time64
|
|
538 x32 sendmmsg compat_sys_sendmmsg
|
|
539 x32 process_vm_readv sys_process_vm_readv
|
|
540 x32 process_vm_writev sys_process_vm_writev
|
|
541 x32 setsockopt sys_setsockopt
|
|
542 x32 getsockopt sys_getsockopt
|
|
543 x32 io_setup compat_sys_io_setup
|
|
544 x32 io_submit compat_sys_io_submit
|
|
545 x32 execveat compat_sys_execveat
|
|
546 x32 preadv2 compat_sys_preadv64v2
|
|
547 x32 pwritev2 compat_sys_pwritev64v2
|
|
# This is the end of the legacy x32 range. Numbers 548 and above are
|
|
# not special and are not to be used for x32-specific syscalls.
|