blob: bbb31e091b282576ef89c0140cd54115edd62da9
1 | Adding a New System Call |
2 | ======================== |
3 | |
4 | This document describes what's involved in adding a new system call to the |
5 | Linux kernel, over and above the normal submission advice in |
6 | Documentation/SubmittingPatches. |
7 | |
8 | |
9 | System Call Alternatives |
10 | ------------------------ |
11 | |
12 | The first thing to consider when adding a new system call is whether one of |
13 | the alternatives might be suitable instead. Although system calls are the |
14 | most traditional and most obvious interaction points between userspace and the |
15 | kernel, there are other possibilities -- choose what fits best for your |
16 | interface. |
17 | |
18 | - If the operations involved can be made to look like a filesystem-like |
19 | object, it may make more sense to create a new filesystem or device. This |
20 | also makes it easier to encapsulate the new functionality in a kernel module |
21 | rather than requiring it to be built into the main kernel. |
22 | - If the new functionality involves operations where the kernel notifies |
23 | userspace that something has happened, then returning a new file |
24 | descriptor for the relevant object allows userspace to use |
25 | poll/select/epoll to receive that notification. |
26 | - However, operations that don't map to read(2)/write(2)-like operations |
27 | have to be implemented as ioctl(2) requests, which can lead to a |
28 | somewhat opaque API. |
29 | - If you're just exposing runtime system information, a new node in sysfs |
30 | (see Documentation/filesystems/sysfs.txt) or the /proc filesystem may be |
31 | more appropriate. However, access to these mechanisms requires that the |
32 | relevant filesystem is mounted, which might not always be the case (e.g. |
33 | in a namespaced/sandboxed/chrooted environment). Avoid adding any API to |
34 | debugfs, as this is not considered a 'production' interface to userspace. |
35 | - If the operation is specific to a particular file or file descriptor, then |
36 | an additional fcntl(2) command option may be more appropriate. However, |
37 | fcntl(2) is a multiplexing system call that hides a lot of complexity, so |
38 | this option is best for when the new function is closely analogous to |
39 | existing fcntl(2) functionality, or the new functionality is very simple |
40 | (for example, getting/setting a simple flag related to a file descriptor). |
41 | - If the operation is specific to a particular task or process, then an |
42 | additional prctl(2) command option may be more appropriate. As with |
43 | fcntl(2), this system call is a complicated multiplexor so is best reserved |
44 | for near-analogs of existing prctl() commands or getting/setting a simple |
45 | flag related to a process. |
46 | |
47 | |
48 | Designing the API: Planning for Extension |
49 | ----------------------------------------- |
50 | |
51 | A new system call forms part of the API of the kernel, and has to be supported |
52 | indefinitely. As such, it's a very good idea to explicitly discuss the |
53 | interface on the kernel mailing list, and it's important to plan for future |
54 | extensions of the interface. |
55 | |
56 | (The syscall table is littered with historical examples where this wasn't done, |
57 | together with the corresponding follow-up system calls -- eventfd/eventfd2, |
58 | dup2/dup3, inotify_init/inotify_init1, pipe/pipe2, renameat/renameat2 -- so |
59 | learn from the history of the kernel and plan for extensions from the start.) |
60 | |
61 | For simpler system calls that only take a couple of arguments, the preferred |
62 | way to allow for future extensibility is to include a flags argument to the |
63 | system call. To make sure that userspace programs can safely use flags |
64 | between kernel versions, check whether the flags value holds any unknown |
65 | flags, and reject the system call (with EINVAL) if it does: |
66 | |
67 | if (flags & ~(THING_FLAG1 | THING_FLAG2 | THING_FLAG3)) |
68 | return -EINVAL; |
69 | |
70 | (If no flags values are used yet, check that the flags argument is zero.) |
71 | |
72 | For more sophisticated system calls that involve a larger number of arguments, |
73 | it's preferred to encapsulate the majority of the arguments into a structure |
74 | that is passed in by pointer. Such a structure can cope with future extension |
75 | by including a size argument in the structure: |
76 | |
77 | struct xyzzy_params { |
78 | u32 size; /* userspace sets p->size = sizeof(struct xyzzy_params) */ |
79 | u32 param_1; |
80 | u64 param_2; |
81 | u64 param_3; |
82 | }; |
83 | |
84 | As long as any subsequently added field, say param_4, is designed so that a |
85 | zero value gives the previous behaviour, then this allows both directions of |
86 | version mismatch: |
87 | |
88 | - To cope with a later userspace program calling an older kernel, the kernel |
89 | code should check that any memory beyond the size of the structure that it |
90 | expects is zero (effectively checking that param_4 == 0). |
91 | - To cope with an older userspace program calling a newer kernel, the kernel |
92 | code can zero-extend a smaller instance of the structure (effectively |
93 | setting param_4 = 0). |
94 | |
95 | See perf_event_open(2) and the perf_copy_attr() function (in |
96 | kernel/events/core.c) for an example of this approach. |
97 | |
98 | |
99 | Designing the API: Other Considerations |
100 | --------------------------------------- |
101 | |
102 | If your new system call allows userspace to refer to a kernel object, it |
103 | should use a file descriptor as the handle for that object -- don't invent a |
104 | new type of userspace object handle when the kernel already has mechanisms and |
105 | well-defined semantics for using file descriptors. |
106 | |
107 | If your new xyzzy(2) system call does return a new file descriptor, then the |
108 | flags argument should include a value that is equivalent to setting O_CLOEXEC |
109 | on the new FD. This makes it possible for userspace to close the timing |
110 | window between xyzzy() and calling fcntl(fd, F_SETFD, FD_CLOEXEC), where an |
111 | unexpected fork() and execve() in another thread could leak a descriptor to |
112 | the exec'ed program. (However, resist the temptation to re-use the actual value |
113 | of the O_CLOEXEC constant, as it is architecture-specific and is part of a |
114 | numbering space of O_* flags that is fairly full.) |
115 | |
116 | If your system call returns a new file descriptor, you should also consider |
117 | what it means to use the poll(2) family of system calls on that file |
118 | descriptor. Making a file descriptor ready for reading or writing is the |
119 | normal way for the kernel to indicate to userspace that an event has |
120 | occurred on the corresponding kernel object. |
121 | |
122 | If your new xyzzy(2) system call involves a filename argument: |
123 | |
124 | int sys_xyzzy(const char __user *path, ..., unsigned int flags); |
125 | |
126 | you should also consider whether an xyzzyat(2) version is more appropriate: |
127 | |
128 | int sys_xyzzyat(int dfd, const char __user *path, ..., unsigned int flags); |
129 | |
130 | This allows more flexibility for how userspace specifies the file in question; |
131 | in particular it allows userspace to request the functionality for an |
132 | already-opened file descriptor using the AT_EMPTY_PATH flag, effectively giving |
133 | an fxyzzy(3) operation for free: |
134 | |
135 | - xyzzyat(AT_FDCWD, path, ..., 0) is equivalent to xyzzy(path,...) |
136 | - xyzzyat(fd, "", ..., AT_EMPTY_PATH) is equivalent to fxyzzy(fd, ...) |
137 | |
138 | (For more details on the rationale of the *at() calls, see the openat(2) man |
139 | page; for an example of AT_EMPTY_PATH, see the fstatat(2) man page.) |
140 | |
141 | If your new xyzzy(2) system call involves a parameter describing an offset |
142 | within a file, make its type loff_t so that 64-bit offsets can be supported |
143 | even on 32-bit architectures. |
144 | |
145 | If your new xyzzy(2) system call involves privileged functionality, it needs |
146 | to be governed by the appropriate Linux capability bit (checked with a call to |
147 | capable()), as described in the capabilities(7) man page. Choose an existing |
148 | capability bit that governs related functionality, but try to avoid combining |
149 | lots of only vaguely related functions together under the same bit, as this |
150 | goes against capabilities' purpose of splitting the power of root. In |
151 | particular, avoid adding new uses of the already overly-general CAP_SYS_ADMIN |
152 | capability. |
153 | |
154 | If your new xyzzy(2) system call manipulates a process other than the calling |
155 | process, it should be restricted (using a call to ptrace_may_access()) so that |
156 | only a calling process with the same permissions as the target process, or |
157 | with the necessary capabilities, can manipulate the target process. |
158 | |
159 | Finally, be aware that some non-x86 architectures have an easier time if |
160 | system call parameters that are explicitly 64-bit fall on odd-numbered |
161 | arguments (i.e. parameter 1, 3, 5), to allow use of contiguous pairs of 32-bit |
162 | registers. (This concern does not apply if the arguments are part of a |
163 | structure that's passed in by pointer.) |
164 | |
165 | |
166 | Proposing the API |
167 | ----------------- |
168 | |
169 | To make new system calls easy to review, it's best to divide up the patchset |
170 | into separate chunks. These should include at least the following items as |
171 | distinct commits (each of which is described further below): |
172 | |
173 | - The core implementation of the system call, together with prototypes, |
174 | generic numbering, Kconfig changes and fallback stub implementation. |
175 | - Wiring up of the new system call for one particular architecture, usually |
176 | x86 (including all of x86_64, x86_32 and x32). |
177 | - A demonstration of the use of the new system call in userspace via a |
178 | selftest in tools/testing/selftests/. |
179 | - A draft man-page for the new system call, either as plain text in the |
180 | cover letter, or as a patch to the (separate) man-pages repository. |
181 | |
182 | New system call proposals, like any change to the kernel's API, should always |
183 | be cc'ed to linux-api@vger.kernel.org. |
184 | |
185 | |
186 | Generic System Call Implementation |
187 | ---------------------------------- |
188 | |
189 | The main entry point for your new xyzzy(2) system call will be called |
190 | sys_xyzzy(), but you add this entry point with the appropriate |
191 | SYSCALL_DEFINEn() macro rather than explicitly. The 'n' indicates the number |
192 | of arguments to the system call, and the macro takes the system call name |
193 | followed by the (type, name) pairs for the parameters as arguments. Using |
194 | this macro allows metadata about the new system call to be made available for |
195 | other tools. |
196 | |
197 | The new entry point also needs a corresponding function prototype, in |
198 | include/linux/syscalls.h, marked as asmlinkage to match the way that system |
199 | calls are invoked: |
200 | |
201 | asmlinkage long sys_xyzzy(...); |
202 | |
203 | Some architectures (e.g. x86) have their own architecture-specific syscall |
204 | tables, but several other architectures share a generic syscall table. Add your |
205 | new system call to the generic list by adding an entry to the list in |
206 | include/uapi/asm-generic/unistd.h: |
207 | |
208 | #define __NR_xyzzy 292 |
209 | __SYSCALL(__NR_xyzzy, sys_xyzzy) |
210 | |
211 | Also update the __NR_syscalls count to reflect the additional system call, and |
212 | note that if multiple new system calls are added in the same merge window, |
213 | your new syscall number may get adjusted to resolve conflicts. |
214 | |
215 | The file kernel/sys_ni.c provides a fallback stub implementation of each system |
216 | call, returning -ENOSYS. Add your new system call here too: |
217 | |
218 | cond_syscall(sys_xyzzy); |
219 | |
220 | Your new kernel functionality, and the system call that controls it, should |
221 | normally be optional, so add a CONFIG option (typically to init/Kconfig) for |
222 | it. As usual for new CONFIG options: |
223 | |
224 | - Include a description of the new functionality and system call controlled |
225 | by the option. |
226 | - Make the option depend on EXPERT if it should be hidden from normal users. |
227 | - Make any new source files implementing the function dependent on the CONFIG |
228 | option in the Makefile (e.g. "obj-$(CONFIG_XYZZY_SYSCALL) += xyzzy.c"). |
229 | - Double check that the kernel still builds with the new CONFIG option turned |
230 | off. |
231 | |
232 | To summarize, you need a commit that includes: |
233 | |
234 | - CONFIG option for the new function, normally in init/Kconfig |
235 | - SYSCALL_DEFINEn(xyzzy, ...) for the entry point |
236 | - corresponding prototype in include/linux/syscalls.h |
237 | - generic table entry in include/uapi/asm-generic/unistd.h |
238 | - fallback stub in kernel/sys_ni.c |
239 | |
240 | |
241 | x86 System Call Implementation |
242 | ------------------------------ |
243 | |
244 | To wire up your new system call for x86 platforms, you need to update the |
245 | master syscall tables. Assuming your new system call isn't special in some |
246 | way (see below), this involves a "common" entry (for x86_64 and x32) in |
247 | arch/x86/entry/syscalls/syscall_64.tbl: |
248 | |
249 | 333 common xyzzy sys_xyzzy |
250 | |
251 | and an "i386" entry in arch/x86/entry/syscalls/syscall_32.tbl: |
252 | |
253 | 380 i386 xyzzy sys_xyzzy |
254 | |
255 | Again, these numbers are liable to be changed if there are conflicts in the |
256 | relevant merge window. |
257 | |
258 | |
259 | Compatibility System Calls (Generic) |
260 | ------------------------------------ |
261 | |
262 | For most system calls the same 64-bit implementation can be invoked even when |
263 | the userspace program is itself 32-bit; even if the system call's parameters |
264 | include an explicit pointer, this is handled transparently. |
265 | |
266 | However, there are a couple of situations where a compatibility layer is |
267 | needed to cope with size differences between 32-bit and 64-bit. |
268 | |
269 | The first is if the 64-bit kernel also supports 32-bit userspace programs, and |
270 | so needs to parse areas of (__user) memory that could hold either 32-bit or |
271 | 64-bit values. In particular, this is needed whenever a system call argument |
272 | is: |
273 | |
274 | - a pointer to a pointer |
275 | - a pointer to a struct containing a pointer (e.g. struct iovec __user *) |
276 | - a pointer to a varying sized integral type (time_t, off_t, long, ...) |
277 | - a pointer to a struct containing a varying sized integral type. |
278 | |
279 | The second situation that requires a compatibility layer is if one of the |
280 | system call's arguments has a type that is explicitly 64-bit even on a 32-bit |
281 | architecture, for example loff_t or __u64. In this case, a value that arrives |
282 | at a 64-bit kernel from a 32-bit application will be split into two 32-bit |
283 | values, which then need to be re-assembled in the compatibility layer. |
284 | |
285 | (Note that a system call argument that's a pointer to an explicit 64-bit type |
286 | does *not* need a compatibility layer; for example, splice(2)'s arguments of |
287 | type loff_t __user * do not trigger the need for a compat_ system call.) |
288 | |
289 | The compatibility version of the system call is called compat_sys_xyzzy(), and |
290 | is added with the COMPAT_SYSCALL_DEFINEn() macro, analogously to |
291 | SYSCALL_DEFINEn. This version of the implementation runs as part of a 64-bit |
292 | kernel, but expects to receive 32-bit parameter values and does whatever is |
293 | needed to deal with them. (Typically, the compat_sys_ version converts the |
294 | values to 64-bit versions and either calls on to the sys_ version, or both of |
295 | them call a common inner implementation function.) |
296 | |
297 | The compat entry point also needs a corresponding function prototype, in |
298 | include/linux/compat.h, marked as asmlinkage to match the way that system |
299 | calls are invoked: |
300 | |
301 | asmlinkage long compat_sys_xyzzy(...); |
302 | |
303 | If the system call involves a structure that is laid out differently on 32-bit |
304 | and 64-bit systems, say struct xyzzy_args, then the include/linux/compat.h |
305 | header file should also include a compat version of the structure (struct |
306 | compat_xyzzy_args) where each variable-size field has the appropriate compat_ |
307 | type that corresponds to the type in struct xyzzy_args. The |
308 | compat_sys_xyzzy() routine can then use this compat_ structure to parse the |
309 | arguments from a 32-bit invocation. |
310 | |
311 | For example, if there are fields: |
312 | |
313 | struct xyzzy_args { |
314 | const char __user *ptr; |
315 | __kernel_long_t varying_val; |
316 | u64 fixed_val; |
317 | /* ... */ |
318 | }; |
319 | |
320 | in struct xyzzy_args, then struct compat_xyzzy_args would have: |
321 | |
322 | struct compat_xyzzy_args { |
323 | compat_uptr_t ptr; |
324 | compat_long_t varying_val; |
325 | u64 fixed_val; |
326 | /* ... */ |
327 | }; |
328 | |
329 | The generic system call list also needs adjusting to allow for the compat |
330 | version; the entry in include/uapi/asm-generic/unistd.h should use |
331 | __SC_COMP rather than __SYSCALL: |
332 | |
333 | #define __NR_xyzzy 292 |
334 | __SC_COMP(__NR_xyzzy, sys_xyzzy, compat_sys_xyzzy) |
335 | |
336 | To summarize, you need: |
337 | |
338 | - a COMPAT_SYSCALL_DEFINEn(xyzzy, ...) for the compat entry point |
339 | - corresponding prototype in include/linux/compat.h |
340 | - (if needed) 32-bit mapping struct in include/linux/compat.h |
341 | - instance of __SC_COMP not __SYSCALL in include/uapi/asm-generic/unistd.h |
342 | |
343 | |
344 | Compatibility System Calls (x86) |
345 | -------------------------------- |
346 | |
347 | To wire up the x86 architecture of a system call with a compatibility version, |
348 | the entries in the syscall tables need to be adjusted. |
349 | |
350 | First, the entry in arch/x86/entry/syscalls/syscall_32.tbl gets an extra |
351 | column to indicate that a 32-bit userspace program running on a 64-bit kernel |
352 | should hit the compat entry point: |
353 | |
354 | 380 i386 xyzzy sys_xyzzy compat_sys_xyzzy |
355 | |
356 | Second, you need to figure out what should happen for the x32 ABI version of |
357 | the new system call. There's a choice here: the layout of the arguments |
358 | should either match the 64-bit version or the 32-bit version. |
359 | |
360 | If there's a pointer-to-a-pointer involved, the decision is easy: x32 is |
361 | ILP32, so the layout should match the 32-bit version, and the entry in |
362 | arch/x86/entry/syscalls/syscall_64.tbl is split so that x32 programs hit the |
363 | compatibility wrapper: |
364 | |
365 | 333 64 xyzzy sys_xyzzy |
366 | ... |
367 | 555 x32 xyzzy compat_sys_xyzzy |
368 | |
369 | If no pointers are involved, then it is preferable to re-use the 64-bit system |
370 | call for the x32 ABI (and consequently the entry in |
371 | arch/x86/entry/syscalls/syscall_64.tbl is unchanged). |
372 | |
373 | In either case, you should check that the types involved in your argument |
374 | layout do indeed map exactly from x32 (-mx32) to either the 32-bit (-m32) or |
375 | 64-bit (-m64) equivalents. |
376 | |
377 | |
378 | System Calls Returning Elsewhere |
379 | -------------------------------- |
380 | |
381 | For most system calls, once the system call is complete the user program |
382 | continues exactly where it left off -- at the next instruction, with the |
383 | stack the same and most of the registers the same as before the system call, |
384 | and with the same virtual memory space. |
385 | |
386 | However, a few system calls do things differently. They might return to a |
387 | different location (rt_sigreturn) or change the memory space (fork/vfork/clone) |
388 | or even architecture (execve/execveat) of the program. |
389 | |
390 | To allow for this, the kernel implementation of the system call may need to |
391 | save and restore additional registers to the kernel stack, allowing complete |
392 | control of where and how execution continues after the system call. |
393 | |
394 | This is arch-specific, but typically involves defining assembly entry points |
395 | that save/restore additional registers and invoke the real system call entry |
396 | point. |
397 | |
398 | For x86_64, this is implemented as a stub_xyzzy entry point in |
399 | arch/x86/entry/entry_64.S, and the entry in the syscall table |
400 | (arch/x86/entry/syscalls/syscall_64.tbl) is adjusted to match: |
401 | |
402 | 333 common xyzzy stub_xyzzy |
403 | |
404 | The equivalent for 32-bit programs running on a 64-bit kernel is normally |
405 | called stub32_xyzzy and implemented in arch/x86/entry/entry_64_compat.S, |
406 | with the corresponding syscall table adjustment in |
407 | arch/x86/entry/syscalls/syscall_32.tbl: |
408 | |
409 | 380 i386 xyzzy sys_xyzzy stub32_xyzzy |
410 | |
411 | If the system call needs a compatibility layer (as in the previous section) |
412 | then the stub32_ version needs to call on to the compat_sys_ version of the |
413 | system call rather than the native 64-bit version. Also, if the x32 ABI |
414 | implementation is not common with the x86_64 version, then its syscall |
415 | table will also need to invoke a stub that calls on to the compat_sys_ |
416 | version. |
417 | |
418 | For completeness, it's also nice to set up a mapping so that user-mode Linux |
419 | still works -- its syscall table will reference stub_xyzzy, but the UML build |
420 | doesn't include arch/x86/entry/entry_64.S implementation (because UML |
421 | simulates registers etc). Fixing this is as simple as adding a #define to |
422 | arch/x86/um/sys_call_table_64.c: |
423 | |
424 | #define stub_xyzzy sys_xyzzy |
425 | |
426 | |
427 | Other Details |
428 | ------------- |
429 | |
430 | Most of the kernel treats system calls in a generic way, but there is the |
431 | occasional exception that may need updating for your particular system call. |
432 | |
433 | The audit subsystem is one such special case; it includes (arch-specific) |
434 | functions that classify some special types of system call -- specifically |
435 | file open (open/openat), program execution (execve/exeveat) or socket |
436 | multiplexor (socketcall) operations. If your new system call is analogous to |
437 | one of these, then the audit system should be updated. |
438 | |
439 | More generally, if there is an existing system call that is analogous to your |
440 | new system call, it's worth doing a kernel-wide grep for the existing system |
441 | call to check there are no other special cases. |
442 | |
443 | |
444 | Testing |
445 | ------- |
446 | |
447 | A new system call should obviously be tested; it is also useful to provide |
448 | reviewers with a demonstration of how user space programs will use the system |
449 | call. A good way to combine these aims is to include a simple self-test |
450 | program in a new directory under tools/testing/selftests/. |
451 | |
452 | For a new system call, there will obviously be no libc wrapper function and so |
453 | the test will need to invoke it using syscall(); also, if the system call |
454 | involves a new userspace-visible structure, the corresponding header will need |
455 | to be installed to compile the test. |
456 | |
457 | Make sure the selftest runs successfully on all supported architectures. For |
458 | example, check that it works when compiled as an x86_64 (-m64), x86_32 (-m32) |
459 | and x32 (-mx32) ABI program. |
460 | |
461 | For more extensive and thorough testing of new functionality, you should also |
462 | consider adding tests to the Linux Test Project, or to the xfstests project |
463 | for filesystem-related changes. |
464 | - https://linux-test-project.github.io/ |
465 | - git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git |
466 | |
467 | |
468 | Man Page |
469 | -------- |
470 | |
471 | All new system calls should come with a complete man page, ideally using groff |
472 | markup, but plain text will do. If groff is used, it's helpful to include a |
473 | pre-rendered ASCII version of the man page in the cover email for the |
474 | patchset, for the convenience of reviewers. |
475 | |
476 | The man page should be cc'ed to linux-man@vger.kernel.org |
477 | For more details, see https://www.kernel.org/doc/man-pages/patches.html |
478 | |
479 | References and Sources |
480 | ---------------------- |
481 | |
482 | - LWN article from Michael Kerrisk on use of flags argument in system calls: |
483 | https://lwn.net/Articles/585415/ |
484 | - LWN article from Michael Kerrisk on how to handle unknown flags in a system |
485 | call: https://lwn.net/Articles/588444/ |
486 | - LWN article from Jake Edge describing constraints on 64-bit system call |
487 | arguments: https://lwn.net/Articles/311630/ |
488 | - Pair of LWN articles from David Drysdale that describe the system call |
489 | implementation paths in detail for v3.14: |
490 | - https://lwn.net/Articles/604287/ |
491 | - https://lwn.net/Articles/604515/ |
492 | - Architecture-specific requirements for system calls are discussed in the |
493 | syscall(2) man-page: |
494 | http://man7.org/linux/man-pages/man2/syscall.2.html#NOTES |
495 | - Collated emails from Linus Torvalds discussing the problems with ioctl(): |
496 | http://yarchive.net/comp/linux/ioctl.html |
497 | - "How to not invent kernel interfaces", Arnd Bergmann, |
498 | http://www.ukuug.org/events/linux2007/2007/papers/Bergmann.pdf |
499 | - LWN article from Michael Kerrisk on avoiding new uses of CAP_SYS_ADMIN: |
500 | https://lwn.net/Articles/486306/ |
501 | - Recommendation from Andrew Morton that all related information for a new |
502 | system call should come in the same email thread: |
503 | https://lkml.org/lkml/2014/7/24/641 |
504 | - Recommendation from Michael Kerrisk that a new system call should come with |
505 | a man page: https://lkml.org/lkml/2014/6/13/309 |
506 | - Suggestion from Thomas Gleixner that x86 wire-up should be in a separate |
507 | commit: https://lkml.org/lkml/2014/11/19/254 |
508 | - Suggestion from Greg Kroah-Hartman that it's good for new system calls to |
509 | come with a man-page & selftest: https://lkml.org/lkml/2014/3/19/710 |
510 | - Discussion from Michael Kerrisk of new system call vs. prctl(2) extension: |
511 | https://lkml.org/lkml/2014/6/3/411 |
512 | - Suggestion from Ingo Molnar that system calls that involve multiple |
513 | arguments should encapsulate those arguments in a struct, which includes a |
514 | size field for future extensibility: https://lkml.org/lkml/2015/7/30/117 |
515 | - Numbering oddities arising from (re-)use of O_* numbering space flags: |
516 | - commit 75069f2b5bfb ("vfs: renumber FMODE_NONOTIFY and add to uniqueness |
517 | check") |
518 | - commit 12ed2e36c98a ("fanotify: FMODE_NONOTIFY and __O_SYNC in sparc |
519 | conflict") |
520 | - commit bb458c644a59 ("Safer ABI for O_TMPFILE") |
521 | - Discussion from Matthew Wilcox about restrictions on 64-bit arguments: |
522 | https://lkml.org/lkml/2008/12/12/187 |
523 | - Recommendation from Greg Kroah-Hartman that unknown flags should be |
524 | policed: https://lkml.org/lkml/2014/7/17/577 |
525 | - Recommendation from Linus Torvalds that x32 system calls should prefer |
526 | compatibility with 64-bit versions rather than 32-bit versions: |
527 | https://lkml.org/lkml/2011/8/31/244 |
528 |