blob: 3cb2dd2bdba88d5b7d4f9115f0d8796de66ecce0
1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> |
2 | <html><head> |
3 | <!-- saved from http://www.win.tue.nl/~aeb/linux/lk/lk-10.html --> |
4 | <meta name="GENERATOR" content="SGML-Tools 1.0.9"><title>The Linux kernel: Processes</title> |
5 | </head> |
6 | <body> |
7 | <hr> |
8 | <h2><a name="s10">10. Processes</a></h2> |
9 | |
10 | <p>Before looking at the Linux implementation, first a general Unix |
11 | description of threads, processes, process groups and sessions. |
12 | </p><p> |
13 | (See also <a href="http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap11.html">General Terminal Interface</a>) |
14 | </p><p>A session contains a number of process groups, and a process group |
15 | contains a number of processes, and a process contains a number |
16 | of threads. |
17 | </p><p>A session can have a controlling tty. |
18 | At most one process group in a session can be a foreground process group. |
19 | An interrupt character typed on a tty ("Teletype", i.e., terminal) |
20 | causes a signal to be sent to all members of the foreground process group |
21 | in the session (if any) that has that tty as controlling tty. |
22 | </p><p>All these objects have numbers, and we have thread IDs, process IDs, |
23 | process group IDs and session IDs. |
24 | </p><p> |
25 | </p><h2><a name="ss10.1">10.1 Processes</a> |
26 | </h2> |
27 | |
28 | <p> |
29 | </p><h3>Creation</h3> |
30 | |
31 | <p>A new process is traditionally started using the <code>fork()</code> |
32 | system call: |
33 | </p><blockquote> |
34 | <pre>pid_t p; |
35 | |
36 | p = fork(); |
37 | if (p == (pid_t) -1) |
38 | /* ERROR */ |
39 | else if (p == 0) |
40 | /* CHILD */ |
41 | else |
42 | /* PARENT */ |
43 | </pre> |
44 | </blockquote> |
45 | <p>This creates a child as a duplicate of its parent. |
46 | Parent and child are identical in almost all respects. |
47 | In the code they are distinguished by the fact that the parent |
48 | learns the process ID of its child, while <code>fork()</code> |
49 | returns 0 in the child. (It can find the process ID of its |
50 | parent using the <code>getppid()</code> system call.) |
51 | </p><p> |
52 | </p><h3>Termination</h3> |
53 | |
54 | <p>Normal termination is when the process does |
55 | </p><blockquote> |
56 | <pre>exit(n); |
57 | </pre> |
58 | </blockquote> |
59 | |
60 | or |
61 | <blockquote> |
62 | <pre>return n; |
63 | </pre> |
64 | </blockquote> |
65 | |
66 | from its <code>main()</code> procedure. It returns the single byte <code>n</code> |
67 | to its parent. |
68 | <p>Abnormal termination is usually caused by a signal. |
69 | </p><p> |
70 | </p><h3>Collecting the exit code. Zombies</h3> |
71 | |
72 | <p>The parent does |
73 | </p><blockquote> |
74 | <pre>pid_t p; |
75 | int status; |
76 | |
77 | p = wait(&status); |
78 | </pre> |
79 | </blockquote> |
80 | |
81 | and collects two bytes: |
82 | <p> |
83 | <figure> |
84 | <eps file="absent"> |
85 | <img src="ctty_files/exit_status.png"> |
86 | </eps> |
87 | </figure></p><p>A process that has terminated but has not yet been waited for |
88 | is a <i>zombie</i>. It need only store these two bytes: |
89 | exit code and reason for termination. |
90 | </p><p>On the other hand, if the parent dies first, <code>init</code> (process 1) |
91 | inherits the child and becomes its parent. |
92 | </p><p> |
93 | </p><h3>Signals</h3> |
94 | |
95 | <p> |
96 | </p><h3>Stopping</h3> |
97 | |
98 | <p>Some signals cause a process to stop: |
99 | <code>SIGSTOP</code> (stop!), |
100 | <code>SIGTSTP</code> (stop from tty: probably ^Z was typed), |
101 | <code>SIGTTIN</code> (tty input asked by background process), |
102 | <code>SIGTTOU</code> (tty output sent by background process, and this was |
103 | disallowed by <code>stty tostop</code>). |
104 | </p><p>Apart from ^Z there also is ^Y. The former stops the process |
105 | when it is typed, the latter stops it when it is read. |
106 | </p><p>Signals generated by typing the corresponding character on some tty |
107 | are sent to all processes that are in the foreground process group |
108 | of the session that has that tty as controlling tty. (Details below.) |
109 | </p><p>If a process is being traced, every signal will stop it. |
110 | </p><p> |
111 | </p><h3>Continuing</h3> |
112 | |
113 | <p><code>SIGCONT</code>: continue a stopped process. |
114 | </p><p> |
115 | </p><h3>Terminating</h3> |
116 | |
117 | <p><code>SIGKILL</code> (die! now!), |
118 | <code>SIGTERM</code> (please, go away), |
119 | <code>SIGHUP</code> (modem hangup), |
120 | <code>SIGINT</code> (^C), |
121 | <code>SIGQUIT</code> (^\), etc. |
122 | Many signals have as default action to kill the target. |
123 | (Sometimes with an additional core dump, when such is |
124 | allowed by rlimit.) |
125 | The signals <code>SIGCHLD</code> and <code>SIGWINCH</code> |
126 | are ignored by default. |
127 | All except <code>SIGKILL</code> and <code>SIGSTOP</code> can be |
128 | caught or ignored or blocked. |
129 | For details, see <code>signal(7)</code>. |
130 | </p><p> |
131 | </p><h2><a name="ss10.2">10.2 Process groups</a> |
132 | </h2> |
133 | |
134 | <p>Every process is member of a unique <i>process group</i>, |
135 | identified by its <i>process group ID</i>. |
136 | (When the process is created, it becomes a member of the process group |
137 | of its parent.) |
138 | By convention, the process group ID of a process group |
139 | equals the process ID of the first member of the process group, |
140 | called the <i>process group leader</i>. |
141 | A process finds the ID of its process group using the system call |
142 | <code>getpgrp()</code>, or, equivalently, <code>getpgid(0)</code>. |
143 | One finds the process group ID of process <code>p</code> using |
144 | <code>getpgid(p)</code>. |
145 | </p><p>One may use the command <code>ps j</code> to see PPID (parent process ID), |
146 | PID (process ID), PGID (process group ID) and SID (session ID) |
147 | of processes. With a shell that does not know about job control, |
148 | like <code>ash</code>, each of its children will be in the same session |
149 | and have the same process group as the shell. With a shell that knows |
150 | about job control, like <code>bash</code>, the processes of one pipeline, like |
151 | </p><blockquote> |
152 | <pre>% cat paper | ideal | pic | tbl | eqn | ditroff > out |
153 | </pre> |
154 | </blockquote> |
155 | |
156 | form a single process group. |
157 | <p> |
158 | </p><h3>Creation</h3> |
159 | |
160 | <p>A process <code>pid</code> is put into the process group <code>pgid</code> by |
161 | </p><blockquote> |
162 | <pre>setpgid(pid, pgid); |
163 | </pre> |
164 | </blockquote> |
165 | |
166 | If <code>pgid == pid</code> or <code>pgid == 0</code> then this creates |
167 | a new process group with process group leader <code>pid</code>. |
168 | Otherwise, this puts <code>pid</code> into the already existing |
169 | process group <code>pgid</code>. |
170 | A zero <code>pid</code> refers to the current process. |
171 | The call <code>setpgrp()</code> is equivalent to <code>setpgid(0,0)</code>. |
172 | <p> |
173 | </p><h3>Restrictions on setpgid()</h3> |
174 | |
175 | <p>The calling process must be <code>pid</code> itself, or its parent, |
176 | and the parent can only do this before <code>pid</code> has done |
177 | <code>exec()</code>, and only when both belong to the same session. |
178 | It is an error if process <code>pid</code> is a session leader |
179 | (and this call would change its <code>pgid</code>). |
180 | </p><p> |
181 | </p><h3>Typical sequence</h3> |
182 | |
183 | <p> |
184 | </p><blockquote> |
185 | <pre>p = fork(); |
186 | if (p == (pid_t) -1) { |
187 | /* ERROR */ |
188 | } else if (p == 0) { /* CHILD */ |
189 | setpgid(0, pgid); |
190 | ... |
191 | } else { /* PARENT */ |
192 | setpgid(p, pgid); |
193 | ... |
194 | } |
195 | </pre> |
196 | </blockquote> |
197 | |
198 | This ensures that regardless of whether parent or child is scheduled |
199 | first, the process group setting is as expected by both. |
200 | <p> |
201 | </p><h3>Signalling and waiting</h3> |
202 | |
203 | <p>One can signal all members of a process group: |
204 | </p><blockquote> |
205 | <pre>killpg(pgrp, sig); |
206 | </pre> |
207 | </blockquote> |
208 | <p>One can wait for children in ones own process group: |
209 | </p><blockquote> |
210 | <pre>waitpid(0, &status, ...); |
211 | </pre> |
212 | </blockquote> |
213 | |
214 | or in a specified process group: |
215 | <blockquote> |
216 | <pre>waitpid(-pgrp, &status, ...); |
217 | </pre> |
218 | </blockquote> |
219 | <p> |
220 | </p><h3>Foreground process group</h3> |
221 | |
222 | <p>Among the process groups in a session at most one can be |
223 | the <i>foreground process group</i> of that session. |
224 | The tty input and tty signals (signals generated by ^C, ^Z, etc.) |
225 | go to processes in this foreground process group. |
226 | </p><p>A process can determine the foreground process group in its session |
227 | using <code>tcgetpgrp(fd)</code>, where <code>fd</code> refers to its |
228 | controlling tty. If there is none, this returns a random value |
229 | larger than 1 that is not a process group ID. |
230 | </p><p>A process can set the foreground process group in its session |
231 | using <code>tcsetpgrp(fd,pgrp)</code>, where <code>fd</code> refers to its |
232 | controlling tty, and <code>pgrp</code> is a process group in |
233 | its session, and this session still is associated to the controlling |
234 | tty of the calling process. |
235 | </p><p>How does one get <code>fd</code>? By definition, <code>/dev/tty</code> |
236 | refers to the controlling tty, entirely independent of redirects |
237 | of standard input and output. (There is also the function |
238 | <code>ctermid()</code> to get the name of the controlling terminal. |
239 | On a POSIX standard system it will return <code>/dev/tty</code>.) |
240 | Opening the name of the |
241 | controlling tty gives a file descriptor <code>fd</code>. |
242 | </p><p> |
243 | </p><h3>Background process groups</h3> |
244 | |
245 | <p>All process groups in a session that are not foreground |
246 | process group are <i>background process groups</i>. |
247 | Since the user at the keyboard is interacting with foreground |
248 | processes, background processes should stay away from it. |
249 | When a background process reads from the terminal it gets |
250 | a SIGTTIN signal. Normally, that will stop it, the job control shell |
251 | notices and tells the user, who can say <code>fg</code> to continue |
252 | this background process as a foreground process, and then this |
253 | process can read from the terminal. But if the background process |
254 | ignores or blocks the SIGTTIN signal, or if its process group |
255 | is orphaned (see below), then the read() returns an EIO error, |
256 | and no signal is sent. (Indeed, the idea is to tell the process |
257 | that reading from the terminal is not allowed right now. |
258 | If it wouldn't see the signal, then it will see the error return.) |
259 | </p><p>When a background process writes to the terminal, it may get |
260 | a SIGTTOU signal. May: namely, when the flag that this must happen |
261 | is set (it is off by default). One can set the flag by |
262 | </p><blockquote> |
263 | <pre>% stty tostop |
264 | </pre> |
265 | </blockquote> |
266 | |
267 | and clear it again by |
268 | <blockquote> |
269 | <pre>% stty -tostop |
270 | </pre> |
271 | </blockquote> |
272 | |
273 | and inspect it by |
274 | <blockquote> |
275 | <pre>% stty -a |
276 | </pre> |
277 | </blockquote> |
278 | |
279 | Again, if TOSTOP is set but the background process ignores or blocks |
280 | the SIGTTOU signal, or if its process group is orphaned (see below), |
281 | then the write() returns an EIO error, and no signal is sent. |
282 | [vda: correction. SUS says that if SIGTTOU is blocked/ignored, write succeeds. ] |
283 | <p> |
284 | </p><h3>Orphaned process groups</h3> |
285 | |
286 | <p>The process group leader is the first member of the process group. |
287 | It may terminate before the others, and then the process group is |
288 | without leader. |
289 | </p><p>A process group is called <i>orphaned</i> when <i>the |
290 | parent of every member is either in the process group |
291 | or outside the session</i>. |
292 | In particular, the process group of the session leader |
293 | is always orphaned. |
294 | </p><p>If termination of a process causes a process group to become |
295 | orphaned, and some member is stopped, then all are sent first SIGHUP |
296 | and then SIGCONT. |
297 | </p><p>The idea is that perhaps the parent of the process group leader |
298 | is a job control shell. (In the same session but a different |
299 | process group.) As long as this parent is alive, it can |
300 | handle the stopping and starting of members in the process group. |
301 | When it dies, there may be nobody to continue stopped processes. |
302 | Therefore, these stopped processes are sent SIGHUP, so that they |
303 | die unless they catch or ignore it, and then SIGCONT to continue them. |
304 | </p><p>Note that the process group of the session leader is already |
305 | orphaned, so no signals are sent when the session leader dies. |
306 | </p><p>Note also that a process group can become orphaned in two ways |
307 | by termination of a process: either it was a parent and not itself |
308 | in the process group, or it was the last element of the process group |
309 | with a parent outside but in the same session. |
310 | Furthermore, that a process group can become orphaned |
311 | other than by termination of a process, namely when some |
312 | member is moved to a different process group. |
313 | </p><p> |
314 | </p><h2><a name="ss10.3">10.3 Sessions</a> |
315 | </h2> |
316 | |
317 | <p>Every process group is in a unique <i>session</i>. |
318 | (When the process is created, it becomes a member of the session |
319 | of its parent.) |
320 | By convention, the session ID of a session |
321 | equals the process ID of the first member of the session, |
322 | called the <i>session leader</i>. |
323 | A process finds the ID of its session using the system call |
324 | <code>getsid()</code>. |
325 | </p><p>Every session may have a <i>controlling tty</i>, |
326 | that then also is called the controlling tty of each of |
327 | its member processes. |
328 | A file descriptor for the controlling tty is obtained by |
329 | opening <code>/dev/tty</code>. (And when that fails, there was no |
330 | controlling tty.) Given a file descriptor for the controlling tty, |
331 | one may obtain the SID using <code>tcgetsid(fd)</code>. |
332 | </p><p>A session is often set up by a login process. The terminal |
333 | on which one is logged in then becomes the controlling tty |
334 | of the session. All processes that are descendants of the |
335 | login process will in general be members of the session. |
336 | </p><p> |
337 | </p><h3>Creation</h3> |
338 | |
339 | <p>A new session is created by |
340 | </p><blockquote> |
341 | <pre>pid = setsid(); |
342 | </pre> |
343 | </blockquote> |
344 | |
345 | This is allowed only when the current process is not a process group leader. |
346 | In order to be sure of that we fork first: |
347 | <blockquote> |
348 | <pre>p = fork(); |
349 | if (p) exit(0); |
350 | pid = setsid(); |
351 | </pre> |
352 | </blockquote> |
353 | |
354 | The result is that the current process (with process ID <code>pid</code>) |
355 | becomes session leader of a new session with session ID <code>pid</code>. |
356 | Moreover, it becomes process group leader of a new process group. |
357 | Both session and process group contain only the single process <code>pid</code>. |
358 | Furthermore, this process has no controlling tty. |
359 | <p>The restriction that the current process must not be a process group leader |
360 | is needed: otherwise its PID serves as PGID of some existing process group |
361 | and cannot be used as the PGID of a new process group. |
362 | </p><p> |
363 | </p><h3>Getting a controlling tty</h3> |
364 | |
365 | <p>How does one get a controlling terminal? Nobody knows, |
366 | this is a great mystery. |
367 | </p><p>The System V approach is that the first tty opened by the process |
368 | becomes its controlling tty. |
369 | </p><p>The BSD approach is that one has to explicitly call |
370 | </p><blockquote> |
371 | <pre>ioctl(fd, TIOCSCTTY, 0/1); |
372 | </pre> |
373 | </blockquote> |
374 | |
375 | to get a controlling tty. |
376 | <p>Linux tries to be compatible with both, as always, and this |
377 | results in a very obscure complex of conditions. Roughly: |
378 | </p><p>The <code>TIOCSCTTY</code> ioctl will give us a controlling tty, |
379 | provided that (i) the current process is a session leader, |
380 | and (ii) it does not yet have a controlling tty, and |
381 | (iii) maybe the tty should not already control some other session; |
382 | if it does it is an error if we aren't root, or we steal the tty |
383 | if we are all-powerful. |
384 | [vda: correction: third parameter controls this: if 1, we steal tty from |
385 | any such session, if 0, we don't steal] |
386 | </p><p>Opening some terminal will give us a controlling tty, |
387 | provided that (i) the current process is a session leader, and |
388 | (ii) it does not yet have a controlling tty, and |
389 | (iii) the tty does not already control some other session, and |
390 | (iv) the open did not have the <code>O_NOCTTY</code> flag, and |
391 | (v) the tty is not the foreground VT, and |
392 | (vi) the tty is not the console, and |
393 | (vii) maybe the tty should not be master or slave pty. |
394 | </p><p> |
395 | </p><h3>Getting rid of a controlling tty</h3> |
396 | |
397 | <p>If a process wants to continue as a daemon, it must detach itself |
398 | from its controlling tty. Above we saw that <code>setsid()</code> |
399 | will remove the controlling tty. Also the ioctl TIOCNOTTY does this. |
400 | Moreover, in order not to get a controlling tty again as soon as it |
401 | opens a tty, the process has to fork once more, to assure that it |
402 | is not a session leader. Typical code fragment: |
403 | </p><p> |
404 | </p><pre> if ((fork()) != 0) |
405 | exit(0); |
406 | setsid(); |
407 | if ((fork()) != 0) |
408 | exit(0); |
409 | </pre> |
410 | <p>See also <code>daemon(3)</code>. |
411 | </p><p> |
412 | </p><h3>Disconnect</h3> |
413 | |
414 | <p>If the terminal goes away by modem hangup, and the line was not local, |
415 | then a SIGHUP is sent to the session leader. |
416 | Any further reads from the gone terminal return EOF. |
417 | (Or possibly -1 with <code>errno</code> set to EIO.) |
418 | </p><p>If the terminal is the slave side of a pseudotty, and the master side |
419 | is closed (for the last time), then a SIGHUP is sent to the foreground |
420 | process group of the slave side. |
421 | </p><p>When the session leader dies, a SIGHUP is sent to all processes |
422 | in the foreground process group. Moreover, the terminal stops being |
423 | the controlling terminal of this session (so that it can become |
424 | the controlling terminal of another session). |
425 | </p><p>Thus, if the terminal goes away and the session leader is |
426 | a job control shell, then it can handle things for its descendants, |
427 | e.g. by sending them again a SIGHUP. |
428 | If on the other hand the session leader is an innocent process |
429 | that does not catch SIGHUP, it will die, and all foreground processes |
430 | get a SIGHUP. |
431 | </p><p> |
432 | </p><h2><a name="ss10.4">10.4 Threads</a> |
433 | </h2> |
434 | |
435 | <p>A process can have several threads. New threads (with the same PID |
436 | as the parent thread) are started using the <code>clone</code> system |
437 | call using the <code>CLONE_THREAD</code> flag. Threads are distinguished |
438 | by a <i>thread ID</i> (TID). An ordinary process has a single thread |
439 | with TID equal to PID. The system call <code>gettid()</code> returns the |
440 | TID. The system call <code>tkill()</code> sends a signal to a single thread. |
441 | </p><p>Example: a process with two threads. Both only print PID and TID and exit. |
442 | (Linux 2.4.19 or later.) |
443 | </p><pre>% cat << EOF > gettid-demo.c |
444 | #include <unistd.h> |
445 | #include <sys/types.h> |
446 | #define CLONE_SIGHAND 0x00000800 |
447 | #define CLONE_THREAD 0x00010000 |
448 | #include <linux/unistd.h> |
449 | #include <errno.h> |
450 | _syscall0(pid_t,gettid) |
451 | |
452 | int thread(void *p) { |
453 | printf("thread: %d %d\n", gettid(), getpid()); |
454 | } |
455 | |
456 | main() { |
457 | unsigned char stack[4096]; |
458 | int i; |
459 | |
460 | i = clone(thread, stack+2048, CLONE_THREAD | CLONE_SIGHAND, NULL); |
461 | if (i == -1) |
462 | perror("clone"); |
463 | else |
464 | printf("clone returns %d\n", i); |
465 | printf("parent: %d %d\n", gettid(), getpid()); |
466 | } |
467 | EOF |
468 | % cc -o gettid-demo gettid-demo.c |
469 | % ./gettid-demo |
470 | clone returns 21826 |
471 | parent: 21825 21825 |
472 | thread: 21826 21825 |
473 | % |
474 | </pre> |
475 | <p> |
476 | </p><p> |
477 | </p><hr> |
478 | |
479 | </body></html> |
480 |