blob: 938cce91d89eae216ea91222c4131f12d189b937
1 | Daemontools and runit |
2 | |
3 | Tired of PID files, needing root access, and writing init scripts just |
4 | to have your UNIX apps start when your server boots? Want a simpler, |
5 | better alternative that will also restart them if they crash? If so, |
6 | this is an introduction to process supervision with runit/daemontools. |
7 | |
8 | |
9 | Background |
10 | |
11 | Classic init scripts, e.g. /etc/init.d/apache, are widely used for |
12 | starting processes at system boot time, when they are executed by init. |
13 | Sadly, init scripts are cumbersome and error-prone to write, they must |
14 | typically be edited and run as root, and the processes they launch do |
15 | not get restarted automatically if they crash. |
16 | |
17 | In an alternative scheme called "process supervision", each important |
18 | process is looked after by a tiny supervising process, which deals with |
19 | starting and stopping the important process on request, and re-starting |
20 | it when it exits unexpectedly. Those supervising processes can in turn |
21 | be supervised by other supervising processes. |
22 | |
23 | Dan Bernstein wrote the process supervision toolkit, "daemontools", |
24 | which is a set of small, reliable programs that cooperate in the |
25 | UNIX tradition to manage process supervision trees. |
26 | |
27 | Runit is a more conveniently licensed and more actively maintained |
28 | reimplementation of daemontools, written by Gerrit Pape. |
29 | |
30 | Here I’ll use runit, however, the ideas are the same for other |
31 | daemontools-like projects (there are several). |
32 | |
33 | |
34 | Service directories and scripts |
35 | |
36 | In runit parlance a "service" is simply a directory containing a script |
37 | named "run". |
38 | |
39 | There are just two key programs in runit. Firstly, runsv supervises the |
40 | process for an individual service. Service directories themselves sit |
41 | inside a containing directory, and the runsvdir program supervises that |
42 | directory, running one child runsv process for the service in each |
43 | subdirectory. A typical choice is to start an instance of runsvdir |
44 | which supervises services in subdirectories of /var/service/. |
45 | |
46 | If /var/service/log/ exists, runsv will supervise two services, |
47 | and will connect stdout of main service to the stdin of log service. |
48 | This is primarily used for logging. |
49 | |
50 | You can debug an individual service by running its SERVICE_DIR/run script. |
51 | In this case, its stdout and stderr go to your terminal. |
52 | |
53 | You can also run "runsv SERVICE_DIR", which runs both the service |
54 | and its logger service (SERVICE_DIR/log/run) if logger service exists. |
55 | If logger service exists, the output will go to it instead of the terminal. |
56 | |
57 | "runsvdir /var/service" merely runs "runsv SERVICE_DIR" for every subdirectory |
58 | in /var/service. |
59 | |
60 | |
61 | Examples |
62 | |
63 | This directory contains some examples of services: |
64 | |
65 | var_service/getty_<tty> |
66 | |
67 | Runs a getty on <tty>. (run script looks at $PWD and extracts suffix |
68 | after "_" as tty name). Create copies (or symlinks) of this directory |
69 | with different names to run many gettys on many ttys. |
70 | |
71 | var_service/gpm |
72 | |
73 | Runs gpm, the cut and paste utility and mouse server for text consoles. |
74 | |
75 | var_service/inetd |
76 | |
77 | Runs inetd. This is an example of a service with log. Log service |
78 | writes timestamped, rotated log data to /var/log/service/inetd/* |
79 | using "svlogd -tt". p_log and w_log scripts demonstrage how you can |
80 | "page log" and "watch log". |
81 | |
82 | Other services which have logs handle them in the same way. |
83 | |
84 | var_service/nmeter |
85 | |
86 | Runs nmeter '%t %c ....' with output to /dev/tty9. This gives you |
87 | a 1-second sampling of server load and health on a dedicated text console. |
88 | |
89 | |
90 | Networking examples |
91 | |
92 | In many cases, network configuration makes it necessary to run several daemons: |
93 | dhcp, zeroconf, ppp, openvpn and such. They need to be controlled, |
94 | and in many cases you also want to babysit them. |
95 | |
96 | They present a case where different services need to control (start, stop, |
97 | restart) each other. |
98 | |
99 | var_service/dhcp_if |
100 | |
101 | controls a udhcpc instance which provides dhpc-assigned IP |
102 | address on interface named "if". Copy/rename this directory as needed to run |
103 | udhcpc on other interfaces (var_service/dhcp_if/run script uses _foo suffix |
104 | of the parent directory as interface name). |
105 | |
106 | When IP address is obtained or lost, var_service/dhcp_if/dhcp_handler is run. |
107 | It saves new config data to /var/run/service/fw/dhcp_if.ipconf and (re)starts |
108 | /var/service/fw service. This example can be used as a template for other |
109 | dynamic network link services (ppp/vpn/zcip). |
110 | |
111 | This is an example of service with has a "finish" script. If downed ("sv d"), |
112 | "finish" is executed. For this service, it removes DHCP address from |
113 | the interface. This is useful when ifplugd detects that the the link is dead |
114 | (cable is no longer attached anywhere) and downs us - keeping DHCP configured |
115 | addresses on the interface would make kernel still try to use it. |
116 | |
117 | var_service/zcip_if |
118 | |
119 | Zeroconf IP service: assigns a 169.254.x.y/16 address to interface "if". |
120 | This allows to talk to other devices on a network without DHCP server |
121 | (if they also assign 169.254 addresses to themselves). |
122 | |
123 | var_service/ifplugd_if |
124 | |
125 | Watches link status of interface "if". Downs and ups /var/service/dhcp_if |
126 | service accordingly. In effect, it allows you to unplug/plug-to-different-network |
127 | and have your IP properly re-negotiated at once. |
128 | |
129 | var_service/dhcp_if_pinger |
130 | |
131 | Uses var_service/dhcp_if's data to determine router IP. Pings it. |
132 | If ping fails, restarts /var/service/dhcp_if service. |
133 | Basically, an example of watchdog service for networks which are not reliable |
134 | and need babysitting. |
135 | |
136 | var_service/supplicant_if |
137 | |
138 | Wireless supplicant (wifi association and encryption daemon) service for |
139 | interface "if". |
140 | |
141 | var_service/fw |
142 | |
143 | "Firewall" script, although it is tasked with much more than setting up firewall. |
144 | It is responsible for all aspects of network configuration. |
145 | |
146 | This is an example of *one-shot* service. |
147 | |
148 | It reconfigures network based on current known state of ALL interfaces. |
149 | Uses conf/*.ipconf (static config) and /var/run/service/fw/*.ipconf |
150 | (dynamic config from dhcp/ppp/vpn/etc) to determine what to do. |
151 | |
152 | One-shot-ness of this service means that it shuts itself off after single run. |
153 | IOW: it is not a constantly running daemon sort of thing. |
154 | It starts, it configures the network, it shuts down, all done |
155 | (unlike infamous NetworkManagers which sit in RAM forever). |
156 | |
157 | However, any dhcp/ppp/vpn or similar service can restart it anytime |
158 | when it senses the change in network configuration. |
159 | This even works while fw service runs: if dhcp signals fw to (re)start |
160 | while fw runs, fw will not stop after its execution, but will re-execute once, |
161 | picking up dhcp's new configuration. |
162 | This is achieved very simply by having |
163 | # Make ourself one-shot |
164 | sv o . |
165 | at the very beginning of fw/run script, not at the end. |
166 | |
167 | Therefore, any "sv u /var/run/service/fw" command by any other |
168 | script "undoes" o(ne-shot) command if fw still runs, thus |
169 | runsv will rerun it; or start it in a normal way if fw is not running. |
170 | |
171 | This mechanism is the reason why fw is a service, not just a script. |
172 | |
173 | System administrators are expected to edit fw/run script, since |
174 | network configuration needs are likely to be very complex and different |
175 | for non-trivial installations. |
176 | |
177 | var_service/ftpd |
178 | var_service/httpd |
179 | var_service/tftpd |
180 | var_service/ntpd |
181 | |
182 | Examples of typical network daemons. |
183 | |
184 | |
185 | Process tree |
186 | |
187 | Here is an example of the process tree from a live system with these services |
188 | (and a few others). An interesting detail are ftpd and vpnc services, where |
189 | you can see only logger process. These services are "downed" at the moment: |
190 | their daemons are not launched. |
191 | |
192 | PID TIME COMMAND |
193 | 553 0:04 runsvdir -P /var/service |
194 | 561 0:00 runsv sshd |
195 | 576 0:00 svlogd -tt /var/log/service/sshd |
196 | 589 0:00 /usr/sbin/sshd -D -e -p22 -u0 -h /var/service/sshd/ssh_host_rsa_key |
197 | 562 0:00 runsv dhcp_eth0 |
198 | 568 0:00 svlogd -tt /var/log/service/dhcp_eth0 |
199 | 850 0:00 udhcpc -vv --foreground --interface=eth0 |
200 | --pidfile=/var/service/dhcp_eth0/udhcpc.pid |
201 | --script=/var/service/dhcp_eth0/dhcp_handler -x hostname bbox |
202 | 563 0:00 runsv ntpd |
203 | 573 0:01 svlogd -tt /var/log/service/ntpd |
204 | 845 0:00 busybox ntpd -dddnNl -S ./ntp.script -p 10.x.x.x -p 10.x.x.x |
205 | 564 0:00 runsv ifplugd_wlan0 |
206 | 598 0:00 svlogd -tt /var/log/service/ifplugd_wlan0 |
207 | 614 0:05 ifplugd -apqns -t3 -u0 -d0 -i wlan0 |
208 | -r /var/service/ifplugd_wlan0/ifplugd_handler |
209 | 565 0:08 runsv dhcp_wlan0_pinger |
210 | 911 0:00 sleep 67 |
211 | 566 0:00 runsv unscd |
212 | 583 0:03 svlogd -tt /var/log/service/unscd |
213 | 599 0:02 nscd -dddd |
214 | 567 0:00 runsv dhcp_wlan0 |
215 | 591 0:00 svlogd -tt /var/log/service/dhcp_wlan0 |
216 | 802 0:00 udhcpc -vv -C -o -V --foreground --interface=wlan0 |
217 | --pidfile=/var/service/dhcp_wlan0/udhcpc.pid |
218 | --script=/var/service/dhcp_wlan0/dhcp_handler |
219 | 569 0:00 runsv fw |
220 | 570 0:00 runsv ifplugd_eth0 |
221 | 597 0:00 svlogd -tt /var/log/service/ifplugd_eth0 |
222 | 612 0:05 ifplugd -apqns -t3 -u8 -d8 -i eth0 |
223 | -r /var/service/ifplugd_eth0/ifplugd_handler |
224 | 571 0:00 runsv zcip_eth0 |
225 | 590 0:00 svlogd -tt /var/log/service/zcip_eth0 |
226 | 607 0:01 zcip -fvv eth0 /var/service/zcip_eth0/zcip_handler |
227 | 572 0:00 runsv ftpd |
228 | 604 0:00 svlogd -tt /var/log/service/ftpd |
229 | 574 0:00 runsv vpnc |
230 | 603 0:00 svlogd -tt /var/log/service/vpnc |
231 | 575 0:00 runsv httpd |
232 | 602 0:00 svlogd -tt /var/log/service/httpd |
233 | 622 0:00 busybox httpd -p80 -vvv -f -h /home/httpd_root |
234 | 577 0:00 runsv supplicant_wlan0 |
235 | 627 0:00 svlogd -tt /var/log/service/supplicant_wlan0 |
236 | 638 0:03 wpa_supplicant -i wlan0 -c /var/service/supplicant_wlan0/wpa_supplicant.conf -d |
237 |