You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

369 lines
12 KiB

  1. <chapter xmlns="http://docbook.org/ns/docbook"
  2. xmlns:xlink="http://www.w3.org/1999/xlink"
  3. xml:id="ch-running">
  4. <title>Running NixOS</title>
  5. <para>This chapter describes various aspects of managing a running
  6. NixOS system, such as how to use the <command>systemd</command>
  7. service manager.</para>
  8. <!--===============================================================-->
  9. <section><title>Service management</title>
  10. <para>In NixOS, all system services are started and monitored using
  11. the systemd program. Systemd is the “init” process of the system
  12. (i.e. PID 1), the parent of all other processes. It manages a set of
  13. so-called “units”, which can be things like system services
  14. (programs), but also mount points, swap files, devices, targets
  15. (groups of units) and more. Units can have complex dependencies; for
  16. instance, one unit can require that another unit must be successfully
  17. started before the first unit can be started. When the system boots,
  18. it starts a unit named <literal>default.target</literal>; the
  19. dependencies of this unit cause all system services to be started,
  20. file systems to be mounted, swap files to be activated, and so
  21. on.</para>
  22. <para>The command <command>systemctl</command> is the main way to
  23. interact with <command>systemd</command>. Without any arguments, it
  24. shows the status of active units:
  25. <screen>
  26. $ systemctl
  27. -.mount loaded active mounted /
  28. swapfile.swap loaded active active /swapfile
  29. sshd.service loaded active running SSH Daemon
  30. graphical.target loaded active active Graphical Interface
  31. <replaceable>...</replaceable>
  32. </screen>
  33. </para>
  34. <para>You can ask for detailed status information about a unit, for
  35. instance, the PostgreSQL database service:
  36. <screen>
  37. $ systemctl status postgresql.service
  38. postgresql.service - PostgreSQL Server
  39. Loaded: loaded (/nix/store/pn3q73mvh75gsrl8w7fdlfk3fq5qm5mw-unit/postgresql.service)
  40. Active: active (running) since Mon, 2013-01-07 15:55:57 CET; 9h ago
  41. Main PID: 2390 (postgres)
  42. CGroup: name=systemd:/system/postgresql.service
  43. ├─2390 postgres
  44. ├─2418 postgres: writer process
  45. ├─2419 postgres: wal writer process
  46. ├─2420 postgres: autovacuum launcher process
  47. ├─2421 postgres: stats collector process
  48. └─2498 postgres: zabbix zabbix [local] idle
  49. Jan 07 15:55:55 hagbard postgres[2394]: [1-1] LOG: database system was shut down at 2013-01-07 15:55:05 CET
  50. Jan 07 15:55:57 hagbard postgres[2390]: [1-1] LOG: database system is ready to accept connections
  51. Jan 07 15:55:57 hagbard postgres[2420]: [1-1] LOG: autovacuum launcher started
  52. Jan 07 15:55:57 hagbard systemd[1]: Started PostgreSQL Server.
  53. </screen>
  54. Note that this shows the status of the unit (active and running), all
  55. the processes belonging to the service, as well as the most recent log
  56. messages from the service.
  57. </para>
  58. <para>Units can be stopped, started or restarted:
  59. <screen>
  60. $ systemctl stop postgresql.service
  61. $ systemctl start postgresql.service
  62. $ systemctl restart postgresql.service
  63. </screen>
  64. These operations are synchronous: they wait until the service has
  65. finished starting or stopping (or has failed). Starting a unit will
  66. cause the dependencies of that unit to be started as well (if
  67. necessary).</para>
  68. <!-- - cgroups: each service and user session is a cgroup
  69. - cgroup resource management -->
  70. </section>
  71. <!--===============================================================-->
  72. <section><title>Rebooting and shutting down</title>
  73. <para>The system can be shut down (and automatically powered off) by
  74. doing:
  75. <screen>
  76. $ shutdown
  77. </screen>
  78. This is equivalent to running <command>systemctl
  79. poweroff</command>.</para>
  80. <para>To reboot the system, run
  81. <screen>
  82. $ reboot
  83. </screen>
  84. which is equivalent to <command>systemctl reboot</command>.
  85. Alternatively, you can quickly reboot the system using
  86. <literal>kexec</literal>, which bypasses the BIOS by directly loading
  87. the new kernel into memory:
  88. <screen>
  89. $ systemctl kexec
  90. </screen>
  91. </para>
  92. <para>The machine can be suspended to RAM (if supported) using
  93. <command>systemctl suspend</command>, and suspended to disk using
  94. <command>systemctl hibernate</command>.</para>
  95. <para>These commands can be run by any user who is logged in locally,
  96. i.e. on a virtual console or in X11; otherwise, the user is asked for
  97. authentication.</para>
  98. </section>
  99. <!--===============================================================-->
  100. <section><title>User sessions</title>
  101. <para>Systemd keeps track of all users who are logged into the system
  102. (e.g. on a virtual console or remotely via SSH). The command
  103. <command>loginctl</command> allows querying and manipulating user
  104. sessions. For instance, to list all user sessions:
  105. <screen>
  106. $ loginctl
  107. SESSION UID USER SEAT
  108. c1 500 eelco seat0
  109. c3 0 root seat0
  110. c4 500 alice
  111. </screen>
  112. This shows that two users are logged in locally, while another is
  113. logged in remotely. (“Seats” are essentially the combinations of
  114. displays and input devices attached to the system; usually, there is
  115. only one seat.) To get information about a session:
  116. <screen>
  117. $ loginctl session-status c3
  118. c3 - root (0)
  119. Since: Tue, 2013-01-08 01:17:56 CET; 4min 42s ago
  120. Leader: 2536 (login)
  121. Seat: seat0; vc3
  122. TTY: /dev/tty3
  123. Service: login; type tty; class user
  124. State: online
  125. CGroup: name=systemd:/user/root/c3
  126. ├─ 2536 /nix/store/10mn4xip9n7y9bxqwnsx7xwx2v2g34xn-shadow-4.1.5.1/bin/login --
  127. ├─10339 -bash
  128. └─10355 w3m nixos.org
  129. </screen>
  130. This shows that the user is logged in on virtual console 3. It also
  131. lists the processes belonging to this session. Since systemd keeps
  132. track of this, you can terminate a session in a way that ensures that
  133. all the session’s processes are gone:
  134. <screen>
  135. $ loginctl terminate-session c3
  136. </screen>
  137. </para>
  138. </section>
  139. <!--===============================================================-->
  140. <section><title>Control groups</title>
  141. <para>To keep track of the processes in a running system, systemd uses
  142. <emphasis>control groups</emphasis> (cgroups). A control group is a
  143. set of processes used to allocate resources such as CPU, memory or I/O
  144. bandwidth. There can be multiple control group hierarchies, allowing
  145. each kind of resource to be managed independently.</para>
  146. <para>The command <command>systemd-cgls</command> lists all control
  147. groups in the <literal>systemd</literal> hierarchy, which is what
  148. systemd uses to keep track of the processes belonging to each service
  149. or user session:
  150. <screen>
  151. $ systemd-cgls
  152. ├─user
  153. │ └─eelco
  154. │ └─c1
  155. │ ├─ 2567 -:0
  156. │ ├─ 2682 kdeinit4: kdeinit4 Running...
  157. │ ├─ <replaceable>...</replaceable>
  158. │ └─10851 sh -c less -R
  159. └─system
  160. ├─httpd.service
  161. │ ├─2444 httpd -f /nix/store/3pyacby5cpr55a03qwbnndizpciwq161-httpd.conf -DNO_DETACH
  162. │ └─<replaceable>...</replaceable>
  163. ├─dhcpcd.service
  164. │ └─2376 dhcpcd --config /nix/store/f8dif8dsi2yaa70n03xir8r653776ka6-dhcpcd.conf
  165. └─ <replaceable>...</replaceable>
  166. </screen>
  167. Similarly, <command>systemd-cgls cpu</command> shows the cgroups in
  168. the CPU hierarchy, which allows per-cgroup CPU scheduling priorities.
  169. By default, every systemd service gets its own CPU cgroup, while all
  170. user sessions are in the top-level CPU cgroup. This ensures, for
  171. instance, that a thousand run-away processes in the
  172. <literal>httpd.service</literal> cgroup cannot starve the CPU for one
  173. process in the <literal>postgresql.service</literal> cgroup. (By
  174. contrast, it they were in the same cgroup, then the PostgreSQL process
  175. would get 1/1001 of the cgroup’s CPU time.) You can limit a service’s
  176. CPU share in <filename>configuration.nix</filename>:
  177. <programlisting>
  178. systemd.services.httpd.serviceConfig.CPUShares = 512;
  179. </programlisting>
  180. By default, every cgroup has 1024 CPU shares, so this will halve the
  181. CPU allocation of the <literal>httpd.service</literal> cgroup.</para>
  182. <para>There also is a <literal>memory</literal> hierarchy that
  183. controls memory allocation limits; by default, all processes are in
  184. the top-level cgroup, so any service or session can exhaust all
  185. available memory. Per-cgroup memory limits can be specified in
  186. <filename>configuration.nix</filename>; for instance, to limit
  187. <literal>httpd.service</literal> to 512 MiB of RAM (excluding swap)
  188. and 640 MiB of RAM (including swap):
  189. <programlisting>
  190. systemd.services.httpd.serviceConfig.MemoryLimit = "512M";
  191. systemd.services.httpd.serviceConfig.ControlGroupAttribute = [ "memory.memsw.limit_in_bytes 640M" ];
  192. </programlisting>
  193. </para>
  194. <para>The command <command>systemd-cgtop</command> shows a
  195. continuously updated list of all cgroups with their CPU and memory
  196. usage.</para>
  197. </section>
  198. <!--===============================================================-->
  199. <section><title>Logging</title>
  200. <para>System-wide logging is provided by systemd’s
  201. <emphasis>journal</emphasis>, which subsumes traditional logging
  202. daemons such as syslogd and klogd. Log entries are kept in binary
  203. files in <filename>/var/log/journal/</filename>. The command
  204. <literal>journalctl</literal> allows you to see the contents of the
  205. journal. For example,
  206. <screen>
  207. $ journalctl -b
  208. </screen>
  209. shows all journal entries since the last reboot. (The output of
  210. <command>journalctl</command> is piped into <command>less</command> by
  211. default.) You can use various options and match operators to restrict
  212. output to messages of interest. For instance, to get all messages
  213. from PostgreSQL:
  214. <screen>
  215. $ journalctl -u postgresql.service
  216. -- Logs begin at Mon, 2013-01-07 13:28:01 CET, end at Tue, 2013-01-08 01:09:57 CET. --
  217. ...
  218. Jan 07 15:44:14 hagbard postgres[2681]: [2-1] LOG: database system is shut down
  219. -- Reboot --
  220. Jan 07 15:45:10 hagbard postgres[2532]: [1-1] LOG: database system was shut down at 2013-01-07 15:44:14 CET
  221. Jan 07 15:45:13 hagbard postgres[2500]: [1-1] LOG: database system is ready to accept connections
  222. </screen>
  223. Or to get all messages since the last reboot that have at least a
  224. “critical” severity level:
  225. <screen>
  226. $ journalctl -b -p crit
  227. Dec 17 21:08:06 mandark sudo[3673]: pam_unix(sudo:auth): auth could not identify password for [alice]
  228. Dec 29 01:30:22 mandark kernel[6131]: [1053513.909444] CPU6: Core temperature above threshold, cpu clock throttled (total events = 1)
  229. </screen>
  230. </para>
  231. <para>The system journal is readable by root and by users in the
  232. <literal>wheel</literal> and <literal>systemd-journal</literal>
  233. groups. All users have a private journal that can be read using
  234. <command>journalctl</command>.</para>
  235. </section>
  236. <!--===============================================================-->
  237. <section><title>Cleaning up the Nix store</title>
  238. <para>Nix has a purely functional model, meaning that packages are
  239. never upgraded in place. Instead new versions of packages end up in a
  240. different location in the Nix store (<filename>/nix/store</filename>).
  241. You should periodically run Nix’s <emphasis>garbage
  242. collector</emphasis> to remove old, unreferenced packages. This is
  243. easy:
  244. <screen>
  245. $ nix-collect-garbage
  246. </screen>
  247. Alternatively, you can use a systemd unit that does the same in the
  248. background:
  249. <screen>
  250. $ systemctl start nix-gc.service
  251. </screen>
  252. You can tell NixOS in <filename>configuration.nix</filename> to run
  253. this unit automatically at certain points in time, for instance, every
  254. night at 03:15:
  255. <programlisting>
  256. nix.gc.automatic = true;
  257. nix.gc.dates = "03:15";
  258. </programlisting>
  259. </para>
  260. <para>The commands above do not remove garbage collector roots, such
  261. as old system configurations. Thus they do not remove the ability to
  262. roll back to previous configurations. The following command deletes
  263. old roots, removing the ability to roll back to them:
  264. <screen>
  265. $ nix-collect-garbage -d
  266. </screen>
  267. You can also do this for specific profiles, e.g.
  268. <screen>
  269. $ nix-env -p /nix/var/nix/profiles/per-user/eelco/profile --delete-generations old
  270. </screen>
  271. Note that NixOS system configurations are stored in the profile
  272. <filename>/nix/var/nix/profiles/system</filename>.</para>
  273. <para>Another way to reclaim disk space (often as much as 40% of the
  274. size of the Nix store) is to run Nix’s store optimiser, which seeks
  275. out identical files in the store and replaces them with hard links to
  276. a single copy.
  277. <screen>
  278. $ nix-store --optimise
  279. </screen>
  280. Since this command needs to read the entire Nix store, it can take
  281. quite a while to finish.</para>
  282. </section>
  283. </chapter>