Infiniroot Blog: We sometimes write, too.

Of course we cannot always share details about our work with customers, but nevertheless it is nice to show our technical achievements and share some of our implemented solutions.

Syslog-NG in container stopped working: No space left on device /dev

Published on August 25th 2020


A central syslog server, running syslog-ng in a LXC (system) container, stopped working. A quick check into the local syslog logs revealed that the file system seems to be full:

Aug 24 10:06:01 syslog syslog-ng[42]: Error suspend timeout has elapsed, attempting to write again; fd='26'
Aug 24 10:06:01 syslog syslog-ng[42]: I/O error occurred while writing; fd='26', error='No space left on device (28)'
Aug 24 10:06:01 syslog syslog-ng[42]: Suspending write operation because of an I/O error; fd='26', time_reopen='60'

Yet when checking the container's file system, there was still plenty of space available:

ckadm@syslog ~ $ df -h /
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/vgdata/syslog ext4  216G  192G   24G  90% /

ckadm@syslog ~ $ df -i /
Filesystem            Type Inodes IUsed IFree IUse% Mounted on
/dev/vgdata/syslog ext4    14M  100K   14M    1% /

But df revealed that another partition was fully used: /dev !

root@syslog ~ # df -h /dev
Filesystem     Type   Size  Used Avail Use% Mounted on
none           tmpfs  492K  492K     0 100% /dev

There are a couple of explanations why this happened.

What is /dev in a LXC container?

The partition /dev in a LXC container is (by default) automatically created when the container is started. From the documentation:

By default, lxc creates a few symbolic links (fd,stdin,stdout,stderr) in the container's /dev directory but does not automatically create device node entries. This allows the container's /dev to be set up as needed in the container rootfs. If lxc.autodev is set to 1, then after mounting the container's rootfs LXC will mount a fresh tmpfs under /dev (limited to 500K by default, unless defined in lxc.autodev.tmpfs.size) and fill in a minimal set of initial devices. This is generally required when starting a container containing a "systemd" based "init" but may be optional at other times.

Another important hint is shown here: The default size of /dev is 500K (shown as 492K in df). This is not a big size, agreed, but usually this should be enough as this tmpfs filesystem should only contain some symbolic links or device nodes. But as soon as "real" files are created within this path, by error or on purpose, this will quickly result in problems.

Note: Since LXC 4.0 it is possible to define a bigger size than 500K using the LXC config option lxc.autodev.tmpfs.size (as mentioned in the quote). We have provided the relevant code change in the LXC project.

Why is syslog-ng using /dev?

This question can easily be answered by looking at the logged errors from above again:

Aug 24 10:06:01 syslog syslog-ng[42]: Error suspend timeout has elapsed, attempting to write again; fd='26'
Aug 24 10:06:01 syslog syslog-ng[42]: I/O error occurred while writing; fd='26', error='No space left on device (28)'
Aug 24 10:06:01 syslog syslog-ng[42]: Suspending write operation because of an I/O error; fd='26', time_reopen='60'

Syslog-NG complains that it cannot write to fd (file descriptor) 26. Now what exactly is this fd 26? The real path can be revealed by using the /proc filesystem, using the PID of syslog-ng:

root@syslog ~ # pgrep syslog-ng
13072

root@syslog ~ # ls -l /proc/13072/fd/26
l-wx------ 1 root root 64 Aug 23 09:15 /proc/13072/fd/26 -> /dev/tty10

Obviously fd 26 points to /dev/tty10.

This tty10 can also be found in syslog-ng's config:

root@syslog ~ # grep tty10 /etc/syslog-ng/syslog-ng.conf
destination d_console_all { file(`tty10`); };

In this case the syslog-ng configuration defines a destination d_console_all to use tty10 - as console output for logging. However there's a small "problem" with tty's in containers.

TTY's in containers

This article won't be explaining what a TTY is but if you want to know more, the "The TTY demystified" article from Linus Akesson is a great read!

When a LXC container is started, it will automatically create a tmpfs under /dev, as mentioned above. This also includes tty devices, which are used (and needed) for user interaction such as SSH input/output. The number of tty devices is configurable with the configuration option lxc.tty in LXC < 3.0 and lxc.tty.max in LXC >= 3.0.

Default configurations of LXC containers usually include "base configurations". Here on a Debian 9 (stretch) running an older LXC 2.x, the container's config file includes a common configuration file adapted for Debian systems:

root@lxchost ~ # egrep "^lxc.include" -B 1 /var/lib/lxc/syslog/config
# Common configuration
lxc.include = /usr/share/lxc/config/debian.common.conf

By looking at this included configuration file, yet another config file is included:

root@lxchost ~ # cat /usr/share/lxc/config/debian.common.conf
# This derives from the global common config
lxc.include = /usr/share/lxc/config/common.conf
[...]

And finally inside this common.conf configuration file, the numbers of tty's are defined:

root@lxchost ~ # grep tty /usr/share/lxc/config/common.conf
lxc.devttydir = lxc
# Setup 4 tty devices
lxc.tty = 4

### /dev/tty

This means that (unless overwritten in the container's config file), 4 tty devices are created inside the container's /dev filesystem:

root@syslog ~ # ll /dev/tt*
crw-rw-rw- 1 root root   5, 0 Aug 21  2018 /dev/tty
crw--w---- 1 root tty  136, 0 Aug 21  2018 /dev/tty1
-rw-r----- 1 root adm  503808 Aug 23 09:21 /dev/tty10
crw--w---- 1 root tty  136, 1 Aug 21  2018 /dev/tty2
crw--w---- 1 root tty  136, 2 Aug 21  2018 /dev/tty3
crw--w---- 1 root tty  136, 3 Aug 21  2018 /dev/tty4

Now as you can clearly see, there are tty devices tty[1-4] but there's another one: tty10. Just looking at the permission of tty10 shows that this is not a special device, it is a file! And this file is created and written by syslog-ng - because the syslog-ng configuration tells syslog-ng to write the console output to this path. Syslog-NG does not verify if this is a special device node or a file, it just writes into it... until /dev is filled up (which happens pretty quickly given the 500K capacity).

Change the TTY in syslog-ng config

As we now know that there is no /dev/tty10 device in the container, syslog-ng's configuration needs to be adjusted. To use one of the existing TTY's the path for d_console_all needs to be adjusted:

root@syslog ~ # grep tty10 -A 1 -B 1 /etc/syslog-ng/syslog-ng.conf
#destination d_console_all { file(`tty10`); };
destination d_console_all { file("/dev/tty2"); };

Here the path was set to "/dev/tty2". After a restart of syslog-ng and a clean up of the regular file /dev/tty10, the /dev filesystem was usable again and syslog-ng continued to smoothly collect and store logs.

root@syslog ~ # systemctl stop syslog-ng

root@syslog ~ # rm /dev/tty10
rm: remove regular file '/dev/tty10'? y

root@syslog ~ # systemctl start syslog-ng

root@syslog ~ # df -h /dev/
Filesystem     Type   Size  Used Avail Use% Mounted on
none           tmpfs  492K     0  492K   0% /dev