Arch Linux

FS#55149 - [linux] 4.12.7 hangs when instance of systemd-nspawn is run

Attached to Project: Arch Linux
Opened by Vladimir (_v_l) - Tuesday, 15 August 2017, 06:28 GMT
Last edited by Doug Newgard (Scimmia) - Thursday, 24 August 2017, 12:40 GMT
Description: my system: Archlinux x86_64, Intel i5 2410M hangs when I start systemd-nspawn instance.

Additional info:
* package version(s):

kernel 4.12.7
systemd 234.11-8

* config and/or log files etc.

$ cat /etc/systemd/nspawn/node1-smoon4.nspawn


$ cat /etc/systemd/system/systemd-nspawn@node1-smoon4.service.d/memory.conf

Steps to reproduce:

It is 100% reproducible but there is not one good receipt, I can only describe steps that led to hang:
1. I upgraded packages today, systemd 234.11-8 and kernel 4.12.7, rebooted, login into awesome (SDDM -> awesome), started terminal (urxvt-unicode) and started systemd-nspawn instance, after a second system became irresponsible.
2. After 3 minutes laptop rebooted and I repeated the above steps three times, with the same result.
3. After third time I decided to start another systemd-nspawn instance:

$ cat /etc/systemd/nspawn/node2-smoon4.nspawn



$ cat /etc/systemd/system/systemd-nspawn@node2-smoon4.service.d/memory.conf
# MemoryHigh=500M
# MemoryMax=900M
# MemorySwapMax=10M

At first all went well then I started the first systemd-nspawn instance and few first moments all was fine, so I decided to start Firefox and Chromium and then the system hung again.

I checked SSD disk (smart), tested memory and finally tried to downgrade some packages. I tested with systemd 234.11-6 and kernel 4.12.6 and seems that only the kernel is matter. Right now I use combination systemd 234.11-8 and kernel 4.12.6 and it is stable.

The strange thing is that the same kernel-systemd versions (4.12.7, 234.11-8) seem to work fine on other my hosts: Intel i5 6200 (SSD disk), Intel i5 4570 and Intel i5 7400. All host have almost similar configuration (SSD disk, systemd-nspawn configuration, XFS), OS and packages.

Vladimir Lomov

P.S. Should I report upstream? If so should I report to ML or bug tracker?

Comment by Vladimir (_v_l) - Wednesday, 16 August 2017, 02:53 GMT
I think I found the cause and my previous assumption (about kernel ver. affected by the bug) was wrong. But first some additional information:

- the systemd-nspawn instance node1-smoonX is used to run Yandex.Disk daemon (to synchronize files and directories);
- before kernel ver. 4.12 I was used some time (more than half of year) linux-ck kernel with bfq scheduler;
- as bfq was integrated in kernel ver. 4.12 and CK didn't provide his patches for that ver., so I decided to try mainline kernel (linux from repo);
- so I run kernel ver. 4.12.5 on all my hosts and it worked fine (with node1-smoonX instance);
- I have several hosts with SSD and HDD disks, the system installed on SSD disk, all disks (until recently) were used bfq as scheduler (it was set using udev rule);
- the host that first hang without any reason except starting node1-smoonX systemd-nspawn instance was smoon4 (therefore the instance is named as node1-smoon4), see my initial report;
- the other hosts was fine until today. Today I tried install linux-ck ver. 4.12.7-ck2 (CK released his patches for kernel ver. 4.12) on host smoon2 and it is hang after starting instance node1-smoon2. I removed the linux-ck and tried again with mainline kernel and it hang too, but this time I got information from journald (journalctl -k -f), see attached file. According information from the kernel some problem with bfq occurs and that hang the kernel.
- I changed scheduler to 'kyber' for HDD and 'mq-deadline' for SSD and now all work fine.

I'll try to report upstream about the bfq behavior.
Comment by Vladimir (_v_l) - Wednesday, 16 August 2017, 02:54 GMT
Some part of dmesg (copy and paste from terminal) on smoon2 host.
Comment by Vladimir (_v_l) - Tuesday, 22 August 2017, 13:21 GMT
This is upstream bug in BFQ.

I opened bug ticket on After I searched google group of bfq-iosched I found that this is known issue:!topic/bfq-iosched/2odL08qoPS0,!topic/bfq-iosched/7I3DnJ2BuQ8,!topic/bfq-iosched/H_92hgaqgIQ.

I managed to "resolved" it by applying patches I found on linux-block mailing list (see bug report and last thread on bfq-iosched). These patches will be in kernel 4.14, so I think this task can be closed after the kernel 4.14 will released.
Comment by loqs (loqs) - Tuesday, 21 November 2017, 19:40 GMT
Does the issue still occur with linux 4.14.1-1 now in testing?
Comment by Vladimir (_v_l) - Wednesday, 22 November 2017, 00:35 GMT
I'm not sure because due to this problem I had to build kernel with patches for BFQ found on linux-block. I hope I'll stop to build kernel when 4.15 will be released (I'm watching linux-block ML and some new patches for BFQ were published, they will be definitely in 4.15).