FS#59749 - [firewalld] Linux 4.18.3.arch1-1 breaks libvirtd/qemu networking (DHCP)

Attached to Project: Community Packages
Opened by wzrd tales (wzrdtales) - Wednesday, 22 August 2018, 09:41 GMT
Last edited by freswa (frederik) - Wednesday, 14 October 2020, 20:50 GMT
Task Type Bug Report
Category Packages
Status Closed
Assigned To Maxime Gauduin (Alucryd)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 8
Private No

Details

Description:

When upgrading to Linux 4.18.3.arch1-1 and all packages related to that upgrade will break libvirtd/qemu networking. When the vm gets started it gets stuck while requesting an ip via DHCP and it is not possible to get any ip anymore.

Additional info:

Package Downgrade that fixed the problem:

[2018-08-22 11:31] [ALPM] downgraded linux (4.18.3.arch1-1 -> 4.17.14.arch1-1)
[2018-08-22 11:31] [ALPM] downgraded tp_smapi (0.43-49 -> 0.43-45)
[2018-08-22 11:31] [ALPM] downgraded acpi_call (1.1.0-155 -> 1.1.0-151)
[2018-08-22 11:31] [ALPM] downgraded virtualbox-host-modules-arch (5.2.18-5 -> 5.2.16-11)


Couldn't find any errors in the logs.




Steps to reproduce:

Upgrade to Linux 4.18.3.arch1-1, create a VM with default networking, try to get an IP via DHCP.


This task depends upon

Closed by  freswa (frederik)
Wednesday, 14 October 2020, 20:50 GMT
Reason for closing:  No response
Comment by George Angelopoulos (gangelop) - Friday, 24 August 2018, 12:24 GMT
dns resolution also stopped working for me.
dns + dhcp broken => dnsmasq problem?

I haven't tested downgrading yet but I thought I should mention it.
Comment by Jan Martens (JanMa) - Friday, 24 August 2018, 14:54 GMT
This is also affecting docker containers on my machines. Anything trying to use a network bridge really.
To reproduce, try starting and connecting to a random docker container and doing package upgrades.

Downgrading to kernel version 4.17.14 also solved the issue for me
Comment by loqs (loqs) - Friday, 24 August 2018, 15:09 GMT
I would suggest bisecting the kernel and reporting the results upstream.
Comment by Solus (solusoperandi) - Saturday, 25 August 2018, 21:34 GMT
I noticed this problem Thursday evening after updating my kernel to 4.18 the previous day. I have two QEMU VMs managed by Libvirt. The first is set up in a bridged networking mode and worked normally. The second has a default NAT configuration. When booting Windows in the second VM DHCP failed and the interface was assigned an APIPA address.

I noticed that upon disabling firewalld this machine would once again function normally, DHCP would execute successfully and the machine had full networking capabilities. Once I re-enabled firewalld and restarted the machine DHCP would fail again.

I also have a server with several docker containers which also appear to be affected. I was able to temporarily fix the issue on my desktop by downgrading the Linux and Linux Headers packages to 4.17.
Comment by Jan Martens (JanMa) - Sunday, 26 August 2018, 13:44 GMT
Linux 4.18.5 didn't solve the issue. After some more digging and with the help of Solus comment, i am now certain that this is a bug in firewalld.
It somehow fails to forward packages originating from a virtual Ethernet bridge like "docker0" to their destination.
I went and opened a bug for the firwalld package. This task can be closed IMO
Comment by Daniel Apolinario (dapolinario) - Sunday, 26 August 2018, 19:54 GMT
Try to add the interface in some zone (in the example below I added in the "public" zone):

firewall-cmd --zone=public --change-interface=virbr0 --permanent

Also add the "dns" and "dhcp" services to this zone (I did this through firewall-config).
Comment by Maxime Gauduin (Alucryd) - Monday, 27 August 2018, 07:48 GMT
This also contains some workarounds: https://bugzilla.redhat.com/show_bug.cgi?id=1468914
Comment by Jan Martens (JanMa) - Monday, 27 August 2018, 08:12 GMT
Adding the affected interfaces and services to the public zone did not help. But using firewalld 0.5.4 from the official website solves the problem.
Comment by Jan Martens (JanMa) - Monday, 27 August 2018, 09:18 GMT
EDIT: sorry, duplicated comment by accident
Comment by Milan Kubik (apophys) - Monday, 27 August 2018, 09:32 GMT
To my knowledge, the issue is caused by the change of default firewalld backend to nftables (and/or influenced with some changes in recent kernel; I remember voting for an issue that was opened against 4.17 kernel but couldn't find it again) as described in https://bbs.archlinux.org/viewtopic.php?id=239362.

By changing the backend back to iptables, DHCP works in my libvirt, and I assume will work with docker as well. However this is still just an workaround. The issue then is that the combination of libvirt/docker and firewalld running on top of nftables doesn't seem to work.
Comment by wzrd tales (wzrdtales) - Sunday, 02 September 2018, 19:20 GMT
@JanMa Do you have a link to the bug you opened?
Comment by Jan Martens (JanMa) - Monday, 03 September 2018, 18:06 GMT
@wzrdtales This bug right here has been reassigned to firewalld. The one i opened was therefore closed because it was a duplicate.
FYI changing the backend to iptables as apophys suggested did the trick for me too.
Comment by Maxime Gauduin (Alucryd) - Tuesday, 20 August 2019, 16:17 GMT
Is this still happening with the latest firewalld? In any case this sounds like an upstream issue and should be reported to them rather than us.

Loading...