FS#36749 - [archiso] dhcpcd fails to start when using PXE boot on the archiso media.
Attached to Project:
Release Engineering
Opened by Alejandro Liu (alejandro_liu) - Saturday, 31 August 2013, 12:31 GMT
Last edited by David Runge (dvzrv) - Monday, 29 March 2021, 17:31 GMT
Opened by Alejandro Liu (alejandro_liu) - Saturday, 31 August 2013, 12:31 GMT
Last edited by David Runge (dvzrv) - Monday, 29 March 2021, 17:31 GMT
|
Details
In archiso-2013.08.01 the dhcpcd fails to start properly.
The previous version that I tested was 2013.05.01 and that
worked properly.
With the new "Predictable Interface Names", network interface names are changed by udev which do not work in archiso. In archiso, when you are booting using PXE, the network interface is initialized by "archiso_pxe_common" hook. When systemd-udev tries to change the network interface name to the "Predictable interface name" it fails. Later, there is a udev rule in "/etc/udev/rules.d/81-dhcpcd.rules" which enables dhcpcd on the "predictable interface name" but this fails as this name is not there. |
This task depends upon
Closed by David Runge (dvzrv)
Monday, 29 March 2021, 17:31 GMT
Reason for closing: Fixed
Additional comments about closing: Fixed with https://gitlab.archlinux.org/archlinux/a rchiso/-/merge_requests/106
Monday, 29 March 2021, 17:31 GMT
Reason for closing: Fixed
Additional comments about closing: Fixed with https://gitlab.archlinux.org/archlinux/a rchiso/-/merge_requests/106
udev is unable to rename ethX to these names due to the fact that the interface is up and running (having been configured by the pxe hook). So when it tries it says that the device is busy.
You're wrong. The rule creates dhcpcd instances on ADD events for network interfaces. It does *not* explicitly cater to the renamed interface names.
> udev is unable to rename ethX to these names due to the fact that the interface is up and running
Right, and the fact that it can't rename anything is fine.
You seem to be stating the following:
1) udev tries to rename the the interface and it can't. Therefore, the interface remains eth0, but dhcpcd fails to start on eth0.
2) udev doesn't try to rename the interface. Therefore, the interface remains eth0 and dhcpcd starts on eth0.
...which is quite contradictory. There's always going to be an ADD event for eth0 which pulls in a dhcpcd instance for eth0, and your bug report doesn't make it clear why in one case it fails, and the other case it succeeds.
If I do NOT use "net.ifnames=0", then udev first try to rename the interface to the "predictable name" (and fails), and then "dhcpcd" tries to start on the "predictable name" (apparently does not know that the interface rename fails) and it also fails.
I admit, I am not too familiar with udev and launching "ADD" events, but if I boot without "net.ifnames=0", then "systemctl | grep dhcpcd" shows that dhcpcd is configured on enp0s3 (the predictable name)
If I boot with "net.ifnames=0", then "systemctl | grep dhcpcd" shows that dhcpcd is configured on eth0.
My assumption here is that the "/etc/udev/rules.d/81-dhcpcd.rules" is changing the dhcpcd configuration.
Either way, the "net.ifnames=0" workaround works, so I am happy.
You've yet to post any errors from dhcpcd (or anywhere else) -- care to share?
So what is going on is this:
1. archiso_pxe_common configures the network as "eth0" in the initramfs.
2. systemd and udev get started.
3. udevd tries to rename "eth0" to "enp3s0" and fails.
4. systemd starts dhcpcd@enp3s0 and also fails.
In this situation:
1. archiso_pxe_common configures the network as "eth0" in the initramfs.
2. systemd and udev get started.
3. net.ifnames=0 prevets udev from trying to rename "eth0' to "enp3s0".
4. systemd starts dhcpcd@eth0 and succeeds.
There is a copytoram=n option that would prevent that from working but for people who are using that option would have to use the net.ifnames=0 workaround.
Tested on 2 x64 machines, 1 desktop, 1 is a lenovo thinkpad x200.
Tested 2 isos:
archlinux-2015.11.01-dual.iso
archlinux-2015.10.01-dual.iso
Nov 01 16:36:30 archiso systemd-udevd[345]: Error changing net interface name 'eth0' to 'enp6s0': Device or resource busy
Nov 01 16:36:30 archiso systemd-udevd[345]: could not rename interface '2' from 'eth0' to 'enp6s0': Device or resource busy
...
Nov 01 16:36:31 archiso dhcpcd[393]: enp6s0: interface not found or invalid
Nov 01 16:36:31 archiso dhcpcd[393]: dhcpcd exited
I updated the pxe wiki article so the workaround is documented.
If there is something that needs testing, I'd be happy to do it. One idea is to put net.ifnames=0 in the network boot grub menu items.
Thanks.
Linux archiso 4.9.11-1-ARCH #1 SMP PREEMPT Sun Feb 19 13:45:52 UTC 2017 x86_64 GNU/Linux
Without net.ifnames at all:
# systemctl status dhcpcd@enp0s25.service
● dhcpcd@enp0s25.service - dhcpcd on enp0s25
Loaded: loaded (/usr/lib/systemd/system/dhcpcd@.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2017-03-15 12:37:32 UTC; 6min ago
Process: 390 ExecStart=/usr/bin/dhcpcd -q -w %I (code=exited, status=1/FAILURE)
Mar 15 12:37:32 archiso systemd[1]: Starting dhcpcd on enp0s25...
Mar 15 12:37:32 archiso systemd[1]: dhcpcd@enp0s25.service: Control process exited, code=exited status=1
Mar 15 12:37:32 archiso systemd[1]: Failed to start dhcpcd on enp0s25.
Mar 15 12:37:32 archiso systemd[1]: dhcpcd@enp0s25.service: Unit entered failed state.
Mar 15 12:37:32 archiso systemd[1]: dhcpcd@enp0s25.service: Failed with result 'exit-code'.
# cat /proc/cmdline
BOOT_IMAGE=(tftp)/archiso/arch/boot/x86_64/vmlinuz archisobasedir=/arch archiso_http_srv=192.168.10.22 ip=:::::eth0:dhcp
# cat /etc/resolv.conf
#
# /etc/resolv.conf
#
#search <yourdomain.tld>
#nameserver <ip>
# End of file
With net.ifnames=1
# systemctl status dhcpcd@enp0s25.service
● dhcpcd@enp0s25.service - dhcpcd on enp0s25
Loaded: loaded (/usr/lib/systemd/system/dhcpcd@.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2017-03-15 12:46:35 UTC; 32s ago
Process: 380 ExecStart=/usr/bin/dhcpcd -q -w %I (code=exited, status=1/FAILURE)
Mar 15 12:46:35 archiso systemd[1]: Starting dhcpcd on enp0s25...
Mar 15 12:46:35 archiso systemd[1]: dhcpcd@enp0s25.service: Control process exited, code=exited status=1
Mar 15 12:46:35 archiso systemd[1]: Failed to start dhcpcd on enp0s25.
Mar 15 12:46:35 archiso systemd[1]: dhcpcd@enp0s25.service: Unit entered failed state.
Mar 15 12:46:35 archiso systemd[1]: dhcpcd@enp0s25.service: Failed with result 'exit-code'.
# cat /proc/cmdline
BOOT_IMAGE=(tftp)/archiso/arch/boot/x86_64/vmlinuz archisobasedir=/arch archiso_http_srv=192.168.10.22 ip=:::::eth0:dhcp net.ifnames=1
# cat /etc/resolv.conf
#
# /etc/resolv.conf
#
#search <yourdomain.tld>
#nameserver <ip>
# End of file
With net.ifnames=0
# systemctl status dhcpcd@eth0.service
● dhcpcd@eth0.service - dhcpcd on eth0
Loaded: loaded (/usr/lib/systemd/system/dhcpcd@.service; disabled; vendor preset: disabled)
Active: active (running) since Wed 2017-03-15 12:53:53 UTC; 35s ago
Process: 388 ExecStart=/usr/bin/dhcpcd -q -w %I (code=exited, status=0/SUCCESS)
Main PID: 463 (dhcpcd)
Tasks: 1 (limit: 4915)
CGroup: /system.slice/system-dhcpcd.slice/dhcpcd@eth0.service
└─463 /usr/bin/dhcpcd -q -w eth0
Mar 15 12:53:53 archiso dhcpcd[388]: DUID 00:01:00:01:20:5b:f5:e1:3c:97:0e:0f:c9:5a
Mar 15 12:53:53 archiso dhcpcd[388]: eth0: IAID 0e:0f:c9:5a
Mar 15 12:53:53 archiso dhcpcd[388]: eth0: soliciting a DHCP lease
Mar 15 12:53:53 archiso dhcpcd[388]: eth0: offered 192.168.10.101 from 192.168.10.22
Mar 15 12:53:53 archiso dhcpcd[388]: eth0: leased 192.168.10.101 for 14400 seconds
Mar 15 12:53:53 archiso dhcpcd[388]: eth0: adding route to 192.168.10.0/24
Mar 15 12:53:53 archiso dhcpcd[388]: eth0: adding default route via 192.168.10.1
Mar 15 12:53:53 archiso systemd[1]: Started dhcpcd on eth0.
Mar 15 12:53:53 archiso dhcpcd[463]: eth0: soliciting an IPv6 router
Mar 15 12:54:05 archiso dhcpcd[463]: eth0: no IPv6 Routers available
# cat /proc/cmdline
BOOT_IMAGE=(tftp)/archiso/arch/boot/x86_64/vmlinuz archisobasedir=/arch archiso_http_srv=192.168.10.22 ip=:::::eth0:dhcp net.ifnames=0
# cat /etc/resolv.conf
# Generated by resolvconf
domain asgardsrealm.net
nameserver 192.168.10.22
nameserver 192.168.10.2
...
Mar 15 13:22:07 archiso systemd-udevd[302]: Error changing net interface name 'eth0' to 'enp0s25': Device or resource busy
Mar 15 13:22:07 archiso systemd-udevd[302]: could not rename interface '2' from 'eth0' to 'enp0s25': Device or resource busy
...
Near as I can tell the intent is something like this:
$ diff -u /usr/lib/initcpio/hooks/archiso_pxe_common.orig /usr/lib/initcpio/hooks/archiso_pxe_common
--- /usr/lib/initcpio/hooks/archiso_pxe_common.orig 2017-04-07 13:51:43.999839112 -0700
+++ /usr/lib/initcpio/hooks/archiso_pxe_common 2017-04-07 13:52:09.536574991 -0700
@@ -68,5 +68,7 @@
elif [[ "${copy_resolvconf}" != "n" && -f /etc/resolv.conf ]]; then
cp /etc/resolv.conf /new_root/etc/resolv.conf
fi
+ ln -s /dev/null /new_root/etc/udev/rules.d/80-net-name-slot.rules
+ rm /new_root/etc/udev/rules.d/81-dhcpcd.rules
fi
}
While this change does appear to prevent the errors around attempting (and failing) to rename the interface. It does *not* result in a functional system.
root@archiso ~ # systemctl status dhcpcd@eth0.service
● dhcpcd@eth0.service - dhcpcd on eth0
Loaded: loaded (/usr/lib/systemd/system/dhcpcd@.service; disabled; vendor pre
Active: inactive (dead)
root@archiso ~ # systemctl list-units | grep dhcp | wc -l :(
0
root@archiso ~ # cat /etc/resolv.conf
#
# /etc/resolv.conf
#
#search <yourdomain.tld>
#nameserver <ip>
# End of file
if [[ -n "${bootif_dev}" ]]; then
ip addr flush dev "${bootif_dev}"
ip link set "${bootif_dev}" down
fi
I have created a patch so it resets the IP configuration on all network interfaces, so it does not need BOOTIF to be set, and so the DHCP client in the final root file system can run again.
You can find the patch and all the details there:
- https://gitlab.com/fdupoux/sysresccd-src/-/tree/master/patches
- https://gitlab.com/fdupoux/sysresccd-src/-/issues/19
Could you please consider applying this patch (and ideally my other unrelated patch) to the archiso sources.
I will look into these fixes for sure! Thanks for providing them (I already saw you offered pull-requests for this on github [1]).
We'll switch to our gitlab soonish though (outside collaboration is still not yet possible due to technical reasons) and github will really only be a readonly mirror.
That being said: The fixes seem like something we will want to include!
[1] https://github.com/archlinux/archiso/pulls