FS#9239 - [initscripts] net: network stop doesn't work on two cards bonding two ips

Attached to Project: Arch Linux
Opened by Daniel YC Lin (dlin) - Thursday, 17 January 2008, 02:28 GMT
Last edited by Tom Gundersen (tomegun) - Saturday, 04 June 2011, 18:17 GMT
Task Type Bug Report
Category Arch Projects
Status Closed
Assigned To Aaron Griffin (phrakture)
Thomas Bächler (brain0)
Roman Kyrylych (Romashka)
Tom Gundersen (tomegun)
Architecture All
Severity Low
Priority Normal
Reported Version 2007.08-2
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:
ref:
http://wiki.archlinux.org/index.php/Configuring_network#multiple_ip_on_multiple_card

I setting my pc eth0, eth1 bonding with two ip. When one card failed, the other will take over.
And both ip is still available. But, I found, when I setting 'two ip on same interface' the /etc/rc.d/network

Additional info:
* package version(s) 2007.11-2
* config and/or log files etc.


Steps to reproduce:
1. Set "two ip on one card" or "two ip on two card".
2. /etc/rc.d/network start # this should be OK
3. /etc/rc.d/network stop # this would be failed
This task depends upon

Closed by  Tom Gundersen (tomegun)
Saturday, 04 June 2011, 18:17 GMT
Reason for closing:  No response
Additional comments about closing:  Use (or file bug against) netcfg.
Comment by Daniel YC Lin (dlin) - Thursday, 17 January 2008, 03:26 GMT
I've solved this problem by assume only bond0 exist.
vi /etc/rc.d/network
# add following function
bond_down()
{
for ifline in ${BOND_INTERFACES[@]}; do
if [ "$ifline" = "${ifline#!}" ]; then
eval bondcfg="\$bond_${ifline}"
/sbin/ifenslave -d $ifline $bondcfg || error=1
fi
done
}

# modify the stop) section on line ~ 230-242, change the ifdown section to
R_IFACE=() # use reversed order
for ifline in ${INTERFACES[@]}; do
R_IFACE=("$ifline $R_IFACE")
done
for ifline in ${R_IFACE[@]}; do
if [ "$ifline" = "${ifline#!}" ]; then
if [ "$ifline" = "bond0" ]; then
bond_down
fi
ifdown $ifline || error=1
fi
done

TODO: modify the total network script, all stop should better change the stop order to reversed.
Comment by Roman Kyrylych (Romashka) - Thursday, 28 February 2008, 11:39 GMT
status in 2008-02?
Comment by Aaron Griffin (phrakture) - Monday, 10 March 2008, 17:36 GMT
Roman, I know you mentioned doing some bonding work recently. Do you know if this is resolved for you?
Comment by Roman Kyrylych (Romashka) - Monday, 10 March 2008, 21:38 GMT
Yes, I was doing a backend for web-based configuration system, but that was on CentOS.
Tomorrow on my job I will continue the same work, and I'm going to install Arch on VMware server there to test & try to fix those bonding configurations (VMware Server allows up to 4 ethernet interfaces which is enought to test pretty different bonding configs).
Comment by Aaron Griffin (phrakture) - Friday, 14 March 2008, 18:19 GMT
Do we even need the R_IFACE section? Could we not just always call bond_down on shutdown?
Comment by Thomas Bächler (brain0) - Tuesday, 08 April 2008, 20:35 GMT
I am trying to make sense of the original report, but there doesn't even seem to be one complete sentence in the description. Can anybody explain what this is about?
Comment by Aaron Griffin (phrakture) - Wednesday, 09 April 2008, 18:34 GMT
The way I understand it is that bond devices aren't shutdown properly, making them fail if you restart or whatever. I may be wrong, but I got that by reading the code in the first comment and comparing to the existing code.
Comment by Daniel YC Lin (dlin) - Thursday, 10 April 2008, 00:48 GMT
Hi, Thomas, the key point is:

It required to shutdown network in REVERSE order when multiple interface attached.(especial the bond)
Comment by Gavin Bisesi (Daenyth) - Wednesday, 04 June 2008, 18:29 GMT
Is this fixed in the latest initscripts?
Comment by Roman Kyrylych (Romashka) - Friday, 08 August 2008, 12:20 GMT
First of all, I apologize for the huge delay on this issue,
real life, lack of time and laziness...

Back on topic:
the problem turned out to be very simple, and has nothing to do with real bonding.

suppose we have this in rc.conf (one card - two IP addresses):
eth0="eth0 ..."
eth1="eth0:1 ..."
INTERFACES=(eth0 eth1)

`/etc/rc.d/network stop` will try to stop eth0 *_and_eth1_* which does not exist,
(if you run ifconfig with the configuration above - you'll see eth and eth:0 but no eth1)
thus /etc/rc.d/network prints FAIL, though eth0 and eth0:1 are stopped correctly.

The solution is either don't FAIL on inability to remove existing device,
or improve the way this type of configurations is defined:
eth0=("eth0 ..."
"eth0:1 ...")
Comment by Roman Kyrylych (Romashka) - Friday, 08 August 2008, 12:25 GMT
There are also some issues with real bonding (I guess Daniel experienced two different issues at the same time),
I think slaves should be unbonded first before shutting down the master interface.
Comment by Daniel YC Lin (dlin) - Friday, 08 August 2008, 13:42 GMT
Yes, Roman, I mean, we should release the interfaces with INVERSE order.
Otherwise in my bonding situation. It can only 'network start' but 'network stop'
Comment by adam stokes (boris) - Sunday, 06 December 2009, 18:31 GMT
My rc.conf:

eth1="dhcp"
eth1_1="eth1:1 192.168.1.200 netmask 255.255.255.0 broadcast 192.168.1.255"
eth1_2="eth1:2 192.168.1.201 netmask 255.255.255.0 broadcast 192.168.1.255"
eth1_3="eth1:3 192.168.1.202 netmask 255.255.255.0 broadcast 192.168.1.255"
INTERFACES=(eth1 eth1_1 eth1_2 eth1_3)

The problem still exists in initscripts whether it be bonding or ip aliasing. Has there been any progress on this at all?

Thanks,
Comment by adam stokes (boris) - Monday, 07 December 2009, 01:21 GMT
Can someone have a look at this patch:

--- network-orig 2009-12-06 13:35:15.000000000 -0500
+++ network 2009-12-06 20:19:41.000000000 -0500
@@ -19,8 +19,6 @@
return 1
fi

- /sbin/ifconfig $1 up
-
wi_up $1 || return 1

eval ifcfg="\$${1}"
@@ -65,7 +63,11 @@
fi
fi
# Always bring the interface itself down
- /sbin/ifconfig ${1} down >/dev/null 2>&1
+ # Ignore aliases
+ `echo ${1} | grep -q '_'`
+ if [ $? -gt 0 ]; then
+ /sbin/ifconfig ${1} down >/dev/null 2>&1
+ fi
return $?
}
Comment by adam stokes (boris) - Monday, 07 December 2009, 01:22 GMT
I'd like to get an answer on why we run /sbin/ifconfig $1 up without doing any tests at all in the ifup function?

Also I think its a good idea to ignore aliased interfaces all together since bringing down the parent interface will automatically remove the aliased interfaces.

Thanks
Comment by Tom Gundersen (tomegun) - Wednesday, 27 April 2011, 10:19 GMT
We are considering limiting the scope of the network initscript to not include this usecase. Is there any reason you cannot use netcfg instead?

Loading...