FS#17147 - [netcfg] 2.5rc1 ignores ip settings

Attached to Project: Arch Linux
Opened by Tomas M. (eldragon) - Sunday, 15 November 2009, 16:05 GMT
Last edited by James Rayner (iphitus) - Tuesday, 23 February 2010, 11:43 GMT
Task Type Bug Report
Category Packages: Testing
Status Closed
Assigned To James Rayner (iphitus)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:

ive followed the upgrade path to the new netcfg 2.5 rc1 in testing.
when starting the daemon net-auto-wireless it reports success, but it fails to setup the network.

if after running the daemon, i run "dhcpd wlan0", it finnishes what netcfg left of. (acquires an ip).

ive tested both, static and dhcp settings in the profile.

package: netcfg 2.5 rc1 from testing

relevant snippets from rc.conf
--------------
eth0="dhcp" #eth0 192.168.0.2 netmask 255.255.255.0 broadcast 192.168.0.255"
INTERFACES=(!eth0)
gateway="default gw 192.168.0.1"
ROUTES=(!gateway)
NETWORKS=()
WIRELESS_INTERFACE="wlan0"
WIRED_INTERFACE="eth0"
DAEMONS=(syslog-ng hal network crond @net-auto-wireless @netfs @samba @alsa @cpufreq @laptop-mode @cups @pkgdd)
------------------

my network profile in /etc/network.d/my_profile
----------------------
CONNECTION="wireless"
DESCRIPTION="coneccion casa"
INTERFACE=wlan0
SCAN="yes"
SECURITY="wep"
ESSID="some_essid"
KEY="SOME_HEX_WEP_KEY"
IP="dhcp"
#IP="static"
#ADDR="192.168.10.11"
#GATEWAY="192.168.10.1"
#DNS=("192.168.10.1")
------------

when trying a static ip, i uncommented the static settings in the profile and commented out the dhcp line


This task depends upon

Closed by  James Rayner (iphitus)
Tuesday, 23 February 2010, 11:43 GMT
Reason for closing:  Fixed
Comment by Michael Bentlage (mib1982) - Tuesday, 17 November 2009, 09:55 GMT
Maybe helpful. If I call 'netcfg2 profile-name' as root the connection is also established.
Comment by Tomas M. (eldragon) - Tuesday, 17 November 2009, 11:50 GMT
what a coincidence, i was about to write that. this morning in order to skip scanning a just ran "netcfg proflie" an it did setup my static ip. the issue appears to be with net-auto-wireless rather than netcfg itself (??)
Comment by Vincent Van Houtte (zenlord) - Tuesday, 17 November 2009, 12:33 GMT
In your rc.conf I see that you didn't comment out the old settings for the network-daemon. This could interfere with netcfg.

IIRC you should comment out the first 5 lines of your rc.conf and disable the network-daemon.

Also, instead of constantly changing 1 profile, you could easily make a second (static) profile...
Comment by Vincent Van Houtte (zenlord) - Tuesday, 17 November 2009, 12:41 GMT
Of course I meant to say 'first 5 relevant snippets' you had shown and not the first 5 lines of your rc.conf-file...
Comment by Tomas M. (eldragon) - Tuesday, 17 November 2009, 15:54 GMT
i dont see how they would interfere since they are set for eth0 and are actually disabled..

gonna comment them anyway ;)
Comment by James Rayner (iphitus) - Wednesday, 18 November 2009, 00:02 GMT
ok, a few things that can be done.

On a normal boot as you have configured now, when it does not work, have a look in /var/log/daemon.log for any entries, relating to that particular boot, from wpa_actiond and dhcpcd. You can tell they're from the current boot by timestamp. Do this before fixing it with dhcpcd.


Then there's 2 tests:

After boot with the current config, where you do not have a functional connection, please provide the output of
wpa_cli status
netcfg status
pgrep dhcp
This should identify whether netcfg is somehow returning success yet not providing an IP.

Second, test only that profile on boot:
Set:
NETWORKS=(my_profile)
And ensure the only networking related daemon in your rc.conf is 'net-profiles'. Don't background it.
On boot, watch the output carefully and post it exactly - especially if it mentions dhcp at all.
If the network is not functional, run the 3 commands above and post the output.
This should identify whether it's a problem within netcfg and not the auto stuff.
Comment by Tomas M. (eldragon) - Wednesday, 18 November 2009, 02:33 GMT
checked daemons.log

only line worth mentioning:
-------
Nov 17 23:14:44 lappy wpa_actiond[2577]: Starting wpa_actiond session for interface 'wlan0'
-------

output of wpa_cli status (run with sudo)
------
Selected interface 'wlan0'
wpa_state=SCANNING
------

both netcfg status and pgrep dhcp dont give any output


second test:
bringing the profile up fails: wireless network not present..

ran the test again, but set the profile to SCAN="no" and it did bring the interface up successfuly including the ip.

i must say, that if i run "/etc/rc.d/net-auto-wireless restart" with scan="yes" in the profile, it brings the connection up successfuly, yet the ip is missing. (a dhclient wlan0 fixes it)
Comment by Vincent Van Houtte (zenlord) - Wednesday, 18 November 2009, 07:11 GMT
With my home network, I have better results with dhclient than with dhcpcd. Setting DHCLIENT="yes" in your profile will make nefcfg choose dhclient over dhcpcd.
Comment by Tomas M. (eldragon) - Wednesday, 18 November 2009, 11:36 GMT
im using a static ip, so this will be of no use :(
Comment by Michael Bentlage (mib1982) - Thursday, 19 November 2009, 22:57 GMT
Current state of my connection is as follows: During a normal boot, with the radio killswitch turned off (= wireless-card turned on), the dhcp-lease fails. Today I had the radio killswitch turned on (= wireless card turned off) during boot, and when I pushed it after logging in to X, the connection was established instantly.

Here are relevant snippets of rc.conf and of my network profile:

rc.conf
WIRELESS_INTERFACE="wlan0"
DAEMONS=(syslog-ng cpufreq net-rename hal @alsa @net-auto-wireless @mysqld @crond @laptop-mode)

network profile:
CONNECTION="wireless"
INTERFACE=wlan0
SCAN="yes"
SECURITY="wpa"
ESSID="Funkensprueher_ST"
KEY="some_key"
IP="dhcp"
TIMEOUT=60

Here are the the outputs of the requested commands:
output of wpa_cli status:
Selected interface 'wlan0'
bssid=00:21:27:ff:15:2a
ssid=Funkensprueher_ST
id=1
id_str=steinfurt
pairwise_cipher=CCMP
group_cipher=TKIP
key_mgmt=WPA2-PSK
wpa_state=COMPLETED

output of netcfg status:
(no output)

output of pgrep dhcp:
(no output)

output of dhcpd wlan0:
dhcpcd: version 5.1.3 starting
dhcpcd: wlan0: broadcasting for a lease
dhcpcd: wlan0: offered 192.168.1.100 from 192.168.1.250 `�'
dhcpcd: wlan0: acknowledged 192.168.1.100 from 192.168.1.250 `�'
dhcpcd: wlan0: checking for 192.168.1.100
dhcpcd: wlan0: leased 192.168.1.100 for infinity
dhcpcd: wlan0: MTU set to 576
dhcpcd: forking to background

outout of netcfg status is unchanged after obtaining the lease, but the output of pgrep dhcp is changed to:
4464

I hope that this is helpful.
Comment by Tomas M. (eldragon) - Monday, 23 November 2009, 15:15 GMT
installed kernel 2.6.32-rc8 and it fixed half the issues.

now during boot, net-auto-wireless authenticates with the AP correctly, but it still fails to setup the ip for the wireless device. (no error provided), it just doesnt do it.

running netcfg profile-name works ok.

same config as above.

restarting net-auto-wireless doesnt fix the issue.
Comment by Tomas M. (eldragon) - Thursday, 03 December 2009, 12:39 GMT
this might be quite noobish of me, but ive been following the net-auto-wireless code and havent found a single line that suggests the ip settings are being set up / parsed.... iphitus, could you highlight this portion of the code please?
Comment by James Rayner (iphitus) - Thursday, 03 December 2009, 13:29 GMT
in netcfg-wpa_actiond-action, it calls /usr/lib/network/connections/ethernet with the profile name matching the network. 'ethernet' then configures it according to the options.

After it's associated and settled down, could you check your logs for lines like the following:
wpa_actiond[1234]: Interface 'ethX' connected to network 'mynetwork'
wpa_actiond[1234]: Interface 'ethX' reestablished/lost connection to network 'mynetwork'

If all is working fine,
- wpa_supplicant should associate
- wpa_actiond detects this event, and calls netcfg-wpa_actiond-action
- netcfg-wpa_actiond-action configures your IP settings

I'm looking at this. I'll get back to you in a day with a suggestion.
Comment by Tomas M. (eldragon) - Thursday, 03 December 2009, 15:20 GMT
ive followed the codepath with echos and im stuck at /usr/sbin/wpa_actiond which runs with all the options,


additionally: here is what i get rom my logs:
Dec 3 12:18:11 lappy wpa_actiond[13229]: Starting wpa_actiond session for interface 'wlan0'

nothing else.

when running netcfg profile, wpa_actiond is not used, is this correct?
Comment by Tomas M. (eldragon) - Friday, 04 December 2009, 23:23 GMT
im at a wireless spot which works half the time..

when it works, logs appear as stated above.

when it does not

check this log: the delays show something timeoutish about it:

Dec 4 19:50:01 lappy wpa_actiond[3123]: Terminating wpa_actiond session for interface 'wlan0'
Dec 4 19:50:07 lappy wpa_actiond[3825]: Starting wpa_actiond session for interface 'wlan0'
Dec 4 19:50:19 lappy wpa_actiond[3825]: Terminating wpa_actiond session for interface 'wlan0'
Dec 4 19:50:25 lappy wpa_actiond[3921]: Starting wpa_actiond session for interface 'wlan0'
Dec 4 19:51:38 lappy wpa_actiond[3921]: Terminating wpa_actiond session for interface 'wlan0'
Dec 4 20:12:17 lappy wpa_actiond[4788]: Starting wpa_actiond session for interface 'wlan0'
Dec 4 20:12:44 lappy wpa_actiond[4788]: Terminating wpa_actiond session for interface 'wlan0'
Dec 4 20:12:48 lappy wpa_actiond[4970]: Starting wpa_actiond session for interface 'wlan0'
Dec 4 20:12:48 lappy wpa_actiond[4970]: Interface 'wlan0' connected to network 'marianita'
Dec 4 20:15:04 lappy wpa_actiond[4970]: Interface 'wlan0' lost connection to network 'marianita'
Dec 4 20:15:05 lappy wpa_actiond[4970]: Interface 'wlan0' reestablished connection to network 'marianita'
Dec 4 20:22:21 lappy wpa_actiond[4970]: Interface 'wlan0' lost connection to network 'marianita'
Dec 4 20:22:22 lappy wpa_actiond[4970]: Interface 'wlan0' reestablished connection to network 'marianita'
Comment by James Rayner (iphitus) - Saturday, 05 December 2009, 12:38 GMT
which delays are you talking about?

Yes, when running a profile, wpa_actiond is not used.

Also could you test, running nothing network related at all on boot, and then /etc/rc.d/net-auto-wireless start after logging in?
Comment by Tomas M. (eldragon) - Saturday, 05 December 2009, 13:52 GMT
well yes. not starting net-auto-wireless during boot (or anything network related) and then starting the daemon in a shell works ok.

ive always had this small annoyance of not having a correct environment during boot. never been able to figure this out. do you know where to look?
Comment by Tomas M. (eldragon) - Monday, 07 December 2009, 17:18 GMT
ive noticed a custom kernel ive built does not work with wpa_actiond

the current 2.6.32-ARCH one does.. so i guess its safe to close this bug report as invalid. do you know what kernel config might trigger this?
Comment by James Rayner (iphitus) - Monday, 07 December 2009, 21:27 GMT
If starting net-auto-wireless after logging in, instead of at boot works, that may indicate that the driver isn't ready when it is started at boot.

I've seen this with some ethernet cards in the past too.

If your driver needs a daemon, eg crda, put that at the start of your DAEMONS and don't background it. try putting net-auto-wireless right at the end of your daemons.


Comment by Tomas M. (eldragon) - Monday, 07 December 2009, 22:33 GMT
yes, ive seen this happen...

but. when this does happen, a SIOCTRL error (or something like that) appears under dmesg, this is not the case.

anyway, ive moved net-auto-wireless to the bottom of the list. might add a sleep 5 to the beggining of the rc script to help with the issue if it arises again ;)
Comment by James Rayner (iphitus) - Saturday, 30 January 2010, 23:43 GMT
try 2.5.0 in [testing].

brain0 identified a possible race which may be the cause of this. wpa_supplicant starts up and associates before wpa_actiond is able to connect and so it misses the event. A fix has been applied.
Comment by Michael Bentlage (mib1982) - Tuesday, 02 February 2010, 21:48 GMT
Thank you very much. 2.5.1 fixed the issue for me.

Loading...