FS#56605 - [linux] 4.14.3-1-x86_64 breaks IPSec/L2TP tunnels

Attached to Project: Arch Linux
Opened by Alexander E. Patrakov (patrakov) - Wednesday, 06 December 2017, 01:40 GMT
Last edited by Jan Alexander Steffens (heftig) - Tuesday, 26 December 2017, 00:26 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Jan Alexander Steffens (heftig)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 12
Private No

Details

Description:

I am a user of PureVPN. I use strongswan and xl2tpd, plus networkmanager-l2tp. It worked with 4.13.x kernels, but no longer works after 4.14.3 upgrade. The error is:

дек 06 00:09:11 yoga NetworkManager[519]: xl2tpd[2653]: Listening on IP address 0.0.0.0, port 1701
дек 06 00:09:11 yoga NetworkManager[519]: xl2tpd[2653]: Connecting to host 192.253.242.2, port 1701
дек 06 00:09:11 yoga charon[2593]: 15[KNL] creating acquire job for policy 192.168.7.130/32[udp/l2f] === 192.253.242.2/32[udp/l2f] with reqid {1}
дек 06 00:09:11 yoga charon[2593]: 06[CFG] trap not found, unable to acquire reqid 1

Downgrading the kernel fixes the problem.

Additional info:
* package version(s)

strongswan 5.6.1-1
xl2tpd 1.3.10-1
networkmanager-l2tp 1.2.4-3
linux 4.14.3-1 (broken), 4.13.12-1 (working)

* config and/or log files etc.

Here is the NetworkManager connection config:

[connection]
id=PureVPN L2TP HK Dedicated
uuid=786527b6-87cc-4892-8593-5429a624476f
type=vpn
autoconnect=false
permissions=
timestamp=1490625831

[vpn]
gateway=hk-ded-1.pointtoserver.com
ipsec-enabled=yes
ipsec-forceencaps=yes
ipsec-psk=12345678
lcp-echo-failure=5
lcp-echo-interval=30
mru=1320
mtu=1320
no-vj-comp=yes
nobsdcomp=yes
nodeflate=yes
password-flags=0
refuse-chap=yes
refuse-eap=yes
refuse-pap=yes
user=<censored>
service-type=org.freedesktop.NetworkManager.l2tp

[vpn-secrets]
password=<censored>

[ipv4]
dns=8.8.8.8;
dns-priority=-1
dns-search=
ignore-auto-dns=true
method=auto

[ipv6]
addr-gen-mode=stable-privacy
dns-search=
ip6-privacy=0
method=auto
This task depends upon

Closed by  Jan Alexander Steffens (heftig)
Tuesday, 26 December 2017, 00:26 GMT
Reason for closing:  Fixed
Additional comments about closing:  linux 4.14.9-1
Comment by Dennis (OlliC) - Saturday, 09 December 2017, 13:38 GMT
Some problem here. I switched to the LTS kernel and wait for a fix in main kernel.
Comment by loqs (loqs) - Saturday, 09 December 2017, 15:20 GMT
Have you notified or checked that upstream is aware that there is an issue to fix?
Comment by Andrew Boyko (www2287) - Monday, 11 December 2017, 15:59 GMT
Same here with L2TP IPSec. Here is libreswan bug topic (https://github.com/libreswan/libreswan/issues/140)
My log on 4.14:

002 "e68a2494-5164-4742-b94d-666f89e20c59" #1: initiating Main Mode
104 "e68a2494-5164-4742-b94d-666f89e20c59" #1: STATE_MAIN_I1: initiate
002 "e68a2494-5164-4742-b94d-666f89e20c59" #1: transition from state STATE_MAIN_I1 to state STATE_MAIN_I2
106 "e68a2494-5164-4742-b94d-666f89e20c59" #1: STATE_MAIN_I2: sent MI2, expecting MR2
002 "e68a2494-5164-4742-b94d-666f89e20c59" #1: transition from state STATE_MAIN_I2 to state STATE_MAIN_I3
108 "e68a2494-5164-4742-b94d-666f89e20c59" #1: STATE_MAIN_I3: sent MI3, expecting MR3
002 "e68a2494-5164-4742-b94d-666f89e20c59" #1: Peer ID is ID_IPV4_ADDR: '10.xxx.xxx.xxx'
002 "e68a2494-5164-4742-b94d-666f89e20c59" #1: transition from state STATE_MAIN_I3 to state STATE_MAIN_I4
004 "e68a2494-5164-4742-b94d-666f89e20c59" #1: STATE_MAIN_I4: ISAKMP SA established {auth=PRESHARED_KEY cipher=3des_cbc_192 integ=sha group=MODP1024}
002 "e68a2494-5164-4742-b94d-666f89e20c59" #2: initiating Quick Mode PSK+ENCRYPT+UP+IKEV1_ALLOW+IKEV2_ALLOW+SAREF_TRACK+IKE_FRAG_ALLOW+ESN_NO {using isakmp#1 msgid:b0ba3117 proposal=3DES(3)_000-SHA1(2) pfsgroup=no-pfs}
117 "e68a2494-5164-4742-b94d-666f89e20c59" #2: STATE_QUICK_I1: initiate
003 "e68a2494-5164-4742-b94d-666f89e20c59" #2: ignoring informational payload IPSEC_RESPONDER_LIFETIME, msgid=b0ba3117, length=28
003 "e68a2494-5164-4742-b94d-666f89e20c59" #2: NAT-Traversal: received 2 NAT-OA. Using first, ignoring others
003 "e68a2494-5164-4742-b94d-666f89e20c59" #2: our client subnet returned doesn't match my proposal - us:192.xxx.xxx.xxx/32 vs them:194.xx.xxx.xxx/32
003 "e68a2494-5164-4742-b94d-666f89e20c59" #2: Allowing questionable proposal anyway [ALLOW_MICROSOFT_BAD_PROPOSAL]
003 "e68a2494-5164-4742-b94d-666f89e20c59" #2: peer client subnet returned doesn't match my proposal - us:190.xxx.xxx.xxx/32 vs them:10.xxx.xxx.xxx/32
003 "e68a2494-5164-4742-b94d-666f89e20c59" #2: Allowing questionable proposal anyway [ALLOW_MICROSOFT_BAD_PROPOSAL]
002 "e68a2494-5164-4742-b94d-666f89e20c59" #2: transition from state STATE_QUICK_I1 to state STATE_QUICK_I2
004 "e68a2494-5164-4742-b94d-666f89e20c59" #2: STATE_QUICK_I2: sent QI2, IPsec SA established transport mode {ESP/NAT=>0xbcec3a1e <0x609c3d28 xfrm=3DES_0-HMAC_SHA1 NATOA=189.xxx.xxx.xxx NATD=190.xxx.xxx.xxx:4500 DPD=passive}
nm-l2tp[13414] <info> Libreswan IPsec tunnel is up.
** Message: xl2tpd started with pid 13852
xl2tpd[13852]: setsockopt recvref[30]: Protocol not available
xl2tpd[13852]: Using l2tp kernel support.
xl2tpd[13852]: xl2tpd version xl2tpd-1.3.10 started on lenovo-pc PID:13852
xl2tpd[13852]: Written by Mark Spencer, Copyright (C) 1998, Adtran, Inc.
xl2tpd[13852]: Forked by Scott Balmos and David Stipp, (C) 2001
xl2tpd[13852]: Inherited by Jeff McAdams, (C) 2002
xl2tpd[13852]: Forked again by Xelerance (www.xelerance.com) (C) 2006-2016
xl2tpd[13852]: Listening on IP address 0.0.0.0, port 1701
xl2tpd[13852]: get_call: allocating new tunnel for host 190.xxx.xxx.xxx, port 1701.
xl2tpd[13852]: Connecting to host 190.xxx.xxx.xxx, port 1701
xl2tpd[13852]: control_finish: message type is (null)(0). Tunnel is 0, call is 0.
xl2tpd[13852]: control_finish: sending SCCRQ
nm-l2tp[13414] <warn> Looks like pppd didn't initialize our dbus module
nm-l2tp[13414] <info> Terminated xl2tpd daemon with PID 13852.
xl2tpd[13852]: death_handler: Fatal signal 15 received
xl2tpd[13852]: Connection 0 closed to 190.xxx.xxx.xxx, port 1701 (Server closing)
002 "e68a2494-5164-4742-b94d-666f89e20c59": deleting non-instance connection
** Message: ipsec shut down
nm-l2tp[13414] <warn> xl2tpd exited with error code 1
** Message: ipsec shut down
Comment by loqs (loqs) - Monday, 11 December 2017, 22:24 GMT
Initial finding from upstream libreswan https://github.com/libreswan/libreswan/issues/140#issuecomment-347707742 not a user space issue.
Please check linux 4.15-rc3 if the issue is not fixed in that please bisect between 4.13 and 4.14 then report the issue upstream against the bad commit.
Comment by loqs (loqs) - Tuesday, 12 December 2017, 03:59 GMT
If no one is prepared to compile anything I could try to build the l2tp modules from 4.13 backported to linux 4.14
you would still have to copy them to /usr/lib/modules/extramodules-4.14-ARCH/ then run #depmod.
This is also the option with the lowest change of success as it assumes the bug must be in the l2tp modules.
Please not if you change from 4.14.4-1-ARCH I would need to change the target I am building against to match.
Comment by loqs (loqs) - Tuesday, 12 December 2017, 21:25 GMT
https://github.com/loqs/l2tp-modules please try this PKGBUILD it will install the l2tp modules from 4.13.12 in the extra modules directory so overriding the 4.14 modules.
I went with 4.13.12 as that was the last release arch did of the 4.13 series but not one said specifically which was the last unaffected version.
Edit:
Should be less of an issue than I thought nothing was changed in the l2tp modules in version 4.13.6 to 4.13.12 inclusive.
Comment by Josef Hopfgartner (joho1001) - Friday, 15 December 2017, 16:19 GMT
as documented in current stable kernel changelog
https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.14.6

(Search for XFRM -> xfrm: Copy policy family in clone_policy)

i've tested this kernel version but the problem remains the same:
too many irritated SA get created when connecting L2TP VPN...

Comment by loqs (loqs) - Friday, 15 December 2017, 20:33 GMT
@joho1001 you had the BUG message from 6610b9cb80ad9a58f71940b9658e3116262663db in the system's dmesg with versions before 4.14.6 but not with 4.14.6?
https://bugzilla.redhat.com/show_bug.cgi?id=1526203#c2 one report of the issue being fixed in 4.15-rc3.
Comment by Alexander E. Patrakov (patrakov) - Monday, 18 December 2017, 06:47 GMT
4.14.6-1 does not fix it, I still get "trap not found, unable to acquire reqid 1"
Comment by Josef Hopfgartner (joho1001) - Tuesday, 19 December 2017, 06:15 GMT
Yeah. Still Brocken in 4.14 - Had No idea how to search for this error in kernel sources.
So I'm stuck with kernel 4.13 until release of 4.15.
Ist's the simplest way for me.
Comment by loqs (loqs) - Sunday, 24 December 2017, 20:25 GMT
https://patchwork.ozlabs.org/patch/838470/
which needs
https://patchwork.ozlabs.org/patch/852277/
Please test if applying the above two patches resolves the issue.
Comment by loqs (loqs) - Sunday, 24 December 2017, 20:58 GMT
patch for 4.14.8 PKGBUILD adjusted to fetch and apply patches from previous post and PKGBUILD with patch already applied.
Comment by Alexandre Bique (babali) - Monday, 25 December 2017, 12:09 GMT
Hi,
Same issue here.
Quite critical for me.

Severity should be higher, because I can't connect to my office vpn anymore.
Comment by loqs (loqs) - Monday, 25 December 2017, 22:55 GMT
@bababli even if the severity was set to critical I would not expect that to change the speed of the issue being resolved.
Upstream will fix it whenever upstream believes the fix is ready. Which may mean waiting until after the 4.15 release https://patchwork.ozlabs.org/comment/1828319/
If you wanted arch to apply the proposed patches before upstream then there would need to be proof that the patches do actually fix the issue
so someone affected would have to build the kernel with the patches applied and test them.

Loading...