FS#38596 - [vsftpd] fails to start with "Bad page map" kernel error

Attached to Project: Community Packages
Opened by Howard Guo (howardg) - Tuesday, 21 January 2014, 10:39 GMT
Last edited by Jonathan Steel (jsteel) - Friday, 13 June 2014, 18:38 GMT
Task Type Bug Report
Category Upstream Bugs
Status Closed
Assigned To Jonathan Steel (jsteel)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

My up-to-date Arch Linux installation on Amazon EC2 has vsftpd installed. It worked for a while until a recent kernel upgrade - sorry I cannot remember the exact version number.

Now when I attempt to start vsftpd from systemctl, the following errors appear in syslog:

Jan 21 11:38:40 ip-172-31-12-16 systemd[1]: Starting vsftpd daemon...
Jan 21 11:38:40 ip-172-31-12-16 systemd[1]: Started vsftpd daemon.
Jan 21 11:38:40 ip-172-31-12-16 systemd[1]: vsftpd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Jan 21 11:38:40 ip-172-31-12-16 systemd[1]: Unit vsftpd.service entered failed state.
Sep 27 10:18:58 ip-172-31-12-16 kernel: BUG: Bad page map in process vsftpd pte:8000000000000165 pmd:0f58a067
Sep 27 10:18:58 ip-172-31-12-16 kernel: page:ffffea0000000000 count:-2 mapcount:-2 mapping: (null) index:0x0
Sep 27 10:18:58 ip-172-31-12-16 kernel: page flags: 0x14(referenced|dirty)
Sep 27 10:18:58 ip-172-31-12-16 kernel: addr:00007fa98af23000 vm_flags:00100071 anon_vma:ffff88000f52ad00 mapping: (null) index:7fa98af23
Sep 27 10:18:58 ip-172-31-12-16 kernel: CPU: 0 PID: 634 Comm: vsftpd Tainted: G B 3.12.8-1-ec2 #1
Sep 27 10:18:58 ip-172-31-12-16 kernel: ffff88000f583508 ffff88000f5d3ca0 ffffffff814c77bb 00007fa98af23000
Sep 27 10:18:58 ip-172-31-12-16 kernel: ffff88000f5d3ce8 ffffffff8116788e 0000000000000000 0000000000000000
Sep 27 10:18:58 ip-172-31-12-16 kernel: ffff88000f58a918 ffffea0000000000 00007fa98af24000 ffff88000f5d3e10
Sep 27 10:18:58 ip-172-31-12-16 kernel: Call Trace:
Sep 27 10:18:58 ip-172-31-12-16 kernel: [<ffffffff814c77bb>] dump_stack+0x45/0x56
Sep 27 10:18:58 ip-172-31-12-16 kernel: [<ffffffff8116788e>] print_bad_pte+0x22e/0x250
Sep 27 10:18:58 ip-172-31-12-16 kernel: [<ffffffff811690b3>] unmap_single_vma+0x583/0x890
Sep 27 10:18:58 ip-172-31-12-16 kernel: [<ffffffff8116a445>] unmap_vmas+0x65/0x90
Sep 27 10:18:58 ip-172-31-12-16 kernel: [<ffffffff811737d5>] exit_mmap+0xc5/0x170
Sep 27 10:18:58 ip-172-31-12-16 kernel: [<ffffffff8105d295>] mmput+0x65/0x100
Sep 27 10:18:58 ip-172-31-12-16 kernel: [<ffffffff81062983>] do_exit+0x393/0x9e0
Sep 27 10:18:58 ip-172-31-12-16 kernel: [<ffffffff810630dc>] do_group_exit+0xcc/0x140
Sep 27 10:18:58 ip-172-31-12-16 kernel: [<ffffffff81063164>] SyS_exit_group+0x14/0x20
Sep 27 10:18:58 ip-172-31-12-16 kernel: [<ffffffff814d61ad>] system_call_fastpath+0x1a/0x1f
Sep 27 10:18:58 ip-172-31-12-16 kernel: BUG: Bad rss-counter state mm:ffff88000f5cd080 idx:0 val:-1
Sep 27 10:18:58 ip-172-31-12-16 kernel: BUG: Bad rss-counter state mm:ffff88000f5cd080 idx:1 val:1

Kernel version:

Linux 3.12.8-1-ec2 #1 SMP Mon Jan 20 09:58:48 UTC 2014 x86_64

Vsftpd version:

3.0.2-2
This task depends upon

Closed by  Jonathan Steel (jsteel)
Friday, 13 June 2014, 18:38 GMT
Reason for closing:  Fixed
Comment by Steven Noonan (neunon) - Sunday, 26 January 2014, 05:05 GMT
  • Field changed: Percent Complete (100% → 0%)
Reproduced on stock kernel.
Comment by Steven Noonan (neunon) - Sunday, 26 January 2014, 05:32 GMT
Reproduced on stock Arch kernel:

[ 10.721057] BUG: Bad page map in process vsftpd pte:8000000a0bd0e165 pmd:e9c8f2067
[ 10.721070] page:ffffea00282f4380 count:0 mapcount:-1 mapping: (null) index:0x0
[ 10.721075] page flags: 0x2fc000000000014(referenced|dirty)
[ 10.721080] addr:00007f400eebd000 vm_flags:08100071 anon_vma:ffff880e9d335b40 mapping: (null) index:7f400eebd
[ 10.721087] CPU: 0 PID: 488 Comm: vsftpd Not tainted 3.12.8-1-ARCH #1
[ 10.721090] ffff880e98204da8 ffff880e996b3ca8 ffffffff814ec1e3 00007f400eebd000
[ 10.721103] ffff880e996b3cf0 ffffffff8115b014 ffffea00282f4380 0000000000000000
[ 10.721109] ffff880e9c8f25e8 ffffea00282f4380 00007f400eebe000 ffff880e996b3e18
[ 10.721116] Call Trace:
[ 10.721125] [<ffffffff814ec1e3>] dump_stack+0x54/0x8d
[ 10.721131] [<ffffffff8115b014>] print_bad_pte+0x1b4/0x270
[ 10.721136] [<ffffffff8115cc03>] unmap_single_vma+0x813/0x8e0
[ 10.721140] [<ffffffff8115de29>] unmap_vmas+0x49/0x90
[ 10.721146] [<ffffffff811671bc>] exit_mmap+0x9c/0x170
[ 10.721152] [<ffffffff8105ec19>] mmput+0x59/0x110
[ 10.721158] [<ffffffff81063fef>] do_exit+0x27f/0xab0
[ 10.721162] [<ffffffff8106489f>] do_group_exit+0x3f/0xa0
[ 10.721166] [<ffffffff81064914>] SyS_exit_group+0x14/0x20
[ 10.721170] [<ffffffff814faced>] system_call_fastpath+0x1a/0x1f
[ 10.721174] Disabling lock debugging due to kernel taint
[ 10.723219] BUG: Bad rss-counter state mm:ffff880e9c814380 idx:0 val:-1
[ 10.723224] BUG: Bad rss-counter state mm:ffff880e9c814380 idx:1 val:1

Reported upstream as well, interesting thread: https://lkml.org/lkml/2014/1/21/544
Comment by Steven Haigh (CRCinAU) - Saturday, 17 May 2014, 11:36 GMT
  • Field changed: Percent Complete (100% → 0%)
Looking at this thread, either this hasn't been fixed in 3.14.2 or has resurfaced.

kernel: BUG: Bad page map in process vsftpd pte:8000000000000165 pmd:0230b067
kernel: page:ffffea0000000000 count:-7 mapcount:-7 mapping: (null) index:0x0
kernel: page flags: 0x10(dirty)
kernel: page dumped because: bad pte
kernel: addr:00007f0ede480000 vm_flags:00100071 anon_vma:ffff880020505000 mapping: (null) index:7f0ede480
CPU: 0 PID: 3465 Comm: vsftpd Tainted: G B 3.14.2-1-ARCH #1
0000000000000000 00000000b2454f6c ffff880011e0dc98 ffffffff8150984e
00007f0ede480000 ffff880011e0dce0 ffffffff8116d201 ffff880011e0dce0
0000000000000000 ffff88000230b400 ffffea0000000000 00007f0ede481000
Call Trace:
[<ffffffff8150984e>] dump_stack+0x4d/0x6f
[<ffffffff8116d201>] print_bad_pte+0x1c1/0x290
[<ffffffff8116ef23>] unmap_single_vma+0x833/0x8c0
[<ffffffff81170284>] unmap_vmas+0x54/0xb0
[<ffffffff8117999c>] exit_mmap+0xac/0x1a0
[<ffffffff8151273b>] ? __do_page_fault+0x2eb/0x600
[<ffffffff81067b81>] mmput+0x51/0x120
[<ffffffff8106d2d9>] do_exit+0x349/0xb10
[<ffffffff8106db23>] do_group_exit+0x43/0xc0
[<ffffffff8106dbb4>] SyS_exit_group+0x14/0x20
[<ffffffff81517629>] system_call_fastpath+0x16/0x1b
BUG: Bad rss-counter state mm:ffff880001589c00 idx:0 val:-1
BUG: Bad rss-counter state mm:ffff880001589c00 idx:1 val:1

# cat /proc/version
Linux version 3.14.2-1-ARCH (nobody@var-lib-archbuild-testing-x86_64-tobias) (gcc version 4.9.0 (GCC) ) #1 SMP PREEMPT Sun Apr 27 11:28:44 CEST 2014
Comment by Steven Noonan (neunon) - Saturday, 17 May 2014, 11:39 GMT
The patch that corrected the issue was reverted because it had the negative side effect of breaking live migration. This (reworked) patch can be pulled in to the kernel PKGBUILD to fix it:

https://git.kernel.org/cgit/linux/kernel/git/mel/linux-balancenuma.git/commit/?h=mm-numa-use-high-bit-v4r3&id=90689056584e7f9a2e4783f5d90e2b76ca1eda2b

This patch is part of the linux-ec2 kernel used in my AMIs.
Comment by Steven Noonan (neunon) - Tuesday, 20 May 2014, 06:02 GMT
Patch "mm: use paravirt friendly ops for NUMA hinting ptes" has been added to the 3.14-stable and 3.10-stable queues. Should be present in 3.14.5 and 3.10.41.
Comment by Steven Noonan (neunon) - Friday, 23 May 2014, 16:59 GMT
The fix was also added to the 3.12-stable queue this morning.
Comment by Jonathan Steel (jsteel) - Sunday, 01 June 2014, 08:36 GMT
Can someone try Linux 3.14.5 in [testing] and confirm if that fixes this?
Comment by Jonathan Steel (jsteel) - Monday, 09 June 2014, 11:44 GMT
Can someone confirm this issue persists with linux-3.14.6 (in [core] now)?
Comment by Steven Haigh (CRCinAU) - Friday, 13 June 2014, 17:43 GMT
I'd hazard a guess that it would - however I don't run Arch on that system anymore... I needed to get vsftpd working and reverted to a 'known good' setup...
Comment by Steven Noonan (neunon) - Friday, 13 June 2014, 17:44 GMT
I'll test in a moment, just give me a few to launch an EC2 instance.
Comment by Steven Noonan (neunon) - Friday, 13 June 2014, 17:48 GMT
Tested 3.14.6-1-ARCH, verified vsftpd works fine in domU.
Comment by Jonathan Steel (jsteel) - Friday, 13 June 2014, 18:37 GMT
Thanks for testing and confirming.

Loading...