FS#46764 - [linux-grsec] resolvconf not working

Attached to Project: Community Packages
Opened by ITwrx (andriesinfoserv) - Saturday, 17 October 2015, 13:28 GMT
Last edited by Daniel Micay (thestinger) - Tuesday, 03 November 2015, 19:32 GMT
Task Type Bug Report
Category Upstream Bugs
Status Closed
Assigned To Daniel Micay (thestinger)
Architecture x86_64
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description: resolvconf not working with linux-grsec

Additional info:

linux-grsec 4.2.3.201510161817-1
systemd 227-1

kernel: resolvconf[832] bad frame in rt_sigreturn ... in libc-2.22.so
kernel: grsec: brute force prevention initiated for the next 30 minutes or until service restarted, stalling each fork 30 seconds. Please investigate the crash report for /usr/bin/resolvconf[resolvconf:825] uid/euid:0/0 gid:egid:0/0
systemd-coredump: Process 832(rresolvconf) of user 0 dumped core.

Steps to reproduce: boot arch with linux-grsec using dhcp for net interface?

Thanks in advance.
This task depends upon

Closed by  Daniel Micay (thestinger)
Tuesday, 03 November 2015, 19:32 GMT
Reason for closing:  Fixed
Comment by Daniel Micay (thestinger) - Saturday, 17 October 2015, 18:46 GMT
Does this occur with the previous packages? You can get the older packages/signatures from http://seblu.net/a/archive/packages/l/linux-grsec/ if you don't have them in your package cache (4.2.3.201510130858-1 and then 4.2.3.201510111839-1).
Comment by Daniel Micay (thestinger) - Saturday, 17 October 2015, 19:30 GMT
Also, are you on i686 or x86_64?
Comment by ITwrx (andriesinfoserv) - Saturday, 17 October 2015, 19:53 GMT
this just started a few (3-5?) releases ago. I just verified that resolvconf works with 4.0.8.201507111211-1-grsec.

i'm on x86_64.

thanks again
Comment by Daniel Micay (thestinger) - Saturday, 17 October 2015, 19:56 GMT
It would be very helpful to identify which release it was that first broke it (probably either the move from 4.0 -> 4.1, or 4.1 -> 4.2).
Comment by ITwrx (andriesinfoserv) - Saturday, 17 October 2015, 22:30 GMT
appears to be the move from 4.1 -> 4.2.
resolvconf works in linux-grsec-4.1.7.201509201149-1-x86_64.pkg.tar.xz

I'm guessing it's broke in linux-grsec-4.2.3.201510072230-1-x86_64.pkg.tar.xz but i get "bruteforce prevention initiated ... check crash log for /usr/lib/systemd/systemd-udevd" before it ever gets to the resolvconf issue.
same thing with linux-grsec-4.2.3.201510111839-1-x86_64.pkg.tar.xz.

then with linux-grsec-4.2.3.201510130858-1-x86_64.pkg.tar.xz udev issue is gone and resolvconf issue appears.

btw, i noticed i was unable to build a kernel while using in one of the previous two working grsec kernels, as mkinitcpio triggered bruteforce prevention measures too. Please let me know if i need to replicate and report that as well.

thanks.
Comment by Daniel Micay (thestinger) - Saturday, 17 October 2015, 23:23 GMT
Can you provide the full kernel logs?
Comment by ITwrx (andriesinfoserv) - Sunday, 18 October 2015, 17:49 GMT
sure, which kernel(s)?
Comment by Daniel Micay (thestinger) - Sunday, 18 October 2015, 17:53 GMT
The current one is fine.
Comment by ITwrx (andriesinfoserv) - Sunday, 18 October 2015, 19:30 GMT
maybe the udevd issue is just intermittent? you'll notice that with the first boot with latest kernel the bruteforce prevention is triggered on/by udevd. then, after reboot, no issue with udevd only resolvconf.
   kernel.log (257.4 KiB)
Comment by Remi Gacogne (rgacogne) - Wednesday, 21 October 2015, 09:48 GMT
Hello,

I have the same kind of issue with 4.2.3.201510202025-1-grsec and mkinitcpio.

Oct 21 11:20:41 redacted kernel: mkinitcpio[502] bad frame in rt_sigreturn frame:0000038bb6ff26b8 ip:38526d4c150 sp:38bb6ff2c78 orax:ffffffffffffffff in libc-2.22.so[38526c70000+19b000]

Postfix doesn't seem to be able to start either:

Oct 21 11:17:46 redacted kernel: sh[281] bad frame in rt_sigreturn frame:00000397ecb10678 ip:31cd0925900 sp:397ecb10c18 orax:ffffffffffffffff in libc-2.22.so[31cd08f2000+19b000]
Oct 21 11:17:46 redacted kernel: grsec: bruteforce prevention initiated for the next 30 minutes or until service restarted, stalling each fork 30 seconds. Please investigate the crash report for /usr/bin/bash[sh:281] uid/euid:0/0 gid/egid:0/0, parent /usr/bin/bash[sh:264] uid/euid:0/0 gid/egid:0/0
Oct 21 11:17:46 redacted postfix[232]: /usr/lib/postfix/bin/post-install: line 481: 281 Segmentation fault (core dumped) ( case "$mail_version" in

And PostgreSQL is not working correctly either:

Oct 21 11:17:47 redacted kernel: traps: postgres[350] trap stack segment ip:36bb5288f35 sp:3becce7feb8 error:0 in libcrypto.so.1.0.0[36bb51b4000+24d000]
Oct 21 11:17:47 redacted systemd-coredump[374]: Process 350 (postgres) of user 88 dumped core.

Everything works fine after downgrading to linux-grsec-4.1.7.201509201149-1.

Ping me (here or on IRC) if providing more info can help.
Comment by Daniel Micay (thestinger) - Wednesday, 21 October 2015, 20:02 GMT
Can you provide the /proc/cpuinfo output?
Comment by ITwrx (andriesinfoserv) - Wednesday, 21 October 2015, 21:39 GMT
no problem. pls see attached.
Comment by Remi Gacogne (rgacogne) - Friday, 23 October 2015, 20:23 GMT
Just for information, I have the same issue with 4.2.4.201510222059-1. Let me know if it makes sense to try various kernel configurations, I have the spare CPU time :)
Comment by Daniel Micay (thestinger) - Friday, 23 October 2015, 20:24 GMT
Could try building with all of the PaX / grsecurity features enabled to check if it's an issue with the baseline changes or a specific feature. Could also see if this occurs with the PaX patch too, not just the grsecurity patch which pulls in those changes and adds a lot more.
Comment by Daniel Micay (thestinger) - Friday, 23 October 2015, 21:31 GMT
Also, would be helpful to get cpuinfo from you too. I'm curious if there's some common hardware factor, since most people aren't running into this (both AMD? just a shot in the dark).
Comment by ITwrx (andriesinfoserv) - Saturday, 24 October 2015, 20:40 GMT
attached is another machine's cpuinfo that is affected by this bruteforce prevention issue(amd apu). i also updated and tested with intel core 2 duo and i have no bruteforce prevention related messages. only "kernel: PAX: size overflow detected in function minstrel_ht_get_rate net/mac80211/rc80211_minstrel_ht.c:1056" over and over again. Is that a bug? If so, would it be helpful for me to report? let me know if there's anything else i can try to round up. thanks.
Comment by Daniel Micay (thestinger) - Saturday, 24 October 2015, 20:49 GMT
Yes, you should report the size_overflow issue upstream at https://forums.grsecurity.net/.
Comment by Brad Spengler (spendergrsec) - Sunday, 25 October 2015, 19:48 GMT
This should be fixed in the latest patch. You can also apply https://grsecurity.net/~spender/fpu_fix.diff but the latest patch also includes some size_overflow updates.

-Brad
Comment by ITwrx (andriesinfoserv) - Sunday, 25 October 2015, 20:12 GMT
cool, thanks

@thestinger
maybe this was closed due to a misunderstanding? I think spendergrsec was saying that the intel issue should be resolved with new version, not this bruteforce prevention issue. I just tried to boot with 10-25-2015 build with my amd system and this issue remains. Haven't tested the intel yet for the other issue. Thanks.
Comment by Daniel Micay (thestinger) - Tuesday, 27 October 2015, 19:43 GMT
This issue was expected to be fixed now. The size overflow stuff is not related to the FPU fix.
Comment by Brad Spengler (spendergrsec) - Wednesday, 28 October 2015, 01:42 GMT
Spent the past couple hours trying to debug this.
Can you give me a full dmesg with CONFIG_X86_DEBUG_FPU=y ?

Thanks,
-Brad
Comment by ITwrx (andriesinfoserv) - Wednesday, 28 October 2015, 02:33 GMT
I appreciate your efforts, but don't over do it on my account. This is not ultra time sensitive on my end. Please see attached. That's all that was in my kernel.log. i added "CONFIG_X86_DEBUG_FPU=y" as kernel param. If this doesn't have what you need, let me know and i'll try again.

Thanks again.
Comment by Brad Spengler (spendergrsec) - Wednesday, 28 October 2015, 03:03 GMT
Sorry, I should have clarified:

The kernel would need to be recompiled with CONFIG_X86_DEBUG_FPU=y present in the kernel config. I'll see if the FPU info in the dmesg can help me track it down.

-Brad
Comment by Brad Spengler (spendergrsec) - Wednesday, 28 October 2015, 03:26 GMT
Something else to try is booting with eagerfpu=on added to the kernel command-line.

-Brad
Comment by Daniel Micay (thestinger) - Wednesday, 28 October 2015, 03:57 GMT
@andriesinfoserv: To build it you just need to do `ABSROOT=. abs community/linux-grsec` to fetch the package sources, modify config.x86_64 by hand (can just uncomment/enable the option) and then `updpkgsums && makepkg`. The full guide is here, but half isn't necessary for this:

https://wiki.archlinux.org/index.php/Kernels/Arch_Build_System
Comment by Daniel Micay (thestinger) - Wednesday, 28 October 2015, 03:58 GMT
(er, meant community/linux-grsec :P)
Comment by Daniel Micay (thestinger) - Wednesday, 28 October 2015, 04:06 GMT
And it's not actually bad practice to edit it by hand since the build process always runs `make config`. I'm just used to that breaking because the interactive prompts for y/n/m don't work with the container-based build scripts. It works fine with `makepkg` alone.
Comment by Remi Gacogne (rgacogne) - Wednesday, 28 October 2015, 10:16 GMT
Unfortunately I don't have access to the original host causing the issue for the moment, but I can reproduce the same issue on a Vultr KVM VPS. I realize this may not be an ideal setup to test kernel issues, but here is the full dmesg booting a 4.2.4.201510251836 recompiled with CONFIG_X86_DEBUG_FPU=y and eagerfpu=on.
Comment by ITwrx (andriesinfoserv) - Wednesday, 28 October 2015, 12:01 GMT
@spendergrsec and @thestinger
thanks for the info. i'll check it out soon.

@rgacogne
I had to sleep and today i have field work, so i wouldn't have been able to provide this info very quick. Therefore, your input is very timely and appreciated.

@all
please disregard the attachment. That's just a refresh related error. :)
Comment by Brad Spengler (spendergrsec) - Wednesday, 28 October 2015, 12:32 GMT
Thanks, unfortunately it seems there's a bug in the upstream kernel that prevents that kernel commandline option from having any effect (it configures eagerfpu based on the variable set by the commandline option long before it ever even parses the commandline options). This is why you see the 'lazy' mode being printed in dmesg even though it shouldn't be. I think the bug has to do with the lazy FPU mode (and possibly those WARNs you're triggering) -- could you please edit arch/x86/kernel/fpu/init.c and add a line just before the "if (eagerfpu == ENABLE)" that looks like:
eagerfpu = ENABLE;

and recompile and give me a new dmesg?

Thanks,
-Brad
Comment by Remi Gacogne (rgacogne) - Wednesday, 28 October 2015, 13:48 GMT
Ok, it looks like you were right, everything seems back to normal when eager FPU mode is forced by adding eagerfpu = ENABLE; before the check. I am attaching the new dmesg, but I don't see anything unusual.

Thank you, and please let me know if I can help by running more tests.
Comment by C (phone0) - Wednesday, 28 October 2015, 17:39 GMT
eagerfpu fixed ongoing issues since 3.19+ on a (gentoo) grsec + xen setup. many similiar segfaults like the one originally posted as well as constant bad rss-counter states.
Comment by Brad Spengler (spendergrsec) - Wednesday, 28 October 2015, 23:47 GMT
Could you provide me with some of those logs?
Comment by Daniel Micay (thestinger) - Thursday, 29 October 2015, 23:32 GMT
The FPU changes were reverted in 4.2.5.201510290852-1 so this problem will actually be gone now. Not actually "fixed" though, so I'm sure spender would still like to have logs.
Comment by ITwrx (andriesinfoserv) - Friday, 30 October 2015, 20:20 GMT
i can confirm that the problem doesn't exist with 4.2.5.201510290852-1-grsec.

@thestinger are you asking for logs from me or phone0? I was assuming rgacogne's logs would provide the same potential insights as mine. If y'all would still like mine just let me know and i'll look into it, though i don't know how fast i'll be.

Thanks
Comment by Remi Gacogne (rgacogne) - Tuesday, 03 November 2015, 13:23 GMT
I can confirm that I don't see issue anymore, even in "lazy" mode, with 4.2.5.201511021814-1.

Loading...