Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/index.php/Reporting_Bug_Guidelines

Do NOT report bugs when a package is just outdated, or it is in Unsupported. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#67033 - [security][systemd] Out of Memory DOS attack vector in libnss_myhostname when running BGP

Attached to Project: Arch Linux
Opened by Jonathan (jdccdevel) - Wednesday, 17 June 2020, 20:08 GMT
Last edited by freswa (frederik) - Wednesday, 17 June 2020, 20:21 GMT
Task Type Bug Report
Category Packages: Core
Status Assigned
Assigned To Dave Reisner (falconindy)
Christian Hesse (eworm)
Levente Polyak (anthraxx)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 0
Private No

Details

Description:
In systems with large routing tables, such as those running BGP, some hostname lookup queries cause extremely high memory use.

In particular, high memory use is triggered by any process performing a reverse DNS lookup on a IPv4 address when no PTR record exists. For example, the lookup performed when connecting to the postfix smtpd.

In my case, on a system running BGP with 800k+ routes, connecting to smtpd (from any system without a reverse DNS entry) triggers smtpd to consume 1.3GB of memory PER INSTANCE. (By default, postfix allows 100 smtpd instances to be spawned.) This is a very straightforward mechanism for creating an Out-Of-Memory Denial of Service attack.

The SMTP daemon is just one method of triggering this bug. The simplest way to trigger the bug is (on a system with a large routing table) is described below.

The root cause seems to be a very inefficient netlink routing table query in libnss_myhostname.so (A systemd library)

Additional info:
- Affects at least systemd 2.45.6-6 and older
- Magnitude of the problem appears proportional to the size of the routing table.
- Can be reproduced with static routes, BGP daemon is not necessary.
- Any system with a large routing table is vulnerable.
- Removing "myhostname" from the hosts line in /etc/nsswitch.conf should be an acceptable mitigation.
- Additional details can be found here: https://github.com/systemd/systemd/issues/13199

Steps to reproduce:
1) Ensure "myhostname" is present in hosts line in /etc/nsswitch.conf
2) Add a large number of routes to your routing table. (See attached script, it will take a while.)
3) Run the command
getent hosts 172.20.1.1 //Any rfc1918 or public IPv4 address without reverse dns works.

In my test VM, valgrind is showing ~550MB Memory or more allocated to preform this query:
[root@testvm routes]# valgrind getent -s myhostname hosts 172.20.1.1
==867906== Memcheck, a memory error detector
==867906== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==867906== Using Valgrind-3.16.0.GIT and LibVEX; rerun with -h for copyright info
==867906== Command: getent -s myhostname hosts 172.20.1.1
==867906==
==867906==
==867906== HEAP SUMMARY:
==867906== in use at exit: 4,288 bytes in 24 blocks
==867906== total heap usage: 1,024,300 allocs, 1,024,276 frees, 558,538,874 bytes allocated
==867906==
...

Workaround/Migitation:
- Remove myhostname from hosts line in nsswitch.conf (This will prevent the function from being called entirely. From my research, this will have minimal side effects, especially in systems running BGP or other routing suites.)


Suggestions:
1) I would like to see a warning about this added to the quagga package, and any other routing software with the potential to create a large routing table.
2) The "myhostname" entry could be removed from nsswitch.conf (in the filesystem package), until such time as the bug is fixed by systemd upstream. The only potential issue I see is that "_gateway" and possibly some other implicit hostnames will no longer resolve. (They can be manually added to /etc/hosts by admins that need it.)
3) Arch could patch the libnss_myhostname.so library in the systemd package to remove suppport for the "_gateway" implicit hostname.


This task depends upon

Loading...