← All Posts

Debugging Mystery Emails: The 5 Whys at Work

Posted by

  tech

26 min read | 7508 words | 423 views | 0 comments

You may have heard of "The Five Whys" before. The concept is commonly used to illustrate the importance of not settling for a surface-level answer when something has gone wrong and instead digging deeper to get to the root cause.

IT professionals are likely to be as familiar with this concept as anybody. When troubleshooting something broken or unusual, troubleshooting problems back to the root cause can involve working back through several problems, not all of which initially seemed related. As a very concrete example, take a stack trace in a core dump file. Stack traces are important since they allow a programmer to see how a program got to the point of failure.

Unfortunately, not all problems come with a stack trace (and even those that do are not always trivial to debug). Recently, I troubleshot a bizarre email related issue that I think illustrates this concept nicely, so I'm sharing it here.

It all started yesterday. For some reason, I started getting emails from a server that looked something like this:

Subject: Postfix SMTP server: errors from hostname[127.0.0.1]

Transcript of session follows.

 Out: 220 hostname ESMTP Postfix (Debian/GNU)
 In:  EHLO hotmail.com
 Out: 250-hostname
 Out: 250-PIPELINING
 Out: 250-SIZE 10240000
 Out: 250-VRFY
 Out: 250-ETRN
 Out: 250-STARTTLS
 Out: 250-ENHANCEDSTATUSCODES
 Out: 250-8BITMIME
 Out: 250-DSN
 Out: 250-SMTPUTF8
 Out: 250 CHUNKING
 In:  STARTTLS
 Out: 220 2.0.0 Ready to start TLS
 In:  EHLO hotmail.com
 Out: 250-hostname
 Out: 250-PIPELINING
 Out: 250-SIZE 10240000
 Out: 250-VRFY
 Out: 250-ETRN
 Out: 250-AUTH DIGEST-MD5 NTLM CRAM-MD5 LOGIN PLAIN
 Out: 250-ENHANCEDSTATUSCODES
 Out: 250-8BITMIME
 Out: 250-DSN
 Out: 250-SMTPUTF8
 Out: 250 CHUNKING
 In:  MAIL FROM:
 Out: 452 4.3.1 Insufficient system storage
 In:  QUIT
 Out: 221 2.0.0 Bye


For other details, see the local mail logfile

Except, I didn't get just one of these, I got about a thousand of time, all sent within the same hour or so but delayed and received in batches due to the sheer volume of them.

Problem #1: Hundreds of Postmaster emails

Okay, so clearly something is wrong here. Fortunately, this seemed simple enough: the email even said Out: 452 4.3.1 Insufficient system storage

This actually made sense, since the server was prone to running out of disk space and sometimes needed a watchful eye on it to make sure that sufficient free disk space remained available. Running df -h on the server confirmed that only 11 MB of disk space remained, so, sure enough, free disk space was the problem.

Or rather, as we will see, it was a problem, but not the problem — perhaps even more a symptom of the problem than a problem itself.

Problem #2: Running out of disk space

Astute readers may have noticed something strange in the email above: where did [email protected] come from?

Hold onto that though. We'll be seeing a lot more of that.

Some manual effort cleared up some disk space, but not sufficiently. It was clear that there was something very abnormal going on — something was consuming a lot of disk space; the question was, what?

Linux has a lot of helpful disk utilities to help for this type of thing. A command I like to use is du -sh * | sort -rh. This will show the size of all the directories in the current directory, sorted from largest to smallest. It's an easy way to determine what's taking up abnormally large amounts of space from the command line. Not as pretty as WinDirStat, but you get used to it (and yes, I know there are similar GNU/Linux programs for that purpose, but these are Linux servers one does not simply run a Linux server with a GUI)

Some poking around revealed that /var/spool/postfix seemed to be particularly large for some reason. A little more poking around revealed a huge number of deferred emails in Postfix, on the order of thousands of emails, almost a GB worth of messages.

Okay, so Postfix was responsible for eating up all the disk space. Clearing out all the deferred messages fixed the issue quickly: postsuper -d ALL

Problem #3: Mysterious Deferred Emails

Clearing out the deferred messages proved to be a temporary fix; the deferred messages started queuing again almost as soon as they'd been sent. It became obvious before long that the server was behaving as if some kind of open mail relay was present. If you know what those are, you know that those are not good. On the contrary, those are very, very bad, and they will kill your server, not so much from a performance perspective, but it can lead to serious email delivery issues in the long run, so accidental open mail relays are every IT admin's worst nightmare.

Confirming this hunch, I ran tail -f /var/log/mail.log

Sure enough, it was scrolling by so fast that the text simply glossed right up off the monitor. Something was definitely not right here.

Not one to consider myself much of a postfix expert, I started looking at some of the postfix configuration, scouring all the options that could possibly be inadvertently turning the mail server into a free-for-all. I looked around on StackOverflow and ServerFault, but there wasn't anything relevant there, so I was on my own here. A perfect challenge.

Some intuition eventually led me to this line:

inet_interfaces = all

This was intentionally: the server was mainly used to send outgoing email, but it did occasionally need to receive incoming email to forward it out to a distribution list. But perhaps allowing that had created some loophole in the process.

inet_interfaces = loopback-only

Looked promising, but no cigar. The email activity was as fast and furious as ever.

Next, I tried changing the smtpd_relay_restrictions.

smtpd_relay_restrictions = permit_mynetworks permit_sasl_authenticated defer_unauth_destination

Surely, this shouldn't be allowing any SASL authenticated clients to relay SMTP traffic, right? Well, I didn't know, but I could find out easily:

smtpd_relay_restrictions = permit_mynetworks reject

Still no change but this was odd. I had mynetworks = 127.0.0.1, so that was ruled out. If it wasn't SASL authenticated, then what could it be?

As one of those "just to see" types of things, I changed it again, to:

smtpd_relay_restrictions = reject

Surprisingly, that did work. Now errors were scrolling by in the mail log. No more open mail relay.

But, if you haven't noticed already, there's a very disturbing detail about what we just noticed. Did you catch it?

The only difference between bad and good was removing permit_mynetworks from the list and mynetworks = 127.0.0.1, which means...

Uh oh.

If you've seen When A Stranger Calls (1978), there's a famous scene at the beginning where the babysitter is told the strange calls are coming from inside the house

While those words are famously tied to that movie, the idea behind the scene was not actually novel; it was used previously in Black Christmas, five years prior, in 1974:

Well, either way, that was the situation here. This was not actually an open mail relay. No, much worse the emails were coming from the server.

Problem #4: Internal Open Mail Relay

Okay, so the problem is more complicated than it initially seemed.

The postfix configuration was apparently not the cause of the issue, so I reverted the relay restrictions, also adding this to queue all outgoing mail but not actually send it, to avoid doing any further damage than what had already been done:

defer_transports = hold
default_transport = hold

It was also quite straightforward to confirm that that the emails were coming from the server.

root@hostname:/etc/postfix# tail -f /var/log/mail.log | grep "client="
Sep  9 14:52:11 voip postfix/smtpd[15418]: rec_put: type A len 20 data log_client
Sep  9 14:52:11 voip postfix/smtpd[15418]: rec_put: type A len 28 data log_client
Sep  9 14:52:11 voip postfix/smtpd[15418]: rec_put: type A len 21 data log_client
Sep  9 14:52:11 voip postfix/smtpd[15418]: rec_put: type A len 16 data client_nam
Sep  9 14:52:11 voip postfix/smtpd[15418]: rec_put: type A len 24 data client_add
Sep  9 14:52:11 voip postfix/smtpd[15418]: rec_put: type A len 17 data client_por
Sep  9 14:52:11 voip postfix/smtpd[15418]: rec_put: type A len 21 data client_add
Sep  9 14:52:11 voip postfix/smtpd[15418]: A409D2EF4C: client=hostname[127.0.0.1]
Sep  9 14:52:11 voip postfix/smtpd[15422]: rec_put: type A len 20 data log_client
Sep  9 14:52:11 voip postfix/smtpd[15422]: rec_put: type A len 28 data log_client
Sep  9 14:52:11 voip postfix/smtpd[15422]: rec_put: type A len 21 data log_client
Sep  9 14:52:11 voip postfix/smtpd[15422]: rec_put: type A len 16 data client_nam
Sep  9 14:52:11 voip postfix/smtpd[15422]: rec_put: type A len 24 data client_add
Sep  9 14:52:11 voip postfix/smtpd[15422]: rec_put: type A len 17 data client_por
Sep  9 14:52:11 voip postfix/smtpd[15422]: rec_put: type A len 21 data client_add
Sep  9 14:52:11 voip postfix/smtpd[15422]: B35122EF4D: client=hostname[127.0.0.1]
Sep  9 14:52:11 voip postfix/smtpd[15421]: rec_put: type A len 20 data log_client
Sep  9 14:52:11 voip postfix/smtpd[15421]: rec_put: type A len 28 data log_client
Sep  9 14:52:11 voip postfix/smtpd[15421]: rec_put: type A len 21 data log_client
Sep  9 14:52:11 voip postfix/smtpd[15421]: rec_put: type A len 16 data client_nam
Sep  9 14:52:11 voip postfix/smtpd[15421]: rec_put: type A len 24 data client_add
Sep  9 14:52:11 voip postfix/smtpd[15421]: rec_put: type A len 17 data client_por
Sep  9 14:52:11 voip postfix/smtpd[15421]: rec_put: type A len 21 data client_add
Sep  9 14:52:11 voip postfix/smtpd[15421]: C24D02EF4E: client=hostname[127.0.0.1]
Sep  9 14:52:11 voip postfix/smtpd[15423]: rec_put: type A len 20 data log_client
Sep  9 14:52:11 voip postfix/smtpd[15423]: rec_put: type A len 28 data log_client
Sep  9 14:52:11 voip postfix/smtpd[15423]: rec_put: type A len 21 data log_client
Sep  9 14:52:11 voip postfix/smtpd[15423]: rec_put: type A len 16 data client_nam
Sep  9 14:52:11 voip postfix/smtpd[15423]: rec_put: type A len 24 data client_add
Sep  9 14:52:11 voip postfix/smtpd[15423]: rec_put: type A len 17 data client_por
Sep  9 14:52:11 voip postfix/smtpd[15423]: rec_put: type A len 21 data client_add
Sep  9 14:52:11 voip postfix/smtpd[15423]: D19082EF4F: client=hostname[127.0.0.1]
Sep  9 14:52:12 voip postfix/smtpd[15418]: rec_put: type A len 20 data log_client
Sep  9 14:52:12 voip postfix/smtpd[15418]: rec_put: type A len 28 data log_client
Sep  9 14:52:12 voip postfix/smtpd[15418]: rec_put: type A len 21 data log_client
Sep  9 14:52:12 voip postfix/smtpd[15418]: rec_put: type A len 16 data client_nam
Sep  9 14:52:12 voip postfix/smtpd[15418]: rec_put: type A len 24 data client_add
Sep  9 14:52:12 voip postfix/smtpd[15418]: rec_put: type A len 17 data client_por
Sep  9 14:52:12 voip postfix/smtpd[15418]: rec_put: type A len 21 data client_add
Sep  9 14:52:12 voip postfix/smtpd[15418]: 57D7E2EF54: client=hostname[127.0.0.1]

The next step was figuring out what local process was actually sending the spammy emails over to postfix. To aid with that, I ran watch on netstat so I could get a live view of the ports that were relevant to the issue.

Every 2.0s: netstat --program --numeric-hosts --numeric-ports --extend | grep -E ":25|postfix|smtp"

tcp        0      0 127.0.0.1:49904         127.0.0.1:25            ESTABLISHED com        25790633   16558/sshd: com
tcp        0      0 127.0.0.1:59990         127.0.0.1:12301         ESTABLISHED postfix    25790104   15418/smtpd
tcp        0      0 127.0.0.1:25            127.0.0.1:49904         ESTABLISHED postfix    25790634   15423/smtpd
tcp        0      0 127.0.0.1:59994         127.0.0.1:12301         ESTABLISHED postfix    25790636   15423/smtpd
tcp        0      0 127.0.0.1:25            127.0.0.1:49898         ESTABLISHED postfix    25789448   15421/smtpd
tcp        0      0 127.0.0.1:25            127.0.0.1:49902         ESTABLISHED postfix    25790586   15422/smtpd
tcp        0      0 127.0.0.1:49900         127.0.0.1:25            ESTABLISHED com        25790101   16558/sshd: com
tcp        0      0 127.0.0.1:49898         127.0.0.1:25            ESTABLISHED com        25789447   16558/sshd: com
tcp        0      0 127.0.0.1:59988         127.0.0.1:12301         ESTABLISHED postfix    25789450   15421/smtpd
tcp        0      0 127.0.0.1:49902         127.0.0.1:25            ESTABLISHED com        25790585   16558/sshd: com
tcp        0      0 127.0.0.1:59992         127.0.0.1:12301         ESTABLISHED postfix    25790588   15422/smtpd
tcp        0      0 127.0.0.1:25            127.0.0.1:49900         ESTABLISHED postfix    25790102   15418/smtpd
unix  2      [ ]         DGRAM                    15382    631/rsyslogd         /var/spool/postfix/dev/log
unix  3      [ ]         STREAM     CONNECTED     25793866 15423/smtpd
unix  3      [ ]         STREAM     CONNECTED     25793874 15422/smtpd
unix  3      [ ]         STREAM     CONNECTED     25790120 15418/smtpd
unix  3      [ ]         STREAM     CONNECTED     25498626 15421/smtpd
unix  2      [ ]         DGRAM                    25498419 15421/smtpd
unix  2      [ ]         DGRAM                    25498311 15418/smtpd
unix  2      [ ]         DGRAM                    25498520 15423/smtpd
unix  2      [ ]         DGRAM                    25498472 15422/smtpd
unix  3      [ ]         STREAM     CONNECTED     25498657 15422/smtpd
unix  3      [ ]         STREAM     CONNECTED     25789475 15421/smtpd
unix  3      [ ]         STREAM     CONNECTED     25790642 15423/smtpd
unix  3      [ ]         STREAM     CONNECTED     25498599 15418/smtpd
unix  3      [ ]         STREAM     CONNECTED     25790631 15422/smtpd
unix  3      [ ]         STREAM     CONNECTED     25498736 15423/smtpd

Something stood out immediately from the above. The postfix entries were normal — after all, that was postfix itself the com entries were less expected. However, it was very clear: all the malicious email was getting sent to postfix by the com user.

Okay, now to see what programs the com user was currently running:

root@hostname:/var/spool/postfix/deferred/0# ps -f -aux | grep "com " | grep -v "grep"
root     16530  0.0  0.7  14600  7580 ?        Ss   14:57   0:00  \_ sshd: com [priv]
com      16558  0.4  0.4  14744  4644 ?        S    14:57   0:03  |   \_ sshd: com
root     16571  0.0  0.7  14676  7496 ?        Ss   14:57   0:00  \_ sshd: com [priv]
com      16595  0.0  0.4  14676  4624 ?        S    14:57   0:00  |   \_ sshd: com
root     16573  0.0  0.7  14676  7532 ?        Ss   14:57   0:00  \_ sshd: com [priv]
com      16586  0.0  0.4  14676  4688 ?        S    14:57   0:00  |   \_ sshd: com
root     16825  0.0  0.7  14672  7992 ?        Ss   14:58   0:00  \_ sshd: com [priv]
com      16856  0.0  0.4  14672  4932 ?        S    14:58   0:00      \_ sshd: com@pts/11
com      16858  0.0  0.3   6844  3376 pts/11   Ss+  14:58   0:00          \_ /bin/bash /home/com/main.sh
com      16536  0.0  0.7  21148  7872 ?        Ss   14:57   0:00 /lib/systemd/systemd --user
com      16537  0.0  0.1 171832  1436 ?        S    14:57   0:00  \_ (sd-pam)

Huh? Something looked very odd here. The com user was a limited-access (no root privileges) utility account used for providing public access to a custom program. So, the fact that some random host was using the com account wasn't in and of itself a concern but if something was using the com account, it should be executing the program that the system was set up to have it execute, not sshd: com, whatever that meant.

To test it out, I logged into the com account myself in a new SSH connection. That's what the /home/com/main.sh entry is from. The other connections were responsible for the spam emails. But they weren't executing a program it wasn't even clear what they were executing at all.

Problem #5: Code being executed by com user

As a sanity check, I then checked what TCP ports were in used by the process IDs shown above. In this case, the parent-level sshd process accepts an incoming SSH connection (the ones with [priv]), it's the children that are executing the program. Since the [priv] connections are the actual Internet-facing processes, checking these should reveal the remote IP addresses.

root@hostname:/var/spool/postfix/deferred/0# lsof -p 16530 | grep "TCP"
sshd    16530 root    3u  IPv4           25639921      0t0      TCP hostname.example.com:ssh->194.49.68.83:8785 (ESTABLISHED)
root@hostname:/var/spool/postfix/deferred/0# lsof -p 16571 | grep "TCP"
sshd    16571 root    3u  IPv4           25641628      0t0      TCP hostname.example.com:ssh->dedi89393.hostsailor.com:36368 (ESTABLISHED)
root@hostname:/var/spool/postfix/deferred/0# lsof -p 16573 | grep "TCP"
sshd    16573 root    3u  IPv4           25641735      0t0      TCP hostname.example.com:ssh->45.77.139.137.vultrusercontent.com:34294 (ESTABLISHED)

Okay, so for the first time, I had actual malicious IP addresses to go along with the malicious email spam. This wasn't something postfix had been able to provide, because postfix was receiving everything through localhost, not from the Internet.

Unfortunately, I wasn't any closer to figuring out the program responsible for the activity. So, next it fell to good old strace to see if we could get some clues that way. strace is a handy tool that lets you peek at what system calls a process is executing under the hood.

root@hostname:/var/log# strace -p 16558
strace: Process 16558 attached
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=119, tv_usec=970651}) = 1 (in [3], left {tv_sec=119, tv_usec=942260})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(3, "\202\277\267\373AG\240\234F\335x\256\31{\307\306F\22F\2213\240\37K\252\270U\275M\306\325\243"..., 16384) = 116
select(15, [3 4 7 10 11 12], [12], NULL, {tv_sec=120, tv_usec=0}) = 1 (out [12], left {tv_sec=119, tv_usec=999994})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(12, "\27\3\3\0B\37\236%/\340V4\300\261\230\23\224d\237\376\25\360\32\373\227+\311Z\351\357,\302"..., 71) = 71
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [12], left {tv_sec=119, tv_usec=999984})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(12, "\27\3\3\0&\373\22F\24y\vt\205v\177&yO#\316\366\24!\nhU\26\264\4\354\321E"..., 16384) = 43
getpid()                                = 16558
select(15, [3 4 7 10 11 12], [3], NULL, {tv_sec=120, tv_usec=0}) = 1 (out [3], left {tv_sec=119, tv_usec=999997})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(3, "\254\37\313\177C(Z\34\350G\37\302\225\33:^\376\360)\227\26\216\273igL\315\216\270h`\246"..., 84) = 84
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [11], left {tv_sec=119, tv_usec=999996})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(11, "\27\3\3\0<\341sz\211\363\232\353B\240\327\362\217\234\274\376\250(\302\330\263\204m%\271.\32m"..., 16384) = 65
getpid()                                = 16558
select(15, [3 4 7 10 11 12], [3], NULL, {tv_sec=120, tv_usec=0}) = 1 (out [3], left {tv_sec=119, tv_usec=999998})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(3, "d\332\331w\0064}\204\27\260\\\261Cs \253\234\270H\234(\231k\33\257\245\0\340q\251m\201"..., 100) = 100
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [3], left {tv_sec=119, tv_usec=965488})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(3, "T8 fC\376\260\\u/\327}\353\2407\223\214\31TT\374\376Sx=S\252\375`\350\334\301"..., 16384) = 108
select(15, [3 4 7 10 11 12], [4], NULL, {tv_sec=120, tv_usec=0}) = 1 (out [4], left {tv_sec=119, tv_usec=999995})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(4, "\27\3\3\0\36\276\356!\254\312\264\346we\312\305\321\250\301\316\337XhFG\256-\240\362\371\tB"..., 35) = 35
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [4], left {tv_sec=119, tv_usec=999996})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(4, "\27\3\3\0&\362\231[\7\343\243\240\345\363\337\261\310Dh\246\360E\177\333G\226d\f\263\344w_"..., 16384) = 43
getpid()                                = 16558
select(15, [3 4 7 10 11 12], [3], NULL, {tv_sec=120, tv_usec=0}) = 1 (out [3], left {tv_sec=119, tv_usec=999998})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(3, "\350\31\352N\33060\217\246\f\205\221u\216\252^\366\226\255\355\6yx\372\2025\37\344\32\264\275L"..., 84) = 84
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [3], left {tv_sec=119, tv_usec=980624})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(3, "\261\v\353A,\0237\236\263\266\376\17WI-\r@\307g\27\315\333l\260\235\265\371,\332\351\375\211"..., 16384) = 16384
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [3], left {tv_sec=119, tv_usec=999995})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(3, "6W\314\303JhX\320\263\254\314\371\265\310\f3\213\372{\245\300\372\337\326\272\307\240\223\0\3242U"..., 16384) = 188
select(15, [3 4 7 10 11 12], [10], NULL, {tv_sec=120, tv_usec=0}) = 1 (out [10], left {tv_sec=119, tv_usec=999996})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(10, "\27\3\3\0O\313\257JZ\243\270\315\324\327\342\224fl\343\254B\r\250\333\22\362\235\226\201\377\376}"..., 16529) = 16529
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [10], left {tv_sec=119, tv_usec=950446})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(10, "\27\3\3\0<\371H\203\317V\376v\250*K\252<\17\235\r\274\335\220\342\255\260\33.\33`G\377"..., 16384) = 65
getpid()                                = 16558
select(15, [3 4 7 10 11 12], [3], NULL, {tv_sec=120, tv_usec=0}) = 1 (out [3], left {tv_sec=119, tv_usec=999996})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(3, "\377\226x\343\v~\3725\36\235p\364U?\331~\6\276\276 \327\240\34Q\315Y\277\354\271\241T\5"..., 100) = 100
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [3], left {tv_sec=119, tv_usec=994338})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(3, "4\242\307\271s\237\333l\361\212\240\24\304\307\35\352\t\233\350\r\263TI\0EW\4\363i\36\320\222"..., 16384) = 224
select(15, [3 4 7 10 11 12], [4 12], NULL, {tv_sec=120, tv_usec=0}) = 2 (out [4 12], left {tv_sec=119, tv_usec=999996})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(4, "\27\3\3\0B\276\356!\254\312\264\346x\233\223\253\34\1h?\354Dw\241\ta\31u\3\357\326\35"..., 71) = 71
write(12, "\27\3\3\0003\37\236%/\340V4\301\310\20!\271s\21\v3+&EE\21-\302\204E\251\n"..., 56) = 56
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [4], left {tv_sec=119, tv_usec=999997})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(4, "\27\3\3\0&\362\231[\7\343\243\240\346\277N\272\346P\t\375#\275\312\306@|\6`\304\16dg"..., 16384) = 43
getpid()                                = 16558
select(15, [3 4 7 10 11 12], [3], NULL, {tv_sec=120, tv_usec=0}) = 1 (out [3], left {tv_sec=119, tv_usec=999998})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(3, "\330\232e\"\212l.\222\22O\3\2\301\22\203\252(V/\2138/\355:\n\317u\263\244\322\240E"..., 84) = 84
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [12], left {tv_sec=119, tv_usec=998094})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(12, "\27\3\3\0&\373\22F\24y\vt\206\r)=d\235\7\303S\270x\243+2\241EE?\352,"..., 16384) = 43
getpid()                                = 16558
select(15, [3 4 7 10 11 12], [3], NULL, {tv_sec=120, tv_usec=0}) = 1 (out [3], left {tv_sec=119, tv_usec=999998})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(3, "\316\266>(&\256\356t\320\305\262\246\241G\t\227}\354\320\340\303\241\273i:3\334\314\301\351/\t"..., 84) = 84
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [3], left {tv_sec=119, tv_usec=944923})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(3, "\3622i>6\307\330\240\333\2\316\216\20366\30\4\355hX\344\205E%\261\261&$vQ/\326"..., 16384) = 108
select(15, [3 4 7 10 11 12], [11], NULL, {tv_sec=120, tv_usec=0}) = 1 (out [11], left {tv_sec=119, tv_usec=999993})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(11, "\27\3\3\0\36\220\277\356\242\302\215\256\3\377\353\302\345\247\251x\224\215#?\211\231\275\244\371\220\24\225"..., 35) = 35
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [11], left {tv_sec=119, tv_usec=999995})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(11, "\27\3\3\0&\341sz\211\363\232\353C\253c\370\347E6\356\365\364\322Dn\2256\252'f`\217"..., 16384) = 43
getpid()                                = 16558
select(15, [3 4 7 10 11 12], [3], NULL, {tv_sec=120, tv_usec=0}) = 1 (out [3], left {tv_sec=119, tv_usec=999988})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(3, "\323\261\250p:\352\26\3239\273\333\3\31\332\333\340(\203~\241'\255a`e\351\303\243b~\267\337"..., 84) = 84
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [3], left {tv_sec=119, tv_usec=940704})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(3, "6\0201\200\362O\205\321a\32\342\35P\vT\221\224\26\211\302\211\273ye\344:\314\354\n\177\316?"..., 16384) = 216
select(15, [3 4 7 10 11 12], [4 12], NULL, {tv_sec=120, tv_usec=0}) = 2 (out [4 12], left {tv_sec=119, tv_usec=999996})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0

Okay, so that probably looks like a bunch of gobble-de-gook, but that provided a very useful insight: whatever this process was doing, it was doing a lot of I/O: the write, select, and read calls are a dead giveaway on that.

Now at this point, I actually had identified the process IDs responsible for the email traffic. I could shut them down if I wanted to:

Every 0.5s: netstat --program --numeric-hosts --numeric-ports --extend | grep -E ":25|postfix|smtp"

tcp        0      0 127.0.0.1:25            127.0.0.1:35922         ESTABLISHED postfix    26597211   20974/smtpd
tcp        0      0 127.0.0.1:56342         127.0.0.1:12301         ESTABLISHED postfix    26594722   18121/smtpd
tcp        0      0 127.0.0.1:49614         127.0.0.1:25            ESTABLISHED com        26591390   16558/sshd: com
tcp        0      0 127.0.0.1:35920         127.0.0.1:25            ESTABLISHED com        26595494   16558/sshd: com
tcp        0      0 127.0.0.1:25            127.0.0.1:49614         ESTABLISHED postfix    26591391   20949/smtpd
tcp        0      0 127.0.0.1:35918         127.0.0.1:25            ESTABLISHED com        26594719   16558/sshd: com
tcp        0      0 127.0.0.1:42668         127.0.0.1:12301         ESTABLISHED postfix    26591393   20949/smtpd
tcp        0      0 127.0.0.1:56346         127.0.0.1:12301         ESTABLISHED postfix    26597213   20974/smtpd
tcp        0      0 127.0.0.1:25            127.0.0.1:35918         ESTABLISHED postfix    26594720   18121/smtpd
tcp        0      0 127.0.0.1:25            127.0.0.1:35920         ESTABLISHED postfix    26595495   21028/smtpd
tcp        0      0 127.0.0.1:35922         127.0.0.1:25            ESTABLISHED com        26597210   16558/sshd: com
tcp        0      0 127.0.0.1:56344         127.0.0.1:12301         ESTABLISHED postfix    26595497   21028/smtpd
unix  2      [ ]         DGRAM                    15382    631/rsyslogd         /var/spool/postfix/dev/log
unix  3      [ ]         STREAM     CONNECTED     26597222 20974/smtpd
unix  3      [ ]         STREAM     CONNECTED     26594747 18121/smtpd
unix  2      [ ]         DGRAM                    26420926 21028/smtpd
unix  3      [ ]         STREAM     CONNECTED     26421401 21028/smtpd
unix  3      [ ]         STREAM     CONNECTED     26417626 20974/smtpd
unix  2      [ ]         DGRAM                    26411865 20949/smtpd
unix  2      [ ]         DGRAM                    26417471 20974/smtpd
unix  2      [ ]         DGRAM                    25900529 18121/smtpd
unix  3      [ ]         STREAM     CONNECTED     26595518 21028/smtpd
unix  3      [ ]         STREAM     CONNECTED     26412269 20949/smtpd
unix  3      [ ]         STREAM     CONNECTED     26591420 20949/smtpd
unix  3      [ ]         STREAM     CONNECTED     26398758 18121/smtpd


kill -9 16558


Every 0.5s: netstat --program --numeric-hosts --numeric-ports --extend | grep -E ":25|postfix|smtp"

tcp        0      0 127.0.0.1:25            127.0.0.1:49188         ESTABLISHED postfix    26745049   20974/smtpd
tcp        0      0 127.0.0.1:49188         127.0.0.1:25            ESTABLISHED com        26745048   22882/sshd: com
tcp        0      0 127.0.0.1:36106         127.0.0.1:25            TIME_WAIT   root       0          -
tcp        0      0 127.0.0.1:49182         127.0.0.1:25            ESTABLISHED com        26744634   22882/sshd: com
tcp        0      0 127.0.0.1:36104         127.0.0.1:25            TIME_WAIT   root       0          -
tcp        0      0 127.0.0.1:25            127.0.0.1:49186         ESTABLISHED postfix    26744657   21028/smtpd
tcp        0      0 127.0.0.1:25            127.0.0.1:49182         ESTABLISHED postfix    26744635   18121/smtpd
tcp        0      0 127.0.0.1:25            127.0.0.1:49184         ESTABLISHED postfix    26744650   20949/smtpd
tcp        0      0 127.0.0.1:58192         127.0.0.1:12301         ESTABLISHED postfix    26744637   18121/smtpd
tcp        0      0 127.0.0.1:49186         127.0.0.1:25            ESTABLISHED com        26744656   22882/sshd: com
tcp        0      0 127.0.0.1:58196         127.0.0.1:12301         ESTABLISHED postfix    26744659   21028/smtpd
tcp        0      0 127.0.0.1:58194         127.0.0.1:12301         ESTABLISHED postfix    26744652   20949/smtpd
tcp        0      0 127.0.0.1:49184         127.0.0.1:25            ESTABLISHED com        26744649   22882/sshd: com
tcp        0      0 127.0.0.1:45344         127.0.0.1:25            TIME_WAIT   root       0          -
tcp        0      0 127.0.0.1:58198         127.0.0.1:12301         ESTABLISHED postfix    26745051   20974/smtpd
tcp        0      0 127.0.0.1:36102         127.0.0.1:25            TIME_WAIT   root       0          -
unix  2      [ ]         DGRAM                    15382    631/rsyslogd         /var/spool/postfix/dev/log
unix  3      [ ]         STREAM     CONNECTED     26744681 20949/smtpd
unix  2      [ ]         DGRAM                    26420926 21028/smtpd
unix  3      [ ]         STREAM     CONNECTED     26421401 21028/smtpd
unix  3      [ ]         STREAM     CONNECTED     26417626 20974/smtpd
unix  2      [ ]         DGRAM                    26411865 20949/smtpd
unix  2      [ ]         DGRAM                    26417471 20974/smtpd
unix  2      [ ]         DGRAM                    25900529 18121/smtpd
unix  3      [ ]         STREAM     CONNECTED     26412269 20949/smtpd
unix  3      [ ]         STREAM     CONNECTED     26744661 18121/smtpd
unix  3      [ ]         STREAM     CONNECTED     26744721 21028/smtpd
unix  3      [ ]         STREAM     CONNECTED     26745054 20974/smtpd
unix  3      [ ]         STREAM     CONNECTED     26398758 18121/smtpd

Okay, so this malicious mailer didn't like getting shutdown... notice that 16558 was simply replaced by 22882 — this thing, whatever it was, simply respawned itself, much like a virus. Hardly comforting, but details matter.

The next step was to see what was appearing in the authentication logs whenever these connections occured. I was sure that something had to be there, and perhaps there would be other clues.

root@hostname:/var/log# tail -f /var/log/auth.log
Sep  9 15:37:21 hostname login[28682]: pam_securetty(login:auth): access denied: tty '/dev/pts/2' is not secure !
Sep  9 15:37:21 hostname sshd[22872]: pam_unix(sshd:session): session closed for user com
Sep  9 15:37:21 hostname systemd-logind[537]: Session 16536 logged out. Waiting for processes to exit.
Sep  9 15:37:21 hostname systemd-logind[537]: Removed session 16536.
Sep  9 15:37:22 hostname login[28584]: FAILED LOGIN (5) on '/dev/pts/9' from '49.89.95.5' FOR 'root', Authentication failure
Sep  9 15:37:22 hostname login[28584]: TOO MANY LOGIN TRIES (5) on '/dev/pts/9' from '49.89.95.5' FOR 'root'
Sep  9 15:37:22 hostname login[28584]: pam_mail(login:session): pam_putenv: delete non-existent entry; MAIL
Sep  9 15:37:22 hostname login[28584]: pam_unix(login:session): session closed for user root
Sep  9 15:37:22 hostname xinetd[931]: START: telnet pid=28696 from=::ffff:49.89.95.5
Sep  9 15:37:23 hostname login[28687]: pam_securetty(login:auth): access denied: tty '/dev/pts/1' is not secure !
Sep  9 15:37:23 hostname login[28682]: FAILED LOGIN (1) on '/dev/pts/2' from 'host-79-3-72-208.business.telecomitalia.it' FOR 'root', Authentication failure
Sep  9 15:37:23 hostname login[28682]: pam_securetty(login:auth): access denied: tty '/dev/pts/2' is not secure !
Sep  9 15:37:23 hostname sshd[28702]: Accepted password for com from 194.49.68.83 port 9432 ssh2
Sep  9 15:37:23 hostname sshd[28702]: pam_unix(sshd:session): session opened for user com by (uid=0)
Sep  9 15:37:23 hostname systemd-logind[537]: New session 16566 of user com.
Sep  9 15:37:24 hostname sshd[28712]: error: bind [127.0.0.1]:25: Permission denied
Sep  9 15:37:24 hostname sshd[28712]: error: channel_setup_fwd_listener_tcpip: cannot listen to port: 25
Sep  9 15:37:25 hostname login[28697]: pam_securetty(login:auth): access denied: tty '/dev/pts/9' is not secure !
Sep  9 15:37:25 hostname login[28687]: FAILED LOGIN (1) on '/dev/pts/1' from '222.246.109.152' FOR 'root', Authentication failure
Sep  9 15:37:25 hostname login[28687]: pam_securetty(login:auth): access denied: tty '/dev/pts/1' is not secure !
Sep  9 15:37:27 hostname login[28682]: FAILED LOGIN (2) on '/dev/pts/2' from 'host-79-3-72-208.business.telecomitalia.it' FOR 'root', Authentication failure
Sep  9 15:37:27 hostname login[28682]: pam_securetty(login:auth): access denied: tty '/dev/pts/2' is not secure !
Sep  9 15:37:29 hostname login[28697]: FAILED LOGIN (1) on '/dev/pts/9' from '49.89.95.5' FOR 'root', Authentication failure
Sep  9 15:37:29 hostname login[28687]: FAILED LOGIN (2) on '/dev/pts/1' from '222.246.109.152' FOR 'root', Authentication failure
Sep  9 15:37:29 hostname login[28697]: pam_securetty(login:auth): access denied: tty '/dev/pts/9' is not secure !
Sep  9 15:37:29 hostname login[28687]: pam_securetty(login:auth): access denied: tty '/dev/pts/1' is not secure !
Sep  9 15:37:31 hostname login[28682]: FAILED LOGIN (3) on '/dev/pts/2' from 'host-79-3-72-208.business.telecomitalia.it' FOR 'root', Authentication failure
Sep  9 15:37:31 hostname login[28682]: pam_securetty(login:auth): access denied: tty '/dev/pts/2' is not secure !
Sep  9 15:37:31 hostname login[28697]: FAILED LOGIN (2) on '/dev/pts/9' from '49.89.95.5' FOR 'root', Authentication failure
Sep  9 15:37:32 hostname login[28687]: FAILED LOGIN (3) on '/dev/pts/1' from '222.246.109.152' FOR 'root', Authentication failure
Sep  9 15:37:32 hostname login[28697]: pam_securetty(login:auth): access denied: tty '/dev/pts/9' is not secure !
Sep  9 15:37:32 hostname login[28687]: pam_securetty(login:auth): access denied: tty '/dev/pts/1' is not secure !
Sep  9 15:37:34 hostname login[28682]: FAILED LOGIN (4) on '/dev/pts/2' from 'host-79-3-72-208.business.telecomitalia.it' FOR 'root', Authentication failure
Sep  9 15:37:35 hostname login[28682]: pam_securetty(login:auth): access denied: tty '/dev/pts/2' is not secure !

Most of that is irrelevant, but there are a few important lines:

Sep  9 15:37:21 hostname sshd[22872]: pam_unix(sshd:session): session closed for user com
Sep  9 15:37:21 hostname systemd-logind[537]: Session 16536 logged out. Waiting for processes to exit.
Sep  9 15:37:21 hostname systemd-logind[537]: Removed session 16536.

That happened as a result of running kill -9 on one of the malicious processes, thus disconnecting it. The next bit is due to a new authentication attempt:

Sep  9 15:37:23 hostname sshd[28702]: Accepted password for com from 194.49.68.83 port 9432 ssh2
Sep  9 15:37:23 hostname sshd[28702]: pam_unix(sshd:session): session opened for user com by (uid=0)
Sep  9 15:37:23 hostname systemd-logind[537]: New session 16566 of user com.
Sep  9 15:37:24 hostname sshd[28712]: error: bind [127.0.0.1]:25: Permission denied

Again, the password for the com account is publicly known, so the fact that it authenticated successfully wasn't surprising (although as a general rule of thumb, it's probably best to isolate services like these from other systems, more so than was the case here).

My immediate thought was perhaps there was a bug in the program executed by the com user that allowed a malicious user to somehow "escape" into a system shell and run arbitrary commands. To test that, I added a sleep to the beginning so that nothing could be done before the sleep timer expired. Yet, new connections were still doing socket I/O as soon as I ran strace on newly spawned processes:

root@hostname:~# kill -9 30340
root@hostname:~# strace -p 32109
strace: Process 32109 attached
select(13, [3 4 7 10 11 12], [], NULL, {tv_sec=119, tv_usec=944623}) = 1 (in [3], left {tv_sec=119, tv_usec=940748})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(3, "\322\265\231?\212\\\22V\314\341\331\10\210|\256&T\203\266.\347\3741\317\2\326J\3572\24u\320"..., 16384) = 224
select(13, [3 4 7 10 11 12], [4 10], NULL, {tv_sec=120, tv_usec=0}) = 2 (out [4 10], left {tv_sec=119, tv_usec=999996})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(4, "\27\3\3\0006\201L\303\315P}\233\355*\177kb\231Z\r>\326\252\237\215\316\343\271[\324S\203"..., 59) = 59
write(10, "\27\3\3\0B\262\344\357\316\375\211,\205\214&\335\246\244\302\v\365\227\177+.\323,=\256TsR"..., 71) = 71
select(13, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [10], left {tv_sec=119, tv_usec=999994})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(10, "\27\3\3\0&\306\207A\r\341\256)\325\0\236*\202\255i\r\211fbj"..., 16384) = 43
getpid()                                = 32109
select(13, [3 4 7 10 11 12], [3], NULL, {tv_sec=120, tv_usec=0}) = 1 (out [3], left {tv_sec=119, tv_usec=999995})
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0

So, that led to one conclusion: the process wasn't executing the program at all it was doing "something else", whatever that was.

Problem #6: Malicious Processes Doing "Something Else"

To recap, here's the state of what we know, pretty much:

root@hostname:~# cat /proc/32109/cmdline
sshd: com

root@hostname:~# ls -la /proc/32109/exe
lrwxrwxrwx 1 root root 0 Sep  9 15:50 /proc/32109/exe -> /usr/sbin/sshd

root@hostname:/var/log# ps -f -aux | grep "com " | grep -v "grep"
root     16571  0.0  0.6  14676  6144 ?        Ss   14:57   0:00  \_ sshd: com [priv]
com      16595  0.0  0.3  14676  3652 ?        S    14:57   0:00  |   \_ sshd: com
root     16573  0.0  0.5  14676  5852 ?        Ss   14:57   0:00  \_ sshd: com [priv]
com      16586  0.0  0.3  14676  3340 ?        S    14:57   0:00  |   \_ sshd: com
root     28283  0.0  0.6  14672  6864 ?        Ss   15:36   0:00  \_ sshd: com [priv]
com      28310  0.0  0.4  14672  4048 ?        S    15:36   0:00  |   \_ sshd: com@pts/12
com      28311  0.0  0.3   6844  3232 pts/12   Ss+  15:36   0:00  |       \_ /bin/bash /home/com/main.sh
root     32099  0.0  0.7  14600  7544 ?        Ss   15:45   0:00  \_ sshd: com [priv]
com      32109  0.5  0.5  14740  5316 ?        S    15:45   0:02      \_ sshd: com
com      16536  0.0  0.5  21148  5312 ?        Ss   14:57   0:00 /lib/systemd/systemd --user
com      16537  0.0  0.0 171832   608 ?        S    14:57   0:00  \_ (sd-pam)

root@hostname:~# ls -la /proc/32099/exe
lrwxrwxrwx 1 root root 0 Sep  9 15:54 /proc/32099/exe -> /usr/sbin/sshd
root@hostname:~# ls -la /proc/32109/exe
lrwxrwxrwx 1 root root 0 Sep  9 15:53 /proc/32109/exe -> /usr/sbin/sshd
root@hostname:~# cat /proc/32099/cmdline
sshd: com [priv]root@hostname:~#
root@hostname:~#
root@hostname:~# cat /proc/32109/cmdline
sshd: comroot@hostname:~#
root@hostname:~#

Let's go back to the auth.log on new connections:

Sep  9 15:57:37 hostname sshd[5833]: Accepted password for com from 194.49.68.83 port 9789 ssh2
Sep  9 15:57:37 hostname sshd[5833]: pam_unix(sshd:session): session opened for user com by (uid=0)
Sep  9 15:57:37 hostname systemd-logind[537]: New session 16606 of user com.
Sep  9 15:57:38 hostname sshd[5843]: error: bind [127.0.0.1]:25: Permission denied
Sep  9 15:57:38 hostname sshd[5843]: error: channel_setup_fwd_listener_tcpip: cannot listen to port: 25

So, something logs in as the com user and then tries to do something with port 25, which happens to be the SMTP port. Coincidence? I think not!

It's as if it fails to listen to port 25 and directly send out email that way, so it instead finds a way to talk to the process running on port 25 already:

write(4, "\27\3\3\0B\276\356!\254\312\264\346x\233\223\253\34\1h?\354Dw\241\ta\31u\3\357\326\35"..., 71) = 71
write(12, "\27\3\3\0003\37\236%/\340V4\301\310\20!\271s\21\v3+&EE\21-\302\204E\251\n"..., 56) = 56
select(15, [3 4 7 10 11 12], [], NULL, {tv_sec=120, tv_usec=0}) = 1 (in [4], left {tv_sec=119, tv_usec=999997})

This gave me an idea: let's take a look at the file descriptors used by one of these questionable processes:

root@hostname:/var/log# strace -p 32621
strace: Process 32621 attached
restart_syscall(<... resuming interrupted read ...>) = 1
read(9, "\27\3\3\0A", 5)                = 5
read(9, "\320J\224j_R\343G\214\v\211\307\323o)\263\267\202\351\35\f\324A7I$\340\315*\226M\276"..., 65) = 65
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 76, MSG_NOSIGNAL, NULL, 0) = 76
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 100, MSG_NOSIGNAL, NULL, 0) = 100
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 94, MSG_NOSIGNAL, NULL, 0) = 94
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 94, MSG_NOSIGNAL, NULL, 0) = 94
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 121, MSG_NOSIGNAL, NULL, 0) = 121
poll([{fd=17, events=POLLIN}], 1, 0)    = 0 (Timeout)
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 100, MSG_NOSIGNAL, NULL, 0) = 100
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 64, MSG_NOSIGNAL, NULL, 0) = 64
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 90, MSG_NOSIGNAL, NULL, 0) = 90
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 77, MSG_NOSIGNAL, NULL, 0) = 77
poll([{fd=17, events=POLLOUT}], 1, 3600000) = 1 ([{fd=17, revents=POLLOUT}])
write(17, "request\0rewrite\0rule\0local\0addre"..., 65) = 65
poll([{fd=17, events=POLLIN}], 1, 3600000) = 1 ([{fd=17, revents=POLLIN}])
read(17, "flags\0000\0address\0hanifiarslan2000"..., 4096) = 46
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 77, MSG_NOSIGNAL, NULL, 0) = 77
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 89, MSG_NOSIGNAL, NULL, 0) = 89
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 66, MSG_NOSIGNAL, NULL, 0) = 66
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 91, MSG_NOSIGNAL, NULL, 0) = 91
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 71, MSG_NOSIGNAL, NULL, 0) = 71
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 93, MSG_NOSIGNAL, NULL, 0) = 93
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 101, MSG_NOSIGNAL, NULL, 0) = 101
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 123, MSG_NOSIGNAL, NULL, 0) = 123
poll([{fd=17, events=POLLIN}], 1, 0)    = 0 (Timeout)
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 100, MSG_NOSIGNAL, NULL, 0) = 100
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 64, MSG_NOSIGNAL, NULL, 0) = 64
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 91, MSG_NOSIGNAL, NULL, 0) = 91
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 77, MSG_NOSIGNAL, NULL, 0) = 77
poll([{fd=17, events=POLLOUT}], 1, 3600000) = 1 ([{fd=17, revents=POLLOUT}])
write(17, "request\0rewrite\0rule\0local\0addre"..., 66) = 66
poll([{fd=17, events=POLLIN}], 1, 3600000) = 1 ([{fd=17, revents=POLLIN}])
read(17, "flags\0000\0address\0david.stickland1"..., 4096) = 47
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 77, MSG_NOSIGNAL, NULL, 0) = 77
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 89, MSG_NOSIGNAL, NULL, 0) = 89
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 66, MSG_NOSIGNAL, NULL, 0) = 66
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 91, MSG_NOSIGNAL, NULL, 0) = 91
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 71, MSG_NOSIGNAL, NULL, 0) = 71
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 94, MSG_NOSIGNAL, NULL, 0) = 94
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 101, MSG_NOSIGNAL, NULL, 0) = 101
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 125, MSG_NOSIGNAL, NULL, 0) = 125
poll([{fd=17, events=POLLIN}], 1, 0)    = 0 (Timeout)
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 100, MSG_NOSIGNAL, NULL, 0) = 100
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 89, MSG_NOSIGNAL, NULL, 0) = 89
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 91, MSG_NOSIGNAL, NULL, 0) = 91
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 77, MSG_NOSIGNAL, NULL, 0) = 77
poll([{fd=17, events=POLLOUT}], 1, 3600000) = 1 ([{fd=17, revents=POLLOUT}])
write(17, "request\0resolve\0sender\0hanifiars"..., 91) = 91
poll([{fd=17, events=POLLIN}], 1, 3600000) = 1 ([{fd=17, revents=POLLIN}])
read(17, "flags\0000\0transport\0hold\0nexthop\0n"..., 4096) = 96
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 77, MSG_NOSIGNAL, NULL, 0) = 77
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 89, MSG_NOSIGNAL, NULL, 0) = 89
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 66, MSG_NOSIGNAL, NULL, 0) = 66
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 93, MSG_NOSIGNAL, NULL, 0) = 93
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 73, MSG_NOSIGNAL, NULL, 0) = 73
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 91, MSG_NOSIGNAL, NULL, 0) = 91
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 71, MSG_NOSIGNAL, NULL, 0) = 71
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 77, MSG_NOSIGNAL, NULL, 0) = 77
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 93, MSG_NOSIGNAL, NULL, 0) = 93
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 73, MSG_NOSIGNAL, NULL, 0) = 73
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 94, MSG_NOSIGNAL, NULL, 0) = 94
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 89, MSG_NOSIGNAL, NULL, 0) = 89
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 101, MSG_NOSIGNAL, NULL, 0) = 101
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 216, MSG_NOSIGNAL, NULL, 0) = 216
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 133, MSG_NOSIGNAL, NULL, 0) = 133
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 130, MSG_NOSIGNAL, NULL, 0) = 130
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 86, MSG_NOSIGNAL, NULL, 0) = 86
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 75, MSG_NOSIGNAL, NULL, 0) = 75
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 87, MSG_NOSIGNAL, NULL, 0) = 87
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 92, MSG_NOSIGNAL, NULL, 0) = 92
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 109, MSG_NOSIGNAL, NULL, 0) = 109
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 87, MSG_NOSIGNAL, NULL, 0) = 87
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 101, MSG_NOSIGNAL, NULL, 0) = 101
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 89, MSG_NOSIGNAL, NULL, 0) = 89
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 84, MSG_NOSIGNAL, NULL, 0) = 84
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 92, MSG_NOSIGNAL, NULL, 0) = 92
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 140, MSG_NOSIGNAL, NULL, 0) = 140
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 119, MSG_NOSIGNAL, NULL, 0) = 119
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 105, MSG_NOSIGNAL, NULL, 0) = 105
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 105, MSG_NOSIGNAL, NULL, 0) = 105
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 91, MSG_NOSIGNAL, NULL, 0) = 91
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 108, MSG_NOSIGNAL, NULL, 0) = 108
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 96, MSG_NOSIGNAL, NULL, 0) = 96
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 82, MSG_NOSIGNAL, NULL, 0) = 82
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 103, MSG_NOSIGNAL, NULL, 0) = 103
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 102, MSG_NOSIGNAL, NULL, 0) = 102
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 109, MSG_NOSIGNAL, NULL, 0) = 109
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 105, MSG_NOSIGNAL, NULL, 0) = 105
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 105, MSG_NOSIGNAL, NULL, 0) = 105
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 91, MSG_NOSIGNAL, NULL, 0) = 91
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 108, MSG_NOSIGNAL, NULL, 0) = 108
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 96, MSG_NOSIGNAL, NULL, 0) = 96
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 82, MSG_NOSIGNAL, NULL, 0) = 82
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 93, MSG_NOSIGNAL, NULL, 0) = 93
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 102, MSG_NOSIGNAL, NULL, 0) = 102
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 113, MSG_NOSIGNAL, NULL, 0) = 113
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 105, MSG_NOSIGNAL, NULL, 0) = 105
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 105, MSG_NOSIGNAL, NULL, 0) = 105
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 91, MSG_NOSIGNAL, NULL, 0) = 91
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 108, MSG_NOSIGNAL, NULL, 0) = 108
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 96, MSG_NOSIGNAL, NULL, 0) = 96
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 82, MSG_NOSIGNAL, NULL, 0) = 82
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 97, MSG_NOSIGNAL, NULL, 0) = 97
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 102, MSG_NOSIGNAL, NULL, 0) = 102
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 85, MSG_NOSIGNAL, NULL, 0) = 85
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 66, MSG_NOSIGNAL, NULL, 0) = 66
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 76, MSG_NOSIGNAL, NULL, 0) = 76
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 140, MSG_NOSIGNAL, NULL, 0) = 140
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 101, MSG_NOSIGNAL, NULL, 0) = 101
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 76, MSG_NOSIGNAL, NULL, 0) = 76
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 140, MSG_NOSIGNAL, NULL, 0) = 140
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 84, MSG_NOSIGNAL, NULL, 0) = 84
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 78, MSG_NOSIGNAL, NULL, 0) = 78
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 140, MSG_NOSIGNAL, NULL, 0) = 140
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 76, MSG_NOSIGNAL, NULL, 0) = 76
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 100, MSG_NOSIGNAL, NULL, 0) = 100
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 127, MSG_NOSIGNAL, NULL, 0) = 127
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 154, MSG_NOSIGNAL, NULL, 0) = 154
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 78, MSG_NOSIGNAL, NULL, 0) = 78
poll([{fd=11, events=POLLOUT}], 1, 30000) = 1 ([{fd=11, revents=POLLOUT}])
write(11, "\0\0\0XDR{rcpt_addr}\0david.sticklan"..., 129) = 129
poll([{fd=11, events=POLLIN}], 1, 30000) = 1 ([{fd=11, revents=POLLIN}])
read(11, "\0\0\0\1c", 130964)           = 5
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 76, MSG_NOSIGNAL, NULL, 0) = 76
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 76, MSG_NOSIGNAL, NULL, 0) = 76
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 171, MSG_NOSIGNAL, NULL, 0) = 171
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 170, MSG_NOSIGNAL, NULL, 0) = 170
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 61, MSG_NOSIGNAL, NULL, 0) = 61
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 59, MSG_NOSIGNAL, NULL, 0) = 59
socket(AF_UNIX, SOCK_STREAM, 0)         = 18
fcntl(18, F_GETFL)                      = 0x2 (flags O_RDWR)
fcntl(18, F_SETFL, O_RDWR)              = 0
connect(18, {sa_family=AF_UNIX, sun_path="public/cleanup"}, 110) = 0
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 77, MSG_NOSIGNAL, NULL, 0) = 77
poll([{fd=18, events=POLLIN}], 1, 3600000) = 1 ([{fd=18, revents=POLLIN}])
read(18, "queue_id\0A628437015\0\0", 4096) = 21
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 77, MSG_NOSIGNAL, NULL, 0) = 77
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 91, MSG_NOSIGNAL, NULL, 0) = 91
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 72, MSG_NOSIGNAL, NULL, 0) = 72
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 75, MSG_NOSIGNAL, NULL, 0) = 75
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 100, MSG_NOSIGNAL, NULL, 0) = 100
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 69, MSG_NOSIGNAL, NULL, 0) = 69
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 63, MSG_NOSIGNAL, NULL, 0) = 63
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 70, MSG_NOSIGNAL, NULL, 0) = 70
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 56, MSG_NOSIGNAL, NULL, 0) = 56
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 76, MSG_NOSIGNAL, NULL, 0) = 76
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 80, MSG_NOSIGNAL, NULL, 0) = 80
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 78, MSG_NOSIGNAL, NULL, 0) = 78
poll([{fd=18, events=POLLOUT}], 1, 3600000) = 1 ([{fd=18, revents=POLLOUT}])
write(18, "flags\000242\0\0m\0010T\0211662739228 43145"..., 552) = 552
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 73, MSG_NOSIGNAL, NULL, 0) = 73
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 70, MSG_NOSIGNAL, NULL, 0) = 70
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 76, MSG_NOSIGNAL, NULL, 0) = 76
write(9, "\27\3\3\0&\227\204\37\372/7\36\226e'dj?~\207hb\213\370\5x\t\341\223\20\332\241"..., 43) = 43
read(9, 0x55f68630bf63, 5)              = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=9, events=POLLIN}], 1, 300000) = 1 ([{fd=9, revents=POLLIN}])
read(9, "\27\3\3\0\36", 5)              = 5
read(9, "\320J\224j_R\343Hn\264\331\346\247,\262\211\370\304A\20\304g\311\341\366\35Ol\347G", 30) = 30
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 75, MSG_NOSIGNAL, NULL, 0) = 75
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 65, MSG_NOSIGNAL, NULL, 0) = 65
ioctl(9, FIONREAD, [0])                 = 0
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 68, MSG_NOSIGNAL, NULL, 0) = 68
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 66, MSG_NOSIGNAL, NULL, 0) = 66
getpid()                                = 32621
sendto(7, "<22>Sep  9 16:00:28 postfix/smtp"..., 82, MSG_NOSIGNAL, NULL, 0) = 82

Okay, hello! It doesn't get more obvious than that. This process is definitely communicating with postfix under the hood, trying to send hundreds of spam emails per minute.

But back to what we came here for: the process is writing to file descriptor 7, and reading from file descriptor 9: let's see what those file descriptors correspond to:

root@hostname:/var/log# ls -la /proc/32621/fd/7
lrwx------ 1 root root 64 Sep  9 16:01 /proc/32621/fd/7 -> 'socket:[28044945]'

root@hostname:/var/log# ls -la /proc/32621/fd/9
lrwx------ 1 root root 64 Sep  9 16:02 /proc/32621/fd/9 -> 'socket:[28992210]'

At this point, I somehow happened to recall there being some TCP port related settings in the SSH configuration file, and decided to check them out. To test out this new hunch, I added this to the end of the sshd config file:

Match User com
        X11Forwarding no
        AllowTcpForwarding no

And lo and behold, now when I looked at the auth log:

Sep  9 16:07:53 hostname systemd-logind[537]: Session 16606 logged out. Waiting for processes to exit.
Sep  9 16:07:53 hostname systemd-logind[537]: Removed session 16606.
Sep  9 16:07:53 hostname sshd[5833]: pam_unix(sshd:session): session closed for user com


Sep  9 16:07:56 hostname login[11151]: FAILED LOGIN (2) on '/dev/pts/1' from 'host-79-3-72-208.business.telecomitalia.it' FOR 'root', Authentication failure


Sep  9 16:07:56 hostname login[11151]: pam_securetty(login:auth): access denied: tty '/dev/pts/1' is not secure !
Sep  9 16:07:56 hostname sshd[11193]: Accepted password for com from 194.49.68.83 port 9971 ssh2
Sep  9 16:07:56 hostname sshd[11193]: pam_unix(sshd:session): session opened for user com by (uid=0)
Sep  9 16:07:56 hostname systemd-logind[537]: New session 16627 of user com.

Sep  9 16:07:56 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9972, target localhost port 25
Sep  9 16:07:56 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9973, target localhost port 25
Sep  9 16:07:56 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9974, target localhost port 25
Sep  9 16:07:56 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9975, target localhost port 25
Sep  9 16:07:56 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9976, target localhost port 25
Sep  9 16:07:56 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9977, target localhost port 25
Sep  9 16:07:56 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9978, target localhost port 25
Sep  9 16:07:56 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9979, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9980, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9981, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9982, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9983, target localhost port 25
Sep  9 16:07:57 hostname login[11120]: FAILED LOGIN (3) on '/dev/pts/3' from '222.246.109.152' FOR 'root', Authentication failure
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9984, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9985, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9986, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9987, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9988, target localhost port 25
Sep  9 16:07:57 hostname login[11120]: pam_securetty(login:auth): access denied: tty '/dev/pts/3' is not secure !
Sep  9 16:07:57 hostname login[11133]: FAILED LOGIN (3) on '/dev/pts/2' from '49.89.95.5' FOR 'root', Authentication failure
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9989, target localhost port 25
Sep  9 16:07:57 hostname login[11133]: pam_securetty(login:auth): access denied: tty '/dev/pts/2' is not secure !
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9990, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9991, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9992, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9993, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9994, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9995, target localhost port 25
Sep  9 16:07:57 hostname sshd[11203]: refused local port forward: originator 127.0.0.1 port 9996, target localhost port 25

Aha! This was the root problem that had led to a thousand emails going to the postmaster the day before. It had nothing to do with postfix itself, it was instead a configuration bug that had been present for years.

Looking back at the watch screen, all looked healthy there as well - the nefarious "com" connections were now gone:

Every 0.5s: netstat --program --numeric-hosts --numeric-ports --extend | grep -E ":25|postfix|smtp"

unix  2      [ ]         DGRAM                    15382    631/rsyslogd         /var/spool/postfix/dev/log
unix  3      [ ]         STREAM     CONNECTED     29321334 9852/smtpd
unix  2      [ ]         DGRAM                    29046832 8461/smtpd
unix  3      [ ]         STREAM     CONNECTED     29311729 32621/smtpd
unix  3      [ ]         STREAM     CONNECTED     29323293 8461/smtpd
unix  3      [ ]         STREAM     CONNECTED     29294833 8461/smtpd
unix  2      [ ]         DGRAM                    29163334 9852/smtpd
unix  2      [ ]         DGRAM                    28600647 4870/smtpd
unix  2      [ ]         DGRAM                    28044945 32621/smtpd
unix  3      [ ]         STREAM     CONNECTED     29321325 32621/smtpd
unix  3      [ ]         STREAM     CONNECTED     29320141 32523/smtpd
unix  3      [ ]         STREAM     CONNECTED     29302896 9852/smtpd
unix  3      [ ]         STREAM     CONNECTED     29320284 32523/smtpd
unix  2      [ ]         DGRAM                    28030150 32523/smtpd
unix  3      [ ]         STREAM     CONNECTED     25498067 4870/smtpd

Okay, now to delete all the queued messages that accumulated during the time spent debugging this, since we don't intend to actually send them:

root@hostname:~# postsuper -d ALL
postsuper: Deleted: 24493 messages

Whew! That's a handful right there! Sure is nice to go back to having an empty mail queue.


Sometimes, when things go wrong, they can offer visibility into a problem that has existed for a long time that has only now surfaced due to several things lining up in just the right way. When this happens, it's important to not just settle for the immediate surface-level fix and to roll up your sleeves and dig deep enough that the root vulnerabilities and problems responsible for an issue are mitigated so that they never happen again.

← All Posts


Comments

Log in to leave a comment!