fogbound.net




Sat, 7 Oct 2017

Simple file monitor

— SjG @ 11:59 am

Say you host a few web sites for various folks, and you give them write access to a directory on your server. Well, then, my friend, you’re as big a fool as I am.

Maybe you want to mitigate this foolhardiness by keeping an eye on what these folks upload. For example, when I see a user uploading SuperBulletinBoardThatIsTotallyNotASpamTool.php or SuperWordPressPasswordSharingPlugin.php, I can call them and explain why I’m deleting it. I can be a slightly-less-bastard operator from heck.

So here’s a quick bash script that I use. It’ll also help to alert you if somehow one of the WordPress sites gets compromised, and rogue php files get installed. It ignores commonly changing files or things we’re not interested in like images. It shouldn’t be considered an intrusion detection system, or a robust security auditing tool — this wouldn’t really help in the case of an actual hacker with any l33t skillz at all. It’s just a quick information source.


#/bin/bash

rm -f /tmp/fcl.txt

rm -f /tmp/fcld.txt

/usr/bin/find /var/www/ -type f -ctime -1 | /bin/egrep -v "\\.git|\\.svn|(*.jpg$)|(*.gif$)|(*.pdf$)|wp-content\\/cache|files\\/cache\\/zend_cache" > /tmp/fcl.txt

xargs -0 -n 1 ls -l < <(tr \\n \\0 /tmp/fcld.txt

[ -s /tmp/fcld.txt ] && /usr/bin/mail -aFrom:account@mydomain.com -s "MYDOMAIN.COM FILES UPDATED" you@youremail.com < /tmp/fcld.txt

Throw it into a crontab, and there you have it. You'll get an email with a list of files changed in the past day.


Thu, 31 Aug 2017

Getting nagios back up and running… again.

— SjG @ 12:42 pm

Nagios monitoring on one Centos 6.9 server seemed to have stopped working after an upgrade. All the tests showed status OK, but they hadn’t actually run in days.

Looking at the service details was weird, because the next scheduled check was about a minute in the past.

The Nagios help page wasn’t. And we’re running Core 4.3.x anyway, without a MySQL database.

The first clue was a bunch of lines in the event log:
Error: Could not open check result queue directory '/var/log/nagios/spool/checkresults' for reading.

Turns out we didn’t even have a /var/log/nagios/spool directory. Creating those directories helped. But Nagios still wouldn’t start from the usual startup scripts. Nothing in the main log. But then, another clue.

Who doesn’t love to see shit like this:

$ cat /var/log/nagios/nagios.configtest
ERROR: Errors in config files – see log for details: /var/log/nagios/nagios.configtest
$

So the startup script /etc/init.d/nagios searches for warnings, and aborts if they exist. It’s supposed to log them. For some reason it didn’t.

You can manually get those warnings and errors yourself by running

/usr/sbin/nagios -v /etc/nagios/nagios.cfg (adjust paths as appropriate).

I ended up with a bunch of warnings for deprecated parameters. So I went in and edited my config files to remove them or update them to the new equivalents. Oh yes. Software authors, please keep in mind: nothing pleases your users more than changing the names of variables in config files. We users live for this shit. When, oh when, will the author of our software next change “retry_check_interval” to “retry_interval”?

Fixing all of the warnings was not enough, though. The startup script gave the message “Starting nagios:” and then silently died. Well, sort of. It actually was starting now, but brokenly:

# ps aux | grep -i nagios
nagios 12610 0.0 0.0 12296 1220 ? Ss 16:36 0:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg
nagios 12611 0.0 0.0 0 0 ? Z 16:36 0:00 [nagios] <defunct>
nagios 12612 0.0 0.0 0 0 ? Z 16:36 0:00 [nagios] <defunct>
nagios 12613 0.0 0.0 0 0 ? Z 16:36 0:00 [nagios] <defunct>
nagios 12614 0.0 0.0 0 0 ? Z 16:36 0:00 [nagios] <defunct>
nagios 12616 0.0 0.0 11780 520 ? S 16:36 0:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg
root 12744 0.0 0.0 103328 876 pts/0 S+ 16:44 0:00 grep -i nagios

Starting directly from the command line worked:

# /usr/sbin/nagios -d /etc/nagios/nagios.cfg
# ps aux | grep nagios
nrpe 7282 0.0 0.0 41380 1340 ? Ss 16:51 0:00 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d
nagios 8010 0.0 0.0 16404 1280 ? Ss 17:31 0:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg
nagios 8011 0.0 0.0 10052 920 ? S 17:31 0:00 /usr/sbin/nagios –worker /var/spool/nagios/cmd/nagios.qh
nagios 8012 0.0 0.0 10052 920 ? S 17:31 0:00 /usr/sbin/nagios –worker /var/spool/nagios/cmd/nagios.qh
nagios 8013 0.0 0.0 10052 920 ? S 17:31 0:00 /usr/sbin/nagios –worker /var/spool/nagios/cmd/nagios.qh
nagios 8014 0.0 0.0 10052 920 ? S 17:31 0:00 /usr/sbin/nagios –worker /var/spool/nagios/cmd/nagios.qh
nagios 8015 0.0 0.0 15888 552 ? S 17:31 0:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg
root 8018 0.0 0.0 100956 616 pts/0 S+ 17:31 0:00 tail -f /var/log/nagios/nagios.log
root 8037 0.0 0.0 103328 856 pts/1 S+ 17:32 0:00 grep nagios

When things get this weird, there are only two options. Well, three, if you include the “rm -rf /” option. But the other two are: 1) reboot and see if stuff magically starts working, or 2) see what SELinux is breaking.

# tail -f /etc/audit/audit.log
type=AVC msg=audit(1504137329.209:41): avc: denied { execute_no_trans } for pid=7731 comm=”nagios” path=”/usr/sbin/nagios” dev=dm-0 ino=1201464 scontext=unconfined_u:system_r:nagios_t:s0 tcontext=system_u:object_r:nagios_exec_t:s0 tclass=file

Yup. As expected, SELinux breaking stuff.

So, my preferred way to solve this kind of problem now is to snip out all relevant the AVC “denied” sections from the log into a single file (which I called audit.log), and then using audit2allow to create a new module. Since there’s already a nagios module (containing insufficient privileges), I created a nagios2 module:

# audit2allow -M nagios2 < audit.log # semodule -i nagios2.pp

Hooray! After a few iterations of this process (discovering other blocked operation, granting them permission, restarting nagios), everything was working but check_disk_smb, which was returning “results from smbclient not suitable” even as it worked fine when tested from the command-line as follows:

# su – nagios -s /bin/bash -c “/usr/lib64/nagios/plugins/check_disk_smb -H SMBHOST -s share -a 10.X.X.X -u nagios -p \”password\” -w 90 -c 95″
Disk ok – 16.71G (11%) free on \\SMBHOST\share | ‘share’=134108676096B;136850492620.8;144453297766.4;0;152056102912

Diving in and editing check_disk_smb to throw the actual error message, I found nagios getting a “ERROR: Could not determine network interfaces, you must use a interfaces config line” from smbclient. So I edited /etc/samba/smb.conf, and explicitly told samba which interfaces it had available:

interfaces = lo eth0 10.X.X.X/24

Le sigh. Now this error went away, and I got to go for another fun and challenging round of “find all the SMB operations that SELinux is breaking.” This time, I got tripped up the “dontaudits” — there were operations being blocked, but not logged. I was saved by TrevorH and sfix, helpful people in Freenode’s #centos IRC channel:

TrevorH: semodule -DB to disable dontaudit rules, stay permissive, recreate, use the audit log to generate a policy as per the wiki
13:14 TrevorH: @selinux
13:14 centbot: Useful resources for SELinux: http://wiki.centos.org/HowTos/SELinux | http://wiki.centos.org/TipsAndTricks/SelinuxBooleans | http://docs.fedoraproject.org/en-US/Fedora/13/html/Security-Enhanced_Linux/ | http://www.youtube.com/watch?v=bQqX3RWn0Yw | http://opensource.com/business/13/11/selinux-policy-guide
13:15 _SjG_: Thanks
13:16 TrevorH: semodule -B when done (as well as setenforce 1)
13:25 _SjG_: TrevorH: thanks, that resolved it.
13:25 _SjG_: so what I was missing is that there can be donaudit rules that were preventing specific operations from showing up in the audit log?
13:28 TrevorH: yes
13:28 sfix: _SjG_: yep, there’s permissions in the policy that we know are requested but don’t want to allow for whatever reason. dontaudits are our way of preventing them from cluttering the audit log.
13:29 sfix: dontaudits tend to be a bit over-eager though

So there you have it.

I was finally back to where I had been mere days before.


Thu, 22 Sep 2016

Checking Solr index with nagios: obsolete versions

— SjG @ 12:33 pm

I needed to check that the index process that populates the Solr index succeeded and didn’t die during the night, leaving an empty index.

To make things more complicated, the versions of Solr and nagios in use are probably not the latest.

The check_solr -o numdocs command doesn’t work with our Solr configuration. But the internet tells me that the Solr query http://localhost:8983/solr/select/?debug=q‌uery&q=*:* includes the size of the result set. Testing it, I found this to be true:

<response>
   <lst name="responseHeader">
      <int name="status">0</int>
      <int name="QTime">0
      <lst name="params">
         <str name="q">*:*</str>
         <str name="debug">q‌uery</str>
      </lst>
   </lst>
   <result name="response" numFound="9832" start="0">
      <doc>
...

I want to use nagios to check that that numFound is never zero (or too small). I thought I’d just be able to use a nagios regex:

check_http -H localhost -p 8983 -u "/solr/select/?debug=query&q=*:*" -lr 'numFound=\"\d{2+}"'

It didn’t work. To make a long story short, there’s regex and then there’s regex. The kind that works for nagios is:

check_http -H localhost -p 8983 -u "/solr/select/?debug=query&q=*:*" -lr 'numFound=\"[1-9][0-9][0-9]'

This guarantees at least a hundred docs are in the index.


Mon, 29 Jun 2015

Sorting lots of files into directories, by date

— SjG @ 10:32 am

A process handles periodic tasks, and each time it does, it spits out some telemetry. The telemetry gets written off into a file each time. This process could be any of a number of interesting things: a procmail script doing something with incoming email, a Twitter-bot responding to searches, an IRC-bot responding to events, or whatever. The only important thing in this case is that it creates a variable number of log files each day, and the log files all get dumped into a single directory.

Before long, the log directory will be filled with thousands of files, and will be unmanageable. But, for some reason, we want to keep all these logs, and maybe actually use them. So the key is a script to move the logs into sub-directories based on date.

It turns out it’s easy to write a bash script to create directories in a YYYY-MM format, and move the files into them appropriately. The key is in the stat command. Conveniently, the implementation of this command is completely different and incompatible between Mac OS/BSD and Linux. Jesus H. Christ.

Linux:

#!/bin/bash
for i in *.log
do
filemonth=`stat --format=%y $i | cut -c 1-7`
mkdir -p $filemonth
mv $i $filemonth/$i
done

In Linux, the stat command will give you the data in an ISO format, and using the cut command, you can extract YYYY-MM information. The -p flag to mkdir makes it silently exit without complaining if a directory by that name already exists.

MacOS (or presumably other BSDs):

#!/bin/bash
for i in *.log
do
filemonth=`stat -f%Sm -t %Y-%m $i`
mkdir -p $filemonth
mv $i $filemonth/$i
done

In this case, we’re telling stat that we want the modified date as a string, and we specify the time format.

Either of these would be easy to modify to single day resolution (changing which columns you cut in the Linux version, or the timestamp format in the Mac version).


Tue, 2 Dec 2014

Compiling Kannel 1.4.4 under Centos 7.0

— SjG @ 4:28 pm

This took me while to get to work. If you follow these steps in order, it should work nicely.


# yum install mariadb-devel
# yum install libxml2-devel
# yum install bison
# yum install byacc
# cd /usr/local/src
# wget http://kannel.org/download/1.4.4/gateway-1.4.4.tar.gz
# tar xzvf gateway-1.4.4.tar.gz
# cd gateway-1.4.4
# ./configure --prefix=/usr/local/kannel --with-mysql --with-mysql-dir=/usr/include/mysql --disable-wap
# make

There are a few tricks here. First, just having libxml2 installed is not enough. You need the libxml development headers, etc. Should be obvious, but tricked me. Next, if you run ./configure before you have some of the dependencies installed (e.g., Bison), you will have modified source files that will still fail even after you install the dependency. Thus it’s important to install all that stuff before you run ./configure.

This stuff isn’t really that hard, but it can be time consuming to track down why it’s not working.