fogbound.net




Thu, 20 Apr 2023

Taming logwatch on Linux*

— SjG @ 6:59 am

*This is actually on Rocky Linux / CENTOS / RHEL, but will likely work on others.

Logwatch can be a nice tool for keeping an eye on your servers. It goes through your logs and creates a nightly aggregate email containing information to keep you apprised of various important details. It can be good to bring things to the attention of lazy / overwhelmed sysadmins like me.

Where it fails, though, is where it overwhelms you with useless information. There are different output level settings, and if you turn the detail levels down far enough, it helps a lot. However, with certain configs and certain OSes, you still get overwhelmed with non-actionable information. Here’s how to fix a few of those.

Crontabs. In Rocky Linux, cron logs a success message that contains a process number, which means the default log is filled with lots and lots of lines like session-685197.scope: Succeeded.: 1 Time(s) which logwatch happily throws into the nightly email. Most searches tell you to edit your /etc/logwatch/conf/ignore.conf file and add the following line:

session-.*scope: Succeeded

This doesn’t work for me. Further research indicates that the ignore.conf file wants a Perl-style regular expression. The recommendation above is sort-of-Perlish, but what ended up working correctly for me was putting the following line in my ignore.conf:

\s*session-(.*?)\.scope: Succeeded\.(.*)

HTTP. For some reason, someone thought having a long list of hostile IP addresses would be helpful. Maybe to manually block them? Seems like a hopeless task. Check out /usr/share/logwatch/scripts/services/http around line 596… and un-comment out the conditional.

$flag = 1;
foreach my $i (sort keys %ban_ip) {
   if ($flag) {
      print "\nA total of ".scalar(keys %ban_ip)." sites probed the server \n";
      $flag = 0;
   }
   #if ($detail > 4) {
      print "   $i\n";
   #}
} 

sshd. I know there are a lot of hackers, script kiddies, and bots out there. I don’t need to see the long list of people who tried and failed to log in with ssh. Unfortunately, the detail level setting for sshd aren’t very helpful. I ended up editing /usr/share/logwatch/scripts/services/sshd and liberally sprinkling my own if ($Detail > 4) {} barriers starting around line 500. Hacky, I know. Also will be clobbered with the next logwatch update. Yuck.

Maybe it’s time for me to submit a bunch of pull requests.


Sun, 1 Jan 2023

New Year New Fear

— SjG @ 11:03 am

I thought I’d update this VPS from Ubuntu 20.04.5 LTS to Ubuntu 22.04.1 LTS. I ran do-release-upgrade which gave all the appropriate warnings and stuff, and proceeded to upgrade. This VPS is hosted on Linode, so there’s a local Ubuntu mirror — the download is blindingly fast.

The do-release-upgrade process appears to run in a Unix screen window. Unfortunately, that interacted poorly with the terminal I was using, so any time I moved my mouse, it spewed control characters as input. That was bad when it was prompting … something? … probably about my nonstandard sshd configuration.

Anyway, Linode provides a web-based console access, so I was eventually able to fix the broken sshd, PHP, and Apache configurations. Only about two hours of downtime. Not bad for New Year’s Day.


Tue, 17 May 2022

Linux Command Line Magic

— SjG @ 12:24 pm

In day-to-day operations, cirumstances often arise where you need simple answers to fairly complicated situations. In the best scenario, the information is available to you in some structured way, like in a database, and you can come up with a query (e.g., “what percentage of our customers in January spent more than $7.50 on two consecutive Wednesdays” is something you could probably query). In other scenarios, the information is not as readily available or not in a structured format.

One nice thing about Linux and Unix-like operating systems, is that the filesystem can be interrogated by chaining various tools to make it cough up information you need.

For example, I needed to copy the assets from a digital asset management (DAM) system to a staging server to test a major code change. The wrinkle is that the DAM is located on a server with limited monthly bandwidth. So my challenge: what was the right number of files to copy down without exceeding the bandwidth cap?

So, to start out with, I use some simple commands to determine what I’m dealing with:

$ ls -1 asset_storage | wc -l
10384

$ du -hs asset_storage
409G	asset_storage

So that first command lists all the files in the “asset_storage” directory, with the -1 flag saying to list one file per line, which is then piped into the word-count command with the -l flag which say to count lines. The second command tells me the storage requirement, with the -h flag asking for human-readble units.

I’ve got a problem. Over 10,000 files totalling over 400G of storage, and say my data cap is 5G. The first instinct is to say, “well, the average file size is 40M, so I may only be able to copy 125 files.” However, we know that’s wrong. There are some big video files and many small image thumbnails in there. So what if I only copy the smaller files?

$ find asset_storage -size -10M -print0 | xargs -0 du -hc | tail -n1
630M	total

Look at that beautiful sequence. Just look at it! The find command looks in the asset_storage directory for files smaller than 10M. The list it creates gets passed into the disk usage command via the super-useful xargs command. xargs takes a list that’s output from some command and uses that list as input parameters to another command. To be safe with weird characters (i.e., things that could cause trouble by being interpreted by the shell, like single quotes or parens or dollar signs) we use the -print0 flag from find (which forces it to use null terminators after each result output) and the -0 flag on xargs, which tells it to expect the null terminators. This takes the list of small files, passes them to the disk usage command with the -h (human-readable) and -c (cumulative) flags. The du command gives output for each file and for the sum total, but we only want the sum, so we pipe it into the tail command to just give us that last value.

So if we only include files under 10M, we can transfer them without getting close to our data cap. But what percentage of the files will be included?

$ find asset_storage -size -10M -print | wc -l
7708

Again, the find command looks in the asset_storage directory for files smaller than 10M and each line is passed into the word count as before. So if we include only files smaller than 10M, we get 7,708 of the 10,384 files, or just under 75% of them! Hooray!

But when I started to create the tar file to transfer the files, something was wrong! The tar file was 2G and growing! Control C! Control C! What’s going on here?

What was wrong? Well, this is where it gets into the weeds a bit. It took me longer than I’d like to admit to track down. The shell command buffer has limitations on its length, and xargs has its own limitations. If the list it receives exceeds those limits, xargs splits the input and invokes the destination command multiple times, each with a chunk of the list. So in my example above, the find command was overwhelming the xargs buffer and the du command was called multiple times:

$ find asset_storage -size -10M -print0 | xargs -0 du -hc | grep -i total
6.1G	total
630M	total

My tail command was seeing that second total, and missing the first one! To make the computation work the way I’d wanted, I had to allocate more command line length to xargs (the size you can set is system dependent, and can be found with xargs --show-limits):

$ find asset_storage -size -10M -print0 | xargs -0 -s2000000 du -hc | grep -i total
6.6G	total

Playing with the file size threshold, I was finally able to determine that my ideal target was files under 5M, which still gave me 68% of the files and kept the final transfer down to about 3G.

In summary, do it this way:

$ find asset_storage -size -5M -print0 | xargs -0 -s2000000 du -hc | tail -n1
2.9G	total

$ find asset_storage -size -5M -print | wc -l
7094

$ find asset_storage -size -5M -print0 | xargs -0 -s2000000 tar cf dam_image_backup.tgz


Thu, 2 Sep 2021

Character Encoding

— SjG @ 4:25 pm

Hm. Looks like the server upgrade from Ubuntu 18.04.5 LTS to Ubuntu 20.04.3 LTS changed some character encoding defaults somewhere. I’ll have to track down if it’s in the database (most likely), the database connection (possible), PHP, or somewhere else.

In the mean time, please forgive the occasional bad unicode character.

Filed in:

Fri, 9 Jul 2021

Virtual Apache SSL Hosts in a Docker Container

— SjG @ 1:40 pm

You might want to “containerize” your development hosting environment so you can easily migrate it from machine to machine. As a Docker noob, I had a bunch of issues getting this set up the first time, and wanted to share a working configuration. This example assumes you have Docker installed and operating. You can also skip reading this, and just download the files at GitHub.

First, we’ll need to create some directories. I create one for the Apache configurations, and one for the code projects I’ll be working on.
mkdir apache-php
mkdir project

Within project, you can check out the code for your various projects into subdirectories. For simplicity, I’ve created project1 and project2 in the sample code. The Apache web server within the container will serve content from these directories.

We’re going to use a fictional top-level domain (TLD) for our development environment. This way, the URL you use to access your sites will be the same every time you spin up a new dev environment, without having to worry about name servers. You do, however, have to worry about your /etc/hosts file (or your platform’s equivalent). Choose a TLD that will be easy to remember. For my example, I’m using “mylocal”. One thing to avoid: start the TLD with a letter rather than a number. Don’t include characters like hyphens. Please learn from my mistakes.

Edit your /etc/hosts, and add the lines:
127.0.0.1 project1.mylocal
127.0.0.1 project2.mylocal

Next, we’ll want to create an SSL certificate for use in development. The easiest way to do this is with mkcert. Once you have mkcert installed and working, you’ll create a wildcard certificate for your TLD:
cd php-apache
mkcert -install mylocal "*.mylocal"

Next, we create a compose.yaml file:

version: "3.9"
services:
php:
container_name: ApachePHPVirtual
networks:
- apache
build:
context: .
dockerfile: PhpApacheDockerfile
volumes:
- "./project:/var/www"
ports:
- 80:80
- 443:443
extra_hosts:
- "project1.mylocal:127.0.0.1" # remember to add "127.0.0.1 project1.mylocal" to your /etc/hosts file or equivalent
- "project2.mylocal:127.0.0.1" # remember to add "127.0.0.1 project2.mylocal" to your /etc/hosts file or equivalent
hostname: project1.mylocal # default
domainname: mylocal
tty: true # if you want to debug
networks:
apache:

This is pretty straightforward. We’re creating a container which we’ll call “ApachePHPVirtual,” it will have a network we call “apache” if we want to connect using other containers, and it links our top level project directory to /var/www in the container. We map ports 80 and 443 on our host machine to those same ports in the container. The extra_hosts directive adds our project names to the container’s /etc/hosts. We set up the container’s hostname to match our first project, and set the default domain to our “mylocal” TLD.

We then want to create configurations for each of the Apache virtual hosts. In the php-apache directory, we create config files for each project. These are just standard virtual host declarations, e.g.:

<VirtualHost *:80>
    ServerName project1.mylocal
    Redirect permanent / https://project1.mylocal/
</VirtualHost>
<VirtualHost *:443>
    ServerName project1.mylocal
    DocumentRoot /var/www/project1
    ErrorLog ${APACHE_LOG_DIR}/project1-error.log
    CustomLog ${APACHE_LOG_DIR}/project1-access.log combined
    DirectoryIndex index.php

    <Directory "/var/www/project1">
        Options -Indexes +FollowSymLinks
        AllowOverride all
        Order allow,deny
        Allow from all
    </Directory>

    SSLEngine On
    SSLCertificateFile    /etc/apache2/ssl/cert.pem
    SSLCertificateKeyFile /etc/apache2/ssl/cert-key.pem
</VirtualHost>

You’ll need to create a similar configuration for each project. Note that the Document Root points at the mapped host directory. That means you won’t need to rebuild the container to see project changes.

The actual image for the Apache/PHP container is created and configured in our next file, “PhpApacheDockerfile”. So we create that:

FROM php:8.0.8-apache-buster

# add some packages
RUN docker-php-ext-install curl gd iconv pdo pdo_mysql soap zip

# Apache Config
COPY php-apache/project1.conf /etc/apache2/sites-available/project1.conf
COPY php-apache/project2.conf /etc/apache2/sites-available/project2.conf
COPY php-apache/mylocal+1-key.pem /etc/apache2/ssl/cert-key.pem
COPY php-apache/mylocal+1.pem /etc/apache2/ssl/cert.pem

# mod rewrite! SSL!
RUN a2enmod rewrite
RUN a2enmod ssl

# enable sites
RUN a2ensite project1.conf
RUN a2ensite project2.conf
RUN service apache2 restart

This pulls the php-8.0.8 image from DockerHub, adds in some PHP extensions, copies over our SSL certificate and key, copies our virtual host configuration files over, enables the projects, and restarts the Apache server.

Now, all that remains to do is build it and power it up:

docker-compose build && docker-compose up -d

You can now visit the project URls in your browser, e.g., https://project1.mylocal/