fogbound.net




Mon, 21 Aug 2023

Another “sed” one-liner

— SjG @ 1:22 pm

I needed to generate a comma-delimited, quoted list of all the .svg image files in a directory, but without the extension.

This worked:

ls -1 ./svgs | sed 's/.*/"&"/' | sed 's/\.svg//' | paste -sd, -

Hooray for the command-line!

(coming soon, why I needed that list…)

Filed in:

Wed, 14 Jun 2023

pamd running out of sessions for cron

— SjG @ 6:51 am

I manage a very busy Rocky Linux test-server. For one staging environment, cron is already running five or six maintenance scripts every minute. But when the tests run, the system has to do a lot of additional permissions fixes and filesystem adjustments. I’ve started seeing in the logs the following error message:

pam_systemd(sudo:session): Failed to create session: Maximum number of sessions (8192) reached, refusing further sessions

Now, there is a known older problem with systemd and dbus, that comes up when you search for this error message. I couldn’t find any concrete actions I could take to fix the issue. The other major search results are RedHat pages behind their subscription wall, and, at this point I’m apparently too dumb and out of date to even be able to figure out how to pay for a RedHat subscription.

I think I’ve found at least a temporary solution, however. In /etc/systemd/logind.conf there is the SessionsMax field where you can override the default. I doubled it to 16384, then ran systemctl restart systemd-logind

I’ll have to see if that’s a viable long-term fix rather than just treating the symptoms of a bigger issue.


Thu, 20 Apr 2023

Taming logwatch on Linux*

— SjG @ 6:59 am

*This is actually on Rocky Linux / CENTOS / RHEL, but will likely work on others.

Logwatch can be a nice tool for keeping an eye on your servers. It goes through your logs and creates a nightly aggregate email containing information to keep you apprised of various important details. It can be good to bring things to the attention of lazy / overwhelmed sysadmins like me.

Where it fails, though, is where it overwhelms you with useless information. There are different output level settings, and if you turn the detail levels down far enough, it helps a lot. However, with certain configs and certain OSes, you still get overwhelmed with non-actionable information. Here’s how to fix a few of those.

Crontabs. In Rocky Linux, cron logs a success message that contains a process number, which means the default log is filled with lots and lots of lines like session-685197.scope: Succeeded.: 1 Time(s) which logwatch happily throws into the nightly email. Most searches tell you to edit your /etc/logwatch/conf/ignore.conf file and add the following line:

session-.*scope: Succeeded

This doesn’t work for me. Further research indicates that the ignore.conf file wants a Perl-style regular expression. The recommendation above is sort-of-Perlish, but what ended up working correctly for me was putting the following line in my ignore.conf:

\s*session-(.*?)\.scope: Succeeded\.(.*)

HTTP. For some reason, someone thought having a long list of hostile IP addresses would be helpful. Maybe to manually block them? Seems like a hopeless task. Check out /usr/share/logwatch/scripts/services/http around line 596… and un-comment out the conditional.

$flag = 1;
foreach my $i (sort keys %ban_ip) {
   if ($flag) {
      print "\nA total of ".scalar(keys %ban_ip)." sites probed the server \n";
      $flag = 0;
   }
   #if ($detail > 4) {
      print "   $i\n";
   #}
} 

sshd. I know there are a lot of hackers, script kiddies, and bots out there. I don’t need to see the long list of people who tried and failed to log in with ssh. Unfortunately, the detail level setting for sshd aren’t very helpful. I ended up editing /usr/share/logwatch/scripts/services/sshd and liberally sprinkling my own if ($Detail > 4) {} barriers starting around line 500. Hacky, I know. Also will be clobbered with the next logwatch update. Yuck.

Maybe it’s time for me to submit a bunch of pull requests.


Sun, 1 Jan 2023

New Year New Fear

— SjG @ 11:03 am

I thought I’d update this VPS from Ubuntu 20.04.5 LTS to Ubuntu 22.04.1 LTS. I ran do-release-upgrade which gave all the appropriate warnings and stuff, and proceeded to upgrade. This VPS is hosted on Linode, so there’s a local Ubuntu mirror — the download is blindingly fast.

The do-release-upgrade process appears to run in a Unix screen window. Unfortunately, that interacted poorly with the terminal I was using, so any time I moved my mouse, it spewed control characters as input. That was bad when it was prompting … something? … probably about my nonstandard sshd configuration.

Anyway, Linode provides a web-based console access, so I was eventually able to fix the broken sshd, PHP, and Apache configurations. Only about two hours of downtime. Not bad for New Year’s Day.


Tue, 17 May 2022

Linux Command Line Magic

— SjG @ 12:24 pm

In day-to-day operations, cirumstances often arise where you need simple answers to fairly complicated situations. In the best scenario, the information is available to you in some structured way, like in a database, and you can come up with a query (e.g., “what percentage of our customers in January spent more than $7.50 on two consecutive Wednesdays” is something you could probably query). In other scenarios, the information is not as readily available or not in a structured format.

One nice thing about Linux and Unix-like operating systems, is that the filesystem can be interrogated by chaining various tools to make it cough up information you need.

For example, I needed to copy the assets from a digital asset management (DAM) system to a staging server to test a major code change. The wrinkle is that the DAM is located on a server with limited monthly bandwidth. So my challenge: what was the right number of files to copy down without exceeding the bandwidth cap?

So, to start out with, I use some simple commands to determine what I’m dealing with:

$ ls -1 asset_storage | wc -l
10384

$ du -hs asset_storage
409G	asset_storage

So that first command lists all the files in the “asset_storage” directory, with the -1 flag saying to list one file per line, which is then piped into the word-count command with the -l flag which say to count lines. The second command tells me the storage requirement, with the -h flag asking for human-readble units.

I’ve got a problem. Over 10,000 files totalling over 400G of storage, and say my data cap is 5G. The first instinct is to say, “well, the average file size is 40M, so I may only be able to copy 125 files.” However, we know that’s wrong. There are some big video files and many small image thumbnails in there. So what if I only copy the smaller files?

$ find asset_storage -size -10M -print0 | xargs -0 du -hc | tail -n1
630M	total

Look at that beautiful sequence. Just look at it! The find command looks in the asset_storage directory for files smaller than 10M. The list it creates gets passed into the disk usage command via the super-useful xargs command. xargs takes a list that’s output from some command and uses that list as input parameters to another command. To be safe with weird characters (i.e., things that could cause trouble by being interpreted by the shell, like single quotes or parens or dollar signs) we use the -print0 flag from find (which forces it to use null terminators after each result output) and the -0 flag on xargs, which tells it to expect the null terminators. This takes the list of small files, passes them to the disk usage command with the -h (human-readable) and -c (cumulative) flags. The du command gives output for each file and for the sum total, but we only want the sum, so we pipe it into the tail command to just give us that last value.

So if we only include files under 10M, we can transfer them without getting close to our data cap. But what percentage of the files will be included?

$ find asset_storage -size -10M -print | wc -l
7708

Again, the find command looks in the asset_storage directory for files smaller than 10M and each line is passed into the word count as before. So if we include only files smaller than 10M, we get 7,708 of the 10,384 files, or just under 75% of them! Hooray!

But when I started to create the tar file to transfer the files, something was wrong! The tar file was 2G and growing! Control C! Control C! What’s going on here?

What was wrong? Well, this is where it gets into the weeds a bit. It took me longer than I’d like to admit to track down. The shell command buffer has limitations on its length, and xargs has its own limitations. If the list it receives exceeds those limits, xargs splits the input and invokes the destination command multiple times, each with a chunk of the list. So in my example above, the find command was overwhelming the xargs buffer and the du command was called multiple times:

$ find asset_storage -size -10M -print0 | xargs -0 du -hc | grep -i total
6.1G	total
630M	total

My tail command was seeing that second total, and missing the first one! To make the computation work the way I’d wanted, I had to allocate more command line length to xargs (the size you can set is system dependent, and can be found with xargs --show-limits):

$ find asset_storage -size -10M -print0 | xargs -0 -s2000000 du -hc | grep -i total
6.6G	total

Playing with the file size threshold, I was finally able to determine that my ideal target was files under 5M, which still gave me 68% of the files and kept the final transfer down to about 3G.

In summary, do it this way:

$ find asset_storage -size -5M -print0 | xargs -0 -s2000000 du -hc | tail -n1
2.9G	total

$ find asset_storage -size -5M -print | wc -l
7094

$ find asset_storage -size -5M -print0 | xargs -0 -s2000000 tar cf dam_image_backup.tgz