fogbound.net




Page 1 of 4912345...1020...Last »

Sat, 7 Oct 2017

Simple file monitor

— SjG @ 11:59 am

Say you host a few web sites for various folks, and you give them write access to a directory on your server. Well, then, my friend, you’re as big a fool as I am.

Maybe you want to mitigate this foolhardiness by keeping an eye on what these folks upload. For example, when I see a user uploading SuperBulletinBoardThatIsTotallyNotASpamTool.php or SuperWordPressPasswordSharingPlugin.php, I can call them and explain why I’m deleting it. I can be a slightly-less-bastard operator from heck.

So here’s a quick bash script that I use. It’ll also help to alert you if somehow one of the WordPress sites gets compromised, and rogue php files get installed. It ignores commonly changing files or things we’re not interested in like images. It shouldn’t be considered an intrusion detection system, or a robust security auditing tool — this wouldn’t really help in the case of an actual hacker with any l33t skillz at all. It’s just a quick information source.


#/bin/bash

rm -f /tmp/fcl.txt

rm -f /tmp/fcld.txt

/usr/bin/find /var/www/ -type f -ctime -1 | /bin/egrep -v "\\.git|\\.svn|(*.jpg$)|(*.gif$)|(*.pdf$)|wp-content\\/cache|files\\/cache\\/zend_cache" > /tmp/fcl.txt

xargs -0 -n 1 ls -l < <(tr \\n \\0 /tmp/fcld.txt

[ -s /tmp/fcld.txt ] && /usr/bin/mail -aFrom:account@mydomain.com -s "MYDOMAIN.COM FILES UPDATED" you@youremail.com < /tmp/fcld.txt

Throw it into a crontab, and there you have it. You'll get an email with a list of files changed in the past day.


Wed, 27 Sep 2017

Seasonal Palettes

— SjG @ 7:43 pm

Over the years, I’ve written various JavaScript mandala-generators. I like giving variety to the color sets used, and in the past, I’ve hand-crafted collections of colors which I’ve given descriptive names like “Earthy,” “Angst,” and “Scorchio.”

For a new project, I wanted seasonal palettes. Being a northern-hemisphere dweller, I think of January as cool colors, May as yellows and greens, August as ambers and oranges, etc. Rather than hand assemble them, I thought this would be a good use for the Interwebs.

So I wrote a bash/php/ImageMagick script that would hit flickr.com with a seasonal search term to bring back the first twenty-five matching pictures. It then made a composite of the pictures, did a pixelation process, reduced the colors to a minimum set, and built a palette from them.

With excuses of fair use, here’s a visual of that process, using the example where the search terms were “Landscape July”:

1. Images are brought down, each scaled to fit in a 64 x 64 pixel square, and then they’re all combined into a single image.

2. The combined image is pixelated by scaling to 5% of the original size, then scaling back up to a larger size.

3. To get a little more punch and a little less muddy, the pixelated image has its histogram equalized

4. For good measure, the script then reduces the image to 32 colors.

Now, some of this may be redundant. For example, we could easily skip step 2, since we’re reducing colors in step 4. However, this way we sort of reduce the color space before we equalize the histogram. Maybe I should experiment with other paths here.

In any case, the results for my first search term “($month) Landscape” was not very good:

I tried some other search terms for good measure.

Here’s “($month) colors”:

Here’s “($month) thoughts”:

And finally, here’s “($month) skies”:

I have a few conclusions. First, it’s obvious that a hand-created set of palettes would be better. The pictures Flickr returned for each search term didn’t match my expectations very well. Perhaps I’d have done better with season names instead of month names. Lastly finding the best palette from an image is a problem that Google tells me many have worked on. I’m assuming others have probably done better than I.

But it’s a curious question — what are the “characteristic” colors from an image? My approach largely comes down to the number of pixels of a given general color. Are there lots of blues? My approach will have at least some blue. But if an accent color is “important,” whatever that means, my approach will probably lose it.

In any case, it’s probably back to mandalas and hand-crafted palettes for the next project.


Thu, 21 Sep 2017

Time Machine Backups

— SjG @ 3:37 pm

I use Time Machine for my local desktop backups. It’s a nice solution. It sits there quietly backing stuff up, keeping multiple revisions of files, and even keeping it all encrypted so if the external drive gets swiped it’s not going to be easy to get at the data.

Of course, it’s no substitute for a revision control system for code, nor is it good for situations where the office gets annihilated due to stray meteorite or drone strike. It’s not a complete solution, but it’s part of a broader collection of solutions.

Today I was reminded of some of the limitations. I used Time Machine to migrate to a new machine. That’s a pretty sweet process. You wait for a few hours of disk read time, and suddenly a new machine is populated with all your old settings, applications, data, and so on from your old machine.

But I found some things that weren’t quite right. Most of them had to do with processes that keep open files or databases, and don’t get backed up in a clean fashion.

  • Interestingly, Safari didn’t propagate Ublock Origin, which was a manually-added extension. This was the only surprising one of the bunch.
  • MySQL databases. I hadn’t shut down the MySQL server when the backup ran, so the table files from the Time Machine restore were corrupted. I was able to copy the files off the old machine (after shutting down mysqld gracefully), and use those.
  • TimeKeeper‘s datastore files were all corrupted. I had to delete them, and re-export/import the data from the old machine.
  • VMWare Virtual Machines. I knew they’d get corrupted if backed up by TimeMachine for the same reason as the above, but then I forgot that they weren’t backed up. I had to manually copy them off the old machine. This reminded me — if I want backups of my VMs, I need to do it myself!

That’s all thus far. Nothing too surprising, but a good reminder. Just because you’re backing up, doesn’t necessarily mean you’re backing up stuff in a restorable state!


Thu, 31 Aug 2017

Getting nagios back up and running… again.

— SjG @ 12:42 pm

Nagios monitoring on one Centos 6.9 server seemed to have stopped working after an upgrade. All the tests showed status OK, but they hadn’t actually run in days.

Looking at the service details was weird, because the next scheduled check was about a minute in the past.

The Nagios help page wasn’t. And we’re running Core 4.3.x anyway, without a MySQL database.

The first clue was a bunch of lines in the event log:
Error: Could not open check result queue directory '/var/log/nagios/spool/checkresults' for reading.

Turns out we didn’t even have a /var/log/nagios/spool directory. Creating those directories helped. But Nagios still wouldn’t start from the usual startup scripts. Nothing in the main log. But then, another clue.

Who doesn’t love to see shit like this:

$ cat /var/log/nagios/nagios.configtest
ERROR: Errors in config files – see log for details: /var/log/nagios/nagios.configtest
$

So the startup script /etc/init.d/nagios searches for warnings, and aborts if they exist. It’s supposed to log them. For some reason it didn’t.

You can manually get those warnings and errors yourself by running

/usr/sbin/nagios -v /etc/nagios/nagios.cfg (adjust paths as appropriate).

I ended up with a bunch of warnings for deprecated parameters. So I went in and edited my config files to remove them or update them to the new equivalents. Oh yes. Software authors, please keep in mind: nothing pleases your users more than changing the names of variables in config files. We users live for this shit. When, oh when, will the author of our software next change “retry_check_interval” to “retry_interval”?

Fixing all of the warnings was not enough, though. The startup script gave the message “Starting nagios:” and then silently died. Well, sort of. It actually was starting now, but brokenly:

# ps aux | grep -i nagios
nagios 12610 0.0 0.0 12296 1220 ? Ss 16:36 0:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg
nagios 12611 0.0 0.0 0 0 ? Z 16:36 0:00 [nagios] <defunct>
nagios 12612 0.0 0.0 0 0 ? Z 16:36 0:00 [nagios] <defunct>
nagios 12613 0.0 0.0 0 0 ? Z 16:36 0:00 [nagios] <defunct>
nagios 12614 0.0 0.0 0 0 ? Z 16:36 0:00 [nagios] <defunct>
nagios 12616 0.0 0.0 11780 520 ? S 16:36 0:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg
root 12744 0.0 0.0 103328 876 pts/0 S+ 16:44 0:00 grep -i nagios

Starting directly from the command line worked:

# /usr/sbin/nagios -d /etc/nagios/nagios.cfg
# ps aux | grep nagios
nrpe 7282 0.0 0.0 41380 1340 ? Ss 16:51 0:00 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d
nagios 8010 0.0 0.0 16404 1280 ? Ss 17:31 0:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg
nagios 8011 0.0 0.0 10052 920 ? S 17:31 0:00 /usr/sbin/nagios –worker /var/spool/nagios/cmd/nagios.qh
nagios 8012 0.0 0.0 10052 920 ? S 17:31 0:00 /usr/sbin/nagios –worker /var/spool/nagios/cmd/nagios.qh
nagios 8013 0.0 0.0 10052 920 ? S 17:31 0:00 /usr/sbin/nagios –worker /var/spool/nagios/cmd/nagios.qh
nagios 8014 0.0 0.0 10052 920 ? S 17:31 0:00 /usr/sbin/nagios –worker /var/spool/nagios/cmd/nagios.qh
nagios 8015 0.0 0.0 15888 552 ? S 17:31 0:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg
root 8018 0.0 0.0 100956 616 pts/0 S+ 17:31 0:00 tail -f /var/log/nagios/nagios.log
root 8037 0.0 0.0 103328 856 pts/1 S+ 17:32 0:00 grep nagios

When things get this weird, there are only two options. Well, three, if you include the “rm -rf /” option. But the other two are: 1) reboot and see if stuff magically starts working, or 2) see what SELinux is breaking.

# tail -f /etc/audit/audit.log
type=AVC msg=audit(1504137329.209:41): avc: denied { execute_no_trans } for pid=7731 comm=”nagios” path=”/usr/sbin/nagios” dev=dm-0 ino=1201464 scontext=unconfined_u:system_r:nagios_t:s0 tcontext=system_u:object_r:nagios_exec_t:s0 tclass=file

Yup. As expected, SELinux breaking stuff.

So, my preferred way to solve this kind of problem now is to snip out all relevant the AVC “denied” sections from the log into a single file (which I called audit.log), and then using audit2allow to create a new module. Since there’s already a nagios module (containing insufficient privileges), I created a nagios2 module:

# audit2allow -M nagios2 < audit.log # semodule -i nagios2.pp

Hooray! After a few iterations of this process (discovering other blocked operation, granting them permission, restarting nagios), everything was working but check_disk_smb, which was returning “results from smbclient not suitable” even as it worked fine when tested from the command-line as follows:

# su – nagios -s /bin/bash -c “/usr/lib64/nagios/plugins/check_disk_smb -H SMBHOST -s share -a 10.X.X.X -u nagios -p \”password\” -w 90 -c 95″
Disk ok – 16.71G (11%) free on \\SMBHOST\share | ‘share’=134108676096B;136850492620.8;144453297766.4;0;152056102912

Diving in and editing check_disk_smb to throw the actual error message, I found nagios getting a “ERROR: Could not determine network interfaces, you must use a interfaces config line” from smbclient. So I edited /etc/samba/smb.conf, and explicitly told samba which interfaces it had available:

interfaces = lo eth0 10.X.X.X/24

Le sigh. Now this error went away, and I got to go for another fun and challenging round of “find all the SMB operations that SELinux is breaking.” This time, I got tripped up the “dontaudits” — there were operations being blocked, but not logged. I was saved by TrevorH and sfix, helpful people in Freenode’s #centos IRC channel:

TrevorH: semodule -DB to disable dontaudit rules, stay permissive, recreate, use the audit log to generate a policy as per the wiki
13:14 TrevorH: @selinux
13:14 centbot: Useful resources for SELinux: http://wiki.centos.org/HowTos/SELinux | http://wiki.centos.org/TipsAndTricks/SelinuxBooleans | http://docs.fedoraproject.org/en-US/Fedora/13/html/Security-Enhanced_Linux/ | http://www.youtube.com/watch?v=bQqX3RWn0Yw | http://opensource.com/business/13/11/selinux-policy-guide
13:15 _SjG_: Thanks
13:16 TrevorH: semodule -B when done (as well as setenforce 1)
13:25 _SjG_: TrevorH: thanks, that resolved it.
13:25 _SjG_: so what I was missing is that there can be donaudit rules that were preventing specific operations from showing up in the audit log?
13:28 TrevorH: yes
13:28 sfix: _SjG_: yep, there’s permissions in the policy that we know are requested but don’t want to allow for whatever reason. dontaudits are our way of preventing them from cluttering the audit log.
13:29 sfix: dontaudits tend to be a bit over-eager though

So there you have it.

I was finally back to where I had been mere days before.


Fri, 14 Jul 2017

Surveillance, Big Data, and Big Stupidity

— SjG @ 4:21 pm

(This post was started in March of ’16, revised later.)

Recently, a friend I’ll call Cassie was on a trip abroad to a country I’ll call Absurdia. She went to access her Google mail account, and was promptly locked out by the clever security system. It had determined that someone was accessing the account from overseas. Presumably, she was asked one or more security questions that she couldn’t answer (“When did you first create this account?”) along with one or another of her own security questions. OK, bad on her, you might say, for not remembering the answers to her security questions, and hooray for adaptive security that protected her account from unauthorized access!

But let’s examine that for a moment. Adaptive security recognized that the access was from a new place — not merely a different computer or IP address, but a different country. Great, makes a lot of sense. But if we step back to the weeks before her departure, Cassie was being served ads for hotels around Absurdia. She was being served ads for taxi companies in Absurdia, airline bargains for nonstop flights to Absurdia, and online language courses in Absurdese. You see, Google processes GMail messages, and extracts keywords and knowledge in order to serve ads that the user will find interesting1. When Cassie emailed people about her upcoming trip to Absurdia, Google’s algorithms understood enough to start serving travel related ads for the place. Google “knew” that Cassie was going to Absurdia. But this knowledge was not propagated beyond the ad-serving system.

Back in the 80s, my sister did a semester abroad in Rostock, in what was then the German Democratic Republic — East Germany. There was a very limited exchange program between Brown University and the GDR, and she was one of a handful of American students who took advantage of it. We have some family history in Rostock. A great-aunt had lived there, and my sister wanted to do some research on what had become of her. This great-aunt had been elderly by the time of the Second World War, and my sister wanted to know if she had died of natural causes (sadly, it turns out that she had not).

Now, the reason I’m telling this seemingly unrelated story involves something that happened years later. After the reunification of Germany, and as part of the national reconciliation process, people could request their Stasi files. That’s the collection of data that had been accumulated by the Staatssicherheitsdienst — the secret police — gathered via informants, phone taps, reading mail, and so forth. Naturally, during the tense Cold-War Reagan years, the East German security apparatus assumed that any American who would study there was a CIA agent, so my sister’s file was extensive.

Her file was also slightly ridiculous: pages and pages of hand-written notes, filled with scuttlebutt and rumor. What was particularly enlightening was just how far off base the operatives had been. They missed critical details, and misinterpreted others. My sister’s attempts to track down our great-aunt became, in their notes, a frustrated attempt to make contact with a hitherto unknown agent. With all the data they gathered, with all the information they accumulated, there was no actual gain in knowledge. In fact, there could have been even greater costs: the incorrect assumptions and misunderstanding could have resulted in the agency siphoning off resources to pursue this phantom.

Now, you might suggest that I’m the one who is missing the point here. Perhaps, you could argue, that this is the nature of bureaucracy. The agents monitoring my sister were obligated to report to their superiors, so they grasped at whatever straws were available, and willfully ignored clues that would get in the way of a narrative that would please the authorities.

But in a way, that is the point. Surveillance generally finds what it’s seeking and only utilizes it for the purpose at hand.

In this day where Big Data is a tech industry buzzword, we continuously see articles on “business intelligence” and adaptive systems. More data gathering will solve all kinds of business problems. We read that credit card companies can predict divorce, that Target Stores predict pregnancies, and so on2.

And there are other successes. In the last year, there was a fascinating article on how a programmer helped discover cheating in the crossword puzzle world. “I guess that’s the nature of any data set. You might find things you’d rather not see,” said one of the people who contributed data to the collection that ended up confirming the plagiarism.

But Amazon still serves me ads for, say, umbrellas for weeks after I actually buy an umbrella from them. Maybe they set the flag when I look at the products, but don’t unset it when I buy one. I do work on the MarriageToGo.com site from a new computer, and suddenly I’m being served wedding ads. Ads are scattershot, and the only penalty for throwing stuff at the wall to see what sticks is the lower-value ads could crowd out the higher-value ads.

This kind of bad data processing is annoying, but not harmful. The same is not true with crime-prediction, voter targeting, insurance assessment, and other tasks upon which “deep learning” is being brought to bear. If the AI is built with bad assumptions, it can have serious effects on people. Training AI with “real world data” that’s been filtered by the status quo is equally dangerous. I think it’s obvious what happens. You can become un-insurable, denied loans, put on a no-fly list, and worse. “I do assure you, Mrs. Buttle, the Ministry is very scrupulous about following up and eradicating any error.”3

Whenever I fuck up something spectacularly in a complicated piece of code, I think of the Donald Fagen lyric:

A just machine to make big decisions
Programmed by fellows with compassion and vision

Unfortunately, as we see time and again, both of those attributes are often lacking. Stressed or overworked programmers, get-rich-quick VC and startup culture, bad assumptions, and a lack of examining the biases built into data sets all contribute to the failure of our machines to live up to that ideal.

1 Google issues a blog post at the end of June 2017, saying this practice would stop.

2 Interestingly, in the update to that article, Visa indignantly claims they do not track marital status, nor offer a service to predict divorces. Maybe the protest is carefully worded to hide their capabilities, or maybe it’s straightforward and honest. The fact remains that credit card companies know an enormous amount about their customers.

3 As Terry Gilliam, Tom Stoppard, and Charles McKeown captured so deftly in Brazil

Filed in:

Page 1 of 4912345...1020...Last »