fogbound.net




Wed, 27 Jun 2007

Unix: How to find files lacking certain strings

— SjG @ 4:10 pm

So, I’m working on a convoluted web site, and a problem comes up. It seems that some vitally important code was not included in some pages (for the sake of argument, let’s say it’s a copyright string). This particular site has an ungodly mix of files, including .htm, .html, and .jsp files. Some of the .jsp files are actual pages, and others are stubs to be included in other .jsp pages. The majority of the full .jsp pages include a “footer.jsp” that has the desired string, so they’re good. But I need to generate a list of the full pages, of whatever sort, that lack this string.

The inverse of this problem is easy, and is the kind of thing I use all the time:
find . -name \*.htm -o -name \*.html -o -name \*.jsp -exec grep -il "myString" {} \;

Initially, I thought using the -v flag to grep would work for me, but grep -vl returns all files it sees, because -v returns the lines that match the invert expression, not the files that match the invert expression. Then there’s the problem that I need to match “full” pages rather than included .jsp stubs.

So here’s how the Mighty Power of Unix came to my rescue:

find . -name \*.htm -o -name \*.html -o -name \*.jsp | xargs grep -il "</html>" | sort -u > full_pages.txt

provides me with a list of pages that are not mere inclusions, if you accept my assumption that an inclusion won’t match the closing HTML tag.

Then I generate a list of full pages that contain the magic string and or include the footer.jsp that would contain the magic string:
find . -name \*.htm -o -name \*.html -o -name \*.jsp | xargs grep -il "</html>" | xargs grep -le "uniqueCopyrightTag\|footer\.jsp" | sort -u > pages_no_string.txt

Then I compare the files to find out which full pages lack both the magic string and the include:
comm -3 pages_no_string.txt full_pages.txt

Wow. There it is!

I bet there’s an easier way. Post an example in the comments if you know of one!

NOTE: All commands are on a single line, regardless of whether they wrap in this particular display.


Sat, 16 Jun 2007

You Can’t Win

— SjG @ 6:16 pm

You Can’t Win, by Jack Black, 1926, reprinted by Nabat Press, 2000.

This is an interesting, conflicted, tripartite book. It’s an autobiography of a hobo and burglar, a jailbird, and a reform activist. The book starts as a good-natured telling of how Black left home, and became a hobo. We follow him as he gets caught up in the seamier side of life away from home, and how, ostensibly, through misunderstandings, he came to fall fully on the wrong side of the law. The arc continues through opium addiction, prison, abuse, and ends in reform and moral outrage.

The first part of the telling is a light, almost romantic adventure. The young man goes off, has adventures in the city, then starts to ride the rails. Sure, there’s danger, there’s police and railyard bulls to avoid, there’s even sudden death from shifting cargo, but the telling is almost with the exuberance of youth. Black encounters other hobos, who welcome him into the family, teach him the argot, and start showing him the ropes.

From here, the tale darkens. Black apprentices himself out to be a burglar, and the situations get more perilous. Friends get killed; Black gets into and out of prison. Still, the tale is rip-roaring adventure: now a member of the brotherhood of thieves, Black introduces us to a cast of wild characters. He describes to us the great hobo gatherings, with their camaraderie and drunken abandon. He details many hair-raising exploits of burglary and safe breaking.

The latter part of the book involves a lot more prison, betrayal, and drug addiction. It still has elaborate capers of theft and jailbreak, but now Black has suffered under the system. Authority is now beating him down, and he responds with wantonness and violence. In the end, there is kindness and reform.

The book is particularly intriguing in the shift of tone throughout the book. There is definite pride in the exploits, even if the words condemn his actions. The latter parts of the book are quite bitter, and the emotions are contradictory — Black blames the cruel neglect and abuse of society for making him into a monster, yet he also happily admits that he never had any interest in becoming part of society or behaving in a way that society would accept. This is what makes the book more than just a personal journey or a thriller; we experience the world from Black’s perspective, seeing hypocrisies in both the society with which he’s in conflict, and in his antisocial lifestyle.

Filed in:

Church Signs

— SjG @ 5:41 pm

Facing the main street at a church up the road from here is one of those illuminated signboards with the movable letters. For years, it amused me with its inadvertent proclamation:

SUN.WORSHIP
10-11 WEEKLY
ALL WELCOME

I’d always had to resist stealing the punctuation. “Get thee behind me, Loki!” I’d say quietly under my breath. And somehow I forbore.

Evidently, however, I was not the only one who interpreted the sign that way. So it’s been changed:

SUNDAY
WORSHIP 10-11
ALL WELCOME

But I still wonder if it bothers them that they’ve just abstracted the misunderstanding by one level. After all, the word “Sunday” originates in Pagan sun worship (refs: here which links to other sources, and numerous others).

Filed in:

Thu, 3 May 2007

McCarthy’s Bar

— SjG @ 8:16 pm

McCarthy’s Bar, Pete McCarthy, 2000, Hodder and Stoughton.

Like Red Haired Girl from the Bog, McCarthy’s Bar starts with an author of Irish descent undertaking a search for identity in Ireland. McCarthy, however, has a much more down-to-earth approach, which begins with the rule that you should never pass up a bar with your name on it (sage advice for someone traveling in Ireland with the name of McCarthy, no doubt, but for us Goldsteins it should be understood that we might have to assume an alias to avoid serious sobriety and/or dehydration).

McCarthy’s writing is reminiscent of Bill Bryson — self-deprecating, incisive, descriptive, and howlingly funny in places. The humor shouldn’t suggest that he’s not very serious about his quest, nor does it soften the keen edge to his observations.

Monaghan’s writing makes it seem that she was relatively comfortable in accepting the mantle of identity. She is in many ways more distant from her heritage, and is able to project certain things on them (being poets and sennachie). McCarthy dwells more, perhaps, on what identity means to him. He meets more relatives on his trail, is accepted as a relative to people who may or may not be blood relations, and ruminates on non-Irish who are working at becoming Irish.

Of the two books, McCarthy’s Bar is probably a better travelogue. Reading it meshed more closely with our experiences, and sometimes this added to the humor (e.g., the derelict Titanic themed bar we passed in Cobh, which he describes as being planned but observes that the local predictions for it are not promising).

Filed in:

The Red Haired Girl from the Bog

— SjG @ 7:49 pm

The Red-Haired Girl from the Bog: The Landscape of Celtic Myth and Spirit,
Patricia Monaghan, 2004, New World Library.

I recently returned from three weeks traveling through Ireland with The Right Reverend Oakes. It was something of a whirlwind tour; we visited a lot of different places both on and off of the standard tourist track. Along the way, I finished up reading The Red-Haired Girl from the Bog, which is an interesting blend of Irish history, Irish mythology, and philosophy, along with a fair amount of rumination on the meaning of place, thoughts on personal relations, and observations on poetry, all wrapped loosely in a collection of personal anecdotes structured by regions in Ireland.

The book is not really a travelogue, per se, as it’s more concerned with mythology as it pertains to place, but it does provide counterpoint and commentary to someone traveling through the places described.

Monaghan starts the book with a not-very-mystical quest involving identity — what does Ireland tell a person about herself as a member of the Irish diaspora. It quickly goes beyond that, and shows the complex intertwining of the mythological layers in Ireland, from Pagan to early-Christian to Roman Catholic to neo-pagan. Monaghan also talks about how those beliefs may mesh and coexist in a way that is fairly alien to us Americans. The bulk of the book uses personal experiences as segues into mythology, and vice versa, in a very readable way.

Monaghan is clearly extremely well read, which allows her to bring together a broad perspective on topics, but also results in an overuse of descriptions of the form “what writer X has called ‘Y‘.” I appreciate the need for attribution, but personally find footnotes less disruptive. As someone who is at best marginally familiar with Celtic, Irish, and Christian mythology (or history, for that matter), I also found the flurry of names and references somewhat dizzying. Fortunately, if you’re as undisciplined as I, you can let the references wash over you without needing to absorb them all — Monaghan’s overall narrative is good enough that you don’t need all the specifics.

Filed in: