fogbound.net




Thu, 13 Jun 2013

Failures in Image Processing

— SjG @ 10:00 am

I have a tendency to have ideas that seem simple until I attempt the implementation. Then, all of the complexities jump out at me, and I give up and move on to the next idea. Once in a while, however, I’ll go so far as to prototype something or play around before abandoning ship. This here’s an example of the latter.

The Big Idea hit when I looking up a recruiter who had emailed me out of the blue. He shared the name of someone I went to school with, and I wanted to see if it was, in fact, the same person. In this case, a quick Google Images search on the name and job title indicated that it was not the same person, so I didn’t have to fake remembered camaraderie.

While searching, though, I thought it interesting the variety of faces that showed up for that name. Hm. Wouldn’t it be cool, thought I, if I could enter a name into my simple little web service, and it would spit back the “average face” of the first ten or twenty matching images from Google? After writing a few pages of code, scrapping them, switching languages and libraries, writing a few pages of code, scrapping them, switching languages and libraries again, writing a few more pages of code, I then ditched the whole enterprise.

In the process, I did use some morphing software to test the underlying concept. Here is the average face for my picture mixed in with the first seven Samuel Goldstein images that were clear enough, angled correctly, and of sufficient size to work with:
s0123_4567

For what it’s worth, here are a few of the software challenges to automating something like this:

  • Extracting the portrait from the background. This isn’t critical, but will simplify subsequent tasks.
  • Scaling the heads to be the same size.
  • Aligning the images more or less. I was going to use the eyes as the critical alignment points; if they couldn’t be within a certain degree of accuracy, this would suggest incompatible images (e.g., one a 3/4 portrait, the other straight on).
  • Detail extraction. This is finding key points that match on each image. Experimentally, when combining images by hand, it may be sufficient to match:
    • 4 points for each eye, marking the boundaries
    • 3 points marking the extremities of the nostrils and bottom center of septum
    • 5 points marking ear boundaries: top point where they connect to the head, top of extension, center, bottom of lobe, point where lobe connects to head
    • 5 points marking top outer edges, outer angle, and center of mandible
    • 5 points mapping hairline
    • 5 points mapping top of hair
    • 7 points along the superciliary ridge
  • Interpolate these points on each pair of images, then on each interpolated pair, and so on until a single final interpolation is achieved
  • Render the image to a useful format
  • View image, and laugh (or cry)

A few other lessons learned:

The typical picture returned by Google images search for a name will be a thumbnail portrait. It’ll be small — maybe 150 by 300 pixels or so. While that’s enough data to recognize a face, it’s not a lot of data when you start manipulating it. Ideally, for nicer results, source images should be a minimum of two or three times that size.

Google gives back different results depending on whether you surround the name with quotes or not; it also makes a big difference if you pass size or color parameters. The “face” search is a good parameter to use, although when searching for “Samuel Goldstein” face search inexplicably had lots of Hitlers and Osama Bin Ladens. The “safe search” feature is also strongly recommended for this project — again, when searching for “Samuel Goldstein” without safe search yielded a variety of unexpected vaginas.

Bing image search gives different results (not too surprisingly), but they also have some anomalies. My search brought back many of the same pictures, along with an inexplicable collection of calabashes and LP labels.

If any ambitious programmers go ahead and implement this, please send me the link when you’re done!


Mon, 8 Apr 2013

Why I need an extension on my taxes

— SjG @ 9:49 pm

… because I wanted to play with HTML 5 Canvas.
example1

example2
This should run on any reasonably sane browser. Play with it yourself here.


Tue, 19 Mar 2013

Godwin’s Law

— SjG @ 8:22 am

(Mary) Godwin’s Law: As a summer of incessant Swiss rain wears on, the probability of romantic poets turning into monsters approaches one.


Wed, 30 Jan 2013

Failures in Typographical Experimentation

— SjG @ 8:35 pm

This started with an idea.

Perhaps it would be interesting to create a family of type faces where the density of the characters was related to the frequency of their use. This font, to be called Densitas, would have variants based upon the text analyzed. For example, Densitas Shakespeare would use the collected works of Shakespeare for the character frequency corpus, while Densitas Brontë would use the works of the Brontë sisters for the corpus. For aesthetic purposes, perhaps the initial faces could be selected based on relevance to the source corpus as well.

What would this accomplish? It might reveal something interesting about the difference in usages between authors. It might end up being environmentally friendly by using less ink on more common characters. It might enhance readability. After all, it’s popularly understood that we tend to look at the shapes of words rather than the constituent letters. De-emphasizing the more common shapes may even make it easier to process text.

Any time I have an idea of this nature, I start thinking about code and design and try to avoid thinking about the end result. As my Father is wont to say, it is just as difficult to create something ugly as it is to create something beautiful. If I think too much on the end result, I will obsess over whether it will be worth the effort, and never get to the actual work. If I just dive in, I may find myself wasting a lot of time, but at least I will learn something.

This turns out to be one of those experiences. I thought it was an interesting idea. The end result is mediocre at best, dull perhaps, a waste of time. Still, I learned something in the process.

Step one was to write a character frequency analyzer. This code does a few things:

  • read a text file
  • compute the character frequencies
  • scale the results across the frequency range, so the least frequent character has a value of zero and the most frequent character has a value of one
  • map the characters to glyph names
  • write out a chunk of code to substitute into the next step

The next step is a FontLab Studio/RoboFab script, hence glyph names instead of raw character names. Since FontLab/RoboFab scripts are in Python, I figured I’d write this in Python as well (I don’t really know Python, but that kind of ignorance never stops me from writing code).

I ended up with this program: cf.py

I ran it against the plaintext
The Complete Works of William Shakespeare from Project Gutenberg
(after stripping out the Project Gutenberg-specific text, which I believe is permitted since I’m not redistributing the text, merely crunching it with code).

The FontLab/RoboFab script accepts two font sources, and interpolates each glyph according to the frequency computed in the previous step, where the less frequently used glyphs are darkest. For my test, I used the current state a sans-serif font I’ve been developing1. I have it in several weights, so I interpolated between the lightest and heaviest. The code to do this interpolation looks like shakespeare_weighter.py.

The results were unimpressive, to say the least:

(click to enlarge)

There are some of the obvious problems: distribution between is too stark; there seem to be only two or three densities. Similarly, kerning gets really disrupted by the different densities. But first things first. Why is the density contrast so extreme? Looking at the weighted frequency data answers that question:

For this chart, punctuation and other glyphs have been omitted.

So the next approach is to make the differences more gradual. Instead of doing by pure letter frequency, we use a gradient based on the ranking of frequency. In other words, the least common glyph is the darkest, the next least common glyph is one increment lighter, and so on, until the most common glyph is the lightest. This code to compute this looks like cf2.py, and the output distribution looks like this:

Looks more promising, does it not? We substitute the values into our FontLab/RoboFab script (like this: shakespeare_weighter2.py), and run it. Alas, the end results are still pretty dull:


(click to enlarge)

For the last try, we’ll do a few things differently. First, the thing that probably jumped out at you when you saw the first distribution graph: we’ll ignore all non-alphabetical characters when doing the frequency calculation. For the sake of readability, we’ll set all non-alphabetical characters to the median value. Secondly, we’ll take accented characters and consider them the same weight as their non-accented versions, so, for example, “á” and “ä” are the same density as “a.” Lastly — and this might be the big shift — we won’t interpolate between two weights of a font based on the frequency, but instead we will effectively halftone each glyph with a screen density based on the frequency.

To do this, we use the RoboFab halftoneGlyph() pen for inspiration. We do a much blunter approach: we impose a grid over the glyph, determine which points on the grid are inside, and replace those points with squares. The size of the squares is the same across a given glyph, and is based on the frequency. This process will then convert a nice, smooth glyph into a rougher, pixellated gray version of itself.

The revised frequency computation code is here (cf3.py), and the resulting frequency graph looks like this:

From this, we generate the final FontLab/RoboFab script (this one: shakespeare_weighter3.py), and run it.

And yet again, we look at the results, and sigh. All this work, and really nothing to show for it. There are a number of problems. The font stresses most rendering engines with its very high contour count, and either gets blurred into oblivion or converted into a plaid checkerboard nightmare when viewed on a display. The differences in shades are only apparent when the characters are enormous, even when printing. And, of course, aesthetically, it’s nothing to write home about.

(click to enlarge)

The lack of results are dispiriting enough to resort to quoting that reprobate Thomas Edison: “Results! Why, man, I have gotten a lot of results! I know several thousand things that won’t work.”

I can’t claim to know thousands of things that won’t work, but I do have another handful to add to the collection.

1 The font will be released as WL Hope Grotesque, when and if I ever complete it to my satisfaction.


Tue, 8 Jan 2013

iPhone 5 Wallpapers

— SjG @ 8:48 pm

I needed to change my iPhone lock screen image. With no further aesthetic commentary, here are three Retina-ready, iPhone 5-sized images you can use. They’ll also work on non-Retina or lower-resolution iPhone screens, you’ll just have to select a portion of the image (or scale it down). If you don’t know how to install Wallpaper images on your iPhone, the first page I Googled gave a pretty good step-by-step.

(Click on the thumbnail to see the full-size versions; right-click to download them)

Enjoy!