Page 1 of 512345

Fri, 14 Jul 2017

Surveillance, Big Data, and Big Stupidity

— SjG @ 4:21 pm

(This post was started in March of ’16, revised later.)

Recently, a friend I’ll call Cassie was on a trip abroad to a country I’ll call Absurdia. She went to access her Google mail account, and was promptly locked out by the clever security system. It had determined that someone was accessing the account from overseas. Presumably, she was asked one or more security questions that she couldn’t answer (“When did you first create this account?”) along with one or another of her own security questions. OK, bad on her, you might say, for not remembering the answers to her security questions, and hooray for adaptive security that protected her account from unauthorized access!

But let’s examine that for a moment. Adaptive security recognized that the access was from a new place — not merely a different computer or IP address, but a different country. Great, makes a lot of sense. But if we step back to the weeks before her departure, Cassie was being served ads for hotels around Absurdia. She was being served ads for taxi companies in Absurdia, airline bargains for nonstop flights to Absurdia, and online language courses in Absurdese. You see, Google processes GMail messages, and extracts keywords and knowledge in order to serve ads that the user will find interesting1. When Cassie emailed people about her upcoming trip to Absurdia, Google’s algorithms understood enough to start serving travel related ads for the place. Google “knew” that Cassie was going to Absurdia. But this knowledge was not propagated beyond the ad-serving system.

Back in the 80s, my sister did a semester abroad in Rostock, in what was then the German Democratic Republic — East Germany. There was a very limited exchange program between Brown University and the GDR, and she was one of a handful of American students who took advantage of it. We have some family history in Rostock. A great-aunt had lived there, and my sister wanted to do some research on what had become of her. This great-aunt had been elderly by the time of the Second World War, and my sister wanted to know if she had died of natural causes (sadly, it turns out that she had not).

Now, the reason I’m telling this seemingly unrelated story involves something that happened years later. After the reunification of Germany, and as part of the national reconciliation process, people could request their Stasi files. That’s the collection of data that had been accumulated by the Staatssicherheitsdienst — the secret police — gathered via informants, phone taps, reading mail, and so forth. Naturally, during the tense Cold-War Reagan years, the East German security apparatus assumed that any American who would study there was a CIA agent, so my sister’s file was extensive.

Her file was also slightly ridiculous: pages and pages of hand-written notes, filled with scuttlebutt and rumor. What was particularly enlightening was just how far off base the operatives had been. They missed critical details, and misinterpreted others. My sister’s attempts to track down our great-aunt became, in their notes, a frustrated attempt to make contact with a hitherto unknown agent. With all the data they gathered, with all the information they accumulated, there was no actual gain in knowledge. In fact, there could have been even greater costs: the incorrect assumptions and misunderstanding could have resulted in the agency siphoning off resources to pursue this phantom.

Now, you might suggest that I’m the one who is missing the point here. Perhaps, you could argue, that this is the nature of bureaucracy. The agents monitoring my sister were obligated to report to their superiors, so they grasped at whatever straws were available, and willfully ignored clues that would get in the way of a narrative that would please the authorities.

But in a way, that is the point. Surveillance generally finds what it’s seeking and only utilizes it for the purpose at hand.

In this day where Big Data is a tech industry buzzword, we continuously see articles on “business intelligence” and adaptive systems. More data gathering will solve all kinds of business problems. We read that credit card companies can predict divorce, that Target Stores predict pregnancies, and so on2.

And there are other successes. In the last year, there was a fascinating article on how a programmer helped discover cheating in the crossword puzzle world. “I guess that’s the nature of any data set. You might find things you’d rather not see,” said one of the people who contributed data to the collection that ended up confirming the plagiarism.

But Amazon still serves me ads for, say, umbrellas for weeks after I actually buy an umbrella from them. Maybe they set the flag when I look at the products, but don’t unset it when I buy one. I do work on the site from a new computer, and suddenly I’m being served wedding ads. Ads are scattershot, and the only penalty for throwing stuff at the wall to see what sticks is the lower-value ads could crowd out the higher-value ads.

This kind of bad data processing is annoying, but not harmful. The same is not true with crime-prediction, voter targeting, insurance assessment, and other tasks upon which “deep learning” is being brought to bear. If the AI is built with bad assumptions, it can have serious effects on people. Training AI with “real world data” that’s been filtered by the status quo is equally dangerous. I think it’s obvious what happens. You can become un-insurable, denied loans, put on a no-fly list, and worse. “I do assure you, Mrs. Buttle, the Ministry is very scrupulous about following up and eradicating any error.”3

Whenever I fuck up something spectacularly in a complicated piece of code, I think of the Donald Fagen lyric:

A just machine to make big decisions
Programmed by fellows with compassion and vision

Unfortunately, as we see time and again, both of those attributes are often lacking. Stressed or overworked programmers, get-rich-quick VC and startup culture, bad assumptions, and a lack of examining the biases built into data sets all contribute to the failure of our machines to live up to that ideal.

1 Google issues a blog post at the end of June 2017, saying this practice would stop.

2 Interestingly, in the update to that article, Visa indignantly claims they do not track marital status, nor offer a service to predict divorces. Maybe the protest is carefully worded to hide their capabilities, or maybe it’s straightforward and honest. The fact remains that credit card companies know an enormous amount about their customers.

3 As Terry Gilliam, Tom Stoppard, and Charles McKeown captured so deftly in Brazil

Filed in:

Tue, 9 May 2017

This too shall pass

— SjG @ 9:49 pm

Back in October of 2015, I started writing the following, and never finished or published it:

I upgraded the Mac to Yosemite a year or so ago. Yesterday, I wanted to do some development on a project that I’d been idly thinking about. Unfortunately, it required a dependency in a package I’d installed via Mac Ports. I tried to upgrade it, but got an error that I was compiling for the wrong Darwin version. This means I haven’t actually updated any of my Ports since upgrading to Yosemite! For shame.

Rather than fix Mac Ports for Yosemite, and then again when I upgrade to El Capitan, I decided it was time to do that upgrade and then fix it. I also thought … hey, there’re all these neat new container technologies and configuration tools. Maybe I should look into some of those, and save myself the agony next time around.

So I dove into some articles, and pretty soon had become a seething mass of quivering rage.

To set up my environment in Docker, I need Docker, and a VM. I could set it up using Vagrant, or, as some people recommend, Vagrant running Chef or Solo. Then, of course, I need to set up some replacement for vboxsf so I can access my files in the Virtual environment. Each of these requires its own configuration, of course.

Today, I was struggling with something similar. I’m building an iOS app. Years ago, I’d built a few native iOS apps, but I’ve forgotten everything I ever knew about Objective C, and I don’t know Swift. Plus, I need to publish for Android too. So, six months ago, when I started this process, I decided I’d be using Ionic Framework. It had the advantage that it was based on AngularJS, and I’ve done some work in Angular.

Now that I’m starting, I discover that Ionic 2 is the way to go — oh wait, not Ionic 3 was just released! And my AngularJS experience is ancient v1.2.x, knowledge which is largely obsolete. I’d be learning Angular 2 — no, we’re up to Angular 4 now — so better get cracking on that.

I remember, many, many years ago, how excited I was was there was a new version of Windows. I couldn’t wait to get all those 3.5″ floppies home so I could upgrade my machine to the latest and greatest. Now, I dread each year when a new version of Mac OS comes out, and I need to upgrade and track down all the things that broke, and rebuild my ports and and and… Not to mention when I installed a recent Linux on a VM to host some sites, and discovered to my chagrin that systemd has replaced all manner of things Unixy that I’ve been doing mostly-the-same for thirty years.

Well shit. There it is. I’ve become the grumpy old software guy. “Why are they changing things? Why can’t they just leave them alone?” The fact is, some of these changes are indisputably improvements. But so many of them seem to be changes for the sake of change. We have to have “new, improved!” all the time, even if it’s just changing the syntax (why, oh why, is *ngFor so much better than ng-repeat !?).

Part of this is struggling with obsolescence in general. It’s hard being middle-aged in tech. You can’t help seeing that look in the eyes of the youngsters: that old guy is so backwards. But it goes beyond that. My neighborhood is changing around me. Younger families are moving in, and suddenly I’m that guy who’s been in the neighborhood for a long time. I find myself navigating by past landmarks — it’s across from the Burger King, er, those condos, right by the Foster’s Freeze, er, Dunkin’ Donuts. The world is changing around me rapidly. The political world I grew up in has shifted. Every year, I see more obituaries for people I know, or whose names I know. Things that were true when I was a child are no longer true.

I have vague memories of hearing these thoughts expressed when I was younger by people I thought were old. I didn’t understand them then. I’m beginning to understand them now.

I just have to remind myself that change is constant, and not all bad. When I was a kid, there were no known exoplanets. When I was in my twenties, I’d come home from a night out, and I’d be stinking of second-hand cigarette smoke. We had to struggle with card catalogs to find books in the library. When trying to reach my friends, I’d have to leave messages on their home answering machines, and I’d have to call from a pay phone where I’d enter in a multidigit phone card number. If you were interested in obscure music or books, you’d have to read tiny ads in the back of magazines to track down sources or information. LGBQT people were all but invisible, and same-sex marriage was barely even in the realm of speculative fiction.

So, that being said, I’d like change to slow down a bit. Could I please just finish a project before all the constituent languages, libraries, and frameworks have a major version increment?

Tue, 7 Jun 2016

JavaScript compares things weirdly

— SjG @ 2:52 pm

We’ve already established that PHP compares things weirdly.

It shouldn’t surprise us that JavaScript does too.

Consider the following:

> var k=['hello'];
> (k=='hello'?'Equals':'Nope');

Now, purists will point out that that’s an “equals” operator not an “identity” operator, but I mean seriously? We’re just going to pretend that

> ['hello']=='hello'

I think I’ll just go and rewrite all my client side code in C now.

Mon, 28 Mar 2016

PHP Compares Things Weirdly

— SjG @ 10:36 am

This is a known .. uh … situation, but it bit me today.

So, consider the following:
$ php --version
PHP 5.4.16 (cli) (built: Jun 23 2015 21:17:27)
Copyright (c) 1997-2013 The PHP Group
Zend Engine v2.4.0, Copyright (c) 1998-2013 Zend Technologies
$ php -a
Interactive shell
php > $v1 = '479014103257633139480';
php > $v2 = '479014103257633139481';
php > echo ($v1==$v2?'Equal':'Not Equal');
Not Equal

Seems sane, yes? Reasonable. Kind of what you expect.

But then, consider this:

$ php --version
PHP 5.3.3 (cli) (built: Feb 9 2016 10:36:17)
Copyright (c) 1997-2010 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies
$ php -a
Interactive shell
php > $v1 = '479014103257633139480';
php > $v2 = '479014103257633139481';
php > echo ($v1==$v2?'Equal':'Not Equal');

Yeah. Let that sink in for a moment.

Some versions of PHP (before 5.4.mumble) will preëmptively convert strings to numbers before comparing them (if they contain only digits). But if the number is large enough, you may lose the precision to compare them correctly.

Wow. I mean, just … well… I dunno.

For what it’s worth, strcmp will do the right thing regardless of PHP version. But seriously. I mean. Why do I use this turdburger of a language?

Fri, 27 Nov 2015


— SjG @ 1:48 pm

(This is a post from the end of September. I didn’t finish writing it then, but recent events made me revisit it).

I just finished reading Camp and Community: Manzanar and the Owens Valley, an oral history compiled in the mid 1970s by Jessie A. Garrett and Ronald C. Larson. Unlike many of the oral histories of Manzanar, these interviews are not of internees. Rather, this is a collection of interviews of twenty some odd people who lived and worked in the area. Some of them worked at the camp itself (including one director of the camp), while some had no connection to it at all.

It’s a fascinating read. Not unexpectedly, people often contradict one another and the memories are rife with inconsistencies, but it paints a picture of a small, relatively isolated community being confronted with substantial change and influx of outsiders (both within the camp and with the outside personnel the camp required). The change was an economic boon in a lean time, and it brought outside attention to the area. Both of these factors affected the attitudes of the community.

There is a strong impression that some people’s feelings changed in the twenty-five to thirty years between when the events took place and the interviews occurred.

Among the people whose opinions changed against the internment, there were all of the expected explanations: it wasn’t actually so bad, some of the the internees came voluntarily, it was for their own protection, the internment was a fait accompli and there was nothing to be done, there were legitimate mutual threats against America and Japanese Americans so this was sadly necessary, and so on. Among the people who supported the internment then and now, the arguments were also the expected ones: it was war, these were people of suspect loyalty, internees were treated better than the Japanese would treat Americans, to do otherwise would be to invite disaster.

One theme, as valid today as any time, is that fear is easily stirred up and manipulated to make people do things they would ordinarily oppose. Several of the interviewed people reflected on the fact that American citizens were unconstitutionally stripped of their rights, but excused it because there was a foreign threat to the country. It was also clear that the sense of “otherness” was key. Many of the people interviewed said they’d never seen (much less met) a person of Japanese descent before the establishment of the camp.

Another theme is essentially the William Goldman adage to “follow the money.” People like newspaperman Manchester Boddy helped establish the camps — and profited greatly on buying up the property of Japanese-Americans at firesale prices when they had twenty-four hours to liquidate their belongings before being shipped out.

Some of the defenses of the creation of Manzanar are true. People were afraid. We were at war. The imperial Japanese army was terrible and cruel to captured peoples. And yet, even if true, these are irrelevant. If our rights as Americans are subject to revocation when we’re afraid, then they’re not rights. If our answer to enemy cruelty is cruelty, then we’re no different than our enemy. If we can strip citizens of their freedom and property just because they look different than the majority, then we descend into mob rule and our lofty appeals to our ideals are just so much hot air.

Page 1 of 512345