Pages

Thursday, April 28, 2016

The Data Hoarders

There seems to be something in the human psyche that makes us want to hoard data, information, gossip, facts.

Bureaucracies invariably seem to become infested with a certain breed of pervert who believes that, if only The State had sufficiently deep Files on Everyone, then some sort of utopia would be achieved. These are the same perverts who are pretty sure there ought to be cameras recording everything everywhere.

As mentioned earlier on this blog, there's a constant siren call, echoing like a particularly obnoxious form of tinnitus, in the mental ears of technologists of a certain stripe. These people feel that it is inevitable that everyone will have wearable recording devices and will record and broadcast constantly. This will lead to a utopia. A different utopia, apparently, than the one envisioned by the creepy bureaucrats. To me they look pretty much the same, but that's just me.

I went to grad school at Wesleyan University, in Connecticut. Every year around thesis time, as the seniors started to write their little papers that would get them, I think, a With Honors designation, the library shelves were stripped. Seniors would check out 50 books, 100 books. More academic tomes than they had even touched, let alone read, in their 4 or 5 years at the school. Somehow, possession of all this human knowledge made them feel like they knew something. They would ostentatiously surround themselves, in public study areas, with literal ramparts of books they would never read. It was touching, in a way.

Photographers have the same bad habit. Shoot, shoot, shoot. Pile up the exposures. Your first 10,000 pictures are your worst. Lightroom! I must organize my archive! Ha ha, my keeper rate is 0.01%! Well, MY keeper rate is 0.001% making me ten times the photographer you are!

The trouble with all of these setups is that a mass of data, of information, becomes more valuable as it grows, up to a point. After that, as it grows, the value drops off, quite sharply, simply because you can't find anything or worse, because you find too much stuff.

The great fallacy of data mining and big data is that with enough data we can find better answers. The reality is that with enough data piled up, you can find any answers at all, right or wrong.

Suppose you've got 100,000 pictures in your Lightroom archive. To flip through them at 1 per second, you'd need 30 hours, after which your head would catch fire. If you instead glance at thumbnails, 40 at a shot, taking perhaps 4 or 5 seconds per sheet, you get that down to about 3 hours of concentration. After which your head would catch fire.

Given our marvelous visual memory, you don't have to sift the whole thing, of course. 100,000 pictures is probably manageable in this day and age, but just barely. Of course, our marvelous visual memory works by eliding a lot of stuff, so if you're relying on that you've actually got far fewer than 100,000 pictures. You've got whatever subset your brain can dredge up, which isn't all of them.

In this era of digital cameras, lots of people are just getting started at 100,000 pictures. They really feel like they're getting somewhere, and all too often, they are not. They're just piling up more and more of the same uninteresting pictures, and getting increasingly finicky about how they select the "good" uninteresting pictures out of their vast glacier of dross.

Travel light. Shoot to a specific goal, and move on. If you happen to find something you love in your archives, that's wonderful too. But the stuff you shot 2 years ago is gone, except for the handful of shots that still stick with you. You may have shot 50,000 pictures in 2014, the year you got your Canon 5D, but you've only got about 10.

And that is OK. 10 is great.

2 comments:

  1. http://www.dailymail.co.uk/news/article-3506304/Web-terror-Paris-Brussels.html

    http://abcnews.go.com/WNT/story?id=129563

    http://gawker.com/woolwich-attackers-were-known-to-mi5-were-not-consider-509683415

    "The great fallacy of data mining and big data is that with enough data we can find better answers. The reality is that with enough data piled up, you can find any answers at all, right or wrong."

    [I think that should be "can't"].

    Sure, record all our calls, keep all our emails, snoop on everything you can as a government agency. But you are making your own work harder. Terrorists are always "known to the authorities" because everyone is f*****g known to the authorities.

    ReplyDelete
  2. One should not forget those photographers who take pictures as an intense way of seeing. The camera is a prop to focus their attention, and the goal is not so much the resulting pictures but rather the state of mind during the activity. One probably accumulates lots of pictures this way, but the approach is perfectly legitimate (and enjoyable, too).

    Best, Thomas

    ReplyDelete