Back in April, in writing about Union Square Ventures' Hacking Society
event, I discussed the importance of measuring the unmeasurable
, in noting that we all too often seem to be evaluating information-era economics using industrial-era metrics. That's a problem. Nick Grossman, who organized that Hacking Society session, has a great post discussing this same concept and highlighting Tim O'Reilly's discussion
about this topic, in which he describes the clothesline paradox
, which actually seems to come from a discussion in the early 1970s, and highlights how metrics can mislead. You can think of the clothesline paradox like this:
If you take down your clothes line and buy an electric clothes dryer the electric consumption of the nation rises slightly. If you go in the other direction and remove the electric clothes dryer and install a clothesline the consumption of electricity drops slightly, but there is no credit given anywhere on the charts and graphs to solar energy which is now drying the clothes.
In my mind, there are two "problems" associated with this, and while I think there is interest in attacking the first one, the second problem is often ignored. The first problem
is that we notice that important information is measured with the wrong metrics. We see this all the time in the internet era. People talk about "the collapse" of the music industry, but miss the fact that more music has been produced, recorded and released in the last decade than in any previous decade. In fact, some of the evidence suggests more music was produced and recorded in the last decade than all other decades combined
. Of course, that's an example of a metric that can be determined, but not all such metrics are that easy to pin down. For example, we talked about how Craigslist almost certainly helped contribute to the challenge that many newspapers are facing, because it undercut the cash cow that supported many of them: the classified advertising business. And if you used traditional metrics, you'd bizarrely and incorrectly suggest that Craigslist somehow "destroyed" value. But that's because no one takes into account all the value that Craigslist created, not for itself, but for its users. But how do you measure the fact that I can now find someone to take my old couch away for free? There's value in that transaction, but no one "measures" it. What about the fact that I can more efficiently rent out an apartment - without having to pay the local newspaper? Again, there's value, but it's not properly measured.
The second problem
is a little trickier to understand. It's that when we have things that we can measure
, we instinctively gravitate towards using those metrics, even if they're the wrong metrics!
I was thinking about this as I read Paul Graham's excellent thoughts on "black swan farming,"
which is all about the counter-intuitive process involved in funding startups. There's a ton of tremendously thought-provoking lines in that piece, but I'm going to concentrate on one, which was really more of an aside, unrelated to the larger article (which you should go read), because it helped clarify my thinking on this point. Graham talks about not bothering to measure how many of the YCombinator companies he funds and trains later go on to raise more money after their initial fundraising efforts, noting:
I deliberately avoid calculating that number, because if you start measuring something you start optimizing it, and I know it's the wrong thing to optimize.
And here's where the problem of using the wrong metrics becomes compounded. Even if you know something is the wrong metric, just having the number almost forces you to optimize for it
. So rather than looking at, say, what's best for the overall culture of music, we look at "revenue for the record labels" and decide we need to "fix" that. Or, we look at the patent system as a proxy number for "innovation" and then the focus becomes solely on increasing the number of patents we issue
, rather than on actually maximizing innovation.
When you have the wrong metrics, not only do you have bad or incomplete information, but even when you know that
it's almost impossible not to optimize for those metrics, because you don't have anything else to work towards.
There is a lot of new interest in quantifying all sorts of new data -- and one benefit of the information age is that it also helps to create
new data that can be quantified. But not all quantified data is actually that useful, and unfortunately, we often get so focused on the fact that we have a number, we ignore the possibility that the number is not telling us anything useful.
I was recently reminded of Shelby Bonnie's opinion piece from three years ago about why we need to kill the CPM
as a metric for advertising (for those who don't know, CPM -- or "cost per thousand" impressions -- is how most banner ads are sold). He noted, quite accurately, that even those with the best of intentions to get away from "CPM-based" advertising seem to end up there in the end anyway. Because we have that number. And it becomes what people optimize around, just because it's there.
All campaigns start with the best of intentions: “let’s do something creative, engaging, and unique!” But unless someone really senior from the agency or client side intervenes, the road for a campaign always leads to the media buyer and the dreaded spreadsheet, where the two most important columns are impressions and cost. Ironically, there’s usually some good stuff in campaigns, but they are thrown in for free as “value adds.” At some point, publishers decide that if all clients care about is impressions, then OK, we’ll give them impressions. The output is an industry that overproduces shallow, superficial, commoditized impressions. Why do we have so many bad sites that republish the same junky content–content that’s often made by machines or $1-per-post contractors? Why do sites intentionally try to get us to turn lots of pages with tons of top 10 lists, photo galleries, or single-paragraph summaries of someone else’s story?
The more I spend time thinking about these issues, the more I think these combined problems -- both not having the right data and then optimizing for the wrong data -- are the keys to many of the issues that we're regularly discussing around here. Figuring out ways to get beyond that, and to find the right data, and break our habits of relying on bad data are going to be increasingly important.