<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/">
<channel>
<title>Techdirt. Stories filed under &quot;probability&quot;</title>
<description>Easily digestible tech news...</description>
<link>http://www.techdirt.com/</link>
<language>en-us</language>
<image><title>Techdirt. Stories filed under &quot;probability&quot;</title><url>http://www.techdirt.com/images/td-88x31.gif</url><link>http://www.techdirt.com/</link></image>
<item>
<pubDate>Wed, 7 Nov 2012 12:42:00 PST</pubDate>
<title>Why The Press Is Getting The Wrong Message Out Of The 'Nate Silver Walloped The Pundits' Story</title>
<dc:creator>Mike Masnick</dc:creator>
<link>http://www.techdirt.com/articles/20121107/07473420959/why-press-is-getting-wrong-message-out-nate-silver-walloped-pundits-story.shtml</link>
<guid>http://www.techdirt.com/articles/20121107/07473420959/why-press-is-getting-wrong-message-out-nate-silver-walloped-pundits-story.shtml</guid>
<description><![CDATA[ Let me start off by saying that I've been a longterm Nate Silver fan, back before he was the "fivethirtyeight" guy, and when he was just some random guy whose statistical models were helping my fantasy baseball team kick ass.  And let me follow that up by noting that even more than being a Nate Silver fan, I'm a huge fan of statistics in general.  I think that statistics should be a <i>required</i> class in school and that a combination of statistics and economics (the two go hand in hand) literacy (or lack thereof) is a major problem today, leading to numerous bad policy decisions.  Finally, I've never been a fan (at all) of political punditry that focuses on the "horse race" aspect of politics.  So, given all that, it has certainly been fun to follow the secondary storyline from last night -- which is how Nate Silver and his statistical genius <a href="http://www.businessweek.com/news/2012-11-07/nate-silver-led-statistics-men-crush-pundits-in-election" target="_blank">"crushed" the pundits</a> in predicting the election -- to the point that every single major press "pundit" was <a href="http://www.theatlanticwire.com/politics/2012/11/grading-pundit-predictions/58768/" target="_blank">flat out wrong</a>, and it looked like Silver had a perfect crystal ball.  And, given how much Silver was attacked for being a "stats guy," (or for being biased, rather than neutral) you can certainly understand why it's tempting to wish he'd do something like Whitney McNamara's <a href="http://tumblr.absono.us/post/35203726587" target="_blank">mock blog post</a>:
<center>
<a href="http://imgur.com/x6UJj"><img src="http://i.imgur.com/x6UJj.png" width=560 /></a>
</center>
In many ways, I agree that yesterday was the <a href="http://www.newyorker.com/online/blogs/newsdesk/2012/11/our-money-ball-election.html" target="_blank">"moneyball moment"</a> in politics, in which the prognosticators were shown to be faulty, while the number crunchers were shown to be accurate.  Hell, it was a much stronger example than the Moneyball case in baseball, which never had a "victory" quite as clearly aligned with the numbers.
<br /><br />
Of course, if you look at what's happened to baseball since "Moneyball" and the success of the first statistical analysis guys, it should be a reminder that statistical prognostication is still about the <i>probabilities</i> -- and not about true <i>predictions</i>.  And this is where the "suddenly-in-awe" pundits are still getting confused.  They seem to think that Silver or other statistical modelers suddenly have a magic crystal ball with which they can predict the future.  But probabilities and predictions are different, and Silver himself would likely admit (and, actually, <a href="http://www.onthemedia.org/2012/nov/02/forecasting-tuesday/?utm_source=local&#038;utm_media=treatment&#038;utm_campaign=daMost&#038;utm_content=damostviewed" target="_blank">did admit</a>) that when you're dealing in probabilities, you're still going to be completely wrong some percentage of the time (he can even tell you <i>what</i> percentage of the time!) Even if the probabilities show a 90% likelihood that a certain event will happen, it still means that one time out of 10, you're going to be wrong.
<br /><br />
Unfortunately, our brains don't deal that well with probabilities.  We don't think in probabilities.  Because we're dealing with a (mostly) binary situation, we assume that as soon as the probabilities tilt in our favor, it means that a "win" is somehow assured, and mentally, the probabilities turn into a prediction.  It's very, very difficult for our brains not to think that way.
<br /><br />
So I'm thrilled to see statistical analysis "win" over the moronic pundit-class who thinks that "storylines" or "momentum" (or, um, the ultimate in believing in anecdotes over data, <a href="http://blogs.wsj.com/peggynoonan/2012/11/05/monday-morning/" target="_blank">"my friends see more yard signs" for one candidate</a>) are valid methods for prognosticating.  But it seems that the press, by going on to insist that Silver and his ilk are the new magic prognosticators, are missing the point just as much as those who thought the election could be predicted by political pundits.
<br /><br />
Statistics is a tool for highlighting the probabilities.  I'm sure that Nate Silver clones are going to be appearing a lot more on TV during the next major election cycles -- and I think that's a step forward.  But now it seems like some people are expecting Silver and other stats guys to be right every time.  And that's going to lead to backlash, just as the "failure" of Moneyball-type analysis to always get it exactly right resulted in some backlash in baseball.  There will be data analysis in future election cycles -- likely from Silver himself -- that is wrong.  That's the nature of probabilities.  It will happen.  And, unfortunately, people will then suddenly go back to arguing the opposite: that the stats geeks were "wrong."
<br /><br />
But, as they say in the stats world, these are small sample size issues.  Believing that statistical analysis is a perfect tool for predictions based on a <i>single</i> election is almost (though not quite) as weak as some of the traditional political punditry methods for predictions.
<br /><br />
Hopefully, as with baseball, after a few years, the whole idea that these are entirely separate worlds will melt away.  In baseball, every team now uses detailed statistical analysis as <i>a tool</i>, and most seem to understand that it suggests probabilities that help them find underexploited opportunities.  But no one relies on it as a crystal ball that predicts the absolute future.  Hopefully we'll reach that same sort of equilibrium in political analysis as well.<br /><br /><a href="http://www.techdirt.com/articles/20121107/07473420959/why-press-is-getting-wrong-message-out-nate-silver-walloped-pundits-story.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20121107/07473420959/why-press-is-getting-wrong-message-out-nate-silver-walloped-pundits-story.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20121107/07473420959/why-press-is-getting-wrong-message-out-nate-silver-walloped-pundits-story.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>small-sample-sizes</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20121107/07473420959</wfw:commentRss>
</item>
<item>
<pubDate>Thu, 29 Mar 2012 17:00:00 PDT</pubDate>
<title>DailyDirt: Taxes On The Mathematically Challenged</title>
<dc:creator>Michael Ho</dc:creator>
<link>http://www.techdirt.com/articles/20100408/1323598942/dailydirt-taxes-mathematically-challenged.shtml</link>
<guid>http://www.techdirt.com/articles/20100408/1323598942/dailydirt-taxes-mathematically-challenged.shtml</guid>
<description><![CDATA[ Every so often, a lottery jackpot reaches such an insane amount that everyone starts to wonder if it's worthwhile to thrown down a couple bucks on a ticket. If you haven't heard yet, it's that time again, and the Mega Millions multi-state lottery drawing could hand a lucky winner over half a billion dollars. Here are just a few reality checks if you're thinking about playing.

<ul>
<li> <a title="http://blogs.wsj.com/numbersguy/lottery-math-101-801/" href="http://on.wsj.com/HtsOHP">There are some interesting coincidences in the history of lottery drawings. For example, in one Bulgaria lottery, the same numbers were chosen twice in the same week.</a> Lightning actually does strike twice... [<a href="http://blogs.wsj.com/numbersguy/lottery-math-101-801/">url</a>]</li>

<li> <a title="http://www.slate.com/articles/life/do_the_math/2001/08/is_powerball_a_mugs_game.single.html" href="http://slate.me/HnA60m">If you want to see some math on expected values of lottery tickets and what the odds are for someone to win a given lottery, check out this advice from a mathematician.</a> "<i>If you play Powerball every day, stop playing Powerball every day.</i>" [<a href="http://www.slate.com/articles/life/do_the_math/2001/08/is_powerball_a_mugs_game.single.html">url</a>]</li>

<li> <a title="http://sbronars.wordpress.com/2012/03/28/why-a-mega-millions-ticket-is-a-good-bet/" href="http://bit.ly/H4vyYq">It's likely that there will be multiple winners (2.5 according to the math), but even so, the expected value of a Mega Millions ticket is greater than the cost of the ticket ($1.23).</a> Still, the probability of zero winners is about 10%. [<a href="http://sbronars.wordpress.com/2012/03/28/why-a-mega-millions-ticket-is-a-good-bet/">url</a>]</li>

<li><b>To discover more stuff related to economics, <a title="http://www.stumbleupon.com/to/stumble/topic:137" href="http://bit.ly/mPvUHR">check out what's currently floating around the StumbleUpon universe.</a></b> [<a href="http://www.stumbleupon.com/to/stumble/topic:137">url</a>]  <a title="what's this?" href="#" class="whatsthis help_ddstumble">&nbsp;</a>
</li>
</ul> 

As always, StumbleUpon can also recommend some good <a title="http://www.stumbleupon.com/to/stumble/stumblethru:www.techdirt.com" href="http://bit.ly/fagV8c">Techdirt</a> articles, too.<br /><br /><a href="http://www.techdirt.com/articles/20100408/1323598942/dailydirt-taxes-mathematically-challenged.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20100408/1323598942/dailydirt-taxes-mathematically-challenged.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20100408/1323598942/dailydirt-taxes-mathematically-challenged.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>urls-we-dig-up</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20100408/1323598942</wfw:commentRss>
</item>
<item>
<pubDate>Thu, 20 May 2010 13:55:00 PDT</pubDate>
<title>The Mathematics Of Proving (Or Disproving) Identity Fraud</title>
<dc:creator>Mike Masnick</dc:creator>
<link>http://www.techdirt.com/articles/20100520/0032159501.shtml</link>
<guid>http://www.techdirt.com/articles/20100520/0032159501.shtml</guid>
<description><![CDATA[ Here's a fun one by Thomas O'Toole, looking into a lawsuit by the US gov't <a href="http://pblog.bna.com/techlaw/2010/05/improbable-argument-lifts-katrina-scammers-identity-theft-conviction.html" target="_blank">against a guy who committed identity fraud to apply for emergency disaster relief after Hurricane Katrina</a>.  Basically, the entire case hinged on a bit of probability.  The guy had applied for aid using 15 different social security numbers on 15 different applications.  Here's the thing: the law he was charged under says that it's a crime to "knowingly" make use of someone else's identity.  In other words, it's only identity fraud if the guy knew he was using someone else's SSN.  If he just made up the numbers, and they all turned out to be legit <i>by luck</i>, then he could say he did not knowingly commit fraud on the people who those SSN's actually applied to.  So, here's where the probability part comes in.  As O'Toole notes, if you just take a guess, you actually have about a 50% chance of getting an actual SSN (which doesn't seem like a very good system).  But to get 15 correct guesses in a row?  Well, simplifying things a bit, the probability of guessing right 15 times in a row is about 0.0003.
<br /><br />
So, the government argued, there was a 99.997% chance that the guy, Gregory Parks, must have known that the SSNs he was using came from real people, and thus, he was guilty of knowingly using their SSNs, against the law.  But Parks and his lawyers went a little deeper, and pointed out that the original calculation was wrong, in that it way over-simplified things:
<blockquote><i>
The first three digits of a social security number are known as "area numbers." These numbers correlate to states. All of the numbers Parks used had Texas or Louisiana area numbers. Except for two: one had an Oklahoma area number and the other a Michigan area number. Area codes are published on the SSA website.
<br /><br />
The SSA also publishes on its website information indicating the extent to which the second pair of digits in a social security number -- the "group number" -- have been assigned. In Parks' case, this information indicated that, for the 13 social security numbers he used in the Texas and Louisiana area codes, the two-digit "group number" was 99, meaning that nearly all of those numbers had been assigned. Louisiana and Texas were the areas hardest hit by Hurricane Katrina.
<br /><br />
The group numbers for the two other area numbers used by Parks indicated that the social security numbers for those areas were not assigned to such an extent. For area number 446 (Oklahoma), the group number was 19 (out of a possible 99); for area number 372 (Michigan), the group number was 31 (again, out of 99).
<br /><br />
All of this extra information dramatically increased Parks' odds of randomly guessing valid social security numbers. According to the court, the new math looked like this:
<br /><br />
    1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 0.59 * 0.65 = .38
<br /><br />
Thus, with a little knowledge about how the SSA doles out social security numbers, Parks had a 38 percent chance of "randomly" choosing 15 valid social security numbers.
<br /><br />
According to the court's math. And that was the math that counted here. The court ruled that the high odds of making 15 educated guesses about social security numbers was sufficient to vacate Parks' conviction
</i></blockquote>
While amusing, this does raise a few points.  First of all, it highlights how ridiculous it is to use Social Security Numbers as identifiers, given just how easy it is to guess legit SSNs.  Second, it makes you wonder why the law dealing with identity fraud cares one way or another if the fake SSN was used "knowingly" or not.  The guy still was guilty of mail fraud -- so it's not like he gets off completely free.  But does it make sense that the laws on identity fraud only apply if you know that the SSN you're using is someone else's, but doesn't apply if you just make it up?<br /><br /><a href="http://www.techdirt.com/articles/20100520/0032159501.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20100520/0032159501.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20100520/0032159501.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>brush-up-on-your-probability</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20100520/0032159501</wfw:commentRss>
</item>
<item>
<pubDate>Mon, 20 Jul 2009 15:25:00 PDT</pubDate>
<title>Can The Lottery Make People Save More?</title>
<dc:creator>Mike Masnick</dc:creator>
<link>http://www.techdirt.com/articles/20090719/0140115590.shtml</link>
<guid>http://www.techdirt.com/articles/20090719/0140115590.shtml</guid>
<description><![CDATA[ The lottery has often been described as a "tax on those who don't understand probability."  However, it seems some enterprising folks are trying to use that basic fact to help people who have trouble saving money (who often overlap with the folks who don't understand probability) to save more.  Apparently some credit unions in Michigan are experimenting with <a href="http://online.wsj.com/article/SB124786612839159989.html" target="_new">a lottery feature as a part of a savings account</a>:
<blockquote><i>
Psychologists have long known that people tend to overestimate the odds of rare events. Applying that behavioral insight, finance professor Peter Tufano of Harvard Business School has devised a clever program called "Save to Win." Launched earlier this year for members of eight credit unions in Michigan, it is a cross between a certificate of deposit and a raffle ticket. Members who put $25 or more into a Save to Win one-year CD are entered into a monthly "savings raffle" for prizes up to $400, plus one annual drawing for a $100,000 jackpot.
</i></blockquote>
Apparently, this program has attracted $3.1 million in new deposits, many (the article claims) from people who have never been able to save much money.  In many ways it is like buying a lottery ticket, except that you don't lose the money paid for the ticket.  The credit unions make this work by paying out a slightly lower interest rate on the CD in question, but the net effect works out to benefit everyone.  Many who put their money into such an account would never have put their money into a higher rate CD in the first place.  In some ways, it's a neat example of efficient price discrimination that expands an overall market.<br /><br /><a href="http://www.techdirt.com/articles/20090719/0140115590.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20090719/0140115590.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20090719/0140115590.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>tax-on-the-poor</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20090719/0140115590</wfw:commentRss>
</item>
</channel>
</rss>