How Much Would It Cost To Store All US Phone Calls Made In A Year?

from the cheaper-than-you-think dept

An early criticism of Snowden's leak about NSA spying activity was that the $20 million annual cost for PRISM -- whatever that turns out to be -- was simply too low to be credible. One person who knows more about storage costs than practically anyone -- well, outside the NSA, at least -- is Brewster Kahle, who set up the Internet Archive, essentially a backup for the entire Web plus a wonderfully rich store of many other materials. He's carried out a fascinating back-of-the envelope calculation of how much it would cost annually to record every phone call made in the US and store it in the cloud:

These estimates show only $27M in capital cost, and $2M in electricity and take less than 5,000 square feet of space to store and process all US phonecalls made in a year. The NSA seems to be spending $1.7 billion on a 100k square foot datacenter that could easily handle this and much much more. Therefore, money and technology would not hold back such a project -- it would be held back if someone did not have the opportunity or will.
Kahle has made the calculation available as a shared document (on Google, appropriately enough), so you can inspect his assumptions there and play around with the numbers. It's also worth reading through the comments to his short post, since they make some interesting points. However, even if the numbers are off by a factor or two, there's no doubt about the feasibility of recording all US phone calls.

And that's for sound files, which take up quite a lot of space. Text-based information pulled in from emails, Web pages and chat logs could be stored more compactly. That would make the routine recording of vast swathes of what those in the US -- and outside it -- do online not just plausible, but so cheap in comparison to the NSA's presumably large budget, that the latter might feel it would be crazy not to do so as a matter of course.

Follow me @glynmoody on Twitter or identi.ca, and on Google+



Reader Comments (rss)

(Flattened / Threaded)

  1.  
    identicon
    Anonymous Coward, Jun 21st, 2013 @ 6:34pm

    Transit costs

    The big number left out of Brewster Kahle's calculation is the cost to transport the data from the point of acquistion to the long-term storage facility.

    There are, of course, two basic approaches:
      • Simultaneous transport to long-term storage
      • Delayed transport transport to long-term storage

    Those two different approaches will have different costs.

     

    reply to this | link to this | view in thread ]

  2.  
    identicon
    Anonymous Coward, Jun 21st, 2013 @ 6:38pm

    You constantly hear the GOP bitching about the budget. Funny, you don't hear them bitching about the military budget that went up 800% for the Iraq/Afghanistan wars. Nor do you hear them bitching about the cost of the worthless TSA who has never produced one terrorist for all the money they've spent. Nor do you hear them bitching about the cost of all the new facilities built for the NSA to house these phone calls. It isn't the cost of the storage space, it's the cost of the facility, the hardware, the whole budget process for it.

    Funny the problem should be paying out for food stamps in a tough economy.

    This whole thing is a fisaco that should be followed right down to the last penny, the last broken law, the last false interpretation of what the law says.

    It's looking more every day like you could change the symbol and color of the flag and be China or Russia. Sorry but this doesn't look like the country I grew up in at all.

     

    reply to this | link to this | view in thread ]

  3.  
    identicon
    Eponymous Coward, Jun 21st, 2013 @ 6:49pm

    Something else to consider:

    That the NSA is probably using some very good compression on this data (possibly even some especially written for these applications), and so the storage requirements (and thus price [discounting the cost of the compression algo]) may be drastically reduced by that.

     

    reply to this | link to this | view in thread ]

  4.  
    icon
    Jon Renaut (profile), Jun 21st, 2013 @ 7:05pm

    This is government, remember

    There are a ton of hidden costs that most people won't see. The NSA may be different, but a certain government agency that I happen to work for has some ridiculous IT security policies that prevent anyone from using any technology that might be considered modern, efficient, or useful.

    The difference between "the cost to store and manage the data" and "the cost for a dozen Oracle licenses because we're inherently terrified of open source" is many millions of dollars.

     

    reply to this | link to this | view in thread ]

  5.  
    identicon
    Anonymous Coward, Jun 21st, 2013 @ 7:31pm

    Voice call minutes per month

    Andy Odlyzko says 638 wireless voice minutes/subscriber/month in 2010.   ( The volume and value of information, A. Odlyzko. International Journal of Communication, vol. 6, 2012. Table 2 on p. 929 (p.10 in PDF))

    Andy's deriving his numbers (by calculation) from:
    CTIA. (2011). Year-end 2010 top-line survey results.

    See Andy's footnote 9.

     

    reply to this | link to this | view in thread ]

  6.  
    icon
    That One Guy (profile), Jun 21st, 2013 @ 7:52pm

    Re: Transit costs

    Indeed, now if there was only some system whereby you could transfer lots of data easily and cheaply from one source to another...

     

    reply to this | link to this | view in thread ]

  7.  
    identicon
    Anonymous Coward, Jun 21st, 2013 @ 8:07pm

    Re: Re: Transit costs

    Dude, this is engineering. Shut up and work. You got a cite for a better number than around $3.00/Mbps ?

    How about the number of central offices in the U.S.? That number would be handy right now.

    Or just go ahead and make comments with no sources and no numbers and no calculations and no thought.

     

    reply to this | link to this | view in thread ]

  8.  
    identicon
    Anonymous Coward, Jun 21st, 2013 @ 9:05pm

    Central office storage and transit utilization ratio

    Andrew Odlyzko, in another paper, says that, historically, utilization ratios for AT&T long distance switched voice lines were about 33%.

    Data networks are lightly utilized, and will stay that way, A. M. Odlyzko, Review of Network Economics, 2 (no. 3), September 2003, p. 218.
    Voice networks, such as that of AT&T, are engineered to provide a low-cost solution to all normal demands. This means that many calls may get blocked in cases of an earthquake, say, but even peak hour demands during the busiest days, such as Mother's Day or the Monday after Thanksgiving, are accommodated. For example, to cite a small sample of the data in Ash (1998), on Monday, Dec. 2, 1991, which was the busiest day for the AT&T network until then, of 157.5 million calls, only 228 were blocked on intercity connections. In spite of this, the average utilization of long distance links in the switched voice network is close to 33%, as is explained in Coffman et al. (1998), based on data from Ash (1998).


    It occurs to me that for an intercept network, that even if it's desired to have most calls forwarded to the processing/storage facility in near real time, it may not make sense to engineer the transit network for peak load conditions.

    That is, if there's some storage provisioned at the call capture point, then even if most call captures are forwarded immediately, the intercept network can even out bursts during holidays, and other peak periods.

    In addition, a day or two of storage at the capture point would allow call captures to be retained even in the event of an outage in the intercept transit network.

     

    reply to this | link to this | view in thread ]

  9.  
    identicon
    Anonymous Coward, Jun 21st, 2013 @ 9:17pm

    Re: Re: Transit costs

    Try OpenGarden, which automagically creates a mesh network specially build with protocols used to handle the mammoth datsets that the Hadron Collider produces.

    https://en.wikipedia.org/wiki/Open_Garden

    That should do it.

     

    reply to this | link to this | view in thread ]

  10.  
    identicon
    Anonymous Coward, Jun 21st, 2013 @ 9:19pm

    Re: Re: Re: Transit costs

    How about $0.0003/Mbps?

     

    reply to this | link to this | view in thread ]

  11.  
    identicon
    Rekrul, Jun 21st, 2013 @ 9:20pm

    Don't forget that this is the government. With the way they waste money, they're probably paying about $100 per gigabyte.

     

    reply to this | link to this | view in thread ]

  12.  
    identicon
    Anonymous Coward, Jun 21st, 2013 @ 9:47pm

    Re: Re: Re: Re: Transit costs

    Sorry, $3/Mbps is per month. Didn't add the /mo, although it might have been evident from context.

    But I don't necessarily like the source I've got for that $3 number.

     

    reply to this | link to this | view in thread ]

  13.  
    icon
    jimb (profile), Jun 21st, 2013 @ 11:21pm

    Re: Re: Transit costs

    NSA has lots of money, our money, and they're all true believers. As soon as they think they have a terrorist, they're going to want to go back in time as far as they can and listen to the contents of all the calls. They have to, to save us. After all, if they don't save America from terrorists, who will?! Safety before the Bill of Rights, that's the operating principle here. So, along with the storage, I am sure the NSA can afford to string all kinds of really really fast optical fibers from wherever to wherever... what's a few hundred million, or even a few billion when you're saving America from terrorists? Transporting large volumes of data is no problem when you can build data pipes out of billions of dollars.

     

    reply to this | link to this | view in thread ]

  14.  
    identicon
    Anonymous Coward, Jun 21st, 2013 @ 11:21pm

    Re: Re: Re: Re: Re: Transit costs

    Those are some crazy high costs compared to Sweden. 1000/1000 is ~ $90 atm :x

     

    reply to this | link to this | view in thread ]

  15. This comment has been flagged by the community. Click here to show it
     
    identicon
    horse with no name, Jun 22nd, 2013 @ 1:52am

    all of this is nice but

    There is no indication they are recording all phone calls, or every a very small percentage. They are recording metadata, not actual phone calls.

    Don't let reality get in the way of painting the current administration into a corner.

    (oh, and day 6 of getting my posts "moderated" rather than posted... censorship lives at Techdirt!)

     

    reply to this | link to this | view in thread ]

  16.  
    identicon
    Anonymouse, Jun 22nd, 2013 @ 2:17am

    storage is one thing...

    Maybe people are forgetting the computing power required to trawl through all this data, after all, you don't really think they are just capturing and storing this lot, only.

     

    reply to this | link to this | view in thread ]

  17.  
    identicon
    Anonymous Coward, Jun 22nd, 2013 @ 3:41am

    Re:

    The NSA has forgotten the faces of its fathers and should be sent West.

     

    reply to this | link to this | view in thread ]

  18.  
    identicon
    Anonymous Coward, Jun 22nd, 2013 @ 4:51am

    irrelevant! it will still be done anyway!

     

    reply to this | link to this | view in thread ]

  19.  
    identicon
    Anonymous Coward, Jun 22nd, 2013 @ 8:04am

    it would cost the US Gov very little as all that information is already held on the servers of the phone companies, who keep that information for their own purposes, (billing etc).

    The phone companies also maintain specific servers to hold the information requested, it would also not be a very large amount of information to be stored anywhere. You can store a huge amount of data on a 2TB drive and they are cheap..

     

    reply to this | link to this | view in thread ]

  20.  
    identicon
    Anonymous Coward, Jun 22nd, 2013 @ 8:18am

    Re: Re: Re: Re: Re: Re: Transit costs

    Those are some crazy high costs compared to Sweden.

    IP transit price declines steepen”, TeleGeography, 2 Aug 2012
    Prices for wholesale IP transit service continue to decline throughout the world. According to new data from TeleGeography’s IP Transit Pricing Service…

    The median monthly lease price for a full GigE port in London dropped … to USD3.13 per Mbps.… In New York, the comparable price dropped … to USD3.50 per Mbps.… Pricing for short term promotions and high capacities has dropped below USD1.00 per Mbps per month.

    While prices have declined globally, significant geographic disparities persist.…

     

    reply to this | link to this | view in thread ]

  21.  
    identicon
    Anonymous Coward, Jun 22nd, 2013 @ 9:23am

    Interesting estimates but completely irrelevant

    We're talking about people who've modified a nuclear submarine to tap undersea cables. Clearly, NO amount of money is an obstacle to ANYTHING that they want to do.

    So while this discussion is interesting from an academic point of view, it's completely meaningless to the real world. Presume, for all practical purposes, that the NSA has infinite money and reason accordingly.

     

    reply to this | link to this | view in thread ]

  22.  
    identicon
    Anonymous Coward, Jun 22nd, 2013 @ 3:44pm

    Re: Interesting estimates but completely irrelevant

    you kinda make it sound like they have infinite reason, which they obviously do not

     

    reply to this | link to this | view in thread ]

  23.  
    icon
    McCrea (profile), Jun 22nd, 2013 @ 7:59pm

    Re:

    I didn't think the point is the cost. The point is that NSA lying about the cost.

     

    reply to this | link to this | view in thread ]

  24.  
    identicon
    Anonymous Coward, Jun 23rd, 2013 @ 5:01am

    Telco CO count

    The FCC's Wireline Competition Bureau (WCB) collects a variety of information from various segments of the voice telephony industry in the U.S.   The FCC WCB publishes a number of statistical reports summarizing that information. The latest copy of the Trends in Telephone Service Report that I've found on the FCC WCB website is from September 2010.

    According to that report, from Table 17.4 (p.17-8) (p.142 in PDF), there are 24,357 telco central offices in the United States. Note that this number may be an undercount, because the FCC WCB is relying on industry reports which may not be required from all telcos. In addition, it seems probable that this number is only for wireline central offices.

    Let's call it twenty-five thousand ( 25,000 ) in round numbers.

    Other tables in that report (table 17.1 and 17.2) give us an idea of the distribution of traffic across switches associated with those telco COs (CO switches and remote switches). Note though, that table 17.1 comes from numbers which are only reported to the FCC by RBOCs. And table 17.2 comes from numbers which are only reported by ILECs (including RBOCs).

     

    Acronyms:

    CO: Central Office
    FCC: Federal Communications Commission
    WCB: FCC Wireline Competition Bureau
    RBOC: Regional Bell Operating Company
    ILEC: Incumbent Local Exchange Carrier
    CLEC: Competitive Local Exchange Carrier

     

    reply to this | link to this | view in thread ]

  25.  
    identicon
    Anonymous Coward, Jun 24th, 2013 @ 3:32am

    Re: Re: Re: Transit costs

    what's a few hundred million, or even a few billion


    According to the FCC (p.214 in PDF):
    The cost of owning fiber ranges from US$40,000 to more than US$250,000 per mile, depending upon geography, soil characteristics and whether it is buried underground or strung overhead

    Consider just the 25,000 wireline central offices (neglecting the unknown number of mobile switch centers), let's say those telco COs are on average 1,000 miles away from the Utah facility. That's probably low, considering the population and infrastructure density on the East coast. If the network is built as one big physical star, then that's 25 million miles of fiber (conservatively).

    If fiber is priced at the bargain-basement price of only $10,000/mile, then it comes to a quarter trillion dollars for a physical star network. That would make a dent in the federal budget.

    No, I don't think, “the NSA can afford to string all kinds of really really fast optical fibers from wherever to wherever”.

     

    reply to this | link to this | view in thread ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Save me a cookie
  • Note: A CRLF will be replaced by a break tag (<br>), all other allowable HTML will remain intact
  • Allowed HTML Tags: <b> <i> <a> <em> <br> <strong> <blockquote> <hr> <tt>
Follow Techdirt
A word from our sponsors...
Essential Reading
Techdirt Reading List
Techdirt Insider Chat
A word from our sponsors...
Recent Stories
A word from our sponsors...

Close

Email This