IRS Finally Examines Backup Tapes, Recovers 30,000 'Missing' Lois Lerner Emails

from the oh,-you-mean-THESE-backup-tapes? dept

Whether or not the IRS is subjecting certain politically-affiliated groups to an unfair amount of attention remains to be seen. What is indisputable is that the agency’s document retention policies are an unenforced joke. As citizens, we’re required to hold onto pertinent financial records for 2-7 years just in case the IRS wants to look through them. The IRS, however, seemingly only retains records for as long as it can keep itself from inadvertently destroying them.

Emails from IRS official Lois Lerner have been sought for several months. At first, the IRS said it had them. Then it said it couldn’t find them. Then it said Lerner’s computer suffered a hard drive crash, taking with it a bunch of the emails being sought. Then it said more computers had crashed, taking out even more emails. Then it said it had recycled the crashed hard drives, making any data unrecoverable.

Questions were asked, most of them being “Bro, do you even back up files to a server?” Apparently, the IRS did no such thing, or was unaware of it, or didn’t understand the question… and so on. The IRS admitted it told officials to print out and save emails (per internal guidance) but apparently no one took these rules very seriously, as there was no hard copy to be found either. A Justice Department official noted that there were backups, but that it was too hard to recover stuff from them, before dozing off in mid-sentence.

Now, all of a sudden (well, actually on a pre-Thanksgiving week Friday afternoon), the IRS has suddenly found the emails it claimed were lost.

Up to 30,000 missing emails sent by former Internal Revenue Service official Lois Lerner have been recovered by the IRS inspector general, five months after they were deemed lost forever.

The U.S. Treasury Inspector General for Tax Administration (TIGTA) informed congressional staffers from several committees on Friday that the emails were found among hundreds of “disaster recovery tapes” that were used to back up the IRS email system.

The prodigal Lerner emails have returned! And there was much rejoicing, especially in Darrell Issa’s camp, which has been applying much of the pressure over the past several months.

It will still be some time before these emails are turned over, however. The investigators looked through 744 disaster recovery tapes, holding an estimated 250 million emails and says it will be a few weeks before the recovered emails are in a readable format. If this goes at the usual speed of government, it will be next year before the emails even make their way into the hands of the investigating committee, and longer than that before the public can take a look for itself.

The good news is that despite the IRS’s internal failures, the system still mostly worked. A backup backed up files and (after much hassling) an internal investigation recovered most of what had been declared officially missing. It’s almost enough to restore your faith in the IRS (and the government as a whole), except for almost everything else about the IRS (and the government as a whole).

Filed Under: , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “IRS Finally Examines Backup Tapes, Recovers 30,000 'Missing' Lois Lerner Emails”

Subscribe: RSS Leave a comment
37 Comments
Rich Kulawiec (profile) says:

What the heck were they thinking?

“The investigators looked through 744 disaster recovery tapes, holding an estimated 250 million emails […]”

Let’s talk about that statement for a moment.

I happen to have access to some very large and diverse email archives. A small sample of those (363K messages) suggests an average message size of just over 8K each (including full headers). 250M such messages would occupy 2T — well within the capacity of a single external drive, even without compression and allowing for the overhead of encryption. If we presume for a moment that their message corpus has an average size that 500% larger, this still remains a tractable problem: buy a bunch of external drives, encrypt, copy each year’s subset onto each of a set of three drives, store in diverse locations, test periodically and replace any drives that fails by cloning one of the ones that hasn’t.

So why are they screwing around with hundreds of tapes? (And are those 744 tapes replicated somewhere?)

Anonymous Coward says:

Re: What the heck were they thinking?

Probably because it isn’t an email backup.

It probably is a full server backup. That is, each backup set includes not only emails, but also things like shared file servers and databases. While email databases can be small (unless your users love file attachments), shared file servers can have years of accumulated junk, and databases can be huge even without accumulated junk.

The emails were probably backed up just as a side effect of backing up everything else.

Rich Kulawiec (profile) says:

Re: Re: What the heck were they thinking?

But why would they do such a thing?

If I’m tasked with creating a legally-mandated backup/archive of an email corpus in order to comply with future discovery requests, then I do that separately from routine server backups that include everything on the disks. The former only needs to contain email messages and perhaps the log files associated with them. The latter needs to contain everything including the OS and all the software, libraries, config files, etc.

But it’s the former that I would retain in triplicate in order to comply with the law, not the latter. (I don’t think anybody’s going to come along 5 years later and ask for a recovered copy of /usr/sbin/sendmail.) The former is an archive designed to achieve compliance with records retention laws; the latter is a backup designed to defend from hardware/software/human failure or from a successful intrusion.

tqk (profile) says:

Re: Re: Re: What the heck were they thinking?

Yes. 🙂

About those tapes, of course tapes do fail. They’re just as fragile as hard drives, possibly even more so. The good thing about them is they’re cheaper. Tape drives fail too, and tend to take the tape with them when they do. I wonder if anyone’s still making them.

I would think multi-CD or DVDs would make a better and more permanent medium, and I think somebody’s even come up with a way to laser burn data onto glass which can last a heck of a long time (though even glass “flows”, given enough time).

However, kudos to the IRS backup team! At least somebody was taking their responsibilities seriously. Good job. Any time data is successfully recovered from backups, it’s time for a Snoopy dance.

Anonymous Coward says:

Re: Re: Re:4 What the heck were they thinking?

According to Wikipedia, “Writing in the American Journal of Physics, materials engineer Edgar D. Zanotto states ‘… the predicted relaxation time for GeO2 at room temperature is 10^32 years. Hence, the relaxation period (characteristic flow time) of cathedral glasses would be even longer.'”

https://en.wikipedia.org/wiki/Glass#Behavior_of_antique_glass

That’s why I said it’s “pretty much” a myth. Yes, glass can flow, but it’s SO slow you’d never notice it even in centuries-old glass – or glass that’s been around for the last billion years, for that matter. As the article points out, if this were true you’d expect really old telescopes (telescopes are very sensitive to any change in the lens) to be unusable, but this is not the case. And the really ancient Roman or Egyptian glass should be showing proportionally more “flow” if was flowing, but this is also not observed.

The cathedral glass LOOKS this way because the glass was spun unevenly, and they ordinarily put the thick side down for stability.

Wayne says:

Re: Re: Re:2 What the heck were they thinking?

Tapes are not fragile. And they suffer from data degradation at a much better rate than any disk; be it one with platters or one encased in plastic. Furthermore, they arent that cheap, at least not in comparison to enterprise grade hard drives. Im not enterprise grade writable dvds even exist or could exist (outside of that laser on glass idea someone mentioned).
I will give you that if your tape drive fails, youre gonna have a bad time. None of the drives I have worked with were backwards compatible.

Anonymous Coward says:

Re: Re: Re: What the heck were they thinking?

Rich, the difference is you run a mail server and know what you are doing.

I’m pretty much sure they are running an Exchange server, since most .gov types I’ve run across usually are, so backups for data retention and message discovery are a bit different. Usually it’s either a separate appliance, (ProofPoint, Baracuda, GFI, et al) or you use an in-house mailbox and journaling, then just export the journal mailbox to a pst file for archival storage, ie if you only need active discovery of messages for the past year or so. I.E. a bit messy from your typical archival/compliance message backups, but generally with the export it’s pretty much the same process.

Eldakka (profile) says:

Re: Re: Re: What the heck were they thinking?

From the article, these do NOT sound like “legally-mandated backup/archive of an email corpus in order to comply with future discovery requests,”.

As the article says, these are from DR tapes, which likely means they are full server backup tapes.

DR backups are not designed to go “lets recover email x from the email database stored on server Z”.

DR backups are designed for restoring entire servers to an operational state after some sort of disaster – ranging from someone taking a hammer to a server to a nuke destroying the data center.

At best they could restore the entire email database, then search through the restored database for emails (either by querying it directly or importing it back into an email server to do ‘normal’ email searches).

Not to mention they may have to actually restore multiple instances of the database, because if they are DR backups they are probably monthly full backups, so to retrieve emails that have been deleted at various times, they may have to restore several backup versions to pull out emails that may have been deleted prior to subsequent backups.

All in all, it sounds like they have a bodgy system designed to NOT be able to easily audit/version emails, with no email-specific archiving mechanism (hey, they even said the official way to ‘archive’ business-relevant emails is to print them out and literally file them on a paper file). Therefore it is a SIDE-EFFECT of DR backups that they are able to retrieve old emails.

Hell, if I was running the backup system and the intent (even if unofficially) was to not keep a history of business decisions, I’d only keep 2 ‘monthly’ backups, overwriting the 3-month old backup with the current month (in addition to the daily incremental backups which would only be kept until the next full backup is verified successful). That way you COULDN’T go back more than 3 months, which is more than sufficient for DR.

Of course, that’s of you are using a grandfather/father/son backup schedule, I always preferred “incremental forever” systems myself.

allengarvin (profile) says:

Re: What the heck were they thinking?

It’s probably important to note these are “disaster recovery tapes”, not email archive tapes. As such, they’re probably full disk or LUN image backups, probably taken at regular intervals (month? quarter?). A fragment of data is not particularly useful in a DR situation. So I can easily imagine that recovering email over several years requires examination of a bunch of tapes, especially if there’s a regular purge going on, on the servers. DR tapes aren’t going to have space-saving measures like incremental diffs or dedupe. You want to dump a full image and bring it up as fast as possible.

(Also, as for size, I have 9 years of email from a previous job, from 2005 to 2014, exported as PSTs, converted to mbox format and exported to timestamped individual files. The median size is 12k [ls -l | awk ‘{print $5}’ | sort -n | sed -n “$(($(echo * | wc -w)/2))p”], but the average size is a much larger 180k [46k messages totalling 8.1G], and I know I would frequently delete mails with big attachments that I didn’t want to save).

spodula (profile) says:

Re: What the heck were they thinking?

“I happen to have access to some very large and diverse email archives. A small sample of those (363K messages) suggests an average message size of just over 8K each (including full headers). 250M such messages would occupy 2T –”

QUC-120’s were good enough for my Grandfather and they are good enough for me! and apparently the IRS….

Young Whippersnapper!

Adam (profile) says:

It isn’t that they had them, lost them, crashed them, found them that ticks me off. I have been involved in an IRS discovery for a private company as a worker for the contractor that stored their data. The IRS wasn’t ALL data within 7 days. ALL OF IT. They didn’t even accept a “we have it but it’s on our 744 tapes” and it will take more than 7 days. A no excuses approach is what they took. So, 5 months? pshh

Anon says:

What Work?

If someone is composing and sending 30,000 emails, when do they have time to get any work done? Even if they are just reading 30,000 emails when does she do real work?

I’ve been reading up on DRM security for Outlook/Exchange, and it’s scary how fragile the system could be. Lose the server and certificate, your emails are unreadable.

744 tapes to look through just sounds far too disorganized. I suspect you see chaotic disorganization as a side effect of a government that does not pay for good IT (despite that the IRS’s primary job is essentially about organizing data) plus the typical civil service / large bureaucracy problem that nobody can make a timely decision or fix a problem. Backup takes too many tapes? Backup crashes? We’ll set a taskforce to study the problem and make recommendations. Need new tape system? We’ll put in a request in next year’s budget and hope it doesn’t get cut.

amoshias (profile) says:

Look, I know conspiracy theories are fun and all...

but let’s try to get our heads on straight here. The assumption from the beginning, by many in the IT world, has been a very reasonable one – that we’re not dealing with messages from a server here. Given the federal government’s generally TERRIBLE infrastructure, many sites have pathetically tiny email systems. I worked at a DoD site – considered a tech center – where users were given 30 megs of storage space. Because of the nature of the work I was regularly sent individual attachments which would fill up my inbox. EVERYONE had outlook .PST files on their individual computers. Is this a stupid way to do things? Yes, absolutely. Do I know for certain that the Lerner emails that are missing are from a personal hard drive? Absolutely not. But it seems extremely likely.

To Mr. 5-months-is-enough-to-delete-incriminating-emails – I know 30k emails SOUNDS like a lot. It is not. It’s probably about 15 days of work for someone who had never done that kind of thing before, probably half that for someone who actually has some familiarity with document review.

Disaster recovery guy? It sounds like you’ve got an IT background, but it’s possible that you’re not familiar with disaster recovery – but it doesn’t work like a drive mirror. The things you’re talking about bear no relation to the reality of how tape archiving works.

But from my point of view, the biggest problem here is that by even talking about server problems and email infrastructures, you’re fundamentally buying into the idea that there’s a scandal here. We’ve been through literally YEARS of Issa bullying everyone in his arm’s reach, trying to manufacture a scandal whether there was one or not. And what’s come out is that the IRS took the somewhat lazy tack of focusing on groups whose names clearly indicated that they intended to violate the laws pertaining to their nonprofit status. Which, to be honest, seems reasonable to me. But then again, profiling always SEEMS reasonable when it’s against people you don’t like, and I generally don’t like liberal or conservative groups who pretend to nonprofit and nonpartisanship.

So it’s profiling, and that’s wrong, and I’m absolutely willing to let that principle stand over my own feelings of “yeah, not too bothered by this.” But where’s the coverup? What’s the point? The scandal – such as it is – is out in the open. Occam’s Razor is that this is stupidity and laziness.

But who knows? Maybe it’ll come out – if Issa keeps pushing to keep himself in the news… oops, I mean, if he keeps tirelessly investigating this – that the White House ordered the IRS to use their powers to TAKE AWAY THE TAX BREAKS of a bunch of fairly insignificant progressive and tea party groups. Truly, a plan worthy of Lex Luthor.

Anonymous Coward says:

Re: Look, I know conspiracy theories are fun and all...

Look, I know conspiracy theories are fun and all but let’s try to get our heads on straight here. The assumption from the beginning, by many in the IT world, has been a very reasonable one – that we’re not dealing with messages from a server here. Given the federal government’s generally TERRIBLE infrastructure, many sites have pathetically tiny email systems. I worked at a DoD site – considered a tech center – where users were given 30 megs of storage space. Because of the nature of the work I was regularly sent individual attachments which would fill up my inbox. EVERYONE had outlook .PST files on their individual computers. Is this a stupid way to do things? Yes, absolutely. Do I know for certain that the Lerner emails that are missing are from a personal hard drive? Absolutely not. But it seems extremely likely.

Conspiracy? Ehh your point is invalid here because they stated they lost the emails, yet five months later here they are. Whether it was an attempted coverup or gross incompetence is beside the point: They said they lost them, you argue that’s probable, except they didn’t lose them so your argument is null and void.

tqk (profile) says:

Re: Re: Look, I know conspiracy theories are fun and all...

They said they lost them, you argue that’s probable, except they didn’t lose them so your argument is null and void.

You appear to have forgotten about the left hand, right hand problem. Somebody said they lost them. That’s entirely possible if they were unaware of those disaster recovery backup copies. They were lost to them, and then somebody at the back of the room raised his hand.

Anonymous Coward says:

Re: Re: Re: Look, I know conspiracy theories are fun and all...

I forgot? Nah, the guy was saying it’s probable they lost them.

Do I know for certain that the Lerner emails that are missing are from a personal hard drive? Absolutely not. But it seems extremely likely.

After they found them. Which is a null and void argument to make because they found them. Left/right hand doesn’t enter into it.

It’s like if you lost some emails and found them six months later, then told me that you found them, and i then piped up and said, “naw dog, you probably lost them for good because of x y and z”.

That’s his argument in a nutshell.

Anonymous Coward says:

Re: Look, I know conspiracy theories are fun and all...

But who knows? Maybe it’ll come out – if Issa keeps pushing to keep himself in the news… oops, I mean, if he keeps tirelessly investigating this – that the White House ordered the IRS to use their powers to TAKE AWAY THE TAX BREAKS of a bunch of fairly insignificant progressive and tea party groups. Truly, a plan worthy of Lex Luthor.

You sound dismissive, but if it’s true it’s worthy of impeachment, just like a third-rate burglary would be. And just like a certain third-rate burglary, the problem may not be so much the crime as the coverup.

Anon says:

Exchange Restore

OK, so let’s say the department has really crappy backup software. Don’t assume the mail database is too small, assume the opposite. Backups generally were intended to prevent loss of data from server crashes, not as a searchable archive.

The cheaper backup solutions simply backed up the database. OK, but how do we restore? Overwrite the (corrupt, failed) existing database. the server may not exist any more, or it’s a production server and needs to be there to allow people to keep getting email. So we create a new server, additional cost, additional resources. format as Exchange server. Restore database and attach. But this server needs to be on the domain so the userid’s are valid. Either do it with the existing domain, or duplicate a domain controller on the private network. After that restore, open the mailboxes required, run search for requested emails. All done? restore number two tape and repeat. Then three… all the way to seven hundred. An email may be on one tape but not the next, if there was not enough mailbox room and it was moved to local PST. Eliminate duplicates.

I hope nobody asks for another user’s email, or they do this all over again from tape 1.

Anonymous Coward says:

Re: Exchange Restore

Journaling has been around since Exchange 2003, so really you are just restoring 1 mailbox from a specific time on that separate server or using something like Veeam Explorer to search that mailstore backup from off the tape. If they need true eDiscovery, it’s done on the journal mailbox which contains everything sent/received by the server itself and can’t be deleted by anyone other than an admin unless you have an illegal operation.

Anonymous Coward says:

Koskinen should be fired

I’ve seen Koskinen testify before Congress. The nerve of Congress that they would impugn his integrity!

This guy needs to be fired, and replaced with someone who will actually work to get to the bottom of this mess, rather than someone who is a lackey to delay as long as possible, and continue to sweep things under the rug.

I thought Koskinen’s appointment would be a good move, but his actions prove he has the intelligence of a garden slug….Apologies to all of you garden slugs out there.

Docrailgun says:

PACs

The email debacle aside, I think we’ve lost the signal in all the noise about the eeeevil IRS. Of course the IRS was looking into the flood of new PACs created around the time this all occurred – many of them were religiously-related groups who may well have been breaking the rules. That these groups were considered “conservative” or right-wing really had nothing to do with the process. There were groups on the left that were also investigated, but since most of these didn’t fall under the category that most needed investigation to assure that they were established under the rules there were of course fewer of them investigated. Further, most of these new groups expecting to be given tax-free status would also be considered “right-wing” but that had nothing to do with the investigation… so of course they’re going to have had more investigations, by both ratio and by sheer numbers.
This whole issue is like Benghazi – parts of the situation were a tragedy but the sinister conspiracy just didn’t happen the way that some people have tried to make it seem.

Anonymous Coward says:

I find it very interesting that these emails were “suddenly” recovered by the IRS after the mid-term elections. Especially when you consider Republics have just taken back majority control of both houses in Congress.

Things that make you go hmmmm. One thing we can say for sure. There’s a bunch of bald faced liars inside the IRS.

New Mexico Mark says:

Problem with their process...

“The IRS admitted it told officials to print out and save emails (per internal guidance)”

The problem is obvious. They forgot to add, “Then fax your printed e-mails to our offsite backup where scribes will copy them to papyrus scrolls for long-term storage.”

It is 2015 and the IRS is using computers (sort of) but their procedures are stuck in 1980 mode.

cargostud says:

IRS IT

I work for a large state government agency. Guess who is coming to audit us for IT security compliance? That’s right! the IRS. What a bunch of hypocrites. Their own house doesn’t seem to be in order. I am not an exchange guy but I do know that you can have your email saved up on a server and have a local copy at the same time. That is a great way to always make sure you have your email. That was how I did it because I was always working in tactical environments where I had no connectivity but I still had the emails I downloaded. A high powered agency director such as Lerner is not going to be left in the lurch without having access to email and not have local copies of it along with backups of her pst files. Some of these folks move from job to job with the same pst file and have years of saved emails. Then we have the active archival system of exchange. After that there may be secondary or tertiary backups and finally there is tape. Tape is a bitch but I think its great that someone finally was able to save Ms. Lerner’s emails. She must have be distraught at the thought of so many safe guards that failed and resulted in the loss of her precious emails. I wonder if anyone got fired for this.

Leave a Reply to Anonymous Coward Cancel reply

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...