Court Says Metadata Should Be Released Under Freedom Of Information Act Request

from the commence-metadata-scrubbing dept

Copycense points us to the fascinating news that a federal judge has ordered Immigrations and Customs Enforcement (ICE) to reveal the metadata on a document as part of a Freedom of Information Act (FOIA) request. ICE had responded to the FOIA request (apparently “after significant delay,”) but provided the content requested in an unsearchable PDF. The original requestor for the content, the National Day Laborer Organization, complained that this was unfair, and the information had to be supplied with metadata — and the court agreed. The court also agreed that making the PDF unsearchable was not justified:

“Metadata maintained by the agency as a part of an electronic record is presumptively producible under FOIA, unless the agency demonstrates that such metadata is not ‘readily producible.'”

Sounds like some government employees are going to need to spend the next few weeks scrubbing metadata from documents. Wouldn’t want people to find out who really wrote various laws by looking at the metadata on Word docs, would we?

As for the unsearchable format, the judge slammed ICE for clearly going out of its way to make the document “more difficult or burdensome for the requesting party to use,” in violation of standard discover rules. Nice to see that ICE has the time to purposely obfuscate records requested in a FOIA. Transparency in action…

Filed Under: ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Court Says Metadata Should Be Released Under Freedom Of Information Act Request”

Subscribe: RSS Leave a comment
19 Comments
Anonymous Coward says:

Re: Re: Re:

No, ugly for anyone working on a document in it’s development stages. It will limit people’s desire to make comments, to offer ideas, or to disagree with anything in the document because they will not want their objection to be part of the metadata.

It is a huge negative to the entire process.

Anonymous Coward says:

Re: Re: Re:2 Re:

Apparently you have never been in the room when legal theories and possibly courses of action are discussed. It’s sort of like the comments here, all over the road.

Requiring full disclosure on every step and every part of the document, no matter what the end result is very likely to stop people from making suggestions, or at least stop them from using official channels to do so.

It’s the same logic as Piracy. Block protocol X, and they try a new one with more covering and more privacy. If you buy that logic, you know exactly what the government people are going to do in the future: Use undocumented ways of discussing and working on product so their intermediate workflow stuff cannot be easily obtained.

velox says:

Re: Re:

“this looks more like a fishing expedition”

The discovery phase of any lawsuit these days is almost always a “fishing expedition”.
That’s what lawyers do.

If you don’t like it, stop using software that hides a bunch of metadata.
For some reason, software engineers have this desire to store a bunch of metadata inside working files, and it rarely provides serves a purpose or benefit to the enduser.
MS Office is one of the worst, but it’s a widespread issue.

Rekrul says:

As for the unsearchable format, the judge slammed ICE for clearly going out of its way to make the document “more difficult or burdensome for the requesting party to use,” in violation of standard discover rules.

My first thought is that they simply scanned the documents as images and then compiled those images into a PDF. I see this done all the time. It’s faster than using OCR and then cleaning up the hundreds of mistakes that the software makes, however, images aren’t searchable.

Borax Bob says:

Maybe Not Intentional

Obviously I have no idea if this is true or not, but where I work, probably over 50% of the people, who have a PDF “printer” installed on their computer of one sort or another, will still print out something, and then walk over and scan it. Even large documents. Heck, They have a 80 page change to a 400 page policy they put out a year ago; the original policy was searchable, the changes to it aren’t and the scan isn’t even all that good. It looks like someone wanted to highlight parts of the changes, so they printed it out, used a highlighter to do the highlighting, and then scanned it. I guess it makes sense then that the page numbers don’t match, the “Changes” appear to actually have been made to the entire original document, which is distributed electronically, why only the changes were put out, I have no idea. I was assured the changes were temporary, but it’s been over a year….
My point is, it might be (or, since it’s the Federal Government, it probably is) just plain incompetence on someone’s part.

Rabbit80 says:

DMS

I work with in a scanning bureau and with various document management systems. We get sent tons of paperwork (mostly typed) for scanning which is presumably no longer available in its original (ie Word etc) form.

Some of the document management software we use will not produce searchable PDFs. The images are stored as single page TIFF with an accompanying XML or TXT file – not much use to anyone in that format! To make the PDFs searchable, they have to be run through a separate OCR process after exporting.

This software is also capable of redacting the image files before exporting them.

Its pretty easy to see why non-searchable PDFs may have been given.

Anonymous Coward says:

Usually pretty easy to make a PDF searchable as long as it is in the form of an electronic file (unless the conversion gives you the dreaded “this document contains non-renderable text”). Redactions also create problems.

If a PDF is printed to paper, conversion via OCR is a nightmare.

I presume the FOIA requestor will hereafter make sure to request electronic copies of the originals so that metadata is preserved, but making them searchable is problematic as noted above. This would have at least one benefit. Printouts cannot be tossed over the transom as a compendium of several thousand pages that do not clearly demarcate where one file ends and another begins.

mischab1 says:

If it’s anything like my company, half the meta data is meaningless anyway. Someone wants to copy the formatting style of a document so they copy an existing file and replace all the text with their own. (On a completely different subject.) But they don’t know anything about meta data so the file still lists as Author the person who clicked File -> New 5+ years ago.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...