Good Luck Trying To Delete Stuff Off The Internet
from the yeah,-that'll-work dept
ethorad writes “In the UK, in an attempt to promote the work the police do, some forces name and shame criminals that they catch and prosecute. All good so far as it helps the community see that crimes are being tackled (assuming they are …)
However the Ministry of Justice has now said that police forces who do that must remove the details from their website after one month. Yeah, good luck with that. Place your bets now on how many third party websites (especially local community ones) will start scraping the details from police websites for long term storage?”
Comments on “Good Luck Trying To Delete Stuff Off The Internet”
You have to be something of a fanatic to systematically scrap and preserve someone else’s web site. The practical reality for most web sites is that if you delete something it’s pretty much deleted, there are exceptions of course but those don’t amount to a general inability to delete.
Re: Re:
http://www.archive.org/index.php
150 billion pages archived. That’s pretty fanatical…
There are various other projects that do the same thing too, eg Archives NZ recently decided to download as much of the .nz webspace as they could find, plus a whole lot of NZ-related sites in other domains. They’re not fanatics, it’s actually part of their job description…
Re: Re: Re:
If you search that site for http://www.techdirt.com you find nothing archived for 2009 and only 3 pages for 2008 – pretty useless for e.g. techdirt trying to recover from being hacked earlier in the year.
150 billion pages just doesn’t cut it.
Re: Re: Re: Re:
I think you have a point you’re trying to make, but I can’t quite see it.
http://www.nytimes.com/idg/IDG_852573C40069388000257442004ECECE.html?pagewanted=print
Re: Re:
Hi, you must be new to the internet. Everything is indexed and cached and not just by google.
Re: Re: Re:
Hi, you must be new tothe internet. Indexing and caching doesn’t mean stuff exists forver even if it’s done by google.
Re: Re: Re: Re:
But it does mean that it exists on the internet for longer then the intended month.
Re: Re:
Agree that there are websites out there which aren’t specifically preserved by third parties (search engine cache and various internet archive sites aside). However, local criminals will be a point of interest for the local community so I’m sure that there will be a specific interest to retain this data.
Basically someone who is interested in keeping tabs on crime in their local community – Neighbourhood Watch, etc style. For example I’m sure someone will have a Google Maps mashup that takes the police data and tags it on a map of the UK (if there isn’t one already of course!). Once the data is copied from the police site and into someone else’s database it’s that bit harder to delete.
Re: Re: Re:
Basically if the local community is interested in some info ona police web site they can read it and remember it – that’s the whole point of putting on the web inthe first place. Anything no oneis interested in gets deleted which means …. deleted.
The police don’t maintain an ever growing list of criminals and criminal activity online, which is also OK.
Unpublishing
Kind of reminds me of when one of the biggest tabloids here in Sweden tried to interview the leader of our most famous union about a recently revealed scandal. After they had published the paper it however turned out that the woman on the photo of the article and who seemed very dismissive and unwilling to comment on the whole story (for natural reasons it would turn out) was a completely different person who just happened to live nearby the union leader. This mistake was of course deeply embarrasing for the tabloid and I listened to a radio interview with a representative for the newspaper afterwards. He said that of course they “unpublished” the article as soon as they found out.
He was probably referring to the internet version of the article – the paper was already out everywhere. Still I found it very interesting to hear a person at a newspaper believe that it’s possible to “unpublish” something. To me it seems just as impossible as “untelling” a secret or “unbreaking” a wase.
Re: Unpublishing
Very good observation.
Have a nice Christmas.
You may be interested in a site I know of. It’s called Google. I don’t know if there are many more like it but it’s quite fascinating what deleted items can be found there…
With a National Security Letter, everything is available outside of the usual judicial process.
I wouldn’t be surprised if this whole Tiger Woods fiasco, and voice mails were released under political duress of a chatty intel officer who, in turn, obtained it a result of the PATRIOT act.
Thanks, Bush Administration.
Re: Re:
Perhaps you haven’t read Judge Victor Marero’s 103 page decision on NSLs (or 120 pages as the news organizations reported…)
Here’s a copy.
http://www.aclu.org/pdfs/safefree/nsldecision.pdf
Re: Re:
Or was told to by N’obama, needed some more smoke for the socialist agenda…
Re: Re: Re:
Hey man, you voted for the Patriot Act. Shouldn’t you be angry about your personal agenda instead?
Re: Re:
Thanks, Bush Administration.
Add – Congress for voting for it and the Obama Administration for continuing it.
wrong
there is a site dedicated to caching for ever everything on the net minus the robots.txt
its been around for what 10 years or is it 15
you just aren’t told its doing it
but oddly once you are it is public
UK Govt. is becoming synonymous with technical illiterate. It won’t be long before it’s listed in thesauri worldwide.
“Fanatic”?
Hardly. It takes 5 minutes to set up a program like httrack to rip a website daily or even hourly.
Never Goes Away
When I first started playing with the web (1994 or so) a friend told me that once you post something online (including email) that it never goes away.
The longer I live the more I believe it’s true
This will last until...
Theses sort of lists sound good in meetings. Name and Shame, John Lists, Blame where blame is due etc.
Until you make up your list and at some point you have a typo. Dan Smith of Cardiff instead of Don Smith of Cardiff. All of a sudden Dan Smith has people saying ‘Dan was arrested with a prostitute?’. Add in one pontificating lawyer, cries that ‘this list put my life in ruins’, and the list gets quietly taken down.
im not sure how some of you people think there is much to scraping as used in this situation.
you write your script to run the scraper then automate it to run at certain intervals. once that is done the only time you have to spend on it is when something causes it to fail. for someone that can code a scraper, there is little time spent on it. and those who cant code? theres an app for that……
The Droid Sucks.
The isue is that there are commercially sold programs that allow for data to be recovered.
So it’s easily possibl to have something that’s deleted years ago come back and haunt you.
ha