Company Claims Its Software Can Magically Identify 'Rogue Sites'
from the that's-not-how-it-works dept
A company called RogueFinder is claiming that it has automated the process for finding rogue sites:
The basic idea is to draw links between seemingly unconnected ?rogue? web sites, e.g. web sites selling counterfeit goods. According to the RogueFinder web site, its software takes minutes to do what it takes forensics teams months to achieve.
It uses data from registries, registrars, web hosts, servers, and ISPs as well as inspecting the sites? ?invisible source code?.
Sounds useful for playing parlor tricks. Not so sure for a system involved in blocking protected speech. As we’ve discussed time and time again, one of the issues in all of this is that determining what is and what is not infringing is not an easy task. At all. It takes a human being who can actually analyze the situation and how it falls under copyright law — including exploring specific exemptions. It’s time that we got rid of the myth that there’s any significant way to magically identify what’s infringing and what’s not.
Filed Under: monitoring, rogue sites
Comments on “Company Claims Its Software Can Magically Identify 'Rogue Sites'”
I have a brain that can realistically without any majicks identify bullshit claims..
And I can honestly state It has identified one again
And WTF is invisible source code? is that like Imaginary Hollywood Matrix stuff.. oooo scary
It sounds like the special way you explain the magic faries that live inside the computer to the dinosaurs who are sure the internet is just a fad you can kill off before anyone starts using it.
ps my tablet typingskills need work…that and i forgot my pw again…
Re: Re: Re:
I prefer the ‘magic smoke’ theory. Once you let the magic smoke out, it stops working.
Re: Re: Re: Re:
I thought I covered the magic smoke before… had to go find it again…
“The magic pixies who live inside the thinking box, when you double click it they are forced to once again pick up their instruments and reproduce the drivel to appease their human captors.
The magic smoke one sometimes sees leaving a computer case is actually the souls of pixies pushed to far and to hard to reproduce to many songs in a public performance.
Before you torrent that next album, won’t you stop and think of the pixies?
Re: Re: Re:2 Re:
Its all that Technical Hitch called TIKUF who is the real culprit [ http://www.youtube.com/watch?v=Yq_i-swEK14 ]
They say if you pronounce TIKUF backwards three times, that he will appear to smite your enemies.. Honest and for true!
all humans can get this, the just need to carefully care for the truth, and be exposed to dangerous amount of lies
Wow, it certainly didn’t take long for the “anti-piracy” companies/leeches to jump on the whole “rogue” bandwagon in an effort to make a quick buck.
Link a system like this up with SOPA automate it and you could probably shut down a significant chunk of the web, certainly what’s accessible in the US. The false positives would just be acceptable collateral damage with it seems limited recourse. Of course then the countermeasures to get around the system come online and we have a race to see who is better.
What a waste of time/money/effort.
Re: Response to: Anonymous Coward on Dec 13th, 2011 @ 11:25pm
It’s yet further proof, if any were needed, that the US government isn’t actually interested in helping the tech sector, and is simply trying to create new “jobs”.
But if the source code is invicible how can they see it? O.o
Interesting… but not at all surprising….
Gioconda, Joseph Joseph.Gioconda@RogueFinder.com
42-40 Bell Boulevard
Bayside, New York 11361
Which apparently is these folks.
why am I not surprised that a lawyer makes a claim that he has a majiktastic solution to just KNOW what’s infringing.
A lawyer who just so happens to make all of his money from IP.
Re: Re: Re:
Sounds like a ‘make work’ program for lawyers….
We have software that can identify thousands of people you can sue automatically… Imagine not having to spend all that time and effort gathering ip addresses and fake names for your extortion schemes… er legal filings, with our automated software, you just point it to a piece of content, enter the number of suckers (aka litigants) you want to try and extort money from, and our system will use it’s “Magic Six Degrees of Kevin Bacon Methodology” to identify the appropriate number of individuals to include in your suit.
fine print: no warranty expressed or implied, all results made up on the spot based on random ip address associations, no guarantee of actual infringement or any proof is ever provided by this software, the results of this system are not valid for legal filings and should not be relied upon for initiating legal proceedings… (we know nobody reads the fine print… so if you use our software to identify people to sue, you are violating our licensing agreement on any suits filed, and you agree to pay us $1000 per name identified by our softwar and used in your suit)
Yes, THIS IS SOFTWAR…. get in the game or move on…
Re: Re: Re: Re:
Why not just use the phone book?
The worst part is, the SOPA crowd would probably buy that. If I went to a congressman’s house and showed him where “view page source” is in Internet Explorer 4 (or possibly Netscape Navigator), he’d think it was some kind of secret legendary hacker trick.
I thought they meant the server-side code, as in PHP or perhaps even the actual source code for the webserver. And that would have been an impressive feat.
Re: Re: Re:
that would have been an impressive feat
And probably illegal. Unauthorized access to a computer system, anyone?
I doubt it
If this were possible, don’t you think Google (or some other search engine) would be selling this service to the the MPAA/RIAA/etc.
Everyone likes to shout “free market” all the time, but forget to take it into account in most discussions that actually require it.
Plus first of all, all they are doing is using bots to crawl websites they ‘mark’ as rogue, and then to make a database of other sites that are linked to by sites that link to that original marked site(s).
This would be the 3rd homework problem in any class teaching how to make a search engine after 1) how to make a crawler-bot, 2) how to make a DB of sites 3) (this) how to link sites together in groups
Re: I doubt it
The only way for a system to identify potentially infringing material is for all copyrighted material to be registered, finger printed, each and every promotional track sent out to be listed as such in a huge database.
All that would do is give you what is ‘potentially infringing’ not what is infringing.
From Wikipedia, the free encyclopedia
see it’s listed, so it *must* be a rogue site…
meanwhile those _poor_ industries struck by piracy “help” add to the list… …
It might be useful, but it will invariably have a lot of false positives and false negatives. Especially if you consider the logical malleability of law and how it relates to IP.
It might be useful as a tool to gather potential instances of infringement. However, these instances will still need people to verify if they are infringing or not.
Although, based upon many companies previous behavior when given tools to locate potentially infringing material, they are likely to take this software’s list and send out mass takedown notices without properly checking.
Can’t wait to see the false positive rate from this software.
I can only guess that false positive rate is higher than 100%. Illogical, but then again it’s fitting for software that makes even less sense.
Oh great like we need this.
DMCA law gave them cruise missiles. Over half target a business rival. One third were invalid attacks.
Now SOPA will give them nuclear weapons and you can watch part of the Internet get obliterated before your eyes.
Then what better then for lazy copyright owners to put the pending WWIII all on computer control.
Technology being used to crack down on piracy?
Sounds like Mike Masnick’s worst nightmare.
Comments that really have no bearing on the conversation?
Sounds like a troll with penis envy talking to hear himself talk again.
Sounds like an IP extremists wet dream. And is as likely as becoming reality.
Yea, ask the spam filtering software field how well that ‘promise of filtering the baddies’ has worked out for them.
“invisible source code” is not a technical term based on actual technology; therefore this is not “technology being used to crack down on piracy”.
It’s more like a dash of poor understanding and a bucket of desperation sweat mixed with a cup of web crawling search engine bots to make a “magic” potion that cures the poor, poor ailment all the IP welfare leeches are suffering from known as “being forced to adapt your business model to fit reality.”
No one else in the world gets to sit around like a lazy piece of trash and perpetually make money off of work they did in the past. So make sure you keep the tear stains off your resume as you go out there and look for a real job.
Shotgun approach of extortion scheme targeting to become legal?
Sounds like a criminals’ dream come true.
Invisible source code?… as in what appears on your screen when you click “View source code?”.. aka what you get when you request any website?
This is obviously some alternate reality where invisible means what you can see.
Acutally, if I were to guess, it means that the software will find links on the pages, and then follow those links and/or access those ‘other’ pages.
For example, on our ‘job application’ page, we have been getting several ‘error’ emails (everytime someone doesn’t fill out the page correctly, and attempts to submit the job application in a ‘bad’ aka ‘sql injection’ type format, we are sent an email informing us of the submitting computer/user info…yes, we coded this ourselves…not a software package).
If you physically look at our page, all you see is a submit button. In the “source code” section, you can see where some things are processed and then it jumps to a different page. That 2nd page does some additional checking, and then inserts the data into the database. The user never sees anything except ‘processing’.
We have had bots recently that have been skipping the first page, and going directly to the 2nd page and attempting to inject code there. However, we have already built in for that possibility, so the 2nd page errors out and shoots us an email.
If you were to attempt to explain ‘invisible source code’ to a non techie, then technically, to them, the 2nd page is ‘invisible’. Nothing on their screen gives them the impression that they are on a different page.
Re: Re: Re:
It’s invisible to those who do not know it is there.
Not really in line with the actual definition of the word however.
In our inexhaustible race to the bottom, would such snake oil sites fall under the guise of rogue?
1 vagrant, tramp
2 a dishonest or worthless person : scoundrel
3 a mischievous person : scamp
4 a horse inclined to shirk or misbehave
5 an individual exhibiting a chance and usually inferior biological variation
As I was reading the article I pictured a crowd of people gathered round a wagon with a man standing on it with saying “You sir, yes you. Are you plagued with rogue sites in the middle of the night? Want something to get rid of them? Well look no further, I’ve got the solution to all your ills.”
Re: Re: Re:
How does it do on stains?
The funny part is to some extent, this isn’t really a hard thing to do.
Many of the “rogue sites” do things that are common between sites. From linking images from “file hosts” to intentional misspellings of words, there are plenty of things you can use to filter down and give the old mark 1 eyeball something to look at.
Many of these sites use similar source code and layouts, those who move from domain to domain and host to host often upload the same site over and over again, with minor variations. Over time, you can build up a library of these pages and be able to spot similar sites. Duplicate content is one of the ways these sites often stand out.
You can also look at the products they offer, the hosts they use, the payment processors, and all of that stuff to look for commonalities. If you can filter down 100,000 sites offering “nike shoes” down to a list of 200-300 that are likely rogue, then review them by hand, you would probably have a pretty high success rate.
You could also use honeypots to catch their spam. Opening a wordpress site and allowing open comments is a great way to find out who is scamming what. Similar results can happen using various forum software and other types of sites that permit user comments or postings.
100% success rate? No way. Reasonable successful? I suspect that it can be done.
And we thought precognition technologies were dead.
Obviously you atuomagically know what the MAFIAA, IFPI and BSA don’t.
Do not think you are superior to God.
Again, it’s about what it’s gonna take to get there.
Some of us ‘old time network admins’ that just now got out from under the ‘omg how do we filter spam’ umbrella knows that it took YEARS before filtering out spam without filtering legit emails became manageable.
The hours we spent with configuring software solution after software solution….the months we spent reading log files, the years we spent making phone calls to the ISP, to the sending server IT dept, the finger pointing about who’s fault it is that a legit email didn’t make it.
(don’t give me that ‘3rd party crap’, those are the hardest to track down why an email didn’t make it to it’s destination…but I digress.)
Now….now….because some dying “entertainment” industry can’t save their own ass and want to go crying to Gov for a handout…..now I get to find out why our purchasing agent can’t find the rivets he needs to build this sidewall to the plane, because of more filtering crap.
Haven’t we learned by now?
SO I wanna know……is the **AA’s or the government going to reimburse businesses for IT time spent tracking down problems with legit business activities…such as purchasing steel, or shipping products because of the 92 different filtering softwares that are going to flood the market with horrible code, and a lack of understanding of business rules outside of their own world?
It might actually work
I sat on a plane soon after 9/11 with someone whose obscure research was suddenly now funded. It in effect looked at meta data of phone calls and could identify different types of groups (family, sales force, people planning a batchelor party, terrorist cell) by the different network behaviour.
So, based on various factors (a site 2 days old with gigs of content for example?) registrant’s associations with previous “rogues”, traffic patterns (if they could get access to this or deduce it from response times), I could see how sites could be characterised into types (news, eCommerce etc) pretty rapidly.
And this doesn’t need to be perfect – it just needs to make the xxAA’s job slightly less of a needle in a haystack. Simply finding a site which has music for download on has already narrowed the field a bit. (It’s not like the results are ever going to be used as actual proof of anything). And the backlinks (the company they keep) will will give clues too.
If they can cheaply trawl a million newly registered domains
and give a vague probability that a site might be a non legit download site, that changes the odds and the timelag in the game of whack a mole.
I suspect that this company will sell the “software” as a service, charge a fortune, but their “server” will actually be some google-literate students told to locate content in return for pocket money, focusing particularly on content pertinent to the paying customers they have signed up. In the distorted world of said content holders, this will appear to offer great value, and a follow on service of filing a takedown will be sold by the lawyers for each site located. Content holders will think this is hugely helpful and will be reminded of all the lost sales that it is preventing.
And not a single extra CD will be sold as a result.
One interesting question arises.
If said “software” finds a site (or thousands of sites) and verifies that they are indeed offering infringing content (how, by downloading ?) Is the holder of the software in violation of any laws ? Every time this software downloads, a sales is lost !
Re: It might actually work
“And not a single extra CD will be sold as a result.”
Re: Re: It might actually work
I don't know about this software...
but I personally use Faerie Fire when I want to fish out a rogue.
Maybe I can use this to find a couple of files I’ve been having a hard time getting ahold of.
Beyond the thought that “invisible source code” might be something as innocent as shrouded PHP scripting using the Zend engine I have no idea what that is. And all that does is make the source inaccessible to prevent copying unless it’s encrypted which means the browser has to receive an encryption key so that it will actually run the code on the client side if that’s what’s intended.
The other possibility is that it’s code that runs on the server and the client side never sees it during the code’s execution. From what they describe it could be either or nothing at all. If that’s what’s happening they may be going to use the application to break into servers, something itself that’s illegal but I guess this band of lawyers gets to excuse this because they’re on the side of the “angels”. At least in their minds.
Now data mining CAN be useful. Not will be useful as their site (a multi page advertisement in reality) as there are no guarantees. First you have to know what you’re looking for. They claim they do though the sites they describe are usually those associated with harvesting credit card numbers, passwords, identity theft and that sort of thing in the sense that they set up look alike sites of of a bank and ask questions of the user no bank ever would. They may also have to do with the gray/black market for prescription drugs. They claim that by their software’s analysis of the data mined the can create a collection of, frankly, unbelievable connections between owners, hosts, ISPs and other data to bring the offender to court.
The thing is this, found on a the About Us page.
“ROGUEFINDER? Investigative Software is currently in active development by a team at RogueFinder LLC, located in New York City.
The impressive team includes experienced intellectual property attorneys, private investigators, software analysts and technical consultants. Each team member is involved in critically important elements of the software, including:..”
Whoops. The software isn’t finished yet. But, hey, we’re working on it.
Notably missing from the list are statistical analysts which one needs to do effective data mining as all data mining does is result in a stack of statistics which get tossed out of whack the moment something unexpected data comes along if you’re relying on a collection of preset
They also claim the software is patent pending, along with the usual copyright and trade mark claims. While I won’t, completely, dispute the last two the first seems unlikely as they would be relying on an aircraft carrier stuffed full of prior art to do what they claim to be able to do. (With unfinished software even!). As for copyright, there may be questions there too as some things cannot be subject to copyright. Things like facts, mathematical equations (aka algorithms) and many others that appear in software. The specific expression in that software is protected with copyright before someone tries to jump on me for that.
More than anything the site looks like an almost well written ad for vapourware stuffed with an over abunance of stock photos. If they’re looking to tag those who send out spam with the Nigerian scam in them, fake bank notices about expired passwords and what have we it’s gonna fail. If, for no other reason, that sites that those are run by organized crime, often Russian, who have far more resources available to them to counter this vapourware than this law firm has. And I can hear them laughing from here some 4000 miles away to the east as the crow flies. I can hear them tapping out software right now to counter what this software claims to do.
As for file sharing sites, the ones copyright purists want to target as SOPA and PIPA claim to do, virtually all of those are small operations with few, if any ads, collecting some support through donations and stuff. Not the kind of sites that are likely to be raking in money.
File lockers are both ad and subscription supported but their legitimate uses far outweigh any illegitimate uses. They do respond to takedown notices so they follow the letter and spirit of the DCMA as it is.
As I said, all they’ve done is warn the very people best equipped to counter them. And counter them they will while the law firm collects a hoped for ton of fees on games of whack a mole. I have yet to figure out how a New York based law firm can bring suit in Russia, Canada, France, the UK and so on when they’re not members of the Bar in any of those countries. Unless, once again, the idea is to collect a liability award in the United States and whack the site owners if they’re foolish enough to visit the U.S. at some point in the future under the name they used to register their site(s). Good luck there.
I’m not for a moment minimizing the threat of fake prescription drugs, the possibility of identity theft or other serious issues where organized crime would see a profit. Hell, I’ll even concede that perhaps another fake Dior handbag might hurt someone, somewhere though we already know and have known for years that the majority of those come from Hong Kong.
File sharing by individuals it won’t stop.
Still, if I was tasked with reviewing this software with an eye to using it I’d want to see real world data, test results, a complete and detailed description of the methodology and the complete source code. Until I got all of that not a penny would go their way.
Something about this stinks. Badly.
Oh yeah, and the database needed to store all the data gathered to data mine the Internet would be enormous. Given that HTTP protocol is connect, exchange, drop then do it all again at the next click the number of connections and drops on the Web is massive. Beyond massive.
Even if they do write software to isolate suspicious transactions, at the end of the day it will still take human eyeballs to verify it all.
Of course, all it takes is to bust one 14 year old girl and one granny sharing thier own photos that are mistakenly identified as bearing an actionable copyright. Not that we haven’t been down that road before. Of course it won’t happen. Not in a million years! Ok, a million microseconds then.
What Protected Speech?
Tell me again — if someone is enabling and encouraging the downloading of unauthorized copies of a work, which, under the law of the location the download is saved, how is that “protected speech”? Free speech is “free” as in “freedom” — you can convey your own ideas about society, government, critique and petition the government. “Free” speech is not “free” as in “I don’t want to pay”. Not to say there aren’t potential problems in SOPA/PIPA or creating a new ITC action in OPEN, but I still don’t see how it’s censorship or a restraint on free speech if it were used to stop illegal downloading from foreign sites.
Re: What Protected Speech?
it’s called a forum as in a forum for free speech. if you go look around these sites you seem to think are just posting copyrighted material you will most like find something that says forum where opinions are expressed about software, music, movies, and politics.
i got an offer like this once. its was a virus.
so what you are telling me is that I need to go put.divx or axxo in The comments on every mpaa and riaa related site I can find
Identifying Rogue sites is easy peasy, all you need to do is ask for their character sheets.
The thing is that you wouldn’t need this software to be anywhere near 100 percent accurate. Investigators working for civil litigants and the feds now aren’t anywhere near 100 percent accurate, and their conclusions are routinely accepted when Courts seize sites. As for how it works, I would think it would be pretty easy to link sites based on a lot of variables that you can see without even getting into “invisible source code,” whatever that means lol
Re: Rogue Sites
Oh and I looked up the lawyer behind it: he has already shut down hundreds of websites.