<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/">
<channel>
<title>Techdirt. Stories filed under &quot;scraping&quot;</title>
<description>Easily digestible tech news...</description>
<link>http://www.techdirt.com/</link>
<language>en-us</language>
<image><title>Techdirt. Stories filed under &quot;scraping&quot;</title><url>http://www.techdirt.com/images/td-88x31.gif</url><link>http://www.techdirt.com/</link></image>
<item>
<pubDate>Fri, 21 Jan 2011 01:06:00 PST</pubDate>
<title>Dating Site's Plans To Create Profiles By Scraping Social Networks: Publicity Stunt Or Just Dumb?</title>
<dc:creator>Mike Masnick</dc:creator>
<link>http://www.techdirt.com/articles/20110119/04141612721/dating-sites-plans-to-create-profiles-scraping-social-networks-publicity-stunt-just-dumb.shtml</link>
<guid>http://www.techdirt.com/articles/20110119/04141612721/dating-sites-plans-to-create-profiles-scraping-social-networks-publicity-stunt-just-dumb.shtml</guid>
<description><![CDATA[ If you thought Match.com was in legal trouble due to <a href="http://www.techdirt.com/articles/20110114/09203212668/matchcom-sued-over-deadfake-profiles.shtml">dead or fake profiles</a>, just imagine the legal issues facing an Australian dating site that claims it's going to <a href="http://www.itnews.com.au/News/244737,dating-site-creates-profiles-from-public-records.aspx" target="_blank">scrape social networking profiles and turn them into dating site profiles</a>.  I'm not even going to mention the name of the company, because I'm pretty sure this was just a publicity stunt to get its name in the press, before "backing away" from the plan.  If it's an actual plan, it's stupid.  Not just because of the potential privacy concerns and lawsuits, and not just because some of the social networks from which they scrape the info may find ways to sue them as well, but because this seems like a <i>terrible</i> strategy for a dating site.  I mean, if you're looking to find a dating site where you're likely to actually meet someone, are you going to use the site where the vast majority of the "members" <i>don't even know they're members</i>?  It's hard to see how that makes for a compelling pitch.  And I'm not even getting into what will happen when it starts creating profiles of people who are married or in long term relationships...<br /><br /><a href="http://www.techdirt.com/articles/20110119/04141612721/dating-sites-plans-to-create-profiles-scraping-social-networks-publicity-stunt-just-dumb.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20110119/04141612721/dating-sites-plans-to-create-profiles-scraping-social-networks-publicity-stunt-just-dumb.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20110119/04141612721/dating-sites-plans-to-create-profiles-scraping-social-networks-publicity-stunt-just-dumb.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>lawsuit-waiting-to-happen</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20110119/04141612721</wfw:commentRss>
</item>
<item>
<pubDate>Wed, 9 Sep 2009 06:41:00 PDT</pubDate>
<title>Why Doesn't Century 21 Canada Want More People Viewing Its Real Estate Listings?</title>
<dc:creator>Mike Masnick</dc:creator>
<link>http://www.techdirt.com/articles/20090908/1251356129.shtml</link>
<guid>http://www.techdirt.com/articles/20090908/1251356129.shtml</guid>
<description><![CDATA[ A whole bunch of folks have sent in this rather odd legal dispute up in Canada, with real estate firm Century 21 Canada <a href="http://www.canada.com/business/fp/Century Canada does battle with Rogers/1969611/story.html" target="_new">suing telco Rogers and its subsidiary Zoocasa</a> for creating what appears to be a real estate info portal/search engine.  At issue: Zoocasa apparently scrapes various real estate listings, including those from Century 21 Canada, to provide them in its own search results, along with some additional info -- but still links back to the original Century 21 listing.  In other words, it acts like a basic search engine.  It's difficult to see how or why that should be against the law.
<br /><br />
Of course, the real estate business has always been focused on bogus exclusions on data though the MLS system -- and apparently they don't like the idea of that data being more widely available.  But, still, it's difficult to see what right Century 21 has to complain about, since the site links to Century 21 postings and should only provide them with more traffic.  Unless, of course, its fear is that it can't compete by offering enough useful info on its own site.<br /><br /><a href="http://www.techdirt.com/articles/20090908/1251356129.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20090908/1251356129.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20090908/1251356129.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>someone-please-explain</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20090908/1251356129</wfw:commentRss>
</item>
<item>
<pubDate>Fri, 10 Jul 2009 15:20:00 PDT</pubDate>
<title>Power.com Says Facebook Can't Block Access To User Data</title>
<dc:creator>Mike Masnick</dc:creator>
<link>http://www.techdirt.com/articles/20090710/0222325507.shtml</link>
<guid>http://www.techdirt.com/articles/20090710/0222325507.shtml</guid>
<description><![CDATA[ Earlier this year, we had trouble understanding <a href="http://www.techdirt.com/articles/20090104/2328183283.shtml">Facebook's reasoning</a> for suing Power.com, a site that tried to aggregate a variety of social network sites into a single interface (something that seems rather useful).  However, Facebook insisted that it violated its copyright, and in a slightly troubling ruling in the case, the judge seemed to find that <a href="http://www.techdirt.com/articles/20090605/2228205147.shtml">any scraping</a> could be copyright infringement, even if the scraping was just to get at <i>non-infringing</i> content.  The court's argument was that in order to get at the non-infringing content, you first have to scrape the infringing content too.
<br /><br />
Now the case is getting odder, as Power.com <a href="http://www.techcrunch.com/2009/07/09/powercom-countersues-facebook-over-data-portability/" target="_new">has countersued Facebook</a>, claiming that Facebook is "unlawfully withholding the data that users own (as stated in Facebook’s own ToS)."  Of course, if that's true, I'm not sure if Power.com has the standing to make that claim.  Wouldn't that be an issue for the user to raise themselves?  Besides, I don't think there's any rule that even if a site lets you retain the copyright on content that it needs to make it easy to access.  So now we have lawsuits coming from both sides that don't make much sense.  The two sites should just learn to play nicely with each other.<br /><br /><a href="http://www.techdirt.com/articles/20090710/0222325507.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20090710/0222325507.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20090710/0222325507.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>seems-like-a-tough-claim</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20090710/0222325507</wfw:commentRss>
</item>
<item>
<pubDate>Wed, 10 Jun 2009 21:21:00 PDT</pubDate>
<title>Can Scraping Non-Infringing Content Become Copyright Infringement... Because Of How Scrapers Work?</title>
<dc:creator>Mike Masnick</dc:creator>
<link>http://www.techdirt.com/articles/20090605/2228205147.shtml</link>
<guid>http://www.techdirt.com/articles/20090605/2228205147.shtml</guid>
<description><![CDATA[ Earlier this year, we couldn't figure out how Facebook's lawsuit against Power.com <a href="http://www.techdirt.com/articles/20090104/2328183283.shtml">made any sense</a>.  Power.com tried to aggregate various social networking accounts in a single place, so you could manage them all at once through a single interface.  Yet Facebook charged the company with all sorts of complaints, including copyright and trademark infringement, unlawful competition and violation of the computer fraud and abuse act.  Power.com asked for the case to be dismissed, but last month the judge sided with Facebook, but did so in a troubling way, by basically suggesting that since Facebook's terms of service prohibited these uses, it made it copyright infringement.  <a href="http://twitter.com/InternetLaw/statuses/1951217187">Michael Scott</a> points us to <a href="http://newmedialaw.proskauer.com/2009/05/articles/contracts/facebook-takes-a-page-from-ticketmasters-playbook-block-unauthorized-web-site-access-with-carefully-drafted-terms-of-use/" target="_new">lawyer Jeff Neuberger's take on the ruling</a>, and separately <a href="http://pblog.bna.com/techlaw/2009/05/provocative-ruling-in-facebook-v-power-ventures.html" target="_new">Tom O'Toole has a good analysis</a> of the ruling.  Neuberger states the following:
<blockquote><i>
Judge Fogel concluded that the allegations of the complaint made out a sufficient claim of copyright infringement because Power Ventures "need only access and copy one page to commit copyright infringement." The court also found that the ToU prohibited downloading, scraping or distributing content from the Facebook Web site content except that belonging to the user, and that in any event, using automated methods, i.e., "data mining, robots, scraping, or similar data gathering or extraction methods" to access any content were also prohibited by the ToU. Thus, the court found that the allegation that Power Ventures accessed Facebook via automated means constituted made out a claim of direct copyright infringement, while the allegation that Facebook users utilized the Power.com interface to access their own profile pages made out claim of secondary copyright infringement.
</i></blockquote>
Thus, because the terms of service said you can't do any automated scraping of the site, it's suddenly infringing?  Even worse, the court found that even though the data being used by Power.com isn't owned by Facebook (it's the users') the scraping was <i>still</i> copyright infringement, because in order to scrape the non-infringing content, Power.com had to first "scrape" the whole page.  O'Toole explains:
<blockquote><i>
OK, so far the court has found that Power.com made unauthorized copies of the Facebook Web site. What about the fact that Facebook does not own the copyright in its users' profile data? Facebook surmounted this hurdle by arguing that the content of the Facebook page that surrounded the user's data is copyrightable and is owned by Facebook. According to Facebook, the Power.com scraper operated in a manner that required it to copy the entire Web page in order to extract the user's profile data....
<br /><br />
Note that the court is conditioning its ruling on the assertion that the Power Ventures scraper necessarily copied the entire Web page before it processed the page and extracted the profile data. That comports with my (limited) understanding of how a Web scraper works. But is it true? If it were true, couldn't an argument be made that this is a fair use of the page? I'll leave that for better lawyers.
</i></blockquote>
All of this seems a bit troubling, as it would effectively rule out scraping even non-infringing content, just because the scraper had to first read through copyrighted content to get to the non-infringing stuff.  But, that seems to go against the entire purpose of copyright law.  The fact that the scraper <i>reads</i> copyrighted content shouldn't mean that it's infringement.  It's not <i>doing</i> anything with that content other than using it to find the content it can make use of.  Anyway, this ruling probably doesn't mean all that much, since it was just to reject the dismissal request, but it does seem odd that the judge gave so much weight to Facebook's terms of service, and seems to indicate the mere act of scraping can be copyright infringement.<br /><br /><a href="http://www.techdirt.com/articles/20090605/2228205147.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20090605/2228205147.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20090605/2228205147.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>this-seems-troubling</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20090605/2228205147</wfw:commentRss>
</item>
<item>
<pubDate>Wed, 22 Apr 2009 04:39:00 PDT</pubDate>
<title>New Consortium Says If Others Can Monetize Better Than We Can... We Deserve Their Money?</title>
<dc:creator>Mike Masnick</dc:creator>
<link>http://www.techdirt.com/articles/20090421/1319244600.shtml</link>
<guid>http://www.techdirt.com/articles/20090421/1319244600.shtml</guid>
<description><![CDATA[ We've pointed out in the past how <a href="http://www.techdirt.com/articles/20090116/0348223430.shtml">silly</a> it is to be worried about various spam/scraper sites that take content from sites (including ours) and repost it on their own.  Those sites never add any real value, but just repost the content.  They get no significant traffic and retain no real audience.  They tend to come and go pretty quickly.  Worrying about them is a total waste of time (time that can be used making sure your own site is more valuable).  Yet, apparently a group of publishers has put together a "Fair Syndication Consortium" that has decided that rather than go after these sites directly, it will simply <a href="http://www.techcrunch.com/2009/04/21/should-ad-networks-pay-publishers-for-stolen-content-the-fair-syndication-consortium-thinks-so/" target="_new">try to get the ad networks that serve ads on such sites to hand over some money to the original content creators</a>.  As far as I can tell, that's basically the content creators saying "well, if others can monetize our content better than we can, we deserve some of that cash."
<br /><br />
That makes no sense to me.  If you can't monetize your own content better than other sites, you don't deserve to be in business.  If other sites are actually getting traffic and ad revenue that you think you deserve, it means you're doing a bad job giving people a real reason to visit your site and to interact with your community.  Simply demanding money from the sites that have done things better makes no sense.  Of course, the reality is that most of these sites <i>haven't</i> done things better, and don't make any money.  So the whole grandstanding seems rather wasted effort.
<br /><br />
Focus on making your own site worth visiting.  Stop worrying what others are doing with your content.<br /><br /><a href="http://www.techdirt.com/articles/20090421/1319244600.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20090421/1319244600.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20090421/1319244600.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>please-explain</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20090421/1319244600</wfw:commentRss>
</item>
<item>
<pubDate>Fri, 8 Aug 2008 11:11:38 PDT</pubDate>
<title>Airline Plans To Cancel All Flights Booked Through 3rd Party Websites</title>
<dc:creator>Mike Masnick</dc:creator>
<link>http://www.techdirt.com/articles/20080808/1057031933.shtml</link>
<guid>http://www.techdirt.com/articles/20080808/1057031933.shtml</guid>
<description><![CDATA[ And people wonder why airlines have so much trouble staying in business?  We were already confused enough by American Airlines' desire <a href="http://www.techdirt.com/articles/20080725/1322411794.shtml">not</a> to be listed on the sites where people search for airfare, and easyJet's plan to <a href="http://www.techdirt.com/articles/20080627/1250171538.shtml">sue</a> the sites that send it customers, but Irish-based airline Ryanair is taking this all to a new level.  Beyond just being upset about those 3rd party sites (i.e., sites that send it business!), it's planning to <a href="http://www.independent.ie/business/irish/ryanair-travellers-may-lose-bookings-1449260.html" target="_new">cancel the flights for everyone who booked through one of those services</a> (thanks to <a href="http://seansicily.wordpress.com">Sean</a> for the link).
<br /><br />
Yes, we understand that these airlines prefer people to purchase flights from the airlines directly, but it still seems bizarre to try to cut off a great promotional channel.  People already know to go look at 3rd party sites for airfare, so actively working against having your flights promoted doesn't make much sense.  Then actively pissing off a bunch of your customers who booked through those sites by canceling their flights is even more braindead, as you've just formed a huge group of customers who will complain about your airline and spread the word about how you canceled their legitimately purchased flight for no reason other than spite and a confusion over business models.  When Ryanair started promoting how some of its seats might come with <a href="http://www.ryanair.com/site/EN/news.php?yr=08&#038;month=jun&#038;story=gen-en-200608">sexual gratification</a>, I'd bet many passengers didn't realize it would end with them getting screwed.<br /><br /><a href="http://www.techdirt.com/articles/20080808/1057031933.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20080808/1057031933.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20080808/1057031933.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>piss-off-your-customers-much?</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20080808/1057031933</wfw:commentRss>
</item>
<item>
<pubDate>Tue, 29 Jul 2008 01:50:25 PDT</pubDate>
<title>The Airlines' Ongoing Struggle With Price Aggregation Sites</title>
<dc:creator>Tom Lee</dc:creator>
<link>http://www.techdirt.com/articles/20080725/1322411794.shtml</link>
<guid>http://www.techdirt.com/articles/20080725/1322411794.shtml</guid>
<description><![CDATA[ <p>It's proving pretty difficult to figure out exactly what happened between American Airlines and Kayak last week.  Last Wednesday TechCrunch reported that American Airlines was <a href="http://www.techcrunch.com/2008/07/23/trouble-in-online-travel-american-airlines-ditches-kayak-maybe-orbitz-too/">pulling its listings</a> from the airfare search engine.  Comments left by Kayak's CEO Steve Hafner and VP Keith Melnick chalked the split up to Kayak's display of AA fares from Orbitz: American had demanded that Kayak suppress the Orbitz listings, and Kayak refused.</p>

<p>Presumably one of two things is making American want to avoid comparison to Orbitz prices: either, as TechCrunch speculates, users clicking the Orbitz option put AA on the hook for two referral fees -- one to Kayak and one to Orbitz; or AA has struck a deal with Orbitz that provides the latter's users with cheaper fares than can be found on <a href="http://aa.com">aa.com</a>.</p>

<p>Either way, the news doesn't appear to be as dire as it first sounded.  It doesn't seem that AA <em>flights</em> will be disappearing from Kayak -- it's just the links to buy them at aa.com that will go missing.  As <a href="http://www.jaunted.com/story/2008/7/24/133325/514/travel/Booking+Engine+Fiascos:+Trying+to+Parse+the+Kayak-AA+Drama">Jaunted points out</a> this might wind up costing flyers a few more dollars, but it shouldn't be a major inconvenience for Kayak customers.</p>

<p>The more interesting aspect of this episode is how it reveals the stresses at play in the relationship between the airlines and travel search engines like Kayak.  It's no secret, of course, that the airlines are having a rough time as rising fuel prices put even more pressure on their perennially-failing business model.  But while an airline attempting to control the distribution of its prices is <a href="http://southwest.com">nothing new</a>, one can't help but wonder whether ever-narrowing margins might lead to a shakeup of this market.</p>

<p>Kayak, like most travel search sites, gets its data from one of a handful of <a href="http://en.wikipedia.org/wiki/Global_distribution_system">Global Distribution Services</a>: businesses that charge airlines a fee to aggregate  price and reservation information.  Some airlines, like Southwest, opt out of the GDS system in order to avoid those fees.  Others, like American, participate in the system but try to send as much online business as possible to their own sites.  Presumably each airline tries to find an equilibrium point at which the business brought in by participation in a GDS and the payments associated with it add up to the most profit.</p>

<p>But so long as the financial temptation to retreat from the GDSes persists, GDS data will be less than complete.  And that creates an opportunity for another kind of fare-aggregation business -- one based upon scraping the data from the airlines' websites.  <a href="http://www.allbusiness.com/transportation-communications/transportation-services/4153746-1.html">It's been done before</a>, after all, albeit on a limited scale.  And since most people recognize that <a href="http://news.cnet.com/2100-1023-976296.html">prices can't be copyrighted</a>, there doesn't seem to be any legal barrier stopping such an aggregator from stepping in (nothing besides the need to write a lot of tedious screen-scraping software, that is).  Though, of course, that won't stop airlines from <a href="http://www.techdirt.com/articles/20080627/1250171538.shtml">suing</a>, but the legal basis for their argument seems pretty weak.</p>

<p>Whether such a business is likely to emerge and succeed, I couldn't say.  But it does seem certain that as fuel prices rise we'll be seeing more and more travel industry infighting -- and more and more hoops for online fare-shoppers to jump through.</p><br /><br /><a href="http://www.techdirt.com/articles/20080725/1322411794.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20080725/1322411794.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20080725/1322411794.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>airlines-vs.-aggregators?</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20080725/1322411794</wfw:commentRss>
</item>
<item>
<pubDate>Fri, 4 Jan 2008 15:20:07 PST</pubDate>
<title>Just Assume Any Info You Put Online Is Public</title>
<dc:creator>Tom Lee</dc:creator>
<link>http://www.techdirt.com/articles/20080104/151042.shtml</link>
<guid>http://www.techdirt.com/articles/20080104/151042.shtml</guid>
<description><![CDATA[ I have to admit that I was sorry to see that my fellow Techdirt blogger Julian had beaten me to the punch, writing a <a href="http://techdirt.com/articles/20080103/124455.shtml">characteristically insightful post on the Robert Scoble/Facebook story</a>.  But Facebook and screen-scraping are two of my favorite things to talk about, so I can&#39;t resist pointing out that I disagree with some of Julian&#39;s analysis.  <p>Having noted that a script acting on Scoble&#39;s behalf can only access information that Scoble himself can reach manually, Julian argues that this can&#39;t be considered the only criterion in evaluating the situation:</p>  <blockquote>[P]rivacy is not just a function of the publicity of your personal information, but of the searchability and aggregability of that information. Public closed-circuit surveillance cameras, for instance, typically capture the same information that a casual observer on the street is already privy to. But we recognize that being spotted by diverse random pedestrians, or even being captured on diffuse and disconnected private security cameras, is not intrusive in the same way as being captured on a citywide surveillance system that is searchable from a centralized location.</blockquote>  <p>All of this seems true: individuals&#39; attitudes about privacy are rightly driven by a pragmatic appraisal of the likelihood of someone doing something bad with the available information &mdash; a judgment based on the information&#39;s value and the cost of obtaining it.  Ripping up your credit card statement before throwing it in the trash doesn&#39;t make it impossible for a dumpster-diving thief to target you, but it  increases the difficulty of ripping you off <em>enough</em> that you&#39;ll probably be safe.</p>  <p>But I think Julian makes a mistake when he assumes that this is a viable way to conduct your life online.  The problem with applying this approach to an digital context is that a user&#39;s estimation of the accessibility of a given piece of online information is almost invariably going to be too low &mdash; and will be getting more so by the second.  The costs to automatically collecting data are very small and getting smaller.</p>  <p>There are a few reasons for this.  First, the tools are getting better.  Libraries like WWW::Mechanize are simple for any programmer to use and available in a <a href="http://mechanize.rubyforge.org/mechanize/">variety</a> <a href="http://wwwsearch.sourceforge.net/mechanize/">of</a> <a href="http://search.cpan.org/dist/WWW-Mechanize/lib/WWW/Mechanize.pm">languages</a>.  And GUI-based applications like <a href="http://www.dapper.net/">Dapper</a> and <a href="http://simile.mit.edu/wiki/Piggy_Bank">Piggy Bank</a> aim to make things even simpler. Second, if done properly, it&#39;s very difficult to prevent, detect or punish automated data collection.  Facebook&#39;s script detection technology is impressively existent relative to that of its competitors, but it&#39;s still almost certainly trivial to subvert it with proxies, faked user agents and plausibly human delays.  Third, once the data is collected it can, of course, be easily distributed.</p>  <p>And the situation is only going to get worse! In fact, it&#39;s getting worse at such a rapid rate that counting on the privacy of any even <em>slightly</em> public online information is a mistake.</p>  <p>The negative reaction to Scoble&#39;s script is coming from users who think of it as a violation of the covenant they perceived to surround their data.  But that covenant was based upon their own mistaken understanding of the internet.  Scoble&#39;s actions shouldn&#39;t be viewed by these users as a transgression against them, but rather as a pleasantly benign lesson.</p>  <p>It&#39;s fine to lament the situation, or to applaud Facebook for taking steps to keep its valuable, freely-acquired user data away from competitors (and, while they&#39;re at it, script-employing users).  But this assertion of community norms is unlikely to stop those who, unlike Scoble, are genuinely acting in bad faith.   The technology for containing digital cats in digital bags is woefully inadequate, and it&#39;s unlikely to improve anytime soon.</p><br /><br /><a href="http://www.techdirt.com/articles/20080104/151042.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20080104/151042.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20080104/151042.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>welcome-to-the-new-world</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20080104/151042</wfw:commentRss>
</item>
<item>
<pubDate>Thu, 3 Jan 2008 15:11:26 PST</pubDate>
<title>Is There A Conflict Between Open Social Graphs And Your Privacy?</title>
<dc:creator>Julian Sanchez</dc:creator>
<link>http://www.techdirt.com/articles/20080103/124455.shtml</link>
<guid>http://www.techdirt.com/articles/20080103/124455.shtml</guid>
<description><![CDATA[ Techblogger <a href="http://scobleizer.com/">Robert Scoble</a> has apparently been <a href="http://uk.techcrunch.com/2008/01/03/facebook-blocks-scoble-for-downloading-his-contacts/">barred from Facebook</a> for <a href="http://blog.wired.com/monkeybites/2008/01/scobles-slap-in.html">running a script</a> from Plaxo to export his relationship information (or &quot;<a href="http://bradfitz.com/social-graph-problem/">social graph</a>,&quot; as the kids say), in violation of the site&#39;s terms of service.  On one read, this makes him a martyr to the cause of <a href="http://www.sixapart.com/about/news/2007/09/were_opening_th.html">open social graphs</a>. I&#39;m a bit more ambivalent.<p>Intuitively, it makes sense for users to be able to make whatever use they please of information about their own social networks.  But in a social network, &quot;your&quot; information is someone else&#39;s as well. And on a site like Facebook, much of that information will have been provided in the context of a set of  individually calibrated privacy controls, by people who expected it to be used in that context by a limited audience.  Exporting that information without permission, then, raises important privacy questions.</p><p>Within Facebook, users have a fair amount of control over who can access what information about them.  I can choose to block particular users on Facebook, rendering myself wholly invisible to them, as though I weren&#39;t even on the network.  I can decide how much of my profile information will be visible to friends, to people who live in my region, to the general Facebook membership, and to the Internet at large. I can even decide how aggressively public, so to speak, such information will be.  Lots of Facebook users are happy to let friends view their relationship status, but disable those status notifications in their news feeds, to prevent everyone they know from being simultaneously blasted with the news that &quot;Bob has gone from being in a relationship to being single.&quot;  Automated data collection &quot;liberates&quot; information from those constraints, possibly against the wishes of the people who provided it.</p><p>It&#39;s true that a script can only sweep up information that would already have been visible to a particular user anyway.  But privacy is not just a function of the <em>publicity</em> of your personal information, but of the <em>searchability</em> and <em>aggregability</em> of that information.  Public closed-circuit surveillance cameras, for instance, typically capture the same information that a casual observer on the street is already privy to.  But we recognize that being spotted by diverse random pedestrians, or even being captured on diffuse and disconnected private security cameras, is not intrusive in the same way as being captured on a citywide surveillance system that is searchable from a centralized location.   By the same token, I may be unhappy with the possibility of someone forming an external public database full of data I&#39;ve freely shared with more narrow communities&mdash;personal, regional, or whatever. </p>None of this is to deny the initial intuition that it&#39;s desirable for users&#39; social graphs to be portable to some extent.  But as with all forms of intimacy, openness and privacy complement each other: We feel free to share information about ourselves to the extent that we have some assurances about how that information will be used.  So while it&#39;s one thing to argue that Facebook should enable greater openness or portability in some particular way, subject to user control, it seems like quite another to criticize them for enforcing a rule about indiscriminate automated data collection.<br /><br /><a href="http://www.techdirt.com/articles/20080103/124455.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20080103/124455.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20080103/124455.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>what-about-your-friends?</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20080103/124455</wfw:commentRss>
</item>
<item>
<pubDate>Wed, 10 Oct 2007 09:41:00 PDT</pubDate>
<title>AP Sues VeriSign For Copyright Infringement; Mostly Pointless</title>
<dc:creator>Mike Masnick</dc:creator>
<link>http://www.techdirt.com/articles/20071009/185531.shtml</link>
<guid>http://www.techdirt.com/articles/20071009/185531.shtml</guid>
<description><![CDATA[ The Associated Press is apparently <a href="http://money.cnn.com/news/newsfeeds/articles/apwire/ce9f60ee582eefd7a232c772d046ef4a.htm">suing VeriSign's Moreover for copyright infringement</a>, <strike>though the details are woefully unclear (even in the AP version of the article).  It's unclear if the complaint is over the fact that Moreover scrapes and links people to AP content, or if there's something else going on</strike> (<b>Update below</b>).  If it is just that Moreover is pointing people to AP content, then this is quite ridiculous -- but most likely driven by the AP's ability to get Google to <a href="http://www.techdirt.com/articles/20060803/0851258.shtml">pay</a> up for the same thing.  The article hints that there may be a bigger problem with Moreover providing the full text of the content, but details are lacking.  If it is true that Moreover provides the full text -- then they probably are violating the copyright.  However, if that content is simply stored away for indexing purposes, and people are sent to legitimate AP sources, then it's hard to see how this is a copyright violation at all.  If anything, it's the opposite -- pointing more people to AP content.  The AP is also complaining that Moreover lists the AP as a news source on its site -- but that's just a petty complaint from the AP.  Listing out news sources hardly is a violation of trademark.  Hell, the AP is a "news source" for the content we write here on Techdirt all the time -- and there's nothing wrong with saying so.  All in all, unless more details prove otherwise, this sounds like the AP continuing to struggle with the changing marketplace it's facing, and lashing out at one of the companies that helps deliver more traffic to AP content for not paying the AP for the privilege of promoting AP content.
<br /><br />
<b>Update</b>: Rafat Ali from PaidContent stopped by in the comments to point to <a href="http://www.paidcontent.org/entry/419-ap-sues-moreover-and-verisign-for-stories-copyright-infringement/">the full lawsuit</a> documents, posted on his site.  From there, it appears that the AP's lawsuit is mostly ridiculous, with just a little tiny bit of reasonable thrown in.  Most of the claims are about the fact that Moreover is spidering and scraping AP news feeds, and providing both free and paid subscribers headlines and the opening lede.  However, it's pretty difficult for the AP to make a copyright claim here, since those links are almost definitely fair use, especially since they point people to legitimate AP licensees.  There's a little gray area where Moreover indexes and caches the articles on its own servers -- but Google has been doing that for years without much of a problem -- and if the AP is really upset about it, there's always the old robots.txt solution.  The one area where the AP may have a claim (though, the evidence does not seem clear from the exhibits) is in saying that there are times when Moreover will show subscribers a full AP article hosted on its own servers, rather than passing them through to a licensee.  If true, then that would likely be copyright infringement -- though the "damages" would be minimal, if anything.  Finally, the claim that this is an AP trademark infringement by listing the AP as a news source seems laughable.  All in all, the original assessment stands: this is the AP unable to adapt and lashing out at those who are helping to promote their content.<br /><br /><a href="http://www.techdirt.com/articles/20071009/185531.shtml">Permalink</a> | <a href="http://www.techdirt.com/articles/20071009/185531.shtml#comments">Comments</a> | <a href="http://www.techdirt.com/articles/20071009/185531.shtml?op=sharethis">Email This Story</a><br />
 ]]></description>
<slash:department>what's-going-on-here?</slash:department>
<wfw:commentRss>http://www.techdirt.com/comment_rss.php?sid=20071009/185531</wfw:commentRss>
</item>
</channel>
</rss>