Current Insight Community Cases

Essential Datacenter Tips On Application Performance Monitoring

The Importance Of Skilled Immigrants To The American Economy

Help A New Kind of Music Label Revolutionize The Industry

Mandates To Buy American Should Be More Carefully Considered

Navigating The New Business World After This Recession

CwF + RtB

-- get "looooots of t-shirts"

Brought to you by Floor64 and the Techdirt crew.

stories filed under: "algorithms"
Culture

Culture

by Mike Masnick


Filed Under:
algorithms, culture, hits, long tales, recommendation systems, winner takes all



Winner Takes All, Long Tails And The Fractilization Of Culture

from the rethinking-the-niche dept

Reader Eileen points us to a thought-provoking article by Joshua-Michele Ross discussing the idea that, rather than a diverse "long-tail" culture, we're actually being driven to a homogenized "winner-take-all" culture thanks to the rise of our robot overlords, better known as online recommendation engines. Or something like that. It's a nice theory, with some interesting statistical modelling behind it. And, I've always been interested in "winner takes all" economies, since the guy who taught me Econ 101 literally wrote the book on "winner takes all" economics.

That said, I think this really only tells a part of the story -- and maybe not the most important or most interesting part. That's because (and, again, this may be due to my own econ education) it doesn't surprise me in the slightest that we'd see hits follow a winner takes all approach (that's how hits work). Nor is it a surprise that the effect would seem stronger as the world globalizes and borders and barriers become less of an issue. So, yes, of course there will be a "globalized" winner takes all situation at the hits level. But is that all?

What's much more interesting to me is what happens beyond the hits. And, as you start to dig down into subsectors or subcultures, you begin to notice an interesting pattern there as well: that those subsectors and subcultures follow that same power law pattern themselves. The big name bands in a subculture may seem "small" in the wider world, but they're huge within the subculture. Within that subculture, they're the winner who took all -- but from a more limited population.

In some ways, it's the fractalization of culture.

Just as a fractal repeats its same pattern as you zoom in and look closer on the smaller segments, so do cultural subsegments. And those segments continue to thrive, despite the recommendation systems just pushing people to the hits. Part of that may be that once you've begun exploring those subcultures, the recommendation engines and collaborative filters drive you towards the "hits within" the subculture -- or it may be that the impact of algorithmic recommendation engines isn't quite as dominating as some make it out to be. Yes, people do rely on those recommendation engines... somewhat. But they trust people they know even more. And once you get involved in a subculture you quickly find other people already involved in that culture who act as guides who point you both to the "hits" but also to the interesting and "diverse" long tail places to go as well.

So, yes, there is a winner take all effect found in the recommendation engines, but it hasn't resulted in less diversity within our cultural output or our cultural consumption -- and that's because people don't just follow that limited algorithmic overlord to find the content they want to consume. In fact, the original statistical model highlighted above more or less makes this point. Basically, it shows that even if each individual sees a more diverse culture, it can still end up with a more homogenized culture -- but really only among the hits. Basically, because the world is global, the really big hits go global and become winner-take-all in a much larger market. But, at the same time, the niches thrive as well.

10 Comments | Leave a Comment..

 
Legal Issues

Legal Issues

by Mike Masnick


Filed Under:
algorithms, copyright, facts, wolfram alpha

Companies:
wolfram



Can You Copyright Algorithmic Output?

from the do-computers-need-incentive-to-create? dept

A bunch of folks have been sending in Neil McAllister's writeup at InfoWorld about how Wolfram Alpha, the incredibly overhyped "knowledge engine" (that, in my experience doesn't work very well) is claiming copyright on all of its output, which raises questions about what would happen if others did the same thing:

In other words, Wolfram Research is claiming that each page of results returned by the Wolfram Alpha engine is a unique, copyrightable work, like a report or term paper. That makes Wolfram Alpha different not just from classic search engines, but from most software. While software companies routinely retain sole ownership of their software and license it to users, Wolfram Research has taken the additional step of claiming ownership of the output of the software itself. It's a bold assertion, and one that could have significant ramifications for the software industry as a whole.
It really depends on the output, but in many cases I have trouble believing the output really is copyrightable. After all, you cannot copyright facts and (in the US, at least) you can't copyright a collection of facts, either. The article doesn't discuss that, and seems to assume that the output may be copyrightable, but I would think that it would need to be significantly more unique and have additional creativity before it could be covered (and then, only the unique parts would be covered). Still, there may be a legal gray area, as McAllister notes:
Suppose you have an Excel spreadsheet full of numbers that you input, but then you ask Excel to generate a series of complex graphs based on rules, formulae, and templates designed by Microsoft. Or what about pivot tables? What about mash-ups or tools like Mozilla Jetpack? If unique presentations based on software-based manipulation of mundane data are copyrightable, who retains what rights to the resulting works?
I'm guessing that the graphs still wouldn't be copyrightable, as they'd really just be the same collection of data, but you could see a mathematically illiterate court finding otherwise...

24 Comments | Leave a Comment..

 
Legal Issues

Legal Issues

by Mike Masnick


Filed Under:
algorithms, france, libel, suggestions

Companies:
cnfdi, direct energie, google



Two Separate Rulings In France Split Over Whether Google's Suggestion Algorithm Can Be Libelous

from the confusion-abounds dept

Reader Yann alerts us to an interesting set of lawsuits and decisions in France, both concerning the Google Suggest feature. One case involved a company named Direct Energie and the other with a company named CNFDI (both links to the Google translation of the news).

In both cases, the companies were upset that when people started searching on their company names, the first suggestion was their company name followed by the word "arnaque," which means "scam." Of course, as you probably know, Google Suggest works by finding the most common searches on what you've typed and letting you know. So, all this really meant was that an awful lot of people were doing searches questioning whether or not these two companies were scams. But, is Google liable for its algorithm accurately suggesting the most common searches associated with those company names? It appears the courts split on that decision (it's worth noting that there was one major difference between the lawsuits: Direct Energie sued under civil code, while CNFDI sued for libel -- which apparently makes it a criminal case in France.

With Direct Energie, the judge seemed to not really understand Google Suggest or how it worked, declaring that no algorithm could justify the prejudice caused by Google. He then got confused, saying that it was clearly Google's fault because the search on "direct energie arnaque" was not the first alphabetically in the list, nor did it have the highest number of results. Despite it being explained by Google, the judge seems to have totally ignored the reason why it was at the top of the list (the number of people searching for it). Because of this, he said it's no limit on free speech to force Google to change the results, and ordered Google to do so (though, did not allow for any damages to be awarded). This seems to get the basic facts backwards, and it seems quite ridiculous to find Google guilty of such a charge when all its actually doing is accurately counting up what people are legitimately searching for.

The CNFDI ruling, seems much more reasonable. There was one oddity (though it's probably got more to do with French law than with the judge), and that is that the judge ruled that Google could be liable for libel because the company had been informed by CNFDI of the issue, thereby removing any safe harbors. In the US, Section 230 safe harbors on libel thankfully do not get waived if you've been informed. Instead, they take the much more logical position that a third party service provider should never be blamed for actions of its users. Thus, it would be flat-out ridiculous to blame Google for the phrases people are searching for. But, even having lost its local "safe harbor" protections, the judge properly recognized that the suggestion came from the algorithm looking at what people were searching for, and noted that the suggestion was based on "a valid observation." On top of that, he pointed out that search engines are "important tools for the free circulation of ideas and information," and the fact that many people were questioning whether CNFDI was a scam was, in fact, important and potentially useful information, and thus not libelous by itself. Finally, the court also noted that forcing Google to remove such a suggestion would be too big a burden on free speech and citizens' rights.

It should be no surprise that I think the second ruling is much more sensible, while the first ruling makes little sense, and appears to have been decided without a full understanding of what Google's Suggest feature is or how it works. Still, I imagine we'll be seeing similar cases around the world... and hopefully they'll find themselves in front of judges more like the one that dealt with the CNFDI case...

20 Comments | Leave a Comment..

 
Overhype

Overhype

by Mike Masnick


Filed Under:
algorithms, entitlement, search, search engines, sem, seo, transparency

Companies:
google



Isn't There Something Ironic In An Anonymous Exec Demanding Transparency From Google?

from the entitlement-culture dept

It really is amazing sometimes to see how many people think that Google "owes" them something. For example, we've had a few different stories about companies suing Google because they don't like how Google ranks them. That makes little sense. Google doesn't owe anyone a spot in its index. It determines its index by figuring out what it thinks people will like best, and it's always tweaking it. If it fails to figure that out properly and someone else (like Microsoft) does figure it out, then Google will lose business. So, it seems a bit odd that some anonymous "well known exec at one of the largest sites on the Internet" is suddenly demanding transparency into how Google ranks content, suggesting that it's somehow unfair and arbitrary in its rankings -- and only by opening up the details of its algorithm will "fairness" be restored.

Ryan, who alerted us to this story, has written up a biting, but reasonable, response, where he notes that being ranked highly in Google is no one's right. And demanding that Google be transparent about its algorithm is meaningless (while being especially ironic, given that this "well-known exec" is demanding transparency while wanting to remain anonymous himself). The key point Ryan makes:

You want an algorithm, here it is:
1.) Sites that are useful to visitors will rank high.
2.) Popular sites that are useful to visitors will rank higher.
3.) Sites that don't offer any value to the web or are irrelevant to the query won't rank well.
4.) Sites that are harmful or spammy won't be included in the index.

Seriously, that's Google’s algorithm in plain English. There's your disclosure. The weighting factors and code behind it don't matter -- these principles are all you really need to know.
Indeed. Create useful sites with useful content that people use, and don't be spammy, and you'll most likely rank well in Google. You don't need to force Google to reveal the nuts and bolts of its algorithm. That doesn't change anything. If you're trying to craft your websites to the specifics of the algorithm, you're already lost. If you're creating websites that match the "plain English" code above, you're going to be just fine.

54 Comments | Leave a Comment..

 
(Mis)Uses of Technology

(Mis)Uses of Technology

by Mike Masnick


Filed Under:
algorithms, computers, quants, trust, wall street



Garbage In, Financial Crisis Out

from the so-much-for-the-quants dept

With everyone trying to figure out just what went wrong to cause the rather spectacular financial mess Wall Street finds itself in these days, Saul Hansell over at the NY Times wanted to find out why all the sophisticated risk management quant algorithms that Wall St. has been so big on lately failed to warn of impending doom. His answer, basically, is that people on Wall St. were lying to the algorithms, coming up with ways to purposely enter data such that the risk seemed much less than it actually was -- in order to let them keep pushing the boundary. Then, it became a situation where people start relying on the computers just because the computer says so -- even though the data is bad. This happens time and time again. Even when people know that computers make mistakes, it's just so convenient to have a computer "confirm" your thinking that you start ignoring other warning signs.

44 Comments | Leave a Comment..

 
Failures

Failures

by Joseph Weisenthal


Filed Under:
algorithms, hedge funds, quant

Companies:
skype



From Hedge Funds To Skype, Collapses Prove Unavoidable

from the crashing-down dept

Is there a connection between the recent meltdown at quant funds and last week's outage at Skype? Nick Carr makes the provocative argument that both events are the result of what happens when algorithms fail to anticipate behavior that is somehow out of the ordinary. In the case of quant funds, their models failed to anticipate the market's wild volatility, whereas with Skype (if you believe the company's official explanation), the glitch was the result of mass reboots taxing network capacity. Interestingly, both Skype engineers and hedge fund managers were heard using the phrase "perfect storm" to describe the sequence of events that lead to their respective collapses. Of course, as hedge funds learn every few years, these perfect storms that are mathematically supposed to occur just once in a thousand years, seem to happen quite a bit more often. The same goes for any network that suffers an outage despite the best laid contingency plans. The problem is that it's difficult to craft an algorithm or a model that's robust during 'normal' times and abnormal times. In finance, one hopes that the profits are big enough during the good so that you can survive the occasional mess. The one problem, of course, with the comparison between hedge funds and Skype is that Skype's explanation doesn't ring particularly true. The connection between Microsoft patches, mass reboots and the network collapse seems tenuous at best. Thus, it's entirely possible that this particularly outage had nothing to do with abnormal crowd behavior. Still, as the surprise outage at 365 Main demonstrates, it's difficult, if not fully impossible, to completely inoculate oneself against adverse events.

3 Comments | Leave a Comment..

 
Search Techdirt
And now, a word from our Sponsors..



Popular Posts
Poll

Which Internet Concern Worries You The Most?

 

 

 

 

 

 


Add Techdirt RSS To Your Reader
rss Add Techdirt to your Bloglines
Add Techdirt to your Google Add Techdirt to your My Yahoo
Add Techdirt to your Netvibes Add Techdirt to your Newsgator
Subscribe to Techdirt's Daily Email Newsletter

Techdirt's Daily Email Newsletter

Older Stuff

Thursday

8:11pm: In Going Free, London Evening Standard Doubles Circulation While Slashing Costs (26)
6:10pm: Senate Exploring Med School Profs Putting Names On Ghostwritten Journal Articles In Favor Of Drugs (22)
4:52pm: What Does It Say When A Comedy Show Does More Fact Checking Than News Programs? (56)
3:33pm: Nordic Music Week: Optimism Galore And Found Songs (11)
2:10pm: Would Top Sites Really Opt-Out Of Google Based On A Microsoft Bribe? (37)
12:57pm: Intel Lawyers Again Go Too Far In Trademark Bullying (22)
11:43am: Mandelson Wants Gov't To Have Sweeping Powers To Protect Copyright Holders (40)
10:47am: Once Again, Walmart Stops People From Printing Family Photos Due To Copyright Law Claims (42)
9:39am: Essayist Writes Popular Essay... Then Sends 'Non-Negotiable' Invoice To Church Who Posts It Online (59)
8:23am: ASCAP, BMI And SESAC Continue To Screw Over Most Songwriters: 'Write A Hit Song If You Want Money' (78)
7:07am: Kicking People Off The Internet Not Enough In South Korea, Copyright Lobbyists Demand More (26)
5:33am: Are The Record Labels Using Bluebeat's Bogus Copyright Defense To Avoid Having To Give Copyrights Back To Artists? (42)
3:53am: Larry Magid Calls For News Tax To Fund Failing Newspapers (29)
1:35am: Judge Says 'There's An Ad For That...' And It's Ok For Now (14)

Wednesday

11:01pm: Oh Look, Some Police Do Know How To Use Craigslist As A Tool (8)
8:43pm: Netherlands The Latest To Propose Mileage Tax That Requires GPS For Tracking Driving (30)
6:40pm: Spain Says Broadband Is A Basic Right (12)
4:22pm: Entertainment Industry Wants More People To Know About OpenBitTorrent Tracker (25)
3:00pm: It's The TSA, Not CSI: Actions Limited To Security, Not Crime Investigation (25)
1:49pm: The More Innovative You Are, The More You Get Sued; Yet Another Patent Lawsuit Over Shazam (7)
12:36pm: Oh No! Nobody Reads! Oh No! It's Too Cheap For Everyone To Read! (18)
11:15am: We See Your 'Copyright Contributes $1.5 Trillion' And Raise You 'Fair Use Contributes $2.2 Trillion' (17)
9:55am: Cable Industry Joins MPAA In Asking FCC To Allow Them To Stop Your DVR From Recording Movies (45)
8:44am: Sony Pictures Having Its Best Box Office Year Ever... Still Blaming Piracy For Killing The Business (38)
7:30am: Jenzabar Finds 'Expert Witness' Who Will Claim Google Relies On Metatags, Despite Google Saying It Does Not (38)
5:52am: China Says Microsoft Violates IP With Windows, Bars Sales (26)
4:01am: Don't Post Comments On StlToday.com Or They Might Tell Your Boss (45)
1:50am: Recording Industry Making It Impossible For Any Legit Online Music Service To Survive Without Being Too Expensive (45)

Tuesday

11:01pm: Crackdown On Loyalty Program Scams Shows How Ridiculously Successful They Were (11)
8:56pm: Just Because People Say They'll Pay For Something, It Doesn't Mean They Will (21)
More arrow
Quick Links
Close
E-mail It