Powerset: Is There More Than Buzzwords And Patent Threats?

There’s been so much hype around search startup Powerset that it seems like it’s going to be quite difficult to live up to it. The company kicked off by raising a lot of money at an insanely high valuation for a seed stage company, and then used some of that cash to license some natural language technology from PARC. Of course, natural language search has been tried and failed many times before — sometimes because the technology sucks, but more often because there just isn’t that big a benefit to it compared to traditional keyword search (especially as more people have become comfortable with keyword searching). However, Powerset keeps generating lots of attention and hype, and on Thursday apparently revealed a lot more concerning what it’s about… we think. That is, the company revealed a lot, but an awful lot of it comes off as simply repeating every buzzword they can think of and reminding everyone they have patents.

It’s always a signal to be worried if a company kicks off a description of its product by bragging about its patents rather than the actual benefits of its product, but Powerset kicked off the discussion by talking about how “locked down” its patents are. If the company is really doing something special, then people will beat a path to its door, whether or not it has patents. If the technology is useless, the patents will also be meaningless. We don’t care about the patents, we care about what’s useful. The rest of the talk apparently was about this incredibly confusing buzzword-fest of a social network/ecosystem that the company is apparently trying to build around its search engine:

“Imagine a mashup between Facebook, Digg and Google Apps, but you get to participate in the building of the products that sit on top of our platform. You log into a social network, like you would Facebook, and you get certified to be a Powerlabber. Once certified you can join different interest groups, such as travel, and participate in idea and mashup competitions. QA is embedded and its all bloggable.”

What does that mean? I’ve read it many times and I still can’t figure it out. He goes on to mention MySpace, Second Life and Wikipedia, of course. It sounds like the company is trying to build the ultimate web platform — which is a good strategy, but it needs to get away from buzzwords and patents and actually explain what makes it useful.

Sanguine Dream says:


An effort to create a more tightly focused search engine. I have to say that such a concept do show promise. It does get aggrevating to search for something in Google and you get results that are nowhere near what you searched for.

However useful this turns out to be I also think that they may become victims of their own hype.

Don Dodge (user link) says:

Powerset does have some secret sauce

I agree that all the hype and buzzwords make it hard to sort out what is going on at Powerset, but there is some secret sauce. I have spent a lot of time with the founders of Powerset and know several employees very well.

Focus on the search index, not the user query. The NLP rocket science is applied to indexing the billions of web pages. NLP is not that helpful in parsing the typical two or three word query, but that is what everyone focuses on.

However, take a quick look at this query. “Who is the best ballplayer of all time?” Powerset breaks this query down very carefully using linguistic ontologies and all sorts of proprietary rules. For example, they know that “ballplayer” can mean Sports. Sports can be separated into categories that involve a “Ball”. Things like baseball, basketball, soccer, and football. Note that soccer does not include the word ball, yet Powerset knows this is a sport that includes a ball. Powerset knows that “ballplayer” can mean an individual player of a sport that includes a ball. They know that “best of all time” means history, not time in the clock sense.

Knowing all of this is cool but the real rocket science is in the index. Powerset uses all these rules and linguistic approaches to analyze millions and billions of web pages, and adds “meta data” hooks into each word on each page. As you can imagine this is a huge scaling problem, that has been impossible to solve economically. With Moore’s Law applied to constantly reducing the cost of computing, storage, and bandwidth, it is now possible to solve this problem, and within a few years it will be economically viable.

I wrote a blog about Powerset today that goes into more detail. See http://dondodge.typepad.com/the_next_big_thing/2007/06/powerset—open.html

Don Dodge

Jos says:

Re: Powerset does have some secret sauce

They may intend to do what you describe, they obviously have not implemented anything like it in their current demo. Just type your sample query about the best ballplayer into Powerlabs Wikipedia search demo and into Google and judge for yourself. Running a ball sport? The listing of the song “Closer to God” (number one on some “all time” list of videos) suggests that nothing more fancy is happening than substituting synonyms for e.g. “best” in what is still a dumb keyword search (something Google also does once in a while it seems to me).

Also try “Where did Einstein die?” and see how the meaning of “to die” is completely ignored by Powerset, which just happily returns samples containing the German determiner “die”.

“Where did Babe Ruth die?” similarly suggests that “where” is just taken as a sort of keyword wild card (anything of the location type), but that the crucial NLP information (that Ruth did the dying at that location) is just ignored.

Petréa Mitchell says:

It appears to me that they’re taking search ranking, trying a new parsing algorithm, and then applying the Wikipedia model where the person with the most spare time wins.

As for that quote from Google on statistics and semantics, I’ve tried Google’s supposedly world-changing statistics-based translator, and translations I got were totally incoherent. Babelfish may not be that great, but Google is nowhere near being any threat to it.

Chris Maresca (user link) says:

I’ve known one of the founders for almost 10 years (Barney Pell, the CEO) and I was at the launch party at Steve Newcomb’s house. I’m skeptical about Powerset, not because they may have great technology, but because technology is only 1/10th of what’s needed to win in the marketplace. Just look at Mac OS.

However, I will point out that Barney has, in the past, created one of the best job search engines around, FlipDog. FlipDog worked because it knew what a job listing looked like and crawled the net looking for, and indexing, job listings from corporate websites. If Powerset can build something similar for the entire web, then they might be onto something, but I think that beating Google is going to be a very, very difficult task since they are the new Microsoft (eg. entrenched incumbent with huge advantages).

That said, the switching costs for search are pretty low, so perhaps it’s possible…


Mark Johnson (user link) says:


Hi, Mike,

I’m the product manager at Powerset for Powerlabs, so I thought that I’d clear up Steve’s statement a bit. Powerlabs is going to be Powerset’s platform for testing out our newest product ideas and allowing users to test them out and comment on them. The Facebook aspect is the community, the Digg aspect is the ability to rate and comment ideas, and the Google Labs (not Apps) aspect is that we’re going to release products before they’re ready for prime-time, e.g. in the “Labs”.

If you have any questions, you know where to find us =)


Charlie (user link) says:

Re: Clarification

I made a post on my blog (http://artificialminds.blogspot.com/2007/06/how-would-nlp-parse-buzzwords.html) where I pretty much agreed with Mike on the buzzwords and patent threats. Mark Johnson left a similar comment on my blog as well, trying to point out that the statements being criticized are about Powerlabs, the community they’re trying to build around the development of Powerset. So all of the mashup buzzword medley has to do with their effort to add some Web 2.0-ness to the project…getting users involved early, soliciting feedback, etc. The presentation might have been a bit off, but it could be a good idea nevertheless, and hopefully they’ll have the technology to back up their lofty claims.

I signed up for Powerlabs to find out more about their work, as I’m very interested in Natural Language Processing. According to Mark Johnson’s blog (http://deliberateambiguity.typepad.com/blog/2007/06/powerlabs_scree.html), it looks like it won’t be opening up until September, but I did receive an e-mail from Mark with a link to this short video, which sums up what’s been discussed so far: http://www.youtube.com/watch?v=8D6czWVYc-o

ashkan karbasfrooshan (user link) says:

Good PR positioning for Sale?

Agree with the comment that technology is 1/10th of the challenge, main problem is the company will get very little distribution and traffic. Just cause there’s a buzz amongst people who know the team does not mean it will necessarily mean it will be a marketing or sales success.

Then again, this is all great to get google, msft, ask and aol interested in a bidding war.

Fear is a great motivational tool in M&A:

Venture Itch (user link) says:

Natural languge search

You made a very good point “people have become comfortable with keyword searching”. I would even say that people have been comfortable with keyword searching all the way. I’m not sure that natural languge search as behaivoral phonomenon exists on web. It’s much easier and natural to run search “florist Boston” than “what florist would you recommend in Boston”? I hope Powerset team has enough common sense to figure out what direction to take. If natural language search is the only option, probably the best course of action is just to return capital to investors.

NitinK (user link) says:

Mass confusion! We're talking about different thin

Hmm – at the risk of making things even murkier, let me add my $0.02 as well.
[Disclaimer: I’m NOT a Powerset employee, nor associated with Powerset in any way.]

I think there is a lot of confusion here, at different levels:

1. The Digg-Facebook-GLabs buzzwords apply to PowerLabs, the feedback community for the product [a la Dell Ideastorm], not to the search site or platform.

2. As Don Dodge points out above, the Semantic Processing aspect applies to *both* – the query and the indexed content. (This had not been clear to me before last night.) This understanding of meaning will allow Powerset to provide query results with a much higher level of relevance than keyword search, IMHO.

3. Finally, although there is incredible potential here, Powerset seems to be following a disciplined approach with the following progression: (i) search site (ii) widgets, mashups, APIs (iii) search platform .

Of course, this is all based on what they told us last night – the usual disclaimers apply!

Ob.plug for my own blog post on this topic:
Powerset is Not a Google-killer!

Chris (user link) says:

Natural Search & All That Crap!

Guys & Gals
Please understand that most of the users are not very sophisticated when entering their queries. Let alone using a multi-keyword or non keyword phrase.

Google is what it is today because it could monetize those specific keywords into adwords for publsihers. How can Powerset do the same to generate revenues from user search queries applying natural search?

Where does Paris hilton buy her underwear?

I wonder what results Powerset deliver?

