Update 4/29, 5:21pm: Wolfram updated his blog today and linked to his demo video, and the product does look as niche as I feared. It is “smart answers” on steriods, and while it may complement regular search results nicely, it’s not moving the field of Internet search and indexing forward at all. Perhaps it’s the press’s fault for pumping up Alpha as the next Google – clearly it is not, nor are they trying to be. They’re tackling a relatively small problem (compared to indexing the entire Internet) and they appear to be targeting a small audience (academics and scientists), so we should probably stop discussing Alpha in the same breath as Google, Yahoo, and the rest. Please move along, nothing to see here.
Much ado has been made lately about Wolfram Alpha, a new-fangled “search engine” due to release in May that promises to give answers to questions that are asked in plain English. Predictably, it’s much ado about nothing. Techcrunch responded today to leaked screenshots by sitting on both sides of the fence, saying it’s unlikely Wolfram has “something Google doesn’t or can’t build in a year,” while also saying that their own guest editor’s predictions of Wolfram’s search greatness are “persuasive.” Which is it, guys?
Let me boil it down for you based on what I’ve read so far: Wolfram Alpha’s pitch is that their search engine is built to answer plain English “computational” questions, i.e. questions that have specific answers that can be calculated. To do this, they are sucking in all the databases they can find – population stats, weather stats, census data, geographic data, and any other corpora that are readily available.
Once they have all the data compiled, they make it mine-able using plain English queries. In his TechCrunch guest article, Nova Spivack gives three sample queries that are supposed to show the awesome potential of Alpha:
- What country is Timbuktu in?
- How many protons are in a hydrogen atom?
- What is the average rainfall in Seattle?
It’s great that Alpha can answer these, but did Spivack bother to try these queries in Google? Google gives an answer to every single one in the summary of the top result. I didn’t even have to click through. Hopefully these are just shoddy examples from Spivack rather than an example of how lame Alpha actually is.
Here are a few query types I’m hoping Alpha can answer that Google cannot:
- What was Bank of America’s stock price at close on September 11, 2001?
- Is next year a leap year?
- How have home prices in San Diego, CA changed in the last 5 years?
These are questions that have specific answers that can be calculated from readily available data, but (here’s the key) are unlikely to have been written about on the web in a way that would make them findable by Google. These questions are so specific (long-tail), that Google just won’t have answers sitting around in its index.
I can hear your next question already: if Wolfram Alpha is only good for such long-tail questions, how can it possibly compete with Google? The answer is: it can’t.
For all the glowing talk, Alpha appears to be a large set of regular expressions that parse natural language so users can mine a massive database. This is not dissimilar to how Ask.com worked in the late 90’s when they had a huge, human-built database of question templates allowing them to parse queries and provide links as answers. Remember how well that worked?
Alpha must be an acquisition play. They must be developing this answer engine with an eye towards selling it to one of the big players (Google, Yahoo, MS, or even Ask.com) so they can beef up their search results. All of the majors already have smart answers features (a la Ask.com) that give exact answers to a small set of templatic questions, so acquiring Alpha would make an existing smart answers feature more robust.
TechCrunch reported that an Alpha “insider” today leaked the screenshow below in an attempt to show how Alpha is so much cooler than Google’s smart answers:
Wolfram Alpha Leaked Screenshot
On the left, we have Alpha’s result for a search on “ISS.” The result it gives is a map and technical details of the Internation Space Station’s orbit. Wow. That is a truly horrible result. Why anyone would leak this to show the power of the engine is beyond me. Here’s what’s wrong with it:
- Who says I want information about the International Space Station? Maybe I wanted Internet Security Systems or International Schools Services or info about the company ISS A/S out of Copenhagen. How about a little disambiguation guys? Clearly Alpha is not trying to be a comprehensive search engine.
- Who wants data like that? If I want info about the International Space Station, I would probably rather see its homepage than some crazy technical data about where it is right this second.
- What happened to the natural language queries, eh? Showing that your engine can figure out what I meant by a search on “ISS” hardly shows any natural language parsing ability, and conversely shows the complete lack of disambiguation as I discussed above.
- Lastly a nitpicky Product Manager thing: at the top, it says that International Space Station is the “input interpretation” of ISS. “Input interpretation”? Really? How many users would have any idea what you’re trying to say there? This is a product made by nerds for nerds. I’m a nerd, so I can say this.
On the right, we see Google giving a fantastic answer to the query “maine population”. (I’ll assume that someone changed the text in the query box to read “california population” after looking up Maine first.) Google 1, Alpha 0.
Ultimately, Wolfram Alpha is not a search engine, but rather a data mining language for answers about a relatively small set of known entities. If you want to know about the International School Services, use Google. If you want to know where Timbuktu is, use Google. If you want to know the inclination and orbital period of the International Space Station, then by all means, go ahead and use Wolfram Alpha. (Note: Google gives a pretty good result when searching for “International Space Station Inclination.”) When Alpha finally sells to one of the majors, it will mostly likely settle in as a feature of a search engine, not a search engine itself.
Caveat: all conclusions I’ve drawn are from the information available now. We’ll see what it’s actually capable of when it launches in the coming weeks.