Tag Archives: search engine

How Facebook’s Open Graph will own identity and threaten Google

Facebook today announced some potentially ‘net-changing features they are releasing under the moniker Open Graph.  Open Graph replaces Facebook connect, or perhaps deprecates it if you like, making it easier for people to utilize their Facebook data within the context of other websites.  Sounds fancy, eh?  Let’s break it down into understandable examples:

open-graph-stream1Most prominently, websites can embed “Like” buttons on their pages, just as Facebook has on its activity feed items and various other pages around their site.  Website creators will embed these Like buttons because it lets their users publish links they like back to their Facebook feed with a single click – they don’t even need to sign in to the creator’s site, as long as they are already signed in on Facebook - free marketing for the website.

In addition to the Like tool, Facebook offers a variety of other “social plugins” to help site creators make their sites more social and more integrated with Facebook.   The Activity Feed lets users see what their friends are doing on the creator’s site.  Login with Faces shows a user which of their friends are already members of a site and prompts them to sign up with that site to connect with them.  Comments lets users comment on individual items on the creator’s site, and gives them a seamless option to post that comment back to Facebook as well.  All this without having to create an account on the creator’s site.  You get the picture.

Facebook = identity

The most significant immediate implication of Facebooks Open Graph is that site creators may no longer bother having their own registration systems at all, as FriendFeed founder Bret Taylor (now with Facebook) explained.  My interpretation is that Facebook wants to own identity on the web, and site creators are likely to step in line because Facebook has made it in their best interest. If creators adopt the tools, they get free marketing tools and a seamless experience for their users.  All they sacrifice is having to share their user data with Facebook.

Open Graph is easier to implement than Facebook Connect, and people can start interacting with sites immediately, no login required, which is great.   It appears that Facebook will share users’ “basic” information once they connect on a site, so creators get names, email addresses, genders, etc. – all the basic things they would ask for anyway.  Easy for the user, easy for me, everyone wins.

Especially Facebook.  Once Open Graph plugins become widespread, Facebook will know exactly what users are doing…all the time.  They’ll know what sites you visit, and they’ll know what things you Like on those sites.  Who knows what evil lurks in the hearts of men?  Big Brother Zuckerberg knows.

Spam bait

Interesting note for you privacy fans: it appears that any data you make public on your profile (which is most of it by default), including things and sites you like, will now be available to other sites so they can tailor their content to your tastes.  Cool? Yes.  Spooky? Yes.  Ripe for abuse? Most definitely.  While I don’t have a problem with this personally because I am pretty careful and sparing about what I share on Facebook, a lot of people are going to get stung, and spammers and direct marketers will try to abuse the system to deliver unsolicited ads.  I wish Facebook well, but this is going to be a hornet’s nest.

open-graph-profile

The data

Now for what I think is the real meat of all this: the data.  When site creators implement these Like buttons and other plugins, Facebook is encouraging them to tag their pages with specific types of common metadata that may be relevant:  image, name, location, email address, phone number, and “type” (e.g. sport, activity, restaurant, athlete, city, product, book, blog, website, etc.)  If creators take the time to tag their pages like this, then when their users “like” something, Facebook will know exactly what it is and can present it nicely within the Facebook context.

Think about this for a minute. Suddenly, one organization on the web has the ability to know what pages are about without having to crawl every page (and its backlinks) to figure it out.  Site creators are telling Facebook exactly what their pages are about using structured data.  Here is the quote from their Open Graph page that jumped right out at me:

Based on the structured data you provide via the Open Graph protocol, your pages show up richly across Facebook: in user profiles, within search results and in News Feed.

Search results, eh?  Any page on my site that I tag with structured data can show up in Facebook search.  Facebook could presumably let their users filter the search so it’s for “actors” or “politicians” or “athletes” or whatever type of object.  They can search for activities, landmarks, restaurants near their current location…  This sounds an awful lot like Google, but with 1/100th of the effort that Google goes to when compiling their monstrous index of every page on the Internet.

Even better, all these links are ranked by humans.  Every “Like” button that we press makes this massive index of webpages and real-life offline things smarter.  This is getting impressively close to the holy grail of search: social search.  Not only is it vetted by humans, but it’s real-time – no need to wait for a crawler to poke around every corner of the web.  The best of Google search with the Best of Twitter search in one package.

Google is surely watching these development keenly, and probably wishing they had acquired Facebook back when they had the chance.  Microsoft is surely dancing a jig.  (Hey Stumbleupon: love you guys, but it’s time to pack up your bags and go home.)

If I was Google, I’d give an arm and a leg for all this data.  With Microsoft being a major investor in Facebook, don’t be surprised to see this data integrated into Bing in the not-too-distant future.

Wolfram Alpha is a feature, not a search engine

Update 4/29, 5:21pm: Wolfram updated his blog today and linked to his demo video, and the product does look as niche as I feared.  It is “smart answers” on steriods, and while it may complement regular search results nicely, it’s not moving the field of Internet search and indexing forward at all.  Perhaps it’s the press’s fault for pumping up Alpha as the next Google – clearly it is not, nor are they trying to be.  They’re tackling a relatively small problem (compared to indexing the entire Internet) and they appear to be targeting a small audience (academics and scientists), so we should probably stop discussing Alpha in the same breath as Google, Yahoo, and the rest.  Please move along, nothing to see here.

Much ado has been made lately about Wolfram Alpha, a new-fangled “search engine” due to release in May that promises to give answers to questions that are asked in plain English.  Predictably, it’s much ado about nothing.  Techcrunch responded today to leaked screenshots by sitting on both sides of the fence, saying it’s unlikely Wolfram has “something Google doesn’t or can’t build in a year,” while also saying that their own guest editor’s predictions of Wolfram’s search greatness are “persuasive.”  Which is it, guys?

Let me boil it down for you based on what I’ve read so far: Wolfram Alpha’s pitch is that their search engine is built to answer plain English “computational” questions, i.e. questions that have specific answers that can be calculated.  To do this, they are sucking in all the databases they can find – population stats, weather stats, census data, geographic data, and any other corpora that are readily available.  

Once they have all the data compiled, they make it mine-able using plain English queries.  In his TechCrunch guest article, Nova Spivack gives three sample queries that are supposed to show the awesome potential of Alpha:

  • What country is Timbuktu in?
  • How many protons are in a hydrogen atom?
  • What is the average rainfall in Seattle?

It’s great that Alpha can answer these, but did Spivack bother to try these queries in Google?  Google gives an answer to every single one in the summary of the top result.  I didn’t even have to click through.  Hopefully these are just shoddy examples from Spivack rather than an example of how lame Alpha actually is.

Here are a few query types I’m hoping Alpha can answer that Google cannot:

  • What was Bank of America’s stock price at close on September 11, 2001?
  • Is next year a leap year?
  • How have home prices in San Diego, CA changed in the last 5 years?

These are questions that have specific answers that can be calculated from readily available data, but (here’s the key) are unlikely to have been written about on the web in a way that would make them findable by Google.  These questions are so specific (long-tail), that Google just won’t have answers sitting around in its index.

I can hear your next question already: if Wolfram Alpha is only good for such long-tail questions, how can it possibly compete with Google?  The answer is: it can’t.  

For all the glowing talk, Alpha appears to be a large set of regular expressions that parse natural language so users can mine a massive database.  This is not dissimilar to how Ask.com worked in the late 90′s when they had a huge, human-built database of question templates allowing them to parse queries and provide links as answers.  Remember how well that worked?

Alpha must be an acquisition play.  They must be developing this answer engine with an eye towards selling it to one of the big players (Google, Yahoo, MS, or even Ask.com) so they can beef up their search results.  All of the majors already have smart answers features (a la Ask.com) that give exact answers to a small set of templatic questions, so acquiring Alpha would make an existing smart answers feature more robust.

TechCrunch reported that an Alpha “insider” today leaked the screenshow below in an attempt to show how Alpha is so much cooler than Google’s smart answers:

Wolfram Alpha Leaked Screenshot

Wolfram Alpha Leaked Screenshot

On the left, we have Alpha’s result for a search on “ISS.”  The result it gives is a map and technical details of the Internation Space Station’s orbit.  Wow.  That is a truly horrible result.  Why anyone would leak this to show the power of the engine is beyond me.  Here’s what’s wrong with it:

  • Who says I want information about the International Space Station?  Maybe I wanted Internet Security Systems or International Schools Services or info about the company ISS A/S out of Copenhagen.  How about a little disambiguation guys?  Clearly Alpha is not trying to be a comprehensive search engine. 
  • Who wants data like that?  If I want info about the International Space Station, I would probably rather see its homepage than some crazy technical data about where it is right this second.
  • What happened to the natural language queries, eh?  Showing that your engine can figure out what I meant by a search on “ISS” hardly shows any natural language parsing ability, and conversely shows the complete lack of disambiguation as I discussed above.
  • Lastly a nitpicky Product Manager thing: at the top, it says that International Space Station is the “input interpretation” of ISS.  ”Input interpretation”?  Really?  How many users would have any idea what you’re trying to say there?  This is a product made by nerds for nerds.  I’m a nerd, so I can say this.

On the right, we see Google giving a fantastic answer to the query “maine population”.  (I’ll assume that someone changed the text in the query box to read “california population” after looking up Maine first.)  Google 1, Alpha 0.

Ultimately, Wolfram Alpha is not a search engine, but rather a data mining language for answers about a relatively small set of known entities.  If you want to know about the International School Services, use Google.  If you want to know where Timbuktu is, use Google.  If you want to know the inclination and orbital period of the International Space Station, then by all means, go ahead and use Wolfram Alpha. (Note: Google gives a pretty good result when searching for “International Space Station Inclination.”)  When Alpha finally sells to one of the majors, it will mostly likely settle in as a feature of a search engine, not a search engine itself.

Caveat: all conclusions I’ve drawn are from the information available now.  We’ll see what it’s actually capable of when it launches in the coming weeks.