I apologize for the silence lately, but I’ve been cranking pretty hard on my new project, Degree3 Q&A. It’s a social Q&A system that sites can quickly and easily integrate into their site, helping visitors find answers more easily than using comments or forums. I’ll be talking more about it later, but for now, if you’re interested in trying it out, drop me a line or apply for our Private Beta at Degree3.com. In the meantime, try it out on this page, just to the right of this post, and let me know what you think!
Category Archives: Social Q&A
Pointless Twittering: According to a study by Pear Analytics, 40% of Tweets are “Pointless Babble” with another 38% being “Conversational” (which I suppose is a step above Pointless Babble. A small step.) Only 3.6% of posts were classified as news, confirming my assertion that Twitter is more of a communication tool than a source of information. If you’re in the market for pointless babble or conversation, now you know where to go. On a side note, I love the use of the term “pointless babble” in serious research.
Celeb name power: The social media press should stop talking about Hunch.com just because it was started by a Flickr founder. It’s not interesting and it’s not (as the press keeps calling it) social Q&A; it’s social polling, more akin to Sodahead or even the old-school Coolquiz than Yahoo! Answers or my baby. With social Q&A you get to ask a question any way you want and let other people answer your question. On Hunch, you can’t ask a question at all – you have to search for decision-making wizards that other users have already created. And even then, it’s only good for making decisions like whether you should mow your lawn or whether you should renew your World of Warcraft subscription. If you want to know why the sky is blue or what sights to see in Istanbul, you’re out of luck. Yawn. The initial burst of traffic they got from the press is fading, although not as precipitously as Wolfram Alpha’s.
Dumb, smart!: Radio Shack is smart to try rebranding as The Shack because they have nothing to lose. Their old brand stands for irritable, aggressive salespeople, batteries, and out-of-date, no-name electronics devices, so they could stand to shed some of that. Pizza Hut, on the other hand, is nuts to drop the ‘Pizza’ and call themselves The Hut. They will forever be associated with Jabba, and there was nothing terribly wrong with their brand as it was. The Hut says they changed the name to allow them to broaden their menu, but I say if Burger King can sell salads, you guys can sell just about anything short of sushi. Don’t get me started on Syfy.
Microyawn: I suppose the whole Yahoo-Microsoft deal is big news, but for the average web user, it just won’t mean anything. Yahoo search results will look different. Big deal.
Wake me when Google Wave comes out. It’s way too overblown to gain any mass-market acceptance, but at least it’ll be a fun toy for tech geeks like myself.
This is an example of computer software trying to be too smart. Computers, no matter how smart they get, will never be able to comprehend context as well as a person, and this is why Yedda has so much trouble figuring out what I actually am an expert in, and what I am most likely to answer questions about.
Last time I brought this up (Yedda said I was an expert on “What’s that smell?”), Yaniv from Yedda stopped by and pointed out:
As for why you received this one, I guess that the Yedda active distribution system figured out on its own that itâ€™s a pretty good match for â€œtool, basketball, and (?) perfect circleâ€?. Perhaps we should change our slogan to â€œInterested in silly topics? Get silly questions!â€?
Gee, thanks, Yaniv. Even if I was just entering “silly topics”, I’d say the “active distribution system” doesn’t seem too smart if it thinks that “tool”, “basketball”, or “perfect circle” had anything to do with “What’s that smell”, or the latest question I got on toilet paper rolls, eh?
The crew at Yedda is asking a computer program to determine the actual meaning of things, and computers aren’t very good at that yet, so this is what happens. The technology just isn’t ready. This is why on Answerbag we ask users to tell us what categories they want to answer questions in using a full hierarchical category structure to give context, and we only send them questions in that category.
Yaniv, my interests may sound silly to you, being based in Israel, but someone from the US in my rough demographic would probably recognize Tool and Perfect Circle as popular American bands. A good categorization system would be able to differentiate a tool that you hammer with, and a Tool whose CD you can buy. Answerbag uses a full category hierarchy to achieve this, so “Tool” is a category under “Musical artists” giving us the proper context to know what the user is really interested in. Yedda’s system tries to figure out what “tool” and “perfect circle” are with no context, and, being a computer system, it just can’t figure out that these are bands. It probably thinks I’m interested in tools and geometry.
A common theme in interaction design these days is to make the system smarter, so the user doesn’t have to think, but this assumes that the system can be made to be smart. However, in order to make certain systems work correctly (like context-determination), we still need to rely on our users to do a little thinking of their own.
When someone writes for a publication like MIT’s Technology Review, they have an obligation to write articles that are objective and scientifcally sound. To represent a brand like MIT, they have to observe the standards of review journalism such as creating measurable comparison criteria, applying those standards consistently, and giving consistent, even-handed treatment to their subjects. However, Wade Roush of Tech Review last month ignored all of these rules in his article What’s the Best Q&A Site?
Perhaps it was Roush’s objective to take a light-hearted look at the Social Q&A space and therefore was lax in his editorial rigor, but if that’s the case, his review should have been published on a blog somewhere, not on Tech Review, and it should have had the appropriate disclaimers. When MIT Tech Review publishes an articles with hard numbers comparing websites, that review becomes gospel for the hordes of other sites that reference it, so it had better be accurate.
I will show here why that article never should have been published by the MIT Technology Review.
Before we begin, here’s my big, fat disclaimer: I founded Answerbag, one of the subjects of the review, so I’m hardly unbiased. However, to be thorough, I will show here that Roush’s review of all of the QnA sites — not just mine — were cursory, unbalanced, and inaccurate.
I’ll start with Roush’s description of his test:
I also devised a diabolically difficult, two-part test. First, I searched each site’s archive for existing answers to the question “Is there any truth to the five-second rule?” (I meant the rule about not eating food after it’s been on the floor for more than five seconds, not the basketball rule about holding.) Second, I posted the same two original questions at each site: “Why did the Mormons settle in Utah?” and “What is the best way to make a grilled cheese sandwich?” The first question called for factual, historical answers, while the second simply invited people to share their favorite sandwich-making methods and recipes.
The five-second rule in basketball actually has nothing to do with holding, but I’ll let that one go.
I awarded each site up to three points for the richness and originality of its features, and up to three points for the quality of the answers to my three questions, for a total of 12 possible points.
Okay, so here’s the fun part: three points for “richness” and “originality” of its features – these seem like awfully vague terms to use, but it’s only 3 points out of a total 12, so I suppose it’s not that big a deal. He finishes by awarding up to 3 points for answers to each of his three questions, based on the quality of the answer. That seems fair.
Since his review starts with my site, Answerbag, I’ll dig into his review of Answerbag first.
Members get points for asking and answering questions as well as for rating other members’ questions and answers.
Somewhat true. Members can receive points for their questions and answers from other members, but they do not automatically receive points (as they do on some other Q&A systems, a difference that is key to the quality of the answers.) Also, members do not receive points for rating other members’ Q&A. But hey, he says he only had a few days to write this review, so I can’t expect him to get all the details.
He goes on to talk about our Levels system and our Widgets, but never discusses the core differences between Answerbag and other sites, such as the fact that we never close our questions to new answers or ratings on our answers. We also get no credit for pioneering video answers or image answers. We get no credit for allowing our users to sort the questions in a category by popularity so they can easily learn about a topic like “PlayStation 2 troubleshooting” or “foreclosures.” No credit for a unique profile page that lets users see all of the latest activity on the site that involves them (answers to their questions, ratings on their answers, comments on their answers, etc.) No credit for giving more power to users who have earned a lot of points on the site. Then he gives us 1 point out of 3, apparently for not having “rich” and “original” features. Ouch. Well, it’s a subjective measure, so I’ll let it go.
Is there any truth to the five-second rule? All of AnswerBag’s answers about the five-second rule pertained to basketball. Points: 0
Busted. But then, to our answer for why the Mormons settled in Utah, he has this to say:
That’s more or less in line with the best answers to this question at other sites. Points: 1
So…you’re telling me that for an answer that was just as good as the best answers on other sites, we got 1 point out of 3? Is that fair in the review guidelines set forth by MIT’s Tech Review?
To his question on “What’s the best way to make a grilled cheese sandwich?”, he says “I rated the answers to this question purely according to their mouthwateringness.” Very scientific. He received six answers, and gave Answerbag 2 points out of 3 with no explanation whatsoever. I guess they just weren’t mouthwatering enough to get the full 3 points.
3 points for features – he liked the mechanics of the site. Fair enough, since it was established as a subjective measure.
0/3 points for having no answers to the “five-second rule” question. Bummer.
On the Mormon question, Roush cites a couple of the answers, and then gives 2 points out of 3, with no explanation of what was wrong with the answer, or what a correct answer is.
On the grilled cheese question, he got a couple answers that “provided just the basics” as well as some that were taken from “grilledcheese-contest.com” but complains they were “not very original.” Perhaps if he wanted people to submit their own personal, original grilled cheese recipes, he should have asked for that. Poor Askville only got 1 point out of 3 here thanks to Roush not asking for what he wanted.
For a QnA system that is virtually identical to Yahoo’s, Askville’s, or Yedda’s, MSN gets 1 point from Roush. He complains that as users earn points for their answers on the site, they can raise up in levels but “there are no other rewards.” This was his only criticism, and Yahoo’s point system works virtually the same way. But, MS only scores 1 point from Roush. Maybe he thinks Windows crashes too much.
2/3 points for answers about the “five second rule.” No explanation of what the answers were missing to keep them from getting 3/3.
2/3 points for answers to the Mormon question, again with no explanation of what’s wrong with them.
For the grilled-cheese question, he got three answers, including one he says “sounded delicious,” but I guess it didn’t sound delicious enough. 2/3 points.
Roush must have been approaching his deadline. He gave an overview of Wondir’s features, and then gave them 1/3 points with no explanation of what he didn’t like about it at all.
Five second rule: 0/3 points because it was very hard to find answers. Fair enough.
Mormons settling: He got “six answers, 3 of which were useful.” No explanation of what was missing. 2/3 points.
This one is great: for the grilled cheese question, he changed his question (for no apparent reason) to ask how to make a healthy grilled cheese sandwich. He got answers he considered “brief and obvious” like using wheat bread, lowfat cheese, and margarine. I have no clue what he would have considered a good answer. Those answers are the best things I could think of to make a healthy grilled-cheese sandwich; if you’re cooking something with bread, cheese, and butter, what else can you do? Yet, Roush was not satisfied, and this answer got 1/3 points. Brutal.
Roush starts by explaining how Y!Answers has similar features to other sites. He praises their “My QnA” page, but doesn’t acknowledge that almost every other site in the comparison has a similar page. Then he seems to get excited:
One fun twist: users can choose and customize their own cartoon self-portraits, which appear alongside their questions and answers and give the site a surprisingly jaunty feel.
How much is that worth? 3/3 points. THREE POINTS. For an avatar system that anyone who’s used Yahoo (particularly Yahoo Games or Yahoo Instant Messenger) knows has been a feature of Yahoo (not Yahoo Answers) for several years. Somehow their avatars added so much to the experience that he thought Yahoo had the BEST FEATURES out of any of the other QnA sites in the review (tied with Askville.) Perhaps it’s his first time using the web?
Five second rule: This question had been asked 11 times and had 160 answers. It seems that the time it took him to find a good answer under the 11 iterations of the same question was not a hindrance – after reading through the 160 answers (yawn), he apparently liked one that referenced a show from the Discovery Channel as a source. 3/3 points. Alright.
Mormons settling in Utah: Again, Roush inexplicably changed the wording of his question, adding a bit about “why didn’t they go on to California or someplace more desolate.” (Rule #1 of doing experiements: perform the SAME experiment on every subject, Wade.) He got one “offensive” answer, four “cursory” answers, and one he considered good enough for a 2/3, again with no explanation.
Grilled cheese sandwich: This time he just searched for existing answers, giving no word on why he didn’t do this on other services, or why it was good enough to look at existing answers on Yahoo rather than testing how quickly he could get answers to a new question as he had done on other sites. He seemed to be pleased that he had to look at over 100 copies of the same question to get a recipe. Something in there must have sounded yummy to him, but he doesn’t tell us much besides the fact that he found thousands of answers. Yahoo Search returns millions of results to many queries, but that doesn’t make it the best search. Roush gives Y!Answers 3/3 points regardless.
His evaluations of Yedda’s features culimates in “In practice, however, I couldn’t see much difference between the answers at Yedda and those at Live QnA, Yahoo Answers, or the other sites.” For being the same as those sites, Yedda only gets a 2/3. No futher explanation offered.
Five second rule: Nothing. 0/3.
Mormons: One answer that didn’t ad dress the question. He’s nice and gives Yedda a 1/3 anyway. How generous.
Grilled-cheese: He got one answer that focused on the cheese used in the sandwich. 1/3 points.
How this article got through MIT Tech Review’s editors is beyond me. Time and time again Roush gives random points to these sites without even saying what he liked or didn’t like about them. He never even established the “correct” answer for the five-second rule or why the Mormons settled in Utah, so we, his readers, have no basis of evaluating the answers he received. At the very least, he could have told us what was wrong with the wrong answers.
Here’s what a good review would look like: come up with a set of real, measurable criteria like: question answering time, accuracy of answers, and intuitiveness. For certain types of questions that are looking for a variety of answers, the number of answers received may be a useful metric (such as grilled-cheese recipes.) It might be nice to see a feature comparison chart of the sites in question, as well as a discussion of why those features are important. You could, perhaps, give one point per useful feature out of a fixed set of features.
Most importantly, apply these measurable standards even-handedly to every subject, and explain why they receive or don’t receive points. These are the basics.
Am I upset just because Yahoo won? No. I’m upset because the review was cursory and didn’t do justice to any of the sites involved. Maybe it was meant to be fun, to be a “jaunty” exploration of Social Q&A, but is Tech Review really the right forum for that? I’m not going to pretend I’m unbiased in the matter, but Roush’s review should not have been represented as a measure of The Best Q&A Site by a well-respected publication.
I guess Bezos is getting tired of the ecommerce biz – Amazon has just released their Askville Social Q&A service, a competitor to sites like Yahoo Answers, MSN Live QnA, and my own Answerbag(which predates all of them, incidentally. Not being snippy, just pointing it out so the title of this post doesn’t sound hypocritical.)
I suppose for Amazon it’s opportunistic – they saw it work at Yahoo Answers, so they did it themselves, and they have enough traffic already to make almost any social service work. (I won’t pretend that they were imitating Answerbag, the smaller, nimble competitor!) The bummer is that they really didn’t add much to the concept.
It works essentially the same way as Yahoo Answers and MSN’s service, but it’s actually even more limiting – when someone asks a question, only 5 people can give answers, and then those five people are the only ones who can evaluate the answers and decide which one is the best. Why the limits, guys? Server too small? Wouldn’t having more people evaluate your answers make them better? Wouldn’t allowing more people to PROVIDE answers result in better answers? According to their site:
Weâ€™ve placed a limit on the number of answers per question to make sure you are not overwhelmed with too many answers to your question. If you want to get more than 5 answers you can simply ask your question again â€“ itâ€™s free!
They’re probably right. Six answers is just too many for my feeble brain to process. Thank you for giving me that limit, and encouraging me to ask the question again if I really, really want a sixth answer.
One more thing, you can’t see the answers until all five have been posted, so you may end up posting the exact same answer someone else already gave, and you won’t know until the question gets five answers. Another bummer about that is if your question doesn’t get answered, it gets deleted. You have to come back and ask again. Seems like a pain.
I will give them credit for Map Answers – neat idea, and it was implemented well. They do Video Answers (as Answerbag does), but for some reason don’t do Image Answers. And, of course, they let you embed Amazon product listings in your answers, if you happen to be eager to help a struggling ecommerce company hawk their wares.
I could go on, but I am far too biased to make this post sound objective at all. I have nothing against Amazon as a company and in general I like their stuff and I’ve even found inspiration from various features of their main site. But this time…c’mon guys – let’s innovate a little!