I apologize for the silence lately, but I’ve been cranking pretty hard on my new project, Degree3 Q&A. It’s a social Q&A system that sites can quickly and easily integrate into their site, helping visitors find answers more easily than using comments or forums. I’ll be talking more about it later, but for now, if you’re interested in trying it out, drop me a line or apply for our Private Beta at Degree3.com. In the meantime, try it out on this page, just to the right of this post, and let me know what you think!
Category Archives: Answerbag
When someone writes for a publication like MIT’s Technology Review, they have an obligation to write articles that are objective and scientifcally sound. To represent a brand like MIT, they have to observe the standards of review journalism such as creating measurable comparison criteria, applying those standards consistently, and giving consistent, even-handed treatment to their subjects. However, Wade Roush of Tech Review last month ignored all of these rules in his article What’s the Best Q&A Site?
Perhaps it was Roush’s objective to take a light-hearted look at the Social Q&A space and therefore was lax in his editorial rigor, but if that’s the case, his review should have been published on a blog somewhere, not on Tech Review, and it should have had the appropriate disclaimers. When MIT Tech Review publishes an articles with hard numbers comparing websites, that review becomes gospel for the hordes of other sites that reference it, so it had better be accurate.
I will show here why that article never should have been published by the MIT Technology Review.
Before we begin, here’s my big, fat disclaimer: I founded Answerbag, one of the subjects of the review, so I’m hardly unbiased. However, to be thorough, I will show here that Roush’s review of all of the QnA sites — not just mine — were cursory, unbalanced, and inaccurate.
I’ll start with Roush’s description of his test:
I also devised a diabolically difficult, two-part test. First, I searched each site’s archive for existing answers to the question “Is there any truth to the five-second rule?” (I meant the rule about not eating food after it’s been on the floor for more than five seconds, not the basketball rule about holding.) Second, I posted the same two original questions at each site: “Why did the Mormons settle in Utah?” and “What is the best way to make a grilled cheese sandwich?” The first question called for factual, historical answers, while the second simply invited people to share their favorite sandwich-making methods and recipes.
The five-second rule in basketball actually has nothing to do with holding, but I’ll let that one go.
I awarded each site up to three points for the richness and originality of its features, and up to three points for the quality of the answers to my three questions, for a total of 12 possible points.
Okay, so here’s the fun part: three points for “richness” and “originality” of its features – these seem like awfully vague terms to use, but it’s only 3 points out of a total 12, so I suppose it’s not that big a deal. He finishes by awarding up to 3 points for answers to each of his three questions, based on the quality of the answer. That seems fair.
Since his review starts with my site, Answerbag, I’ll dig into his review of Answerbag first.
Members get points for asking and answering questions as well as for rating other members’ questions and answers.
Somewhat true. Members can receive points for their questions and answers from other members, but they do not automatically receive points (as they do on some other Q&A systems, a difference that is key to the quality of the answers.) Also, members do not receive points for rating other members’ Q&A. But hey, he says he only had a few days to write this review, so I can’t expect him to get all the details.
He goes on to talk about our Levels system and our Widgets, but never discusses the core differences between Answerbag and other sites, such as the fact that we never close our questions to new answers or ratings on our answers. We also get no credit for pioneering video answers or image answers. We get no credit for allowing our users to sort the questions in a category by popularity so they can easily learn about a topic like “PlayStation 2 troubleshooting” or “foreclosures.” No credit for a unique profile page that lets users see all of the latest activity on the site that involves them (answers to their questions, ratings on their answers, comments on their answers, etc.) No credit for giving more power to users who have earned a lot of points on the site. Then he gives us 1 point out of 3, apparently for not having “rich” and “original” features. Ouch. Well, it’s a subjective measure, so I’ll let it go.
Is there any truth to the five-second rule? All of AnswerBag’s answers about the five-second rule pertained to basketball. Points: 0
Busted. But then, to our answer for why the Mormons settled in Utah, he has this to say:
That’s more or less in line with the best answers to this question at other sites. Points: 1
So…you’re telling me that for an answer that was just as good as the best answers on other sites, we got 1 point out of 3? Is that fair in the review guidelines set forth by MIT’s Tech Review?
To his question on “What’s the best way to make a grilled cheese sandwich?”, he says “I rated the answers to this question purely according to their mouthwateringness.” Very scientific. He received six answers, and gave Answerbag 2 points out of 3 with no explanation whatsoever. I guess they just weren’t mouthwatering enough to get the full 3 points.
3 points for features – he liked the mechanics of the site. Fair enough, since it was established as a subjective measure.
0/3 points for having no answers to the “five-second rule” question. Bummer.
On the Mormon question, Roush cites a couple of the answers, and then gives 2 points out of 3, with no explanation of what was wrong with the answer, or what a correct answer is.
On the grilled cheese question, he got a couple answers that “provided just the basics” as well as some that were taken from “grilledcheese-contest.com” but complains they were “not very original.” Perhaps if he wanted people to submit their own personal, original grilled cheese recipes, he should have asked for that. Poor Askville only got 1 point out of 3 here thanks to Roush not asking for what he wanted.
For a QnA system that is virtually identical to Yahoo’s, Askville’s, or Yedda’s, MSN gets 1 point from Roush. He complains that as users earn points for their answers on the site, they can raise up in levels but “there are no other rewards.” This was his only criticism, and Yahoo’s point system works virtually the same way. But, MS only scores 1 point from Roush. Maybe he thinks Windows crashes too much.
2/3 points for answers about the “five second rule.” No explanation of what the answers were missing to keep them from getting 3/3.
2/3 points for answers to the Mormon question, again with no explanation of what’s wrong with them.
For the grilled-cheese question, he got three answers, including one he says “sounded delicious,” but I guess it didn’t sound delicious enough. 2/3 points.
Roush must have been approaching his deadline. He gave an overview of Wondir’s features, and then gave them 1/3 points with no explanation of what he didn’t like about it at all.
Five second rule: 0/3 points because it was very hard to find answers. Fair enough.
Mormons settling: He got “six answers, 3 of which were useful.” No explanation of what was missing. 2/3 points.
This one is great: for the grilled cheese question, he changed his question (for no apparent reason) to ask how to make a healthy grilled cheese sandwich. He got answers he considered “brief and obvious” like using wheat bread, lowfat cheese, and margarine. I have no clue what he would have considered a good answer. Those answers are the best things I could think of to make a healthy grilled-cheese sandwich; if you’re cooking something with bread, cheese, and butter, what else can you do? Yet, Roush was not satisfied, and this answer got 1/3 points. Brutal.
Roush starts by explaining how Y!Answers has similar features to other sites. He praises their “My QnA” page, but doesn’t acknowledge that almost every other site in the comparison has a similar page. Then he seems to get excited:
One fun twist: users can choose and customize their own cartoon self-portraits, which appear alongside their questions and answers and give the site a surprisingly jaunty feel.
How much is that worth? 3/3 points. THREE POINTS. For an avatar system that anyone who’s used Yahoo (particularly Yahoo Games or Yahoo Instant Messenger) knows has been a feature of Yahoo (not Yahoo Answers) for several years. Somehow their avatars added so much to the experience that he thought Yahoo had the BEST FEATURES out of any of the other QnA sites in the review (tied with Askville.) Perhaps it’s his first time using the web?
Five second rule: This question had been asked 11 times and had 160 answers. It seems that the time it took him to find a good answer under the 11 iterations of the same question was not a hindrance – after reading through the 160 answers (yawn), he apparently liked one that referenced a show from the Discovery Channel as a source. 3/3 points. Alright.
Mormons settling in Utah: Again, Roush inexplicably changed the wording of his question, adding a bit about “why didn’t they go on to California or someplace more desolate.” (Rule #1 of doing experiements: perform the SAME experiment on every subject, Wade.) He got one “offensive” answer, four “cursory” answers, and one he considered good enough for a 2/3, again with no explanation.
Grilled cheese sandwich: This time he just searched for existing answers, giving no word on why he didn’t do this on other services, or why it was good enough to look at existing answers on Yahoo rather than testing how quickly he could get answers to a new question as he had done on other sites. He seemed to be pleased that he had to look at over 100 copies of the same question to get a recipe. Something in there must have sounded yummy to him, but he doesn’t tell us much besides the fact that he found thousands of answers. Yahoo Search returns millions of results to many queries, but that doesn’t make it the best search. Roush gives Y!Answers 3/3 points regardless.
His evaluations of Yedda’s features culimates in “In practice, however, I couldn’t see much difference between the answers at Yedda and those at Live QnA, Yahoo Answers, or the other sites.” For being the same as those sites, Yedda only gets a 2/3. No futher explanation offered.
Five second rule: Nothing. 0/3.
Mormons: One answer that didn’t ad dress the question. He’s nice and gives Yedda a 1/3 anyway. How generous.
Grilled-cheese: He got one answer that focused on the cheese used in the sandwich. 1/3 points.
How this article got through MIT Tech Review’s editors is beyond me. Time and time again Roush gives random points to these sites without even saying what he liked or didn’t like about them. He never even established the “correct” answer for the five-second rule or why the Mormons settled in Utah, so we, his readers, have no basis of evaluating the answers he received. At the very least, he could have told us what was wrong with the wrong answers.
Here’s what a good review would look like: come up with a set of real, measurable criteria like: question answering time, accuracy of answers, and intuitiveness. For certain types of questions that are looking for a variety of answers, the number of answers received may be a useful metric (such as grilled-cheese recipes.) It might be nice to see a feature comparison chart of the sites in question, as well as a discussion of why those features are important. You could, perhaps, give one point per useful feature out of a fixed set of features.
Most importantly, apply these measurable standards even-handedly to every subject, and explain why they receive or don’t receive points. These are the basics.
Am I upset just because Yahoo won? No. I’m upset because the review was cursory and didn’t do justice to any of the sites involved. Maybe it was meant to be fun, to be a “jaunty” exploration of Social Q&A, but is Tech Review really the right forum for that? I’m not going to pretend I’m unbiased in the matter, but Roush’s review should not have been represented as a measure of The Best Q&A Site by a well-respected publication.