Announcing Wikirank: Tracking what's popular on Wikipedia
A few months ago, the four of us in Small Batch Inc were kicking around ideas for what we should build next. We had just launched the Twitter Election site -- a fun real-time data visualization project -- and wanted to keep building things that help people make sense of the enormous amount of information that bombards us each day. Late one night soon after, I followed a link to a repository of Wikipedia server logs. There were gigabytes of data sitting there just begging to be visualized. We got to work.
The result is Wikirank, a tool for exploring what's popular on Wikipedia, discovering comparisons between topics, and sharing them with the world. We launched it yesterday, happily coinciding with the 14th anniversary with Ward Cunningham's invention of the wiki.
There are a bunch of reasons why I think Wikirank is cool, but my favorite is how it helps people find stories in the data. One of the great things about the web is how measuring tiny behaviors reveals patterns that tell stories. The data we get from Wikipedia is no different; as we started playing around with the numbers, we saw loads of interesting shapes emerge in the charts.
For example, big news stories show up as dramatic spikes where there there was no data before. When the astounding story broke that an airliner had made an emergency landing in the Hudson, a page was created on Wikipedia within minutes. Over the next two days, that page was one of the most popular on the site.
The shape takes a different angle as we watch the marketing buzz and fan excitement build towards the release of the Watchmen movie earlier this month.
Comparisons are where Wikirank really shines, however. The weekly viewing habits of television watchers comes into clear view -- as do the day on which the shows air -- when we compare Heroes to Lost. (It's also my guess that the occasionally perplexing plotlines of both of those shows leads a fair number of people to Wikipedia to find out what the heck just happened.)
I'm really pleased with how this project came out. I'll write more about the technology behind the project in a followup post, but the reality is this launch represents a pretty big shift in how we build web apps. With only a couple developers and a rack of rented machines in the cloud, we pulled this all together in just a few weeks. That simply wasn't possible the last time we did this.
The team behind all of this includes my long-time friends and business partners Bryan Mason, Ryan Carver, and Greg Veen. We were also extremely fortunate to be able to work with Dan Cederholm of Simplebits, who helped us with visual design and identity. Dan rocks.
We'd love to know what you think. You can join the discussion at Get Satisfaction.
This entry was written by Jeffrey Veen and posted 26 March 2009 at 10:04 AM. It was filed under Personal, Technology.
congrats, pleasure to assist!
This. Is. So. Cool. Seriously, as someone with a fascination with the crowd conciousness I may have just lost a few hours of my life. Thanks to you and your team for doing this :)
Very polished launch, and cool cache of data to mine.
Thanks Domas!
I was going to save this for the next post on the technology we used, but I do want to say how grateful we are to you for making this data available and hosting on your site.
For those of you who don't know, Domas Mituzas is a board member for the Wikimedia Foundation and a database system guru. He started dumping logs from the Wikipedia proxy servers to his site about a year ago as a public service. None of this would have been possible without this generous service.
Man, I'm on all your channels bugging you about this, but I'm really curious if you're planning on releasing the graph tools to the public! The graphs are great, and there's way to few good graphing libraries out there.
Besides from that, and I've said it before, Wikirank is damn awesome. :)
congrats on the launch, jeff!
wikiRank wikiRocks. Congrats on the launch.
Can you do this for wikiHow too? I'm the founder of wikiHow. Email me if I can help you do this.
Love data visualization and think you guys are great evangelists and pioneers in this field. I have been a huge fan over the years. Two comments. 1) I felt compelled to want to be able to put two topics in the initial search to be compared. Comma separated and use of quotes for multi word combinations. Is that a future consideration? 2) When the list of matches appears below the search term or comparison you enter, I feel it would be beneficial to see the number of views each of those terms before you commit to picking it to determine if it is the best choice for your comparison or even if it is worth comparing. For example, comparing Twitter which is above 1 million views to Facebook which had less than a few thousand. Hate that I have to blindly add it first, see it measures a strait line, and then remove it.
Keep up the great work! Thanks for keeping us inspired.
Jeff, very nicely done.
Have you considered doing something similar for Wikipedia EDIT data rather than just views? Could be a nice view into the community beyond the opaque recent changes.
http://en.wikipedia.org/wiki/Special:RecentChanges
Ed Chi from PARC has done some work here with Wikidashboard, but I think there's a lot more that could be discovered with more/better data visualizations.
http://wikidashboard.parc.com/
Micah
Congratulations guys... awesome work..
This is very, very cool Jeff - I love the concept, I love the data visualisation and the design; what's even more cool is the time it took to turn it around from an idea in your head to launch. Very impressive!
You mention that you'll follow this up with a post on the technology behind it, but it would be great if you could talk about the WikiRank UX approach and any user research/testing you may have done, given the short timeframe and your exhaustive legacy of work involving users at its core.
I very much look forward to the next Small Batch project.
Jeff, congrats on the new site launch. Looks great! Cool functionality.
I admired your work with MeasureMap and Google Analytics, and this new work is no exception. I have a possible project that would be of very similar interest to these type of sites. However it relates closely to the private sector than public sector, but has some great potential. By private sector, not working for a company or contracted by a company, but catering to customers of a company.
If you are interested, please let me know, I can share some more information.
@Henrik - Thanks for being persistent. We built the graphs on top of the Primer drawing package.
http://github.com/mojombo/primer/blob/master/README
It gives us an ActionScript-like abstraction for drawing primitives. The actual line graphs we did ourselves. I'm not sure if we'll release it or not, but we're certainly thinking about it. Problem with releasing code like this is that you need to support it and, well, that's time consuming. There are only four of us...
@Greg - yeah, I'm not satisfied with how search works either. We're using the Wikipedia search API and the results it returns aren't really ordered in any meaningful way. Not only that, the results show an aggregation of what people tried to search for, rather than what they ended up finding useful. So it's filled with misspellings and inappropriate capitalizations. We're working on an alternative.
@Miles - that's a good idea. I'll write about that too.
@Jeff: I understand your concern relating to support, but putting something out there doesn't necessarily mean you *have to* support it. In the GitHub-days with forking and source code collaboration, if it's good, it'll live on!
WIkiRank looks great. I really like the snippet from the Wikipedia page that are included below the graph. I look forward to read about the technical details involved.I made a tool some months ago that looks at the revision history of Wikipedia articles over time. It might be of interest to some of the readers - http://sergionunes.com/p/wikichanges/
Currently:
() More...
About Me
Bio: Jeffrey Veen
Book: "The Art & Science of Web Design"
Book: "HotWired Style: Principles For Building Smart Web Sites"
Work: My LinkedIn Profile
Travel: China, Tuscany, Kayaking in Baja, Touring Costa Rica, Studying Theater in London
Popular Posts
» Making a Better Open Source CMS
» Seven Steps to Better Presentations
» A Contrast in Urban Design
» IA Jargon Watch
» On Writing Short
» Pain and Cycling
Recent Photos
XML Feeds
Subscribe to my site
Click the link above to be notified automatically every time I add a new post.

