Friday, December 02, 2011

BOINCing Angry Birds for BHL: Purposeful Gaming in Digital Libraries

We've been talking about the subject of crowdsourcing and gaming for the Biodiversity Heritage Library for several months now, but as it is outside any BHL project funding, no one has taken on the challenge. During last month's Life & Literature Conference I led the Technology breakout session, and the subject of gaming came up again, but this time with the caveat that it should have a purpose - as in, creating a game out of BHL content that works to enhance the data contained within BHL, like defining article boundaries, setting page types (e.g., map, illustration, text, blank), rekeying scientific names, etc.

I think the purposeful gamification of BHL would be a huge opportunity to make BHL an even richer resource. Other digital libraries are taking this approach. The National Library of Finland leads the pack with its "DigitalKoot" project, which features different Facebook apps that have players rekey suspect text in OCR via games that, in one example, builds bridges for moles to find love. I've played them. They're kind of fun. I've ended up killing a lot of lonely moles because I don't have the right characters on my US keyboard.

You could imagine BHL putting its OCR into the same games for improvement. But I actually want something else - I want offline games. DigitalKoot runs through Facebook, so I have to be connected to play. I travel internationally, so I spend a lot of down time on planes and I turn off data roaming on my iPhone whenever I reach my destination because it's incredibly expensive. Who the heck wants to spend hundreds of euros rekeying Finnish OCR? But if I had an offline game, one that didn't require me to always be connected to the Internet, I could spend LOTS of time doing this kind of purposeful gaming.

I was asked to expand on this idea a bit today while in Brussels for a BHL-Europe meeting. I was finally able to put it together in a way that made sense for people - I want a game like Angry Birds that I can play while I'm standing in a long grocery store line, or flying, or on a train...basically any time that I have some spare cycles.

And when I described this as "spare cycles" it brought me full circle to a project we did in 2005 called SciLINC ("Scientific Literature Indexing on Networked Computers", my best acronym ever, thank you very much). SciLINC was a project that used the BOINC framework, which is open-source software for volunteer computing and grid computing, to find scientific names in literature, preceeding our work with TaxonFinder. The BOINC platform grew out of the SETI@Home project, where users downloaded software that ran as a screensaver and used the spare cycles of an unused computer to crunch through radio waves looking for signs of extraterrestrial life, and then reported the results back to a main server when the computing job was done and when it had a live network connection.

In the SciLINC project we packaged up OCR text and sent it to volunteer computers along with an algorithm for finding scientific names. It was a wonderful demonstration, but we ran out of jobs in about 2 days because text indexing isn't processor intensive; the best BOINC projects take small inputs that require lots of computing resources. Lessons learned (final report here, & appendix). In doing the project I gained an understanding of how jobs are packaged for distribution to volunteers and the kinds of inputs and outputs that are successful in an asynchronous computing environment.

So to bring this all together, I want a gaming system that improves the metadata in BHL. I want it to be asynchronous and offline so that I can play the game using my own spare cycles (time & brainpower) whenever & wherever and then upload my results to the "game master" when I'm next connected. To do that requires that the system sends me packages of data that don't wreak havoc on a mobile data plan (small inputs), but that give me enough tasks to work on with all of my spare cycles on planes. The tasks have to result in improvements in BHL, not just throwing birds at pigs. And it has to be fun. And it should probably have a killer soundtrack (Tetris, anyone?).

Who's with me?

***Update Feb 24, 2012*** This post has generated a fair amount of interest since its original posting date, leading BHL to develop a public wiki page that has some example tasks/data challenges we think are candidates for gamification. That page is online at http://biodivlib.wikispaces.com/BHL+and+Gaming

***Update Sep 26, 2013*** It was announced today that BHL, through Missouri Botanical Garden, received a National Leadership Grant for Libraries from the Institute of Museum and Library Services for a project to develop & evaluate purposeful gaming techniques for crowdsourced metadata enhancement.  Kudos to the team for operationalizing these ideas!! 

7 comments:

Roderic Page said...

Couple of quick comments.

The Finnish example is cool. I guess the assumption is that it's easier to get people involved if they are disconnected from the goal of the underlying text and are just interested in solving the problem in isolation. reCAPTCHA is a variation on this theme - I'm correcting OCR text somewhere but I never see the results of that effort. I wonder whether in the case of BHL you actually want users to see the results of what they are doing?

The other comment is the words "mobile" and "offline" made me think of CouchDB, and then this project by Patrick Heneise (see http://patrick.heneise.de/my-master-thesis/a-modern-approach-to-the-transcription-of-vintage-literature-using-mobile-technology-and-cloud-services and https://github.com/LIACS/Transcribe, I've added this thesis to the BHL group on Mendeley). This project would be a great place to start if someone was serious about developing a tool that you could use offline but then sync when you had a connection again.

Tom Gilissen said...

Sounds like a great idea Chris. You could also consider creating a game that can be played online (multiplayer) as well as offline (single player vs phone).

Two games I can think of that might serve as an example for your game: wordfeud and connect four. In both games you should use a mix of images of characters: characters the OCR software is sure about and characters it itsn't. The usage of the unknown characters in the game then gives some extra clue to what they actually are. In a wordfeud-like game each player should of course get different kinds of characters in order to be able to form words. In a connect four-like game each player gets only one character. Say one only a's and the other only d's. The players are supposed to only use the correct characters; the wrong ones are then singled out.

FabulousLadyB said...

The popularity of Words with Friends http://www.wordswithfriends.com/ makes me think that there is good potential for some kind of an OCR correction kind of game...

Chris Freeland said...

@Rod, I do think gaming concepts for BHL will work better if the player knows how they're improving BHL, or where the improvements fit in. I get a sense of satisfaction after paginating a book, knowing that now more illustrations can be found. It's not a game, just a meaningful, useful task. Also, thanks for point us to Patrick - have already gotten our friends in BHL-Europe to meet him this week at a conference they're both at, so great timing!

@Tom & @Bianca - Yes, I think online & multiplayer options will also be needed for a comprehensive suite of games, likes your examples, but since those are the ones that most people consider when doing games for digital libraries I wanted to first float the idea of an offline game with my specific use case. I like the Connect Four concept, one of my fave games as a kid. We also thought about a Memory-style matching game using BHL illustrations, but not sure how to make it purposeful. And as for Words with Friends, yep, love it. I play Scrabble as much as Angry Birds, and I play Scrabble for the same reason described here - don't always have a connection, so can't always play Words with Friends.

Lots of great ideas! Hoping we can take action on them.

Roderic Page said...

@Chris I suspect people's motivations will vary widely. Some people love playing games for games sake, I hardly play games because (a) I'm not good at them, and (b) I can't escape the feeling that if I'm in front of a computer I want the world to be changed (even just a minuscule amount) at the end of it.

Maybe tasks could be created that combine gaming with meaning. For example, we could have a list of names for which we haven't found the original description due to OCR problems, and the game is to find the first occurrence of that name. The game could be pitched as "discovering species", but in this case in a zoo of papers not in the real world. Of it could be done as "find the missing papers", where we have a taxonomist who has "lost" is publications and we want to find them. Or perhaps we could do geography and make the game about decoding past geographic place names, reconstructing a "lost voyage" or expedition.

Roderic Page said...

The latest Wired UK has an article on Reid Hoffman (LinkedIn) where he mentions that adding a "profile completeness" bar to LinkedIn resulted in an increase of 20% in the profile length (the about of stuff to tell LinkedIn about yourself). This is an aspect of gamification (see http://blogs.imediaconnection.com/blog/2011/08/03/15-brand-examples-of-gamification/ for other examples) that is perhaps less ambitious than pure games, but which might be an intermediate step towards engaging users with BHL content. Stack Overflow is a great example of a reputation system inspired by game badges, maybe we could develop something similar for BHL?

Stew said...

That idea is one for the books mate. If you can come up something as perennial and popular a game as Angry Birds but at the same time has rich historical concept and knowledge to bring forth will be a feat that all of us in this
digital preservation management industry can certainly be proud of.