Smorball tackles a major challenge for digital libraries: poor output from Optical Character Recognition (OCR) software significantly hampers full-text searching of digitized material. When first scanned, the pages of digitized books and journals are merely image files, making the pages unsearchable and virtually unusable. While OCR converts page images to searchable, machine encoded text, historic literature is difficult for OCR to accurately render because of its tendency to have varying fonts, typesetting and layouts.

This educational game enables citizen scientists to engage in “purposeful gaming” by playing Smorball, which asks players to correctly type the words they see on the screen—punctuation and all. Smorball presents players with phrases from scanned pages from cultural heritage institutions. After much verification, the words players type are sent to the libraries that store the corresponding pages, allowing those pages to be searched and data mined and ultimately making historic literature more usable for institutions, scholars, educators and the public.

Source: Smorball

No Comments

Be the first to start a conversation

Leave a Reply

  • (will not be published)