Description

Textonic is an open-source web interface for processing text-based data with Amazon Mechanical Turk.

(See the Background page for more about Mechanical Turk.)

Textonic is a tool to make new kinds of data gathering campaigns possible. Intended for tagging and categorizing SMS messages sent from mobile phones to a centralized database, it allows the user to configure the automated submission of Human Intelligence Tasks (HITs) to MTurk. Workers on MTurk are then presented with these messages and the option of choosing one (or more) user-defined tags that the worker thinks are applicable to that message. In this manner the user can quickly and cheaply generate custom metadata for each message.

Textonic was conceived as an extension to UNICEF’s RapidSMS software for use in Africa. Specific use cases include:

  • Moderation of story submissions related to the Young Africa Summit – MTurk workers could judge if a particular story submission was relevant and/or appropriate. If a majority of workers approved a particular story, it could automatically be posted to a website without requiring action from a UNICEF worker.
  • Monitoring of famine conditions – simple SMS surveys could ask participants questions such as “When was the last time you ate?” MTurk workers could then categorize the natural-language responses into programmatically defined buckets. Responses of “just after dawn,” “right after I woke up” and “before I went to town for food” could all be determined by the MTurk worker to mean “today,” for example, while it would be very difficult to write an automated parser that could achieve the same result. This would also eliminate the need to train respondents beforehand about format requirements for the SMS messages.
  • Determining a user’s location – current SMS campaigns rely on complex systems of location codes for specifying where messages are coming from. MTurk workers could determine the location expressed in an SMS message at a variety of scales (neighborhood, city, region, Internally Displaced Persons camp) without requiring a specific message format.

Textonic’s flexible tag-based structure will allow users to leverage MTurk for these scenarios and many others. The goal of this project is to build a robust web interface to allow users to configure the algorithmic submission of these Human Intelligence Tasks to MTurk without additional software development. In addition, Textonic will make campaigns to gather data from the field more failure-tolerant, because administrators will have more options for processing the data that has already been gathered.

Hopefully, Textonic can be an alternative to paid commercial tools for nonprofits and other organizations. Current commercial tools include:

  • Smartsheet, an online group collaboration spreadsheet tool with optional integration with MTurk
  • Dolores Labs, a company that offers software and consulting services
  • HIT- Builder, a stand-alone tool for managing MTurk workers