Journalist Guide To Crowdsourced Data Collection With Amazon Mechanical Turk
If you are looking for a cost-effective solution to offload any amount of tedious, time-consuming tasks to a qualified, on-demand human workforce, Amazon Mechanical Turk, or one of its new international competitors, may be your best choice. These services not only help you match your task to an always-on distributed workforce, but they also save you tons of money, costing a fraction of what a professional editor or data input person would normally cost in Europe or the US.
Photo credit: Xiao Fang Hu
This new breed of online services, which leverage qualified free-lance workers in fast-developing countries, can indeed provide great time- and cost-savings to many web publishers needing to sift, verify, collect, input or validate large sets of data.
The key idea behind these crowdsourced human workforce services is the one of flexibly providing qualified workers to execute tasks that a human being can do much more effectively than a computer. Such as identifying objects in a photo or video, filling-in a spreadsheet document, writing short technical reviews, transcribing video clips or verifying specific bits of information online.
While traditionally, tasks like these have been carried out by hiring a temporary workforce (which is time consuming, expensive and difficult to scale), these "marketplaces of human intelligence" have been designed to create much larger polls of workers that can be hired on-demand and just-in-time.
The most popular of these services is Amazon Mechanical Turk, which is in existance since 2005.
Amazon created the service having in mind those people who did not move forward with certain projects because the cost to establish a skilled individual outweighed the value of completing that particular project. By creating a marketplace where business owners could scale their costs to match precisely their needs, Mechanical Turk has proved to be an effective project which has survived until now.
In this guide from ProPublica, republished here with full permission from the authors, you will find a step-by-step journalist guide on how to use Mechanical Turk at its fullest.
While the guide is specifically written with the Amazon service in mind, you can certainly transfer the insights and "best practices" hereby given to other situations and also to some of the new and emerging competing services. The key limitation in fact for Mechanical Turk, is that the Amazon service is available only in the US.
If you live outside of the US, you may want to browse through this small map in which I have started collecting online service alternatives to Mechanical Turk, which can be used by any individual or company outside of the US.
Here the Journalist Guide to crowdsourced data collection with Mechanical Turk:
ProPublica's Guide to Mechanical Turk
by Srinivas Rao and Amanda Michel
Amazon Mechanical Turk - or mTurk - is an online marketplace, set up by the online shopping site Amazon, where anyone can hire workers to complete short, simple tasks over the Internet.
Amazon originally developed it as an in-house tool, and commercialized it in 2005.
This is a guide to journalists looking to use Mechanical Turk in their data projects. It's meant for users who are already familiar with mTurk and are looking for ways to improve their results.
Readers who are new to Mechanical Turk should start with Amazon mTurk Resource Center.
1) The Basics
Before you start putting projects up, there are some fundamentals to get right.
a) Get the Lingo Down
Mechanical Turk is great for delegating lots of simple, well-defined requests, which Amazon refers to as "Human Intelligence Tasks," or "HITs."
People who complete tasks are called "workers" and those who post them are "requesters."
In this guide, the word "project" refers to a batch of similar tasks. "Spammers" are workers who try to make money without actually doing the work by answering with random data.
b) Make Sure Your Project is Suitable for mTurk
- Can you easily deconstruct your project into tasks that can be completed independently? Use mTurk.
- Do these tasks require specialized knowledge? Don't use mTurk.
- Are the tasks simple and quick to finish? Use mTurk.
- Are there multiple correct answers for each task? Don't use mTurk.
For example, ProPublica uses mTurk to help figure out which companies are getting stimulus money for our Recovery Tracker.
We get the data from the government, but in at least 400 cases last quarter, the data identified some companies only by their DUNS number instead of the company name. And because the online DUNS database uses a CAPTCHA to prevent scraping, we used mTurk workers to find company names by their DUNS numbers.
Each task is simple; a worker plugs a DUNS number into a field on the DUNS website and then copies and pastes relevant company information into fields in mTurk.
c) Be Careful! Mechanical Turk Is a Public Site
Amazon doesn't allow search engines to index its content, nor does it maintain a public archive of tasks or projects.
However, third party programs can still collect information about the projects you're running, and workers are under no obligation to keep their work secret.
To date, we've run all of our projects under the ProPublica name because it makes our work more interesting for potential workers.
If you're uncomfortable with someone accessing your work, and if your ethics rules are OK with it, you can run sensitive work under an assumed name.
Or, you can follow our lead and make it hard for anyone to scrape your work by allowing only approved workers to see your HITs (just use Amazon built-in qualification tests to screen workers based on accuracy rate, location and experience).
2) Ready, Set, Turk
mTurk is a big marketplace, and you'll attract workers' attention by:
- Project type,
- work quantity,
- pay, and
- your organization's reputation.
Here are some tips for when you're ready to assign mTurk workers to a project:
a) Design Your HITs so They're Hits
Think through the best approach to designing your projects and describing your tasks.
Do you have one batch of tasks or are there several to be done in sequence?
What information do you need to provide so people can complete a task?
We've found it's best to keep tasks as simple as possible.
You'll want to experiment at breaking down projects into different kinds of tasks to see which ones work best for you.
b) Write Crystal-Clear and Detailed Instructions
Workers won't be able to ask you questions as they complete your tasks, so you need to make sure your instructions are well-written and precise.
Any ambiguity will increase your error rate.
It will also frustrate workers because they'll make unnecessary mistakes.
We recommend giving workers as much information about projects as possible, including how you'll review their work.
Newsrooms and nonprofits should underscore their mission, as many workers respond favorably to mission-driven work.
Always clue workers into your project's goals so they're better positioned to troubleshoot any problems.
Don't be the judge of your own instructions: Ask someone who knows little about your subject to complete a task before you launch your project. Amanda's favorite test subject is her sister, who has no problem voicing complaints.
c) Error-Proof Your Project Beforehand
Here are a few best practices to minimize the chances that your projects will yield useless results:
I) Use Redundancy
You should never trust the response of a single worker. Instead, delegate each task to at least two but sometimes as many as five workers.
The number of workers you should use depends on the task's difficulty.
At ProPublica we often delegate each task to four workers and trust responses when three or more of the workers agree.
Our use of several workers for each task is based on Professor Ipeirotis' research on the subject.
II) Troubleshoot Any Complications
...a worker may face and adjust accordingly.
For instance, for our DUNS project, we originally forgot to plan for situations in which the DUNS number could not be found in the database. This led to workers entering responses like "error" and "no zip code listed" into the "company name" field. The next time we ran this task, we gave workers the option to note that a DUNS number returned an error.
III) Price Your HITs Appropriately
Pricing on mTurk is more art than science, unfortunately.
Pay too little, and workers won't complete your tasks.
Pay too much and you'll attract more spammers.
At ProPublica, we pay between 1 and 10 cents per task.
Professor Ipeirotis recently surveyed tasks on mTurk and found that 90 percent of them were priced within this range.
Requesters can also reward workers with bonuses after the project has been completed. But don't expect the promise of a bonus to work in your favor.
Workers rarely trust new requesters who guarantee bonuses for good work.
IV) Use "Code Words"
...to ward off spammers.
Spammers love multiple choice questions because they can complete tasks quickly by clicking randomly. Avoid this problem by asking workers to obtain an arbitrary piece of data on the Internet, even if you won't use it. You can use this data to verify that workers actually completed the task and didn't simply provide answers at random.
V) Spot-Check Results
Before we launch a project on mTurk, we try to complete 50 or so tasks on our own, and then compare the results to mTurk's.
The first time we ran a project on mTurk, we ran an old data set we'd completed ourselves and then compared the results.
The good news: In 524 cases, data from mTurk matched our own results - and mTurk actually caught 10 errors we made in-house.
VI) Be Good to Workers
Workers like requesters who treat them with respect.
Develop a reputation for prompt payment, clear instructions and fair judgment, and you will attract high-quality workers who will do quick and accurate work.
VII) Get to Know Your Workers
...and give them a heads-up when you post to MT.
mTurk assigns every worker a persistent ID, but doesn't show you their name or provide contact information. We've worked around this by adding a text field to our HITs that reads,"If you would be interested in receiving email updates about upcoming HITs from ProPublica, enter your email address here. Don't worry, we will only send you messages related to Amazon Mechanical Turk."
Our plan is to send our best workers an e-mail alert before uploading a new project.
VIII) Test Massive Projects
Before posting a massive project, post a batch of 100-200 HITs as a test. Testing has helped us catch possible complications and errors. A $3 test can save you from running $100 of flawed results. You can see some of these tips in practice in our DUNS project.
In the HIT template, we showed workers a DUNS Number, and asked them to retrieve the corresponding company's name and ZIP code.
We assigned each task to four workers to ensure accuracy.
Since the tasks involved copying text, an arbitrary code word was not necessary.
We also asked workers to note when a DUNS number was not listed in the database, or if a company did not have a U.S. ZIP code.
We told workers that we would reject their work only if there was evidence of cheating, and we gave them an opportunity to sign up for ProPublica's mTurk team.
We paid workers one cent per task. This is on the low side of our pay range, as we underestimated the time it takes to run DUNS numbers through the database.
We'll pay three cents next time.
Uploading Your Project To mTurk
When you're actually filling in the details of your HITs, there are some good tips to keep in mind: Don't worry too much about your task's expiration date or the "time allotment per HIT."
Workers can't search for HITs based on either of these factors, so they don't generally have an effect on the speed or accuracy of your project.
Our recommendation: Just keep the default values.
Always use your company name as a keyword. Add a few more that describe your work. We use "data," "fast," and "easy" a lot.
Professor Ipeirotis has assembled a list of the top keywords.
We recommend you allow only workers with a 95 percent accuracy rate or higher to view your HITs.
If your tasks require knowledge of American culture or slang, limit your project to American workers.
Believe it or not, the bigger your job, the faster it will complete.
If your project includes more than 200 HITs and is simple, we've found that it will almost always finish within 12 hours.
Smaller projects take longer - in one case, a job with 18 HITs took four days to complete and a similar job with 155 HITs finished in less than an hour.
Run your project over a weekend or at night.
We've found that projects run over the weekend finish faster (very few HITs are loaded up on the weekend).
If you want to run your projects during the week, launch them in the evening, U.S. Eastern Time, when American workers are getting home from their day jobs, and the Indian workday is just beginning.
We published our DUNS project on a Friday afternoon at 5:30 p.m., and all 401 tasks were completed within six hours.
Higher-paying work doesn't get prioritized.
People favor simple tasks and large projects. One-cent jobs often complete faster than two- or three-cent jobs. That doesn't mean that under-priced HITs attract quality workers; workers tend to search for easy tasks to complete, and lower-priced tasks tend to be easier.
Project Completed, Now The Review Begins
Unless you ran your project over the weekend, don't let more than a day pass before you review the results. Make sure you examine the data closely.
If there are a lot of HITs in which workers disagreed about the correct answer, you should check if one worker was responsible for all the errors.
You can also check for spammers by calculating whether one worker completed tasks much faster than others.
You can see how we analyzed the results to our DUNS project.
Pay workers promptly.
mTurk workers are paid only once you've approved their work.
You should pay workers within one business day of posting your project.
If there is no evidence of spam, payment should happen immediately.
While Mechanical Turk gives you the ability to reject submissions and refuse payment to workers, we highly recommend against doing so unless there is clear evidence that workers are providing spam responses.
Many projects won't accept workers who have had less than 95 percent of their work accepted, so they are understandably protective of their accuracy rates.
Rejecting work that was completed with good intentions is not only unfair, it will also get your company a reputation for being inhospitable to workers.
About Srinivas Rao
Srinivas Rao is employed by ProPublica since March 2010. He graduated from George Washington University in 2008 in political communication.
About Amanda Michel
Amanda Michel is the director of distributed reporting at ProPublica. She recently directed HuffPost's OffTheBus, a citizen journalism site. She has also worked as an online organizer and strategist for political campaigns, including the Dean and Kerry Internet teams.
Image of Amanda Michel - Lars Klove
Other Images - Jpsdk
Reference: ProPublica [ Read more ]
blog comments powered by Disqus