The power of collective research, task-based investigations and swarm intelligence

In January 2007 a well known computer scientist named Jim Gray was lost at sea off the California coast on his way to the Farallon islands.

It was a moment that many will remember either because Jim Gray was a big influence personally or professionally or because the method of the search for him was a real eye opener about the power of the Internet.  It was a group task-based investigation of epic proportions using the latest and greatest technology of the day.

I didn’t know him, but I will never forget what happened.  Not only did the Coast Guard’s air and surface search cover 40,000 square miles, but a distributed army of 12,000 people scanned NASA satellite imagery covering 30,000 square miles.  We all used Amazon’s Mechanical Turk to flip through tiles looking for a boat that would’ve been about 6 pixels in size.

They attacked the search in some phenomenal ways.  Here is Werner Vogel’s public call for help. You can also go back and read the daily search logs posted by his friends on the blog here.  Both Wired and the New York Times covered this incredible drama in detail.

Since then we’ve seen the Internet come to the rescue or at least try to make a difference using similar crowdmapping techniques.  Perhaps the most powerful example is the role crisis mappers and the Ushahidi platform played in the major Haiti earthquake in 2010.

But it’s not just crisis where these technologies are serving a public good.  We’ve seen these swarming techniques applied in a range of ways for journalism and many other activities on the Internet.

Perhaps the gold standard for collective investigative reporting is the MPs Expenses experiment by Simon Willison at the Guardian where 170,000 documents were reviewed by 15,000 people in the first 80 hours after it went live.  The Guardian has deployed its readers to uncover truth in a range of different stories, most recently with the Privatised Public Spaces story.  We’ve also looked at crowdmapping broadband speeds across the UK, and Joanna Geary’s ‘Tracking the Trackers‘ project uncovered some fascinating data about the worst web browser cookie abusers.

Last year Germany’s defense minister Karl-Theodor zu Guttenberg, a man once considered destined for an even larger role in the government, was forced to resign from his post as a result of allegations that he plagiarized his doctoral thesis.  It was proved to be true by a group of people working collectively on the investigation using a site called GuttenPlag Wiki.

ProPublica is a real pioneer in collective reporting and data journalism.  For example, their 2010 investigation into which politicians were given Super Bowl tickets provided a wonderful window into the investigative process.  And the Stimulus Spotcheck project invited people to assess whether or not the 2009 stimulus package in the US was in fact having an impact.

Also, Kevin Anderson reminded me of http://www.ipaidabribe.com tracking local corruption and http://oilreporter.org/ which came out of the Gulf of Mexico Oil Spill in 2010 and helps people report wildlife damage, share photos, etc.

Of course, swarming projects can have a range of different intentions, and if one were to try and count them I would bet only a small percentage are high impact journalistic endeavors.

Andy Baio is a pioneer in this kind of concept and has either been the curator of data already in existence or the inspiration for a crowdsourced investigation.  For example, his “Girl Turk” collective research uncovered an exhaustive list of artist and track names sampled for Girl Talk’s Feed the Animals album.

The big advertising brands intuitively understand the power of swarming intelligence, too, as they see it as a way to use their loyal customers to help them acquire new customers or to at least build a stronger direct relationship with a large group of people.  This is essentially the pitch once used by MySpace and adopted by Facebook, Twitter and Google +…Step 1: create a brand page where people can congregate, Step 2: inspire people to do something collectively that spreads virally.

The technologies that make these group tasks possible are getting easier and more accessible all the time. The wiki format works great for some projects.  DocumentCloud is a tremendous platform.   Google Docs are providing a lot of power for collective investigations, as we’ve discovered several times on the Guardian’s Datablog. And, of course, crowdmapping can be done with little technical intervention using Ushahidi and n0tice.

Of course, you can’t discount the power of the social networks as distribution platforms and amplifiers for group-based investigations.  Creating the space for swarming activity is one thing, but getting the word out is a role that Facebook and Twitter are very good at playing.  It’s a perfect marriage, in many ways.

An army of helpers may be accessible in other ways, too.

Amanda Michel who famously drove the Off The Bus campaign at HuffPo (more on that below) produced a guide to “Using Amazon’s Mechanical Turk for Data Projects” while at ProPublica where she describes how they hired workers to complete short, simple tasks.

But I imagine that the next wave of activity will arise as some of the human patterns of group tasks inspire more sustainable technology platforms.  As Martin Kotynek and ‘PlagDoc’ acknowledge in their wonderful report “Swarm of thoughts” there’s a need for some sort of centralized research platform so this kind of activity is easier to trigger and run with.

Perhaps it’s a matter of identifying a few very specific collective research concepts that work and fueling ongoing community activity around those ideas.  Citizen journalism, for example, is an obvious activity where communities are forming.

CNN’s iReport has a ready-built citizen journalist network incentivized by exposure on cnn.com, and the n0tice platform can enable citizen-powered crowdmapping activity for a range of different projects and get exposure and distribution across different platforms.  Both are capable of serving an ongoing role as useful every-day citizen journalism services that can crank up the volume on a particular issue when the appropriate moment arises.

Platforms can create some ongoing momentum, but so can issues.

Off The Bus was an 18-month HuffPo initiative where readers and staff covered the US elections collaboratively from their own communities. The project had the additional benefit of generating insights that turned into larger editorial investigations such as the Superdelegate Investigation, a report on the Evangelical Vote and the Political Campaign HQ crowdmapping project.  Ryan Tate’s book The 20% Doctrine goes into some detail about Off The Bus, how it developed, and how Amanda managed it all.

I suspect that a whole class of swarming intelligence projects is starting to bubble up that may only appear when the human story, the technology, and the amplifier join up and create a perfect storm.

In the end, it comes down to projects that resonate with people on a personal level.

Though Jim Gray was never found, the thinking about how to conduct the search amongst the leaders of the crowd at the time could not have been more cogent.  The instructions for participants were inspiring, detailing a simple task and the result of completing it:

You will be presented with 5 images. The task is to indicate any satellite images which contain any foreign objects in the water that may resemble Jim’s sailboat or parts of a boat. Jim’s sailboat will show up as a regular object with sharp edges, white or nearly white, about 10 pixels long and 4 pixels wide in the image. If in doubt, be conservative and mark the image. Marked images will be sent to a team of specialists who will determine if they contain information on the whereabouts of Jim Gray. Friends and family of Jim Gray would like to thank you for helping them with this cause.

It’s conceivable that the most important thing social media has accomplished over the last 3-5 years is that it has unlocked the natural desire people have to impact what’s happening in the world in a way they may not have felt empowered to do for decades.

Now it’s simply a matter of joining up the technologies in ways that enable those ideas to come to life.


 

A List of Collective Investigations

Below are some of the projects mentioned above and several others that have been sent to me.  I’ve included a few things that aren’t journalism investigations that are worth a closer look simply because they can be instructive.

Tenacious SearchSince January 28, the San Francisco police, the Coast Guard and Jim’s friends and family have conducted an extensive search to find him and his sailboat, Tenacious, off the California coast. I want to summarize the status of that search here, so that the broad volunteer community that’s done so much knows where we stand.

Embedly Powered

Crisis mapping brings online tool to Haitian disaster relief effortPatrick Meier learned about the earthquakes at 7 p.m. Tuesday while he was watching the news in Boston. By 7:20, he’d contacted a colleague in Atlanta. By 7:40, the two were mobilizing an online tool created by a Kenyan lawyer in South Africa. By 8, they were gathering intelligence from everyplace,…

Embedly Powered

The Brian Lehrer Show – Are You Being Gouged?Our latest “crowdsourcing” project asks listeners to go to their local grocery store and find out the price of three goods: milk, lettuce and beer. You don’t have to buy them (or consume them), but we want to know how much they cost in different neighborhoods throughout the New York area.

Embedly Powered

via Wnyc
Investigate your MP’s expensesWe have 458,832 pages of documents. 33,105 of you have reviewed 226,139 of them. Only 232,693 to go… Start reviewing Please read our privacy policy to find out how we use your data. You must also read our terms of service.

Embedly Powered

Privately owned public space: where are they and who owns them?We’re in the middle of a creeping privatisation of public space. Streets and open spaces are being defined as private land after redevelopment. It began with Canary Wharf but is now a standard feature of urban regeneration. In future, one of the biggest public squares in Europe – Granary square, in the new development around Kings Cross – will be privately owned.

Embedly Powered

Broadband Britain: how fast is your connection?With your help, the Guardian is creating an up-to-date broadband map of Britain, showing advertised versus real speeds. We want to highlight the best and worst-served communities, and bring attention to the broadband blackspots.

Embedly Powered

Tracking the trackers: help us reveal the unseen world of cookiesCookies and other web trackers monitor our online behaviour and store our browsing habits, but who are the companies behind them and what are they doing with our data? We have teamed up with Mozilla to try to find out.

Embedly Powered

GuttenPlag WikiAchtung: Dies sind keine Initiativen von GuttenPlag Dies ist eine kollaborative Dokumentation der Plagiate – jeder ist eingeladen, hier mitzuarbeiten. Ergänzungen und Änderungen in diesem Wiki sind transparent und jederzeit nachvollziehbar. Jede Bearbeitung wird protokolliert. Siehe: Letzte Änderungen (ohne Diskussionsbeiträge) Guttenbergs Dissertation und die Plagiatsvorwürfe wurden seit dem 16.

Embedly Powered

via Wikia
ProPublica’s Super Bowl Blitz: Which Congressmen Are Getting Super Bowl Perks?Carson, André (D)(202) 225-4011 7th IN Don’t call [ Sebastian Jones, ProPublica ] Don’t call [ Sebastian Jones, ProPublica ] Awaiting reply [ Kathleen McLaughlin, Indianapolis Business Journal | Congressional staff Glendal Jones, press secretary, Feb 3, 2010 ] Delahunt, Bill (D)(202) 225-3111 10th MA Staff doesn’t know.

Embedly Powered

I Paid a Bribe | Uncover the market price of corruption in Indiaipaidabribe: Share your story on bribes and corruption. Read latest news on corruption in Indian bureaucracy and civic agencies. Read corruption and bribery related stories from all over India

Embedly Powered

Deepwater Oil Reporter Crowdsourcing PlatformHere are a few things we think you need to know before joining this open data sharing initiative. Please read before you proceed. Know that all data reported on Oil Reporter is PUBLIC. If you don’t want to share information with the public, Oil Reporter isn’t for you.

Embedly Powered

Girl Turk: Mechanical Turk Meets Girl Talk’s “Feed the Animals” – Waxy.orgGirl Talk’s Feed the Animals is one of my favorite albums this year, a hyperactive mish-mash sampling hundreds of songs from the last 45 years of popular music. Gregg Gillis created a beautiful, illegal mess of copyright clearance hell, which you should download immediately.

Embedly Powered

via Waxy
HuffPost Launches OffTheBus Citizen Journalism Project Ahead of 2012 ElectionsWASHINGTON — If you are like most people, you don’t much like the way the “national media” cover politics. As a long-time member of the Washington press corps, I agree with you. We can be trivial, shortsighted, credulous, ideologically blinkered and timid — on a good day.

Embedly Powered

Netflix Prize: HomeThe Netflix Prize sought to substantially improve the accuracy of predictions about how much someone is going to enjoy a movie based on their movie preferences. On September 21, 2009 we awarded the $1M Grand Prize to team “BellKor’s Pragmatic Chaos”. Read about their algorithm, checkout team scores on the Leaderboard, and join the discussions on the Forum.

Embedly Powered

HerdictWeb : AboutAbout Us Herdict is a project of the Berkman Center for Internet & Society at Harvard University. Herdict is a portmanteau of ‘herd’ and ‘verdict’ and seeks to show the verdict of the users (the herd). Herdict Web seeks to gain insight into what users around the world are experiencing in terms of web accessibility; or in other words, determine the herdict.

Embedly Powered

The High Price of Creating Free Ads – New York TimesFrom an advertiser’s perspective, it sounds so easy: invite the public to create commercials for your brand, hold a contest to pick the best one and sit back while average Americans do the creative work. But look at the videos H. J. Heinz is getting on YouTube.

Embedly Powered

SpotCrime Crime MapArrest Arson Assault Burglary Robbery Shooting Theft Vandalism Other Loading Crime Data… City and county crime map showing crime incident data down to neighborhood crime activity. Subscribe for crime alerts and reports.

Embedly Powered

The Peer to Patent Project – Community Patent ReviewThe Community Patent Review: Peer to Patent project On June 15, 2007, the United States Patent and Trademark Office (USPTO) opened the patent examination process for online public participation for the first time.

Embedly Powered

via Nyls
Prize4LifeOur mission is to accelerate the discovery of treatments and a cure for ALS by using powerful incentives to attract new people and drive innovation. We know that the solutions to some of the biggest challenges in ALS research will require out-of-the-box thinking, and some of the most critical discoveries may come from unlikely places.

Embedly Powered

FixMyStreetHow to report a problem Enter a nearby GB postcode, or street name and area Locate the problem on a map of the area Enter details of the problem We send it to the council on your behalf 1,616 reports in past week 2,529 fixed in past month 204,852 updates on reports

Embedly Powered

Reporting Recipe: Using Amazon’s Mechanical Turk for Data ProjectsOf all of journalism’s recent evolutions, data-driven reporting is one of the most celebrated. But as much as we should toast data’s powers, we must acknowledge its cost: Assembling even a small dataset can require hours of tedious work, deterring even the most disciplined of journalists and their editors.

Embedly Powered

HuffPost’s OffTheBus Superdelegate InvestigationWe asked HuffPost readers to join with us and profile the hundreds of superdelegates who are likely to decide the Democratic nomination for president. Hundreds of you responded and we can now present our initial findings. Just click on a state or territory and a list of superdelegate profiles, as compiled by our citizen journalists, will pop up.

Embedly Powered

The Political Campaign HQ Next Door: OffTheBus Special Ops PhotographsWhere are the state campaign headquarters located, exactly, for the party that claims to represent Main Street? Where are they located for the party that claims to represent everyone? Thanks to the work of HuffPost OffTheBus Special Ops, you can visit offices around the nation in just a few key strokes.

Embedly Powered

Introducing Stimulus Spot CheckJuly 20, 2009: This post has been corrected. It’s the middle of July and we’re all wondering whether the stimulus is working. If we do as the administration has advised, we should remain patient – and let the administration measure its own success.

Embedly Powered

WNYC – Mapping the Storm Clean-upWe’ve been asking readers and listeners to let us know if their streets have been plowed. Here are maps from Tuesday, Wednesday and Thursday (white balloons represent unplowed streets, blue plowed). Click the balloons for full information and voice messages where available. Submit yours by texting PLOW to 30644.

Embedly Powered

via Wnyc
How Do You Feel About the Economy? – Interactive Feature – NYTimes.comEnter the word that best describes your current mood. You can submit a response once a day.

Embedly Powered

Adjunct ProjectThe Project The Adjunct Project exists for the growing number of graduate degree holders who are unemployed and underemployed. Many of these highly educated and passionate people are being forced to take jobs dramatically below their achievement and earning potential.

Embedly Powered

The Scrapbook – POPS Report: Tell Us About New York City’s Privately-Owned Public SpacesListen: Project Intro from October 19th // Listen: Wrap-Up from November 9th // WNYC’s Brian Leher Show and The New York World are collaborating on a project to map and report on New York City’s Privately-Owned Public Spaces, aka POPS. We want to figure out how public these public spaces really are.

Embedly Powered

via Wnyc