The power of collective research, task-based investigations and swarm intelligence

In January 2007 a well known computer scientist named Jim Gray was lost at sea off the California coast on his way to the Farallon islands.

It was a moment that many will remember either because Jim Gray was a big influence personally or professionally or because the method of the search for him was a real eye opener about the power of the Internet.  It was a group task-based investigation of epic proportions using the latest and greatest technology of the day.

I didn’t know him, but I will never forget what happened.  Not only did the Coast Guard’s air and surface search cover 40,000 square miles, but a distributed army of 12,000 people scanned NASA satellite imagery covering 30,000 square miles.  We all used Amazon’s Mechanical Turk to flip through tiles looking for a boat that would’ve been about 6 pixels in size.

They attacked the search in some phenomenal ways.  Here is Werner Vogel’s public call for help. You can also go back and read the daily search logs posted by his friends on the blog here.  Both Wired and the New York Times covered this incredible drama in detail.

Since then we’ve seen the Internet come to the rescue or at least try to make a difference using similar crowdmapping techniques.  Perhaps the most powerful example is the role crisis mappers and the Ushahidi platform played in the major Haiti earthquake in 2010.

But it’s not just crisis where these technologies are serving a public good.  We’ve seen these swarming techniques applied in a range of ways for journalism and many other activities on the Internet.

Perhaps the gold standard for collective investigative reporting is the MPs Expenses experiment by Simon Willison at the Guardian where 170,000 documents were reviewed by 15,000 people in the first 80 hours after it went live.  The Guardian has deployed its readers to uncover truth in a range of different stories, most recently with the Privatised Public Spaces story.  We’ve also looked at crowdmapping broadband speeds across the UK, and Joanna Geary’s ‘Tracking the Trackers‘ project uncovered some fascinating data about the worst web browser cookie abusers.

Last year Germany’s defense minister Karl-Theodor zu Guttenberg, a man once considered destined for an even larger role in the government, was forced to resign from his post as a result of allegations that he plagiarized his doctoral thesis.  It was proved to be true by a group of people working collectively on the investigation using a site called GuttenPlag Wiki.

ProPublica is a real pioneer in collective reporting and data journalism.  For example, their 2010 investigation into which politicians were given Super Bowl tickets provided a wonderful window into the investigative process.  And the Stimulus Spotcheck project invited people to assess whether or not the 2009 stimulus package in the US was in fact having an impact.

Also, Kevin Anderson reminded me of tracking local corruption and which came out of the Gulf of Mexico Oil Spill in 2010 and helps people report wildlife damage, share photos, etc.

Of course, swarming projects can have a range of different intentions, and if one were to try and count them I would bet only a small percentage are high impact journalistic endeavors.

Andy Baio is a pioneer in this kind of concept and has either been the curator of data already in existence or the inspiration for a crowdsourced investigation.  For example, his “Girl Turk” collective research uncovered an exhaustive list of artist and track names sampled for Girl Talk’s Feed the Animals album.

The big advertising brands intuitively understand the power of swarming intelligence, too, as they see it as a way to use their loyal customers to help them acquire new customers or to at least build a stronger direct relationship with a large group of people.  This is essentially the pitch once used by MySpace and adopted by Facebook, Twitter and Google +…Step 1: create a brand page where people can congregate, Step 2: inspire people to do something collectively that spreads virally.

The technologies that make these group tasks possible are getting easier and more accessible all the time. The wiki format works great for some projects.  DocumentCloud is a tremendous platform.   Google Docs are providing a lot of power for collective investigations, as we’ve discovered several times on the Guardian’s Datablog. And, of course, crowdmapping can be done with little technical intervention using Ushahidi and n0tice.

Of course, you can’t discount the power of the social networks as distribution platforms and amplifiers for group-based investigations.  Creating the space for swarming activity is one thing, but getting the word out is a role that Facebook and Twitter are very good at playing.  It’s a perfect marriage, in many ways.

An army of helpers may be accessible in other ways, too.

Amanda Michel who famously drove the Off The Bus campaign at HuffPo (more on that below) produced a guide to “Using Amazon’s Mechanical Turk for Data Projects” while at ProPublica where she describes how they hired workers to complete short, simple tasks.

But I imagine that the next wave of activity will arise as some of the human patterns of group tasks inspire more sustainable technology platforms.  As Martin Kotynek and ‘PlagDoc’ acknowledge in their wonderful report “Swarm of thoughts” there’s a need for some sort of centralized research platform so this kind of activity is easier to trigger and run with.

Perhaps it’s a matter of identifying a few very specific collective research concepts that work and fueling ongoing community activity around those ideas.  Citizen journalism, for example, is an obvious activity where communities are forming.

CNN’s iReport has a ready-built citizen journalist network incentivized by exposure on, and the n0tice platform can enable citizen-powered crowdmapping activity for a range of different projects and get exposure and distribution across different platforms.  Both are capable of serving an ongoing role as useful every-day citizen journalism services that can crank up the volume on a particular issue when the appropriate moment arises.

Platforms can create some ongoing momentum, but so can issues.

Off The Bus was an 18-month HuffPo initiative where readers and staff covered the US elections collaboratively from their own communities. The project had the additional benefit of generating insights that turned into larger editorial investigations such as the Superdelegate Investigation, a report on the Evangelical Vote and the Political Campaign HQ crowdmapping project.  Ryan Tate’s book The 20% Doctrine goes into some detail about Off The Bus, how it developed, and how Amanda managed it all.

I suspect that a whole class of swarming intelligence projects is starting to bubble up that may only appear when the human story, the technology, and the amplifier join up and create a perfect storm.

In the end, it comes down to projects that resonate with people on a personal level.

Though Jim Gray was never found, the thinking about how to conduct the search amongst the leaders of the crowd at the time could not have been more cogent.  The instructions for participants were inspiring, detailing a simple task and the result of completing it:

You will be presented with 5 images. The task is to indicate any satellite images which contain any foreign objects in the water that may resemble Jim’s sailboat or parts of a boat. Jim’s sailboat will show up as a regular object with sharp edges, white or nearly white, about 10 pixels long and 4 pixels wide in the image. If in doubt, be conservative and mark the image. Marked images will be sent to a team of specialists who will determine if they contain information on the whereabouts of Jim Gray. Friends and family of Jim Gray would like to thank you for helping them with this cause.

It’s conceivable that the most important thing social media has accomplished over the last 3-5 years is that it has unlocked the natural desire people have to impact what’s happening in the world in a way they may not have felt empowered to do for decades.

Now it’s simply a matter of joining up the technologies in ways that enable those ideas to come to life.


A List of Collective Investigations

Below are some of the projects mentioned above and several others that have been sent to me.  I’ve included a few things that aren’t journalism investigations that are worth a closer look simply because they can be instructive.

Tenacious SearchSince January 28, the San Francisco police, the Coast Guard and Jim’s friends and family have conducted an extensive search to find him and his sailboat, Tenacious, off the California coast. I want to summarize the status of that search here, so that the broad volunteer community that’s done so much knows where we stand.

Embedly Powered

Crisis mapping brings online tool to Haitian disaster relief effortPatrick Meier learned about the earthquakes at 7 p.m. Tuesday while he was watching the news in Boston. By 7:20, he’d contacted a colleague in Atlanta. By 7:40, the two were mobilizing an online tool created by a Kenyan lawyer in South Africa. By 8, they were gathering intelligence from everyplace,…

Embedly Powered

The Brian Lehrer Show – Are You Being Gouged?Our latest “crowdsourcing” project asks listeners to go to their local grocery store and find out the price of three goods: milk, lettuce and beer. You don’t have to buy them (or consume them), but we want to know how much they cost in different neighborhoods throughout the New York area.

Embedly Powered

via Wnyc
Investigate your MP’s expensesWe have 458,832 pages of documents. 33,105 of you have reviewed 226,139 of them. Only 232,693 to go… Start reviewing Please read our privacy policy to find out how we use your data. You must also read our terms of service.

Embedly Powered

Privately owned public space: where are they and who owns them?We’re in the middle of a creeping privatisation of public space. Streets and open spaces are being defined as private land after redevelopment. It began with Canary Wharf but is now a standard feature of urban regeneration. In future, one of the biggest public squares in Europe – Granary square, in the new development around Kings Cross – will be privately owned.

Embedly Powered

Broadband Britain: how fast is your connection?With your help, the Guardian is creating an up-to-date broadband map of Britain, showing advertised versus real speeds. We want to highlight the best and worst-served communities, and bring attention to the broadband blackspots.

Embedly Powered

Tracking the trackers: help us reveal the unseen world of cookiesCookies and other web trackers monitor our online behaviour and store our browsing habits, but who are the companies behind them and what are they doing with our data? We have teamed up with Mozilla to try to find out.

Embedly Powered

GuttenPlag WikiAchtung: Dies sind keine Initiativen von GuttenPlag Dies ist eine kollaborative Dokumentation der Plagiate – jeder ist eingeladen, hier mitzuarbeiten. Ergänzungen und Änderungen in diesem Wiki sind transparent und jederzeit nachvollziehbar. Jede Bearbeitung wird protokolliert. Siehe: Letzte Änderungen (ohne Diskussionsbeiträge) Guttenbergs Dissertation und die Plagiatsvorwürfe wurden seit dem 16.

Embedly Powered

via Wikia
ProPublica’s Super Bowl Blitz: Which Congressmen Are Getting Super Bowl Perks?Carson, André (D)(202) 225-4011 7th IN Don’t call [ Sebastian Jones, ProPublica ] Don’t call [ Sebastian Jones, ProPublica ] Awaiting reply [ Kathleen McLaughlin, Indianapolis Business Journal | Congressional staff Glendal Jones, press secretary, Feb 3, 2010 ] Delahunt, Bill (D)(202) 225-3111 10th MA Staff doesn’t know.

Embedly Powered

I Paid a Bribe | Uncover the market price of corruption in Indiaipaidabribe: Share your story on bribes and corruption. Read latest news on corruption in Indian bureaucracy and civic agencies. Read corruption and bribery related stories from all over India

Embedly Powered

Deepwater Oil Reporter Crowdsourcing PlatformHere are a few things we think you need to know before joining this open data sharing initiative. Please read before you proceed. Know that all data reported on Oil Reporter is PUBLIC. If you don’t want to share information with the public, Oil Reporter isn’t for you.

Embedly Powered

Girl Turk: Mechanical Turk Meets Girl Talk’s “Feed the Animals” – Waxy.orgGirl Talk’s Feed the Animals is one of my favorite albums this year, a hyperactive mish-mash sampling hundreds of songs from the last 45 years of popular music. Gregg Gillis created a beautiful, illegal mess of copyright clearance hell, which you should download immediately.

Embedly Powered

via Waxy
HuffPost Launches OffTheBus Citizen Journalism Project Ahead of 2012 ElectionsWASHINGTON — If you are like most people, you don’t much like the way the “national media” cover politics. As a long-time member of the Washington press corps, I agree with you. We can be trivial, shortsighted, credulous, ideologically blinkered and timid — on a good day.

Embedly Powered

Netflix Prize: HomeThe Netflix Prize sought to substantially improve the accuracy of predictions about how much someone is going to enjoy a movie based on their movie preferences. On September 21, 2009 we awarded the $1M Grand Prize to team “BellKor’s Pragmatic Chaos”. Read about their algorithm, checkout team scores on the Leaderboard, and join the discussions on the Forum.

Embedly Powered

HerdictWeb : AboutAbout Us Herdict is a project of the Berkman Center for Internet & Society at Harvard University. Herdict is a portmanteau of ‘herd’ and ‘verdict’ and seeks to show the verdict of the users (the herd). Herdict Web seeks to gain insight into what users around the world are experiencing in terms of web accessibility; or in other words, determine the herdict.

Embedly Powered

The High Price of Creating Free Ads – New York TimesFrom an advertiser’s perspective, it sounds so easy: invite the public to create commercials for your brand, hold a contest to pick the best one and sit back while average Americans do the creative work. But look at the videos H. J. Heinz is getting on YouTube.

Embedly Powered

SpotCrime Crime MapArrest Arson Assault Burglary Robbery Shooting Theft Vandalism Other Loading Crime Data… City and county crime map showing crime incident data down to neighborhood crime activity. Subscribe for crime alerts and reports.

Embedly Powered

The Peer to Patent Project – Community Patent ReviewThe Community Patent Review: Peer to Patent project On June 15, 2007, the United States Patent and Trademark Office (USPTO) opened the patent examination process for online public participation for the first time.

Embedly Powered

via Nyls
Prize4LifeOur mission is to accelerate the discovery of treatments and a cure for ALS by using powerful incentives to attract new people and drive innovation. We know that the solutions to some of the biggest challenges in ALS research will require out-of-the-box thinking, and some of the most critical discoveries may come from unlikely places.

Embedly Powered

FixMyStreetHow to report a problem Enter a nearby GB postcode, or street name and area Locate the problem on a map of the area Enter details of the problem We send it to the council on your behalf 1,616 reports in past week 2,529 fixed in past month 204,852 updates on reports

Embedly Powered

Reporting Recipe: Using Amazon’s Mechanical Turk for Data ProjectsOf all of journalism’s recent evolutions, data-driven reporting is one of the most celebrated. But as much as we should toast data’s powers, we must acknowledge its cost: Assembling even a small dataset can require hours of tedious work, deterring even the most disciplined of journalists and their editors.

Embedly Powered

HuffPost’s OffTheBus Superdelegate InvestigationWe asked HuffPost readers to join with us and profile the hundreds of superdelegates who are likely to decide the Democratic nomination for president. Hundreds of you responded and we can now present our initial findings. Just click on a state or territory and a list of superdelegate profiles, as compiled by our citizen journalists, will pop up.

Embedly Powered

The Political Campaign HQ Next Door: OffTheBus Special Ops PhotographsWhere are the state campaign headquarters located, exactly, for the party that claims to represent Main Street? Where are they located for the party that claims to represent everyone? Thanks to the work of HuffPost OffTheBus Special Ops, you can visit offices around the nation in just a few key strokes.

Embedly Powered

Introducing Stimulus Spot CheckJuly 20, 2009: This post has been corrected. It’s the middle of July and we’re all wondering whether the stimulus is working. If we do as the administration has advised, we should remain patient – and let the administration measure its own success.

Embedly Powered

WNYC – Mapping the Storm Clean-upWe’ve been asking readers and listeners to let us know if their streets have been plowed. Here are maps from Tuesday, Wednesday and Thursday (white balloons represent unplowed streets, blue plowed). Click the balloons for full information and voice messages where available. Submit yours by texting PLOW to 30644.

Embedly Powered

via Wnyc
How Do You Feel About the Economy? – Interactive Feature – NYTimes.comEnter the word that best describes your current mood. You can submit a response once a day.

Embedly Powered

Adjunct ProjectThe Project The Adjunct Project exists for the growing number of graduate degree holders who are unemployed and underemployed. Many of these highly educated and passionate people are being forced to take jobs dramatically below their achievement and earning potential.

Embedly Powered

The Scrapbook – POPS Report: Tell Us About New York City’s Privately-Owned Public SpacesListen: Project Intro from October 19th // Listen: Wrap-Up from November 9th // WNYC’s Brian Leher Show and The New York World are collaborating on a project to map and report on New York City’s Privately-Owned Public Spaces, aka POPS. We want to figure out how public these public spaces really are.

Embedly Powered

via Wnyc

What media can learn from swarming activities

The old saying, “if you build it they will come” doesn’t apply to participation in the same way it can sometimes work for information and entertainment.

However, active participation is possible and potentially very meaningful when the conditions for swarming exist.  If bidirectionality is the way forward for media then it seems to me that great success lies ahead if editors and advertisers can evolve their models to fuel swarming projects.

The business of publishing online has come a long way since the early days of broadcast web sites and banner ads, though the high impact story or campaign is still very elusive on the Internet.

News operations are tackling this problem by adopting cultures of participation, and brands are getting wiser to the power of direct relationships with people.

John Battelle has been exploring this trend brands are making and even goes as far as stating that “all brands are publishers“:

“Dictating a message to your audience is no longer acceptable. Consumers online expect dialogue, so pairing your brand with relevant and passion-driven topics is one of the best ways to ensure that you are engaged with key audiences.”

Now, publishing and having conversations is much better than interrupting people, but this strategy can easily become broadcast disguised as conversation if you’re not careful.

Even worse, that strategy can reinforce the dependency on the major network firehoses where people spend their time online and create a layer of context between the individual and the source instead of the direct and meaningful relationship both publishers and advertisers want to have with people.

There is another way, a better way for everyone involved.

Swarm intelligence is about simple beings following simple rules, each one acting on local information.  National Geographic published an influential feature in 2007 talking about how the swarm tactics of ants and bees have inspired new transport systems and even military tactics.

“Decentralized control, response to local cues, simple rules of thumb—add up to a shrewd strategy to cope with complexity.”

The mathematical challenges inherent in this type of world view are considerable, which, of course, makes swarm intelligence projects very alluring for most alpha geeks.

Author John Robb described how this swarming approach has fueled a new kind of protest movement, recently demonstrated by the Occupy demonstrations.  He says, “Open source protest is an organizational technique.”

It’s made up of a few key ingredients, described here by mathematics professor Lee Worden(*):

  • Plausible promise: An simple goal that people can get behind, that you can believably offer
  • Open invitation: you don’t have to agree on everything, just on what we are doing
  • Many leaders: let everyone innovate, do multiple things at once. Support anyone in a leadership role that either a) grows the movement or b) advances the movement closer to its goal. Oppose (ignore) anybody that proposes a larger, more complex agenda or those that claim ownership over the movement.
  • Open source: If a new technique works, document it, use it again, and share it with everyone else. Copy everything that works.
  • Spread the word of the movement as widely as possible.

It’s the antithesis of the Baby Boomer protest model which was about a collective barricade, a massive force of immovable inertia.

The open source protest conditions are not exclusive to a protest.  They are natural tendencies that draw on human emotion and our sense of purpose.  We want to belong to something and to participate in that thing, even if it’s just a small role in the overall goal.

This is precisely where media and marketing brands need to focus now.

How do you either initiate or participate in meangingful swarming activity?

We all know now that collaboration in and of itself is not interesting as an objective.  But when a project does have a purpose collaborating with people from around the world who have varying views and levels of expertise can be absolutely thrilling.  Ask anyone who contributes substantially to an open source project of any shape or size.

Just like open source protests or open source software development, swarming media activities tend to share the same principles and include the same ingredients – a widely understood purpose, simple little participatory actions that feed into the whole, a high level of openness in the system, authoritative advocates and demonstrable leadership among them at least on a part-time basis, repeatability when successes appear and efficient ways to share learnings, and strong signals to participate spreading far and wide.

There can be big successes with swarming activities in a more task-like point solution approach.  It’s the viral marketing equivilent of breaking the Billboard Top 10 list.  We experienced this in dramatic fashion at the Guardian with the MPs Expenses scandal.

Creating a more long term and sustainable solution for swarming activities that modify themselves, adjust over time, and fluctuate in intensity is a bit more complicated.

I would have trouble believing that either Jimmy Wales or Jack Dorsey had an explicit plan for turning their swarming engines into global phenomenons.  They had a lot of help and a bit of luck along the way.  That said, you can be sure they both intuitively understood the open source model and how to both lead and get out of the way at the right time.

I suspect many alpha geeks, as Tim O’Reilly calls them, will continue to find success working on various approaches to swarming tools and technologies over the next few years.  Media organizations would be wise to think more broadly about swarming strategies and specifically about how to use these techniques in the news agenda and in branding campaigns.

It would be presumptuous to say we have an answer with the n0tice platform, but its undeniably capable of serving this function for customers who want to use it for swarming projects.

For example, look at the recently launched Best Bookshops project on the Guardian which is sponsored by National Book Tokens.

Both the Guardian editors and National Book Tokens are clever to imagine an activity that encourages people to reacquaint themselves with their local bookshops.  It’s an interesting editorial proposition and a brilliantly selfless brand campaign that encourages people to do something good for themselves and local business.

They are clever to approach that campaign in a way that will fuel a collective interest in spending time at local bookshops through a swarming exercise rather than trying to push the idea on people.

It’s also interesting how many different constituents are involved in this idea.  In addition to Guardian editors and the advertiser, National Book Tokens, bookshops across the UK are collectively posting events on the Bookshops noticeboard, and people are encouraged to post photos of their experiences at their local bookshops.

There’s a shared experience at the local level that feeds a larger context with an understood purpose: local bookshops matter.  To reinforce this larger purpose the campaign offers a unified view of the swarming activity presented as a map residing on the Guardian web site.  It will continue to live there indefinitely.

Everybody wins here.  People become reacquainted with their local bookshops.  Bookshops build better ties with their community.  National Book Tokens strengthens its brand and its role with bookshops and book shoppers.  The Guardian earns money from the sponsorship and provides a great service to its readers.  And n0tice builds a stronger user base across the n0tice network.

It’s a classic case of generative media in action.

The swarming media concept may need adjusting a few times before people get it right.  But the Internet is ideally suited for it.

Sometimes I wonder, actually, is it possible that swarming activities are THE network-native format for successful campaigns?  It’s early days still, but the opportunity seems absolutely massive.