Category Archives: maps

Orchestrating streams of data from across the Internet

The liveblog was a revelation for us at the Guardian. The sports desk had been doing them for years experimenting with different styles, methods and tone. And then about 3 years ago the news desk started using them liberally to great effect.

I think it was Matt Wells who suggested that perhaps the liveblog was *the* network-native format for news. I think that’s nearly right…though it’s less the ‘format’ of a liveblog than the activity powering the page that demonstrates where news editing in a networked world is going.

It’s about orchestrating the streams of data flowing across the Internet into a compelling use in one form or another. One way to render that data is the liveblog. Another is a map with placemarks. Another is a RSS feed. A stream of tweets. Storify. Etc.

I’m not talking about Big Data for news. There is certainly a very hairy challenge in big data investigations and intelligent data visualizations to give meaning to complex statistics and databases. But this is different.

I’m talking about telling stories by playing DJ to the beat of human observation pumping across the network.

We’re working on one such experiment with a location-tagging tool we call FeedWax. It creates location-aware streams of data for you by looking across various media sources including Twitter, Instagram, YouTube, Google News, Daylife, etc.

The idea with FeedWax is to unify various types of data through shared contexts, beginning with location. These sources may only have a keyword to join them up or perhaps nothing at all, but when you add location they may begin sharing important meaning and relevance. The context of space and time is natural connective tissue, particularly when the words people use to describe something may vary.

We’ve been conducting experiments in orchestrated stream-based and map-based storytelling on n0tice for a while now. When you start crafting the inputs with tools like FeedWax you have what feels like a more frictionless mechanism for steering the flood of data that comes across Twitter, Instagram, Flickr, etc. into something interesting.

For example, when the space shuttle Endeavour flew its last flight and subsequently labored through the streets of LA there was no shortage of coverage from on-the-ground citizen reporters. I’d bet not one of them considered themselves a citizen reporter. They were just trying to get a photo of this awesome sight and share it, perhaps getting some acknowledgement in the process.

You can see the stream of images and tweets here: http://n0tice.com/search?q=endeavor+OR+endeavour. And you can see them all plotted on a map here: http://goo.gl/maps/osh8T.

Interestingly, the location of the photos gives you a very clear picture of the flight path. This is crowdmapping without requiring that anyone do anything they wouldn’t already do. It’s orchestrating streams that already exist.

This behavior isn’t exclusive to on-the-ground reporting. I’ve got a list of similar types of activities in a blog post here which includes task-based reporting like the search for computer scientist Jim Gray, the use of Ushahidi during the Haiti earthquake, the Guardian’s MPs Expenses project, etc. It’s also interesting to see how people like Jon Udell approach this problem with other data streams out there such as event and venue calendars.

Sometimes people refer to the art of code and code-as-art. What I see in my mind when I hear people say that is a giant global canvas in the form of a connected network, rivers of different colored paints in the form of data streams, and a range of paint brushes and paint strokes in the form of software and hardware.

The savvy editors in today’s world are learning from and working with these artists, using their tools and techniques to tease out the right mix of streams to tell stories that people care about. There’s no lack of material or tools to work with. Becoming network-native sometimes just means looking at the world through a different lens.

Interactive journalism: An amazing homicide mashup

I had the pleasure of interviewing Sean Connelly and Katy Newton for YDN Theater recently with YDN videographer Ricky Montalvo. They created the amazing (and award-winning) crime data mashup Not Just A Number in partnership with The Oakland Tribune.

Not Just A NumberAfter getting tired of watching the homicide count for 2006 climb higher and higher, they decided to humanize the issue and talk to the families of the victims directly. They wanted to expose the story beneath the number and give a platform upon which the community could make the issue real.

Statistics can tell effective stories, but death and loss reach emotional depths beyond the power of any numerical exploration.

Sean and Katy posted recordings of the families talking about the sons, daughters, sisters and brothers that they lost. They integrated family photos, message boards, articles and more along with the interactive homicide map on the site to round out the experience making it much more human than the traditional crime data mashup.

Here is the video (7 min.):

I also asked them if they had trouble getting data to make the site, and they said the Oakland Tribune staff were very supportive. There weren’t any usable open data sets coming out of the city, so they had to collect and enter everything themselves.

This, of course, is a very manual process. Given the challenge of getting the data Sean and Katy didn’t see how the idea could possibly scale outside of the city of Oakland.

SOmebody needs to take that on as a challenge.

I’m hopeful that efforts like Not Just A Number and the Open Government Data organization will be able to surface why it’s important for our government to open up access to the many data repositories they hold. And if the government won’t do it, then it should be the job of journalists and media companies to surface government data so that people can use it in meaningful ways.

This is a great example of how the Internet can empower people who otherwise have no voice or audience despite having profound stories to tell.

Crime data stories

My Potrero Hill neighbors tell me that the sweet song of crackling firearms in the evening always begins again in May as the days get longer, hotter and schoolless.

Recently, I witnessed a sample of the gun play happening in the nearby projects, and I decided to do some of my own research to understand what’s going on. The first thing I found was that I wasn’t the only witness to this particular incident:

“Two of the bullets hit our daughters bedroom– one went through the wall and crossed a small portion of the room and lodged in another wall near her sliding glass door.

[The Police] told us that based on the 24 bullet shells they found up the hill on Missouri St. near the public housing, there were two guns involved, one of which was an AK47 the other was probably a 9mm pistol. The police have no idea who was firing the guns and given that there are not witnesses, there is not likely to be any resolution to the incident. The officers were confident that the two bullets that hit the condo were random and not targeted at us.”

There are lots of factors behind violent neghborhoods, and the San Francisco projects are pretty densely representative of many of those factors. But it really irritates me that guns are so prevalent in the area, and, in general, so prevalent in America.

So, I started my journey at the old PotreroHillSF Crime Mashup which apparently doesn’t work any more. There is an ongoing “Police Blotter” on the site, though, with some good reporting.


I then found the official San Francisco Police Department Crime Map. Of course, the data is wrapped in their own heavy-handed user interface and unavailable in common shareable web data formats. The tool is burdened with legal trappings and strangely fails to acknowledge homicides, though they offer an explanation:

“A homicide may not appear correctly on the map because:

  1. The incident was initially reported as an assault and the victim died some time later from the injuries.
  2. The incident was reported as an arson, and the body was not found until a later time.
  3. A body was found and the cause of death was not obvious to the officer making the incident report.”

I’m hoping that the City has more advanced reporting capabilities internally, as it seems pretty obvious that we have a data visualization failure going on here. I can see some data around assaults, robberies, larceny, vandalism, drug incidents, etc.

But the compelling visual storytelling is missing.

I want to know how many crime incidents in the projects this year involved guns. How many guns in these events are registered/unregistered? How many of the gun incidents were or became homicides vs non-gun related incidents? Where did the guns come from? What kinds are being used?

I suspect most guns aren’t registered which is an argument used by those who think a gun ban would be useless. People who want guns will find them, legal or not. But I also suspect that the victims aren’t carrying guns. Thus, the argument that people should have the right to own a gun to protect themselves isn’t a counterbalancing force. People who avoid violence won’t carry guns, legal or not.

As I progressed with this research I realized that somewhere in between raw data and overt campaigning is an interesting space. Data can help us learn and make more intelligent and informed decisions about how to manage and evolve our society and its rules.

Unfortunately, that space seems more difficult to find than it should be. I should be able to download data for myself or at least be able to visualize the stories behind the data in relevant pictures and charts.

Of course, there’s the fantastic ChicagoCrime.org web site which has done a lot to raise awareness about crime data. Despite the lack of available data from the local government, site owner Andrian Holovaty found a way to collect what he needed to make this site through an automated script:

“Each weekday, my computer program goes to the Chicago Police Department’s website and gathers all crimes reported in Chicago.”

The site has some great info (such as this screenshot of “Armed Robbery: Handgun Incidents”), though I still want to see an editorial lens on this data that puts a bit more meaning behind it.

For example, it only takes a glance to see in this series of Census images of San Francisco that the City is incredibly segregated, something I think many residents choose to ignore under the mask of open-mindedness. Even here, though, the story is incomplete without some intelligence wrapped around the data. What’s the trend? Is it becoming whiter? Where are people going who are leaving?

This same question punctures my happy place every time I exit onto Palo Alto’s University Avenue from Highway 101 and pass what is now a high end office park where one of the most dangerous areas in the country used to exist only a decade or so ago. I’m very pleased it’s a safer place, but do we understand the cost of that transition? Where did those people go? Are they better off?

Yahoo! colleague Micah Laaker pointed me to an interesting project he worked on back in 2002 and 2003 called the Denver Census Tract Animation Project. He worked with Citizen Mapmakers to trend movement of the African-American population in Denver from 1960 to 2000. Here’s a snapshot of their work:

I really like the way they visualized data to tell a story here. We need similar visualizations for crime data.

The InfoPlease “School Shootings” site gets closer to telling a story about guns just by focusing on a type of statistic and representing it. What a powerful domain name! However, the data here is still pretty raw and limited. This is hugely important information, but there’s an implicit argument here that should be made much more explicit with actionable information and analysis. In its current state it’s just telling us that there are a lot of school shootings (a surprising number in Europe, actually).


The Citizen Crime Watch site for New Orleans gets even closer to what I want to see. Similar to ChicagoCrime.org, they visualize with your standard data-on-a-map mashup, but the hover links point to coverage in the local media. I’m suddenly given a much more human window into the crime scene, and I can read about each event. For example, on April 9, 2007, there was a homicide in a trailer park:

“…Officers found Williams lying on the floor of the trailer with blunt-force trauma to her head. Emergency medical technicians declared her dead at the scene. An autopsy shows she had been beaten to death, said John Gagliano, chief investigator for the Orleans Parish coroner’s office.

The trailer is in a trailer park at 6801 Press Drive run by the Federal Emergency Management Agency. Although the trailer park is near the campus of Southern University, the chancellor, Victor Ukpolo, said neither faculty nor students live there.

The murder is being investigated by Detective Harold Wischan, who can be reached at (504) 658-5300.”

I’m very thankful for local reporting from sites like Nola.com, The Times Picayune, and community leaders such as Mike Lin of PotreroHillSF and the increasingly active Yahoo! Group Potrero Hill Parents Association who all help surface this kind of information, but it’s not enough. The City needs make it easier for its residents to both report on things that matter to us and to collect the data, filter it, and act on it.

People will always want greater access to information. This is particularly true in communities where poor decision-making creates mistrust:

“Under pressure from constituents who say New Orleans police stonewall requests for crime data, the City Council’s criminal justice subcommittee took police representatives to task Wednesday, calling for a faster, freer flow of public information…When asked for a written breakdown of policy and procedures relating to the release of public information, Maj. Michael Sauter, the head of technology, told the council most of that information was ‘not meant for the public.’”

Similarly, Rick Klau has begun experimenting with this kind of thing in response to the Magnetix toy recall incident. He calls it “Open source parenting” and observes that bottom-up community-driven politics is likely to be more successful than anything a politician can enable:

“If the government is under-staffed and under-funded to help parents avoid harmful toys, then why can’t we help ourselves?…Give thousands of parents the tools to easily identify harmful products, leverage the community’s ability to provide visibility to legitimate threats while minimizing less serious risks, and quickly disseminate information that could be instrumental in avoiding a serious accident.”

I’m suddenly wondering what role politicans will play if communities are able to form solutions to issues locally, nationally and internationally on their own. Maybe instead of legislators (or merely professional campaigners/marketers), politicians will become community managers.

I also start wondering what politicians do all day if they can’t sort out ways to curb violence in our neighborhoods. I don’t see why anyone living in this country or any other should have to worry about whether their child will be shot accidentally in his or her bedroom by stray AK47 bullets or intentionally while at school.

I’m convinced the answer is in the data that is already being collected in various government crime databases. And I’m sure the answer is related to gun access.

Where is Tufte when you need him?