Archive for the 'video' Category

Open source grid computing takes off

This has been fun to watch. The Hadoop team at Yahoo! is moving quickly to push the technology to reach its potential. They’ve now adopted it on one of the most important applications in the entire business, Yahoo! Search.

From the the Hadoop Blog:

The Webmap build starts with every Web page crawled by Yahoo! and produces a database of all known Web pages and sites on the internet and a vast array of data about every page and site. This derived data feeds the Machine Learned Ranking algorithms at the heart of Yahoo! Search.

Some Webmap size data:

  • Number of links between pages in the index: roughly 1 trillion links
  • Size of output: over 300 TB, compressed!
  • Number of cores used to run a single Map-Reduce job: over 10,000
  • Raw disk used in the production cluster: over 5 Petabytes

I’m still trying to figure out what all this means, to be honest, but Jeremy Zawodny helps to break it down. In this interview, he gets some answers from Arnab Bhattacharjee (manager of the Yahoo! Webmap Team) and Sameer Paranjpye (manager of our Hadoop development):

The Hadoop project is opening up a really interesting discussion around computing scale. A few years ago I never would have imagined that the open source world would be contributing software solutions like this to the market. I don’t know why I had that perception, really. Perhaps all the positioning by enterprise software companies to discredit open source software started to sink in.

As Jeremy said, “It’s not just an experiment or research project. There’s real money on the line.

For more background on what’s going on here, check out this article by Mark Chu-Carroll “Databases are hammers; MapReduce is a screwdriver”.

This story is going to get bigger, I’m certain.

Interactive journalism: An amazing homicide mashup

I had the pleasure of interviewing Sean Connelly and Katy Newton for YDN Theater recently with YDN videographer Ricky Montalvo. They created the amazing (and award-winning) crime data mashup Not Just A Number in partnership with The Oakland Tribune.

Not Just A NumberAfter getting tired of watching the homicide count for 2006 climb higher and higher, they decided to humanize the issue and talk to the families of the victims directly. They wanted to expose the story beneath the number and give a platform upon which the community could make the issue real.

Statistics can tell effective stories, but death and loss reach emotional depths beyond the power of any numerical exploration.

Sean and Katy posted recordings of the families talking about the sons, daughters, sisters and brothers that they lost. They integrated family photos, message boards, articles and more along with the interactive homicide map on the site to round out the experience making it much more human than the traditional crime data mashup.

Here is the video (7 min.):

I also asked them if they had trouble getting data to make the site, and they said the Oakland Tribune staff were very supportive. There weren’t any usable open data sets coming out of the city, so they had to collect and enter everything themselves.

This, of course, is a very manual process. Given the challenge of getting the data Sean and Katy didn’t see how the idea could possibly scale outside of the city of Oakland.

SOmebody needs to take that on as a challenge.

I’m hopeful that efforts like Not Just A Number and the Open Government Data organization will be able to surface why it’s important for our government to open up access to the many data repositories they hold. And if the government won’t do it, then it should be the job of journalists and media companies to surface government data so that people can use it in meaningful ways.

This is a great example of how the Internet can empower people who otherwise have no voice or audience despite having profound stories to tell.

What is a MashUp?

YDN went to Ireland for MashupCamp and came back with some hilarious footage. Rated PG:

Investing in video at YDN

We’ve been playing around with video as a communications mechanism on Yahoo! Developer Network for a while now. Our casual attempts to generate interest in Yahoo! technologies through interviews, screencasts, tech talks, etc. have worked really well.

So, we hired a full time videographer/filmmaker named Ricky Montalvo and got him some decent gear to push the envelope a little further. And today we rolled out YDN Theater on the YDN web site to establish a home for all the work he has been producing.

The journey here started with a pretty lame but surprisingly successful screencast that Dan Theurer and I did to explain how browser-based authentication worked. It was blurry. We made mistakes. The subject matter was pretty abstract. And neither Dan nor I have particularly strong camera presence.

Regardless, it has been viewed over 19,000 times, so far.

We kept pushing with new types of videos such as partner showcases with people like Joyce Park, Adam Rifkin, and Leah Culver. We brought the camera to our various Hack Days and produced a particularly funny recap of the London event. And we recorded tech talks from our own staff at Yahoo! and presentations from guest speakers like Grady Booch, Joe Hewitt and David Weinberger.

By the time we found Ricky, we knew we were building a program that was going to be really interesting. Yet, we hardly spent any money other than a few cheap cameras and some basic editing tools including Camtasia at that point.

The success to date I think has been in large part due to the fact that we haven’t tried to pimp out our videos with any professional plastic gloss or staged demos. We also try to have a little fun with them. Jeremy Zawodny is a really good interviewer. His unassuming yet pointed questions get people to say things they otherwise wouldn’t include on any planned script. And the fact that the videos are raw with few cuts or edits make them feel real, too.

There are some good video program ideas floating around here that could be a lot of fun, but now we’re torn between how much time we want to spend building out the video offering and how much time we want to spend on all the other ways the team can evangelize Yahoo! technologies.

I’m not sure how to measure that decision just yet, but as long as people are consuming these shows we do with such enthusiasm we’ll probably tilt the scale in favor of doing more video whenever possible.

The Hack Day London Video

I’m heading back home from Hack Day London tomorrow. What a spectacular event.

I did my best to capture the behind-the-scenes action this time, as I think the Hack Day event process itself is really interesting, too. Of course, sharing the day-to-day work would be frightengly boring, but you can at least get a sense of what happens the day or so before Hack Day starts in this video here:

What’s easy to forget is that the event process itself is treated like a hack. We break the rules. We invent on the fly. We don’t know if it will work.

Anyhow, there’s more to come, I’m sure.

(Apologies for the horrible editing in this video…it’s my own hack contribution…unpolished, experimental, and a little bit broken.)