Data dynamics: How the rules of sharing are changing

Today it’s easy to store and share my pictures, my favorite URLs, my thoughts and lots of other things online. There are a range of data repositories that allow me to do this kind of thing in different ways.

What still needs work is how I give trusted services access to much more private data — things like my current location, my spending behavior, access to my friends and family, etc.

To date, most services follow the premise that the looser the controls, the more fluidly data will travel. And that’s all that mattered when it was still hard to get data flowing.

Data flow is no longer an issue. Perhaps data flow has actually become too easy now. And therein lies the problem.

Clearly, blogging, RSS and feed readers drove a lot of the early thinking about syndication. Blogging enabled people to post content in a publicly accessible data repository somewhere for anyone to pull out without any privacy or permissioning controls. The further your content then syndicated, the better.

Wikis and community sites like Slashdot created a slightly more complex read/write dynamic against the central content repository that lots of people could access together. The permissioning model was essentially hierarchical where controls were kept in the hands of a smaller community.

Then Flickr broke ground with a new approach. They applied a user-centric friends and family relationship model to permissioning access to personal photos. Flickr opened up what was once considered private data and defaulted it to a public read-only permission status. But each individual still has a great deal of control over the data he or she contributes.

Similarly, del.icio.us made it possible to store and publicly address what had previously been private data. The nice twist here was the easy-to-understand URLs that allowed machines to consume, interpret and redistribute data stored in del.icio.us.

Where services like Facebook and Wesabe are now breaking ground again is in identifying a security model around highly sensitive data. Contact lists are very personal, but there aren’t many data sets more personal than my purchases and spending patterns.

Neat things can happen when I give machines access to my data, both the things I explicitly ‘own’ and my implicit behaviors. I want machines to act on my behalf and make my data more useful to me in a range of different contexts.

For example, I like the fact that Facebook slurps up my Twitter activity and shares it with my friends in the Facebook network. I don’t want to change my ‘status’ on every service that shows status messages. Similarly, I like that Last.fm captures my listening behavior from iTunes and then uses that data to give back personal recommendations on a badge posted to my blog.

Allowing machines to automatically act on personal data on my bahalf is the right direction for things to go. But important questions need to be resolved.

For example, what happens to my data in all the places I’ve allowed it to appear when I change it? How do permissions pass from one service to another? How do I guarantee that a permission type I grant in one service means the same thing in another service? How do changes propagate? How does consent get revoked?

And even trickier than all that will be the methods for enforcing protection of privacy and penalties for breaking those permissions.

Until trust is measurable with explicit consentual triggers, loosely coupled networks that act on the data I wish to protect are going to struggle to talk to each other. Standards need to enable common sharing tactics. Responsibility needs to be clearly defined. And policies need to be enforceable.

Empowering a person to invest in storing and sharing the more sensitive data he or she owns is going to require a lot more than traditional read/write controls. But given the pace of change right now I suspect the answers will happen as the people behind these services work things out together before the industry taskforces, legal entities and blogosphere sort it out for them.

How we made the BBAuth screencast

The news that seemed to get overlooked by the amazingness that became Hack Day was the release of a login API, BBAuth, or Browser-based Authentication. This new service allows any web site or web application to identify a user who has a Yahoo! ID with the user’s consent. Dan Theurer explains it on his blog:

…instead of creating your own sign-up flow, which requires users to pick yet another username and password, you can let them sign in with their existing Yahoo! account.

My mind keeps spinning thinking of the implications of this…more on that in a later post.

It was immediately obvious to me when I heard about it that this concept was going to be hard to fully grok without some visuals to explain it. So I sat with Dan yesterday to create a video walk-through that might help people digest it (myself included). Here is a 5 minute screencast talking about what it is and an example of it in action (also available on the YDN blog and on Yahoo! Video):

The screencast itself took only a few minutes in total to produce. Here’s how it went down:

  1. I closed all my applications on my laptop other than my browser (or so I thought) and launched Camtasia
  2. We spent 5 minutes discussing what we were going to say.
  3. I clicked ‘record’.
  4. We talked for 5 minutes.
  5. I clicked ‘stop’.
  6. I selected the output settings and it then produced a video file for me.
  7. DONE. That part took about 20 minutes.

The next part, posting to a video sharing site, got a little sticky, but here’s what I learned:

  • I tried Yahoo! Video, JumpCut and YouTube.
  • Outputting my screencast in 320×240 resolution saves a lot of time for the video sharing sites
  • Yahoo! Video liked the MPEG4 format most. YouTube claims the same, though it wasn’t obvious after trying a few formats which one it liked most.
  • JumpCut was a snap to use, but the output quality was a little fuzzier
  • Titles…I forgot the damn titles, and it just looked too weak without some kind of intro and outro. Camtasia gives you a couple of very simple options. I added an intro title in less than 5 minutes.
  • Logo! Ugh. After encoding it about 8 times to get the right format I realized the logo really needed to be in there:
    1. I took a quick Snag-It screenshot of the YDN web site, played with it a bit and made a simple title screen.
    2. Saved it as a jpeg
    3. Imported into my Camtasia screencast
    4. Inserted the title image in the beginning and a variation of the same at the end
    5. Dropped a transition between the title frames and the video
    6. Titles DONE. That took less than 30 minutes…could have taken 2 seconds if I was prepared.
  • Wait…the screen wasn’t big enough. You couldn’t see the graphic that Dan points to in his explanation because it’s too small. Not a problem. Camtasia includes a simple zoom tool:
    1. I played the screencast again and found where I needed to zoom.
    2. Inserted opening zoom marker
    3. Selected zoom size. Clicked done.
    4. Found the end of the segment where I wanted to zoom out.
    5. Inserted another zoom marker.
    6. Opened zoom window back up to full size.
    7. DONE. Maybe 15 minutes to do that.
  • Output one last time
  • Upload.
  • DONE

Then all I had to do was write a blog post and embed the video in that post. That took about 10 minutes.

All in all, I probably spent close to 2 hours beginning to end producing this screencast, but most of that was learning a few tricks. Next time I do this, I bet I can complete the whole thing from launching Camtasia to posting on a blog in 45 minutes, possibly less.