Search: Big indexes versus microformats

I was just notified by Amazon that John Battelle’s book on search is in the mail.  It got me thinking that John probably didn’t have enough time to take into account the recent developments with microformats and tagging that are changing the search landscape, though, of course, I haven’t read it yet.  But if his book is all about the battle of the indexes and the big brains that made them, then it may be missing the pending explosion in information discovery happening all around us today outside of those big indexes.

In my mind, the end game is not about whether or not the search results page is good enough.  Search is just one piece of the process, the negotiation.  The transaction that ultimately matters to me is discovery.  Indexing enables some very useful tools for broad searching across really wide data sets to narrow results to something digestible. And that story is really important in the evolution of the Internet.  But there’s a new approach to finding things happening right now that may be more profound.

Soon search is going to be about the explicit relationships between people and the discovery mechanisms they create directly between each other.  People who think like me, who have solved the same problems I have, who know people who can help me are going to provide a direct route to precisely the things that matter to me.  I don’t have to negotiate with a machine when I can use the knowledge pool of my peers to locate exactly what I need when I need it.

This change became more obvious to me when tags came along.  Tags are labels people add to pages.  They are explicit organizing principles applied to things of interest.  My connection to people and the tags that they use to track things of interest to them makes it possible to discover things that also matter to me.  I don’t even have to ask my friends for their expertise...they are leaving breadcrumbs to their minds all over the Internet.

When I look in MyWeb, I see things that I need to read.  They are items that people in my small network have saved and tagged.  My explicit relationship to them makes the items more relevant than the results of a wide net cast over a huge data set that gives me data in a just-in-time kind of way...a very 1990’s way.  For example, MyWeb shows me things that people who are living in a Web2.0 world discover about XML.  That helps me stay in tune with the implications of XML in a new media environment in a far more precise and relevant way than any search result I get from a search engine on "XML".  Further, I won’t likely check out music that is interesting to people who find XML fascinating because I have a different network of friends serving that need for me…again, not something I get from big indexes in search engines.  You can’t say to Google, "show me new music that I might like".

But tags are just the beginning of this breakthrough in precise discovery.  Microformats are going to make it possible to mashup global data in a very local way.  When people begin marking more things they interact with using explicit descriptive data (events, prices, ratings, playlists, etc.), then tools will evolve that give me the things that matter to me via the people that matter to me in the context that matters at the moment.

The hard part is making a fluid connection from a thing to a person to me.  It must take no more than 2 seconds for a person to add microformat data.  And there must be an immediate and beneficial gain upon completing the additional markup or they won’t do it.

It would be a mistake to assume that people will tag or add microformat data because they can.  People will do it because it’s easy and because there’s an intrinsic incentive.  When you rate something in the Yahoo! Music Engine, you get the sense that the system will offer related music that you like.  When you tag things in MyWeb, your search results get more precise.  When you post a change in Wikipedia, your post appears immediately on the live site.  The more you interact with these tools, the smarter they become.  That is what will differentiate the successful search tools in the next generation.

Chris Tolles of likes to dispel the myths of tagging as a former member of the Open Directory Project.  He has a man vs machine view where machines are more likely to win the game of finding things:

"If I had a nickel for every starry eyed idealist point to tagging saving the world, I'd be able to fund my own blog search engine."

But the economics of participatory media are starting to form some tangible results.  Umair Haque at Bubblegeneration explains the model:

"Web 2.0 is about the shift from network search economies, which realize mild exponential gains - your utility is bounded by the number of things (people, etc) you can find on the network - to network coordination economies, which realize combinatorial gains: your utility is bounded by the number of things (transactions, etc) you can do on the network."

It’s not man versus machine.  It’s man helping machine to help man.


Re: Search: Big indexes versus microformats
by Tolles on Thu 15 Sep 2005 11:55 PM EDT tolles

Nice writeup -- I actually agree with a lot of what you're pointing out here (despite my snarky quote). The nuances of making all of this work when confronted with massive machanical SPAM attacks will be, in my opinion the hallmarks of the next generation of search.

The centralized authority of an Amazon type approach to aggregate user generated reviews and opinions has worked very well -- but it's far different when talking about the "adhocracies" as they gain audiences. (Just look at Google's blog search to get an idea of what SPAM is doing to user generated data through the filter of search).

I'd like to see more thought on the security and trust frameworks that are likley to be required for all of this to remain valuable -- Ebay, the ODP, Amazon have all done an ok job in creating something that aggregates user "tagged" info, scales and resists attack -- making a similar system work in a truly distributed way is far, far harder.

Check out threadwatch or some of the *really* blackhat SEO forums out there - The Morlocks are sharpening their knives and licking their chops over this tagging thing.

All that being said -- I'd like this all to work :-)


TrackBack URL:

Matt McAlister :: Search: Big indexes versus microformats
Excerpt:  Matt McAlister :: Search: Big indexes versus microformatsMatt, while waiting for Battelle's The Search to arrive, posits that microformats will trump massive indices of content in the long run. Example -- I use the GMap Pedometer to build bicycle rides...
Posted:  Fri Sep 16 08:18:52 EDT 2005
Search: Big indexes versus microformats