October 03, 2007

Who Can Build an Inverse Technorati?

With all the hubbub earlier this week around the introduction of the new TechMeme Leaderboard, the fading aura of Technorati was once again brought to the fore, as TechMeme's new offering was seen as challenging the longtime blog search engine's hold on who owns the most "Authority" on the Web - best indicated by the number of unique sites provide direct links to their blogs.

In the last 12-18 months, Technorati has seen more than its fair share of bad news and bad karma. From consistent bouts with downtime and sluggish responsiveness, an all-out assault from Google to own the blog search space, bloggers' gaming of the site's ranking index, and the loss of CEO David Sifry, many don't see the Web 2.0 pioneer pulling out of the spiral and reclaiming share - especially as its latest forays into innovation, WTF and Topics, are more confusing than useful.

Despite all the above, Technorati still performs an excellent set of functions - tracking who has linked to your blog, sorted by date, or "Authority", and giving you your own "Authority" count, based on the number of individual blogs pointing your way in the last six months.

But, partly due to our recent thoughts around internal links, and the work of Yuvi Panda, showing how some of the biggest sites link outwardly, I've been thinking we need a spider-driven search engine that will index blogs, and provide reports on who we link to the most frequently. The question is, who builds it?

Ideally, the service would:

1. Provide aggregate reports on how many internal and external links were created, and in how many posts, over a given period.
2. Provide a ranking of the most-frequently linked-to sites or pages in a given period.
3. Recognize links from blog posts, and could exclude both "sidebars" and "action" buttons, (i.e. for Digg, Ballhype, StumbleUpon, etc.)
4. Be able to display subsets of data, such as the ranking of most-frequently linked-to sites in which I had a specific tag (i.e. Sports, Technology, Media).
5. Show me which bloggers have similar sites in my "Top 10 Linked", for example, which might indicate people who have similar interests, who I would undoubtedly want to read.

Yuvi Panda has created a statistical engine that crunches a single site at a time, reporting back on internal and external link frequency. Could this service be expanded to crawl the entire blogosphere, like Technorati, and provide individual bloggers with their own statistics? And could this service ever be marketized? I know I'd love to use it.