Why Technorati is so slow
Why Technorati feels slow is a relevant piece by Stephen Baker In Business Week. Very interesting if you've ever wondered why it is so slow:
In the last year, with the blogosphere doubling twice in size, Technorati has had to re-engineer its system. Originally, says Hertz, it dealt with all the data in one big (and ever-expanding) pool. In the last nine months, engineers have rearranged the data in different segments. At the same time, they're enabling it to comb through the data more intelligently, sorting each piece so that it can be cross-referenced. For example, this post can be associated with me as a blogger, with Blogspotting, with BW, with Technorati, with the search industry, and with any of you who link to it. Each one of those relations has meaning and value. But offering all these dimensions adds layer upon layer of complexity to blog search. "In general, our traffic isn’t the big gating factor," he says. "It’s the amount of new data that we’re managing." (...) New services will continue to add to the complexity. In the future, says Hertz, Technorati will organize bloggers by their specialties, and perhaps even rank the authority they have on certain subject matters. (...) For many, the first impulse when faced with a crush of blog data would be to add servers. That's the easiest part, says Hertz. "The trick here is when we have to break things into pieces, or invent brand new systems to do the data management."
What's more, blog search engines, unlike Google, have to update this data continuously. They're providing a look at time as it passes. Yesterday, with the London bombing, traffic exploded, taxing the Technorati system. Instead of the usual 800,000 new posts, Technorati was on track yesterday to process 1.2 million of them.
There are more stuff about it there, in a .doc file written by the author. Why do I blog this? Technorati is a powerful and interesting tool, however it's sometimes slow, just wanted to know why. What is also cool in this blogpost is that the CTO of PubSub (technorati's concurrent) expands the discussion.