AOL data release and data mining freaks

It seems that data mininer researchers/hackers had been crazy about the the recent AOL release of tons of data. This "A chance to play with big data" blogpsot gives some hint about it:

Second, the new AOL Research site has posted a list of APIs and data collections from AOL.

Of most interest to me is data set of "500k User Queries Sampled Over 3 Months" that apparently includes {UserID, Query, QueryTime, ClickedRank, DestinationDomainUrl} for each of 20M queries. Drool, drool!

Update: Sadly, AOL has now taken the 500k data set offline. This is a loss to academic research community which, until now, has had no access to this kind of data.

There's also a NYT column about it:

A list of 20 million search inquiries collected over a three-month period was published last month on a new Web site ( meant to endear AOL to academic researchers by providing several sets of data for study. AOL assigned each of the users a unique number, so the list shows what a person was interested in over many different searches.

The release of the data shines a light on how much information people disclose about themselves, phrase by phrase, as they use search engines.