Improving Bloglines Recommendations
Bloglines is in the best possible position, if it chooses, to build a recommendation service. They already have a basic service, but its usefulness is pretty limited as it is heavily weighted by popularity. Bloglines has consistently suggested that I should be reading Slashdot, The Register, and Salon. Lets assume that if my profile suggests I’m a geek, and I’m not subscribed to Slashdot, it isn’t because I haven’t heard of it. What we need is inverse frequency weighting, if you and I are the only people are on Earth who subscribe to a handful of feeds, I’m very interested in your reccomendations for related readings. The more I think about this, the more it sounds like a vector space problem.
And Bloglines has a few advantages that other services, like Winer’s Share Your OPML don’t have, you see Bloglines knows if I’m actually reading the feeds I’m subscribed to, or just letting items pile up. Bloglines knows I’m more likely to read the Guardian, then the BBC, more likely to read the New York Times Review of Books, then their national news, and that, while I’m subscribed to 6 feeds on ColdFusion, I haven’t read any of them since December.
Of course as my own experiments with finding similar items suggests modelling problems as a vector space has its draw backs. They tend to consume a lot of memory, and they can’t be updated incrementally. With Bloglines seeming add several hundred new feeds a day to the system, there would probably have to be a week, to a month delay in adding new feeds to the mix/vector.