dinsdag 9 december 2008

Netflix

Hi, this is a little blog about team GreenCircle in the Netflix prize competition. My name is Willem Mestrom and I am the sole member of team GreenCircle. In May 2008 I first read some things about this competition. Since I am a mathematician and informatician I figured it would be nice to see if I could compete, team GreenCircle was born.

Since then I have learned an awful lot about collaborative filtering. From different matrix factorization techniques and Restricted Boltzmann Machines to asymmetric factoring and some really fancy nearest neighbor techniques.

Putting all of this new knowledge into a computer program and optimizing it so it can be used on a huge dataset like the netflix dataset proved interesting. Even though this is pretty close to the kind of things I had to do at the university when I was studying it is still no easy task to manage so many datapoints in an efficient way.

November 7th 2008 I finally got to a point where I had a nice prediction set to submit to Netflix. It came back with an RMSE of 0.8909. A very nice result for a first submittion. In the next weeks I submitted a few more sets making big steps towards a nice ranking on the leaderboard. Today, December 26th 2008 I reached a score of 0.8773 (at this time ranking me at position 42). Just short of the top 40 which is displayed by default. But with quiet a few ideas for improvements still on my desk I'm sure I will be in the top 40 really soon...

Geen opmerkingen: