zondag 28 december 2008

Into the top 40

Yahoo! After the submit on December 26th 2008 that put me on position 42, today (December 28th 2008) I made it into the top 40. With a new (and simple) postprocessor that accounts for a bias a user may have on a specific day I got the score down to 0.8755 which ranks me at position 36! Only one small disappointment... the improvement over the Cinematch score is 7.98%. Next step is to got over 8%... shouldn't be to hard since there is still plenty of room for improvements. Just have to wait another 24 hours before I can submit again...

dinsdag 9 december 2008


Hi, this is a little blog about team GreenCircle in the Netflix prize competition. My name is Willem Mestrom and I am the sole member of team GreenCircle. In May 2008 I first read some things about this competition. Since I am a mathematician and informatician I figured it would be nice to see if I could compete, team GreenCircle was born.

Since then I have learned an awful lot about collaborative filtering. From different matrix factorization techniques and Restricted Boltzmann Machines to asymmetric factoring and some really fancy nearest neighbor techniques.

Putting all of this new knowledge into a computer program and optimizing it so it can be used on a huge dataset like the netflix dataset proved interesting. Even though this is pretty close to the kind of things I had to do at the university when I was studying it is still no easy task to manage so many datapoints in an efficient way.

November 7th 2008 I finally got to a point where I had a nice prediction set to submit to Netflix. It came back with an RMSE of 0.8909. A very nice result for a first submittion. In the next weeks I submitted a few more sets making big steps towards a nice ranking on the leaderboard. Today, December 26th 2008 I reached a score of 0.8773 (at this time ranking me at position 42). Just short of the top 40 which is displayed by default. But with quiet a few ideas for improvements still on my desk I'm sure I will be in the top 40 really soon...