Create Presentation
Download Presentation

Download Presentation
## Matrix Factorization

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Recovering latent factors in a matrix**m movies n users V[i,j] = user i’s rating of movie j**Recovering latent factors in a matrix**m movies m movies ~ n users V[i,j] = user i’s rating of movie j**KDD 2011**talk pilfered from …..**Recovering latent factors in a matrix**r m movies m movies ~ H W V n users V[i,j] = user i’s rating of movie j**Matrix factorization as SGD**step size**Matrix factorization as SGD - why does this work? Here’s**the key claim:**Checking the claim**• Think for SGD for logistic regression • LR loss = compare y and ŷ= dot(w,x) • similar but now update w (user weights) and x (movie weight)**What loss functions are possible?**N1, N2 - diagonal matrixes, sort of like IDF factors for the users/movies “generalized” KL-divergence**KDD 2011**talk pilfered from …..**More detail….**• Randomly permute rows/cols of matrix • Chop V,W,H into blocks of size d x d • m/d blocks in W, n/d blocks in H • Group the data: • Pick a set of blocks with no overlapping rows or columns (a stratum) • Repeat until all blocks in V are covered • Train the SGD • Process strata in series • Process blocks within a stratum in parallel**More detail….**Z was V**More detail….**M= • Initialize W,H randomly • not at zero • Choose a random ordering (random sort) of the points in a stratum in each “sub-epoch” • Pick strata sequence by permuting rows and columns of M, and using M’[k,i] as column index of row i in subepoch k • Use “bold driver” to set step size: • increase step size when loss decreases (in an epoch) • decrease step size when loss increases • Implemented in Hadoop and R/Snowfall**Hadoop scalability**Hadoop process setup time starts to dominate