mjamesruggiero/mlconf_pandora.md

## mlconf_pandora.md

      
    Raw
  

              mlconf_pandora.md
            
          
    Pandora talk re:recommendation systems

Eric Bieschke, Chief Scientist @pandora

Expresses great love for A/B testing
"the big advantage about having a lot of data is that you can do experiments with real data, real users"

The importance of metrics


how you judge experiments shapes where you are headed
choose the wrong measuring stick and you wind up in the wrong place
choose the right measuring stick and progress is inevitable
improvements come from better hypotheses and better measuring sticks

What they used to measure


they originally looked at thumbs-up percentages as their measuring stick

did a lot of incremental optimizations around that, but
the optimizations were skewing for users on the web (not device users), people more likely to vote either way


a better metric: total listening hours

but the weakness of that is that is skews pathological music nerds
and Pandora moves towards people who want to listen to tons of music
events like major traffic delays in major cities make listenership go up, so not an accurate indicator of "addictiveness"


Finalyy hit on listening return rate

you want many users listening frequently on many days


"deeper metrics"

The above metrics sit on top of "deeper metrics"

relevance - how tight and how broad are specific stations for specific users
prediction accuracy - how well did we recommend something
musical diversity

we want you to listen to 200 artists, not just 50
novelty / surprise - how likely are we to hit you with something you never heard; if we have two choices Pandora will pick the one that you will least expect


awesomeness - hard to measure

What actual recommendation system do they use?


An "ensemble" recommender
they add and remove recommenders as they prove effective for a given user
the more varied the given techniques the stronger the ensemble; orthogonality
the vintage vinyl freak and the top 40 fan have very different profiles
it's all about results; you need to have a goal in mind

Contains

the music genome project
collaborative filtering - when you have a lot of data, you don't need to get fancy with matrix factorization
collective intelligence; reinfocement learning - our listeners know what they want; they use "stations" as a defacto search term

More


Profile of Bieschke on Forbes
Short talk by Bieschke on recommendation