vansika/info.md

## info.md

      
    Raw
  

              info.md
            
          
    The message woula have been long, therefore I have written it here. Please read the whole of it.
**get me top x recommendations:

This is not a good option. We already show recs sorted on score, and they aren't good enough. Even if we are making a small
change, I don't think we should go from bad -> bad or bad -> worse but bad -> better (at least in theory), therefore this UI feature would not really please users and I don't think they will want to give it a try after they have tried it once.
**get botton x recommendations:

This looks better than the previous playlist (in my case). The artist spread out down the playlist. There can be many reasons for this. Let me explain why it happened in my case, cf couldn't find a user similar to me that listens to indian artists, therefore these artists and associated tracks went down the list and are scattered. But I don't think that this is a good option either. We haven'nt processed this in any way, so users might get a playlist like get me top x recommendationa i.e centered around one/a few artist(s).
Note** Machine learning is one aspect of recommendations. Like we do pre-processing of data before applying ML algorithms, we must after-processes (though not a word) the data to get something useful. It is not a good idea to just show on the site whtever we have got from spark
**get a small playable playlist:
Note**: showing a playlist of maybe 10 or 25 tracks (depends on how many artists are kicked out by the mapping), which is diverse is way better than showing a playlist of 1k which will overwhelm the use and maybe not want them to come back.
Small playlists are playable and user may eagerly wait for the next week once they have played all the songs of the current one (hopefully)

This looks simple and sweet to me (just a hack to manage the 1K recs till the daily jams are ready)
for row in rows:
        if mbids_and_ratings.get(row['recording_mbid']) is not None:
            listens.append({
                'listened_at': 0,
                'track_metadata': {
                    'artist_name': row['artist_credit_name'],
                    'track_name': row['recording_name'],
                    'release_name': row.get('release_name', ""),
                    'additional_info': {
                        'recording_mbid': row['recording_mbid'],
                        'artist_mbids': row['[artist_credit_mbids]']
                    }
                },
                'score': mbids_and_ratings[row['recording_mbid']],
                'artist_credit_id': row['artist_credit_id']
            })

    listens = sorted(listens, key=lambda x: x['score'], reverse=True)
    
    l = []
    a = {}
    for row in listens:
        if a.get(row['artist_credit_id']):
            if a[row['artist_credit_id']] < 2:
                a[row['artist_credit_id']] += 1
                l.append(row)
            else:
                continue
        else:
            a[row['artist_credit_id']] = 1
            l.append(row)

code attached for reference, of course not the final one, but just for an idea.
I wanted to use a table to store the processed 1k recs which will possibly reduce to 30 or so dpending on how many diff artists are there. We want to store them in the db so that the playlist don't change with every refresh.
if you don't agree with this temporay solution of mine, I am open to suggestions and other ways of implementation.