Skip to content

Instantly share code, notes, and snippets.

@vansika
Last active October 16, 2020 13:45
Show Gist options
  • Save vansika/33489526a4a7971cda95f81ac6093a0a to your computer and use it in GitHub Desktop.
Save vansika/33489526a4a7971cda95f81ac6093a0a to your computer and use it in GitHub Desktop.

The message woula have been long, therefore I have written it here. Please read the whole of it.

**get me top x recommendations:

Screenshot from 2020-10-16 18-15-03

This is not a good option. We already show recs sorted on score, and they aren't good enough. Even if we are making a small change, I don't think we should go from bad -> bad or bad -> worse but bad -> better (at least in theory), therefore this UI feature would not really please users and I don't think they will want to give it a try after they have tried it once.

**get botton x recommendations:

Screenshot from 2020-10-16 18-19-38

This looks better than the previous playlist (in my case). The artist spread out down the playlist. There can be many reasons for this. Let me explain why it happened in my case, cf couldn't find a user similar to me that listens to indian artists, therefore these artists and associated tracks went down the list and are scattered. But I don't think that this is a good option either. We haven'nt processed this in any way, so users might get a playlist like get me top x recommendationa i.e centered around one/a few artist(s).

Note** Machine learning is one aspect of recommendations. Like we do pre-processing of data before applying ML algorithms, we must after-processes (though not a word) the data to get something useful. It is not a good idea to just show on the site whtever we have got from spark

**get a small playable playlist:

Note**: showing a playlist of maybe 10 or 25 tracks (depends on how many artists are kicked out by the mapping), which is diverse is way better than showing a playlist of 1k which will overwhelm the use and maybe not want them to come back. Small playlists are playable and user may eagerly wait for the next week once they have played all the songs of the current one (hopefully)

Screenshot from 2020-10-16 18-37-05

This looks simple and sweet to me (just a hack to manage the 1K recs till the daily jams are ready)

for row in rows:
        if mbids_and_ratings.get(row['recording_mbid']) is not None:
            listens.append({
                'listened_at': 0,
                'track_metadata': {
                    'artist_name': row['artist_credit_name'],
                    'track_name': row['recording_name'],
                    'release_name': row.get('release_name', ""),
                    'additional_info': {
                        'recording_mbid': row['recording_mbid'],
                        'artist_mbids': row['[artist_credit_mbids]']
                    }
                },
                'score': mbids_and_ratings[row['recording_mbid']],
                'artist_credit_id': row['artist_credit_id']
            })

    listens = sorted(listens, key=lambda x: x['score'], reverse=True)
    
    l = []
    a = {}
    for row in listens:
        if a.get(row['artist_credit_id']):
            if a[row['artist_credit_id']] < 2:
                a[row['artist_credit_id']] += 1
                l.append(row)
            else:
                continue
        else:
            a[row['artist_credit_id']] = 1
            l.append(row)

code attached for reference, of course not the final one, but just for an idea.

I wanted to use a table to store the processed 1k recs which will possibly reduce to 30 or so dpending on how many diff artists are there. We want to store them in the db so that the playlist don't change with every refresh.

if you don't agree with this temporay solution of mine, I am open to suggestions and other ways of implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment