Skip to content

Instantly share code, notes, and snippets.

@paramsingh
Last active February 12, 2017 11:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save paramsingh/49eba1056bc459359b5c389bb5aeb734 to your computer and use it in GitHub Desktop.
Save paramsingh/49eba1056bc459359b5c389bb5aeb734 to your computer and use it in GitHub Desktop.
Listenbrainz Stats work.

Last.fm currently shows top artists, top albums and top tracks on the user page, with the flat list of listens that ListenBrainz already has. On artist and album pages, it shows top listeners and top tracks prominently.

There is an idea on the GSoC 2017 page for this already, but right now it only involves generation of user stats with no mention of artists etc. I propose that we add artist and album pages to the idea also, before breaking it into two parts, the first of which I'll work on before summer and the second of which will involve a GSoC student (hopefully me) continuing on the work already done.

Right now, I think the work can be broken into two parts as follows:

  1. The first part would include getting basic infrastructure for graph building done before the summer. I start with the basic infrastructure required for getting information from data that we're streaming to BigQuery. I'm not sure of the exact technical details on this yet (I'll need some time and help on this probably), but right now I'm thinking that we'll need a class BigQueryReader which has methods allowing us to run queries passed on the data we have in BigQuery. BigQueryReader would use google-api-client to send queries and would probably have some way of allowing queries of any type to be run on the data. When this is done, I start with graphs for the user pages as these are the ones that require the least amount of legwork to get done. I was thinking of using some charting library based off of d3.js (like plotly.js) to draw the graphs. Right now, I'm thinking bar charts for top artists, top albums and top tracks is the minimum that we should add first (as it brings us up to speed with what last.fm shows on its user page).

  2. The second part would involve completing graphs for user pages which haven't been done so far and building pages for artists, album and graphs. This would be the actual GSoC project. If the user stats are done, I start with pages for artists, albums and tracks. There are no views for any of these in listenbrainz as of now, but we can use ideas from critiquebrainz to get this done well, followed by adding stats for each of these pages also.

Tentative Timeline of work before the summer

  • February: Be done with tests and alpha_importer and get beta deployed and tested, fixing minor bugs etc.
  • March 1 - March 20 (before student application): Complete work on the method to get information from BigQuery and if this gets done early (which it probably will), start with the charts on user pages.
  • March 20 - April: student application, the project will depend on how much work is done already, but it will surely contain addition of pages for artists, albums and tracks.
  • Later: Students announced, hopefully I'm one of them and I start working on the project.

Notes

  • Adding pages for artists and albums brings up the problem that some listens will not have MBIDs recognizing artists etc. I don't know how exactly last.fm identifies scrobbles as being of a particular artist but maybe we can do some automatic recognition of metadata ala how picard does it (again not very familiar with how picard does this also, but it works really well most of the time). From what I see, getting this done is listed as a GSoC idea in itself, and it's a pretty interesting problem.

  • Adding pages for artists etc also brings ups the problem that CritiqueBrainz is facing right now of too many requests by musicbrainzngs. There is a GSoC idea for Critiquebrainz which involves modifying the code to directly use the MB db instead of making api requests. Maybe we'll avoid that entirely by using the MB db from the beginning.

  • Right now, I've not given much thought to caching the results, although I think it would be part of the work done in the summer itself as we'd probably like to cache the data required by pages opened in redis. Probably an LRU cache here, I'd love opinions on this.

  • When my work before the summer gets completed, making and adding new graphs to LB should be pretty easy. We could probably branch off to new graphs that last.fm does not show, I haven't yet thought much on what we can do and am hoping the community chimes in with ideas on what it wants. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment