The data we get from BigQuery needs to be saved on our server. The thing I've come up with is one table for each stat that needs to be saved, with foreign keys to the user table, and msids.
table | columns | notes |
---|---|---|
user_top_artists | (user_id, artist_msid, listen_count) | This table can be used for top artists of a user, user_id is FK to "user" |
user_top_releases | (user_id, release_msid, listen_count) | This table can be used for top releases of a user |
user_top_recordings | (user_id, recording_msid, listen_count) | This table can be used for top recordings of a user. |
artist_top_releases | (artist_msid, release_msid, listen_count) | This table will be used for getting top releases of an artist. |
artist_top_recordings | (artist_msid, recording_msid, listen_count) | This table will be used for getting top recordings of an artist. |
release_top_recordings | (release_msid, recording_msid, listen_count) | This table will be used for getting top recordings of a release. |
We also need to save when stats for a particular entity or a particular user were calculated last. This would involve creating
new tables for each entity (i.e three new tables, artist, recording, release) with each table containing a last_updated
column
which will store the timestamp which tells us when the stats for this entity were last updated. Also, a last_updated
column
will be added to the "user" table.