Skip to content

Instantly share code, notes, and snippets.

@ferbncode
Created March 5, 2017 18:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ferbncode/4ed7b7d61cb5a0488acf15753990f74f to your computer and use it in GitHub Desktop.
Save ferbncode/4ed7b7d61cb5a0488acf15753990f74f to your computer and use it in GitHub Desktop.
dad
nickiana> help
6:26 PM im working on a project to extract data from music brainz, and we want the process to be completely automated.
6:27 PM
<yvanz> nickiana: just ask your question
6:27 PM
<ruaok> hi nickiana. what do you need help with then?
6:28 PM
<nickiana> our current environment is AWS RDS MYSQL, and we are trying to figure out the best option for us
6:29 PM i see that they are 2 options, i.e. virtual image or the data dump
6:29 PM im not sure what the pros and cons are of each option
6:29 PM
<ruaok> before you go any further, let us remind you that we do not support a DB other than postgres.
6:30 PM so, if you absolutely require using mysql, in whatever form, it is up to you to manage that. we DO NOT recommend you do that.
6:30 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net okay so it would have to postgres based, that's fine we can work round that
6:30 PM
<ruaok> great.
6:31 PM do you use docker for your installs?
6:31 PM if so, or are at least open to it, then this project is the easiest way to get our data into a working setup:
6:31 PM https://github.com/jsturgis/musicbrainz-docker
6:32 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net apologies i dont know what a docker is
6:32 PM
<ruaok> container technology for making deployment of software easier
6:32 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net my skillset has been predominately been sql
6:33 PM
<ruaok> https://www.docker.com/
6:33 PM k. does your team have a system administration or devops person?
6:33 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net yes we do
6:34 PM
<ruaok> then it might be best for you to work with them to figure how to deploy our software on your infrastructure.
6:34 PM you can send them here to ask for help.
6:34 PM as for you, you can assume that MB data will be delivered and available to you in Postgres.
6:34 PM what is your use case? how do you plan to use our data?
6:39 PM ⇐ arbenina quit (~quassel@46.149.92.139) Ping timeout: 260 seconds
6:45 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net one of my colleagues had done some work where he downloaded the virtual image, and was querying the database to get data out
6:45 PM
<ruaok> querying the database or using the API that is part of the virtual image?
6:45 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net but we would like the process to be automated, without any manual intervention
6:45 PM
<ruaok> we generally do not suggest that people not use the DB directly, but make calls against the API.
6:46 PM there are ways to automate the process, yes. but what data are you fetching?
6:46 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net we would like to enrich our database with MBZ data, so where we have missing fields like release date, track length
6:47 PM we would also like minimize data entry on our side, so create a process where the users can pull data across from MBZ into our database as and when required
6:48 PM
<ruaok> ok, no problem.
6:49 PM there are a few things that need to be done to get this to work right:
6:50 PM 1) match data from your DB to our DB. For that we suggest that when you find a match between the databases, that you import the MBID for the given entity into your database. This may very well be the most tedious step.
6:50 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net so we currently capture tracks and products information, and some of that information has missing fields, so we would like to update that where it is available on MBZ
6:50 PM
<ruaok> ah. I should ask you another question then... do you every need updates from you finish populating the data?
6:50 PM e.g. you pull in a release date. later we realize the release date is wrong and fix it our DB.
6:51 PM do you then need to update the date in your DB?
6:52 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net Yes absolutely we need to make sure that any changes made to mbz is in sync with our database, so we need a way of keeping tracks to mbz updates
6:52 PM
<ruaok> ok, then the process I started outlining is the right onw.
6:52 PM one.
6:52 PM
— ruaok continues with #2.
6:53 PM
<ruaok> 2) For this part, you will actually need to query the DB directly, but this is the only thing we suggest talking to the DB directly to.
6:54 PM Each of our tables has a last_updated or created_on column. So for each table that you are interested, you need to keep track of when you last checked the table for updates. then via our Live Data Feed we can provide updates hourly. then periodically, perhaps hourly, your script can run and see which MBIDs have changed.
6:55 PM 3) pass the list of changed MBIDs to another script that makes call to our API (hosted on your server) which then updates the data in your database.
6:55 PM I hope that makes sense.
6:56 PM
<Quesito> Hi nickiana! Is the project you are working on commercial?
6:57 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net yes it is
6:57 PM i work for universal music
6:58 PM
<ruaok> ah ha. :)
6:59 PM
<Quesito> Is this the first time you all are attempting set up?
6:59 PM
<ruaok> Quesito: no, they've been using our data for a while, but the person who was working with hit has left the company.
7:00 PM
<Quesito> ah, ok! thanks ruaok
7:01 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net my colleague has been doing some work on it, but the set up he has queries the data manually,but we would like to automate that
7:01 PM
<Quesito> well nickiana welcome! :)
7:02 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net i.e. i need to verify what he does but it involves alot of copying and pasting
7:02 PM
<ruaok> ugh. that's awful.
7:04 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net so basically i need to follow the steps on the github link
7:04 PM
<ruaok> for installing the service onto your own servers, yes.
7:04 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net i know you have aws postgres version as well
7:05 PM isnt there a way of replicating the data from postgres aws to mysql aws
7:06 PM
<ruaok> people have written various ways of doing that, but the software is never maintained.
7:06 PM this will become a problem for you later.
7:06 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net oh okay, so you wouldnt recommend that
7:07 PM
<ruaok> very much NOT recommended.
7:07 PM
<reosarevok> /query ruaok
7:07 PM :D or not
7:08 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net oakey dokey, many thanks ruaok, you have been very helpful, i'll be in touch if i get stuck
7:11 PM
<ruaok> ok, good luck!
7:16 PM → hibiscuskazeneko joined ⇐ Xinayder quit
7:22 PM
<nickiana> 212.36.169.74 - http://webchat.freenode.net thank you, much appreciated
7:23 PM → Xinayder, Leo_Verto and github joined ⇐ D4RK-PH0ENiX quit ↔ flyingrub nipped out
7:32 PM
<github>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment