Ryan5453/apple-music.md

## apple-music.md

      
    Raw
  

              apple-music.md
            
          
    Note: this no longer works
Let's start.
First we need something to request, we're going to start with an playlist, but I'll show you how to request a track and album later.
url = "https://music.apple.com/us/playlist/todays-hits/pl.f4d106fed2bd41149aaacabb233eb5eb"
Let's get the raw HTML so we can extract what we need (the token)
import requests

url = "https://music.apple.com/us/playlist/todays-hits/pl.f4d106fed2bd41149aaacabb233eb5eb"

response = requests.get(url)
html = response.text
print(html)
Now, that isn't helpful at all, so we will use beautifulsoup (bs4) to extract the data we need.
import requests
from bs4 import BeautifulSoup

url = "https://music.apple.com/us/playlist/todays-hits/pl.f4d106fed2bd41149aaacabb233eb5eb"

response = requests.get(url)
html = response.text
webpage = BeautifulSoup(html, "html.parser")
raw = webpage.find_all("meta",attrs={"name":"desktop-music-app/config/environment"})[0].get("content")
print(raw)
Ew, this data is all messy and url encoded, let's make it look nicer.
import requests
from bs4 import BeautifulSoup
from urllib.parse import unquote

url = "https://music.apple.com/us/playlist/todays-hits/pl.f4d106fed2bd41149aaacabb233eb5eb"

response = requests.get(url)
html = response.text
webpage = BeautifulSoup(html, "html.parser")
raw = webpage.find_all("meta",attrs={"name":"desktop-music-app/config/environment"})[0].get("content")
clean = unquote(raw)
print(clean)
Ok, that looks nicer, now let's get the data we really need.
import requests
from bs4 import BeautifulSoup
from urllib.parse import unquote
import json

url = "https://music.apple.com/us/playlist/todays-hits/pl.f4d106fed2bd41149aaacabb233eb5eb"

response = requests.get(url)
html = response.text
webpage = BeautifulSoup(html, "html.parser")
raw = webpage.find_all("meta",attrs={"name":"desktop-music-app/config/environment"})[0].get("content")
clean = unquote(raw)
dictifyed = json.loads(clean)
data = dictifyed["MEDIA_API"]["token"]
print(data)
Awesome, we now have the token for the API.
Currently at the time of writing (9/26/2021) Apple only requires the Authorization: Bearer <token> header to access the track data. For the normal webpage, a normal requests.get() works, but the website does provide more headers, so we will try to hide it more and make it look more like natural requests.
import requests
from bs4 import BeautifulSoup
from urllib.parse import unquote
import json

url = "https://music.apple.com/us/playlist/todays-hits/pl.f4d106fed2bd41149aaacabb233eb5eb"

def get_creds(url):
  headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Host': 'music.apple.com',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15',
    'Accept-Language': 'en-US,en;q=0.9',
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive',
  }
  response = requests.get(url, headers=headers)
  html = response.text
  webpage = BeautifulSoup(html, "html.parser")
  raw = webpage.find_all("meta",attrs={"name":"desktop-music-app/config/environment"})[0].get("content")
  clean = unquote(raw)
  dictifyed = json.loads(clean)
  data = dictifyed["MEDIA_API"]["token"]
  return data
print(get_creds(url))
These are the exact headers that my MacBook Pro running Safari requested, except the cookie header. I've also seperated it into seperate functions to make it easier to read.
Ok, now that we have a lower chance of getting blocked by Apple, let's get the actual API data. We're again going to copy my MacBook headers.
import requests
from bs4 import BeautifulSoup
from urllib.parse import unquote
import json

url = "https://music.apple.com/us/playlist/todays-hits/pl.f4d106fed2bd41149aaacabb233eb5eb"

def get_creds(url):
  headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Host': 'music.apple.com',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15',
    'Accept-Language': 'en-US,en;q=0.9',
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive',
  }
  response = requests.get(url, headers=headers)
  html = response.text
  webpage = BeautifulSoup(html, "html.parser")
  raw = webpage.find_all("meta",attrs={"name":"desktop-music-app/config/environment"})[0].get("content")
  clean = unquote(raw)
  dictifyed = json.loads(clean)
  data = dictifyed["MEDIA_API"]["token"]
  return data

def extract_url_data(url):
    array = url.split("/")
    country = array[3]
    playlist_id = array[6]
    return country, playlist_id

def get_data(country, playlist_id, token):
  headers = {
    'Accept': '*/*',
    'Origin': 'music.apple.com',
    'Referer': 'music.apple.com',
    'Accept-Language': 'en-US,en;q=0.9',
    'Host': 'amp-api.music.apple.com',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15',
    'Authorization': f"Bearer {token}",
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive',
    'Sec-Fetch-Dest': 'empty',
    'Sec-Fetch-Mode': 'cors',
    'Sec-Fetch-Site': 'same-site',
  }
  params = (
    ('l', 'en-us'),
  )
  url = f"https://amp-api.music.apple.com/v1/catalog/{country}/playlists/{playlist_id}/tracks"
  response = requests.get(url, headers=headers, params=params)
  return response.text

creds = get_creds(url)
country, playlist_id = extract_url_data(url)
data = get_data(country, playlist_id, creds)
print(data)
Woah! That's a big change. I'll explain.
Let's start with the extract_url_data function. This one's pretty easy - all it does is take in the URL and return a tuple of the country and playlist id from the URL. We use this data when we request the Apple Music API. Now onto the get_data function. This is the main function that requests the data from the Apple Music API. It uses the token we got earlier from the get_creds function.
That's it! Or is it? We never covered how to request a track or album.
Before we get onto how to request an Apple Music Album, we need to talk about a single track.
Here's an example of an album: https://music.apple.com/us/album/rock-the-beat-ii/1440636622
And a track from that album https://music.apple.com/us/album/rock-the-beat-ii/1440636622?i=144063662
Looks familiar, right? A URL of an Apple Music Track is just an album, with an i param of the track ID. When you want to get the data of a track, you request the whole album, and iterate through the tracks until i matches id in the track dictionary, and then you have your data.
Requesting an Apple Music Album is very similar to requesting a playlist, but we can make our functions modular to support both albums and playlists.
import requests
from bs4 import BeautifulSoup
from urllib.parse import unquote
import json

url = "https://music.apple.com/us/album/sorry-for-party-rocking/1440636622"

def get_creds(url):
  headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Host': 'music.apple.com',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15',
    'Accept-Language': 'en-US,en;q=0.9',
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive',
  }
  response = requests.get(url, headers=headers)
  html = response.text
  webpage = BeautifulSoup(html, "html.parser")
  raw = webpage.find_all("meta",attrs={"name":"desktop-music-app/config/environment"})[0].get("content")
  clean = unquote(raw)
  dictifyed = json.loads(clean)
  data = dictifyed["MEDIA_API"]["token"]
  return data

def extract_url_data(url):
    array = url.split("/")
    country = array[3]
    playlist_id = array[6]
    type = array[4]
    return country, playlist_id, type

def get_data(country, id, token, type):
  headers = {
    'Accept': '*/*',
    'Origin': 'music.apple.com',
    'Referer': 'music.apple.com',
    'Accept-Language': 'en-US,en;q=0.9',
    'Host': 'amp-api.music.apple.com',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15',
    'Authorization': f"Bearer {token}",
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive',
    'Sec-Fetch-Dest': 'empty',
    'Sec-Fetch-Mode': 'cors',
    'Sec-Fetch-Site': 'same-site',
  }
  params = (
    ('l', 'en-us'),
  )
  if type == "playlist":
      type = "playlists"
  if type == "album":
      type = "albums"
  response = requests.get(f"https://amp-api.music.apple.com/v1/catalog/{country}/{type}/{id}/tracks", headers=headers, params=params)
  return response.json()

creds = get_creds(url)
country, id, type = extract_url_data(url)
data = get_data(country, id, creds, type)
print(data)
I've changed some variable names and made it support both playlists and albums. Cool, right?
One more thing. The Apple Music API uses something called pagination, which means that it only returns 100 tracks at a time. For playlists like the one we used it isn't a problem as it's under 100 songs, but for some it will cause you to not get all the tracks. Luckily the API makes it extremely easy to know when you need to paginate.
A playlist with 100 or under tracks will return data like this
{
 "data": []
}
When a playlist needs to be paginated, you'll see a response like this
{
 "next": "/v1/catalog/us/playlists/pl.u-BNA6YaXtp5lgjr/tracks?l=en-US&offset=100",
 "data": []
}
That link in "next" isn't a valid URL by itself. You'll need to attach the base URL (https://amp-api.music.apple.com) to the front, and then request it with the correct headers.