Skip to content

Instantly share code, notes, and snippets.

@dbeley
Last active September 14, 2020 00:44
Show Gist options
  • Save dbeley/9363640dbe7cf410c995588585d33038 to your computer and use it in GitHub Desktop.
Save dbeley/9363640dbe7cf410c995588585d33038 to your computer and use it in GitHub Desktop.
Script to extract youtube urls from a json file returned by the google takeout export.
"""
Extract youtube urls from a json file returned by the google takeout export.
Usage : python extract_urls.py <name of json file>
"""
import logging
import argparse
import json
from pathlib import Path
logger = logging.getLogger()
def main():
args = parse_args()
with open(args.file, "r", encoding="utf-8") as f:
content = json.loads(f.read())
ids = [
"https://youtube.com/watch?v=" + x["contentDetails"]["videoId"]
for x in content
]
with open(f"Export_{Path(args.file).stem}.txt", "w") as f:
for i in ids:
f.write(i + "\n")
def parse_args():
format = "%(levelname)s :: %(message)s"
parser = argparse.ArgumentParser(
description="Extract youtube urls from a json file returned by the google takekout export."
)
parser.add_argument(
"--debug",
help="Display debugging information.",
action="store_const",
dest="loglevel",
const=logging.DEBUG,
default=logging.INFO,
)
parser.add_argument(
"file",
help="Youtube playlist JSON file from a google takeout export.",
type=str,
)
parser.set_defaults(boolean_flag=False)
args = parser.parse_args()
logging.basicConfig(level=args.loglevel, format=format)
return args
if __name__ == "__main__":
main()
@yannickcola
Copy link

Hi, I am receiving this error when trying to run on my Google Takeout - Youtube Liked Videos json file:

Traceback (most recent call last):
File "extract_urls.py", line 55, in
main()
File "extract_urls.py", line 17, in main
content = json.loads(f.read())
File "C:\Users\ycola\AppData\Local\Programs\Python\Python36\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 520: character maps to

Would you happen to have come across this before?

@dbeley
Copy link
Author

dbeley commented Sep 13, 2020

Hi, I am receiving this error when trying to run on my Google Takeout - Youtube Liked Videos json file:

Traceback (most recent call last):
File "extract_urls.py", line 55, in
main()
File "extract_urls.py", line 17, in main
content = json.loads(f.read())
File "C:\Users\ycola\AppData\Local\Programs\Python\Python36\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 520: character maps to

Would you happen to have come across this before?

I've never had this kind of issue, but it seems to be an encoding error.

You can try forcing utf-8 decoding by changing the line 16 of the script with this one:
with open(args.file, "r", encoding="utf-8") as f:

(I also updated the gist reflecting the change)

@yannickcola
Copy link

It worked, thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment