Skip to content

Instantly share code, notes, and snippets.

@kafran
Last active November 3, 2023 20:04
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save kafran/0257c13b3d0a79620695b73062334930 to your computer and use it in GitHub Desktop.
Save kafran/0257c13b3d0a79620695b73062334930 to your computer and use it in GitHub Desktop.
Python 3 script to extract images from HTTP Archive (HAR) files
import json
import base64
import os
# make sure the output directory exists before running!
folder = os.path.join(os.getcwd(), "imgs")
with open("scr.har", "r") as f:
har = json.loads(f.read())
entries = har["log"]["entries"]
for entry in entries:
mimetype = entry["response"]["content"]["mimeType"]
filename = entry["request"]["url"].split("/")[-1]
image64 = entry["response"]["content"]["text"]
if mimetype == "image/webp":
file = os.path.join(folder, "{}.webp".format(filename))
print(file)
with open(file, "wb") as f:
f.write(base64.b64decode(image64))
@Lewiscowles1986
Copy link

@FurloSK
Copy link

FurloSK commented Nov 3, 2023

Note that this gist will overwrite files with the same filename and different URL paths, since the code is not creating subfolders.

For updated version with subfolders creation and parametrised (specifiable) input file and output folder, see the fork here:
https://gist.github.com/FurloSK/0477e01024f701db42341fc3223a5d8c

@kafran
Copy link
Author

kafran commented Nov 3, 2023

Thank you guys. This code is so old I barely remember it =)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment