Skip to content

Instantly share code, notes, and snippets.

@josephx86
Forked from Lewiscowles1986/extract_har.py
Created March 30, 2023 05:58
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save josephx86/2cf2678eced21dba82a7bf9ead537d00 to your computer and use it in GitHub Desktop.
Save josephx86/2cf2678eced21dba82a7bf9ead537d00 to your computer and use it in GitHub Desktop.
Python 3 script to extract images from HTTP Archive (HAR) files
import json
import base64
import os
# make sure the output directory exists before running!
folder = os.path.join(os.getcwd(), "imgs")
with open("src.har", "r") as f:
har = json.loads(f.read())
entries = har["log"]["entries"]
for entry in entries:
mimetype = entry["response"]["content"]["mimeType"]
filename = entry["request"]["url"].split("/")[-1]
image64 = entry["response"]["content"]["text"]
if any([
mimetype == "image/webp",
mimetype == "image/jpeg",
mimetype == "image/png"
]):
ext = {
"image/webp": "webp",
"image/jpeg": "jpg",
"image/png": "png",
}.get(mimetype)
file = os.path.join(folder, f"{filename}.{ext}")
print(file)
with open(file, "wb") as f:
f.write(base64.b64decode(image64))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment