Skip to content

Instantly share code, notes, and snippets.

Last active February 25, 2021 17:22
Show Gist options
  • Save craigds/00331c6ff8fd2334de68a52ef0cd79c2 to your computer and use it in GitHub Desktop.
Save craigds/00331c6ff8fd2334de68a52ef0cd79c2 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python
Converts Internet Explorer 'capture network traffic' XML to a HAR file.
Turns out that XML is just a HAR file anyways, but in XML form. So this
just converts it to JSON, and Bob's your uncle.
Requires Python 2.7+ and LXML.
from __future__ import unicode_literals
import argparse
import json
from lxml import objectify
import sys
if sys.version_info > (3,):
str_type = str
str_type = unicode
list_things = {
def xml_to_dict(element):
if element.tag in list_things:
return [xml_to_dict(e) for e in element.getchildren()]
if element.getchildren():
return {e.tag: xml_to_dict(e) for e in element.getchildren()}
return str_type(element.pyval)
def main():
parser = argparse.ArgumentParser(description="Convert IE's crazy XML-HAR into a real HAR file")
parser.add_argument('infile', type=argparse.FileType('r'), default=sys.stdin)
parser.add_argument('outfile', type=argparse.FileType('w'), default=sys.stdout)
args = parser.parse_args()
tree = objectify.parse(args.infile)
root = tree.getroot()
d = {root.tag: xml_to_dict(root)}
json.dump(d, args.outfile, indent=2, sort_keys=True)
if __name__ == '__main__':
Copy link

This is fantastic, thank you very much.

Copy link

cordylus commented May 1, 2017

This just saved my bacon. Thanks!

Copy link

YES, thanks alot! The other alternatives where getting Fiddler to work on linux or get a windows VM....

Copy link

fbender commented Aug 18, 2017

For anyone struggling with file encodings on the Windows (MSYSGIT) command line like me, I created a fork that forces UTF-8 when reading and writing the files. This small fix cost me almost three hours today since I was puzzled why lxml or an extra open() would ignore my encoding (unless I specify 'ascii'). Turns out argparse has a bit more magic in it than I thought.

Copy link

Coming from php; had to do this to the json in order for it to be validated in Source xml came from IE11.

$contents = file_get_contents('NetworkData.har');
$json = json_decode($contents, TRUE);
foreach ($json['log']['entries'] as $key => &$values) {
  $values['time'] = (int) $values['time'];
  $values['timings']['receive'] = (int) $values['timings']['receive'];
  $values['timings']['send'] = (int) $values['timings']['send'];
  $values['timings']['wait'] = (int) $values['timings']['wait'];
  if (!isset($values['response']['redirectURL'])) {
    $values['response']['redirectURL'] = '';
  if (!isset($values['cache'])) {
    $values['cache'] = array();
file_put_contents('NetworkData2.har', $output);

Copy link

Does it work in Python 3, or only in 2.7?

Copy link

Thanks a lot for this handy tool, much appreciated!

Copy link

mohitt commented May 9, 2019

No Bob is not my uncle :P

Copy link

giriva commented Sep 25, 2019

I want to record chrome session in HAR and playback it in IE. Is there an option?

Copy link

nimiiy commented Nov 13, 2020

So glad I landed here, it's very useful :D Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment