Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
simple script to extract all <dom-module> Polymer component definitions from a vulcanized file.
# usage: python
# parses a file called `vulcanized.html` and splits it
# into several files, separated in different folders
# according to the `assetpath` attribute.
# it does not create the folders, though: look for IOErrors and OSErrors
# inside the traceback.
# requires beautifulsoup4
from bs4 import BeautifulSoup
with open("vulcanized.html", encoding="utf-8") as data:
soup = BeautifulSoup(data, "html.parser")
modules_list = soup.find_all('dom-module')
for module in modules_list:
print(module.attrs["id"], module.attrs["assetpath"])
subfolder = module.attrs["assetpath"].replace("../", "")
subfolder = subfolder[:subfolder.index("/")]
with open("extracted_components/{1}/{0}.html".format(module.attrs["id"], subfolder), "wb") as dest:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.