Skip to content

Instantly share code, notes, and snippets.

@Manoj-nathwani
Last active July 31, 2019 02:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Manoj-nathwani/1f498cc2a1863439bdd91d83fcda9266 to your computer and use it in GitHub Desktop.
Save Manoj-nathwani/1f498cc2a1863439bdd91d83fcda9266 to your computer and use it in GitHub Desktop.
Update Jekyll blog to self hostings images

Problem

  • My Jekyll blog uses hundrads of images on flickr for free image hosting
  • buuuuuuuut it's 2019 now and flickr is no longer free 🙈

Solution

  • Download all images hosted on flickr and save them to /images/articles/
  • Replace all image src attributes to using the new image location
import sys, os
import requests
from bs4 import BeautifulSoup
if sys.version_info[0] < 3:
raise Exception('Run script using Python 3!')
directory = '_posts/'
for filename in os.listdir(directory):
if filename.endswith('.html'):
print('Updating {}'.format(filename))
# parse article
f = open(os.path.join(directory, filename), 'r')
html = f.read()
soup = BeautifulSoup(html, 'html.parser')
# get images
images = [
x['src']
for x in soup.find_all('img')
if 'flickr' in x['src']
]
images = list(set(images))
# dowload image to local folder & update html image src
for image in images:
image_name = image.split('/')[-1]
with open(os.path.join('images/articles/', image_name), 'wb') as f:
f.write(requests.get(image).content)
html = html.replace(image, '/images/articles/{}'.format(image_name))
# save updated html
with open(os.path.join(directory, filename), 'w') as f:
f.write(html)
f.close()
print('{} images downloaded ✅'.format(len(images)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment