Skip to content

Instantly share code, notes, and snippets.

@sbarratt
Last active June 2, 2016 18:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sbarratt/616f7c0bc2857f9af745aa644215c32f to your computer and use it in GitHub Desktop.
Save sbarratt/616f7c0bc2857f9af745aa644215c32f to your computer and use it in GitHub Desktop.
Takes a list of urls from stdin and writes formatted hyperlinks to stdout.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""Urls to Hyperlinks.
This script takes line-separated urls from stdin and writes formatted hyperlinks to stdout.
Usage:
# cat urls.txt | python ltoh.py > hyperlinks.
"""
from urllib.request import urlopen
import fileinput
import sys
from bs4 import BeautifulSoup
def url_to_hyperlink(url):
soup = BeautifulSoup(urlopen(url), "lxml")
return "<a href=\"%s\">%s</a><br />" % (url, soup.title.string)
if __name__ == '__main__':
for line in fileinput.input():
sys.stdout.write(url_to_hyperlink(line))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment