Skip to content

Instantly share code, notes, and snippets.

@clbarnes
Last active November 3, 2023 13:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save clbarnes/b634a61e104248056205d4eb3e8afcc6 to your computer and use it in GitHub Desktop.
Save clbarnes/b634a61e104248056205d4eb3e8afcc6 to your computer and use it in GitHub Desktop.
Scrape all of the project rosalind problems and create python stubs for each (obsolete code transferred from an old account)
#!/usr/bin/python2
from bs4 import BeautifulSoup
import urllib
path = '/home/tunisia/Projects/rosalind/'
html = BeautifulSoup(urllib.urlopen('http://rosalind.info/problems/list-view/'))
tr_all = html.find_all('tr')
probnum = 0
url_base = 'http://www.rosalind.info'
header_fmt = '''
#!/usr/bin/python
"""Project Rosalind Problem %d: "%s - %s" in Python
%s
"""
'''.strip()
for tr in tr_all[1:]:
probnum += 1
td = tr.find_all('td')
abbrev = td[0].string
title = td[1].a.string[:-2].rstrip().replace('\n','')
filename = 'prob%d_%s - %s.py' % (probnum,abbrev,title)
pathNname = path+filename
url = url_base + td[1].a['href']
header = header_fmt % (probnum, abbrev, title, url)
myfile = open(pathNname,'w')
myfile.write(header)
myfile.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment