Skip to content

Instantly share code, notes, and snippets.

@jss367
Created March 29, 2017 18:27
Show Gist options
  • Save jss367/110580607d9af9d8128e9b005b596500 to your computer and use it in GitHub Desktop.
Save jss367/110580607d9af9d8128e9b005b596500 to your computer and use it in GitHub Desktop.
import re
from urllib import request
# Now let's grab some text from Great Expectations
url = 'http://www.gutenberg.org/files/1400/1400-0.txt'
response = request.urlopen(url)
raw = response.read().decode('utf8')
# Here is some text we'll start with
text = raw[886:1091]
# Let's clean out all the annoying line markings
text = text.replace('\r', '')
text = text.replace('\n', ' ')
print(text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment