Skip to content

Instantly share code, notes, and snippets.

@MineRobber9000
Created June 21, 2020 09:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save MineRobber9000/24c87d3fb50d0b942989cbe4d4da7e73 to your computer and use it in GitHub Desktop.
Save MineRobber9000/24c87d3fb50d0b942989cbe4d4da7e73 to your computer and use it in GitHub Desktop.
Gopher transport adapter for python-requests

requests_gopher

A transport adapter to make gopher requests from python-requests. Here be dragons.

I've documented most of this shit with my comments, but I seriously warn you, for your own sanity: don't do this. Don't use this, don't try and write your own transport adapter, there's just so much weird crap you have to deal with. Just take a look, maybe test it once or twice, say "neat", and walk away.

I won't blame you.

This is licensed under MIT, except for the parse_url method, which I implemented based on dotcomboom's Pituophis. Just in case there's some legal problems there (I doubt there would be, he's a cool guy), that part is licensed under BSD 2-Clause from Pituophis. That copyright notice is included below:

# BSD 2-Clause License
#
# Copyright (c) 2020, dotcomboom <dotcomboom@somnolescent.net> and contributors
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# * Redistributions of source code must retain the above copyright notice, this
#   List of conditions and the following disclaimer.
#
# * Redistributions in binary form must reproduce the above copyright notice,
#   this List of conditions and the following disclaimer in the documentation
#   and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
# Portions copyright solderpunk & VF-1 contributors, licensed under the BSD 2-Clause License above.
"""A transport adapter to make gopher requests from python-requests. Here be
dragons.
I've documented most of this shit with my comments, but I seriously warn you,
for your own sanity: don't do this. Don't use this, don't try and write your
own transport adapter, there's just so much weird crap you have to deal with.
Just take a look, maybe test it once or twice, say "neat", and walk away.
I won't blame you.
This is licensed under MIT, except for the parse_url method, which I
implemented based on dotcomboom's Pituophis. Just in case there's some legal
problems there (I doubt there would be, he's a cool guy), that part is licensed
under BSD 2-Clause from Pituophis. That copyright notice is included below:
BSD 2-Clause License
Copyright (c) 2020, dotcomboom <dotcomboom@somnolescent.net> and contributors
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
List of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this List of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Portions copyright solderpunk & VF-1 contributors, licensed under the BSD
2-Clause License above.
"""
import requests, re, io, code, socket
from urllib.parse import urlparse, unquote_plus
# Regex to detect itemtype in path
__ITEM_TYPE_IN_PATH = re.compile(r"(/[0-9+gITdhs])(/.+)")
# remove itemtype from path (we can't include it in our request)
deitemize = lambda x: __ITEM_TYPE_IN_PATH.sub(lambda m: m.groups()[1], x)
# detect itemtype in path
itemized = lambda x: __ITEM_TYPE_IN_PATH.match(x) is not None
class HoldsThings:
"""It's like a namedtuple, but you can't index by number and it's actually mutable."""
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
def parse_url(url):
# parse the URL
res = urlparse(url)
# create the object to hold the things
ret = HoldsThings(**res._asdict())
# gopher queries do "%09" (which at this stage is '\t'), not '?', so replace that
if res.query:
ret.path = res.path + "?" + res.query
# there shouldn't be a reason for ret.query to exist at this stage
del ret.query
# default to a base path
if not ret.path:
ret.path = "/"
# split out the gopher query here
if "\t" in ret.path:
ret.path, ret.query = ret.path.split("\t", 1)
# we can't do anything with an item type so get rid of it
if itemized(ret.path):
ret.path = deitemize(ret.path)
return ret
class GopherAdapter(requests.adapters.BaseAdapter):
def _netloc_to_tuple(self, netloc):
# partition based on the ":"
host, sep, port = netloc.rpartition(":")
if sep: # we have a manually specified port
port = int(port)
else: # default to port 70
host = port
port = 70
return (host, port)
def _connect_and_read(self, parsed):
# connect
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(self._netloc_to_tuple(parsed.netloc))
# send path
msg = parsed.path
if hasattr(parsed, "query"): # add query if it exists
msg += "\t" + parsed.query
msg += "\r\n"
s.sendall(msg.encode("utf-8"))
# use file API because it'll be easier to work with
f = s.makefile("rb")
# read data in 16 byte chunks at a time
res = b""
data = f.read(16)
while data:
res += data
data = f.read(16)
# close the damn socket
f.close()
# return the data
return res
def _build_response(self, request, res):
resp = requests.Response()
# if there's a 3 line in there, we've got an error, so put in an error status code
# otherwise put in a success code
resp.status_code = 400 if (res.startswith(b"3")
or b"\r\n3" in res) else 200
# no headers in gopher, this is just to maintain the API
resp.headers = requests.structures.CaseInsensitiveDict({})
# Assume utf-8 encoding
resp.encoding = "utf-8"
# requests.Response.raw expects a file-like object, so use io.BytesIO
resp.raw = io.BytesIO(res)
# some basic stuff that requests.adapters.HTTPAdapter.build_response sets
resp.url = request.url
resp.req = request
resp.connection = self
return resp
def send(self, request, **kwargs):
# cowardly refuse to do anything but GET a gopher URL
assert request.method == "GET", f"You can't {request.method.lower!r} a Gopher resource!"
# parse the URL
parsed = parse_url(unquote_plus(request.url))
# get the response
res = self._connect_and_read(parsed)
# build a requests.Response and send it
return self._build_response(request, res)
if __name__ == "__main__":
# test this by making sure we can hit Floodgap
# this is the setup to make gopher requests work
s = requests.Session()
s.mount("gopher:", GopherAdapter())
# now try and hit Floodgap
resp = s.get("gopher://gopher.floodgap.com")
# if Floodgap gives us an error we've fucked up somewhere
assert resp.status_code == 200
# if we're still going it worked
print("Test passed!")
"""A transport adapter to make gopher requests from python-requests. Here be
dragons.
I've documented most of this shit with my comments, but I seriously warn you,
for your own sanity: don't do this. Don't use this, don't try and write your
own transport adapter, there's just so much weird crap you have to deal with.
Just take a look, maybe test it once or twice, say "neat", and walk away.
I won't blame you.
This is licensed under MIT, except for the parse_url method, which I
implemented based on dotcomboom's Pituophis. Just in case there's some legal
problems there (I doubt there would be, he's a cool guy), that part is licensed
under BSD 2-Clause from Pituophis. That copyright notice is included below:
BSD 2-Clause License
Copyright (c) 2020, dotcomboom <dotcomboom@somnolescent.net> and contributors
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
List of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this List of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Portions copyright solderpunk & VF-1 contributors, licensed under the BSD
2-Clause License above.
"""
import requests, re, io, socket
from urllib.parse import urlparse, unquote_plus
# Regex to detect itemtype in path
__ITEM_TYPE_IN_PATH = re.compile(r"(/[0-9+gITdhs])(/.+)")
# remove itemtype from path (we can't include it in our request)
deitemize = lambda x: __ITEM_TYPE_IN_PATH.sub(lambda m: m.groups()[1], x)
# detect itemtype in path
itemized = lambda x: __ITEM_TYPE_IN_PATH.match(x) is not None
class HoldsThings:
"""It's like a namedtuple, but you can't index by number and it's actually mutable."""
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
def parse_url(url):
# parse the URL
res = urlparse(url)
# create the object to hold the things
ret = HoldsThings(**res._asdict())
# gopher queries do "%09" (which at this stage is '\t'), not '?', so replace that
if res.query:
ret.path = res.path + "?" + res.query
# there shouldn't be a reason for ret.query to exist at this stage
del ret.query
# default to a base path
if not ret.path:
ret.path = "/"
# split out the gopher query here
if "\t" in ret.path:
ret.path, ret.query = ret.path.split("\t", 1)
# we can't do anything with an item type so get rid of it
if itemized(ret.path):
ret.path = deitemize(ret.path)
return ret
class GopherAdapter(requests.adapters.BaseAdapter):
def _netloc_to_tuple(self, netloc):
# partition based on the ":"
host, sep, port = netloc.rpartition(":")
if sep: # we have a manually specified port
port = int(port)
else: # default to port 70
host = port
port = 70
return (host, port)
def _connect_and_read(self, parsed):
# connect
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(self._netloc_to_tuple(parsed.netloc))
# send path
msg = parsed.path
if hasattr(parsed, "query"): # add query if it exists
msg += "\t" + parsed.query
msg += "\r\n"
s.sendall(msg.encode("utf-8"))
# use file API because it'll be easier to work with
f = s.makefile("rb")
# read data in 16 byte chunks at a time
res = b""
data = f.read(16)
while data:
res += data
data = f.read(16)
# close the damn socket
f.close()
# return the data
return res
def _build_response(self, request, res):
resp = requests.Response()
# if there's a 3 line in there, we've got an error, so put in an error status code
# otherwise put in a success code
resp.status_code = 400 if (res.startswith(b"3")
or b"\r\n3" in res) else 200
# no headers in gopher, this is just to maintain the API
resp.headers = requests.structures.CaseInsensitiveDict({})
# Assume utf-8 encoding
resp.encoding = "utf-8"
# requests.Response.raw expects a file-like object, so use io.BytesIO
resp.raw = io.BytesIO(res)
# some basic stuff that requests.adapters.HTTPAdapter.build_response sets
resp.url = request.url
resp.req = request
resp.connection = self
return resp
def send(self, request, **kwargs):
# cowardly refuse to do anything but GET a gopher URL
assert request.method == "GET", f"You can't {request.method.lower!r} a Gopher resource!"
# parse the URL
parsed = parse_url(unquote_plus(request.url))
# get the response
res = self._connect_and_read(parsed)
# build a requests.Response and send it
return self._build_response(request, res)
if __name__ == "__main__":
# test this by making sure we can hit Floodgap
# this is the setup to make gopher requests work
s = requests.Session()
s.mount("gopher:", GopherAdapter())
# now try and hit Floodgap
resp = s.get("gopher://gopher.floodgap.com")
# if Floodgap gives us an error we've fucked up somewhere
assert resp.status_code == 200
# if we're still going it worked
print("Test passed!")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment