Skip to content

Instantly share code, notes, and snippets.

@willmarkley
Last active November 8, 2017 00:52
Show Gist options
  • Save willmarkley/f0098e55cf3abeb9c44e90a7f69b5a03 to your computer and use it in GitHub Desktop.
Save willmarkley/f0098e55cf3abeb9c44e90a7f69b5a03 to your computer and use it in GitHub Desktop.
Export Trademark Electronic Search System results to an excel document
<!DOCTYPE html>
<html>
<head>
<title>TESS Search to XLS</title>
</head>
<body>
<h1>TESS Search to XLS</h1>
<h5>This tool converts a page of TESS results to an XLS file that can be opened in excel.</h5>
<h5>The application is still in beta version. The conversion may produce an error page or an XLS file with no data. Simply reload the home page and try again.</h5>
<h5>Since TESS Search results are limited to 50 entries per page, this tool will only capture the 50 entries on the URL provided.</h5>
<h5>Please enter a valid URL into the Textbox. Simply copy and paste the TESS result's URL for best results. Failure to enter a valid URL will resut in an error page.</h5>
<form method="post" action="/tess">
<p>Enter Result URL here:
<input type="text" name="url">
<input type="submit" name = "Submit">
</p>
</form>
</body>
</html>

Export Trademark Electronic Search System results to an excel document

To accomplish this task, setup download and setup an Apache HTTP Server.

Assuming the apache default is unchanged (/var/www), the following structure will work:

html/index.html
html/downloads/    ## a directory named "downloads" must be present under the HTML folder
cgi-bin/tess.py

In the apache http.conf, the following line is needed under <IfModule alias_module>:

ScriptAlias /tess "/var/www/cgi-bin/tess.py"

The final step is to run the apache server daemon

#!/usr/bin/python
import sys
import os
import time
import urllib
import cgi
import cgitb
cgitb.enable()
print("Content-Type: text/html;charset=utf-8")
print
print("Download Link: ")
success = True
finished = False
formData = cgi.FieldStorage()
url = formData.getvalue('url')
while not finished:
try:
response = urllib.urlopen(url)
except:
success = False
print(" INVALID URL")
sys.exit()
html = response.read()
read = False;
start_tag = '<TABLE BORDER=2>'
end_tag = '</TABLE>'
timestamp = int(time.time())
file_name = 'downloads/TESS_Results_'+str(timestamp)+'.xls'
output_file = open('../html/'+file_name, 'w+')
for l in html.splitlines():
line = l.rstrip()
if not line:
continue
if(start_tag in line):
read = True
if(read and (end_tag in line)):
output_file.write(line)
read = False
if(read):
output_file.write(line)
finished = True
output_file.close()
print('<a href="../'+file_name+'"download>TESS Results as .XLS</a>')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment