Skip to content

Instantly share code, notes, and snippets.

@lancepioch
Created May 8, 2022 04:32
Show Gist options
  • Save lancepioch/baf03f66e27ce5ef73dcb3ab89d8cd0f to your computer and use it in GitHub Desktop.
Save lancepioch/baf03f66e27ce5ef73dcb3ab89d8cd0f to your computer and use it in GitHub Desktop.
import os, re
pattern = re.compile(r'https:..www\d+.zippyshare.com/v/\w+/file.html')
directory = 'gogs'
urls = []
for filename in os.listdir(directory):
f = os.path.join(directory, filename)
if not os.path.isfile(f):
continue
print("Checking file: " + filename)
for line in open(f, 'r'):
match = pattern.search(line)
if match:
urls.append(match.group())
# remove duplicates
urls = list(set(urls))
with open('gog-urls.txt', 'w') as f:
for url in urls:
f.write("%s\n" % url)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment