Skip to content

Instantly share code, notes, and snippets.

@mluerig
Created October 30, 2018 17:35
Show Gist options
  • Save mluerig/95354d53c30a1ddba1555cb9b04113d4 to your computer and use it in GitHub Desktop.
Save mluerig/95354d53c30a1ddba1555cb9b04113d4 to your computer and use it in GitHub Desktop.
this finds citations in text-files based on the pattern "four digits inside parentheses". outputs textfile with list
import os
import re
os.chdir("E:\\PhD\\Chapters\\2_Sondes_2015")
file = open(os.path.join(os.getcwd(),"sondes_manuscript.txt"), encoding="utf8")
findings = []
for line in file:
par_content = re.findall(r'\((.*?)\)',line)
for item in par_content:
res = re.search(r"\d{4}", item)
if res:
citations = item.split("; ")
for citation in citations:
findings.append(citation)
res_file = open(os.path.join(os.getcwd(),"out.txt"), 'w')
for item in findings:
res_file.write(item + "\n")
res_file.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment