Skip to content

Instantly share code, notes, and snippets.

@Hilal-Urun
Created March 28, 2022 15:12
Show Gist options
  • Save Hilal-Urun/7dfe167b369aa735c761f0de717ef06b to your computer and use it in GitHub Desktop.
Save Hilal-Urun/7dfe167b369aa735c761f0de717ef06b to your computer and use it in GitHub Desktop.
getting full text data from pmc
from Bio import Entrez
from html_parser import strip_tags
Entrez.email = 'your.email@example.com'
pmc_id = "Write pmc id here" # papers pmc id or you can add also ids
fetch = Entrez.efetch(db='pmc',
resetmode='xml',
id=pmc_id,
rettype='full')
data=(fetch.read()).decode("UTF-8")
full_text=strip_tags(data)
#full_text contains all data of the text of pmc article
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment