Skip to content

Instantly share code, notes, and snippets.

View v-sukt's full-sized avatar

Rajshekhar K v-sukt

  • India
View GitHub Profile
@v-sukt
v-sukt / extract_pdf_notes.py
Last active February 29, 2024 06:44 — forked from Samathy/dumppdfcomments.py
Python Script to extract highlighted text, images(square/rectangle - e.g the table you highlight with box) and Text annotations from PDFs. Uses python-poppler-qt5 and PyQt5. Updated https://stackoverflow.com/questions/21050551/extracting-text-from-higlighted-text-using-poppler-qt4-python-poppler-qt4 with some minute modifications.
import popplerqt5
import sys
import PyQt5
resolution = 150
def main():
doc = popplerqt5.Poppler.Document.load(sys.argv[1])
total_annotations = 0