Skip to content

Instantly share code, notes, and snippets.

View documentprocessing's full-sized avatar

Document Processing documentprocessing

View GitHub Profile
@documentprocessing
documentprocessing / add-new-column-to-excel-xls.py
Created March 26, 2024 19:09
Add new Column to Excel XLS file with Pyexcel-XLS
import pyexcel as p
#open a sample Excel file
sheet = p.get_sheet(file_name="example.xls")
#add a new column to an existing Python file
sheet.column += ["Column 3", 10, 11, 12]
#save the file to disc
sheet.save_as("addNewColumn.xls")
@documentprocessing
documentprocessing / add-new-row-to-excel-xls-with-pyexcel-xls.py
Created March 26, 2024 19:08
Add new row to Excel XLS file with Pyexcel-xls in Python
import pyexcel as p
#open a sample Excel file
sheet = p.get_sheet(file_name="example.xls")
##add row to the Excel file with values
sheet.row += [12, 11]
sheet.save_as("addNewRowToXLS.xls")
@documentprocessing
documentprocessing / read-and-save-xls-file.py
Created March 26, 2024 19:07
Read and Save XLS file using Pyexcel-XLS
#This example creates a new XLS file from scratch and saves it to disc
import pyexcel as p
#open a sample Excel file
sheet = p.get_sheet(file_name="example.xls")
#save as empty Excel file
sheet.save_as("emptyExcelFile.xls")
@documentprocessing
documentprocessing / insert-table-in-docx-file-with-python.py
Created February 22, 2024 12:52
Add table to Word DOCX file using Python
table = document.add_table(rows=2, cols=2)
//access the cell at first row and second column
cell = table.cell(0, 1)
//insert some text
cell.text = 'Document Processing'
//Add a new row to the table
row = table.add_row()
@documentprocessing
documentprocessing / insert-picture-in-docx-with-python-docx.py
Created February 22, 2024 11:52
Insert picture in DOCX with Python-dotx library
from docx import Document
document = Document()
document.add_picture('file-name-of-image.png')
document.save('docx with image from file.docx')
@documentprocessing
documentprocessing / open-docx-document-and-save-after-edit.py
Created February 22, 2024 07:32
Open a DOCX file with Python-docx library
from docx import Document
document = Document('existing-docx-file.docx')
document.save('save-with-new-file-name.docx')
@documentprocessing
documentprocessing / create-docx-file-using-python-docx-library.py
Created February 22, 2024 06:48
Create Word DOCX file using Python-docx Library
from docx import Document
document = Document()
document.save('test.docx')
@documentprocessing
documentprocessing / extract-form-values-from-pdfs-using-pdfplumber-library.py
Created December 15, 2023 14:07
Extract Content from PDF documents using pdfplumber library. Check [URL] for details.
# Import necessary libraries for PDF processing
import pdfplumber
from pdfplumber.utils.pdfinternals import resolve_and_decode, resolve
from pprint import pprint
# Open the PDF document for processing
pdf = pdfplumber.open("form_pdf.pdf")
# Define a helper function to parse form fields recursively
@documentprocessing
documentprocessing / edit-metadata-of-pdfs-using-pymupdf-library.py
Last active December 7, 2023 17:39
Read and edit PDF metadata (standard and XML metadata) in Python using PyMuPDF library. Check https://products.documentprocessing.com/metadata/python/pymupdf/ for more details.
# Import PyMuPDF library
import fitz
# Open the PDF file
doc = fitz.open('documentprocessing.pdf')
# Define new metadata
new_metadata = {
'author': 'Document Processing',
'title': 'Test Document',
@documentprocessing
documentprocessing / add-a-watermark-using-react-pdf-viewer-library.js
Created December 1, 2023 10:35
Render PDFs in React Web Applications. Check [URL] for more details.
// Import necessary dependencies from React and react-pdf-viewer library
import React from 'react';
import { Viewer, SpecialZoomLevel, Worker } from '@react-pdf-viewer/core';
import '@react-pdf-viewer/core/lib/styles/index.css';
// Functional component for rendering a PDF with a watermark
const WaterMarkExample = ({ fileUrl }) => {
// Custom rendering function for each page
const renderPage = (props) => (