kordless/README.md

## README.md

      
    Raw
  

              README.md
            
          
    To run the Python script for splitting a PDF into segments of just under 25MB each, you'll need to follow these steps:
Prerequisites

Python Installation: Ensure that Python is installed on your system. If not, you can download and install it from python.org.
PyPDF2 Library: The script uses the PyPDF2 library. You can install it using pip, Python's package installer. If pip is not already installed, it comes bundled with Python 3.4 and later versions.
Installation Steps

Open Terminal or Command Prompt:
On Windows, you can open Command Prompt by searching for cmd in the Start menu.
On macOS or Linux, open Terminal.
Install PyPDF2: Run the following command to install the PyPDF2 library:
pip install PyPDF2

Running the Script

Save the Script: Save the provided Python script to a file on your computer. Let's name it split_pdf_25MB.py.
Locate the PDF: Make sure you know the path of the PDF file you want to split and that it is accessible.
Open Terminal/Command Prompt in the Script's Directory:
On Windows: Navigate to the folder where split_pdf_25MB.py is saved using the cd command. For example, if it's saved in C:\Users\YourUsername\Documents, use cd C:\Users\YourUsername\Documents.
On macOS/Linux: Use the cd command to navigate to the directory where the script is saved.
Run the Script: Execute the script by typing:
python split_pdf_25MB.py

Follow the on-screen prompts to enter the filename and the output prefix.
Notes

The script will ask for the file name of the PDF and a prefix for the output files. Ensure the file name is correct and the file exists in the specified default directory (~/Desktop/mitta/).
The output PDFs will be saved in the same directory as the input file, named with the provided prefix and indicating the page ranges.
If you encounter any issues during installation or running the script, check for error messages in the command prompt or terminal, which can provide insights into what might be going wrong.

  
## split_pdf.py
# written by ChatGPT 4 and Kord Campbell.
# do what you will with it

import PyPDF2
import os
from io import BytesIO

def get_pdf_size(writer):
    """Get the size of the PDF currently in the writer."""
    temp_buffer = BytesIO()
    writer.write(temp_buffer)
    size = len(temp_buffer.getvalue())
    return size

def split_pdf(file_path, output_prefix, max_size=25*1024*1024):  # max_size in bytes
    with open(file_path, 'rb') as file:
        reader = PyPDF2.PdfReader(file)
        total_pages = len(reader.pages)
        start_page = 0

        while start_page < total_pages:
            writer = PyPDF2.PdfWriter()
            end_page = start_page
            current_size = 0

            while end_page < total_pages:
                writer.add_page(reader.pages[end_page])
                temp_size = get_pdf_size(writer)

                if temp_size > max_size:
                    if end_page == start_page:
                        # This means a single page is larger than the max size, so we have to include it.
                        end_page += 1
                    break
                else:
                    current_size = temp_size
                    end_page += 1

            output_filename = os.path.join(os.path.dirname(file_path), f"{output_prefix}_pages_{start_page + 1}_to_{end_page}.pdf")
            with open(output_filename, 'wb') as output_file:
                writer.write(output_file)

            start_page = end_page

if __name__ == "__main__":
    default_directory = os.path.expanduser("~/Desktop/mitta/")
    file_name = input("Enter the file name (e.g., filename.pdf): ")
    file_path = os.path.join(default_directory, file_name)
    output_prefix = input("Enter the prefix for the output files: ")
    split_pdf(file_path, output_prefix)
	# written by ChatGPT 4 and Kord Campbell.
	# do what you will with it

	import PyPDF2
	import os
	from io import BytesIO

	def get_pdf_size(writer):
	"""Get the size of the PDF currently in the writer."""
	temp_buffer = BytesIO()
	writer.write(temp_buffer)
	size = len(temp_buffer.getvalue())
	return size

	def split_pdf(file_path, output_prefix, max_size=2510241024): # max_size in bytes
	with open(file_path, 'rb') as file:
	reader = PyPDF2.PdfReader(file)
	total_pages = len(reader.pages)
	start_page = 0

	while start_page < total_pages:
	writer = PyPDF2.PdfWriter()
	end_page = start_page
	current_size = 0

	while end_page < total_pages:
	writer.add_page(reader.pages[end_page])
	temp_size = get_pdf_size(writer)

	if temp_size > max_size:
	if end_page == start_page:
	# This means a single page is larger than the max size, so we have to include it.
	end_page += 1
	break
	else:
	current_size = temp_size
	end_page += 1

	output_filename = os.path.join(os.path.dirname(file_path), f"{output_prefix}_pages_{start_page + 1}_to_{end_page}.pdf")
	with open(output_filename, 'wb') as output_file:
	writer.write(output_file)

	start_page = end_page

	if __name__ == "__main__":
	default_directory = os.path.expanduser("~/Desktop/mitta/")
	file_name = input("Enter the file name (e.g., filename.pdf): ")
	file_path = os.path.join(default_directory, file_name)
	output_prefix = input("Enter the prefix for the output files: ")
	split_pdf(file_path, output_prefix)