Skip to content

Instantly share code, notes, and snippets.

@e96031413
Created January 22, 2021 03:47
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save e96031413/9de85ae7bd595874d28e29934e22a4c0 to your computer and use it in GitHub Desktop.
Save e96031413/9de85ae7bd595874d28e29934e22a4c0 to your computer and use it in GitHub Desktop.
Search keywords in ppt files with python
# REF https://stackoverflow.com/questions/55497789/find-a-word-in-multiple-powerpoint-files-python/55763992#55763992
from pptx import Presentation
from pptx.enum.shapes import MSO_SHAPE_TYPE
import os
path = "./"
files = [x for x in os.listdir(path) if x.endswith(".pptx")]
def CheckRecursivelyForText(shpthissetofshapes):
for shape in shpthissetofshapes:
if shape.shape_type == MSO_SHAPE_TYPE.GROUP:
checkrecursivelyfortext(shape.shapes)
else:
if hasattr(shape, "text"):
shape.text = shape.text.lower()
if "what_ever_you_want_to_find" in shape.text:
print(eachfile)
print("----------------------")
else :
print("No text found in these PPTs")
print("----------------------")
break
for eachfile in files:
prs = Presentation(path + eachfile)
for slide in prs.slides:
CheckRecursivelyForText(slide.shapes)
@ThermalPermal
Copy link

Currently, this code reports out every time "what_ever_you_want_to_find" is found.

Ex: if "what_ever_you_want_to_find" is found on 3 different slides of the same powerpoint, it will name the same file 3 times as well as "text found" or whatever prompt you may choose.

How can we change so that the code only searches until the first found "what_ever_you_want_to_find"

@e96031413
Copy link
Author

You can set a variable to count how many times you found something you want.
Once the variable reach 1, then you can break from the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment