Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@dmccreary
Last active February 26, 2024 16:49
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dmccreary/7734b7148f523771adb38597bcbe5732 to your computer and use it in GitHub Desktop.
Save dmccreary/7734b7148f523771adb38597bcbe5732 to your computer and use it in GitHub Desktop.
Find big images in PowerPoint
#!/bin/sh
# this program has been updated to list both the images and the largest images and the slide numbers together
# get the parameter
echo "working on" "$1"
rm -r /tmp/big-images 2> /dev/null
mkdir /tmp/big-images
# copy the ppt file to /tmp
cp "$1" /tmp/big-images
# rename it to be .zip
mv "/tmp/big-images/$1" "/tmp/big-images/$1.zip"
# unzip it
echo "unzipping /tmp/big-images/$1.zip"
unzip -q "/tmp/big-images/$1.zip" -d /tmp/big-images
# find large images
find /tmp/big-images/ppt/media/ -type f -size +100k > /tmp/big-images/big-images-list.txt
# process each image
while IFS= read -r image; do
size=$(ls -l "$image" | awk '{print $5}')
name=$(basename "$image")
slide=$(grep -l "$name" /tmp/big-images/ppt/slides/_rels/*.rels | sed 's/.*slide\(.*\).xml.rels/\1/')
echo "Image: $name (Size: $size bytes) is on Slide: $slide"
done < /tmp/big-images/big-images-list.txt
#!/bin/sh
# this program has been updated to list both the images and the largest images and the slide numbers together
# get the parameter
echo "working on" "$1"
rm -r /tmp/big-images 2> /dev/null
mkdir /tmp/big-images
# copy the ppt file to /tmp
cp "$1" /tmp/big-images
# rename it to be .zip
mv "/tmp/big-images/$1" "/tmp/big-images/$1.zip"
# unzip it
echo "unzipping /tmp/big-images/$1.zip"
unzip -q "/tmp/big-images/$1.zip" -d /tmp/big-images
# find large images
find /tmp/big-images/ppt/media/ -type f -size +100k > /tmp/big-images/big-images-list.txt
# process each image
while IFS= read -r image; do
size=$(ls -l "$image" | awk '{print $5}')
name=$(basename "$image")
slide=$(grep -l "$name" /tmp/big-images/ppt/slides/_rels/*.rels | sed 's/.*slide\(.*\).xml.rels/\1/')
echo "Image: $name (Size: $size bytes) is on Slide: $slide"
done < /tmp/big-images/big-images-list.txt
@DKroot
Copy link

DKroot commented Feb 2, 2024

Quite useful! A couple of suggestions:

  • It fails on a .ppt file name with spaces. Replacing spaces with underscores solves it.
  • It would be nice if the second part of the script could list image(s) right along a slide where they are present.

@dmccreary
Copy link
Author

Thanks! I am glad you found it helpful!

@dmccreary
Copy link
Author

I updated it so that the images and the slide numbers appear together. Actually, I had GPT-4 do it for me.

https://chat.openai.com/share/80969a42-2bc3-46ce-91b8-985200f9a16a

@DKroot
Copy link

DKroot commented Feb 26, 2024

Great idea. I've tried the latest version. It seems to be having issues:

# find-big-images-in-ppt-file.sh 6315.pptx
working on 6315.pptx
unzipping /tmp/big-images/6315.pptx.zip
Image: image76.png (Size: 307216 bytes) is on Slide: 20
Image: image17.svg (Size: 149307 bytes) is on Slide:
Image: image29.svg (Size: 147285 bytes) is on Slide:
Image: image14.png (Size: 438446 bytes) is on Slide:
Image: image28.png (Size: 420906 bytes) is on Slide:
Image: image140.png (Size: 355936 bytes) is on Slide: 29
Image: image16.png (Size: 577500 bytes) is on Slide:
Image: image13.png (Size: 121206 bytes) is on Slide:
Image: image38.png (Size: 317595 bytes) is on Slide:
Image: image35.png (Size: 291360 bytes) is on Slide:
Image: image21.png (Size: 234788 bytes) is on Slide:
Image: image20.png (Size: 109419 bytes) is on Slide:
Image: image34.png (Size: 231865 bytes) is on Slide:
Image: image22.png (Size: 367941 bytes) is on Slide:
Image: image37.png (Size: 3277172 bytes) is on Slide:
Image: image33.png (Size: 106525 bytes) is on Slide:
Image: image32.png (Size: 338258 bytes) is on Slide:
Image: image26.png (Size: 403154 bytes) is on Slide:
Image: image6.svg (Size: 4414565 bytes) is on Slide:
Image: image18.png (Size: 110659 bytes) is on Slide:
Image: image30.png (Size: 164577 bytes) is on Slide:
Image: image24.png (Size: 388468 bytes) is on Slide:
Image: image31.png (Size: 190860 bytes) is on Slide:
Image: image19.png (Size: 106205 bytes) is on Slide:
Image: image95.png (Size: 110667 bytes) is on Slide: 23
Image: image25.svg (Size: 147362 bytes) is on Slide:
Image: image27.svg (Size: 147568 bytes) is on Slide:
Image: image5.png (Size: 365179 bytes) is on Slide:
Image: image87.png (Size: 200728 bytes) is on Slide: 23
Image: image23.svg (Size: 289559 bytes) is on Slide:
Image: image51.png (Size: 171798 bytes) is on Slide: 5
Image: image40.jpeg (Size: 127599 bytes) is on Slide:
Image: image110.png (Size: 166043 bytes) is on Slide: 24
Image: image90.png (Size: 329167 bytes) is on Slide: 23
24
Image: image91.png (Size: 109977 bytes) is on Slide: 23

Here is the output from the previous version:

# find-big-images-in-ppt-file.sh 6315.pptx
working on 6315.pptx
override r-------- korobskd/wheel for /tmp/big-images/6315.pptx.zip? y
unzipping /tmp/big-images/6315.pptx.zip
Copy any of the following lines into the shell to view the image
open /tmp/big-images/ppt/media/image6.svg
open /tmp/big-images/ppt/media/image37.png
open /tmp/big-images/ppt/media/image16.png
open /tmp/big-images/ppt/media/image14.png
open /tmp/big-images/ppt/media/image28.png
open /tmp/big-images/ppt/media/image26.png
open /tmp/big-images/ppt/media/image24.png
open /tmp/big-images/ppt/media/image22.png
open /tmp/big-images/ppt/media/image5.png
open /tmp/big-images/ppt/media/image140.png
open /tmp/big-images/ppt/media/image32.png
open /tmp/big-images/ppt/media/image90.png
open /tmp/big-images/ppt/media/image38.png
open /tmp/big-images/ppt/media/image76.png
open /tmp/big-images/ppt/media/image35.png
open /tmp/big-images/ppt/media/image23.svg
open /tmp/big-images/ppt/media/image21.png
open /tmp/big-images/ppt/media/image34.png
open /tmp/big-images/ppt/media/image87.png
open /tmp/big-images/ppt/media/image31.png
open /tmp/big-images/ppt/media/image51.png
open /tmp/big-images/ppt/media/image110.png
open /tmp/big-images/ppt/media/image30.png
open /tmp/big-images/ppt/media/image17.svg
open /tmp/big-images/ppt/media/image27.svg
open /tmp/big-images/ppt/media/image25.svg
open /tmp/big-images/ppt/media/image29.svg
open /tmp/big-images/ppt/media/image40.jpeg
open /tmp/big-images/ppt/media/image13.png
open /tmp/big-images/ppt/media/image95.png
open /tmp/big-images/ppt/media/image18.png
open /tmp/big-images/ppt/media/image91.png
open /tmp/big-images/ppt/media/image20.png
open /tmp/big-images/ppt/media/image33.png
open /tmp/big-images/ppt/media/image19.png
open /tmp/big-images/ppt/media/image137.png
The following images are over 100K bytes
4414565 image6.svg
3277172 image37.png
577500 image16.png
438446 image14.png
420906 image28.png
403154 image26.png
388468 image24.png
367941 image22.png
365179 image5.png
355936 image140.png
338258 image32.png
329167 image90.png
317595 image38.png
307216 image76.png
291360 image35.png
289559 image23.svg
234788 image21.png
231865 image34.png
200728 image87.png
190860 image31.png
171798 image51.png
166043 image110.png
164577 image30.png
149307 image17.svg
147568 image27.svg
147362 image25.svg
147285 image29.svg
127599 image40.jpeg
121206 image13.png
110667 image95.png
110659 image18.png
109977 image91.png
109419 image20.png
106525 image33.png
106205 image19.png
102178 image137.png

 The images are on the following slides:
slide29.xml.rels
slide23.xml.rels
slide24.xml.rels
slide20.xml.rels
slide23.xml.rels
slide5.xml.rels
slide24.xml.rels
slide23.xml.rels
slide23.xml.rels
slide28.xml.rels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment