Skip to content

Instantly share code, notes, and snippets.

@3lpsy
Created June 10, 2021 22:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save 3lpsy/d9a0b79bc1d13f8c5fd08161e978a7fd to your computer and use it in GitHub Desktop.
Save 3lpsy/d9a0b79bc1d13f8c5fd08161e978a7fd to your computer and use it in GitHub Desktop.
Converting Docx to PDFs via the CLI on Linux using Docker/Podman

I want to be able to convert docx files to PDFs. I also don't want to install libreoffice on my host. Mostly because it's fairly large and has a lot of dependencies. If you don't care about installing libreoffce, just install it and use the command from the Dockerfile. Using a container makes it easy to "clean up" in my opinion. Although, there are 100 other options.

I use podman as root here but docker would work. You can just substitute in docker for podman. However, running rootless docker/podman may have implications on file ownership.

Building Image

Make a dir to build the image

$ mkdir buildir
$ cd buildir

Dockerfile:

Create the Dockerfile. I create a hostuser with uid/gid that matches my normal host user. This is cause of rootless podman. Also because mounting and running as root will use the root user for perms. If you match IDs. Rootless docker may make this strat not work. You can also just change perms after creation in the function to avoid this. In such a case you'd remove the "USER" and "adduser/addgroup" lines. Run id on your host to get the necessary gid/uid.

FROM alpine:latest

RUN apk update \
    && rm -rf /var/cache/apk/* \
    && rm -rf /tmp/* \
    && apk update
RUN apk add libreoffice-writer openjdk8-jre

# run `id` on your host to get these values for default group and user ids.
# matching helps with perms as mentioned above
RUN addgroup -g 1001 -S hostuser && adduser -u 1000 -S hostuser -G hostuser -D
USER hostuser 

WORKDIR /data
ENTRYPOINT ["libreoffice", "--headless", "--convert-to", "pdf", "--outdir", "/data"]

Next, build the image:

$ sudo podman build -t doc2pdf .
# alternatively
# docker build -t doc2pdf .

Create a Shell Function

Throw this in your ~/.bashrc, ~/.zshrc or equivalent

function doc2pdf() {
	if [ -z "$1" ] && { echo "No pdf passed" && return 1 }
	if [ ! -f "$1" ] && { echo "File $1 is not a file" && return 1 }
	_path=$(readlink -f $1)
	_base="${_path##*/}"
	_dir=${_path%/*}
  # you can add an optional check to not overwrite an existing pdf here if you want
	# if [ ! -f "${_dir}/${_path}.pdf" ] && { echo "File exists" && return 1 }

	echo "Converting ${1} (/data/${_base})...";
  # this uses mounts. plan accordingly
	sudo podman run -it --rm --name doc2pdf -v $_dir:/data doc2pdf "$_base";
}

The pdf will be output in the same directory as the docx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment