Skip to content

Instantly share code, notes, and snippets.

@grownseed
Last active October 12, 2023 17:36
Show Gist options
  • Save grownseed/4fd2e91eca829cc039de to your computer and use it in GitHub Desktop.
Save grownseed/4fd2e91eca829cc039de to your computer and use it in GitHub Desktop.

Give me back my sanity

One of the many things I do for my group at work is to take care of automating as many things as possible. It usually brings me a lot of satisfaction, mostly because I get a kick out of making people's lives easier.

But sometimes, maybe too often, I end up in drawn-out struggles with machines and programs. And sometimes, these struggles bring me to the edge of despair, so much so that I regularly consider living on a computer-less island growing vegetables for a living.

This is the story of how I had to install Pandoc in a CentOS 6 Docker container. But more generally, this is the story of how I think computing is inherently broken, how programmers (myself included) tend to think that their way is the way, how we're ultimately replicating what most of us think is wrong with society, building upon layers and layers of (best-case scenario) obscure and/or weak foundations.

I would like to extend my gratitude to Google, StackOverflow, GitHub issues but mostly, the people who make the beer I drink


It all starts with this beautifully simple command:

yum install -y pandoc

After resolving about 80 dependencies and downloading around 150Mb of data, here we have it, a beautiful new Pandoc.

Except it isn't so new, because CentOS doesn't really do new, so instead we have Pandoc 1.9.4.1. You'd be tempted to think it doesn't matter so much, and that a few missing features aren't the end of the world. Unfortunately though this particular version is shipped with its very own show-stopping bugs.


"Do not fret" I hear, let's build a newer version of Pandoc ourselves!

Pandoc was written in Haskell, a quick Google search (like the 100's of other quick searches that led to this point) tells me that the compiler for Haskell is named GHC, why not...

My first reaction is to go:

yum install -y ghc

Quite a bit of time later (GHC is provided with 4 different flavours of every single library), we now have GHC 7.0.4-46.

Fast-forward a couple of hours and realize that this particular version of GHC is outdated, too...

It's ok, let's build it ourselves then:

RUN wget http://www.haskell.org/ghc/dist/7.8.2/ghc-7.8.2-x86_64-unknown-linux-centos65.tar.bz2
RUN tar xf ghc-7.8.2-x86_64-unknown-linux-centos65.tar.bz2
RUN cd ghc-7.8.2 && ./configure
RUN cd ghc-7.8.2 && make install

GHC doesn't need make, just make install; they're pretty proud of that - I couldn't care less.

And that successfully installed the new version of GHC. Getting there, or are we?


Well as it turns out, Pandoc is installed through Cabal, some sort of package-manager-which-isn't-really-one for Haskell.

I tentatively try yum, which of course ends in failure, and proceed to build Cabal:

yum install -y which gmp-devel

Because my base CentOS Docker image doesn't have which, and because Cabal's compilation throws an error about a missing -lgmp, stupid me, it's obviously the gmp-devel package.

wget http://www.haskell.org/cabal/release/cabal-install-1.20.0.3/cabal-install-1.20.0.3.tar.gz
tar xf cabal-install-1.20.0.3.tar.gz
cd cabal-install-1.20.0.3 && ./bootstrap.sh

And it builds, and downloads, and builds some more, and downloads some more...

ln -s /.cabal/bin/cabal /usr/bin/cabal

Because installing Cabal doesn't add Cabal to your path. This is nasty, I don't care, let's move on.

cabal update

Because it couldn't have done so previously, and so it builds and downloads some more.


By this point, I could have probably hand-written the converted files on a daily basis and still wasted less time. But we're almost there...

So I install Pandoc (and Citeproc), finally:

cabal install pandoc pandoc-citeproc

Hmm, there's an error with a missing UTF-8 locale, let's generate it with this completely obvious command:

localedef -v -c -i en_US -f UTF-8 en_US.UTF-8

Funny thing is, this command actually throws an error, which Docker doesn't like when building, so hey, why not:

RUN localedef -v -c -i en_US -f UTF-8 en_US.UTF-8 || true

And here we go, error ignored whatever happens.

But wait, building Pandoc still doesn't go through, as it turns out it's missing an environment variable which we could tell from the previous step was hard-coded somewhere else. Of course it doesn't explicitly tell you, the depths of the Internet are your friend:

ENV LANG en_US.UTF-8

I re-build Pandoc and go read the entirety of Les Miserables in the mean time...

Another nasty symlink to the executables, because Cabal doesn't do that either:

ln -s /.cabal/bin/pandoc /usr/bin/pandoc
ln -s /.cabal/bin/pandoc-citeproc /usr/bin/pandoc-citeproc

And here we go, we now have a working Pandoc!


You may not believe it, but this is a really concise version of what I actually had to go through.

I don't doubt that I'm not the sharpest tool in the box or that there aren't more shortcuts I could have taken, but in my view this process is completely crazy.

In my opinion, we've done this to ourselves, maybe because of developer ego, maybe because we're not as smart as we think we are, or maybe because we and the rest of the world expect too much, too quickly. I don't have an answer, nor do I have a solution to the whole problem, but I'll definitely think twice about telling anybody else to get into this business.

@michaelnt
Copy link

Seems to have got easier now if you use stack as per the official pandoc page.

Here a Dockerfile for rhel6

FROM rhel6
ENV VERSION 1.17.0.3

# stack https://docs.haskellstack.org/en/stable/install_and_upgrade/#centos
RUN curl -sSL https://s3.amazonaws.com/download.fpcomplete.com/centos/6/fpco.repo | tee /etc/yum.repos.d/fpco.repo
RUN curl -s -o /tmp/pandoc-${VERSION}.tar.gz https://hackage.haskell.org/package/pandoc-1.17.0.3/pandoc-${VERSION}.tar.gz

RUN yum -y install stack gcc zlib-devel

RUN tar zxf /tmp/pandoc-${VERSION}.tar.gz -C /usr/local/src
RUN cd /usr/local/src/pandoc-${VERSION} && stack setup 
RUN cd /usr/local/src/pandoc-${VERSION} && stack install

@StubbsPKS
Copy link

My issue with using stack is that I STILL have a binary that isn't portable bc it wants to look in weird directories for a stack work directory. I don't want stack on all of our prod machines. I JUST want pandoc.

@caot
Copy link

caot commented Oct 26, 2017

pandoc through anaconda will be helpful. https://www.anaconda.com/

@pjrola
Copy link

pjrola commented Aug 30, 2018

I'm leaving this link here for anyone who is still struggling I managed to install it by following https://github.com/rstudio/rmarkdown/blob/master/PANDOC.md

wget -P /etc/yum.repos.d/ https://copr.fedoraproject.org/coprs/petersen/pandoc-el5/repo/epel-5/petersen-pandoc-el5-epel-5.repo
yum install -y pandoc pandoc-citeproc

will get you pandoc 1.15 if however you need the latest you should do

wget -P /etc/yum.repos.d/ https://copr.fedorainfracloud.org/coprs/petersen/pandoc/repo/epel-7/petersen-pandoc-epel-7.repo
yum install -y pandoc pandoc-citeproc

which will get you 2.2.1
hopefully this saves someone from days of work like I just experienced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment