Skip to content

Instantly share code, notes, and snippets.

@infowolfe
Last active January 1, 2016 04:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save infowolfe/8095718 to your computer and use it in GitHub Desktop.
Save infowolfe/8095718 to your computer and use it in GitHub Desktop.
mongodb is fat.... how fat, exactly? Update includes calculation/of aggregate sizes and where these sizes exist. /usr/lib/debug seems to be the largest offender.

Why?

  • We were discussing the massive amounts of space that MongoDB consumes during build (as @infowolfe has PORTAGE_TMPDIR="/dev/shm")
  • @robbat2 was planning on committing a warning message/disk space prerequisite check to the MongoDB ebuild on Gentoo Linux noting that during both build and install it takes up an impressive amount of disk space (@infowolfe saw >12GB of used tmpfs prior to merge).
  • @robbat2 wanted to know exactly how much space was being consumed but didn't personally have MongoDB installed. He then asked me (@infowolfe) to run a perl one-liner[1] twice (once with /usr/ and once without). We calculated the differential between MongoDB's /usr and non-/usr packaged files to be roughly 180KiBytes on my system (nostrip&&CFLAGS=-g).
  1. / (all filesystems) 2,473,771,415 (bytes)
  2. /usr/ (only /usr): 2,473,662,740 (bytes)
  3. differential: 108,675 bytes
  • @robbat2 suggested that I drill down further into MongoDB's CONTENTS file, to see exactly where the ~2.4GB in total disappeared to. My first result outputted in raw bytes, not calculated like the current script:
#!/bin/bash

mcont="/var/db/pkg/dev-db/mongodb-2.*/CONTENTS"

for dir in $(grep ^dir ${mcont} | sed -e 's~^dir ~~') ; do
	pdir=$(echo ${dir} | sed -e 's~/~\\/~g' -e 's~.*~&\\/~')
	echo -en "${dir}:\t"
	perl  -e 'while(<>) { chomp; next unless $_ =~ /obj '${pdir}'/; @a = split /\s+/, $_; $sum += -s $a[1]; }; print $sum,"\n"; ' ${mcont}
done | awk {'printf "%-45s\t%-8s %s %s\n", $1, $2, $3, $4'}

I didn't notice the output initially, but @robbat2 did, and pointed out this (note the 3 places of difference between directories):

freya:~$ ./run_perl.sh  | egrep -e '^/usr(:|/include:|/lib/debug:)'
/usr:                                        	2473662740
/usr/include:                                	1973187
/usr/lib/debug:                              	2340973686

Because I completely failed to see the above, I decided it would be prudent (if this script were ever to be reused), to do human-readable calculations on the output. This particular "tool" can actually be pointed at any Gentoo package's /var/db/pkg/$category/$PN-$PV/CONTENTS file to get a readout of disk space used on a by-directory basis. Because it was still not super easy to differentiate between say GiB and MiB when stacked vertically, I also colorized the bash output: blue (bytes), green (kilobytes, yellow (megabytes) and red (gigabytes). All figures are in base2, not base10 (Mibibyte = 1,048,576 vs Megabyte = 1,000,000). Anyway, if you have a use for this, enjoy - my portion (the bash) is being released into the public domain: this means you may use it any way you'd like including modifying and redistriibuting, and you don't need to cite my name if you do so. I also will not under any circumstances be supporting this or accepting patches. @robbat2 just informed me that his perl is licensed BSD-2.

Authors: @robbat2: perl // @infowolfe: bash

Note: wrapped for convenience, please extract perl from above/below bash scripts for use. perl -e 'while(<>) { chomp; next unless $_ =~ /obj \/usr\//; @a = split /\s+/, $_; $sum += -s $a[1]; }; print $sum,"\n";' /var/db/pkg/dev-db/mongodb-2.*/CONTENTS

Coloration - imgur screenshot via droplr

freya:~$ cat ./run_perl.sh ; echo -e "\n\n" ; ./run_perl.sh | cat
#!/bin/bash
mcont="/var/db/pkg/dev-db/mongodb-2.*/CONTENTS"
kilo="1024"
mega=$((1024 * 1024))
giga=$((1024 * 1024 * 1024))
tera=$((1024 * 1024 * 1024 * 1024))
for dir in $(grep ^dir ${mcont} | sed -e 's~^dir ~~') ; do
pdir=$(echo ${dir} | sed -e 's~/~\\/~g' -e 's~.*~&\\/~')
size=$(perl -e 'while(<>) { chomp; next unless $_ =~ /obj '${pdir}'/; @a = split /\s+/, $_; $sum += -s $a[1]; }; print $sum,"\n"; ' ${mcont})
if [[ "${size}" -lt "${kilo}" ]]; then
color="" # dark blue
color="" # light blue
output="${size} bytes"
elif [[ "${size}" -lt "${mega}" ]]; then
color="" # dark green
color="" # light green
output="$(echo -e "scale=3\n${size} / ${kilo}" | bc) kiB"
elif [[ "${size}" -lt "${giga}" ]]; then
color="" # dark yellow
color="" # light yellow
output="$(echo -e "scale=3\n${size} / ${mega}" | bc) MiB"
elif [[ "${size}" -lt "${tera}" ]]; then
color="" # dark red
color="" # light red
output="$(echo -e "scale=3\n${size} / ${giga}" | bc) GiB"
fi
echo -en "${color}${dir}:\t${output}\n"
done | awk {'printf "%-45s\t%-8s %s %s\n", $1, $2, $3, $4'}
/usr: 2.303 GiB
/usr/include: 1.881 MiB
/usr/include/mongo: 1.881 MiB
/usr/include/mongo/base: 53.936 kiB
/usr/include/mongo/bson: 171.173 kiB
/usr/include/mongo/bson/util: 22.447 kiB
/usr/include/mongo/client: 159.850 kiB
/usr/include/mongo/db: 771.620 kiB
/usr/include/mongo/db/auth: 51.883 kiB
/usr/include/mongo/db/ops: 52.774 kiB
/usr/include/mongo/db/repl: 65.227 kiB
/usr/include/mongo/db/stats: 14.171 kiB
/usr/include/mongo/platform: 40.834 kiB
/usr/include/mongo/s: 246.395 kiB
/usr/include/mongo/scripting: 82.681 kiB
/usr/include/mongo/shell: 20.939 kiB
/usr/include/mongo/util: 374.523 kiB
/usr/include/mongo/util/concurrency: 91.199 kiB
/usr/include/mongo/util/mongoutils: 16.264 kiB
/usr/include/mongo/util/net: 40.488 kiB
/usr/bin: 123.237 MiB
/usr/lib64: 1.374 MiB
/usr/share: 49.803 kiB
/usr/share/man: 47.885 kiB
/usr/share/man/man1: 47.885 kiB
/usr/share/doc: 1.917 kiB
/usr/share/doc/mongodb-2.4.8: 1.917 kiB
/usr/lib: 2.180 GiB
/usr/lib/systemd: 220 bytes
/usr/lib/systemd/system: 220 bytes
/usr/lib/debug: 2.180 GiB
/usr/lib/debug/usr: 2.180 GiB
/usr/lib/debug/usr/bin: 2.160 GiB
/usr/lib/debug/usr/lib64: 20.224 MiB
/var: 0 bytes
/var/lib: 0 bytes
/var/lib/mongodb: 0 bytes
/var/log: 0 bytes
/var/log/mongodb: 0 bytes
/etc: 5.130 kiB
/etc/init.d: 3.771 kiB
/etc/conf.d: 941 bytes
/etc/logrotate.d: 205 bytes
/opt: 100.997 kiB
/opt/mms-agent: 100.997 kiB
freya:~$ cat -e ./run_perl.sh ; echo -e "\n\n" ; ./run_perl.sh | cat -e
#!/bin/bash$
$
mcont="/var/db/pkg/dev-db/mongodb-2.*/CONTENTS"$
$
kilo="1024"$
mega=$((1024 * 1024))$
giga=$((1024 * 1024 * 1024))$
tera=$((1024 * 1024 * 1024 * 1024))$
$
for dir in $(grep ^dir ${mcont} | sed -e 's~^dir ~~') ; do$
pdir=$(echo ${dir} | sed -e 's~/~\\/~g' -e 's~.*~&\\/~')$
size=$(perl -e 'while(<>) { chomp; next unless $_ =~ /obj '${pdir}'/; @a = split /\s+/, $_; $sum += -s $a[1]; }; print $sum,"\n"; ' ${mcont})$
if [[ "${size}" -lt "${kilo}" ]]; then$
color="^[[0;34m" # dark blue$
color="^[[1;34m" # light blue$
output="${size} bytes"$
elif [[ "${size}" -lt "${mega}" ]]; then$
color="^[[0;32m" # dark green$
color="^[[1;32m" # light green$
output="$(echo -e "scale=3\n${size} / ${kilo}" | bc) kiB"$
elif [[ "${size}" -lt "${giga}" ]]; then$
color="^[[0;33m" # dark yellow$
color="^[[1;33m" # light yellow$
output="$(echo -e "scale=3\n${size} / ${mega}" | bc) MiB"$
elif [[ "${size}" -lt "${tera}" ]]; then$
color="^[[0;31m" # dark red$
color="^[[1;31m" # light red$
output="$(echo -e "scale=3\n${size} / ${giga}" | bc) GiB"$
fi$
echo -en "${color}${dir}:\t${output}\n"$
done | awk {'printf "%-45s\t%-8s %s %s\n", $1, $2, $3, $4'}$
^[[1;31m/usr: 2.303 GiB $
^[[1;33m/usr/include: 1.881 MiB $
^[[1;33m/usr/include/mongo: 1.881 MiB $
^[[1;32m/usr/include/mongo/base: 53.936 kiB $
^[[1;32m/usr/include/mongo/bson: 171.173 kiB $
^[[1;32m/usr/include/mongo/bson/util: 22.447 kiB $
^[[1;32m/usr/include/mongo/client: 159.850 kiB $
^[[1;32m/usr/include/mongo/db: 771.620 kiB $
^[[1;32m/usr/include/mongo/db/auth: 51.883 kiB $
^[[1;32m/usr/include/mongo/db/ops: 52.774 kiB $
^[[1;32m/usr/include/mongo/db/repl: 65.227 kiB $
^[[1;32m/usr/include/mongo/db/stats: 14.171 kiB $
^[[1;32m/usr/include/mongo/platform: 40.834 kiB $
^[[1;32m/usr/include/mongo/s: 246.395 kiB $
^[[1;32m/usr/include/mongo/scripting: 82.681 kiB $
^[[1;32m/usr/include/mongo/shell: 20.939 kiB $
^[[1;32m/usr/include/mongo/util: 374.523 kiB $
^[[1;32m/usr/include/mongo/util/concurrency: 91.199 kiB $
^[[1;32m/usr/include/mongo/util/mongoutils: 16.264 kiB $
^[[1;32m/usr/include/mongo/util/net: 40.488 kiB $
^[[1;33m/usr/bin: 123.237 MiB $
^[[1;33m/usr/lib64: 1.374 MiB $
^[[1;32m/usr/share: 49.803 kiB $
^[[1;32m/usr/share/man: 47.885 kiB $
^[[1;32m/usr/share/man/man1: 47.885 kiB $
^[[1;32m/usr/share/doc: 1.917 kiB $
^[[1;32m/usr/share/doc/mongodb-2.4.8: 1.917 kiB $
^[[1;31m/usr/lib: 2.180 GiB $
^[[1;34m/usr/lib/systemd: 220 bytes $
^[[1;34m/usr/lib/systemd/system: 220 bytes $
^[[1;31m/usr/lib/debug: 2.180 GiB $
^[[1;31m/usr/lib/debug/usr: 2.180 GiB $
^[[1;31m/usr/lib/debug/usr/bin: 2.160 GiB $
^[[1;33m/usr/lib/debug/usr/lib64: 20.224 MiB $
^[[1;34m/var: 0 bytes $
^[[1;34m/var/lib: 0 bytes $
^[[1;34m/var/lib/mongodb: 0 bytes $
^[[1;34m/var/log: 0 bytes $
^[[1;34m/var/log/mongodb: 0 bytes $
^[[1;32m/etc: 5.130 kiB $
^[[1;32m/etc/init.d: 3.771 kiB $
^[[1;34m/etc/conf.d: 941 bytes $
^[[1;34m/etc/logrotate.d: 205 bytes $
^[[1;32m/opt: 100.997 kiB $
^[[1;32m/opt/mms-agent: 100.997 kiB $
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment