Skip to content

Instantly share code, notes, and snippets.

@fedarko
Last active August 31, 2022 08:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fedarko/36f8609a03b11483aca596183a1f9478 to your computer and use it in GitHub Desktop.
Save fedarko/36f8609a03b11483aca596183a1f9478 to your computer and use it in GitHub Desktop.
Sort and remove duplicate BBL (bibtex file) entries; useful when combining multiple BBL files (e.g. if using the multibib package) into a single one
#! /usr/bin/env python3
# NOTE: this is a hack, so it will probably break if you have BBL files that
# don't look like the natbib-generated ones I'm used to. It is also pretty
# unintelligent about *how* it sorts entries (it defers most of the work
# to python), so if you have cases where some of your references are by
# the same person or whatever then that might cause the output to not match
# your expectations.
import sys
blocks = []
# We assume that this file starts with
# \begin{thebibliography}{} and ends with \end{thebibliography}
BBL_FP = sys.argv[1]
with open(BBL_FP, "r") as f:
curr_block = ""
for line in f:
if len(line.strip()) == 0:
if len(curr_block) > 0:
blocks.append(curr_block)
curr_block = ""
continue
if "{thebibliography}" in line:
continue
else:
curr_block += line
# bbl entries with author titles encapsulated in {{ (e.g. due to not being
# real names, but instead the names of groups of people or whatevs) will mess
# with the sorting. so we replace the {{ with a { just for the purpose of
# sorting (the {{ remains for the output bbl)
sorted_blocks = sorted(
set(blocks), key=lambda b: b if "{{" not in b else b.replace("{{", "{")
)
out_text = r"\begin{thebibliography}{}" + "\n"
for block in sorted_blocks:
out_text += "\n" + block
out_text += "\n" + r"\end{thebibliography}"
with open(BBL_FP, "w") as f:
f.write(out_text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment