Extract only needed citations from large bibfiles

2019-09-18·· 1 min read
blog software

I like to use cryptobib, because it gives me consistent results and it contains almost everything relevant to me.

However, as crypto.bib is over 725000 lines long, parsing it to create the bibliography takes a long time. This means that my $\LaTeX$ compile jobs take much too long.

I have now written a python script that will allow you to extract entries from large bibliographies.

This script will take your biblatex .bcf file and then extract the needed entries from larger bibfiles.

Makefile example

The only thing is that you need make sure to run the script if you add new references to your document. You can do this manually, or use the below example Makefile to keep your extracted .bib file up to date.

.PHONY: main.pdf
main.pdf: extracted_cryptobib.bib
	./latexrun --bibtex-cmd=biber -Wall main.tex
	# or latexmk, or just pdflatex a bunch of times...

latex.out/main.bcf:
	mkdir -p latex.out
	pdflatex -interaction=batchmode -output-directory=latex.out main

extracted_cryptobib.bib: latex.out/main.bcf cryptobib/crypto.bib
	python3 extract_from_bibliography.py $^ > $@

You can find this script on it’s GitHub page

Thom Wiggers
Authors
Senior Cryptography Researcher
Thom Wiggers is a cryptography researcher at PQShield. His PhD thesis was on the interactions of post-quantum cryptography with protocols, under the supervision of Peter Schwabe, at the Institute of Computing and Information Sciences, Radboud University in The Netherlands.