Manage your scientific article library using PDF metadata, Unix-style.
Most PDF articles found online have useless metadata. To manage a library you will have to rely on
- the filesystem, which is limited
- or an external database = additional state to maintain + can lock you into an "ecosystem" :‑###..
However, PDF metadata is stored alongside the content and can contain a list of authors and a journal for easier searching.
- interoperability: files stay where they were and their metadata is open to other programs (pro-tip: look into
fzf), - freedom: no proprietary databases with special tools enforcing their workflow,
- hackability: the scripts are short, POSIX-compliant, pipe-able and accept wildcards, where applicable.
git clone https://github.com/ivan-boikov/fam && cd fam && make installBy default it will install symlinks in $HOME/.local/bin.
Change to whatever inside the Makefile, but be sure it's in $PATH.
Run make uninstall to uninstall.
- Interactively find a correct DOI of a PDF with
doi-repair. No automatic internet searches and fancy algorithms -- you are way more reliable in finding a correct DOI anyway. Then, data from CrossRef is used to populate standard PDF metadata with a list of authors, a title, and an ISO-690-like reference with a DOI. - Enjoy the profits: use
doi-renameto unify your filenames or chaindoi-inferwithdoi-to-bibtexto generate a BibTeX file
doi-infer *.pdf | doi-to-bibtex > literature.bib(don't ask too much too quickly, don't anger CrossRef) or be extra fancy and search recursively with a bit cleaner output
find <library path> -name '*.pdf' | doi-infer | doi-to-bibtex | sed 's/\}, /\},\n\t/g' > literature.bibor whatever else you have, the data is open, so use it!
The things you already have most likely:
pdftk, an excellent PDF editing utility, highly useful on its ownpdfinfofrom thepopplerpackage (a dependency ofpdftk)- a PDF viewer callable by
xdg-open - standard Linux utilities:
grep,curland others
No guarantees, things might crash and burn, so be careful. Works on my machine™.
- embedding keywords
- optional removal of accented characters
- support conference proceedings