Customizing¶
The core functionality of bibtexparser is deliberately kept simple:
Upon parsing, the input string is merely split into different parts (blocks) and corresponding subparts (fields, keys, …).
Upon writing, the splitting is reversed and the blocks are joined together again, with few formatting options.
Advanced transformations of blocks, such as sorting, encoding, cross-referencing, etc. are not part of the core functionality, but can be optionally added to the parse stack by using the corresponding middleware layers: Middleware layers helper classes providing the functionality take a library object and return a new, transformed version of said library.
Middleware Layers¶
import bibtexparser.middlewares as m
# We want to add three new middleware layers to our parse stack:
layers = [
m.MonthIntMiddleware(), # Months should be represented as int (0-12)
m.SeparateCoAuthors(), # Co-authors should be separated as list of strings
m.SplitNameParts() # Individual Names should be split into first, von, last, jr parts
]
library = bibtexparser.parse_file('bibtex.bib', append_middleware=layers)
This example adds three new middleware layers to the parse stack:
The first layer converts the month field (which may be represented as String (“February”), native string reference (feb) or integer (2) to the integer representation (0-12).
The second layer splits the author field into a list of authors (and similarly for editors, translators, etc.).
The third layer splits the author names into a object representing the first, von, last and jr parts of the name.
Default Parse-Stack¶
BibtexParser foresees a default parse stack; i.e., some middleware is automatically applied as we assume it to be part of the expected functionality for most users.
Currently, the default parse stack consists of the following layers:
bibtexparser.middlewares.ResolveStringReferencesMiddleware
: De-Reference reference to @string definitions.bibtexparser.middlewares.RemoveEnclosingMiddleware
: Removes enclosing (e.g. curly braces or “”) from values.
The default write stack consists of the following layers:
bibtexparser.middlewares.AddEnclosingMiddleware
: Encloses values in curly braces where needed.
When specifying their own stack, user get to chose if they want to add to or overwrite the default stack
by selecting the corresponding argument when calling bibtexparser.parse
or bibtexparser.write
:
append_middleware
: Add middleware to the default parse stack (similarlyprepend_middleware
for write stack).parse_stack
: Overwrite the default parse stack (similarlywrite_stack
for write stack).
Warning
The default parse and write stacks may change on minor version updates and between pre-releases.
To reduce the risk of unnoticed changes in parsing stack, critical applications may want to hard-code
the full parse stack in their code using parse_stack
and write_stack
arguments.
Core Middleware¶
bibtexparser comes with a number of middleware options:
Encoding and Enclosing of Values¶
bibtexparser.middlewares.AddEnclosingMiddleware
bibtexparser.middlewares.RemoveEnclosingMiddleware
bibtexparser.middlewares.LatexEncodingMiddleware
bibtexparser.middlewares.LatexDecodingMiddleware
Value References and Representation¶
bibtexparser.middlewares.ResolveStringReferencesMiddleware
bibtexparser.middlewares.MonthIntMiddleware
bibtexparser.middlewares.MonthAbbreviationMiddleware
bibtexparser.middlewares.MonthLongStringMiddleware
Names¶
bibtexparser.middlewares.SeparateCoAuthors
bibtexparser.middlewares.MergeCoAuthors
bibtexparser.middlewares.SplitNameParts
(requires SeperateCoAuthors to be applied first)bibtexparser.middlewares.MergeNameParts
Sorting¶
bibtexparser.middlewares.SortBlocksByTypeAndKeyMiddleware
bibtexparser.middlewares.SortFieldsAlphabeticallyMiddleware
bibtexparser.middlewares.SortFieldsCustomMiddleware
Note
As opposed to bibtexparser v1, the en- and decoding of latex characters is now handled by a third-party library. Previously, this part was responsible for much of the code complexity and bugs in bibtexparser, and leaving this to an established solution is intended to make the use of bibtexparser much more stable, even if it comes at the cost of slightly reduced functionality and performance. See the migration docs, if you are migrating from bibtexparser v1.
Write your own Middleware¶
Functions working on blocks individually¶
Should extend the bibtexparser.middlewares.BlockMiddleware
class.
This includes functionalities similar to
Encoding and Enclosing of Values, Value References and Representation, and Names.
Basic example:
from bibtexparser.middlewares import BlockMiddleware class MyMiddleware(BlockMiddleware): def transform_entry(self, entry, *args, **kwargs): # Do something with the entry, e.g. entry["title"] = entry["title"].lower() # Return the transformed entry return entry
Initialize the middleware with some parameters:
from bibtexparser.middlewares import BlockMiddleware class MyMiddleware(BlockMiddleware): def __init__(self, my_param): self.my_param = my_param super().__init__() def transform_entry(self, entry, *args, **kwargs): # Do something with the entry, e.g. entry["title"] = entry["title"].lower() # Return the transformed entry return entry
Library-wide transformations¶
Should extend the bibtexparser.middlewares.LibraryMiddleware
class.
This includes functionalities similar to sorting blocks
(e.g. bibtexparser.middlewares.SortBlocksByTypeAndKeyMiddleware
).
Warning¶
bibtexparser.middlewares.BlockMiddleware
and bibtexparser.middlewares.LibraryMiddleware
have two default arguments:
allow_parallel_execution=True
, seebibtexparser.middlewares.Middleware.allow_inplace_modification()
.allow_inplace_modification=True
, seebibtexparser.middlewares.Middleware.allow_parallel_execution()
.
If you want to change these defaults, specify them in the call to the super constructor. E.g.:
from bibtexparser.middlewares import BlockMiddleware
class MyMiddleware(BlockMiddleware):
def __init__(self, my_param):
self.my_param = my_param
super().__init__(
allow_parallel_execution = False,
allow_inplace_modification = False,
)
def transform_entry(self, entry, *args, **kwargs):
# Do something with the entry, e.g.
entry["title"] = entry["title"].lower()
# Return the transformed entry
return entry
Community-Provided Middleware¶
We encourage users to provide their own middleware layers and share them with the community.
We are happy to provide a list of community-provided middleware layers here, so please let us know if you have written one!
See CONTRIBUTING.md
for suggestions how to contribute.
Metadata Fields¶
All blocks have a metadata
attribute, which is a dictionary of arbitrary middleware-value pairs.
This is intended for middleware layers to store metadata about the transformation made by them,
which in turn can be used by other middleware layers (e.g. to reverse the transformation).
The metadata attribute and its exact specification is still experimental and subject to breaking changes even within minor/path versions. Even when not experimental anymore, it is not intended to be used by users directly, and may be changed as needed by the corresponding middleware maintainers.
Formatting Options for Writing¶
Basic formatting options (e.g. indentation, line breaks, etc.) have no influence on the bibtexparser.bparser.Library
representation and should not / cannot therefore be specified as middleware layers.
These options are instead specified as arguments to the bibtexparser.write
function.
Specifically, a user may pass a bibtexparser.BibtexFormatter
object to the bibtex_format
argument of bibtexparser.write
.
bibtex_format = bibtexparser.BibtexFormat()
bibtex_format.indent = ' '
bibtex_format.block_separator = '\n\n'
bib_str = bibtexparser.write_string(library, bibtex_format=bibtex_format)
A few more options are provided and we refer to the docstrings of bibtexparser.BibtexFormat
for details.
Note: Sorting of blocks and fields is done with the corresponding middleware, as described above.