Contents
- bibtexparser: API
bibtexparser
— Parsing and writing BibTeX filesbibtexparser.bibdatabase
— The bibliographic database objectbibtexparser.bparser
— Tune the default parserbibtexparser.customization
— Functions to customize recordsbibtexparser.bwriter
— Tune the default writerbibtexparser.bibtexexpression
— Parser’s core relying on pyparsing
bibtexparser: API¶
bibtexparser
— Parsing and writing BibTeX files¶
BibTeX is a bibliographic data file format.
The bibtexparser
module can parse BibTeX files and write them. The API is similar to the
json
module. The parsed data is returned as a simple BibDatabase
object with the main attribute being
entries
representing bibliographic sources such as books and journal articles.
The following functions provide a quick and basic way to manipulate a BibTeX file. More advanced features are also available in this module.
Parsing a file is as simple as:
import bibtexparser
with open('bibtex.bib') as bibtex_file:
bibtex_database = bibtexparser.load(bibtex_file)
And writing:
import bibtexparser
with open('bibtex.bib', 'w') as bibtex_file:
bibtexparser.dump(bibtex_database, bibtex_file)
-
bibtexparser.
load
(bibtex_file, parser=None)[source]¶ Load
BibDatabase
object from a fileParameters: - bibtex_file (file) – input file to be parsed
- parser (BibTexParser) – custom parser to use (optional)
Returns: bibliographic database object
Return type: Example:
import bibtexparser with open('bibtex.bib') as bibtex_file: bibtex_database = bibtexparser.load(bibtex_file)
-
bibtexparser.
loads
(bibtex_str, parser=None)[source]¶ Load
BibDatabase
object from a stringParameters: - bibtex_str (str or unicode) – input BibTeX string to be parsed
- parser (BibTexParser) – custom parser to use (optional)
Returns: bibliographic database object
Return type:
-
bibtexparser.
dumps
(bib_database, writer=None)[source]¶ Dump
BibDatabase
object to a BibTeX stringParameters: - bib_database (BibDatabase) – bibliographic database object
- writer (BibTexWriter) – custom writer to use (optional) (not yet implemented)
Returns: BibTeX string
Return type: unicode
-
bibtexparser.
dump
(bib_database, bibtex_file, writer=None)[source]¶ Dump
BibDatabase
object as a BibTeX text fileParameters: - bib_database (BibDatabase) – bibliographic database object
- bibtex_file (file) – file to write to
- writer (BibTexWriter) – custom writer to use (optional) (not yet implemented)
Example:
import bibtexparser with open('bibtex.bib', 'w') as bibtex_file: bibtexparser.dump(bibtex_database, bibtex_file)
bibtexparser.bibdatabase
— The bibliographic database object¶
-
class
bibdatabase.
BibDatabase
[source]¶ Bibliographic database object that follows the data structure of a BibTeX file.
-
comments
= None¶ List of BibTeX comment (@comment{…}) blocks.
-
entries
= None¶ List of BibTeX entries, for example @book{…}, @article{…}, etc. Each entry is a simple dict with BibTeX field-value pairs, for example ‘author’: ‘Bird, R.B. and Armstrong, R.C. and Hassager, O.’ Each entry will always have the following dict keys (in addition to other BibTeX fields):
- ID (BibTeX key)
- ENTRYTYPE (entry type in lowercase, e.g. book, article etc.)
-
entries_dict
¶ Return a dictionary of BibTeX entries. The dict key is the BibTeX entry key
-
preambles
= None¶ List of BibTeX preamble (@preamble{…}) blocks.
-
strings
= None¶ OrderedDict of BibTeX string definitions (@string{…}). In order of definition.
-
bibtexparser.bparser
— Tune the default parser¶
-
class
bparser.
BibTexParser
(data=None, customization=None, ignore_nonstandard_types=True, homogenize_fields=False, interpolate_strings=True, common_strings=False)[source]¶ A parser for reading BibTeX bibliographic data files.
Example:
from bibtexparser.bparser import BibTexParser bibtex_str = ... parser = BibTexParser() parser.ignore_nonstandard_types = False parser.homogenize_fields = False parser.common_strings = False bib_database = bibtexparser.loads(bibtex_str, parser)
Parameters: - customization – function or None (default) Customization to apply to parsed entries.
- ignore_nonstandard_types – bool (default True) If True ignores non-standard bibtex entry types.
- homogenize_fields – bool (default False) Common field name replacements (as set in alt_dict attribute).
- interpolate_strings – bool (default True) If True, replace bibtex string by their value, else uses BibDataString objects.
- common_strings – book (default False) Include common string definitions (e.g. month abbreviations) to the bibtex file.
-
common_strings
= None¶ Load common strings such as months abbreviation Default: False.
-
customization
= None¶ Callback function to process BibTeX entries after parsing, for example to create a list from a string with multiple values. By default all BibTeX values are treated as simple strings. Default: None.
-
homogenize_fields
= None¶ Sanitize BibTeX field names, for example change url to link etc. Field names are always converted to lowercase names. Default: False.
-
ignore_nonstandard_types
= None¶ Ignore non-standard BibTeX types (book, article, etc). Default: True.
-
interpolate_strings
= None¶ Interpolate Bibtex Strings or keep the structure
bibtexparser.customization
— Functions to customize records¶
A set of functions useful for customizing bibtex fields. You can find inspiration from these functions to design yours. Each of them takes a record and return the modified record.
-
customization.
splitname
(name, strict_mode=True)[source]¶ Break a name into its constituent parts: First, von, Last, and Jr.
Parameters: - name (string) – a string containing a single name
- strict_mode (Boolean) – whether to use strict mode
Returns: dictionary of constituent parts
Raises: customization.InvalidName – If an invalid name is given and
strict_mode = True
.- In BibTeX, a name can be represented in any of three forms:
- First von Last
- von Last, First
- von Last, Jr, First
This function attempts to split a given name into its four parts. The returned dictionary has keys of
first
,last
,von
andjr
. Each value is a list of the words making up that part; this may be an empty list. If the input has no non-whitespace characters, a blank dictionary is returned.It is capable of detecting some errors with the input name. If the
strict_mode
parameter isTrue
, which is the default, this results in acustomization.InvalidName
exception being raised. If it isFalse
, the function continues, working around the error as best it can. The errors that can be detected are listed below along with the handling for non-strict mode:- Name finishes with a trailing comma: delete the comma
- Too many parts (e.g., von Last, Jr, First, Error): merge extra parts into First
- Unterminated opening brace: add closing brace to end of input
- Unmatched closing brace: add opening brace at start of word
-
customization.
getnames
(names)[source]¶ Convert people names as surname, firstnames or surname, initials.
Parameters: names (list) – a list of names Returns: list – Correctly formated names This function is known to be too simple to handle properly the complex rules. We would like to enhance this in forthcoming releases.
Split author field into a list of “Name, Surname”.
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
customization.
editor
(record)[source]¶ Turn the editor field into a dict composed of the original editor name and a editor id (without coma or blank).
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
customization.
journal
(record)[source]¶ Turn the journal field into a dict composed of the original journal name and a journal id (without coma or blank).
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
customization.
keyword
(record, sep=', |;')[source]¶ Split keyword field into a list.
Parameters: - record (string, optional) – the record.
- sep – pattern used for the splitting regexp.
Returns: dict – the modified record.
-
customization.
link
(record)[source]¶ Parameters: record (dict) – the record. Returns: dict – the modified record.
-
customization.
page_double_hyphen
(record)[source]¶ Separate pages by a double hyphen (–).
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
customization.
doi
(record)[source]¶ Parameters: record (dict) – the record. Returns: dict – the modified record.
-
customization.
type
(record)[source]¶ Put the type into lower case.
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
customization.
convert_to_unicode
(record)[source]¶ Convert accent from latex to unicode style.
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
customization.
homogenize_latex_encoding
(record)[source]¶ Homogenize the latex enconding style for bibtex
This function is experimental.
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
customization.
add_plaintext_fields
(record)[source]¶ For each field in the record, add a plain_ field containing the plaintext, stripped from braces and similar. See https://github.com/sciunto-org/python-bibtexparser/issues/116.
Parameters: record (dict) – the record. Returns: dict – the modified record.
Exception classes¶
-
class
customization.
InvalidName
[source]¶ Exception raised by
customization.splitname()
when an invalid name is input.
bibtexparser.bwriter
— Tune the default writer¶
-
class
bwriter.
BibTexWriter
(write_common_strings=False)[source]¶ Writer to convert a
BibDatabase
object to a string or file formatted as a BibTeX file.Example:
from bibtexparser.bwriter import BibTexWriter bib_database = ... writer = BibTexWriter() writer.contents = ['comments', 'entries'] writer.indent = ' ' writer.order_entries_by = ('ENTRYTYPE', 'author', 'year') bibtex_str = bibtexparser.dumps(bib_database, writer)
-
align_values
= None¶ Align values. Determines the maximal number of characters used in any fieldname and aligns all values
-
comma_first
= None¶ BibTeX syntax allows comma first syntax (common in functional languages), use this to enable comma first syntax as the bwritter output
-
common_strings
= None¶ Whether common strings are written
-
contents
= None¶ List of BibTeX elements to write, valid values are entries, comments, preambles, strings.
-
display_order
= None¶ Tuple of fields for display order in a single BibTeX entry. Fields not listed here will be displayed alphabetically at the end. Set to ‘[]’ for alphabetical order. Default: ‘[]’
-
entry_separator
= None¶ Characters(s) for separating BibTeX entries. Default: new line.
-
indent
= None¶ Character(s) for indenting BibTeX field-value pairs. Default: single space.
-
order_entries_by
= None¶ Tuple of fields for ordering BibTeX entries. Set to None to disable sorting. Default: BibTeX key (‘ID’, ).
-
write
(bib_database)[source]¶ Converts a bibliographic database to a BibTeX-formatted string.
Parameters: bib_database (BibDatabase) – bibliographic database to be converted to a BibTeX string Returns: BibTeX-formatted string Return type: str or unicode
-