Contents
- bibtexparser: API
bibtexparser
— Parsing and writing BibTeX filesbibtexparser.bibdatabase
— The bibliographic database objectbibtexparser.bparser
— Tune the default parserbibtexparser.customization
— Functions to customize recordsbibtexparser.bwriter
— Tune the default writerbibtexparser.bibtexexpression
— Parser’s core relying on pyparsing
bibtexparser: API¶
bibtexparser
— Parsing and writing BibTeX files¶
BibTeX is a bibliographic data file format.
The bibtexparser
module can parse BibTeX files and write them. The API is similar to the
json
module. The parsed data is returned as a simple BibDatabase
object with the main attribute being
entries
representing bibliographic sources such as books and journal articles.
The following functions provide a quick and basic way to manipulate a BibTeX file. More advanced features are also available in this module.
Parsing a file is as simple as:
import bibtexparser
with open('bibtex.bib') as bibtex_file:
bibtex_database = bibtexparser.load(bibtex_file)
And writing:
import bibtexparser
with open('bibtex.bib', 'w') as bibtex_file:
bibtexparser.dump(bibtex_database, bibtex_file)
-
bibtexparser.
load
(bibtex_file, parser=None)[source]¶ Load
BibDatabase
object from a fileParameters: - bibtex_file (file) – input file to be parsed
- parser (BibTexParser) – custom parser to use (optional)
Returns: bibliographic database object
Return type: Example:
import bibtexparser with open('bibtex.bib') as bibtex_file: bibtex_database = bibtexparser.load(bibtex_file)
-
bibtexparser.
loads
(bibtex_str, parser=None)[source]¶ Load
BibDatabase
object from a stringParameters: - bibtex_str (str or unicode) – input BibTeX string to be parsed
- parser (BibTexParser) – custom parser to use (optional)
Returns: bibliographic database object
Return type:
-
bibtexparser.
dumps
(bib_database, writer=None)[source]¶ Dump
BibDatabase
object to a BibTeX stringParameters: - bib_database (BibDatabase) – bibliographic database object
- writer (BibTexWriter) – custom writer to use (optional) (not yet implemented)
Returns: BibTeX string
Return type: unicode
-
bibtexparser.
dump
(bib_database, bibtex_file, writer=None)[source]¶ Dump
BibDatabase
object as a BibTeX text fileParameters: - bib_database (BibDatabase) – bibliographic database object
- bibtex_file (file) – file to write to
- writer (BibTexWriter) – custom writer to use (optional) (not yet implemented)
Example:
import bibtexparser with open('bibtex.bib', 'w') as bibtex_file: bibtexparser.dump(bibtex_database, bibtex_file)
bibtexparser.bibdatabase
— The bibliographic database object¶
-
class
bibtexparser.bibdatabase.
BibDatabase
[source]¶ Bibliographic database object that follows the data structure of a BibTeX file.
-
comments
= None¶ List of BibTeX comment (@comment{…}) blocks.
-
entries
= None¶ List of BibTeX entries, for example @book{…}, @article{…}, etc. Each entry is a simple dict with BibTeX field-value pairs, for example ‘author’: ‘Bird, R.B. and Armstrong, R.C. and Hassager, O.’ Each entry will always have the following dict keys (in addition to other BibTeX fields):
- ID (BibTeX key)
- ENTRYTYPE (entry type in lowercase, e.g. book, article etc.)
-
entries_dict
¶ Return a dictionary of BibTeX entries. The dict key is the BibTeX entry key
-
preambles
= None¶ List of BibTeX preamble (@preamble{…}) blocks.
-
strings
= None¶ OrderedDict of BibTeX string definitions (@string{…}). In order of definition.
-
bibtexparser.bparser
— Tune the default parser¶
-
class
bibtexparser.bparser.
BibTexParser
(data=None, customization=None, ignore_nonstandard_types=True, homogenize_fields=False, interpolate_strings=True, common_strings=False, add_missing_from_crossref=False)[source]¶ A parser for reading BibTeX bibliographic data files.
Example:
from bibtexparser.bparser import BibTexParser bibtex_str = ... parser = BibTexParser() parser.ignore_nonstandard_types = False parser.homogenize_fields = False parser.common_strings = False bib_database = bibtexparser.loads(bibtex_str, parser)
Parameters: - customization – function or None (default) Customization to apply to parsed entries.
- ignore_nonstandard_types – bool (default True) If True ignores non-standard bibtex entry types.
- homogenize_fields – bool (default False) Common field name replacements (as set in alt_dict attribute).
- interpolate_strings – bool (default True) If True, replace bibtex string by their value, else uses BibDataString objects.
- common_strings – bool (default False) Include common string definitions (e.g. month abbreviations) to the bibtex file.
- add_missing_from_crossref – bool (default False) Resolve BibTeX references set in the crossref field for BibTeX entries and add the fields from the referenced entry to the referencing entry.
-
common_strings
= None¶ Load common strings such as months abbreviation Default: False.
-
customization
= None¶ Callback function to process BibTeX entries after parsing, for example to create a list from a string with multiple values. By default all BibTeX values are treated as simple strings. Default: None.
-
homogenize_fields
= None¶ Sanitize BibTeX field names, for example change url to link etc. Field names are always converted to lowercase names. Default: False.
-
ignore_nonstandard_types
= None¶ Ignore non-standard BibTeX types (book, article, etc). Default: True.
-
interpolate_strings
= None¶ Interpolate Bibtex Strings or keep the structure
-
parse
(bibtex_str, partial=False)[source]¶ Parse a BibTeX string into an object
Parameters: - bibtex_str – BibTeX string
- partial – If True, print errors only on parsing failures. If False, an exception is raised.
Type: str or unicode
Type: boolean
Returns: bibliographic database
Return type:
bibtexparser.customization
— Functions to customize records¶
A set of functions useful for customizing bibtex fields. You can find inspiration from these functions to design yours. Each of them takes a record and return the modified record.
-
bibtexparser.customization.
splitname
(name, strict_mode=True)[source]¶ Break a name into its constituent parts: First, von, Last, and Jr.
Parameters: - name (string) – a string containing a single name
- strict_mode (Boolean) – whether to use strict mode
Returns: dictionary of constituent parts
Raises: customization.InvalidName – If an invalid name is given and
strict_mode = True
.- In BibTeX, a name can be represented in any of three forms:
- First von Last
- von Last, First
- von Last, Jr, First
This function attempts to split a given name into its four parts. The returned dictionary has keys of
first
,last
,von
andjr
. Each value is a list of the words making up that part; this may be an empty list. If the input has no non-whitespace characters, a blank dictionary is returned.It is capable of detecting some errors with the input name. If the
strict_mode
parameter isTrue
, which is the default, this results in acustomization.InvalidName
exception being raised. If it isFalse
, the function continues, working around the error as best it can. The errors that can be detected are listed below along with the handling for non-strict mode:- Name finishes with a trailing comma: delete the comma
- Too many parts (e.g., von Last, Jr, First, Error): merge extra parts into First
- Unterminated opening brace: add closing brace to end of input
- Unmatched closing brace: add opening brace at start of word
-
bibtexparser.customization.
getnames
(names)[source]¶ Convert people names as surname, firstnames or surname, initials.
Parameters: names (list) – a list of names Returns: list – Correctly formated names Note
This function is known to be too simple to handle properly the complex rules. We would like to enhance this in forthcoming releases.
Split author field into a list of “Name, Surname”.
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
bibtexparser.customization.
editor
(record)[source]¶ Turn the editor field into a dict composed of the original editor name and a editor id (without coma or blank).
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
bibtexparser.customization.
journal
(record)[source]¶ Turn the journal field into a dict composed of the original journal name and a journal id (without coma or blank).
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
bibtexparser.customization.
keyword
(record, sep=', |;')[source]¶ Split keyword field into a list.
Parameters: - record (string, optional) – the record.
- sep – pattern used for the splitting regexp.
Returns: dict – the modified record.
-
bibtexparser.customization.
link
(record)[source]¶ Parameters: record (dict) – the record. Returns: dict – the modified record.
-
bibtexparser.customization.
page_double_hyphen
(record)[source]¶ Separate pages by a double hyphen (–).
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
bibtexparser.customization.
doi
(record)[source]¶ Parameters: record (dict) – the record. Returns: dict – the modified record.
-
bibtexparser.customization.
type
(record)[source]¶ Put the type into lower case.
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
bibtexparser.customization.
convert_to_unicode
(record)[source]¶ Convert accent from latex to unicode style.
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
bibtexparser.customization.
homogenize_latex_encoding
(record)[source]¶ Homogenize the latex enconding style for bibtex
This function is experimental.
Parameters: record (dict) – the record. Returns: dict – the modified record.
-
bibtexparser.customization.
add_plaintext_fields
(record)[source]¶ For each field in the record, add a plain_ field containing the plaintext, stripped from braces and similar. See https://github.com/sciunto-org/python-bibtexparser/issues/116.
Parameters: record (dict) – the record. Returns: dict – the modified record.
bibtexparser.bwriter
— Tune the default writer¶
-
class
bibtexparser.bwriter.
BibTexWriter
(write_common_strings=False)[source]¶ Writer to convert a
BibDatabase
object to a string or file formatted as a BibTeX file.Example:
from bibtexparser.bwriter import BibTexWriter bib_database = ... writer = BibTexWriter() writer.contents = ['comments', 'entries'] writer.indent = ' ' writer.order_entries_by = ('ENTRYTYPE', 'author', 'year') bibtex_str = bibtexparser.dumps(bib_database, writer)
-
add_trailing_comma
= None¶ BibTeX syntax allows the comma to be optional at the end of the last field in an entry. Use this to enable writing this last comma in the bwriter output. Defaults: False.
-
comma_first
= None¶ BibTeX syntax allows comma first syntax (common in functional languages), use this to enable comma first syntax as the bwriter output
-
common_strings
= None¶ Whether common strings are written
-
contents
= None¶ List of BibTeX elements to write, valid values are entries, comments, preambles, strings.
-
display_order
= None¶ Tuple of fields for display order in a single BibTeX entry. Fields not listed here will be displayed alphabetically at the end. Set to ‘[]’ for alphabetical order. Default: ‘[]’
-
entry_separator
= None¶ Characters(s) for separating BibTeX entries. Default: new line.
-
indent
= None¶ Character(s) for indenting BibTeX field-value pairs. Default: single space.
-
order_entries_by
= None¶ Tuple of fields for ordering BibTeX entries. Set to None to disable sorting. Default: BibTeX key (‘ID’, ).
-
write
(bib_database)[source]¶ Converts a bibliographic database to a BibTeX-formatted string.
Parameters: bib_database (BibDatabase) – bibliographic database to be converted to a BibTeX string Returns: BibTeX-formatted string Return type: str or unicode
-
bibtexparser.bibtexexpression
— Parser’s core relying on pyparsing¶
-
class
bibtexparser.bibtexexpression.
BibtexExpression
[source]¶ Gives access to pyparsing expressions.
Attributes are pyparsing expressions for the following elements:
- main_expression: the bibtex file
- string_def: a string definition
- preamble_decl: a preamble declaration
- explicit_comment: an explicit comment
- entry: an entry definition
- implicit_comment: an implicit comment
-
exception
ParseException
(pstr, loc=0, msg=None, elem=None)¶ Exception thrown when parse expressions don’t match class; supported attributes by name are: - lineno - returns the line number of the exception text - col - returns the column number of the exception text - line - returns the line containing the exception text
Example:
try: Word(nums).setName("integer").parseString("ABC") except ParseException as pe: print(pe) print("column: {}".format(pe.col))
prints:
Expected integer (at char 0), (line:1, col:1) column: 1
-
static
explain
(exc, depth=16)¶ Method to take an exception and translate the Python internal traceback into a list of the pyparsing expressions that caused the exception to be raised.
Parameters:
- exc - exception raised during parsing (need not be a ParseException, in support of Python exceptions that might be raised in a parse action)
- depth (default=16) - number of levels back in the stack trace to list expression and function names; if None, the full stack trace names will be listed; if 0, only the failing input line, marker, and exception string will be shown
Returns a multi-line string listing the ParserElements and/or function names in the exception’s stack trace.
Note: the diagnostic output will include string representations of the expressions that failed to parse. These representations will be more helpful if you use setName to give identifiable names to your expressions. Otherwise they will use the default string forms, which may be cryptic to read.
explain() is only supported under Python 3.
-
static
-
add_log_function
(log_fun)[source]¶ Add notice to logger on entry, comment, preamble, string definitions.
Parameters: log_fun – logger function
-
set_string_expression_parse_action
(fun)[source]¶ Set the parseAction for string_expression expression.
Note
See set_string_name_parse_action.
-
set_string_name_parse_action
(fun)[source]¶ Set the parseAction for string name expression.
Note
For some reason pyparsing duplicates the string_name expression so setting its parseAction a posteriori has no effect in the context of a string expression. This is why this function should be used instead.
-
bibtexparser.bibtexexpression.
add_logger_parse_action
(expr, log_func)[source]¶ Register a callback on expression parsing with the adequate message.