license_expression package
Module contents
Define a mini language to parse, validate, deduplicate, simplify, normalize and compare license expressions using a boolean logic engine.
This module supports SPDX and ScanCode license expressions and also accepts other license naming conventions and license identifiers aliases to recognize and normalize licenses.
Using boolean logic, license expressions can be tested for equality, containment, equivalence and can be normalized, deduplicated or simplified.
The main entry point is the Licensing object.
- class license_expression.BaseSymbol(obj)
Bases:
Renderable,SymbolA base class for all symbols.
- decompose()
Yield the underlying symbols of this symbol.
- exception license_expression.ExpressionError
Bases:
Exception
- class license_expression.ExpressionInfo(original_expression, normalized_expression=None, errors=None, invalid_symbols=None)
Bases:
objectThe ExpressionInfo class is returned by Licensing.validate() where it stores information about a given license expression passed into Licensing.validate().
The ExpressionInfo class has the following fields:
- original_expression: str.
This is the license expression that was originally passed into Licensing.validate()
- normalized_expression: str.
If a valid license expression has been passed into validate(), then the license expression string will be set in this field.
- errors: list
If there were errors validating a license expression, the error messages will be appended here.
- invalid_symbols: list
If the license expression that has been passed into validate() has license keys that are invalid (either that they are unknown or not used in the right context), or the syntax is incorrect because an invalid symbol was used, then those symbols will be appended here.
- exception license_expression.ExpressionParseError(token_type=None, token_string='', position=-1, error_code=0)
Bases:
ParseError,ExpressionError
- class license_expression.Keyword(value, type)
Bases:
tuple- type
Alias for field number 1
- value
Alias for field number 0
- class license_expression.LicenseSymbol(key, aliases=(), is_deprecated=False, is_exception=False, *args, **kwargs)
Bases:
BaseSymbolA LicenseSymbol represents a license key or identifier as used in a license expression.
- FALSE = FALSE
- Symbol
alias of
LicenseSymbol
- TRUE = TRUE
- decompose()
Return an iterable of the underlying license symbols for this symbol.
- render(template='{symbol.key}', *args, **kwargs)
Return a formatted string rendering for this expression using the
templateformat string to render each license symbol. The variables available are symbol.key and any other attribute attached to a LicenseSymbol-like instance; a customtemplatecan be provided to handle custom rendering such as HTML.For symbols that hold multiple licenses (e.g. in a “XXX WITH YYY” statement) the template is applied to each symbol individually.
Note that when render() is called the
*argsand**kwargsare passed down recursively to any Renderable object render() method.
- classmethod symbol_like(symbol)
Return True if
symbolis a symbol-like object with its essential attributes.
- class license_expression.LicenseSymbolLike(symbol_like, *args, **kwargs)
Bases:
LicenseSymbolA LicenseSymbolLike object wraps a symbol-like object to expose it’s LicenseSymbol behavior.
- render(template='{symbol.key}', *args, **kwargs)
Return a formatted string rendering for this expression using the
templateformat string to render each license symbol. The variables available are symbol.key and any other attribute attached to a LicenseSymbol-like instance; a customtemplatecan be provided to handle custom rendering such as HTML.For symbols that hold multiple licenses (e.g. in a “XXX WITH YYY” statement) the template is applied to each symbol individually.
Note that when render() is called the
*argsand**kwargsare passed down recursively to any Renderable object render() method.
- class license_expression.LicenseWithExceptionSymbol(license_symbol, exception_symbol, strict=False, *args, **kwargs)
Bases:
BaseSymbolA LicenseWithExceptionSymbol represents a license with a “WITH” keyword and a license exception such as the Classpath exception. When used in a license expression, this is treated as a single Symbol. It holds two LicenseSymbols objects: one for the left-hand side license proper and one for the right- hand side exception to the license and deals with the specifics of resolution, validation and representation.
- FALSE = FALSE
- Symbol
alias of
LicenseSymbol
- TRUE = TRUE
- decompose()
Yield the underlying symbols of this symbol.
- render(template='{symbol.key}', wrap_with_in_parens=False, *args, **kwargs)
Return a formatted “WITH” expression. If
wrap_with_in_parens, wrap the expression in parens as in “(XXX WITH YYY)”.
- class license_expression.Licensing(symbols=(), quiet=True)
Bases:
BooleanAlgebraLicensing defines a mini language to parse, validate and compare license expressions. This is the main entry point in this library.
Some of the features are:
licenses can be validated against user-provided lists of known licenses “symbols” (such as ScanCode licenses or the SPDX list).
flexible expression parsing and recognition of licenses (including licenses with spaces and keywords (such as AND, OR WITH) or parens in their names).
in an expression licenses can be more than just identifiers such as short or long names with spaces, symbols and even parenthesis.
A license can have multiple aliases (such as GPL-2.0, GPLv2 or GPL2) and each will be properly recognized when parsing. The expression is rendered normalized using the canononical license keys.
expressions can be deduplicated, simplified, normalized, sorted and compared for containment and/or logical equivalence thanks to a built-in boolean logic engine.
Once parsed, expressions can be rendered using simple templates (for instance to render as HTML links in a web UI).
For example:
>>> l = Licensing() >>> expr = l.parse(" GPL-2.0 or LGPL-2.1 and mit ") >>> expected = 'GPL-2.0 OR (LGPL-2.1 AND mit)' >>> assert expected == expr.render('{symbol.key}')
>>> expected = [ ... LicenseSymbol('GPL-2.0'), ... LicenseSymbol('LGPL-2.1'), ... LicenseSymbol('mit') ... ] >>> assert expected == l.license_symbols(expr)
>>> symbols = ['GPL-2.0+', 'Classpath', 'BSD'] >>> l = Licensing(symbols) >>> expression = 'GPL-2.0+ with Classpath or (bsd)' >>> parsed = l.parse(expression) >>> expected = 'GPL-2.0+ WITH Classpath OR BSD' >>> assert expected == parsed.render('{symbol.key}')
>>> expected = [ ... LicenseSymbol('GPL-2.0+'), ... LicenseSymbol('Classpath'), ... LicenseSymbol('BSD') ... ] >>> assert expected == l.license_symbols(parsed) >>> assert expected == l.license_symbols(expression)
- advanced_tokenizer(expression)
Return an iterable of Token from an
expressionstring.
- contains(expression1, expression2, **kwargs)
Return True if
expression1containsexpression2. where each expression is either a string or a LicenseExpression object. If a string is provided, it will be parsed and simplified.Extra
kwargsare passed down to the parse() function.
- dedup(expression)
Return a deduplicated LicenseExpression given a license
expressionstring or LicenseExpression object.The deduplication process is similar to simplification but is specialized for working with license expressions. Simplification is otherwise a generic boolean operation that is not aware of the specifics of license expressions.
The deduplication:
Does not sort the licenses of sub-expression in an expression. They stay in the same order as in the original expression.
Choices (as in “MIT or GPL”) are kept as-is and not treated as simplifiable. This avoids droping important choice options in complex expressions which is never desirable.
- get_advanced_tokenizer()
Return an AdvancedTokenizer instance for this Licensing either cached or created as needed.
If symbols were provided when this Licensing object was created, the tokenizer will recognize known symbol keys and aliases (ignoring case) when tokenizing expressions.
A license symbol is any string separated by keywords and parens (and it can include spaces).
- is_equivalent(expression1, expression2, **kwargs)
Return True if both
expression1andexpression2LicenseExpression objects are equivalent. If a string is provided, it will be parsed and simplified. Extrakwargsare passed down to the parse() function. Raise ExpressionError on parse errors.
- license_keys(expression, unique=True, **kwargs)
Return a list of licenses keys used in an
expressionin the same order as they first appear in the expression.expressionis either a string or a LicenseExpression object.If
uniqueis True only return unique symbols. Extrakwargsare passed down to the parse() function.For example: >>> l = Licensing() >>> expr = ‘ GPL-2.0 and mit+ with blabla and mit or LGPL-2.1 and mit and mit+ with GPL-2.0’ >>> expected = [‘GPL-2.0’, ‘mit+’, ‘blabla’, ‘mit’, ‘LGPL-2.1’] >>> assert expected == l.license_keys(l.parse(expr))
- license_symbols(expression, unique=True, decompose=True, **kwargs)
Return a list of LicenseSymbol objects used in an expression in the same order as they first appear in the expression tree.
expressionis either a string or a LicenseExpression object. If a string is provided, it will be parsed.If
uniqueis True only return unique symbols.If
decomposeis True then composite LicenseWithExceptionSymbol instances are not returned directly; instead their underlying license and exception symbols are returned.Extra
kwargsare passed down to the parse() function.For example: >>> l = Licensing() >>> expected = [ … LicenseSymbol(‘GPL-2.0’), … LicenseSymbol(‘LGPL-2.1+’) … ] >>> result = l.license_symbols(l.parse(‘GPL-2.0 or LGPL-2.1+’)) >>> assert expected == result
- parse(expression, validate=False, strict=False, simple=False, **kwargs)
Return a new license LicenseExpression object by parsing a license
expression. Check that theexpressionsyntax is valid and raise an ExpressionError or an ExpressionParseError on errors.Return None for empty expressions.
expressionis either a string or a LicenseExpression object. Ifexpressionis a LicenseExpression it is returned as-is.Symbols are always recognized from known Licensing symbols if symbols were provided at Licensing creation time: each license and exception is recognized from known license keys (and from aliases for a symbol if available).
If
validateis True and a license is unknown, an ExpressionError error is raised with a message listing the unknown license keys.If
validateis False, no error is raised if theexpressionsyntax is correct. You can call further call the unknown_license_keys() or unknown_license_symbols() methods to get unknown license keys or symbols found in the parsed LicenseExpression.If
strictis True, an ExpressionError will be raised if in a “WITH” expression such as “XXX with ZZZ” if the XXX symbol has is_exception set to True or the YYY symbol has is_exception set to False. This checks that symbols are used strictly as intended in a “WITH” subexpression using a license on the left and an exception on thr right.If
simpleis True, parsing will use a simple tokenizer that assumes that license symbols are all license keys and do not contain spaces.For example: >>> expression = ‘EPL-1.0 and Apache-1.1 OR GPL-2.0 with Classpath-exception’ >>> parsed = Licensing().parse(expression) >>> expected = ‘(EPL-1.0 AND Apache-1.1) OR GPL-2.0 WITH Classpath-exception’ >>> assert expected == parsed.render(template=’{symbol.key}’)
- primary_license_key(expression, **kwargs)
Return the left-most license key of an
expressionor None. The underlying symbols are decomposed.expressionis either a string or a LicenseExpression object.Extra
kwargsare passed down to the parse() function.
- primary_license_symbol(expression, decompose=True, **kwargs)
Return the left-most license symbol of an
expressionor None.expressionis either a string or a LicenseExpression object.If
decomposeis True, only the left-hand license symbol of a decomposed LicenseWithExceptionSymbol symbol will be returned if this is the left most member. Otherwise a composite LicenseWithExceptionSymbol is returned in this case.Extra
kwargsare passed down to the parse() function.
- simple_tokenizer(expression)
Return an iterable of Token from an
expressionstring.The split is done on spaces, keywords and parens. Anything else is a symbol token, e.g. a typically license key or license id (that contains no spaces or parens).
If symbols were provided when this Licensing object was created, the tokenizer will recognize known symbol keys (ignoring case) when tokenizing expressions.
- tokenize(expression, strict=False, simple=False)
Return an iterable of 3-tuple describing each token given an
expressionstring. See boolean.BooleanAlgreba.tokenize() for API details.This 3-tuple contains these items: (token, token string, position): - token: either a Symbol instance or one of TOKEN_* token types.. - token string: the original token string. - position: the starting index of the token string in the expr string.
If
strictis True, additional exceptions will be raised in a expression such as “XXX with ZZZ” if the XXX symbol has is_exception` set to True or the ZZZ symbol has is_exception set to False.If
simpleis True, use a simple tokenizer that assumes that license symbols are all license keys that do not contain spaces.
- unknown_license_keys(expression, unique=True, **kwargs)
Return a list of unknown licenses keys used in an
expressionin the same order as they first appear in theexpression.expressionis either a string or a LicenseExpression object. If a string is provided, it will be parsed.If
uniqueis True only return unique keys. Extrakwargsare passed down to the parse() function.
- unknown_license_symbols(expression, unique=True, **kwargs)
Return a list of unknown license symbols used in an
expressionin the same order as they first appear in theexpression.expressionis either a string or a LicenseExpression object.If
uniqueis True only return unique symbols. Extrakwargsare passed down to the parse() function.
- validate(expression, strict=True, **kwargs)
Return a ExpressionInfo object that contains information about the validation of an
expressionlicense expression string.If the syntax and license keys of
expressionis valid, then ExpressionInfo.normalized_license_expression is set.If an error was encountered when validating
expression, ExpressionInfo.errors will be populated with strings containing the error message that has occured. If an error has occured due to unknown license keys or an invalid license symbol, the offending keys or symbols will be present in ExpressionInfo.invalid_symbolsIf
strictis True, validation error messages will be included if in a “WITH” expression such as “XXX with ZZZ” if the XXX symbol has is_exception set to True or the YYY symbol has is_exception set to False. This checks that exception symbols are used strictly as intended on the right side of a “WITH” statement.
- validate_license_keys(expression)
- class license_expression.Renderable
Bases:
objectAn interface for renderable objects.
- render(template='{symbol.key}', *args, **kwargs)
Return a formatted string rendering for this expression using the
templateformat string to render each license symbol. The variables available are symbol.key and any other attribute attached to a LicenseSymbol-like instance; a customtemplatecan be provided to handle custom rendering such as HTML.For symbols that hold multiple licenses (e.g. in a “XXX WITH YYY” statement) the template is applied to each symbol individually.
Note that when render() is called the
*argsand**kwargsare passed down recursively to any Renderable object render() method.
- render_as_readable(template='{symbol.key}', *args, **kwargs)
Return a formatted string rendering for this expression using the
templateformat string to render each symbol. Add extra parenthesis around “WITH” sub-expressions such as in “(XXX WITH YYY)”for improved readbility. Seerender()for other arguments.
- class license_expression.RenderableFunction
Bases:
Renderable- render(template='{symbol.key}', *args, **kwargs)
Render an expression as a string, recursively applying the string
templateto every symbols and operators.
- license_expression.as_symbols(symbols)
Return an iterable of LicenseSymbol objects from a
symbolssequence of strings or LicenseSymbol-like objects.If an item is a string, then create a new LicenseSymbol for it using the string as key. If this is not a string it must be a LicenseSymbol- like type. Raise a TypeError expection if an item is neither a string or LicenseSymbol- like.
- license_expression.build_licensing(license_index)
Return a Licensing object that has been loaded with license keys and attributes from a
license_indexlist of simple ScanCode license mappings.
- license_expression.build_spdx_licensing(license_index)
Return a Licensing object that has been loaded with license keys and attributes from a
license_indexlist of simple SPDX license mappings.
- license_expression.build_symbols_from_unknown_tokens(tokens)
Yield Token given a
tokensequence of Token replacing unmatched contiguous tokens by a single token with a LicenseSymbol.
- license_expression.build_token_groups_for_with_subexpression(tokens)
- Yield tuples of Token given a
tokenssequence of Token such that: all “XXX WITH YYY” sequences of 3 tokens are grouped in a three-tuple
single tokens are just wrapped in a tuple for consistency.
- Yield tuples of Token given a
- license_expression.combine_expressions(expressions, relation='AND', unique=True, licensing=<license_expression.Licensing object>)
Return a combined LicenseExpression object with the relation, given a list of license
expressionsstrings or LicenseExpression objects. Ifuniqueis True remove duplicates before combining expressions.- For example::
>>> a = 'mit' >>> b = 'gpl' >>> str(combine_expressions([a, b])) 'mit AND gpl' >>> assert 'mit' == str(combine_expressions([a])) >>> combine_expressions([]) >>> combine_expressions(None) >>> str(combine_expressions(('gpl', 'mit', 'apache',))) 'gpl AND mit AND apache' >>> str(combine_expressions(('gpl', 'mit', 'apache',), relation='OR')) 'gpl OR mit OR apache' >>> str(combine_expressions(('gpl', 'mit', 'mit',))) 'gpl AND mit' >>> str(combine_expressions(('mit WITH foo', 'gpl', 'mit',))) 'mit WITH foo AND gpl AND mit' >>> str(combine_expressions(('gpl', 'mit', 'mit',), relation='OR', unique=False)) 'gpl OR mit OR mit' >>> str(combine_expressions(('mit', 'gpl', 'mit',))) 'mit AND gpl'
- license_expression.get_license_index(license_index_location='/builddir/build/BUILD/python-license-expression-30.4.4-build/license-expression-30.4.4/src/license_expression/data/scancode-licensedb-index.json')
Return a list of mappings that contain license key information from
license_index_locationThe default value of license_index_location points to a vendored copy of the license index from https://scancode-licensedb.aboutcode.org/
- license_expression.get_scancode_licensing(license_index_location='/builddir/build/BUILD/python-license-expression-30.4.4-build/license-expression-30.4.4/src/license_expression/data/scancode-licensedb-index.json')
Return a Licensing object using ScanCode license keys loaded from a
license_index_locationlocation of a license db JSON index files See https://scancode-licensedb.aboutcode.org/index.json
- license_expression.get_spdx_licensing(license_index_location='/builddir/build/BUILD/python-license-expression-30.4.4-build/license-expression-30.4.4/src/license_expression/data/scancode-licensedb-index.json')
Return a Licensing object using SPDX license keys loaded from a
license_index_locationlocation of a license db JSON index files See https://scancode-licensedb.aboutcode.org/index.json
- license_expression.is_valid_license_key(string, pos=0, endpos=9223372036854775807)
Matches zero or more characters at the beginning of the string.
- license_expression.is_with_subexpression(tokens_tripple)
Return True if a
tokens_trippleToken tripple is a “WITH” license sub- expression.
- license_expression.load_licensing_from_license_index(license_index)
Return a Licensing object that has been loaded with license keys and attributes from a
license_indexlist of license mappings.
- license_expression.ordered_unique(seq)
Return unique items in a sequence
seqpreserving their original order.
- license_expression.replace_with_subexpression_by_license_symbol(tokens, strict=False)
Given a
tokensiterable of Token, yield updated Token(s) replacing any “XXX WITH ZZZ” subexpression by a LicenseWithExceptionSymbol symbol.Check validity of WITH subexpessions and raise ParseError on errors.
If
strictis True also raise ParseError if the left hand side LicenseSymbol has is_exception True or if the right hand side LicenseSymbol has is_exception False.
- license_expression.validate_symbols(symbols, validate_keys=False)
Return a tuple of (warnings, errors) given a sequence of
symbolsLicenseSymbol-like objects.warnings is a list of validation warnings messages (possibly empty if there were no warnings).
errors is a list of validation error messages (possibly empty if there were no errors).
Keys and aliases are cleaned and validated for uniqueness.
If
validate_keysalso validate that license keys are known keys.