Note: This refactor opens the door for implementing a ContextualLexer for Earley.
But unlike the existing one for LALR, it will have to be computed at runtime,
rather than ahead of time.
- Makes rule ordering the default ambiguity tie breaker.
E.g.
start: a | b
a: "A"
b: "A"
will return:
start
a
start: b | a
a: "A"
b: "A"
will return
start
b
- Replaces the ambiguity='resolve__antiscore_sum' with a separate option: 'priority'.
The priority option has 4 values: 'auto', 'none', 'normal', 'invert'.
'Auto' maps to 'Normal' for CYK and Earley and 'None' for LALR.
'None' filters your priorities and ignores them. This saves some extra tree walking on Earley.
'Normal' uses your priorities untouched, mimicing the old behaviour.
'Invert' negates your priorities, emulating the old 'resolve__antiscore_sum' behaviour.
This allows you to use priority logic even when ambiguity=='explicit', to get a better idea
of the shape of your tree; and to easily disable priorities without removing them from the
grammar for testing (or performance).
- ambiguity='explicit' now correctly returns an ambiguous tree again, as 0.6 did.
When using the same parser repeatedly for small parsers we incur
significant overhead by recreating the ForestVisitor each parser.
We can cache the Forest walker and re-use it by making it stateless.
Also, we can use slots for all of the Forest Walkers to reduce
construction delay and function call overhead.
Changed dynamic lexer behavior to only match terminals to their maximum length (i.e. greedy match), emulating the standard lexer.
The original dynamic lexer behavior, that attempts to match all appearances of a terminal, has been moved to the "dynamic_complete" lexer.
For example, when applying a terminal "a"+ to the text "aaa":
- dynamic: ["aaa"]
- dynamic_complete: ["a", "aa", "aaa"]
* All exceptions are now under exceptions.py
* UnexpectedInput is now superclass of UnexpectedToken and UnexpectedCharacters,
all of which support the get_context() and match_examples() methods.