night199uk
bb22c84df3
Speed up repetitive parsing using the same parser
When using the same parser repeatedly for small parsers we incur
significant overhead by recreating the ForestVisitor each parser.
We can cache the Forest walker and re-use it by making it stateless.
Also, we can use slots for all of the Forest Walkers to reduce
construction delay and function call overhead.
6 years ago
Erez Shinan
b2489e13e2
A few fixes (tests + interface)
6 years ago
Erez Shinan
0077366255
Earley now ignores infinite recursion
6 years ago
Erez Shinan
1798971455
Adjustments
6 years ago
Erez Shinan
6c8ba76b79
Fixed a deep bug in grammar analysis involving empty rules (Issue #250 )
6 years ago
night199uk
1d5fd7301a
Heavy modifications to the Earley parser to try and make it handle more
types of ambiguity.
- Rewritten along the lines of Elizabeth Scott's parser.
https://www.sciencedirect.com/science/article/pii/S1571066108001497
- Implement SPPF trees per Elizabeth Scott and Bram van der Sanden's work.
http://www.bramvandersanden.com/post/2014/06/shared-packed-parse-forest/
6 years ago
Erez Shinan
242ac24ea6
Fixed the propagate_positions implementation, and added start_pos/end_pos attributes to Tree.Meta
Related to issue #216
6 years ago
Erez Shinan
181f061091
BUGFIX - Fixed 2 issues with line counting
1) Failed to detect newlines in regexps of the form [^...]
2) Last token didn't get end_line & end_column
6 years ago
Erez Shinan
32b78b8ee5
BUGFIX: Repeated use of optional rules tripped up the simplifier, manifesting when aliases were used (Issue #197 )
6 years ago
Erez Shinan
454c88b58a
Refactoring and fixes for merge 2fd0087
6 years ago
Julien Malard
a03e01bc12
Fixed test.
6 years ago
Julien Malard
12004b3c65
Reimplemented relative and multiple imports.
6 years ago
Julien Malard
405f6a399d
From and relative type imports seem to work.
6 years ago
Erez Shinan
6ea4588bcf
Dynamic lexer is now returns the maximum match only. Complete lexing beahvior moved to "dynamic_complete"
Changed dynamic lexer behavior to only match terminals to their maximum length (i.e. greedy match), emulating the standard lexer.
The original dynamic lexer behavior, that attempts to match all appearances of a terminal, has been moved to the "dynamic_complete" lexer.
For example, when applying a terminal "a"+ to the text "aaa":
- dynamic: ["aaa"]
- dynamic_complete: ["a", "aa", "aaa"]
6 years ago
Erez Shinan
5c6df8e825
Moved and restructured exceptions
* All exceptions are now under exceptions.py
* UnexpectedInput is now superclass of UnexpectedToken and UnexpectedCharacters,
all of which support the get_context() and match_examples() methods.
6 years ago
Erez Shinan
6bfc27c11d
New transformers near completion
Nearley tool still needs fixing
6 years ago
Erez Shinan
5e546f38a9
args decorators actually work now
6 years ago
Erez Shinan
9daacb9082
Refactored transformers, better code
6 years ago
Erez Shinan
2b4ef11ebf
Columns now start at 1
6 years ago
Erez Shinan
7b32ffd83a
Fixed token visibility rules (Issue #109 )
Anonymous tokens would become visible if they had the same value as named tokens.
That's because they are merged for the lexer. But after this change, the rules for
visibility are based on their use in the rule, and not their name or identity.
6 years ago
Erez Shinan
33caa391d5
Breaking backwards compatibility:
* Removed the scanless parsing feature (dynamic lexing is king)
* Default LALR lexer is now contextual
6 years ago
Erez Shinan
0f0776c0fa
BUGIX in lexer: Embedding strings overwrote priority (Issue #121 )
6 years ago
Erez Shinan
c3bce19dc2
More steps towards a good solution
6 years ago
Erez Shinan
f69bceb335
Snap more things into place
6 years ago
Erez Shinan
f960c1b8ac
Initial: Added transformers.py, and Meta to tree
6 years ago
Erez Shinan
4f2330fc9b
Fixed bug in Earley prioritization
6 years ago
Erez Shinan
25c3c51b1c
Fixed bug in Earley: A tree builder optimization clashed with explicit ambiguity
6 years ago
Erez Shinan
255ef0d973
Added error message for the alias syntax in terminals (Issue #97 )
7 years ago
Erez Shinan
7d11dfa5cd
FEATURE: Added support for ranged-repeat for rules and terminals (Issues #75 , #19 )
Syntax: symbol~number
| symbol~min..max
Example:
HEXCOLOR: "#" (HEXDIGIT~3 | HEXDIGIT~6)
short_sentence: word~4..20
Added range for tokens
7 years ago
Erez Shinan
22e525f53e
Fixed propagate positions. Added lexer_callbacks option.
7 years ago
Erez Shinan
748e9b7248
All relevant tests passing. Also indentation and other refactoring.
7 years ago
Erez Shinan
d173d6d66b
Validate against zero-width terminals in XEarley (Issue #63 )
7 years ago
Erez Shinan
5fd331be54
BUGFIX: Internally repetitive rules are now handled silently (Issue #60 )
7 years ago
Erez Shinan
38c5fd244a
Improved grammar validation and refactored the lexers
7 years ago
Erez Shinan
209ac5ab4e
BUGFIX: Mishandling of quotes (Issue #50 )
7 years ago
Erez Shinan
dcb7297c30
Flags are now part of the terminal identity
7 years ago
Erez Shinan
6f85ca4294
%ignore bug fixed in xearley (thanks to issue #44 )
7 years ago
Erez Shinan
34449651bf
Added UnexpectedInput exception (with line & column) to xearley (Issue #43 )
7 years ago
Erez Shinan
08a8a747b8
Fixed escaping for all tests
7 years ago
Erez Shinan
593446d025
Improved Readme
7 years ago
Erez Shinan
27fb1889cf
Added test
7 years ago
Erez Shinan
816266a5eb
BUGFIX for issue #24 : Dynamic Earley mishandled %ignore tokens
7 years ago
Erez Shinan
692307f683
Added the fruitflies test. Found bug in scanless reconstruction of tokens
7 years ago
Erez Shinan
e8810e3b80
Fixed some deprecation warnings due to changes in Py3.6 regexps
7 years ago
Erez Shinan
baae08e399
Fixed tree-construction semantics: Alias now overrides the "?rule" operator
Breaking change!!
7 years ago
Erez Shinan
035eea234f
BUGFIX: Tree comparison in Earley wasn't hashed, which caused a huge spike in run-time for some cases.
7 years ago
Erez Shinan
b532bf4e3c
Fixed test
7 years ago
Erez Shinan
19a9c9c206
Towards an introspectable tree-builder. Also added tests.
7 years ago
Kaspar Emanuel
ed04b22c4c
Fix UTF-8 test
7 years ago
Kaspar Emanuel
7d21c754a1
Add test for UTF-8 characters in grammar
7 years ago