Erez Shinan 0d48385721 | vor 7 Jahren | |
---|---|---|
examples | vor 7 Jahren | |
lark | vor 7 Jahren | |
LICENSE | vor 7 Jahren | |
README.md | vor 7 Jahren |
Lark is a modern general-purpose Python parsing library, that focuses on simplicity and power.
Lark accepts grammars as EBNF and lets you choose between two parsing algorithms:
Both algorithms are pure-python implementations and can be used interchangably (aside for algorithmic restrictions).
Lark can automagically build an AST from your grammar, without any more code on your part.
Separates code from grammar: The result is parsers that are cleaner and easier to read & work with.
Automatically builds a tree (AST): Trees are always simpler to work with than state-machines. (But if you want to provide a callback for efficiency reasons, Lark lets you do that too)
Follows Python’s Idioms: Beautiful is better than ugly. Readability counts.
Here is a little program to parse “Hello, World!” (Or any other similar phrase):
from lark import Lark
l = Lark('''start: WORD "," WORD "!"
WORD: /\w+/
SPACE.ignore: " "
''')
print( l.parse("Hello, World!") )
And the output is:
Tree(start, [Token(WORD, Hello), Token(WORD, World)])
Notice punctuation doesn’t appear in the resulting tree. It’s automatically filtered away by Lark.
To learn more about Lark:
These features are planned to be implemented in the near future:
Library | Algorithm | LOC | Grammar | Builds AST |
---|---|---|---|---|
Lark | Earley/LALR(1) | 0.5K | EBNF+ | Yes! |
PLY | LALR(1) | 4.6K | Yacc-like BNF | No |
PyParsing | PEG | 5.7K | Parser combinators | No |
Parsley | PEG | 3.3K | EBNF-like | No |
funcparselib | Recursive-Descent | 0.5K | Parser combinators | No |
(LOC measures lines of code of the parsing algorithm(s), without accompanying files)
It’s hard to compare parsers with different parsing algorithms, since each algorithm has many advantages and disadvantages. However, I will try to summarize the main points here:
Lark offers both Earley and LALR(1), which means you can choose between the most powerful and the most efficient algorithms, without having to change libraries.
(* According to Wikipedia, it remains unanswered whether PEGs can really parse all deterministic CFGs)
Lark uses the GPL3 license.
If you have any questions or want to contribute, please email me at erezshin at gmail com.