@@ -7,7 +7,7 @@ Lark focuses on simplicity, power, and speed. It lets you choose between two par | |||||
- Earley : Parses all context-free grammars (even ambiguous ones)! It is the default. | - Earley : Parses all context-free grammars (even ambiguous ones)! It is the default. | ||||
- LALR(1): Only LR grammars. Outperforms PLY and most (if not all) other pure-python parsing libraries. | - LALR(1): Only LR grammars. Outperforms PLY and most (if not all) other pure-python parsing libraries. | ||||
Both algorithms are written in Python and can be used interchangably with the same grammar (aside for algorithmic restrictions). See "Comparison to other parsers" for more details. | |||||
Both algorithms are written in Python and can be used interchangeably with the same grammar (aside for algorithmic restrictions). See "Comparison to other parsers" for more details. | |||||
Lark can automagically build an AST from your grammar, without any more code on your part. | Lark can automagically build an AST from your grammar, without any more code on your part. | ||||
@@ -108,7 +108,7 @@ These features are planned to be implemented in the near future: | |||||
- Standard library of tokens (string, int, name, etc.) | - Standard library of tokens (string, int, name, etc.) | ||||
- Contextual lexing for LALR (already working, needs some finishing touches) | - Contextual lexing for LALR (already working, needs some finishing touches) | ||||
- Parser generator - create a small parser, indepdendent of Lark, to embed in your project. | |||||
- Parser generator - create a small parser, independent of Lark, to embed in your project. | |||||
- Grammar composition (in cases that the tokens can reliably signify a grammar change) | - Grammar composition (in cases that the tokens can reliably signify a grammar change) | ||||
- Optimizations in both the parsers and the lexer | - Optimizations in both the parsers and the lexer | ||||
- Better handling of ambiguity | - Better handling of ambiguity | ||||
@@ -76,7 +76,7 @@ Notice that WS, which matches whitespace, gets flagged with "ignore". This tells | |||||
Once we have our grammar, creating the parser is very simple. | Once we have our grammar, creating the parser is very simple. | ||||
We simply instanciate Lark, and tell it to accept a "value": | |||||
We simply instantiate Lark, and tell it to accept a "value": | |||||
```python | ```python | ||||
from lark import Lark | from lark import Lark | ||||
@@ -272,7 +272,7 @@ Now, of course there are JSON libraries for Python written in C, and we can neve | |||||
The first step for optimizing is to have a benchmark. For this benchmark I'm going to take data from [json-generator.com/](http://www.json-generator.com/). I took their default suggestion and changed it to 5000 objects. The result is a 6.6MB sparse JSON file. | The first step for optimizing is to have a benchmark. For this benchmark I'm going to take data from [json-generator.com/](http://www.json-generator.com/). I took their default suggestion and changed it to 5000 objects. The result is a 6.6MB sparse JSON file. | ||||
Our first program is going to be just a concatanation of everything we've done so far: | |||||
Our first program is going to be just a concatenation of everything we've done so far: | |||||
```python | ```python | ||||
import sys | import sys | ||||
@@ -348,7 +348,7 @@ json_parser = Lark(json_grammar, start='value', parser='lalr') | |||||
user 0m7.504s | user 0m7.504s | ||||
sys 0m0.175s | sys 0m0.175s | ||||
Ah, that's much better. The resulting JSON is of course exactly the same. You can run it for yourself an see. | |||||
Ah, that's much better. The resulting JSON is of course exactly the same. You can run it for yourself and see. | |||||
It's important to note that not all grammars are LR-compatible, and so you can't always switch to LALR(1). But there's no harm in trying! If Lark lets you build the grammar, it means you're good to go. | It's important to note that not all grammars are LR-compatible, and so you can't always switch to LALR(1). But there's no harm in trying! If Lark lets you build the grammar, it means you're good to go. | ||||
@@ -117,7 +117,7 @@ Lark will parse "(hello world)" as: | |||||
"world" | "world" | ||||
b. Rules that recieve a question mark (?) at the beginning of their definition, will be inlined if they have a single child. | |||||
b. Rules that receive a question mark (?) at the beginning of their definition, will be inlined if they have a single child. | |||||
Example: | Example: | ||||
@@ -157,7 +157,7 @@ When initializing the Lark object, you can provide it with keyword options: | |||||
- transformer - Applies the transformer to every parse tree (only allowed with parser="lalr") | - transformer - Applies the transformer to every parse tree (only allowed with parser="lalr") | ||||
- only\_lex - Don't build a parser. Useful for debugging (default: False) | - only\_lex - Don't build a parser. Useful for debugging (default: False) | ||||
- postlex - Lexer post-processing (Default: None) | - postlex - Lexer post-processing (Default: None) | ||||
- profile - Measure run-time usage in Lark. Read results from the profiler proprety (Default: False) | |||||
- profile - Measure run-time usage in Lark. Read results from the profiler property (Default: False) | |||||
To be supported: | To be supported: | ||||