Merge pull request #1 from lark-parser/master

Merge from origin.
5 years ago · c95257c906
--- a/.gitignore
+++ b/.gitignore
@@ -4,6 +4,7 @@
 /lark_parser.egg-info/**
 tags
 .vscode
 .idea
 .ropeproject
 .cache
 /dist
--- a/README.md
+++ b/README.md
@@ -34,13 +34,16 @@ Lark has no dependencies.
 [![Build Status](https://travis-ci.org/lark-parser/lark.svg?branch=master)](https://travis-ci.org/lark-parser/lark)
 ### Syntax Highlighting (new)
 ### Syntax Highlighting
 Lark now provides syntax highlighting for its grammar files (\*.lark):
 Lark provides syntax highlighting for its grammar files (\*.lark):
 - [Sublime Text & TextMate](https://github.com/lark-parser/lark_syntax)
 - [vscode](https://github.com/lark-parser/vscode-lark)
 ### Clones
 - [Lerche (Julia)](https://github.com/jamesrhester/Lerche.jl) - an unofficial clone, written entirely in Julia.
 ### Hello World
@@ -72,7 +75,7 @@ Lark is great at handling ambiguity. Let's parse the phrase "fruit flies like ba
 ![fruitflies.png](examples/fruitflies.png)
 See more [examples in the wiki](https://github.com/erezsh/lark/wiki/Examples)
 See more [examples here](https://github.com/lark-parser/lark/tree/master/examples)
@@ -95,7 +98,7 @@ See more [examples in the wiki](https://github.com/erezsh/lark/wiki/Examples)
 - Extensive test suite [![codecov](https://codecov.io/gh/erezsh/lark/branch/master/graph/badge.svg)](https://codecov.io/gh/erezsh/lark)
 - And much more!
 See the full list of [features in the wiki](https://github.com/erezsh/lark/wiki/Features)
 See the full list of [features here](https://lark-parser.readthedocs.io/en/latest/features/)
 ### Comparison to other libraries
@@ -132,9 +135,21 @@ Check out the [JSON tutorial](/docs/json_tutorial.md#conclusion) for more detail
 ### Projects using Lark
 - [storyscript](https://github.com/storyscript/storyscript) - The programming language for Application Storytelling
 - [tartiflette](https://github.com/dailymotion/tartiflette) - a GraphQL engine by Dailymotion. Lark is used to parse the GraphQL schemas definitions.
 - [Hypothesis](https://github.com/HypothesisWorks/hypothesis) - Library for property-based testing
 - [mappyfile](https://github.com/geographika/mappyfile) - a MapFile parser for working with MapServer configuration
 - [synapse](https://github.com/vertexproject/synapse) - an intelligence analysis platform
 - [Datacube-core](https://github.com/opendatacube/datacube-core) - Open Data Cube analyses continental scale Earth Observation data through time
 - [SPFlow](https://github.com/SPFlow/SPFlow) - Library for Sum-Product Networks
 - [Torchani](https://github.com/aiqm/torchani) - Accurate Neural Network Potential on PyTorch
 - [Command-Block-Assembly](https://github.com/simon816/Command-Block-Assembly) - An assembly language, and C compiler, for Minecraft commands
 - [Fabric-SDK-Py](https://github.com/hyperledger/fabric-sdk-py) - Hyperledger fabric SDK with Python 3.x
 - [required](https://github.com/shezadkhan137/required) - multi-field validation using docstrings
 - [miniwdl](https://github.com/chanzuckerberg/miniwdl) - A static analysis toolkit for the Workflow Description Language
 - [pytreeview](https://gitlab.com/parmenti/pytreeview) - a lightweight tree-based grammar explorer
 - [tartiflette](https://github.com/dailymotion/tartiflette) - a GraphQL engine by Dailymotion (Lark is used to parse the GraphQL schemas definitions)
 - [harmalysis](https://github.com/napulen/harmalysis) - A language for harmonic analysis and music theory
 Using Lark? Send me a message and I'll add your project!
--- a/docs/classes.md
+++ b/docs/classes.md
@@ -1,42 +1,42 @@
 # Classes - Reference
 # Classes Reference
 This page details the important classes in Lark.
 ----
 ## Lark
 ## lark.Lark
 The Lark class is the main interface for the library. It's mostly a thin wrapper for the many different parsers, and for the tree constructor.
 ### Methods
 #### \_\_init\_\_(self, grammar, **options)
 The Lark class accepts a grammar string or file object, and keyword options:
 * start - The symbol in the grammar that begins the parse (Default: `"start"`)
 * **start** - A list of the rules in the grammar that begin the parse (Default: `["start"]`)
 * parser - Decides which parser engine to use, "earley", "lalr" or "cyk". (Default: `"earley"`)
 * **parser** - Decides which parser engine to use, "earley", "lalr" or "cyk". (Default: `"earley"`)
 * lexer - Overrides default lexer.
 * **lexer** - Overrides default lexer, depending on parser.
 * transformer - Applies the transformer instead of building a parse tree (only allowed with parser="lalr")
 * **transformer** - Applies the provided transformer instead of building a parse tree (only allowed with parser="lalr")
 * postlex - Lexer post-processing (Default: None. only works when lexer is "standard" or "contextual")
 * **postlex** - Lexer post-processing (Default: `None`. only works when lexer is "standard" or "contextual")
 * ambiguity (only relevant for earley and cyk)
 * **ambiguity** (only relevant for earley and cyk)
     * "explicit" - Return all derivations inside an "_ambig" data node.
     * "resolve" - Let the parser choose the best derivation (greedy for tokens, non-greedy for rules. Default)
 * debug - Display warnings (such as Shift-Reduce warnings for LALR)
 * **debug** - Display warnings (such as Shift-Reduce warnings for LALR)
 * **keep_all_tokens** - Don't throw away any terminals from the tree (Default=`False`)
 * keep_all_tokens - Don't throw away any terminals from the tree (Default=False)
 * **propagate_positions** - Propagate line/column count to tree nodes, at the cost of performance (default=`False`)
 * propagate_positions - Propagate line/column count to tree nodes (default=False)
 * **maybe_placeholders** - When True, the `[]` operator returns `None` when not matched. When `False`,  `[]` behaves like the `?` operator, and return no value at all, which may be a little faster (default=`False`)
 * lexer_callbacks - A dictionary of callbacks of type f(Token) -> Token, used to interface with the lexer Token generation. Only works with the standard and contextual lexers. See [Recipes](recipes.md) for more information.
 * **lexer_callbacks** - A dictionary of callbacks of type f(Token) -> Token, used to interface with the lexer Token generation. Only works with the standard and contextual lexers. See [Recipes](recipes.md) for more information.
 #### parse(self, text)
@@ -50,13 +50,10 @@ If a transformer is supplied to `__init__`, returns whatever is the result of th
 The main tree class
 ### Properties
 * `data` - The name of the rule or alias
 * `children` - List of matched sub-rules and terminals
 * `meta` - Line & Column numbers, if using `propagate_positions`
 ### Methods
 * `meta` - Line & Column numbers (if `propagate_positions` is enabled)
    * meta attributes: `line`, `column`, `start_pos`, `end_line`, `end_column`, `end_pos`
 #### \_\_init\_\_(self, data, children)
@@ -92,102 +89,6 @@ Trees can be hashed and compared.
 ----
 ## Transformers & Visitors
 Transformers & Visitors provide a convenient interface to process the parse-trees that Lark returns.
 They are used by inheriting from the correct class (visitor or transformer), and implementing methods corresponding to the rule you wish to process. Each methods accepts the children as an argument. That can be modified using the `v-args` decorator, which allows to inline the arguments (akin to `*args`), or add the tree `meta` property as an argument.
 See: https://github.com/lark-parser/lark/blob/master/lark/visitors.py
 ### Visitors
 Visitors visit each node of the tree, and run the appropriate method on it according to the node's data.
 They work bottom-up, starting with the leaves and ending at the root of the tree.
 **Example**
 ```python
 class IncreaseAllNumbers(Visitor):
  def number(self, tree):
    assert tree.data == "number"
    tree.children[0] += 1
 IncreaseAllNumbers().visit(parse_tree)
 ```
 There are two classes that implement the visitor interface:
 * Visitor - Visit every node (without recursion)
 * Visitor_Recursive - Visit every node using recursion. Slightly faster.
 ### Transformers
 Transformers visit each node of the tree, and run the appropriate method on it according to the node's data.
 They work bottom-up (or: depth-first), starting with the leaves and ending at the root of the tree.
 Transformers can be used to implement map & reduce patterns.
 Because nodes are reduced from leaf to root, at any point the callbacks may assume the children have already been transformed (if applicable).
 Transformers can be chained into a new transformer by using multiplication.
 **Example:**
 ```python
 from lark import Tree, Transformer
 class EvalExpressions(Transformer):
    def expr(self, args):
            return eval(args[0])
 t = Tree('a', [Tree('expr', ['1+2'])])
 print(EvalExpressions().transform( t ))
 # Prints: Tree(a, [3])
 ```
 Here are the classes that implement the transformer interface:
 - Transformer - Recursively transforms the tree. This is the one you probably want.
 - Transformer_InPlace - Non-recursive. Changes the tree in-place instead of returning new instances
 - Transformer_InPlaceRecursive - Recursive. Changes the tree in-place instead of returning new instances
 ### v_args
 `v_args` is a decorator.
 By default, callback methods of transformers/visitors accept one argument: a list of the node's children. `v_args` can modify this behavior.
 When used on a transformer/visitor class definition, it applies to all the callback methods inside it.
 `v_args` accepts one of three flags:
 - `inline` - Children are provided as `*args` instead of a list argument (not recommended for very long lists).
 - `meta` - Provides two arguments: `children` and `meta` (instead of just the first)
 - `tree` - Provides the entire tree as the argument, instead of the children.
 Examples:
 ```python
@v_args(inline=True)
 class SolveArith(Transformer):
    def add(self, left, right):
        return left + right
 class ReverseNotation(Transformer_InPlace):
    @v_args(tree=True):
    def tree_node(self, tree):
        tree.children = tree.children[::-1]
 ```
 ### Discard
 When raising the `Discard` exception in a transformer callback, that node is discarded and won't appear in the parent.
 ## Token
 When using a lexer, the resulting tokens in the trees will be of the Token class, which inherits from Python's string. So, normal string comparisons and operations will work as expected. Tokens also have other useful attributes:
@@ -198,18 +99,27 @@ When using a lexer, the resulting tokens in the trees will be of the Token class
 * `column` - The column of the token in the text (starting with 1)
 * `end_line` - The line where the token ends
 * `end_column` - The next column after the end of the token. For example, if the token is a single character with a `column` value of 4, `end_column` will be 5.
 * `end_pos` - the index where the token ends (basically pos_in_stream + len(token))
 ## Transformer
 ## Visitor
 ## Interpreter
 See the [visitors page](visitors.md)
 ## UnexpectedInput
 ## UnexpectedToken
 ## UnexpectedException
 - `UnexpectedInput`
    - `UnexpectedToken` - The parser recieved an unexpected token
    - `UnexpectedCharacters` - The lexer encountered an unexpected string
 After catching one of these exceptions, you may call the following helper methods to create a nicer error message:
 ### Methods
 #### get_context(text, span)
 Returns a pretty string pinpointing the error in the text, with `span` amount of context characters around it.
--- a/docs/grammar.md
+++ b/docs/grammar.md
@@ -1,5 +1,13 @@
 # Grammar Reference
 Table of contents:
 1. [Definitions](#defs)
 1. [Terminals](#terms)
 1. [Rules](#rules)
 1. [Directives](#dirs)
 <a name="defs"></a>
 ## Definitions
 **A grammar** is a list of rules and terminals, that together define a language.
@@ -25,6 +33,7 @@ Lark begins the parse with the rule 'start', unless specified otherwise in the o
 Names of rules are always in lowercase, while names of terminals are always in uppercase. This distinction has practical effects, for the shape of the generated parse-tree, and the automatic construction of the lexer (aka tokenizer, or scanner).
 <a name="terms"></a>
 ## Terminals
 Terminals are used to match text into symbols. They can be defined as a combination of literals and other terminals.
@@ -45,6 +54,16 @@ Literals can be one of:
 * `/re with flags/imulx`
 * Literal range: `"a".."z"`, `"1".."9"`, etc.
 Terminals also support grammar operators, such as `|`, `+`, `*` and `?`.
 Terminals are a linear construct, and therefor may not contain themselves (recursion isn't allowed).
 ### Priority
 Terminals can be assigned priority only when using a lexer (future versions may support Earley's dynamic lexing).
 Priority can be either positive or negative. In not specified for a terminal, it's assumed to be 1 (i.e. the default).
 #### Notes for when using a lexer:
 When using a lexer (standard or contextual), it is the grammar-author's responsibility to make sure the literals don't collide, or that if they do, they are matched in the desired order. Literals are matched in an order according to the following criteria:
@@ -59,11 +78,58 @@ When using a lexer (standard or contextual), it is the grammar-author's responsi
 IF: "if"
 INTEGER : /[0-9]+/
 INTEGER2 : ("0".."9")+          //# Same as INTEGER
 DECIMAL.2: INTEGER "." INTEGER  //# Will be matched before INTEGER
 DECIMAL.2: INTEGER? "." INTEGER  //# Will be matched before INTEGER
 WHITESPACE: (" " | /\t/ )+
 SQL_SELECT: "select"i
 ```
 ### Regular expressions & Ambiguity
 Each terminal is eventually compiled to a regular expression. All the operators and references inside it are mapped to their respective expressions.
 For example, in the following grammar, `A1` and `A2`, are equivalent:
 ```perl
 A1: "a" | "b"
 A2: /a|b/
 ```
 This means that inside terminals, Lark cannot detect or resolve ambiguity, even when using Earley.
 For example, for this grammar:
 ```perl
 start           : (A | B)+
 A               : "a" | "ab"
 B               : "b"
 ```
 We get this behavior:
 ```bash
 >>> p.parse("ab")
 Tree(start, [Token(A, 'a'), Token(B, 'b')])
 ```
 This is happening because Python's regex engine always returns the first matching option.
 If you find yourself in this situation, the recommended solution is to use rules instead.
 Example:
 ```python
 >>> p = Lark("""start: (a | b)+
 ...             !a: "a" | "ab"
 ...             !b: "b"
 ...             """, ambiguity="explicit")
 >>> print(p.parse("ab").pretty())
 _ambig
  start
    a   ab
  start
    a   a
    b   b
 ```
 <a name="rules"></a>
 ## Rules
 **Syntax:**
@@ -85,24 +151,30 @@ Each item is one of:
 * `TERMINAL`
 * `"string literal"` or `/regexp literal/`
 * `(item item ..)` - Group items
 * `[item item ..]` - Maybe. Same as `(item item ..)?`
 * `[item item ..]` - Maybe. Same as `(item item ..)?`, but generates `None` if there is no match
 * `item?` - Zero or one instances of item ("maybe")
 * `item*` - Zero or more instances of item
 * `item+` - One or more instances of item
 * `item ~ n` - Exactly *n* instances of item
 * `item ~ n..m` - Between *n* to *m* instances of item
 * `item ~ n..m` - Between *n* to *m* instances of item (not recommended for wide ranges, due to performance issues)
 **Examples:**
 ```perl
 hello_world: "hello" "world"
 mul: [mul "*"] number     //# Left-recursion is allowed!
 mul: (mul "*")? number     //# Left-recursion is allowed and encouraged!
 expr: expr operator expr
    | value               //# Multi-line, belongs to expr
 four_words: word ~ 4
 ```
 ### Priority
 Rules can be assigned priority only when using Earley (future versions may support LALR as well).
 Priority can be either positive or negative. In not specified for a terminal, it's assumed to be 1 (i.e. the default).
 <a name="dirs"></a>
 ## Directives
 ### %ignore
@@ -111,7 +183,7 @@ All occurrences of the terminal will be ignored, and won't be part of the parse.
 Using the `%ignore` directive results in a cleaner grammar.
 It's especially important for the LALR(1) algorithm, because adding whitespace (or comments, or other extranous elements) explicitly in the grammar, harms its predictive abilities, which are based on a lookahead of 1.
 It's especially important for the LALR(1) algorithm, because adding whitespace (or comments, or other extraneous elements) explicitly in the grammar, harms its predictive abilities, which are based on a lookahead of 1.
 **Syntax:**
 ```html
--- a/docs/how_to_develop.md
+++ b/docs/how_to_develop.md
@@ -7,7 +7,7 @@ There are many ways you can help the project:
 * Write new grammars for Lark's library
 * Write a blog post introducing Lark to your audience
 * Port Lark to another language
 * Help me with code developemnt
 * Help me with code development
 If you're interested in taking one of these on, let me know and I will provide more details and assist you in the process.
@@ -60,4 +60,4 @@ Another way to run the tests is using setup.py:
 ```bash
 python setup.py test 
 ```
 ```
--- a/docs/how_to_use.md
+++ b/docs/how_to_use.md
@@ -10,7 +10,7 @@ This is the recommended process for working with Lark:
 3. Try your grammar in Lark against each input sample. Make sure the resulting parse-trees make sense.
 4. Use Lark's grammar features to [[shape the tree|Tree Construction]]: Get rid of superfluous rules by inlining them, and use aliases when specific cases need clarification.
 4. Use Lark's grammar features to [shape the tree](tree_construction.md): Get rid of superfluous rules by inlining them, and use aliases when specific cases need clarification.
  - You can perform steps 1-4 repeatedly, gradually growing your grammar to include more sentences.
--- a/docs/index.md
+++ b/docs/index.md
@@ -35,8 +35,8 @@ $ pip install lark-parser
 * [Examples](https://github.com/lark-parser/lark/tree/master/examples)
 * Tutorials
    * [How to write a DSL](http://blog.erezsh.com/how-to-write-a-dsl-in-python-with-lark/) - Implements a toy LOGO-like language with an interpreter
    * [How to write a JSON parser](json_tutorial.md)
    * External
    * [How to write a JSON parser](json_tutorial.md) - Teaches you how to use Lark
    * Unofficial
        * [Program Synthesis is Possible](https://www.cs.cornell.edu/~asampson/blog/minisynth.html) - Creates a DSL for Z3
 * Guides
    * [How to use Lark](how_to_use.md)
@@ -44,6 +44,7 @@ $ pip install lark-parser
 * Reference
    * [Grammar](grammar.md)
    * [Tree Construction](tree_construction.md)
    * [Visitors & Transformers](visitors.md)
    * [Classes](classes.md)
    * [Cheatsheet (PDF)](lark_cheatsheet.pdf)
 * Discussion
--- a/docs/json_tutorial.md
+++ b/docs/json_tutorial.md
@@ -230,7 +230,8 @@ from lark import Transformer
 class MyTransformer(Transformer):
    def list(self, items):
        return list(items)
    def pair(self, (k,v)):
    def pair(self, key_value):
        k, v = key_value
        return k, v
    def dict(self, items):
        return dict(items)
@@ -251,9 +252,11 @@ Also, our definitions of list and dict are a bit verbose. We can do better:
 from lark import Transformer
 class TreeToJson(Transformer):
    def string(self, (s,)):
    def string(self, s):
        (s,) = s
        return s[1:-1]
    def number(self, (n,)):
    def number(self, n):
        (n,) = n
        return float(n)
    list = list
@@ -315,9 +318,11 @@ json_grammar = r"""
    """
 class TreeToJson(Transformer):
    def string(self, (s,)):
    def string(self, s):
        (s,) = s
        return s[1:-1]
    def number(self, (n,)):
    def number(self, n):
        (n,) = n
        return float(n)
    list = list
--- a/docs/parsers.md
+++ b/docs/parsers.md
@@ -5,9 +5,9 @@ Lark implements the following parsing algorithms: Earley, LALR(1), and CYK
 An [Earley Parser](https://www.wikiwand.com/en/Earley_parser) is a chart parser capable of parsing any context-free grammar at O(n^3), and O(n^2) when the grammar is unambiguous. It can parse most LR grammars at O(n). Most programming languages are LR, and can be parsed at a linear time.
 Lark's Earley implementation runs on top of a skipping chart parser, which allows it to use regular expressions, instead of matching characters one-by-one. This is a huge improvement to Earley that is unique to Lark. This feature is used by default, but can also be requested explicitely using `lexer='dynamic'`.
 Lark's Earley implementation runs on top of a skipping chart parser, which allows it to use regular expressions, instead of matching characters one-by-one. This is a huge improvement to Earley that is unique to Lark. This feature is used by default, but can also be requested explicitly using `lexer='dynamic'`.
 It's possible to bypass the dynamic lexer, and use the regular Earley parser with a traditional lexer, that tokenizes as an independant first step. Doing so will provide a speed benefit, but will tokenize without using Earley's ambiguity-resolution ability. So choose this only if you know why! Activate with `lexer='standard'`
 It's possible to bypass the dynamic lexing, and use the regular Earley parser with a traditional lexer, that tokenizes as an independent first step. Doing so will provide a speed benefit, but will tokenize without using Earley's ambiguity-resolution ability. So choose this only if you know why! Activate with `lexer='standard'`
 **SPPF & Ambiguity resolution**
@@ -21,7 +21,7 @@ Lark provides the following options to combat ambiguity:
 1) Lark will choose the best derivation for you (default). Users can choose between different disambiguation strategies, and can prioritize (or demote) individual rules over others, using the rule-priority syntax.
 2) Users may choose to recieve the set of all possible parse-trees (using ambiguity='explicit'), and choose the best derivation themselves. While simple and flexible, it comes at the cost of space and performance, and so it isn't recommended for highly ambiguous grammars, or very long inputs.
 2) Users may choose to receive the set of all possible parse-trees (using ambiguity='explicit'), and choose the best derivation themselves. While simple and flexible, it comes at the cost of space and performance, and so it isn't recommended for highly ambiguous grammars, or very long inputs.
 3) As an advanced feature, users may use specialized visitors to iterate the SPPF themselves. Future versions of Lark intend to improve and simplify this interface.
--- a/docs/recipes.md
+++ b/docs/recipes.md
@@ -19,18 +19,18 @@ It only works with the standard and contextual lexers.
 ### Example 1: Replace string values with ints for INT tokens
 ```python
 from lark import Lark, Token
 from lark import Lark, Transformer
 def tok_to_int(tok):
    "Convert the value of `tok` from string to int, while maintaining line number & column."
    # tok.type == 'INT'
    return Token.new_borrow_pos(tok.type, int(tok), tok)
 class T(Transformer):
    def INT(self, tok):
        "Convert the value of `tok` from string to int, while maintaining line number & column."
        return tok.update(value=int(tok))
 parser = Lark("""
 start: INT*
 %import common.INT
 %ignore " "
 """, parser="lalr", lexer_callbacks = {'INT': tok_to_int})
 """, parser="lalr", transformer=T())
 print(parser.parse('3 14 159'))
 ```
--- a/docs/tree_construction.md
+++ b/docs/tree_construction.md
@@ -7,6 +7,12 @@ For example, the rule `node: child1 child2` will create a tree node with two chi
 Using `item+` or `item*` will result in a list of items, equivalent to writing `item item item ..`.
 Using `item?` will return the item if it matched, or nothing.
 If `maybe_placeholders=False` (the default), then `[]` behaves like `()?`.
 If `maybe_placeholders=True`, then using `[item]` will return the item if it matched, or the value `None`, if it didn't.
 ### Terminals
 Terminals are always values in the tree, never branches.
@@ -23,6 +29,24 @@ Lark filters out certain types of terminals by default, considering them punctua
    - Unnamed regular expressions (like `/[0-9]/`)
    - Named terminals whose name starts with a letter (like `DIGIT`)
 Note: Terminals composed of literals and other terminals always include the entire match without filtering any part.
 **Example:**
 ```
 start:  PNAME pname
 PNAME:  "(" NAME ")"
 pname:  "(" NAME ")"
 NAME:   /\w+/
 %ignore /\s+/
 ```
 Lark will parse "(Hello) (World)" as:
    start
        (Hello)
        pname World
 Rules prefixed with `!` will retain all their literals regardless.
--- a/docs/visitors.md
+++ b/docs/visitors.md
@@ -0,0 +1,117 @@
 ## Transformers & Visitors
 Transformers & Visitors provide a convenient interface to process the parse-trees that Lark returns.
 They are used by inheriting from the correct class (visitor or transformer), and implementing methods corresponding to the rule you wish to process. Each method accepts the children as an argument. That can be modified using the `v_args` decorator, which allows to inline the arguments (akin to `*args`), or add the tree `meta` property as an argument.
 See: <a href="https://github.com/lark-parser/lark/blob/master/lark/visitors.py">visitors.py</a>
 ### Visitors
 Visitors visit each node of the tree, and run the appropriate method on it according to the node's data.
 They work bottom-up, starting with the leaves and ending at the root of the tree.
 **Example**
 ```python
 class IncreaseAllNumbers(Visitor):
  def number(self, tree):
    assert tree.data == "number"
    tree.children[0] += 1
 IncreaseAllNumbers().visit(parse_tree)
 ```
 There are two classes that implement the visitor interface:
 * Visitor - Visit every node (without recursion)
 * Visitor_Recursive - Visit every node using recursion. Slightly faster.
 ### Transformers
 Transformers visit each node of the tree, and run the appropriate method on it according to the node's data.
 They work bottom-up (or: depth-first), starting with the leaves and ending at the root of the tree.
 Transformers can be used to implement map & reduce patterns.
 Because nodes are reduced from leaf to root, at any point the callbacks may assume the children have already been transformed (if applicable).
 Transformers can be chained into a new transformer by using multiplication.
 `Transformer` can do anything `Visitor` can do, but because it reconstructs the tree, it is slightly less efficient.
 **Example:**
 ```python
 from lark import Tree, Transformer
 class EvalExpressions(Transformer):
    def expr(self, args):
            return eval(args[0])
 t = Tree('a', [Tree('expr', ['1+2'])])
 print(EvalExpressions().transform( t ))
 # Prints: Tree(a, [3])
 ```
 All these classes implement the transformer interface:
 - Transformer - Recursively transforms the tree. This is the one you probably want.
 - Transformer_InPlace - Non-recursive. Changes the tree in-place instead of returning new instances
 - Transformer_InPlaceRecursive - Recursive. Changes the tree in-place instead of returning new instances
 ### visit_tokens
 By default, transformers only visit rules. `visit_tokens=True` will tell Transformer to visit tokens as well. This is a slightly slower alternative to `lexer_callbacks`, but it's easier to maintain and works for all algorithms (even when there isn't a lexer).
 Example:
 ```python
 class T(Transformer):
    INT = int
    NUMBER = float
    def NAME(self, name):
        return lookup_dict.get(name, name)
 T(visit_tokens=True).transform(tree)
 ```
 ### v_args
 `v_args` is a decorator.
 By default, callback methods of transformers/visitors accept one argument: a list of the node's children. `v_args` can modify this behavior.
 When used on a transformer/visitor class definition, it applies to all the callback methods inside it.
 `v_args` accepts one of three flags:
 - `inline` - Children are provided as `*args` instead of a list argument (not recommended for very long lists).
 - `meta` - Provides two arguments: `children` and `meta` (instead of just the first)
 - `tree` - Provides the entire tree as the argument, instead of the children.
 Examples:
 ```python
@v_args(inline=True)
 class SolveArith(Transformer):
    def add(self, left, right):
        return left + right
 class ReverseNotation(Transformer_InPlace):
    @v_args(tree=True):
    def tree_node(self, tree):
        tree.children = tree.children[::-1]
 ```
 ### Discard
 When raising the `Discard` exception in a transformer callback, that node is discarded and won't appear in the parent.
--- a/examples/README.md
+++ b/examples/README.md
@@ -27,6 +27,7 @@ For example, the following will parse all the Python files in the standard libra
 - [error\_reporting\_lalr.py](error_reporting_lalr.py) - A demonstration of example-driven error reporting with the LALR parser
 - [python\_parser.py](python_parser.py) - A fully-working Python 2 & 3 parser (but not production ready yet!)
 - [python\_bytecode.py](python_bytecode.py) - A toy example showing how to compile Python directly to bytecode
 - [conf\_lalr.py](conf_lalr.py) - Demonstrates the power of LALR's contextual lexer on a toy configuration language
 - [conf\_earley.py](conf_earley.py) - Demonstrates the power of Earley's dynamic lexer on a toy configuration language
 - [custom\_lexer.py](custom_lexer.py) - Demonstrates using a custom lexer to parse a non-textual stream of data
--- a/examples/json_parser.py
+++ b/examples/json_parser.py
@@ -49,11 +49,21 @@ class TreeToJson(Transformer):
    false = lambda self, _: False
 ### Create the JSON parser with Lark, using the Earley algorithm
 # json_parser = Lark(json_grammar, parser='earley', lexer='standard')
 # def parse(x):
 #     return TreeToJson().transform(json_parser.parse(x))
 json_parser = Lark(json_grammar, parser='lalr', lexer='standard', transformer=TreeToJson())
 ### Create the JSON parser with Lark, using the LALR algorithm
 json_parser = Lark(json_grammar, parser='lalr',
                   # Using the standard lexer isn't required, and isn't usually recommended.
                   # But, it's good enough for JSON, and it's slightly faster.
                   lexer='standard',
                   # Disabling propagate_positions and placeholders slightly improves speed
                   propagate_positions=False,
                   maybe_placeholders=False,
                   # Using an internal transformer is faster and more memory efficient
                   transformer=TreeToJson())
 parse = json_parser.parse
--- a/examples/lark.lark
+++ b/examples/lark.lark
@@ -10,10 +10,12 @@ token: TOKEN priority? ":" expansions _NL
 priority: "." NUMBER
 statement: "%ignore" expansions _NL                -> ignore
         | "%import" import_args ["->" name] _NL   -> import
         | "%import" import_path ["->" name] _NL   -> import
         | "%import" import_path name_list _NL     -> multi_import
         | "%declare" name+                        -> declare
 import_args: "."? name ("." name)*
 !import_path: "."? name ("." name)*
 name_list: "(" name ("," name)* ")"
 ?expansions: alias (_VBAR alias)*
@@ -33,7 +35,7 @@ name: RULE
    | TOKEN
 _VBAR: _NL? "|"
 OP: /[+*][?]?|[?](?![a-z])/
 OP: /[+*]|[?](?![a-z])/
 RULE: /!?[_?]?[a-z][_a-z0-9]*/
 TOKEN: /_?[A-Z][_A-Z0-9]*/
 STRING: _STRING "i"?
@@ -44,7 +46,7 @@ _NL: /(\r?\n)+\s*/
 %import common.INT -> NUMBER
 %import common.WS_INLINE
 COMMENT: "//" /[^\n]/*
 COMMENT: /\s*/ "//" /[^\n]/*
 %ignore WS_INLINE
 %ignore COMMENT
--- a/examples/python3.lark
+++ b/examples/python3.lark
@@ -81,7 +81,7 @@ with_item: test ["as" expr]
 except_clause: "except" [test ["as" NAME]]
 suite: simple_stmt | _NEWLINE _INDENT stmt+ _DEDENT
 ?test: or_test ["if" or_test "else" test] | lambdef
 ?test: or_test ("if" or_test "else" test)? | lambdef
 ?test_nocond: or_test | lambdef_nocond
 lambdef: "lambda" [varargslist] ":" test
 lambdef_nocond: "lambda" [varargslist] ":" test_nocond
@@ -107,7 +107,7 @@ star_expr: "*" expr
 // sake of a __future__ import described in PEP 401 (which really works :-)
 !_comp_op: "<"|">"|"=="|">="|"<="|"<>"|"!="|"in"|"not" "in"|"is"|"is" "not"
 ?power: await_expr ["**" factor]
 ?power: await_expr ("**" factor)?
 ?await_expr: AWAIT? atom_expr
 AWAIT: "await"
@@ -137,7 +137,7 @@ dictorsetmaker: ( ((test ":" test | "**" expr) (comp_for | ("," (test ":" test |
 classdef: "class" NAME ["(" [arguments] ")"] ":" suite
 arguments: argvalue ("," argvalue)*  ["," [ starargs | kwargs]]
 arguments: argvalue ("," argvalue)*  ("," [ starargs | kwargs])?
         | starargs
         | kwargs
         | test comp_for
@@ -145,7 +145,7 @@ arguments: argvalue ("," argvalue)*  ["," [ starargs | kwargs]]
 starargs: "*" test ("," "*" test)* ("," argvalue)* ["," kwargs]
 kwargs: "**" test
 ?argvalue: test ["=" test]
 ?argvalue: test ("=" test)?
@@ -178,7 +178,7 @@ HEX_NUMBER.2: /0x[\da-f]*/i
 OCT_NUMBER.2: /0o[0-7]*/i
 BIN_NUMBER.2 : /0b[0-1]*/i
 FLOAT_NUMBER.2: /((\d+\.\d*|\.\d+)(e[-+]?\d+)?|\d+(e[-+]?\d+))/i
 IMAG_NUMBER.2: /\d+j|${FLOAT_NUMBER}j/i
 IMAG_NUMBER.2: /\d+j/i | FLOAT_NUMBER "j"i
 %ignore /[\t \f]+/  // WS
 %ignore /\\[\t \f]*\r?\n/   // LINE_CONT
--- a/examples/python_bytecode.py
+++ b/examples/python_bytecode.py
@@ -0,0 +1,77 @@
 #
 # This is a toy example that compiles Python directly to bytecode, without generating an AST.
 # It currently only works for very very simple Python code.
 #
 # It requires the 'bytecode' library. You can get it using
 #
 #     $ pip install bytecode
 #
 from lark import Lark, Transformer, v_args
 from lark.indenter import Indenter
 from bytecode import Instr, Bytecode
 class PythonIndenter(Indenter):
    NL_type = '_NEWLINE'
    OPEN_PAREN_types = ['LPAR', 'LSQB', 'LBRACE']
    CLOSE_PAREN_types = ['RPAR', 'RSQB', 'RBRACE']
    INDENT_type = '_INDENT'
    DEDENT_type = '_DEDENT'
    tab_len = 8
@v_args(inline=True)
 class Compile(Transformer):
    def number(self, n):
        return [Instr('LOAD_CONST', int(n))]
    def string(self, s):
        return [Instr('LOAD_CONST', s[1:-1])]
    def var(self, n):
        return [Instr('LOAD_NAME', n)]
    def arith_expr(self, a, op, b):
        # TODO support chain arithmetic
        assert op == '+'
        return a + b + [Instr('BINARY_ADD')]
    def arguments(self, args):
        return args
    def funccall(self, name, args):
        return name + args + [Instr('CALL_FUNCTION', 1)]
    @v_args(inline=False)
    def file_input(self, stmts):
        return sum(stmts, []) + [Instr("RETURN_VALUE")]
    def expr_stmt(self, lval, rval):
        # TODO more complicated than that
        name ,= lval
        assert name.name == 'LOAD_NAME' # XXX avoid with another layer of abstraction
        return rval + [Instr("STORE_NAME", name.arg)]
    def __default__(self, *args):
        assert False, args
 python_parser3 = Lark.open('python3.lark', rel_to=__file__, start='file_input',
                           parser='lalr', postlex=PythonIndenter(),
                           transformer=Compile(), propagate_positions=False)
 def compile_python(s):
    insts = python_parser3.parse(s+"\n")
    return Bytecode(insts).to_code()
 code = compile_python("""
 a = 3
 b = 5
 print("Hello World!")
 print(a+(b+2))
 print((a+b)+2)
 """)
 exec(code)
 # -- Output --
 # Hello World!
 # 10
 # 10
--- a/examples/reconstruct_json.py
+++ b/examples/reconstruct_json.py
@@ -25,15 +25,9 @@ test_json = '''
 def test_earley():
    json_parser = Lark(json_grammar)
    json_parser = Lark(json_grammar, maybe_placeholders=False)
    tree = json_parser.parse(test_json)
    # print ('@@', tree.pretty())
    # for x in tree.find_data('true'):
    #     x.data = 'false'
    #     # x.children[0].value = '"HAHA"'
    new_json = Reconstructor(json_parser).reconstruct(tree)
    print (new_json)
    print (json.loads(new_json) == json.loads(test_json))
@@ -41,7 +35,7 @@ def test_earley():
 def test_lalr():
    json_parser = Lark(json_grammar, parser='lalr')
    json_parser = Lark(json_grammar, parser='lalr', maybe_placeholders=False)
    tree = json_parser.parse(test_json)
    new_json = Reconstructor(json_parser).reconstruct(tree)
--- a/examples/standalone/create_standalone.sh
+++ b/examples/standalone/create_standalone.sh
@@ -1 +1,2 @@
 #!/bin/sh
 PYTHONPATH=../.. python -m lark.tools.standalone json.lark > json_parser.py
--- a/examples/standalone/json_parser.py
+++ b/examples/standalone/json_parser.py
--- a/lark/init.py
+++ b/lark/init.py
@@ -5,4 +5,4 @@ from .exceptions import ParseError, LexError, GrammarError, UnexpectedToken, Une
 from .lexer import Token
 from .lark import Lark
 __version__ = "0.7.0"
 __version__ = "0.8.1"
--- a/lark/common.py
+++ b/lark/common.py
@@ -20,6 +20,7 @@ class LexerConf(Serialize):
 class ParserConf:
    def __init__(self, rules, callbacks, start):
        assert isinstance(start, list)
        self.rules = rules
        self.callbacks = callbacks
        self.start = start
--- a/lark/exceptions.py
+++ b/lark/exceptions.py
@@ -13,6 +13,14 @@ class ParseError(LarkError):
 class LexError(LarkError):
    pass
 class UnexpectedEOF(ParseError):
    def __init__(self, expected):
        self.expected = expected
        message = ("Unexpected end-of-input. Expected one of: \n\t* %s\n" % '\n\t* '.join(x.name for x in self.expected))
        super(UnexpectedEOF, self).__init__(message)
 class UnexpectedInput(LarkError):
    pos_in_stream = None
@@ -52,7 +60,7 @@ class UnexpectedInput(LarkError):
 class UnexpectedCharacters(LexError, UnexpectedInput):
    def __init__(self, seq, lex_pos, line, column, allowed=None, considered_tokens=None, state=None):
    def __init__(self, seq, lex_pos, line, column, allowed=None, considered_tokens=None, state=None, token_history=None):
        message = "No terminal defined for '%s' at line %d col %d" % (seq[lex_pos], line, column)
        self.line = line
@@ -65,6 +73,8 @@ class UnexpectedCharacters(LexError, UnexpectedInput):
        message += '\n\n' + self.get_context(seq)
        if allowed:
            message += '\nExpecting: %s\n' % allowed
        if token_history:
            message += '\nPrevious tokens: %s\n' % ', '.join(repr(t) for t in token_history)
        super(UnexpectedCharacters, self).__init__(message)
@@ -87,10 +97,10 @@ class UnexpectedToken(ParseError, UnexpectedInput):
        super(UnexpectedToken, self).__init__(message)
 class VisitError(LarkError):
    def __init__(self, tree, orig_exc):
        self.tree = tree
    def __init__(self, rule, obj, orig_exc):
        self.obj = obj
        self.orig_exc = orig_exc
        message = 'Error trying to process rule "%s":\n\n%s' % (tree.data, orig_exc)
        message = 'Error trying to process rule "%s":\n\n%s' % (rule, orig_exc)
        super(VisitError, self).__init__(message)
 ###}
--- a/lark/grammar.py
+++ b/lark/grammar.py
@@ -3,6 +3,8 @@ from .utils import Serialize
 ###{standalone
 class Symbol(Serialize):
    __slots__ = ('name',)
    is_term = NotImplemented
    def __init__(self, name):
@@ -79,7 +81,7 @@ class Rule(Serialize):
        self.expansion = expansion
        self.alias = alias
        self.order = order
        self.options = options
        self.options = options or RuleOptions()
        self._hash = hash((self.origin, tuple(self.expansion)))
    def _deserialize(self):
@@ -101,4 +103,4 @@ class Rule(Serialize):
 ###}
 ###}
--- a/lark/lark.py
+++ b/lark/lark.py
@@ -1,8 +1,6 @@
 from __future__ import absolute_import
 import os
 import time
 from collections import defaultdict
 from io import open
 from .utils import STRING_TYPE, Serialize, SerializeMemoizer
@@ -43,8 +41,7 @@ class LarkOptions(Serialize):
        keep_all_tokens - Don't automagically remove "punctuation" tokens (default: False)
        cache_grammar - Cache the Lark grammar (Default: False)
        postlex - Lexer post-processing (Default: None) Only works with the standard and contextual lexers.
        start - The start symbol (Default: start)
        profile - Measure run-time usage in Lark. Read results from the profiler proprety (Default: False)
        start - The start symbol, either a string, or a list of strings for multiple possible starts (Default: "start")
        priority - How priorities should be evaluated - auto, none, normal, invert (Default: auto)
        propagate_positions - Propagates [line, column, end_line, end_column] attributes into all tree branches.
        lexer_callbacks - Dictionary of callbacks for the lexer. May alter tokens during lexing. Use with caution.
@@ -63,12 +60,12 @@ class LarkOptions(Serialize):
        'lexer': 'auto',
        'transformer': None,
        'start': 'start',
        'profile': False,
        'priority': 'auto',
        'ambiguity': 'auto',
        'propagate_positions': False,
        'lexer_callbacks': {},
        'maybe_placeholders': False,
        'edit_terminals': None,
    }
    def __init__(self, options_dict):
@@ -85,6 +82,9 @@ class LarkOptions(Serialize):
            options[name] = value
        if isinstance(options['start'], STRING_TYPE):
            options['start'] = [options['start']]
        self.__dict__['options'] = options
        assert self.parser in ('earley', 'lalr', 'cyk', None)
@@ -97,7 +97,11 @@ class LarkOptions(Serialize):
            raise ValueError("Unknown options: %s" % o.keys())
    def __getattr__(self, name):
        return self.options[name]
        try:
            return self.options[name]
        except KeyError as e:
            raise AttributeError(e)
    def __setattr__(self, name, value):
        assert name in self.options
        self.options[name] = value
@@ -110,30 +114,6 @@ class LarkOptions(Serialize):
        return cls(data)
 class Profiler:
    def __init__(self):
        self.total_time = defaultdict(float)
        self.cur_section = '__init__'
        self.last_enter_time = time.time()
    def enter_section(self, name):
        cur_time = time.time()
        self.total_time[self.cur_section] += cur_time - self.last_enter_time
        self.last_enter_time = cur_time
        self.cur_section = name
    def make_wrapper(self, name, f):
        def wrapper(*args, **kwargs):
            last_section = self.cur_section
            self.enter_section(name)
            try:
                return f(*args, **kwargs)
            finally:
                self.enter_section(last_section)
        return wrapper
 class Lark(Serialize):
    def __init__(self, grammar, **options):
        """
@@ -161,9 +141,6 @@ class Lark(Serialize):
        if self.options.cache_grammar:
            raise NotImplementedError("Not available yet")
        assert not self.options.profile, "Feature temporarily disabled"
        # self.profiler = Profiler() if self.options.profile else None
        if self.options.lexer == 'auto':
            if self.options.parser == 'lalr':
                self.options.lexer = 'contextual'
@@ -200,22 +177,37 @@ class Lark(Serialize):
        self.grammar = load_grammar(grammar, self.source)
        # Compile the EBNF grammar into BNF
        self.terminals, self.rules, self.ignore_tokens = self.grammar.compile()
        self.terminals, self.rules, self.ignore_tokens = self.grammar.compile(self.options.start)
        if self.options.edit_terminals:
            for t in self.terminals:
                self.options.edit_terminals(t)
        self._terminals_dict = {t.name:t for t in self.terminals}
        # If the user asked to invert the priorities, negate them all here.
        # This replaces the old 'resolve__antiscore_sum' option.
        if self.options.priority == 'invert':
            for rule in self.rules:
                if rule.options and rule.options.priority is not None:
                if rule.options.priority is not None:
                    rule.options.priority = -rule.options.priority
        # Else, if the user asked to disable priorities, strip them from the
        # rules. This allows the Earley parsers to skip an extra forest walk
        # for improved performance, if you don't need them (or didn't specify any).
        elif self.options.priority == None:
            for rule in self.rules:
                if rule.options and rule.options.priority is not None:
                if rule.options.priority is not None:
                    rule.options.priority = None
        self.lexer_conf = LexerConf(self.terminals, self.ignore_tokens, self.options.postlex, self.options.lexer_callbacks)
        # TODO Deprecate lexer_callbacks?
        lexer_callbacks = dict(self.options.lexer_callbacks)
        if self.options.transformer:
            t = self.options.transformer
            for term in self.terminals:
                if hasattr(t, term.name):
                    lexer_callbacks[term.name] = getattr(t, term.name)
        self.lexer_conf = LexerConf(self.terminals, self.ignore_tokens, self.options.postlex, lexer_callbacks)
        if self.options.parser:
            self.parser = self._build_parser()
@@ -287,8 +279,17 @@ class Lark(Serialize):
            return self.options.postlex.process(stream)
        return stream
    def parse(self, text):
        "Parse the given text, according to the options provided. Returns a tree, unless specified otherwise."
        return self.parser.parse(text)
    def get_terminal(self, name):
        "Get information about a terminal"
        return self._terminals_dict[name]
    def parse(self, text, start=None):
        """Parse the given text, according to the options provided.
        The 'start' parameter is required if Lark was given multiple possible start symbols (using the start option).
        Returns a tree, unless specified otherwise.
        """
        return self.parser.parse(text, start=start)
 ###}
--- a/lark/lexer.py
+++ b/lark/lexer.py
@@ -3,12 +3,11 @@
 import re
 from .utils import Str, classify, get_regexp_width, Py36, Serialize
 from .exceptions import UnexpectedCharacters, LexError
 from .exceptions import UnexpectedCharacters, LexError, UnexpectedToken
 ###{standalone
 class Pattern(Serialize):
    __serialize_fields__ = 'value', 'flags'
    def __init__(self, value, flags=()):
        self.value = value
@@ -41,6 +40,10 @@ class Pattern(Serialize):
 class PatternStr(Pattern):
    __serialize_fields__ = 'value', 'flags'
    type = "str"
    def to_regexp(self):
        return self._get_flags(re.escape(self.value))
@@ -50,15 +53,25 @@ class PatternStr(Pattern):
    max_width = min_width
 class PatternRE(Pattern):
    __serialize_fields__ = 'value', 'flags', '_width'
    type = "re"
    def to_regexp(self):
        return self._get_flags(self.value)
    _width = None
    def _get_width(self):
        if self._width is None:
            self._width = get_regexp_width(self.to_regexp())
        return self._width
    @property
    def min_width(self):
        return get_regexp_width(self.to_regexp())[0]
        return self._get_width()[0]
    @property
    def max_width(self):
        return get_regexp_width(self.to_regexp())[1]
        return self._get_width()[1]
 class TerminalDef(Serialize):
@@ -77,9 +90,9 @@ class TerminalDef(Serialize):
 class Token(Str):
    __slots__ = ('type', 'pos_in_stream', 'value', 'line', 'column', 'end_line', 'end_column')
    __slots__ = ('type', 'pos_in_stream', 'value', 'line', 'column', 'end_line', 'end_column', 'end_pos')
    def __new__(cls, type_, value, pos_in_stream=None, line=None, column=None, end_line=None, end_column=None):
    def __new__(cls, type_, value, pos_in_stream=None, line=None, column=None, end_line=None, end_column=None, end_pos=None):
        try:
            self = super(Token, cls).__new__(cls, value)
        except UnicodeDecodeError:
@@ -93,11 +106,19 @@ class Token(Str):
        self.column = column
        self.end_line = end_line
        self.end_column = end_column
        self.end_pos = end_pos
        return self
    def update(self, type_=None, value=None):
        return Token.new_borrow_pos(
            type_ if type_ is not None else self.type,
            value if value is not None else self.value,
            self
        )
    @classmethod
    def new_borrow_pos(cls, type_, value, borrow_t):
        return cls(type_, value, borrow_t.pos_in_stream, borrow_t.line, borrow_t.column, borrow_t.end_line, borrow_t.end_column)
        return cls(type_, value, borrow_t.pos_in_stream, borrow_t.line, borrow_t.column, borrow_t.end_line, borrow_t.end_column, borrow_t.end_pos)
    def __reduce__(self):
        return (self.__class__, (self.type, self.value, self.pos_in_stream, self.line, self.column, ))
@@ -149,38 +170,38 @@ class _Lex:
        newline_types = frozenset(newline_types)
        ignore_types = frozenset(ignore_types)
        line_ctr = LineCounter()
        last_token = None
        while line_ctr.char_pos < len(stream):
            lexer = self.lexer
            for mre, type_from_index in lexer.mres:
                m = mre.match(stream, line_ctr.char_pos)
                if not m:
                    continue
                t = None
                value = m.group(0)
                type_ = type_from_index[m.lastindex]
                if type_ not in ignore_types:
                    t = Token(type_, value, line_ctr.char_pos, line_ctr.line, line_ctr.column)
                    if t.type in lexer.callback:
                        t = lexer.callback[t.type](t)
                        if not isinstance(t, Token):
                            raise ValueError("Callbacks must return a token (returned %r)" % t)
                    yield t
                else:
                    if type_ in lexer.callback:
                        t = Token(type_, value, line_ctr.char_pos, line_ctr.line, line_ctr.column)
                        lexer.callback[type_](t)
            res = lexer.match(stream, line_ctr.char_pos)
            if not res:
                allowed = {v for m, tfi in lexer.mres for v in tfi.values()} - ignore_types
                if not allowed:
                    allowed = {"<END-OF-FILE>"}
                raise UnexpectedCharacters(stream, line_ctr.char_pos, line_ctr.line, line_ctr.column, allowed=allowed, state=self.state, token_history=last_token and [last_token])
                line_ctr.feed(value, type_ in newline_types)
                if t:
                    t.end_line = line_ctr.line
                    t.end_column = line_ctr.column
            value, type_ = res
                break
            if type_ not in ignore_types:
                t = Token(type_, value, line_ctr.char_pos, line_ctr.line, line_ctr.column)
                line_ctr.feed(value, type_ in newline_types)
                t.end_line = line_ctr.line
                t.end_column = line_ctr.column
                t.end_pos = line_ctr.char_pos
                if t.type in lexer.callback:
                    t = lexer.callback[t.type](t)
                    if not isinstance(t, Token):
                        raise ValueError("Callbacks must return a token (returned %r)" % t)
                yield t
                last_token = t
            else:
                allowed = {v for m, tfi in lexer.mres for v in tfi.values()}
                raise UnexpectedCharacters(stream, line_ctr.char_pos, line_ctr.line, line_ctr.column, allowed=allowed, state=self.state)
                if type_ in lexer.callback:
                    t2 = Token(type_, value, line_ctr.char_pos, line_ctr.line, line_ctr.column)
                    lexer.callback[type_](t2)
                line_ctr.feed(value, type_ in newline_types)
 class UnlessCallback:
@@ -253,23 +274,21 @@ def build_mres(terminals, match_whole=False):
    return _build_mres(terminals, len(terminals), match_whole)
 def _regexp_has_newline(r):
    """Expressions that may indicate newlines in a regexp:
    r"""Expressions that may indicate newlines in a regexp:
        - newlines (\n)
        - escaped newline (\\n)
        - anything but ([^...])
        - any-char (.) when the flag (?s) exists
        - spaces (\s)
    """
    return '\n' in r or '\\n' in r or '[^' in r or ('(?s' in r and '.' in r)
    return '\n' in r or '\\n' in r or '\\s' in r or '[^' in r or ('(?s' in r and '.' in r)
 class Lexer(object):
    """Lexer interface
    Method Signatures:
        lex(self, stream) -> Iterator[Token]
        set_parser_state(self, state)   # Optional
    """
    set_parser_state = NotImplemented
    lex = NotImplemented
@@ -284,7 +303,7 @@ class TraditionalLexer(Lexer):
        for t in terminals:
            try:
                re.compile(t.pattern.to_regexp())
            except:
            except re.error:
                raise LexError("Cannot compile token %s: %s" % (t.name, t.pattern))
            if t.pattern.min_width == 0:
@@ -314,6 +333,11 @@ class TraditionalLexer(Lexer):
        self.mres = build_mres(terminals)
    def match(self, stream, pos):
        for mre, type_from_index in self.mres:
            m = mre.match(stream, pos)
            if m:
                return m.group(0), type_from_index[m.lastindex]
    def lex(self, stream):
        return _Lex(self).lex(stream, self.newline_types, self.ignore_types)
@@ -322,6 +346,7 @@ class TraditionalLexer(Lexer):
 class ContextualLexer(Lexer):
    def __init__(self, terminals, states, ignore=(), always_accept=(), user_callbacks={}):
        tokens_by_name = {}
        for t in terminals:
@@ -344,16 +369,25 @@ class ContextualLexer(Lexer):
        self.root_lexer = TraditionalLexer(terminals, ignore=ignore, user_callbacks=user_callbacks)
        self.set_parser_state(None) # Needs to be set on the outside
    def set_parser_state(self, state):
        self.parser_state = state
    def lex(self, stream):
        l = _Lex(self.lexers[self.parser_state], self.parser_state)
        for x in l.lex(stream, self.root_lexer.newline_types, self.root_lexer.ignore_types):
            yield x
            l.lexer = self.lexers[self.parser_state]
            l.state = self.parser_state
    def lex(self, stream, get_parser_state):
        parser_state = get_parser_state()
        l = _Lex(self.lexers[parser_state], parser_state)
        try:
            for x in l.lex(stream, self.root_lexer.newline_types, self.root_lexer.ignore_types):
                yield x
                parser_state = get_parser_state()
                l.lexer = self.lexers[parser_state]
                l.state = parser_state # For debug only, no need to worry about multithreading
        except UnexpectedCharacters as e:
            # In the contextual lexer, UnexpectedCharacters can mean that the terminal is defined,
            # but not in the current context.
            # This tests the input against the global context, to provide a nicer error.
            root_match = self.root_lexer.match(stream, e.pos_in_stream)
            if not root_match:
                raise
            value, type_ = root_match
            t = Token(type_, value, e.pos_in_stream, e.line, e.column)
            raise UnexpectedToken(t, e.allowed, state=e.state)
 ###}
--- a/lark/load_grammar.py
+++ b/lark/load_grammar.py
@@ -2,17 +2,17 @@
 import os.path
 import sys
 from ast import literal_eval
 from copy import copy, deepcopy
 from io import open
 from .utils import bfs
 from .utils import bfs, eval_escaping
 from .lexer import Token, TerminalDef, PatternStr, PatternRE
 from .parse_tree_builder import ParseTreeBuilder
 from .parser_frontends import LALR_TraditionalLexer
 from .common import LexerConf, ParserConf
 from .grammar import RuleOptions, Rule, Terminal, NonTerminal, Symbol
 from .utils import classify, suppress, dedup_list
 from .utils import classify, suppress, dedup_list, Str
 from .exceptions import GrammarError, UnexpectedCharacters, UnexpectedToken
 from .tree import Tree, SlottedTree as ST
@@ -73,11 +73,12 @@ TERMINALS = {
    '_RPAR': r'\)',
    '_LBRA': r'\[',
    '_RBRA': r'\]',
    'OP': '[+*][?]?|[?](?![a-z])',
    'OP': '[+*]|[?](?![a-z])',
    '_COLON': ':',
    '_COMMA': ',',
    '_OR': r'\|',
    '_DOT': r'\.',
    '_DOT': r'\.(?!\.)',
    '_DOTDOT': r'\.\.',
    'TILDE': '~',
    'RULE': '!?[_?]?[a-z][_a-z0-9]*',
    'TERMINAL': '_?[A-Z][_A-Z0-9]*',
@@ -85,12 +86,12 @@ TERMINALS = {
    'REGEXP': r'/(?!/)(\\/|\\\\|[^/\n])*?/[%s]*' % _RE_FLAGS,
    '_NL': r'(\r?\n)+\s*',
    'WS': r'[ \t]+',
    'COMMENT': r'//[^\n]*',
    'COMMENT': r'\s*//[^\n]*',
    '_TO': '->',
    '_IGNORE': r'%ignore',
    '_DECLARE': r'%declare',
    '_IMPORT': r'%import',
    'NUMBER': r'\d+',
    'NUMBER': r'[+-]?\d+',
 }
 RULES = {
@@ -112,7 +113,7 @@ RULES = {
    '?expr': ['atom',
              'atom OP',
              'atom TILDE NUMBER',
              'atom TILDE NUMBER _DOT _DOT NUMBER',
              'atom TILDE NUMBER _DOTDOT NUMBER',
              ],
    '?atom': ['_LPAR expansions _RPAR',
@@ -130,7 +131,7 @@ RULES = {
    '?name': ['RULE', 'TERMINAL'],
    'maybe': ['_LBRA expansions _RBRA'],
    'range': ['STRING _DOT _DOT STRING'],
    'range': ['STRING _DOTDOT STRING'],
    'term': ['TERMINAL _COLON expansions _NL',
              'TERMINAL _DOT NUMBER _COLON expansions _NL'],
@@ -196,7 +197,7 @@ class EBNF_to_BNF(Transformer_InPlace):
                mn = mx = int(args[0])
            else:
                mn, mx = map(int, args)
                if mx < mn:
                if mx < mn or mn < 0:
                    raise GrammarError("Bad Range for %s (%d..%d isn't allowed)" % (rule, mn, mx))
            return ST('expansions', [ST('expansion', [rule] * n) for n in range(mn, mx+1)])
        assert False, op
@@ -205,7 +206,7 @@ class EBNF_to_BNF(Transformer_InPlace):
        keep_all_tokens = self.rule_options and self.rule_options.keep_all_tokens
        def will_not_get_removed(sym):
            if isinstance(sym, NonTerminal): 
            if isinstance(sym, NonTerminal):
                return not sym.name.startswith('_')
            if isinstance(sym, Terminal):
                return keep_all_tokens or not sym.filter_out
@@ -345,28 +346,6 @@ def _rfind(s, choices):
 def _fix_escaping(s):
    w = ''
    i = iter(s)
    for n in i:
        w += n
        if n == '\\':
            n2 = next(i)
            if n2 == '\\':
                w += '\\\\'
            elif n2 not in 'uxnftr':
                w += '\\'
            w += n2
    w = w.replace('\\"', '"').replace("'", "\\'")
    to_eval = "u'''%s'''" % w
    try:
        s = literal_eval(to_eval)
    except SyntaxError as e:
        raise ValueError(s, e)
    return s
 def _literal_to_pattern(literal):
    v = literal.value
@@ -379,7 +358,7 @@ def _literal_to_pattern(literal):
    assert v[0] == v[-1] and v[0] in '"/'
    x = v[1:-1]
    s = _fix_escaping(x)
    s = eval_escaping(x)
    if literal.type == 'STRING':
        s = s.replace('\\\\', '\\')
@@ -397,7 +376,7 @@ class PrepareLiterals(Transformer_InPlace):
        assert start.type == end.type == 'STRING'
        start = start.value[1:-1]
        end = end.value[1:-1]
        assert len(_fix_escaping(start)) == len(_fix_escaping(end)) == 1, (start, end, len(_fix_escaping(start)), len(_fix_escaping(end)))
        assert len(eval_escaping(start)) == len(eval_escaping(end)) == 1, (start, end, len(eval_escaping(start)), len(eval_escaping(end)))
        regexp = '[%s-%s]' % (start, end)
        return ST('pattern', [PatternRE(regexp)])
@@ -451,9 +430,9 @@ class PrepareSymbols(Transformer_InPlace):
        if isinstance(v, Tree):
            return v
        elif v.type == 'RULE':
            return NonTerminal(v.value)
            return NonTerminal(Str(v.value))
        elif v.type == 'TERMINAL':
            return Terminal(v.value, filter_out=v.startswith('_'))
            return Terminal(Str(v.value), filter_out=v.startswith('_'))
        assert False
 def _choice_of_rules(rules):
@@ -465,7 +444,7 @@ class Grammar:
        self.rule_defs = rule_defs
        self.ignore = ignore
    def compile(self):
    def compile(self, start):
        # We change the trees in-place (to support huge grammars)
        # So deepcopy allows calling compile more than once.
        term_defs = deepcopy(list(self.term_defs))
@@ -476,7 +455,7 @@ class Grammar:
        # ===================
        # Convert terminal-trees to strings/regexps
        transformer = PrepareLiterals() * TerminalTreeToPattern()
        for name, (term_tree, priority) in term_defs:
            if term_tree is None:  # Terminal added through %declare
                continue
@@ -484,7 +463,8 @@ class Grammar:
            if len(expansions) == 1 and not expansions[0].children:
                raise GrammarError("Terminals cannot be empty (%s)" % name)
        terminals = [TerminalDef(name, transformer.transform(term_tree), priority)
        transformer = PrepareLiterals() * TerminalTreeToPattern()
        terminals = [TerminalDef(name, transformer.transform( term_tree ), priority)
                  for name, (term_tree, priority) in term_defs if term_tree]
        # =================
@@ -498,7 +478,8 @@ class Grammar:
        ebnf_to_bnf = EBNF_to_BNF()
        rules = []
        for name, rule_tree, options in rule_defs:
            ebnf_to_bnf.rule_options = RuleOptions(keep_all_tokens=True) if options and options.keep_all_tokens else None
            ebnf_to_bnf.rule_options = RuleOptions(keep_all_tokens=True) if options.keep_all_tokens else None
            ebnf_to_bnf.prefix = name
            tree = transformer.transform(rule_tree)
            res = ebnf_to_bnf.transform(tree)
            rules.append((name, res, options))
@@ -511,18 +492,18 @@ class Grammar:
        simplify_rule = SimplifyRule_Visitor()
        compiled_rules = []
        for i, rule_content in enumerate(rules):
        for rule_content in rules:
            name, tree, options = rule_content
            simplify_rule.visit(tree)
            expansions = rule_tree_to_text.transform(tree)
            for expansion, alias in expansions:
            for i, (expansion, alias) in enumerate(expansions):
                if alias and name.startswith('_'):
                    raise GrammarError("Rule %s is marked for expansion (it starts with an underscore) and isn't allowed to have aliases (alias=%s)" % (name, alias))
                empty_indices = [x==_EMPTY for i, x in enumerate(expansion)]
                empty_indices = [x==_EMPTY for x in expansion]
                if any(empty_indices):
                    exp_options = copy(options) if options else RuleOptions()
                    exp_options = copy(options) or RuleOptions()
                    exp_options.empty_indices = empty_indices
                    expansion = [x for x in expansion if x!=_EMPTY]
                else:
@@ -538,7 +519,8 @@ class Grammar:
            for dups in duplicates.values():
                if len(dups) > 1:
                    if dups[0].expansion:
                        raise GrammarError("Rules defined twice: %s" % ', '.join(str(i) for i in duplicates))
                        raise GrammarError("Rules defined twice: %s\n\n(Might happen due to colliding expansion of optionals: [] or ?)"
                                           % ''.join('\n  * %s' % i for i in dups))
                    # Empty rule; assert all other attributes are equal
                    assert len({(r.alias, r.order, r.options) for r in dups}) == len(dups)
@@ -546,6 +528,19 @@ class Grammar:
            # Remove duplicates
            compiled_rules = list(set(compiled_rules))
        # Filter out unused rules
        while True:
            c = len(compiled_rules)
            used_rules = {s for r in compiled_rules
                                for s in r.expansion
                                if isinstance(s, NonTerminal)
                                and s != r.origin}
            used_rules |= {NonTerminal(s) for s in start}
            compiled_rules = [r for r in compiled_rules if r.origin in used_rules]
            if len(compiled_rules) == c:
                break
        # Filter out unused terminals
        used_terms = {t.name for r in compiled_rules
                             for t in r.expansion
@@ -563,13 +558,13 @@ def import_grammar(grammar_path, base_paths=[]):
        for import_path in import_paths:
            with suppress(IOError):
                joined_path = os.path.join(import_path, grammar_path)
                with open(joined_path) as f:
                with open(joined_path, encoding='utf8') as f:
                    text = f.read()
                grammar = load_grammar(text, joined_path)
                _imported_grammars[grammar_path] = grammar
                break
        else:
            open(grammar_path)
            open(grammar_path, encoding='utf8')
            assert False
    return _imported_grammars[grammar_path]
@@ -592,7 +587,9 @@ def import_from_grammar_into_namespace(grammar, namespace, aliases):
            _, tree, _ = imported_rules[symbol]
        except KeyError:
            raise GrammarError("Missing symbol '%s' in grammar %s" % (symbol, namespace))
        return tree.scan_values(lambda x: x.type in ('RULE', 'TERMINAL'))
        return _find_used_symbols(tree)
    def get_namespace_name(name):
        try:
@@ -620,11 +617,10 @@ def import_from_grammar_into_namespace(grammar, namespace, aliases):
 def resolve_term_references(term_defs):
    # TODO Cycles detection
    # TODO Solve with transitive closure (maybe)
    token_dict = {k:t for k, (t,_p) in term_defs}
    assert len(token_dict) == len(term_defs), "Same name defined twice?"
    term_dict = {k:t for k, (t,_p) in term_defs}
    assert len(term_dict) == len(term_defs), "Same name defined twice?"
    while True:
        changed = False
@@ -637,11 +633,21 @@ def resolve_term_references(term_defs):
                    if item.type == 'RULE':
                        raise GrammarError("Rules aren't allowed inside terminals (%s in %s)" % (item, name))
                    if item.type == 'TERMINAL':
                        exp.children[0] = token_dict[item]
                        term_value = term_dict[item]
                        assert term_value is not None
                        exp.children[0] = term_value
                        changed = True
        if not changed:
            break
    for name, term in term_dict.items():
        if term:    # Not just declared
            for child in term.children:
                ids = [id(x) for x in child.iter_subtrees()]
                if id(term) in ids:
                    raise GrammarError("Recursion in terminal '%s' (recursion is only allowed in rules, not terminals)" % name)
 def options_from_rule(name, *x):
    if len(x) > 1:
        priority, expansions = x
@@ -669,6 +675,11 @@ class PrepareGrammar(Transformer_InPlace):
        return name
 def _find_used_symbols(tree):
    assert tree.data == 'expansions'
    return {t for x in tree.find_data('expansion')
              for t in x.scan_values(lambda t: t.type in ('RULE', 'TERMINAL'))}
 class GrammarLoader:
    def __init__(self):
        terminals = [TerminalDef(name, PatternRE(value)) for name, value in TERMINALS.items()]
@@ -678,7 +689,7 @@ class GrammarLoader:
        callback = ParseTreeBuilder(rules, ST).create_callback()
        lexer_conf = LexerConf(terminals, ['WS', 'COMMENT'])
        parser_conf = ParserConf(rules, callback, 'start')
        parser_conf = ParserConf(rules, callback, ['start'])
        self.parser = LALR_TraditionalLexer(lexer_conf, parser_conf)
        self.canonize_tree = CanonizeTree()
@@ -830,9 +841,7 @@ class GrammarLoader:
            rule_names.add(name)
        for name, expansions, _o in rules:
            used_symbols = {t for x in expansions.find_data('expansion')
                              for t in x.scan_values(lambda t: t.type in ('RULE', 'TERMINAL'))}
            for sym in used_symbols:
            for sym in _find_used_symbols(expansions):
                if sym.type == 'TERMINAL':
                    if sym not in terminal_names:
                        raise GrammarError("Token '%s' used but not defined (in rule %s)" % (sym, name))
--- a/lark/parse_tree_builder.py
+++ b/lark/parse_tree_builder.py
@@ -2,6 +2,8 @@ from .exceptions import GrammarError
 from .lexer import Token
 from .tree import Tree
 from .visitors import InlineTransformer # XXX Deprecated
 from .visitors import Transformer_InPlace
 from .visitors import _vargs_meta, _vargs_meta_inline
 ###{standalone
 from functools import partial, wraps
@@ -27,7 +29,7 @@ class PropagatePositions:
        if isinstance(res, Tree):
            for c in children:
                if isinstance(c, Tree) and c.children and not c.meta.empty:
                if isinstance(c, Tree) and not c.meta.empty:
                    res.meta.line = c.meta.line
                    res.meta.column = c.meta.column
                    res.meta.start_pos = c.meta.start_pos
@@ -41,7 +43,7 @@ class PropagatePositions:
                    break
            for c in reversed(children):
                if isinstance(c, Tree) and c.children and not c.meta.empty:
                if isinstance(c, Tree) and not c.meta.empty:
                    res.meta.end_line = c.meta.end_line
                    res.meta.end_column = c.meta.end_column
                    res.meta.end_pos = c.meta.end_pos
@@ -50,7 +52,7 @@ class PropagatePositions:
                elif isinstance(c, Token):
                    res.meta.end_line = c.end_line
                    res.meta.end_column = c.end_column
                    res.meta.end_pos = c.pos_in_stream + len(c.value)
                    res.meta.end_pos = c.end_pos
                    res.meta.empty = False
                    break
@@ -193,6 +195,23 @@ def ptb_inline_args(func):
        return func(*children)
    return f
 def inplace_transformer(func):
    @wraps(func)
    def f(children):
        # function name in a Transformer is a rule name.
        tree = Tree(func.__name__, children)
        return func(tree)
    return f
 def apply_visit_wrapper(func, name, wrapper):
    if wrapper is _vargs_meta or wrapper is _vargs_meta_inline:
        raise NotImplementedError("Meta args not supported for internal transformer")
    @wraps(func)
    def f(children):
        return wrapper(func, name, children, None)
    return f
 class ParseTreeBuilder:
    def __init__(self, rules, tree_class, propagate_positions=False, keep_all_tokens=False, ambiguous=False, maybe_placeholders=False):
        self.tree_class = tree_class
@@ -206,12 +225,12 @@ class ParseTreeBuilder:
    def _init_builders(self, rules):
        for rule in rules:
            options = rule.options
            keep_all_tokens = self.always_keep_all_tokens or (options.keep_all_tokens if options else False)
            expand_single_child = options.expand1 if options else False
            keep_all_tokens = self.always_keep_all_tokens or options.keep_all_tokens
            expand_single_child = options.expand1
            wrapper_chain = list(filter(None, [
                (expand_single_child and not rule.alias) and ExpandSingleChild,
                maybe_create_child_filter(rule.expansion, keep_all_tokens, self.ambiguous, options.empty_indices if self.maybe_placeholders and options else None),
                maybe_create_child_filter(rule.expansion, keep_all_tokens, self.ambiguous, options.empty_indices if self.maybe_placeholders else None),
                self.propagate_positions and PropagatePositions,
                self.ambiguous and maybe_create_ambiguous_expander(self.tree_class, rule.expansion, keep_all_tokens),
            ]))
@@ -227,10 +246,15 @@ class ParseTreeBuilder:
            user_callback_name = rule.alias or rule.origin.name
            try:
                f = getattr(transformer, user_callback_name)
                assert not getattr(f, 'meta', False), "Meta args not supported for internal transformer"
                # XXX InlineTransformer is deprecated!
                if getattr(f, 'inline', False) or isinstance(transformer, InlineTransformer):
                    f = ptb_inline_args(f)
                wrapper = getattr(f, 'visit_wrapper', None)
                if wrapper is not None:
                    f = apply_visit_wrapper(f, user_callback_name, wrapper)
                else:
                    if isinstance(transformer, InlineTransformer):
                        f = ptb_inline_args(f)
                    elif isinstance(transformer, Transformer_InPlace):
                        f = inplace_transformer(f)
            except AttributeError:
                f = partial(self.tree_class, user_callback_name)
--- a/lark/parser_frontends.py
+++ b/lark/parser_frontends.py
@@ -44,18 +44,28 @@ def get_frontend(parser, lexer):
        raise ValueError('Unknown parser: %s' % parser)
 class _ParserFrontend(Serialize):
    def _parse(self, input, start, *args):
        if start is None:
            start = self.start
            if len(start) > 1:
                raise ValueError("Lark initialized with more than 1 possible start rule. Must specify which start rule to parse", start)
            start ,= start
        return self.parser.parse(input, start, *args)
 class WithLexer(Serialize):
 class WithLexer(_ParserFrontend):
    lexer = None
    parser = None
    lexer_conf = None
    start = None
    __serialize_fields__ = 'parser', 'lexer_conf'
    __serialize_fields__ = 'parser', 'lexer_conf', 'start'
    __serialize_namespace__ = LexerConf,
    def __init__(self, lexer_conf, parser_conf, options=None):
        self.lexer_conf = lexer_conf
        self.start = parser_conf.start
        self.postlex = lexer_conf.postlex
    @classmethod
@@ -65,18 +75,17 @@ class WithLexer(Serialize):
        inst.parser = LALR_Parser.deserialize(inst.parser, memo, callbacks)
        inst.init_lexer()
        return inst
    def _serialize(self, data, memo):
        data['parser'] = data['parser'].serialize(memo)
    def lex(self, text):
        stream = self.lexer.lex(text)
    def lex(self, *args):
        stream = self.lexer.lex(*args)
        return self.postlex.process(stream) if self.postlex else stream
    def parse(self, text):
    def parse(self, text, start=None):
        token_stream = self.lex(text)
        sps = self.lexer.set_parser_state
        return self.parser.parse(token_stream, *[sps] if sps is not NotImplemented else [])
        return self._parse(token_stream, start)
    def init_traditional_lexer(self):
        self.lexer = TraditionalLexer(self.lexer_conf.tokens, ignore=self.lexer_conf.ignore, user_callbacks=self.lexer_conf.callbacks)
@@ -104,14 +113,24 @@ class LALR_ContextualLexer(LALR_WithLexer):
                                     ignore=self.lexer_conf.ignore,
                                     always_accept=always_accept,
                                     user_callbacks=self.lexer_conf.callbacks)
    def parse(self, text, start=None):
        parser_state = [None]
        def set_parser_state(s):
            parser_state[0] = s
        token_stream = self.lex(text, lambda: parser_state[0])
        return self._parse(token_stream, start, set_parser_state)
 ###}
 class LALR_CustomLexer(LALR_WithLexer):
    def __init__(self, lexer_cls, lexer_conf, parser_conf, options=None):
        pass    # TODO
        self.lexer = lexer_cls(lexer_conf)
        debug = options.debug if options else False
        self.parser = LALR_Parser(parser_conf, debug=debug)
        WithLexer.__init__(self, lexer_conf, parser_conf, options)
    def init_lexer(self):
        self.lexer = lexer_cls(self.lexer_conf)
 def tokenize_text(text):
    line = 1
@@ -128,22 +147,26 @@ class Earley(WithLexer):
        self.init_traditional_lexer()
        resolve_ambiguity = options.ambiguity == 'resolve'
        self.parser = earley.Parser(parser_conf, self.match, resolve_ambiguity=resolve_ambiguity)
        debug = options.debug if options else False
        self.parser = earley.Parser(parser_conf, self.match, resolve_ambiguity=resolve_ambiguity, debug=debug)
    def match(self, term, token):
        return term.name == token.type
 class XEarley:
 class XEarley(_ParserFrontend):
    def __init__(self, lexer_conf, parser_conf, options=None, **kw):
        self.token_by_name = {t.name:t for t in lexer_conf.tokens}
        self.start = parser_conf.start
        self._prepare_match(lexer_conf)
        resolve_ambiguity = options.ambiguity == 'resolve'
        debug = options.debug if options else False
        self.parser = xearley.Parser(parser_conf,
                                    self.match,
                                    ignore=lexer_conf.ignore,
                                    resolve_ambiguity=resolve_ambiguity,
                                    debug=debug,
                                    **kw
                                    )
@@ -166,8 +189,8 @@ class XEarley:
            self.regexps[t.name] = re.compile(regexp)
    def parse(self, text):
        return self.parser.parse(text)
    def parse(self, text, start):
        return self._parse(text, start)
 class XEarley_CompleteLex(XEarley):
    def __init__(self, *args, **kw):
@@ -182,13 +205,13 @@ class CYK(WithLexer):
        self.init_traditional_lexer()
        self._analysis = GrammarAnalyzer(parser_conf)
        self._parser = cyk.Parser(parser_conf.rules, parser_conf.start)
        self.parser = cyk.Parser(parser_conf.rules)
        self.callbacks = parser_conf.callbacks
    def parse(self, text):
    def parse(self, text, start):
        tokens = list(self.lex(text))
        parse = self._parser.parse(tokens)
        parse = self._parse(tokens, start)
        parse = self._transform(parse)
        return parse
--- a/lark/parsers/cyk.py
+++ b/lark/parsers/cyk.py
@@ -84,12 +84,11 @@ class RuleNode(object):
 class Parser(object):
    """Parser wrapper."""
    def __init__(self, rules, start):
    def __init__(self, rules):
        super(Parser, self).__init__()
        self.orig_rules = {rule: rule for rule in rules}
        rules = [self._to_rule(rule) for rule in rules]
        self.grammar = to_cnf(Grammar(rules))
        self.start = NT(start)
    def _to_rule(self, lark_rule):
        """Converts a lark rule, (lhs, rhs, callback, options), to a Rule."""
@@ -97,16 +96,19 @@ class Parser(object):
        assert all(isinstance(x, Symbol) for x in lark_rule.expansion)
        return Rule(
            lark_rule.origin, lark_rule.expansion,
            weight=lark_rule.options.priority if lark_rule.options and lark_rule.options.priority else 0,
            weight=lark_rule.options.priority if lark_rule.options.priority else 0,
            alias=lark_rule)
    def parse(self, tokenized):  # pylint: disable=invalid-name
    def parse(self, tokenized, start):  # pylint: disable=invalid-name
        """Parses input, which is a list of tokens."""
        assert start
        start = NT(start)
        table, trees = _parse(tokenized, self.grammar)
        # Check if the parse succeeded.
        if all(r.lhs != self.start for r in table[(0, len(tokenized) - 1)]):
        if all(r.lhs != start for r in table[(0, len(tokenized) - 1)]):
            raise ParseError('Parsing failed.')
        parse = trees[(0, len(tokenized) - 1)][self.start]
        parse = trees[(0, len(tokenized) - 1)][start]
        return self._to_tree(revert_cnf(parse))
    def _to_tree(self, rule_node):
--- a/lark/parsers/earley.py
+++ b/lark/parsers/earley.py
@@ -10,20 +10,22 @@ is better documented here:
    http://www.bramvandersanden.com/post/2014/06/shared-packed-parse-forest/
 """
 import logging
 from collections import deque
 from ..visitors import Transformer_InPlace, v_args
 from ..exceptions import ParseError, UnexpectedToken
 from ..exceptions import UnexpectedEOF, UnexpectedToken
 from .grammar_analysis import GrammarAnalyzer
 from ..grammar import NonTerminal
 from .earley_common import Item, TransitiveItem
 from .earley_forest import ForestToTreeVisitor, ForestSumVisitor, SymbolNode, ForestToAmbiguousTreeVisitor
 class Parser:
    def __init__(self, parser_conf, term_matcher, resolve_ambiguity=True):
    def __init__(self, parser_conf, term_matcher, resolve_ambiguity=True, debug=False):
        analysis = GrammarAnalyzer(parser_conf)
        self.parser_conf = parser_conf
        self.resolve_ambiguity = resolve_ambiguity
        self.debug = debug
        self.FIRST = analysis.FIRST
        self.NULLABLE = analysis.NULLABLE
@@ -43,13 +45,9 @@ class Parser:
            #  the priorities will be stripped from all rules before they reach us, allowing us to
            #  skip the extra tree walk. We'll also skip this if the user just didn't specify priorities
            #  on any rules.
            if self.forest_sum_visitor is None and rule.options and rule.options.priority is not None:
                self.forest_sum_visitor = ForestSumVisitor()
            if self.forest_sum_visitor is None and rule.options.priority is not None:
                self.forest_sum_visitor = ForestSumVisitor
        if resolve_ambiguity:
            self.forest_tree_visitor = ForestToTreeVisitor(self.callbacks, self.forest_sum_visitor)
        else:
            self.forest_tree_visitor = ForestToAmbiguousTreeVisitor(self.callbacks, self.forest_sum_visitor)
        self.term_matcher = term_matcher
@@ -272,9 +270,11 @@ class Parser:
        ## Column is now the final column in the parse.
        assert i == len(columns)-1
        return to_scan
    def parse(self, stream, start_symbol=None):
        start_symbol = NonTerminal(start_symbol or self.parser_conf.start)
    def parse(self, stream, start):
        assert start, start
        start_symbol = NonTerminal(start)
        columns = [set()]
        to_scan = set()     # The scan buffer. 'Q' in E.Scott's paper.
@@ -289,22 +289,33 @@ class Parser:
            else:
                columns[0].add(item)
        self._parse(stream, columns, to_scan, start_symbol)
        to_scan = self._parse(stream, columns, to_scan, start_symbol)
        # If the parse was successful, the start
        # symbol should have been completed in the last step of the Earley cycle, and will be in
        # this column. Find the item for the start_symbol, which is the root of the SPPF tree.
        solutions = [n.node for n in columns[-1] if n.is_complete and n.node is not None and n.s == start_symbol and n.start == 0]
        if self.debug:
            from .earley_forest import ForestToPyDotVisitor
            try:
                debug_walker = ForestToPyDotVisitor()
            except ImportError:
                logging.warning("Cannot find dependency 'pydot', will not generate sppf debug image")
            else:
                debug_walker.visit(solutions[0], "sppf.png")
        if not solutions:
            expected_tokens = [t.expect for t in to_scan]
            # raise ParseError('Incomplete parse: Could not find a solution to input')
            raise ParseError('Unexpected end of input! Expecting a terminal of: %s' % expected_tokens)
            raise UnexpectedEOF(expected_tokens)
        elif len(solutions) > 1:
            assert False, 'Earley should not generate multiple start symbol items!'
        # Perform our SPPF -> AST conversion using the right ForestVisitor.
        return self.forest_tree_visitor.visit(solutions[0])
        forest_tree_visitor_cls = ForestToTreeVisitor if self.resolve_ambiguity else ForestToAmbiguousTreeVisitor
        forest_tree_visitor = forest_tree_visitor_cls(self.callbacks, self.forest_sum_visitor and self.forest_sum_visitor())
        return forest_tree_visitor.visit(solutions[0])
 class ApplyCallbacks(Transformer_InPlace):
--- a/lark/parsers/earley_forest.py
+++ b/lark/parsers/earley_forest.py
@@ -122,7 +122,7 @@ class PackedNode(ForestNode):
        ambiguously. Hence, we use the sort order to identify
        the order in which ambiguous children should be considered.
        """
        return self.is_empty, -self.priority, -self.rule.order
        return self.is_empty, -self.priority, self.rule.order
    def __iter__(self):
        return iter([self.left, self.right])
@@ -195,7 +195,7 @@ class ForestVisitor(object):
                    continue
                if id(next_node) in visiting:
                    raise ParseError("Infinite recursion in grammar!")
                    raise ParseError("Infinite recursion in grammar, in rule '%s'!" % next_node.s.name)
                input_stack.append(next_node)
                continue
@@ -250,7 +250,7 @@ class ForestSumVisitor(ForestVisitor):
        return iter(node.children)
    def visit_packed_node_out(self, node):
        priority = node.rule.options.priority if not node.parent.is_intermediate and node.rule.options and node.rule.options.priority else 0
        priority = node.rule.options.priority if not node.parent.is_intermediate and node.rule.options.priority else 0
        priority += getattr(node.right, 'priority', 0)
        priority += getattr(node.left, 'priority', 0)
        node.priority = priority
--- a/lark/parsers/grammar_analysis.py
+++ b/lark/parsers/grammar_analysis.py
@@ -1,4 +1,4 @@
 from collections import Counter
 from collections import Counter, defaultdict
 from ..utils import bfs, fzset, classify
 from ..exceptions import GrammarError
@@ -37,8 +37,22 @@ class RulePtr(object):
        return hash((self.rule, self.index))
 # state generation ensures no duplicate LR0ItemSets
 class LR0ItemSet(object):
    __slots__ = ('kernel', 'closure', 'transitions', 'lookaheads')
    def __init__(self, kernel, closure):
        self.kernel = fzset(kernel)
        self.closure = fzset(closure)
        self.transitions = {}
        self.lookaheads = defaultdict(set)
    def __repr__(self):
        return '{%s | %s}' % (', '.join([repr(r) for r in self.kernel]), ', '.join([repr(r) for r in self.closure]))
 def update_set(set1, set2):
    if not set2:
    if not set2 or set1 > set2:
        return False
    copy = set(set1)
@@ -85,6 +99,8 @@ def calculate_sets(rules):
                if set(rule.expansion[:i]) <= NULLABLE:
                    if update_set(FIRST[rule.origin], FIRST[sym]):
                        changed = True
                else:
                    break
    # Calculate FOLLOW
    changed = True
@@ -109,7 +125,10 @@ class GrammarAnalyzer(object):
    def __init__(self, parser_conf, debug=False):
        self.debug = debug
        rules = parser_conf.rules + [Rule(NonTerminal('$root'), [NonTerminal(parser_conf.start), Terminal('$END')])]
        root_rules = {start: Rule(NonTerminal('$root_' + start), [NonTerminal(start), Terminal('$END')])
                      for start in parser_conf.start}
        rules = parser_conf.rules + list(root_rules.values())
        self.rules_by_origin = classify(rules, lambda r: r.origin)
        if len(rules) != len(set(rules)):
@@ -121,17 +140,37 @@ class GrammarAnalyzer(object):
                if not (sym.is_term or sym in self.rules_by_origin):
                    raise GrammarError("Using an undefined rule: %s" % sym) # TODO test validation
        self.start_state = self.expand_rule(NonTerminal('$root'))
        self.start_states = {start: self.expand_rule(root_rule.origin)
                             for start, root_rule in root_rules.items()}
        self.end_states = {start: fzset({RulePtr(root_rule, len(root_rule.expansion))})
                           for start, root_rule in root_rules.items()}
        lr0_root_rules = {start: Rule(NonTerminal('$root_' + start), [NonTerminal(start)])
                for start in parser_conf.start}
        lr0_rules = parser_conf.rules + list(lr0_root_rules.values())
        assert(len(lr0_rules) == len(set(lr0_rules)))
        self.lr0_rules_by_origin = classify(lr0_rules, lambda r: r.origin)
        # cache RulePtr(r, 0) in r (no duplicate RulePtr objects)
        self.lr0_start_states = {start: LR0ItemSet([RulePtr(root_rule, 0)], self.expand_rule(root_rule.origin, self.lr0_rules_by_origin))
                for start, root_rule in lr0_root_rules.items()}
        self.FIRST, self.FOLLOW, self.NULLABLE = calculate_sets(rules)
    def expand_rule(self, rule):
    def expand_rule(self, source_rule, rules_by_origin=None):
        "Returns all init_ptrs accessible by rule (recursive)"
        if rules_by_origin is None:
            rules_by_origin = self.rules_by_origin
        init_ptrs = set()
        def _expand_rule(rule):
            assert not rule.is_term, rule
            for r in self.rules_by_origin[rule]:
            for r in rules_by_origin[rule]:
                init_ptr = RulePtr(r, 0)
                init_ptrs.add(init_ptr)
@@ -140,14 +179,7 @@ class GrammarAnalyzer(object):
                    if not new_r.is_term:
                        yield new_r
        for _ in bfs([rule], _expand_rule):
        for _ in bfs([source_rule], _expand_rule):
            pass
        return fzset(init_ptrs)
    def _first(self, r):
        if r.is_term:
            return {r}
        else:
            return {rp.next for rp in self.expand_rule(r) if rp.next.is_term}
--- a/lark/parsers/lalr_analysis.py
+++ b/lark/parsers/lalr_analysis.py
@@ -7,12 +7,12 @@ For now, shift/reduce conflicts are automatically resolved as shifts.
 # Email : erezshin@gmail.com
 import logging
 from collections import defaultdict
 from collections import defaultdict, deque
 from ..utils import classify, classify_bool, bfs, fzset, Serialize, Enumerator
 from ..exceptions import GrammarError
 from .grammar_analysis import GrammarAnalyzer, Terminal
 from .grammar_analysis import GrammarAnalyzer, Terminal, LR0ItemSet
 from ..grammar import Rule
 ###{standalone
@@ -28,11 +28,12 @@ class Action:
 Shift = Action('Shift')
 Reduce = Action('Reduce')
 class ParseTable:
    def __init__(self, states, start_state, end_state):
    def __init__(self, states, start_states, end_states):
        self.states = states
        self.start_state = start_state
        self.end_state = end_state
        self.start_states = start_states
        self.end_states = end_states
    def serialize(self, memo):
        tokens = Enumerator()
@@ -47,8 +48,8 @@ class ParseTable:
        return {
            'tokens': tokens.reversed(),
            'states': states,
            'start_state': self.start_state,
            'end_state': self.end_state,
            'start_states': self.start_states,
            'end_states': self.end_states,
        }
    @classmethod
@@ -59,7 +60,7 @@ class ParseTable:
                    for token, (action, arg) in actions.items()}
            for state, actions in data['states'].items()
        }
        return cls(states, data['start_state'], data['end_state'])
        return cls(states, data['start_states'], data['end_states'])
 class IntParseTable(ParseTable):
@@ -76,66 +77,212 @@ class IntParseTable(ParseTable):
            int_states[ state_to_idx[s] ] = la
        start_state = state_to_idx[parse_table.start_state]
        end_state = state_to_idx[parse_table.end_state]
        return cls(int_states, start_state, end_state)
        start_states = {start:state_to_idx[s] for start, s in parse_table.start_states.items()}
        end_states = {start:state_to_idx[s] for start, s in parse_table.end_states.items()}
        return cls(int_states, start_states, end_states)
 ###}
 # digraph and traverse, see The Theory and Practice of Compiler Writing
 # computes F(x) = G(x) union (union { G(y) | x R y })
 # X: nodes
 # R: relation (function mapping node -> list of nodes that satisfy the relation)
 # G: set valued function
 def digraph(X, R, G):
    F = {}
    S = []
    N = {}
    for x in X:
        N[x] = 0
    for x in X:
        # this is always true for the first iteration, but N[x] may be updated in traverse below
        if N[x] == 0:
            traverse(x, S, N, X, R, G, F)
    return F
 # x: single node
 # S: stack
 # N: weights
 # X: nodes
 # R: relation (see above)
 # G: set valued function
 # F: set valued function we are computing (map of input -> output)
 def traverse(x, S, N, X, R, G, F):
    S.append(x)
    d = len(S)
    N[x] = d
    F[x] = G[x]
    for y in R[x]:
        if N[y] == 0:
            traverse(y, S, N, X, R, G, F)
        n_x = N[x]
        assert(n_x > 0)
        n_y = N[y]
        assert(n_y != 0)
        if (n_y > 0) and (n_y < n_x):
            N[x] = n_y
        F[x].update(F[y])
    if N[x] == d:
        f_x = F[x]
        while True:
            z = S.pop()
            N[z] = -1
            F[z] = f_x
            if z == x:
                break
 class LALR_Analyzer(GrammarAnalyzer):
    def __init__(self, parser_conf, debug=False):
        GrammarAnalyzer.__init__(self, parser_conf, debug)
        self.nonterminal_transitions = []
        self.directly_reads = defaultdict(set)
        self.reads = defaultdict(set)
        self.includes = defaultdict(set)
        self.lookback = defaultdict(set)
    def compute_lookahead(self):
        self.end_states = []
    def compute_lr0_states(self):
        self.lr0_states = set()
        # map of kernels to LR0ItemSets
        cache = {}
        self.states = {}
        def step(state):
            lookahead = defaultdict(list)
            sat, unsat = classify_bool(state, lambda rp: rp.is_satisfied)
            for rp in sat:
                for term in self.FOLLOW.get(rp.rule.origin, ()):
                    lookahead[term].append((Reduce, rp.rule))
            _, unsat = classify_bool(state.closure, lambda rp: rp.is_satisfied)
            d = classify(unsat, lambda rp: rp.next)
            for sym, rps in d.items():
                rps = {rp.advance(sym) for rp in rps}
                kernel = fzset({rp.advance(sym) for rp in rps})
                new_state = cache.get(kernel, None)
                if new_state is None:
                    closure = set(kernel)
                    for rp in kernel:
                        if not rp.is_satisfied and not rp.next.is_term:
                            closure |= self.expand_rule(rp.next, self.lr0_rules_by_origin)
                    new_state = LR0ItemSet(kernel, closure)
                    cache[kernel] = new_state
                state.transitions[sym] = new_state
                yield new_state
                for rp in set(rps):
                    if not rp.is_satisfied and not rp.next.is_term:
                        rps |= self.expand_rule(rp.next)
            self.lr0_states.add(state)
                new_state = fzset(rps)
                lookahead[sym].append((Shift, new_state))
                if sym == Terminal('$END'):
                    self.end_states.append( new_state )
                yield new_state
        for _ in bfs(self.lr0_start_states.values(), step):
            pass
            for k, v in lookahead.items():
                if len(v) > 1:
    def compute_reads_relations(self):
        # handle start state
        for root in self.lr0_start_states.values():
            assert(len(root.kernel) == 1)
            for rp in root.kernel:
                assert(rp.index == 0)
                self.directly_reads[(root, rp.next)] = set([ Terminal('$END') ])
        for state in self.lr0_states:
            seen = set()
            for rp in state.closure:
                if rp.is_satisfied:
                    continue
                s = rp.next
                # if s is a not a nonterminal
                if s not in self.lr0_rules_by_origin:
                    continue
                if s in seen:
                    continue
                seen.add(s)
                nt = (state, s)
                self.nonterminal_transitions.append(nt)
                dr = self.directly_reads[nt]
                r = self.reads[nt]
                next_state = state.transitions[s]
                for rp2 in next_state.closure:
                    if rp2.is_satisfied:
                        continue
                    s2 = rp2.next
                    # if s2 is a terminal
                    if s2 not in self.lr0_rules_by_origin:
                        dr.add(s2)
                    if s2 in self.NULLABLE:
                        r.add((next_state, s2))
    def compute_includes_lookback(self):
        for nt in self.nonterminal_transitions:
            state, nonterminal = nt
            includes = []
            lookback = self.lookback[nt]
            for rp in state.closure:
                if rp.rule.origin != nonterminal:
                    continue
                # traverse the states for rp(.rule)
                state2 = state
                for i in range(rp.index, len(rp.rule.expansion)):
                    s = rp.rule.expansion[i]
                    nt2 = (state2, s)
                    state2 = state2.transitions[s]
                    if nt2 not in self.reads:
                        continue
                    for j in range(i + 1, len(rp.rule.expansion)):
                        if not rp.rule.expansion[j] in self.NULLABLE:
                            break
                    else:
                        includes.append(nt2)
                # state2 is at the final state for rp.rule
                if rp.index == 0:
                    for rp2 in state2.closure:
                        if (rp2.rule == rp.rule) and rp2.is_satisfied:
                            lookback.add((state2, rp2.rule))
            for nt2 in includes:
                self.includes[nt2].add(nt)
    def compute_lookaheads(self):
        read_sets = digraph(self.nonterminal_transitions, self.reads, self.directly_reads)
        follow_sets = digraph(self.nonterminal_transitions, self.includes, read_sets)
        for nt, lookbacks in self.lookback.items():
            for state, rule in lookbacks:
                for s in follow_sets[nt]:
                    state.lookaheads[s].add(rule)
    def compute_lalr1_states(self):
        m = {}
        for state in self.lr0_states:
            actions = {}
            for la, next_state in state.transitions.items():
                actions[la] = (Shift, next_state.closure)
            for la, rules in state.lookaheads.items():
                if len(rules) > 1:
                    raise GrammarError('Reduce/Reduce collision in %s between the following rules: %s' % (la, ''.join([ '\n\t\t- ' + str(r) for r in rules ])))
                if la in actions:
                    if self.debug:
                        logging.warn("Shift/reduce conflict for terminal %s:  (resolving as shift)", k.name)
                        for act, arg in v:
                            logging.warn(' * %s: %s', act, arg)
                    for x in v:
                        # XXX resolving shift/reduce into shift, like PLY
                        # Give a proper warning
                        if x[0] is Shift:
                            lookahead[k] = [x]
            for k, v in lookahead.items():
                if not len(v) == 1:
                    raise GrammarError("Collision in %s: %s" %(k, ', '.join(['\n  * %s: %s' % x for x in v])))
            self.states[state] = {k.name:v[0] for k, v in lookahead.items()}
        for _ in bfs([self.start_state], step):
            pass
                        logging.warning('Shift/Reduce conflict for terminal %s: (resolving as shift)', la.name)
                        logging.warning(' * %s', list(rules)[0])
                else:
                    actions[la] = (Reduce, list(rules)[0])
            m[state] = { k.name: v for k, v in actions.items() }
        self.end_state ,= self.end_states
        states = { k.closure: v for k, v in m.items() }
        self._parse_table = ParseTable(self.states, self.start_state, self.end_state)
        # compute end states
        end_states = {}
        for state in states:
            for rp in state:
                for start in self.lr0_start_states:
                    if rp.rule.origin.name == ('$root_' + start) and rp.is_satisfied:
                        assert(not start in end_states)
                        end_states[start] = state
        _parse_table = ParseTable(states, { start: state.closure for start, state in self.lr0_start_states.items() }, end_states)
        if self.debug:
            self.parse_table = self._parse_table
            self.parse_table = _parse_table
        else:
            self.parse_table = IntParseTable.from_ParseTable(self._parse_table)
            self.parse_table = IntParseTable.from_ParseTable(_parse_table)
    def compute_lalr(self):
        self.compute_lr0_states()
        self.compute_reads_relations()
        self.compute_includes_lookback()
        self.compute_lookaheads()
        self.compute_lalr1_states()
--- a/lark/parsers/lalr_parser.py
+++ b/lark/parsers/lalr_parser.py
@@ -6,16 +6,15 @@ from ..exceptions import UnexpectedToken
 from ..lexer import Token
 from ..utils import Enumerator, Serialize
 from .lalr_analysis import LALR_Analyzer, Shift, IntParseTable
 from .lalr_analysis import LALR_Analyzer, Shift, Reduce, IntParseTable
 ###{standalone
 class LALR_Parser(object):
    def __init__(self, parser_conf, debug=False):
        assert all(r.options is None or r.options.priority is None
                   for r in parser_conf.rules), "LALR doesn't yet support prioritization"
        assert all(r.options.priority is None for r in parser_conf.rules), "LALR doesn't yet support prioritization"
        analysis = LALR_Analyzer(parser_conf, debug=debug)
        analysis.compute_lookahead()
        analysis.compute_lalr()
        callbacks = parser_conf.callbacks
        self._parse_table = analysis.parse_table
@@ -39,19 +38,22 @@ class LALR_Parser(object):
 class _Parser:
    def __init__(self, parse_table, callbacks):
        self.states = parse_table.states
        self.start_state = parse_table.start_state
        self.end_state = parse_table.end_state
        self.start_states = parse_table.start_states
        self.end_states = parse_table.end_states
        self.callbacks = callbacks
    def parse(self, seq, set_state=None):
    def parse(self, seq, start, set_state=None):
        token = None
        stream = iter(seq)
        states = self.states
        state_stack = [self.start_state]
        start_state = self.start_states[start]
        end_state = self.end_states[start]
        state_stack = [start_state]
        value_stack = []
        if set_state: set_state(self.start_state)
        if set_state: set_state(start_state)
        def get_action(token):
            state = state_stack[-1]
@@ -81,7 +83,7 @@ class _Parser:
        for token in stream:
            while True:
                action, arg = get_action(token)
                assert arg != self.end_state
                assert arg != end_state
                if action is Shift:
                    state_stack.append(arg)
@@ -94,11 +96,9 @@ class _Parser:
        token = Token.new_borrow_pos('$END', '', token) if token else Token('$END', '', 0, 1, 1)
        while True:
            _action, arg = get_action(token)
            if _action is Shift:
                assert arg == self.end_state
                val ,= value_stack
                return val
            else:
                reduce(arg)
            assert(_action is Reduce)
            reduce(arg)
            if state_stack[-1] == end_state:
                return value_stack[-1]
 ###}
--- a/lark/parsers/xearley.py
+++ b/lark/parsers/xearley.py
@@ -24,8 +24,8 @@ from .earley_forest import SymbolNode
 class Parser(BaseParser):
    def __init__(self,  parser_conf, term_matcher, resolve_ambiguity=True, ignore = (), complete_lex = False):
        BaseParser.__init__(self, parser_conf, term_matcher, resolve_ambiguity)
    def __init__(self,  parser_conf, term_matcher, resolve_ambiguity=True, ignore = (), complete_lex = False, debug=False):
        BaseParser.__init__(self, parser_conf, term_matcher, resolve_ambiguity, debug)
        self.ignore = [Terminal(t) for t in ignore]
        self.complete_lex = complete_lex
@@ -146,4 +146,5 @@ class Parser(BaseParser):
        self.predict_and_complete(i, to_scan, columns, transitives)
        ## Column is now the final column in the parse.
        assert i == len(columns)-1
        assert i == len(columns)-1
        return to_scan
--- a/lark/reconstruct.py
+++ b/lark/reconstruct.py
@@ -19,13 +19,15 @@ def is_iter_empty(i):
    except StopIteration:
        return True
 class WriteTokensTransformer(Transformer_InPlace):
    def __init__(self, tokens):
    "Inserts discarded tokens into their correct place, according to the rules of grammar"
    def __init__(self, tokens, term_subs):
        self.tokens = tokens
        self.term_subs = term_subs
    def __default__(self, data, children, meta):
        #  if not isinstance(t, MatchTree):
            #  return t
        if not getattr(meta, 'match_tree', False):
            return Tree(data, children)
@@ -33,10 +35,15 @@ class WriteTokensTransformer(Transformer_InPlace):
        to_write = []
        for sym in meta.orig_expansion:
            if is_discarded_terminal(sym):
                t = self.tokens[sym.name]
                if not isinstance(t.pattern, PatternStr):
                    raise NotImplementedError("Reconstructing regexps not supported yet: %s" % t)
                to_write.append(t.pattern.value)
                try:
                    v = self.term_subs[sym.name](sym)
                except KeyError:
                    t = self.tokens[sym.name]
                    if not isinstance(t.pattern, PatternStr):
                        raise NotImplementedError("Reconstructing regexps not supported yet: %s" % t)
                    v = t.pattern.value
                to_write.append(v)
            else:
                x = next(iter_args)
                if isinstance(x, list):
@@ -66,19 +73,39 @@ class MakeMatchTree:
        t.meta.orig_expansion = self.expansion
        return t
 def best_from_group(seq, group_key, cmp_key):
    d = {}
    for item in seq:
        key = group_key(item)
        if key in d:
            v1 = cmp_key(item)
            v2 = cmp_key(d[key])
            if v2 > v1:
                d[key] = item
        else:
            d[key] = item
    return list(d.values())
 class Reconstructor:
    def __init__(self, parser):
    def __init__(self, parser, term_subs={}):
        # XXX TODO calling compile twice returns different results!
        tokens, rules, _grammar_extra = parser.grammar.compile()
        assert parser.options.maybe_placeholders == False
        tokens, rules, _grammar_extra = parser.grammar.compile(parser.options.start)
        self.write_tokens = WriteTokensTransformer({t.name:t for t in tokens})
        self.write_tokens = WriteTokensTransformer({t.name:t for t in tokens}, term_subs)
        self.rules = list(self._build_recons_rules(rules))
        self.rules.reverse()
        # Choose the best rule from each group of {rule => [rule.alias]}, since we only really need one derivation.
        self.rules = best_from_group(self.rules, lambda r: r, lambda r: -len(r.expansion))
        self.rules.sort(key=lambda r: len(r.expansion))
        callbacks = {rule: rule.alias for rule in self.rules}   # TODO pass callbacks through dict, instead of alias?
        self.parser = earley.Parser(ParserConf(self.rules, callbacks, parser.options.start),
                                    self._match, resolve_ambiguity=True)
    def _build_recons_rules(self, rules):
        expand1s = {r.origin for r in rules if r.options and r.options.expand1}
        expand1s = {r.origin for r in rules if r.options.expand1}
        aliases = defaultdict(list)
        for r in rules:
@@ -126,4 +153,12 @@ class Reconstructor:
                yield item
    def reconstruct(self, tree):
        return ''.join(self._reconstruct(tree))
        x = self._reconstruct(tree)
        y = []
        prev_item = ''
        for item in x:
            if prev_item and item and prev_item[-1].isalnum() and item[0].isalnum():
                y.append(' ')
            y.append(item)
            prev_item = item
        return ''.join(y)
--- a/lark/tools/nearley.py
+++ b/lark/tools/nearley.py
@@ -18,7 +18,7 @@ nearley_grammar = r"""
    expansion: expr+ js
    ?expr: item [":" /[+*?]/]
    ?expr: item (":" /[+*?]/)?
    ?item: rule|string|regexp|null
         | "(" expansions ")"
@@ -167,7 +167,7 @@ def create_code_for_nearley_grammar(g, start, builtin_path, folder_path):
    emit("    __default__ = lambda self, n, c, m: c if c else None")
    emit()
    emit('parser = Lark(grammar, start="n_%s")' % start)
    emit('parser = Lark(grammar, start="n_%s", maybe_placeholders=False)' % start)
    emit('def parse(text):')
    emit('    return TransformNearley().transform(parser.parse(text))')
--- a/lark/tools/serialize.py
+++ b/lark/tools/serialize.py
@@ -0,0 +1,39 @@
 import codecs
 import sys
 import json
 from lark import Lark
 from lark.grammar import RuleOptions, Rule
 from lark.lexer import TerminalDef
 import argparse
 argparser = argparse.ArgumentParser(prog='python -m lark.tools.serialize') #description='''Lark Serialization Tool -- Stores Lark's internal state & LALR analysis as a convenient JSON file''')
 argparser.add_argument('grammar_file', type=argparse.FileType('r'), help='A valid .lark file')
 argparser.add_argument('-o', '--out', type=argparse.FileType('w'), default=sys.stdout, help='json file path to create (default=stdout)')
 argparser.add_argument('-s', '--start', default='start', help='start symbol (default="start")', nargs='+')
 argparser.add_argument('-l', '--lexer', default='standard', choices=['standard', 'contextual'], help='lexer type (default="standard")')
 def serialize(infile, outfile, lexer, start):
    lark_inst = Lark(infile, parser="lalr", lexer=lexer, start=start)    # TODO contextual
    data, memo = lark_inst.memo_serialize([TerminalDef, Rule])
    outfile.write('{\n')
    outfile.write('  "data": %s,\n' % json.dumps(data))
    outfile.write('  "memo": %s\n' % json.dumps(memo))
    outfile.write('}\n')
 def main():
    if len(sys.argv) == 1 or '-h' in sys.argv or '--help' in sys.argv:
        print("Lark Serialization Tool - Stores Lark's internal state & LALR analysis as a JSON file")
        print("")
        argparser.print_help()
    else:
        args = argparser.parse_args()
        serialize(args.grammar_file, args.out, args.lexer, args.start)
 if __name__ == '__main__':
    main()
--- a/lark/tools/standalone.py
+++ b/lark/tools/standalone.py
@@ -34,6 +34,9 @@
 #    See <http://www.gnu.org/licenses/>.
 #
 #
 import os
 from io import open
 ###}
 import pprint
--- a/lark/tree.py
+++ b/lark/tree.py
@@ -56,30 +56,6 @@ class Tree(object):
    def __hash__(self):
        return hash((self.data, tuple(self.children)))
 ###}
    def expand_kids_by_index(self, *indices):
        "Expand (inline) children at the given indices"
        for i in sorted(indices, reverse=True): # reverse so that changing tail won't affect indices
            kid = self.children[i]
            self.children[i:i+1] = kid.children
    def find_pred(self, pred):
        "Find all nodes where pred(tree) == True"
        return filter(pred, self.iter_subtrees())
    def find_data(self, data):
        "Find all nodes where tree.data == data"
        return self.find_pred(lambda t: t.data == data)
    def scan_values(self, pred):
        for c in self.children:
            if isinstance(c, Tree):
                for t in c.scan_values(pred):
                    yield t
            else:
                if pred(c):
                    yield c
    def iter_subtrees(self):
        # TODO: Re-write as a more efficient version
@@ -102,6 +78,31 @@ class Tree(object):
                yield x
                seen.add(id(x))
    def find_pred(self, pred):
        "Find all nodes where pred(tree) == True"
        return filter(pred, self.iter_subtrees())
    def find_data(self, data):
        "Find all nodes where tree.data == data"
        return self.find_pred(lambda t: t.data == data)
 ###}
    def expand_kids_by_index(self, *indices):
        "Expand (inline) children at the given indices"
        for i in sorted(indices, reverse=True): # reverse so that changing tail won't affect indices
            kid = self.children[i]
            self.children[i:i+1] = kid.children
    def scan_values(self, pred):
        for c in self.children:
            if isinstance(c, Tree):
                for t in c.scan_values(pred):
                    yield t
            else:
                if pred(c):
                    yield c
    def iter_subtrees_topdown(self):
        stack = [self]
        while stack:
@@ -141,17 +142,19 @@ class SlottedTree(Tree):
    __slots__ = 'data', 'children', 'rule', '_meta'
 def pydot__tree_to_png(tree, filename, rankdir="LR"):
 def pydot__tree_to_png(tree, filename, rankdir="LR", **kwargs):
    """Creates a colorful image that represents the tree (data+children, without meta)
    Possible values for `rankdir` are "TB", "LR", "BT", "RL", corresponding to
    directed graphs drawn from top to bottom, from left to right, from bottom to
    top, and from right to left, respectively. See:
    https://www.graphviz.org/doc/info/attrs.html#k:rankdir
    top, and from right to left, respectively.
    `kwargs` can be any graph attribute (e. g. `dpi=200`). For a list of
    possible attributes, see https://www.graphviz.org/doc/info/attrs.html.
    """
    import pydot
    graph = pydot.Dot(graph_type='digraph', rankdir=rankdir)
    graph = pydot.Dot(graph_type='digraph', rankdir=rankdir, **kwargs)
    i = [0]
--- a/lark/utils.py
+++ b/lark/utils.py
@@ -1,4 +1,5 @@
 import sys
 from ast import literal_eval
 from collections import deque
 class fzset(frozenset):
@@ -160,7 +161,7 @@ def smart_decorator(f, create_decorator):
    elif isinstance(f, partial):
        # wraps does not work for partials in 2.7: https://bugs.python.org/issue3445
        return create_decorator(f.__func__, True)
        return wraps(f.func)(create_decorator(lambda *args, **kw: f(*args[1:], **kw), True))
    else:
        return create_decorator(f.__func__.__call__, True)
@@ -172,7 +173,7 @@ import sre_parse
 import sre_constants
 def get_regexp_width(regexp):
    try:
        return sre_parse.parse(regexp).getwidth()
        return [int(x) for x in sre_parse.parse(regexp).getwidth()]
    except sre_constants.error:
        raise ValueError(regexp)
@@ -239,3 +240,28 @@ class Enumerator(Serialize):
        assert len(r) == len(self.enums)
        return r
 def eval_escaping(s):
    w = ''
    i = iter(s)
    for n in i:
        w += n
        if n == '\\':
            try:
                n2 = next(i)
            except StopIteration:
                raise ValueError("Literal ended unexpectedly (bad escaping): `%r`" % s)
            if n2 == '\\':
                w += '\\\\'
            elif n2 not in 'uxnftr':
                w += '\\'
            w += n2
    w = w.replace('\\"', '"').replace("'", "\\'")
    to_eval = "u'''%s'''" % w
    try:
        s = literal_eval(to_eval)
    except SyntaxError as e:
        raise ValueError(s, e)
    return s
--- a/lark/visitors.py
+++ b/lark/visitors.py
@@ -3,6 +3,7 @@ from functools import wraps
 from .utils import smart_decorator
 from .tree import Tree
 from .exceptions import VisitError, GrammarError
 from .lexer import Token
 ###{standalone
 from inspect import getmembers, getmro
@@ -12,7 +13,31 @@ class Discard(Exception):
 # Transformers
 class Transformer:
 class _Decoratable:
    @classmethod
    def _apply_decorator(cls, decorator, **kwargs):
        mro = getmro(cls)
        assert mro[0] is cls
        libmembers = {name for _cls in mro[1:] for name, _ in getmembers(_cls)}
        for name, value in getmembers(cls):
            # Make sure the function isn't inherited (unless it's overwritten)
            if name.startswith('_') or (name in libmembers and name not in cls.__dict__):
                continue
            if not callable(cls.__dict__[name]):
                continue
            # Skip if v_args already applied (at the function level)
            if hasattr(cls.__dict__[name], 'vargs_applied'):
                continue
            static = isinstance(cls.__dict__[name], (staticmethod, classmethod))
            setattr(cls, name, decorator(value, static=static, **kwargs))
        return cls
 class Transformer(_Decoratable):
    """Visits the tree recursively, starting with the leaves and finally the root (bottom-up)
    Calls its methods (provided by user via inheritance) according to tree.data
@@ -21,6 +46,10 @@ class Transformer:
    Can be used to implement map or reduce.
    """
    __visit_tokens__ = True   # For backwards compatibility
    def __init__(self,  visit_tokens=True):
        self.__visit_tokens__ = visit_tokens
    def _call_userfunc(self, tree, new_children=None):
        # Assumes tree is already transformed
        children = new_children if new_children is not None else tree.children
@@ -30,25 +59,39 @@ class Transformer:
            return self.__default__(tree.data, children, tree.meta)
        else:
            try:
                if getattr(f, 'meta', False):
                    return f(children, tree.meta)
                elif getattr(f, 'inline', False):
                    return f(*children)
                elif getattr(f, 'whole_tree', False):
                    if new_children is not None:
                        raise NotImplementedError("Doesn't work with the base Transformer class")
                    return f(tree)
                wrapper = getattr(f, 'visit_wrapper', None)
                if wrapper is not None:
                    return f.visit_wrapper(f, tree.data, children, tree.meta)
                else:
                    return f(children)
            except (GrammarError, Discard):
                raise
            except Exception as e:
                raise VisitError(tree, e)
                raise VisitError(tree.data, tree, e)
    def _call_userfunc_token(self, token):
        try:
            f = getattr(self, token.type)
        except AttributeError:
            return self.__default_token__(token)
        else:
            try:
                return f(token)
            except (GrammarError, Discard):
                raise
            except Exception as e:
                raise VisitError(token.type, token, e)
    def _transform_children(self, children):
        for c in children:
            try:
                yield self._transform_tree(c) if isinstance(c, Tree) else c
                if isinstance(c, Tree):
                    yield self._transform_tree(c)
                elif self.__visit_tokens__ and isinstance(c, Token):
                    yield self._call_userfunc_token(c)
                else:
                    yield c
            except Discard:
                pass
@@ -66,26 +109,10 @@ class Transformer:
        "Default operation on tree (for override)"
        return Tree(data, children, meta)
    @classmethod
    def _apply_decorator(cls, decorator, **kwargs):
        mro = getmro(cls)
        assert mro[0] is cls
        libmembers = {name for _cls in mro[1:] for name, _ in getmembers(_cls)}
        for name, value in getmembers(cls):
            # Make sure the function isn't inherited (unless it's overwritten)
            if name.startswith('_') or (name in libmembers and name not in cls.__dict__):
                continue
            if not callable(cls.__dict__[name]):
                continue
            # Skip if v_args already applied (at the function level)
            if hasattr(cls.__dict__[name], 'vargs_applied'):
                continue
    def __default_token__(self, token):
        "Default operation on token (for override)"
        return token
            static = isinstance(cls.__dict__[name], (staticmethod, classmethod))
            setattr(cls, name, decorator(value, static=static, **kwargs))
        return cls
 class InlineTransformer(Transformer):   # XXX Deprecated
@@ -157,6 +184,11 @@ class Visitor(VisitorBase):
            self._call_userfunc(subtree)
        return tree
    def visit_topdown(self,tree):
        for subtree in tree.iter_subtrees_topdown():
            self._call_userfunc(subtree)
        return tree
 class Visitor_Recursive(VisitorBase):
    """Bottom-up visitor, recursive
@@ -169,8 +201,16 @@ class Visitor_Recursive(VisitorBase):
            if isinstance(child, Tree):
                self.visit(child)
        f = getattr(self, tree.data, self.__default__)
        f(tree)
        self._call_userfunc(tree)
        return tree
    def visit_topdown(self,tree):
        self._call_userfunc(tree)
        for child in tree.children:
            if isinstance(child, Tree):
                self.visit_topdown(child)
        return tree
@@ -184,7 +224,7 @@ def visit_children_decor(func):
    return inner
 class Interpreter:
 class Interpreter(_Decoratable):
    """Top-down visitor, recursive
    Visits the tree, starting with the root and finally the leaves (top-down)
@@ -193,8 +233,14 @@ class Interpreter:
    Unlike Transformer and Visitor, the Interpreter doesn't automatically visit its sub-branches.
    The user has to explicitly call visit_children, or use the @visit_children_decor
    """
    def visit(self, tree):
        return getattr(self, tree.data)(tree)
        f = getattr(self, tree.data)
        wrapper = getattr(f, 'visit_wrapper', None)
        if wrapper is not None:
            return f.visit_wrapper(f, tree.data, tree.children, tree.meta)
        else:
            return f(tree)
    def visit_children(self, tree):
        return [self.visit(child) if isinstance(child, Tree) else child
@@ -240,8 +286,7 @@ def inline_args(obj):   # XXX Deprecated
 def _visitor_args_func_dec(func, inline=False, meta=False, whole_tree=False, static=False):
    assert [whole_tree, meta, inline].count(True) <= 1
 def _visitor_args_func_dec(func, visit_wrapper=None, static=False):
    def create_decorator(_f, with_self):
        if with_self:
            def f(self, *args, **kwargs):
@@ -256,17 +301,42 @@ def _visitor_args_func_dec(func, inline=False, meta=False, whole_tree=False, sta
    else:
        f = smart_decorator(func, create_decorator)
    f.vargs_applied = True
    f.inline = inline
    f.meta = meta
    f.whole_tree = whole_tree
    f.visit_wrapper = visit_wrapper
    return f
 def v_args(inline=False, meta=False, tree=False):
 def _vargs_inline(f, data, children, meta):
    return f(*children)
 def _vargs_meta_inline(f, data, children, meta):
    return f(meta, *children)
 def _vargs_meta(f, data, children, meta):
    return f(children, meta)   # TODO swap these for consistency? Backwards incompatible!
 def _vargs_tree(f, data, children, meta):
    return f(Tree(data, children, meta))
 def v_args(inline=False, meta=False, tree=False, wrapper=None):
    "A convenience decorator factory, for modifying the behavior of user-supplied visitor methods"
    if [tree, meta, inline].count(True) > 1:
        raise ValueError("Visitor functions can either accept tree, or meta, or be inlined. These cannot be combined.")
    if tree and (meta or inline):
        raise ValueError("Visitor functions cannot combine 'tree' with 'meta' or 'inline'.")
    func = None
    if meta:
        if inline:
            func = _vargs_meta_inline
        else:
            func = _vargs_meta
    elif inline:
        func = _vargs_inline
    elif tree:
        func = _vargs_tree
    if wrapper is not None:
        if func is not None:
            raise ValueError("Cannot use 'wrapper' along with 'tree', 'meta' or 'inline'.")
        func = wrapper
    def _visitor_args_dec(obj):
        return _apply_decorator(obj, _visitor_args_func_dec, inline=inline, meta=meta, whole_tree=tree)
        return _apply_decorator(obj, _visitor_args_func_dec, visit_wrapper=func)
    return _visitor_args_dec
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -9,5 +9,6 @@ pages:
    - How To Develop (Guide): how_to_develop.md
    - Grammar Reference: grammar.md
    - Tree Construction Reference: tree_construction.md
    - Visitors and Transformers: visitors.md
    - Classes Reference: classes.md
    - Recipes: recipes.md
--- a/readthedocs.yml
+++ b/readthedocs.yml
@@ -0,0 +1,10 @@
 version: 2
 mkdocs:
  configuration: mkdocs.yml
  fail_on_warning: false
 formats: all
 python:
  version: 3.5
--- a/tests/main.py
+++ b/tests/main.py
@@ -10,7 +10,7 @@ from .test_reconstructor import TestReconstructor
 try:
    from .test_nearley.test_nearley import TestNearley
 except ImportError:
    pass
    logging.warning("Warning: Skipping tests for Nearley grammar imports (js2py required)")
 # from .test_selectors import TestSelectors
 # from .test_grammars import TestPythonG, TestConfigG
@@ -21,6 +21,7 @@ from .test_parser import (
        TestCykStandard,
        TestLalrContextual,
        TestEarleyDynamic,
        TestLalrCustom,
        # TestFullEarleyStandard,
        TestFullEarleyDynamic,
--- a/tests/grammars/test_unicode.lark
+++ b/tests/grammars/test_unicode.lark
@@ -0,0 +1 @@
 UNICODE : /[a-zØ-öø-ÿ]/
--- a/tests/test_nearley/test_nearley.py
+++ b/tests/test_nearley/test_nearley.py
@@ -15,9 +15,12 @@ NEARLEY_PATH = os.path.join(TEST_PATH, 'nearley')
 BUILTIN_PATH = os.path.join(NEARLEY_PATH, 'builtin')
 if not os.path.exists(NEARLEY_PATH):
    print("Skipping Nearley tests!")
    logging.warn("Nearley not installed. Skipping Nearley tests!")
    raise ImportError("Skipping Nearley tests!")
 import js2py    # Ensures that js2py exists, to avoid failing tests
 class TestNearley(unittest.TestCase):
    def test_css(self):
        fn = os.path.join(NEARLEY_PATH, 'examples/csscolor.ne')
--- a/tests/test_parser.py
+++ b/tests/test_parser.py
@@ -5,6 +5,7 @@ import unittest
 import logging
 import os
 import sys
 from copy import deepcopy
 try:
    from cStringIO import StringIO as cStringIO
 except ImportError:
@@ -20,9 +21,9 @@ logging.basicConfig(level=logging.INFO)
 from lark.lark import Lark
 from lark.exceptions import GrammarError, ParseError, UnexpectedToken, UnexpectedInput, UnexpectedCharacters
 from lark.tree import Tree
 from lark.visitors import Transformer
 from lark.visitors import Transformer, Transformer_InPlace, v_args
 from lark.grammar import Rule
 from lark.lexer import TerminalDef
 from lark.lexer import TerminalDef, Lexer, TraditionalLexer
 __path__ = os.path.dirname(__file__)
 def _read(n, *args):
@@ -62,6 +63,14 @@ class TestParsers(unittest.TestCase):
        r = g.parse('a')
        self.assertEqual( r.children[0].meta.line, 1 )
        g = Lark("""start: x
                    x: a
                    a: "a"
                 """, propagate_positions=True)
        r = g.parse('a')
        self.assertEqual( r.children[0].meta.line, 1 )
    def test_expand1(self):
        g = Lark("""start: a
@@ -94,6 +103,98 @@ class TestParsers(unittest.TestCase):
        r = g.parse('xx')
        self.assertEqual( r.children[0].data, "c" )
    def test_comment_in_rule_definition(self):
        g = Lark("""start: a
               a: "a"
                // A comment
                // Another comment
                | "b"
                // Still more
               c: "unrelated"
            """)
        r = g.parse('b')
        self.assertEqual( r.children[0].data, "a" )
    def test_visit_tokens(self):
        class T(Transformer):
            def a(self, children):
                return children[0] + "!"
            def A(self, tok):
                return tok.update(value=tok.upper())
        # Test regular
        g = """start: a
            a : A
            A: "x"
            """
        p = Lark(g, parser='lalr')
        r = T(False).transform(p.parse("x"))
        self.assertEqual( r.children, ["x!"] )
        r = T().transform(p.parse("x"))
        self.assertEqual( r.children, ["X!"] )
        # Test internal transformer
        p = Lark(g, parser='lalr', transformer=T())
        r = p.parse("x")
        self.assertEqual( r.children, ["X!"] )
    def test_vargs_meta(self):
        @v_args(meta=True)
        class T1(Transformer):
            def a(self, children, meta):
                assert not children
                return meta.line
            def start(self, children, meta):
                return children
        @v_args(meta=True, inline=True)
        class T2(Transformer):
            def a(self, meta):
                return meta.line
            def start(self, meta, *res):
                return list(res)
        for T in (T1, T2):
            for internal in [False, True]:
                try:
                    g = Lark(r"""start: a+
                                a : "x" _NL?
                                _NL: /\n/+
                            """, parser='lalr', transformer=T() if internal else None, propagate_positions=True)
                except NotImplementedError:
                    assert internal
                    continue
                res = g.parse("xx\nx\nxxx\n\n\nxx")
                assert not internal
                res = T().transform(res)
                self.assertEqual(res, [1, 1, 2, 3, 3, 3, 6, 6])
    def test_vargs_tree(self):
        tree = Lark('''
            start: a a a
            !a: "A"
        ''').parse('AAA')
        tree_copy = deepcopy(tree)
        @v_args(tree=True)
        class T(Transformer):
            def a(self, tree):
                return 1
            def start(self, tree):
                return tree.children
        res = T().transform(tree)
        self.assertEqual(res, [1, 1, 1])
        self.assertEqual(tree, tree_copy)
    def test_embedded_transformer(self):
        class T(Transformer):
            def a(self, children):
@@ -150,6 +251,51 @@ class TestParsers(unittest.TestCase):
        r = g.parse("xx")
        self.assertEqual( r.children, ["<c>"] )
    def test_embedded_transformer_inplace(self):
        @v_args(tree=True)
        class T1(Transformer_InPlace):
            def a(self, tree):
                assert isinstance(tree, Tree), tree
                tree.children.append("tested")
                return tree
            def b(self, tree):
                return Tree(tree.data, tree.children + ['tested2'])
        @v_args(tree=True)
        class T2(Transformer):
            def a(self, tree):
                assert isinstance(tree, Tree), tree
                tree.children.append("tested")
                return tree
            def b(self, tree):
                return Tree(tree.data, tree.children + ['tested2'])
        class T3(Transformer):
            @v_args(tree=True)
            def a(self, tree):
                assert isinstance(tree, Tree)
                tree.children.append("tested")
                return tree
            @v_args(tree=True)
            def b(self, tree):
                return Tree(tree.data, tree.children + ['tested2'])
        for t in [T1(), T2(), T3()]:
            for internal in [False, True]:
                g = Lark("""start: a b
                            a : "x"
                            b : "y"
                        """, parser='lalr', transformer=t if internal else None)
                r = g.parse("xy")
                if not internal:
                    r = t.transform(r)
                a, b = r.children
                self.assertEqual(a.children, ["tested"])
                self.assertEqual(b.children, ["tested2"])
    def test_alias(self):
        Lark("""start: ["a"] "b" ["c"] "e" ["f"] ["g"] ["h"] "x" -> d """)
@@ -386,12 +532,22 @@ def _make_full_earley_test(LEXER):
    _TestFullEarley.__name__ = _NAME
    globals()[_NAME] = _TestFullEarley
 class CustomLexer(Lexer):
    """
    Purpose of this custom lexer is to test the integration,
    so it uses the traditionalparser as implementation without custom lexing behaviour.
    """
    def __init__(self, lexer_conf):
        self.lexer = TraditionalLexer(lexer_conf.tokens, ignore=lexer_conf.ignore, user_callbacks=lexer_conf.callbacks)
    def lex(self, *args, **kwargs):
        return self.lexer.lex(*args, **kwargs)
 def _make_parser_test(LEXER, PARSER):
    lexer_class_or_name = CustomLexer if LEXER == 'custom' else LEXER
    def _Lark(grammar, **kwargs):
        return Lark(grammar, lexer=LEXER, parser=PARSER, propagate_positions=True, **kwargs)
        return Lark(grammar, lexer=lexer_class_or_name, parser=PARSER, propagate_positions=True, **kwargs)
    def _Lark_open(gfilename, **kwargs):
        return Lark.open(gfilename, lexer=LEXER, parser=PARSER, propagate_positions=True, **kwargs)
        return Lark.open(gfilename, lexer=lexer_class_or_name, parser=PARSER, propagate_positions=True, **kwargs)
    class _TestParser(unittest.TestCase):
        def test_basic1(self):
            g = _Lark("""start: a+ b a* "b" a*
@@ -890,7 +1046,7 @@ def _make_parser_test(LEXER, PARSER):
        @unittest.skipIf(PARSER == 'cyk', "No empty rules")
        def test_twice_empty(self):
            g = """!start: [["A"]]
            g = """!start: ("A"?)?
                """
            l = _Lark(g)
            tree = l.parse('A')
@@ -984,6 +1140,32 @@ def _make_parser_test(LEXER, PARSER):
            self.assertEqual(res.children, ['ab'])
            grammar = """
            start: A B | AB
            A: "a"
            B.-20: "b"
            AB.-10: "ab"
            """
            l = _Lark(grammar)
            res = l.parse("ab")
            self.assertEqual(res.children, ['a', 'b'])
            grammar = """
            start: A B | AB
            A.-99999999999999999999999: "a"
            B: "b"
            AB: "ab"
            """
            l = _Lark(grammar)
            res = l.parse("ab")
            self.assertEqual(res.children, ['ab'])
        def test_import(self):
            grammar = """
@@ -1021,6 +1203,12 @@ def _make_parser_test(LEXER, PARSER):
            self.assertEqual(x.children, ['12', 'lions'])
        def test_relative_import_unicode(self):
            l = _Lark_open('test_relative_import_unicode.lark', rel_to=__file__)
            x = l.parse(u'Ø')
            self.assertEqual(x.children, [u'Ø'])
        def test_relative_import_rename(self):
            l = _Lark_open('test_relative_import_rename.lark', rel_to=__file__)
            x = l.parse('12 lions')
@@ -1448,7 +1636,20 @@ def _make_parser_test(LEXER, PARSER):
            parser.parse(r'"That" "And a \"b"')
        @unittest.skipIf(PARSER!='lalr', "Serialize currently only works for LALR parsers (though it should be easy to extend)")
        def test_meddling_unused(self):
            "Unless 'unused' is removed, LALR analysis will fail on reduce-reduce collision"
            grammar = """
                start: EKS* x
                x: EKS
                unused: x*
                EKS: "x"
            """
            parser = _Lark(grammar)
        @unittest.skipIf(PARSER!='lalr' or LEXER=='custom', "Serialize currently only works for LALR parsers without custom lexers (though it should be easy to extend)")
        def test_serialize(self):
            grammar = """
                start: _ANY b "C"
@@ -1465,6 +1666,37 @@ def _make_parser_test(LEXER, PARSER):
            parser3 = Lark.deserialize(d, namespace, m)
            self.assertEqual(parser3.parse('ABC'), Tree('start', [Tree('b', [])]) )
        def test_multi_start(self):
            parser = _Lark('''
                a: "x" "a"?
                b: "x" "b"?
            ''', start=['a', 'b'])
            self.assertEqual(parser.parse('xa', 'a'), Tree('a', []))
            self.assertEqual(parser.parse('xb', 'b'), Tree('b', []))
        def test_lexer_detect_newline_tokens(self):
            # Detect newlines in regular tokens
            g = _Lark(r"""start: "go" tail*
            !tail : SA "@" | SB "@" | SC "@" | SD "@"
            SA : "a" /\n/
            SB : /b./s
            SC : "c" /[^a-z]/
            SD : "d" /\s/
            """)
            a,b,c,d = [x.children[1] for x in g.parse('goa\n@b\n@c\n@d\n@').children]
            self.assertEqual(a.line, 2)
            self.assertEqual(b.line, 3)
            self.assertEqual(c.line, 4)
            self.assertEqual(d.line, 5)
            # Detect newlines in ignored tokens
            for re in ['/\\n/', '/[^a-z]/', '/\\s/']:
                g = _Lark('''!start: "a" "a"
                             %ignore {}'''.format(re))
                a, b = g.parse('a\na').children
                self.assertEqual(a.line, 1)
                self.assertEqual(b.line, 2)
    _NAME = "Test" + PARSER.capitalize() + LEXER.capitalize()
@@ -1479,6 +1711,7 @@ _TO_TEST = [
        ('dynamic_complete', 'earley'),
        ('standard', 'lalr'),
        ('contextual', 'lalr'),
        ('custom', 'lalr'),
        # (None, 'earley'),
 ]
--- a/tests/test_reconstructor.py
+++ b/tests/test_reconstructor.py
@@ -16,7 +16,7 @@ def _remove_ws(s):
 class TestReconstructor(TestCase):
    def assert_reconstruct(self, grammar, code):
        parser = Lark(grammar, parser='lalr')
        parser = Lark(grammar, parser='lalr', maybe_placeholders=False)
        tree = parser.parse(code)
        new = Reconstructor(parser).reconstruct(tree)
        self.assertEqual(_remove_ws(code), _remove_ws(new))
@@ -105,7 +105,7 @@ class TestReconstructor(TestCase):
            %ignore WS
        """
        json_parser = Lark(json_grammar, parser='lalr')
        json_parser = Lark(json_grammar, parser='lalr', maybe_placeholders=False)
        tree = json_parser.parse(test_json)
        new_json = Reconstructor(json_parser).reconstruct(tree)
--- a/tests/test_relative_import_unicode.lark
+++ b/tests/test_relative_import_unicode.lark
@@ -0,0 +1,3 @@
 start: UNICODE
 %import .grammars.test_unicode.UNICODE
--- a/tests/test_tools.py
+++ b/tests/test_tools.py
@@ -1,11 +1,9 @@
 from __future__ import absolute_import
 import sys
 import unittest
 from unittest import TestCase
 from unittest import TestCase, main
 from lark.tree import Tree
 from lark.tools import standalone
 try:
@@ -49,6 +47,8 @@ class TestStandalone(TestCase):
        l = _Lark()
        x = l.parse('12 elephants')
        self.assertEqual(x.children, ['12', 'elephants'])
        x = l.parse('16 candles')
        self.assertEqual(x.children, ['16', 'candles'])
    def test_contextual(self):
        grammar = """
@@ -92,26 +92,19 @@ class TestStandalone(TestCase):
            _NEWLINE: /\n/
        """
        # from lark import Lark
        # l = Lark(grammar, parser='lalr', lexer='contextual', postlex=MyIndenter())
        # x = l.parse('(\n)\n')
        # print('@@', x)
        context = self._create_standalone(grammar)
        _Lark = context['Lark_StandAlone']
        # l = _Lark(postlex=MyIndenter())
        # x = l.parse('()\n')
        # print(x)
        l = _Lark(postlex=MyIndenter())
        x = l.parse('()\n')
        self.assertEqual(x, Tree('start', []))
        l = _Lark(postlex=MyIndenter())
        x = l.parse('(\n)\n')
        print(x)
        self.assertEqual(x, Tree('start', []))
 if __name__ == '__main__':
    unittest.main()
    main()
--- a/tests/test_trees.py
+++ b/tests/test_trees.py
@@ -4,9 +4,10 @@ import unittest
 from unittest import TestCase
 import copy
 import pickle
 import functools
 from lark.tree import Tree
 from lark.visitors import Transformer, Interpreter, visit_children_decor, v_args, Discard
 from lark.visitors import Visitor, Visitor_Recursive, Transformer, Interpreter, visit_children_decor, v_args, Discard
 class TestTrees(TestCase):
@@ -33,6 +34,43 @@ class TestTrees(TestCase):
        nodes = list(self.tree1.iter_subtrees_topdown())
        self.assertEqual(nodes, expected)
    def test_visitor(self):
        class Visitor1(Visitor):
            def __init__(self):
                self.nodes=[]
            def __default__(self,tree):
                self.nodes.append(tree)
        class Visitor1_Recursive(Visitor_Recursive):
            def __init__(self):
                self.nodes=[]
            def __default__(self,tree):
                self.nodes.append(tree)
        visitor1=Visitor1()
        visitor1_recursive=Visitor1_Recursive()
        expected_top_down = [Tree('a', [Tree('b', 'x'), Tree('c', 'y'), Tree('d', 'z')]),
                    Tree('b', 'x'), Tree('c', 'y'), Tree('d', 'z')]
        expected_botton_up= [Tree('b', 'x'), Tree('c', 'y'), Tree('d', 'z'),
                    Tree('a', [Tree('b', 'x'), Tree('c', 'y'), Tree('d', 'z')])]
        visitor1.visit(self.tree1)
        self.assertEqual(visitor1.nodes,expected_botton_up)
        visitor1_recursive.visit(self.tree1)
        self.assertEqual(visitor1_recursive.nodes,expected_botton_up)
        visitor1.nodes=[]
        visitor1_recursive.nodes=[]
        visitor1.visit_topdown(self.tree1)
        self.assertEqual(visitor1.nodes,expected_top_down)
        visitor1_recursive.visit_topdown(self.tree1)
        self.assertEqual(visitor1_recursive.nodes,expected_top_down)
    def test_interp(self):
        t = Tree('a', [Tree('b', []), Tree('c', []), 'd'])
@@ -146,6 +184,22 @@ class TestTrees(TestCase):
        res = T().transform(t)
        self.assertEqual(res, 2.9)
    def test_partial(self):
        tree = Tree("start", [Tree("a", ["test1"]), Tree("b", ["test2"])])
        def test(prefix, s, postfix):
            return prefix + s.upper() + postfix
        @v_args(inline=True)
        class T(Transformer):
            a = functools.partial(test, "@", postfix="!")
            b = functools.partial(lambda s: s + "!")
        res = T().transform(tree)
        assert res.children == ["@TEST1!", "test2!"]
    def test_discard(self):
        class MyTransformer(Transformer):
            def a(self, args):