This is the recommended process for working with Lark:
Collect or create input samples, that demonstrate key features or behaviors in the language you’re trying to parse.
Write a grammar. Try to aim for a structure that is intuitive, and in a way that imitates how you would explain your language to a fellow human.
Try your grammar in Lark against each input sample. Make sure the resulting parse-trees make sense.
Use Lark’s grammar features to shape the tree: Get rid of superfluous rules by inlining them, and use aliases when specific cases need clarification.
Of course, some specific use-cases may deviate from this process. Feel free to suggest these cases, and I’ll add them to this page.
For common use, you only need to know 3 classes: Lark, Tree, Transformer (Classes Reference)
Here is some mock usage of them. You can see a real example in the examples
from lark import Lark, Transformer
grammar = """start: rules and more rules
rule1: other rules AND TOKENS
| rule1 "+" rule2 -> add
| some value [maybe]
rule2: rule1 "-" (rule2 | "whatever")*
TOKEN1: "a literal"
TOKEN2: TOKEN1 "and literals"
"""
parser = Lark(grammar)
tree = parser.parse("some input string")
class MyTransformer(Transformer):
def rule1(self, matches):
return matches[0] + matches[1]
# I don't have to implement rule2 if I don't feel like it!
new_tree = MyTransformer().transform(tree)
By default Lark silently resolves Shift/Reduce conflicts as Shift. To enable warnings pass debug=True
. To get the messages printed you have to configure logging
framework beforehand. For example:
from lark import Lark
import logging
logging.basicConfig(level=logging.DEBUG)
collision_grammar = '''
start: as as
as: a*
a: 'a'
'''
p = Lark(collision_grammar, parser='lalr', debug=True)