Below is an example of a simple grammar that is able to parse strings of
integers separated by any amount of white space and a +
symbol.
grammar Addition
rule additive
number plus (additive | number)
end
rule number
[0-9]+ space
end
rule plus
'+' space
end
rule space
[ \t]*
end
end
Several things to note about the above example:
end
keywordThe grammar above is able to parse simple mathematical expressions such as “1+2” and “1 + 2+3”, but it does not have enough semantic information to be able to actually interpret these expressions.
At this point, when the grammar parses a string it generates a tree of Match objects. Each match is created by a rule and may itself be comprised of any number of submatches.
Submatches are created whenever a rule contains another rule. For example, in
the grammar above number
matches a string of digits followed by white space.
Thus, a match generated by this rule will contain two submatches.
We can define a method inside a set of curly braces that will be used to extend
a particular rule’s matches. This works in similar fashion to using Ruby’s
blocks. Let’s extend the Addition
grammar using this technique.
grammar Addition
rule additive
(number plus term:(additive | number)) {
number.value + term.value
}
end
rule number
([0-9]+ space) {
to_i
}
end
rule plus
'+' space
end
rule space
[ \t]*
end
end
In this version of the grammar we have added two semantic blocks, one each for
the additive
and number
rules. These blocks contain code that we can
execute by calling value
on match objects that result from those rules. It’s
easiest to explain what is going on here by starting with the lowest level
block, which is defined within number
.
Inside this block we see a call to another method, namely to_i
. When called in
the context of a match object, methods that are not defined may be called on a
match’s internal string object via method_missing
. Thus, the call to to_i
should return the integer value of the match.
Similarly, matches created by additive
will also have a value
method. Notice
the use of the term
label within the rule definition. This label allows the
match that is created by the choice between additive
and number
to be
retrieved using the term
method. The value of an additive match is determined
to be the values of its number
and term
matches added together using Ruby’s
addition operator.
Since additive
is the first rule defined in the grammar, any match that
results from parsing a string with this grammar will have a value
method that
can be used to recursively calculate the collective value of the entire match
tree.
To give it a try, save the code for the Addition
grammar in a file called
addition.citrus. Next, assuming you have the Citrus
gem installed, try the following sequence of
commands in a terminal.
$ irb
> require 'citrus'
=> true
> Citrus.load 'addition'
=> [Addition]
> m = Addition.parse '1 + 2 + 3'
=> #<Citrus::Match ...
> m.value
=> 6
Congratulations! You just ran your first piece of Citrus code.
One interesting thing to notice about the above sequence of commands is the
return value of Citrus#load. When you use
Citrus.load
to load a grammar file (and likewise
Citrus#eval to evaluate a raw string of
grammar code), the return value is an array of all the grammars present in that
file.
Take a look at examples/calc.citrus for an example of a calculator that is able to parse and evaluate more complex mathematical expressions.
If you need more than just a value
method on your match object, you can attach
additional methods as well. There are two ways to do this. The first lets you
define additional methods inline in your semantic block. This block will be used
to create a new Module using Module#new. Using the
Addition
example above, we might refactor the additive
rule to look like
this:
rule additive
(number plus term:(additive | number)) {
def lhs
number.value
end
def rhs
term.value
end
def value
lhs + rhs
end
}
end
Now, in addition to having a value
method, matches that result from the
additive
rule will have a lhs
and a rhs
method as well. Although not
particularly useful in this example, this technique can be useful when unit
testing more complex rules. For example, using this method you might make the
following assertions in a unit test:
match = Addition.parse('1 + 4')
assert_equal(1, match.lhs)
assert_equal(4, match.rhs)
assert_equal(5, match.value)
If you would like to abstract away the code in a semantic block, simply create a separate Ruby module (in another file) that contains the extension methods you want and use the angle bracket notation to indicate that a rule should use that module when extending matches.
To demonstrate this method with the above example, in a Ruby file you would define the following module.
module Additive
def lhs
number.value
end
def rhs
term.value
end
def value
lhs + rhs
end
end
Then, in your Citrus grammar file the rule definition would look like this:
rule additive
(number plus term:(additive | number)) <Additive>
end
This method of defining extensions can help keep your grammar files cleaner.
However, you do need to make sure that your extension modules are already loaded
before using Citrus.load
to load your grammar file.
Copyright © 2015 Michael Jackson