Parser

This package can be installed separately using the command composer require phplrt/parser

The parser provides a set of components for grammar analysis (Parsing) of the source code and converting them into an abstract syntax tree (AST).

Let's create a primitive lexer that can handle spaces, numbers and the addition character.

use Phplrt\Lexer\Lexer;

$lexer = (new Lexer())
    ->append('T_NUMBER', '\\d+')
    ->append('T_PLUS', '\\+')
    ->append('T_WHITESPACE', '\\s+')
        ->skip('T_WHITESPACE')
;

Grammar will be a little more complicated. We need to determine in what order the tokens in the source text can be located, which we will parse.

First we start with the (E)BNF format:

(* A simple example of adding two numbers will look like this: *)
expr = T_NUMBER T_PLUS T_NUMBER ;

To define this rule inside the Grammar, we simply use two classes that define the rules inside the product, this is the concatenation and definitions of the tokens.

use Phplrt\Parser\Grammar\Concatenation;
use Phplrt\Parser\Grammar\Lexeme;
use Phplrt\Parser\Parser;

$options = [Parser::CONFIG_INITIAL_RULE => 'expression'];

//
// This (e)BNF construction:
// expression = T_NUMBER T_PLUS T_NUMBER ;
// 
// Looks like:
// Concatenation = Token1 Token2 Token1
//
$grammar = [
    // Concat: 1 then 2 then 1
    'expression' => new Concatenation([1, 2, 1]),

    // 1 is a T_NUMBER token
    1 => new Lexeme('T_NUMBER'),

    // 2 is a T_PLUS lexeme
    2 => new Lexeme('T_PLUS'),
];

In order to test the grammar, we can simply parse the source.


$parser = new \Phplrt\Parser\Parser($lexer, $grammar, $options);

var_dump($parser->parse('2 + 2'));

But if the source is wrong, the parser will tell you exactly where the error occurred:

$result = $parser->parse('2 + + 2');
// Syntax error, unexpected "+" (T_PLUS)
//   1. | 2 + + 2
//      |     ^ in .../Parser/src/Exception/UnexpectedTokenException.php:37