End Of Input

The EOI token marks the end of lexical analysis of the source and can be used as a marker for the end of the parsing execution.

By default, such a token is named T_EOI, but this behavior can be changed.

Let's take a look at a simple lexer example:

$lexer = new Phplrt\Lexer\Lexer(
    tokens: [ 'T_DIGIT' => '\d+' ],
);

foreach ($lexer->lex('42') as $token) {
    echo $token->getName() . "\n";
}

// T_DIGIT
// T_EOI

Renaming

Renaming the EOI token is done using the eoi constructor argument:

$lexer = new Phplrt\Lexer\Lexer(
    tokens: [ 'T_DIGIT' => '\d+' ],
    eoi: 'CUSTOM_EOI_NAME',
);

foreach ($lexer->lex('42') as $token) {
    echo $token->getName() . "\n";
}

// T_DIGIT
// CUSTOM_EOI_NAME

Behaviour Change

By default, each time lexical analysis is performed, such an EOI token will be located at the very end of the list of returned tokens. This behavior can be overridden using onEndOfInput constructor argument.

$lexer = new Phplrt\Lexer\Lexer(
    tokens: ['T_DIGIT' => '\d+'],
    onEndOfInput: new \Phplrt\Lexer\Config\NullHandler(),
    eoi: 'CUSTOM_EOI_NAME',
);

foreach ($lexer->lex('42') as $token) {
    echo $token->getName() . "\n";
}

// T_DIGIT

The Phplrt\Lexer\Config\NullHandler in this case overrides the return EOI behaviour and removes the token from the list of returned tokens.

You can also specify your own handler.

use Phplrt\Contracts\Lexer\TokenInterface;
use Phplrt\Contracts\Source\ReadableInterface;
use Phplrt\Lexer\Config\HandlerInterface;

$lexer = new Phplrt\Lexer\Lexer(
    tokens: ['T_DIGIT' => '\d+'],
    onEndOfInput: new class implements HandlerInterface {
        public function handle(ReadableInterface $source, TokenInterface $token): ?TokenInterface
        {
            throw new \RuntimeException('Input must not never ends!');
        }
    },
    eoi: 'CUSTOM_EOI_NAME',
);

foreach ($lexer->lex('42') as $token) {
    echo $token->getName() . "\n";
}

// T_DIGIT
// Uncaught RuntimeException: Input must not never ends!