End Of Input
The EOI token marks the end of lexical analysis of the source and can be used as a marker for the end of the parsing execution.
By default, such a token is named T_EOI
, but this behavior can be changed.
Let's take a look at a simple lexer example:
$lexer = new Phplrt\Lexer\Lexer(
tokens: [ 'T_DIGIT' => '\d+' ],
);
foreach ($lexer->lex('42') as $token) {
echo $token->getName() . "\n";
}
// T_DIGIT
// T_EOI
Renaming
Renaming the EOI token is done using the eoi
constructor argument:
$lexer = new Phplrt\Lexer\Lexer(
tokens: [ 'T_DIGIT' => '\d+' ],
eoi: 'CUSTOM_EOI_NAME',
);
foreach ($lexer->lex('42') as $token) {
echo $token->getName() . "\n";
}
// T_DIGIT
// CUSTOM_EOI_NAME
Behaviour Change
By default, each time lexical analysis is performed, such an EOI token will be
located at the very end of the list of returned tokens. This behavior can be
overridden using onEndOfInput
constructor argument.
$lexer = new Phplrt\Lexer\Lexer(
tokens: ['T_DIGIT' => '\d+'],
onEndOfInput: new \Phplrt\Lexer\Config\NullHandler(),
eoi: 'CUSTOM_EOI_NAME',
);
foreach ($lexer->lex('42') as $token) {
echo $token->getName() . "\n";
}
// T_DIGIT
The Phplrt\Lexer\Config\NullHandler
in this case overrides the return EOI
behaviour and removes the token from the list of returned tokens.
You can also specify your own handler.
use Phplrt\Contracts\Lexer\TokenInterface;
use Phplrt\Contracts\Source\ReadableInterface;
use Phplrt\Lexer\Config\HandlerInterface;
$lexer = new Phplrt\Lexer\Lexer(
tokens: ['T_DIGIT' => '\d+'],
onEndOfInput: new class implements HandlerInterface {
public function handle(ReadableInterface $source, TokenInterface $token): ?TokenInterface
{
throw new \RuntimeException('Input must not never ends!');
}
},
eoi: 'CUSTOM_EOI_NAME',
);
foreach ($lexer->lex('42') as $token) {
echo $token->getName() . "\n";
}
// T_DIGIT
// Uncaught RuntimeException: Input must not never ends!