-
Notifications
You must be signed in to change notification settings - Fork 80
Define Grammar AST types explicitly, refine EBNF-based terminals to avoid synthetic capturing groups #1966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…icated AST type `TerminalElement` * updated grammar
… to RegExp now avoiding unnecessary capturing groups '(...)' * added 'paranthesized' flag to type TerminalElement * marked required synthetic groups as non-capturing '(?:...)' * updated tests * updated example languages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks! A few details below.
@@ -0,0 +1,242 @@ | |||
type AbstractRule = InfixRule | ParserRule | TerminalRule; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the copyright header with Copyright 2025
|
||
interface AbstractElement { | ||
cardinality?: "*" | "+" | "?"; | ||
// parenthesized: boolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove it instead of leaving it commented out.
return reflection.isInstance(item, TerminalElement.$type); | ||
} | ||
|
||
export interface CharacterRange extends TerminalElement { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we're not sorting the types properly prior to generating them – the order shouldn't change after switching to declared types.
This PR introduces
langium-types.langium
defining all the AST types of Langium's grammar language.As an initial customization it contributes interface
TerminalElement extends AbstractElement
, andAbstractElement#lookahead
is moved toTerminalElement
.The second commit refines the synthesis of RegExps representing the terminals s.t. synthetic pairs of parentheses that are added to the RegExp are marked as non-capturing groups
(?:...)
, while parentheses being present in the terminal definition are transferred as capturing groups(...)
. To enable that an additional flagTerminalElement#parenthesized
is introduced for capturing the information wether a sub terminal is enclosed in parenthesizes.This gives adopters more control over the capturing groups within the RegExp, which is relevant when re-using the generated RegExps e.g. for value conversion, like