Child pages
  • Developing Custom Language Plugins for IntelliJ IDEA

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: fix GitHub links

...

Types of tokens for lexers used in IDEA are defined by instances of IElementType. A number of token types common for all languages are defined in the TokenType interface; custom language plugins should reuse these token types wherever applicable. For all other token types, the plugin needs to create new IElementType instances and associate with the language in which the token type is used. The same IElementType instance should be returned every time a particular token type is encountered by the lexer.

Example: Token types for Properties language

An important feature which can be implemented at lexer level is mixing languages within a file (for example, embedding fragments of Java code in some template language). If a language supports embedding its fragments in another language, it needs to define the chameleon token types for different types of fragments which can be embedded, and these token types need to implement the ILazyParseableElementType interface. The lexer of the enclosing language needs to return the entire fragment of the embedded language as a single chameleon token, of the type defined by the embedded language. To parse the contents of the chameleon token, IDEA will call the parser of the embedded language through a call to ILazyParseableElementType.parseContents().

...

Second, a PSI (Program Structure Interface) tree is built on top of the AST, adding semantics and methods for manipulating specific language constructs. Nodes of the PSI tree are represented by classes implementing the PsiElement interface and are created by the language plugin in the ParserDefinition.createElement() method. The top-level node of the PSI tree for a file needs to implement the PsiFile interface, and is created in the ParserDefinition.createFile() method.

Example: ParserDefinition for Properties language

The lifecycle of the PSI is described in more detail in IntelliJ IDEA Architectural Overview.

...

The language plugin provides the parser implementation as an implementation of the PsiParser interface, returned from ParserDefinition.createParser(). The parser receives an instance of the PsiBuilder class, which is used to get the stream of tokens from the lexer and to hold the intermediate state of the AST being built. The parser must process all tokens returned by the lexer up to the end of stream (until PsiBuilder.getTokenType() returns null), even if the tokens are not valid according to the language syntax.

Example: PsiParser implementation for Properties language

The parser works by setting pairs of markers (PsiBuilder.Marker instances) within the stream of tokens received from the lexer. Each pair of markers defines the range of lexer tokens for a single node in the AST tree. If a pair of markers is nested in another pair (starts after its start and ends before its end), it becomes the child node of the outer pair.

...

The syntax and error highlighting is performed on multiple levels. The first level of syntax highlighting is based on the lexer output, and is provided through the SyntaxHighlighter interface. The syntax highlighter returns the TextAttributeKey instances for each token type which needs special highlighting. For highlighting lexer errors, the standard TextAttributeKey for bad characters (HighligherColors.BAD_CHARACTER) can be used.

Example: SyntaxHighlighlighter implementation for Properties language

Parser

The second level of error highlighting happens during parsing. If a particular sequence of tokens is invalid according to the grammar of the language, the PsiBuilder.error() method can be used to highlight the invalid tokens and display an error message showing why they are not valid.

...

When the rename refactoring is performed, the method PsiNamedElement.setName() is called for the renamed element, and PsiReference.handleElementRename() is called for all references to the renamed element. Both of these methods perform basically the same action: replace the underlying AST node of the PSI element with the node containing the new text entered by the user. Creating a fully correct AST node from scratch is quite difficult. Thus, surprisingly, the easiest way to get the replacement node is to create a dummy file in the custom language so that it would contain the necessary node in its parse tree, build the parse tree and extract the necessary node from it.

Example: setName() implementation for a Property

Another interface related to the Rename refactoring is NamesValidator. This interface allows a plugin to check if the name entered by the user in the Rename dialog is a valid identifier (and not a keyword) according to the custom language rules. If an implementation of this interface is not provided by the plugin, Java rules for validating identifiers are used. Implementations of NamesValidator are registered in the com.intellij.lang.namesValidator extension point.

...

  • The PsiElement.delete() method for the PsiElement subclasses for which Safe Delete is available. Deleting PSI elements is implemented by deleting the underlying AST nodes from the AST tree (which, in turn, causes the text ranges corresponding to the AST nodes to be deleted from the document).

Example: delete() implementation for a Property

If needed, it's possible to further customize how Safe Delete is performed for a particular type of element (how references are searched, etc). This is done by implementing the SafeDeleteProcessorDelegate interface.

...