IntelliJ IDEA's indexing framework provides a quick way to locate certain elements (for example, files containing a certain word or methods with a particular name) in large code bases. Plugin developers can use the existing indexes built by IntelliJ IDEA itself, as well as build and use their own indexes.

IntelliJ IDEA supports two main types of indexes: file-based indexes and stub indexes. File-based indexes are built directly over the content of files, and stub indexes are built over serialized stub trees. A stub tree for a source file is a subset of its PSI tree which contains only externally visible declarations and is serialized in a compact binary format. Querying a file-based index gets you the set of files matching a certain condition, and querying a stub index gets you the set of matching PSI elements. Therefore, custom language plugin developers should typically use stub indexes in their plugin implementations.

File-based Indexes

File-based indexes in IntelliJ IDEA are based on a map/reduce architecture. Each index has a certain type of key and a certain type of value. The key is what's later used to retrieve data from the index; for example, in the word index the key is the word itself. The value is arbitrary data which is associated with the key in the index; for example, in the word index the value is a mask indicating in which context the word occurs (code, string literal or comment). In the simplest case (when we only need to know in what files some data occurs), the value has type Void and is not stored in the index.

When the index implementation indexes a file, it receives the content of a file and returns a map from the keys found in the file to the associated values. When the index is accessed, you specify the key that you're interested in and get back the list of files in which the key occurs and the value associated with each file.

Implementing a File-based Index

The standard word index has a fairly straightforward implementation; you can refer to it as an example to understand this discussion better.

Each specific index implementation is a class extending FileBasedIndexExtension, registered in the <fileBasedIndex> extension point. The implementation contains of the following main parts:

If you don't need to associate any value with the files (i.e. your value type is Void), you can simplify the implementation by using ScalarIndexExtension as the base class.

Accessing a File-based Index

Access to file-based indexes is performed through the FileBasedIndex class. The following primary operations are supported:

Standard Indexes

A number of the standard file-based indexes contained in IntelliJ IDEA are often useful for plugin developers. The first of them is the above-mentioned word index. This should generally accessed not directly, but using the helper methods in the PsiSearchHelper class.

The second is FilenameIndex. It provides a quick way to find all files matching a certain file name.

FileTypeIndex serves a similar goal: it allows to quickly find all files of a certain file type.

Stub Trees

As mentioned above, a stub tree is a subset of the PSI tree for a file which is stored in a compact serialized binary format. Actually the PSI tree for a file can be backed either by the AST (built by parsing the text of the file) or by the stub tree (deserialized from disk); switching between the two is transparent. The stub tree contains only a subset of the nodes (typically only the nodes that are needed to resolve the declarations contained in this file from external files). Trying to access any node which is not part of the stub tree, or to perform any operation which cannot be satisfied by the stub tree (such as accessing the text of a PSI element), causes the file to be parsed and the PSI to switch to AST backing.

Each stub in the stub tree is simply a bean class with no behavior, which stores a subset of the state of the corresponding PSI element (for example, its name, modifier flags like public or final, etc.) The stub also holds a pointer to its parent in the tree and a list of its children stubs.

To support stubs for your custom language, you first need to decide which of the elements of your PSI tree should be stored as stubs. Typically you need to have stubs for things like methods or fields, which are visible from other files, and don't need to have stubs for things like statements or local variables, which are not visible externally.

For each element type that you want to store in the stub tree, you need to perform the following steps:

The following steps need to be performed only once for each language that supports stubs:

If you need to change the stored binary format for the stubs (for example, if you want to store some additional data or some new elements), make sure that you advance the stub version returned from IStubFileElementType.getStubVersion() for your language. This will cause the stubs and stub indices to be rebuilt, and will avoid mismatches between the stored data format and the code trying to load it.

By default, if a PSI element extends StubBasedPsiElement, all elements of that type will be stored in the stub tree. If you need more precise control over which elements are stored, override IStubElementType.shouldCreateStub() and return false for elements which should not be included in the stub tree. Note that the exclusion is not recursive: if some elements of the element for which you returned false are also stub-based PSI elements, they will be included in the stub tree.

It's essential to make sure that all information stored in the stub tree depends only on the contents of the file for which stubs are being built, and does not depend on any external files. Otherwise the stub tree will not be rebuilt when an external dependency changes, and you will have stale and incorrect data in the stub tree.

Stub Indexes