Child pages
  • token vs. regex
Skip to end of metadata
Go to start of metadata

The purpose of both RegexRule and TokenRule/ExtensibleTokenRule is to describe tokens. However they have different properties.

RegexRule supports only regular grammar and won't let you analyze the structure of the parsed string. The result of the RegexRule parsing is NSpan. RegexRule is faster than similar TokenRule and ExtensibleTokenRule and consumes less memory.

TokenRule/ExtensibleTokenRule }}supports all {{SimpleRule/ExtensibleRule syntax capabilities: parsed string structure analysis, extensibility, predicates, recursion, error recovery, etc. Yet TokenRule/ExtensibleTokenRule is slower than RegexRule and stores more data.

For simple tokens such as numbers, regex will suffice.

Use token for complex tokens: strings, comments, etc. as since their structure is important, they are easier to express through predicates, and they require error recovery. If for example, you use regex to describe strings, you won't be able to recover them, and therefore the escape sequence highlighting will become impossible.

It probably makes sense to parse complex structure numbers with TokenRule/ExtensibleTokenRule as well. This way you can get information about different parts of the parsed string.

  • No labels