Scope Definition
One of the main concerns when creating a LiClipse language file is determining how to partition
the source code for a language. Based on this information, LiClipse may handle each of
these scopes differently for highlighting, navigation, outline, etc.
The main definition for that is the
scope_definition_rules element, which
should be a top-level element in the YAML file. This section defines the top-level scopes for a
file.
Afterwards, another top-level entry named
scope is used to provide the actual
colors for the contents of a previously matched partition.
The default colors for each partition are given by yet another entry called
scope_to_color_name.
The names of the colors which are available can be seen at:
ColorThemeKeys.java.
The actual colors assigned to each of those will depend on the theme selected. See:
Change Color Theme for details.
Below is a commented example which shows how this works in a simple structure.
Example:
# For this example we have a language which has multi-line comments where we start and end with "###".
# Also, it has a class definition as: class ClassName, and the indentation is used to define new scopes.
scope_to_color_name: {
multiLineComment: string, # For this example, our multiLineComment will use the string color defined in general > appearance > color theme
default: foreground, # In the default scope, we'll use the foreground color
# Anything else uses a color with the same name as the scope (i.e.: class, keyword)
}
scope_definition_rules:
# Define that our comment is anything between ###....###
# Note that any text that does not match any of these rules is considered to be in the default scope.
- {type: MultiLineRule, scope: multiLineComment, start: "###", end: "###", escapeCharacter: \}
scope: # Here we'll specify sub-partition for top-level scopes
default: # We're defining things internal to the default scope (in this example we could also define things in the multiLineComment scope).
#keyword must be a color at this point already (not a scope)
keyword: [class, pass] # Define that we want to consider 'class' and 'pass' as a keyword, colored with the 'keyword' color.
sub_rules: [
# There are things which may need more work to match in a scope, so, for this case we can
# use sub_rules.
{type: CompositeRule, sub_rules: [ # A composite rule will only be matched if all containing rules also match.
{ type: SequenceRule, scope: keyword, sequence: 'class'}, # Define that 'class' is a keyword
{ type: OneOrMoreSpacesRule, scope: default}, # After class we need at least a space
{ type: AnyWordRule, scope: class }] # And any name after that is the class we matched
},
]
file_extensions: [liclipse_example] # The extensions matched by this language
filename: []
name: LiClipse Example # Name of the language
outline: # Note that we just specify 'flat' items here, the indent is later used to specify that an item creates a new scope.
- {type: Scope, scope: [default, class], define: class} # Wherever we have a class inside the default scope we'll show a class icon in the outline.
indent: {
type: spaces, # Our example language uses spaces for indenting
outline_scopes: [class], # We have to say which outline entries actually create a new scope (so, indent and outline work toghether to specify the tree).
}
# Specify that the default comment action (Ctrl+/) creates a multiLineComment and that it should wrap it with '###'.
comment: {type: multiLine, start: '###', end: '###', scope: multiLineComment}
The rules available are:
CompositeRule:
A rule which is only matched if all its internal rules also match. When used in the context of a coloring, it'll return a token composed of multiple scopes.
sub_rules: A list with the rules that compose this rule.
Example:
#In this example, if we have a word and a colon afterwards, the word will use the class color and the sequence the operator color.
default:
sub_rules: [
{type: CompositeRule, sub_rules: [
{ type: AnyWordRule, scope: class},
{ type: SequenceRule, scope: operator, sequence: ':'}]
},
]}
MultiLineRule:
A rule which may span multiple lines after we find the start sequence (and which ends at the end sequence).
start: The start sequence for the rule
end: The end sequence for the rule
scope: The scope which this rule defines
escapeCharacter: a Character used to escape the next char in the rule
Example:
scope_definition_rules:
# Matching a multi line comment for HTML
- {type: MultiLineRule, scope: multiLineComment, start: '<!--', end: '-->', escapeCharacter: '\0'}
# Matching a multi line string in Python
- {type: MultiLineRule, scope: singleQuotedMultiLineString, start: "'''", end: "'''", escapeCharacter: \}
OptionalMultiLineRule:
Same thing as a MultiLineRule, usually used in a CompositeRule to specify that some
multi line pattern may be optionally matched.
MultiLineRuleWithSkip:
A rule which may span multiple lines after we find the start sequence (and which ends at the end sequence) with
optional rules which may be used to skip sections while doing the match.
start: The start sequence for the rule
end: The end sequence for the rule
scope: The scope which this rule defines
escapeCharacter: a Character used to escape the next char in the rule
skip_rules: a list of rules which when matched may skip a section of the document while matching the multi line rule.
Example:
#Matching '<' all the way through '>' and skipping strings which may have < or > inside it.
{type: MultiLineRuleWithSkip, scope: tag, start: '<', end: '>', escapeCharacter: '\0',
skip_rules:[
#Needed because if we find the end sequence within a string, we want to skip it.
{type: MultiLineRule, scope: unused0, start: '"', end: '"', escapeCharacter: '\0'},
{type: MultiLineRule, scope: unused1, start: "'", end: "'", escapeCharacter: '\0'},
]
}
MultiLineRuleRecursive:
It's the same thing as the MultiLineRuleWithSkip, but if the start sequence is found again, a level is added
and the same number of end sequences must be found to finish the rule.
Example:
#Matching '<' all the way through '>' and skipping strings which may have < or > inside it.
{type: MultiLineRuleRecursive, scope: tag, start: '<', end: '>', escapeCharacter: '\0',
skip_rules:[
#Needed because if we find the end sequence within a string, we want to skip it.
{type: MultiLineRule, scope: unused0, start: '"', end: '"', escapeCharacter: '\0'},
{type: MultiLineRule, scope: unused1, start: "'", end: "'", escapeCharacter: '\0'},
]
}
RegexpRule:
Rule that matches a regular expression. The regexp is only matched against a single line and or most 512 chars (whatever is shorter).
The regexp uses Java semantics.
Example:
# Matches a regular expression
{ type: RegexpRule, regexp: 'aabb', scope: decorator }
AnyWordRule:
Rule that matches any identifier word
(i.e.: Where the first character matches Character.isJavaIdentifierStart and
the remaining ones match Character.isJavaIdentifierPart).
Usually used in a CompositeRule.
mustStartUppercase: if True it'll only start matching if the first char is uppercase
except: list of strings with the words that shouldn't be matched by the rule
additionalChars: string with the additional characters to be matched by the rule
Example:
# Matching any word after detecting we're in a decorator context in Python.
{ type: AnyWordRule, scope: decorator, mustStartUppercase: False }
PatternRule:
A complex rule upon most other rules are built (so, usually there's a more specific rule to a given case).
startSequence: the start sequence for this rule
endSequence: the end sequence for this rule
scope: the scope which this rule defines
escapeCharacter: a char used to escape the next char in the rule
breaksOnEOL: indicates that the end of line also terminates the pattern
breaksOnEOF: indicates that the end of the file also terminates the pattern
escapeContinuesLine: If true when the escape char is found, a following new line will not terminate the pattern (even if breaksOnEOL is set).
Example:
scope_definition_rules:
# Matching [xxxx]_ only in the current line
- {type: PatternRule, scope: javadocLink, startSequence: '[',
endSequence: ']_', escapeCharacter: '\0', breaksOnEOL: true,
breaksOnEOF: false, escapeContinuesLine: false}
SingleLineRule:
A rule which must start and end at the same line
sequence: the start/end sequence for this rule
scope: the scope which this rule defines
escapeCharacter: a char used to escape the next char in the rule
escapeContinuesLine: If true when the escape char is found, a following new line will not terminate the pattern.
Example:
scope_definition_rules:
# Matching a double quoted string in Python
- {type: SingleLineRule, scope: doubleQuotedString, sequence: '"', escapeCharacter: \, escapeContinuesLine: true}
SequenceRule:
Usually used in a CompositeRule to specify that some sequence must be matched.
sequence: the exact sequence to be matched in this rule.
scope: the scope which this rule defines.
Example:
#Matching the word 'function'
{ type: SequenceRule, scope: keyword, sequence: 'function'}
SequencesRule:
Usually used in a CompositeRule to specify that one of the specified sequences must be matched.
sequences: the sequences to be matched (one of them must be matched to validate this rule).
scope: the scope which this rule defines.
Example:
#Matching the word 'function' or 'def'
{ type: SequenceRule, scope: keyword, sequences: ['function', 'def']}
OptionalSequenceRule:
Same thing as the SequenceRule to be used in a CompositeRule to optionally match some sequence.
EndOfLineRule:
A rule which starts after a sequence is found and ends at the end of the line.
start: the sequence which starts matching this rule.
scope: the scope which this rule defines.
Example:
scope_definition_rules:
# Matching a comment in Python
- {type: EndOfLineRule, scope: singleLineComment, start: '#'}
OneOrMoreSpacesRule:
Usually used in a CompositeRule match one or more whitespaces. A whitespace is matched if Character.isWhitespace(char) returns true and it's not a newline.
scope: the scope which this rule defines.
ZeroOrMoreSpacesRule:
Usually used in a CompositeRule match zero or more whitespaces. A whitespace is matched if Character.isWhitespace(char) returns true and it's not a newline.
scope: the scope which this rule defines.
NumberRule:
A rule which matches a number.
Some examples of numbers matched:
1.3e4 (float with exp), 0xAF (hexa), 1 (int), 33. (float)
scope: the scope which this rule defines.
SwitchLanguageHtmlRule:
This is a special rule so that when we are inside a given script pattern in HTML we consider the contents
of that language as a different language (i.e.: for when we embed css, javascript, etc inside HTML).
Example:
scope_definition_rules:
#The SwitchLanguageHtmlRule is a special hand-made rule to match the html script tag.
#If there are other 'language' switching cases, this may need to be more flexible.
#It create sub-tokens for the tag as the rules here (open_tag, close_tag, class, etc, so, if this
#changes, the rule may need to be changed too).
- {type: SwitchLanguageHtmlRule, #custom rule matching for: '<script type="???", language="???">', end: '</script>'
scope: this, #On a switch, the scope must alway be 'this'
tag: 'script',
type_attr: {
'application/javascript': javascript, 'application/ecmascript': javascript, 'application/x-javascript': javascript,
'application/x-ecmascript': javascript, 'text/javascript': javascript, 'text/ecmascript': javascript, 'text/jscript':javascript
},
language_attr: {JavaScript: javascript} #the expected language attr to switch to the target language (used with startswith() and case-independent)
}
SwitchLanguageRule:
Same as a multi-line rule, but the contents inside the rule will be partitioned/scanned as being another language
(actions which are language dependent should obey this properly later on).
scope: the scope which this rule defines.
start: the sequence which starts matching this rule.
end: the sequence which ends matching this rule.
language: the language to be considered inside the block.
Example:
{type: SwitchLanguageRule, scope: python_block, start: '<%', end: '%>', language: python}
IndentedBlockRule:
Rule which will match any indented block after some prefix is found.
start: the sequence which starts matching this rule.
scope: the scope which this rule defines.
column: -1 means it can start anywhere, 0 (default) means it'll only match it if the sequence is found in column 0.
additional_start: Optional: it's a list of rules. If given, after matching the start it also has to match these rules (as if it was inside a CompositeRule).
Example:
scope_definition_rules:
#Literal Block (column -1 means it can start anywhere)
# literal block::
# xxx xxx xxx
# xxx xxx xxx
- {type: IndentedBlockRule, scope: literalBlock, start: '::', column: -1}
MatchLineStartRule:
This rule should only be used inside a CompositeRule. Matches the start of a line (column 0). It's length is always 0.
scope: the scope which this rule defines.
Example:
scope_definition_rules:
# Matching an rst title:
# xxxxx
# ------
- {type: CompositeRule, sub_rules:[ #Note: when a composite rule is defined here,
# all the scopes in the inner parts must have the same type.
{ type: MatchLineStartRule, scope: title},
{ type: SkipLineRule, scope: title},
{ type: RepeatCharToEolRule, scope: title, chars: ['-', '=', '_', '~', '`']},
]}
SkipLineRule:
This rule will skip all the contents of the current line until from the current position. Made to be used inside a CompositeRule.
scope: the scope which this rule defines.
Example:
scope_definition_rules:
# Matching an rst title:
# xxxxx
# ------
- {type: CompositeRule, sub_rules:[ #Note: when a composite rule is defined here,
# all the scopes in the inner parts must have the same type.
{ type: MatchLineStartRule, scope: title},
{ type: SkipLineRule, scope: title},
{ type: RepeatCharToEolRule, scope: title, chars: ['-', '=', '_', '~', '`']},
]}
RepeatCharToEolRule:
This rule will match if all the characters in the current line until the end of the line (or file) are the same.
scope: the scope which this rule defines.
chars: the characters that this rule can match.
Example:
scope_definition_rules:
# Matching an rst title:
# xxxxx
# ------
- {type: CompositeRule, sub_rules:[ #Note: when a composite rule is defined here,
# all the scopes in the inner parts must have the same type.
{ type: MatchLineStartRule, scope: title},
{ type: SkipLineRule, scope: title},
{ type: RepeatCharToEolRule, scope: title, chars: ['-', '=', '_', '~', '`']},
]}
PrevCharNotIn:
This rule is always a 0-sized rule just to check that the previous char is not contained in any of the passed characters.
scope: the scope which this rule defines.
chars: the characters that this rule can match.
Example:
# Single Line Strings start only if not right after a number.
- {type: CompositeRule, sub_rules: [
{type: PrevCharNotIn, scope: singleQuotedString, chars: '0123456789'}, # I.e.: we can't be inside a number
{type: SingleLineRule, scope: singleQuotedString, sequence: "'", escapeCharacter: \, escapeContinuesLine: true},
]}
SingleLineRuleWithSkip:
This rule would usually match a single line after some character is found and can optionally have additional rules if we need to skip some parts (such as a parenthesis that must be closed).
scope: the scope which this rule defines.
start: the characters that start this rule.
escapeCharacter: a char used to escape the next char in the rule
escapeContinuesLine: If true when the escape char is found, a following new line will not terminate the pattern.
skipRules: The rules that should be skipped after the start sequence is found.
Example:
{type: SingleLineRuleWithSkip, scope: line_statement, start: '#', escapeCharacter: '\0', escapeContinuesLine: false, skipRules:[
{type: MultiLineRule, scope: keyword, start: '(', end: ')', escapeCharacter: '\0'},
{type: MultiLineRule, scope: keyword, start: '[', end: ']', escapeCharacter: '\0'},
{type: MultiLineRule, scope: keyword, start: '{', end: '}', escapeCharacter: '\0'},
]}