Prism is awesome out of the box, but it’s even awesomer when it’s customized to your own needs. This section will help you write new language definitions, plugins and all-around Prism hacking.
Every language is defined as a set of tokens, which are expressed as regular expressions. For example, this is the language definition for CSS:
A regular expression literal is the simplest way to express a token. An alternative way, with more options, is by using an object literal. With that notation, the regular expression describing the token would be the pattern
attribute:
...
'tokenname': {
pattern: /regex/
}
...
So far the functionality is exactly the same between the short and extended notations. However, the extended notation allows for additional options:
true
,
the first capturing group in the regex pattern
is discarded when matching this token, so it effectively behaves
as if it was lookbehind. For an example of this, check out the C-like language definition, in particular the comment and class-name tokens:
rest
is useful, check the Markup definitions above.latex-equation
is not supported by any theme, but it will be highlighted the same as a string.
{
'latex-equation': {
pattern: /\$(\\?.)*?\$/g,
alias: 'string'
}
}
/* foo */
appears inside a string, you would not want it to be highlighted as a comment.
The greedy-property allows a pattern to ignore previous matches of other patterns, and
overwrite them when necessary. Use this flag with restraint, as it incurs a small performance overhead.
The following example demonstrates its usage:
'string': {
pattern: /(["'])(\\(?:\r\n|[\s\S])|(?!\1)[^\\\r\n])*\1/,
greedy: true
}
Unless explicitly allowed through the inside
property, each token cannot contain other tokens, so their order is significant. Although per the ECMAScript specification, objects are not required to have a specific ordering of their properties, in practice they do in every modern browser.
In most languages there are multiple different ways of declaring the same constructs (e.g. comments, strings, ...) and sometimes it is difficult or unpractical to match all of them with one single regular expression. To add multiple regular expressions for one token name an array can be used:
...
'tokenname': [ /regex0/, /regex1/, { pattern: /regex2/ } ]
...
Prism.languages.insertBefore(inside, before, insert, root)
This is a helper method to ease modifying existing languages. For example, the CSS language definition not only defines CSS highlighting for CSS documents,
but also needs to define highlighting for CSS embedded in HTML through <style>
elements. To do this, it needs to modify
Prism.languages.markup
and add the appropriate tokens. However, Prism.languages.markup
is a regular JavaScript object literal, so if you do this:
Prism.languages.markup.style = {
/* tokens */
};
then the style
token will be added (and processed) at the end. Prism.languages.insertBefore
allows you to insert
tokens before existing tokens. For the CSS example above, you would use it like this:
Prism.languages.insertBefore('markup', 'cdata', {
'style': {
/* tokens */
}
});
root
that contains the object to be modified.Prism.languages
.Prism’s plugin architecture is fairly simple. To add a callback, you use Prism.hooks.add(hookname, callback)
.
hookname
is a string with the hook id, that uniquely identifies the hook your code should run at.
callback
is a function that accepts one parameter: an object with various variables that can be modified, since objects in JavaScript are passed by reference.
For example, here’s a plugin from the Markup language definition that adds a tooltip to entity tokens which shows the actual character encoded:
Prism.hooks.add('wrap', function(env) {
if (env.token === 'entity') {
env.attributes['title'] = env.content.replace(/&/, '&');
}
});
Of course, to understand which hooks to use you would have to read Prism’s source. Imagine where you would add your code and then find the appropriate hook. If there is no hook you can use, you may request one to be added, detailing why you need it there.
Prism.highlightAll(async, callback)
This is the most high-level function in Prism’s API. It fetches all the elements that have a .language-xxxx
class
and then calls Prism.highlightElement()
on each one of them.
prism.js
file for the async highlighting to work. You can build your own bundle on the Download page.
async
is true, since in that case, the highlighting is done asynchronously.
Prism.highlightAllUnder(element, async, callback)
Fetches all the descendants of element
that have a .language-xxxx
class
and then calls Prism.highlightElement()
on each one of them.
.language-xxxx
class will be highlighted.Prism.highlightAll()
Prism.highlightAll()
Prism.highlightElement(element, async, callback)
Highlights the code inside a single element.
language-xxxx
to be processed, where xxxx
is a valid language identifier.Prism.highlightAll()
Prism.highlightAll()
Prism.highlight(text, grammar)
Low-level function, only use if you know what you’re doing. It accepts a string of text as input and the language definitions to use, and returns a string with the HTML produced.
Prism.languages.markup
The highlighted HTML
Prism.tokenize(text, grammar)
This is the heart of Prism, and the most low-level function you can use. It accepts a string of text as input and the language definitions to use, and returns an array with the tokenized code. When the language definition includes nested tokens, the function is called recursively on each of these tokens. This method could be useful in other contexts as well, as a very crude parser.
Prism.languages.markup
An array of strings, tokens (class Prism.Token
) and other arrays.