+
We are excited to announce Grafana Labs is to acquire Kausal. Read more

Adding a New Syntax to Prism

Twitter profile image
on Sep 5, 2017

At Kausal, we’re making it as easy as possible for developers to understand their applications’ behavior. To that end, we want writing PromQL expressions to be as fast and painless as possible by adding intelligent autocompletion for metric names, labels names, labels values and aggregation by-clauses. Kausal’s clever query editor is built on Slate.js, a rich editing framework. Its syntax highlighting is powered by Prism.js.

PromQL syntax highlighting in Slate editor

Since Prism does not come with PromQL syntax rules we had to write our own to get proper highlighting for PromQL queries. Prometheus metric names present a difficulty as they are not known upfront. This post shows how we added a new syntax to Prism, with dynamically loaded keywords.

Mapping Prism Tokens to a New Syntax

Syntax highlighting is essentially mapping tokens to color. Some languages have concepts that are identical, e.g., reserved keywords, numbers, comments. Ideally, we would like to preserve those when adding a new syntax. To get started we should see what is available. The best resource I found was the prism.css stylesheet file itself. There you can see that a couple of tokens already map to the same color:

...

.token.selector,
.token.attr-name,
.token.string,
.token.char,
.token.builtin,
.token.inserted {
	color: #690;
}

...

We already see that token names like attr-name could be useful.

Quick Look at the PromQL Syntax

PromQL queries can be as simple as “up” or as complex as the following:

# Aggregated metric with label selector
sum(rate(prometheus_evaluator_duration_seconds{node="foo"}[1m])) by (quantile)

At this point we have to make a choice. Either we choose to fully implement the grammar rules (e.g., by reverse engineering the PromQL lexer), or we see how far we get with defining our own custom tokens. In addition to simply mapping node="foo" to attr-name="attr-value" we can also define wider contexts. These nested rules make the token matching easier to reason about. Looking at the query above, we can dissect it as follows:

# Aggregated metric with label selector
aggregation(function(metric{label-key="label-value"}[range])) keyword (label-key)
                           <--  context-labels   --><-- -->   <--             -->
                                                 context-range
                                                             context-aggregration

Token Rules

Before we can specify the rules, we need to collect the reserved functions and operators like by and group_left and store them as simple arrays like OPERATORS. These lists will allow the construction of long OR-match expressions when joined with |.

When defining the tokens, we are free to use PromQL concepts like label-value. The alias fields then map the concept to a predefined Prism token, e.g., label-value is mapped to attr-value. Here is our current rule set:

Prism.languages.promql = {
  'comment': {
    pattern: /(^|[^\n])#.*/,
    lookbehind: true,
  },
  'context-aggregation': {
    pattern: /((by|without)\s*)\([^)]*\)/, // by ()
    lookbehind: true,
    inside: {
      'label-key': {
        pattern: /[^,\s][^,]*[^,\s]*/,
        alias: 'attr-name',
      },
    }
  },
  'context-labels': {
    pattern: /\{[^}]*(?=})/,
    inside: {
      'label-key': {
        pattern: /[a-z_]\w*(?=\s*(=|!=|=~|!~))/,
        alias: 'attr-name',
      },
      'label-value': {
        pattern: /"(?:\\.|[^\\"])*"/,
        greedy: true,
        alias: 'attr-value',
      },
    }
  },
  'function': new RegExp(`\\b(?:${FUNCTIONS.join('|')})(?=\\s*\\()`, 'i'),
  'context-range': [{
    pattern: /\[[^\]]*(?=])/, // [1m]
    inside: {
      'range-duration': {
        pattern: /\b\d+[smhdwy]\b/i,
        alias: 'number',
      }
    }
  }, {
    pattern: /(offset\s+)\w+/, // offset 1m
    lookbehind: true,
    inside: {
      'range-duration': {
        pattern: /\b\d+[smhdwy]\b/i,
        alias: 'number',
      }
    }
  }],
  'number': /-?\d+((\.\d*)?([eE][+-]?\d+)?)?\b/,
  'operator': new RegExp(`/[-+*/=%^~]|&&?|\\|?\\||!=?|<(?:=>?|<|>)?|>[>=]?|\\b(?:${OPERATORS.join('|')})\\b`, 'i'),
  'punctuation': /[{};()`,.]/
}

Note how nested token definitions like context-labels, which just matches the closing {}, massively simplify the matching of elements inside. For a full description of other matching features like greedy and lookbehind, see the official Prism documentation.

Dynamically Adding Rules

Now what about queries like up? This is a single metric name, and not matched by any of the rules. We found it easiest to just define an additional token for all metric names. The problem is that the query field is displayed immediately, while the full list of available metric names is loading in the background.

Luckily, new tokens can be added at any point. When the request for metric names returns with the list, we can simply construct a new token pattern on the fly and add it to the PromQL rules:

function updateMetricNames(metrics) {
  Prism.languages.promql.metric = {
    pattern: new RegExp(`\\b(${metrics.join('|')})\\b`, 'i'),
    alias: 'variable',
  };  
}

The new token rules come into effect the next time the query field renders. In our case we trigger a re-render of the Slate editor field after updateMetricNames is called.

Conclusion

Initial tests showed that our custom token rules got us pretty far. Especially the context markers have been very helpful in deciding which suggestions to load when a user starts typing labels like {instance=. Admittedly, it will not parse all possible PromQL expressions, e.g., group_left with ignore clauses are not covered. For those we need to see if we can either add or tweak the rules, or implement a proper lexer.

If you spot any rule tweaks or like to chat more about Slate, Prism, or syntax highlighting, reach out on Twitter.

Kausal's mission is to enable software developers to better understand the behaviour of their code. We combine Prometheus monitoring, log aggregation and OpenTracing-compatible distributed tracing into a hosted observability service for developers.
Contact us to get started.