Blog

Behind the Scenes: Query Language Editor

Behind the Scenes: Query Language Editor

Behind the Scenes: Query Language Editor

Behind the Scenes
Behind the Scenes

24.10.2022

by

-

8 minutes

read

The last blog post taught you how to set up a query language lexer and parser using ANTLR. This post will cover making this setup accessible in a user interface.

The Options

The last blog post introduced a function toTargetPredicate taking a simple query string and returning a tree structure according to the defined grammar. So the only thing the user needs to do is to define the query string. The obvious cheap option would be to provide a simple text field and parse the input using toTargetPredicate. Since we want more advanced user support, like auto-completion, syntax highlighting, and error handling, we decided to go with the Monaco editor. The Monaco editor is what powers VS Code under the hood. It can also be used in web frontends using the monaco-editor-core module.

There are tons of tutorials out there covering how to use this package together with your frontend technology, so I will concentrate on what's needed to combine our previously described query language lexer and parser with the powers of a full-blown editor.

The Basic Setup

Starting with Monaco is simple as it can be. The only import needed is

import * as monaco from 'monaco-editor-core';

The monaco.editor.create function accepts a lot of properties to control the visual appearance and behavior of the editor; just check the docs.

The first thing to do with the import is to register a language using the following:

monaco.languages.register({ id: 'MY_LANGUAGE' });

After this, you should be ready to have a small, simple text editor in your front end as in the following image (Note that the styling and the ? depends on your usage).

Syntax Highlighting

Before we explain our approach, one word upfront: Our setup and needs may likely differ from yours. I always recommend checking the official docs for all functions used to see how much you can customize to create your desired experience.

To get syntax highlighting support, you need to apply two steps:

  • tokenize your string and

  • define colors for your tokens

Let's start with the easy part, defining the colors. Monaco's static function defineTheme is the way to go.

// An object holding your colors
const colors = {
  green: '#7b9826',
  normalText: '#2c2421',
  purple: '#c33ef4',
  lightBlue: '#3d97b8',
  blue: '#407ee7',
  default: '#8F9DAD',
  neutral100: '#F4F7FB'
};
// Using your colors to define some tokens
monaco.editor.defineTheme('MY_THEME', {
  base: 'vs',
  rules: [
    { token: 'AND', foreground: colors.green },
    { token: 'OR', foreground: colors.green },
    { token: 'NOT', foreground: colors.green },
    { token: 'LPAREN', foreground: colors.normalText },
    { token: 'RPAREN', foreground: colors.normalText },
    { token: 'OP_NOT_EQUAL', foreground: colors.purple },
    { token: 'OP_EQUAL', foreground: colors.purple },
    { token: 'OP_TILDE', foreground: colors.purple },
    { token: 'OP_NOT_TILDE', foreground: colors.purple },
    { token: 'QUOTED', foreground: colors.lightBlue },
    { token: 'TERM', foreground: colors.blue },
    { token: 'DEFAULT_SKIP', foreground: colors.coral500 },
    { token: 'UNKNOWN', foreground: colors.normalText },
  ],
  inherit: false,
  colors: {
    'editor.background': colors.neutral100,
  },
});

A token here is something that maps a token name to a color. The tokens defined here are very likely not the tokens you will have. They come from our grammar and can be found in the lexer generated by ANTLR.

Visually, nothing has changed so far. This is because we need to tell Monaco how to create these tokens from the given text string.

Let's check the following code:

monaco.languages.setTokensProvider('MY_LANGUAGE', new TokensProvider());
class State implements monaco.languages.IState {
  clone(): monaco.languages.IState {
    return new State();
  }
  equals(other: monaco.languages.IState): boolean {
    // can always be true for our example
    return true;
  }
}

class TokensProvider implements monaco.languages.TokensProvider {
  getInitialState(): monaco.languages.IState {
    return new State();
  }
  tokenize(line: string): monaco.languages.ILineTokens {
    // So far we ignore the state, which may harm performance for massive texts
    return tokensForLine(line);
  }
}

function tokensForLine(input: string): monaco.languages.ILineTokens {
  // Using our created tokenize functionality to cut tokens and map them to monaco tokens
  const tokens = tokenize(input);
  return new LineTokens(
  tokens.map((token) => new Token(QueryLanguageLexer.symbolicNames[token.type] || 'UNKNOWN', token.start, token.stop))
  );
}

function tokenize(input: string): antlr4.Token[] {
  const chars = new antlr4.InputStream(input);
  const lexer = new QueryLanguageLexer(chars);
  return lexer.getAllTokens();
}

class LineTokens implements monaco.languages.ILineTokens {
  endState: monaco.languages.IState;
  tokens: monaco.languages.IToken[];

  constructor(tokens: monaco.languages.IToken[]) {
    this.endState = new State();
    this.tokens = tokens;
  }
}

class Token implements monaco.languages.IToken {
  scopes: string;
  startIndex: number;
  endIndex: number;

  constructor(ruleName: string, startIndex: number, endIndex: number) {
    // important: the ruleName must match your theme definition
    this.scopes = ruleName;
    this.startIndex = startIndex;
    this.endIndex = endIndex;
  }
}

Phew, there are a lot of classes here. Since Monaco was designed to support a variety of use cases, there is a lot of boilerplate code. One can copy this code as it is for a good start. The only change needed is the implementation of the function tokensForLine. Here, we use our own support function tokenize to create tokens according to the defined grammar. The rest is boilerplate for optimizations.

That is all you need to do to get this result:

With this basic setup and the syntax highlighting you already have good support to show the user the concepts of your language. However, the editor is still very explorative in this state. With good documentation, you can already guide users toward writing text queries with your own defined grammar. There are some features left that enhance the experience of your editor even more.

Error handling

Nothing is more frustrating than writing invalid queries without anything telling you what is wrong. Monaco has an interface to set so-called markers in the editor. Markers are hints to specific text parts, like error messages or hints. Settings markers are pretty straightforward:

// editor is what you get back calling monaco.editor.create(...)
const model = editor.getModel();
if (model) {
  monaco.editor.setModelMarkers(model, 'owner', markers);
}

markers is an array of:

{
  severity: monaco.MarkerSeverity.Error,
  message: parseResult.message,
  startLineNumber: parseResult.line,
  startColumn: parseResult.column + 1,
  endLineNumber: parseResult.line,
  endColumn: parseResult.column + 4,
}

parseResult can be calculated usingtoTargetPredicate described in the last blog post:

const parseResult = toTargetPredicate(text);

Auto-completion

Auto-completion is a bigger beast to tame. The basic boilerplate looks like this:

const disposable = monaco.languages.registerCompletionItemProvider("MY_LANGUAGE", {
  triggerCharacters: ['!', '~', '.', '"', '=', '('],
  provideCompletionItems: (model, position) => {

    // caution, monaco cursor space starts at 1, not 0.
    const { column, lineNumber } = position;
    const line = model.getValueInRange({
      startLineNumber: lineNumber,
      endLineNumber: lineNumber,
      startColumn: 0,
      endColumn: Number.MAX_VALUE,
    });
    // your work goes here
  }
})
//Sometime later when your UI component gets unmounted
disposable.dispose();

This is the raw skeleton you need to tell Monaco that you want to provide your own suggestions. Everything else is already covered. The suggestion popup will automatically open when one of the triggerCharacters is entered. You also have the option to open the popup using the CTRL+Space shortcut.

The fiddly part is to find out where users are with their cursors and what kinds of suggestions you actually want to provide. Since this topic is very subjective, let's start with always giving back the same static set of suggestions:

const range = {
  startLineNumber: lineNumber,
  endLineNumber: lineNumber,
  startColumn: cursor,
  endColumn: cursor,
};
return {
  suggestions: [
    {
      label: "pretty printed label",
      insertText: "this gets into your editor when applying",
      kind: monaco.languages.CompletionItemKind.Keyword,
      range,
    }
  ]
}

As you can already guess, you have full control over what is rendered in the popup, what is inserted in your editor, and the range where the suggestion is inserted. There are even more advanced options. A look at the documentation is also recommended here.