Language injection comments & Tree-sitter directives #10747
Unanswered
lemontheme
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm trying to formulate a generic Tree-sitter injection query that mimics Pycharm's language injection comments. These are Python comments with the following form:
When located directly above a string literal, they specify the language that is to be injected into the string. For example, in the snippet below, the string literal assigned to the variable
a
will be highlighted as a Jinja2 template.Suppose I only care about being able to inject the Jinja language, then this Tree-sitter query suffices:
But there are several other languages I'd like to be able to inject, such as html, xml, regex, and sql. The most obvious approach would be to duplicate the query above for each language, replacing 'jinja' by the corresponding language name.
In maintainability terms, though, that doesn't spark joy. Unless I generate my queries from a template, I'd be duplicating a lot of query code that might need minor tweaks later on. Also, if in theory I wanted to make all languages injectable, then I'd have hundreds of injection queries. (I haven't tested if this would have an impact on performance.)
Instead, I'd much rather be able to express my goal using a single generic query. This is where I'm stuck.
Ideally, instead of this...
... I could do something like this...
... and in
#set! injection.language
refer back to the language name captured by the regex.However, I can scrap this idea almost immediately, with 'backrefs to capture groups' appearing nowhere in the Tree-sitter docs. Onto the next best thing.
As always, let's have a look at how Neovim does it. I've come across several discussion threads where the solution seems to be to use the directive
offset!
, which lets one capture a substring of a node's content, e.g. the name of the language specified in the language injection comment.With
offset!
, my language-specific query can be made generic:Unfortunately, I can find no reference to this directive in the Treesitter rust bindings or in the Helix codebase. It would appear that it, along with several others, is specific to Neovim.
My questions:
Beta Was this translation helpful? Give feedback.
All reactions