Skip to content

Instantly share code, notes, and snippets.

@swarn
Last active July 19, 2024 17:09
Show Gist options
  • Save swarn/fb37d9eefe1bc616c2a7e476c0bc0316 to your computer and use it in GitHub Desktop.
Save swarn/fb37d9eefe1bc616c2a7e476c0bc0316 to your computer and use it in GitHub Desktop.
Using semantic highlighting in neovim

Semantic Highlighting in Neovim

What is Semantic Highlighting?

And, how is it different than treesitter highlighting? Here's a small example:

treesitter and lsp highlights

In C++, treesitter will highlight member variable declarations with @property and names in the parameter list as @parameter. But when they are used inside the function body, treesitter can't tell the difference between them, so they are all just blue @variable identifiers. Semantic highlighting uses an LSP (clangd, in this case) to show more accurate highlights.

Being able to tell the difference with a glance is useful:

a possible error

You know immediately — without seeing any other code — that something strange is going on with z. Maybe it's just poorly named, or maybe it's shadowing another variable.

Semantic highlighting can do much more. Here's another C++ example that highlights functions and variables by scope:

highlighting scopes

Seeing variable scope at a glance is so useful, many C++ projects use conventions like "prefix member variables with m_." But there isn't a universal convention, and even if there was, people would make mistakes. If you use semantic highlighting, you can simply assign a specific color to member variables.

Highlighting variables by scope is only one option! Instead, you could choose to highlight mutable variables, or async functions, or anything else that an LSP tells you about your code. You probably care about different properties for each language you write in.

Treesitter and semantic highlighting work great together! Treesitter is a fast, in-process parser. It understands the structure of your code, and it will always handle most of the highlighting. An LSP can add more — or more accurate — highlights for some parts of your code, but it is a slower, separate process.

Default Highlighting

Tokens to Highlights

An LSP server that supports semantic highlighting sends "tokens" to the LSP client. A token is data that describes a piece of text. Each token has a type, and zero or more modifiers.

For this C++ code:

//        Let's look at this token ↓
int function(int const p) { return p; }

The LSP tells us that p has a token with type parameter and two modifiers: readonly and functionScope. The default highlighting will apply five highlights to p:

  • @lsp.type.parameter.cpp
  • @lsp.mod.readonly.cpp
  • @lsp.mod.functionScope.cpp
  • @lsp.typemod.parameter.readonly.cpp
  • @lsp.typemod.parameter.functionScope.cpp

In general, it applies:

  • @lsp.type.<type>.<ft> highlight for each token
  • @lsp.mod.<mod>.<ft> highlight for each modifier of each token
  • @lsp.typemod.<type>.<mod>.<ft> highlights for each modifier of each token

You can use the :Inspect command to see what semantic highlights are being applied to your code.

Changing Highlights

Most of these highlight groups will be undefined, so they won't change the appearance of your code. To make parameters purple:

hi @lsp.type.parameter guifg=Purple

Or, with equivalent lua:

vim.api.nvim_set_hl(0, '@lsp.type.parameter', { fg='Purple' })

Just like treesitter highlights, if there is no specific-to-C++ @lsp.type.parameter.cpp group, it will fall back to the @lsp.type.parameter group.

Then, if you want everything which is read-only to be italic:

hi @lsp.mod.readonly gui=italic

If you only want parameters which are read-only to be italic:

hi @lsp.typemod.parameter.readonly gui=italic

To make sure your changes persist after changing colorschemes, wrap them in an autocommand that will reapply them after each colorscheme change:

vim.api.nvim_create_autocmd('ColorScheme', {
  callback = function ()
    vim.api.nvim_set_hl(0, '@lsp.type.parameter', { fg='Purple' })
    vim.api.nvim_set_hl(0, '@lsp.mod.readonly', { italic=true })
  end
})

Be careful to create the autocommand before calling :colorscheme in your init.

The C++ scopes example above can be created with a handful of highlights:

hi @lsp.type.class      guifg=Aqua
hi @lsp.type.function   guifg=Yellow
hi @lsp.type.method     guifg=Green
hi @lsp.type.parameter  guifg=Purple
hi @lsp.type.variable   guifg=Blue
hi @lsp.type.property   guifg=Green

hi @lsp.typemod.function.classScope  guifg=Orange
hi @lsp.typemod.variable.classScope  guifg=Orange
hi @lsp.typemod.variable.fileScope   guifg=Orange
hi @lsp.typemod.variable.globalScope guifg=Red

You probably want to use nicer colors than these!

If your colorscheme doesn't define @lsp.* groups yet, but it does define treesitter highlights, you might find it useful to link the semantic groups to the treesitter groups to get consistent colors:

local links = {
  ['@lsp.type.namespace'] = '@namespace',
  ['@lsp.type.type'] = '@type',
  ['@lsp.type.class'] = '@type',
  ['@lsp.type.enum'] = '@type',
  ['@lsp.type.interface'] = '@type',
  ['@lsp.type.struct'] = '@structure',
  ['@lsp.type.parameter'] = '@parameter',
  ['@lsp.type.variable'] = '@variable',
  ['@lsp.type.property'] = '@property',
  ['@lsp.type.enumMember'] = '@constant',
  ['@lsp.type.function'] = '@function',
  ['@lsp.type.method'] = '@method',
  ['@lsp.type.macro'] = '@macro',
  ['@lsp.type.decorator'] = '@function',
}
for newgroup, oldgroup in pairs(links) do
  vim.api.nvim_set_hl(0, newgroup, { link = oldgroup, default = true })
end

Disabling Highlights

You can disable semantic highlighting by clearing the semantic highlighting groups.

For example, maybe you don't like the semantic highlighting of functions in lua. Disable it with:

vim.api.nvim_set_hl(0, '@lsp.type.function.lua', {})

Or, you can disable all semantic highlights by clearing all the groups:

for _, group in ipairs(vim.fn.getcompletion("@lsp", "highlight")) do
  vim.api.nvim_set_hl(0, group, {})
end

Using LspTokenUpdate for Complex Highlighting

You can apply custom highlights based on semantic tokens using the LspTokenUpdate event. This event is triggered every time a visible token is updated. You can write code to inspect the token, then apply a highlight with the vim.lsp.semantic_tokens.highlight_token function. Here are a few examples:

Highlighting Based on More Than One Modifier

What if I want all global variables that aren't read-only to get a special highlight? I can check the modifiers for the semantic tokens and use whatever logic I want:

vim.api.nvim_create_autocmd("LspTokenUpdate", {
  callback = function(args)
    local token = args.data.token
    if
      token.type == "variable"
      and token.modifiers.globalScope
      and not token.modifiers.readonly
    then
      vim.lsp.semantic_tokens.highlight_token(
        token, args.buf, args.data.client_id, "MyMutableGlobalHL")
    end
  end,
})

vim.api.nvim_set_hl(0, 'MyMutableGlobalHL', { fg = 'red' })

By default, this highlight is higher priority than the standard LSP highlights.

Dealing with Ambiguity

Imagine I have these highlights:

hi @lsp.typemod.variable.globalScope     guifg=Red
hi @lsp.typemod.variable.defaultLibrary  guifg=Green

And I have the following c++:

std::cout << "Hello";

The semantic highlights applied to cout will be:

  • @lsp.type.variable.cpp, priority: 125
  • @lsp.mod.defaultLibrary.cpp, priority: 126
  • @lsp.mod.globalScope.cpp, priority: 126
  • @lsp.typemod.variable.defaultLibrary.cpp, priority: 127
  • @lsp.typemod.variable.globalScope.cpp, priority: 127

There are two different highlights (the last two) with the same priority and different colors. Because of that, there's no way to tell whether cout will be red or green.

One way to fix that is to make sure you use composable highlights. If globalScope is red and defaultLibrary is underlined, then cout will be both red and underlined.

Another alternative is use a callback to apply the highlight you want at a higher priority:

vim.api.nvim_create_autocmd("LspTokenUpdate", {
  callback = function(args)
    local token = args.data.token
    if token.type == "variable" and token.modifiers.defaultLibrary then
      vim.lsp.semantic_tokens.highlight_token(
        token, args.buf, args.data.client_id, "@lsp.mod.defaultLibrary")
    end
  end,
})

Complex Highlighting

You can write highlighting logic that uses more than just the token type and modifiers. Here's an example that highlights variable names written in ALL_CAPS that aren't constant:

local function show_unconst_caps(args)
  local token = args.data.token
  if token.type ~= "variable" or token.modifiers.readonly then return end

  local text = vim.api.nvim_buf_get_text(
    args.buf, token.line, token.start_col, token.line, token.end_col, {})[1]
  if text ~= string.upper(text) then return end

  vim.lsp.semantic_tokens.highlight_token(
    token, args.buf, args.data.client_id, "Error")
end

vim.api.nvim_create_autocmd("LspTokenUpdate", {
  callback = show_unconst_caps,
})

Controlling When Highlights are Applied

The previous example, which highlighted mutable variables, only makes sense for languages that have some way of marking variables as readonly, like const in C++ and Typescript. In languages like Lua or Python, where there is no readonly, that highlight won't work correctly.

Thankfully, there are many ways to control how the highlights are applied:

  • :h autocmd-pattern explains how you can filter autocommands based on file name:

    vim.api.nvim_create_autocmd("LspTokenUpdate", {
      pattern = {"*.cpp", "*.hpp"},
      callback = show_unconst_caps,
    })
  • :h LspTokenUpdate tells you that the client_id is in the args, so you can just return early if it's not an LSP server you want to highlight:

    local function show_unconst_caps(args)
      local client = vim.lsp.get_client_by_id(args.data.client_id)
      if client.name ~= "clangd" then return end
    
      local token = args.data.token
      -- etc
    end
  • You can create buffer-local autocommands (:h autocmd-buflocal) whenever an LSP client attaches to a buffer:

    require('lspconfig').clangd.setup {
      on_attach = function(client, buffer)
        vim.api.nvim_create_autocmd("LspTokenUpdate", {
          buffer = buffer,
          callback = show_unconst_caps,
        })
    
        -- other on_attach logic
      end
    }
    
  • You can also create buffer-local autocommands inside an :h LspAttach event callback:

    vim.api.nvim_create_autocmd("LspAttach", {
      callback = function(args)
        local client = vim.lsp.get_client_by_id(args.data.client_id)
        if client.name ~= "clangd" then return end
    
        vim.api.nvim_create_autocmd("LspTokenUpdate", {
          buffer = args.buf,
          callback = show_unconst_caps,
        })
      end
    })
@carschandler
Copy link

  1. I prefer to lower the priority because in the case that treesitter doesn't pick something up, we can fall back to semantic. Granted, I highly doubt I'd come across a case where this occurs, but why not have the option rather than completely disabling? Also the config is cleaner than having to iterate over each item and disabling it.
  2. When I :Inspect the text in p.text, for example, the only semantic highlight is @lsp.type.variable.python. Granted, this may be a shortcoming of the LSP.
  3. I'm trying out basedpyright, which does implement semantic highlighting, which explains why I never had this issue before, so I guess it's an issue I should open on their page?

@swarn
Copy link
Author

swarn commented Jul 3, 2024

That all makes sense to me! It seems reasonable to submit that as an issue at basedpyright.

I have an alternative solution which would work better for me, because I wouldn't want to change the highlight priorities for all LSP servers, just the misbehaving one. If that server is simply applying variable to too many identifiers, then I'd disable just that highlight:

vim.api.nvim_set_hl(0, '@lsp.type.variable.python', {})

If I didn't like that server's highlighting in general, I'd probably just disable semantic highlighting for it, and check to see where their development is in another six months or so.

It would be neat if the semantic token highlighting priority could be set as a buffer option in neovim. If you submit that as an issue there, I suspect people will yell at you :).

@carschandler
Copy link

Yep that's definitely fair and probably the better option. Thanks for the alternatives :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment