LSP: Writing a Language Server in Bash

October 23, 2023 · 7 min read

Jeffrey Chupp

Prefab Founding Engineer. Three-time dad. Polyglot. I am a pleaser. He/him.

Implementing a language server is so easy that we're going to do it in Bash.

You really shouldn't write your language server in Bash, but we'll show that you could.

The minimum viable language server needs to

receive JSON-RPC requests from the client
respond with JSON-RPC to client requests

Like a good bash program, we'll talk over stdin and stdout.

Mad scientist holding a bash language server

JSON-RPC

JSON-RPC is simply a protocol for communication over JSON.

A request message includes an id, params, and the method to be invoked.
A response message includes the id from the request and either a result or error payload.
The LSP adds on the additional requirement of a header specifying the Content-Length of the message.

An example request might look like

Content-Length: 226\r\n
\r\n
{"jsonrpc":"2.0","method":"initialize","id":1,"params":{"trace":"off","processId":2729,"capabilities":[],"workspaceFolders":null,"rootUri":null,"rootPath":null,"clientInfo":{"version":"0.10.0-dev+Homebrew","name":"Neovim"}}}

An example response might look like

Content-Length: 114\r\n
\r\n
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "capabilities": {
      "completionProvider": {}
    }
  }
}

For our language server in bash, we'll write the following function:

respond() {
  local body="$1"
  local length=${#body}
  local response="Content-Length: $length\r\n\r\n$body"

  echo -e "$response"
}

This will take a JSON string as an argument and echo it out. The -e ensures our line breaks work as intended.

Listening for messages and parsing them

Our language server will listen for messages on stdin and write messages on stdout.

Let's name the bash script /tmp/bash-ls and chmod +x it.

I'll connect it to my editor, Neovim, using

vim.lsp.start {
    name = "Bash LS",
    cmd = {"/tmp/bash-ls"},
    capabilities = vim.lsp.protocol.make_client_capabilities(),
}

Now, we'll work on our read/print loop.

We'll start with the Bash classic

while IFS= read -r line; do

This gives us a value for $line that looks like Content-Length: 3386

The content length will vary based on the capabilities of your editor, but the gist here is that we need to read 3386 characters to get the entire JSON payload.

Let's extract the content length number

while IFS= read -r line; do
  # Capture the content-length header value
  [[ "$line" =~ ^Content-Length:\ ([0-9]+) ]]
  length="${BASH_REMATCH[1]}"

We need to add 2 to the number to account for the initial \r after the content length header. So we'll length=$((length + 2))

Now we're ready to read the JSON payload:

while IFS= read -r line; do
  # Capture the content-length header value
  [[ "$line" =~ ^Content-Length:\ ([0-9]+) ]]
  length="${BASH_REMATCH[1]}"

  # account for \r at end of header
  length=$((length + 2))

  # Read the message based on the Content-Length value
  json_payload=$(head -c "$length")

Remember that JSON-RPC requires us to include the id of the request message in our response. We could write some convoluted JSON parsing in bash to extract the id, but we'll lean on jq instead.

while IFS= read -r line; do
  # Capture the content-length header value
  [[ "$line" =~ ^Content-Length:\ ([0-9]+) ]]
  length="${BASH_REMATCH[1]}"

  # account for \r at end of header
  length=$((length + 2))

  # Read the message based on the Content-Length value
  json_payload=$(head -c "$length")

  # We need -E here because jq fails on newline chars -- https://github.com/jqlang/jq/issues/1049
  id=$(echo -E "$json_payload" | jq -r '.id')

Now, we have everything we need to read and reply to our first message.

The `initialize` method

The first message sent by the client is the initialize method. It describes the client's capabilities to the server.

You can think of this message as saying, "Hey, language server, here are all the features I support!"

The server replies with, "Oh, hi, client. Given the things you support, here are the things I know how to handle."

Well, that's how it should work, anyway. For our MVP here, we'll provide a canned response with an empty capabilities section.

while IFS= read -r line; do
  # Capture the content-length header value
  [[ "$line" =~ ^Content-Length:\ ([0-9]+) ]]
  length="${BASH_REMATCH[1]}"

  # account for \r at end of header
  length=$((length + 2))

  # Read the message based on the Content-Length value
  json_payload=$(head -c "$length")

  # We need -E here because jq fails on newline chars -- https://github.com/jqlang/jq/issues/1049
  id=$(echo -E "$json_payload" | jq -r '.id')
  method=$(echo -E "$json_payload" | jq -r '.method')

  case "$method" in
  'initialize')
    respond '{
          "jsonrpc": "2.0",
          "id": '"$id"',
          "result": {
            "capabilities": {}
          }
        }'
    ;;

  *) ;;
  esac
done

We pluck out the request's method and use a case statement to reply to the correct method. If we don't support the method, we don't respond to the client.

If we didn't use a case statement here and always replied with our canned message, we'd make it past initialization, but then the client would get confused when we respond to (e.g.) its request for text completions with an initialize result.

That's all you need for a minimum viable language server built-in bash. It doesn't do anything besides the initialization handshake, but it works.

Adding completion

A language server that doesn't do anything is no fun, so let's teach it how to respond to textDocument/completion to offer text completions.

First, we'll need to modify our capabilities in our initialize response to indicate that we support completion:

          "result": {
            "capabilities": {
              "completionProvider": {}
            }
          }

We'll start with hardcoded results to verify things work. This is as easy as adding a new condition to our case statement.

  'textDocument/completion')
    respond '{
          "jsonrpc": "2.0",
          "id": '"$id"',
          "result": {
            "isIncomplete": false,
            "items": [
              { "label": "a sample completion item" },
              { "label": "another sample completion item" }
            ]
          }
        }'
    ;;

That works as we hoped. Let's jazz it up a little by completing the first 1000 words from the dict file on macOS (your path may differ).

Here's the final version of the script:

#!/bin/bash

respond() {
  local body="$1"
  local length=${#body}
  local response="Content-Length: $length\r\n\r\n$body"

  echo "$response" >>/tmp/out.log

  echo -e "$response"
}

completions=$(head </usr/share/dict/words -n 1000 | jq --raw-input --slurp 'split("\n")[:-1] | map({ label: . })')

while IFS= read -r line; do
  # Capture the content-length header value
  [[ "$line" =~ ^Content-Length:\ ([0-9]+) ]]
  length="${BASH_REMATCH[1]}"

  # account for \r at end of header
  length=$((length + 2))

  # Read the message based on the Content-Length value
  json_payload=$(head -c "$length")

  # We need -E here because jq fails on newline chars -- https://github.com/jqlang/jq/issues/1049
  id=$(echo -E "$json_payload" | jq -r '.id')
  method=$(echo -E "$json_payload" | jq -r '.method')

  case "$method" in
  'initialize')
    respond '{
          "jsonrpc": "2.0",
          "id": '"$id"',
          "result": {
            "capabilities": {
              "completionProvider": {}
            }
          }
        }'
    ;;

  'textDocument/completion')
    respond '{
          "jsonrpc": "2.0",
          "id": '"$id"',
          "result": {
            "isIncomplete": false,
            "items": '"$completions"'
          }
        }'
    ;;

  *) ;;
  esac
done

Lovely.

Closing

In 56 lines of bash, we've implemented a usable (if boring) language server.

I wouldn't advocate anyone writing a serious language server in bash. Hopefully, this has illustrated how easy it is to get started with language servers and has made the LSP and JSON-RPC a little less magical.

What language would I recommend for writing a language server? That's probably a whole article in itself, but the short answer is

All things being equal, choose TypeScript. The first-party libraries (e.g., vscode-languageserver-node) are in written TypeScript, and the community and ecosystem are excellent.
If you don't want to use TypeScript, use whatever language you're most productive in. There's probably already a library for writing a language server in your preferred language, but if there isn't, you now know how easy it is to write it yourself.

If you'd like to be notified when I publish more LSP content, sign up for my newsletter.

Like what you read? You might want to check out what we're building at Prefab. Feature flags, dynamic config, and dynamic log levels. Free trials and great pricing for all of it.

See our Feature Flags

Feature Flags with a Seat for Everyone

JSON-RPC​

Listening for messages and parsing them​

The initialize method​

Adding completion​

Closing​

JSON-RPC

Listening for messages and parsing them

The `initialize` method

Adding completion

Closing