How to write your own mode for GNU Emacs and publish it in MELPA

Some time ago I developed mode GNU Emacs to edit operating system configuration files Embox. In addition to a comprehensive study of Emacs Lisp, I needed to understand the structure of the mode module, as well as the process and requirements
to publish packages to MELPA, the most popular package archive for GNU Emacs. In this guide, I’ll cover what you need to know to write your own mode, and how to publish your own package.

;;;###autoload
(define-derived-mode mybuild-mode prog-mode "Mybuild"
  "Major mode for editing Mybuild files from Embox operating system."
  :syntax-table mybuild-mode-syntax-table
  (setq-local comment-start "// ")
  (setq-local comment-end "")
  (setq-local indent-tabs-mode nil)
  (setq-local indent-line-function 'mybuild-mode-indent-line)
  (setq-local font-lock-defaults '(mybuild-highlights)))

Introduction

Emacs has two kinds of modes:

The main mode is the main one for the buffer, and the additional ones provide functions that are not present in the main one. And often these are functions that are applicable to various main modes.

For example, editing files with Emacs Lisp code is responsible for emacs-lisp-mode. This is major-mode. He knows how to do syntax highlighting, how to properly indent, and even gives information about
functions in minibuffer. But it doesn’t provide handy features like code completion and on-the-fly code checking. This is the field of activity for minor-mode. For autocompletion, for example, companyand for checks flymake.

In this article, we will consider the development of only the main mode, but the situation will be about the same with additional ones.

emacs distinguishes Three kinds of main modes:

  • text (text-mode)

  • software (prog-mode)

  • special (special-mode)

text-mode designed for editing text files: flat (txt), formatted (org, md) and structured text (html, json).

prog-mode applied to programming languages.

special-mode is not tied to a specific file and is used to create applications, for example, such as magit to work with Git repositories, dired to work with the file system, tetris.

In this article, we’ll take a look at developing our own prog-mode. The choice for Embox prog-mode configuration files is due to their syntax, which is extremely similar to C-like ones (only without the semicolon).

The Mybuild config file looks something like this:

package arduino_due.examples

module blinking_led {
       depends embox.driver.gpio.sam3
       source "blinking_led.c"
}

General moments and tasks

The first thing to decide when designing your regimen is how far are you willing to go? The more experience you have with GNU Emacs and Emacs Lisp, the more likely it is that what you plan will be implemented. If there is little experience, then you can limit yourself to syntactic highlighting and, perhaps, organizing the correct indentation, which is much more difficult to do.

The second moment will be the choice of the basic mode. As mentioned in the introduction, there are three kinds of basic modes in Emacs. If you want to add support for a completely new markup language, then you need to base on text-mode. If for a completely new programming language with original syntax, then take prog-mode. If you want to write an application, then you need special-mode. The good news is that if your project is not very original, then you can use a more advanced mode as a basis, for example, for languages ​​\u200b\u200bwith a C-like syntax, you can use c-mode. The bad news is that base mode is unlikely to work perfectly with your code, so you have to write some of the functions from scratch.

If you are going to write a mode for a markup, configuration, or programming language, then a grammar for it will be a good help. If you write for a programming language, then find out
a complete list of keywords, comment format and formatting features. This is the minimum that you will need to implement.

Come up with a name for your mode. For languages, it is usually made up of the name of the language and the suffix -mode. In my case this mybuild-mode. Applications can have any name, a hyphen should be used as a separator -. All characters must be lowercase.

Next, let’s move on to the structure of the project.

Structure of the main mode module

The main module of the mode should be named the same as the mode itself, with the addition of the file name extension .el (which means Emacs Lisp).

Let’s consider the main points here, and for the full text of the module from my project, contact
link.

Documentation

The mode module must begin with comments, which are fairly strictly formatted. To understand whether you wrote everything correctly, 1) the hints of Emacs itself will help you, 2) the command checkdoc and 3) MELPA requirements (more on that below).

The first line has three parts:

  1. module file name

  2. then three dashes and a very brief description of what your mode does

  3. Emacs directive to enable lexical linking

In my case, the first line looks like this:

;;; mybuild-mode.el --- Major mode for editing Mybuild files from Embox  -*- lexical-binding: t; -*-

If you are not familiar with Emacs Lisp, then I will give some explanation. So comment lines start with a semicolon ;. With the help of additional ; you can set some kind of headings of different levels. Most often, headings are distinguished by three comment characters, the text of the documentation itself by two, and a pass-through comment in the code by one.

Lisp uses something called dynamic linking. This is when the names encountered in the functions are associated with the names in the module that called it. In lexical binding, these names correspond to names in the same module as the function body. The latter mechanism is more common in other languages ​​than the former. In Emacs Lisp, dynamic linking is also primary, but lexical linking can also be included within a given module. The directors of MELPA believe that lexical linking should always be used.

The second line contains the copyright. For example:

;; Copyright (c) 2022 Evgeny Simonenko

Then you specify:

  • module author

    ;; Author: Evgeny Simonenko <easimonenko@gmail.com>
    
  • keywords

    ;; Keywords: languages
    
  • version

    ;; Version: 0.2.0
    
  • package dependencies

    ;; Package-Requires: ((emacs "24.3"))
    
  • date of creation of the regime

    ;; Created: August 2022
    
  • Home Page URL

    ;; URL: https://github.com/easimonenko/mybuild-mode
    
  • Link to the repository

    ;; Repository: https://github.com/easimonenko/mybuild-mode
    

Note that the correct dependencies can be determined automatically by calling package-lint (see below).

Next, under the heading License you specify the license for the package code. Below is an example text for the GNU GPL v3 license:

;;; License:
;;
;; This program is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation, either version 3 of the License, or
;; (at your option) any later version.
;;
;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;; GNU General Public License for more details.
;;
;; You should have received a copy of the GNU General Public License
;; along with this program.  If not, see <http://www.gnu.org/licenses/>.

Both GNU Emacs itself and the built-in packages are published under this license.

under the heading Commentary we place a detailed description of your mode. Usually, the functions of the mode are briefly described here, how to work with it and how to configure it. I described my mode like this:

;;; Commentary:
;;
;; Major mode for editing Mybuild files from Embox operating system
;;
;; mybuild-mode supports:
;;
;; * syntax highlighting;
;; * proper indentations;
;; * autoload for Mybuild, *.my, mods.conf files.
;;
;; Customization
;; -------------
;;
;; You can set the width of the indentation by setting the customizable user
;; option variable mybuild-indent-offset from customization group mybuild.
;; By default, it is set to 2.

Code

Didn’t get bored? Let’s move on to the most interesting! Heading Code prepends your module code.

;;;###autoload
(define-derived-mode mybuild-mode prog-mode "Mybuild"
  "Major mode for editing Mybuild files from Embox operating system."
  :syntax-table mybuild-mode-syntax-table
  (setq-local comment-start "// ")
  (setq-local comment-end "")
  (setq-local indent-tabs-mode nil)
  (setq-local indent-line-function 'mybuild-mode-indent-line)
  (setq-local font-lock-defaults '(mybuild-highlights)))

To declare our mode, we need to use a special form define-derived-mode. Special forms in Lisp (and Scheme) are constructs similar to a function call, but arranged differently so that they cannot be implemented through functions. (However, as well as macros designed to generate code based on the description).

Directive ;;;###autoload instructs Emacs to automatically execute the code marked by it, which means to start our mode when necessary. And when is it necessary? But when:

;;;###autoload
(add-to-list 'auto-mode-alist '("\\(?:/Mybuild\\|\\.my\\|/mods\\.conf\\)\\'" . mybuild-mode))

This code is executed without fail in order to set the association of files with names Mybuild or with extensions my, mods, conf with our regime.

IN define-derived-mode we pass the name of the mode, the parent mode (see introduction)
the human-readable name of the mode. The next line gives a brief description of the mode (in principle, the same as in the first comment line). Next, we set a variable with a description of the “syntax” of the language by setting a pair :syntax-table. Then we set the main parameters of the mode.

The syntax table for Mybuild config files is as follows:

(defvar mybuild-mode-syntax-table
  (let ((st (make-syntax-table)))
    (modify-syntax-entry ?@ "w" st)
    (modify-syntax-entry ?_ "w" st)
    (modify-syntax-entry ?\{ "(}" st)
    (modify-syntax-entry ?\} "){" st)
    (modify-syntax-entry ?\( "()" st)
    (modify-syntax-entry ?\) ")(" st)
    (modify-syntax-entry ?\/ ". 124b" st)
    (modify-syntax-entry ?* ". 23" st)
    (modify-syntax-entry ?\n "> b" st)
    st)
  "Syntax table for `mybuild-mode'.")

special shape defvar is for declaring variables. The Lisp language, unlike Haskell and partly Scheme, is not pure, and it has variables. A variable declaration consists of its name, an expression whose result will be assigned to it, and a documentation string.

To create a syntax table, the function is called make-syntax-table. After that, we make entries in it by calling the function modify-syntax-entry. In our example, we describe what the names of “variables” consist of (modifier w), pair brackets (modifiers of the form (}), characters that open and close comments. (I agree with you if this all seemed very convoluted to you. It took me a lot of time to figure it out and get the desired result.)

If we go back to the mode declaration, we can see that the comment description is also set here:

(setq-local comment-start "// ")
(setq-local comment-end "")

Here, the characters for the beginning and end of inline comments are set, and the description for block comments was given above.

special shape setq-local assigns a value to a local variable.

In recent versions of GNU Emacs and prior to version 29, syntax highlighting had to use the builtin font-lock-mode. To activate it, we set a variable font-lock-defaults:

(setq-local font-lock-defaults '(mybuild-highlights))

'(mybuild-highlights) means that we have a variable that describes how to correctly highlight various code fragments, and we pass a reference to it.

(defvar mybuild-highlights
  `(("'''[^z-a]*?'''" . 'font-lock-string-face)
    ("@[A-Za-z][A-Za-z0-9-+_]*" . 'font-lock-preprocessor-face)
    ( ,(regexp-opt mybuild-keywords 'words) . 'font-lock-keyword-face)
    ( ,(regexp-opt mybuild-types 'words) . 'font-lock-type-face)
    ( ,(regexp-opt mybuild-constants 'words) . 'font-lock-constant-face)
    ("[A-Za-z][A-Za-z0-9-+/_]*" . 'font-lock-function-name-face))
  "Mybuild syntax highlighting with `'font-lock-mode'.")

This code is less convoluted than the description of the syntax table, but still needs some explanation. The backquote character “` denotes the formation of a list of the following values. Dot symbol . denotes a pair constructor (cons).

With the help of regular expressions, we describe what to consider as strings, keywords, constants, function names, and text for the preprocessor. You need to understand that these are all conventions, and behind the text for the preprocessor you can hide compiler directives, and indeed anything you like. For convenient construction of regular expressions, the operation is used , And regexp-opt. With their help, a regular expression for alternative strings is formed from a list of words. We describe the lists of words in the corresponding variables:

(defvar mybuild-keywords
  '("package" "import" "annotation" "interface" "extends" "feature" "module"
    "static" "abstract" "depends" "provides" "requires" "source" "object" "option"
    "configuration" "include"))

(defvar mybuild-types
  '("string" "number" "boolean"))

(defvar mybuild-constants
  '("true" "false"))

By the way, the recently released version 29 offers a progressive method for writing syntax parsers. tree-sitter-modeusing grammars. You can read more about this in the article Tree-sitter: an overview of the incremental parser.

We continue to analyze the description of the regime. The next line specifies the refusal to use the tab character for indentation:

(setq-local indent-tabs-mode nil)

One of the most important and at the same time very complex functions of the mode is the formation of correct indents. A function that will be called every time you press a key Enter or Tabis set in a variable indent-line-function:

(setq-local indent-line-function 'mybuild-mode-indent-line)

Since the formation of indents varies greatly, we will only consider the main points of this code:

(defun mybuild-mode-indent-line ()
  "Indent current line for `mybuild-mode'."
  (interactive)
  (let ((indent-col 0))
    (save-excursion
      (beginning-of-line)
      (condition-case nil
          (while t
            (backward-up-list 1)
            (when (looking-at "[{]")
              (setq indent-col (+ indent-col mybuild-indent-offset))))
        (error nil)))
    (save-excursion
      (back-to-indentation)
      (when (and (looking-at "[}]") (>= indent-col mybuild-indent-offset))
        (setq indent-col (- indent-col mybuild-indent-offset))))
    (indent-line-to indent-col)))

special shape defun declares a function. In our case, no arguments are passed to it, so an empty pair of brackets appears in the description (). Call interactive tells Emacs that this function can be called interactively, which can be useful for manual debugging.

For Mybuild files, the indentation convention for C-like languages ​​is applied, but life is greatly spoiled by the fact that a semicolon is not used here, which is why we are forced to use a self-written function instead of the one available in c-mode.

Function save-excursion remembers the current position of the cursor in the buffer. beginning-of-line puts courses at the beginning of the string. backward-up-list translates the desired position one line up, so that then
by using looking-at find the position of the opening parenthesis. If the bracket is found, then the amount of indentation indent-col incremented by the value in the setting mybuild-indent-offset (more on mode settings below). Then we look for the closing bracket and reduce the amount of indentation. By using indent-line-to
indent by the previously calculated value indent-col. To put it simply, if we enter a block restricted {then we increase the indent, and if we exit, entering }then we decrease.

It remains quite a bit, only a couple of points: settings for key combinations and mode settings.

Key combinations that allow you to call some of the mode’s functions are specified by a combination of function calls make-keymap And define-key:

(defvar mybuild-mode-map
  (let ((map (make-keymap)))
    (define-key map "\C-j" 'newline-and-indent)
    map)
  "Keymap for `mybuild-mode'.")

Here we specify that with Ctrl-j the function of forming the correct indents will be called.

One of the most common mode settings for languages ​​is the amount of indentation. To create such a setting, we will use special forms defgroup And defcustom. After that, the user will be able, using the editor interface, to set the indent value he needs.

(defgroup mybuild nil
  "Customization variables for Mybuild mode."
  :group 'languages
  :tag "Mybuild")

(defcustom mybuild-indent-offset 2
  "Indentation offset for `mybuild-mode'."
  :group 'mybuild
  :type 'integer
  :safe 'integerp)

By using defgroup we have created a settings group mybuild for our regime. In the interface, it will be available by name Mybuild. Then using defcustom added to the group mybuild setting mybuild-indent-offset with default value equal to 2which has an integer type and a function to check the entered value using a predicate integerp.

Finally, the module code ends with a comment like:

;;; mybuild-mode.el ends here

Well, the most difficult, and perhaps the most interesting part is over. A little more routine.

Package Structure

In addition to the main and auxiliary modules, the project directory should contain:

In the README, give a brief description of the mode, list its capabilities, and you can also indicate that the mode does not. Then give a full description of how to use your mode, how to set it up.
It is also not bad to describe how the mode works, what other modes or programs it depends on. Finally, indicate the license for the package code and authorship.

After publishing a package in MELPA, you can add badges with links to the package pages.

Since publishing to MELPA assumes that the package sources are located in a Git (or Mercurial) repository, you will also put the file there .gitignore.

My .gitignore such:

# Compiled
*.elc

# Packaging
.cask

# Backup files
*~

# Undo-tree save-files
*.~undo-tree

The process of publishing a package in MELPA

Now we need to publish our mode to MELPA. This process is not complicated, but not trivial either. The main steps are:

  1. Login to your GitHub account and fork MELPA repository.

  2. Write a Recipe File for your mode and add it to your MELPA fork.

  3. Check your package for compliance with MELPA requirements.

  4. Submit a Pull Request for your mode recipe.

  5. Wait for verification from MELPA support.

  6. Make the required corrections and publish them.

  7. Leave a comment on your PR that you’re done.

  8. If all goes well, your package will appear in the MELPA archive.

  9. If you didn’t fix everything, or other errors were found, then see point 6.

The recipe file is written in Emacs Lisp and must have the structure described in README MELPA. In its simplest form, the recipe will look similar to mine:

(mybuild-mode :fetcher github :repo "easimonenko/mybuild-mode")

Those. first we write the name of the mode package, then we indicate where its repository is located. In addition to GitHub, other package repositories are supported, such as GitLab, Git and Mercurial repositories. You may also need the option :fileswhich allows you to list the files included in the package, as well as excluded using a nested option :exclude. You probably won’t need to do this.

The recipe file itself should be named after your package.

Once a package is published, MELPA will keep track of updates to its repository, and each time you commit, MELPA will publish a fresh version of the package. To publish a package in MELPA Stable, you must make at least one tag (tag) or release (release) in the repository. Similarly, after the next tag or release is published, a new version of the package will be created.

When publishing a PR, you will need to fill in the data according to the proposed template and go through the list of requirements:

  1. The license must be free and compatible with the GNU GPL. My choice is GPL v3, i.e. the license under which GNU Emacs itself is distributed.

  2. Need to read document CONTRIBUTINGwhich describes the specifics of publishing a new package.

  3. The package code must be checked with package lint.

  4. Compile the Emacs Lisp code to bytecode with the command M-x byte-compile-file.

  5. Check the mode documentation with the command M-x checkdoc.

  6. Build locally and install the package.

The first five points are clear enough. And how to make the sixth is described in the same CONTRIBUTING. For this we need a utility make. Change to the MELPA fork directory where we placed our recipe file and call:

make recipe/mybuild-mode

Then we call in Emacs the command M-x package-install-fileto install the built package from the resulting archive.

Conclusion

To summarize: we figured out what the process of developing our modes and MELPA packages is, looked at the structure of the package module, and what you need to know in order to write your own prog-mode. Of course, this all looks quite intimidating, and in general it is: after all, we need not only to master the technology, but also to learn how to program in Lisp and write code for GNU Emacs, which in itself is not trivial. But let’s not lose heart, because how can you live with Emacs if you don’t love it? 😉 Good luck!

Oh, yes, I almost forgot: if you also developed on Emacs Lisp, share your experience in the comments, or, who knows, in your article, and if you find an error in the text, be sure to write about it in the comments or message, I will be grateful.

What else to read

Links

(c) Simonenko Evgeny, 2023

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *