Special characters in Vim

Particularly when editing documents for human consumption rather than code, it’s often necessary to enter special characters into a document that can’t otherwise be produced by a single key press:

  • Letters with diacritical marks like ä, é, and ô — Vim refers to these as digraphs — a particular problem when using a US keyboard layout
  • Unicode characters like typographic dashes or copyright symbols ©, or other symbols from the multi-byte portion of the UTF-8 character set (including foreign languages)
  • Literal control characters like <Tab>

Vim has a method for inserting each of these within the editor, rather than having to copy-paste them from another document. We won’t discuss Vim’s alternative multibyte input methods here, and will assume that you’re using a keyboard with a US or UK layout, and predominantly type in English — apologies to international readers, but I do not have another type of keyboard to test this out!

Some of the following assumes that you’re using Vim in a UTF-8 capable terminal, and with the encoding option in your .vimrc set to utf-8, which is highly recommended for the vast majority of editing requirements:

set encoding=utf-8

It also assumes that your font is capable of displaying all of the characters concerned; monospace fonts with workable symbol coverage include Consolas, Inconsolata, and Ubuntu Mono.

Digraphs

Vim has a special shorthand for entering characters with diacritical marks. If you need some familiar variant of a Latin alphabet character with a diacritical mark or embellishment, it’s likely you’ll be able to input it with the digraph system. It also has support for some other sometimes-needed characters like thorn Þ and eszett ß, and Cyrillic characters.

Digraph input is started in insert or command mode (but not normal mode) by pressing Ctrl-k, then two printable characters in succession; the first is often the “base” form of the letter, and the second denotes the appropriate embellishment.

Some simple examples that might occasionally be needed for English speakers to correctly type one of the language’s many “loan words”:

  • Ctrl-k c , -> ç
  • Ctrl-k e ' -> é
  • Ctrl-k o ^ -> ô
  • Ctrl-k a ! -> à
  • Ctrl-k u : -> ü
  • Ctrl-k = e ->

This is just a small sample; Vim has support for a great many digraphs. Take a look at the relevant section of the documentation for a complete treatment of the feature. You can also type :digraphs within Vim to get a complete list of digraphs — several screenfuls of them!

Note that you can enter all of these characters using the Unicode mode discussed later in this article as well; two-character mnemonic digraphs simply happen to be easier to remember than four-digit codes.

Unicode characters

For characters not covered in the digraph set, you can also enter unicode characters by referring to their code page number. In insert or command mode (but not normal mode) this is done by typing Ctrl-v and then u, followed by the hexadecimal number. Some potentially useful examples:

  • Ctrl-v u 2018 -> , a LEFT SINGLE QUOTATION MARK
  • Ctrl-v u 2019 -> , a RIGHT SINGLE QUOTATION MARK
  • Ctrl-v u 2014 -> , an EM DASH
  • Ctrl-v u 00a9 -> ©, a COPYRIGHT SIGN

These are handy in some cases when writing HTML documents, as an alternative to using HTML entities like &mdash; or &copy;. An exhaustive summary of these characters and their codes is available on the Unicode website.

Other non-printable characters

The unicode character input method is actually a specialised case of inputting literal characters with a Ctrl-v prefix. We can input other non-printable and control characters using this prefix:

  • Ctrl-v <Enter> -> ^M
  • Ctrl-v <Tab> -> ^I

This is sometimes handy when conforming to someone else’s tab style, and can also be handy when searching for characters literally in searches.