Custom commands

As users grow more familiar with the feature set available to them on UNIX-like operating systems, and grow more comfortable using the command line, they will find more often that they develop their own routines for solving problems using their preferred tools, often repeatedly solving the same problem in the same way. You can usually tell if you’ve entered this stage if one or more of the below applies:

  • You repeatedly search the web for the same long commands to copy-paste.
  • You type a particular long command so often it’s gone into muscle memory, and you type it without thinking.
  • You have a text file somewhere with a list of useful commands to solve some frequently recurring problem or task, and you copy-paste from it a lot.
  • You’re keeping large amounts of history so you can search back through commands you ran weeks or months ago with ^R, to find the last time an instance of a problem came up, and getting angry when you realize it’s fallen away off the end of your history file.
  • You’ve found that you prefer to run a tool like ls(1) more often with a non-default flag than without it; -l is a common example.

You can definitely accomplish a lot of work quickly with shoving the output of some monolithic program through a terse one-liner to get the information you want, or by developing muscle memory for your chosen toolbox and oft-repeated commands, but if you want to apply more discipline and automation to managing these sorts of tasks, it may be useful for you to explore more rigorously defining your own commands for use during your shell sessions, or for automation purposes.

This is consistent with the original idea of the Unix shell as a programming environment; the tools provided by the base system are intentionally very general, not prescribing how they’re used, an approach which allows the user to build and customize their own command set as appropriate for their system’s needs, even on a per-user basis.

What this all means is that you need not treat the tools available to you as holy writ. To leverage the Unix philosophy’s real power, you should consider customizing and extending the command set in ways that are useful to you, refining them as you go, and sharing those extensions and tweaks if they may be useful to others. We’ll discuss here a few methods for implementing custom commands, and where and how to apply them.

Aliases

The first step users take toward customizing the behaviour of their shell tools is often to define shell aliases in their shell’s startup file, usually specifically for interactive sessions; for Bash, this is usually ~/.bashrc.

Some aliases are so common that they’re included as commented-out suggestions in the default ~/.bashrc file for new users. For example, on Debian systems, the following alias is defined by default if the dircolors(1) tool is available for coloring ls(1) output by filetype:

alias ls='ls --color=auto'

With this defined at startup, invoking ls, with or without other arguments, will expand to run ls --color=auto, including any given arguments on the end as well.

In the same block of that file, but commented out, are suggestions for other aliases to enable coloured output for GNU versions of the dir and grep tools:

#alias dir='dir --color=auto'
#alias vdir='vdir --color=auto'

#alias grep='grep --color=auto'
#alias fgrep='fgrep --color=auto'
#alias egrep='egrep --color=auto'

Further down still, there are some suggestions for different methods of invoking ls:

#alias ll='ls -l'
#alias la='ls -A'
#alias l='ls -CF'

Commenting these out would make ll, la, and l work as commands during an interactive session, with the appropriate options added to the call.

You can check the aliases defined in your current shell session by typing alias with no arguments:

$ alias
alias ls='ls --color=auto'

Aliases are convenient ways to add options to commands, and are very common features of ~/.bashrc files shared on the web. They also work in POSIX-conforming shells besides Bash. However, for general use, they aren’t very sophisticated. For one thing, you can’t process arguments with them:

# An attempt to write an alias that searches for a given pattern in a fixed
# file; doesn't work because aliases don't expand parameters
alias grepvim='grep "$1" ~/.vimrc'

They also don’t work for defining new commands within scripts for certain shells:

#!/bin/bash
alias ll='ls -l'
ll

When saved in a file as test, made executable, and run, this script fails:

./test: line 3: ll: command not found

So, once you understand how aliases work so you can read them when others define them in startup files, my suggestion is there’s no point writing any. Aside from some very niche evaluation tricks, they have no functional advantages over shell functions and scripts.

Functions

A more flexible method for defining custom commands for an interactive shell (or within a script) is to use a shell function. We could declare our ll function in a Bash startup file as a function instead of an alias like so:

# Shortcut to call ls(1) with the -l flag
ll() {
    command ls -l "$@"
}

Note the use of the command builtin here to specify that the ll function should invoke the program named ls, and not any function named ls. This is particularly important when writing a function wrapper around a command, to stop an infinite loop where the function calls itself indefinitely:

# Always add -q to invocations of gdb(1)
gdb() {
    command gdb -q "$@"
}

In both examples, note also the use of the "$@" expansion, to add to the final command line any arguments given to the function. We wrap it in double quotes to stop spaces and other shell metacharacters in the arguments causing problems. This means that the ll command will work correctly if you were to pass it further options and/or one or more directories as arguments:

$ ll -a
$ ll ~/.config

Shell functions declared in this way are specified by POSIX for Bourne-style shells, so they should work in your shell of choice, including Bash, dash, Korn shell, and Zsh. They can also be used within scripts, allowing you to abstract away multiple instances of similar commands to improve the clarity of your script, in much the same way the basics of functions work in general-purpose programming languages.

Functions are a good and portable way to approach adding features to your interactive shell; written carefully, they even allow you to port features you might like from other shells into your shell of choice. I’m fond of taking commands I like from Korn shell or Zsh and implementing them in Bash or POSIX shell functions, such as Zsh’s vared or its two-argument cd features.

If you end up writing a lot of shell functions, you should consider putting them into separate configuration subfiles to keep your shell’s primary startup file from becoming unmanageably large.

Examples from the author

You can take a look at some of the shell functions I have defined here that are useful to me in general shell usage; a lot of these amount to implementing convenience features that I wish my shell had, especially for quick directory navigation, or adding options to commands:

Other examples

Variables in shell functions

You can manipulate variables within shell functions, too:

# Print the filename of a path, stripping off its leading path and
# extension
fn() {
    name=$1
    name=${name##*/}
    name=${name%.*}
    printf '%s\n' "$name"
}

This works fine, but the catch is that after the function is done, the value for name will still be defined in the shell, and will overwrite whatever was in there previously:

$ printf '%s\n' "$name"
foobar
$ fn /home/you/Task_List.doc
Task_List
$ printf '%s\n' "$name"
Task_List

This may be desirable if you actually want the function to change some aspect of your current shell session, such as managing variables or changing the working directory. If you don’t want that, you will probably want to find some means of avoiding name collisions in your variables.

If your function is only for use with a shell that provides the local (Bash) or typeset (Ksh) features, you can declare the variable as local to the function to remove its global scope, to prevent this happening:

# Bash-like
fn() {
    local name
    name=$1
    name=${name##*/}
    name=${name%.*}
    printf '%s\n' "$name"
}

# Ksh-like
# Note different syntax for first line
function fn {
    typeset name
    name=$1
    name=${name##*/}
    name=${name%.*}
    printf '%s\n' "$name"
}

If you’re using a shell that lacks these features, or you want to aim for POSIX compatibility, things are a little trickier, since local function variables aren’t specified by the standard. One option is to use a subshell, so that the variables are only defined for the duration of the function:

# POSIX; note we're using plain parentheses rather than curly brackets, for
# a subshell
fn() (
    name=$1
    name=${name##*/}
    name=${name%.*}
    printf '%s\n' "$name"
)

# POSIX; alternative approach using command substitution:
fn() {
    printf '%s\n' "$(
        name=$1
        name=${name##*/}
        name=${name%.*}
        printf %s "$name"
    )"
}

This subshell method also allows you to change directory with cd within a function without changing the working directory of the user’s interactive shell, or to change shell options with set or Bash options with shopt only temporarily for the purposes of the function.

Another method to deal with variables is to manipulate the positional parameters directly ($1, $2 … ) with set, since they are local to the function call too:

# POSIX; using positional parameters
fn() {
    set -- "${1##*/}"
    set -- "${1%.*}"
    printf '%s\n' "$1"
}

These methods work well, and can sometimes even be combined, but they’re awkward to write, and harder to read than the modern shell versions. If you only need your functions to work with your modern shell, I recommend just using local or typeset. The Bash Guide on Greg’s Wiki has a very thorough breakdown of functions in Bash, if you want to read about this and other aspects of functions in more detail.

Keeping functions for later

As you get comfortable with defining and using functions during an interactive session, you might define them in ad-hoc ways on the command line for calling in a loop or some other similar circumstance, just to solve a task in that moment.

As an example, I recently made an ad-hoc function called monit to run a set of commands for its hostname argument that together established different types of monitoring system checks, using an existing script called nmfs:

$ monit() { nmfs "$1" Ping Y ; nmfs "$1" HTTP Y ; nmfs "$1" SNMP Y ; }
$ for host in webhost{1..10} ; do
> monit "$host"
> done

After that task was done, I realized I was likely to use the monit command interactively again, so I decided to keep it. Shell functions only last as long as the current shell, so if you want to make them permanent, you need to store their definitions somewhere in your startup files. If you’re using Bash, and you’re content to just add things to the end of your ~/.bashrc file, you could just do something like this:

$ declare -f monit >> ~/.bashrc

That would append the existing definition of monit in parseable form to your ~/.bashrc file, and the monit function would then be loaded and available to you for future interactive sessions. Later on, I ended up converting monit into a shell script, as its use wasn’t limited to just an interactive shell.

If you want a more robust approach to keeping functions like this for Bash permanently, I wrote a tool called Bashkeep, which allows you to quickly store functions and variables defined in your current shell into separate and appropriately-named files, including viewing and managing the list of names conveniently:

$ keep monit
$ keep
monit
$ ls ~/.bashkeep.d
monit.bash
$ keep -d monit

Scripts

Shell functions are a great way to portably customize behaviour you want for your interactive shell, but if a task isn’t specific only to an interactive shell context, you should instead consider putting it into its own script whether written in shell or not, to be invoked somewhere from your PATH. This makes the script useable in contexts besides an interactive shell with your personal configuration loaded, for example from within another script, by another user, or by an X11 session called by something like dmenu.

Even if your set of commands is only a few lines long, if you need to call it often–especially with reference to other scripts and in varying contexts– making it into a generally-available shell script has many advantages.

/usr/local/bin

Users making their own scripts often start by putting them in /usr/local/bin and making them executable with sudo chmod +x, since many Unix systems include this directory in the system PATH. If you want a script to be generally available to all users on a system, this is a reasonable approach. However, if the script is just something for your own personal use, or if you don’t have the permissions necessary to write to this system path, it may be preferable to have your own directory for logical binaries, including scripts.

Private bindir

Unix-like users who do this seem to vary in where they choose to put their private logical binaries directory. I’ve seen each of the below used or recommended:

  • ~/bin
  • ~/.bin
  • ~/.local/bin
  • ~/Scripts

I personally favour ~/.local/bin, but you can put your scripts wherever they best fit into your HOME directory layout. You may want to choose something that fits in well with the XDG standard, or whatever existing standard or system your distribution chooses for filesystem layout in $HOME.

In order to make this work, you will want to customize your login shell startup to include the directory in your PATH environment variable. It’s better to put this into ~/.profile or whichever file your shell runs on login, so that it’s only run once. That should be all that’s necessary, as PATH is typically exported as an environment variable for all the shell’s child processes. A line like this at the end of one of those scripts works well to extend the system PATH for our login shell:

PATH=$HOME/.local/bin:$PATH

Note that we specifically put our new path at the front of the PATH variable’s value, so that it’s the first directory searched for programs. This allows you to implement or install your own versions of programs with the same name as those in the system; this is useful, for example, if you like to experiment with building software in $HOME.

If you’re using a systemd-based GNU/Linux, and particularly if you’re using a display manager like GDM rather than a TTY login and startx for your X11 environment, you may find it more robust to instead set this variable with the appropriate systemd configuration file. Another option you may prefer on systems using PAM is to set it with pam_env(8).

After logging in, we first verify the directory is in place in the PATH variable:

$ printf '%s\n' "$PATH"
/home/tom/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games

We can test this is working correctly by placing a test script into the directory, including the #!/bin/sh shebang, and making it executable by the current user with chmod(1):

$ cat >~/.local/bin/test-private-bindir
#!/bin/sh
printf 'Working!\n'
^D
$ chmod u+x ~./local/bin/test-private-bindir
$ test-private-bindir
Working!

Examples from the author

I publish the more generic scripts I keep in ~/.local/bin, which I keep up-to-date on my personal systems in version control using Git, along with my configuration files. Many of the scripts are very short, and are intended mostly as building blocks for other scripts in the same directory. A few examples:

  • gscr(1df): Run a set of commands on a Git repository to minimize its size.
  • fgscr(1df): Find all Git repositories in a directory tree and run gscr(1df) over them.
  • hurl(1df): Extract URLs from links in an HTML document.
  • maybe(1df): Exit with success or failure with a given probability.
  • rfcr(1df): Download and read a given Request for Comments document.
  • tot(1df): Add up a list of numbers.

For such scripts, I try to write them as much as possible to use tools specified by POSIX, so that there’s a decent chance of them working on whatever Unix-like system I need them to.

On systems I use or manage, I might specify commands to do things relevant specifically to that system, such as:

  • Filter out uninteresting lines in an Apache HTTPD logfile with awk.
  • Check whether mail has been delivered to system users in /var/mail.
  • Upgrade the Adobe Flash player in a private Firefox instance.

The tasks you need to solve both generally and specifically will almost certainly be different; this is where you can get creative with your automation and abstraction.

X windows scripts

An additional advantage worth mentioning of using scripts rather than shell functions where possible is that they can be called from environments besides shells, such as in X11 or by other scripts. You can combine this method with X11-based utilities such as dmenu(1), libnotify’s notify-send(1), or ImageMagick’s import(1) to implement custom interactive behaviour for your X windows session, without having to write your own X11-interfacing code.

Other languages

Of course, you’re not limited to just shell scripts with this system; it might suit you to write a script completely in a language like awk(1), or even sed(1). If portability isn’t a concern for the particular script, you should use your favourite scripting language. Notably, don’t fall into the trap of implementing a script in shell for no reason …

#!/bin/sh
awk 'NF>2 && /foobar/ {print $1}' "$@"

… when you can instead write the whole script in the main language used, and save a fork(2) syscall and a layer of quoting:

#!/usr/bin/awk -f
NF>2 && /foobar/ {print $1}

Versioning and sharing

Finally, if you end up writing more than a couple of useful shell functions and scripts, you should consider versioning them with Git or a similar version control system. This also eases implementing your shell setup and scripts on other systems, and sharing them with others via publishing on GitHub. You might even go so far as to write a Makefile to install them, or manual pages for quick reference as documentation … if you’re just a little bit crazy …

Testing exit values in Bash

In Bash scripting (and shell scripting in general), we often want to check the exit value of a command to decide an action to take after it completes, likely for the purpose of error handling. For example, to determine whether a particular regular expression regex was present somewhere in a file options, we might apply grep(1) with its POSIX -q option to suppress output and just use the exit value:

grep -q regex options

An approach sometimes taken is then to test the exit value with the $? parameter, using if to check if it’s non-zero, which is not very elegant and a bit hard to read:

# Bad practice
grep -q regex options
if (($? > 0)); then
    printf '%s\n' 'myscript: Pattern not found!' >&2
    exit 1
fi

Because the if construct by design tests the exit value of commands, it’s better to test the command directly, making the expansion of $? unnecessary:

# Better
if grep -q regex options; then
    # Do nothing
    :
else
    printf '%s\n' 'myscript: Pattern not found!\n' >&2
    exit 1
fi

We can precede the command to be tested with ! to negate the test as well, to prevent us having to use else as well:

# Best
if ! grep -q regex options; then
    printf '%s\n' 'myscript: Pattern not found!' >&2
    exit 1
fi

An alternative syntax is to use && and || to perform if and else tests with grouped commands between braces, but these tend to be harder to read:

# Alternative
grep -q regex options || {
    printf '%s\n' 'myscript: Pattern not found!' >&2
    exit 1
}

With this syntax, the two commands in the block are only executed if the grep(1) call exits with a non-zero status. We can apply && instead to execute commands if it does exit with zero.

That syntax can be convenient for quickly short-circuiting failures in scripts, for example due to nonexistent commands, particularly if the command being tested already outputs its own error message. This therefore cuts the script off if the given command fails, likely due to ffmpeg(1) being unavailable on the system:

hash ffmpeg || exit 1

Note that the braces for a grouped command are not needed here, as there’s only one command to be run in case of failure, the exit call.

Calls to cd are another good use case here, as running a script in the wrong directory if a call to cd fails could have really nasty effects:

cd wherever || exit 1

In general, you’ll probably only want to test $? when you have specific non-zero error conditions to catch. For example, if we were using the --max-delete option for rsync(1), we could check a call’s return value to see whether rsync(1) hit the threshold for deleted file count and write a message to a logfile appropriately:

rsync --archive --delete --max-delete=5 source destination
if (($? == 25)); then
    printf '%s\n' 'Deletion limit was reached' >"$logfile"
fi

It may be tempting to use the errexit feature in the hopes of stopping a script as soon as it encounters any error, but there are some problems with its usage that make it a bit error-prone. It’s generally more straightforward to simply write your own error handling using the methods above.

For a really thorough breakdown of dealing with conditionals in Bash, take a look at the relevant chapter of the Bash Guide.

Unix as IDE: Editing

This entry is part 3 of 7 in the series Unix as IDE.

The text editor is the core tool for any programmer, which is why choice of editor evokes such tongue-in-cheek zealotry in debate among programmers. Unix is the operating system most strongly linked with two enduring favourites, Emacs and Vi, and their modern versions in GNU Emacs and Vim, two editors with very different editing philosophies but comparable power.

Being a Vim heretic myself, here I’ll discuss the indispensable features of Vim for programming, and in particular the use of shell tools called from within Vim to complement the editor’s built-in functionality. Some of the principles discussed here will be applicable to those using Emacs as well, but probably not for underpowered editors like Nano.

This will be a very general survey, as Vim’s toolset for programmers is enormous, and it’ll still end up being quite long. I’ll focus on the essentials and the things I feel are most helpful, and try to provide links to articles with a more comprehensive treatment of the topic. Don’t forget that Vim’s :help has surprised many people new to the editor with its high quality and usefulness.

Filetype detection

Vim has built-in settings to adjust its behaviour, in particular its syntax highlighting, based on the filetype being loaded, which it happily detects and generally does a good job at doing so. In particular, this allows you to set an indenting style conformant with the way a particular language is usually written. This should be one of the first things in your .vimrc file.

if has("autocmd")
  filetype indent plugin on
endif

Syntax highlighting

Even if you’re just working with a 16-color terminal, just include the following in your .vimrc if you’re not already:

syntax on

The colorschemes with a default 16-color terminal are not pretty largely by necessity, but they do the job, and for most languages syntax definition files are available that work very well. There’s a tremendous array of colorschemes available, and it’s not hard to tweak them to suit or even to write your own. Using a 256-color terminal or gVim will give you more options. Good syntax highlighting files will show you definite syntax errors with a glaring red background.

Line numbering

To turn line numbers on if you use them a lot in your traditional IDE:

set number

You might like to try this as well, if you have at least Vim 7.3 and are keen to try numbering lines relative to the current line rather than absolutely:

set relativenumber

Tags files

Vim works very well with the output from the ctags utility. This allows you to search quickly for all uses of a particular identifier throughout the project, or to navigate straight to the declaration of a variable from one of its uses, regardless of whether it’s in the same file. For large C projects in multiple files this can save huge amounts of otherwise wasted time, and is probably Vim’s best answer to similar features in mainstream IDEs.

You can run :!ctags -R on the root directory of projects in many popular languages to generate a tags file filled with definitions and locations for identifiers throughout your project. Once a tags file for your project is available, you can search for uses of an appropriate tag throughout the project like so:

:tag someClass

The commands :tn and :tp will allow you to iterate through successive uses of the tag elsewhere in the project. The built-in tags functionality for this already covers most of the bases you’ll probably need, but for features such as a tag list window, you could try installing the very popular Taglist plugin. Tim Pope’s Unimpaired plugin also contains a couple of useful relevant mappings.

Calling external programs

Until 2017, there were three major methods of calling external programs during a Vim session:

  • :!<command> — Useful for issuing commands from within a Vim context particularly in cases where you intend to record output in a buffer.
  • :shell — Drop to a shell as a subprocess of Vim. Good for interactive commands.
  • Ctrl-Z — Suspend Vim and issue commands from the shell that called it.

Since 2017, Vim 8.x now includes a :terminal command to bring up a terminal emulator buffer in a window. This seems to work better than previous plugin-based attempts at doing this, such as Conque. For the moment I still strongly recommend using one of the older methods, all of which also work in other vi-type editors.

Lint programs and syntax checkers

Checking syntax or compiling with an external program call (e.g. perl -c, gcc) is one of the calls that’s good to make from within the editor using :! commands. If you were editing a Perl file, you could run this like so:

:!perl -c %

/home/tom/project/test.pl syntax OK

Press Enter or type command to continue

The % symbol is shorthand for the file loaded in the current buffer. Running this prints the output of the command, if any, below the command line. If you wanted to call this check often, you could perhaps map it as a command, or even a key combination in your .vimrc file. In this case, we define a command :PerlLint which can be called from normal mode with \l:

command PerlLint !perl -c %
nnoremap <leader>l :PerlLint<CR>

For a lot of languages there’s an even better way to do this, though, which allows us to capitalise on Vim’s built-in quickfix window. We can do this by setting an appropriate makeprg for the filetype, in this case including a module that provides us with output that Vim can use for its quicklist, and a definition for its two formats:

:set makeprg=perl\ -c\ -MVi::QuickFix\ %
:set errorformat+=%m\ at\ %f\ line\ %l\.
:set errorformat+=%m\ at\ %f\ line\ %l

You may need to install this module first via CPAN, or the Debian package libvi-quickfix-perl. This done, you can type :make after saving the file to check its syntax, and if errors are found, you can open the quicklist window with :copen to inspect the errors, and :cn and :cp to jump to them within the buffer.

Vim quickfix working on a Perl file

Vim quickfix working on a Perl file

This also works for output from gcc, and pretty much any other compiler syntax checker that you might want to use that includes filenames, line numbers, and error strings in its error output. It’s even possible to do this with web-focused languages like PHP, and for tools like JSLint for JavaScript. There’s also an excellent plugin named Syntastic that does something similar.

Reading output from other commands

You can use :r! to call commands and paste their output directly into the buffer with which you’re working. For example, to pull a quick directory listing for the current folder into the buffer, you could type:

:r!ls

This doesn’t just work for commands, of course; you can simply read in other files this way with just :r, like public keys or your own custom boilerplate:

:r ~/.ssh/id_rsa.pub
:r ~/dev/perl/boilerplate/copyright.pl

Filtering output through other commands

You can extend this to actually filter text in the buffer through external commands, perhaps selected by a range or visual mode, and replace it with the command’s output. While Vim’s visual block mode is great for working with columnar data, it’s very often helpful to bust out tools like column, cut, sort, or awk.

For example, you could sort the entire file in reverse by the second column by typing:

:%!sort -k2,2r

You could print only the third column of some selected text where the line matches the pattern /vim/ with:

:'<,'>!awk '/vim/ {print $3}'

You could arrange keywords from lines 1 to 10 in nicely formatted columns like:

:1,10!column -t

Really any kind of text filter or command can be manipulated like this in Vim, a simple interoperability feature that expands what the editor can do by an order of magnitude. It effectively makes the Vim buffer into a text stream, which is a language that all of these classic tools speak.

There is a lot more detail on this in my “Shell from Vi” post.

Built-in alternatives

It’s worth noting that for really common operations like sorting and searching, Vim has built-in methods in :sort and :grep, which can be helpful if you’re stuck using Vim on Windows, but don’t have nearly the adaptability of shell calls.

Diffing

Vim has a diffing mode, vimdiff, which allows you to not only view the differences between different versions of a file, but also to resolve conflicts via a three-way merge and to replace differences to and fro with commands like :diffput and :diffget for ranges of text. You can call vimdiff from the command line directly with at least two files to compare like so:

$ vimdiff file-v1.c file-v2.c
Vim diffing a .vimrc file

Vim diffing a .vimrc file

Version control

You can call version control methods directly from within Vim, which is probably all you need most of the time. It’s useful to remember here that % is always a shortcut for the buffer’s current file:

:!svn status
:!svn add %
:!git commit -a

Recently a clear winner for Git functionality with Vim has come up with Tim Pope’s Fugitive, which I highly recommend to anyone doing Git development with Vim. There’ll be a more comprehensive treatment of version control’s basis and history in Unix in Part 7 of this series.

The difference

Part of the reason Vim is thought of as a toy or relic by a lot of programmers used to GUI-based IDEs is its being seen as just a tool for editing files on servers, rather than a very capable editing component for the shell in its own right. Its own built-in features being so composable with external tools on Unix-friendly systems makes it into a text editing powerhouse that sometimes surprises even experienced users.