Temporary files

With judicious use of tricks like pipes, redirects, and process substitution in modern shells, it’s very often possible to avoid using temporary files, doing everything inline and keeping them quite neat. However when manipulating a lot of data into various formats you do find yourself occasionally needing a temporary file, just to hold data temporarily.

A common way to deal with this is to create a temporary file in your home directory, with some arbitrary name, something like test or working:

$ ps -ef >~/test

If you want to save the information indefinitely for later use, this makes sense, although it would be better to give it a slightly more instructive name than just test.

If you really only needed the data temporarily, however, you’re much better to use the temporary files directory. This is usually /tmp, but for good practice’s sake it’s better to check the value of TMPDIR first, and only use /tmp as a default:

$ ps -ef >"${TMPDIR:-/tmp}"/test

This is getting better, but there is still a significant problem: there’s no built-in check that the test file doesn’t already exist, perhaps being used by some other user or program, particularly another running instance of the same script.

To that end, we have the mktemp program, which creates an empty temporary file in the appropriate directory for you without overwriting anything, and prints the filename it created. This allows you to use the file inline in both shell scripts and one-liners, and is much safer than specifying hardcoded paths:

$ mktemp
/tmp/tmp.yezXn0evDf
$ procsfile=$(mktemp)
$ printf '%s\n' "$procsfile"
/tmp/tmp.9rBjzWYaSU
$ ps -ef >"$procsfile"

If you’re going to create several such files for related purposes, you could also create a directory in which to put them using the -d option:

$ procsdir=$(mktemp -d)
$ printf '%s\n' "$procsdir"
/tmp/tmp.HMAhM2RBSO

On GNU/Linux systems, files of a sufficient age in TMPDIR are cleared on boot (controlled in /etc/default/rcS on Debian-derived systems, /etc/cron.daily/tmpwatch on Red Hat ones), making /tmp useful as a general scratchpad as well as for a kind of relatively reliable inter-process communication without cluttering up users’ home directories.

In some cases, there may be additional advantages in using /tmp for its designed purpose as some administrators choose to mount it as a tmpfs filesystem, so it operates in RAM and works very quickly. It’s also common practice to set the noexec flag on the mount to prevent malicious users from executing any code they manage to find or save in the directory.

Bash process substitution

For tools like diff that work with multiple files as parameters, it can be useful to work with not just files on the filesystem, but also potentially with the output of arbitrary commands. Say, for example, you wanted to compare the output of ps and ps -e with diff -u. An obvious way to do this is to write files to compare the output:

$ ps > ps.out
$ ps -e > pse.out
$ diff -u ps.out pse.out

This works just fine, but Bash provides a shortcut in the form of process substitution, allowing you to treat the standard output of commands as files. This is done with the <() and >() operators. In our case, we want to direct the standard output of two commands into place as files:

$ diff -u <(ps) <(ps -e)

This is functionally equivalent, except it’s a little tidier because it doesn’t leave files lying around. This is also very handy for elegantly comparing files across servers, using ssh:

$ diff -u .bashrc <(ssh remote cat .bashrc)

Conversely, you can also use the >() operator to direct from a filename context to the standard input of a command. This is handy for setting up in-place filters for things like logs. In the following example, I’m making a call to rsync, specifying that it should make a log of its actions in log.txt, but filter it through grep -vF .tmp first to remove anything matching the fixed string .tmp:

$ rsync -arv --log-file=>(grep -vF .tmp >log.txt) src/ host::dst/

Combined with tee this syntax is a way of simulating multiple filters for a stdout stream, transforming output from a command in as many ways as you see fit:

$ ps -ef | tee >(awk '$1=="tom"' >toms-procs.txt) \
               >(awk '$1=="root"' >roots-procs.txt) \
               >(awk '$1!="httpd"' >not-apache-procs.txt) \
               >(awk 'NR>1{print $1}' >pids-only.txt)

In general, the idea is that wherever on the command line you could specify a file to be read from or written to, you can instead use this syntax to make an implicit named pipe for the text stream.

Thanks to Reddit user Rhomboid for pointing out an incorrect assertion about this syntax necessarily abstracting mkfifo calls, which I’ve since removed.

Unix as IDE: Debugging

This entry is part 6 of 7 in the series Unix as IDE.

When unexpected behaviour is noticed in a program, GNU/Linux provides a wide variety of command-line tools for diagnosing problems. The use of gdb, the GNU debugger, and related tools like the lesser-known Perl debugger, will be familiar to those using IDEs to set breakpoints in their code and to examine program state as it runs. Other tools of interest are available however to observe in more detail how a program is interacting with a system and using its resources.

Debugging with gdb

You can use gdb in a very similar fashion to the built-in debuggers in modern IDEs like Eclipse and Visual Studio. If you are debugging a program that you’ve just compiled, it makes sense to compile it with its debugging symbols added to the binary, which you can do with a gcc call containing the -g option. If you’re having problems with some code, it helps to also use -Wall to show any errors you may have otherwise missed:

$ gcc -g -Wall example.c -o example

The classic way to use gdb is as the shell for a running program compiled in C or C++, to allow you to inspect the program’s state as it proceeds towards its crash.

$ gdb example
...
Reading symbols from /home/tom/example...done.
(gdb)

At the (gdb) prompt, you can type run to start the program, and it may provide you with more detailed information about the causes of errors such as segmentation faults, including the source file and line number at which the problem occurred. If you’re able to compile the code with debugging symbols as above and inspect its running state like this, it makes figuring out the cause of a particular bug a lot easier.

(gdb) run
Starting program: /home/tom/gdb/example 

Program received signal SIGSEGV, Segmentation fault.
0x000000000040072e in main () at example.c:43
43     printf("%d\n", *segfault);

After an error terminates the program within the (gdb) shell, you can type backtrace to see what the calling function was, which can include the specific parameters passed that may have something to do with what caused the crash.

(gdb) backtrace
#0  0x000000000040072e in main () at example.c:43

You can set breakpoints for gdb using the break to halt the program’s run if it reaches a matching line number or function call:

(gdb) break 42
Breakpoint 1 at 0x400722: file example.c, line 42.
(gdb) break malloc
Breakpoint 1 at 0x4004c0
(gdb) run
Starting program: /home/tom/gdb/example 

Breakpoint 1, 0x00007ffff7df2310 in malloc () from /lib64/ld-linux-x86-64.so.2

Thereafter it’s helpful to step through successive lines of code using step. You can repeat this, like any gdb command, by pressing Enter repeatedly to step through lines one at a time:

(gdb) step
Single stepping until exit from function _start,
which has no line number information.
0x00007ffff7a74db0 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6

You can even attach gdb to a process that is already running, by finding the process ID and passing it to gdb:

$ pgrep example
1524
$ gdb -p 1524

This can be useful for redirecting streams of output for a task that is taking an unexpectedly long time to run.

Debugging with valgrind

The much newer valgrind can be used as a debugging tool in a similar way. There are many different checks and debugging methods this program can run, but one of the most useful is its Memcheck tool, which can be used to detect common memory errors like buffer overflow:

$ valgrind --leak-check=yes ./example
==29557== Memcheck, a memory error detector
==29557== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==29557== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==29557== Command: ./example
==29557== 
==29557== Invalid read of size 1
==29557==    at 0x40072E: main (example.c:43)
==29557==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==29557== 
...

The gdb and valgrind tools can be used together for a very thorough survey of a program’s run. Zed Shaw’s Learn C the Hard Way includes a really good introduction for elementary use of valgrind with a deliberately broken program.

Tracing system and library calls with ltrace

The strace and ltrace tools are designed to allow watching system calls and library calls respectively for running programs, and logging them to the screen or, more usefully, to files.

You can run ltrace and have it run the program you want to monitor in this way for you by simply providing it as the sole parameter. It will then give you a listing of all the system and library calls it makes until it exits.

$ ltrace ./example
__libc_start_main(0x4006ad, 1, 0x7fff9d7e5838, 0x400770, 0x400760 
srand(4, 0x7fff9d7e5838, 0x7fff9d7e5848, 0, 0x7ff3aebde320) = 0
malloc(24)                                                  = 0x01070010
rand(0, 0x1070020, 0, 0x1070000, 0x7ff3aebdee60)            = 0x754e7ddd
malloc(24)                                                  = 0x01070030
rand(0x7ff3aebdee60, 24, 0, 0x1070020, 0x7ff3aebdeec8)      = 0x11265233
malloc(24)                                                  = 0x01070050
rand(0x7ff3aebdee60, 24, 0, 0x1070040, 0x7ff3aebdeec8)      = 0x18799942
malloc(24)                                                  = 0x01070070
rand(0x7ff3aebdee60, 24, 0, 0x1070060, 0x7ff3aebdeec8)      = 0x214a541e
malloc(24)                                                  = 0x01070090
rand(0x7ff3aebdee60, 24, 0, 0x1070080, 0x7ff3aebdeec8)      = 0x1b6d90f3
malloc(24)                                                  = 0x010700b0
rand(0x7ff3aebdee60, 24, 0, 0x10700a0, 0x7ff3aebdeec8)      = 0x2e19c419
malloc(24)                                                  = 0x010700d0
rand(0x7ff3aebdee60, 24, 0, 0x10700c0, 0x7ff3aebdeec8)      = 0x35bc1a99
malloc(24)                                                  = 0x010700f0
rand(0x7ff3aebdee60, 24, 0, 0x10700e0, 0x7ff3aebdeec8)      = 0x53b8d61b
malloc(24)                                                  = 0x01070110
rand(0x7ff3aebdee60, 24, 0, 0x1070100, 0x7ff3aebdeec8)      = 0x18e0f924
malloc(24)                                                  = 0x01070130
rand(0x7ff3aebdee60, 24, 0, 0x1070120, 0x7ff3aebdeec8)      = 0x27a51979
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++

You can also attach it to a process that’s already running:

$ pgrep example
5138
$ ltrace -p 5138

Generally, there’s quite a bit more than a couple of screenfuls of text generated by this, so it’s helpful to use the -o option to specify an output file to which to log the calls:

$ ltrace -o example.ltrace ./example

You can then view this trace in a text editor like Vim, which includes syntax highlighting for ltrace output:

Vim session with ltrace output

Vim session with ltrace output

I’ve found ltrace very useful for debugging problems where I suspect improper linking may be at fault, or the absence of some needed resource in a chroot environment, since among its output it shows you its search for libraries at dynamic linking time and opening configuration files in /etc, and the use of devices like /dev/random or /dev/zero.

Tracking open files with lsof

If you want to view what devices, files, or streams a running process has open, you can do that with lsof:

$ pgrep example
5051
$ lsof -p 5051

For example, the first few lines of the apache2 process running on my home server are:

# lsof -p 30779
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
apache2 30779 root  cwd    DIR    8,1     4096       2 /
apache2 30779 root  rtd    DIR    8,1     4096       2 /
apache2 30779 root  txt    REG    8,1   485384  990111 /usr/lib/apache2/mpm-prefork/apache2
apache2 30779 root  DEL    REG    8,1          1087891 /lib/x86_64-linux-gnu/libgcc_s.so.1
apache2 30779 root  mem    REG    8,1    35216 1079715 /usr/lib/php5/20090626/pdo_mysql.so
...

Interestingly, another way to list the open files for a process is to check the corresponding entry for the process in the dynamic /proc directory:

# ls -l /proc/30779/fd

This can be very useful in confusing situations with file locks, or identifying whether a process is holding open files that it needn’t.

Viewing memory allocation with pmap

As a final debugging tip, you can view the memory allocations for a particular process with pmap:

# pmap 30779 
30779:   /usr/sbin/apache2 -k start
00007fdb3883e000     84K r-x--  /lib/x86_64-linux-gnu/libgcc_s.so.1 (deleted)
00007fdb38853000   2048K -----  /lib/x86_64-linux-gnu/libgcc_s.so.1 (deleted)
00007fdb38a53000      4K rw---  /lib/x86_64-linux-gnu/libgcc_s.so.1 (deleted)
00007fdb38a54000      4K -----    [ anon ]
00007fdb38a55000   8192K rw---    [ anon ]
00007fdb392e5000     28K r-x--  /usr/lib/php5/20090626/pdo_mysql.so
00007fdb392ec000   2048K -----  /usr/lib/php5/20090626/pdo_mysql.so
00007fdb394ec000      4K r----  /usr/lib/php5/20090626/pdo_mysql.so
00007fdb394ed000      4K rw---  /usr/lib/php5/20090626/pdo_mysql.so
...
total           152520K

This will show you what libraries a running process is using, including those in shared memory. The total given at the bottom is a little misleading as for loaded shared libraries, the running process is not necessarily the only one using the memory; determining “actual” memory usage for a given process is a little more in-depth than it might seem with shared libraries added to the picture.