Useless use of cat

If you’re reasonably confident with using pipes and redirection to build command-line strings in Bash and similar shells, at some point somebody is going to tell you off for a “useless use of cat“. This means that somewhere in your script or pipeline, you’ve used the cat command unnecessarily.

Here’s a simple example of such a use of cat; here, I’m running an Apache log through a pipe to awk to get a list of all the IP addresses that have accessed the server.

# cat /var/log/apache2/access.log | awk '{print $1}'

That works just fine, but I don’t actually need the cat instance there. Instead, I can supply a file argument directly to awk, and the result will be the same, with one less process spawned as a result:

# awk '{print $1}' /var/log/apache2/access.log

Most of the standard UNIX command-line tools designed for filtering text actually work like this; like me, you probably already knew that, but fell into the habit of using cat anyway.

For tools that don’t take a filename instead of standard input, such as mail, you can explicitly specify that a file’s contents should be used as the standard input stream. Consider this line, where I’m mailing a copy of my Apache configuration to myself:

# cat /etc/apache2/apache2.conf | mail tom@sanctum.geek.nz

That works fine, but we can compact it using < as a redirection symbol for the same result, and a command line a whole five characters shorter:

# mail tom@sanctum.geek.nz </etc/apache2/apache2.conf

If you overuse cat, then these two methods will probably enable you to fix that with a little effort. Incidentally, the second method can also be extended to provide in-place arguments for tools that read from files, by adding parentheses. Here, for example, I’m comparing the difference in output when I run grep with two different flags on the same file, using diff:

# diff -u <(grep -E '(a|b)' file.txt) <(grep -F '(a|b)' file.txt)