Bash – A few commands to use again and again
Introduction
These days I spend a lot of time in the bash shell. I use it for ad-hoc scripting or driving several Linux boxes. In my current project we set up a continuous delivery environment and migrate code onto it. I lift code from CVS to SVN, mavenize Ant builds and funnel artifacts into Nexus. One script I wrote determines if a jar that was checked into a CVS source tree exists in Nexus or not. This check can be done via the Nexus REST API. More on this script at the end of the blog. But first let’s have a look at a few bash commands that I use all the time in day-to-day bash usage, in no particular order.
- find
- for
- tr
- awk
- sed
- xargs
- grep
- sort
- Reverse search (CTRL-R)
- !!
Find searches files recursively in the current directory.
$ find -name *.jar
This command lists all jars in the current directory, recursively. We use this command to figure out if a source tree has jars. If this is the case we add them to Nexus and to the pom as part of the migration from Ant to Maven.
$ find -name *.jar -exec sha1sum {} \;
Find combined with exec is very powerful. This command lists the jars and computes sha1sum for each of them. The shasum command is put directly after the -exec flag. The {} will be replaced with the jar that is found. The \; is an escaped semicolon for find to figure out when the command ends.
For loops are often the basis of my shell scripts. I start with a for loop that just echoes some values to the terminal so I can check if it works and then go from there.
$ for i in $(cat items.txt); do echo $i; done;
The for loop keywords should be followed by either a newline or an ‘;’. When the for loop is OK I will add more commands between the do and done blocks. Note that I could have also used find -exec but if I have a script that is more than a one-liner I prefer a for loop for readability.
Transliterate. You can use this to get rid of certain characters or replace them, piecewise.
$ echo ‘Com_Acme_Library’ | tr ‘_A-Z’ ‘.a-z’
Lowercases and replaces underscores with dots.
$ echo 'one two three' | awk '{ print $2, $3 }'
Prints the second and third column of the output. Awk is of course a full blown programming language but I tend to use this snippets like this a lot for selecting columns from the output of another command.
Stream EDitor. A complete tool on its own, yet I use it mostly for small substitutions.
$ cat 'foo bar baz' | sed -e 's/foo/quux/'
Replaces foo with quux.
Run a command on every line of input on standard in.
$ cat jars.txt | xargs -n1 sha1sum
Run sha1sum on every line in the file. This is another for loop or find -exec alternative. I use this when I have a long pipeline of commands in a oneliner and want to process every line in the end result.
Here are some grep features you might not know:
$ grep -A3 -B3 keyword data.txt
This will list the match of the keyword in data.txt including 3 lines after (-A3) and 3 lines before (-B3) the match.
$ grep -v keyword data.txt
Inverse match. Match everything except keyword.
Sort is another command often used at the end of a pipeline. For numerical sorting use
$ sort -n
This one isn’t a real command but it’s really useful. Instead of typing history and looking up a previous command, press CTRL-R,
start typing and have bash autocomplete your history. Use escape to quit reverse search mode. When you press CTRL-R your prompt will look like this:
(reverse-i-search)`':
Pronounced ‘bang-bang’. Repeats the previous command. Here is the cool thing:
$ !!:s/foo/bar
This repeats the previous command, but with foo replaced by bar. Useful if you entered a long command with a typo. Instead of manually replacing one of the arguments replace it this way.
Bash script – checking artifacts in Nexus
Below is the script I talked about. It loops over every jar and dll file in the current directory, calls Nexus via wget and optionally outputs a pom dependency snippet. It also adds a status column at the end of the output, either an OK or a KO, which makes the output easy to grep for further processing.
#!/bin/bash ok=0 jars=0 for jar in $(find $(pwd) 2&>/dev/null -name '*.jar' -o -name '*.dll') do ((jars+=1)) output=$(basename $jar)-pom.xml sha1=$(sha1sum $jar | awk '{print $1}') response=$(curl -s http://oss.sonatype.org/service/local/data_index?sha1=$sha1) if [[ $response =~ groupId ]]; then ((ok+=1)) echo "findjars $jar OK" echo "" >> "$output" echo "$response" | grep groupId -A3 -m1 >> "$output" echo "" >> "$output" else echo "findjars $jar KO" fi done if [[ $jars > 0 ]]; then echo "findjars Found $ok/$jars jars/dlls. See -pom.xml file for XML snippet" exit 1 fi
Conclusions
It is amazing what you can do in terms of scripting when you combine just these commands via pipes and redirection! It’s like a Pareto’s law of shell scripting, 20% of the features of bash and related tools provide 80% of the results. The basis of most scripts can be a for loop. Inside the for loop the resulting data can be transliterated, grepped, replaced by sed and finally run through another program via xargs.
References
The Bash Cookbook is a great overview of how to solve solutions to common problems using bash. It also teaches good bash coding style.