A Shell Scripting Example
Hopefully, now you are getting the hang of Linux Shell Scripting! In this section, we'll develop a shell script from scratch to give you some idea of what sort of thing you can do and how a typical script develops.
Note: Now, old Scripting hacks may tell you that you can achieve the same results as this script (-or better) via the command line, but that's not the point here: we are just going to take this as an example of a real world requirement and refine the script, just to show you the process.
Let us say that our PC is running slow and we want to find out what is causing it. We can't just sit around watching the System Monitor all day, so we decide to run a script which we can schedule via cron to keep a tab on what's running on the system. We have done our research (-by typing in "linux top cpu batch" in a search engine) and found that we can use the top command to do this. So we start with a basic script that looks as follows:
# Script by Fred Bloggs to find the top CPU hog
top -n 1 -b >> tmpFile
All this script does is to append the output of the top command to "tmpFile". This would certainly show you what was running, but it may be a little hard to see the wood for the trees in all that output! So, as a next step, let's narrow down the number of lines we append to the file. The code we end up with might look as follows:
# Script by Fred Bloggs to find the top CPU hog
top -n 1 -b | head >> tmpFile
Now, this certainly cut down the number of lines written to the file, but it's a bit crude: we end up getting the headers from top command and missing off seven out of the top five processes:
$ more tmpFile
top - 15:50:00 up 6:14, 3 users, load average: 0.27, 0.20, 0.18
Tasks: 222 total, 1 running, 221 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.7%us, 1.0%sy, 0.1%ni, 95.8%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1795140k total, 1743708k used, 51432k free, 88316k buffers
Swap: 1999868k total, 436k used, 1999432k free, 435792k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1226 root 20 0 283m 131m 34m S 6 7.5 23:55.29 Xorg
1948 fredb 20 0 349m 18m 13m S 4 1.0 0:06.39 clock-applet
3885 fredb 20 0 879m 77m 20m S 4 4.4 0:22.53 chromium-browse
Let's visit the script and change it to read through the output line by line, missing out the header and showing the data for the top five processes only:
# Script by Fred Bloggs to find the top CPU hog
top -n 1 -b | head --lines=12 > tmpFile
while read topLine
do
firstArg=`echo "$topLine" | awk '{ print $1 }'`
echo "$firstArg" | grep -q [A-Za-z]
if [ "$?" -eq 1 ] ; then
echo $topLine >> outFile
fi
done < tmpFileOk, this is getting a bit more complex. The slight complication here is that top outputs a header after every few processes, so we've got to skip this. The code above gets the first argument on each line into the variable firstArg. The next line checks if there are any upper/lower case alphabetics in it and the following if prints a line to the "outFile" only if none are found. This removes all lines but those with a numeric PID as the first argument and if we print out the "outFile" it should now look something like this:
$ more outFile
7197 fredb 20 0 1008m 86m 38m S 8 4.9 6:59.04 rhythmbox
1226 root 20 0 282m 125m 29m S 6 7.2 29:43.11 Xorg
1757 fredb 9 -11 353m 10m 8504 S 4 0.6 3:45.38 pulseaudio
3921 fredb 20 0 312m 16m 10m S 2 1.0 0:50.21 gnome-terminal
26715 fredb 20 0 19272 1340 920 R 2 0.1 0:00.02 top
So, we've got the data we want, but it's still not that readable to the lay person, so let's alter the code again, to format the output nicely:
# Script by Fred Bloggs to find the top CPU hog
top -n 1 -b | head --lines=12 > tmpFile
while read topLine
do
firstArg=`echo "$topLine" | awk '{ print $1 }'`
echo "$firstArg" | grep -q [A-Za-z]
if [ "$?" -eq 1 ] ; then
pid="$firstArg"
user=`echo "$topLine" | awk '{ print $2 }'`
pri=`echo "$topLine" | awk '{ print $3 }'`
cpu=`echo "$topLine" | awk '{ print $9 }'`
mem=`echo "$topLine" | awk '{ print $10 }'`
name=`echo "$topLine" | awk '{ print $12 }'`
echo " $name CPU%=$cpu Mem%=$mem ($user Pri=$pri)" >> outFile
fi
done < tmpFileUsing the new script we get the following output:
$ more outFile
CPU%= Mem%= ( Pri=)
rhythmbox CPU%=6 Mem%=5.0 (fredb Pri=20)
Xorg CPU%=4 Mem%=7.2 (root Pri=20)
top CPU%=4 Mem%=0.1 (fredb Pri=20)
pulseaudio CPU%=2 Mem%=0.6 (fredb Pri=9)
init CPU%=0 Mem%=0.1 (root Pri=20)
This exposes another problem: a blank line has slipped through. We can remove that quickly, by changing one line:
if [ "$?" -eq 1 -a ! -z "$firstArg" ] ; then
This just checks that firstArg is non-blank. All should now look good, except that when the script runs multiple times, we can't see which output is from which run. We need to output the time and date ahead of each run's output:
# Script by Fred Bloggs to find the top CPU hog
top -n 1 -b | head --lines=12 > tmpFile
echo `date` >> outFile;
while read topLine
do
firstArg=`echo "$topLine" | awk '{ print $1 }'`
echo "$firstArg" | grep -q [A-Za-z]
if [ "$?" -eq 1 -a ! -z "$firstArg" ] ; then
pid="$firstArg"
user=`echo "$topLine" | awk '{ print $2 }'`
pri=`echo "$topLine" | awk '{ print $3 }'`
cpu=`echo "$topLine" | awk '{ print $9 }'`
mem=`echo "$topLine" | awk '{ print $10 }'`
name=`echo "$topLine" | awk '{ print $12 }'`
echo " $name CPU%=$cpu Mem%=$mem ($user Pri=$pri)" >> outFile
fi
done < tmpFile
echo "" >> outFileThe final output now looks how we would like it:
Tue Jan 25 18:41:46 GMT 2011
rhythmbox CPU%=6 Mem%=5.0 (fredb Pri=20)
Xorg CPU%=4 Mem%=7.2 (root Pri=20)
pulseaudio CPU%=4 Mem%=0.6 (fredb Pri=9)
top CPU%=2 Mem%=0.1 (fredb Pri=20)
init CPU%=0 Mem%=0.1 (root Pri=20)
Tue Jan 25 18:44:02 GMT 2011
rhythmbox CPU%=8 Mem%=5.0 (fredb Pri=20)
Xorg CPU%=2 Mem%=7.2 (root Pri=20)
pulseaudio CPU%=2 Mem%=0.6 (fredb Pri=9)
compiz CPU%=2 Mem%=1.8 (fredb Pri=20)
top CPU%=2 Mem%=0.1 (fredb Pri=20)
However, it would be useful to be able to change file to write the information to, so let's pass that in as a command line argument, such as:
$ ./processMon.sh /tmp/myProcesses.log
Here's the new code:
# Script by Fred Bloggs to find the top CPU hog
# Parm $1 = filename to use (default is /tmp/outFile)
if [ ! -z "$1" ] ; then
outFile=$1
else
outFile="/tmp/outFile"
fi
top -n 1 -b | head --lines=12 > tmpFile
echo `date` >> "$outFile"
while read topLine
do
firstArg=`echo "$topLine" | awk '{ print $1 }'`
echo "$firstArg" | grep -q [A-Za-z]
if [ "$?" -eq 1 -a ! -z "$firstArg" ] ; then
pid="$firstArg"
user=`echo "$topLine" | awk '{ print $2 }'`
pri=`echo "$topLine" | awk '{ print $3 }'`
cpu=`echo "$topLine" | awk '{ print $9 }'`
mem=`echo "$topLine" | awk '{ print $10 }'`
name=`echo "$topLine" | awk '{ print $12 }'`
echo " $name CPU%=$cpu Mem%=$mem ($user Pri=$pri)" >> "$outFile"
fi
done < tmpFile
echo "" >> "$outFile"The script can now be scheduled to run periodically throughout the day using cron to build up a picture of which processes are often in the top 5 resource hogs. For example, here's the line that should be added to the crontab to run the script every 5 minutes, writing the output to /tmp/myProcesses.log:
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /home/fredb/scripts/processMon.sh /tmp/myProcesses.log
Note: that you need to fully qualify the path to the script and any filenames (-i.e. state absolute not relative paths) when running under cron.