Introduction

Purpose: Understand how shell constructs for looping can help improve the efficiency and correctness of scientific data processing.

Background: Individual commands are powerful, but the ability to repeat operations in a loop massively increases the efficiency and effectiveness of scientific computations.

Learning outcomes

Why loops

  • Automate repetitive tasks compared to the GUI
  • Increase efficiency
  • Fewer commands, less likelihood for errors

A basic loop

$ for filen in paleo*
    do echo $filen
 done

produces

paleo-mammals-v2.txt
paleo-mammals-v3.txt
paleo-mammals.txt

Running multiple commands in a loop

$ for file in paleo-mammals-v?*
do
       echo $file
       wc -l $file
       diff -q paleo-mammals.txt $file
done

produces

paleo-mammals-v2.txt
      23 paleo-mammals-v2.txt
Files paleo-mammals.txt and paleo-mammals-v2.txt differ
paleo-mammals-v3.txt
      23 paleo-mammals-v3.txt
Files paleo-mammals.txt and paleo-mammals-v3.txt differ

Renaming files

  • Imagine you have a lot of data files named similarly
  • To rename all of them, try a loop like:
$ for filename in plot-*.csv
do
   mv $filename orig-$filename
done
$ ls orig*

Bash scripts

  • Series of commands can be saved in a file and run at any time
  • Useful to write a program that automates a task
  • Start with #! syntax to indicate which program should be used for the script
    • e.g., #!/bin/bash to use the bash shell

A simple bash script

#!/bin/bash

# Example of a simple shell script
# Do not really use this for backup!
PREFIX="backup-"
FILES=$@
for file in $FILES
do
    cp $file $PREFIX$file
done

Save this in a file called ‘crazy-backup.sh’, and then make it executable, and run it:

$ chmod u+x crazy-backup.sh                                                            
$ ./crazy-backup.sh paleo*
$ ls
1-servers-net.Rmd           4-regex.Rmd                 paleo-mammals-v2.txt
1-servers-net.html          backup-paleo-mammals-v2.txt paleo-mammals-v3.txt
2-commandline-intro.Rmd     backup-paleo-mammals-v3.txt paleo-mammals.txt
2-commandline-intro.html    backup-paleo-mammals.txt    plotobs.csv
3-bash-loops.Rmd            crazy-backup.sh
3-bash-loops.html           images
$