Practical Computing for Biologists: Chapters 2, 5, 6, 16, Appendices 2, 3.
Unix “Basics” and “Finding Things” from UConn CBC: http://bioinformatics.uconn.edu/unix-basics/
Software Carpentry Shell Novice lesson: Episodes 5-7: https://swcarpentry.github.io/shell-novice/
Review basic commands and server access from UConn_Unix_basics
pwd - print working directory
ls - list directory contents
mkdir - create a directory
unzip - decompression
mv - move file
cp - copy file
cat - print contents of file
touch - create empty file
rm - remove file
wc - count lines/words/characters/ in file
> - redirects output to new file
>> - redirects output to append to existing file
* - wildcard that specifies any input
One can select multiple files using the
* wildcard. Navigate to the
~/MEDS5420/lec02_files directory and type:
Instead of seeing the 3 columns of numbers for the number of lines,
words and characters, we can limit the
wc command to only show us
the number of lines using the
wc -l *.txt
One can also add some specificity to wild cards using brackets:
wc -l [Wt]*.txt # this is equivalent to saying files that start with a "W" or "t"
Let’s find which file is shortest. Let’s save the
wc output to disk
with the redirection
> operator; then we can verify the contents of
length.txt are the same as what
wc produces using
wc -l *.txt > lengths.txt cat lengths.txt less lengths.txt
To find the shortest file, we then sort the lengths using the
command. We then pick the top shortest file using
head -n 1:
sort -n lengths.txt > sorted-lengths.txt head -n 1 sorted-lengths.txt
Using the intermediate files can be confusing, especially in more
complex problems. We can save a lot of messy files and typing using
wc -l *.txt | sort -n | head -n 1
A file called animals.txt contains the following data:
deer rabbit raccoon rabbit deer fox rabbit bear
What text passes through each of the pipes and the final redirect in the pipeline below? Manually rearrange and parse the input before you run or deconstruct the command.
cat animals.txt | head -n 5 | tail -n 3 | sort > final.txt
Alter the commands to get only all three rabbits as the final output.
||compression/decompression tool using Lempel-Ziv coding (LZ77)|
||Bundling files in folders|
||Global Regular Expression Print (useful flags:
||Recursively list all files and directories and filter|
1. Variables (creating and printing to screen).
2. Basics of shell scripts.
Download and move the data-shell.tar from GitHub to your MEDS5420 folder. See the third code chunk of section 6 of Lecture 2 for how to accomplish this for Windows OS.
We already unzipped a file using unzip:
unzip -d Example_files Example_files.zip
Other types of archives you will encounter:
.tar # bundles multiple files or folders
.gzip # compressed file