Topics Map > SCS Computing > Software
Topics Map > SCS Computing > How-To
Tutorial - Unix/Linux Primer
Unix/Linux Primer
Taras V. Pogorelov and Mike Hallock
School of Chemical Sciences, University of Illinois
First published August 25, 2017
Updated/reviewed May 17, 2024
To download and save this tutorial to your computer, print this page ('Command + P' or 'Ctrl + P') and save the document as a PDF.
For the purposes of this document, we will use a dollar-sign ($) to denote the prompt, and to signify that the line is a command that you can type in. Please note that you will not actually type the dollar-sign ($). The command you will type will be denoted as code you type
.
1 Filesystem Basics
In Unix, the first central concept is that of a filesystem. It is the hierarchical, tree-like structure that provides a unified namespace for everything on the system. The tree is made out of directories that in turn hold files and other directories. There is a special directory, called the root directory, that represents the very top of the filesystem, and all files and directories are decedents of it. It is written as the forward-slash character ( / ). A path is a sequence of directories one must travel to reach a target file or directory. An absolute path is one that begins at the root, and therefore begins with a forward-slash. A path is constructed by joining the directory names traveled to reach a file together with forward-slashes. Consider the following absolute path:
/path/to/my/file.txt
The file, file.txt is located in the directory my, which is located in the directory to, which is in the directory path, located at the root.
As you work, you will be constantly navigating inside the filesystem. At all times, you will have a current working directory that represents the directory you are “in” and that will be used as the starting point for a relative path. A relative path is one that does not start at the root of the filesystem, but implicitly begins in your working directory. There are also two specially-named directories that you can use to construct paths, the single-dot and double-dot. Single-dot (.
) represents the current directory, and double-dot (..
) represents the parent of the current directory. So consider our working directory to be /path/to, the following are all equivalent relative paths:
my/file.txt ./my/file.txt ../to/my/file.txt
When you log in to a computer, the program that is started for you is called the shell. It is the program you interact with that is interpreting all the commands you are typing in. We can use shell commands to find out where we are in the filesystem, to print out directory contents, and to move between directories. When the shell is ready to take input from you, it will have printed a prompt at the beginning of the line. The prompt may contain your username, the machine you are on, the directory you are in, or a whole assortment of other information. The default prompts will vary from system to system, and are configurable.
The first command to try is to find out where we are. Use pwd
(print working directory) to display the absolute path for the current working directory:
$ pwd
If you’ve just logged in, then you are probably in your home directory, which is the space on the system that you can keep your files in. We can use the ls command to get a listing of files and directories, and the cd
(change directory) command to move to a new working directory. However, we may not have anything interesting to look at quite yet, so lets make a new directory:
$ mkdir unix-tutorial
And now list our current directory:
$ ls
And go into our newly-created tutorial directory:
$ cd unix-tutorial
Note that I used a relative path to get to my new directory, where I could have also used an absolute path. If my home directory happens to be /home/mike, I could have also typed the absolute path:
$ cd /home/mike/unix-tutorial
Tip: cd
with no arguments will always return you to your home directory. The tilde (~
) is also a shortcut for your home directory, so
$ cd ~
will return you home as well.
As it is pretty empty in here, let's use the copy command to bring a file in to this directory. We will grab a list of dictionary words to play with later:
$ cp /usr/share/dict/words .
The copy command (cp
) takes two arguments, the source file first, and the destination second. Note we used an absolute path to the words file in the filesystem, and used the relative (.
) as the destination, to signify we want to copy to the directory we are currently working in. You can now run ls
to verify the file is in your directory.
$mkdir testing124
$cp words testing124
$ls testing124
$ls /usr/share/dict
mv
) to rename that.$ mv testing124 testing123
cp
, mv
takes two arguments: the source name and the destination name. Do an ls to verify the directory name changed. Do ls on the renamed directory to verify that the contents of the directory are unmodified.$cd testing123
$pwd
$cp ../words words2
..
) to refer to the parent directory to use it as the source of the file we want to copy. What is the parent directory? What directory is the parent of the parent (../../
)? What do you think would have happened if we used (.
) as the destination for that last copy command?If we run the command:
$ mv [src] [dst]
-
If [dst] doesn't exist, then [src] is renamed to [dst].
-
If [dst] is a directory, then [src] is moved in to the directory.
-
If [src] and [dst] are files, then [src] replaces [dst], meaning [dst] is deleted and [src] is renamed to [dst].
-
If [src] is a directory, and [dst] is a file, then this is an error! A directory cannot replace a file.
cp
, the rules are similar.- If you copy a file to a new filename, you end up with two copies of the file.
- If you copy a file with a destination of a directory, you get a copy of the file in that directory.
- However, if the source is a directory, you’ll get a puzzling message that the directory you tried to make a copy of has been omitted for the copy.
To copy directories requires the use of an option to the cp
command to ask it to recursively make a copy of the whole directory. We will learn about command options in the next section.
2 Command Options and Arguments
cp
and mv
), and those where the arguments are optional (cd
and ls
). Nearly all commands also support a wide array of options that modify their behavior in some way. Options come in two flavors: those that require an argument of their own, and those that do not (these are commonly referred to as flags). Options are typically written as a dash and a single letter, or two dashes and a word. For options that require an argument, the argument should directly follow the option. Let's look at some examples to make this a bit more clear.ls
:$cd
$ls -l
$ls -l unix-tutorial
ls
just gave us the list of file and directory names, but now we have seven columns of output.- Permissions of the file
- Number of links
- Owner of the file
- Group.
- File size
- Modification date of the file
- Filename.
-t
), and in reverse-order (-r
) so that the newest files are on the bottom of the list.$ ls -1 -r -t
$ls -lrt
$ls -ltr
Be careful though when mixing options that require arguments with those that do not, as now ordering and grouping matter. Consider the tar
command (short for Tape ARchive) that creates and extracts tarfiles, which is sort of the UNIX equivalent of ZIP files.
$ tar -z -x -f file.tar.gz -v
This runs tar with the options -z -x -v
, and the -f
option specifies the file to operate on. When options take arguments, an option that has an argument must be last, although more options may follow.
-z
flag indicates that the file is zipped (compressed)-v
stands for verbose (a common option in unix, it makes programs usually print more information)-x
stands for eXtract (as opposed to-c
for Create)-f
gives the file to either extract or create
These are all functionally identical:
$tar -f file.tar.gz -z -x -v
$tar -zxvf file.tar.gz
$tar -xvf file.tar.gz -z
3 Finding Help
$ man ls
- down-arrow, enter, or j to scroll down one line
- up-arrow or k to scroll up one line
- spacebar or f to go down a page
- b to go up a page
- g to go to the top of the document
- G to go to the bottom of the document
- q to quit and exit the pager
- / to search
- h to get help on all these keys and more
$cd
$cd unix-tutorial
$less words
/chemistry
and press enter. You can search for the next occurrence from your current position by just hitting /
and pressing enter. 4 Searching and Manipulating Text
grep
.grep
searches for patterns of characters in a file. These patterns are called regular expressions, which is a special grammar for describing a text string to search for. Many unix utilities will utilize regular expressions, so having a basic understanding is important.grep
command will take its first command-line argument as the pattern to search for, and any subsequent arguments as files to search for the pattern. A regular expression can be as simple as a word. Lets look through the words file again and pull out all the entries that contain the word “chemistry”:$ grep chemistry words
$ grep chem words
$ grep ^chem words
$ grep chem$ words
$ grep ch.m$ words
$grep 'physics\?$' words
$grep 'lag\+e' words
$grep 'x*hello' words
$ grep '^super.*man$' words
*
) is also a wildcard character, where it can match multiple filenames. We also have to use backslashes before some of the special characters to get all flavors of grep to recognize them.awk
. It takes text one line at a time and breaks the text into individual words. There are many features to awk, but to get started we will use it to simply select which columns of text we want, and only print those.date
. It tells you the current date and time. Let's use awk
to print only the day, month, and year. When awk
reads a line of text, it splits it into words, and saves each word into a variable, starting with $1 for the first word, $2 for the second, and so on. We can then use the print command to output these variables back to the screen. Type the following:$ date | awk '{print $3, $2, $6}'
5 Redirection and Piping
$grep super words | grep man
$grep super words | grep man | sed -e 's/super/hulk/'
grep
? Without a file name, it reads from standard in by default. We also used sed
here to perform a substitution. It used the output of the second grep
command as its input.wc
counts words and lines:$wc -1 words
$grep cat words | wc -1
>
) sign:$ 1s -1 > file_list.txt
<
) sign:$ grep unix < file_list.txt
6 Practice
curl
command:$ curl https://files.rcsb.org/download/1qd5.pdb.gz > 1qd5.pdb.gz
$ gunzip 1qd5.pdb.gz
gunzip
will have stripped the .gz suffix off the filename. Use ls
to verify that. less
to take a quick look at the file and see its structure. grep
and wc
to determine how many atoms are in the structure.$ awk '{print $6}'
uniq
command to remove consecutive, repeated lines. What change to the awk
command could you do in order to get the residue type (the three letter code in column 4) printed along the residue number? Finally, count the total number of residues, which tells us how long this protein is.