Subsections

4. Shell basics

4.1 Command line editing and parameter substitution

Modern shells offer fairly advanced command-editing functions. Right and left arrows let you move the cursor withing the command-line you are editing; up and down arrows let you browse your history of previous commands. The <tab> key performs the so-called expansion of the word you are currently editing, i.e. the shell looks for a some file name that fits the (incomplete) one you are typing. This is better explained with some examples:

The !last and ~user notations

The !last notation can be used to repeat a command you already entered to the shell:

The ~user notation can be used to substitute the full path to user's home directory (see 4.2): you type pico ~someuser/somefile and the shell executes pico /u/grp1/someuser/somefile. Your home directory is simply ~/. Please note that some programs (remarkably ssh) treat the tilde as a special character in some situations, so you might need to type it twice to make it actually appear on screen.


Wildcards

The characters ? and * are special to the shell. ? means «any character» and * means «any number of characters». Thus if you have file1.tex, file2.tex, file1.dvi, file2.dvi, typing ls file?.tex actually means ls file1.tex file2.tex, while ls file1* means ls file1.tex file1.dvi. Wildcard expansion is performed before any command is executed. If you need to pass a * or ? character to some command you need to escape it (prepend with a backslash, so $\backslash$* instead of *, $\backslash$? instead of ?) or single-quote the word containing it ('word*' instead of word*, 'word?' instead of word?).


4.2 Working with files and directories

Every file in a unix system can be referred to by means of an absolute or relative path. The absolute path of a file describes its position referred to the root of the whole filesystem; the relative path describes the position of the file as relative to the current directory: if the file is in the current directory it can be referred to with its name only.
This is not true for executable files: they are searched for only in certain directories. If you need to execute a file in the current directory you will probably need to write something like ./file-name

The following picture shows (a little part of) a typical filesystem structure; directory names end with /, file names are bold.

\includegraphics{img/directory}

If our current directory were /u/g1/user2/f77/, we could refer to our fortran sources simply with alpha.f and beta.f. For any other file we would need to write an absolute or relative path, e.g. a relative path ../mail/read-mail or an absolute path /usr/bin/less. Note that / is the path separator, and .. is the parent directory (one level up). Absolute paths always start with the forward-slash /.

file & directory commands quick reference
pwd show current directory name
cd change to your home directory
cd dir-name change to directory dir-name
cd .. change to parent directory
mkdir dir-name create new directory dir-name
rmdir dir-name remove (empty) directory dir-name
cp name1 name2 copy file name1 to name2
mv name1 name2 move (or rename) file name1 to name2
rm name delete file name
rm -R dir-name recursively delete directory dir-name and its contents
ln name1 name2 create a (hard) link to file name1, named name2
ln -s name1 name2 create a symbolic link to file name1, named name2
grep string name search string in file name
find path -name 'pattern' find files whose name matches pattern, starting from path
diff name1 name2 find differences between files name1 and name2
du dir-name show space used by directory dir-name and its contents
df dir-name show free and used space in a filesystem
quota show your filesystem quota

Organizing your directories

You should create a logical structure for your files and directories in order to manage them easily. There are several ways of organizing your directory structure:

«flat» structure:
all subdirectories put immediately under the top level one; this is probably the simplest structure, but it work well only if the total number of subdirectories is small
«tall» structure:
a highly hierarchical structure, useful if you work on many large and complex projects
by project:
each project you work on gets its own directory under the top level one; each project directory contains all types of files (and subdirectories) related to the project
by tool:
a single directory under the top one for all files created/worked on with a specific tool or programming language; each directory may then contain project subdirectories
by type:
a single directory for each file type (executables, images, ...)

Mount, mountpoints

For those who come from the Windows world: a unix filesystem is quite different from what you are accustomed to. You do no access «devices» (A:, C:, ...), but a single filesystem structure. Every possible file and directory is found somewhere under the root directory /. This does not mean that there is a single storage device: instead, a device can be mounted to a directory, i.e. its contents can be made accessible as a sub-tree of the existing filesystem. Mounting and unmounting is usually reserved to the super-user, with the notable exception of removable devices. You can mount the floppy disk with mount /mnt/floppy; everything on the floppy will be then accessible from the directory /mnt/floppy. Always remember to unmount the floppy before ejecting it: this is done with umount /mnt/floppy.4.1 This is needed because the unix filesystem is asyncronous, i.e. data are not immediately written to the physical device: if you ejected the floppy before unmounting it, you could lose data. The same mount/umount procedure is needed for any CD-ROM, which is usually accessed in the /mnt/cdrom directory. Most graphical environments provide you with floppy/CDROM icons that automatically perform the mount operation when clicked. Please note that a device cannot be unmounted if any process is currently accessing it, i.e.:

Links, symbolic links

In a unix filesystem files are not actually contained in directories, they are only linked from them. As a consequence the following statements hold:

Links to existing files are created with the ln command; you cannot create hard links to directories. The ln -s command is used to create symbolic links to files. A symbolic links is a sort of reminder for the filesystem: «when the user tries to access this file name, redirect him to this other location»4.3. Symbolic links can point to directories, and can cross filesystem boundaries; they are not checked before the pointed file is deleted, so you can end up with symbolic links that point to non-existent files.

M- commands

If you ever worked with MS-DOS, you may feel more comfortable using the tools provided by the mtools package. Most useful tools are mcd, mcopy, mdel, mdir, mformat, mmd, mmove, mrd, mrem, mtype that perform the same actions as the corresponding MS-DOS commands, and can be used when working on files on a FAT filesystem (e.g. most floppies). Syntax and behaviour are the same as MS-DOS commands, when applicable. This means that:

1.
you can use the A: notation to refer to the floppy disk (if present on the system, a Zip drive is attached to Z:, and a Jaz drive to J:)
2.
file names are displayed and referred to in the 8+3 format
3.
file names are case insensitive
4.
you do not need to mount and umount any media to access files on it
5.
the path separator is the backslash $\backslash$, not the forward slash /
but
6.
you still need to use unix syntax when copying to or from a unix filesystem

Mcopy will also handle character encoding and carriage return conversion if instructed to do so (-a and -T switches). See man mtools for a complete list of available mtools.


4.3 Redirection and pipelines

Many unix command read their input (if any) from the so-called standard input (stdin) and write their output to the so-called standard output (stdout), plus optional warnings to the so-called standard error (stderr). When launching them from an interactive shell, stdin is normally connected to the keyboard, while stdout and stderr are both connected to the terminal screen.

Sometimes you will find it useful to redirect standard input, output and/or error, i.e. make the running command read from or write to some file instead of keyboard and screen. The < operator is used to redirect stdin; the > operator is used to redirect stdout; the 2> operator is used to redirect stderr; the >& operator is used to redirect both stdout and stderr to the same place.

When redirecting output, you can append to an existing file instead of overwriting it; just substitute > with ». Mind that the behaviour of the > and » operators with existing (resp. non-existing) files is shell- and setup-specific; refer to shell manuals for details.

Recent versions of the bash shell can redirect output to a remote host via tcp or udp with the following notation:

(tcp)
somecommand >/dev/tcp/host name/port number
(udp)
somecommand >/dev/udp/host name/port number
(note that /dev/tcp/ and /dev/udp/ are not actual directories)

You can even redirect the output of one command to be the input of another command: this is done with the | (pipe) operator. It is quite common to have several commands connected by redirection operators in a pipeline:
cat mybadPS.ps | fixps | psbook -q | psnup -2 > mybook.ps
In this pipeline, fixps, psbook and psnup are said filters, i.e. data flow through them without hitting the disk.

4.4 Searching for files

The find utility can search a directory (sub-)tree for files matching a pattern; find is invoked as

find directories options

where directories is a list of directories to search (often . only) and options is a list of options to select files (and possibly do something with them).

some find options
-name 'pattern' find files with names matching the specified pattern
-iname 'pattern' find files with names matching the specified pattern, case insensitive
-mtime -days find files modified less than days ago
-mtime +days find files modified more than days ago
-mmin -minutes find files modified less than minutes ago
-maxdepth n do not descend more than n levels of subdirectories
-type f find regular files only
-type d find directories only

Example: find C source files in my home modified in the last 5 days
find  -mtime -5 -name '*.c'

There are many more options for find, but some of them are quite tricky, so please be sure to understand the full man page before experimenting with them.

Please note that find /usr/bin -name some-program-name is not the right way to find the location of an executable: which some-program-name is shorter to type and faster (and will also search directories other than /usr/bin). If you are looking for a «system» file (i.e. a file located in /usr or /etc) use locate file-name; locate uses a database of files that is reindexed every night, so it is very fast, but its output does not reflect last minute changes.

4.5 Searching for strings

The grep utility can be used to search for strings or patterns; grep can work on regular files or as a filter (see 4.3):
grep somestring somefile will output every line of somefile containing somestring;
cat somefile | grep somestring will do exactly the same.

some some options
-C n show n lines of context, i.e. show n lines before and after the matched line
-c suppress normal output, show only the number of matching lines
-i perform case-insensitive match
-l suppress normal output, show only the names of files with matching lines
-n show line numbers
-r descend recursively into subdirectories
-v show non-metching lines

Grep can search for simple strings or for more complex patterns, also known as regular expressions (regex). Regular expressions are strings where some characters have special meanings:

some grep regex components
. any character
$\backslash$? multiplier: 0 or 1 time
$\backslash$* multiplier: 0 or more times
$\backslash$+ multiplier: 1 or more times
$\backslash$| logical or
$\backslash$($\backslash$) grouping

grep regex examples
Password$\backslash$|password matches any line containing «password» or «Password»
Aa$\backslash$*rgh!$\backslash$+ matches «Argh!», «Aaaargh!», «Aaargh!!!», ...

4.6 Shell scripting

Much more work can be done using shell scripts, i.e. (usually) short «programs» written in the language of the shell: one can use variables (maybe arrays too), conditionals, loops, pattern matching and substitution, simple arithmetics and more. Please refer to shell manuals for this.

Some example scripts

Use at your own risk: do not complain if anyone of these scripts deletes all your files, locks your account or kills your cat.

Countdown

#!/bin/bash

if [ $# == 0 ]; then
if zero command-line arguments
timeout=10
else
timeout=$1
the 1st command-line argument is the desired timeout
fi

printf "%3d" $timeout
while : ; do
infinite loop
sleep 1
timeout=$((timeout-1))
printf "$\backslash$b$\backslash$b$\backslash$b"
printf "%3d" $timeout
if [ $timeout == 0 ]; then
printf "...expired !$\backslash$n$\backslash$a"
exit 0
jump out
fi
done

This is not actually very useful, but shows plenty of bash features: command-line arguments handling ($1, $#), conditionals (if ...else ...fi), loop (while ...do ...done).

Open ssh session in a new xterm

#!/bin/sh

xterm -T $1 -e ssh -t $* &

Much shorter, a bit more useful.

Mass conversion

#!/bin/sh

for im in *.gif ; do
name=`basename $im .gif`
convert $im ${name}.png
done

Convert all GIFs (in the current directory) to PNGs, using ImageMagick convert tool.

Environment and shell setup

Various scripts are automatically executed by the shell at startup and shutdown. The following table shows which files are executed in various situations by bash and tcsh. Files in /etc/ are owned by the super-user and are read-only, as far as you are concerned. Files in your home directory (~/) can be customized; they are all «hidden» files (name begins with a dot), so they do not show up with ls; use ls -a to list all files, including hidden ones, or ls -d .* to list hidden files only.


bash tcsh
at login
/etc/profile
~/.profile
~/.bash_profile
~/.bash_login
/etc/csh.cshrc
/etc/csh/login
~/.tcshrc
or ~/.cshrc
~/.login
~/.cshdirs
at startup of a non-login interactive shell
~/.bashrc /etc/csh.cshrc
~/.tcshrc
or ~/.cshrc
at logout
~/.bash_logout /etc/csh.logout
~/.logout


There is no conceptual limit to what you can do from these scripts. However they usually set up some enviroment variables, i.e. little pieces of information that can be read from subsequently starting programs. To create an environment variable you use the export VARIABLE=value syntax when working with bash and setenv VARIABLE value when working with tcsh. To read the value of a variable use $variable_name, e.g. echo $HOME.

common environment variables
$PATH colon-delimited list of directories where to look for executables
$HOME your home directory
$PWD shell current directory
$MAIL your INBOX folder
$LANG this is related to character encoding - RedHat 8.0 users should probably set LANG=C
$DISPLAY where to display graphics windows - if you logged in via ssh this has already been set for you
$EDITOR which program to start when an editor is needed by some other command
$? status of the most recently executed foreground command (0 means success)
$$ process ID of the running shell
$! process ID of the most recently executed background command



Footnotes

... /mnt/floppy.4.1
Please note that the command is umount, not unmount.
... closed4.2
As a consequence, for a device with open but unlinked files (available space) $\leq$ (total space) $-$ (space occupied by shown files). Some programs actually create large temporary files in /tmp and immediately unlink them so you can have little space left on /tmp without any files in it!
... location\guillemotright4.3
Unix symbolic links are quite different from Windows .lnk files, since they actually contain nothing but their name (in fact they do not occupy any space on device) and no special software is required to perform the redirection (instead, special care is required to read the link itself)
Piero Calucci 2004-11-05