Manipulating files and directories
Where am I?
Files and directories are identified by an absolute path, which shows how
to reach it from the root directory.
The root directory is identified by /
.
As an example, /home/benoist
is a directory located in the home
directory,
itself located in the root directory.
/home/benoist/list.txt
is a file located in /home/benoist
.
Importantly, one cannot know if a path refers to a file or a directory simply
based on its name.
As a convention, most files will have an extension, i.e. the part following
the .
character in a file name (e.g. .txt
is the extension of file foo.txt
).
In contrast to absolute path, one can refer to a file or directory thanks to its relative path, which is its path relatively to the current working directory.
As an example, if the current working directory is /home/benoist
,
then ./lists.txt
is the relative path to /home/benoist/lists.txt
.
The command pwd
prints the absolute path of the current working directory.
$ pwd
/data
Listing the content of a directory
To display the content of a directory, use the ls
command (which stands
for “listing”).
Use the ls
command to list files and directories present in your current
working directory.
$ ls
animals creatures iris.csv molecules tooth.csv
As you can see, based on this result, there is no way to distinguish files from
directories.
creatures
and molecules
, which have no extension, are probably directories
and iris.csv
, which extension is .csv
is probably a file.
But this is only a convention and we need a better way to make sure of it.
Identifying files and directories
The command ls -l
runs the command ls
with the option -l
(which stands
for “long”)
$ ls -l
total 20
drwxr-xr-x 2 fish fish 4096 Nov 23 15:11 animals
drwxr-xr-x 2 fish fish 4096 Nov 14 15:31 creatures
-rw-r--r-- 1 fish fish 3715 Nov 14 15:35 iris.csv
drwxr-xr-x 2 fish fish 4096 Nov 13 13:58 molecules
-rw-r--r-- 1 fish fish 1855 Nov 20 16:07 tooth.csv
The character d
starting each path line indicates that this path is
a directory.
Knowing files size
Question: use man ls
to find how to print human readable file sizes.
Solution:
The
-h
option allows to print human readable sizes.total 20K drwxr-xr-x 2 fish fish 4.0K Nov 23 15:11 animals drwxr-xr-x 2 fish fish 4.0K Nov 14 15:31 creatures -rw-r--r-- 1 fish fish 3.7K Nov 14 15:35 iris.csv drwxr-xr-x 2 fish fish 4.0K Nov 13 13:58 molecules -rw-r--r-- 1 fish fish 1.9K Nov 20 16:07 tooth.csv
Question: list the content of the directory named molecules
.
Solution:
$ ls molecules cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb
Question: list the content of the directory named molecules
with the
element size.
Solution:
$ ls -lh molecules total 24K -rw-r--r-- 1 fish fish 1.2K Nov 13 13:58 cubane.pdb -rw-r--r-- 1 fish fish 622 Nov 13 13:58 ethane.pdb -rw-r--r-- 1 fish fish 422 Nov 13 13:58 methane.pdb -rw-r--r-- 1 fish fish 1.8K Nov 13 13:58 octane.pdb -rw-r--r-- 1 fish fish 1.2K Nov 13 13:58 pentane.pdb -rw-r--r-- 1 fish fish 825 Nov 13 13:58 propane.pdb
Question: list the content of the root directory.
Solution:
$ ls / bin dev initrd.img lib64 mnt root srv tmp vmlinuz boot etc initrd.img.old lost+found opt run swapfile usr vmlinuz.old data home lib media proc sbin sys var
Question: list the content of directory home/fish
which is located
in the parent directory of /data
.
Solution:
$ ls ../home/fish/
..
is the shortcut for the parent directory of the current directory. In the same fashion,../..
would be the parent of the parent.In this example, there is no output to the
ls
command, which basically means that the directory is empty.
Knowing directories size
Importantly, the ls
command won’t display the actual disk usage for directories.
To convince us, let’s list the content of the molecules
directory:
$ ls -lh molecules
total 24K
-rw-r--r-- 1 fish fish 1.2K Nov 13 13:58 cubane.pdb
-rw-r--r-- 1 fish fish 622 Nov 13 13:58 ethane.pdb
-rw-r--r-- 1 fish fish 422 Nov 13 13:58 methane.pdb
-rw-r--r-- 1 fish fish 1.8K Nov 13 13:58 octane.pdb
-rw-r--r-- 1 fish fish 1.2K Nov 13 13:58 pentane.pdb
-rw-r--r-- 1 fish fish 825 Nov 13 13:58 propane.pdb
The first line of this output tells us that the total file size is 24K.
However, a moment ago, ls -l
displayed 4K for the molecules
directory size.
This is because the ls
command won’t recurse into subdirectories to get file
sizes and, more importantly, won’t sum file and subdirectories sizes to get
the actual disk usage for a directory.
The command du
(disk usage) will give us this information:
$ du
12 ./creatures
28 ./molecules
20 ./animals
72 .
As for ls
, the -h
option allows to display human readable sizes:
$ du -h
12K ./creatures
28K ./molecules
20K ./animals
72K .
Question: use the manual to find du
’s option that will limit recursion
to a single subdirectory, then get the disk usage for /usr
.
Solution:
$ du -h --max-depth=1 /usr 136M /usr/src 60K /usr/local 131M /usr/lib 157M /usr/share 200K /usr/include 41M /usr/bin 4.0K /usr/games 8.1M /usr/sbin 473M /usr
Displaying the tree structure
Being able to see a directory tree structure can be very useful.
ls -R
allow to recurse into subdirectories:
$ ls -R
.:
animals creatures iris.csv molecules tooth.csv
./animals:
animals2.txt animals3.txt animals4.txt animals.txt
./creatures:
basilisk.fasta unicorn.fasta
./molecules:
cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb
Unfortunately, the output may be quite unreadable.
For this reason, the tree
program may be useful:
$ tree
.
├── animals
│ ├── animals2.txt
│ ├── animals3.txt
│ ├── animals4.txt
│ └── animals.txt
├── creatures
│ ├── basilisk.fasta
│ └── unicorn.fasta
├── iris.csv
├── molecules
│ ├── cubane.pdb
│ ├── ethane.pdb
│ ├── methane.pdb
│ ├── octane.pdb
│ ├── pentane.pdb
│ └── propane.pdb
└── tooth.csv
3 directories, 14 files
Changing directory
The command to change the current working directory is cd
(change directory).
$ pwd
/data
$ cd creatures
$ pwd
/data/creatures
Question: now that you are in the creatures
directory, how to go back
to the parent directory?
Solution:
$ cd ..
Copying files and directories
The command cp
(copy) usage is cp [options] <SOURCE> <TARGET>
,
meaning that, appart from options, it takes two arguments, meaning the source
and the target:
$ cp iris.csv iris-copy.csv
$ ls
animals creatures iris-copy.csv iris.csv molecules tooth.csv
If the target name is a directory, the file will be copied to the directory with the same name:
$ cp iris.csv /home/fish
$ ls /home/fish/
iris.csv
Question: find in the manual how to copy a whole directory (recursive copy),
then copy creatures
to tmp
.
Solution:
$ cp -r creatures /tmp $ ls /tmp/creatures basilisk.fasta unicorn.fasta
Moving files and directories
In unix, “moving” a file and renaming a file is actually the same thing and
we use the command mv
(move):
# Rename iris-copy.csv to iris2.csv
$ mv iris-copy.csv iris2.csv
# Moves iris2.csv to the /tmp directory
$ mv iris2.csv /tmp
Creating a directory
To create a directory, use the command mkdir
(make directory):
$ mkdir tempdir
Delete files and directories
To remove a file, use the command rm
(remove):
$ rm /tmp/iris2.csv
To remove a directory, use can either use rm -r
(recursive) or the
rmdir
command which will only work if the target directory is empty:
Question: remove the directory tempdir
we just created.
Solution:
$ rmdir tempdir # Alternatively (rm -r tempdir)
Searching files and directories
The command find
, which quite intuitively searches and finds files and directories,
is a very powerful and this tutorial will not cover 10% of its capacities.
So this is a very short list of examples
By name
$ # find files and directories named "ethane.pdb"
$ find . -name ethane.pdb
./molecules/ethane.pdb
$ # find files and directories named "ethane.pdb" and located in /home/fish
$ find /home/fish -name ethane.pdb
$ # no result found
By extension
$ # find files and directories with extension ".pdb"
$ find . -name *.pdb
./molecules/octane.pdb
./molecules/propane.pdb
./molecules/cubane.pdb
./molecules/methane.pdb
./molecules/ethane.pdb
./molecules/pentane.pdb
By prefix
$ # find files and directories starting with "c"
$ find . -name 'c*'
./molecules/cubane.pdb
./creatures
By size
$ # find files and directories larger than 2K
$ find . -size '+2k'
.
./molecules
./iris.csv
./creatures
By modification date
$ # find files and directories modified in the last 60 days
$ find . -mtime -60
# [...]
$ # find files and directories modified in more than 60 days ago
$ find . -mtime +60
# [...]
Find only files/directories
$ # find files and directories starting with "c"
$ find . -name 'c*'
./molecules/cubane.pdb
./creatures
$ # find only files starting with "c"
$ find . -type f -name 'c*'
./molecules/cubane.pdb
$ # find only directories starting with "c"
$ find . -type d -name 'c*'
./creatures