Skip to main content

Hard links

$ touch file.txt
$ ln file.txt another-name-for-a-file.txt
$ echo Hello > file.txt
$ cat another-name-for-a-file.txt
Hello

So, these things are great if you want one file to appear in more than one place. At least once I thought that a hardlink can allow two processes running under different users to access and modify one file even if each file has a pretty strict access mode. I could not be more wrong.

A directory contains filename to inode mapping for a file. And a hard link is just another filename for the same inode. The directory entry does not have the ownership information and access mode:

$ ls -l file.txt another-name-for-a-file.txt
-rw-rw-r-- 2 rtg rtg 6 Feb 23 16:34 another-name-for-a-file.txt
-rw-rw-r-- 2 rtg rtg 6 Feb 23 16:34 file.txt

$ chmod 0600 another-name-for-a-file.txt
$ ls -l file.txt another-name-for-a-file.txt
-rw------- 2 rtg rtg 6 Feb 23 16:34 another-name-for-a-file.txt
-rw------- 2 rtg rtg 6 Feb 23 16:34 file.txt

Yes, that information is stored in the inode itself and it is simply shared between all the names of the file.

Hard links cannot cross the filesystem boundaries because they are simply referencing an existing inode number and not the path.

You can’t create a hard link to a directory because it can introduce an infinite loop while traversing the directories. You can still do this with symbolic links but the system utilities will handle this for you, because it can be clearly seen what file name is the canonical one:

$ mkdir /tmp/a
$ ln -s /tmp/a /tmp/a/b
$ ls -lHR /tmp/a
/tmp/a:
total 0
lrwxrwxrwx 1 rtg rtg 6 Feb 23 16:47 b -> /tmp/a
$ find -L /tmp/a
/tmp/a
find: File system loop detected; `/tmp/a/b' is part of the same file system loop as `/tmp/a'.

At some point I understood that I have no idea why hard links would be useful, however, as a nice blog post by Paul Cobbaut and the comments suggest, hard links are used during file renaming across a single file system. At first a new link to the same inode is created, then an old one is removed.

Good to know

http://0.gravatar.com/avatar/cbde212986000125ff4322c25c1e50ce?s=44

sil

2013-02-23 at 17:21

They’re useful in the case of mostly-duplicated data, too. Backups are a good example of this. If you’ve got a folder things/ with thing1, thing2, thing3 in it, and you create a backup of it on your backup server, you’ll have:

backupserver:
/backup-2013-02-23
    /things
    thing1
    thing2
    thing3

then you delete thing2 from your disk, and back up again. In theory, then, you’d get a new folder on the backup server containing thing1 and thing3 — so the backup server would have two copies of thing1 and thing3, which is wasteful of space because they havn’t changed. However, in practice, you get this:

backupserver:
/backup-2013-02-23
    /things
    thing1
    thing2
    thing3
/backup-2013-02-24
    /things
    thing1 (hardlink to thing1 in previous folder)
    thing3 (hardlink to thing3 in previous folder)

so there’s only actually one copy of thing1, even though it’s in both backup folders. This means that every backup folder looks like a full backup, but only actually takes the space of an incremental backup.