In what situations would one want to use a hard-link rather than a soft-link? I personally have never run across a situation where I'd want to use a hard-link over a soft-link, and the only use-case I've come across when searching the web is deduplicating identical files.
|
Aside from the backup usage mentioned in another comment, which I believe also includes the snapshots on a BTRFS volume, a use-case for hard-links over soft-links is a tag-sorted collection of files. (Not necessarily the best method to create a collection, a database-driven method is potentially better, but for a simple collection that's reasonably stable, it's not too bad.) A media collection where all files are stored in one, flat, directory and are sorted into other directories based on various criteria, i.e.: year, subject, artist, genre, etc. This could be a personal movie collection, or a commercial studio's collective works. Essentially finished, the file is saved, not likely to be modified, and sorted, possibly into multiple locations by links. Bear in mind that the concept of "original" and "copy" are not applicable to hard-links: every link to the file is an original, there is no "copy" in the normal sense. For the description of the use-case, however, the terms mimic the logic of the behavior. The "original" is saved in the "catalog" directory, and the sorted "copies" are hard-linked to those files. The file attributes on the sorting directories can be set to r/o, preventing any accidental changes to the file-names and sorted structure, while the attributes on the catalog directory can be r/w allowing it to be modified as needed. (Case for that would be music files where some players attempt to rename and reorganize files based on tags embedded in the media file, from user input, or internet retrieval.) Additionally, since the attributes of the "copy" directories can be different than the "original" directory, the sorted structure could be made available to the group, or world, with restricted access while the main "catalog" is only accessible to the principal user, with full access. The files themselves, however will always have the same attributes on all links to that inode. (ACL could be explored to enhance that, but not my knowledge area.) If the original is renamed, or moved (the single "catalog" directory becomes too large to manage, for example) the hard-links remain valid, soft-links are broken. If the "copies" are moved and the soft-links are relative, then the soft-links will, again, be broken, and the hard-links will not be. Note: there seems to be inconsistency on how different tools report disk usage when soft-links are involved. With hard-links, however, it seems consistent. So with 100 files in a catalog sorted into a collection of "tags", there could easily be 500 linked "copies." (For an photograph collection, say date, photographer, and an average of 3 "subject" tags.) Dolphin, for example, would report that as 100 files for hard-links, and 600 files if soft-links are used. Interestingly, it reports that same disk-space usage either way, so it looks like a large collection of small files for soft-links, and a small collection of large files for hard-links. A caveat to this type of use-case is that in file-systems that use COW, modifying the "original" could break the hard-links, but not break the soft-links. But, if the intent is to have the master copy, after editing, saved, and sorted, COW doesn't enter the scenario. |
|||||||||||||
|
|
Hard links are useful for cases where you don't want to tie the existence of both files. Consider this:
Now Whereas with a hard link,
|
|||||||||||||||||||||
|
|
A single program may change its behavior depending on what name it is launched as:
Which over in the source is decided via something like
though the exact details wil vary depending on the OS and language involved. This allows (mostly) identical code to not have to be compiled out to two (mostly) identical binaries. Bear in mind unix dates to days when disk space was super expensive, though according to Stevens in APUE chapter 4 symlinks were implemented in BSD4.2 (1983) to replace various limitations of hardlinks. A test program to check whether the symlink name is used as the program name might look something like:
And tested via:
|
|||||||||||||||||
|
|
Filesystems are a simple and yet an efficient way to organize and classify files (this is its very primary reason for existence). Hardlinks allow a higher degree of flexibility in this matter. As mentioned, there is no concept of original and copies when dealing with hardlinks, all directory entries (hardlinks) are simply references to the existence of the file (point to its inode) with no precedence, hence there are also no broken hardlinks... So here there are some of the use cases that hardlinks attend but softlinks don't:
|
|||||
|
|
I had recently a use case for a somewhat safe update procedure for U-Boot based systems where
Without hardlinks it wouldn't be that simple. |
|||||||||||||
|
|
When my P2P software finishes downloading a certain file, the file is placed in a specific directory. Downloaded files hardly ever need to be edited. The common case is I make a hardlink in a different directory where I need the file to be. Advantages:
The main point: if I knew in advance which file I would |
|||
|
|
|
Very common, real-world example that needs hardlinks:
This clones from an local Git repo with nearly zero copying. Instead of copying the object files (immutable files used by Git for its "database"), it simply hardlinks them. Any repo can remove an object, but the inode stays valid for the rest of the repos. And if an object is removed from all repos, it's deleted from disk. Hard links make for a beautifully robust and fast solution. Very common in CI servers. There is a non-hard-link version: |
||||
|
|
|
BackupPC is a backup system that uses hard links on the servers to provide file-level deduplication. Hard links are superior to soft links here because they provide automatic reference counting. Files are first stored in a "pool" directory tree based on their md5 hash. Any backup that makes use of that file makes a hard link to the pool file. As backups expire or are deleted, their hard links are removed from the filesystem. A cron job periodically deletes any files in the pool directory that don't have more than one link. This method has some disadvantages (principally, that it is difficult to use filesystem-based tools to replicate the backup store), but it's proven to be quite robust in practice. Another use case: the tomcat java web application server treats file names as metadata (a java "war" file must be named based on its path on the web server). e.g.: Unfortunately, it resolves symlinks before making this decision. So, say you want to deploy an application build, and give it a descriptive file name (e.g., with a release number or date). You can't make a symlink to the file with the "real" name - you have to make a hardlink.
i don't like this tomcat behavior, but hardlinks give me a way around it. |
|||
|
|
|
One use that I have had for hard links is when downloading or uncompressing a broken file. The program that does the downloading or uncompressing (such as unzip or unrar) will often automatically remove the incomplete file when it encounters an error, and there is usually no option to keep it. If I want to keep the file, I can make a hard link to it. |
|||
|
|
..is always the same inode as.in the parent directory. Things likefindcan check that link-count=2 to detect leaf directories, and avoidstating the entries from readdir to look for subdirectories. But that's only a minor feature enabled by support for hardlinks of non-directory files (regular, symlink, device, socket, and named-pipe). (Yes, symlinks have their own inode, and can be hardlinked.) – Peter Cordes 5 hours ago