How to identify duplicate files on Linux

How to identify duplicate files on Linux

Identifying files that share disk space relies on making use of the fact that the files share the same inode — the data structure that stores all the information about a file except its name and content. If two or more files have different names and file system locations, yet share an inode, they also share content, ownership, permissions, etc.

These files are often referred to as “hard links” — unlike symbolic links that simply point to other files by containing their names. Symbolic links are easy to pick out in a file listing by the “l” in the first position and -> symbol that refers to the file being referenced.

$ ls -l my*
-rw-r--r-- 4 shs shs 228 Apr 12 19:37 myfile
lrwxrwxrwx 1 shs shs 6 Apr 15 11:18 myref -> myfile
-rw-r--r-- 4 shs shs 228 Apr 12 19:37 mytwin

Identifying hard links in a single directory is not as obvious, but it is still quite easy. If you list the files using the ls -i command and sort them by inode number, you can pick out the hard links fairly easily. In this type of ls output, the first column shows the inode numbers.

To read this article in full, please click here

div#stuning-header .dfd-stuning-header-bg-container {background-image: url(;background-size: initial;background-position: top center;background-attachment: initial;background-repeat: no-repeat;}#stuning-header {min-height: 650px;}