First, read this:
http://dashohoxha.fs.al/deduplicating-data-with-xfs-and-reflinks/
Then, some adds:
I recommand the use of fdupes
to really save space used by all duplicated files. duperemove
can miss some of them.
fdupes -r . | duperemove --fdupes
Finally, once duplicated files deduplicated, you can use same
and block
options to save more place.
duperemove -hdr --hashfile=/tmp/test.hash --dedupe-options=same,block .
I saved 15Go on a 40Go qemu windows image qcow file with this.