Monday, February 15, 2010

Linux fdupes: Get Rid (Delete) Of Double Duplicate Files In Directory

How do I find duplicate files in a given set of directories and delete them using a shell script or a command line options?

How do I get rid of double duplicates files stored in ~/foo and /u2/foo directory?

You need to use a tool called fdupes. It will searche the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison. fdupes is a nice tool to get rid of duplicate filesŲ²


Install fdupes
Type the following command under Debian / Ubuntu Linux:

# apt-get install fdupes

Type the following command under Redhat / RHEL / Fedota / CentOS Linux, enter (turn on rpmforge repo before running the following yum command):


# yum install fdupes


How Do I Use fdupes?
Find duplicate files in /etc/ directory, enter:


# fdupes /etc


Sample outputs:
/etc/vimrc
/etc/virc

How Do I Delete Unwanted Files?
You can force fdupes to prompt you for files to preserve, deleting all others (use this with care otherwise you may loss data):


# fdupes -d /etc


Sample outputs:
[1] /etc/vimrc
[2] /etc/virc

Set 1 of 1, preserve files [1 - 2, all]: 1

[+] /etc/vimrc
[-] /etc/virc

How Do Recursively Search Directory?
You can recursively search every directory given follow subdirectories encountered within the -r option, enter:


# fdupes -r /dir1


How Do I Find Dupes In Two Directories?
Type the command as follows:


# fdupes /dir1 /dir2


OR

# fdupes -r /etc /data/etc /nas95/etc


How Do I See Size Of Duplicate Files?
Type the following command with the -S option:


# fdupes -S /etc


Sample outputs:
1533 bytes each:
/etc/vimrc
/etc/virc

Further readings:
  • man page fdupes

No comments:

Post a Comment