App to Verify Copies using Hash

Few years ago, correction, a decade ago, I did write a short code to verify copied files. As much as I would like to believe NAS is more tolerant, it would be a moot point if the files were already damaged going in —as the saying goes, garbage in, garbage out. Once I had the system up and running, the need for such a code subsided very quickly; but it was definitely a story I kept on the back burner.

The idea for the old scripts, the one I’ve used and available in many other fashions online, is rather simple: get the hash values of two supposedly identical file sets and compare the hash. The problem has more to do with maintaining the simple script, and have the script reliably churning out the same results.

Though confusingly named, md5deep, or hashdeep uses many hash algorithms of your choice. It’s cross-platform and open-source. The developer cites ‘historical reasons’ for keeping the md5 name, but it does support other hashes as well. Considering we are talking about non-cryptographic context, I believe md5 is good enough algorithm for the job.

Here’s an example of how I would use it. Let’s assume we have two paths. /path/to/dir1 and /path/to/dir2, and they are supposedly full of the same files and in same hierarchy.

Running the following command will create the hash of the first path. You can name the file path1.md5 however you like.

md5deep -r /path/to/dir1 > path1.md5

Next, running another command will print any irregularities.

md5deep -X path1.md5 -r /path/to/dir2

One good disclosure, though how it would affect the users is simply beyond me: the developers were in the Air Force, therefore the code does reference U.S. Air Force. Not that it would matter in any significant ways I can see, aside from possible legal grounds to cover; the code is also available on GitHub.

Leave a comment