Similarity digesting is a class of algorithms and technologies that generate hashes from files and preserve file similarity. They find applications in various areas across security industry: malware variant detection, spam filtering, computer forensic analysis, data loss prevention and etc.. There are a few schemes and tools available that include ssdeep, sdhash and TLSH. While being useful for detecting file similarity, they define similarity from different perspectives. In other words, they take different approaches to describe what file similarity is about. In order to compare those tools with better evaluation, we introduce a simple mathematical model to describe similarity that would cover all three schemes and beyond. This model enables us to establish a theoretic framework for analyzing essential differences of various similarity digesting tools. The general use cases proposed by NIST are studied. As a result, a few tools are found to be complementary to each other so that we can use them in a hybrid approach in practice. Data experiment results are be provided to support the theoretic analysis.