The "Universal Similarity Metric"

This similarity measure is proposed by Vitanyi et al. By comparing the information content of 2 files and a concatenated version of the 2 files, we can find the similarity between them as we can find the amount of commonality between their representation in bits.

USM can be applied on any computer to measure how similar two files are. A list of quick and easy to follow steps is shown below:

Quick and easy to follow steps

Issues with the USM

There are some issues present with the measure:

These issues will be discussed thoroughly in my thesis and some solutions are proposed as to how to fix them