# Measuring the resolution of Torus

How good is Torus at identifying tunes? Suppose we pick tunes in the database uniformly at random and give Torus the first k characters of their shape strings, for some fixed k. Then we can define two possible measures:

• the uniqueness probability = the probability that a tune is identified uniquely;
• the resolution = the reciprocal of the expected number of matches found.

We can express each of these measures as a percentage: the higher the value, the better Torus is at identifying tunes by using k shape characters. We can then plot these measures against k to get an idea of the best length of shape string to use:

Ideally we'd expect both measures to reach 100% at some point, ie when we've used enough shape characters to distinguish any tune uniquely. However, it may be that there are pairs of tunes that Torus can't distinguish, in which case the measures won't reach this value. This can happen if the shape string for one tune appears at the start of the shape string for the other (and I can't extend the shorter shape string because I've now forgotten the relevant tune). Currently the final uniqueness probability is 97.0% and the final resolution is 98.0%.

The graph above suggests some puzzles: What shape should we expect the curves to be? Do they approach each other as k becomes large?