### Variation of information

In probability theory and information theory, the variation of information or shared information distance is a measure of the distance between two clusterings (partitions of elements). It is closely related to mutual information; indeed, it is a simple linear expression involving the mutual information. Unlike the mutual information, however, the variation of information is a true metric, in that it obeys the triangle inequality. Even more, it is a universal metric, in that if any other distance measure two items close-by, then the variation of information will also judge them close.

## Definition

Suppose we have two clusterings (a division of a set into several subsets) $X$ and $Y$ where $X = \\left\{X_\left\{1\right\}, X_\left\{2\right\}, ..,, X_\left\{k\right\}\\right\}$, $p_\left\{i\right\} = |X_\left\{i\right\}| / n$, $n = \Sigma_\left\{k\right\} |X_\left\{i\right\}|$. Then the variation of information between two clusterings is:

$VI\left(X; Y \right) = H\left(X\right) + H\left(Y\right) - 2I\left(X, Y\right)$

where $H\left(X\right)$ is entropy of $X$ and $I\left(X, Y\right)$ is mutual information between $X$ and $Y$.

This is completely equivalent to the shared information distance.