A function to compute the NMI between two classifications

NMI(
  c1,
  c2,
  variant = c("max", "min", "sqrt", "sum", "joint"),
  sorted_pairs = NULL
)

Arguments

c1

A vector of length $n$ with values between 0 and $N_1 < n$ representing the first classification. Supported types: integer, numeric, or factor. Avoid character vectors for better performance. Must not be a list.

c2

A vector of length $n$ with values between 0 and $N_2 < n$ representing the second classification. Supported types: integer, numeric, or factor. Avoid character vectors for better performance. Must not be a list.

variant

a string in ("max", "min", "sqrt", "sum", "joint"): different variants of NMI. Default use "max".

sorted_pairs

optional output of function sort_pairs (if already computed). If `NULL` (the default), will be called internally

Value

a scalar with the normalized mutual information .

See also

Examples

data(iris)
cl <- cutree(hclust(dist(iris[, -5])), 4)
NMI(cl, iris$Species)
#> [1] 0.643852