A function to compute a modified adjusted rand index between two classifications as proposed by Sundqvist et al. (2023), based on a multinomial model.

MARI(c1, c2, sorted_pairs = NULL, raw = FALSE)

Arguments

c1

A vector of length $n$ with values between 0 and $N_1 < n$ representing the first classification. Supported types: integer, numeric, or factor. Avoid character vectors for better performance. Must not be a list.

c2

A vector of length $n$ with values between 0 and $N_2 < n$ representing the second classification. Supported types: integer, numeric, or factor. Avoid character vectors for better performance. Must not be a list.

sorted_pairs

optional output of function sort_pairs (if already computed). If `NULL` (the default), will be called internally

raw

Boolean: should the raw version of the MARI be computed? Default to `FALSE`.

Value

a scalar with the modified ARI.

References

Sundqvist, Martina, Julien Chiquet, and Guillem Rigaill. "Adjusting the adjusted Rand Index: A multinomial story." Computational Statistics 38.1 (2023): 327-347.

See also

Examples

data(iris)
cl <- cutree(hclust(dist(iris[, -5])), 4)
MARI(cl, iris$Species)
#> [1] 0.5894345