Measures of similarity between two classification

A function for computing all the measures of similarity implemented in this package at once. Include (A)RI, (N)MI, (N)VI, (N)ID, Chi2, MARI, Frobenius

compare_clustering(c1, c2, sorted_pairs = NULL, AMI = FALSE)

Arguments

c1: A vector of length $n$ with values between 0 and $N_1 < n$ representing the first classification. Supported types: integer, numeric, or factor. Avoid character vectors for better performance. Must not be a list.
c2: A vector of length $n$ with values between 0 and $N_2 < n$ representing the second classification. Supported types: integer, numeric, or factor. Avoid character vectors for better performance. Must not be a list.
sorted_pairs: optional output of function sort_pairs (if already computed). If `NULL` (the default), will be called internally
AMI: Boolean: should the AMI be computed (more costly than all other measures)? Default is `FALSE`.

Value

a list with all the measures available

Examples

data(iris)
cl <- cutree(hclust(dist(iris[, -5])), 4)
compare_clustering(cl, iris$Species)
#> $RI
#> [1] 0.821745
#> 
#> $ARI
#> [1] 0.5894567
#> 
#> $MI
#> [1] 0.8035825
#> 
#> $NMI
#> [1] 0.643852
#> 
#> $VI
#> [1] 0.7395331
#> 
#> $NVI
#> [1] 0.4792468
#> 
#> $ID
#> [1] 0.4445033
#> 
#> $NID
#> [1] 0.356148
#> 
#> $Chi2
#> [1] 209.1143
#> 
#> $MARI
#> [1] 0.5894345
#> 
#> $MARIraw
#> [1] 0.1279572
#> 
#> $Frobenius
#> [1] 2.21181
#>

Arguments

Value

See also

Examples