A function to compute the empirical entropy for two vectors of classification and the joint entropy

entropy(c1, c2, sorted_pairs = NULL)

Arguments

c1

A vector of length $n$ with values between 0 and $N_1 < n$ representing the first classification. Supported types: integer, numeric, or factor. Avoid character vectors for better performance. Must not be a list.

c2

A vector of length $n$ with values between 0 and $N_2 < n$ representing the second classification. Supported types: integer, numeric, or factor. Avoid character vectors for better performance. Must not be a list.

sorted_pairs

optional output of function sort_pairs (if already computed). If `NULL` (the default), will be called internally

Value

a list with the two conditional entropies, the joint entropy and output of sort_pairs.

Examples

data(iris)
cl <- cutree(hclust(dist(iris[, -5])), 4)
entropy(cl, iris$Species)
#> $UV
#> [1] 1.543116
#> 
#> $U
#> [1] 1.248086
#> 
#> $V
#> [1] 1.098612
#> 
#> $sort_pairs
#> $sort_pairs$spMat
#> NULL
#> 
#> $sort_pairs$levels
#> $sort_pairs$levels$c1
#> [1] 1 2 3 4
#> 
#> $sort_pairs$levels$c2
#> [1] 1 2 3
#> 
#> 
#> $sort_pairs$nij
#> [1] 50 23 37 27  1 12
#> 
#> $sort_pairs$ni.
#> [1] 50 60 28 12
#> 
#> $sort_pairs$n.j
#> [1] 50 50 50
#> 
#> $sort_pairs$pair_c1
#> [1] 0 1 1 2 2 3
#> 
#> $sort_pairs$pair_c2
#> [1] 0 1 2 1 2 2
#> 
#>