Entropy has units of bits
Therefore, $$H = \sum_{i=1}^n p_i \cdot \log_2(\frac{1}{p_i}) = -\sum_{i=1}^n p_i \cdot \log_2(p_i)$$
Entropy is maximum when all outcomes are equally likely.
Huffman encoding
4 axioms for
- Continuity (if I only change the probabilities a little, the information of the process should change only a little).
- Symmetry (if I reorder the list of probabilities I gave you, you should get the same answer).
- Condition of Maximum Information:
is at its maximum value when all the are equal. - Coarse-Graining. $$H(X) = H(X') + p_{bc}H(G)$$where
is the uncertainty of the choice between the group G containing b and c.
Quantifying coding failure using KL-divergence:$$KL(p|q)=\sum_{i=1}^Nq(x_i)\log_2\frac{q(x_i)}{p(x_i)}$$where
Conditional entropy of