Here, normalization is ever so slightly trickier.įirstly, the demo function takes NB (number of bits), rather than NP (number of patterns). The sum is NP×(1/NP), which is obviously one. That’s why the example above used 1/NP for each pattern’s probability. The sum of all probabilities must be 1.000, so the probability function must be normalized. Here’s some code:Ġ03| def demo_shannon_entropy ( NB = 5, p_of_x = None ) :Ġ04| “””Shannon entropy for a specific P(X).”””Ġ06| PF = ( 1.0 / NP ) if p_of_x is None else p_of_xĠ08| probs = lambda x : PF if x = PX else ( ( 1.0 – PF ) / ( NP – 1 ) )Ġ09| ent, _, tprob, _ = shannon_entropy ( NP, probs )Ġ11| print ( ‘NB=%d, NP=%d’ % ( NB, NP ) ) In this simple case we’ll assign the same low probability to all other possible teas. We’re saying the tea dial is very likely to point to some given tea, and very unlikely to point to any other. Things get more interesting when we vary the pattern probabilities.Ī very simple extension assigns a high probability to one pattern (tea) and a low probability to all others. This effectively means the variable is random - it has maximum surprise. (Imagine a tiny dial with a pointer.) As just shown, when all 27 teas are equally likely, the dial could point to any of them, we say the variable has maximum entropy (of ~4.755 if we mean bits). Imagine we have a “tea” variable that can specify one of 27 types of tea. But there are some interesting things we can do to explore its consequences. In a way, that’s it, everything else is an application of that calculation. The entropy just says how many symbols are required to express NP many values. Setting logbase to other values results in different entropy values. Telling us the same 27-pattern system, when all patterns are equal, has an entropy of ~1.43 decimal digits. Note that if we change logbase to 10, then the demo prints: 1.4313637641589871 (If we had 32 patterns, the max entropy would be exactly 5.000 because five bits can express exactly 32 patterns.) When all patterns are the same, the entropy is maximum and the same as the number of bits required to express all possible patterns. That’s because the probability function (line #15) assigns equal probability (1/NP) to all possible patterns. Telling us the entropy is ~4.755 bits and, as expected, the probabilities do sum to 1.000, give or take a floating-point error. When run with the 27-pattern default, the demo prints: 4.754887502163471 Most applications will only care about the entropy value, but the other data is available if wanted. ![]() Then I generate a list of summation terms (line #9), sum them to derive the entropy (line #10), and return the entropy, the total probability, and both lists. The expected sum is 1.000 (give or take a floating-point error). ![]() As a safety check, in the next line, I also generate a sum of those probabilities (something the one-liner can’t do). Implementing the equation is very simple:Ġ03| “””Calculate the Shannon Entropy given PF(x).”””Ġ04| lambda N, P : – sum ( )īut I do, so first I generate a list of all the probabilities for each system micro-state (line #6). The symbol density determines the log base (or vice versa). Using the log10 function gives a result in terms of decimal digits. Which gives an answer in binary bits because of the log2 function. Regardless, rather than Boltzmann’s famous thermodynamic formula: They’re closely related but involve quite different domains and enough differences to make calling them the same thing ever so slightly questionable. What matters is that this code involves Shannon entropy, not Boltzmann entropy. I’m not going to get into entropy here ( I’ve posted about it plenty elsewhere). ![]() This article describes the code I used mainly as an excuse to try embedding my own colorized source code blocks. The illustration bridges Boltzmann and Shannon entropy, which got me playing around with the latter, and that led to some practical results. (See Entropy 101 and Entropy 102.) I needed to calculate the minimum number of moves required to sort a disordered collection, but that turns out to be an NP problem (no doubt related to Traveling Salesman). I’ve been playing with calculating the entropy of a toy system used to illustrate the connection between “disorder” and entropy.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |