Shannon Entropy

Information is uncertainty, surprise, difficulty, and entropy.

—James Gleick

Everything tends toward entropy. Entropy means disorder and chaos. Entropy means the final state of everything. Sooner or later everything will be in the state of entropy. That state will be totally random and unable to produce any meaningful work anymore. The energy left in the system is completely dissipated. In short it’s pretty close to our understanding of chaos. In general entropy is a concept used in many varying fields, such as physics, information theory, and thermodynamics. It always refers to a measure of disorder or randomness in a system.

Shannon Entropy 1/27 s, f/1.5, ISO 40, 26 mm, iPhone 14

In 1865 Rudolf Clausius first introduced the concept of entropy. He wanted to measure the unavailability of energy, its uselessness for work, in this particular state. He formed the word entropy from Greek meaning “transformation content”.

But there’s more than that. In information theory entropy actually means information. It represents the uncertainty or surprise associated with a random piece of information. The more disorder there is, the more information. The more order there is, the less entropy, the less information. This sounds almost like the opposite of what we were just talking about, right? More disorder actually means more information? Well, there’s one additional concept that we need to introduce to better understand the notion of the whole subject. Let’s introduce the concept of uncertainty.

Shannon Entropy 1/26 s, f/1.5, ISO 50, 26 mm, iPhone 14

Uncertainty in information theory can be seen as a value of the amount of information that is included in a message. The more uncertain a message is, the less predictable, the more surprise or information it includes. In other words, if you already know what a message is all about after reading just the first couple of words, you might want to skip reading the rest of it because it holds no additional information. It’s totally predictable, it doesn’t present much new information to you. Therefore also the amount of entropy is less.

Shannon Entropy 1/25 s, f/1.5, ISO 64, 26 mm, iPhone 14

In 1948 Claude Shannon, an American mathematician, computer scientist, cryptographer and later called the “father of information theory”, wrote his famous paper “A Mathematical Theory of Communication”. Since then his understanding of the term entropy is referred to as “information entropy” or “Shannon entropy”. Basically it’s all about how information, in this case a message, is transferred from a sender through a certain channel to a receiver. The question is how much information of the original message needs to be received at the other end in order to be still decipherable. We all know that there can be many problems of noise, encoding and compression all along the way.

Shannon Entropy 1/17 s, f/1.5, ISO 64, 26 mm, iPhone 14

Fascinating is Shannon’s analysis of the English language, especially about how much redundancy this particular language has compared to others. Redundancy in this case means the amount of overlap every English letter or every English word has in order to be still be comprehensible, even under difficult conditions like noise, a bad connection or channel, or loud environments. He also was able to calculate the entropy of the English language to be 2.6 bits per letter, on average. That means that you need about 2.6 “yes” or “no” questions to guess a certain letter in its given context right. That’s not that much, if you think about it. So you indeed could argue that the English language actually is quite redundant. You’d be still able to understand an English message quite well even under bad conditions.

So if you want to send a cheaper, faster, more efficient, more condensed message to a close friend you’d probably send a message with higher entropy. You might skip some, if not all vowles, you might skip whole words and maybe won’t write in full sentences either. Some emojis with their highly compressed meanings might not hurt as well. On the other hand if you write an important message to your boss where every bit counts, you’d send a message with more redundancies, with every word properly spelled out, just to make completely sure, that your message comes across properly. Such a message includes less information, therefore less entropy.


· photography, science