This mathematical tool could improve how fast information is shared

Building on relative entropy

Scientists introduced the concept of “relative entropy” (different to relative attention entropy explored in the current study) to help both senders and receivers make better use of the available capacity for transmitting data, such as bandwidth, network infrastructure, or other communication channels. This concept allows for a precise calculation of the best message to send to the receiver, which ultimately helps the recipient make more informed decisions based on the transmitted information.

“In a nutshell, relative entropy measures distances between the knowledge of communication partners and can be used to design optimal communication acts between those. It states how much information is lost in the transfer of knowledge,” said Enßlin.“If Alice calculates the expected relative entropy for each of the possible messages she could send to Bob and then sends the message with the lowest relative entropy, she would inform him to the best of her abilities.”

However, no matter how useful and widely utilized relative entropy has been, it still has its drawbacks. “Relative entropy measures how many bits are lost in a communication. However, it does not quantify whether these bits were important or not,” Enßlin explained.

“For example, if Alice sends Bob a weather forecast, some aspects of the weather are more important than others. Informing about the possibility of dangerous weather conditions (like heavy rain) should get more weight than informing about details of less severe possibilities,” he added.

To solve this, Enßlin and his colleagues, Carolin Weidinger and Philipp Frank, refined the concept of relative entropy, taking into consideration the varying importance of different pieces of information, and introducing the tool they called relative attention entropy. In simpler terms, they developed a method to create messages that highlight the most important parts of the data while ensuring the accuracy of the information remains intact.

“Attention functions are introduced […] that allow for a weighting of the communicated possibilities for their relevance to the receiver,” Enßlin explained. “If the sender’s full information can be transferred accurately to the receiver, relative attention entropy will request that this happens. Otherwise, if a precise communication is impossible, it ensures that the more relevant aspects of the sender’s knowledge are preferentially communicated.”

Applications in AI and beyond

The team believes that although their study is purely mathematical, their modification to relative entropy could be implemented in practice in the near future. Namely, they envision integrating their modification of relative entropy into various deep learning algorithms that handle vast datasets.

“Relative attention entropy can be directly incorporated into existing AI systems that use relative entropy,” said Enßlin. This advancement is anticipated to bolster the algorithms’ capability to discern critical information within the data, thereby potentially accelerating the learning process and improving the accuracy of results generated by neural networks.

Moreover, the researchers anticipate that beyond aiding in the development of more sophisticated AI systems, this concept could also be applied in fields such as biology.

“In biological systems, which perform information processing on all levels, from brains down to the cellular apparatus, computation under resource constraints is the norm, and attention mechanisms are at work in order to ensure good performance,” concluded Enßlin. “Relative attention entropy might therefore be a theoretical tool that aids our understanding of how these systems were shaped in their biological evolution.”

Reference: T. Enßlin, C. Weidinger, and P. Frank, Attention to Entropic Communication, Annalen der Physik. DOI: andp.202300334

Feature image credit: Rene Böhmer on Unsplash