Semantic information theory in 20 minutes

6,997 views
6,631 views

Published on

Published in: Technology, Education
2 Comments
3 Likes
Statistics
Notes
No Downloads
Views
Total views
6,997
On SlideShare
0
From Embeds
0
Number of Embeds
50
Actions
Shares
0
Downloads
65
Comments
2
Likes
3
Embeds 0
No embeds

No notes for slide
  • PB: Expressed message => Technical Message encoding is deterministic Replace agent
  • What’s new? How to validate? Who cares?
  • Semantic information theory in 20 minutes

    1. 1. Towards a Theory of Semantic Communication Jie Bao , Prithwish Basu, Mike Dean, Craig Partridge, Ananthram Swami, Will Leland and Jim Hendler RPI, Raytheon BBN, and ARL IEEE Network Science Workshop 2011, West Point, June 23rd, 20011
    2. 2. Outline <ul><li>Background </li></ul><ul><li>A general semantic communication model </li></ul><ul><li>Semantic data compression (source coding) </li></ul><ul><li>Semantic reliable communication (channel coding) </li></ul><ul><li>Path ahead </li></ul>
    3. 3. Shannon, 1948 <ul><li>“ The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning ;... These semantic aspects of communication are irrelevant to the engineering problem .” </li></ul>Claude E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27:379-423, 625-56, 1948. message message Signal Signal
    4. 4. However, are these just bits? <ul><li>Movie streams </li></ul><ul><li>Software codes </li></ul><ul><li>DNA sequences </li></ul><ul><li>Emails </li></ul><ul><li>Tweets </li></ul><ul><li>…… </li></ul>“ The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning;..” “ These semantic aspects of communication are irrelevant to the engineering problem ”?
    5. 5. Our Contributions <ul><li>We develop a generic model of semantic communication, extending the classic model-theoretical work of (Carnap and Bar-Hillel 1952) ; </li></ul><ul><li>We discuss the role of semantics in reducing source redundancy , and potential approaches for lossless and lossy semantic data compression; </li></ul><ul><li>We define the notions of semantic noise, semantic channel , and obtain the semantic capacity of a channel. </li></ul>
    6. 6. Shannon, 1948 message message Shannon Model Signal Signal Expressed Message (e.g., commands and reports) Expressed Message From IT to SIT (Classical) Information Theory Semantic Information Theory Semantic Channel
    7. 7. A 3-level Model (adapted from Weaver) Transmitter Receiver Destination Source Physical Channel Technical message Technical Noise Intended message Expressed message Semantic Transmitter Semantic Receiver Semantic Noise Shared knowledge Local knowledge Local knowledge C: Effectiveness B: Semantic A: Technical Context, Utility, Trust etc.
    8. 8. A Semantic Communication Model Message generator World model Background Knowledge Inference Procedure Messages Sender Message interpreter World model Background Knowledge Inference Procedure Receiver W s W r K s K r I s I r {m} World M: Message Syntax Feedback (?) observations Ms Mr
    9. 9. Semantic Sources <ul><li>Key : A semantic source tells something that is “ true ” </li></ul><ul><ul><li>Engineering bits are neither true or false! </li></ul></ul><ul><li>Goals : 1) more correctness (sent as “true”->received as “true”); 2) less ambiguity </li></ul>
    10. 10. Which message is more “surprising”? Rex is not a tyrannosaurus Rex is not a dog This slide contains animation
    11. 11. Semantics of Messages <ul><li>If a message (an expression) is more commonly true , it contains less semantic information </li></ul><ul><ul><li>inf (Sunny & Cold) > inf (Cold) </li></ul></ul><ul><ul><li>inf (Cold) > inf (Cold or Warm) </li></ul></ul>Shannon Information How often a message appears Semantic Information How often a message is true
    12. 12. Semantics of Messages <ul><li>Carnap & Bar-Hillel (1952) - “An outline of a theory of semantic information” </li></ul><ul><ul><li>m(exp) = |mod(exp)| / |all models| </li></ul></ul><ul><ul><li>inf(exp) = - log m(exp) </li></ul></ul><ul><li>Example </li></ul><ul><ul><li>m(A v B) = 3/4, m(A ^ B)=1/4 </li></ul></ul><ul><ul><li>Inf(A v B)=0.415, inf(A^B )= 2 </li></ul></ul>
    13. 13. Knowledge Entropy <ul><li>Extending Carnap & Bar-Hillel (1952) </li></ul><ul><ul><li>Models have a distribution </li></ul></ul><ul><ul><li>Background knowledge may present </li></ul></ul>Weekend=2/7, Saturday=1/7
    14. 14. Semantic Information Calculator <ul><li>http://www.cs.rpi.edu/~baojie/sit/index.php </li></ul>
    15. 15. <ul><li>Semantic Information and Coding </li></ul><ul><ul><li>Data compressing (source coding) </li></ul></ul><ul><ul><li>Reliable communication (channel coding) </li></ul></ul>
    16. 16. Compression with Shared Knowledge <ul><li>Background knowledge (A->B), when shared, helps compress the source </li></ul><ul><ul><li>Side information in the form of entailment </li></ul></ul><ul><ul><li>Not addressed by classical information theory </li></ul></ul>
    17. 17. Lossless Message Compression <ul><li>Intuition: by removing synonyms </li></ul><ul><ul><li>“ pig” = “swine” </li></ul></ul><ul><ul><li>a->(a^b)v(b^c) = a->b </li></ul></ul><ul><li>Theorem: There is a semantically lossless code for source X, with message entropy H >= H(X eq ) </li></ul><ul><ul><li>X eq are equivalence classes of X </li></ul></ul><ul><li>Other lossless compressing strategies may exist </li></ul><ul><ul><li>e.g., by using semantic ambiguity </li></ul></ul>
    18. 18. Other Source Coding Strategies <ul><li>Lossless model compression </li></ul><ul><ul><li>E.g., using minimal models </li></ul></ul><ul><li>Lossy message compression </li></ul><ul><ul><li>Sometime, a semantic loss is intentional compression </li></ul></ul><ul><ul><ul><li>How about having lunch at 1pm? See you soon! => “lunch@1? C U” </li></ul></ul></ul><ul><ul><ul><li>Textual description of an image </li></ul></ul></ul><ul><li>Ongoing Work </li></ul>
    19. 19. Semantic Errors and Noises <ul><li>Examples </li></ul><ul><li>From engineering noise: “copy machine” or “coffee machine”? </li></ul><ul><li>Semantic mismatch: The source / receiver use different background knowledge or inference </li></ul><ul><li>Lost in translation : The word “Uncle” in English has no exact correspondence in Chinese. </li></ul><ul><li>Note : </li></ul><ul><li>Not all syntactical errors are semantic! </li></ul><ul><li>Neither that all semantic errors are syntactical. </li></ul>W X Y W’ Model Sent message Received message Model Goal: W’ |=x Noise
    20. 20. Semantic Channel Coding Theorem <ul><li>Theorem: If transmission rate is smaller than C s (semantic channel capacity), semantic-error-free coding exists </li></ul>Engineering channel capacity Encoder’s semantic ambiguity Decode’s “smartness” W X Y W’ Model Sent message Received message Model Goal: W’ |=x
    21. 21. Path Ahead <ul><li>Extensions </li></ul><ul><ul><li>First-order languages [probabilistic logics] </li></ul></ul><ul><ul><li>Inconsistent KBs (misinformation) [paraconsistent logics] </li></ul></ul><ul><ul><li>Lossy source coding [clustering and similarity measurement] </li></ul></ul><ul><ul><li>Semantic mismatches </li></ul></ul><ul><li>Applications </li></ul><ul><ul><li>Semantic compression for RDF/OWL </li></ul></ul><ul><ul><li>Semantic retrieval, e.g., extending TF*IDF </li></ul></ul>
    22. 22. Questions? <ul><li>Slides: http://www.slideshare.net/baojie_iowa </li></ul><ul><li>Tech Report: http://www.cs.rpi.edu/~baojie/pub/2011-03-28_nsw_tr.pdf </li></ul><ul><li>Contact: baojie@gmail.com </li></ul>Image courtesy: http://www.addletters.com/pictures/bart-simpson-generator/900788.htm
    23. 23. backup
    24. 24. Measuring Semantic Information <ul><li>Statistical approach : Inference may change the distribution of symbols, hence the entropy of the source. </li></ul><ul><li>Model-theoretical approach : The less “likely” a message is to be true , the more information it contains. </li></ul><ul><li>Algorithmic approach : What’s the minimal program needed to describe messages and their deductions? </li></ul><ul><li>Situation-theoretical approach : measuring the divergence of messages to “truth”. </li></ul>Our Approach
    25. 25. Shannon: Information = “surpriseness” H( tyrannosaurus ) > H(dog) Captured from: http://www.wordcount.org/main.php
    26. 26. Model Semantics <ul><li>tyrannosaurus </li></ul><ul><li>dog </li></ul>?? ??
    27. 27. Conditional Knowledge Entropy <ul><li>When there is background knowledge, the set of possible worlds decreases. </li></ul>
    28. 28. Semantic Noise and Channel Coding “ coffee machine” “ copy machine” “ Xerox ” “ Xerox” “ copy machine” p->ff ? ? 0.9 0.1 1.0 W X Y W’ Scenario developed based on reports in http://english.visitkorea.or.kr/enu/AK/AK_EN_1_6_8_5.jsp and   http://blog.cleveland.com/metro/2011/03/identifying_photocopy_machine.html
    29. 29. Compressing by semantic Ambiguity Sunny Rain Light Rain Heavy Rain Sunny Light Rain Heavy Rain Status 0.5 0.25 0.25 0.5 0.2 0.2 0.1 (a) (b) H(X)=1.5 H(X’)=1.76

    ×