SlideShare a Scribd company logo
Workshop #1
semantica
Part 1WARM UP
(10 min)
 Exercise
 Ask for a volunteer
 Inside the box, ask the V. to look for 1 specific item
 Bend the V.’s eyes and ask her to find similar items, retrieve and put them in
different places depending on their similarity
Part 1WARM UP - EXPLANATION
 What does it all mean?
 What you have just met is the problem the your computer faces:
 If you ask it to “find” the item that you need, it will do: “find” actually
means “match what I give you with what you have in your db”
 What your computer is not able to do is to put similar things
together or to separate the different ones
 Or better, it’s not able to make new categories which include similar items
 In other words, topics ;)
But why isn’t your computer able to make up such new categories?
The answer is pretty straight… Because it does not know what those objects are and “mean”
Everything your computer sees in a text is a series of characters. So, in a sentence like
“Roberto is having great fun in this workshop!”,
the thing that your computer sees is actually just…
“Xxxxxx zv gdatdin dhdp3 axnwbx sdxn hwbxwbx xbwxhbjwx!”
That is, it just does not know what those sequences of chars stand for.
And the only things that he can put together are ”similar shapes”
That’s why you need to tag...
Part 1WARM UP - EXPLANATION
PART 2
WHAT IS “SEMANTICS”
AND WHAT IS IT FOR A COMPUTER?
THERE ARE MANY TYPES OF MEANING
Semantics is meaning. But first of all, let’s broaden the
meaning of what “meaning” means :P
Actually, we should better talk with plurals, ie. meanings.
There are several types of meanings, and each one
depends on the purpose of the communication (or better,
communicative action)
Some examples of types of meaning could be:
 Referents, labels, relations
 Events
 Text cohesion
Units of analysis
User intentions
 Context: implications and consequences
Homo Sapiens (or just people, ehe) can understand all these types of meanings, and many more.
THERE ARE MANY TYPES OF MEANING
BUT WHAT CAN A COMPUTER UNDERSTAND?
A lot, for sure. But a computer does not have (yet) the knowledge about the
world that a human being has gained since she was born. By “knowledge of
the world” we mean all of possible information registered in many ways,
from biological perception to cultural education and social habits.
if this all sounds so far away from your need to make a tool work, think like this:
there are so many implied and underlying meanings in the text your tool is
processing that it just does not know anything about. That’s why you need
to cover its lacks of knowledge.
REFERENTS, LABELS AND RELATIONS
a REFERENT is the OBJECT : a person, a company, an event (oh, btw, these are also the
SmartThemes in TalkWalker!).
A Referent could be sticked with more than just one LABEL: a name is a label for a person, or
company, or even an event (eg. a title for a conference). For example, the curly guy writing
these slides is named Roberto. And he’s also Digital Analyst. And he’s also the guy travelling
from and to Bergamo every day. These are all ways (labels) that could be used to refer to the
same Referent (object) ALTERNATIVELY.
This is important to your computer (and you behind it; btw, why aren’t you sitting in the front?)
because the writer of a text could refer to the same object in many ways, and you could miss
out some results because you didn’t set up those keyword in your query.
REFERENTS, LABELS AND RELATIONS
And then there are SYNONYMS, ie. things that are similar and thus they sometimes occur in
the same contexts, close to one another. Unfortunately, a computer doesn’t know there 2
words are synonyms, unless you instruct it that they are. Thus, 2 or more words could have a
RELATION of synonymy (or antonymy, to put it veeeery simply) and belong to the same
SEMANTIC AREA.
Referents, labels, relations
Entities
 Have attributes and features and are involved in events
 Could be referred to with different labels
 Are in relations with other similar entities, which have
different names, but which sometimes are used in their
place
BUT WHAT CAN A COMPUTER UNDERSTAND?
First of all, a computer lacks all of this knowledge
about the world and the language. This is why tech
giants are building it.
Schema.org
Google’s Knowledge graph
EVENTS
Event structures
Frames
Thematic roles
EVENT STRUCTURES
 Give me 10 words to describe this
situation: what’s going on?
EVENT STRUCTURES - FRAMES
 The basic idea is that one cannot understand the meaning of a single word without
access to all the essential knowledge that relates to that word.
 For example, one would not be able to understand the word "sell" without knowing
anything about the situation of commercial transfer, which also involves, among
other things, a seller, a buyer, goods, money, the relation between the money
and the goods, the relations between the seller and the goods and the money, the
relation between the buyer and the goods and the money and so on.
 Thus, a word activates, or evokes, a frame of semantic knowledge relating to the
specific concept it refers to (or highlights, in frame semantic terminology)
EVENT STRUCTURES – THEMATIC ROLES
EVENT STRUCTURES
So, in events there are
PEOPLE DOING THINGS TO OTHER PEOPLE, maybe WITH SOME THING
So far, so good.
Imagine if you could relate the entities you identify to the actions they take.
Imagine if your computer could do that….
And indeed, there are projects working on the description of events. Ever heard about Framenet?
UNDERSTADING TEXTS
Understanding a text is definitely much more than plain reading it (or its
“graphical shapes”). Especially when it comes to relate it to other texts and
make content and meaningful collections, ie. Gathering into topics
In order to have a computer understand even the flatter meaning of a sentence
(not even a text) we would need at least
 A dictionary: to provide for linguistic information (eg. Grammatical meta-
data)
 An ontology: to relate entities and frame them into events structures
This is the future of Semantic Web. But this one is already another story…
USER’S INTENTIONS
 What is an “intention” ?
 It’s a form of meaning, ie. Pragmatic. Something that is
present in one’s mind
Intentions are made of both the motivations to take an action
and the result the one wants to achieve by that action
Intentions show the background where the motivation was
formed (eg. Emotionally) and the direction where the user’s
attention is heading
 In search: how does a search engine satisfy the query with a broader
scope?
 Uses a dictionary (with synonyms, variants, etc. all included in the
algorithm): the case of Google’s broad match type
 In social networks:
 Posts, comments, shares: which one counts more?
 A scale of “original” content
 why did Fb introduce reactions?
 To give a limited and predefinite set of emotions beyond the Like button
 Are they a source for sentiment analysis? (yes, to count in social media analytics, with
breakdown)
USER’S INTENTIONS
And what about text? How can a computer understand the intention of a (written or spoken)
text?
IT IS A HUUUUUUUUUUUUGGGEEE QUESTION
Most tech giants are hardly working on this, making great efforts in developing algorithms and
computing systems. Here’s an excerpt from the Microsoft Speech Technology Dept.
 Intent understanding is about identifying the action a user wants a computer to take or the
information she/he would like to obtain, conveyed in a spoken utterance or a text query.
USER’S INTENTIONS
USER’S INTENTIONS
 … and which way are they of interest to ORM?
 Identifying the background and the direction of an intention may provide a “path” of
action. Which could potentially be a pattern (in which it could be possible to intervene)
 words can be ambiguous (not clearly connoted, or not used for their straight meaning,
eg. Irony). Identifying the intention of use can help attributing the best fitting sentiment
 Computers don’t have (yet) the ability to “sense” the overall mood of a situation
 Considering the pragmatic dimension of intention (and context, later) broadens the
perspective of ORM beyond keywords and metrics, and can help writing more significant
insights
Context: implications and consequences
 Context meaning can be conceived of as the meaning that is received by the people (more
or less) independently of the speaker’s intentions
 Or said differently, the effects that a communicative act brings about within the environment
that it gets into
 Shares on Social networks: a case of “mute meaning”
 Intention is to amplify the attention on the news, to make it own, to show support
 If no comments are added, shares are examples of how a content spreads the consequences of the
original content
PART 3 SENTIMENT ANALYSYS
 Units of analysis for sentiment attribution
 Word?
 Sentences?
 document?
 Discourse?
 Topic / Theme?
 Data-driven approach
 Exc. 2
 Top-down?
 Bottom-up?
WHAT IS AN OPINION? ABSTRACTION 1
 “I bought an iPhone a few days ago. It is such a nice phone. The touch screen
is really cool. The voice quality is clear too. It is much better than my old
Blackberry, which was a terrible phone and so difficult to type with its tiny
keys. However, my mother was mad with me as I did not tell her before I
bought the phone. She also thought the phone was too expensive, ...” (Liu, Ch.
in NLP handbook, 2010)
 One can look at this review/blog at the
 document level, i.e., is this review + or -?
 sentence level, i.e., is each sentence + or -? entity and feature/aspect level
Entity and aspect/feature level
 “I bought an iPhone a few days ago. It is such a nice phone. The touch screen is really cool.
The voice quality is clear too. It is much better than my old Blackberry, which was a terrible
phone and so difficult to type with its tiny keys. However, my mother was mad with me as I
did not tell her before I bought the phone. She also thought the phone was too expensive,
...”
 What do we see?
 Opinion targets: entities and their features/aspects
 Sentiments: positive and negative
 Opinion holders: persons who hold the opinions
 Time: when opinions are expressed
OPINION LOGIC STRUCTURE
An opinion is a quintuple
(ej, ajk, soijkl, hi, tl)
where
 ej is a target entity.
 ajk is an aspect/feature of the entity ej.
 soijkl is the sentiment value of the opinion from the opinion holder hi on aspect ajk of entity
ej at time tl. soijkl is positive, negative, or neutral, or a more granular rating.
 hi is an opinion holder.
 tl is the time when the opinion is expressed.
Opinion definition (Liu, Ch. in NLP handbook, 2010)
HOW TO USE THIS OPINION LOGIC STRUCTURE?
With this logic, it’s possible to face the issue to structure the unstructured
 Goal: Given an opinionated document
 we can discover all quintuples (ej, ajk, soijkl, hi, tl),
 Or, solve some simpler forms of the problem; E.g., sentiment classification at the document or
sentence level.
 With the quintuples, it’s possible to convert unstructured Text to structured Data
 Traditional data and visualization tools can be used to slice, dice and visualize the results
 However, as seen in the logic structure, tools need to have dictionaries and ontologies built-in
 It is then possible to enable qualitative and quantitative analysis
OPINION SUMMARY (ABSTRACTION 2)
With a lot of opinions, a summary is necessary.
 It’s a multi-document summarization task
 For factual texts, summarization is to select the most important facts and present them in a
sensible order while avoiding repetition
 1 fact = any number of the same fact
 But for opinion documents, it is different because opinions have a quantitative side & have
targets
 1 opinion = a number of opinions
 Aspect-based summary is more suitable
 quintuples form the basis for opinion summarization
Aspect-based opinion summary
(Hu & Liu, 2004)
““I bought an iPhone a few days ago. It is such a nice phone. The touch screen is really cool. The voice quality is clear
too. It is much better than my old Blackberry, which was a terrible phone and so difficult to type with its tiny keys.
However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was
too expensive, ...”
Feature Based Summary of iPhone:
Feature1: Touch screen
Positive: 212
 The touchscreen was really cool.
 The touch screen was so easy to use and can do amazing things.
...
Negative: 6
 The screen is easily scratched.
 I have a lot of difficulty in removing finger marks from the touch screen.
...
Feature2: voice quality
…
ASPECT-BASED OPINION SUMMARY
This approach seems to be more suitable also for ORM purposes
because the variety and fragmentation of target objects is extremely
wide when it comes to summerize the reputation of products and
people
Indeed, it allows to breakdown more aspects of a object and to
assess them
AN EXAMPLE: APP2CHECK
 https://app2check.finsa.it/webapp/app/login.html
DATA-DRIVEN APPROACH
 Exc 2.
 Think of today’s presentation’s parts
 On a colored sticky notes, write the part that you like the most (green), medium (yellow) and
the least (red)
 Parts were
 Topic grouping
 Semantics
 Sentiment and opinions
 Top-down?
 The one that we currently use
 We identify some documents and assign them some sentiment
 It’s a ”fake bottom-up approach” because we can’t read all documents (for limited time and
resources)
 Bottom-up?
 a more fine-grained approach: sentiment tagging at word, sentence, or document level?
 App2check tags at sentences level. Then calculates the average sentiment of all sentences, and assigns it to the
document (single review)
 The document sentiment is compared and matched against the user’s rating, as a control measure
 Topics are also rated: topics are identified as keywords within the sentences that are opinionated, and get
calculated on average
DATA-DRIVEN APPROACH

More Related Content

Similar to Web & Social Media Analystics - Workshop Semantica

From Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial IntelligenceFrom Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial Intelligence
Jonathan Mugan
 
Foundations understanding users and interactions
Foundations  understanding users and interactionsFoundations  understanding users and interactions
Foundations understanding users and interactions
Preeti Mishra
 
New Concepts: Relationship Elements Transcript (March 2020)
New Concepts: Relationship Elements Transcript (March 2020)New Concepts: Relationship Elements Transcript (March 2020)
New Concepts: Relationship Elements Transcript (March 2020)
ALAeLearningSolutions
 
Semantic web and information graph
Semantic web and information graphSemantic web and information graph
Semantic web and information graph
Chao-Hsuan Shen
 
Language is Infrastructure for InteractConf London 2014
Language is Infrastructure for InteractConf London 2014Language is Infrastructure for InteractConf London 2014
Language is Infrastructure for InteractConf London 2014
Andrew Hinton
 
Metaphic or the art of looking another way.
Metaphic or the art of looking another way.Metaphic or the art of looking another way.
Metaphic or the art of looking another way.
Suresh Manian
 
sent_analysis_report
sent_analysis_reportsent_analysis_report
sent_analysis_report
Subhadarsini Prusty
 
COMM 100Mass Media and Society Reflection Paper OutlineGuest S.docx
COMM 100Mass Media and Society Reflection Paper OutlineGuest S.docxCOMM 100Mass Media and Society Reflection Paper OutlineGuest S.docx
COMM 100Mass Media and Society Reflection Paper OutlineGuest S.docx
monicafrancis71118
 
IBM Watson V3 Application Development- certification guide
IBM Watson V3 Application Development- certification guideIBM Watson V3 Application Development- certification guide
IBM Watson V3 Application Development- certification guide
jamie rahman
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
Sai Mohith
 
Semantic analysis
Semantic analysisSemantic analysis
Semantic analysis
Ibrahim Muneer
 
New Concepts: Relationship Elements (Transcript)
New Concepts: Relationship Elements (Transcript)New Concepts: Relationship Elements (Transcript)
New Concepts: Relationship Elements (Transcript)
ALAeLearningSolutions
 
Data Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AIData Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AI
Jonathan Mugan
 
Questions For Essays.pdf
Questions For Essays.pdfQuestions For Essays.pdf
Questions For Essays.pdf
Nikki Wheeler
 
Pragmatics, Cognition, and Conceptual Modeling. Why Process Modelling and Pro...
Pragmatics, Cognition, and Conceptual Modeling. Why Process Modelling and Pro...Pragmatics, Cognition, and Conceptual Modeling. Why Process Modelling and Pro...
Pragmatics, Cognition, and Conceptual Modeling. Why Process Modelling and Pro...
Stijn Hoppenbrouwers
 
Microsoft.com Usability broken.
Microsoft.com Usability broken.Microsoft.com Usability broken.
Microsoft.com Usability broken.
None None
 
Beyond Buzz - Web 2.0 Expo - K.Niederhoffer & M.Smith
Beyond Buzz - Web 2.0 Expo - K.Niederhoffer & M.SmithBeyond Buzz - Web 2.0 Expo - K.Niederhoffer & M.Smith
Beyond Buzz - Web 2.0 Expo - K.Niederhoffer & M.Smith
kategn
 
2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx
2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx
2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx
lorainedeserre
 
Rules For Writing Numbers Know When To Spell Them Out YourDictionary
Rules For Writing Numbers Know When To Spell Them Out YourDictionaryRules For Writing Numbers Know When To Spell Them Out YourDictionary
Rules For Writing Numbers Know When To Spell Them Out YourDictionary
Allison Thompson
 
AI_Lecture_10.pptx
AI_Lecture_10.pptxAI_Lecture_10.pptx
AI_Lecture_10.pptx
saadurrehman35
 

Similar to Web & Social Media Analystics - Workshop Semantica (20)

From Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial IntelligenceFrom Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial Intelligence
 
Foundations understanding users and interactions
Foundations  understanding users and interactionsFoundations  understanding users and interactions
Foundations understanding users and interactions
 
New Concepts: Relationship Elements Transcript (March 2020)
New Concepts: Relationship Elements Transcript (March 2020)New Concepts: Relationship Elements Transcript (March 2020)
New Concepts: Relationship Elements Transcript (March 2020)
 
Semantic web and information graph
Semantic web and information graphSemantic web and information graph
Semantic web and information graph
 
Language is Infrastructure for InteractConf London 2014
Language is Infrastructure for InteractConf London 2014Language is Infrastructure for InteractConf London 2014
Language is Infrastructure for InteractConf London 2014
 
Metaphic or the art of looking another way.
Metaphic or the art of looking another way.Metaphic or the art of looking another way.
Metaphic or the art of looking another way.
 
sent_analysis_report
sent_analysis_reportsent_analysis_report
sent_analysis_report
 
COMM 100Mass Media and Society Reflection Paper OutlineGuest S.docx
COMM 100Mass Media and Society Reflection Paper OutlineGuest S.docxCOMM 100Mass Media and Society Reflection Paper OutlineGuest S.docx
COMM 100Mass Media and Society Reflection Paper OutlineGuest S.docx
 
IBM Watson V3 Application Development- certification guide
IBM Watson V3 Application Development- certification guideIBM Watson V3 Application Development- certification guide
IBM Watson V3 Application Development- certification guide
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
 
Semantic analysis
Semantic analysisSemantic analysis
Semantic analysis
 
New Concepts: Relationship Elements (Transcript)
New Concepts: Relationship Elements (Transcript)New Concepts: Relationship Elements (Transcript)
New Concepts: Relationship Elements (Transcript)
 
Data Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AIData Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AI
 
Questions For Essays.pdf
Questions For Essays.pdfQuestions For Essays.pdf
Questions For Essays.pdf
 
Pragmatics, Cognition, and Conceptual Modeling. Why Process Modelling and Pro...
Pragmatics, Cognition, and Conceptual Modeling. Why Process Modelling and Pro...Pragmatics, Cognition, and Conceptual Modeling. Why Process Modelling and Pro...
Pragmatics, Cognition, and Conceptual Modeling. Why Process Modelling and Pro...
 
Microsoft.com Usability broken.
Microsoft.com Usability broken.Microsoft.com Usability broken.
Microsoft.com Usability broken.
 
Beyond Buzz - Web 2.0 Expo - K.Niederhoffer & M.Smith
Beyond Buzz - Web 2.0 Expo - K.Niederhoffer & M.SmithBeyond Buzz - Web 2.0 Expo - K.Niederhoffer & M.Smith
Beyond Buzz - Web 2.0 Expo - K.Niederhoffer & M.Smith
 
2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx
2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx
2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx
 
Rules For Writing Numbers Know When To Spell Them Out YourDictionary
Rules For Writing Numbers Know When To Spell Them Out YourDictionaryRules For Writing Numbers Know When To Spell Them Out YourDictionary
Rules For Writing Numbers Know When To Spell Them Out YourDictionary
 
AI_Lecture_10.pptx
AI_Lecture_10.pptxAI_Lecture_10.pptx
AI_Lecture_10.pptx
 

Recently uploaded

原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
yuvarajkumar334
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
sameer shah
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
exukyp
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 

Recently uploaded (20)

原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 

Web & Social Media Analystics - Workshop Semantica

  • 2. Part 1WARM UP (10 min)  Exercise  Ask for a volunteer  Inside the box, ask the V. to look for 1 specific item  Bend the V.’s eyes and ask her to find similar items, retrieve and put them in different places depending on their similarity
  • 3. Part 1WARM UP - EXPLANATION  What does it all mean?  What you have just met is the problem the your computer faces:  If you ask it to “find” the item that you need, it will do: “find” actually means “match what I give you with what you have in your db”  What your computer is not able to do is to put similar things together or to separate the different ones  Or better, it’s not able to make new categories which include similar items  In other words, topics ;)
  • 4. But why isn’t your computer able to make up such new categories? The answer is pretty straight… Because it does not know what those objects are and “mean” Everything your computer sees in a text is a series of characters. So, in a sentence like “Roberto is having great fun in this workshop!”, the thing that your computer sees is actually just… “Xxxxxx zv gdatdin dhdp3 axnwbx sdxn hwbxwbx xbwxhbjwx!” That is, it just does not know what those sequences of chars stand for. And the only things that he can put together are ”similar shapes” That’s why you need to tag... Part 1WARM UP - EXPLANATION
  • 5. PART 2 WHAT IS “SEMANTICS” AND WHAT IS IT FOR A COMPUTER?
  • 6. THERE ARE MANY TYPES OF MEANING Semantics is meaning. But first of all, let’s broaden the meaning of what “meaning” means :P Actually, we should better talk with plurals, ie. meanings. There are several types of meanings, and each one depends on the purpose of the communication (or better, communicative action)
  • 7. Some examples of types of meaning could be:  Referents, labels, relations  Events  Text cohesion Units of analysis User intentions  Context: implications and consequences Homo Sapiens (or just people, ehe) can understand all these types of meanings, and many more. THERE ARE MANY TYPES OF MEANING
  • 8. BUT WHAT CAN A COMPUTER UNDERSTAND? A lot, for sure. But a computer does not have (yet) the knowledge about the world that a human being has gained since she was born. By “knowledge of the world” we mean all of possible information registered in many ways, from biological perception to cultural education and social habits. if this all sounds so far away from your need to make a tool work, think like this: there are so many implied and underlying meanings in the text your tool is processing that it just does not know anything about. That’s why you need to cover its lacks of knowledge.
  • 9. REFERENTS, LABELS AND RELATIONS a REFERENT is the OBJECT : a person, a company, an event (oh, btw, these are also the SmartThemes in TalkWalker!). A Referent could be sticked with more than just one LABEL: a name is a label for a person, or company, or even an event (eg. a title for a conference). For example, the curly guy writing these slides is named Roberto. And he’s also Digital Analyst. And he’s also the guy travelling from and to Bergamo every day. These are all ways (labels) that could be used to refer to the same Referent (object) ALTERNATIVELY. This is important to your computer (and you behind it; btw, why aren’t you sitting in the front?) because the writer of a text could refer to the same object in many ways, and you could miss out some results because you didn’t set up those keyword in your query.
  • 10. REFERENTS, LABELS AND RELATIONS And then there are SYNONYMS, ie. things that are similar and thus they sometimes occur in the same contexts, close to one another. Unfortunately, a computer doesn’t know there 2 words are synonyms, unless you instruct it that they are. Thus, 2 or more words could have a RELATION of synonymy (or antonymy, to put it veeeery simply) and belong to the same SEMANTIC AREA.
  • 11. Referents, labels, relations Entities  Have attributes and features and are involved in events  Could be referred to with different labels  Are in relations with other similar entities, which have different names, but which sometimes are used in their place
  • 12. BUT WHAT CAN A COMPUTER UNDERSTAND? First of all, a computer lacks all of this knowledge about the world and the language. This is why tech giants are building it. Schema.org Google’s Knowledge graph
  • 14. EVENT STRUCTURES  Give me 10 words to describe this situation: what’s going on?
  • 15. EVENT STRUCTURES - FRAMES  The basic idea is that one cannot understand the meaning of a single word without access to all the essential knowledge that relates to that word.  For example, one would not be able to understand the word "sell" without knowing anything about the situation of commercial transfer, which also involves, among other things, a seller, a buyer, goods, money, the relation between the money and the goods, the relations between the seller and the goods and the money, the relation between the buyer and the goods and the money and so on.  Thus, a word activates, or evokes, a frame of semantic knowledge relating to the specific concept it refers to (or highlights, in frame semantic terminology)
  • 16. EVENT STRUCTURES – THEMATIC ROLES
  • 17. EVENT STRUCTURES So, in events there are PEOPLE DOING THINGS TO OTHER PEOPLE, maybe WITH SOME THING So far, so good. Imagine if you could relate the entities you identify to the actions they take. Imagine if your computer could do that…. And indeed, there are projects working on the description of events. Ever heard about Framenet?
  • 18. UNDERSTADING TEXTS Understanding a text is definitely much more than plain reading it (or its “graphical shapes”). Especially when it comes to relate it to other texts and make content and meaningful collections, ie. Gathering into topics In order to have a computer understand even the flatter meaning of a sentence (not even a text) we would need at least  A dictionary: to provide for linguistic information (eg. Grammatical meta- data)  An ontology: to relate entities and frame them into events structures This is the future of Semantic Web. But this one is already another story…
  • 19. USER’S INTENTIONS  What is an “intention” ?  It’s a form of meaning, ie. Pragmatic. Something that is present in one’s mind Intentions are made of both the motivations to take an action and the result the one wants to achieve by that action Intentions show the background where the motivation was formed (eg. Emotionally) and the direction where the user’s attention is heading
  • 20.  In search: how does a search engine satisfy the query with a broader scope?  Uses a dictionary (with synonyms, variants, etc. all included in the algorithm): the case of Google’s broad match type  In social networks:  Posts, comments, shares: which one counts more?  A scale of “original” content  why did Fb introduce reactions?  To give a limited and predefinite set of emotions beyond the Like button  Are they a source for sentiment analysis? (yes, to count in social media analytics, with breakdown) USER’S INTENTIONS
  • 21. And what about text? How can a computer understand the intention of a (written or spoken) text? IT IS A HUUUUUUUUUUUUGGGEEE QUESTION Most tech giants are hardly working on this, making great efforts in developing algorithms and computing systems. Here’s an excerpt from the Microsoft Speech Technology Dept.  Intent understanding is about identifying the action a user wants a computer to take or the information she/he would like to obtain, conveyed in a spoken utterance or a text query. USER’S INTENTIONS
  • 22. USER’S INTENTIONS  … and which way are they of interest to ORM?  Identifying the background and the direction of an intention may provide a “path” of action. Which could potentially be a pattern (in which it could be possible to intervene)  words can be ambiguous (not clearly connoted, or not used for their straight meaning, eg. Irony). Identifying the intention of use can help attributing the best fitting sentiment  Computers don’t have (yet) the ability to “sense” the overall mood of a situation  Considering the pragmatic dimension of intention (and context, later) broadens the perspective of ORM beyond keywords and metrics, and can help writing more significant insights
  • 23. Context: implications and consequences  Context meaning can be conceived of as the meaning that is received by the people (more or less) independently of the speaker’s intentions  Or said differently, the effects that a communicative act brings about within the environment that it gets into  Shares on Social networks: a case of “mute meaning”  Intention is to amplify the attention on the news, to make it own, to show support  If no comments are added, shares are examples of how a content spreads the consequences of the original content
  • 24. PART 3 SENTIMENT ANALYSYS  Units of analysis for sentiment attribution  Word?  Sentences?  document?  Discourse?  Topic / Theme?  Data-driven approach  Exc. 2  Top-down?  Bottom-up?
  • 25. WHAT IS AN OPINION? ABSTRACTION 1  “I bought an iPhone a few days ago. It is such a nice phone. The touch screen is really cool. The voice quality is clear too. It is much better than my old Blackberry, which was a terrible phone and so difficult to type with its tiny keys. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive, ...” (Liu, Ch. in NLP handbook, 2010)  One can look at this review/blog at the  document level, i.e., is this review + or -?  sentence level, i.e., is each sentence + or -? entity and feature/aspect level
  • 26. Entity and aspect/feature level  “I bought an iPhone a few days ago. It is such a nice phone. The touch screen is really cool. The voice quality is clear too. It is much better than my old Blackberry, which was a terrible phone and so difficult to type with its tiny keys. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive, ...”  What do we see?  Opinion targets: entities and their features/aspects  Sentiments: positive and negative  Opinion holders: persons who hold the opinions  Time: when opinions are expressed
  • 27. OPINION LOGIC STRUCTURE An opinion is a quintuple (ej, ajk, soijkl, hi, tl) where  ej is a target entity.  ajk is an aspect/feature of the entity ej.  soijkl is the sentiment value of the opinion from the opinion holder hi on aspect ajk of entity ej at time tl. soijkl is positive, negative, or neutral, or a more granular rating.  hi is an opinion holder.  tl is the time when the opinion is expressed. Opinion definition (Liu, Ch. in NLP handbook, 2010)
  • 28. HOW TO USE THIS OPINION LOGIC STRUCTURE? With this logic, it’s possible to face the issue to structure the unstructured  Goal: Given an opinionated document  we can discover all quintuples (ej, ajk, soijkl, hi, tl),  Or, solve some simpler forms of the problem; E.g., sentiment classification at the document or sentence level.  With the quintuples, it’s possible to convert unstructured Text to structured Data  Traditional data and visualization tools can be used to slice, dice and visualize the results  However, as seen in the logic structure, tools need to have dictionaries and ontologies built-in  It is then possible to enable qualitative and quantitative analysis
  • 29. OPINION SUMMARY (ABSTRACTION 2) With a lot of opinions, a summary is necessary.  It’s a multi-document summarization task  For factual texts, summarization is to select the most important facts and present them in a sensible order while avoiding repetition  1 fact = any number of the same fact  But for opinion documents, it is different because opinions have a quantitative side & have targets  1 opinion = a number of opinions  Aspect-based summary is more suitable  quintuples form the basis for opinion summarization
  • 30. Aspect-based opinion summary (Hu & Liu, 2004) ““I bought an iPhone a few days ago. It is such a nice phone. The touch screen is really cool. The voice quality is clear too. It is much better than my old Blackberry, which was a terrible phone and so difficult to type with its tiny keys. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive, ...” Feature Based Summary of iPhone: Feature1: Touch screen Positive: 212  The touchscreen was really cool.  The touch screen was so easy to use and can do amazing things. ... Negative: 6  The screen is easily scratched.  I have a lot of difficulty in removing finger marks from the touch screen. ... Feature2: voice quality …
  • 31. ASPECT-BASED OPINION SUMMARY This approach seems to be more suitable also for ORM purposes because the variety and fragmentation of target objects is extremely wide when it comes to summerize the reputation of products and people Indeed, it allows to breakdown more aspects of a object and to assess them
  • 32. AN EXAMPLE: APP2CHECK  https://app2check.finsa.it/webapp/app/login.html
  • 33. DATA-DRIVEN APPROACH  Exc 2.  Think of today’s presentation’s parts  On a colored sticky notes, write the part that you like the most (green), medium (yellow) and the least (red)  Parts were  Topic grouping  Semantics  Sentiment and opinions
  • 34.  Top-down?  The one that we currently use  We identify some documents and assign them some sentiment  It’s a ”fake bottom-up approach” because we can’t read all documents (for limited time and resources)  Bottom-up?  a more fine-grained approach: sentiment tagging at word, sentence, or document level?  App2check tags at sentences level. Then calculates the average sentiment of all sentences, and assigns it to the document (single review)  The document sentiment is compared and matched against the user’s rating, as a control measure  Topics are also rated: topics are identified as keywords within the sentences that are opinionated, and get calculated on average DATA-DRIVEN APPROACH

Editor's Notes

  1. these can all be topics